Defining your data

When transforming or cleansing data, you must define the data that you are working with.

You define the data by importing or defining table definitions. You can save the table definitions for use in your job designs.

Table definitions are the key to your IBM® InfoSphere® DataStage® project and specify the data to be used at each stage of a job. Table definitions are stored in the repository and are shared by all the jobs in a project. You need, as a minimum, table definitions for each data source and one for each data target in the data warehouse.

When you develop a job you will typically load your stages with column definitions from table definitions held in the repository. You do this on the relevant Columns tab of the stage editor. If you select the options in the Grid Properties dialog box, the Columns tab will also display two extra fields: Table Definition Reference and Column Definition Reference. These show the table definition and individual columns that the columns on the tab were derived from.

You can import, create, or edit a table definition using the Designer.