Each of the four DataStage® parallel
jobs contains one or more stages that connect with the STAGEDB database.
The jobs extract new data changes and keep track of progress. In this
lesson, you will modify the stages to add connection information,
and create and link to data set files that DataStage populates.
Stages have predefined and editable properties. This lesson
walks you through the process of changing some of these properties
for the STAGEDB_ASN_PRODUCT_CCD_extract parallel job. After you are
finished, you will repeat the steps for the STAGEDB_ASN_INVENTORY_CCD_extract
parallel job, and also change some properties for stages of the STAGEDB_ST00_AQ00_getExtractRange
and STAGEDB_ST00_AQ00_markRangeProcessed parallel jobs.
This
lesson assumes that you are still logged onto the DataStage Designer after importing table
metadata. If not, open the Designer.
Procedure
- Browse the Designer repository tree to the SQLREP folder,
open it, and select the STAGEDB_ASN_PRODUCT_CCD_extract parallel job.
Right-click the job name and click Edit. The design window of the parallel job opens in the Designer
palette.
- Locate the green icon at the left side. This icon represents
the DB2® connector stage that
is used for extracting data from the CCD table. Double-click the icon.
Figure 1. extract_From_CCD_Table stage icon
A stage editor window opens.
Figure 2. Loading
connection information into stage editor
- In the editor, click Load to populate
the fields with connection information and click OK to
close the stage editor and save your changes.
- Go back to the design window for the STAGEDB_ASN_PRODUCT_CCD_extract
parallel job, find the icon for the getSynchPoints DB2 connector stage, and double-click the icon.
- In the stage editor, click Load to
populate the fields with connection information.
Note: If
you are using a database other than STAGEDB as your Apply control
server, select it to load the connection information for the getSynchPoints
stage, which interacts with the control tables rather than the CCD
table.
Close the stage editor and save your changes.
- Create an empty text file on the system where InfoSphere® DataStage runs. Name the file productdataset.ds and
make note of where you saved it. DataStage will write to this data set after
it fetches changes from the CCD table. Data sets that are used to
move data between linked jobs are known as persistent data sets and
are represented by a Data Set stage.
DataStage
can also write to regular flat files, databases, and other targets.
For better performance you can avoid writing the data to disk and
instead keep it in parallel processors in the DataStage engine, where
data can be partitioned based on an algorithm. In this manner, instead
of immediately writing the changes to a file you feed the data to
the actual processing stages.
- In the design window, open the stage editor for the insert_into_a_dataset
stage.
Figure 3. insert_into_a_dataset stage icon
- On the Properties tab, make sure the Target folder is open
and highlight the File = DATASETNAME property. Then in the File field
on the right, enter the full path to the productdataset.ds file,
and click OK.
Figure 4. Selecting
File = DATASETNAME property
You have now updated all
the necessary properties for the STAGEDB_ASN_PRODUCT_CCD_extract parallel
job. Close the design window and save all changes.
- In the repository pane of the Designer, locate and open
the STAGEDB_ASN_INVENTORY_CCD_extract parallel job and repeat Steps
3-8. Keep these points in mind:
- If your control server is not STAGEDB, you need to load the connection
information for the control server database into the stage editor
for the getSynchPoints stage.
- For the insert_into_a_dataset stage, name the dataset file inventorydataset.ds.
- For the STAGEDB_ST00_AQ00_getExtractRange and STAGEDB_ST00_AQ00_markRangeProcessed
parallel jobs, open all the DB2 connector
stages and use the load function to add connection information for
the STAGEDB database (or separate control server if you are using
a two-database configuration).
- Make sure all design windows are closed and saved.
Lesson checkpoint
With the stage properties
set, you are done configuring the DataStage jobs
and are ready to compile and run the jobs. In the next module, you
will take these steps and then put everything together in a test run.