Help SQL Replication


< Previous | Next >

Lesson 2.5: Setting properties for the DataStage jobs

Each of the four DataStage® parallel jobs contains one or more stages that connect with the STAGEDB database. The jobs extract new data changes and keep track of progress. In this lesson, you will modify the stages to add connection information, and create and link to data set files that DataStage populates.

Stages have predefined and editable properties. This lesson walks you through the process of changing some of these properties for the STAGEDB_ASN_PRODUCT_CCD_extract parallel job. After you are finished, you will repeat the steps for the STAGEDB_ASN_INVENTORY_CCD_extract parallel job, and also change some properties for stages of the STAGEDB_ST00_AQ00_getExtractRange and STAGEDB_ST00_AQ00_markRangeProcessed parallel jobs.

This lesson assumes that you are still logged onto the DataStage Designer after importing table metadata. If not, open the Designer.

Procedure

  1. Browse the Designer repository tree to the SQLREP folder, open it, and select the STAGEDB_ASN_PRODUCT_CCD_extract parallel job. Right-click the job name and click Edit. The design window of the parallel job opens in the Designer palette.
  2. Locate the green icon at the left side. This icon represents the DB2® connector stage that is used for extracting data from the CCD table. Double-click the icon.
    Figure 1. extract_From_CCD_Table stage icon
    Clicking on DB2 connector icon

    A stage editor window opens.

    Figure 2. Loading connection information into stage editor
    Loading connection information into stage editor
  3. In the editor, click Load to populate the fields with connection information and click OK to close the stage editor and save your changes.
  4. Go back to the design window for the STAGEDB_ASN_PRODUCT_CCD_extract parallel job, find the icon for the getSynchPoints DB2 connector stage, and double-click the icon.
  5. In the stage editor, click Load to populate the fields with connection information.
    Note: If you are using a database other than STAGEDB as your Apply control server, select it to load the connection information for the getSynchPoints stage, which interacts with the control tables rather than the CCD table.
    Close the stage editor and save your changes.
  6. Create an empty text file on the system where InfoSphere® DataStage runs. Name the file productdataset.ds and make note of where you saved it. DataStage will write to this data set after it fetches changes from the CCD table. Data sets that are used to move data between linked jobs are known as persistent data sets and are represented by a Data Set stage.

    Tip DataStage can also write to regular flat files, databases, and other targets. For better performance you can avoid writing the data to disk and instead keep it in parallel processors in the DataStage engine, where data can be partitioned based on an algorithm. In this manner, instead of immediately writing the changes to a file you feed the data to the actual processing stages.

  7. In the design window, open the stage editor for the insert_into_a_dataset stage.
    Figure 3. insert_into_a_dataset stage icon
    insert_into_a_dataset stage icon
  8. On the Properties tab, make sure the Target folder is open and highlight the File = DATASETNAME property. Then in the File field on the right, enter the full path to the productdataset.ds file, and click OK.
    Figure 4. Selecting File = DATASETNAME property

    You have now updated all the necessary properties for the STAGEDB_ASN_PRODUCT_CCD_extract parallel job. Close the design window and save all changes.

  9. In the repository pane of the Designer, locate and open the STAGEDB_ASN_INVENTORY_CCD_extract parallel job and repeat Steps 3-8. Keep these points in mind:
    • If your control server is not STAGEDB, you need to load the connection information for the control server database into the stage editor for the getSynchPoints stage.
    • For the insert_into_a_dataset stage, name the dataset file inventorydataset.ds.
  10. For the STAGEDB_ST00_AQ00_getExtractRange and STAGEDB_ST00_AQ00_markRangeProcessed parallel jobs, open all the DB2 connector stages and use the load function to add connection information for the STAGEDB database (or separate control server if you are using a two-database configuration).
  11. Make sure all design windows are closed and saved.

Lesson checkpoint

With the stage properties set, you are done configuring the DataStage jobs and are ready to compile and run the jobs. In the next module, you will take these steps and then put everything together in a test run.

< Previous | Next >



Send your feedback | Information roadmap | Replication group on My developerWorks



Update icon Last updated: 2011-10-21