Scheduling ETL Jobs on Linux

ETL job runs are scheduled by using Cron jobs on Linux in production environments.

About this task

Cron jobs are used to schedule commands to be run periodically. You can set up commands or scripts that can repeatedly run on a schedule.

Procedure

To schedule an ETL job on Linux, complete the following steps:

  1. Install the cron job by running the following command: # crontab -e.
  2. Specify the job to be run.

    The Cron job syntax is 1 2 3 4 5 /<directory path>/<script>.

    Where

    • 1: Minute (0-59)
    • 2: Hours (0-23)
    • 3: Day (0-31)
    • 4: Month (0-12 [12 == December])
    • 5: Day of the week (0-7 [7 or 0 == sunday])

    The Cron diamon runs the job at the scheduled time.

Example

If you want to run ETL job every day at 3AM, your Crontab entry would look like as follows.

Install your Cron job by running the following command: # crontab -e

Append the following entry: 3 * * * /opt/IBM/customer/scripts/CustomerName_scheduler.sh

Save and close the file.

The CustomerName _scheduler.sh file has the following content. The dsjob command that is specified in the following file starts the ETL main sequence with the supplied parameters to the command.

CustomerName _scheduler.sh

cd /opt/export/IBM/InformationServer/Server/DSEngine
. ./dsenv
/opt/export/IBM/InformationServer/Server/DSEngine/bin/dsjob
 –domain domainName:9080 
-server serverName 
-user wasadmin -password wasadmin 
-run -mode NORMAL -warn 0 
-param  SecurityFlagsPS=securityFlagOff 
-param IncrementalRefresh=PartialRefresh 
-param VSMTargetDBInfoPS=VSMTargetFile 
-param SuiteDatamartPS=ReportingDB
 -param ECMSourceDBInfoPS=EcmSource 
-param ECMReportPaths=suite_rep 
-param CognosBIServerPS=cognos_ps 
-param LinesExtension=Process 
projectName  LoadingECMDataSequence