Managing source control with Git

You can integrate your Git server with IBM® DataStage® Flow Designer. This integration allows you to publish jobs and related artifacts to different Git branches and load other versions of a job from Git onto the IBM DataStage Flow Designer canvas.

Benefits

Using IBM DataStage Flow Designer with Git provides the following benefits:
  • Helps with Continuous Integration Continuous Delivery (CICD) pipeline automation. Your assets are easily available in Git and you can move them from the development (DEV) branch to quality assurance (QA) to production.
  • Helps with auditing requirements. You can simply tell who changed what.
  • You can map a version of a job in the XMETA repository to a version in Git.
  • You can work on multiple versions of a job by creating temporary branches.
  • You can easily roll back to a prior version of a job.

Supported Git repositories

The following Git repositories are supported:
  • Bitbucket
  • GitHub
  • GitLab
  • Microsoft Team Foundation Server

Artifacts

IBM DataStage Flow Designer publishes the job .json file to Git. You can also publish .isx files if you select Include compiled code, binaries, and dependant parameters when publishing to Git. This allows you to move jobs between different InfoSphere® Information Server instances.

Job actions for Git and Xmeta repositories

The following table details how you can manage jobs for either the Git or XMETA repositories.

Table 1.
Job scenario Possible actions
A job exists in the XMETA repository but does not exist in the Git repository. The job can be published to the Git repository.
A job exists in both the XMETA repository and the Git repository.
  • The job from the XMETA repository can be loaded onto the IBM DataStage Flow Designer canvas and published to the Git repository.
  • The latest or earlier version of the job can be loaded from the Git repository and can be saved to the XMETA repository.

Publishing jobs to Git by using IBM DataStage Flow Designer

To publish jobs:
  1. Log in to IBM DataStage Flow Designer and open the job that you want to publish.
  2. Click the compile icon to compile the job. Then, save it.
  3. Click the publish to Git icon
    Publish to Git icon
    to publish the file.
  4. Select the branch that you want to add the file to and add a comment. Select Include compiled code, binaries, and dependant parameters if you want the binaries to be published to Git. This publishes the .isx package, which can be moved to other InfoSphere Information Server environments, such as QA or production.
  5. Click Publish.

Loading jobs from Git by using IBM DataStage Flow Designer

To load jobs:
  1. Log in to IBM DataStage Flow Designer and open the job that you want to load a different version of from Git.
  2. Click the Repository version link next to the job name. For example:
    Load from Git icon
  3. Select the branch that you want to load the job from and select the version number.
  4. Click Load.

Scenario: You need to make a change to the job in the Job1_branch in Git and work on it in the Job2_branch. After you fix a bug, you need to push the change back to the Job1_branch. Your administrator created a Git configuration and you created a Git profile. You also pushed Job1 and Job2 to master prior to this scenario.

  1. You load Job1 to develop a requirement that is complex to implement.
  2. You get a defect due to a CritSit for which you need to make an immediate update to Job2 and deliver it to the customer.
  3. You push the changes made so far to Job1 to a branch Job1_branch.
  4. You load Job2, fix the defect, and push to Job2 _branch.
  5. You get the changes reviewed and push the fix to the master branch.
  6. You load Job1 from the Job1_branch and continue to develop the requirement.
  7. When you are done, you push directly to master or go through the branch/review process, depending on your company policy.

Moving jobs from Git branches to different environments by using the command-line interface

You can move jobs from IBM DataStage Flow Designer to Git by using the command-line interface. This allows you to move jobs between different InfoSphere Information Server environments, such as QA and production. This also satisfies requirements around continuous integration and continuous delivery (CICD) and auditing. The command line utility retrieves the isx packages published using IBM DataStage Flow Designer and imports them into the InfoSphere Information Server metadata repository by using the parameters specified in the command line.

Use the script dfdGitCli.sh or dfdGitCli.bat to move jobs from Git to anInfoSphere Information Server metadata repository . It is located in: /opt/installation_directory/Clients/DFD or c:/installation_directory/Clients/DFD, where installation_directory is the directory where you installed InfoSphere Information Server. For example, /opt/IBM//InformationServer.

You must run the command on the services tier.

To run the script, issue the following command:
  • Linux cue graphicUNIX cue graphic
    [root@fall1 DFD]# ./dfdGitCli.sh load 
  • Windows cue graphic
    C:/IBM/InformationServer/Clients/DFD> dfdGitCli.bat load
Below is the command syntax. Optional parameters and values in the syntax are enclosed in brackets, [ ].
-domain | -dom <IIS domain server:port>
-username | -u <IIS user name>
-password | -p <IIS user password>
-projectName | -prj <IIS project name to load jobs into>
[-branch | -br <Commits from git branch; default: master>]
[-gitJobPath | -gitPath <IIS job path from version control>]
[-gitProjectPath | -gitPrjPath <all commits under project git path>]
[-includeDesign | -incdes <include design yes/no; default: no>]
[-replaceAssets | -replace <replace dependencies yes/no; default: no>]
[-version | -ver <git commit id>]
[-beforeDate | -before <commits before date>]
[-afterDate | -after <commits after date>]
-branch
Enter the Git version control branch name. The default is master. This parameter is optional.
-domain
Enter the host name and port number of the InfoSphere Information Server services tier in the format <host_name>:port. For example, server.ibm.com:9446.
-gitJobPath
Enter the complete path of the InfoSphere Information Server job name from version control. For example, dstage1/Jobs/testJob.isx or dstage/Jobs/testJob.json. This parameter is optional.
-gitProjectPath
Enter the complete path of the InfoSphere Information Server project name from version control. For example, dstage1/Jobs/EmptyJob/. This parameter is optional.
-username
Enter the InfoSphere Information Server user name
-password
Enter your InfoSphere Information Server password.
-projectName
Enter the InfoSphere DataStage project name. For example, dstage1.
-jobName
Enter the name of the InfoSphere DataStage job. Instead of specifying a job name, you can enter -all to publish all jobs. This parameter is optional.
-version
Enter the version control commit ID. For example, enter the Git commit ID d66b893 or d66b893ac9f9df7ac6be620929ee82a2fde3c93b. This parameter is optional.
-includeDesign
Enter yes or no to include the design. If you enter no, then only job executables are imported, but not job designs. If you enter yes, then both job executables and job designs are imported. This parameter is optional.
-beforeDate
Enter the date. All commits before the date you enter are included. For example, 2018-11-29T16:00:00-0700. This parameter is optional.
-afterDate
Enter the date. All commits after the date you enter are included. For example, 2016-01-28T16:45:00-0700. This parameter is optional.

Scenario:

You want to publish a job that you created in one InfoSphere Information Server instance using IBM DataStage Flow Designer to Git and then move the same job into another InfoSphere Information Server instance such as a QA instance or production instance.

  1. Create and save the job Job_1 in IBM DataStage Flow Designer. Then, publish the job to Git.
  2. Run the dfdGitCli.sh or dfdGitCli.bat script from the /opt/IBM/InformationServer/Clients/DFD/ or C:\IBM\InformationServer\Clients\DFD directory. Enter all required parameters such as the branch where Job1 is in Git, the Git job path, and so on to move the job to your other system.

Examples

Example 1: Move a Single Job from Git to the InfoSphere Information Server system myserver.ibm.com:9443:
./dfdGitCli.sh load -domain=myserver.ibm.com:9443 -username=isadmin -password=isadmin
-projectName=dstage1 -gitJobPath=dstage1/Jobs/MyJob10/MyJob10.isx
-version=f000af7 -replaceAssets=yes -includeDesign=yes
Example 2: Move multiple jobs published to GIT in a particular interval from Git to the InfoSphere Information Server system myserver.ibm.com:9443:
./dfdGitCli.sh load -dom=myserver.ibm.com:9443  -u=isadmin -p=isadmin -projectName=dstage1
-branch=testBranch -includeDesign=yes -replaceAssets=yes
-beforeDate=2018-11-28T16:00:00-0700
-afterDate=2018-11-28T6:00:00-0700
Example 3: Move multiple jobs under a project Git path from Git to the InfoSphere Information Server system myserver.ibm.com:9443:
./dfdGitCli.sh load -dom=myserver.fyre.ibm.com:9443
-u=isadmin -p=isadmin -prj=IISM -br=master
-gitProjectPath=dstage1/Jobs/EmptyJob/