Running scripts in a wrapped stage in DataStage

You can wrap an executable script and turn it into a DataStage® stage.

You can wrap scripts into stages in DataStage. Sample scripts in this topic include Python, Netezza®, and Db2® scripts.

Creating a wrapped stage with a script

After you create the wrapped stage with your script, you can include the new stage in your DataStage flows. Wrapped stages that you create are visible in canvas under User-defined stages and on the Assets tab of your project in the "DataStage components" list.

Example: Python script

The following example shows how to create a wrapped stage that contains a Python script. The example demonstrates a simple use case to search for the largest input integer. See Defining wrapped stages in DataStage for details on the information in the example.

Source code
See the following source code for this example. It reads the input integer, then finds and outputs the largest number.
# Searches through a list of input numbers and finds smallest or largest number

# Syntax: python search_num.py <arg>

# where arg=-S or -L

import sys

numStr = ""

i = 0

min_max = 0

newStr = 0

arg = "Largest"

if len(sys.argv) > 1:

   arg = sys.argv[1]

if arg.lower() == "-s":

   arg = "Smallest"

else:

   arg = "Largest"

for line in sys.stdin:

   if line.strip() != "":

      strVal = line.strip();

      newStr=strVal.replace('"','')

      if (i == 0):

         min_max = newStr

         numStr = '"' + line.strip() + '"'

         i = 1

      else:

         numStr = numStr + ', "' + line.strip() + '"'

      intNum = int(newStr)

      intMax = int(min_max)

      if arg == "Smallest":

         if (intNum < intMax):

            min_max = newStr

      else:

         if (intNum > intMax):

            min_max = newStr

print(" [Numbers:" + numStr + "]. " + arg + " number: " + str(min_max))
  1. Create your Python script by using the source code in this example. Save the script as search_num.py. As a cluster administrator, create a directory /ds-storage/tools/python3.9.9/python/ and locate your script search_num.py there.

    If multiple PX runtime instances are involved, you must create the directories and locate the scripts in them for each instance.

  2. From your project, on the Assets tab, click New asset > Component editors > DataStage component > Wrapped stage.
  3. Provide a name and optional description. Then, click Create.
  4. Specify details on the General tab:
    1. Wrapper name
    2. Execution mode: Parallel only
    3. Command (script information). For example,
      /ds-storage/tools/python3.9.9/python/ds-storage/search_num.py
    4. Optional description
  5. Click Add property + to add arguments for the script.:
    1. Name: Arg
    2. Data type: String
    3. Prompt: Argument
    4. Default: 10
    5. Repeats column: cleared
    6. Conversion: -Name Value
  6. Click Save.
  7. On the Wrapped tab, on the Input page, specify input ports and the properties:
    1. Click Add link +.
    2. Click in the Data definition column to select a table that describes the metadata for the port.
  8. Repeat step 7 for the Output ports.
  9. Click Save, then click Generate.
Sample flow for running Python script

Example: Netezza script

The following example shows how to connect to a Netezza database, then run a simple query against the database. See Defining wrapped stages in DataStage for details on the information in the example.

  1. From your project, on the Assets tab, click New asset > Component editors > DataStage component > Wrapped stage.
  2. Provide a name and optional description. Then, click Create.
  3. Set Mode for "Sequential only" to execute commands only once.
  4. On the Properties tab, set host and port properties.
  5. On the Wrapped tab, for Exit codes, select All Exit codes successful to prevent failure on error.
  6. Click Generate.
  7. Create a DataStage flow with the new stage.
  8. Create your set of NZ-SQL statements and pass them as an input to the wrapped stage. The example uses the Row Generator stage to pass a blank input to the Transformer stage, which generates three SQL statements that are passed through the Funnel stage and into the Wrapped stage. This example shows a Transformer output that defines an SQL statement. The example flow uses three outputs.

    Sample SQL statement for running Netezza script

  9. Compile and run job.

Wrapped stage flow for running Netezza script

Example: Db2 script

The following example shows how to connect to Db2 database, then run a simple query against the Db2 database. See Defining wrapped stages in DataStage for details on the information in the example.

  1. From your project, on the Assets tab, click New asset > Component editors > DataStage component > Wrapped stage.
  2. Provide a name and optional description. Then, click Create
  3. On the General tab of the new stage, set the Execution mode to Parallel only and specify a command of db2 -s -v.
  4. On the Wrapped tab, specify input ports and their properties.
  5. Create a data definition asset to define the input schema for the wrapped stage. Use a single column of type VARCHAR (255) to input SQL statements.
  6. Create a DataStage flow with a Row Generator stage, a Transformer stage, a funnel stage, and the new wrapped stage as a target.
  7. Configure the Row Generator stage to pass an input to the Transformer stage.
  8. Configure that Transformer stage to generate your SQL statements and add any job parameters. Pass them through the funnel stage as an input to the wrapped stage. This example shows a Transformer output that defines an SQL statement. The example flow uses three outputs.

    Sample SQL statement for running Db2 script

  9. Compile and run job.
Wrapped stage flow for running Db2 script