Using stored procedures in DataStage

A stored procedure is a block of procedural constructs and embedded SQL statements. Use the Call procedure option under the Read method or Write method property of the Stage tab to call an existing stored procedure from a database. Use the Call procedure statement option to write your own procedure statement syntax in the text box.

Available databases and features

Note: All databases are available in source, transform, and target modes.

See the following table to check which databases or features you can use for your stored procedures. Stored procedures can be run on connectors with input links, output links, or both.

Table 1. A list of database systems with their corresponding stored procedure capabilities.
Database Call procedure Call procedure statement Inout parameters Optional parameters Query timeout User-defined functions
Amazon RDS for PostgreSQL    
IBM Cloud® Databases for PostgreSQL    

IBM Db2®

 

IBM® Db2 Big SQL

 

IBM Db2 on Cloud

 

IBM Db2 for i

 
IBM® Db2 Warehouse  
IBM Db2 for z/OS®    
Google BigQuery        
Microsoft Azure SQL Database  
Microsoft SQL Server
Microsoft Azure Synapse Analytics    
Oracle    
PostgreSQL    
SAP ASE    
Snowflake  
Teradata    

Stored procedures in a source stage

Select Add procedure return value to schema to add the procedure return code and message to the schema as columns ProcCode and ProcMess. The first column on the output link holds the value of the code. The second column holds the value of the message. If you also select Forward row data, the input columns are added to the output link after the ProcCode and ProcMess columns.

Stored procedures in a target stage

Select Execute procedure for each row to run the procedure for each row of the input data. If you do not select Execute procedure for each row, the procedure runs once, for the first row. By default, Execute procedure for each row is selected.

Parameter values

In the Parameter values section of the Stage tab, use parameter specifications to map the input and output parameters of your stored procedure to input and output columns. The mapping determines which columns data is returned in. Parameter names in the Stage tab should match parameter names in your stored procedure. Each parameter name can be mapped to a single column. If a mapping is not provided, the output data for each column is returned in the order the columns are specified. A connector with an input link has input parameters, a connector with an output link has output parameters, and a connector with both input and output links has both.

When a connector has both input and output links, you can edit the Marker / Literal value for your input parameters on the Stage tab. You can specify an input parameter name as:
@
The default value for the parameter. The column value that is specified in the mapping is passed to the parameter.
Literal
A constant specified as an input parameter. The value of the constant is used, and the value from the input link is discarded.
Local parameter or a parameter from a parameter set
For a parameter from a parameter set, use (#name#) format. The value of the parameter is used, and the value from the input link is discarded.

Each parameter must be mapped to the input column. For example, P1 should be mapped to COLUMN_1, and P2 to COLUMN_2. Parameter mapping does not replace the column definition. However, you can discard the value from an input link and use the value in the Marker / Literal column.

In parameters

For stored procedures that require only input parameters, you can specify the parameters with a ?. For example, if you have a stored procedure that takes in 2 values: parameter1 and parameter2, the syntax for the call statement is as follows:

CALL my_procedure (?, ?)
If you want to specify a custom input value, replace ? with the custom value:
CALL my_procedure (4, ?)

Inout parameters

You can use an inout parameter type for your stored procedures. The inout parameter acts both as an input and an output parameter. A procedure receives a value for the parameter, processes it, and returns a modified value through the same parameter. You can use the inout parameters only when connector is configured in a transform mode.

You can pass the input value of the inout parameter only though the link or from a parameter map under the Parameter values section.

The input values for the inout parameter cannot be fixed in your statement. For example, ?=(CALL) test (1,?,?) is an incorrect statement, while ?=CALL test (?,?,?) is correct.

To create a call procedure statement that takes inout parameters, specify the inout parameter with a ?, the same way that you specify an output parameter. You can specify input parameters with @input and output parameters with ?. For example, if a procedure accepts one input parameter and one inout parameter, use the following syntax where ? represents the inout parameter.

CALL my_procedure (@input_param, ?)

Out parameters

For stored procedures that require only output parameters, you can specify the parameters with a ?. For example, if you have a stored procedure that outputs 2 values: parameter1 and parameter2, the syntax for the call statement is as follows:

CALL my_procedure (?, ?)

Optional parameters

You can use optional parameters only for the stored procedures. User-defined functions do not support optional parameters. A stored procedure can have one or more optional parameters. If you want to omit optional parameters while calling a stored procedure, use Call procedure statement. If a stored procedure has one or more optional parameters, all optional parameters must be defined after output parameters. Optional parameters with the Null value must be grouped at the end.

Example: A stored procedure MYSCHEMA.MYSP has two required input parameters (inparam1, inparam2), two optional inputs (optinparam3, optinparam4), and two output parameters. See the following examples of valid and invalid use of Call procedure statement:

Valid:

In this case the Input value for optinparam3 and optinparam4 is used:
CALL MYSCHEMA.MYSP(inparam1, inparam2, ?, ?, optinparam3, optinparam4)
In this case the Input value for optinparam3 is used, and default value for optinparam4 is used:
CALL MYSCHEMA.MYSP(inparam1, inparam2, ?, ?, optinparam3)
In this case the default value for both optinparam3 and optinparam4 is used:
CALL MYSCHEMA.MYSP(in1, in2, ?, ?)
Invalid:
CALL MYSCHEMA.MYSP(inparam1, inparam2, optinparam3, optinparam4, ?, ?)
CALL MYSCHEMA.MYSP(inparam1, inparam2, ?, ?, , optinparam4)

Query timeout

For long-running queries and stored procedures, set the Query timeout property to limit the number of seconds that a procedure runs for. The default value is 300 seconds. If you set a value greater than 600 seconds, you must also edit the environment variable $OSH_JOB_START_TIMEOUT and set a value higher than the Query timeout.

User-defined functions

When Call procedure or Call procedure statement is selected, you can select User-defined function to specify a user-defined function to execute. A user-defined function can take input parameters and returns either a single value (for scalar functions) or a set of values (for table functions). In IBM Db2 for z/OS, user-defined functions must be called as part of a SELECT statement. In Teradata, macro-type functions are also available.

User-defined functions support both source and transform modes. For the Source mode, the data is read from the database. For the Transform mode, the data is read, transformed, and written to the target node.

You can use two types of user-defined functions:
  • Scalar
    Returns a single value.
  • Table
    Returns a table.
When a function has no input parameters, it can be executed when the database acts as a source. When a function has input parameters, it can be executed when the database acts as a transformer.
Use the following procedure statements for your user-defined functions:
  • For scalar user-defined function:
    SELECT  “Sclr_udf_name”(@arg1,@arg2,…);
  • For table user-defined function:
    SELECT * FROM  “table_udf_name”(@arg1,@arg2,…);