Using stored procedures in DataStage
A stored procedure is a block of procedural constructs and embedded SQL statements. Use the Call procedure option under the Read method or Write method property of the Stage tab to call an existing stored procedure from a database. Use the Call procedure statement option to write your own procedure statement syntax in the text box.
Available databases and features
See the following table to check which databases or features you can use for your stored procedures. Stored procedures can be run on connectors with input links, output links, or both.
| Database | Call procedure | Call procedure statement | Inout parameters | Optional parameters | Query timeout | User-defined functions |
|---|---|---|---|---|---|---|
| Amazon RDS for PostgreSQL | ✓ | ✓ | ✓ | ✓ | ||
| IBM Cloud® Databases for PostgreSQL | ✓ | ✓ | ✓ | ✓ | ||
|
IBM Db2® |
✓ | ✓ | ✓ | ✓ | ✓ | |
|
IBM® Db2 Big SQL |
✓ | ✓ | ✓ | ✓ | ✓ | |
|
IBM Db2 on Cloud |
✓ | ✓ | ✓ | ✓ | ✓ | |
|
IBM Db2 for i |
✓ | ✓ | ✓ | ✓ | ✓ | |
| IBM® Db2 Warehouse | ✓ | ✓ | ✓ | ✓ | ✓ | |
| IBM Db2 for z/OS® | ✓ | ✓ | ✓ | ✓ | ||
| Google BigQuery | ✓ | ✓ | ||||
| Microsoft Azure SQL Database | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Microsoft SQL Server | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Microsoft Azure Synapse Analytics | ✓ | ✓ | ✓ | ✓ | ||
| Oracle | ✓ | ✓ | ✓ | ✓ | ||
| PostgreSQL | ✓ | ✓ | ✓ | ✓ | ||
| SAP ASE | ✓ | ✓ | ✓ | ✓ | ||
| Snowflake | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Teradata | ✓ | ✓ | ✓ | ✓ |
Stored procedures in a source stage
Select Add procedure return value to schema to add the procedure return code and message to the schema as columns ProcCode and ProcMess. The first column on the output link holds the value of the code. The second column holds the value of the message. If you also select Forward row data, the input columns are added to the output link after the ProcCode and ProcMess columns.
Stored procedures in a target stage
Select Execute procedure for each row to run the procedure for each row of the input data. If you do not select Execute procedure for each row, the procedure runs once, for the first row. By default, Execute procedure for each row is selected.
Parameter values
In the Parameter values section of the Stage tab, use parameter specifications to map the input and output parameters of your stored procedure to input and output columns. The mapping determines which columns data is returned in. Parameter names in the Stage tab should match parameter names in your stored procedure. Each parameter name can be mapped to a single column. If a mapping is not provided, the output data for each column is returned in the order the columns are specified. A connector with an input link has input parameters, a connector with an output link has output parameters, and a connector with both input and output links has both.
- @
- The default value for the parameter. The column value that is specified in the mapping is passed to the parameter.
- Literal
- A constant specified as an input parameter. The value of the constant is used, and the value from the input link is discarded.
- Local parameter or a parameter from a parameter set
- For a parameter from a parameter set, use
(#name#)format. The value of the parameter is used, and the value from the input link is discarded.
Each parameter must be mapped to the input column. For example, P1 should be mapped to COLUMN_1, and P2 to COLUMN_2. Parameter mapping does not replace the column definition. However, you can discard the value from an input link and use the value in the Marker / Literal column.
In parameters
For stored procedures that require only input parameters, you can specify the parameters with a
?. For example, if you have a stored procedure that takes in 2 values:
parameter1 and parameter2, the syntax for the call
statement is as follows:
CALL my_procedure (?, ?)
? with the custom
value:CALL my_procedure (4, ?)Inout parameters
You can use an inout parameter type for your stored procedures. The
inout parameter acts both as an input and an output parameter. A procedure receives
a value for the parameter, processes it, and returns a modified value through the same parameter.
You can use the inout parameters only when connector is configured in a transform
mode.
You can pass the input value of the inout parameter only though the link or from
a parameter map under the Parameter values section.
The input values for the inout parameter cannot be fixed in your statement. For
example, ?=(CALL) test (1,?,?) is an incorrect statement, while ?=CALL test
(?,?,?) is correct.
To create a call procedure statement that takes inout parameters, specify the
inout parameter with a ?, the same way that you specify an output parameter. You
can specify input parameters with @input and output parameters with
?. For example, if a procedure accepts one input parameter and one inout parameter,
use the following syntax where ? represents the inout parameter.
CALL my_procedure (@input_param, ?)
Out parameters
For stored procedures that require only output parameters, you can specify the parameters with a
?. For example, if you have a stored procedure that outputs 2 values:
parameter1 and parameter2, the syntax for the call
statement is as follows:
CALL my_procedure (?, ?)
Optional parameters
You can use optional parameters only for the stored procedures. User-defined functions do not support optional parameters. A stored procedure can have one or more optional parameters. If you want to omit optional parameters while calling a stored procedure, use Call procedure statement. If a stored procedure has one or more optional parameters, all optional parameters must be defined after output parameters. Optional parameters with the Null value must be grouped at the end.
Example: A stored procedure MYSCHEMA.MYSP has two required input
parameters (inparam1, inparam2), two optional inputs (optinparam3, optinparam4), and two output
parameters. See the following examples of valid and invalid use of Call procedure
statement:
Valid:
CALL MYSCHEMA.MYSP(inparam1, inparam2, ?, ?, optinparam3, optinparam4)
In this case the Input value for optinparam3 is used, and default value for optinparam4 is
used:CALL MYSCHEMA.MYSP(inparam1, inparam2, ?, ?, optinparam3) In this case
the default value for both optinparam3 and optinparam4 is
used:CALL MYSCHEMA.MYSP(in1, in2, ?, ?)CALL MYSCHEMA.MYSP(inparam1, inparam2, optinparam3, optinparam4, ?, ?)CALL MYSCHEMA.MYSP(inparam1, inparam2, ?, ?, , optinparam4)Query timeout
For long-running queries and stored procedures, set the Query timeout
property to limit the number of seconds that a procedure runs for. The default value is 300 seconds.
If you set a value greater than 600 seconds, you must also edit the environment variable
$OSH_JOB_START_TIMEOUT and set a value higher than the Query
timeout.
User-defined functions
When Call procedure or Call procedure statement is selected, you can select User-defined function to specify a user-defined function to execute. A user-defined function can take input parameters and returns either a single value (for scalar functions) or a set of values (for table functions). In IBM Db2 for z/OS, user-defined functions must be called as part of a SELECT statement. In Teradata, macro-type functions are also available.
User-defined functions support both source and transform modes. For the Source mode, the data is read from the database. For the Transform mode, the data is read, transformed, and written to the target node.
-
- Scalar
- Returns a single value.
-
- Table
- Returns a table.
- For scalar user-defined
function:
SELECT “Sclr_udf_name”(@arg1,@arg2,…); - For table user-defined
function:
SELECT * FROM “table_udf_name”(@arg1,@arg2,…);