What's new and changed in DataStage

DataStage® updates can include new features, bug fixes, and security updates. Updates are listed in reverse chronological order so that the latest release is at the beginning of the topic.

You can see a list of the new features for the platform and all of the services at What's new in IBM® Cloud Pak for Data.

Installing or upgrading DataStage

Ready to install or upgrade DataStage?

Related documentation:

Cloud Pak for Data Version 4.5.3

A new version of DataStage was released in October 2022 with Cloud Pak for Data 4.5.3.

Operand version: 4.5.3

The 4.5.3 release of DataStage includes the following features and updates:

Stored procedures for additional connectors
You can now use stored procedures in the following connectors:
  • IBM Db2® for z/OS®
  • IBM Db2 for i
For more information, see Using stored procedures.
Connect to storage volumes as data sources in DataStage flows
You can now include data from a volume on an NFS server or a persistent volume claim in your DataStage flows. For more information, see Adding connections to analytics projects and Asset browser.
Before-job and after-job subroutines support KSH scripts and bash
You can now use before-job and after-job subroutines to run both bash and KSH scripts. For more information, see Setting up before-job and after-job subroutines.
Share files with Watson™ Studio Pipelines
You can now use a shared storage volume to share files between Watson Studio Pipelines and DataStage. For details, see Sharing storage volumes.
New stage
You can now use the Slowly Changing Dimension stage in your DataStage flows to store and manage current and historical data over time.

For more information, see Slowly Changing Dimension stage.

Simplify setup with the DSStageName server macro
You can add the DSStageName macro to stage properties or in transformer functions. You can use the macro to simplify the setup of DataStage jobs and flows because the macro acts as a DataStage function and outputs data without the need for arguments. When the job compiles, "DSStageName" is replaced with the name of the stage.

For more information, see Macros.

New parameter types
In DataStage jobs you can use two new parameter types:
Boolean
Use the Boolean parameter type to specify a true or false value.
List
Use the List parameter type to specify a list of values that are available for selection in a job.

For more information about parameters, see Creating and using parameters and parameter sets.

Add and edit column metadata for data definitions
In data definitions, you can now add and edit metadata properties at the column level. For example, you can set properties such as field level, delimiter, quotation mark, and string type.

For more information about data definitions, see Defining data definitions.

Copy and paste subflows
You can copy and paste shared subflows within a DataStage flow or between different DataStage flows in the same project. You can copy and paste a subflow as part of a larger flow or as just the subflow itself.

For more information about subflows, see Subflows.

Oracle, Snowflake, and Teradata connectors can now have multiple input links
Previously the Oracle, Snowflake, and Teradata connectors had only one input link, and you specified the link's properties in the Stage properties.

Now, the connectors can have multiple input links, and each link can have a different property. Therefore, each link can have an individual action, such as read, write, or append. You can view the properties by switching the links in the Input tab.

Bug fixes
This release includes the following fixes:
  • Issue: Migrated Salesforce jobs fail to run.

    Resolution: This issue is now fixed.

  • Issue: If a subflow is copied and pasted to another flow, an error is thrown.

    Resolution: This issue is now fixed.

  • Issue: Upgrading to 4.5.2 is delayed by approximately an hour.

    Resolution: This issue is now fixed.

  • Issue: The SAP Bulk Extract connector requires a value for System Number to be provided, even when Use System Number is not checked.

    Resolution: This issue is now fixed.

Security fixes
This release includes the following fixes:
  • CVE-2016-10739
  • CVE-2018-1313 CVE-2018-19591 CVE-2018-20796
  • CVE-2019-16866 CVE-2019-19126 CVE-2019-25013 CVE-2019-25033 CVE-2019-6488 CVE-2019-7309 CVE-2019-9169 CVE-2019-9192
  • CVE-2020-10029 CVE-2020-1751 CVE-2020-1752 CVE-2020-27618 CVE-2020-6096
  • CVE-2021-33098 CVE-2021-33150 CVE-2021-3326 CVE-2021-34558 CVE-2021-35942 CVE-2021-36221 CVE-2021-38297 CVE-2021-38604 CVE-2021-39293 CVE-2021-39711 CVE-2021-39715 CVE-2021-41771 CVE-2021-41772 CVE-2021-44716
  • CVE-2022-1012 CVE-2022-1586 CVE-2022-1652 CVE-2022-1785 CVE-2022-1897 CVE-2022-1927 CVE-2022-1976 CVE-2022-2097 CVE-2022-22476 CVE-2022-23218 CVE-2022-23219 CVE-2022-25375 CVE-2022-2596 CVE-2022-29622 CVE-2022-31159 CVE-2022-32206 CVE-2022-32208 CVE-2022-32250

Cloud Pak for Data Version 4.5.2

A new version of DataStage was released in August 2022 with Cloud Pak for Data 4.5.2.

Operand version: 4.5.2

The 4.5.2 release of DataStage includes the following features and updates:

Send reports by using before/after-job subroutines
You can now send full job reports by using the Send mail function of before/after-job subroutines.
Support for migrating Db2 server-type data connection objects from traditional DataStage
Traditional DataStage supports data connection objects of the type Db2 server. When you migrate these data connection objects to modern DataStage, they are automatically converted to Db2 connector objects so that you can still use them in your DataStage flows and jobs.
New Custom stage
You can now use the Custom stage in your DataStage flows. Use the Custom stage to process your data according to your own specifications.

For details on the Custom stage, see Defining custom stages in DataStage.

Use function libraries in the Transformer stage
You can now define your own functions for use in the Transformer stage's expression editor by creating a function library asset from an SO file. For details on function libraries, see Function libraries in DataStage.
Use new functions in the Transformer stage
You can now use the ConvertDatum and NextValidDate functions in the Transformer stage as part of your DataStage flows.

For the full list of available functions, see Parallel transform functions.

Migrate nested sequence jobs to Watson Pipelines
A traditional DataStage sequence job can contain a "Job Activity" type that can invoke another sequence job, making the second sequence job a nested sequence job.

You can now migrate these nested sequence jobs into modern DataStage. When migrated, a sequence job in the ISX import file is converted to a Watson pipeline.

Connect to a new data source
You can now include data from Apache HBase in your DataStage flows.

For the full list of DataStage connectors, see DataStage connectors.

Kerberos authentication supported for Apache Hive connections
You can now use the Kerberos authentication protocol to connect to an Apache Hive data source when you create the connection from the DataStage service. For information, see Apache Hive connection.
The Db2 (optimized) connector can now have multiple input links, each with an individual action
Previously the Db2 (optimized) connector had only one input link, and you specified the link's properties in the Stage properties. Now the connector can have multiple input links, and each link can have a different property. This enhancement means that each link can have an individual action, such as read, write, append, etc. You can view the properties by switching the links in the Input tab.
Bug fixes
This release includes the following fixes:
  • Issue: Browser issues occur that are being caused by CSS.

    Resolution: This issue is now fixed.

  • Issue:When an Excel target node is added by using the asset browser, mapping is not preserved.

    Resolution: This issue is now fixed.

  • Issue: Excessive calls are being made from the DataStage designer canvas when a new DataStage flow is created.

    Resolution: This issue is now fixed.

  • Issue: Validation for function name and argument name does not work properly.

    Resolution: This issue is now fixed.

Security fixes
This release includes the following fixes:
  • CVE-2015-3627
  • CVE-2016-4658, CVE-2016-5131
  • CVE-2017-0663, CVE-2017-15412, CVE-2017-18258, CVE-2017-7375, CVE-2017-9047, CVE-2017-9048, CVE-2017-9049, CVE-2017-9050
  • CVE-2018-20406
  • CVE-2019-16775, CVE-2019-16776, CVE-2019-16777
  • CVE-2020-15095, CVE-2020-1734, CVE-2020-1737
  • CVE-2021-3114, CVE-2021-32760, CVE-2021-35065, CVE-2021-3583, CVE-2021-3618, CVE-2021-41103, CVE-2021-43784, CVE-2021-43816, CVE-2021-46195
  • CVE-2022-1729, CVE-2022-2257, CVE-2022-23648, CVE-2022-25169, CVE-2022-25896, CVE-2022-28356, CVE-2022-29162, CVE-2022-30126, CVE-2022-31030, CVE-2022-31129, CVE-2022-33099, CVE-2022-34494

Cloud Pak for Data Version 4.5.1

A new version of DataStage was released in July 2022 with Cloud Pak for Data 4.5.1.

Operand version: 4.5.1

The 4.5.1 release of DataStage includes the following features and updates:

Connect to more data sources in DataStage
You can now include data from these data sources in your DataStage flows:
  • Cognos® Analytics
  • SAP IQ

For the full list of DataStage connectors, see DataStage connectors.

Use new functions and features in the DataStage Transformer stage
  • You can now use the Fold, Fmt, and Rmunprint functions in the Transformer stage as part of your DataStage flows.
  • The Transformer stage now supports partitions.
  • You can now use type-ahead search in the Transformer stage for functions, columns, and variables.

For the full list of available functions, see Parallel transform functions.

Bug fixes
This release includes the following fixes:
  • Issue: Deleting a build stage does not clear all background files.

    Resolution: This issue is now fixed.

  • Issue: Duplicate is not supported for build stage.

    Resolution: This issue is now fixed.

Security fixes
This release includes fixes for the following security issues:

CVE-2017-12626, CVE-2017-5644

CVE-2019-12415, CVE-2019-20372

CVE-2020-15522

CVE-2021-23017, CVE-2021-29425, CVE-2021-3629, CVE-2021-3634, CVE-2021-41771, CVE-2021-41772, CVE-2021-44716, CVE-2021-45868

CVE-2022-1271

Cloud Pak for Data Version 4.5.0

A new version of DataStage was released in June 2022 with Cloud Pak for Data 4.5.0.

Operand version: 4.5.0

The 4.5.0 release of DataStage includes the following features and updates:

Connect to more data sources in DataStage
You can now include data from these data sources in your DataStage flows:
  • IBM Match 360
  • SAP Bulk Extract
  • SAP Delta Extract

For the full list of connectors, see DataStage connectors.

Preview and add metadata faster for Apache Kafka and Netezza® Performance Server (optimized) connectors
After you create the connection, you can drag the Asset browser to the DataStage canvas, select a connection and drill down to add or preview the data for these connectors:
  • Apache Kafka (Preview is available only if the connection has a schema registry configured.)
  • Netezza Performance Server (optimized)
Use new stages in DataStage
You can now use the following stages in your DataStage flows:
Two-source Match
Compare two sources of input data (reference records and data records) for matches.
Wrapper
Define a Wrapped stage to specify a UNIX command that is run by another DataStage stage.

For the full list of stages, see DataStage stages.

Orchestrate flows with Watson Studio Pipelines
Beta feature You can now create a pipeline to run a sequence of DataStage flows. You can add conditions, loops, expressions, and scripts to a pipeline. For details, see Orchestrating flows.

This component is offered as a beta feature and must be installed separately. For details, see Watson Studio Pipelines.

Customer-reported issues fixed in this release
Bug fixes
This release includes the following fixes:
  • Issue: Snowflake JDBC connections fail to write data to Snowflake.

    Resolution: This issue is now fixed.

Security fixes
This release includes fixes for the following security issues:

CVE-2016-4971

CVE-2018-1000876, CVE-2018-20483

CVE-2019-20916, CVE-2019-9923

CVE-2020-1751, CVE-2020-1752

CVE-2021-33503, CVE-2021-35942, CVE-2021-37322

CVE-2022-26612