What's new and changed in DataStage
DataStage® updates can include new features, bug fixes, and security updates. Updates are listed in reverse chronological order so that the latest release is at the beginning of the topic.
You can see a list of the new features for the platform and all of the services at What's new in IBM® Cloud Pak for Data.
Installing or upgrading DataStage
Ready to install or upgrade DataStage?
- Related documentation:
Cloud Pak for Data Version 4.5.3
A new version of DataStage was released in October 2022 with Cloud Pak for Data 4.5.3.
Operand version: 4.5.3
The 4.5.3 release of DataStage includes the following features and updates:
- Stored procedures for additional connectors
- You can now use stored procedures in the following connectors:
- IBM Db2® for z/OS®
- IBM Db2 for i
- Connect to storage volumes as data sources in DataStage flows
- You can now include data from a volume on an NFS server or a persistent volume claim in your DataStage flows. For more information, see Adding connections to analytics projects and Asset browser.
- Before-job and after-job subroutines support KSH scripts and bash
- You can now use before-job and after-job subroutines to run both bash and KSH scripts. For more information, see Setting up before-job and after-job subroutines.
- Share files with Watson™ Studio Pipelines
- You can now use a shared storage volume to share files between Watson Studio Pipelines and DataStage. For details, see Sharing storage volumes.
- New stage
- You can now use the Slowly Changing Dimension stage in your DataStage flows to store and manage current and
historical data over time.
For more information, see Slowly Changing Dimension stage.
- Simplify setup with the DSStageName server macro
- You can add the DSStageName macro to stage properties or in transformer functions. You can use
the macro to simplify the setup of DataStage jobs and flows because the macro acts as a DataStage function and outputs data without the
need for arguments. When the job compiles, "DSStageName" is replaced with the name of the
stage.
For more information, see Macros.
- New parameter types
- In DataStage jobs you can use two new parameter types:
- Boolean
- Use the Boolean parameter type to specify a true or false value.
- List
- Use the List parameter type to specify a list of values that are available for selection in a job.
For more information about parameters, see Creating and using parameters and parameter sets.
- Add and edit column metadata for data definitions
- In data definitions, you can now add and edit metadata properties at the column level. For
example, you can set properties such as field level, delimiter, quotation mark, and string
type.
For more information about data definitions, see Defining data definitions.
- Copy and paste subflows
- You can copy and paste shared subflows within a DataStage flow or between different DataStage flows in the same project. You can copy
and paste a subflow as part of a larger flow or as just the subflow itself.
For more information about subflows, see Subflows.
- Oracle, Snowflake, and Teradata connectors can now have multiple input links
- Previously the Oracle, Snowflake, and Teradata connectors had only one input link, and you
specified the link's properties in the Stage properties.
Now, the connectors can have multiple input links, and each link can have a different property. Therefore, each link can have an individual action, such as read, write, or append. You can view the properties by switching the links in the Input tab.
- Bug fixes
- This release includes the following fixes:
- Issue: Migrated Salesforce jobs fail to run.
Resolution: This issue is now fixed.
- Issue: If a subflow is copied and pasted to another flow, an error is
thrown.
Resolution: This issue is now fixed.
- Issue: Upgrading to 4.5.2 is delayed by approximately an
hour.
Resolution: This issue is now fixed.
- Issue: The SAP Bulk Extract connector requires a value for System Number to
be provided, even when Use System Number is not checked.
Resolution: This issue is now fixed.
- Issue: Migrated Salesforce jobs fail to run.
- Security fixes
- This release includes the following fixes:
- CVE-2016-10739
- CVE-2018-1313 CVE-2018-19591 CVE-2018-20796
- CVE-2019-16866 CVE-2019-19126 CVE-2019-25013 CVE-2019-25033 CVE-2019-6488 CVE-2019-7309 CVE-2019-9169 CVE-2019-9192
- CVE-2020-10029 CVE-2020-1751 CVE-2020-1752 CVE-2020-27618 CVE-2020-6096
- CVE-2021-33098 CVE-2021-33150 CVE-2021-3326 CVE-2021-34558 CVE-2021-35942 CVE-2021-36221 CVE-2021-38297 CVE-2021-38604 CVE-2021-39293 CVE-2021-39711 CVE-2021-39715 CVE-2021-41771 CVE-2021-41772 CVE-2021-44716
- CVE-2022-1012 CVE-2022-1586 CVE-2022-1652 CVE-2022-1785 CVE-2022-1897 CVE-2022-1927 CVE-2022-1976 CVE-2022-2097 CVE-2022-22476 CVE-2022-23218 CVE-2022-23219 CVE-2022-25375 CVE-2022-2596 CVE-2022-29622 CVE-2022-31159 CVE-2022-32206 CVE-2022-32208 CVE-2022-32250
Cloud Pak for Data Version 4.5.2
A new version of DataStage was released in August 2022 with Cloud Pak for Data 4.5.2.
Operand version: 4.5.2
The 4.5.2 release of DataStage includes the following features and updates:
- Send reports by using before/after-job subroutines
- You can now send full job reports by using the Send mail function of before/after-job subroutines.
- Support for migrating Db2 server-type data connection objects from traditional DataStage
- Traditional DataStage supports data connection objects of the type Db2 server. When you migrate these data connection objects to modern DataStage, they are automatically converted to Db2 connector objects so that you can still use them in your DataStage flows and jobs.
- New Custom stage
- You can now use the Custom stage in your DataStage flows. Use the Custom stage to process
your data according to your own specifications.
For details on the Custom stage, see Defining custom stages in DataStage.
- Use function libraries in the Transformer stage
- You can now define your own functions for use in the Transformer stage's expression editor by creating a function library asset from an SO file. For details on function libraries, see Function libraries in DataStage.
- Use new functions in the Transformer stage
- You can now use the ConvertDatum and NextValidDate functions in the Transformer stage as part of your DataStage flows.
- Migrate nested sequence jobs to Watson Pipelines
- A traditional DataStage sequence job
can contain a "Job Activity" type that can invoke another sequence job, making the second sequence
job a nested sequence job.
You can now migrate these nested sequence jobs into modern DataStage. When migrated, a sequence job in the ISX import file is converted to a Watson pipeline.
- Connect to a new data source
- You can now include data from Apache HBase in
your DataStage flows.
For the full list of DataStage connectors, see DataStage connectors.
- Kerberos authentication supported for Apache Hive connections
- You can now use the Kerberos authentication protocol to connect to an Apache Hive data source when you create the connection from the DataStage service. For information, see Apache Hive connection.
- The Db2 (optimized) connector can now have multiple input links, each with an individual action
- Previously the Db2 (optimized) connector had only one input link, and you specified the link's properties in the Stage properties. Now the connector can have multiple input links, and each link can have a different property. This enhancement means that each link can have an individual action, such as read, write, append, etc. You can view the properties by switching the links in the Input tab.
- Bug fixes
- This release includes the following fixes:
- Issue: Browser issues
occur that are being caused by CSS.
Resolution: This issue is now fixed.
- Issue:When an Excel
target node is added by using the asset browser, mapping is not preserved.
Resolution: This issue is now fixed.
- Issue: Excessive calls
are being made from the DataStage designer
canvas when a new DataStage flow is
created.
Resolution: This issue is now fixed.
- Issue: Validation for
function name and argument name does not work properly.
Resolution: This issue is now fixed.
- Issue: Browser issues
occur that are being caused by CSS.
- Security fixes
- This release includes the following fixes:
- CVE-2015-3627
- CVE-2016-4658, CVE-2016-5131
- CVE-2017-0663, CVE-2017-15412, CVE-2017-18258, CVE-2017-7375, CVE-2017-9047, CVE-2017-9048, CVE-2017-9049, CVE-2017-9050
- CVE-2018-20406
- CVE-2019-16775, CVE-2019-16776, CVE-2019-16777
- CVE-2020-15095, CVE-2020-1734, CVE-2020-1737
- CVE-2021-3114, CVE-2021-32760, CVE-2021-35065, CVE-2021-3583, CVE-2021-3618, CVE-2021-41103, CVE-2021-43784, CVE-2021-43816, CVE-2021-46195
- CVE-2022-1729, CVE-2022-2257, CVE-2022-23648, CVE-2022-25169, CVE-2022-25896, CVE-2022-28356, CVE-2022-29162, CVE-2022-30126, CVE-2022-31030, CVE-2022-31129, CVE-2022-33099, CVE-2022-34494
Cloud Pak for Data Version 4.5.1
A new version of DataStage was released in July 2022 with Cloud Pak for Data 4.5.1.
Operand version: 4.5.1
The 4.5.1 release of DataStage includes the following features and updates:
- Connect to more data sources in DataStage
- You can now include data from these data sources in your DataStage flows:
- Cognos® Analytics
- SAP IQ
For the full list of DataStage connectors, see DataStage connectors.
- Use new functions and features in the DataStage Transformer stage
-
- You can now use the Fold, Fmt, and Rmunprint functions in the Transformer stage as part of your DataStage flows.
- The Transformer stage now supports partitions.
- You can now use type-ahead search in the Transformer stage for functions, columns, and variables.
For the full list of available functions, see Parallel transform functions.
- Bug fixes
- This release includes the following fixes:
- Issue: Deleting a build stage does not clear all background files.
Resolution: This issue is now fixed.
- Issue: Duplicate is not supported for build stage.
Resolution: This issue is now fixed.
- Issue: Deleting a build stage does not clear all background files.
- Security fixes
- This release includes fixes for the following security issues:
CVE-2017-12626, CVE-2017-5644
CVE-2019-12415, CVE-2019-20372
CVE-2020-15522
CVE-2021-23017, CVE-2021-29425, CVE-2021-3629, CVE-2021-3634, CVE-2021-41771, CVE-2021-41772, CVE-2021-44716, CVE-2021-45868
CVE-2022-1271
Cloud Pak for Data Version 4.5.0
A new version of DataStage was released in June 2022 with Cloud Pak for Data 4.5.0.
Operand version: 4.5.0
The 4.5.0 release of DataStage includes the following features and updates:
- Connect to more data sources in DataStage
- You can now include data from these data sources in your DataStage flows:
- IBM Match 360
- SAP Bulk Extract
- SAP Delta Extract
For the full list of connectors, see DataStage connectors.
- Preview and add metadata faster for Apache Kafka and Netezza® Performance Server (optimized) connectors
- After you create the connection, you can drag the Asset browser to the
DataStage canvas, select a connection and
drill down to add or preview the data for these connectors:
- Apache Kafka (Preview is available only if the connection has a schema registry configured.)
- Netezza Performance Server (optimized)
- Use new stages in DataStage
- You can now use the following stages in your DataStage flows:
- Two-source Match
- Compare two sources of input data (reference records and data records) for matches.
- Wrapper
- Define a Wrapped stage to specify a UNIX command that is run by another DataStage stage.
For the full list of stages, see DataStage stages.
- Orchestrate flows with Watson Studio Pipelines
- Beta feature You can now create
a pipeline to run a sequence of DataStage
flows. You can add conditions, loops, expressions, and scripts to a pipeline. For details, see Orchestrating
flows.
This component is offered as a beta feature and must be installed separately. For details, see Watson Studio Pipelines.
- Customer-reported issues fixed in this release
- Bug fixes
- This release includes the following fixes:
- Issue: Snowflake JDBC connections fail to write data to Snowflake.
Resolution: This issue is now fixed.
- Issue: Snowflake JDBC connections fail to write data to Snowflake.
- Security fixes
- This release includes fixes for the following security
issues:
CVE-2016-4971
CVE-2018-1000876, CVE-2018-20483
CVE-2019-20916, CVE-2019-9923
CVE-2020-1751, CVE-2020-1752
CVE-2021-33503, CVE-2021-35942, CVE-2021-37322
CVE-2022-26612