What's new and changed in Data Refinery

Important: IBM Cloud Pak® for Data Version 4.7 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.7 reaches end of support. For more information, see Upgrading IBM Software Hub in the IBM Software Hub Version 5.1 documentation.

Data Refinery updates can include new features, bug fixes, and security updates. Updates are listed in reverse chronological order so that the latest release is at the beginning of the topic.

You can see a list of the new features for the platform and all of the services at What's new in IBM Cloud Pak for Data?

Installing or upgrading Data Refinery

Ready to install or upgrade Data Refinery?

Related documentation:

Cloud Pak for Data Version 4.7.0

A new version of Data Refinery was released in June 2023 with Cloud Pak for Data 4.7.0.

Operand version: 7.0.0

This release includes the following changes:

New features

The 7.0.0 release of Data Refinery includes the following features and updates:

The Calculate operation works on date columns
You can now use the Calculate operation on date data type columns to add or subtract day or month values.
Data Refinery Calculate operation

For more information, see GUI operations in Data Refinery.

Updates for environments for running Data Refinery flow jobs
  • The Default Spark 3.2 & R 3.6 environment was removed.
  • The Default Spark 3.3 & R 3.6 environment is deprecated and will be discontinued in a future update.
  • The Default Spark 3.3 & R 4.2 environment is now available.

    You can select Default Spark 3.3 & R 4.2 when you select an environment for a Data Refinery flow job.

If you are upgrading from a previous version of Cloud Pak for Data and your flow jobs use a discontinued environment, a deprecated environment, or a custom Spark 3.0 environment, update the jobs to use the new Default Spark 3.3 & R 4.2 environment. Use the new environment for new jobs.

For more information, see Data Refinery environments.

The environment change affects the following GUI operations:

  • Split
  • Tokenize

If you are upgrading from a previous version of Cloud Pak for Data and your flow jobs include these GUI operations, you must update the Data Refinery flow. To update a flow, open it and save it. For more information, see Managing Data Refinery flows.

Audit logging
Data Refinery now integrates with the Cloud Pak for Data audit logging service. Auditable events for Data Refinery flows are forwarded to the security information and event management (SIEM) solution that you integrate with.
Security fixes

This release includes fixes for the following security issues:

CVE-2023-32695, CVE-2023-31125, CVE-2023-30535, CVE-2023-27535, CVE-2023-25690, CVE-2023-23931, CVE-2023-0842, CVE-2023-0361, CVE-2023-0286

CVE-2022-43552, CVE-2022-42004, CVE-2022-42003, CVE-2022-35252, CVE-2022-27664, CVE-2022-24823, CVE-2022-23491, CVE-2022-2048, CVE-2022-2047, CVE-2022-1471

CVE-2021-46877, CVE-2021-44906, CVE-2021-37533, CVE-2021-35065, CVE-2021-33036, CVE-2021-29469

CVE-2020-36518, CVE-2020-13956

CVE-2019-0205

CVE-2018-1330

CVE-2015-5237, CVE-2015-3627