Table of contents

What's new in IBM Cloud Pak for Data?

See what new features and improvements are available in the latest release of IBM® Cloud Pak for Data.

Refresh 2 of Version 4.0

Released: October 2021

This refresh of Cloud Pak for Data introduces support for Red Hat® OpenShift® Container Platform Version 4.8 and for Red Hat OpenShift Container Platform clusters running on Power® and s390x (IBM Z® and LinuxONE) hardware.

This refresh also includes the initial release of the following services on Cloud Pak for Data Version 4.0:
  • Informix®
  • Watson™ Knowledge Studio

In addition, this refresh introduces a new OADP backup and restore utility.

Software Version What does it mean for me?
Cloud Pak for Data platform

(Provided by the IBM Cloud Pak® for Data platform operator)

  • CASE: 2.0.5
  • Operator: 2.0.4
  • Operand: 4.0.2
The 4.0.2 release of the Cloud Pak for Data platform includes the following features and updates:
Support for Power and s390x hardware
You can now install Cloud Pak for Data on Red Hat OpenShift Container Platform clusters running on:
  • s390x (z14 or later)
  • Power

Not all services support these environments. For a list of services that can be installed on Power or s390x hardware, see Hardware requirements.

Support for Red Hat OpenShift Container Platform Version 4.8
You can now install Cloud Pak for Data on Red Hat OpenShift Container Platform Version 4.8.

Keep in mind that Version 4.8 is a non-EUS release and goes out of full support on 26 November, 2021 (maintenance support ends on 26 January, 2022). If you decide to install Cloud Pak for Data on OpenShift 4.8, your cluster should continue to function as originally intended after the release goes out of support. For details, see Red Hat OpenShift Container Platform for Cloud Pak for Data.

Version 4.0.2 of the Cloud Pak for Data platform includes various fixes.

Related documentation:
Cloud Pak for Data command-line interface (cpd-cli) 10.0.1
The 10.0.1 release of the Cloud Pak for Data command-line interface includes the following features and updates:
New backup and restore utility
The OpenShift APIs for Data Protection (OADP) backup and restore utility enables you to seamlessly back up and restore your entire Cloud Pak for Data deployment. With the OADP backup and restore utility, you can back up:
  • The persistent volume claims (PVCs) that are associated with a set of Red Hat OpenShift Container Platform projects (Kubernetes namespaces)
  • The Kubernetes and OpenShift metadata associated with your deployment

You can restore your Cloud Pak for Data deployment into the same OpenShift cluster or to a new or secondary cluster.

Restriction: This utility is supported only on Linux x86-64.

For details, see Backing up and restoring your deployment.

Cloud Pak for Data common core services
  • CASE: 1.0.2
  • Operator: 1.0.2
  • Operand: 4.0.2
The 4.0.2 release of the common core services includes changes to support features and updates in Watson Studio and Watson Knowledge Catalog.
The common core services includes the following features and updates:
Create analytics projects with default Git integration
In projects with default Git integration, you always have your own view of the project based on the contents of your local Git clone.

This feature enables you to associate the same Git repository with different analytics projects across, even if they are associated with different instances of Cloud Pak for Data. For details, see Projects with default Git integration.

Support on s390x hardware
You can create, train, and deploy machine learning models on Cloud Pak for Data on IBM Z and LinuxONE. However, deployments on this hardware support a limited set of features.

For a list of the features that are available on IBM Z and LinuxONE, see Capabilities on IBM Z.

If you install or upgrade a service that requires the common core services, the common core services will also be installed or upgraded.

Version 4.0.2 of the common core services includes various fixes. For details, see What's new and changed in the common core services.

Cloud Pak for Data scheduling service
  • CASE: 1.2.3
  • Operator: 1.2.3
  • Operand: 1.2.3

Version 1.2.3 of the scheduling service includes various fixes. various fixes. For details, see What's new and changed in the scheduling service.

Related documentation:
Analytics Engine Powered by Apache Spark
  • CASE: 4.0.2
  • Operator: 1.0.2
  • Operand: 4.0.2

Version 4.0.2 of the Analytics Engine Powered by Apache Spark includes various fixes. For details, see What's new and changed in Analytics Engine Powered by Apache Spark.

Related documentation:
Analytics Engine Powered by Apache Spark
Cognos® Analytics
  • CASE: 4.0.4
  • Operator: 4.0.2
  • Operand: 4.0.2
The 4.0.2 release of Cognos Analytics includes the following features and updates:
Support for upgrade
You can now upgrade Cognos Analytics from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5 to version 4.0.2
Email notifications
You can now notify people by e-mail using either a report task or an email task. For details, see Notification methods in the Cognos Analytics product documentation.

Version 4.0.2 of the Cognos Analytics service includes various fixes.

Related documentation:
Cognos Analytics
Cognos Dashboards
  • CASE: 2.0.2
  • Operator: 1.0.2
  • Operand: 4.0.2
The 4.0.2 release of Cognos Dashboards includes the following features and updates:
Support for Power
You can install Cognos Dashboards on a Red Hat OpenShift Container Platform cluster running on Power hardware.
Related documentation:
Cognos Dashboards
Data Refinery
  • CASE: 1.0.2
  • Operator: 1.0.2
  • Operand: 4.0.2

Version 4.0.2 of the Data Refinery service includes various fixes. For details, see What's new and changed in Data Refinery.

Related documentation:
Data Refinery
Data Virtualization
  • CASE: 1.7.2
  • Operator: 1.7.2
  • Operand: 1.7.2
The 1.7.2 release of Data Virtualization includes the following features and updates:
Support for upgrade
You can now upgrade Data Virtualization from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.1
Back up and restore
Data Virtualization now supports backup and restore with the new OADP backup and restore utility. For details, see Backing up and restoring your deployment.
Enhancements for user groups support
This release includes the following enhancements for user groups support:
  • Schemas with CREATEIN privilege for a group are included in the list of schemas you can specify when you virtualize an object.
  • Data protection rules that are defined on user groups are enforced.
Enhancements for Excel support
The Excel source wrapper in Data Virtualization now allows access to spreadsheets of unlimited size. For details, see Creating a virtualized table from files.
New tutorial for using remote connectors
In this tutorial, you learn how to improve performance for your data virtualization data sources with remote connectors. For details, see Improve performance for your Data Virtualization data sources with remote connectors.

Version 1.7.2 of the Data Virtualization service includes various fixes. For details, see What's new and changed in Data Virtualization.

Related documentation:
Data Virtualization
DataStage®
  • CASE: 4.0.3
  • Operator: 1.0.0
  • Operand: 4.0.2
In the 4.0.2 release, the DataStage service been redesigned and modernized.

With DataStage, you can design and run data flows that move and transform data anywhere, at any scale.

No matter how complex your data landscape, DataStage can streamline your data movement costs and increase productivity. DataStage offers:
  • A best-in-breed parallel processing engine that enables you to process your data where it resides
  • Automated job design
  • Simple integration with cloud data lakes, real-time data sources, relational databases, big data, and NoSQL data stores

The DataStage service uses Cloud Pak for Data platform connections and integration points, with services like Data Virtualization, to simplify the process of connecting to and accessing your data.

With DataStage, Data Engineers can use the simple user interface to build no-code/low-code data pipelines. The interface offers hundreds of functions and connectors that reduce development time and inconsistencies across pipelines. The interface also makes it easy to collaborate with your peers and control access to specific analytics projects.

The service also provides automatic workload balancing to provide high performance pipelines that make efficient use of available compute resources.

Related documentation:
DataStage
Db2®
  • CASE: 4.0.3
  • Operator: 1.03
  • Operand: 4.0.3
The 4.0.3 release of Db2 includes the following features and updates:
Support for gathering diagnostic data
When you request support from IBM for the Db2 service, you can use the Support > Diagnostics feature to collect the diagnostic logs for the service. Ensure that you select Db2 when you create a new diagnostic job. For details, see Gathering diagnostic information.
Separate storage for temporary table spaces
When you deploy the Db2 service, you can now specify a separate storage area for temporary table spaces on the Db2 pod to reduce I/O bottlenecks and improve performance.
Version 4.0.3 of the Db2 service includes various fixes.
Related documentation:
Db2
Db2 Big SQL
  • CASE: 7.2.2
  • Operator: 7.2.2
  • Operand: 7.2.2
The 7.2.2 release of Db2 Big SQL includes the following features and updates:
Support for upgrade
You can now upgrade Db2 Big SQL on Cloud Pak for Data Version 4.0.1 to 4.0.2.
Back up and restore
Db2 Big SQL now supports backup and restore with the new OADP backup and restore utility. For details, see Backing up and restoring your deployment.
Version 7.2.2 of the Db2 Big SQL service includes various fixes.
Related documentation:
Db2 Big SQL
Db2 Data Gate
  • CASE: 4.0.2
  • Operator: 2.0.2
  • Operand: 2.0.2
The 2.0.2 release of Db2 Data Gate includes the following features and updates:
Support on s390x hardware
You can install Db2 Data Gate on Red Hat OpenShift Container Platform cluster running on s390x hardware (IBM Z and LinuxONE).
Data compression for row-based Db2 target databases
When you add a table to a Db2 target database, Db2 Data Gate creates the corresponding backend table target and selects a compression algorithm to be applied to the data when the table is loaded or synchronized. The data compression saves storage space and improves the overall performance.
Version 2.0.2 of the Db2 Data Gate service includes various fixes.
Related documentation:
Db2 Data Gate
Db2 Data Management Console
  • CASE: 4.0.2
  • Operator: 1.0.2
  • Operand: 3.1.5.2
The 3.1.5.2 release of Db2 Data Management Console includes the following features and updates:
Support on s390x hardware
You can install Db2 Data Management Console on Red Hat OpenShift Container Platform cluster running on s390x hardware (IBM Z and LinuxONE).

Version 3.1.5.2 of the Db2 Data Management Console service includes various fixes. For details, see What's new and changed in Db2 Data Management Console.

Related documentation:
Db2 Data Management Console
Db2 Warehouse
  • CASE: 4.0.3
  • Operator: 1.0.3
  • Operand: 4.0.3
The 4.0.3 release of Db2 Warehouse includes the following features and updates:
Support for gathering diagnostic data
When you request support from IBM for the Db2 Warehouse service, you can use the Support > Diagnostics feature to collect the diagnostic logs for the service. Ensure that you select Db2 Warehouse when you create a new diagnostic job. For details, see Gathering diagnostic information.
Separate storage for temporary table spaces
When you deploy the Db2 Warehouse service, you can now specify a separate storage area for temporary table spaces on the Db2 Warehouse pod to reduce I/O bottlenecks and improve performance.
Version 4.0.3 of the Db2 Warehouse service includes various fixes.
Related documentation:
Db2 Warehouse
Decision Optimization
  • CASE: 4.0.2
  • Operator: 4.0.2
  • Operand: 4.0.2
The 4.0.2 release of Decision Optimization includes the following features and updates:
Support for Git-based projects
You can now use Decision Optimization experiments in Watson Studio Git-based projects.
Python 3.8
Python 3.8 is now the default version for Decision Optimization experiments. Python 3.7 is deprecated but still supported.
Version 4.0.1 of the Decision Optimization service includes various fixes. For details, see What's new and changed in Decision Optimization.
Related documentation:
Decision Optimization
EDB Postgres
  • CASE: 4.0.2
  • Operators:
    • PostgreSQL (third party): 1.8.0
    • EDB Postgres (IBM): 4.0.2
  • Operand: 12.8
In the 4.0.2 release, the EDB Postgres service includes the following features and updates:
Updated PostgreSQL operator
The Cloud Native PostgreSQL operator, which is packaged with EDB Postgres, is now at Version 1.8.0.
Related documentation:
Execution Engine for Apache Hadoop
  • CASE: 4.0.2
  • Operator: 1.0.2
  • Operand: 4.0.2
The 4.0.2 release of Execution Engine for Apache Hadoop includes the following features and updates:
Support on s390x hardware
You can train, and deploy machine learning models on Cloud Pak for Data on IBM Z and LinuxONE. However, deployments on this hardware support a limited set of features.

For a list of the features that are available on IBM Z and LinuxONE, see Capabilities on IBM Z.

Version 4.0.2 of the Execution Engine for Apache Hadoop service includes various fixes. For details, see What's new and changed in Execution Engine for Apache Hadoop.
Related documentation:
Execution Engine for Apache Hadoop
IBM Match 360
  • CASE: 1.0.115
  • Operator: 1.1.90
  • Operand: 1.1.90
The 1.1.90 release of IBM Match 360 includes the following features and updates:
Back up and restore
IBM Match 360 now supports backup and restore with the new OADP backup and restore utility. For details, see Backing up and restoring your deployment.
Version 1.1.90 of the IBM Match 360 service includes various fixes. For details, see What's new and changed in IBM Match 360.
Related documentation:
IBM Match 360 with Watson
Informix
  • CASE:
    • Install: 4.0.2
    • Deployment: 4.0.2
  • Operator:
    • Install: 4.0.0
    • Deployment: 4.0.0.
  • Operand: 4.0.0
Informix is now available on Cloud Pak for Data 4.0.

The 4.0.0 release of Informix includes the following features and updates:

Improved resilience
The Informix service is more resilient. The service has been rearchitected to use microservices for better resiliency and more flexibility. This release also introduces a highly available MACH11 cluster with a primary server and an HDR secondary server as part of the default installation. You can optionally specify up to eight RS secondary servers. For details, see Creating a database deployment on the cluster.
Enhanced user experience
The Informix service automatically integrates with the Informix Data Management Console, which enables you to manage and monitor Informix databases that are deployed on Cloud Pak for Data from a single user interface. For details, see Viewing Informix Data Management Console.
Integration with Certificate manager
Informix leverages the Certificate manager (provided by IBM Cloud Pak foundational services) to generate TLS certificates for secure internal communications between pods.

Version 4.0.0 of the Informix service includes various fixes. For details, see Fix list for Informix Server 14.10.xC6 release.

Related documentation:
Informix
MongoDB
  • CASE: 4.0.2
  • Operators:
    • MongoDB Enterprise (third party): 1.12.0
    • MongoDB (IBM): 4.0.2
  • Operand: 4.2.6 or 4.4.0
Version 4.2.6 and Version 4.4.0 of the MongoDB service include various fixes.
Related documentation:
MongoDB
OpenPages®
  • CASE: 2.0.2 *
  • Operator: 8.203.2
  • Operand: 8.203.2
The 8.203.2 release of OpenPages includes the following features and updates:
Back up and restore
OpenPages now supports backup and restore with the new OADP backup and restore utility. For details, see Backing up and restoring your deployment.

Version 8.203.2 of the OpenPages service includes various fixes.

* The CASE package number includes additional metadata. However, only the base version information is shown.

Related documentation:
OpenPages
Planning Analytics
  • CASE: 4.0.2
  • Operator: 4.0.2
  • Operand: 4.0.2
A new refresh of Planning Analytics is available on Cloud Pak for Data Version 4.0.
Related documentation:
Planning Analytics
Product Master
  • CASE: 1.0.1
  • Operator: 1.0.1
  • Operand: 1.0.1
The 1.0.1 release of Product Master includes the following features and updates:
Support for upgrade
You can now upgrade Product Master on Cloud Pak for Data Version 4.0.1 to 4.0.2.

Version 1.0.1 of the Product Master service includes various fixes. For details, see What's new and changed in Product Master.

Related documentation:
Product Master
RStudio® Server with R 3.6
  • CASE: 1.0.2
  • Operator: 1.0.2
  • Operand: 4.0.2
The 4.0.2 release of RStudio Server with R 3.6 includes the following features and updates:
Support for Anaconda Repository for IBM Cloud Pak for Data
You can use software packages and libraries from Anaconda Repository for IBM Cloud Pak for Data in RStudio Server with R 3.6. See Using libs from Anaconda Repository.
Version 4.0.2 of the RStudio Server with R 3.6 service includes various fixes. For details, see What's new and changed in RStudio Server with R 3.6.
Related documentation:
RStudio Server with R 3.6
SPSS® Modeler
  • CASE: 1.0.2
  • Operator: 4.0.2
  • Operand: 4.0.2
Version 4.0.2 of the SPSS Modeler service includes various fixes. For details, see What's new and changed in SPSS Modeler.
Related documentation:
SPSS Modeler
Voice Gateway
  • CASE: 1.0.3
  • Operator: 1.0.3
  • Operand: 1.0.7
A new refresh of Voice Gateway is available on Cloud Pak for Data Version 4.0. This refresh includes fixes to image security vulnerabilities.
Related documentation:
Voice Gateway
Watson Assistant
  • CASE: 4.0.2
  • Operator: 4.0.2
  • Operand: 4.0.2
The 4.0.2 release of Watson Assistant includes the following features and updates:
Integration with the Cloud Pak for Data auditing service
Watson Assistant integrates with the Cloud Pak for Data auditing service feature, providing standard auditing records for important lifecycle and security events. The service generates audit records for events such as intent edits, entity creation, dialog node deletion, and more.
Related documentation:
Watson Assistant
Watson Discovery
  • CASE: 4.0.2
  • Operator: 4.0.2
  • Operand: 4.0.2
For information about what's new in the 4.0.2 refresh of the Watson Discovery service, see the release notes.
Related documentation:
Watson Discovery
Watson Knowledge Catalog
  • CASE: 4.0.2
  • Operator:1.0.2
  • Operand: 4.0.2
The 4.0.2 release of Watson Knowledge Catalog includes the following features and updates:
Support for Power
You can install Watson Knowledge Catalog on a Red Hat OpenShift Container Platform cluster running on Power hardware.
Data discovery from Netezza® sources
You can run automated discovery and quick scan jobs on Netezza data source by using a generic JDBC platform connection. For information on creating platform connections, see Connecting to data sources at the platform level.
Usability improvement for editing governance artifacts
When you edit a governance artifact property and select an artifact, you can display basic information for the selected artifact in the same edit panel.
Screen capture of the governance artifact interface
Support for new connection type
Watson Knowledge Catalog can now connect to Amazon RDS for Oracle.
Support for additional data sources
Metadata import now supports the following data sources:
  • Databases for MongoDB
  • MongoDB
  • SAP HANA
Version 4.0.1 of the Watson Knowledge Catalog service includes various fixes. For details, see What's new and changed in Watson Knowledge Catalog.
Related documentation:
Watson Knowledge Catalog
Watson Knowledge Studio
  • CASE: 4.0.2
  • Operator:4.0.2
  • Operand: 4.0.2
Watson Knowledge Studio is now available on Cloud Pak for Data Version 4.0.
Related documentation:
Watson Knowledge Studio
Watson Machine Learning
  • CASE: 4.0.3
  • Operator: 1.1.1
  • Operand: 4.0.2
The 4.0.2 release of Watson Machine Learning includes the following features and updates:
Support for new SPSS data source
You can now use data from a Netezza database as input for SPSS model deployments. For details, see Batch deployment details.
Support on s390x hardware
You can train, and deploy machine learning models on Cloud Pak for Data on IBM Z and LinuxONE. However, deployments on this hardware support a limited set of features.

For a list of the features that are available on IBM Z and LinuxONE, see Capabilities on IBM Z.

Enhanced capabilities to import content into a space
For details, see Importing spaces and projects into existing deployment spaces
Support for Spark 3.0 frameworks
For details, see Supported frameworks
Create deployment jobs for scripts from space Assets page
For details on creating jobs for Python and R scripts without having to create a batch job, see Managing jobs
Support for new data sources for AutoAI
For details on connecting to data files such as Parquet files in Cloud Object Storage (S3), see AutoAI overview
Version 4.0.2 of the Watson Machine Learning service includes various fixes. For details, see What's new and changed in Watson Machine Learning.
Related documentation:
Watson Machine Learning
Watson Machine Learning Accelerator
  • CASE: 2.3.2
  • Operator: 1.0.2
  • Operand: 2.3.2
The 2.3.2 release of Watson Machine Learning Accelerator includes the following features and updates:
Support for new libraries
Watson Machine Learning Accelerator includes support for new deep learning libraries. For details, see Supported deep learning frameworks in the Watson Machine Learning Accelerator documentation.
Version 2.3.2 of the Watson Machine Learning Accelerator service includes various fixes.
Related documentation:
Watson Machine Learning Accelerator
Watson OpenScale
  • CASE: 2.2.0
  • Operator: 1.2.0
  • Operand: 4.0.2
The 4.0.2 release of Watson OpenScale includes the following features and updates:
Evaluating custom metrics
You can configure an evaluation of custom metrics to validate your custom monitor thresholds. You can view the results from the Insights dashboard.
Version 4.0.2 of the Watson OpenScale service includes various fixes.
Related documentation:
Watson OpenScale
Watson Studio
  • CASE: 2.0.2
  • Operator: 2.0.2
  • Operand: 4.0.2
The 4.0.2 release of Watson Studio includes the following features and updates:
Create analytics projects with default Git integration
In projects with default Git integration, you always have your own view of the project based on the contents of your local Git clone.

This feature enables you to associate the same Git repository with different analytics projects across, even if they are associated with different instances of Cloud Pak for Data. For details, see Projects with default Git integration.

Support for Microsoft Azure repositories
If you create an analytics project with default integration, you can optionally associate the project with a repository in Microsoft Azure. For details, see Accessing a Git repository.
CPDCTL support for code packages
You can use CPDCTL to:
  • Promote a code package (ZIP archive file) to a space. (When the package is promoted to the space it is registered as a code package asset)
  • Create a job to run the files in the code package asset.
For details, see Code packages.
Support on s390x hardware
You can create, train, and deploy machine learning models on Cloud Pak for Data on IBM Z and LinuxONE. However, deployments on this hardware support a limited set of features.

For a list of the features that are available on IBM Z and LinuxONE, see Capabilities on IBM Z.

Support for new connection type
Watson Studio can now connect to Amazon RDS for Oracle.
Version 4.0.1 of the Watson Studio service includes various fixes. For details, see What's new and changed in Watson Studio.
Related documentation:
Watson Studio
Watson Studio Runtimes
  • CASE: 1.0.2
  • Operator: 1.0.2
  • Operand: 4.0.2
The 4.0.2 release of Watson Studio Runtimes includes the following features and updates:
Support on s390x hardware
You can run Jupyter Notebooks with Python 3.8 on IBM Z and LinuxONE.

For a list of the features that are available on IBM Z and LinuxONE, see Capabilities on IBM Z.

Version 4.0.2 of the Watson Studio Runtimes service includes various fixes.
Related documentation:
Jupyter Notebook runtimes for Watson Studio

Refresh 1 of Version 4.0

Released: August 2021

This refresh of Cloud Pak for Data introduces support for upgrades from:
Restriction: Not all services support upgrade from both versions. Information about which services support upgrade from Version 3.5 and Version 4.0 are listed in the preceding overviews.
This refresh also includes the initial release of the following services on Cloud Pak for Data Version 4.0:
  • Voice Gateway
  • Watson Assistant
  • Watson Discovery
  • Watson Speech to Text
  • Watson Text to Speech
Software Version What does it mean for me?
Cloud Pak for Data platform

(Provided by the IBM Cloud Pak for Data platform operator)

  • CASE: 2.0.3
  • Operator: 2.0.3
  • Operand: 4.0.1
The 4.0.1 release of the Cloud Pak for Data platform includes the following features and updates:
Support for upgrade
You can now upgrade your Cloud Pak for Data installation:
  • If you are running Cloud Pak for Data Version 3.0.1, you must upgrade your installation to Version 3.5. before you can upgrade to Version 4.0.x. For details on upgrading to Version 3.5, see Upgrading IBM Cloud Pak for Data control plane in the Version 3.5 documentation.
  • If you are running Cloud Pak for Data Version 3.5.3, you can upgrade to Cloud Pak for Data Version 4.0.1 or later refreshes.
    Important: If you are running an earlier refresh of Cloud Pak for Data Version 3.5, you must upgrade to Version 3.5.3 or later before you upgrade to 4.0.x.
  • If you are running Cloud Pak for Data Version 4.0.x, you can upgrade to the latest refresh.

After you upgrade Cloud Pak for Data, upgrade the services on your cluster.

Volumes file browser
The integrated file browser for storage volumes is no longer a tech preview. The file browser is now a fully supported feature in Cloud Pak for Data.

Version 4.0.1 of the Cloud Pak for Data platform includes various fixes.

Related documentation:
Cloud Pak for Data common core services
  • CASE: 1.0.1
  • Operator: 1.0
  • Operand: 4.0.1
The 4.0.1 release of the common core services includes changes to support features and updates in Watson Studio and Watson Knowledge Catalog.
The common core services includes the following features and updates:
Support for job features
The common core services now support job notifications and job retention.
Support for Python 3.8
You can now select Python 3.8 in environments for notebooks and Python scripts.

If you install or upgrade a service that requires the common core services, the common core services will also be installed or upgraded.

Version 4.0.1 of the common core services includes various fixes. For details, see What's new and changed in the common core services.

Cloud Pak for Data scheduling service
  • CASE: 1.2.2
  • Operator: 1.2.2
  • Operand: 1.2.2
The 1.2.2 release of the scheduling service includes the following features and updates:
Support for upgrade
You can now upgrade the scheduling service from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x

Version 1.2.2 of the scheduling service includes various fixes. various fixes. For details, see What's new and changed in the scheduling service.

Related documentation:
Analytics Engine Powered by Apache Spark
  • CASE: 4.0.1
  • Operator: 1.0.1
  • Operand: 4.0.1
The 4.0.1 release of Analytics Engine Powered by Apache Spark includes the following features and updates:
Support for upgrade
You can now upgrade Analytics Engine Powered by Apache Spark from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x

Version 4.0.1 of the Analytics Engine Powered by Apache Spark includes various fixes. For details, see What's new and changed in Analytics Engine Powered by Apache Spark.

Related documentation:
Analytics Engine Powered by Apache Spark
Cognos Analytics
  • CASE: 4.0.2
  • Operator: 4.0.1
  • Operand: 4.0.1
The 4.0.1 release of Cognos Analytics includes the following features and updates:
Support for upgrade
You can now upgrade Cognos Analytics from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 4.0.x
Important: Upgrade from Cloud Pak for Data 3.5 is not supported.

Version 4.0.1 of the Cognos Analytics service includes various fixes.

Related documentation:
Cognos Analytics
Cognos Dashboards
  • CASE: 2.0.1
  • Operator: 1.0.1
  • Operand: 4.0.1
The 4.0.1 release of Cognos Dashboards includes the following features and updates:
Support for upgrade
You can now upgrade Cognos Dashboards from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Related documentation:
Cognos Dashboards
Data Refinery
  • CASE: 1.0.1
  • Operator: 1.0.1
  • Operand: 4.0.1

Version 4.0.1 of the Data Refinery service includes various fixes. For details, see What's new and changed in Data Refinery.

Related documentation:
Data Refinery
Data Virtualization
  • CASE: 1.7.1
  • Operator: 1.7.1
  • Operand: 1.7.1
The 1.7.1 release of Data Virtualization includes the following features and updates:
Support for upgrade
Upgrading from Data Virtualization 1.5.0 on Cloud Pak for Data Version 3.5 to Data Virtualization Version 1.7.1 on Cloud Pak for Data Version 4.0.1 is a manual process. If you want to upgrade from Data Virtualization 1.5.0, contact IBM Support for information.
Virtualize tables in IBM Cloud Object Storage, Amazon S3, and Ceph® data sources
You can now virtualize tables in IBM Cloud Object Storage by using an enhanced virtualization flow to browse and preview files. Data Virtualization integrates with object storage connections and supports PARQUET (or PARQUETFILE), Optimized Row Columnar (ORC), CSV (Comma Separated Values), TSV (TAB Separated Values), and JSON data formats. For details, see Connecting to Cloud Object Storage.
Support for user groups
Data Virtualization supports user groups. To use groups in Data Virtualization, follow these steps:
  1. Set up Cloud Pak for Data user groups.
  2. Assign Data Virtualization roles to the user groups.
  3. Grant user groups access to virtual objects in Data Virtualization.
Build applications that can connect to Data Virtualization
You can build applications that access and use the data from the Data Virtualization service by enabling the Db2 REST interface. For more information, see Building database applications.
Db2 Data Management Console enhancements
Data Virtualization automatically installs or upgrades the Db2 Data Management Console service. For information about new features or fixes, see What's new and changed in Db2 Data Management Console.

Version 1.7.1 of the Data Virtualization service includes various fixes. For details, see What's new and changed in Data Virtualization.

Related documentation:
Data Virtualization
DataStage
  • CASE: 4.0.2
  • Operator: 1.0.1
  • Operand: 4.0.1
The 4.0.1 release of DataStage includes the following features and updates:
Support for upgrade
You can now upgrade DataStage from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Version 4.0.1 of the DataStage service includes various fixes.
Related documentation:
DataStage
Db2
  • CASE: 4.0.1
  • Operator: 1.0.1
  • Operand: 4.0.1
The 4.0.1 release of Db2 includes the following features and updates:
Support for upgrade
You can now upgrade Db2 from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Related documentation:
Db2
Db2 Big SQL
  • CASE: 7.2.1+20210819.120000
  • Operator: 7.2.1
  • Operand: 7.2.1
The 7.2.1 release of Db2 Big SQL includes the following features and updates:
Support for upgrade
You can now upgrade Db2 Big SQL from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
Important: Upgrade from Cloud Pak for Data 4.0 is not supported.
Related documentation:
Db2 Big SQL
Db2 Data Gate
  • CASE: 4.0.1
  • Operator: 2.0.1
  • Operand: 2.0.1
The 2.0.1 release of Db2 Data Gate includes the following features and updates:
Support for upgrade
You can now upgrade Db2 Data Gate from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Version 2.0.1 of the Db2 Data Gate service includes various fixes.
Related documentation:
Db2 Data Gate
Db2 Data Management Console
  • CASE: 4.0.1
  • Operator: 1.0.1
  • Operand: 3.1.5
The 3.1.5 release of Db2 Data Management Console includes the following features and updates:
Support for upgrade
You can now upgrade Db2 Data Management Console from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
New method for specifying service instances
You can now create a Db2 Data Management Console service instance by creating a custom resource.

For details, see Creating a service instance.

Version 3.1.5 of the Db2 Data Management Console service includes various fixes. For details, see What's new and changed in Db2 Data Management Console.

Related documentation:
Db2 Data Management Console
Db2 Warehouse
  • CASE: 4.0.1
  • Operator: 1.0.1
  • Operand: 4.0.1
The 4.0.1 release of Db2 Warehouse includes the following features and updates:
Support for upgrade
You can now upgrade Db2 Warehouse from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Related documentation:
Db2 Warehouse
Decision Optimization
  • CASE: 4.0.1
  • Operator: 4.0.1
  • Operand: 4.0.1
The 4.0.1 release of Decision Optimization includes the following features and updates:
Support for upgrade
You can now upgrade Decision Optimization from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Sort data tables
You can now sort data tables in the Prepare Data view.
Support for Python 3.8
By default, Decision Optimization experiments use Python 3.7. However, you can edit the run parameters for your experiment to use Python 3.8 instead.
Version 4.0.1 of the Decision Optimization service includes various fixes. For details, see What's new and changed in Decision Optimization.
Related documentation:
Decision Optimization
EDB Postgres
  • CASE: 4.0.1
  • Operators:
    • PostgreSQL (third party): 1.6.0
    • EDB Postgres (IBM): 4.0.1
  • Operand: 12.7
Version 12.7 of the EDB Postgres service includes the following features and updates:
Support for upgrade
You can now upgrade EDB Postgres from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 4.0.x
Important: Upgrade from Cloud Pak for Data 3.5.x is not supported.
Updated PostgreSQL operator
The Cloud Native PostgreSQL operator, which is packaged with EDB Postgres, is now at Version 1.6.0.
Related documentation:
Execution Engine for Apache Hadoop
  • CASE: 4.0.1
  • Operator: 4.0.1
  • Operand: 4.0.1
The 4.0.1 release of Execution Engine for Apache Hadoop includes the following features and updates:
Support for upgrade
You can now upgrade Execution Engine for Apache Hadoop from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Version 4.0.1 of the Execution Engine for Apache Hadoop service includes various fixes. For details, see What's new and changed in Execution Engine for Apache Hadoop.
Related documentation:
Execution Engine for Apache Hadoop
Jupyter Notebooks with Python 3.7 for GPU   See Watson Studio Runtimes.
Jupyter Notebooks with R 3.6   See Watson Studio Runtimes.
IBM Match 360
  • CASE: 1.0.48
  • Operator: 1.1.14
  • Operand: 1.1.14
The 1.1.14 release of IBM Match 360 includes the following features and updates:
Support for upgrade
You can now upgrade IBM Match 360 from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 4.0.x
Version 1.1.14 of the IBM Match 360 service includes various fixes. For details, see What's new and changed in IBM Match 360.
Related documentation:
IBM Match 360 with Watson
MongoDB
  • CASE: 4.0.1
  • Operators:
    • MongoDB Enterprise (third party): 1.10.0
    • MongoDB (IBM): 4.0.1
  • Operand: 4.2.6 or 4.4.0
The 4.0.1 release of MongoDB includes the following features and updates:
Support for upgrade
You can now upgrade MongoDB from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Related documentation:
MongoDB
OpenPages
  • CASE: 2.0.1
  • Operator: 8.203.1
  • Operand: 8.203.1
The 8.203.1 release of OpenPages includes the following features and updates:
Support for upgrade
You can now upgrade OpenPages from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 4.0.x
Important: Upgrade from Cloud Pak for Data 3.5.x is not supported.
Unprivileged mode for SCC used for Db2 as a service
The Db2 as a service instance that is provisioned by OpenPages now uses unprivileged mode for the security context constraint (SCC). This mode provides more security. Previously, the database required additional privileges.

Version 8.203.1 of the OpenPages service includes various fixes.

Planning Analytics
  • CASE: 4.0.1
  • Operator: 4.0.1
  • Operand: 4.0.1
The 4.0.1 release of Planning Analytics includes the following features and updates:
Support for upgrade
You can now upgrade Planning Analytics from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 4.0.x
Important: Upgrade from Cloud Pak for Data 3.5.x is not supported.

Version 4.0.1 of the Planning Analytics service includes various fixes.

Related documentation:
Planning Analytics
RStudio Server with R 3.6
  • CASE: 1.0.1
  • Operator: 4.0.1
  • Operand: 4.0.1
The 4.0.1 release of RStudio Server with R 3.6 includes the following features and updates:
Support for upgrade
You can now upgrade RStudio Server with R 3.6 from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Version 4.0.1 of the RStudio Server with R 3.6 service includes various fixes. For details, see What's new and changed in RStudio Server with R 3.6.
Related documentation:
RStudio Server with R 3.6
SPSS Modeler
  • CASE: 1.0.1
  • Operator: 4.0.1
  • Operand: 4.0.1
The 4.0.1 release of SPSS Modeler includes the following features and updates:
Support for upgrade
You can now upgrade SPSS Modeler from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Job retention
New retention options are available when you create or edit jobs. You can turn on the Retention Configuration option to retain a set number of finished job runs and job run logs, and choose whether to retain them by duration (days) or by quantity.
New documentation
A new Reference information section has been added, covering various topics such as tips and shortcuts, a CLEM language reference, and SPSS statistical algorithms. Additional information will be added to this section in the future, such as a scripting and automation guide.
Version 4.0.1 of the SPSS Modeler service includes various fixes. For details, see What's new and changed in SPSS Modeler.
Related documentation:
SPSS Modeler
Voice Gateway
  • CASE: 1.0.2
  • Operator: 1.0.1
  • Operand: 1.0.7
Voice Gateway is now available on Cloud Pak for Data Version 4.0.
The 1.0.7 release of Voice Gateway includes the following features and updates:
Default value of the CLUSTER_WORKERS environment variable

The default value of the CLUSTER_WORKERS Docker environment variable changed from 1 to 0. With this variable set to 0, a number of workers equal to one less than the number of CPUs is spawned.

Prometheus metrics formatting

The names of the metrics used by the monitoring feature for the Prometheus text format changed to use the prefix application_ instead of application:. For example, application:vg_max_calls_per_second becomes application_vg_max_calls_per_second.

Added support for Watson Assistant

There is added support for the Watson Assistant search response type. For more information, see Handling Watson Assistant response types.

Related documentation:
Voice Gateway
Watson Assistant
  • CASE: 4.0.0
  • Operator: 4.0.0
  • Operand: 4.0.0
Watson Assistant is now available on Cloud Pak for Data Version 4.0.
The 4.0.0 release of Watson Assistant includes the following features and updates:
Universal language
You now can build an assistant in any language you want to support. If a dedicated language model is not available for your target language, create a skill that uses the universal language model. The universal model applies a set of shared linguistic characteristics and rules from multiple languages as a starting point. It then learns from training data written in the target language that you add to it. For details, see Understanding the universal language model.
Premessage, postmessage, and log webhooks
A set of new webhooks are available for each assistant. You can use the webhooks to perform preprocessing tasks on incoming messages and postprocessing tasks on the corresponding responses. You can use the new log webhook feature to log each message with an external service. For details, see Webhook overview.
Related documentation:
Watson Assistant
Watson Assistant for Voice Interaction  
Watson Assistant for Voice Interaction is comprised of the following services:
  • Voice Gateway
  • Watson Assistant
  • Watson Speech to Text
  • Watson Text to Speech
Related documentation:
Watson Assistant for Voice Interaction
Watson Discovery
  • CASE: 4.0.0
  • Operator: 4.0.0
  • Operand: 4.0.0
Watson Discovery is now available from the Cloud Pak for Data 4 catalog.

For information about what's new in version 4 of the Watson Discovery service, see the release notes.

Related documentation:
Watson Discovery
Watson Knowledge Catalog
  • CASE: 4.0.1
  • Operator: 4.0.1
  • Operand: 4.0.1
The 4.0.1 release of Watson Knowledge Catalog includes the following features and updates:
Support for upgrade
You can now upgrade Watson Knowledge Catalog from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Data discovery: quick scan results
This release includes the following changes to data discovery:
  • The View data quality permission now grants access to quick scan results.
  • You can reanalyze individual tables.
  • You can set the status Reviewed for columns.
  • You can enable overwrite of existing term assignments when results are republished.
For details, see: Reviewing and working with quick scan results
Data discovery: automated discovery
You can now select multiple folders and individual files for discovery for file connection types such as HDFS. For details, see Running automated discovery.
Export and import of all governance artifacts from a single file
You can now export all governance artifacts to a single ZIP file and import them all at once by using REST API. For details, see:
Metadata sync with external repositories
You can configure Watson Knowledge Catalog to sync governance artifacts and catalog assets with external repositories such as other instances of Watson Knowledge Catalog, IBM InfoSphere® Information Governance Catalog, or Apache Atlas. The external metadata repositories must comply with ODPi Egeria standards and support its Open Metadata Repository Services (OMRS). Synchronization between the repositories happens through participation in an Egeria cohort. For details, see: Synchronizing with an external repository
Improved performance in governance workflows
The performance was improved for workflow tasks with larger sets of data, for example when you import governance artifacts.
Support for new connection type
Watson Knowledge Catalog can now connect to SQL Query.
Support for additional data sources
Metadata import now supports the following data sources:
  • MariaDB
  • Snowflake
  • SQL Query
Support for tags in categories
You can now assign one or more tags to a category.

For details, see: Managing categories

Version 4.0.1 of the Watson Knowledge Catalog service includes various fixes. For details, see What's new and changed in Watson Knowledge Catalog.
Related documentation:
Watson Knowledge Catalog
Watson Machine Learning
  • CASE: 4.0.2
  • Operator: 4.0.1
  • Operand: 4.0.1
The 4.0.1 release of Watson Machine Learning includes the following features and updates:
Support for upgrade
You can now upgrade Watson Machine Learning from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x

After you upgrade Watson Machine Learning, review Working with assets after an upgrade.

Support for new frameworks
Build and deploy machine learning models using an expanded set of popular frameworks, including frameworks built on Python 3.8.
Custom URLs
Generate a custom URL to serve an online deployment. For details, see Creating an online deployment.
Assign a deployment owner
If a deployment owner leaves a space, you can assign a new deployment owner to keep the deployment available for use. For details, see Updating a deployment.
Version 4.0.1 of the Watson Machine Learning service includes various fixes. For details, see What's new and changed in Watson Machine Learning.
Related documentation:
Watson Machine Learning
Watson Machine Learning Accelerator
  • CASE: 2.3.1
  • Operator: 2.3.1
  • Operand: 2.3.1
The 2.3.1 release of Watson Machine Learning Accelerator includes the following features and updates:
Support for upgrade
You can now upgrade Watson Machine Learning Accelerator from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Related documentation:
Watson Machine Learning Accelerator
Watson OpenScale
  • CASE: 2.1.0
  • Operator: 1.0.1
  • Operand: 4.0.1
The 4.0.1 release of Watson OpenScale includes the following features and updates:
Support for upgrade
You can now upgrade Watson Studio from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x

Ensure that you review the guidance in Upgrading the Watson OpenScale service before you upgrade the service.

Version 4.0.1 of the Watson OpenScale service includes various fixes.
Related documentation:
Watson OpenScale
Watson Speech to Text
  • CASE: 4.0.0
  • Operator: 4.0.0
  • Operand: 4.0.0
Watson Speech to Text is now available on Cloud Pak for Data Version 4.0.

For a list of new features in Watson Speech to Text, see the Watson Speech to Text release notes for IBM Cloud Pak for Data.

Related documentation:
Watson Speech to Text
Watson Studio
  • CASE: 2.0.1
  • Operator: 2.0.1
  • Operand: 4.0.1
The 4.0.1 release of Watson Studio includes the following features and updates:
Support for upgrade
You can now upgrade Watson Studio from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
Migrating existing Python environments from
Starting with Cloud Pak for Data Version 4.0.0, all of the default Python environments that are included with Watson Studio use the latest versions of open source libraries from the Open Cognitive Environment (Open-CE) rather than from IBM Watson Machine Learning Community Edition (WML-CE).

Cloud Pak for Data Version 4.0.x no longer supports WML-CE integration. For details on migrating your existing environments to Open-CE, see Migrating Python environments from Cloud Pak for Data 3.5.

Support for Python 3.8
When you install Watson Studio, the Jupyter Notebook with Python 3.8 runtime is automatically installed.

The Jupyter Notebooks with Python 3.8 for GPU runtime is available through the Watson Studio Runtimes service.

You can select these environments in notebooks and Python scripts.

Support for job features
Watson Studio now support job notifications and job retention.
Support for a new connection type
Watson Studio can now connect to SQL Query.
Version 4.0.1 of the Watson Studio service includes various fixes. For details, see What's new and changed in Watson Studio.
Related documentation:
Watson Studio
Watson Studio Runtimes
  • CASE: 1.0.1
  • Operator: 1.0.1
  • Operand: 4.0.1
The 4.0.1 release of Watson Studio Runtimes includes the following features and updates:
Support for upgrade
You can now upgrade Watson Studio Runtimes from the following Cloud Pak for Data releases:
  • Cloud Pak for Data Version 3.5.x
  • Cloud Pak for Data Version 4.0.x
New Python 3.8 for GPU environment
You can optionally install the Jupyter Notebooks with Python 3.8 for GPU runtime.

For details, see: Installing Jupyter Notebooks with Python 3.8 for GPU runtime.

Version 4.0.1 of the Watson Studio Runtimes service includes various security fixes. For details, see What's new and changed in Watson Studio Runtimes.
Related documentation:
Jupyter Notebook runtimes for Watson Studio
Watson Text to Speech
  • CASE: 4.0.0
  • Operator: 4.0.0
  • Operand: 4.0.0
Watson Text to Speech is now available on Cloud Pak for Data Version 4.0.

For a list of new features in Watson Text to Speech, see the Watson Text to Speech release notes for IBM Cloud Pak for Data.

Related documentation:
Watson Text to Speech

What's new in Version 4.0

IBM Cloud Pak for Data 4.0 introduces operator-based installations, an improved user management experience, more platform monitoring data from the web client, and an improved connections interface. In addition, Cloud Pak for Data 4.0 includes new services, like IBM Match 360 with Watson and Product Master, and enhancements to existing services, such as Watson Knowledge Catalog, Decision Optimization, and OpenPages.

If you have an existing installation of Cloud Pak for Data 3.5, you cannot upgrade to Cloud Pak for Data 4.0. The initial release of Cloud Pak for Data 4.0 supports only new installations.

Platform enhancements

The following table lists the new features that were introduced in Cloud Pak for Data Version 4.0.

What's new What does it mean for me?
Adoption of IBM Cloud Pak foundational services
In previous releases of Cloud Pak for Data, you could optionally install the IBM Cloud Pak foundational services on your cluster to integrate IAM Service and the License Service.

In Cloud Pak for Data Version 4.0, the IBM Cloud Pak foundational services are a prerequisite and Cloud Pak for Data integrates with additional features from IBM Cloud Pak foundational services:

Operand Deployment Lifecycle Manager (ODLM)
ODLM enables Cloud Pak for Data services to orchestrate and manage their dependencies.

For details about ODLM, see the IBM/operand-deployment-lifecycle-manager repository on GitHub.

Namespace Scope Operator
The Namespace Scope Operator enables the Cloud Pak for Data operators and the IBM Cloud Pak foundational services operators to be deployed in the same namespace and to manage the namespaces where Cloud Pak for Data services are deployed.

The Namespace Scope Operator can be used in tandem with the ownNamespace operator group to improve security in shared clusters. Combining these means that you do not need to grant cluster-wide authority to the Cloud Pak for Data operators and the IBM Cloud Pak foundational services operators.

For details about the Namespace Scope Operator, see:
Certificate manager
Cloud Pak for Data uses the Certificate manager to generate
  • TLS root CA Certificates
  • TLS certificates and keys for secure internal communications between microservices.

For details, see IBM Certificate manager in the IBM Cloud Pak foundational services documentation.

Identity and Access Management Service (IAM Service)
The IAM Service enables Cloud Pak for Data to use multiple identity providers for authentication. Additionally, the service enables single sign-on across multiple IBM Cloud Pak installations.

By default, Cloud Pak for Data is not integrated with the IAM Service. Review Security on Cloud Pak for Data before you enable Cloud Pak for Data to use the IAM Service.

Enhanced platform monitoring interface
In Cloud Pak for Data Version 4.0, the Platform management page has been updated and renamed to Monitoring. In addition to the status and quota information for the platform, services, service instances, environments, and pods, the page now includes:
Event and alert information
By default, Cloud Pak for Data records information for the following events:
  • A persistent volume claim (PVC) is unbound. (In other words, there is no volume that meets the specifications of the PVC.)
  • A StatefulSet or deployment has unavailable replicas.
  • A service reaches or exceeds the vCPU or memory alert threshold
  • A service reaches or exceeds the vCPU or memory quota

However, you can create custom monitors. A monitor is that checks the state of an entity periodically and generate events.

Additionally, Cloud Pak for Data issues alerts for events that are in warning or critical state for a specific period of time.

The events and alerts are accessible from the Events card on the Monitoring page. For details, see Managing the platform.

Historical use data
By default, Cloud Pak for Data stores the vCPU and memory use data for the last 30 days. You can see the platform resource use data from the last 12 hours from the Monitoring page.
Screen capture of the Platform resource use card

From the Monitoring page, you can access the Status and use page, where you can see up to 72 hours worth of data. On the Status and use page, you can see the resource use by service, services instances, environments, and pods.

For details, see Managing the platform.

User management enhancements
Cloud Pak for Data Version 4.0 includes several enhancements to user management:
More details about permissions
The Roles page now provides more information about the actions that a user can take when they have a specific permission.
New permissions for creating and managing projects and deployment spaces
Depending on the services that you have installed, the Roles page includes the following permissions:
Project permissions
  • Create projects
  • Manage projects
  • Monitor project workloads
Deployment space permissions
  • Create deployment spaces
  • Manage deployment spaces
  • Monitor deployment activity
Updated catalog permissions
The catalog permissions have been updated to separate the creation of catalogs from the management of catalogs. Users with the Create catalog permission can only create catalogs. Users with the Manage catalogs permission can:
  • Create catalogs
  • View list of all catalogs
  • Join any catalog as an Admin
  • Reconfigure the default catalog
Updated administration permissions
The administrative permissions have been updated to provide more granular control. Any permissions that are associated with the default admin user are now associated with one of the following permissions:
Platform administration permissions
  • Administer platform
  • Manage configurations
  • Manage platform health
  • View platform health
User administration permissions
  • Manage platform roles
  • Manage user groups
  • Manage users
These changes ensure that you do not need to retain the default admin user to complete specific tasks.

For details, see Predefined roles and permissions.

In addition, an administer with the appropriate permissions to manage users can use the View assigned permissions button to see a complete list of the permissions that a user has based on their assigned roles. For details, see Managing users.

Create platform connections that use shared credentials
When you create a platform connection, you can optionally use shared credentials. With shared credentials, all users use the same credentials to access the connection. Previously all platform connections required personal credentials. With personal credentials, each user must specify their own credentials to access the connection. For details, see Connecting to data sources.
Important: This feature is enabled by default. If you do not want to allow users to specify shared credentials, you must update your settings. For details, see Changing shared credentials settings.
Enhanced connections interface
When you create a new connection, the interface includes enhancements that make it easier to create the connection:
  • The Provider filter enables you to identify IBM data sources, third-party data sources, and user-defined data source types.
  • The Compatible services filter enables you to easily find the connection types that you can use with a specific service.

Alternatively, if you know the name of the connection type that you are looking for, you can enter it in the Find field.

Connection page
New Cloud Pak for Data CLI commands
The Cloud Pak for Data command-line interface (cpd-cli) is no longer used to install the Cloud Pak for Data software. However, the cpd-cli is used to complete other administrative tasks.

Starting in Cloud Pak for Data Version 4.0, the cpd-cli includes new commands:

Diagnostic command
You can use the cpd-cli diag command to gather diagnostic information about your Cloud Pak for Data deployment. For details, see Gathering diagnostics from Cloud Pak for Data services.
User management command
You can use the cpd-cli user-mgmt command to import and manage Cloud Pak for Data users. The command also enables you to import multiple users from a JSON or CSV file. For details, see Managing Cloud Pak for Data users by command line.
Support for SMB storage volumes You can now connect to external SMB file share servers from Cloud Pak for Data. For details, see Managing storage volumes.

Service enhancements

The following table lists the new features that are introduced for existing services in Cloud Pak for Data Version 4.0:

What's new Version What does it mean for me?
Analytics Engine Powered by Apache Spark
  • Operator: 1.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Analytics Engine Powered by Apache Spark service includes the following features and updates:
New version of the Spark jobs REST API
The Spark jobs REST API Version 2 is deprecated and replaced by Version 3, which supports additional spark-submit options. For details, see Getting started with Spark applications.
Run Spark applications interactively
You can now also run Spark applications interactively by leveraging the Kernel API. For details, see Getting started with Spark applications.
Related documentation:
Analytics Engine Powered by Apache Spark
Cognos Analytics
  • Operator: 4.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Cognos Analytics service is available on Cloud Pak for Data 4.0. This release of Cognos Analytics does not include new features or updates.
Related documentation:
Cognos Analytics
Cognos Dashboards
  • Operator: 1.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Cognos Dashboards service is available on Cloud Pak for Data 4.0. This release of Cognos Dashboards does not include new features or updates.
Related documentation:
Cognos Dashboards
Data Refinery
  • Operator: 1.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Data Refinery service includes the following features and updates:
New Spark environment for running Data Refinery flow jobs
You can now select Default Spark 3.0 & R 3.6 when you select an environment for a Data Refinery flow job. This selection gives you the latest supported runtime environment. For details, see Data Refinery environments.
Short videos showcase the Data Refinery GUI operations
The Data Refinery GUI operations topic now includes a short video for each operation to help you learn by example.

If you have feedback on the videos, you can submit it through the Watson Studio and Machine Learning community (You must sign in to leave comments.)

Related documentation:
Data Refinery
Data Virtualization
  • Operator: 1.7.0
  • Operand: 1.7.0
Version 1.7.0 of the Data Virtualization service includes the following features and updates:
Monitor the service with Db2 Data Management Console
The Data Virtualization monitoring dashboard includes new features such as alerts and notifications, job management and scheduling, and monitoring of historical data. Additionally, the Explorer view has been renamed Data. For more information, see What's new and changed in Db2 Data Management Console.

To learn more, see Monitoring and exploring Data Virtualization.

Support for additional data sources
  • You can now connect to the following data sources in Data Virtualization:
  • The following data sources have been optimized to take advantage of native query capabilities:
    • Google BigQuery
    • SAP HANA
    • Snowflake
Improve cache recommendations by using machine learning models
This release of Data Virtualization extends support for recommendation-based caching by introducing machine learning based recommendations. The machine learning cache recommendation algorithm considers underlying query patterns in the workload and predicts caches that help future query workloads. Data Virtualization uses a pre-trained model that was trained on an industry standard data set. The recommendation engine consolidates and ranks the final set of recommendations from both rule-based and machine learning based models. For details, see Managing data caches and queries.
You can also use three new APIs to manage your caches:
Improve statistics collection when you virtualize a table
The optimizer makes its decisions based on statistical information that it has about the data that is being queried. To ensure optimal query performance, you must have accurate and up-to-date statistics on the data. In this release, significant enhancements were made to the statistics collection that is performed during table virtualization for data sources that support catalog-based statistics. You should ensure that accurate catalog-based statistics are available in the remote data sources before you virtualize a table.

For details, see Collecting statistics (Data Virtualization).

Improve query performance with enhanced AVG() and SUM() functions
The AVG() and SUM() functions can now process larger result sets. Queries that previously resulted in an overflow error now complete successfully. These changes improve compatibility with other relational database management systems and enable the processing of larger data sets to be processed without arithmetic overflow errors.

The precision (accuracy) of results of the AVG() function is also improved.

For details, see:

Related documentation:
Data Virtualization
DataStage
  • Operator: 4.0.0
  • Operand: 4.0.0
Version 4.0.0 of the DataStage Enterprise and DataStage Enterprise Plus services is available on Cloud Pak for Data 4.0. This release of DataStage Enterprise and DataStage Enterprise Plus does not include new features or updates.
Related documentation:
DataStage
Db2
  • Operator: 4.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Db2 service includes the following features and updates:
Support for multiple HADR standbys
The Db2 service now supports multiple standby databases for the High Availability Disaster Recovery (HADR) feature. By creating multiple standbys, you can have your data in more than two sites, which provides improved data protection with a single technology. You can now use HADR to achieve:
  • High availability objectives
  • Disaster recovery objectives, which previously required another technology

For details, see Using multiple HADR standby databases.

Support for tethered projects
You can provision Db2 instances into a tethered project (an isolated Kubernetes namespace). When a Db2 instance is in a tethered project, you can manage the Db2 instance from Cloud Pak for Data but the instance is otherwise isolated from Cloud Pak for Data and the other services that are running.
Snapshot backup and restore
Db2 supports new backup and restore method that is based on the OpenShift APIs for Data Protection. This method enables snapshot backup and restore for both Kubernetes resources and storage data. Snapshot backups have significant advantages over traditional backups; snapshot backups:
  • Are completed quickly, regardless of database size
  • Allow for fast recovery
  • Do not affect many database operations, except momentarily pausing operations that perform writes.

For details, see Performing a snapshot backup.

Related documentation:
Db2
Db2 Big SQL
  • Operator: 1.7.0
  • Operand: 7.2.0
Version 7.2.0 of the Db2 Big SQL service includes the following features and updates:
Support for the LIKE url clause on the CREATE TABLE (HADOOP) statement
When you use the LIKE clause, you can specify that the table columns are to have a name and type that are most similar to columns in a file at the specified URL. For details, see CREATE TABLE (HADOOP) statement.
Support for the EXCHANGE PARTITION clause on the ALTER TABLE (HADOOP/HBASE) statement
You can move a partition from a source table to a target table and alter each table's metadata by specifying the EXCHANGE PARTITION clause on the ALTER TABLE statement. For details, see ALTER TABLE (HADOOP/HBASE) statement.
Temporary directory bypass feature for INSERT
You can bypass temporary directories when you insert data into tables that are located on object stores. The data is written directly into a table's location. Enabling this feature can significantly improve ingestion performance and reduce the total number of operations that must be run against the object stores. For details, see Bypassing temporary directories to increase the performance of insert operations into object stores.
Integration with the Cloud Pak for Data auditing service
Db2 Big SQL Version 7.2.0 integrates fully with the Cloud Pak for Data auditing service feature, providing standard auditing records for important lifecycle and security events in the service instance. For details, see Auditing Cloud Pak for Data.
Hadoop cluster support
Db2 Big SQL Version 7.2.0 can connect to Hadoop clusters on Cloudera Data Platform (CDP) Private Cloud Base 7.1.6 only.

Support for other Hadoop clusters is dropped. For details, see Removals and deprecations.

Related documentation:
Db2 Big SQL
Db2 Data Gate
  • Operator: 1.0.0
  • Operand: 2.0.0
Version 2.0.0 of the Db2 Data Gate service includes the following features and updates:
Seamless credential updates
You can update the credentials for the Db2 for z/OS® source database or the Db2 target database without interrupting the synchronization flow between the two databases.

For the target database, see Creating a Db2 Data Gate instance.

Query routing support (beta)
Route your analytical Db2 for z/OS queries to a Db2 Data Gate instance to shift the query workload and save z/OS processing resources.

This feature is in beta state. Contact IBM support before using it.

Related documentation:
Db2 Data Gate
Db2 Data Management Console
  • Operator: 1.0.0
  • Operand: 3.1.5
Version 3.1.5 of the Db2 Data Management Console service includes the following features and updates:
Alerts and notifications
  • Support for availability and performance alert. You can now setup the console to trigger alerts when there is a problem related to availability and performances of a connected database.

    For details, see Setting up alerts.

  • Support for custom (user-defined) alerts. You can now define custom alerts and specify corresponding action for the alert in the monitoring profile for a database connection. With this feature, you can:
    • Create alerts based on SQL scripts.
    • Schedule alerts to run at a specific time, or to repeat at certain intervals for one or more databases that are listed in the monitoring profile.
    • Configure email and slack notifications to be sent to one or more users depending on the success of the alert.
    • View the history of all alerts that run on your databases.

    For details, see Creating custom alerts.

  • Manage alerts in notification center. You can now view alert information and share all received alerts from the notification center. The details of an alert include alert severity, status, group, problem analysis, and resolution.

    For details, see Managing alerts.

Job management and scheduling
Db2 Data Management Console now supports job creation, job scheduling, and job management for your connected databases. With this feature, you can:
  • Define jobs through SQL scripts
  • Create on-demand jobs or schedule jobs to run at a specific time for one or more databases.
  • Categorize jobs using tags
  • View logs, results, and execution status of jobs that run on the database.

For details, see Creating and scheduling jobs.

Additional KPIs for reports
Db2 Data Management Console reports now include the following key performance indicators (KPIs):
  • Transaction commits per minute
  • Transaction rollback per minute
  • Rows read per minute
  • Rows returned per minute
  • Rows modified per minute
  • Rows read per fetched row (rows read / rows returned)
  • Logical reads per minute
  • Direct reads per minute
  • Direct writes per minute
  • Lock wait time
  • Lock timeouts
  • Deadlocks
  • Lock escalations
  • Other wait time breakdown
  • I/O time breakdown
  • Other processing time breakdown
In addition, users can download the top 10 full SQL statements. System SQL statements are excluded from this list.

For details, see Creating monitor reports.

Object management
A new function called Generate DDL has been added to explore objects. The Generate DDL function extracts the Data Definition Language (DDL) statements for the identified database objects. You can use this function to generate a DDL script. The script contains statements to recreate the database objects you have selected.
Event monitoring
The Event monitoring profile page now includes the following enhancements:
  • Added support to configure table space. The Tablespace usage field displays the current table space usage for the event monitors on the database.
  • Supports workload customization for activity and statistics event monitors. Users can view the list of relevant workloads, enable, disable, and edit the collection settings for each workload for activity and statistics event monitors.
  • Added a field, Utility types under Utility tab for selecting the utilities. The following utilities are available:
    • BACKUP
    • LOAD
    • MOVETABLE
    • ONLINERECOVERY
    • REDISTRIBUTE
    • REORG
    • RESTORE
    • ROLLFORWARD
    • RUNSTATS

For details, see Setting up event monitor profile.

Support for Data Virtualization
Db2 Data Management Console now enables you to monitor historical data for Data Virtualization. Additionally, you can access the Data Virtualization without navigating to the Data Virtualization interface.
REST APIs
Db2 Data Management Console now provides REST APIs that you can use to access your monitoring data.
Persist event monitor data
To prevent the loss of data during a Db2 Data Management Console upgrade, the configurations for the event monitor profile and the monitor profile are persisted in the target database by the console.
Related documentation:
Db2 Data Management Console
Db2 Warehouse
  • Operator: 4.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Db2 Warehouse service includes the following features and updates:
Support for multiple HADR standbys
The Db2 Warehouse service now supports multiple standby databases for the High Availability Disaster Recovery (HADR) feature. By creating multiple standbys, you can have your data in more than two sites, which provides improved data protection with a single technology. You can now use HADR to achieve:
  • High availability objectives
  • Disaster recovery objectives, which previously required another technology

For details, see Using multiple HADR standby databases.

Support for tethered projects
You can provision Db2 Warehouse instances into a tethered project (an isolated Kubernetes namespace). When a Db2 Warehouse instance is in a tethered project, you can manage the Db2 Warehouse instance from Cloud Pak for Data but the instance is otherwise isolated from Cloud Pak for Data and the other services that are running.
Snapshot backup and restore
Db2 Warehouse supports new backup and restore method that is based on the OpenShift APIs for Data Protection. This method enables snapshot backup and restore for both Kubernetes resources and storage data. Snapshot backups have significant advantages over traditional backups; snapshot backups:
  • Are completed quickly, regardless of database size
  • Allow for fast recovery
  • Do not affect many database operations, except momentarily pausing operations that perform writes.

For details, see Performing a snapshot backup.

Related documentation:
Db2 Warehouse
Decision Optimization
  • Operator: 4.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Decision Optimization service includes the following features and updates:
New Decision Optimization runtime
When you run a model in a Decision Optimization experiment, the new do_20.1 runtime is used by default. For details, see Run model view.
Support for C# models
You can delegate the Decision Optimization solve to run on Watson Machine Learning .NET (CPLEX or CPO) models. For details, see Delegating the Decision Optimization solve.
New features in the Modeling Assistant
The Modeling Assistant now provides support for:
User-defined decisions
You are no longer restricted to only using decisions deduced from your intent. You can now define your own decisions using the advanced settings and decision tabs, where you can select your decision type and its dimensions (data table or column). You can then configure new rules and objectives which use your newly defined decision.

For details, see Defining custom decisions

Multi-concept iteration
You can specify new groups of rules in natural language by combining different concepts and iterating over these combinations. For example, you can combine employees and days and then state that you want your rule to apply to each employee-day combination.

For details, see Using multi-concept iteration

Logical constraints and associated concepts
You can specify that if one constraint applies, then another constraint also applies. You can also express certain conditions in constraints more concisely and intuitively using the word associated in your natural language expression. This automatically makes the necessary logical connection between the concepts you are referring to, without you having to use more complicated join expressions.
For example, this constraint illustrates both logical constraints (if.. then) and the associated keyword in the Modeling Assistant natural language:
For each employee-day combination, 
  if (the number of assignments of Employee is equal to 0) 
    then (the number of associated Oncall duties is equal to 0 )

For details, see Using logical constraints.

For a video demonstrating these new Modeling Assistant features, see Use Decision Optimization Modeling Assistant video.

CPLEX V.20.1
CPLEX V.20.1 is now available in Watson Machine Learning. For details, see Model deployment.
Support for audit logging
Decision Optimization integrates with the Cloud Pak for Data audit logging feature. Events related to Decision Optimization experiments, scenarios and solves now generate audit records. For details, see Services that support audit logging.
Related documentation:
Decision Optimization
EDB Postgres
  • Operator: 4.0.0
  • Operand: 12.7 or 13.3
Version 12.7 and 13.3 of the EDB Postgres service are available on Cloud Pak for Data 4.0. These versions of EDB Postgres do not include new features or updates.
Related documentation:
Execution Engine for Apache Hadoop
  • Operator: 1.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Execution Engine for Apache Hadoop service is available on Cloud Pak for Data 4.0. Execution Engine for Apache Hadoop no longer requires a separate license to install. It is included in your purchase of Cloud Pak for Data. This release of Execution Engine for Apache Hadoop does not include new features or updates.
Related documentation:
Execution Engine for Apache Hadoop
Jupyter Notebooks with Python 3.7 for GPU
  • Operator: 4.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Jupyter Notebooks with Python 3.7 for GPU service is available on Cloud Pak for Data 4.0. This release of Jupyter Notebooks with Python 3.7 for GPU does not include new features or updates.
Related documentation:
Jupyter Notebooks with Python 3.7 for GPU
Jupyter Notebooks with R 3.6
  • Operator: 4.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Jupyter Notebooks with R 3.6 service is available on Cloud Pak for Data 4.0. This release of Jupyter Notebooks with R 3.6 does not include new features or updates.
Related documentation:
Jupyter Notebooks with R 3.6
MongoDB
  • Operator: 4.0.1
  • Operand: 4.2.6
In the 4.0 release, the MongoDB service includes the following features and updates:
Deploy Ops Manager separately from MongoDB
You can now provision the MongoDB Ops Manager separately from individual MongoDB databases, which enables you to have a common Ops Manager that can work with multiple MongoDB databases and that can scale to match the number of database instances that your organization uses. Ops Manager helps you complete common tasks such as managing users and database access or configuring nodes.
Related documentation:
MongoDB
OpenPages
  • Operator: 8.203.0
  • Operand: 8.203.0
Version 8.203.0 of the OpenPages service includes the following features and updates:
Support for an external database
When you provision an OpenPages instance, you can either use the database that is provided by the OpenPages service, or you use a Db2 database on Linux® that is outside of Cloud Pak for Data.

You might use an external database to meet your organization's data security requirements, for example.

For more information, see Using an external database with OpenPages

Support for tethered projects
If you use an external database, you can provision OpenPages instances into a tethered project (an isolated Kubernetes namespace). When an OpenPages instance is in a tethered project, you can manage the OpenPages instance from Cloud Pak for Data but the instance is otherwise isolated from Cloud Pak for Data and the other services that are running.

For more information, see Using an external database with OpenPages

User authentication and management
OpenPages now integrates with the user authentication and management features in Cloud Pak for Data.
  • You can now sign in to OpenPages by using your Cloud Pak for Data credentials rather than signing in through OpenPages.
  • Create and maintain users and groups in Cloud Pak for Data. Users and groups in Cloud Pak for Data are automatically synchronized with OpenPages. Role templates, profiles, and locales are still managed in OpenPages.
  • Use Cloud Pak for Data to load users and groups from your organization's LDAP server and then give the users and groups access to OpenPages.
  • OpenPages API authentication uses a JSON Web Token (JWT) through Cloud Pak for Data.
  • Integrate single sign-on (SSO) through Cloud Pak for Data.

For details, see Managing users.

Other new features and enhancements
This release of OpenPages includes additional features and enhancements that were introduced in OpenPages Version 8.2.0.3.

For details, see New features in version 8.2.0.3 in the OpenPages documentation.

Related documentation:
OpenPages
Planning Analytics
  • Operator: 4.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Planning Analytics service includes bug fixes from all the constituent components of Planning Analytics Workspace 2.0.63 SC.

For details, see IBM Planning Analytics 2.0 Fix Lists in the IBM support documentation.

Related documentation:
Planning Analytics
RStudio Server with R 3.6
  • Operator: 1.0.0
  • Operand: 4.0.0
Version 4.0.0 of the RStudio Server with R 3.6 service includes the following features and updates:
Create custom R environment images for RStudio
You can build custom images based on the RStudio runtime image available in Watson Studio. Building custom images enables you to optimize the standard software configuration of your RStudio runtime for your application needs.

For details, see Building custom images.

Related documentation:
RStudio Server with R 3.6
SPSS Modeler
  • Operator: 1.0.0
  • Operand: 4.0.0
Version 4.0.0 of the SPSS Modeler service includes the following features and updates:
Environment size
The default option has been changed from 4 vCPU + 12gb of RAM to 2 vCPU + 8gb of RAM. If you have compute intensive workloads that require more vCPU—such as Auto Classifier nodes, Auto Cluster nodes, Auto Numeric nodes, Auto Data Prep nodes, or Text Analytics nodes—we recommend you define a custom environment size.
Interactive decision trees
An interactive tree builder is now available for the C&R Tree, CHAID, and QUEST nodes. These nodes use decision tree models to develop classification systems that predict or classify future observations based on a set of decision rules.

Previously, you could only generate a tree model automatically, where the algorithm decides the best split at each level. Now you can use the interactive tree builder to take control, applying your business knowledge to refine or simplify the tree before saving the model nugget.

For details, see The interactive tree builder.

New sample projects and tutorials
The SPSS Modeler documentation includes tutorials that are based on sample projects that you can download. For details, see the SPSS Modeler tutorials.
Simulation Evaluation (Sim Eval) node
The Sim Eval node is now available for evaluating continuous fields. For more information, see Sim Eval node.
Streaming TCM node
The Streaming TCM node has been added for building and scoring temporal causal models in one step. For more information, see Streaming TCM node.
Support for external R and Python libraries
You can now load R and Python libraries to use with the extension nodes. Also note that R 4.0 is now supported. For more information, including instructions for installing packages that your scripts require, see Extension nodes.
Upload streams
If you use SPSS Modeler desktop Version 18.3, you can now upload streams to Cloud Pak for Data directly from the desktop user interface. For details, see Saving streams to Cloud Pak for Data.
Usability improvements
The following interface changes have been made to improve usability:
  • You can now copy/paste nodes between browser tabs within the same project
  • Undo/redo functionality
  • Improvements to the Expression Builder interface, such as a new Validate button
  • Analysis node output improvements
  • Output preview improvements
Building custom images to install ODBC drivers
You can now build custom images based on the SPSS Modeler runtime images available in IBM Watson Studio. You can use custom images to install custom ODBC drivers. To create a custom image, you need to download the image of the SPSS Modeler runtime that you want to customize, build a new custom image by adding ODBC drivers to the image you downloaded, register the new image, and finally update the environment definition you created in your project to use the new custom image. For details, see Building custom images.
New setting in the GLE node
A new setting called Perform non negative least squares is now included with GLE node in the estimation options under Parameter Estimation. Non-negative least squares (NNLS) is a type of constrained least squares problem where the coefficients are not allowed to become negative. Not all data sets are suitable for NNLS, because NNLS requires a positive or no correlation between predictors and target. For more information about using the GLE node, see GLE node.
Continuous machine learning
Model drift is the process by which models become outdated as your data changes over time. SPSS Modeler provides continuous automated machine learning to help overcome model drift.

A result of IBM research, and inspired by natural selection in biology, continuous machine learning is now available for the Auto Classifier node and the Auto Numeric node.

For details, see Continuous machine learning.

Related documentation:
SPSS Modeler
Watson Knowledge Catalog
  • Operator: 1.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Watson Knowledge Catalog service includes the following features and updates:
Enhancements when importing COBOL copybooks
When you import COBOL copybooks, the relationships between the copybooks and the corresponding virtual tables are imported into the catalog.

You can also select an individual COBOL copybooks for metadata import.

In addition, the performance of importing COBOL copybook metadata is improved.

For details, see Importing metadata.

Use groups to manage data governance
You can use groups to add collaborators to:
You can also specify user groups in:

User groups are currently not supported in data quality projects.

Support for new connection types
Watson Knowledge Catalog can now connect to:
  • Databases for MongoDB
  • Microsoft Azure File Storage
In addition, the following connection names have changed:
  • Sybase is now SAP ASE
  • Sybase IQ is SAP IQ

This change impacts only the connection type names. The connection settings remain the same.

Usability improvements for metadata import
Metadata import includes support for Box as a data source. It also includes the following improvements:
  • Create and add tags to the metadata import asset
  • Directly edit the configuration from the review section
  • Edit a metadata import asset from within the asset
  • See the status of imported data assets
  • More options when setting the data scope

For details, see Importing metadata.

Assign custom attributes to default asset types
The default asset types that are included with Watson Knowledge Catalog can now have custom attributes. Because a default asset type cannot be directly modified, you use the API to apply custom attributes from one or more other asset types to the default asset type, which gives the default asset type custom attributes.

For details, see Adding assets to a catalog

Custom relationships between governance artifacts
You can now create and use custom attributes of the type relationship to define relationships between governance artifacts.

For details, see Custom attributes.

Data protection rules enhancements
You can now use column names in rule conditions, and you can mask columns based on the business terms, data classes, or tags assigned to a column or based on the column name.

For details, see Managing data protection rules.

Data discovery and data quality enhancements
Additional connections
You can now use the following connection types in quick scan, automated discovery, and data quality projects:
  • Amazon Redshift
  • Apache Kudu
  • Data Virtualization

For details, see Discovering assets.

More details in the run history of data rule sets
The run details of a data rule set now include an Output tab where you can see the output data for the configured output setting.

For details, see Running rule sets.

Quick scan results UI
You can now do bulk assignments or removals of business terms, and you can publish results at schema level.

For details, see Working with quick scan results.

Support for audit logging
Watson Knowledge Catalog integrates with the Cloud Pak for Data audit logging feature. Events in the following areas generate logs:
  • Metadata import
  • Policies
  • Policy rules
  • Profiling
  • Catalog
  • Workflow
  • Business glossary

For details, see Services that support logging.

Smaller installation footprint
You can optionally install the Watson Knowledge Catalog service without the legacy user interface that provides advanced curation and data quality.

You can choose between the core installation or the full installation. For details, see Installing Watson Knowledge Catalog.

Improved search across the platform
You can now use the global search bar to search for assets across all the projects, catalogs, and deployment spaces to which you have access. You can also search for governance artifacts across the categories to which you have access.

The search now finds results across more asset properties and governance artifacts. You can now search for exact words or phrases by surrounding search terms with double quotation marks. For details, see Searching across the platform.

Related documentation:
Watson Knowledge Catalog
Watson Machine Learning
  • Operator: 4.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Watson Machine Learning service includes the following features and updates:
Use groups to manage deployment space collaborators
You can now add a user group as a collaborator in an deployment space. All users in the group have the role that you assign to the group. For details, see Collaborator permissions for spaces.
New permission required to create deployment spaces
To create a deployment space, users must have the Manage deployment spaces or Create deployment spaces permission. Restricting who can create deployment spaces gives you more control over resource on the cluster.

By default, these permissions are only associated with the Administrator role. For details, see Predefined roles and permissions

If you are upgrading from Cloud Pak for Data Version 3.5, your existing users will not have permission to create deployment spaces unless they have the Administrator role.

To give users one of these permissions, you can edit an existing role or create a new role. For details, see Managing roles.

New permissions for managing deployment spaces
Cloud Pak for Data includes new permissions that give you more control over deployment spaces:
Manage deployment spaces permission
Users with the Manage deployment spaces permission can:
  • Create deployment spaces
  • View list of all deployment spaces
  • Join any deployment space as an Admin
  • View deployment activity across all spaces
Monitor deployment activity permission
Users with the Monitor deployment activity permission can:
  • View list of all deployment spaces
  • View deployment activity across all spaces

For details, see Collaboration permissions.

Use a connection asset to secure credentials
Previously, when you accessed connected data as an input to a deployment, you had to enter authentication credentials with every connection attempt. Now, you can use the connection_asset type to refer to the data source connection by its ID. This method improves security by eliminating the need to enter authentication details multiple times. For details, see Batch deployment details.
Support for additional frameworks
Watson Machine Learning includes support for an expanded set of popular frameworks and software specifications for building and deploying machine learning models.
AutoAI training enhancements
Now you can train AutoAI experiments with:
An expanded list of data sources
AutoAI experiments now support an expanded set of data sources.
Joined data sets
Tech preview You can optionally combine multiple data sources that share a common column (key) and use that data to train AutoAI experiments. For details, see Building an AutoAI experiment with joined data.
AutoAI time series experiments
Tech preview Use AutoAI time series experiments to predict future activity based on history, sequential data. For details, see Creating a time series experiment.
Save AutoAI experiment code to a notebook
You can save all of the AutoAI experiment code to a notebook, where you can review all of the transformations and feature engineering that was used to create the model pipelines. For details, see, Saving an AutoAI generated notebook.
Manage the lifecycle of models with the cpdctl CLI
You can use the cpdctl command-line interface (CLI) to manage lifecycle of a model in Cloud Pak for Data. The cpdctl CLI helps you automate the end-to-end flow, from creating projects, to training models, to creating deployment jobs. For details, see Managing AI lifecycle with cpdctl.
Support for additional data sources for SPSS models
Watson Machine Learning supports additional input data sources for deploying an SPSS model. For details, see Batch deployment details.
Centralized job management
Simplify your DevOps process by promoting notebooks to deployment spaces, which give you a centralized place to manage jobs. For details, see Deployment spaces.
Federated Learning
Tech preview Federated Learning provides new ways for you to tune your experiments, including support for:
  • A party threshold metric (quorum)
  • Terminating an experiment when experiment accuracy thresholds are not met

For details, see Federated learning.

New authentication method
Watson Machine Learning supports an additional authentication method for Cloud Pak for Data environments that use the IBM Cloud Pak foundational services Identity and Access Management Service (IAM Service) to authenticate. For details, see Authentication.
Related documentation:
Watson Machine Learning
Watson Machine Learning Accelerator
  • Operator: 2.3.0
  • Operand: 2.3.0
Version 2.3.0 of the Watson Machine Learning Accelerator service includes the following features and updates:
Support for new deep learning libraries
Watson Machine Learning Accelerator now includes support the following deep learning libraries:
  • TensorFlow 2.4.1
  • PyTorch 1.7.1
  • NVIDIA CUDA Toolkit 11.0, which supports NVIDIA Ampere 100 GPU
Support for high availability
You can deploy highly available instances of Watson Machine Learning Accelerator with multiple active replicas.

For details, see Installing Watson Machine Learning Accelerator.

Support for OpenShift Container Storage
You can now install Watson Machine Learning Accelerator with OpenShift Container Storage.

For details, see Compute, memory, and storage requirements.

Support for audit logging
Watson Machine Learning Accelerator integrates with the Cloud Pak for Data audit logging feature. The following events generate logs:
  • Batch training create, read and update
  • Notebook create, read and update
  • Resource Plan create, read and update
  • Hyperparameter optimization create, read and update
  • Inference deployment create, read and update
  • Inference model create, read and update
  • Platform application read and stop

For details, see Services that support audit logging.

Related documentation:
Watson Machine Learning Accelerator
Watson OpenScale
  • Operator: 1.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Watson OpenScale service includes the following features and updates:
Support for additional capabilities in batch environments
You can now use the following capabilities in batch environments:
  • Generate explanations for every record
  • Detect bias
  • Compute fairness metrics on your training data

For details, see Batch processing.

Support for Db2 as a data source in batch environments
Previously, Db2 was supported as a data source only for online subscriptions. Now, you can use Db2 as a data source in batch environments for payload, feedback, and training data. For details, see Batch processing.
Support for JDBC connections
You can now use a JDBC connection type when you configure data sources. For details, see Configuring the batch processor.
Use a remote instance of Watson Machine Learning
You can now bind a machine learning provider to:
  • Watson Machine Learning on IBM Cloud
  • Watson Machine Learning on another instance of Cloud Pak for Data

In addition, this release of Watson OpenScale provides enhanced usability for local instances of Watson Machine Learning.

Latest version of scikit-learn
Retrain your drift model to use the latest libraries that are available in scikit-learn 0.24. Beginning in Watson OpenScale on Cloud Pak for Data 4.0.1, drift detection models that are built on scikit-learn 0.20.1 will stop working. For details, see Configuring the drift detection monitor.
Related documentation:
Watson OpenScale
Watson Studio
  • Operator: 2.0.0
  • Operand: 4.0.0
Version 4.0.0 of the Watson Studio service includes the following features and updates:
Use groups to manage project collaborators
You can now add a user group as a collaborator in an analytics project. All users in the group have the role that you assign to the group. For details, see Project collaborators.
Python environments use libraries from Open Cognitive Environment (open-CE)
In IBM Cloud Pak for Data 4.0, all default environments with Python included in Watson Studio use the latest open source versions of many popular machine learning libraries like TensorFlow, XGBoost and PyTorch from the Open Cognitive Environment (Open-CE). For details, see Environments.
New permission required to create projects
To create a project, users must have the Manage projects or Create projects permission. Restricting who can create projects gives you more control over resource on the cluster.

By default, these permissions are only associated with the Administrator role. For details, see Predefined roles and permissions

If you are upgrading from Cloud Pak for Data Version 3.5, your existing users will not have permission to create projects unless they have the Administrator role.

To give users one of these permissions, you can edit an existing role or create a new role. For details, see Managing roles.

Notebooks support more data source connections
You can now use the Insert to code function in notebooks with more database connections. For details, see Data load support.
Access assets with the ibm-watson-studio-lib library
The ibm-watson-studio-lib library provides convenient access to the assets in analytics projects or in deployment spaces.

The ibm-watson-studio-lib library replaces the project-lib library, which is deprecated.

For details on the advantages of using this new asset library in notebooks in projects or spaces, see Using ibm-watson-studio-lib.

Promote notebooks to deployment spaces
You can promote notebooks to deployment spaces, then deploy them and make the URL available to users. You can do this manually from a project, or programmatically by using CPDCTL commands. For details, see Promoting notebooks to spaces.
New version of JupyterLab
JupyterLab is now at Version 3.0, which includes several Elyra 2.x extensions to the JupterLab interface. These extensions enhance the development of data science and AI models. For details about which Elyra extensions are included, see JuypterLab.
Support for new connection types
Watson Studio can now connect to:
  • Databases for MongoDB
  • Microsoft Azure File Storage
In addition, the following connection names have changed:
  • Sybase is now SAP ASE
  • Sybase IQ is SAP IQ

This change impacts only the connection type names. The connection settings remain the same.

Related documentation:
Watson Studio

New services

The following table lists the new services that are introduced in Cloud Pak for Data Version 4.0:

Category Service Pricing What does it mean for me?
AI IBM Match 360 with Watson Included with Cloud Pak for Data

IBM Match 360 with Watson on IBM Cloud Pak for Data seamlessly consolidates data from disparate sources to establish a single, trusted, 360-degree view of your customers.

Information about your customers can come from multiple sources across your business. Use IBM Match 360 with Watson to simplify the process of creating a 360-degree view of your customers. Consolidate data from:
  • Existing master data management systems
  • Transactional data sources
  • Clickstream data sources
Easier for data engineers
When you add a new data source to IBM Match 360, the service generates a customizable data model, reducing the need to manually map attributes. After you load data, you can run a matching algorithm to create enriched master data entities. You can also tune and train the algorithm to improve future matches.
Accessible to your enterprise
Business users can access IBM Match 360 to search, explore, and analyze master data entities. The service also includes a rich set of APIs that your applications can use to access trusted master data.

Operator: 1.0.0

Operand: 1.0.0

Related documentation:
IBM Match 360 with Watson
Data governance IBM Product Master Separately priced
Use IBM Product Master to centralize and optimize your enterprise’s product information. IBM Product Master creates a single, accurate, and up-to-date registry of product and service information that your business can use for strategic initiatives such as automating ingestion and governance of product information.
Product Master provides trusted product management information and collaborative master data management capabilities, such as:
Persona-targeted view of information
Provide users with easy access to their personalized daily tasks, while controlling system access based on roles and privileges.
Adaptive data model
Define and adapt data models with an easy-to-use UI.
Data governance
Streamline processes, rules, and validations and maintain data quality with highly scalable workflows and business performance management capabilities.
Data integration
Create on-demand or scheduled jobs to integrate product information in real-time across your enterprise. Use built-in connectors to integrate with enterprise resource planning software like SAP, JD Edwards, and e-commerce providers like Amazon and eBay.
Digital asset management
Manage unstructured digital assets like images, videos, and PDFs, and multiple file formats all in one interface.
Machine-learning-assisted data stewardship
Use machine learning to accelerate manual tasks, shorten review cycles, and improve data quality. Use machine-learning-based product categorization and data enrichment features to automate tedious and time-consuming manual tasks such as categorizing product names, enriching product attributes, and standardizing product descriptions.

Operand: 1.0.0

Operator: 1.0.0

Related documentation:
Product Master

Installation enhancements

What's new What does it mean for me?
Red Hat OpenShift Container Platform support
You can deploy Cloud Pak for Data Version 4.0 on the following versions of Red Hat OpenShift Container Platform:
  • Version 4.6
Operator-based installation
The Cloud Pak for Data control plane and services are installed using operators, which simplify the process of upgrading, scaling, and rolling back software on Red Hat OpenShift Container Platform.

For an overview of operators, see the Red Hat OpenShift: Operators Framework video from Red Hat. (This video is also available on YouTube: https://www.youtube.com/watch?v=LymzLHRbQdk.)

For details on installing Cloud Pak for Data and Cloud Pak for Data services, see Installing Cloud Pak for Data.

Cloud Pak for Data Version 4.0 offers two installation methods for installing the control plane:
Express installation
This method enables you to install the Cloud Pak for Data control plane with fewer manual steps. However, it offers less control over namespace scoping. Additionally, you must be a Red Hat OpenShift cluster administrator to use this installation method.
Specialized installation
This method provides more control over your environment. However, it requires additional manual steps. A Red Hat OpenShift project administrator can complete some of the steps after a cluster administrator completes the initial setup.
Both of the options are explained in more detail in:
Fewer custom security context constraints
Starting in Cloud Pak for Data Version 4.0, many services can use the default restricted security context constraint (SCC).

However, some services still require custom SCCs. For details about which services require custom SCCs, see Creating custom security context constraints for services.

Removals and deprecations

What's changed What does it mean for me?
Support for Red Hat OpenShift Container Platform Version 3.11
You cannot install Cloud Pak for Data Version 4.0 on Red Hat OpenShift Container Platform 3.11.

If you have an existing cluster on Red Hat OpenShift Container Platform 3.11, you must upgrade to Red Hat OpenShift Container Platform 4.6.

Support for Red Hat OpenShift Container Platform 4.5
You cannot install Cloud Pak for Data Version 4.0 on Red Hat OpenShift Container Platform 4.5.

If you have an existing cluster on Red Hat OpenShift Container Platform 4.5, you must upgrade to Red Hat OpenShift Container Platform 4.6.

Services that are not releasing on Cloud Pak for Data Version 4.0
The following services cannot be deployed on Cloud Pak for Data Version 4.0:
  • Db2 for z/OS Connector
  • Edge Analytics
  • Financial Crimes Insight®
  • Master Data Connect
  • Streams
  • Streams Flows
  • Watson Knowledge Studio
  • Watson Language Translator

If you need to use any of these services, you must use Cloud Pak for Data Version 3.5.

Connecting to CDH and HDP clusters from Db2 Big SQL Starting in Cloud Pak for Data Version 4.0, Db2 Big SQL does not support connecting to Hadoop clusters on:
  • Cloudera Distribution for Hadoop (CDH)
  • Hortonworks Data Platform (HDP)
adm, install, and scale commands are removed from the Cloud Pak for Data CLI
In Cloud Pak for Data Version 4.0, the following commands are no longer included in the Cloud Pak for Data command-line interface (cpd-cli):
  • adm
  • install
  • scale

Instead, the control plane and services are installed and managed through operators.

Previous releases

Looking for information about previous releases? See the following topics in IBM Documentation: