What's new and changed in watsonx.data integration
watsonx.data integration updates can include new features and fixes. Releases are listed in reverse chronological order so that the latest release is at the beginning of the topic.
You can see a list of the new features for the platform and all of the services at What's new in IBM Software Hub.
Installing or upgrading watsonx.data integration
Ready to install or upgrade watsonx.data integration?
- To install watsonx.data integration along with the other IBM® Software Hub services, see Installing IBM Software Hub.
- To upgrade watsonx.data integration along with the other IBM Software Hub services, see Upgrading IBM Software Hub.
- To install or upgrade
watsonx.data integration
independently, see
watsonx.data integration.Remember: All of the IBM Software Hub components associated with an instance of IBM Software Hub must be installed at the same version.
IBM Software Hub Version 5.4.0
A new version of watsonx.data integration was released in June 2026 with IBM Software Hub 5.4.0.
Operand version: 2.4.0
This release includes the following changes:
- New features
-
This release of watsonx.data integration includes the following features:
- Connect to AlloyDB for PostgreSQL databases
-
You can now use the AlloyDB for PostgreSQL connector in your DataStage flows to read and write data from AlloyDB for PostgreSQL databases.
- Access data in AWS Databricks
-
You can now use the AWS Databricks connector in your DataStage flows to access and process data in Databricks workspaces.
- Access files in Microsoft SharePoint
-
You can now use the Microsoft SharePoint Files on Canvas connector in your DataStage flows to read and write files stored in SharePoint document libraries.
- Access data in Microsoft Dynamics 365
-
You can now use the Microsoft Dynamics 365 connector in your DataStage flows to read and write business data from Dynamics 365 applications.
- Export and import compiled pipeline binaries
-
You can now export and import compiled Python binaries with optimized runner pipelines, which means that you can move pipelines together with their compiled assets. You control this behavior by using the
include-python-binariesandinclude-common-binariesoptions incpdctl. - Data encryption for Teradata connections
-
You can now enable full session data encryption for Teradata optimized flows by using the new Data Encryption option. This option uses either TDGSS or TLS/SSL to encrypt network traffic, SQL statements, data requests, and responses for the entire session.
- Create parameter sets from connection properties
-
You can now create parameter sets directly from connection properties for supported connectors. Select one or more connection types and add their properties as parameters so that you can easily reuse and manage configuration values across pipelines.
- Run remote engines on s390x remote engines
-
You can now run remote engines on s390x (IBM Z and LinuxONE) systems, deployed as Docker containers or in Kubernetes clusters. This allows you to submit jobs from x86_64 environments and execute them on s390x hardware. This capability enables workload distribution across heterogeneous architectures.
- Receive alerts in Microsoft Teams or PagerDuty
-
You can now create alert receivers to connect Data Observability to your Microsoft Teams channels or PagerDuty services. When you create a PagerDuty alert receiver, you can track triggered alerts and manage events with your existing PagerDuty services. When you create a Microsoft Teams alert receiver, you can receive detailed notifications about triggered alerts in your Microsoft Teams channels.
- Identify trends in your data by using metric charts
-
You can now add metric charts to your Data Observability dashboard. By adding metric charts, you can easily see how a metric has changed across jobs runs, which can help you identify trends in your data.
- Reuse connection details in StreamSets flow
-
When you deploy a Data Collector engine version 7.4.0, you can include connections in StreamSets flows.
- Easily manage and reuse StreamSets flows by using parameters
-
You can now use parameters in StreamSets flows to set values for stage properties at run time. You can change parameter values for each job run without editing the flow, making your flows easier to manage and reuse.
- Choose how your browser connects to StreamSets engines
-
StreamSets engines can now use the tunneling communication method, giving you more flexibility in how your browser connects to the engine. With tunneling, the browser communicates with watsonx.data integration, which securely relays data to the engine through an encrypted tunnel. This method requires no additional setup and is enabled by default.
- Run multiple engines for a StreamSets environment to support job failover
-
When you run multiple engines for a StreamSets environment, jobs can now fail over to another engine if the current engine becomes unavailable. The job restarts on an available engine and continues processing from where it stopped.
- Track StreamSets job run history
-
You can now view a detailed history of a StreamSets job run to diagnose issues and understand the run state, including cases where a run remains in the Queued or Canceling status. The run history lists timestamped events that show status changes, retries, failovers, and other run activities.
- Capture a snapshot of data as it moves through a StreamSets job run
-
You can now capture and view a snapshot to verify how a StreamSets job processes data. A snapshot is a set of data that is captured as it moves through a running job.
Similar to previewing a flow, you can view how snapshot data moves through a job stage by stage. You can drill down to review the values of each record to determine whether the stage transforms data as expected.
- Process unstructured documents in multiple languages
-
You can now ingest and curate unstructured data documents in the following languages:
- French
- German
- Italian
- Japanese
- Korean
- Polish
- Spanish
- Use semantic chunking in Unstructured Data Integration
-
You can now select semantic chunking in the Chunking operator. This option produces chunks that follow natural topic and meaning boundaries rather than arbitrary size limits, resulting in more coherent context units, higher‑quality embeddings, more accurate retrieval, and reduced noise during downstream question‑answering.
- Summarize chunks with AI in Unstructured Data Integration
-
Generate AI-powered summaries for each document chunk to improve context understanding and retrieval accuracy.
- Ingest and store unstructured data by using more supported connectors
- You can now ingest data from the following sources:
- Confluence
- Google Drive
You can also use the following target databases for vector store:Unstructured data curation supports a subset of these connectors.- OpenSearch
- DataStax Astra DB
- Microsoft Azure Databricks
- PostgreSQL
- Db2
- Oracle
- Work with more file types in Unstructured Data Integration
-
You can now process the following file types:
- HTML
- XLSX
- BMP
- GIF
- JFIF
- JPG
- JPEG
- PNG
- TIFF
- TIF
- New features from 5.3.1 patches
- This release of watsonx.data integration includes the following features that were introduced in IBM Software Hub Version 5.3.1 patches:
- Process Kafka 4.x data with StreamSets flows
- When you deploy a StreamSets Data Collector engine version 7.2.0, you can use Kafka stages to process data in Kafka 4.x, in addition to Kafka 3.x.
- Updates
- The following updates were introduced in this release:
-
Before you install or upgrade watsonx.data integration, a cluster administrator must now create cluster-scoped resources, such as custom resource definitions, cluster roles, and cluster role bindings.
-
The StreamSets component in watsonx.data integration now automatically creates several defensive network policies.
-
- Customer-reported issues fixed in this release
- For a list of customer-reported issues that were fixed in this release, see the Fix List for IBM Cloud Pak for Data on the IBM Support website.
- Security issues fixed in this release
- The following security issues were fixed in this release:
CVE-2021-23337
CVE-2023-26920, CVE-2023-34104, CVE-2023-40403
CVE-2024-29371
CVE-2025-9820, CVE-2025-11226, CVE-2025-13465, CVE-2025-14104, CVE-2025-14831, CVE-2025-15281, CVE-2025-15284, CVE-2025-15558, CVE-2025-15599, CVE-2025-27821, CVE-2025-48431, CVE-2025-48924, CVE-2025-53864, CVE-2025-54988, CVE-2025-55130, CVE-2025-55131, CVE-2025-55132, CVE-2025-58098, CVE-2025-59465, CVE-2025-59466, CVE-2025-62718, CVE-2025-65082, CVE-2025-66200, CVE-2025-66516, CVE-2025-67735, CVE-2025-69873
CVE-2026-0540, CVE-2026-0636, CVE-2026-0861, CVE-2026-0915, CVE-2026-1225, CVE-2026-1525, CVE-2026-1526, CVE-2026-1527, CVE-2026-1528, CVE-2026-1839, CVE-2026-2229, CVE-2026-2327, CVE-2026-2332, CVE-2026-2391, CVE-2026-2581, CVE-2026-2950, CVE-2026-29786, CVE-2026-31802, CVE-2026-32141, CVE-2026-3260, CVE-2026-32280, CVE-2026-32281, CVE-2026-32282, CVE-2026-32283, CVE-2026-32288, CVE-2026-32289, CVE-2026-33228, CVE-2026-33349, CVE-2026-33412, CVE-2026-33532, CVE-2026-33671, CVE-2026-33672, CVE-2026-33750, CVE-2026-33814, CVE-2026-33870, CVE-2026-33871, CVE-2026-33891, CVE-2026-33894, CVE-2026-33895, CVE-2026-33896, CVE-2026-33916, CVE-2026-33937, CVE-2026-33938, CVE-2026-33939, CVE-2026-33940, CVE-2026-33941, CVE-2026-34043, CVE-2026-34073, CVE-2026-34165, CVE-2026-34477, CVE-2026-34478, CVE-2026-34479, CVE-2026-34480, CVE-2026-34481, CVE-2026-34601, CVE-2026-34982, CVE-2026-34986, CVE-2026-39883, CVE-2026-39892, CVE-2026-40175, CVE-2026-40895, CVE-2026-41168, CVE-2026-41238, CVE-2026-41239, CVE-2026-41240, CVE-2026-41305, CVE-2026-41602, CVE-2026-41603, CVE-2026-41604, CVE-2026-41605, CVE-2026-41606, CVE-2026-41607, CVE-2026-41650, CVE-2026-41672, CVE-2026-41673, CVE-2026-41674, CVE-2026-41675, CVE-2026-42033, CVE-2026-42034, CVE-2026-42035, CVE-2026-42036, CVE-2026-42037, CVE-2026-42038, CVE-2026-42039, CVE-2026-42041, CVE-2026-42042, CVE-2026-42043, CVE-2026-42044, CVE-2026-42264, CVE-2026-43868, CVE-2026-43869, CVE-2026-43870, CVE-2026-44728, CVE-2026-44740, CVE-2026-44973
- Deprecated features
- The following features were deprecated in this release:
- StreamSets environments for Data Collector engine versions 6.4.x - 7.0.x
- StreamSets Data Collector engine versions 6.4.x - 7.0.x are now deprecated. They will be removed from service in an upcoming release.