June 23, 2021 By Hemanth Manda 4 min read

When’s the last time you considered if you’re operating in a truly predictive enterprise, furthermore, if it’s easy for your data consumers, models and apps to access the right data? More often than not the answer is a resounding “not very”. Between the proliferation of data types and sources and tightening regulations, data is often held captive, sitting in silos. Traditionally, strategies for overcoming this challenge relied on consolidating the physical data into a single location, structure and vendor. While this strategy seemed great in theory, anyone that has undertaken a migration of this magnitude can tell you it’s easier said than done.

Earlier this year at THINK we unveiled our plans for the next generation of IBM Cloud Pak for Data, our alternative to help customers connect the right people to the right data at the right time. Today, I’m excited to share more details on how the latest version of the platform, version 4.0, will bring that vision to life through an intelligent data fabric.

The journey so far

Since the launch of IBM Cloud Pak for Data in 2018, our goal has always been to help customers unlock the value of their data and infuse AI throughout their business. Understanding the needs of our clients, we doubled down on delivering a first-of-its-kind containerized platform that provided flexibility to deploy the unique mix of data and AI services a client needs, in the cloud environment of their choice.

IBM Cloud Pak for Data supports a vibrant ecosystem of proprietary, third party and open source services that we continue to expand on with each release. With version 4.0 we take our efforts to the next level. New capabilities and intelligent automation help business leaders and users tackle the overwhelming data complexity they face to more easily scale the value of their data.

Weaving the threads of an intelligent data fabric

A data fabric is an architectural pattern that dynamically orchestrates disparate data sources across a hybrid and multicloud landscape to provide business-ready data in support of analytics, AI and applications. The modular and customizable nature of IBM Cloud Pak for Data offers the ideal environment to build a data fabric from best-in-class solutions that is tailored to your unique needs. The tight integration of the microservices within the platform allow for further streamlining of the management and usage of distributed data by infusing intelligent automation. With version 4.0 we’re applying this automation in three key areas:

  1. Data access and usability – AutoSQL is a universal query engine that automates how you access, update and unify data across any source or type (clouds, warehouses, lakes, etc.) without the need for data movement or replication. With AutoSQL you can query distributed data across disparate landscapes up to 8x faster than the standard data warehouse.
  2. Data ingestion and cataloging – AutoCatalog automates the discovery and classification of data to streamline the creation of a real-time catalog of data assets and their relationships across disparate data landscapes.
  3. Data privacy and security – AutoPrivacy uses AI to intelligently automate the identification, monitoring and enforcement of sensitive data across the organization to help minimize risk and ensure compliance.

Register for the webinar to learn more about our intelligent data fabric and how you can take advantage of these new technologies.

Additional enhancements woven into 4.0

Further augmenting the intelligent automation of our data fabric capabilities is another new service coming to IBM Cloud Pak for Data, IBM Match 360 with Watson. Match 360 provides a machine learning-based, easy to use experience for self-service entity resolution. Non-developers can now match and link data from across their organization, helping to improve overall data quality.

IBM SPSS Modeler, IBM Decision Optimization and Hadoop Execution Engine services are also included as part of IBM Cloud Pak for Data 4.0. These capabilities complement the IBM Watson Studio services already within the base and enables users such as business analysts and citizen data scientists, to participate in building AI solutions.

AutoAI is enhanced to support relational data sources and generate exportable python code, enabling data scientists to review and update models generated through AutoAI. This is a significant differentiator compared to the AutoML capabilities of competitors, where the generated model is more of a black box.

Complementary capabilities are also released on IBM Cloud Pak for Data as a Service, including IBM DataStage and IBM Data Virtualization. Now available fully managed, DataStage helps enable the building of modern data integration pipelines, and the Data Virtualization capability helps to share data across the organization in near real-time, connecting governed data to your AI and ML tools.

Finally, IBM Cloud Pak for Data 4.0 includes several platform enhancements, most notable of which. is the addition of Red Hat OpenShift Operators. These help to automate the provisioning, scaling, patching and upgrades of IBM Cloud Pak for Data. First time installs are significantly simplified, decreasing the cost of implementation, while seamless upgrades reduce the upgrade process from weeks to hours. Also beginning in 4.0, IBM Cloud Pak for Data is built on a common IBM Cloud Pak platform, enabling standardized Identify and Access Management and seamless navigation across all of the IBM Cloud Paks.

Data is a huge competitive advantage for companies and when combined with AI, has the power to drive business transformation. IBM Cloud Pak for Data enables just that, but with the potential to be 10x faster due to new built-in automation.

Learn more about the latest version of IBM Cloud Pak for Data by signing up for the Data Fabric Deep Dive webinar or by registering for a free trial.

Was this article helpful?

More from Cloud

Think inside the box: Container use cases, examples and applications

5 min read - Container management has come a long way. For decades, managing containerized environments was a relatively simple affair. The modern idea of a computer container originally appeared back in the 1970s, with the concept first being used to help define application code on Unix systems. Modern containerization technology has moved on steadily from those early beginnings, and when companies run containers now, they’re getting a lot more utility for their investment. From small startups to large, established businesses, container frameworks have…

IBM Tech Now: February 26, 2024

< 1 min read - ​Welcome IBM Tech Now, our video web series featuring the latest and greatest news and announcements in the world of technology. Make sure you subscribe to our YouTube channel to be notified every time a new IBM Tech Now video is published. IBM Tech Now: Episode 92 On this episode, we're covering the following topics: IBM watsonx Orders EDGE3 + watsonx G2 Best of Software Awards Stay plugged in You can check out the IBM Blog Announcements for a full…

IBM Cloud delivers enterprise sovereign cloud capabilities

5 min read - As we see enterprises increasingly face geographic requirements around sovereignty, IBM Cloud® is committed to helping clients navigate beyond the complexity so they can drive true transformation with innovative hybrid cloud technologies. We believe this is particularly important with the rise of generative AI. While AI can undoubtedly offer a competitive edge to organizations that effectively leverage its capabilities, we have seen unique concerns from industry to industry and region to region that must be considered—particularly around data. We strongly…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters