Introducing IBM Cloud Pak for Data Express

You’re probably already building a data fabric.

Many businesses, regardless of size, can benefit from a data fabric. In fact, many of you have started building one, whether you’ve planned to or not. Data fabric architectures have wide-ranging functionality that includes the ability to track, cleanse, integrate, curate, share, protect, explore, analyze and model data. Data fabric architectures can be as advanced as production AI or begin as simply as getting a handle on an organization’s available data assets.

This variation of purpose and scale is why so many organizations already have some part of a fabric in their IT plan, but it creates the core challenge that smaller businesses face: how can you quickly start solving problems without creating an integration nightmare in the future?

Large businesses can afford a top-down approach, building out a full platform all at once. They have a billion problems, they want a platform that does a billion things. This type of platform-centric buildout may not be possible for smaller organizations, due to both upfront costs and because large implementations lengthen time to value. However, building point solutions independently can lead to pain down the road as disjointed systems can require effort to integrate and even more effort to extend.

Organizations seek a solution that starts by addressing a current need, helps deliver value to users and then easily weaves in another piece of the fabric in an open, modular way. That solution is called IBM Cloud Pak for Data Express.

What is IBM Cloud Pak for Data Express?

IBM Cloud Pak for Data Express is a set of three pre-built, pre-sized offerings designed to address problems in cataloging, analyzing and integrating data. The IBM Cloud Pak for Data Express offerings give you a choice of three popular data fabric starting points: IBM Data Governance Express for a data catalog, IBM ELT Pushdown Express for data pipelines or IBM Data Science and MLOps Express for analytics and modeling. Each provides pre-sized, pre-selected services designed to address a current data fabric need.

All three solutions are built on the IBM Cloud Pak framework and the Red Hat OpenShift Container Platform. This allows them to be deployed on-premises or to hybrid cloud environments, to grow to accommodate more users and more data and to be extended with new functionality while maintaining a consistent operational environment for your data.

What’s in the boxes?

Data Governance Express

Many organizations struggle to make use of data because they don’t know that it’s there. Small groups have siloes and share spreadsheets over email. Decisions are made on stale or incomplete data, wasting time and money to clean up mistakes. A central catalog that allows users to discover and share data can help avoid these issues and be key pillar in a data governance strategy. Data Governance Express augments its catalog by discovering and classifying data assets with automation and enforcing your organization’s security controls to help protect sensitive data from prying eyes.

ELT Pushdown Express

ELT Pushdown Express flips the last two steps of traditional ETL (Extract, Transform, Load), first loading raw data into a data warehouse and then transforming it into a finished state that’s ready for reporting and analytics. This can allow businesses to reuse their warehouse investment—along with its computing power and security model—to power their data pipelines. No SQL skills are typically needed because the tool allows users to specify a pipeline visually using hundreds of prebuilt operations and then automatically converts it into SQL.

Data Science and MLOps Express

Data science covers simple exploration and visualization through to modeling and AI training. MLOps (Machine Learning Ops) covers capabilities to help an organization use models in production. Promoting models from development or tracking statistics to help mitigate bias are both examples of MLOps capabilities. Our users were clear that they wanted both sets of function in a single package, so this Express offering weaves together open source and IBM technology to help address both halves.

Value in the present and possibilities for the future

Having these and many more capabilities share the same modular foundation helps mitigate the challenges of building multiple independent point solutions. Instead of integration nightmares, you can reap economies of scale. Happy with your data catalog but want to enhance your analytics? Mix in some data science services. Happy with your AI progress, but wish your data scientists didn’t spend so much time hunting for assets? Bring them a self-service data catalog.

A catalog of IBM, open source and third-party capabilities is available to add on to your Express system, allowing you to tailor your data environment to the needs of your organization. Grow at your own pace and extend as needed—we’ll be here to help.

Start the journey today

You shouldn’t need to sacrifice flexibility of your data architecture to make progress today. IBM Cloud Pak for Data Express was announced on January 17, 2023, and is planned for availability soon on February 28, 2023. Reach out to IBM to learn more about how IBM Cloud Pak for Data Express can help your business.

Author

Midhat Shahid

VP, Product Management, Data Fabric & Cloud Pak for Data

Vince Pasquantonio

Vice President of Development, Data Portfolio - Hybrid Data Management and watsonx.data

IBM