Overview of IBM Cloud Pak for Data

IBM Cloud Pak® for Data is a cloud-native solution that enables you to put your data to work quickly and efficiently.

Your enterprise has lots of data. You need to use your data to generate meaningful insights that can help you avoid problems and reach your goals.

But your data is useless if you can't trust it or access it. Cloud Pak for Data lets you do both by enabling you to connect to your data, govern it, find it, and use it for analysis. Cloud Pak for Data also enables all of your data users to collaborate from a single, unified interface that supports many services that are designed to work together.

Cloud Pak for Data fosters productivity by enabling users to find existing data or to request access to data. With modern tools that facilitate analytics and remove barriers to collaboration, users can spend less time finding data and more time using it effectively.

And with Cloud Pak for Data, your IT department doesn't need to deploy multiple applications on disparate systems and then try to figure out how to get them to connect.

Run anywhere

Cloud Pak for Data can run on your Red Hat® OpenShift® cluster, whether it's behind your firewall or on the cloud.
On the cloud
If you have an OpenShift deployment on IBM® Cloud, AWS, Microsoft Azure, or Google Cloud, you can deploy Cloud Pak for Data on your cluster.
On premises
Prefer to keep your deployment behind a firewall? You can run Cloud Pak for Data on your private, on-premises cluster.

If most of your enterprise data lives behind your firewall, it makes sense to put the applications that access your data behind your firewall to prevent accidentally sharing your data.

The Cloud Pak for Data data fabric

A data fabric architecture enables your enterprise to unlock the value of your data in a hybrid multicloud data landscape. Moving to a data fabric architecture transforms the way that your enterprise integrates, governs, and uses data for analytics, data science, customer master data, and compliance.

With a data fabric, you can have a secure and consistent way to access data from disparate sources. You can eliminate inefficient, repetitive, and manual data access and integration processes. A data fabric architecture bridges the gap between the sources and provides business-ready data to support your company's needs. You can work with data from various types of sources across a hybrid and multi-cloud landscape, while you keep that data secure and trusted with the full breadth of integrated data management capabilities.

The following image shows a data fabric with various data sources:

Image showing a data fabric and various data sources including on premises, analytics/BI, data lakes, data warehouses, and cloud data stores.

Your data engineers need tools to prepare, transform, and virtualize data. Your data quality analysts need tools to measure the quality of the data. Your governance team needs tools to control, protect, and enrich your data. Your data consumers, such as business analysts and data scientists, need tools to collaboratively develop insights and models. With the Cloud Pak for Data platform of integrated tools, your organization can efficiently work together to use your data to improve your business.

For more information about the data fabric solution, see Data fabric use cases.

Ready for AI

To be competitive and successful, your enterprise must leverage the power of artificial intelligence.

Cloud Pak for Data helps you climb the AI ladder by providing a suite of services that support you in your journey to AI.

Collect
Cloud Pak for Data helps you connect to your data, no matter where it lives. Cloud Pak for Data includes a Connections page that lists connections that can be used by multiple services. Some services support additional data sources that you can connect to from the service. The platform makes it simple to access your data.
Organize
The Watson™ Knowledge Catalog service helps you organize your data through data classification and governance. With the Watson Knowledge Catalog service, you can develop an information architecture that is on-point and ready to keep up with the scale of your data.
Analyze
Cloud Pak for Data also includes numerous analytics services that can help you generate scalable insight on demand. For example, with Cloud Pak for Data you can use:
  • Cognos® Dashboards, which enables you to create stunning dashboards to quickly visualize data.
  • SPSS® Modeler (premium service), which enables you to create flows to prepare and blend data, build and manage models, and visualize the results.
Infuse
With Cloud Pak for Data you can make AI a part of your standard operating procedure. Whether you want to build smarter apps with premium Watson services, deploy machine learning models into production at scale with Watson Machine Learning, or infuse your AI with trust and transparency with Watson OpenScale.

There are many more services that you can install on Cloud Pak for Data. For a complete list, see Services.

With Cloud Pak for Data, raw data becomes trusted data that you can analyze to gain insights and maximize business outcomes.

Support for your data lifecycle

Your data isn't static. Your machine learning models shouldn't be static either. As data is added to your on-premises and cloud data sources, you need to continually test and tune your machine learning models to ensure that they give you valuable insight. But you need to make sure that you're working with high-quality data, which is where the data governance and data integration and preparation services that you can install on Cloud Pak for Data come in.

You know the old adage: Garbage in, garbage out. If your data is poor, your results aren't meaningful. By bringing data stewards and data engineers together with your data scientists, you can ensure that your data is ready for analysis.

Additionally, you can ensure that any analytics assets that your data scientists create, such as models, notebooks, and Shiny apps are included in a data catalog so that they can be governed and maintained like any other data assets in your enterprise.

With Cloud Pak for Data, you can continuously discover new, valuable insights as data is added to your ecosystem.

Modern and modular

Cloud Pak for Data provides a modern data and analytics architecture that is elastic, scalable, and reliable. The end-to-end platform means that you can spend less time managing your data and more time using it to grow your business.

You can choose which services you install on Cloud Pak for Data so that you can use your resources wisely. Whether you want to modernize your data landscape, generate real-time insights to drive business transformations, or deliver exceptional, AI-augmented customer experiences, Cloud Pak for Data has a solution that can propel your business forward.

If you want to become a data-driven enterprise, Cloud Pak for Data should be at the center of your data and analytics ecosystem.

Choose the right edition for your needs

There are two editions of Cloud Pak for Data that you can choose from:
  • Enterprise Edition
  • Standard Edition

    Standard Edition places limits on the number of virtual processor cores (VPCs) that you can have in your cluster. For specific information on the limits, contact IBM Sales.