Cloud deployment environments

You can choose to deploy IBM® Cloud Pak for Data in the environment that suits your business needs.

Cloud Pak for Data can be deployed in various private cloud and public cloud environments. The following list describes each deployment environment in more detail.


On premises, private cloud

If you want to ensure that your environment is running securely behind your firewall, or you have an existing on-premises Red Hat® OpenShift® Container Platform cluster, you can deploy Cloud Pak for Data in your own private cloud.

Red Hat OpenShift
You must deploy an instance of Red Hat OpenShift Container Platform on your cluster.
Cluster architecture
Cloud Pak for Data is deployed on a multi-node cluster. Although you can deploy Cloud Pak for Data on a 3-node cluster for development or proof of concept environments, it is recommended that you deploy your production environment on a larger, highly available cluster with multiple dedicated master and worker nodes. This configuration provides better performance, better cluster stability, and increased ease of scaling the cluster to support workload growth. The specific requirements for a production-level cluster are identified in System requirements. For more information, see Architecture for IBM Cloud Pak for Data.
Services
Special requirements for services
Some services require AVX instructions and some services require dedicated nodes. For more information, see Hardware requirements.
Supported storage
You can use one or more of the following types of storage in your cluster:
  • Red Hat OpenShift Container Storage
  • IBM Spectrum® Fusion
  • IBM Spectrum Scale Container Native
  • NFS
  • Portworx
Installation documentation
See Installing IBM Cloud Pak for Data.

IBM Cloud Pak for Data System

If you don't have an existing on-premises cluster and you want to get up and running quickly, consider IBM Cloud Pak for Data System. IBM Cloud Pak for Data System is an all-in-one cloud-native data and AI platform in a box that provides a preconfigured, governed, and secure environment to collect, organize, and analyze data.

Red Hat OpenShift
Red Hat OpenShift is already deployed on your IBM Cloud Pak for Data System.
Cluster architecture
The system lets you flexibly expand or reduce storage and compute with plug-and-play nodes. For more information, see Cloud Pak for Data System overview.
Services
You can decide which services you want to install. For a list of available services, see Services.

The storage that you use determines the services that you can run. For more information, see Storage requirements

For information on installing service on IBM Cloud Pak for Data System, see the in the IBM Cloud Pak for Data System documentation.

Supported storage
OpenShift Container Storage is installed for you.
Installation documentation
Cloud Pak for Data is already installed on your IBM Cloud Pak for Data System hardware. However, you must install the appropriate services for your needs. For more information, see Installing services in the IBM Cloud Pak for Data System.

IBM Cloud

If you already use IBM Cloud to run business-critical applications, or if you don't want to set up and manage your own hardware, you can deploy Cloud Pak for Data on IBM Cloud.

Red Hat OpenShift
Cloud Pak for Data includes entitlement to the Red Hat OpenShift Container Platform. You can optionally apply this entitlement when you create your cluster.
You can deploy Cloud Pak for Data on the following Red Hat OpenShift environments:
Managed OpenShift (Recommended)
To use managed OpenShift, you must deploy an instance of Red Hat OpenShift Container Platform on IBM Cloud.
Your cluster must meet one of the following criteria:
  • Classic infrastructure single zone (IBM Cloud File Storage)
  • Classic infrastructure single or multi zone (Portworx Enterprise storage)
  • Virtual Private Cloud (VPC) Gen2 single or multi zone (Portworx Enterprise storage)

For additional prerequisites, see Getting started with IBM Cloud Pak for Data.

Restriction: If you choose Classic infrastructure, the following services are supported only on Bare Metal clusters with extra local storage for software-defined storage (SDS). These services also require Portworx Enterprise storage. If you plan to deploy any of these services. ensure that you choose this infrastructure when you configure your worker pool.
  • Voice Gateway
  • Watson™ Assistant
  • Watson Discovery
  • Watson Knowledge Studio
  • Watson Speech to Text
  • Watson Text to Speech
Self-managed OpenShift
To use self-managed OpenShift, contact IBM Software Support.
Cluster architecture
At a minimum, your cluster must contain 3 master nodes and 3 worker nodes. You can deploy additional worker nodes to support your workload. For details, see Adding worker nodes and zones to clusters.
Services
After you install Cloud Pak for Data, you can decide which services you want to install. For a list of available services, see Services.

The storage that you use determines the services that you can run. For more information, see Storage requirements.

You can optionally automatically deploy the following services as part of your Cloud Pak for Data installation:
  • Scheduling service (Required if you plan to install Watson Machine Learning Accelerator. Recommended for all installations because it enables you to enforce resource quotas.)
  • Analytics Engine Powered by Apache Spark
  • Cognos® Dashboards
  • Data Refinery (installed with Watson Knowledge Catalog or Watson Studio.)
  • Data Virtualization
  • Db2® Data Gate
  • Db2 Data Management Console
  • Db2 Warehouse
  • RStudio® Server with R 3.6
  • Watson Knowledge Catalog
  • Watson Machine Learning
  • Watson Machine Learning Accelerator
  • Watson OpenScale
  • Watson Studio
Restriction: The Virtual Data Pipeline service is not supported on IBM Cloud.
Supported storage
The storage that is supported depends on your Red Hat OpenShift environment:
Managed OpenShift
If you are running managed OpenShift, the following storage is supported:
  • IBM Cloud File Storage
    For use on the following Red Hat OpenShift Container Platform deployments:
    • Classic infrastructure single zone

    For information about which services support IBM Cloud File Storage, see Storage requirements.

  • Portworx Enterprise
    For use on the following Red Hat OpenShift Container Platform deployments:
    • Classic Infrastructure - single or multi zone
    • Virtual Private Cloud (VPC) Gen2 single or multi zone

    Some services require Portworx Enterprise storage to run on IBM Cloud. You can install the Portworx Enterprise service on IBM Cloud.

Self-managed OpenShift
If you are running self-managed OpenShift, the following storage is supported:
Installation documentation
See Getting started with IBM Cloud Pak for Data.

Amazon Web Services (AWS)

If you already use AWS and you don't want to set up and manage your own hardware, you can deploy Cloud Pak for Data on AWS.

Red Hat OpenShift
You can deploy Cloud Pak for Data on the following Red Hat OpenShift environments:
Managed OpenShift
For details, see Red Hat OpenShift Service on AWS (ROSA).
Self-managed OpenShift on AWS Infrastructure as a service
Cloud Pak for Data includes entitlement to the Red Hat OpenShift Container Platform.
You must deploy an instance of Red Hat OpenShift Container Platform on your cluster.
Services
After you install Cloud Pak for Data, you can decide which services you want to install. For a list of available services, see Services.

The storage that you use determines the services that you can run. For more information, see Storage requirements.

Supported storage
If you are running managed OpenShift, the following storage is supported:
  • OpenShift Data Foundation as a Service

    OpenShift Data Foundation as a Service is available to IBM Cloud Pak customers who want to deploy on ROSA. OpenShift Data Foundation as a Service is a limited availability offering that comes with support license agreements. Contact your IBM representative for assistance.

If you are running self-managed OpenShift, the following storage is supported:
  • Red Hat OpenShift Container Storage
  • NFS
  • Portworx
Installation documentation
See Installing IBM Cloud Pak for Data.

Microsoft Azure infrastructure as a service

If you already use Microsoft Azure and you don't want to set up and manage your own hardware, you can deploy Cloud Pak for Data on Azure.

Red Hat OpenShift
Cloud Pak for Data includes entitlement to the Red Hat OpenShift Container Platform.
You must install a self-managed Red Hat OpenShift Container Platform cluster.

For more information, see:

Services
After you install Cloud Pak for Data, you can decide which services you want to install. For a list of available services, see Services.

The storage that you use determines the services that you can run. For more information, see Storage requirements.

Supported storage
If you are running self-managed OpenShift, the following storage is supported:
  • OpenShift Container Storage
  • NFS
  • Portworx
Installation documentation
See Installing IBM Cloud Pak for Data.


Google Cloud

If you already use Google Cloud and you don't want to set up and manage your own hardware, you can deploy Cloud Pak for Data on Google Cloud.

Red Hat OpenShift
Cloud Pak for Data includes entitlement to the Red Hat OpenShift Container Platform.
You must install a self-managed Red Hat OpenShift Container Platform cluster.

For more information, see:

Cluster architecture
Cloud Pak for Data is deployed on a multi-node cluster. Although you can deploy Cloud Pak for Data on a 3-node cluster for development or proof of concept environments, it is recommended that you deploy your production environment on a larger, highly available cluster with multiple dedicated master and worker nodes. This configuration provides better performance, better cluster stability, and increased ease of scaling the cluster to support workload growth. The specific requirements for a production-level cluster are identified in System requirements. For more information, see Architecture for IBM Cloud Pak for Data.
Services
After you install Cloud Pak for Data, you can decide which services you want to install. For a list of available services, see Services.

The storage that you use determines the services that you can run. For more information, see Storage requirements.

Supported storage
If you are running self-managed OpenShift, the following storage is supported:
  • OpenShift Container Storage
  • NFS
  • Portworx
Installation documentation
See Installing IBM Cloud Pak for Data.


In addition to Cloud Pak for Data software, IBM offers IBM Cloud Pak® for Data as a Service. IBM Cloud Pak for Data as a Service might be right for you if you already use IBM Cloud to run business-critical applications and you don't want to set up and manage your own deployment of Cloud Pak for Data. IBM Cloud Pak for Data as a Service differs from the Cloud Pak for Data software in several ways. For details, see the IBM Cloud Pak for Data as a Service documentation.