Storage requirements

Before you install IBM® Cloud Pak for Data, review the storage requirements for the control plane, the shared cluster components, and the services that you plan to install.

Cloud Pak for Data platform storage requirements

A Cloud Pak for Data deployment requires several types of storage:

Required storage Details
Container image registry Depending on your environment, you might need to store images in a private container registry rather than pulling them directly from the IBM Entitled Registry. For details, see Mirroring images to your container registry.

If you use a private container registry, you must have sufficient space for the Cloud Pak for Data control plane images and the images for the services that you plan to install.

Sizing
A minimum of 300 GB of storage space in the container registry.
Local storage for container images Each node on your cluster must have local storage for the container images that are running on that node.
Storage location
The container images are stored in the root file system on the nodes.

On Red Hat® OpenShift® Container Platform Version 4.6, local copies of the images are stored in /var/lib/containers.

Sizing
A minimum of 300 GB of storage space per node.
Persistent storage for services The Cloud Pak for Data control plane and services store data in persistent storage.
Supported storage types
The platform supports several different types of persistent storage:
Red Hat OpenShift Container Storage
Version: 4.6 or later fixes
Available in the IBM Storage Suite for IBM Cloud® Paks.
IBM Spectrum® Fusion
Version: 2.1.2 or later fixes
IBM Spectrum Scale Container Native storage
IBM Spectrum Scale Container Native Storage Access Version: 5.1.1.3 or later fixes
Container Storage Interface Version: 2.3.0 or later fixes
Available in the IBM Storage Suite for IBM Cloud Paks.
Network File System (NFS)
Version: 4
Portworx
Version: 2.7.0 or later fixes
IBM Cloud File Storage
Version: Not applicable
When you plan your environment, ensure that you review the storage types that are supported by the other software that you must install:
Tip: The preceding storage options have been evaluated by IBM. You can run the Cloud Pak for Data storage validation tool to assess storage that is provided by other vendors. However, this tool does not guarantee support for other types of storage. You can use other storage environments at your own risk.
Sizing
The minimum amount of storage depends on the type of storage that you plan to use. For details, see Storage considerations.

As a general rule, Cloud Pak for Data with all services installed can use up to 700 GB of storage space. Review the Storage considerations to ensure that you have sufficient storage space available for user data based on the type of storage that you select. You can add additional capacity depending on your user data volume requirements.

If the common core services are installed on your cluster, 30–100 GB of storage is allocated to projects. The amount of storage depends on the version of Cloud Pak for Data that you installed. For details on managing this storage, see

Cloud Pak for Data control plane persistent storage requirements

The Cloud Pak for Data control plane supports all of the shared persistent storage types that are supported by the platform. When you install the control plane you must specify the appropriate storage class:

Storage type Storage classes Access mode
OpenShift Container Storage
Primary storage class
ocs-storagecluster-cephfs or an equivalent storage class.
Storage class for metadata storage
ocs-storagecluster-ceph-rbd or an equivalent storage class.
Primary storage class
ReadWriteMany (RWX)
Storage class for metadata storage
ReadWriteOnce (RWO)

For details, see the OpenShift Container Storage documentation.

IBM Spectrum storage:
  • IBM Spectrum Fusion
  • IBM Spectrum Scale Container Native
ibm-spectrum-scale-sc or an equivalent storage class. ReadWriteMany (RWX)

For details, see the IBM Spectrum Scale Container Storage Interface Driver documentation.

NFS managed-nfs-storage or an equivalent storage class. Specify a storage class that supports ReadWriteMany (RWX) access.
Portworx
Primary storage class
portworx-shared-gp3 or an equivalent storage class.
Storage class for metadata storage
portworx-metastoredb-sc or an equivalent storage class.
Primary storage class
ReadWriteMany (RWX)
Storage class for metadata storage
ReadWriteOnce (RWO)

For details, see Creating Portworx storage classes.

IBM Cloud File Storage (IBM Cloud deployments only.)
Primary storage class
Either:
  • ibmc-file-gold-gid or an equivalent storage class.
  • ibm-file-custom-gold-gid or an equivalent storage class.
Storage class for metadata storage
ibmc-block-gold or an equivalent storage class.
Primary storage class
ReadWriteMany (RWX)

For details, see Storing data on classic IBM Cloud File Storage.

Storage class for metadata storage
ReadWriteOnce (RWO)

For details, see Storing data on classic IBM Cloud Block Storage.

Shared cluster component persistent storage requirements

The following table indicates which type of storage are supported by each of the shared cluster components.

Note: IBM Spectrum storage includes both IBM Spectrum Fusion and IBM Spectrum Scale Container Native unless otherwise indicated.
Service OpenShift Container Storage IBM Spectrum NFS IBM Cloud File Storage Portworx Other Notes
IBM Cloud Pak® foundational services   See Storage options in the IBM Cloud Pak foundational services documentation.
Important: IBM Cloud Pak foundational services supports more versions of Red Hat OpenShift Container Platform than Cloud Pak for Data. Ensure that you review the options for the versions that are supported by both IBM Cloud Pak foundational services and Cloud Pak for Data.
Scheduling service             Starting with Version 1.3.0, the scheduling service does not require persistent storage.
Common core services   Supported storage classes:
  • OpenShift Container Storage:
    • ocs-storagecluster-cephfs
    • ocs-storagecluster-ceph-rbd
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx:
    • portworx-shared-gp3
    • portworx-couchdb-sc
    • portworx-elastic-sc
    • portworx-gp3-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid

Service persistent storage requirements

The following table indicates which type of storage are supported by each service.

Note: IBM Spectrum storage includes both IBM Spectrum Fusion and IBM Spectrum Scale Container Native unless otherwise indicated.
Service OpenShift Container Storage IBM Spectrum NFS IBM Cloud File Storage Portworx Other Notes
Anaconda Repository for IBM Cloud Pak for Data Not applicable Not applicable Not applicable Not applicable Not applicable Not applicable This service is not installed on your Red Hat OpenShift Container Platform cluster. For details, see the Anaconda documentation.
Analytics Engine Powered by Apache Spark   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Cognos® Analytics   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Cognos Dashboards   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Data Privacy     Data Privacy leverages the storage that is provisioned when you install Watson™ Knowledge Catalog.
Data Refinery  
Data Refinery leverages the storage that is provisioned when you install Watson Knowledge Catalog or Watson Studio.
Data Virtualization   Supported storage classes:
Note: If you are upgrading from Cloud Pak for Data Version 3.5 or if you are installing Data Virtualization on Version 4.0, the recommended storage class on OpenShift Container Storage is ocs-storagecluster-ceph-rbd.

If you already installed Data Virtualization on Version 4.0 with the ocs-storagecluster-cephfs storage class, do not change it.

  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-db2-rwx-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
DataStage®   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Db2® Supported storage classes:
  • OpenShift Container Storage:
    • For system data and backup data: ocs-storagecluster-cephfs (RWX)
    • For user data, transaction logs, and table space data: ocs-storagecluster-ceph-rbd (RWO with 4K sector size)
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx:
    • For system data and backup data: portworx-db2-rwx-sc (RWX)
    • For user data, transaction logs, and table space data: portworx-db2-rwo-sc (RWO with 4K block size)
  • IBM Cloud File Storage: ibmc-file-gold-gid
  • IBM Spectrum Scale:
    • For system data and backup data: ibm-spectrum-scale-csi (RWX)
    • For user data, transaction logs, and table space data: ibm-spectrum-scale-csi (RWO with 4K sector size)
Db2 also supports:
  • Dell EMC Isilon
  • Local storage
Db2 Big SQL   Supported storage classes:
Note: If you are upgrading from Cloud Pak for Data Version 3.5 or if you are installing Db2 Big SQL on Version 4.0, the recommended storage class on OpenShift Container Storage is ocs-storagecluster-ceph-rbd.

If you already installed Db2 Big SQL on Version 4.0 with the ocs-storagecluster-cephfs storage class, do not change it.

  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-db2-rwx-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Db2 Data Gate   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-db2-rwx-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid
Db2 Data Management Console   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid
Db2 Event Store This information is not currently available. This information is not currently available. This information is not currently available. This information is not currently available. This information is not currently available.   Contact IBM Support for information.
Db2 Warehouse Supported storage classes:
  • OpenShift Container Storage:
    • For system data and backup data: ocs-storagecluster-cephfs (RWX)
    • For user data, transaction logs, and table space data: ocs-storagecluster-ceph-rbd (RWO with 4K sector size)
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx:
    • For system data and backup data: portworx-db2-rwx-sc (RWX)
    • For user data, transaction logs, and table space data: portworx-db2-rwo-sc (RWO with 4K block size)
  • IBM Cloud File Storage: ibmc-file-gold-gid
  • IBM Spectrum Scale:
    • For system data and backup data: ibm-spectrum-scale-csi (RWX)
    • For user data, transaction logs, and table space data: ibm-spectrum-scale-csi (RWO with 4K sector size)
Db2 Warehouse also supports:
  • Dell EMC Isilon
  • Local storage
Decision Optimization  
Decision Optimization leverages the storage that is provisioned when you install Watson Studio.
EDB Postgres       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • IBM Spectrum: ibm-spectrum-scale-sc
  • Portworx: portworx-db-gp
Execution Engine for Apache Hadoop     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
Financial Services Workbench This information is not currently available. This information is not currently available. This information is not currently available. This information is not currently available. This information is not currently available.    
Guardium® External S-TAP®       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
Informix®       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-informix-sc
IBM Match 360 with Watson   Supported storage classes:
  • OpenShift Container Storage:
    • For foundation data: ocs-storagecluster-ceph-rbd
    • For shared volumes: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx:
    • portworx-shared-gp3
    • portworx-elastic-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid
MongoDB       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • IBM Spectrum: ibm-spectrum-scale-sc
  • Portworx: portworx-db-gp
Open Data for Industries       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • NFS: managed-nfs-storage
  • IBM Cloud File Storage: ibmc-file-gold-gid
OpenPages®     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
Planning Analytics     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
Product Master   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid
RStudio® Server with R 3.6   Supported storage classes:
RStudio Server with R 3.6 leverages the storage that is provisioned when you install Watson Studio.
SPSS® Modeler  

SPSS Modeler requires persistent storage for accessing Watson Studio project data.

Supported storage classes:
SPSS Modeler leverages the storage that is provisioned when you install Watson Studio.
Virtual Data Pipeline This information is not currently available. This information is not currently available. This information is not currently available. This information is not currently available. This information is not currently available.    
Voice Gateway ✓*   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • * IBM Spectrum (IBM Spectrum Scale Container Native only): ibm-spectrum-scale-sc
  • Portworx: portworx-shared-gp3
Watson Assistant ✓*     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • * IBM Spectrum (IBM Spectrum Scale Container Native only): ibm-spectrum-scale-sc
  • Portworx: portworx-watson-assistant-sc
  • IBM Cloud Block Storage: ibmc-block-gold
Watson Assistant for Voice Interaction             Watson Assistant for Voice Interaction is comprised of the following services:
  • Voice Gateway
  • Watson Assistant
  • Watson Speech to Text
  • Watson Text to Speech

Refer to the system requirements for the services that you plan to install.

Watson Discovery ✓*       Supported storage classes:
  • IBM Cloud Block Storage: ibmc-block-gold
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • * IBM Spectrum (IBM Spectrum Scale Container Native only): ibm-spectrum-scale-sc
  • Portworx: portworx-db-gp2-sc
Watson Knowledge Catalog   Supported storage classes:
  • OpenShift Container Storage:
    • ocs-storagecluster-cephfs (specified during installation)
    • ocs-storagecluster-ceph-rbd
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx:
    • portworx-shared-gp3 (specified during installation)
    • portworx-cassandra-sc
    • portworx-couchdb-sc
    • portworx-db2-rwo-sc
    • portworx-elastic-sc
    • portworx-metastoredb-sc
    • portworx-gp3-sc
    • portworx-kafka-sc
    • portworx-solr-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Watson Knowledge Studio ✓*     Supported storage classes:
  • OpenShift Container Storage:
    • RWO: ocs-storagecluster-ceph-rbd
    • RWX: ocs-storagecluster-cephfs
  • * IBM Spectrum (IBM Spectrum Scale Container Native only):
    • RWO: ibm-spectrum-scale-sc
    • RWX: ibm-spectrum-scale-sc
  • NFS:
    • RWO: managed-nfs-storage
    • RWX: managed-nfs-storage
  • Portworx:
    • RWO: portworx-db-gp3-sc
    • RWX: portworx-shared-gp3
  • IBM Cloud Block Storage:
    • RWO: ibmc-block-gold
    • RWX: Not applicable
Watson Machine Learning   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid
Watson Machine Learning Accelerator     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
Storage class must have ReadWriteMany (RWX) access.
Watson OpenScale   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Watson Speech to Text ✓*       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • * IBM Spectrum (IBM Spectrum Scale Container Native only): ibm-spectrum-scale-sc
  • Portworx: portworx-shared-gp3
Watson Studio   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • IBM Spectrum: ibm-spectrum-scale-sc
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Watson Studio Runtimes   Supported storage classes:
The Watson Studio Runtimes leverage the storage that is provisioned when you install Watson Studio.
Watson Text to Speech ✓*       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • * IBM Spectrum (IBM Spectrum Scale Container Native only): ibm-spectrum-scale-sc
  • Portworx: portworx-shared-gp3