Table of contents

Storage requirements

Before you install IBM® Cloud Pak for Data, review the storage requirements for the control plane, the shared cluster components, and the services that you plan to install.

Cloud Pak for Data platform storage requirements

A Cloud Pak for Data deployment requires several types of storage:

Required storage Details
Container image registry Depending on your environment, you might need to store images in a private container registry rather than pulling them directly from the IBM Entitled Registry. For details, see Mirroring images to your container registry.

If you use a private container registry, you must have sufficient space for the Cloud Pak for Data control plane images and the images for the services that you plan to install.

Sizing
A minimum of 300 GB of storage space in the container registry.
Local storage for container images Each node on your cluster must have local storage for the container images that are running on that node.
Storage location
The container images are stored in the root file system on the nodes.

On Red Hat® OpenShift® Container Platform Version 4.6, local copies of the images are stored in /var/lib/containers.

Sizing
A minimum of 300 GB of storage space per node.
Shared persistent storage for services The Cloud Pak for Data control plane and services store data in shared persistent storage.
Supported storage types
The platform supports several different types of shared storage:
Red Hat OpenShift Container Storage
Version: 4.6 or later fixes
Network File System (NFS)
Version: 4
Portworx
Version: 2.7.0 or later fixes
IBM Cloud File Storage
Version: Not applicable
When you plan your environment, ensure that you review the storage types that are supported by the other software that you must install:
Sizing
The minimum amount of storage depends on the type of storage that you plan to use. For details, see Storage considerations.

As a general rule, Cloud Pak for Data with all services installed can use up to 700 GB of storage space. Review the Storage considerations to ensure that you have sufficient storage space available for user data based on the type of storage that you select. You can add additional capacity depending on your user data volume requirements.

Cloud Pak for Data control plane persistent storage requirements

The Cloud Pak for Data control plane supports all of the shared persistent storage types that are supported by the platform. When you install the control plane you must specify the appropriate storage class:

Storage type Storage class Access mode
OpenShift Container Storage ocs-storagecluster-cephfs or an equivalent storage class. ReadWriteMany (RWX)

For details, see the OpenShift Container Storage documentation.

NFS managed-nfs-storage or an equivalent storage class. Specify a storage class that supports ReadWriteMany (RWX) access.
Portworx portworx-shared-gp3 or an equivalent storage class. ReadWriteMany (RWX)

For details, see Creating Portworx storage classes.

IBM Cloud File Storage (IBM Cloud deployments only.) Supported storage classes:
  • ibmc-file-gold-gid
  • ibm-file-custom-gold-gid
ReadWriteMany (RWX)

For details, see Storing data on classic IBM Cloud File Storage.

Shared cluster component persistent storage requirements

Service OpenShift Container Storage NFS Portworx IBM Cloud File Storage Other Notes
IBM Cloud Pak® foundational services   See Storage options in the IBM Cloud Pak foundational services documentation.
Scheduling service   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Common core services   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • NFS: managed-nfs-storage
  • Portworx:
    • portworx-couchdb-sc
    • portworx-elastic-sc
    • portworx-gp3-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid

Service persistent storage requirements

Service OpenShift Container Storage NFS Portworx IBM Cloud File Storage Other Notes
Anaconda Repository for IBM Cloud Pak for Data Not applicable Not applicable Not applicable Not applicable Not applicable This service is not installed on your Red Hat OpenShift Container Platform cluster. For details, see the Installation requirements in the Anaconda documentation.
Analytics Engine Powered by Apache Spark   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Cognos® Analytics   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Cognos Dashboards   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Data Refinery  
Data Refinery leverages the storage that is provisioned when you install Watson™ Knowledge Catalog or Watson Studio.
Data Virtualization   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-db2-rwx-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
DataStage®   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx:
    • portworx-shared-gp3 (specified during installation)
    • portworx-cassandra-sc
    • portworx-db2-rwo-sc
    • portworx-kafka-sc
    • portworx-solr-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Db2® Supported storage classes:
  • OpenShift Container Storage:
    • For system data and backup data: ocs-storagecluster-cephfs (RWX)
    • For user data: ocs-storagecluster-ceph-rbd (RWO with 4K sector size)
  • NFS: managed-nfs-storage
  • Portworx:
    • For system data and backup data: portworx-db2-rwx-sc (RWX)
    • For user data: portworx-db2-rwo-sc (RWO with 4K block size)
  • IBM Cloud File Storage: ibmc-file-gold-gid
  • IBM Spectrum® Scale:
    • For system data and backup data: ibm-spectrum-scale-csi (RWX)
    • For user data: ibm-spectrum-scale-csi (RWO with 4K sector size)
Db2 also supports local storage.
Db2 Big SQL   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-db2-rwx-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Db2 Data Gate   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-db2-rwx-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid
Db2 Data Management Console IBM Spectrum Scale Supported storage classes:
  • OpenShift Container Storage:
    • For system data and backup data: ocs-storagecluster-cephfs (RWX)
    • For user data: ocs-storagecluster-ceph-rbd (RWO with 4K sector size)
  • NFS: managed-nfs-storage
  • Portworx:
    • For system data and backup data: portworx-db2-rwx-sc (RWX)
    • For user data: portworx-db2-rwo-sc (RWO with 4K block size)
  • IBM Cloud File Storage: ibmc-file-gold-gid
  • IBM Spectrum Scale:
    • For system data and backup data: ibm-spectrum-scale-csi (RWX)
    • For user data: ibm-spectrum-scale-csi (RWO with 4K sector size)
Db2 Event Store This information is not currently available. This information is not currently available. This information is not currently available. This information is not currently available.   Contact IBM Support for information.
Db2 Warehouse Supported storage classes:
  • OpenShift Container Storage:
    • For system data and backup data: ocs-storagecluster-cephfs (RWX)
    • For user data: ocs-storagecluster-ceph-rbd (RWO with 4K sector size)
  • NFS: managed-nfs-storage
  • Portworx:
    • For system data and backup data: portworx-db2-rwx-sc (RWX)
    • For user data: portworx-db2-rwo-sc (RWO with 4K block size)
  • IBM Cloud File Storage: ibmc-file-gold-gid
  • IBM Spectrum Scale:
    • For system data and backup data: ibm-spectrum-scale-csi (RWX)
    • For user data: ibm-spectrum-scale-csi (RWO with 4K sector size)
Db2 Warehouse also supports local storage.
Decision Optimization  
Decision Optimization leverages the storage that is provisioned when you install Watson Studio.
EDB Postgres       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • Portworx: portworx-db-gp
Execution Engine for Apache Hadoop     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
Financial Services Workbench This information is not currently available. This information is not currently available. This information is not currently available. This information is not currently available.    
IBM Match 360 with Watson   Supported storage classes:
  • OpenShift Container Storage:
    • For foundation data: ocs-storagecluster-ceph-rbd
    • For shared volumes: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid
If you use OpenShift Container Storage, the persistence storage class must support block storage. As block storage only supports read-write once (RWO) permissions, you must additionally define a storage class for shared volumes (shared_persistence) that supports read-write-execute (RWX) permissions such as ocs-storagecluster-cephfs.
Jupyter Notebooks with Python 3.7 for GPU  
Jupyter Notebooks with Python 3.7 for GPU leverages the storage that is provisioned when you install Watson Studio.
Jupyter Notebooks with R 3.6  
Jupyter Notebooks with R 3.6 leverages the storage that is provisioned when you install Watson Studio.
MongoDB       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • Portworx: portworx-db-gp
OpenPages®     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
Planning Analytics     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
Product Master   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid
RStudio® Server with R 3.6   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
SPSS® Modeler   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid
Virtual Data Pipeline This information is not currently available. This information is not currently available. This information is not currently available. This information is not currently available.    
Voice Gateway   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • Portworx: portworx-shared-gp3
Watson Assistant     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • Portworx: portworx-watson-assistant-sc
Watson Assistant for Voice Interaction           Watson Assistant for Voice Interaction is comprised of the following services:
  • Voice Gateway
  • Watson Assistant
  • Watson Speech to Text
  • Watson Text to Speech

Refer to the system requirements for the services that you plan to install.

Watson Discovery       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-ceph-rbd
  • Portworx: portworx-db-gp2-sc
Watson Knowledge Catalog   Supported storage classes:
  • OpenShift Container Storage:
    • ocs-storagecluster-cephfs (specified during installation)
    • ocs-storagecluster-ceph-rbd
  • NFS: managed-nfs-storage
  • Portworx:
    • portworx-shared-gp3 (specified during installation)
    • portworx-cassandra-sc
    • portworx-couchdb-sc
    • portworx-db2-rwo-sc
    • portworx-elastic-sc
    • portworx-metastoredb-sc
    • portworx-gp3-sc
    • portworx-kafka-sc
    • portworx-solr-sc
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Watson Machine Learning   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid
Watson Machine Learning Accelerator     Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp
Watson OpenScale   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Watson Speech to Text       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • Portworx: portworx-shared-gp3
Watson Studio   Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • NFS: managed-nfs-storage
  • Portworx: portworx-shared-gp3
  • IBM Cloud File Storage: ibmc-file-gold-gid or ibm-file-custom-gold-gid
Watson Text to Speech       Supported storage classes:
  • OpenShift Container Storage: ocs-storagecluster-cephfs
  • Portworx: portworx-shared-gp3