System requirements for IBM Cloud Pak for Data

Before you install Cloud Pak for Data on OpenShift®, ensure that your environment meets the following requirements.

Assembly versions

The following versions of the Cloud Pak for Data control plane assembly are available. Use the following table to determine what versions of Red Hat® OpenShift Container Platform each assembly can run on and which version of the Cloud Pak for Data command-line interface (cpd-cli) you can use to install the assembly.

Cloud Pak for Data assembly version OpenShift 3.11 OpenShift 4.5 OpenShift 4.6 OpenShift 4.8
3.5.1 Compatible.
Supported version of the cpd-cli:
  • 3.5.1
  • 3.5.2
  • 3.5.3
  • 3.5.4
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.1
  • 3.5.2
  • 3.5.3
  • 3.5.4
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.2
  • 3.5.3
  • 3.5.4
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Not compatible.
3.5.2 Compatible.
Supported version of the cpd-cli:
  • 3.5.1
  • 3.5.2
  • 3.5.3
  • 3.5.4
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.1
  • 3.5.2
  • 3.5.3
  • 3.5.4
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.2
  • 3.5.3
  • 3.5.4
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Not compatible.
3.5.3

(See the Important note after this table.)

Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Not compatible.
3.5.4

(See the Important note after this table.)

Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Not compatible.
3.5.5

(See the Important note after this table.)

Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Not compatible.
3.5.6

(See the Important note after this table.)

Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Not compatible.
3.5.7

(See the Important note after this table.)

Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10
Compatible.
Supported version of the cpd-cli:
  • 3.5.6
  • 3.5.7
  • 3.5.8
  • 3.5.9
  • 3.5.10

For information about service assemblies, see System requirements for services.

Important: (x86_64 clusters only.) If you are upgrading from Cloud Pak for Data Version 3.0.1, you must install cpd-3.0.1-lite-patch-8 before you upgrade to Version 3.5.3 or later refreshes of the Cloud Pak for Data control plane. For details on installing this patch, see:

Hardware requirements

You can install Cloud Pak for Data on a Red Hat OpenShift Container Platform cluster. For information about the supported versions of Red Hat OpenShift Container Platform, see Software requirements.

The following requirements are the minimum recommendations for a small, stable deployment of Cloud Pak for Data. Use the minimum recommended configuration as a starting point for your cluster configuration. If you use fewer resources, you are likely to encounter stability problems. (If you are sizing a cluster for a proof of concept or for very limited use, you can use the guidance in Limited-use configurations for Cloud Pak for Data.)

The size of your cluster depends on:
  • The required components that you need to install
  • The services that you plan to install

    The sizing requirements for services are available in System requirements for services. If you only install a few services with small vCPU and memory requirements, you might not need additional resources. However, if you plan to install multiple services or services with large footprints, add the appropriate amount of vCPU and memory to the minimum recommendations below.

    Important: Some services require AVX instruction sets. Review the System requirements for services to determine whether you need CPUs that support AVX instruction sets.
  • The types of workloads that you plan to run

    For example, if you plan to run complex analytics workloads in addition to other resource-intensive workloads, such as ETL jobs, you can expect reduced concurrency levels if you don't add additional computing power to your cluster.

    Because workloads vary based on a number of factors, use measurements from running real workloads with realistic data to size your cluster.

Important: Work with your IBM Sales representative to size your cluster.

It is strongly recommended that you deploy Cloud Pak for Data on a highly available cluster.

The following configuration has been tested and validated by IBM. However, Red Hat OpenShift Container Platform supports other configurations. If the configuration in the following table does not work in your environment, you can adapt the configuration based on the guidance in the Red Hat OpenShift documentation. (Links to the relevant Red Hat OpenShift documentation are available in Software requirements.) In general, Cloud Pak for Data is primarily concerned with the resources that are available on your worker nodes.

Node role Hardware Number of servers Minimum available vCPU Minimum memory
Master + infra
  • x86-64
  • z14 or later (4.5 or later)
  • ppc64le (4.5 or later)
3 master (for high availability) and 3 infrastructure on the same 3 nodes 8 vCPU per node 32 GB RAM per node
Worker/compute
  • x86-64
  • z14 or later (4.5 or later)
  • ppc64le (4.5 or later)
3+ worker/compute nodes 16 vCPU per node
  • 64 GB RAM per node (minimum)
  • 128 GB RAM per node (recommended)
Load balancer
  • x86-64
  • z14 or later (4.5 or later)
  • ppc64le (4.5 or later)
2 load balancer nodes 4 vCPU per node 4 GB RAM per node

Add another 4 GB of RAM and 100 GB of root storage for access restrictions and security control.

See Network topology requirements.
Load balancer
A load balancer is required when using three master nodes. The load balancer distributes the traffic load of the master and proxy nodes, securely isolates the master and compute node IP addresses, and facilitates external communication, including accessing the management console and API or making other requests to the master and proxy nodes.
Bastion node
OpenShift 4.5 and 4.6 clusters: If you are installing in a restricted network, set up one internet-facing machine as a bastion node to download the software images. For details, see:
Restriction: On POWER hardware the maximum supported configuration for each worker node is:
  • 160 vCPU
  • 512 GB RAM

Cluster node settings

The time on all of the nodes must be synchronized within 500 ms.

To ensure that services can run correctly on your cluster, ensure that your nodes meet the requirements specified in Changing required node settings. You must change the node settings before you install Cloud Pak for Data.

Storage requirements

A Cloud Pak for Data deployment requires several types of storage:
Required storage Details
Container image registry The container image registry stores the container images for the Cloud Pak for Data control plane and services.
Supported configurations

You can use either the Red Hat OpenShift internal registry or an external registry server. (If you are installing Cloud Pak for Data on an air-gapped cluster, you must use an external registry server.)

The registry must be accessible from all of the nodes in the cluster and all of the nodes must have permission to push to and pull from the container image registry.

See Setting up your registry server for details.

Sizing
A minimum of 300 GB of storage space.
Local storage for container images Each node on your cluster must have local storage for the container images that are running on that node.
Storage location
The container images are stored in the root file system on the nodes. The exact location of the storage depends on the version of Red Hat OpenShift that you're running:
  • On Version 3.11, the location depends on the container runtime:
    • If you are using Docker, the local copies of images are stored in /var/lib/docker.
    • If you are using CRI-O, the local copies of images are stored in /var/lib/containers
  • On Version 4.5 and Version 4.6, local copies of the images are stored in /var/lib/containers.

See Container runtime for information about Docker and CRI-O.

Sizing
A minimum of 300 GB of storage space per node.
Shared persistent storage for services The Cloud Pak for Data control plane and services store data in shared persistent storage.
Supported storage types
The platform supports several different types of shared storage:
Red Hat OpenShift Container Storage
Version: 4.5 or later fixes

Available in the IBM Storage Suite for IBM Cloud® Paks.

IBM Spectrum® Scale Container Native
Version: 5.1.0.3 or later fixes

Available in the IBM Storage Suite for IBM Cloud Paks.

Network File System (NFS)
Version: 4
Portworx
Version:
  • 2.5.0.1 or later is required for Red Hat OpenShift Version 3.11
  • 2.6.2 or later is required for Red Hat OpenShift Version 4.5 and 4.6
IBM Cloud File Storage
Version: Not applicable
When you plan your environment, ensure that you review the storage types that are supported by the other software that you must install:
Sizing
The minimum amount of storage depends on the type of storage that you plan to use. For details, see Storage considerations.

As a general rule, Cloud Pak for Data with all services installed can use up to 700 GB of storage space. Review the Storage considerations to ensure that you have sufficient storage space available for user data based on the type of storage that you select. You can add additional capacity depending on your user data volume requirements.

The Cloud Pak for Data control plane supports all of the shared persistent storage types that are supported by the platform. When you install the control plane you must specify the appropriate storage class:
OpenShift Container Storage
Required storage class:
  • ocs-storagecluster-cephfs
IBM Spectrum Scale Container Native
Required storage class:
  • ibm-spectrum-scale-sc
NFS
Specify a storage class with ReadWriteMany (RWX) access.
Portworx
Required storage class:
  • portworx-shared-gp3
IBM Cloud File Storage
IBM Cloud deployments only.
Supported storage classes:
  • ibmc-file-gold-gid
  • ibm-file-custom-gold-gid

Disk requirements

To prepare your storage disks, ensure that you have good I/O performance, and prepare the disks for encryption.

I/O performance

When I/O performance is not sufficient, services can experience poor performance or cluster instability when the services are handling a heavy load, such as functional failures with timeouts. The following I/O performance requirements are based on repeated workloads that test performance on the platform and validated in various cloud environments. The current requirements are based on the performance of writing data to representative storage locations using two chosen block sizes (4 KB and 1 GB). These tests use the dd command-line utility. Use the MB/sec metric from the tests and ensure that your test result is comparable to or better than the targets.

To ensure that the storage partition has good disk I/O performance, run the following tests.

Disk latency test
dd if=/dev/zero of=/PVC_mount_path/testfile bs=4096 count=1000 oflag=dsync

The value must be comparable to or better than: 2.5 MB/s.

Disk throughput test
dd if=/dev/zero of=/PVC_mount_path/testfile bs=1G count=1 oflag=dsync

The value must be comparable to or better than: 209 MB/s.

Dynamic variations in workloads, access patterns in your environment, and the impact of the network on accessing your volumes can impact results. Repeat these tests at different times to understand the I/O performance patterns. For details, see Testing I/O performance for IBM Cloud Pak for Data.

Some storage types might have more stringent I/O requirements. For details, see Storage considerations.

Note: If your storage volumes are remote, network speed can be a key factor in your I/O performance. For good I/O performance, ensure that you have sufficient network speed, as described in Storage considerations.
Encryption with Linux® Unified Key Setup
To ensure that your data within Cloud Pak for Data is stored securely, you can encrypt your disks. If you use Linux Unified Key Setup-on-disk-format (LUKS) for this purpose, then you must enable LUKS and format the disks with XFS before you install Cloud Pak for Data.

Software requirements

You must have the following software to install Cloud Pak for Data:

Red Hat OpenShift cluster
Important: Cloud Pak for Data includes entitlement to both the Red Hat OpenShift Container Platform and Red Hat Enterprise Linux. You can download Red Hat OpenShift either from IBM Passport Advantage® or directly from the Red Hat Customer Portal. See the Cloud Pak for Data readme on IBM Passport Advantage for details:
  • For Cloud Pak for Data Enterprise Edition, see part number CC8CLEN.
  • For Cloud Pak for Data Standard Edition, see part number CC8CNEN.
If you are deploying Cloud Pak for Data using a Quick Start or Terraform, a supported version of Red Hat OpenShift Container Platform is installed as part of the deployment. For details, see Cloud deployment environments.
The following versions of Red Hat OpenShift Container Platform are supported. (Cloud Pak for Data supports the same operating system requirements as Red Hat OpenShift Container Platform.)
Version Learn more Cluster sizing guidance
Version 3.11.188 or later fixes For details, see the Red Hat OpenShift documentation: Refer to the Cloud Pak for Data sizing recommendations as you configure your cluster.
Version 4.5 or later fixes For details, see the Red Hat OpenShift documentation: Refer to the Cloud Pak for Data sizing recommendations as you configure your cluster.
Version 4.6.1 or later fixes Refresh 2 or later For details, see the Red Hat OpenShift documentation: Refer to the Cloud Pak for Data sizing recommendations as you configure your cluster.
Version 4.8.0 or later fixes For details, see the Red Hat OpenShift documentation: Refer to the Cloud Pak for Data sizing recommendations as you configure your cluster.
Container runtime
Your Red Hat OpenShift Container Platform cluster must include a container runtime. Ensure that you have the appropriate runtime for your environment:
Container runtime OpenShift version Notes
CRI-O version 1.13 or later (recommended)
  • 3.11
  • 4.5
  • 4.6
Ensure that you adjust the CRI-O container settings as specified in Changing required node settings.
Docker with the overlay2 driver
  • 3.11

If you use Docker, you must use NFS storage.

Ensure that you adjust the Docker container settings as specified in Changing required node settings.

For the overlay2 driver, you must enable d_type=true and ftype=1 or else the installation will fail. See Use the OverlayFS storage driver for details. After you install OpenShift, enter docker info and verify that Storage Driver: overlay2 and Supports d_type: true.

Kubernetes Incubator Metrics Server
If you want to gather use metrics for your pods and nodes, you must install Kubernetes Incubator Metrics Server.
Important:

This software is required if you want to use the platform management features in Cloud Pak for Data.

Installation node

You can run the Cloud Pak for Data installation from a workstation that can connect to the OpenShift cluster. The workstation does not have to be part of the cluster. However, you must be able to run oc login from the workstation to connect to the cluster.

You can run the installation from the following types of machines:
  • Mac OS
  • Linux
  • IBM Power® Systems
  • IBM Z
Ensure that the workstation has the appropriate version of the Red Hat OpenShift Container Platform CLI:

In addition, if you plan to run air-gapped installations, ensure that the installation node has sufficient storage space in the temporary directory. A minimum of 300 GB of storage space is recommended. If you don't have sufficient storage space, you might encounter problems when downloading the images to the installation node.

Tip: If the default temporary directory doesn't have sufficient space, you can use the following command to change the temporary directory to a directory with more space:
export TMPDIR=directory_with_sufficient_space

This change only applies to your current command-line session.

For details on the installation node, see Preparing your installation node.

Security requirements

For details on security, see Security on Cloud Pak for Data.

Supported web browsers

  • Mozilla Firefox (recommended) - Version 69 and higher
  • Google Chrome - Version 80 and higher