System requirements for IBM Cloud Pak for Data

Before you install Cloud Pak for Data on OpenShift®, ensure that your environment meets the following requirements.

Minimum hardware requirements

The following minimum recommendations are for demos and proof-of-concept with Cloud Pak for Data.

The time on all of the nodes must be synchronized within 500ms

Minimum requirements for Red Hat® OpenShift Container Platform Version 3.11.188 or later fixes:

Node role Hardware Number of servers Available vCPU Memory
Master + infra x86-64 1 master and 1 infrastructure on the same node 8 vCPU 32 GB RAM
Worker/compute x86-64 2 worker/compute nodes for NFS; 3 worker/compute nodes for Portworx 16 vCPU 64 GB RAM

Minimum requirements for Red Hat OpenShift Container Platform Version 4.3.18 or later fixes:

Node role Hardware Number of servers Available vCPU Memory
Master + infra x86-64, ppc64le 3 master (for high availability) and 3 infrastructure on the same 3 nodes 4 vCPU 16 GB RAM
Worker/compute x86-64, ppc64le 3 worker/compute nodes 16 vCPU 64 GB RAM

Additional cores and nodes should be added depending on the services you need (see System requirements for services for details).

Restriction: On POWER hardware, the maximum supported configuration for each worker node is:
  • 160 vCPUs
  • 512 GB RAM

Production level hardware requirements

When you size your system, think about the types of workloads you plan to run. For example, if you plan to run complex analytics workloads in addition to other resource-intensive workloads, such as ETL jobs, you can expect reduced concurrency levels. Use the minimum configurations provided as a starting point and refine your configuration as needed. Because workloads vary based on a number of factors, use measurements from running real workloads with realistic data when you make your adjustments.

The following minimum recommendations are for small production usage (or proof-of-concept for larger workloads) with Cloud Pak for Data. High availability is always recommended.

The time on all of the nodes must be synchronized within 500ms.

Node role Hardware Number of servers Available vCPU Memory
Master + infra x86-64, ppc64le 3 master (for high availability) and 3 infrastructure on the same 3 nodes 8 vCPU 32 GB RAM
Worker/compute x86-64, ppc64le 3+ worker/compute nodes 16 vCPU 64 GB RAM (minimum), 128 GB RAM (recommended)
Load balancer x86-64, ppc64le 1 load balancer node 4 vCPU 4 GB RAM (add another 4 GB of RAM and 100 GB of root storage for access restrictions and security control)

Additional cores and nodes should be added depending on the services you need (see System requirements for services for details).

Restriction: On POWER hardware (Red Hat OpenShift Container Platform Version 4.3 only), the maximum supported configuration for each worker node is:
  • 160 vCPUs
  • 512 GB RAM
Auto-AI requirement: In order to run Auto-AI experiments successfully, it is required that the processor supports the AVX2 instruction set. Otherwise the Auto-AI experiment run will fail.

A load balancer is required when using three master nodes. The load balancer distributes the traffic load of the master and proxy nodes, securely isolate the master and compute node IP addresses, and facilitate external communication, including accessing the management console and API or making other requests to the master and proxy nodes.

See Planning your installation for Red Hat OpenShift Container Platform 3.11 or OpenShift Container Platform 4.3 installation overview for details.

Node settings

To ensure that services can run correctly on your cluster, ensure that your nodes meet the requirements specified in Changing required node settings.

You must change the node settings before you install Cloud Pak for Data.

Storage requirements

An additional 200 GB of free space in the root file system on all of the nodes. See Storage considerations for the supported storage types and requirements.

Cloud Pak for Data with all services installed can use up to 700 GB of storage space, leaving up to 300 GB of storage space available for user data. You can add additional capacity depending on your user data volume requirements.

Disk requirements

To prepare your storage disks, ensure that you have good I/O performance, and prepare the disks for encryption.

I/O performance
To ensure that the storage partition has good disk I/O performance, run the disk latency test and the disk throughput test:
Disk latency test
dd if=/dev/zero of=/PVC_mount_path/testfile bs=4096 count=1000 oflag=dsync

The value must be better or comparable to: 4096000 bytes (4.1 MB, 3.9 MiB) copied, 1.5625 s, 2.5 MB/s

Disk throughput test
dd if=/dev/zero of=/PVC_mount_path/testfile bs=1G count=1 oflag=dsync

The value must be better or comparable to: 1073741824 bytes (1.1 GB) copied, 5.14444 s, 209 MB/s

See Testing I/O performance for IBM Cloud Pak for Data for details.

Some storage types might have more stringent I/O requirements. For details, see Storage considerations.

Encryption with Linux® Unified Key Setup
To ensure that your data within Cloud Pak for Data is stored securely, you can encrypt your disks. If you use Linux Unified Key Setup-on-disk-format (LUKS) for this purpose, then you must enable LUKS and format the disks with XFS before you install Cloud Pak for Data.

Software requirements

  • A Red Hat Enterprise Linux cluster with Red Hat OpenShift installed on it, either:
    Important: Cloud Pak for Data includes entitlement to both the Red Hat OpenShift Container Platform and Red Hat Enterprise Linux. You can download Red Hat OpenShift either from IBM Passport Advantage® or directly from the Red Hat Customer Portal. See the Cloud Pak for Data readme on IBM Passport Advantage for details:
    • For Cloud Pak for Data Enterprise Edition, see part number CC62LEN.
    • For Cloud Pak for Data Standard Edition, see part number CC62VEN.
  • The OpenShift cluster should include:
    • Ensure that you have the appropriate container runtime for your environment:
      Container runtime OpenShift version Storage type Storage requirement Notes
      CRI-O version 1.13 or later (recommended) 4.3, 3.11 NFS, IBM Cloud File Storage, Portworx 200 GB Thread info
      Docker with the overlay2 driver 3.11 NFS 200 GB Overlay info
      Attention: For the overlay2 driver, you must enable d_type=true and ftype=1 or else the installation will fail. See Use the OverlayFS storage driver for details. After you install OpenShift, enter docker info and verify that Storage Driver: overlay2 and Supports d_type: true.
      Requirement: To ensure proper operation of services, configure the OpenShift cluster default thread count to permit containers with 12000 pids on every compute node. To change the default settings on Docker, append --default-pids-limit=12000 to OPTIONS= in the /etc/sysconfig/docker file. For CRI-O, add (or edit if the line already exists) pids_limit = 12000 under the [crio.runtime] section in the /etc/crio/crio.conf file.
    • Kubernetes Incubator Metrics Server to gather usage metrics for pods and nodes.
  • A registry to store images on, with a minimum of 300 GB of storage space. The registry must be accessible from all of the nodes in the cluster. Additionally, all of the nodes must have permission to push to and pull from the registry. See Setting up your registry server for details.
  • A Linux or Mac OS client workstation to perform the installation from. The workstation does not have to be part of the OpenShift cluster, but must be able to oc login to the cluster. If your OpenShift cluster is air-gapped, then ensure this workstation has CLI 3.11 or CLI 4.3 installed to download the required packages from the hosted IBM registry (you would then copy the installation package onto a machine that is able to oc login to the cluster).

Cloud Pak for Data supports the same operating system requirements as OpenShift. See operating system requirements 3.11 or installation and update 4.3 for details.

Security requirements

Supported web browsers

  • Mozilla Firefox (recommended) - Version 54 and higher
  • Google Chrome - Version 60 and higher