System requirements for IBM Cloud Pak for Data
Before you install Cloud Pak for Data on OpenShift®, ensure that your environment meets the following requirements.
- Assembly versions
- Hardware requirements
- Cluster node settings
- Storage requirements
- Disk requirements
- Software requirements
- Installation node
- Security requirements
- Supported web browsers
Assembly versions
The following versions of the Cloud Pak for Data control plane assembly are available. Use the
following table to determine what versions of Red Hat® OpenShift Container Platform each assembly can run on and which
version of the Cloud Pak for Data command-line interface
(cpd-cli
) you can use to install the assembly.
Cloud Pak for Data assembly version | OpenShift 3.11 | OpenShift 4.5 | OpenShift 4.6 | OpenShift 4.8 |
---|---|---|---|---|
3.5.1 | Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Not compatible. |
3.5.2 | Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Not compatible. |
3.5.3 (See the Important note after this table.) |
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Not compatible. |
3.5.4 (See the Important note after this table.) |
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Not compatible. |
3.5.5 (See the Important note after this table.) |
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Not compatible. |
3.5.6 (See the Important note after this table.) |
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Not compatible. |
3.5.7 (See the Important note after this table.) |
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
Compatible. Supported version of the
cpd-cli :
|
For information about service assemblies, see System requirements for services.
cpd-3.0.1-lite-patch-8
before you upgrade to Version 3.5.3 or later refreshes of
the Cloud Pak for Data control plane. For details on
installing this patch, see:
Hardware requirements
You can install Cloud Pak for Data on a Red Hat OpenShift Container Platform cluster. For information about the supported versions of Red Hat OpenShift Container Platform, see Software requirements.
The following requirements are the minimum recommendations for a small, stable deployment of Cloud Pak for Data. Use the minimum recommended configuration as a starting point for your cluster configuration. If you use fewer resources, you are likely to encounter stability problems. (If you are sizing a cluster for a proof of concept or for very limited use, you can use the guidance in Limited-use configurations for Cloud Pak for Data.)
- The required components that you need to install
- The services
that you plan to install
The sizing requirements for services are available in System requirements for services. If you only install a few services with small vCPU and memory requirements, you might not need additional resources. However, if you plan to install multiple services or services with large footprints, add the appropriate amount of vCPU and memory to the minimum recommendations below.
Important: Some services require AVX instruction sets. Review the System requirements for services to determine whether you need CPUs that support AVX instruction sets. - The types of workloads that you plan to run
For example, if you plan to run complex analytics workloads in addition to other resource-intensive workloads, such as ETL jobs, you can expect reduced concurrency levels if you don't add additional computing power to your cluster.
Because workloads vary based on a number of factors, use measurements from running real workloads with realistic data to size your cluster.
It is strongly recommended that you deploy Cloud Pak for Data on a highly available cluster.
The following configuration has been tested and validated by IBM. However, Red Hat OpenShift Container Platform supports other configurations. If the configuration in the following table does not work in your environment, you can adapt the configuration based on the guidance in the Red Hat OpenShift documentation. (Links to the relevant Red Hat OpenShift documentation are available in Software requirements.) In general, Cloud Pak for Data is primarily concerned with the resources that are available on your worker nodes.
Node role | Hardware | Number of servers | Minimum available vCPU | Minimum memory |
---|---|---|---|---|
Master + infra |
|
3 master (for high availability) and 3 infrastructure on the same 3 nodes | 8 vCPU per node | 32 GB RAM per node |
Worker/compute |
|
3+ worker/compute nodes | 16 vCPU per node |
|
Load balancer |
|
2 load balancer nodes | 4 vCPU per node | 4 GB RAM per node Add another 4 GB of RAM and 100 GB of root storage for access restrictions and security control. See Network topology requirements. |
- Load balancer
- A load balancer is required when using three master nodes. The load balancer distributes the traffic load of the master and proxy nodes, securely isolates the master and compute node IP addresses, and facilitates external communication, including accessing the management console and API or making other requests to the master and proxy nodes.
- Bastion node
- OpenShift 4.5 and 4.6 clusters: If you
are installing in a restricted network, set up one internet-facing machine as a bastion node to
download the software images. For details, see:
- Version 4.5: Mirroring images for a disconnected installation
- Version 4.6: Mirroring images for a disconnected installation
- 160 vCPU
- 512 GB RAM
Cluster node settings
The time on all of the nodes must be synchronized within 500 ms.
To ensure that services can run correctly on your cluster, ensure that your nodes meet the requirements specified in Changing required node settings. You must change the node settings before you install Cloud Pak for Data.
Storage requirements
Required storage | Details |
---|---|
Container image registry | The container image registry stores the container images for the Cloud Pak for Data control plane and services.
|
Local storage for container images | Each node on your cluster must have local storage for the container images that are running
on that node.
|
Shared persistent storage for services | The Cloud Pak for Data control plane and services
store data in shared persistent storage.
|
- OpenShift Container Storage
- Required storage class:
ocs-storagecluster-cephfs
- IBM Spectrum Scale Container Native
- Required storage class:
ibm-spectrum-scale-sc
- NFS
- Specify a storage class with ReadWriteMany (RWX) access.
- Portworx
- Required storage class:
portworx-shared-gp3
- IBM Cloud File Storage
- IBM Cloud deployments
only.Supported storage classes:
ibmc-file-gold-gid
ibm-file-custom-gold-gid
Disk requirements
To prepare your storage disks, ensure that you have good I/O performance, and prepare the disks for encryption.
- I/O performance
-
When I/O performance is not sufficient, services can experience poor performance or cluster instability when the services are handling a heavy load, such as functional failures with timeouts. The following I/O performance requirements are based on repeated workloads that test performance on the platform and validated in various cloud environments. The current requirements are based on the performance of writing data to representative storage locations using two chosen block sizes (4 KB and 1 GB). These tests use the dd command-line utility. Use the MB/sec metric from the tests and ensure that your test result is comparable to or better than the targets.
To ensure that the storage partition has good disk I/O performance, run the following tests.
- Disk latency test
-
dd if=/dev/zero of=/PVC_mount_path/testfile bs=4096 count=1000 oflag=dsync
The value must be comparable to or better than: 2.5 MB/s.
- Disk throughput test
-
dd if=/dev/zero of=/PVC_mount_path/testfile bs=1G count=1 oflag=dsync
The value must be comparable to or better than: 209 MB/s.
Dynamic variations in workloads, access patterns in your environment, and the impact of the network on accessing your volumes can impact results. Repeat these tests at different times to understand the I/O performance patterns. For details, see Testing I/O performance for IBM Cloud Pak for Data.
Some storage types might have more stringent I/O requirements. For details, see Storage considerations.
Note: If your storage volumes are remote, network speed can be a key factor in your I/O performance. For good I/O performance, ensure that you have sufficient network speed, as described in Storage considerations. - Encryption with Linux® Unified Key Setup
- To ensure that your data within Cloud Pak for Data is
stored securely, you can encrypt your disks. If you use Linux Unified Key Setup-on-disk-format
(LUKS) for this purpose, then you must enable LUKS and format the disks with
XFS
before you install Cloud Pak for Data.
Software requirements
You must have the following software to install Cloud Pak for Data:
- Red Hat OpenShift cluster
-
Important: Cloud Pak for Data includes entitlement to both the Red Hat OpenShift Container Platform and Red Hat Enterprise Linux. You can download Red Hat OpenShift either from IBM Passport Advantage® or directly from the Red Hat Customer Portal. See the Cloud Pak for Data readme on IBM Passport Advantage for details:
- For Cloud Pak for Data Enterprise Edition, see part number CC8CLEN.
- For Cloud Pak for Data Standard Edition, see part number CC8CNEN.
- Container runtime
- Your Red Hat OpenShift Container Platform cluster must include
a container runtime. Ensure that you have the appropriate runtime for your environment:
Container runtime OpenShift version Notes CRI-O version 1.13 or later (recommended) - 3.11
- 4.5
- 4.6
Ensure that you adjust the CRI-O container settings as specified in Changing required node settings. Docker with the overlay2 driver - 3.11
If you use Docker, you must use NFS storage.
Ensure that you adjust the Docker container settings as specified in Changing required node settings.
For the overlay2 driver, you must enable
d_type=true
andftype=1
or else the installation will fail. See Use the OverlayFS storage driver for details. After you install OpenShift, enter docker info and verify thatStorage Driver: overlay2
andSupports d_type: true
. - Kubernetes Incubator Metrics Server
- If you want to gather use metrics for your pods and nodes, you must install Kubernetes Incubator Metrics Server.Important:
This software is required if you want to use the platform management features in Cloud Pak for Data.
Installation node
You can run the Cloud Pak for Data installation from a
workstation that can connect to the OpenShift cluster. The workstation does not have to
be part of the cluster. However, you must be able to run oc login
from the
workstation to connect to the cluster.
- Mac OS
- Linux
- IBM Power® Systems
- IBM Z
In addition, if you plan to run air-gapped installations, ensure that the installation node has sufficient storage space in the temporary directory. A minimum of 300 GB of storage space is recommended. If you don't have sufficient storage space, you might encounter problems when downloading the images to the installation node.
export TMPDIR=directory_with_sufficient_space
This change only applies to your current command-line session.
For details on the installation node, see Preparing your installation node.
Security requirements
- You must have a cluster-admin account to set up the environment for Cloud Pak for Data. For details, see the Red Hat OpenShift documentation:
- OpenShift 3.11: Cluster Roles and Local Roles
- OpenShift 4.5: Working with projects
- OpenShift 4.6: Working with projects
- The cluster-admin user must grant the
cpd-admin-role
to the project administration user that will install Cloud Pak for Data in the corresponding OpenShift project. For details, see the Red Hat OpenShift documentation:- OpenShift 3.11: Projects and Users
- OpenShift 4.5: Working with projects
- OpenShift 4.6: Working with projects
For details on security, see Security on Cloud Pak for Data.
Supported web browsers
- Mozilla Firefox (recommended) - Version 69 and higher
- Google Chrome - Version 80 and higher