System requirements for IBM Cloud Pak for Data
Before you install Cloud Pak for Data on OpenShift®, ensure that your environment meets the following requirements.
- Minimum hardware requirements
- Production level hardware requirements
- Node settings
- Storage requirements
- Disk requirements
- Software requirements
- Security requirements
- Supported web browsers
Minimum hardware requirements
The following minimum recommendations are for demos and proof-of-concept with Cloud Pak for Data.
The time on all of the nodes must be synchronized within 500ms
Minimum requirements for Red Hat® OpenShift Container Platform Version 3.11.188 or later fixes:
| Node role | Hardware | Number of servers | Available vCPU | Memory |
|---|---|---|---|---|
| Master + infra | x86-64 | 1 master and 1 infrastructure on the same node | 8 vCPU | 32 GB RAM |
| Worker/compute | x86-64 | 2 worker/compute nodes for NFS; 3 worker/compute nodes for Portworx | 16 vCPU | 64 GB RAM |
Minimum requirements for Red Hat OpenShift Container Platform Version 4.3.18 or later fixes:
| Node role | Hardware | Number of servers | Available vCPU | Memory |
|---|---|---|---|---|
| Master + infra | x86-64, ppc64le | 3 master (for high availability) and 3 infrastructure on the same 3 nodes | 4 vCPU | 16 GB RAM |
| Worker/compute | x86-64, ppc64le | 3 worker/compute nodes | 16 vCPU | 64 GB RAM |
Additional cores and nodes should be added depending on the services you need (see System requirements for services for details).
- 160 vCPUs
- 512 GB RAM
Production level hardware requirements
When you size your system, think about the types of workloads you plan to run. For example, if you plan to run complex analytics workloads in addition to other resource-intensive workloads, such as ETL jobs, you can expect reduced concurrency levels. Use the minimum configurations provided as a starting point and refine your configuration as needed. Because workloads vary based on a number of factors, use measurements from running real workloads with realistic data when you make your adjustments.
The following minimum recommendations are for small production usage (or proof-of-concept for larger workloads) with Cloud Pak for Data. High availability is always recommended.
The time on all of the nodes must be synchronized within 500ms.
| Node role | Hardware | Number of servers | Available vCPU | Memory |
|---|---|---|---|---|
| Master + infra | x86-64, ppc64le | 3 master (for high availability) and 3 infrastructure on the same 3 nodes | 8 vCPU | 32 GB RAM |
| Worker/compute | x86-64, ppc64le | 3+ worker/compute nodes | 16 vCPU | 64 GB RAM (minimum), 128 GB RAM (recommended) |
| Load balancer | x86-64, ppc64le | 1 load balancer node | 4 vCPU | 4 GB RAM (add another 4 GB of RAM and 100 GB of root storage for access restrictions and security control) |
Additional cores and nodes should be added depending on the services you need (see System requirements for services for details).
- 160 vCPUs
- 512 GB RAM
A load balancer is required when using three master nodes. The load balancer distributes the traffic load of the master and proxy nodes, securely isolate the master and compute node IP addresses, and facilitate external communication, including accessing the management console and API or making other requests to the master and proxy nodes.
See Planning your installation for Red Hat OpenShift Container Platform 3.11 or OpenShift Container Platform 4.3 installation overview for details.
Node settings
To ensure that services can run correctly on your cluster, ensure that your nodes meet the requirements specified in Changing required node settings.
You must change the node settings before you install Cloud Pak for Data.
Storage requirements
An additional 200 GB of free space in the root file system on all of the nodes. See Storage considerations for the supported storage types and requirements.
Cloud Pak for Data with all services installed can use up to 700 GB of storage space, leaving up to 300 GB of storage space available for user data. You can add additional capacity depending on your user data volume requirements.
Disk requirements
To prepare your storage disks, ensure that you have good I/O performance, and prepare the disks for encryption.
- I/O performance
- To ensure that the storage partition has good disk I/O performance, run the disk latency test
and the disk throughput test:
- Disk latency test
-
dd if=/dev/zero of=/PVC_mount_path/testfile bs=4096 count=1000 oflag=dsyncThe value must be better or comparable to: 4096000 bytes (4.1 MB, 3.9 MiB) copied, 1.5625 s, 2.5 MB/s
- Disk throughput test
-
dd if=/dev/zero of=/PVC_mount_path/testfile bs=1G count=1 oflag=dsyncThe value must be better or comparable to: 1073741824 bytes (1.1 GB) copied, 5.14444 s, 209 MB/s
See Testing I/O performance for IBM Cloud Pak for Data for details.
Some storage types might have more stringent I/O requirements. For details, see Storage considerations.
- Encryption with Linux® Unified Key Setup
- To ensure that your data within Cloud Pak for Data is
stored securely, you can encrypt your disks. If you use Linux Unified Key Setup-on-disk-format
(LUKS) for this purpose, then you must enable LUKS and format the disks with
XFSbefore you install Cloud Pak for Data.
Software requirements
- A Red Hat Enterprise Linux cluster with Red Hat OpenShift installed on it, either:
- Red Hat OpenShift Container Platform Version 3.11.188 or later fixes for on-premises. See minimum hardware requirements and production level hardware requirements for details.
- Red Hat OpenShift Container Platform Version 4.3.18 or later fixes for on-premises. See minimum hardware requirements and master node sizing for details.
- IBM Cloud. See Getting started with IBM Cloud Pak for Data for details.
- Microsoft Azure or Amazon Web Services: Red Hat OpenShift Version 3.11, either the self-managed option or installed yourself on your own VM. You can also see IBM Cloud Pak for Data: Deployments for guidance.
Important: Cloud Pak for Data includes entitlement to both the Red Hat OpenShift Container Platform and Red Hat Enterprise Linux. You can download Red Hat OpenShift either from IBM Passport Advantage® or directly from the Red Hat Customer Portal. See the Cloud Pak for Data readme on IBM Passport Advantage for details:- For Cloud Pak for Data Enterprise Edition, see part number CC62LEN.
- For Cloud Pak for Data Standard Edition, see part number CC62VEN.
- The OpenShift cluster should include:
- Ensure that you have the appropriate container runtime for your environment:
Container runtime OpenShift version Storage type Storage requirement Notes CRI-O version 1.13 or later (recommended) 4.3, 3.11 NFS, IBM Cloud File Storage, Portworx 200 GB Thread info Docker with the overlay2 driver 3.11 NFS 200 GB Overlay info Attention: For the overlay2 driver, you must enabled_type=trueandftype=1or else the installation will fail. See Use the OverlayFS storage driver for details. After you install OpenShift, enter docker info and verify thatStorage Driver: overlay2andSupports d_type: true.Requirement: To ensure proper operation of services, configure the OpenShift cluster default thread count to permit containers with 12000 pids on every compute node. To change the default settings on Docker, append--default-pids-limit=12000toOPTIONS=in the /etc/sysconfig/docker file. For CRI-O, add (or edit if the line already exists)pids_limit = 12000under the[crio.runtime]section in the /etc/crio/crio.conf file. - Kubernetes Incubator Metrics Server to gather usage metrics for pods and nodes.
- Ensure that you have the appropriate container runtime for your environment:
- A registry to store images on, with a minimum of 300 GB of storage space. The registry must be accessible from all of the nodes in the cluster. Additionally, all of the nodes must have permission to push to and pull from the registry. See Setting up your registry server for details.
- A Linux or Mac OS client workstation to perform the installation from. The workstation does not have to be part of the OpenShift cluster, but must be able to oc login to the cluster. If your OpenShift cluster is air-gapped, then ensure this workstation has CLI 3.11 or CLI 4.3 installed to download the required packages from the hosted IBM registry (you would then copy the installation package onto a machine that is able to oc login to the cluster).
Cloud Pak for Data supports the same operating system requirements as OpenShift. See operating system requirements 3.11 or installation and update 4.3 for details.
Security requirements
- You must have a cluster-admin account to set up the environment for Cloud Pak for Data. See Cluster Roles and Local Roles 3.11 or Projects 4.3.
- The cluster-admin user must grant the
cpd-admin-roleto the project administration user that will install Cloud Pak for Data in the corresponding OpenShift project. See Projects and Users 3.11 or Projects 4.3.
Supported web browsers
- Mozilla Firefox (recommended) - Version 54 and higher
- Google Chrome - Version 60 and higher