Instructions on how to install and configure OpenShift Data Foundation (ODF) on Red Hat Openshift for IBM Cloud and prepare it for IBM Cloud Pak for Data installation.
OpenShift Data Foundation is a highly available storage solution that consists of several open-source operators and technologies like Ceph, NooBaa and Rook. These operators allow you to provision and manage file, block and object storage for your containerized workloads in Red Hat® OpenShift® on IBM Cloud® clusters. Unlike other storage solutions where you might need to configure separate drivers and operators for each type of storage, ODF is a unified solution capable of adapting or scaling to your storage needs. You can also deploy ODF on any OCP cluster.
Architecture overview
The documentation has a detailed architecture overview for OpenShift Data Foundation.
How does OpenShift Data Foundation work?
OpenShift Data Foundation (ODF) uses storage volumes in multiples of three and replicates your app data across these volumes. The underlying storage volumes that you use for ODF depends on your cluster type:
- For IBM Cloud VPC clusters, the storage volumes are dynamically provisioned block storage for VPC devices.
- For bare metal Classic clusters, the storage volumes are local disks on your bare metal worker nodes.
- For IBM Cloud Satellite clusters, the storage volumes are either local disks on your worker nodes, or you can dynamically provision disks by using a compatible block storage driver.
ODF uses these devices to create a virtualized storage layer, where your app data is replicated for high availability. Because ODF abstracts your underlying storage, you can use ODF to create file, block or object storage claims from the same underlying raw block storage.
For a full overview of the features and benefits, see OpenShift Data Foundation.
Step-by-step instructions
In this step-by-step guide, we will show you how to install and configure OpenShift Data Foundation (ODF) on Red Hat Openshift for IBM Cloud and prepare it for IBM Cloud Pak for Data installation.
1. OpenShift Data Foundation Installation
Prerequisites
For this tutorial, we will not demonstrate how to provision and configure a Red Hat OpenShift on IBM Cloud cluster. Before starting, you’ll need to install the required CLI into your computer (ibmcloud and Openshift) or use IBM Cloud Shell in your browser.
- Install the Red Hat OpenShift on IBM Cloud 4.8 (ROKS) cluster (instructions here).
- Create a separate Worker Pool in this cluster, only for ODF installation. The ODF needs at least three worker nodes with 16vCPUx64GB. The default worker pool will be used by Cloud Pak for Data. Having a dedicated worker pool for ODF makes it easier to resize (up and down) and update workers in the application pool without compromising the ODF’s installation.
Installation
- Install the OpenShift Data Foundation (ODF) add-on. In the Overview Section of the OpenShift cluster in IBM Cloud Portal, click on OpenShift Data Foundation Install:
- In the Install OpenShift Data Foundation panel, enter the configuration parameters that you want to use for your ODF deployment and click Install:
- osdSize: Enter the size of the block storage for VPC devices that you want to provision for the OSD pods. The default size is 250Gi.
- osdStorageClassName: Enter the block storage for VPC storage class that you want to use to dynamically provision storage for the OSD pods. The default storage class is ibmc-vpc-block-metro-10iops-tier.
- osdDevicePaths: Invalid for VPC clusters. Leave this parameter as-is.
- numOfOsd: Enter the number of block storage device sets that you want to provision for ODF. A numOfOsd value of one provisions, one device set, which includes three block storage devices. The devices are provisioned evenly across your worker nodes. For more information, see Understanding ODF.
- workerNodes: Enter the worker nodes where you want to deploy ODF. You must have at least three worker nodes. The default setting is all. To deploy ODF only on nodes of ODF’s worker pool, enter the IP addresses of the worker nodes in a comma-separated list without spaces. For example: 10.242.0.32,10.242.0.33,10.242.0.34
- ocsUpgrade: For initial deployment, leave this setting as false. The default setting is false.
- clusterEncryption: The default setting is false.
- Wait a few minutes for the add-on deployment to complete. When the deployment is complete, the add-on status is Normal – Add-on Ready.
- Verify your installation. Access your Red Hat OpenShift cluster.
- Run the following command to verify the ODF pods are running into open-shift-storage namespace/project. At this moment 3 x 250GB (data use) and 3 x 50GB (monitoring) block storage are also provisioned:
2. Create OpenShift taint for ODF’s worker pool
In order to install IBM Cloud Pak for Data and not have its pods be placed into OpenShift Data Foundation’s (ODF’s) worker pool, it is necessary to limit this by using taint and toleration configurations of OpenShift. Set Kubernetes taints for all the worker nodes in the ODF’s worker pool. Taints prevent pods without matching tolerations from running on the worker nodes. To learn more about taint and toleration, check out the site.
Setting taints into ODF’s worker pool means that all new worker nodes (in case of an upgrade, for example) also receive the same taint configuration.
- In IBM Cloud CLI or Shell, login into the Red Hat OpenShift on IBM Cloud cluster, following the site.
- In IBM Cloud CLI or Shell, run the following commands:
-
Syntax:
ibmcloud oc worker-pool taint set --worker-pool WORKER_POOL --cluster CLUSTER --taint KEY=VALUE:EFFECT [--taint KEY=VALUE2:EFFECT] [-f]
-
Example:
ibmcloud oc worker-pool taint set --worker-pool odf --cluster roks-odf-test-cguarany --taint node.ocs.openshift.io/storage=true:NoSchedule
-
Syntax:
- After setting the custom taint for the ODF’s worker pool, confirm that the taints are set on of each the worker nodes by getting the private IP address of the worker node (`ibmcloud oc worker ls -c <cluster_name_or_ID>`) and running `oc describe node <worker_private_IP>`
-
Example:
ibmcloud oc worker ls -c roks-odf-test-cguarany
andoc describe node 10.242.0.33.
- The taints section of the describe results must have the following information: node.ocs.openshift.io/storage=true:NoSchedule
-
Example:
3. Increase ODF’s storage sizing (scale)
It’s possible to scale the OpenShift Data Foundation (ODF) configuration by increasing the numOfOsd setting. When increasing the number of OSDs, ODF provisions the number of disks of the same osdSize capacity in GB in each of the worker nodes in your ODF cluster. However, the total storage that is available to your applications is equal to the osdSize multiplied by the numOfOsd.
The following is an example of ODF storage distribution:
Edit and update the ocscluster CRD to increase the numOfOsd. Run oc edit ocscluster
and change from 1 to 2 in order to have 500Gi storage capacity available to Cloud Pak for Data:
4. Install Cloud Pak for Data using ODF
Now we’ll show you — step-by-step — how to install IBM Cloud Pak for Data using OpenShift Data Foundation (ODF). IBM’s installation official document is available on IBM docs.
Prerequisites
Before starting, you’ll need to install the required CLI (oc) compatible with your OpenShift version and cloudctl cli into a bastion computer or into IBM Cloud Shell (on IBM Cloud Shell, the oc cli is already installed).
- In IBM Cloud Shell, install the latest release for cloudctl using wget:
- Perform tar -xvf cloudctl-linux-amd64.tar.gz to unzip and extract the downloaded content.
- Finish the cloudctl configuration by performing the following steps and then check the version:
ln -s cloudctl-linux-amd64 cloudctl
export PATH=$PWD:$PATH
echo $PATH
cloudctl version
If you are using a bastion computer to install Cloud Pak for Data, you must also install oc. The steps are the same above for cloudctl and will not be covered in this guide because we are using IBM Cloud Shell and it already has oc installed.
To install the oc cli on your bastion computer, you can find latest oc version for Openshift v4.8 by using the following command:
Installation
- In IBM Cloud Shell, log in to the Red Hat OpenShift for IBM Cloud cluster by following the instructions on this site.
- Create the yaml files used to install the Cloud Pak for Data control panel:
- Create a yaml file for Catalog Source — IBM Operator Catalog:
- Create a yaml file to configure Operator Group to use the namespace
ibm-common-services
: - Create a yaml file for the IBM Common Services Operator:
- Create a yaml file for the Cloud Pak for Data platform operator:
- Create a yaml file for the Cloud Pak for Data service. This yaml file is configured to install Cloud Pak for Data version 4.0.7 using ODF storage classes ocs-storagecluster-cephfs and ocs-storagecluster-ceph-rbd in Openshift NameSpace (Project) cp4d. If you want to install in another namespace, you only need to adjust the namespace parameter into the yaml file:
- Apply the yaml files to install all operators used to install the IBM Cloud Pak for Data control panel:
- Apply the yaml file to install Catalog Source – IBM Operator Catalog:
You can check IBM Operator Catalog status by performing the following commands to see if it is READY and get the AGE of the catalog source:
- Apply the yaml file to create the Operator Group:
- Use cloudctl to save and launch CASE to install catalog for IBM Common Services.
- Create a directory to save packages:
- Set package parameters and use cloudctl to save CASE — define the version you want to install:
GITHUBURL=https://github.com/IBM/cloud-pak/raw/master/repo/case
CASENAME=ibm-cp-common-services
CASEINVENTORY=ibmCommonServiceOperatorSetup
CASEVERSION=1.12.3
CASEARCHIVE=${CASENAME}-${CASEVERSION}.tgz
-
cloudctl case save --case $GITHUBURL/${CASEARCHIVE} --outputdir case
- Use clouctl to launch CASE to install the IBMCS Operator and check it:
Check the IBMCS Operator catalog source installation by performing the following command:
- Apply the yaml file to install Catalog Source – IBM Operator Catalog:
- Use cloudctl to save and launch CASE to install catalog for the Cloud Pak for Data Platform Operator:
- Set package parameters and use cloudctl to save CASE — define the version you want to install:
GITHUBURL=https://github.com/IBM/cloud-pak/raw/master/repo/case
CASENAME=ibm-cp-datacore
CASEINVENTORY=cpdPlatformOperator
CASEVERSION=2.0.12
CASEARCHIVE=${CASENAME}-${CASEVERSION}.tgz
cloudctl case save --case $GITHUBURL/${CASEARCHIVE} --outputdir case
- Use clouctl to launch CASE to install the Cloud Pak for Data Platform Operator and check it:
Check the Cloud Pak for Data Platform Operator source installation by performing the following command and looking for cpd-platform catalog:
- Set package parameters and use cloudctl to save CASE — define the version you want to install:
- Apply the yaml file to create csv for IBM Common Services Operator:
You can check the csv for IBM Common Services status by performing the following command and checking if the PHASE is listed as Succeeded:
- Apply the yaml file to create the Cloud Pak for Data Operator:
You can check the status of the installation into the Openshift Console by going to Adminstrator view > Menu Operators > Installed Operators > Status of Cloud Pak for Data Platform Operator:
When the installation completes, you will see Succeeded into Openshift Console and can use the following command:
- Install the IBM Cloud Pak for Data Control Panel service:
- Apply the yaml file to create the Cloud Pak for Data service.
- Change the project to the namespace you define to install your Cloud Pak for Data:
- Apply the yaml file to start to install the Cloud Pak for Data service (it would take approximately 30 minutes or more):
- You can use the
oc get events
command to check the status of the installation: - Alternatively, you can use the
oc get pods
command. When the installation is finished, it will not have any pods shown: - You can also check the zenStatus of zenservices. When the installation finishes, the zenStatus will show Completed:
- You now have your IBM Cloud Pak For Data installed using ODF. To access Cloud Pak for Data, you first need to retrieve the URL by performing the following command:
- You also need to retrieve your initial password for user admin (in this case, the initial password for user admin is password):
- Use your favorite browser and put https://{your_CP4D_url} to log in with your admin and password:
- Then, access the IBM Cloud Pak for Data:
- Use the hamburger menu to view all Cloud Pak for Data services in the Catalog that you can install and use on your IBM Cloud Pak for Data platform by accessing Menu Services > Services Catalog:
5. Update ODF’s OpenShift worker nodes
When necessary, follow these steps to update/upgrade the OpenShift ODF worker nodes to keep ODF storage working properly. To update your VPC worker nodes that use OpenShift Data Foundation, you must cordon, drain and replace each worker node individually. If you deployed OpenShift Data Foundation to a subset of worker nodes in your cluster, after you replace the worker node, you must then edit the ocscluster resource to include the new worker node. The detailed process can be found on the site.
- List your worker nodes by using
oc get nodes
and determine which worker nodes you want to update: - Cordon the node (for example, 10.242.0.32). Cordoning the node prevents any pods from being scheduled on this node. Run
oc adm cordon 10.242.0.32
. - Drain the node to remove all the pods. When you drain the worker node, the pods move to the other worker nodes, ensuring there is no downtime. Draining also ensures that there is no disruption of the pod disruption budget. Run
oc adm drain 10.242.0.32 --force --delete-local-data --ignore-daemonsets
. - Wait until the draining finishes, then replace/update the worker node. When you replace a worker node in VPC Gen 2, you get a new worker node with the latest patch updates:
- List your worker nodes by using
oc get nodes
and determine which is the new worker node that needs to be included in the ocscluster. - Edit and update the ocscluster CRD to include the new node. Run
oc edit ocscluster
and replace the old node with the new one. After saving, the ocscluster will automatically install necessary pods into the new node:- From:
- To:
Learn more
To learn more about Red Hat OpenShift on IBM Cloud and IBM Cloud Pak for Data, check out the links below:
- Red Hat OpenShift on IBM Cloud product page
- IBM Cloud Pak for Data product page
- Getting started with Virtual Private Cloud (VPC)
- Virtual private cloud architecture
- Creating Red Hat OpenShift on IBM Cloud clusters
- OpenShift Data Foundation Architecture
- Deploying OpenShift Data Foundation on VPC clusters
- Kubernetes Taints and Tolerations
- IBM Cloud Pak for Data docs
- Use cases: IBM Cloud Pak for Data