OpenShift Data Foundation is a highly available storage solution that consists of several open-source operators and technologies such as Ceph, NooBaa (link resides outside ibm.com) and Rook (link resides outside ibm.com). These operators allow you to provision and manage file, block and object storage for your containerized workloads in Red Hat® OpenShift® on IBM Cloud® clusters. Unlike other storage solutions where you might need to configure separate drivers and operators for each type of storage, ODF is a unified solution capable of adapting or scaling to your storage needs. You can also deploy ODF on any OCP cluster.
The documentation has a detailed architecture overview for OpenShift Data Foundation.
OpenShift Data Foundation (ODF) uses storage volumes in multiples of three and replicates your app data across these volumes. The underlying storage volumes that you use for ODF depends on your cluster type:
ODF uses these devices to create a virtualized storage layer, where your app data is replicated for high availability. Because ODF abstracts your underlying storage, you can use ODF to create file, block or object storage claims from the same underlying raw block storage.
For a full overview of the features and benefits, see OpenShift Data Foundation (link resides outside ibm.com).
In this step-by-step guide, we will show you how to install and configure OpenShift Data Foundation (ODF) on Red Hat Openshift for IBM Cloud and prepare it for IBM Cloud Pak® for Data installation.
For this tutorial, we will not demonstrate how to provision and configure a Red Hat OpenShift on IBM Cloud cluster. Before starting, you’ll need to install the required CLI into your computer (ibmcloud and Openshift) or use IBM Cloud Shell in your browser.
1. Install the OpenShift Data Foundation (ODF) add-on. In the Overview Section of the OpenShift cluster in IBM Cloud Portal, click on OpenShift Data Foundation Install:
2. In the Install OpenShift Data Foundation panel, enter the configuration parameters that you want to use for your ODF deployment and click Install:
3. Wait a few minutes for the add-on deployment to complete. When the deployment is complete, the add-on status is Normal – Add-on Ready.
4. Verify your installation. Access your Red Hat OpenShift cluster.
5. Run the following command to verify the ODF pods are running into open-shift-storage namespace/project. At this moment 3 x 250GB (data use) and 3 x 50GB (monitoring) block storage are also provisioned:
oc get pods -n openshift-storage -o wide
In order to install IBM Cloud Pak for Data and not have its pods be placed into OpenShift Data Foundation’s (ODF’s) worker pool, it is necessary to limit this by using taint and toleration configurations of OpenShift. Set Kubernetes taints for all the worker nodes in the ODF’s worker pool. Taints prevent pods without matching tolerations from running on the worker nodes. To learn more about taint and toleration, check out the site (link resides outside ibm.com).
Setting taints into ODF’s worker pool means that all new worker nodes (in case of an upgrade, for example) also receive the same taint configuration.
It’s possible to scale the OpenShift Data Foundation (ODF) configuration by increasing the numOfOsd setting. When increasing the number of OSDs, ODF provisions the number of disks of the same osdSize capacity in GB in each of the worker nodes in your ODF cluster. However, the total storage that is available to your applications is equal to the osdSize multiplied by the numOfOsd.
The following is an example of ODF storage distribution:
Edit and update the ocscluster CRD to increase the numOfOsd. Run oc edit ocscluster and change from 1 to 2 in order to have 500Gi storage capacity available to Cloud Pak for Data:
# Please edit the object below. Lines beginning with a '#' will be ignored, # and an empty file will abort the edit. If an error occurs while saving this file will be # reopened with the relevant failures. # apiVersion: ocs.ibm.io/v1kind: OcsCluster metadata: creationTimestamp: "2022-03-29T19:38:26Z" finalizers: - finalizer.ocs.ibm.io generation: 3 name: ocscluster-auto resourceVersion: "3621968" uid: 263df534-d471-4d24-864c-a1e20f77a2c1 spec: autoDiscoverDevices: false billingType: advanced clusterEncryption: false numOfOsd: 2 ocsUpgrade: false osdDevicePaths: - "" osdSize: 250Gi osdStorageClassName: ibmc-vpc-block-metro-10iops-tier workerNodes: - 10.242.0.32 - 10.242.0.33 - 10.242.0.34 status: storageClusterStatus: Ready
Now we’ll show you — step-by-step — how to install IBM Cloud Pak for Data using OpenShift Data Foundation (ODF). IBM’s installation official document is available on IBM docs.
Before starting, you’ll need to install the required CLI (oc (link resides outside ibm.com)) compatible with your OpenShift version and cloudctl (link resides outside ibm.com) cli into a bastion computer or into IBM Cloud Shell (on IBM Cloud Shell, the oc cli is already installed).
1. In IBM Cloud Shell, install the latest release for cloudctl using wget:wget https://github.com/IBM/cloud-pak-cli/releases/latest/download/cloudctl-linux-amd64.tar.gz
2. Perform tar -xvf cloudctl-linux-amd64.tar.gz to unzip and extract the downloaded content.
3. Finish the cloudctl configuration by performing the following steps and then check the version:
ln -s cloudctl-linux-amd64 cloudctl
export PATH=$PWD:$PATH
echo $PATH
cloudctl version
If you are using a bastion computer to install Cloud Pak for Data, you must also install oc. The steps are the same above for cloudctl and will not be covered in this guide because we are using IBM Cloud Shell and it already has oc installed.
To install the oc cli on your bastion computer, you can find latest oc version for Openshift v4.8 by using the following command:
wget -O oc.4.8.tar.gz https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/latest-4.8/openshift-client-linux.tar.gzcat > catalogsource.yaml <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: ibm-operator-catalog namespace: openshift-marketplace spec: displayName: "IBM Operator Catalog" publisher: IBM sourceType: grpc image: icr.io/cpopen/ibm-operator-catalog:latest updateStrategy: registryPoll: interval: 45m EOF
ibm-common-services:
cat > operatorgroup.ibm-common-services.yaml <<EOF apiVersion: operators.coreos.com/v1alpha2 kind: OperatorGroup metadata: name: operatorgroup namespace: ibm-common-services spec: targetNamespaces: - ibm-common-services EOF
cat > subscription.ibm-common-services.yaml <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: ibm-common-service-operator namespace: ibm-common-services spec: channel: v3 installPlanApproval: Automatic name: ibm-common-service-operator source: opencloud-operators sourceNamespace: openshift-marketplace EOF
cat > subscription.cpd-operator.yaml <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: cpd-operator namespace: ibm-common-services spec: channel: v2.0 installPlanApproval: Automatic name: cpd-platform-operator source: cpd-platform sourceNamespace: openshift-marketplace EOF
cat << EOF > ibmcpd-cr.yaml apiVersion: cpd.ibm.com/v1 kind: Ibmcpd metadata: name: ibmcpd-cr namespace: cp4d spec: license: accept: true license: Enterprise storageClass: ocs-storagecluster-cephfs zenCoreMetadbStorageClass: ocs-storagecluster-ceph-rbd cloudpakfordata: true iamIntegration: false generateAdminPassword: false cert_manager_enabled: true version: "4.0.7" EOF
You can check IBM Operator Catalog status by performing the following commands to see if it is READY and get the AGE of the catalog source:
GITHUBURL=https://github.com/IBM/cloud-pak/raw/master/repo/case
CASENAME=ibm-cp-common-services
CASEINVENTORY=ibmCommonServiceOperatorSetup
CASEVERSION=1.12.3
CASEARCHIVE=${CASENAME}-${CASEVERSION}.tgz
cloudctl case save --case $GITHUBURL/${CASEARCHIVE} --outputdir case
Check the IBMCS Operator catalog source installation by performing the following command:
oc get catsrc -n openshift-marketplace
5. Apply the yaml file to create csv for IBM Common Services Operator:
oc apply -f subscription.ibm-common-services.yamlYou can check the csv for IBM Common Services status by performing the following command and checking if the PHASE is listed as Succeeded:
oc get csv -n ibm-common-services6. Apply the yaml file to create the Cloud Pak for Data Operator:
oc apply -f subscription.cpd-operator.yamlYou can check the status of the installation into the Openshift Console by going to Adminstrator view > Menu Operators > Installed Operators > Status of Cloud Pak for Data Platform Operator:
When the installation completes, you will see Succeeded into Openshift Console and can use the following command:
oc get csv -n ibm-common-services
7. Install the IBM Cloud Pak for Data Control Panel service:
Change the project to the namespace you define to install your Cloud Pak for Data:
oc project cp4d
oc get events -A -w
watch 'oc get pod -A | grep -Ev "1/1|2/2|3/3|4/4|5/5|6/6|7/7|Complete"'
oc get zenservice lite-cr -n cp4d -o jsonpath="{.status.zenStatus}"
Use your favorite browser and put https://{your_CP4D_url} to log in with your admin and password:
When necessary, follow these steps to update/upgrade the OpenShift ODF worker nodes to keep ODF storage working properly. To update your VPC worker nodes that use OpenShift Data Foundation, you must cordon, drain and replace each worker node individually. If you deployed OpenShift Data Foundation to a subset of worker nodes in your cluster, after you replace the worker node, you must then edit the ocscluster resource to include the new worker node. The detailed process can be found on the site.
1. List your worker nodes by using oc get nodes and determine which worker nodes you want to update:
cguarany@cloudshell:~$ oc get nodes NAME STATUS ROLES AGE VERSION 10.242.0.32 Ready master,worker 1d v1.21.8+ee73ea2 10.242.0.33 Ready master,worker 1d v1.21.8+ee73ea2 10.242.0.34 Ready master,worker 1d v1.21.8+ee73ea2 10.242.0.40 Ready master,worker 1d v1.21.8+ee73ea2 10.242.0.41 Ready master,worker 1d v1.21.8+ee73ea2
2. Cordon the node (for example, 10.242.0.32). Cordoning the node prevents any pods from being scheduled on this node. Run oc adm cordon 10.242.0.32 .
3. Drain the node to remove all the pods. When you drain the worker node, the pods move to the other worker nodes, ensuring there is no downtime. Draining also ensures that there is no disruption of the pod disruption budget. Run oc adm drain 10.242.0.32 --force --delete-local-data --ignore-daemonsets.
4. Wait until the draining finishes, then replace/update the worker node. When you replace a worker node in VPC Gen 2, you get a new worker node with the latest patch updates:
5. List your worker nodes by using oc get nodes and determine which is the new worker node that needs to be included in the ocscluster.
6. Edit and update the ocscluster CRD to include the new node. Run oc edit ocscluster and replace the old node with the new one. After saving, the ocscluster will automatically install necessary pods into the new node:
workerNodes: - 10.242.0.32 - 10.242.0.33 - 10.242.0.34
workerNodes: - 10.242.0.33 - 10.242.0.34 - 10.242.0.39
To learn more about Red Hat OpenShift on IBM Cloud and IBM Cloud Pak for Data, check out the links below:
IBM web domains
ibm.com, ibm.org, ibm-zcouncil.com, insights-on-business.com, jazz.net, mobilebusinessinsights.com, promontory.com, proveit.com, ptech.org, s81c.com, securityintelligence.com, skillsbuild.org, softlayer.com, storagecommunity.org, think-exchange.com, thoughtsoncloud.com, alphaevents.webcasts.com, ibm-cloud.github.io, ibmbigdatahub.com, bluemix.net, mybluemix.net, ibm.net, ibmcloud.com, galasa.dev, blueworkslive.com, swiss-quantum.ch, blueworkslive.com, cloudant.com, ibm.ie, ibm.fr, ibm.com.br, ibm.co, ibm.ca, community.watsonanalytics.com, datapower.com, skills.yourlearning.ibm.com, bluewolf.com, carbondesignsystem.com