Optional: Configuring dedicated OpenShift worker nodes for Data Virtualization

You can provision Data Virtualization on dedicated Red Hat® OpenShift® worker nodes.

About this task

Do this advanced configuration only when it is necessary. Constraining Data Virtualization to a specific set of nodes can complicate activities, including upgrades and maintenance at the OpenShift level. In general, it is recommended that you allow Data Virtualization to be scheduled across all available Data Virtualization worker nodes.

Nodes that can run pods from other services

To run Data Virtualization instance pods on specific OpenShift worker nodes where the nodes also can run pods from other services

  1. Log in to Red Hat OpenShift Container Platform as an instance administrator:
    oc login ${OCP_URL}
  2. Retrieve the name of the worker node that you want to dedicate to Data Virtualization:
    oc get nodes
  3. Choose one of the following options to label the nodes. If you are running Db2U with elevated privileges, the podantiaffinity rules are disabled in the Data Virtualization engine pods and the head and worker pods can be scheduled on the same nodes.
    • Label the nodes with the podantiaffinity rule enabled:
      Label at least two nodes because the head pod and worker pod cannot be added to same nodes due to the default podantiaffinity rules that are in place on the head and worker nodes.
      Run the following commands:
      oc label node 'node1-name' icp4data=dv 
      oc label node 'node2-name' icp4data=dv
    • Label the node with the podantiaffinity rule disabled:
      If you want to schedule all Data Virtualization pods to one node, the podantiaffinity rules must be disabled to allow the scheduling of head and worker pods on the same node.
      You need to label only one node. Run the following command:
      oc label node node1-name icp4data=dv 
  4. Change to the project where the Cloud Pak for Data control plane is installed:
    oc project ${PROJECT_CPD_INST_OPERANDS}
    This command uses an environment variable so that you can run the command exactly as written. For information about sourcing environment variables, see Setting up installation environment variables.
  5. Add the nodeAffinity field to the pod specification so that the pods are assigned to nodes that have the label icp4data=dv. Edit the Db2 Big SQL CR to add the details from the example spec section:
    oc edit bigsql db2u-dv
    spec: 
     affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: icp4data
                operator: In
                values:
                - dv
  6. Wait for the Data Virtualization pods to restart. If the pods do not restart, run the following command to restart them:
    oc delete pods -n ${PROJECT_CPD_INST_OPERANDS} -l formation_id=db2u-dv
  7. Run the following command to confirm that the pods are created on the right nodes:
    oc get pods -n ${PROJECT_CPD_INST_OPERANDS} -l formation_id=db2u-dv -o wide
    All Data Virtualization service instance pods are listed and you can verify the values in the Node column.
Note: When you add nodeaffinity to schedule pods to specific nodes, the pods might not be able to find a node if the node does not meet the Data Virtualization instance provisioning requirements for clusters and resources (CPU or memory).

Nodes that are dedicated only to Data Virtualization

To run Data Virtualization instance pods on specific OpenShift worker nodes that are dedicated only to Data Virtualization

Pods from other services cannot run on the nodes.

  1. Log in to Red Hat OpenShift Container Platform as an instance administrator:
    oc login ${OCP_URL}
  2. Retrieve the name of the worker node that you want to dedicate to Data Virtualization:
    oc get nodes
  3. Label and taint the nodes. Tainted nodes try to repel pods that do not tolerate the taints on the node. As a result, the nodes try to place only Data Virtualization pods in the nodes. Pods that exist in the nodes do not get evicted. Newly created pods that do not tolerate the taints do not get scheduled to the node.

    Choose one of the following options. If you are running Db2U with elevated privileges, the podantiaffinity rules are disabled in the Data Virtualization engine pods and head and worker pods can be scheduled on the same nodes.

    • Label and taint the nodes with the podantiaffinity rule enabled:
      Label and taint at least two nodes because the head pod and worker pod cannot be added to same nodes due to the default podantiaffinity rules that are in place on the head and worker nodes.
      oc label node 'node1-name' icp4data=dv
      oc label node 'node2-name' icp4data=dv
      oc adm taint node 'node1-name' icp4data=dv:PreferNoSchedule
      oc adm taint node 'node2-name' icp4data=dv:PreferNoSchedule
    • Label and taint the nodes with the podantiaffinity rule disabled:
      If you want to schedule all Data Virtualization pods to one node, the podantiaffinity rules must be disabled to allow the scheduling of head and worker pods on the same node.
      You need to label and taint only one node. Run the following command:
      oc label node node1-name icp4data=dv
      oc adm taint node 'node1-name' icp4data=dv:PreferNoSchedule
  4. Change to the project where the Cloud Pak for Data control plane is installed:
    oc project ${PROJECT_CPD_INST_OPERANDS}
    This command uses an environment variable so that you can run the command exactly as written. For information about sourcing environment variables, see Setting up installation environment variables.
  5. Add tolerations to the pod spec in the Db2 Big SQL CR:
    oc edit bigsql db2u-dv
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: icp4data
                operator: In
                values:
                - dv
      tolerations:
      - key: "icp4data"
        operator: "Equal"
        value: "dv"
        effect: "PreferNoSchedule"    
  6. Wait for the Data Virtualization pods to restart. If the pods do not restart, run the following command to restart them:
    oc delete pods -n ${PROJECT_CPD_INST_OPERANDS} -l formation_id=db2u-dv
  7. Run the following command to confirm that the pods are created on the right nodes:
    oc get pods -n ${PROJECT_CPD_INST_OPERANDS} -l formation_id=db2u-dv -o wide
    All Data Virtualization service instance pods are listed and you can verify the values in the Node column.
Note: When you add tolerations to schedule pods to specific nodes, the pods might not be able to find a node if the node does not meet the Data Virtualization instance provisioning requirements for clusters and resources (CPU or memory).