IBM Support

Watson Studio Local node affinity for GPU & CPU images

How To


Summary

This document helps to configure node affinity on the Watson Studio Local cluster

Environment

Prerequisite

WSL (1.2.3.1) patch-02

Steps

Preinstall Steps

Back up the configmap
kubectl get configmap -n dsx runtimes-def-configmap -o yaml > configmap.backup.yaml

 Identify the compute nodes.

kubectl get nodes --show-labels | grep is_compute=true

Identify the GPU compute nodes and the non-GPU compute nodes.

GPU compute nodes

For each GPU node, run the following command

kubectl label no <node name> is_gpu=true

For each non-GPU node, run the following command

kubectl label no <node name> is_gpu=false
Edit the runtimes definitions configmap
kubectl -n dsx edit configmap runtimes-def-configmap

List of GPU-based runtime types

  1. dsx-scripted-ml-gpu-python3
  2. Jupyter-gpu-py35

The snippet need to be added to the sections listed as GPU-based runtime types

"nodeAffinity": {
          "requiredDuringSchedulingIgnoredDuringExecution": {
              "nodeSelectorTerms": [
                  {
                      "matchExpressions": [
                          {
                              "key": "is_gpu",
                              "operator": "In",
                              "values": [ "true"]
                          }
                      ]
                  }
              ]
          }
      },

List of non-GPU based runtime types

  1. dsx-scripted-ml-python2
  2. dsx-scripted-ml-python3
  3. jupyter-py35
  4. Jupyter
  5. python27-script-as-a-service
  6. python35-script-as-a-service
  7. R-script-as-a-service
  8. Rstudio
  9. Rstudio-worker
  10. Shaper
  11. Zeppelin
The snippet need to be added to the sections listed as non-GPU based runtime types
"nodeAffinity": {
          "requiredDuringSchedulingIgnoredDuringExecution": {
              "nodeSelectorTerms": [
                  {
                      "matchExpressions": [
                          {
                              "key": "is_gpu",
                              "operator": "In",
                              "values": [ "false"]
                          }
                      ]
                  }
              ]
          }
      },

For each of the runtime types in previous sections, run the following

  1. To view deployments with old GPU image (if any). 
    kubectl get deployment -n dsx -l type=<runtime type> 
  2.  To delete deployments running with the old GPU image (if any). 
    kubectl delete deployment -n dsx -l type=<runtime type> 

To roll back the changes

  1. Run
    kubectl -n dsx edit configmap runtimes-def-configmap
  2. Remove the nodeAffinity section added to the runtime types

Document Location

Worldwide

[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSHGWL","label":"IBM Watson Studio Local"},"ARM Category":[{"code":"a8m0z000000bnhcAAA","label":"Admin->Documentation"}],"ARM Case Number":"TS004412755","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Document Information

Modified date:
23 November 2020

UID

ibm16371804