Improving GPU utilization with resource bin packing

In Kubernetes, resource bin packing is the process of scheduling pods onto nodes to maximize your resource utilization and minimize the number of nodes required. This is especially useful when you have a limited number of nodes that can support specific workloads, such as GPU-based workloads.

If you plan to install services that require GPU, you can use the scheduling service node scoring configuration to enable bin packing.

Who needs to complete this task?

Cluster administrator A cluster administrator must complete this task.

When do you need to complete this task?

This task applies only if you plan to install services that require GPU.

Before you begin

To use node scoring, the scheduling service must be installed.

To check whether the scheduling service is installed, run the following command:
oc get scheduling -A
  • If the scheduling service is installed, the command returns information about the project where the scheduling service is installed and the version that is installed.
  • If the scheduling service is not installed, the command returns an empty response.

    To install the scheduling service, see Installing shared cluster components for IBM Software Hub.

Best practice: You can run many of the commands in this task exactly as written if you set up environment variables for your installation. For instructions, see Setting up installation environment variables.

Ensure that you source the environment variables before you run the commands in this task.

About this task

Use the MoreGPURequest setting to enable bin packing. When you enable bin packing, the scheduling service uses the fewest number of GPU nodes to schedule GPU-based pods.

Procedure

  1. Print the contents of the ibm-cpd-scheduler-scheduler ConfigMap to a file called sched-config.yaml:
    oc get cm ibm-cpd-scheduler-scheduler \
    -n ${PROJECT_SCHEDULING_SERVICE} \
    -o yaml > sched-config.yaml
  2. Open the sched-config.yaml file in a text editor.
  3. In the data section, locate the scheduler.yaml section:
    data:
      scheduler.yaml: |-
  4. Edit the nodePreference configuration in the scheduler.yaml section to specify MoreGPURequest:
        nodePreference: MoreGPURequest

    Ensure that the configuration is correctly indented:

    data:
      scheduler.yaml: |-
        nodePreference: MoreGPURequest
  5. Save your changes to the sched-config.yaml file.
  6. Delete the ibm-cpd-scheduler-scheduler ConfigMap:
    oc delete cm ibm-cpd-scheduler-scheduler \
    -n ${PROJECT_SCHEDULING_SERVICE}
  7. Create the ibm-cpd-scheduler-scheduler ConfigMap from the sched-config.yaml file:
    oc create -f sched-config.yaml
  8. Restart the scheduling service to pick up the updated ibm-cpd-scheduler-scheduler ConfigMap:
    oc rollout restart deploy ibm-cpd-scheduler-scheduler \
    -n ${PROJECT_SCHEDULING_SERVICE}

Results

The scheduling service will use bin packing to maximize your GPU utilization and minimize the number of GPU nodes required.