Creating a deployment with attached GPU resources
Learn how to create a container with attached GPU resources.
-
For an overview of Nvidia GPU support in IBM® Cloud Private, see Nvidia GPU support.
-
Ensure that GPU drivers are installed on worker nodes. You can use the
kubectl describe nodescommand to indicate thatnvidia.com/gpuis using a GPU resource.
IBM Cloud Private offers built-in GPU support for the images in the nvidia-docker project. To specify GPU resources, your deployments must specify images from the
nvidia-docker project.
- For Linux® on Power® (ppc64le) nodes, use the images in the nvidia/cuda-ppc64le
Docker Hub repository or images that you derive from contents.
- For Linux® nodes, use the images in the nvidia/cuda
Docker Hub repository or images that you derive from contents.
Two formats are available for you to create a deployment from the management console.
You can create deployments either by entering the parameter values in the Create Deployment window or by pasting a YAML file into the "Create resource" window.
Required user type or access level: Cluster administrator or team administrator
Before you begin, ensure that the nodes are ready for deployment. For more information, see Configuring a GPU worker node.
Known issues and limitations
IBM® Z (s390x) node does not support GPU.
If you run IBM Cloud Private in a mixed environment that has Linux® (x86_64), Linux® on Power® (ppc64le), and IBM® Z(s390x) nodes, nvidia-device-plugin DaemonSet runs only on Linux® (x86_64) and Linux® on Power®
(ppc64le) cluster nodes.
Creating a deployment with GPU resources that are attached by using the Create Deployment window
- From the navigation menu, click Workloads > Deployments > Create Deployment.
- From Container settings, specify the number of GPU requested for the deployment. Ensure that this value is a positive integer.
- Enter all the other parameters options that are needed for your deployment.
- Click Create.
Creating a deployment with GPU resources attached
-
Create a
gpu-demo.yamlfile. This samplegpu-demo.yamlfile creates a container deployment with a single attached GPU resource.This sample deployment uses the
nvidia/cuda:7.5-runtimeimage, which is anvidia-dockerimage for Linux® systems. You can obtain this image from the nvidia/cudaDocker Hub repository. For Power Systems, use one of the
nvidia/cuda-ppc64leimages that are available in the nvidia/cuda-ppc64leDocker Hub repository.
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: gpu-demo spec: replicas: 1 template: metadata: labels: run: gpu-demo spec: containers: - name: gpu-demo image: nvidia/cuda:8.0-runtime command: ["/bin/sh", "-c"] args: ["nvidia-smi && tail -f /dev/null"] resources: limits: nvidia.com/gpu: 1 -
Install the
kubectlcommand line interface. See Accessing your cluster from the Kubernetes CLI (kubectl). -
Update ClusterImagePolicy
ibmcloud-default-cluster-image-policyto allow creating deployment from docker registrydocker.io/nvidia/*.# kubectl edit ClusterImagePolicy ibmcloud-default-cluster-image-policy clusterimagepolicy.securityenforcement.admission.cloud.ibm.com/ibmcloud-default-cluster-image-policy editedThe updated ClusterImagePolicy resembles the following code:
# kubectl get ClusterImagePolicy ibmcloud-default-cluster-image-policy -o yaml apiVersion: securityenforcement.admission.cloud.ibm.com/v1beta1 kind: ClusterImagePolicy metadata: name: ibmcloud-default-cluster-image-policy ... spec: repositories: - name: mycluster.icp:8500/* ... - name: docker.io/nvidia/* -
Create GPU demo deployment:
# kubectl apply -f gpu-demo.yaml deployment.extensions/gpu-demo created # kubectl get deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE gpu-demo 1 1 1 0 4m
Check whether the GPU resource is detected inside the container
-
To view a list of running containers, run this command:
kubectl get podsFrom the returned output, you can locate the
gpu-demodeployment. -
Access the logs for the
gpu-demodeployment. For example,kubectl logs gpu-demo-3638364752-zkqelThe output resembles the following code:
Tue Feb 7 08:38:11 2017 +------------------------------------------------------+ | NVIDIA-SMI 352.63 Driver Version: 352.63 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GT 730 Off | 0000:01:00.0 N/A | N/A | | 36% 41C P8 N/A / N/A | 4MiB / 1023MiB | N/A Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | +-----------------------------------------------------------------------------+After the deployment completes, a new deployment is displayed on the Deployments page. The DESIRED, CURRENT, READY, and AVAILABLE columns all display the same value, which is the number of pods or replica that you specified during the deployment.
-
Click the deployment name to view detailed information about the deployment. Review the deployment properties and ensure that they are accurate.
-
To access your deployment from the internet, you must expose your deployment as a service. See Creating services.