Deploying Kubernetes jobs in LSF Connector for Kubernetes
LSF Connector for Kubernetes enables new options for jobs that are run through Kubernetes. These options are needed for running High Performance Computing (HPC) applications and large AI jobs.
The options extend the pod specification with annotations to define which scheduling and placement policies to use. To enable these features, the pod specification must have the schedulerName parameter set to lsf, for example:
spec.template.spec.schedulerName: lsf
The new features for Kubernetes jobs that this provides includes the following:
- Job priority
- Application profiles
- Fair sharing of resources
- Parallel jobs
- GPU management
The following table shows the new annotations along with the equivalent LSF job submission option.
| Pod spec field | Description | LSF job submission option |
|---|---|---|
| spec.template.metadata.name | A name to assign to the job. | Job name (-J) |
| spec.template.metadata.annotations.lsf.ibm.com/dependency | A job that must complete before this is dispatched. | |
| spec.template.metadata.annotations.lsf.ibm.com/project | A project name to assign to job. | Project name (-P) |
| spec.template.metadata.annotations.lsf.ibm.com/application | An application profile to use. | Application profile (-app) |
| spec.template.metadata.annotations.lsf.ibm.com/gpu | The GPU requirements for the job. | GPU requirement (-gpu) |
| spec.template.metadata.annotations.lsf.ibm.com/queue | The name of the job queue in which to run the job. | Queue (-q) |
| spec.template.metadata.annotations.lsf.ibm.com/jobGroup | A job group to assign to job. | Job Group (-g) |
| spec.template.metadata.annotations.lsf.ibm.com/fairshareGroup | The fair share group to which to assign the job. | Fair share group (-G) |
| spec.template.metadata.annotations.lsf.ibm.com/user | The user to run the application as, and for accounting. | Job submission user |
| spec.template.metadata.annotations.lsf.ibm.com/serviceClass | The service class to apply to the job. | Service class (-sla) |
| spec.template.metadata.annotations.lsf.ibm.com/reservation | The resources to reserve prior to running the job. | Advanced reservation (-U) |
| spec.template.spec.containers[].resources. requests.memory |
The amount of memory to reserve for the job. | Memory reservation (-R "rusage[mem=...]") |
| spec.template.spec.schedulerName | Set to lsf. | N/A |
For information on the annotations and their meanings refer IBM Spectrum LSF cluster management essentials.
apiVersion: batch/v1
kind: Job
metadata:
name: myjob-001
spec:
template:
metadata:
name: myjob-001
spec:
schedulerName: lsf # This directs scheduling to the LSF scheduler
containers:
- name: ubuntutest
image: ubuntu
command: ["sleep", "60"]
resources:
requests:
memory: 5Gi
restartPolicy: NeverThis example enables Kubernetes to use lsf as the job scheduler. The LSF job scheduler can then apply its policies to choose when and where the job will run.
apiVersion: batch/v1
kind: Job
metadata:
name: myjob-001
spec:
template:
metadata:
name: myjob-001
# The following annotations provide additional scheduling
# information to better place the pods on the worker nodes
# NOTE: Some annotations require additional LSF configuration annotations:
annotations:
lsf.ibm.com/project: "big-project-1000"
lsf.ibm.com/queue: "normal"
lsf.ibm.com/jobGroup: "/my-group"
lsf.ibm.com/fairshareGroup: "gold"
spec:
schedulerName: lsf # This directs scheduling to the LSF scheduler
containers:
- name: ubuntutest
image: ubuntu
command: ["sleep", "60"]
restartPolicy: NeverThe annotations provide the LSF scheduler with more information about the job and how it should be run.
apiVersion: batch/v1
kind: Job
metadata:
name: myjob-uid1003-0002
spec:
template:
metadata:
name: myjob-uid1003-0002
spec:
schedulerName: lsf
containers:
- name: ubuntutest
image: ubuntu
command: ["id"]
restartPolicy: Never
securityContext:
runAsUser: 1003
fsGroup: 100
runAsGroup: 1001The pod is run as UID 1003 and produces the following output:
uid=1003(billy) gid=0(root) groups=0(root),1001(users)
Note the GID and groups, and ensure that you limit who can create pods. Alternatively, you can use LSF applications to allow the administrator to predefine the pod specification file.
For further information and examples, refer to https://github.com/IBMSpectrumComputing/lsf-kubernetes.