Configuring LSF to run NVIDIA Docker jobs
Configure the NVIDIA Docker application profile or queue in LSF to run NVIDIA Docker jobs.
About this task
If you are using the NVIDIA Docker integration, you need to configure separate application profiles or queues to run NVIDIA Docker jobs.
- To prepare data for the container as a pre-execution or post-execution operation, put this data into a directory that is mounted to a job container.
- To customize the internal job container, you can customize the starter scripts to prepare the appropriate environment.
Procedure
If this parameter is specified in both files, the parameter value in the lsb.applications file overrides the value in the lsb.queues file.
CONTAINER=nvidia-docker[image(image_name) options(docker_run_options)]
In the following examples, LSF uses the ubuntu image to run the job in the Docker container.
- For sequential
jobs:
CONTAINER=nvidia-docker[image(ubuntu) options(--rm)]The container for the job is removed after the job is done, which is enabled with the docker run --rm option.
- For parallel jobs:
CONTAINER = nvidia-docker[image(ubuntu) options(--rm --net=host --ipc=host -v --runtime=nvidia /path/to/my/passwd:/etc/passwd)This command uses the following docker run options:
- --rm
- The container for the job is removed after the job is done
- --net=host
- LSF needs the host network for launching parallel tasks.
- -v
- LSF needs the user ID and user name for launching parallel tasks.
- --runtime=nvidia
- You must specify this option if the container image is using NVIDIA Docker, version 2.0.
Note: The passwd file must be in the standard format for UNIX and Linux password files, such as the following format:user1:x:10001:10001::: user2:x:10002:10002:::
For more details, refer to the CONTAINER parameter in the lsb.applications file or lsb.queues file.