-n

Submits a parallel job and specifies the number of tasks in the job.

Categories

resource

Synopsis

bsub -n min_tasks[,max_tasks]

Description

The number of tasks is used to allocate the number of slots for a job. Usually, the number of slots assigned to a job will equal the number of tasks specified. For example, one task will be allocated with one slot. (Some slots/processors may be on the same multiprocessor host).

You can specify a minimum and maximum number of tasks. For example, this job requests a minimum of 4, but can launch up to 6 tasks:

bsub -n 4,6 a.out

The job can start if at least the minimum number of slots/processors is available for the minimum number of tasks specified. If you do not specify a maximum number of tasks, the number you specify represents the exact number of tasks to launch.

If PARALLEL_SCHED_BY_SLOT=Y in lsb.params, this option specifies the number of slots required to run the job, not the number of processors.

When used with the -R option and a compound resource requirement, the number of slots in the compound resource requirement must be compatible with the minimum and maximum tasks specified.

Jobs that have fewer tasks than the minimum TASKLIMIT defined for the queue or application profile to which the job is submitted, or more tasks than the maximum TASKLIMIT are rejected. If the job has minimum and maximum tasks, the maximum tasks requested cannot be less than the minimum TASKLIMIT, and the minimum tasks requested cannot be more than the maximum TASKLIMIT.

For example, if the queue defines TASKLIMIT=4 8:
  • bsub -n 6 is accepted because it requests slots within the range of TASKLIMIT
  • bsub -n 9 is rejected because it requests more tasks than the TASKLIMIT allows
  • bsub -n 1 is rejected because it requests fewer tasks than the TASKLIMIT allows
  • bsub -n 6,10 is accepted because the minimum value 6 is within the range of the TASKLIMIT setting
  • bsub -n 1,6 is accepted because the maximum value 6 is within the range of the TASKLIMIT setting
  • bsub -n 10,16 is rejected because its range is outside the range of TASKLIMIT
  • bsub -n 1,3 is rejected because its range is outside the range of TASKLIMIT

See the TASKLIMIT parameter in lsb.queues and lsb.applications for more information.

If JOB_SIZE_LIST is defined in lsb.applications or lsb.queues and a job is submitted to a queue or an application profile with a job size list, the requested job size in the job submission must request only a single job size (number of tasks) rather than a minimum and maximum value and the requested job size in the job submission must satisfy the list defined in JOB_SIZE_LIST, otherwise LSF rejects the job submission. If a job submission does not include a job size request, LSF assigns the default job size to the submission request. JOB_SIZE_LIST overrides any TASKLIMIT parameters defined at the same level.

For example, if the application profile or queue defines JOB_SIZE_LIST=4 2 10 6 8:
  • bsub -n 6 is accepted because it requests a job size that is in JOB_SIZE_LIST.
  • bsub -n 9 is rejected because it requests a job size that is not in JOB_SIZE_LIST.
  • bsub -n 2,8 is rejected because you cannot request a range of job slot sizes when JOB_SIZE_LIST is defined.
  • bsub without specifying a job size (using -n or -R) is accepted and the job submission is assigned a job size request of 4 (the default value).

See the JOB_SIZE_LIST parameter in lsb.applications(5) or lsb.queues(5) for more information.

In the LSF multicluster capability environment, if a queue exports jobs to remote clusters (see the SNDJOBS_TO parameter in lsb.queues), then the process limit is not imposed on jobs submitted to this queue.

Once the required number of processors is available, the job is dispatched to the first host selected. The list of selected host names for the job are specified in the environment variables LSB_HOSTS and LSB_MCPU_HOSTS. The job itself is expected to start parallel components on these hosts and establish communication among them, optionally using RES.

Specify first execution host candidates using the -m option when you want to ensure that a host has the required resources or runtime environment to handle processes that run on the first execution host.

If you specify one or more first execution host candidates, LSF looks for a first execution host that satisfies the resource requirements. If the first execution host does not have enough processors or job slots to run the entire job, LSF looks for additional hosts.

Examples

bsub –n2 –R "span[ptile=1]" –network "protocol=mpi,lapi: type=sn_all: instances=2: usage=shared" poe /home/user1/mpi_prog

For this job running on hostA and hostB, each task will reserve 8 windows (2*2*2), for 2 protocols, 2 instances and 2 networks. If enough network windows are available, other network jobs with usage=shared can run on hostA and hostB because networks used by this job are shared.