IBM Support

How to set a job level started jobs number limit for a user

Question & Answer


Question

LSF can set the jobs number limit in lsb.queues or lsb.resources but they are all static. You need to reconfigure the cluster to make the change take effect. You need to way to set the limit of jobs volumn number for a specific user.

Answer

The started jobs mean "RUN", "SSUSP and "USUSP" jobs. You can use "job group" feature. The steps are as below:

1. Create a job group

userA@HostA-170: bgadd /user1

Job group </user1> was added.

userA@HostA-172: bjgroup

GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER

/user1 0 0 0 0 0 0 () 0/- userA

2. Set a limit for the job group

userA@HostA-173: bgmod -L 10 /user1

Job group /user1 is modified.

userA@HostA-176: bjgroup -s /user1

GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER

/user1 0 0 0 0 0 0 () 0/10 userA

3. The number of running jobs can be limited if you submit the job to the job group

(1) I submitted 11 jobs and only 10 jobs can run at a time

userA@HostA-181: repeat 11 bsub -g /user1 sleep 1000

Job <2701> is submitted to default queue <normal>.

Job <2702> is submitted to default queue <normal>.

Job <2703> is submitted to default queue <normal>.

Job <2704> is submitted to default queue <normal>.

Job <2705> is submitted to default queue <normal>.

Job <2706> is submitted to default queue <normal>.

Job <2707> is submitted to default queue <normal>.

Job <2708> is submitted to default queue <normal>.

Job <2709> is submitted to default queue <normal>.

Job <2710> is submitted to default queue <normal>.

Job <2711> is submitted to default queue <normal>.

userA@HostA-182: bjobs

JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME

2701 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2702 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2703 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2704 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2705 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2706 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2707 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2708 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2709 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2710 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23

2711 userA PEND normal HostA sleep 1000 Aug 26 15:23

(2) Other jobs will pend due to reach the limit.

userA@HostA-183: bjobs -lp 2711

Job <2711>, User <userA>, Project <default>, Job Group </user1>, Status <PEND>,

Queue <normal>, Command <sleep 1000>

Fri Aug 26 15:23:09: Submitted from host <HostA>, CWD </tmp>;

PENDING REASONS:

The specified job group has reached its job limit;

userA@HostA-186: bjgroup -s /user1

GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER

/user1 11 1 10 0 0 0 () 10/10 userA

4. You can modify the limit any time with the command

bgmod -L <n> <group name>

[{"Product":{"code":"SSETD4","label":"Platform LSF"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"--","Platform":[{"code":"PF016","label":"Linux"}],"Version":"9.1.2;9.1.3","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 June 2018

UID

isg3T1024288