Question & Answer
Question
LSF can set the jobs number limit in lsb.queues or lsb.resources but they are all static. You need to reconfigure the cluster to make the change take effect. You need to way to set the limit of jobs volumn number for a specific user.
Answer
The started jobs mean "RUN", "SSUSP and "USUSP" jobs. You can use "job group" feature. The steps are as below:
1. Create a job group
userA@HostA-170: bgadd /user1
Job group </user1> was added.
userA@HostA-172: bjgroup
GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER
/user1 0 0 0 0 0 0 () 0/- userA
2. Set a limit for the job group
userA@HostA-173: bgmod -L 10 /user1
Job group /user1 is modified.
userA@HostA-176: bjgroup -s /user1
GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER
/user1 0 0 0 0 0 0 () 0/10 userA
3. The number of running jobs can be limited if you submit the job to the job group
(1) I submitted 11 jobs and only 10 jobs can run at a time
userA@HostA-181: repeat 11 bsub -g /user1 sleep 1000
Job <2701> is submitted to default queue <normal>.
Job <2702> is submitted to default queue <normal>.
Job <2703> is submitted to default queue <normal>.
Job <2704> is submitted to default queue <normal>.
Job <2705> is submitted to default queue <normal>.
Job <2706> is submitted to default queue <normal>.
Job <2707> is submitted to default queue <normal>.
Job <2708> is submitted to default queue <normal>.
Job <2709> is submitted to default queue <normal>.
Job <2710> is submitted to default queue <normal>.
Job <2711> is submitted to default queue <normal>.
userA@HostA-182: bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
2701 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2702 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2703 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2704 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2705 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2706 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2707 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2708 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2709 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2710 userA RUN normal HostA HostB sleep 1000 Aug 26 15:23
2711 userA PEND normal HostA sleep 1000 Aug 26 15:23
(2) Other jobs will pend due to reach the limit.
userA@HostA-183: bjobs -lp 2711
Job <2711>, User <userA>, Project <default>, Job Group </user1>, Status <PEND>,
Queue <normal>, Command <sleep 1000>
Fri Aug 26 15:23:09: Submitted from host <HostA>, CWD </tmp>;
PENDING REASONS:
The specified job group has reached its job limit;
userA@HostA-186: bjgroup -s /user1
GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER
/user1 11 1 10 0 0 0 () 10/10 userA
4. You can modify the limit any time with the command
bgmod -L <n> <group name>
Was this topic helpful?
Document Information
Modified date:
17 June 2018
UID
isg3T1024288