please advise me on setting scheduler correctly in order to place a parallel job onto "same" nodes only.
Historically our HPC comprises a number of types of nodes, which means that nodes of one type are identical as they were bought at once. So were the other types.
What now seems to us a nice optimization is that we should assign a job to be calculated by identical (CPU,MEM) nodes, but not some specific set of nodes although. Another words, scheduler can pick any type of nodes, but all the nodes for a given job must be of that same kind. I've seen exactly implementation of this in Moab with its "node sets".
How do we achieve it with LL? I've read guide on classes and multiclusters, but they don't seem to address the requirement.
Anyone to have tackled the problem already?
Thanks for reading)
This topic has been locked.
7 replies Latest Post - 2011-11-06T09:03:59Z by kostty
Pinned topic LL to allocate from homogeneous nodes
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2011-11-06T09:03:59Z at 2011-11-06T09:03:59Z by kostty
Re: LL to allocate from homogeneous nodes2011-10-31T16:57:36Z in response to kosttyDo you mean that the scheduler should identify the nodes for a job only the first time and then assign the same nodes (not other nodes with the same characteristics) when the job is submitted a second time?
What do you think should happen if the originally assigned nodes are busy? Should the job idle until these nodes become available?
What is the value of this feature if the nodes selected each time by the scheduler have the characteristics that the job needs?
Re: LL to allocate from homogeneous nodes2011-10-31T17:29:45Z in response to kosttyPerhaps I read too much into your requirement. Please take a look at the "Pool" keyword.
When configuring the old machines in machine or machin_group stanzas, add a keyword in the stanza
pool_list = 1
The new machines can similarly belong to
pool_list = 2
Any new machines that have similar HW configuration to the old machines can be in a hybrid pool like
pool_list = 1 2
If you want a job to use only the machines in the old pool, specify
# @ requirements = (Pool == 1)
If you want a job to use only the machines in the new pool, specify
# @ requirements = (Pool == 2)
If you don't care where the job runs, don't specify a requirement or specify
# @ requirements = ((Pool == 1) || (Pool == 2))
Re: LL to allocate from homogeneous nodes2011-11-03T09:52:08Z in response to kmarthihi, kmarthi
and thanks for replying. Your second guess was correct.
So pools really solve a problem to divide a cluster into groups by configuration. And one can easily separate jobs from unwanted pools.
If I don't care what pool nodes come from, I just omit pool requirement.
But what do we do to get a job to run in a single pool but not specific pool? So, having pools 1 2 and 3, job has to be capable to run in any single pool of those three: only in 1st or only in 2nd or only in 3rd.
Just having requirements set to ((Pool==1)||(Pool==2)||(Pool==3)) allows for a job to be executed on any of the nodes, e.g. 2 nodes from 1st, 1 node from 2nd and so on. And that is not exactly what LL is expected to do.
Re: LL to allocate from homogeneous nodes2011-11-03T11:51:57Z in response to kosttyIn the current releases of LoadLeveler, there is no way to pick machines from a certain group without specifying a Pool or Feature keyword.
Scheduling by machine groups was recently introduced (LL 4.1 and LL 5.1).
Options that allow user to pick machines from pre-configured groups are likely to be available in LL 5.2 in 1H 2012.
Re: LL to allocate from homogeneous nodes2011-11-05T11:34:46Z in response to kmarthiMaybe I should clarify once again before closing the topic.
Is there a possibility to pick machines not from a certain group, but from a single group out of many? Will it be released soon?
Pools and Features do their work well, but the mechanism of Requirement field (when "OR" is used) won't let anyone to choose from multiple available pools (or machine groups, or specifically featured machines, whatever) just one. There seems to be something else to specify, that we would like to work on alike machines.
If that was clear enough, could you now please affirm if LL4 or LL5 is capable of that or we have to switch from it to another scheduler.
Re: LL to allocate from homogeneous nodes2011-11-05T13:44:17Z in response to kosttyIs there a possibility to pick machines not from a certain group, but from a single group out of many? Will it be released soon?
In the current releases of LL (4.1 and 5.1) the scheduler cannot pick machines from a single group out of many. It will become available in LL 5.2 in the first half of 2012.