Topic
  • 7 replies
  • Latest Post - ‏2011-11-06T09:03:59Z by kostty
kostty
kostty
4 Posts

Pinned topic LL to allocate from homogeneous nodes

‏2011-10-20T07:42:12Z |
Hi all,

please advise me on setting scheduler correctly in order to place a parallel job onto "same" nodes only.

Historically our HPC comprises a number of types of nodes, which means that nodes of one type are identical as they were bought at once. So were the other types.

What now seems to us a nice optimization is that we should assign a job to be calculated by identical (CPU,MEM) nodes, but not some specific set of nodes although. Another words, scheduler can pick any type of nodes, but all the nodes for a given job must be of that same kind. I've seen exactly implementation of this in Moab with its "node sets".

How do we achieve it with LL? I've read guide on classes and multiclusters, but they don't seem to address the requirement.
Anyone to have tackled the problem already?

Thanks for reading)
Updated on 2011-11-06T09:03:59Z at 2011-11-06T09:03:59Z by kostty
  • kmarthi
    kmarthi
    8 Posts

    Re: LL to allocate from homogeneous nodes

    ‏2011-10-31T16:57:36Z  
    Do you mean that the scheduler should identify the nodes for a job only the first time and then assign the same nodes (not other nodes with the same characteristics) when the job is submitted a second time?

    What do you think should happen if the originally assigned nodes are busy? Should the job idle until these nodes become available?

    What is the value of this feature if the nodes selected each time by the scheduler have the characteristics that the job needs?
  • kmarthi
    kmarthi
    8 Posts

    Re: LL to allocate from homogeneous nodes

    ‏2011-10-31T17:29:45Z  
    Perhaps I read too much into your requirement. Please take a look at the "Pool" keyword.

    When configuring the old machines in machine or machin_group stanzas, add a keyword in the stanza
    pool_list = 1

    The new machines can similarly belong to
    pool_list = 2

    Any new machines that have similar HW configuration to the old machines can be in a hybrid pool like
    pool_list = 1 2

    If you want a job to use only the machines in the old pool, specify
    # @ requirements = (Pool == 1)

    If you want a job to use only the machines in the new pool, specify
    # @ requirements = (Pool == 2)

    If you don't care where the job runs, don't specify a requirement or specify
    # @ requirements = ((Pool == 1) || (Pool == 2))
  • kostty
    kostty
    4 Posts

    Re: LL to allocate from homogeneous nodes

    ‏2011-11-03T09:52:08Z  
    • kmarthi
    • ‏2011-10-31T17:29:45Z
    Perhaps I read too much into your requirement. Please take a look at the "Pool" keyword.

    When configuring the old machines in machine or machin_group stanzas, add a keyword in the stanza
    pool_list = 1

    The new machines can similarly belong to
    pool_list = 2

    Any new machines that have similar HW configuration to the old machines can be in a hybrid pool like
    pool_list = 1 2

    If you want a job to use only the machines in the old pool, specify
    # @ requirements = (Pool == 1)

    If you want a job to use only the machines in the new pool, specify
    # @ requirements = (Pool == 2)

    If you don't care where the job runs, don't specify a requirement or specify
    # @ requirements = ((Pool == 1) || (Pool == 2))
    hi, kmarthi

    and thanks for replying. Your second guess was correct.

    So pools really solve a problem to divide a cluster into groups by configuration. And one can easily separate jobs from unwanted pools.

    If I don't care what pool nodes come from, I just omit pool requirement.

    But what do we do to get a job to run in a single pool but not specific pool? So, having pools 1 2 and 3, job has to be capable to run in any single pool of those three: only in 1st or only in 2nd or only in 3rd.

    Just having requirements set to ((Pool==1)||(Pool==2)||(Pool==3)) allows for a job to be executed on any of the nodes, e.g. 2 nodes from 1st, 1 node from 2nd and so on. And that is not exactly what LL is expected to do.
  • kmarthi
    kmarthi
    8 Posts

    Re: LL to allocate from homogeneous nodes

    ‏2011-11-03T11:51:57Z  
    • kostty
    • ‏2011-11-03T09:52:08Z
    hi, kmarthi

    and thanks for replying. Your second guess was correct.

    So pools really solve a problem to divide a cluster into groups by configuration. And one can easily separate jobs from unwanted pools.

    If I don't care what pool nodes come from, I just omit pool requirement.

    But what do we do to get a job to run in a single pool but not specific pool? So, having pools 1 2 and 3, job has to be capable to run in any single pool of those three: only in 1st or only in 2nd or only in 3rd.

    Just having requirements set to ((Pool==1)||(Pool==2)||(Pool==3)) allows for a job to be executed on any of the nodes, e.g. 2 nodes from 1st, 1 node from 2nd and so on. And that is not exactly what LL is expected to do.
    In the current releases of LoadLeveler, there is no way to pick machines from a certain group without specifying a Pool or Feature keyword.
    Scheduling by machine groups was recently introduced (LL 4.1 and LL 5.1).
    Options that allow user to pick machines from pre-configured groups are likely to be available in LL 5.2 in 1H 2012.
  • kostty
    kostty
    4 Posts

    Re: LL to allocate from homogeneous nodes

    ‏2011-11-05T11:34:46Z  
    • kmarthi
    • ‏2011-11-03T11:51:57Z
    In the current releases of LoadLeveler, there is no way to pick machines from a certain group without specifying a Pool or Feature keyword.
    Scheduling by machine groups was recently introduced (LL 4.1 and LL 5.1).
    Options that allow user to pick machines from pre-configured groups are likely to be available in LL 5.2 in 1H 2012.
    Maybe I should clarify once again before closing the topic.

    Is there a possibility to pick machines not from a certain group, but from a single group out of many? Will it be released soon?

    Pools and Features do their work well, but the mechanism of Requirement field (when "OR" is used) won't let anyone to choose from multiple available pools (or machine groups, or specifically featured machines, whatever) just one. There seems to be something else to specify, that we would like to work on alike machines.

    If that was clear enough, could you now please affirm if LL4 or LL5 is capable of that or we have to switch from it to another scheduler.
  • kmarthi
    kmarthi
    8 Posts

    Re: LL to allocate from homogeneous nodes

    ‏2011-11-05T13:44:17Z  
    • kostty
    • ‏2011-11-05T11:34:46Z
    Maybe I should clarify once again before closing the topic.

    Is there a possibility to pick machines not from a certain group, but from a single group out of many? Will it be released soon?

    Pools and Features do their work well, but the mechanism of Requirement field (when "OR" is used) won't let anyone to choose from multiple available pools (or machine groups, or specifically featured machines, whatever) just one. There seems to be something else to specify, that we would like to work on alike machines.

    If that was clear enough, could you now please affirm if LL4 or LL5 is capable of that or we have to switch from it to another scheduler.
    Is there a possibility to pick machines not from a certain group, but from a single group out of many? Will it be released soon?
    In the current releases of LL (4.1 and 5.1) the scheduler cannot pick machines from a single group out of many. It will become available in LL 5.2 in the first half of 2012.
  • kostty
    kostty
    4 Posts

    Re: LL to allocate from homogeneous nodes

    ‏2011-11-06T09:03:59Z  
    • kmarthi
    • ‏2011-11-05T13:44:17Z
    Is there a possibility to pick machines not from a certain group, but from a single group out of many? Will it be released soon?
    In the current releases of LL (4.1 and 5.1) the scheduler cannot pick machines from a single group out of many. It will become available in LL 5.2 in the first half of 2012.
    Ok, kmarthi, thank you!

    Let us wait until it's hopefully released.