Topic
  • 3 replies
  • Latest Post - ‏2008-07-16T18:29:23Z by ezhong
SystemAdmin
SystemAdmin
46 Posts

Pinned topic LoadLeveler 3.4 and Fair Share

‏2007-07-10T18:14:19Z |
Hi!

We started to test the new fair share feature that was recently added in LL. A lot of flexibility is given for the configuration, it is really promising!

A curiosity: why is it based on user time, rather than on wall clock time? All the scheduling in LoadL is wall clock based, but not fair share. There are pros and cons of doing it the way it is done, I'd like to hear from the developers what was the intention behind it.

Or could we hope for a tunable in the future that would allow fair share using a different metric? (wall clock, or even an user-defined metric?) Or would this be a first step where jobs could in a near future request user time rather than wall clock time?

Thanks,

Luc.
Updated on 2008-07-16T18:29:23Z at 2008-07-16T18:29:23Z by ezhong
  • ezhong
    ezhong
    11 Posts

    Re: LoadLeveler 3.4 and Fair Share

    ‏2007-07-13T15:21:20Z  
    Hi Luc,

    > Hi!
    >
    > We started to test the new fair share feature that
    > was recently added in LL. A lot of flexibility is
    > given for the configuration, it is really promising!
    >

    Thank you. :)

    > A curiosity: why is it based on user time, rather
    > than on wall clock time? All the scheduling in LoadL
    > is wall clock based, but not fair share. There are
    > pros and cons of doing it the way it is done, I'd
    > like to hear from the developers what was the
    > intention behind it.

    Wall clock limit is one of the many factors considered in LoadLeveler scheduling. It's certainly a very important and basic one.

    The elapsed wall clock time a job uses can fluctuate a lot depending on whether other jobs or daemons share the resources, etc. And that's outside of the control of a job. Hopefully, the CPU usage is relatively more consistent from run to run in general.

    >
    > Or could we hope for a tunable in the future that
    > would allow fair share using a different metric?
    > (wall clock, or even an user-defined metric?) Or
    > would this be a first step where jobs could in a near
    > future request user time rather than wall clock
    > time?
    >

    We understand that CPU usage based fair share scheduling may not work well if a customer site always uses dedicated machines to run a job which may not be CPU intensive. On a IBM Blue Gene machine, each compute node can run one job only at a time. Fair share scheduling for Blue Gene is based on the number of compute nodes and the elapsed wall clock time a job uses.

    We'd like to work with our customers to understand their needs and enhance the fair share schudling function to meet the customer needs.

    > Thanks,
    >
    > Luc.

    Thanks,
    Enci
  • michael-t
    michael-t
    28 Posts

    Re: LoadLeveler 3.4 and Fair Share

    ‏2008-07-09T23:00:29Z  
    • ezhong
    • ‏2007-07-13T15:21:20Z
    Hi Luc,

    > Hi!
    >
    > We started to test the new fair share feature that
    > was recently added in LL. A lot of flexibility is
    > given for the configuration, it is really promising!
    >

    Thank you. :)

    > A curiosity: why is it based on user time, rather
    > than on wall clock time? All the scheduling in LoadL
    > is wall clock based, but not fair share. There are
    > pros and cons of doing it the way it is done, I'd
    > like to hear from the developers what was the
    > intention behind it.

    Wall clock limit is one of the many factors considered in LoadLeveler scheduling. It's certainly a very important and basic one.

    The elapsed wall clock time a job uses can fluctuate a lot depending on whether other jobs or daemons share the resources, etc. And that's outside of the control of a job. Hopefully, the CPU usage is relatively more consistent from run to run in general.

    >
    > Or could we hope for a tunable in the future that
    > would allow fair share using a different metric?
    > (wall clock, or even an user-defined metric?) Or
    > would this be a first step where jobs could in a near
    > future request user time rather than wall clock
    > time?
    >

    We understand that CPU usage based fair share scheduling may not work well if a customer site always uses dedicated machines to run a job which may not be CPU intensive. On a IBM Blue Gene machine, each compute node can run one job only at a time. Fair share scheduling for Blue Gene is based on the number of compute nodes and the elapsed wall clock time a job uses.

    We'd like to work with our customers to understand their needs and enhance the fair share schudling function to meet the customer needs.

    > Thanks,
    >
    > Luc.

    Thanks,
    Enci
    Hello,

    we are also considering the best way to use Fair Share here in our installation.

    One problem is that we do not know how UserUsedShares and GroupUsedShares are updated ... I do know they are based on total cpu consumption (T_cpu) but how is T_cpu converted to share units?

    Another is that we are trying to use FS but at the same time NOT let old jobs stay indefinitely postponed so we will have to include QDate in the SYSPRIO, right?
    SYSPRIO: W_1 * $(UserRemainingShares) + W_2 * QDate + W_3 * Other_factors

    but QDate is monotonically increasing (+1/sec) and if not paid attention to, it can quickly render any setting out of proper scale....

    Michael
  • ezhong
    ezhong
    11 Posts

    Re: LoadLeveler 3.4 and Fair Share

    ‏2008-07-16T18:29:23Z  
    • michael-t
    • ‏2008-07-09T23:00:29Z
    Hello,

    we are also considering the best way to use Fair Share here in our installation.

    One problem is that we do not know how UserUsedShares and GroupUsedShares are updated ... I do know they are based on total cpu consumption (T_cpu) but how is T_cpu converted to share units?

    Another is that we are trying to use FS but at the same time NOT let old jobs stay indefinitely postponed so we will have to include QDate in the SYSPRIO, right?
    SYSPRIO: W_1 * $(UserRemainingShares) + W_2 * QDate + W_3 * Other_factors

    but QDate is monotonically increasing (+1/sec) and if not paid attention to, it can quickly render any setting out of proper scale....

    Michael
    I assume the questions have been answered by replies posted earlier today.