Topic
  • 4 replies
  • Latest Post - ‏2008-07-16T16:24:44Z by ezhong
michael-t
michael-t
28 Posts

Pinned topic Details of Fair-Share Algorithm on LL 3.4+

‏2008-07-09T21:44:12Z |
I have a few basic questions on how LL (v3.4+) is handling Fair-Share policy.

1) The only resource observed is the total CPU time consumed by all running processes of a job step, right?

2) Assume total_shares is FAIR_SHARE_TOTAL_SHARES = 1000 and a user has fair_shares = 10;

How is CPU time converted to "share units"?
Updated on 2008-07-16T16:24:44Z at 2008-07-16T16:24:44Z by ezhong
  • michael-t
    michael-t
    28 Posts

    Details of Fair-Share Algorithm on LL 3.4+ (Cont'd)

    ‏2008-07-09T21:50:20Z  
    More Qs:

    When are the used shares of users computed? At the end of the job step? If NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL is 0, are shares still calculated at the end of the job step?

    If NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL > 0, are used shares recalculated every NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL seconds as well?

    thanks
    Michael
  • HPC_Central
    HPC_Central
    8 Posts

    Re: Details of Fair-Share Algorithm on LL 3.4+

    ‏2008-07-15T22:15:50Z  
    Hi Michael,

    I forwarded your questions to LL team. Someone from LL will get back to you on these questions.

    HPC central
  • michael-t
    michael-t
    28 Posts

    Re: Details of Fair-Share Algorithm on LL 3.4+

    ‏2008-07-15T22:42:52Z  
    Hi Michael,

    I forwarded your questions to LL team. Someone from LL will get back to you on these questions.

    HPC central
    Thanks ... I couldn't find any info on this.

    Michael
  • ezhong
    ezhong
    11 Posts

    Re: Details of Fair-Share Algorithm on LL 3.4+ (Cont'd)

    ‏2008-07-16T16:24:44Z  
    • michael-t
    • ‏2008-07-09T21:50:20Z
    More Qs:

    When are the used shares of users computed? At the end of the job step? If NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL is 0, are shares still calculated at the end of the job step?

    If NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL > 0, are used shares recalculated every NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL seconds as well?

    thanks
    Michael
    Hi Michael,

    I have a few basic questions on how LL (v3.4+) is handling Fair-Share policy.

    1) The only resource observed is the total CPU time consumed by all running processes of a job step, right?

    Right for non-Blue Gene jobs.

    2) Assume total_shares is FAIR_SHARE_TOTAL_SHARES = 1000 and a user has fair_shares = 10;

    How is CPU time converted to "share units"?

    In LoadLeveler Fair Share scheduling, there is a decaying mechanism regarding the resources and resource usage. Thus, from the distant past to the present time, there is a finite amount of total CPU resources for a given cluster. This total amout of resources divided by the total number of shares of 1000 gives the per share resource value.

    When a LoadLeveler job just finished, its resources usage is accumulated with all the resource usage by the same user, considering decay. Fair Share resource usage is kept for each user and group which has run a job in LoadLeveler. When the time comes to get the used shares by a user or group, the accumulated Fair Share resoruce usage divided by the per share resource value provides the used share value. Note that 0.9 share is 0 share and 1.1 share is 1 share.

    When are the used shares of users computed? At the end of the job step? If NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL is 0, are shares still calculated at the end of the job step?

    The used shares of users are computed when needed. For example, when you issue the llfs command. It's not calculated automatically at the end of the job step.

    It's crucial in LoadLeveler Fair Share Scheduling to let NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL be greater than 0. Otherwise, there is no chance for LoadLeveler to adjust job priorites based on the resource usage and thus it defeats the purpose of Fair Share Scheduling.

    If NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL > 0, are used shares recalculated every NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL seconds as well?

    Yes.

    Please feel free to ask if you have more questions.

    Regards,
    Enci Zhong
    LoadLeveler Development