I'm sure you are all aware of the WLM_KILL feature introduced in AIX 5.3, the long awaited real memory limit enforcement feature. I think we've found cases where it is useless.
Say a job reserves all the available memory on a node. AIX uses about 6-7gb of memory (too much memory but that's another topic) so we tell LL the maximum is 57gb out of 64. The job has problems and will exceed this 57gb limit. The AIX page stealer kicks in before the real memory limit is reached, making this limit unattainable since real memory is being paged out. The virtual memory then increases until the node dies. wlmstat -M shows this very clearly. We think setting the wlm virtual memory limit equal to the wlm real memory limit would prevent such a scenario to happen but we can't find an obvious way to set that limit because LL doesn't enforce ConsumableVirtualMemory (i.e. ENFORCE_RESOURCE_USAGE won't accept the ConsumableVirtualMemory keyword).
Is this making sense? Is there another way to properly limit physical memory usage ?
Pinned topic LoadLeveler virtual memory limit (or lack thereof).
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2008-04-11T19:22:39Z at 2008-04-11T19:22:39Z by lcham
lcham 060001U0Y07 Posts
Re: LoadLeveler virtual memory limit (or lack thereof).2008-04-11T19:22:39ZThis is the accepted answer. This is the accepted answer.Hi Alain,
Sorry for the late response. This is really how to configure WLM enforcement for your environment so I wanted to verify with the AIX WLM team before responding.
Firstly, it seems that to hit the Absolute Memory you have to have enough real memory without paging so that WLM will know the limit got hit. So in your case, paging had started so the real memory limit was never hit. Virtual Memory enforcement is compose of real memory, swapping space and large page. So if you set virtual memory enforcement of 57gb, then when the total value of how WLM calcuates what was used for real memory, swapping page and large page is hit then the process will be killed. So you might need to think of having the Virtual Memory Limit value to be more than the Absolute Memory limit to include paging value.
Therefore, use the real memory limit to prevent paging but make sure it is an attainable limit. If some paging is allowed, then use the absolute virtual memory limit.