Configuring host-level memory usage limits

Configure the memory usage limit for all processes created by PEM (for Windows) and the sub-PEM (for Linux®), and descendant processes on each compute host, so that the total memory (virtual memory for Windows, and physical memory for Linux) consumed by these processes does not exceed the total memory of the host.

Before you begin

There are no prerequisites for configuring host-memory usage limits on Windows hosts. However, if you are using Linux, note these requirements:
  • When IBM® Spectrum Symphony is installed in simplified Workload Execution Mode (WEM), do not run the egosetsudoers.sh script; otherwise, when the sum of the memory usage of the processes in the host-level cgroup reaches its memory limit, the cgroup will only prevent memory allocation and not kill those processes.
  • Memory and swap accounting must be enabled in the Linux kernel.
  • Linux control groups (cgroups) must be installed on the compute host (see Control groups (cgroups) for host-level memory usage limits on Linux).
  • Linux swap partitions must not be enabled on the host.

Procedure

The memory usage limit is set at the host level; therefore, you can configure different usage limits on each compute host.

  1. Log on to the compute host on which you want to enforce a memory usage limit.
  2. If IBM Spectrum Symphony for Linux is installed in simplified Workload Execution Mode (WEM):
    1. Ensure that the compute host is started, and log on with root privileges (such as root).
    2. Clean up and create the host cgroup:
      egosh user logon -u Admin -x Admin
      symcgroup.sh -d
      symcgroup.sh
  3. Edit the local ego.conf file (Installation_top\kernel\conf\ego.conf on Windows, and $EGO_TOP/kernel/conf/ego.conf on Linux) to add the EGO_HOST_RESOURCE_USAGE_LIMIT parameter (see the ego.conf reference). Valid value is higher than 0 and lower than 1 and specifies the memory limit as a percentage of the host's total memory. In Windows, the memory limits are for virtual memory, and in Linux, they are for physical memory. For example:
    • Example for Windows:
      EGO_HOST_RESOURCE_USAGE_LIMIT=MEM[VIRTUAL_PERCENTAGE=0.8]

      When total virtual memory on this host is 8 GB, this configuration enforces that the maximum virtual memory usage of all processes created by PEM and its descendant processes does not exceed 80% of the host’s total virtual memory.

    • Example for Linux:
      EGO_HOST_RESOURCE_USAGE_LIMIT=MEM[PHYSICAL_PERCENTAGE=0.8]
      When total physical memory on this host is 8 GB, this configuration enforces that the maximum physical memory usage of all processes created by the sub-PEM and its descendant processes does not exceed 80% of the host’s total physical memory.
      Note: Linux cgroups require the memory limit value to be a multiple of the memory page size (4 KB for Linux x86_64 by default). So the memory limit value set by IBM Spectrum Symphony might be different from the value that takes effect in the cgroup. For example, when IBM Spectrum Symphony sets the host memory limit as 8193 (2*4K+1) bytes, cgroup will change the value to 8192 (2*4K) bytes automatically.
  4. Make the memory usage limit take effect on the compute host:
    1. Kill all processes.
    2. Starting the host:
      egosh ego start
  5. Check the pem.log.${hostname} log (within the Installation_top\kernel\log directory on Windows, and the $EGO_TOP/kernel/log directory on Linux). Look for the following message to confirm whether the memory usage limit was set up successfully:
    Memory usage limit (MEM) is configured for the EGO_HOST_RESOURCE_USAGE_LIMIT parameter.

    The log level on compute hosts, by default, is LOG_NOTICE. At this level, only failed operations are logged to the PEM log. For detailed information, set the log level to LOG_INFO or LOG_DEBUG (see the egosh debug pemon command in debug).

What to do next

  1. Submit application workload.
  2. Verify that the host-level memory usage limit you set is working, and note:
    • For Windows, if the sum of the memory usage of PEM's child processes reaches the memory usage limit, the system will not allocate memory. Monitored processes are not terminated.
    • For Linux, if the sum of the memory usage of the sub-PEM and its child processes reaches the memory usage limit, some monitored processes are terminated by the out-of-memory killer of the host-level cgroup. Check the /var/log/messages log to see which processes were terminated:
      grep -i 'killed process' /var/log/messages

      If a process is terminated, application error handling might place the compute host in the blocked host list (see Host blocking).