Out-of-memory errors on Power systems with many vCPUs

Your installation fails, and you receive Memory cgroup out of memory errors in the kernel log (dmesg) when installing IBM® Cloud Private with management and Vulnerability Advisor services on a Power® system that has a large number of virtual CPUs. An example of the system is a POWER9™ AC922 with more than 100 virtual CPUs.

Symptoms

Your installation fails when installing on a Power system that has a large number of virtual CPUs. You might also receive Memory cgroup out of memory errors in the kernel log, either during or after the installation.

Causes

The default memory limit that is specified for some of the PODs is not sufficient for running a large number of virtual CPUs on a large system.

Resolving the problem

When installing IBM Cloud Private with management and Vulnerability Advisor services on Power systems with a large number of CPUs, increase the default memory limits.

Add the following content to the config.yaml file:

logging:
   logstash:
           memoryLimit: "2000Mi"
   elasticsearch:
           client:
                   memoryLimit: "2000Mi"
           master:
                   memoryLimit: "2000Mi"

helm-api:
  auditService:
    resources:
      limits:
        memory: "256Mi"

helm-repo:
  auditService:
    resources:
      limits:
        memory: "256Mi"

mgmt-repo:
  auditService:
    resources:
      limits:
        memory: "256Mi"