Changing required node settings

Some services that run on IBM® Cloud Pak for Data require specific settings on the nodes in the cluster. To ensure that the cluster has the required settings for these services, an operating system administrator with root privileges must review and adjust the settings on the appropriate nodes in the cluster.

Required permissions

To adjust these settings, you must be an operating system administrator with root privileges.

Load balancer timeout settings
CRI-O container settings
Docker container settings
Kernel parameter settings
Power® SMT settings
GPU node settings

Load balancer timeout settings

To prevent connections from being closed before processes complete, you might need to adjust the timeout settings on your load balancer node. If you are using HAProxy, the load balancer node is usually the OpenShift® cluster public node. The recommended timeout is at least 5 minutes. In some situations, you might need to set the timeout even higher. For more information, see Processes time out before completing.

This setting is required if you plan to install the Watson™ Knowledge Catalog service or the OpenPages® service. However, this setting is also recommended if you are working with large data sets or you have slower network speeds.

The following steps assume that you are using HAProxy. If you are using a different load balancer, see the documentation for your load balancer.

On premises or private cloud

On the load balancer node, check the HAProxy timeout settings in the /etc/haproxy/haproxy.cfg file.
The recommended values are at least:
```
timeout client          300s 
timeout server          300s 
```
If the timeout values are less than 300 seconds (5 minutes), update the values:
- To change the timeout client setting, enter the following command:
```
sed -i -e "/timeout client/s/ [0-9].*/ 5m/" /etc/haproxy/haproxy.cfg
```
- To change the timeout server setting, enter the following command:
```
sed -i -e "/timeout server/s/ [0-9].*/ 5m/" /etc/haproxy/haproxy.cfg
```
Run the following command to apply the changes that you made to the HAProxy configuration:
```
systemctl restart haproxy
```

On IBM Cloud

If you are setting HAProxy timeout settings for Cloud Pak for Data on IBM Cloud, you can configure route timeouts using the oc annotate command.

Use the following command to set the server-side timeout for the HAProxy route to 360 seconds:
```
oc annotate route zen-cpd --overwrite haproxy.router.openshift.io/timeout=360s
```
If you don't provide the units, ms is the default.
Optionally, customize other route-specific settings. For more information, see Route-specific annotations.

Note: On a Virtual Private Cloud (VPC) Gen2 cluster, the load balancer timeout is set to 30s by default. If you use the annotate command to set the timeout value greater than 50s, it will be set to 50s. You cannot customize the timeout value to be greater than 50s. The server may time out during long running transactions. For more information, see Connection timeouts.

CRI-O container settings

To ensure that services can run correctly, you must adjust values in the CRI-O container settings to specify the maximum number of processes and the maximum number of open descriptor files.

Note: If you install Cloud Pak for Data on IBM Cloud, the CRI-O container settings are automatically applied to your cluster as part of the installation. You do not need to manually change these settings.

These settings are required if you are using the CRI-O container runtime. Follow the instructions for your version of Red Hat® OpenShift:

Configuring CRI-O container settings on Red Hat OpenShift version 4.5 or 4.6
Configuring CRI-O container settings on Red Hat OpenShift version 3.11

Configuring CRI-O container settings on Red Hat OpenShift version 4.5 or 4.6

To change CRI-O settings, you modify the contents of the crio.conf file and pass those updates to your nodes as a machine config.

Obtain a copy of the existing crio.conf file from a worker node. For example, run the following command, replacing $node with one of the worker nodes. You can obtain the worker nodes by using the oc get nodes command.
```
scp core@$node:/etc/crio/crio.conf /tmp/crio.conf
```
If the crio.conf file doesn't exist in the path /etc/crio/crio.conf, use the path /etc/crio/crio.conf.d/00-default instead.

If you don't have access by using the scp command, ask your cluster administrator for the crio.conf file.

Make sure that you obtain the latest version of the crio.conf file. You can verify that the file is the latest version by running the oc get mcp command and verifying that the worker node is not being updated (UPDATING = False).
In the crio.conf file, make the following changes in the [crio.runtime] section (uncomment the lines if necessary):
- To set the maximum number of open files, change the default_ulimits setting to at least 66560, as follows:
```
……
[crio.runtime]
default_ulimits = [
        "nofile=66560:66560"
]
……
```
- To set the maximum number of processes, change the pids_limit setting to at least 12288, as follows:
```
……
# Maximum number of processes allowed in a container.
pids_limit = 12288
……
```

Create a machineconfig object YAML file, as follows, and apply it.

Note: If you are using Cloud Pak for Data on Red Hat OpenShift version 4.6, the ignition version is 3.1.0 and the machineconfig object YAML file must not include the filesystem parameter. If you are using Cloud Pak for Data on Red Hat OpenShift version 4.5, the ignition version is 2.2.0 and the machineconfig object YAML file must include filesystem: root.

Red Hat OpenShift version 4.6

cat << EOF | oc apply -f -
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-worker-cp4d-crio-conf
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,$(cat /tmp/crio.conf | base64 -w0)
        mode: 0644
        overwrite: true
        path: /etc/crio/crio.conf
EOF

Red Hat OpenShift version 4.5

cat << EOF | oc apply -f -
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-worker-cp4d-crio-conf
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,$(cat /tmp/crio.conf | base64 -w0)
        filesystem: root
        mode: 0644
        overwrite: true
        path: /etc/crio/crio.conf
EOF

Monitor all of the nodes to ensure that the changes are applied, by using the following command:
```
watch oc get nodes
```
You can also use the following command to confirm that the MachineConfig sync is complete:
```
watch oc get mcp
```

Configuring CRI-O container settings on Red Hat OpenShift version 3.11

On each worker node in the cluster, perform the following steps.

In the crio.conf file, make the following changes in the [crio.runtime] section (uncomment the lines if necessary):
- To set the maximum number of open files, change the default_ulimits setting to at least 66560, as follows:
```
……
[crio.runtime]
default_ulimits = [
        "nofile=66560:66560"
]
……
```
- To set the maximum number of processes, change the pids_limit setting to at least 12288, as follows:
```
……
# Maximum number of processes allowed in a container.
pids_limit = 12288
……
```
Run the following command to apply the changes that you made to the /etc/crio/crio.conf file:
```
systemctl restart crio
```

For more information, see Configuring CRI-O.

Docker container settings

To ensure that services can run correctly, you must adjust the maximum number of processes and the maximum number of open files in the Docker container settings.

Note: You do not need to adjust these settings on IBM Cloud.

These settings are required if you are using the Docker container runtime on Red Hat OpenShift version 3.11.

Note: Docker is not supported for Red Hat OpenShift version 4.5 and 4.6.

On each worker node in the cluster, perform the following steps.

Check the maximum number of open files setting by running the following command:
```
ulimit -n
```
The recommended value is at least 66560.
1. If the ulimit value is less than 66560, edit or append the following setting in the OPTIONS line in the /etc/sysconfig/docker file:
```
OPTIONS=' --default-ulimit nofile=66560'
```
Check the maximum number of processes setting by running the following command:
```
ulimit -u
```
The recommended value is at least 12288.
1. If the ulimit value is less than 12288, edit or append the following setting in the OPTIONS line in the /etc/sysconfig/docker file:
```
OPTIONS=' --default-pids-limit=12288'
```
Run the following command to apply the changes that you made to the /etc/sysconfig/docker file:
```
systemctl restart docker
```

Kernel parameter settings

To ensure that certain microservices can run correctly, you must verify the kernel parameters. These settings are required for all deployments; however, they depend on the machine RAM size and the OS page size. The following steps assume that you have worker nodes with 64 GB of RAM on an x86 platform with a 4 K OS page size. If the worker nodes have 128 GB of RAM each, you must double the values for kernel.shmmax and kernel.shmall.

Virtual memory limit (vm.max_map_count)
Message limits (kernel.msgmax, kernel.msgmnb, and kernel.msgmni)
Shared memory limits (kernel.shmmax, kernel.shmall, and kernel.shmmni)
The following settings are recommended:
- kernel.shmmni: 256 * <size of RAM in GB>
- kernel.shmmax: <size of RAM in bytes>
- kernel.shmall: 2 * <size of RAM in the default OS system page size>
The default OS system page size on Power Systems is 64KB. Take this OS page size into account when you set the value for kernel.shmall. For more information, see Modifying kernel parameters (Linux®) in Kernel parameters for Db2 database server installation (Linux and UNIX).
Semaphore limits (kernel.sem)
As of Red Hat Enterprise Linux version 7.8 and Red Hat Enterprise Linux version 8.1, the kernel.shmmni, kernel.msgmni, and kernel.semmni settings in kernel.sem must be set to 32768. If the boot parameter ipcmni_extend is specified, then the maximum value is 8388608 while the minimum value is 32768. Use 256 * <size of RAM in GB> to calculate possible values for kernel.shmmni and kernel.semmni. Use 1024 * <size of RAM in GB> to calculate a possible value for kernel.msgmni. For more information, see On RHEL servers, changing the semaphore value fails with a message "setting key "kernel.sem": Numerical result out of range".
- The kernel.sem value for SEMMNS must be 1024000 for Watson Knowledge Catalog service.
- The kernel.sem value for SEMOPM must be 100 for Data Virtualization service.

For Red Hat OpenShift version 4.5 or 4.6

On Red Hat OpenShift, you can use the Node Tuning Operator to manage node-level profiles. For more information, see Using the Node Tuning Operator.

Note: The following steps affect all services and all worker nodes on the cluster. You may need to manage node-level profiles for each worker node in the cluster based on the services that are installed. You can limit node tuning to specific nodes. For more information, see Managing nodes.

Create a YAML file, 42-cp4d.yaml, with the following content. If your current settings are less than the recommendations, adjust the settings in your YAML file. This step assumes that you have worker nodes with 64 GB of RAM.

apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
  name: cp4d-wkc-ipc
  namespace: openshift-cluster-node-tuning-operator
spec:
  profile:
  - name: cp4d-wkc-ipc
    data: |
      [main]
      summary=Tune IPC Kernel parameters on OpenShift Worker Nodes running WKC Pods
      [sysctl]
      kernel.shmall = 33554432
      kernel.shmmax = 68719476736
      kernel.shmmni = 32768
      kernel.sem = 250 1024000 100 32768
      kernel.msgmax = 65536
      kernel.msgmnb = 65536
      kernel.msgmni = 32768
      vm.max_map_count = 262144
  recommend:
  - match:
    - label: node-role.kubernetes.io/worker
    priority: 10
    profile: cp4d-wkc-ipc

IBM Power Systems On Power Systems, create a YAML file, 42-cp4d.yaml, with the following content. Adjust kernel.shmall for a 64 K OS system page size for Power Systems.

apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
  name: cp4d-wkc-ipc
  namespace: openshift-cluster-node-tuning-operator
spec:
  profile:
  - name: cp4d-wkc-ipc
    data: |
      [main]
      summary=Tune IPC Kernel parameters on OpenShift Worker Nodes running WKC Pods
      [sysctl]
      kernel.shmall = 2097152
      kernel.shmmax = 68719476736
      kernel.shmmni = 32768
      kernel.sem = 250 1024000 100 32768
      kernel.msgmax = 65536
      kernel.msgmnb = 65536
      kernel.msgmni = 32768
      vm.max_map_count = 262144
  recommend:
  - match:
    - label: node-role.kubernetes.io/worker
    priority: 10
    profile: cp4d-wkc-ipc

Run the following command to apply the changes:
```
oc create -f 42-cp4d.yaml
```

For Red Hat OpenShift version 3.11

On each worker node in the cluster, perform the following steps.

Create a conf file for the Cloud Pak for Data kernel settings in the /etc/sysctl.d directory.
The files are processed in alphabetical order. To ensure that the settings from the Cloud Pak for Data conf file are retained after the sysctl files are processed, give the conf file a name that ensures it is at the end of the queue. For example, name the file /etc/sysctl.d/42-cp4d.conf
Check the virtual memory limit setting by running the following command:
```
sysctl -a 2>/dev/null | grep vm.max | grep -v next_id
```
The recommended value is at least:
```
vm.max_map_count = 262144
```
1. If the vm.max_map_count value is less than 262144, add the following entry in the /etc/sysctl.d/42-cp4d.conf file:
```
vm.max_map_count = 262144
```
Check the message limit settings by running the following command:
```
sysctl -a 2>/dev/null | grep kernel.msg | grep -v next_id
```
The recommended values are at least:
```
kernel.msgmax = 65536
kernel.msgmnb = 65536
kernel.msgmni = 32768
```
1. If the kernel.msgmax value is less than 65536, add the following entry in the /etc/sysctl.d/42-cp4d.conf file:
```
kernel.msgmax = 65536
```
2. If the kernel.msgmnb value is less than 65536, add the following entry in the /etc/sysctl.d/42-cp4d.conf file:
```
kernel.msgmnb = 65536
```
3. If the kernel.msgmni value is less than 32768, add the following entry in the /etc/sysctl.d/42-cp4d.conf file:
```
kernel.msgmni = 32768
```
Check the shared memory limit settings by running the following command:
```
sysctl -a 2>/dev/null | grep kernel.shm | grep -v next_id | grep -v shm_rmid_forced
```
The recommended values are at least:
```
kernel.shmmax = 68719476736
kernel.shmall = 33554432
kernel.shmmni = 32768
```
1. If the kernel.shmmax value is less than 68719476736, add the following entry in the /etc/sysctl.d/42-cp4d.conf file:
```
kernel.shmmax = 68719476736
```
2. If the kernel.shmall value is less than 33554432, add the following entry in the /etc/sysctl.d/42-cp4d.conf file:
```
kernel.shmall = 33554432
```
3. If the kernel.shmmni value is less than 32768, add the following entry in the /etc/sysctl.d/42-cp4d.conf file:
```
kernel.shmmni = 32768
```
Check the semaphore limit settings by running the following command:
```
sysctl -a 2>/dev/null | grep kernel.sem | grep -v next_id
```
The recommended values are at least:
```
kernel.sem = 250 1024000 100 32768
```
Specifically:
- The max semaphores per array must be at least 250.
- The max semaphores system wide must be at least 1024000.
- The max ops per semop call must be at least 100.
- The max number of arrays must be at least 32768.
1. If any of the semaphore limit settings are less that the minimum requirements, add the following entry in the /etc/sysctl.d/42-cp4d.conf file:
```
kernel.sem = 250 1024000 100 32768
```
Run the following command to apply the changes that you made to the /etc/sysctl.d/42-cp4d.conf:
```
sysctl -p /etc/sysctl.d/42-cp4d.conf
```

Power settings

On Power Systems, you must complete the following steps to change the simultaneous multithreading (SMT) settings and set the kernel argument slub_max_order to 0 for small core, Kernel-based Virtual Machine (KVM) capable, and big core, PowerVM® capable systems.

Note: PowerVM capable systems include L922, E950, E980, S922.

Kernel-based Virtual Machine (KVM) capable systems include LC922, IC922, AC922.

Power SMT and slub_max_order settings

Label all small core KVM capable worker nodes that are not running Db2® Warehouse workloads to SMT=2. For example:
```
oc label node <node> SMT=2 --overwrite
```
Label all small core KVM capable worker nodes that are running Db2 Warehouse workloads to SMT=4. For example:
```
oc label node <node> SMT=4 --overwrite
```
Label all big core PowerVM capable worker nodes that are not running Db2 Warehouse workloads to SMT=4. For example:
```
oc label node <node> SMT=4 --overwrite
```
Label all big core PowerVM capable worker nodes that are running Db2 Warehouse workloads to SMT=8. For example:
```
oc label node <node> SMT=8 --overwrite 
```

Create a YAML file, smt.yaml, with the following content:

Note: The ignition version value changes based on the Red Hat OpenShift version number.

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-worker-smt
spec:
  kernelArguments:
  - slub_max_order=0
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,IyEvYmluL2Jhc2gKZXhwb3J0IFBBVEg9L3Jvb3QvLmxvY2FsL2Jpbjovcm9vdC9iaW46L3NiaW46L2JpbjovdXNyL2xvY2FsL3NiaW46L3Vzci9sb2NhbC9iaW46L3Vzci9zYmluOi91c3IvYmluCmV4cG9ydCBLVUJFQ09ORklHPS92YXIvbGliL2t1YmVsZXQva3ViZWNvbmZpZwpDT1JFUFM9JCgvYmluL2xzY3B1IHwgL2Jpbi9hd2sgLUY6ICcgJDEgfiAvXkNvcmVcKHNcKSBwZXIgc29ja2V0JC8ge3ByaW50ICQyfSd8L2Jpbi94YXJncykKU09DS0VUUz0kKC9iaW4vbHNjcHUgfCAvYmluL2F3ayAtRjogJyAkMSB+IC9eU29ja2V0XChzXCkkLyB7cHJpbnQgJDJ9J3wvYmluL3hhcmdzKQpsZXQgVE9UQUxDT1JFUz0kQ09SRVBTKiRTT0NLRVRTCk1BWFRIUkVBRFM9JCgvYmluL2xzY3B1IHwgL2Jpbi9hd2sgLUY6ICcgJDEgfiAvXkNQVVwoc1wpJC8ge3ByaW50ICQyfSd8L2Jpbi94YXJncykKbGV0IE1BWFNNVD0kTUFYVEhSRUFEUy8kVE9UQUxDT1JFUwpDVVJSRU5UU01UPSQoL2Jpbi9sc2NwdSB8IC9iaW4vYXdrIC1GOiAnICQxIH4gL15UaHJlYWRcKHNcKSBwZXIgY29yZSQvIHtwcmludCAkMn0nfC9iaW4veGFyZ3MpCgpTTVRMQUJFTD0kKC9iaW4vb2MgZ2V0IG5vZGUgJEhPU1ROQU1FIC1MIFNNVCAtLW5vLWhlYWRlcnMgfC9iaW4vYXdrICd7cHJpbnQgJDZ9JykKaWYgW1sgLW4gJFNNVExBQkVMIF1dCiAgdGhlbgogICAgY2FzZSAkU01UTEFCRUwgaW4KICAgICAgMSkgVEFSR0VUU01UPTEKICAgIDs7CiAgICAgIDIpIFRBUkdFVFNNVD0yCiAgICA7OwogICAgICA0KSBUQVJHRVRTTVQ9NAogICAgOzsKICAgICAgOCkgVEFSR0VUU01UPTgKICAgIDs7CiAgICAgICopIFRBUkdFVFNNVD0kQ1VSUkVOVFNNVCA7IGVjaG8gIlNNVCB2YWx1ZSBtdXN0IGJlIDEsIDIsIDQsIG9yIDggYW5kIHNtYWxsZXIgdGhhbiBNYXhpbXVtIFNNVC4iCiAgICA7OwogICAgZXNhYwogIGVsc2UKICAgIFRBUkdFVFNNVD0kTUFYU01UCmZpCgp0b3VjaCAvcnVuL21hY2hpbmUtY29uZmlnLWRhZW1vbi1mb3JjZQoKQ1VSUkVOVFNNVD0kKC9iaW4vbHNjcHUgfCAvYmluL2F3ayAtRjogJyAkMSB+IC9eVGhyZWFkXChzXCkgcGVyIGNvcmUkLyB7cHJpbnQgJDJ9J3wvYmluL3hhcmdzKQoKaWYgW1sgJENVUlJFTlRTTVQgLW5lICRUQVJHRVRTTVQgXV0KICB0aGVuCiAgICBJTklUT05USFJFQUQ9MAogICAgSU5JVE9GRlRIUkVBRD0kVEFSR0VUU01UCiAgICBpZiBbWyAkTUFYU01UIC1nZSAkVEFSR0VUU01UIF1dCiAgICAgIHRoZW4KICAgICAgICB3aGlsZSBbWyAkSU5JVE9OVEhSRUFEIC1sdCAkTUFYVEhSRUFEUyBdXQogICAgICAgIGRvCiAgICAgICAgICBPTlRIUkVBRD0kSU5JVE9OVEhSRUFECiAgICAgICAgICBPRkZUSFJFQUQ9JElOSVRPRkZUSFJFQUQKCiAgICAgICAgICB3aGlsZSBbWyAkT05USFJFQUQgLWx0ICRPRkZUSFJFQUQgXV0KICAgICAgICAgIGRvCiAgICAgICAgICAgIC9iaW4vZWNobyAxID4gL3N5cy9kZXZpY2VzL3N5c3RlbS9jcHUvY3B1JE9OVEhSRUFEL29ubGluZQogICAgICAgICAgICBsZXQgT05USFJFQUQ9JE9OVEhSRUFEKzEKICAgICAgICAgIGRvbmUKICAgICAgICAgIGxldCBJTklUT05USFJFQUQ9JElOSVRPTlRIUkVBRCskTUFYU01UCiAgICAgICAgICB3aGlsZSBbWyAkT0ZGVEhSRUFEIC1sdCAkSU5JVE9OVEhSRUFEIF1dCiAgICAgICAgICBkbwogICAgICAgICAgICAvYmluL2VjaG8gMCA+IC9zeXMvZGV2aWNlcy9zeXN0ZW0vY3B1L2NwdSRPRkZUSFJFQUQvb25saW5lCiAgICAgICAgICAgIGxldCBPRkZUSFJFQUQ9JE9GRlRIUkVBRCsxCiAgICAgICAgICBkb25lCiAgICAgICAgICBsZXQgSU5JVE9GRlRIUkVBRD0kSU5JVE9GRlRIUkVBRCskTUFYU01UCiAgICAgICAgZG9uZQogICAgICBlbHNlCiAgICAgICAgZWNobyAiVGFyZ2V0IFNNVCBtdXN0IGJlIHNtYWxsZXIgb3IgZXF1YWwgdGhhbiBNYXhpbXVtIFNNVCBzdXBwb3J0ZWQiCiAgICBmaQpmaQo=
          verification: {}
        filesystem: root
        mode: 0755
        overwrite: true
        path: /usr/local/bin/powersmt
    systemd:
      units:
        - name: smt.service
          enabled: true
          contents: |
            [Unit]
            Description=Set SMT
            After=network-online.target
            Before= crio.service
            [Service]
            Type=oneshot
            RemainAfterExit=yes
            ExecStart=/usr/local/bin/powersmt
            [Install]
            WantedBy=multi-user.target

Run the following command to apply the changes:
Note: You must ensure that the cluster master nodes (or control plane) are in Ready status before you issue this command.
```
oc create -f smt.yaml
```
Your worker nodes will perform a rolling reboot action to update the kernel argument slub_max_order and set the labeled SMT level.