Changing required node settings
Some services that run on IBM® Cloud Pak for
Data require specific settings on the nodes in
the cluster. To ensure that the cluster has the required settings for these services, an operating
system administrator with root
privileges must review and adjust the settings on
the appropriate nodes in the cluster.
MachineConfig
objects to configure nodes in the Red Hat® OpenShift® Container Platform documentation:
Node settings for services
The following table shows the services that require changes to specific node settings, with links to instructions for changing the settings:
Node settings | Services that require changes to the setting | Environments | Instructions |
---|---|---|---|
HAProxy timeout settings for the load balancer |
|
All environments | |
CRI-O container settings |
|
All environments except IBM Cloud | |
Kernel parameter settings |
|
All environments | |
Power settings | |||
GPU settings |
|
All environments |
Load balancer timeout settings
To prevent connections from being closed before processes complete, you might need to adjust the timeout settings on your load balancer node.
- Db2 Data Gate
- OpenPages
- Watson Discovery
- Watson Knowledge Catalog
- Watson Speech
This setting is also recommended if you are working with large data sets or you have slower network speeds. For example, you might need to increase this value if you receive a timeout or failure when you upload a large file.
The following procedures show how to change the timeout settings if you are using HAProxy. If you are using a load balancer other than HAProxy, see the documentation for your load balancer for information about how to configure the timeout settings.
If you are using HAProxy, the load balancer node is the OpenShift cluster public node.
Changing timeout settings on premises or private cloud
- On the load balancer node, check the HAProxy timeout settings in the
/etc/haproxy/haproxy.cfg file. The recommended minimum values are as follows:
- Db2 Data Gate
-
timeout client 7500s timeout server 7500s
- OpenPages
-
timeout client 300s timeout server 300s
- Watson Discovery
-
timeout client 300s timeout server 300s
- Watson Knowledge Catalog
-
timeout client 300s timeout server 300s
- Watson Speech
-
timeout client 1800s timeout server 1800s
- If necessary, change the timeout values by running the following commands:
- To change the
timeout client
setting, enter the following command:sed -i -e "/timeout client/s/ [0-9].*/ 5m/" /etc/haproxy/haproxy.cfg
- To change the
timeout server
setting, enter the following command:sed -i -e "/timeout server/s/ [0-9].*/ 5m/" /etc/haproxy/haproxy.cfg
- To change the
- Run the following command to apply the changes that you made to the HAProxy
configuration:
systemctl restart haproxy
Changing timeout settings on IBM Cloud
If you are setting HAProxy timeout settings for Cloud Pak for Data on IBM Cloud, you can configure route timeouts by using the
oc annotate
command.
- Use the following command to set the server-side timeout for the HAProxy route to 360
seconds:
oc annotate route zen-cpd --overwrite haproxy.router.openshift.io/timeout=360s
If you don't provide the units,
ms
is the default. - Optionally, customize other route-specific settings. For more information, see Route-specific annotations.
annotate
command to set the
timeout value to a maximum of 50s. If you need to set the timeout value higher than 50s, open a
support ticket with the Load Balance Service team. The server might time out during long
running transactions. For more information, see Connection timeouts.CRI-O container settings
To ensure that services can run correctly, you must adjust values in the CRI-O container settings to specify the maximum number of processes and the maximum number of open descriptor files.
These settings are required for the CRI-O CRI-O container runtime on the OpenShift Container Platform.
To change CRI-O settings, you modify the
contents of the crio.conf
file and pass those updates to your nodes as a machine
config.
- Obtain a copy of the existing crio.conf file from a worker node. For
example, run the following command, replacing $node with one of the worker nodes.
You can obtain the worker nodes by using the
oc get nodes
command.scp core@$node:/etc/crio/crio.conf /tmp/crio.conf
If the crio.conf file doesn't exist in the path /etc/crio/crio.conf, use the path /etc/crio/crio.conf.d/00-default instead.
If you don't have access by using the scp command, ask your cluster administrator for the crio.conf file.
Make sure that you obtain the latest version of the crio.conf file.
- In the crio.conf file, make the following changes in the
[crio.runtime]
section (uncomment the lines if necessary):- To set the maximum number of open files, change the default_ulimits setting
to at least
66560
, as follows.Note: When you set the default_ulimits parameter in the crio.conf file, make sure that theulimit -n
settings in the /etc/security/limits.conf files on the worker machines also are set to at least66560
.…… [crio.runtime] default_ulimits = [ "nofile=66560:66560" ] ……
- To set the maximum number of processes, change the pids_limit setting to at
least
12288
, as follows.…… # Maximum number of processes allowed in a container. pids_limit = 12288 ……
- To set the maximum number of open files, change the default_ulimits setting
to at least
- Create a
machineconfig
object YAML file, as follows, and apply it.Note: If you are using Cloud Pak for Data on OpenShift Container Platform version 4.6, the ignition version is 3.1.0. If you are using Cloud Pak for Data on OpenShift Container Platform version 4.8, change the ignition version to 3.2.0.Note: On Mac OS systems, remove-w0
at the end of the source value so that you do not receive an error when you apply themachineconfig
object YAML file.cat << EOF | oc apply -f - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: 99-worker-cp4d-crio-conf spec: config: ignition: version: 3.1.0 storage: files: - contents: source: data:text/plain;charset=utf-8;base64,$(cat /tmp/crio.conf | base64 -w0) mode: 0644 overwrite: true path: /etc/crio/crio.conf EOF
- Monitor all of the nodes to ensure that the changes are applied, by using the following
command:
watch oc get nodes
You can also use the following command to confirm that the MachineConfig sync is complete:watch oc get mcp
Kernel parameter settings
Enabling unsafe sysctls in on-premises and private cloud deployments
kubelet
to allow Db2U to
make unsafe sysctl calls for Db2 to manage
required memory settings. - Update all of the nodes to use a custom
KubletConfig
:cat << EOF | oc apply -f - apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: db2u-kubelet spec: machineConfigPoolSelector: matchLabels: db2u-kubelet: sysctl kubeletConfig: allowedUnsafeSysctls: - "kernel.msg*" - "kernel.shm*" - "kernel.sem" EOF
- Update the label on the
machineconfigpool
:oc label machineconfigpool worker db2u-kubelet=sysctl
- Wait for the cluster to restart and then run the following command to verify that the
machineconfigpool
is updated:oc get machineconfigpool
The command should return output with the following format:NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master master True False False 3 3 3 0 139m worker worker False True False 5 1 1 0 139m
Wait until all of the worker nodes are updated and ready.
Changing kernel parameter settings on IBM Cloud
If you install Cloud Pak for Data and services on IBM Cloud without using the IBM Cloud Catalog, you must manually change the kernel parameter settings by applying a custom Kubernetes daemon set. For more information, see Modifying default worker node settings to optimize performance in the IBM Cloud documentation. Update the values in the daemon set based on the recommended settings for Cloud Pak for Data. For more information, see Kernel parameter requirements (Linux).
Changing kernel parameter settings in environments that do not support kubelet settings
In some environments, you might not be able to use kubelet
settings to change kernel parameter. For example, you might not have
permission to access the KubeletConfig and MachineConfig commands due to security restrictions.
In these cases, you can verify the kernel parameters to ensure that certain services can run
correctly. You can use the Red Hat OpenShift Node
Tuning Operator to calculate the correct kernel parameters. For more information, see Using the Red Hat OpenShift Node Tuning Operator to set kernel
parameters.
Power settings
0
only if your OpenShift Container Platform version is earlier than 4.8. Remove the kernel argument setting from the YAML file if your
OpenShift Container Platform version is 4.8 or later.
- Label all small core KVM capable worker nodes that are not running Db2
Warehouse workloads to SMT=2.
For
example:
oc label node <node> SMT=2 --overwrite
- Label all small core KVM capable worker nodes that are running Db2
Warehouse workloads to SMT=4.
For
example:
oc label node <node> SMT=4 --overwrite
- Label all big core PowerVM capable
worker nodes that are not running Db2
Warehouse workloads to SMT=4.
For example:
oc label node <node> SMT=4 --overwrite
- Label all big core PowerVM capable
worker nodes that are running Db2
Warehouse workloads to SMT=8.
For
example:
oc label node <node> SMT=8 --overwrite
- Create a YAML file, smt.yaml, with the following
content:
apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: 99-worker-smt spec: kernelArguments: - slub_max_order=0 config: ignition: version: 3.1.0 storage: files: - contents: source: data:text/plain;charset=utf-8;base64,IyEvYmluL2Jhc2gKZXhwb3J0IFBBVEg9L3Jvb3QvLmxvY2FsL2Jpbjovcm9vdC9iaW46L3NiaW46L2JpbjovdXNyL2xvY2FsL3NiaW46L3Vzci9sb2NhbC9iaW46L3Vzci9zYmluOi91c3IvYmluCmV4cG9ydCBLVUJFQ09ORklHPS92YXIvbGliL2t1YmVsZXQva3ViZWNvbmZpZwpDT1JFUFM9JCgvYmluL2xzY3B1IHwgL2Jpbi9hd2sgLUY6ICcgJDEgfiAvXkNvcmVcKHNcKSBwZXIgc29ja2V0JC8ge3ByaW50ICQyfSd8L2Jpbi94YXJncykKU09DS0VUUz0kKC9iaW4vbHNjcHUgfCAvYmluL2F3ayAtRjogJyAkMSB+IC9eU29ja2V0XChzXCkkLyB7cHJpbnQgJDJ9J3wvYmluL3hhcmdzKQpsZXQgVE9UQUxDT1JFUz0kQ09SRVBTKiRTT0NLRVRTCk1BWFRIUkVBRFM9JCgvYmluL2xzY3B1IHwgL2Jpbi9hd2sgLUY6ICcgJDEgfiAvXkNQVVwoc1wpJC8ge3ByaW50ICQyfSd8L2Jpbi94YXJncykKbGV0IE1BWFNNVD0kTUFYVEhSRUFEUy8kVE9UQUxDT1JFUwpDVVJSRU5UU01UPSQoL2Jpbi9sc2NwdSB8IC9iaW4vYXdrIC1GOiAnICQxIH4gL15UaHJlYWRcKHNcKSBwZXIgY29yZSQvIHtwcmludCAkMn0nfC9iaW4veGFyZ3MpCgpTTVRMQUJFTD0kKC9iaW4vb2MgZ2V0IG5vZGUgJEhPU1ROQU1FIC1MIFNNVCAtLW5vLWhlYWRlcnMgfC9iaW4vYXdrICd7cHJpbnQgJDZ9JykKaWYgW1sgLW4gJFNNVExBQkVMIF1dCiAgdGhlbgogICAgY2FzZSAkU01UTEFCRUwgaW4KICAgICAgMSkgVEFSR0VUU01UPTEKICAgIDs7CiAgICAgIDIpIFRBUkdFVFNNVD0yCiAgICA7OwogICAgICA0KSBUQVJHRVRTTVQ9NAogICAgOzsKICAgICAgOCkgVEFSR0VUU01UPTgKICAgIDs7CiAgICAgICopIFRBUkdFVFNNVD0kQ1VSUkVOVFNNVCA7IGVjaG8gIlNNVCB2YWx1ZSBtdXN0IGJlIDEsIDIsIDQsIG9yIDggYW5kIHNtYWxsZXIgdGhhbiBNYXhpbXVtIFNNVC4iCiAgICA7OwogICAgZXNhYwogIGVsc2UKICAgIFRBUkdFVFNNVD0kTUFYU01UCmZpCgpDVVJSRU5UU01UPSQoL2Jpbi9sc2NwdSB8IC9iaW4vYXdrIC1GOiAnICQxIH4gL15UaHJlYWRcKHNcKSBwZXIgY29yZSQvIHtwcmludCAkMn0nfC9iaW4veGFyZ3MpCgppZiBbWyAkQ1VSUkVOVFNNVCAtbmUgJFRBUkdFVFNNVCBdXQogIHRoZW4KICAgIElOSVRPTlRIUkVBRD0wCiAgICBJTklUT0ZGVEhSRUFEPSRUQVJHRVRTTVQKICAgIGlmIFtbICRNQVhTTVQgLWdlICRUQVJHRVRTTVQgXV0KICAgICAgdGhlbgogICAgICAgIHdoaWxlIFtbICRJTklUT05USFJFQUQgLWx0ICRNQVhUSFJFQURTIF1dCiAgICAgICAgZG8KICAgICAgICAgIE9OVEhSRUFEPSRJTklUT05USFJFQUQKICAgICAgICAgIE9GRlRIUkVBRD0kSU5JVE9GRlRIUkVBRAoKICAgICAgICAgIHdoaWxlIFtbICRPTlRIUkVBRCAtbHQgJE9GRlRIUkVBRCBdXQogICAgICAgICAgZG8KICAgICAgICAgICAgL2Jpbi9lY2hvIDEgPiAvc3lzL2RldmljZXMvc3lzdGVtL2NwdS9jcHUkT05USFJFQUQvb25saW5lCiAgICAgICAgICAgIGxldCBPTlRIUkVBRD0kT05USFJFQUQrMQogICAgICAgICAgZG9uZQogICAgICAgICAgbGV0IElOSVRPTlRIUkVBRD0kSU5JVE9OVEhSRUFEKyRNQVhTTVQKICAgICAgICAgIHdoaWxlIFtbICRPRkZUSFJFQUQgLWx0ICRJTklUT05USFJFQUQgXV0KICAgICAgICAgIGRvCiAgICAgICAgICAgIC9iaW4vZWNobyAwID4gL3N5cy9kZXZpY2VzL3N5c3RlbS9jcHUvY3B1JE9GRlRIUkVBRC9vbmxpbmUKICAgICAgICAgICAgbGV0IE9GRlRIUkVBRD0kT0ZGVEhSRUFEKzEKICAgICAgICAgIGRvbmUKICAgICAgICAgIGxldCBJTklUT0ZGVEhSRUFEPSRJTklUT0ZGVEhSRUFEKyRNQVhTTVQKICAgICAgICBkb25lCiAgICAgIGVsc2UKICAgICAgICBlY2hvICJUYXJnZXQgU01UIG11c3QgYmUgc21hbGxlciBvciBlcXVhbCB0aGFuIE1heGltdW0gU01UIHN1cHBvcnRlZCIKICAgIGZpCmZp verification: {} filesystem: root mode: 0755 overwrite: true path: /usr/local/bin/powersmt systemd: units: - name: smt.service enabled: true contents: | [Unit] Description=Set SMT After=network-online.target Before= crio.service [Service] Type=oneshot RemainAfterExit=yes ExecStart=/usr/local/bin/powersmt [Install] WantedBy=multi-user.target
-
Run the oc create command to apply the changes.
Note: You must ensure that the cluster master nodes (or control plane) are in Ready status before you issue this command.oc create -f smt.yaml
Your worker nodes will perform a rolling reboot action to update the kernel argument slub_max_order and set the labeled SMT level.Note:- All the worker nodes are rebooted after the command is issued. The
slub_max_order=0
kernel argument and the specified SMT level are applied to all the worker nodes after the reboot completes. The SMT level on the worker nodes that are not labeled will be set to the default value. - After this process is done, if the SMT level on a particular worker node needs to be changed, you must label that worker node with the desired SMT level and manually reboot it.
- All the worker nodes are rebooted after the command is issued. The