ping command fails in Db2 cluster

Important: IBM Cloud Pak for Data Version 4.5 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.5 reaches end of support. For more information, see Upgrading IBM Software Hub in the IBM Software Hub Version 5.1 documentation.

After deploying Db2 in a db2ucluster, the ping command fails. This applies to IBM Cloud Pak for Data 4.5.1 and 4.5.2 on Red Hat® OpenShift® Kubernetes Service (ROKS). Addition of a kernel parameter might be required.

Symptoms

  • A Db2 pod remains in a 0/1 Ready state.
  • After running the oc logs -f <db2u_pod> command on a Db2 pod with a -db2u-0 suffix, or -db2u-0,-db2u-1,-db2u-(n+1) suffixes in Db2® Warehouse MPP, the following ping: socket: Operation not permitted message is found in the pod log:
    
    + echo 'Command failed. Attempt 120/120:'
    Command failed. Attempt 120/120:
    + sleep 5
    + true
    + timeout 1 ping -c 1 c-db2oltp-db2u-0.c-db2oltp-db2u-internal
    ping: socket: Operation not permitted
    + [[ 120 -lt 120 ]]
    + fail 'The command has failed after 120 attempts.'
    + echo The command has failed after 120 attempts.
    The command has failed after 120 attempts.

Resolving the problem

The respective StatefulSet must be patched by completing the following steps:
  1. Find the respective db2ucluster resource name and create a db2ucluster environment variable:
    oc get db2ucluster --all-namespaces
    db2ucluster=db2ucluster_resource_name
  2. Configure namespace and statefulset variables:
    namespace=$(oc get db2ucluster --all-namespaces | grep ${db2ucluster} | awk {'print $1'})
    statefulset=$(oc get sts -n ${namespace} | grep ${db2ucluster}-db2u | awk {'print $1'})
  3. To determine if the db2ucluster deployment is restricted: true, run the following command:
    oc get db2ucluster ${db2ucluster} -o yaml | grep -i restricted

    The command returns restricted: true
    Run the following patch command:
    oc exec -it ${statefulset}-0 -- /bin/bash -c "touch /db2u/tmp/.pause_probe"

    The command returns restricted: false or nothing

    Proceed to the next step.


  4. Wait for the formation status of the Db2 StatefulSet to become OK before proceeding to the next step. To obtain the status, run the following command:
    oc get formations.db2u.databases.ibm.com ${db2ucluster} -n ${namespace} -o go-template='{{range .status.components}}{{printf "%s,%s,%s\n" .kind .name .status.state}}{{end}}' | column -s, -t
    Look for the following result:
    StatefulSet ${db2ucluster}-db2u OK
  5. There are two different commands to patch the StatefulSet. To determine which command to use, run the following command:
    oc get sts ${statefulset} -o yaml | grep sysctl

    The command does not return an output
    Run the following patch command:
    oc patch sts ${statefulset} -n ${namespace} -p '{"spec": {"template":{"spec":{"securityContext":{"sysctls":[{"name": "net.ipv4.ping_group_range","value": "0 2147483647"}]}}}}}'

    The command returns an output such as sysctls: and nothing else
    Run the following patch command:
    oc patch sts ${statefulset} -n ${namespace} --type json -p '[{"op":"add","path":"/spec/template/spec/securityContext/sysctls/-","value":{"name":"net.ipv4.ping_group_range","value":"0 2147483647"}}]'

    The db2u-0 pod will enter a terminating state and restart. Wait for the db2u-0 pod to become 1/1 Running state before proceeding to the next step.

  6. To ensure that the db2u-0 pod completed the entry point, the end of the pod log includes a timestamp similar to the following:
    
    =========================================
    ### STARTTIME=20220817154656 ###"
    ### ENDTIME=20220817155051     ###"
    ### TIMEDIFF=395 s ###"
    =========================================
    + exit 0
With the pod now in 1/1 Running state, examine the output of the respective sts yaml file by running the following command:
oc get sts ${statefulset} -n ${namespace} -o yaml | grep "net.ipv4.ping_group_range"
Look for the following result:
- name: net.ipv4.ping_group_range
The StatefulSet definition now includes the following:

sysctls:
- name: net.ipv4.ping_group_range
  value: 0 2147483647
The db2u pods under the StatefulSet resource should proceed beyond the initial ping command failure.

Additional steps for InfoSphere Information Server (IIS) and Watson Knowledge Catalog (WKC) deployments

If your db2ucluster resource was created from an IIS or WKC custom resource, the extra databases might be missing. WKC pod c-db2oltp-wkc-db2u-0 might contain only the LINEAGE database and IIS pod c-db2oltp-iis-db2u-0 might contain only the XMETA database.

Symptoms
Pod with prefix wkc-db2u-init might remain in the Running or Error state and never reach the Completed state. Pod with prefix iis-db2u-backup-restore-job might remain in the Error state.
To recover and provision the missing databases, complete the following steps:
  1. Remove the "addDatabaseTriggered":true text by running the following command:
    oc edit cm db2aaservice-databases-watch-cm
    For example,
    
    [{"addDatabase":true,"addDatabaseTriggered":true,"databaseConfigurationOverrides":
    {"db-cfg":{"BGDB":{"AUTHN_CACHE_USERS":"10","DFT_EXTENT_SZ":"8","LOGARCHMETH1":"OFF",
    "LOGFILSIZ":"10240","LOGSECOND":"128"},"ILGDB":{"DFT_EXTENT_SZ":"256","LOGARCHMETH1":"OFF",
    "LOGFILSIZ":"10240","LOGSECOND":"100"},"LINEAGE":{"DFT_EXTENT_SZ":"256","LOGARCHMETH1":"OFF"},
    "WFDB":{"DFT_EXTENT_SZ":"8","LOGARCHMETH1":"OFF","LOGSECOND":"100"}}},"dbname":
    "LINEAGE,BGDB,ILGDB,WFDB","dbtype":"db2oltp","instanceId":"db2oltp-wkc","namespace":"wkc"},
    {"addDatabase":true,"addDatabaseTriggered":true,"databaseConfigurationOverrides":
    {"db-cfg":{"DSODB":{"LOGARCHMETH1":"OFF","LOGFILSIZ":"1000","LOGPRIMARY":"50",
    "LOGSECOND":"200"},"IADB":{"LOGARCHMETH1":"OFF","LOGFILSIZ":"15000","LOGPRIMARY":"50",
    "LOGSECOND":"200"},"XMETA":{"LOGARCHMETH1":"OFF","LOGFILSIZ":"10000","LOGPRIMARY":"50",
    "LOGSECOND":"200"}}},"dbname":"XMETA,IADB,DSODB","dbtype":"db2oltp","instanceId":
    "db2oltp-iis","namespace":"wkc"}]
    becomes
    [{"addDatabase":true, "databaseConfigurationOverrides":{"db-cfg":{"BGDB":
    {"AUTHN_CACHE_USERS":"10","DFT_EXTENT_SZ":"8","LOGARCHMETH1":"OFF","LOGFILSIZ":"10240",
    "LOGSECOND":"128"},"ILGDB":{"DFT_EXTENT_SZ":"256","LOGARCHMETH1":"OFF","LOGFILSIZ":"10240",
    "LOGSECOND":"100"},"LINEAGE":{"DFT_EXTENT_SZ":"256","LOGARCHMETH1":"OFF"},"WFDB":
    {"DFT_EXTENT_SZ":"8","LOGARCHMETH1":"OFF","LOGSECOND":"100"}}},"dbname":"LINEAGE,BGDB,
    ILGDB,WFDB","dbtype":"db2oltp","instanceId":"db2oltp-wkc","namespace":"wkc"},{"addDatabase":true,
    "databaseConfigurationOverrides":{"db-cfg":{"DSODB":{"LOGARCHMETH1":"OFF","LOGFILSIZ":"1000",
    "LOGPRIMARY":"50","LOGSECOND":"200"},"IADB":{"LOGARCHMETH1":"OFF","LOGFILSIZ":"15000",
    "LOGPRIMARY":"50","LOGSECOND":"200"},"XMETA":{"LOGARCHMETH1":"OFF","LOGFILSIZ":"10000",
    "LOGPRIMARY":"50","LOGSECOND":"200"}}},"dbname":"XMETA,IADB,DSODB","dbtype":"db2oltp",
    "instanceId":"db2oltp-iis","namespace":"wkc"}]

    The missing databases will now provision and the wkc-db2u-init pod will enter the Completed state after the databases complete provisioning.

  2. If the iis-db2u-backup-restore-job pod persists in the Error state, restart the job by completing the following steps:
    1. Delete the job:
      oc delete job -n ${namespace} iis-db2u-backup-restore-job
    2. Restart the ibm-cpd-iis-operator operator by running the following command:
      oc get po -A -l app.kubernetes.io/name=ibm-cpd-iis-operator
      The command returns the pod and namespace to delete that is similar to the following example result:
      
      NAMESPACE             NAME                                    READY   STATUS    RESTARTS   AGE
      ibm-common-services   ibm-cpd-iis-operator-5c788487cc-ncpz6   1/1     Running   0          119m
      To delete the returned example pod and namespace, run the following example command:
      oc delete pod ibm-cpd-iis-operator-5c788487cc-ncpz6 -n ibm-common-services

Important: If a future change to the db2ucluster resource takes effect, the patch to the StatefulSet does not persist and the same issue will reoccur prompting a rerun of the patch command.