IBM Support

Cloud Pak for Security: Cases Having Slow Performance with Gateway Timeout Errors

Troubleshooting


Problem

Navigating to the Cases application, in Cloud Pak for Security (CP4S), results in longer than usual loading times and intermittent gateway timeout errors appearing on the page.

Symptom

image-20221107130556-1

Cause

The error is produced due to the Cases application Postgres pod not having enough resources to properly handle the workload that it's receiving.

Diagnosing The Problem

Use the oc client to check the Cases Postgres CPU limit on the cluster. If the command reports the CPU limit less than 2 CPU cores, then gateway timeout errors are expected under production loads:
oc get po -lname=isc-cases-postgres -o yaml | grep -A6 -i resources

Resolving The Problem

The resolution requires the use of the pgo client, which is used to run commands on the crunchydata Postgres operator. It is available to be run from inside the cases-operator but requires some parameters to be set with the long commands in steps 3 and 4.

  1. Check current resources on postgres pod:
    oc get po -lname=isc-cases-postgres -o yaml | grep -A6 -i resources
            
  2. Edit the Cases Custom Resource (CR) with the following command. This command sets the postgres CPU limits to 4 CPU cores:
    oc patch cases cases --type='json' -p='[{"op": "replace", "path": "/spec/cases/postgres/provision/crunchydata/primary/resources/limits/cpu", "value":"4"}]'
            
    The following to set postgres CPU requests to 2 CPU cores:
    oc patch cases cases --type='json' -p='[{"op": "replace", "path": "/spec/cases/postgres/provision/crunchydata/primary/resources/requests/cpu", "value":"2"}]'
            
  3. Get the crunchy postgres version to test (note: the cases operator pod must be running):
    oc exec -t  $( oc get pods -l name=isc-cases-operator | grep operator | awk '{print $1;}') -- bash -c '
           export PGO_DIR=/tmp/pgo
           export PGOUSER=$PGO_DIR/pgouser
           export PGO_CA_CERT="$PGO_DIR/client.crt"
           export PGO_CLIENT_CERT="$PGO_DIR/client.crt"
           export PGO_CLIENT_KEY="$PGO_DIR/client.pem"
           export PGO_APISERVER_URL="https://postgres-operator:8443"
           mkdir -p $PGO_DIR
           echo $(oc get secrets/pgo-user --template={{.data.username}} | base64 --decode):$(oc get secrets/pgo-user --template={{.data.password}} | base64 --decode) > $PGOUSER
           oc cp $(oc get pods -lname=postgres-operator -o=name | sed "s/pod\///"):/tmp/server.key $PGO_CLIENT_KEY -c apiserver
           oc cp $(oc get pods -lname=postgres-operator -o=name | sed "s/pod\///"):/tmp/server.crt $PGO_CLIENT_CERT -c apiserver
           pgo version
           
           rm -rf $PGO_DIR
           '
            
    NOTE: The output from this command is expected to be 4.5.1 or 4.5.2 for the crunchy Postgres version.

  4. Delete the crunchy Postgres pgcluster custom resource without deleting the PVC:
    oc exec -t  $( oc get pods -l name=isc-cases-operator | grep operator | awk '{print $1;}') -- bash -c '
           export PGO_DIR=/tmp/pgo
           export PGOUSER=$PGO_DIR/pgouser
           export PGO_CA_CERT="$PGO_DIR/client.crt"
           export PGO_CLIENT_CERT="$PGO_DIR/client.crt"
           export PGO_CLIENT_KEY="$PGO_DIR/client.pem"
           export PGO_APISERVER_URL="https://postgres-operator:8443"
           mkdir -p $PGO_DIR
           echo $(oc get secrets/pgo-user --template={{.data.username}} | base64 --decode):$(oc get secrets/pgo-user --template={{.data.password}} | base64 --decode) > $PGOUSER
           oc cp $(oc get pods -lname=postgres-operator -o=name | sed "s/pod\///"):/tmp/server.key $PGO_CLIENT_KEY -c apiserver
           oc cp $(oc get pods -lname=postgres-operator -o=name | sed "s/pod\///"):/tmp/server.crt $PGO_CLIENT_CERT -c apiserver
           pgo delete cluster isc-cases-postgres -n cp4s --keep-data --keep-backups --no-prompt
           
           rm -rf $PGO_DIR
           '
            
    NOTE: The cases operator re-creates the pgclusterresource by applying the settings from the cases custom resource.

  5. Wait for the postgres pods to restart:
    oc get po -lname=isc-cases-postgres -w
            
  6. Verify that application resources are updated:
    oc get po -lname=isc-cases-postgres -o yaml | grep -A6 -i resources
            
  7. Wait for cases-application to be ready (8 of 8 containers):
    oc get po -lname=isc-cases-application -w
            
  8. Case management page is now available

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTDPP","label":"IBM Cloud Pak for Security"},"ARM Category":[{"code":"a8m0z0000001h8pAAA","label":"Cases"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1.7.1;1.7.2;1.8.0"}]

Document Information

Modified date:
07 November 2022

UID

ibm16571723