IBM Support

Db2 SORTHEAP errors when running a report in Guardium Insights

Troubleshooting


Problem

When you are running queries that involve sorts or aggregations (for example, sum and count), the Db2 back end runs out of sort heap memory allocation. If the sort heap is exhausted, these Db2 error codes appear in the logs for the reports-runner pods on the cluster:
 
SQLCODE=-955, SQLSTATE=57011, SQLERRMC=3

Symptom

To check for these messages, run this command in the Red Hat OpenShift command-line interface while you are using the namespace of the Guardium Insights deployment:
 
oc logs AAAAAA-reports-runner-XXXXXXXXXX-YYYYY
Where XXXXXXXXXX-YYYYY are the unique identifiers for the reports-runner pods and AAAAAA is the cluster resource.
Additionally, run the following commands from the Openshift command-line interface to see how many agents are allocated to your queries:
oc rsh c-staging-db2-db2u-0
su - db2inst1
db2 list applications
The last column in the output is the agent count. Large agent count numbers can be an indication of this issue. If you take the agent count and divide it by 8, the result should not be significantly higher than the number of processor cores that you have.
When running the oc logs command, be sure the run the list applications command at the same time as the queries are running:

Cause

If there are many CPU cores allocated to a small amount of memory, or if Db2 is not running on dedicated Red Hat OpenShift nodes, Db2 may allocate too many agents to query execution. This known bug is related to running Db2 in containers.

Resolving The Problem

  1. Ensure that the Db2 pods are running on dedicated Red Hat OpenShift nodes by enabling the tainting of data nodes using the taint (-t) option of the preInstall.sh installation script. There are additional considerations related to tainting that may require a larger cluster. It is recommended that you use the feature with caution.
  2. For the nodes that the Db2 pods are running on, ensure that at least a 1 CPU core to 8-16 GB memory allocation ratio is specified. This should be at the node level, not by adjusting the limit/request parameters for the Db2 pods.
  3. If the environment has fewer than six worker nodes, or you are unable to ensure the recommended memory allocation ratio, a Db2 configuration parameter can be modified to inform Db2 of how much concurrency to allow in query execution. To change the configuration, run these commands in the Red Hat OpenShift command line:
     
    oc rsh c-staging-db2-db2u-0
    su - db2inst1
    db2 connect to bludb
    db2 update db cfg using DFT_DEGREE XXXX immediate

    Where XXXX is the amount of memory allocated to the Db2 pods, divided by 16 and rounded up to the nearest whole number. The change requires no downtime and can be modified easily, so you may want to experiment with other values as well.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSNH7ER","label":"IBM Security Guardium Insights for IBM Cloud Pak for Security"},"ARM Category":[{"code":"a8m0z0000001erNAAQ","label":"REPORTS"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
19 April 2022

UID

ibm16573087