IBM Support

QRadar: Troubleshooting disk space issues on the / partition due to large files in /transient/monitor

Troubleshooting


Problem

Root / partition on the QRadar host may go beyond 90% utilization due to large files located in /transient/monitor. This high utilization can lead to issues, including disk space check failures during software upgrades and deployment configuration errors.

Symptom

Lack of free space in the "/" partition can cause the following symptoms:
[tomcat.tomcat] /console/JSON-RPC/QRadar.scheduleDeployment QRadar.scheduleDeployment] com.q1labs.configservices.util.ConfigServicesUtil: 
[INFO] [-/--] Deployment is blocked due to critical disk space issue
By default, the QRadar disk sentry check runs every 60 seconds and looks for high disk usage across the partitions. When a partition goes beyond the critical warning threshold, an alert is triggered for administrators to investigate.

NOTE: System services are stopped to avoid the partition becoming full and prevent further issues. A maximum threshold notification is sent to the UI and can also be seen in /var/log/qradar.log

Diagnosing The Problem

The first step in diagnosing the issue is to identify which partition is experiencing the problem.
1. Use the df -Th command to get the output of the partitions. If you notice that / partition is more than 90% utilised, move to next step
df -hT
2. For "/" partition find the files that are consuming lot of disk space.
find / -xdev -type f -size +100M | xargs ls -lhSr
3. If the above output returns like this:
-rw-r--r-- 1 root    root 1.1G Nov  7 20:15 /transient/monitor/07112024/ecs-ec_Nov07.log
-rw-r--r-- 1 root    root 1.3G Nov  6 23:59 /transient/monitor/06112024/ecs-ec-ingress_Nov06.log
-rw-r--r-- 1 root    root 1.8G Nov  7 20:15 /transient/monitor/07112024/ecs-ec-ingress_Nov07.log
The above files are created by monitor_script.sh script under /transient/monitor directory related to QRadar processes. This script enables monitoring of a wide variety of Linux and QRadar processes.
Once the cause is confirmed to be large files in /transient/monitor, go to the next steps.

Resolving The Problem

1. To stop the monitor_script.sh , get its process-id (PID) using below command and kill the process using the next command 
ps -eaf | grep -i monitor_script #to extract PID
kill -9 PID
2.Move or Remove a file.
mkdir -p /store/IBM_Support/
mv -v /full/path/to/<file> /store/IBM_Support/
Output Example:
mv -v /transient/monitor/07112024/ecs-ec_Nov07.log /store/IBM_Support/
Move already present /transient/monitor/*.log files to /store/ibm_support/ or remove it if not required anymore
3.Verify the partition now has more space.
df -Th /
Output Example:
Filesystem                Type  Size  Used Avail Use% Mounted on
/dev/mapper/rootrhel-root xfs    13G  4.1G  8.5G  33% /
Result
The "/" partition no longer has disk space constraints. If the partition reached the point of critical services stop, administrators must restart the services in the proper order and wait 5 mins with the following commands:
 
IMPORTANT: When the QRadar core service restart, the QRadar UI, event processing, and database are not available to all users. Administrators with strict outage policies are advised to complete the next step during a scheduled maintenance window for their organization.
 
systemctl stop hostcontext
systemctl stop tomcat
systemctl restart hostservices
systemctl start tomcat
systemctl start hostcontext
If the partition does not decrease its usage or if the services does not start properly, administrators can contact QRadar Support for assistance.
NOTE: This isn't an automatic script, to monitor QRadar processes it is required to be enabled manually when suggested by IBM Support.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSYS0N","label":"IBM QRadar SIEM (SaaS)"},"ARM Category":[{"code":"a8m0z000000cwtNAAQ","label":"Deployment"}],"ARM Case Number":"TS017779018","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
21 November 2024

UID

ibm17175329