IBM Support

QRadar: How to identify and remove large search data files from /transient/ariel_proxy.ariel_proxy_server/data/ directory

Question & Answer


Question

What troubleshooting steps can be used to help resolving high disk usage situations on the /transient partition due to large data search files?

Cause

The /transient partition in QRadar version 7.4 or 7.5 is the location that stores ariel cursors for searches and generated reports data.

By default, the QRadar disk sentry check runs every 60 seconds and looks for high disk usage across the transient partition. If the partition fills greater than 95%, QRadar critical services stop running.

To check the disk usage, use the du command:

du -sh /transient/* | egrep 'G|M' 2> /dev/null 

When this issue is investigated, it is found that /transient/ariel_proxy.ariel_proxy_server/data/ is the directory responsible for the /transient partition filling up. If there are several large or small files, further investigation needs to be carried out. 

16G /transient/ariel_dr_backup
52G /transient/ariel_proxy.ariel_proxy_server <- is transient huge?
3.9M /transient/assetprofiler.assetprofiler
11G /transient/monitor
68M /transient/spillover

For a quick way to identify the files that are responsible for filling up the partition, follow these steps

  1. To find the largest files, you can run:
    find /transient -xdev -type f -size +200M | xargs ls -lh
    The file size option can be adjusted to narrow the results. In this example, we used the find command to display all files larger than 200mb on the transient partition, which is a normal file size.
  2. If you don't see any large files, it might be a combination of many smaller files. Some files size can grow to larger than one terabyte. In the directory that is responsible for transient partition filling up, you can run the "ls" command. This command displays all the files, sorted by size that is, smallest to largest.

    1. Use the cd command to navigate to the directory:

      cd /transient/ariel_proxy.ariel_proxy_server/data/
    2. Type the command:

      ls -lhSr
  3. You can then get a file count in this directory by typing:  
    ls -lh | wc -l
  4. To find the users who created searches use this AQL query.
    1. Log in to the QRadar UI.
    2. Click Log Activity tab.
      image-20221213141213-1
    3. Click Advanced Search.
    4. Add this statement to the search box.
      select qidname(qid) as 'Event', count(*) as 'Count', username as 'Username', 
      DATEFORMAT(deviceTime,'dd-MM-YYYY HH:mm:ss') as 'Time', UTF8(payload) as 'Payload' from events where 
      LOGSOURCENAME(logsourceid) like 'SIM Audit%' and (QIDNAME(qid) = 'Search Executed') Group by 
      Username ORDER BY 'Time' DESC ;
    5. Click Search.
    6. The advanced search displays users who created searches.
      image-20200331154533-1

      Results
      If the search is not getting purged properly, they can build up over time and fill the partition. If there is a potential issue, the output contains hundreds of thousands of files. After you identify the data files causing the space issue, you can look at the details of that file to see when it was generated. You can then associate it to the manage search results page of the log or flow viewers in UI to determine whether there is a search with a large cursor. It might be because a user triggered a query that brought back too much data. Example,  Use a search to return all the data for the last 180 days with no additional filters or use an app to create query with wide search criteria.

Answer

Warning: Restarting ariel proxy_server might result in a searches being stopped and the ariel queues to be reset. Always maintain service window during service restarts.

After you identify the search file results in transient directory, you can proceed to delete it by using command:

rm 24fb58f4-d9f8-45a5-9de1-2e899a54b567.data

Additionally, the .map, .off, .meta and .desc files with the same search ID as the .data, can be removed too. 
After you remove all search data files necessary to reduce the used space, you can restart ariel_proxy service:

systemctl restart ariel_proxy_server

You can verify the transient partition free space:

​df -h

Reducing the 'Search Results Retention Period' can help to alleviate the amount of data stored in the /transient partition if its over the one day default:

To do this:

  1. Log in to the QRadar UI as admin user.
  2. Click Admin tab.
  3. Click System Settings.
    image-20221213155200-1
  4. Click Advanced.
    image-20221213155345-2
  5. Click Search Results Retention Period.
    image-20221213155629-3
  6. Adjust the Search Results Retention period per your organizations retention policy if it is over 1 day.
    Note: The default setting is 1 day.


    Results 
    QRadar critical services continue to run properly. 

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"Component":"","Platform":[{"code":"PF043","label":"Red Hat"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB24","label":"Security Software"}}]

Document Information

Modified date:
16 December 2022

UID

ibm10882072