Purging data to resolve a full disk when the GUI is down
Learn how to identify a full disk, and what to do about it.
About this task
Two areas can get full on a Guardium appliance which can then cause the GUI to stop:
- The internal database
- The filesystem itself (usually the /var partition)
Auto stop services: By default the appliance stops services including GUI and sniffer when the
database or the filesystem reaches 90% full. An internal 'nanny' process checks the status every 5
minutes and takes actions. You can check the current setting in the CLI:
xxx.xxx.xxx.com> show auto_stop_services_when_full
See Configuration and control CLI commands Important notes for auto stop services:
- If the auto_stop_services_when_full is switched off, the system might be filled to 100% preventing all access to the system
- Never set the auto_stop_services_when_full to off unless used temporarily in the specific circumstance described in the answer section
- You must stop inspection-core before setting auto_stop_services_when_full to off. This prevents the system filling any further.
- If you attempt to restart stopped services before the space issue is resolved, then the services
stop again after 5 minutes. The filesystem and database usage keep increasing in that time. Command
to restart stopped services:
restart stopped_services
Warning: Do not use this command until you are sure that space has been recovered.
Diagnosing the problem
Internal database: As user cli, check whether the internal database is full with this command:
support show db-status used %
If the result is 90% or more the GUI should be
stopped automatically by auto stop services. It is possible for the database to show over 100% used.
It happens when the database files consume more than the set size defined on the system (50% of disk
space for collectors, 75% for aggregators). This can happen if system services are not stopped when
database reaches 90% or they are restarted manually. Internal filesystem: To check if /var partition (filesystem) is 90% full or
more, run a must gather from cli:
support must_gather system_db_info
Use
fileserver to check the df -k
output within the
system_output.txt file that can be seen in fileserver:
must_gather/system_logs/system_output.txt, or extracted from the
system.<datetime>.tgz file once you have downloaded it Inside the system_output.txt file you can find the detail. In this example
the /var is only 65%
full:
==========2016-11-30 08:36:09 ... Output of df command:==========
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/sda3 10154020 2272668 7357232 24% /
/dev/sda2 28571320 17384504 9712052 65% /var
/dev/sda1 505604 33476 446024 7% /boot
tmpfs 6169768 0 6169768 0% /dev/shm
Before the database or the filesystem fills to the "auto stop" level you should receive warnings
in the system log (messages file). You can run a must_gather command and look inside the compressed
file that gets created to check the latest messages file within
support must_gather system_db_info
Sample message filesystem space problem errors.
In this example the messages file shows the filesystem is full (DB space may also be full )
Nov 23 12:00:13 xxx nanny:[2986]: Nanny is awake.
Nov 23 12:00:13 xxx nanny:[2986]: DB parameters - status 2 db warn level 75 db critical level 90 db auto stop 1.
Nov 23 12:00:13 xxx nanny:[2986]: It is in critical ..Used space on your system is almost full(currently at 93%). Please use CLI command 'show filesystem usage' to see which directories take too much space to target your clean up.
Nov 23 12:00:13 xxx nanny:[2986]: Email has been sent to admin (admin@admin.com) on the out-of-space issue.
Nov 23 12:00:13 xxx nanny:[2986]: Stopping Guardium Services until used space on your system has been cleaned up.
This example shows both the DB and the filesystem (/var partition) NEARLY full (before the auto stop
of services)
Nov 23 14:13:12 xxx nanny:[10070]: TURBINE DB is configured after nap
Nov 23 14:13:12 xxx nanny:[10070]: Nanny is awake.
Nov 23 14:13:12 xxx nanny:[10070]: DB parameters - status 1 db warn level 75 db critical level 90 db auto stop 1.
Nov 23 14:13:12 xxx nanny:[10070]: Used space on your system is filling up (currently at 88%). Please use CLI command 'show filesystem usage' to see which directories take too much space to target your clean up.
Nov 23 14:13:12 xxx nanny:[10070]: Email has been sent to admin (admin@admin.com) on the out-of-space issue.
Nov 23 14:13:12 xxx nanny:[10070]: A partition is rapidly filling up. Partition /dev/sda2 (/var) on xxx is on 88 percent usage. Doing preventive cleaning.
Nov 23 14:13:13 xxx root: 64 bit big mem 24554360 limit is 12277180
Nov 23 14:13:13 xxx nanny:[15110]: Hunting version 35, every 300, for more than 12277180 kb.
Nov 23 14:13:13 xxx nanny:[15110]: Also checking tomcat.
Nov 23 14:13:13 xxx nanny:[15110]: Nanny set memory limit to 12277180
Nov 23 14:13:13 xxx nanny:[15110]: TURBINE DB Already configured before nap
Nov 23 14:13:13 xxx nanny:[15110]: Going for my initial nap.