Self Monitoring

The Guardium solution monitors itself to minimize disruptions and correct problems automatically whenever possible.

Guardium uses a three-pronged approach to ensuring that it is available, functioning properly, has not been tampered with, and alerts users of problems:
  • Reports: Whether textual or graphical, reports are at the core of the Guardium® solution. By using Guardium's Query-Report Builder, a user can effectively report on any of the self-monitoring data collected through associated domains and entities. Many of the predefined reports can be enhanced through more detailed effort to provide higher levels of granularity. Use the domain VA Tests to report on tests that are available for security assessments.
  • Alerts: In addition to building reports, a user can define an alert against those reports through defined thresholds, indicating an exception or policy rule violation. These alerts can either be real-time or determined through historical analysis. These alerts can then trigger notification to users through SMTP, SNMP, syslog, or a custom Java™ class.
  • Self-Monitoring Utility: Guardium has implemented an internal self-monitoring daemon (always running) service utility on collectors and aggregators that wakes up every 5 minutes and does a system scan, checking components for optimal configuration, operational effectiveness, and repairs when necessary. For example, if the utility finds the Web Server down, it first validates a complete shutdown of the service, restarts the service, and then alerts an administrative user.

Components Monitored

Table 1. Components monitored
Components How to access

System

Disk space(%full)

Manage > System View > System Monitor

Alert: You can use the Queries and Correlation Alerts, utilizing the Sniffer Buffer domain and Sniffer Buffer Usage entity to create alerts

DB sizes and files on disk (/var)
Alerts are sent when the system predicts that a DB size or files on disk (/var) will reach 70% in the next 14 days. Alerts detail the predicted size and the largest tables or files. Alerts are also shown in the deployment health dashboard of the central manager.
You can configure and disable the alerts, see Health analyzer APIs.

CPU Load

Uptime and Reboots

Memory Usage

Monitoring Engine (sniffer) - Status: up/down/stuck/overloaded

CPU Usage

Memory Usage

Overload and delays (queues)

Reports > Guardium Operational Reports > Buff Usage Monitor

Alert: You can use the Queries and Correlation Alerts, utilizing the Sniffer Buffer domain and Sniffer Buffer Usage entity to create alerts

Failed Logins

Manage > System View > System Monitor.

Alert: You can use the Queries and Correlation Alerts, utilizing the Guardium Login domain and Guardium Users Login entity to create alerts

Lost requests

Manage > Reports > Activity Monitoring > Dropped Requests

Alert: You can use the Queries and Correlation Alerts, utilizing the Exceptions domain and Exceptions entity to create alerts

Change in data patterns

Reports >Real-time Operational Reports > Values Changed Alert: See Viewing an Audit Process Definition for alert: Data Source Changes - alert on any data source changes

Packets rates

Request rates

Ignored data

Reports >Guardium Operational Reports > Buffer Usage Monitor

Alert: You can use the Queries and Correlation Alerts, utilizing the Sniffer Buffer domain and Sniffer Buffer Usage entity to create alerts

Scheduled Jobs Exceptions

Reports >Guardium Operational Reports > Scheduled Job Exceptions, or See Predefined admin Reports:

Alert: You can use the Queries and Correlation Alerts, utilizing the Exceptions domain and Exception Type entity to create alerts.

Audit processes status

Reports >Guardium Operational Reports > Number of Active Audit Processes, or See Predefined admin Reports.

Alert: You can use the Queries and Correlation Alerts, utilizing the Audit Process domain and Audit Process entity to create alerts

Inspection Engine Changes

Reports >Activity Monitoring > S-TAP Configuration Change History

Alert: See Viewing an Audit Process Definition for alert: Inspection Engines and S-TAP - alert on any activity related to inspection engine and S-TAP configuration

Guardium Users Activity - Login/logout

Reports >Guardium Operational Reports > Logins to Guardium, or See Predefined admin Reports

Alert: You can use the Queries and Correlation Alerts, utilizing the Guardium Login domain and SQL Guard Login entity to create alerts

Failed Logins

Reports >Guardium Operational Reports > Logins to Guardium, or See Predefined admin Reports

Alert: See Viewing an Audit Process Definition for alert: Failed Logins To Guardium - alert if have more than 5 failed logins in the last 11 minutes, or Select Tools > Report Building > drop-down Report Title: Guardium Logins, See Reports for additional information

User Activity Audit Trail

Reports >Guardium Operational Reports > User Activity Audit Trail, or See Predefined admin Reports

Alert: You can use the Queries and Correlation Alerts, utilizing the Guardium Activity domain and SQL Guard User Activity Audit entity to create alerts

Note: User activity includes those instances where a user changes to the root shell -- providing a log of their root activity.

Creation/Deletion of Users/Roles

Reports >Guardium Operational Reports > User Activity Audit Trail, or See Predefined admin Reports

Alert: See Viewing an Audit Process Definition for alert: Guardium - Add/Remove Users - alert on any Addition or Removal of Guardium User

Permissions monitoring

Reports >Guardium Operational Reports > Guardium Users, Guardium Roles, or Guardium Applications

Alert: You can use the Queries and Correlation Alerts, utilizing the Application domain and Application Data entity to create alerts

S-TAP® Info (Central Manager)

Report: See S-TAP Reports. On a Central Manager, an additional report, S-TAP Info, is available. This report monitors S-TAPs of the entire environment. Upload this data using the Custom Table Builder. This report is the result of uploading data using remote sources on a Central Manager and using that data to see a consolidated view of S-TAPs.

S-TAP info is a predefined custom domain which contains the S-TAP Info entity and is not modifiable like the entitlement domain.

Guardium nanny process

The Guardium nanny is an internal process that monitors key components and critical resources within the Guardium system—guaranteeing their availability and reliability. The nanny alerts when potential problems are emerging. Nanny alerts go to syslog, can be forwarded and sent as emails to the administrator. In some cases the nanny can take remedial actions.

The monitored resources and components include:

  • Web service monitoring - service port (default 8443) not responding or tomcat service is not up
    • syslog message
    • mail admin
    • will issue restarts of the web service
  • Inspection Engine activity - snif overloaded, not responding, or failure
    • syslog message
    • mail admin
    • mail guardium support (optional)
    • tries to fix by restarting the snif under certain conditions
    • tries to respawn snif if process dies
  • Diskspace utilization - alerts when > 75% on the critical partitions
    • syslog message
    • alert admin
    • performs preventive action by cleaning temporary files when over 95%  
  • Failed login (ssh) to the appliance - checks for ssh daemon's messages and alerts on failed ssh login attempts
    • mail admin  (it's already in syslog)
  • Monitor internal database (TURBINE) - verify service is up, status, and capacity utilization monitoring
    • syslog message
    • mail admin
    • restart service
  • File System utilization - every five minutes, Nanny.pl checks file system at /var, warning alert when > 75% in the /var directory, critical alert and services stopped when >90% in /var directory
    • syslog message
    • alert admin
    • Admin clean-up required, using CLI commands: show filesystem usage, clear filesystem dir, and restart stopped_services
  • remote syslog. You can send test messages to the rsyslog to verify that it is communicating with Guardium. To enable and configure the rsyslog test, use the API command modify_guard_param. To run the test, use the CLI command show remotelog status.
    • If the test message is successful, the response is success.
    • If the test message is unsuccessful, Guardium restarts the rsyslogd and checks with a test message. If the test message is successful, the response is success. If this test is unsuccessful, Guardium sends an alert to the admin user.