Self Monitoring
The Guardium solution monitors itself to minimize disruptions and correct problems automatically whenever possible.
- Reports: Whether textual or graphical, reports are at the core of the Guardium® solution. By using Guardium's Query-Report Builder, a user can effectively report on any of the self-monitoring data collected through associated domains and entities. Many of the predefined reports can be enhanced through more detailed effort to provide higher levels of granularity. Use the domain VA Tests to report on tests that are available for security assessments.
- Alerts: In addition to building reports, a user can define an alert against those reports through defined thresholds, indicating an exception or policy rule violation. These alerts can either be real-time or determined through historical analysis. These alerts can then trigger notification to users through SMTP, SNMP, syslog, or a custom Java™ class.
- Self-Monitoring Utility: Guardium has implemented an internal self-monitoring daemon (always running) service utility on collectors and aggregators that wakes up every 5 minutes and does a system scan, checking components for optimal configuration, operational effectiveness, and repairs when necessary. For example, if the utility finds the Web Server down, it first validates a complete shutdown of the service, restarts the service, and then alerts an administrative user.
Components Monitored
Components | How to access |
---|---|
System Disk space(%full) |
Manage > System View > System Monitor Alert: You can use the Queries and Correlation Alerts, utilizing the Sniffer Buffer domain and Sniffer Buffer Usage entity to create alerts |
DB sizes and files on disk (/var) |
Alerts are sent when the system predicts that a DB size or files on disk (/var) will reach 70% in the next 14 days. Alerts detail the predicted size and the largest tables or files. Alerts are also shown in the deployment health dashboard of the central manager.
You can configure and disable the alerts, see Health analyzer APIs.
|
CPU Load Uptime and Reboots Memory Usage Monitoring Engine (sniffer) - Status: up/down/stuck/overloaded CPU Usage Memory Usage Overload and delays (queues) |
Reports > Guardium Operational Reports > Buff Usage Monitor Alert: You can use the Queries and Correlation Alerts, utilizing the Sniffer Buffer domain and Sniffer Buffer Usage entity to create alerts |
Failed Logins |
Manage > System View > System Monitor. Alert: You can use the Queries and Correlation Alerts, utilizing the Guardium Login domain and Guardium Users Login entity to create alerts |
Lost requests |
Manage > Reports > Activity Monitoring > Dropped Requests Alert: You can use the Queries and Correlation Alerts, utilizing the Exceptions domain and Exceptions entity to create alerts |
Change in data patterns |
Reports >Real-time Operational Reports > Values Changed Alert: See Viewing an Audit Process Definition for alert: Data Source Changes - alert on any data source changes |
Packets rates Request rates Ignored data |
Reports >Guardium Operational Reports > Buffer Usage Monitor Alert: You can use the Queries and Correlation Alerts, utilizing the Sniffer Buffer domain and Sniffer Buffer Usage entity to create alerts |
Scheduled Jobs Exceptions |
Reports >Guardium Operational Reports > Scheduled Job Exceptions, or See Predefined admin Reports: Alert: You can use the Queries and Correlation Alerts, utilizing the Exceptions domain and Exception Type entity to create alerts. |
Audit processes status |
Reports >Guardium Operational Reports > Number of Active Audit Processes, or See Predefined admin Reports. Alert: You can use the Queries and Correlation Alerts, utilizing the Audit Process domain and Audit Process entity to create alerts |
Inspection Engine Changes |
Reports >Activity Monitoring > S-TAP Configuration Change History Alert: See Viewing an Audit Process Definition for alert: Inspection Engines and S-TAP - alert on any activity related to inspection engine and S-TAP configuration |
Guardium Users Activity - Login/logout |
Reports >Guardium Operational Reports > Logins to Guardium, or See Predefined admin Reports Alert: You can use the Queries and Correlation Alerts, utilizing the Guardium Login domain and SQL Guard Login entity to create alerts |
Failed Logins |
Reports >Guardium Operational Reports > Logins to Guardium, or See Predefined admin Reports Alert: See Viewing an Audit Process Definition for alert: Failed Logins To Guardium - alert if have more than 5 failed logins in the last 11 minutes, or Select Tools > Report Building > drop-down Report Title: Guardium Logins, See Reports for additional information |
User Activity Audit Trail |
Reports >Guardium Operational Reports > User Activity Audit Trail, or See Predefined admin Reports Alert: You can use the Queries and Correlation Alerts, utilizing the Guardium Activity domain and SQL Guard User Activity Audit entity to create alerts Note: User activity includes those instances where a user changes to the root shell -- providing a log of their root activity. |
Creation/Deletion of Users/Roles |
Reports >Guardium Operational Reports > User Activity Audit Trail, or See Predefined admin Reports Alert: See Viewing an Audit Process Definition for alert: Guardium - Add/Remove Users - alert on any Addition or Removal of Guardium User |
Permissions monitoring |
Reports >Guardium Operational Reports > Guardium Users, Guardium Roles, or Guardium Applications Alert: You can use the Queries and Correlation Alerts, utilizing the Application domain and Application Data entity to create alerts |
S-TAP® Info (Central Manager) |
Report: See S-TAP Reports. On a Central Manager, an additional report, S-TAP Info, is available. This report monitors S-TAPs of the entire environment. Upload this data using the Custom Table Builder. This report is the result of uploading data using remote sources on a Central Manager and using that data to see a consolidated view of S-TAPs. S-TAP info is a predefined custom domain which contains the S-TAP Info entity and is not modifiable like the entitlement domain. |
Guardium nanny process
The Guardium nanny is an internal process that monitors key components and critical resources within the Guardium system—guaranteeing their availability and reliability. The nanny alerts when potential problems are emerging. Nanny alerts go to syslog, can be forwarded and sent as emails to the administrator. In some cases the nanny can take remedial actions.
The monitored resources and components include:
- Web service monitoring - service port (default 8443) not responding or tomcat service is not up
- syslog message
- mail admin
- will issue restarts of the web service
- Inspection Engine activity - snif overloaded, not responding, or failure
- syslog message
- mail admin
- mail guardium support (optional)
- tries to fix by restarting the snif under certain conditions
- tries to respawn snif if process dies
- Diskspace utilization - alerts when > 75% on the critical partitions
- syslog message
- alert admin
- performs preventive action by cleaning temporary files when over 95%
- Failed login (ssh) to the appliance - checks for ssh daemon's messages and alerts on failed ssh
login attempts
- mail admin (it's already in syslog)
- Monitor internal database (TURBINE) - verify service is up, status, and capacity utilization
monitoring
- syslog message
- mail admin
- restart service
- File System utilization - every five minutes, Nanny.pl checks file system at /var, warning alert
when > 75% in the /var directory, critical alert and services stopped when >90% in /var directory
- syslog message
- alert admin
- Admin clean-up required, using CLI commands: show filesystem usage, clear filesystem dir, and restart stopped_services
- remote syslog. You can send test messages to the rsyslog to
verify that it is communicating with Guardium. To enable and configure the rsyslog test, use the API
command modify_guard_param. To run the test, use the CLI command show remotelog status.
- If the test message is successful, the response is success.
- If the test message is unsuccessful, Guardium restarts the rsyslogd and checks with a test message. If the test message is successful, the response is success. If this test is unsuccessful, Guardium sends an alert to the admin user.