Deployment health dashboard
Learn more about how the deployment health dashboard presents data from your entire Guardium deployment.
Data availability
Data source | Information type | Trigger criteria | Data latency | Data purge interval |
---|---|---|---|---|
System resources |
System configuration, such as CPU cores, system memory, /var disk capacity |
System does not meet minimum requirements |
Updated whenever the user-interface server is started or restarted |
Not applicable |
Unit utilization |
Unit utilization data such as sniffer restarts, MySQL disk usage, and CPU load. |
Value exceeds unit utilization thresholds |
Updated within 1 - 2 hours, based on the recommended configuration. For more information, see Configuring unit utilization data processing. |
Unit utilization data is purged after 60 days Sniffer buffer usage data is purged after 14 days |
System self-monitoring |
MySQL disk usage and system disk usage |
Usage meets or exceeds default thresholds (75% for high severity, 90% for critical severity) |
Updated every 5 - 10 minutes. For high-severity, if the same event occurs multiple times in a 15 minute period, the timestamp is updated to reflect the most recent instance. If the same event occurs after a 15 minute interval, a new entry is created with the most recent timestamp. For critical issues, every instance of an event is created with a unique timestamp. |
High-severity issues are purged after 7 days Critical issues are never purged |
Correlation alerts |
Triggered correlation alerts |
An alert threshold is reached |
Updated based on the alert notification frequency. For more information, see Correlation Alerts. |
Data is purged after 7 days |
- Only data from systems that are running Guardium V10.1.2 and later are included on the deployment health dashboard.
- When you change the host name of a system, preexisting data that is associated with the original host name is no longer displayed on the deployment health dashboard.
- When a primary central manager transfers data to a backup central manager during a failover scenario, up to 30 minutes of data is unavailable to the deployment health dashboard.
Data presentation
Tile name | |||||||
---|---|---|---|---|---|---|---|
Data source | Resource requirements | Unit utilization issues | Unit utilization timecharts | Alerts (by category, name, severity, or system) | Events | High severity | Critical |
System resources | ![]() |
![]() |
|||||
Unit utilization | ![]() |
![]() |
![]() |
![]() |
|||
System self-monitoring | ![]() |
![]() (When usage meets or exceeds 75% threshold) |
![]() (When usage meets or exceeds 90% threshold) |
||||
Correlation alerts | ![]() |
![]() |
![]() |
||||
The following tiles are displayed by default: alerts by name, critical issues, events timeline, high severity issues, and unit utilization issues. |
Dashboard filter
The dashboard filter allows quick filtering of the data based on Guardium systems, issue severity, and time period. Filter settings affect the data displayed on the entire dashboard unless noted otherwise.
The Guardium systems filter allows filtering the dashboard by unit type or by groups defined at .
- Outstanding or unresolved critical issues are displayed on the dashboard regardless of the Severity filter setting.
- For the unit utilization issues tile, the dashboard Severity filter is based on the overall unit utilization severity. For more information about how unit utilization severity is assigned, see Unit utilization issues.
The time filter determines the range of data that is displayed on the dashboard. Default settings allow time periods from 1 hour to 3 weeks, but custom time periods are also supported. The time filter does not apply to critical issues: critical issues are always displayed, regardless of the time filter setting.
Use the Add chart menu to add tiles to the dashboard or replace default tiles that you previously removed.
Dashboard summary
- The Critical and High counts are not affected by adding or removing tiles from the dashboard.
- The counts on the dashboard summary bar reflect the dashboard filter settings.
Alerts by category, name, severity, or system
The deployment health dashboard supports several tiles based on Guardium correlation alerts: Alerts by category, Alerts by name, Alerts by severity, and Alerts by system. Add correlation alert tiles to the dashboard by using the Add chart menu.
Correlation alerts must be explicitly configured for inclusion on the deployment health dashboard. For information about configuring alerts for the dashboard, see Configuring a central manager for the deployment health views.
Resource requirements
The resource requirements tile indicates whether systems in a Guardium deployment meet the minimum hardware requirements for CPU, memory, and /var disk capacity. Any system resource that does not meet the minimum requirement is designated as a high-severity issue and displayed on both the resource requirements tile and the high severity issues tile.
Use the Include healthy systems check box on the details view of the tile to include all available data for the systems and time frame that are indicated on the dashboard filter bar. By including all available data, the Include healthy systems check box overrides the Severity setting of the overall dashboard filter. Systems without any detected health issues are excluded by default.
- System resource issues are not displayed in the Events timeline because they are not associated with a specific time stamp
Unit utilization issues
The unit utilization issues tile displays issues based on unit utilization thresholds. The issues that are displayed on the tile represent individual metrics that exceed their respective thresholds. The overall severity is assigned based on the highest severity issue that is found in all available metrics for an individual system in a specified time period. For more information about unit utilization thresholds, see Unit Utilization Level.
- The Period start time indicates that the CM buffer usage monitor data is rolled-up into hourly periods, for example periods starting at 13:00, 12:00, and 11:00.
- The Timestamp indicates when the unit utilization levels data is added to the deployment health dashboard, either based on the unit utilization levels schedule or by using run once now.
Period start | Timestamp |
---|---|
13:00 | 14:40 |
12:00 | 13:40 |
11:00 | 12:40 |
Use the Include healthy systems check box on the details view of the tile to include all available data for the systems and time frame that are indicated on the dashboard filter bar. By including all available data, the Include healthy systems check box overrides the Severity setting of the overall dashboard filter. Systems without any detected health issues are excluded by default.
Unit utilization timecharts
Unit utilization timecharts allow the observation of trends in unit utilization data over time. Unit utilization timecharts can be configured to show multiple unit utilization metrics for a single Guardium system or to show a single unit utilization metric for multiple Guardium systems.
- The x-axis represents the period start time
- When multiple metrics are being charted and the values for the metrics are in the same range, one y-axis is drawn. For example, both MySQL disk usage and /var disk usage are expressed as percentages and are drawn with the same y-axis.
- When multiple metrics are being charted and the values of the metrics are not similar, two y-axes are drawn. For example, MySQL disk usage is expressed as a percentage and flat log requests is expressed as an integer, so two y-axes are drawn: one displaying percentages and one displaying integers.
- If the value of a metric falls outside the range of a y-axis, that value is displayed at the
bottom of the chart. This behavior accommodates scenarios where different metrics are expressed with
similar units but significantly different values: for example, integers in the range of thousands
versus millions.Tip: Create multiple time charts when values are in significantly different ranges.