Template event rules
Event management consists of creating rules that define conditions to monitor and the actions to take when that condition is detected. The event manager uses these rules to define its monitoring scope, and thus its behavior when a rule is triggered. Creating event rules can be a complex process because you must define the condition clearly so that the event manager can detect it, and you must define the actions to take when the match occurs.
To help ease the process of creating event rules, there are template event rules supplied that you can copy and tailor for your system. The template events define a set of common conditions to monitor with actions that are based on the type or effect of the condition. The template event rules are not enabled by default, and you cannot change or delete the template events. You can copy them as starter rules for more customized rules in your environment.
As a best practice, you can begin by copying and by using the template rules. If you are familiar with event management and the operational characteristics of your Netezza Performance Server system, you can also create your own rules to monitor conditions that are important to you. You can display the template event rules by using the nzevent show -template command.
Template event rule name | Description |
---|---|
Disk80PercentFull | Notifies you when a disk is more than 80 percent full. Disk space threshold notification. |
Disk90PercentFull | Notifies you when a disk is more than 90 percent full. Disk space threshold notification. |
HardwareNeedsAttention | Notifies you when the system detects a condition that can impact the hardware. For more information, see Hardware needs attention. |
HardwareRestarted | Notifies you when a hardware component successfully restarts. For more information, see Hardware restarted. |
HardwareServiceRequested | Notifies you of the failure of a hardware component, which most likely requires a service call, hardware replacement, or both. For more information, see Hardware service requested. |
HistCaptureEvent | Notifies you if there is a problem that prevents history-data files from being written to the staging area. |
HistLoadEvent | Notifies you if there is a problem that prevents the external tables that contain history data from being loaded to the history database. |
NPSNoLongerOnline | Notifies you when the system goes from the online state to another state. For more information, see System state changes. |
RegenFault | Notifies you when the system cannot set up a data slice regeneration. |
RunAwayQuery | Notifies you when a query exceeds a timeout limit. For more information, see Runaway query notification. |
SpuCore | Notifies you when the system detects that a SPU process has restarted and resulted in a core file. For more information, see SPU cores event. |
SystemOnline | Notifies you when the system is online. For more information, see System state changes. |
SystemStuckInState | Notifies you when the system is stuck in the Pausing Now state for more than the timeout specified by the sysmgr.pausingStateTimeout (420 seconds). For more information, see System state changes. |
Transaction Limit Event | Sends an email notification when the number of outstanding transaction objects exceeds 90 percent of the available objects. For more information, see Transaction limits event. |
Netezza Performance Server might add new event types to monitor conditions on the system. These event types might not be available as templates, which means you must manually add a rule to enable them. For a description of more event types that can assist you with monitoring and managing the system, see Event types reference.
Template name | Type | Notify | Severity | Effect | Action |
---|---|---|---|---|---|
Disk80PercentFull
Disk90PercentFull |
hwDiskFull (Notice) | Admins, DBAs | Moderate to Serious | Full disk prevents some operations. | Reclaim space or remove unwanted databases or older data. For more information, see Disk space threshold notification. |
HardwareNeeds
Attention |
hwNeeds
Attention |
Admins, NPS® | Moderate | Possible change or issue that can start to affect performance. | Investigate and identify whether more assistance is required from Support. For more information, see Hardware needs attention. |
Hardware
Restarted |
hwRestarted (Notice) | Admins, NPS | Moderate | Any query or data load in progress is lost. | Investigate whether the cause is hardware or software. Check for SPU cores. For more information, see Hardware restarted. |
HardwareService
Requested |
hwService
(Warning)Requested |
Admins, NPS | Moderate to Serious | Any query or work in progress is lost. Disk failures initiate a regeneration. | Contact Netezza Performance Server. For more information, see Hardware service requested. |
HistCapture
Event |
histCapture
Event |
Admins, NPS | Moderate to Serious | The history-data collection process (alcapp) is unable to save captured history data in the staging area; alcapp stops collecting new data. | The size of the staging area reaches the configured size threshold, or there is no available disk space in /nz/data. Either increase the size of the threshold or free up disk space by deleting old files. |
HistLoadEvent
|
histLoadEvent | Admins, NPS | Moderate to Serious | The history-data loader process (alcloader) is unable to load history data into the history database; new history data is not available in reports until it can be loaded. | The history configuration might be changed, the history database might be deleted, or there might be some session connection error. |
NPSNoLongerOnline
SystemOnline |
sysState
(Information)Changed |
Admins, NPS, DBAs | Varies | Availability status. | Depends on the current state. For more information, see System state changes. |
RegenFault
|
regenFault | Admins, NPS | Critical | Might prevent user data from being regenerated. | Contact Netezza Performance Server Support. For more information, see Regeneration errors. |
RunAwayQuery
|
runaway
(Notice)Query |
Admins, DBAs | Moderate | Can consume resources that are needed for operations. | Determine whether to allow you to run, manage workload. For more information, see Runaway query notification. |
SpuCore
|
spuCore | Admins, NPS | Moderate | A SPU core file was created. | The system created a SPU core file. See SPU cores event. |
SystemStuckInState
|
systemStuck
(Information)InState |
Admins, NPS | Moderate | A system is stuck in the Pausing Now state. | Contact Netezza Performance Server Support. See System state. |
TrasactionLimit
Event |
transaction
LimitEvent |
Admins, NPS | Serious | New transactions are blocked if the limit is reached. | Stop some existing sessions that might be old and require cleanup, or stop/start the Netezza Performance Server system to close all existing transactions. |