Template event rules

Event management consists of creating rules that define conditions to monitor and the actions to take when that condition is detected. The event manager uses these rules to define its monitoring scope, and thus its behavior when a rule is triggered. Creating event rules can be a complex process because you must define the condition clearly so that the event manager can detect it, and you must define the actions to take when the match occurs.

To help ease the process of creating event rules, there are template event rules supplied that you can copy and tailor for your system. The template events define a set of common conditions to monitor with actions that are based on the type or effect of the condition. The template event rules are not enabled by default, and you cannot change or delete the template events. You can copy them as starter rules for more customized rules in your environment.

As a best practice, you can begin by copying and by using the template rules. If you are familiar with event management and the operational characteristics of your Netezza Performance Server system, you can also create your own rules to monitor conditions that are important to you. You can display the template event rules by using the nzevent show -template command.

Note: Release 5.0.x introduced new template events for the IBM® Netezza® 100, IBM Netezza 1000, and later systems. Previous event template rules specific to the z-series platform do not apply to the new models and were replaced by similar, new events.
The following table lists the predefined template event rules.
Table 1. Template event rules
Template event rule name Description
Disk80PercentFull Notifies you when a disk is more than 80 percent full. Disk space threshold notification.
Disk90PercentFull Notifies you when a disk is more than 90 percent full. Disk space threshold notification.
HardwareNeedsAttention Notifies you when the system detects a condition that can impact the hardware. For more information, see Hardware needs attention.
HardwareRestarted Notifies you when a hardware component successfully restarts. For more information, see Hardware restarted.
HardwareServiceRequested Notifies you of the failure of a hardware component, which most likely requires a service call, hardware replacement, or both. For more information, see Hardware service requested.
HistCaptureEvent Notifies you if there is a problem that prevents history-data files from being written to the staging area.
HistLoadEvent Notifies you if there is a problem that prevents the external tables that contain history data from being loaded to the history database.
NPSNoLongerOnline Notifies you when the system goes from the online state to another state. For more information, see System state changes.
RegenFault Notifies you when the system cannot set up a data slice regeneration.
RunAwayQuery Notifies you when a query exceeds a timeout limit. For more information, see Runaway query notification.
SpuCore Notifies you when the system detects that a SPU process has restarted and resulted in a core file. For more information, see SPU cores event.
SystemOnline Notifies you when the system is online. For more information, see System state changes.
SystemStuckInState Notifies you when the system is stuck in the Pausing Now state for more than the timeout specified by the sysmgr.pausingStateTimeout (420 seconds). For more information, see System state changes.
Transaction Limit Event Sends an email notification when the number of outstanding transaction objects exceeds 90 percent of the available objects. For more information, see Transaction limits event.

Netezza Performance Server might add new event types to monitor conditions on the system. These event types might not be available as templates, which means you must manually add a rule to enable them. For a description of more event types that can assist you with monitoring and managing the system, see Event types reference.

The action to take for an event often depends on the type of event (its effect on the system operations or performance). The following table lists some of the predefined template events and their corresponding effects and actions.
Table 2. Netezza Performance Server template event rules
Template name Type Notify Severity Effect Action
Disk80PercentFull

Disk90PercentFull
hwDiskFull (Notice) Admins, DBAs Moderate to Serious Full disk prevents some operations. Reclaim space or remove unwanted databases or older data. For more information, see Disk space threshold notification.
HardwareNeeds
Attention
hwNeeds
Attention
Admins, NPS® Moderate Possible change or issue that can start to affect performance. Investigate and identify whether more assistance is required from Support. For more information, see Hardware needs attention.
Hardware
Restarted
hwRestarted (Notice) Admins, NPS Moderate Any query or data load in progress is lost. Investigate whether the cause is hardware or software. Check for SPU cores. For more information, see Hardware restarted.
HardwareService
Requested
hwService
Requested
(Warning)
Admins, NPS Moderate to Serious Any query or work in progress is lost. Disk failures initiate a regeneration. Contact Netezza Performance Server. For more information, see Hardware service requested.
HistCapture
Event
histCapture
Event
Admins, NPS Moderate to Serious The history-data collection process (alcapp) is unable to save captured history data in the staging area; alcapp stops collecting new data. The size of the staging area reaches the configured size threshold, or there is no available disk space in /nz/data. Either increase the size of the threshold or free up disk space by deleting old files.
HistLoadEvent
histLoadEvent Admins, NPS Moderate to Serious The history-data loader process (alcloader) is unable to load history data into the history database; new history data is not available in reports until it can be loaded. The history configuration might be changed, the history database might be deleted, or there might be some session connection error.
NPSNoLongerOnline

SystemOnline
sysState
Changed
(Information)
Admins, NPS, DBAs Varies Availability status. Depends on the current state. For more information, see System state changes.
RegenFault
regenFault Admins, NPS Critical Might prevent user data from being regenerated. Contact Netezza Performance Server Support. For more information, see Regeneration errors.
RunAwayQuery
runaway
Query
(Notice)
Admins, DBAs Moderate Can consume resources that are needed for operations. Determine whether to allow you to run, manage workload. For more information, see Runaway query notification.
SpuCore
spuCore Admins, NPS Moderate A SPU core file was created. The system created a SPU core file. See SPU cores event.
SystemStuckInState
systemStuck
InState
(Information)
Admins, NPS Moderate A system is stuck in the Pausing Now state. Contact Netezza Performance Server Support. See System state.
TrasactionLimit
Event
transaction
LimitEvent
Admins, NPS Serious New transactions are blocked if the limit is reached. Stop some existing sessions that might be old and require cleanup, or stop/start the Netezza Performance Server system to close all existing transactions.