Template event rules

Event management consists of creating rules that define conditions to monitor and the actions to take when that condition is detected. The event manager uses these rules to define its monitoring scope, and thus its behavior when a rule is triggered. Creating event rules can be a complex process because you must define the condition clearly so that the event manager can detect it, and you must define the actions to take when the match occurs.

To help ease the process of creating event rules, there are template event rules supplied that you can copy and tailor for your system. The template events define a set of common conditions to monitor with actions that are based on the type or effect of the condition. The template event rules are not enabled by default, and you cannot change or delete the template events. You can copy them as starter rules for more customized rules in your environment.

As a best practice, you can begin by copying and by using the template rules. If you are familiar with event management and the operational characteristics of your Netezza Performance Server system, you can also create your own rules to monitor conditions that are important to you. You can display the template event rules by using the nzevent show -template command.

Note: Release 5.0.x introduced new template events for the IBM® Netezza® 100, IBM Netezza 1000, and later systems. Previous event template rules specific to the z-series platform do not apply to the new models and were replaced by similar, new events.

The following table lists the predefined template event rules.

Table 1. Template event rules
Template event rule name	Description
Disk80PercentFull	Notifies you when a disk is more than 80 percent full. Disk space threshold notification.
Disk90PercentFull	Notifies you when a disk is more than 90 percent full. Disk space threshold notification.
HardwareNeedsAttention	Notifies you when the system detects a condition that can impact the hardware. For more information, see Hardware needs attention.
HardwareRestarted	Notifies you when a hardware component successfully restarts. For more information, see Hardware restarted.
HardwareServiceRequested	Notifies you of the failure of a hardware component, which most likely requires a service call, hardware replacement, or both. For more information, see Hardware service requested.
HistCaptureEvent	Notifies you if there is a problem that prevents history-data files from being written to the staging area.
HistLoadEvent	Notifies you if there is a problem that prevents the external tables that contain history data from being loaded to the history database.
NPSNoLongerOnline	Notifies you when the system goes from the online state to another state. For more information, see System state changes.
RegenFault	Notifies you when the system cannot set up a data slice regeneration.
RunAwayQuery	Notifies you when a query exceeds a timeout limit. For more information, see Runaway query notification.
SpuCore	Notifies you when the system detects that a SPU process has restarted and resulted in a core file. For more information, see SPU cores event.
SystemOnline	Notifies you when the system is online. For more information, see System state changes.
SystemStuckInState	Notifies you when the system is stuck in the Pausing Now state for more than the timeout specified by the sysmgr.pausingStateTimeout (420 seconds). For more information, see System state changes.
Transaction Limit Event	Sends an email notification when the number of outstanding transaction objects exceeds 90 percent of the available objects. For more information, see Transaction limits event.

Netezza Performance Server might add new event types to monitor conditions on the system. These event types might not be available as templates, which means you must manually add a rule to enable them. For a description of more event types that can assist you with monitoring and managing the system, see Event types reference.

The action to take for an event often depends on the type of event (its effect on the system operations or performance). The following table lists some of the predefined template events and their corresponding effects and actions.

Table 2. Netezza Performance Server template event rules
Template name	Type	Notify	Severity	Effect	Action
Disk80PercentFull Disk90PercentFull	hwDiskFull (Notice)	Admins, DBAs	Moderate to Serious	Full disk prevents some operations.	Reclaim space or remove unwanted databases or older data. For more information, see Disk space threshold notification.
HardwareNeeds Attention	hwNeeds Attention	Admins, NPS®	Moderate	Possible change or issue that can start to affect performance.	Investigate and identify whether more assistance is required from Support. For more information, see Hardware needs attention.
Hardware Restarted	hwRestarted (Notice)	Admins, NPS	Moderate	Any query or data load in progress is lost.	Investigate whether the cause is hardware or software. Check for SPU cores. For more information, see Hardware restarted.
HardwareService Requested	hwService Requested (Warning)	Admins, NPS	Moderate to Serious	Any query or work in progress is lost. Disk failures initiate a regeneration.	Contact Netezza Performance Server. For more information, see Hardware service requested.
HistCapture Event	histCapture Event	Admins, NPS	Moderate to Serious	The history-data collection process (alcapp) is unable to save captured history data in the staging area; alcapp stops collecting new data.	The size of the staging area reaches the configured size threshold, or there is no available disk space in /nz/data. Either increase the size of the threshold or free up disk space by deleting old files.
HistLoadEvent	histLoadEvent	Admins, NPS	Moderate to Serious	The history-data loader process (alcloader) is unable to load history data into the history database; new history data is not available in reports until it can be loaded.	The history configuration might be changed, the history database might be deleted, or there might be some session connection error.
NPSNoLongerOnline SystemOnline	sysState Changed (Information)	Admins, NPS, DBAs	Varies	Availability status.	Depends on the current state. For more information, see System state changes.
RegenFault	regenFault	Admins, NPS	Critical	Might prevent user data from being regenerated.	Contact Netezza Performance Server Support. For more information, see Regeneration errors.
RunAwayQuery	runaway Query (Notice)	Admins, DBAs	Moderate	Can consume resources that are needed for operations.	Determine whether to allow you to run, manage workload. For more information, see Runaway query notification.
SpuCore	spuCore	Admins, NPS	Moderate	A SPU core file was created.	The system created a SPU core file. See SPU cores event.
SystemStuckInState	systemStuck InState (Information)	Admins, NPS	Moderate	A system is stuck in the Pausing Now state.	Contact Netezza Performance Server Support. See System state.
TrasactionLimit Event	transaction LimitEvent	Admins, NPS	Serious	New transactions are blocked if the limit is reached.	Stop some existing sessions that might be old and require cleanup, or stop/start the Netezza Performance Server system to close all existing transactions.