Error notification

The Error Notification object class specifies the conditions and actions to be taken when errors are recorded in the system error log. The user specifies these conditions and actions in an Error Notification object.

Each time an error is logged, the error notification daemon determines if the error log entry matches the selection criteria of any of the Error Notification objects. If matches exist, the daemon runs the programmed action, also called a notify method, for each matched object.

The Error Notification object class is located in the /etc/objrepos/errnotify file. Error Notification objects are added to the object class by using Object Data Manager (ODM) commands. Only processes running with the root user authority can add objects to the Error Notification object class. Error Notification objects contain the following descriptors:

en_alertflg
Identifies whether the error can be alerted. This descriptor is provided for use by alert agents associated with network management applications using the SNA Alert Architecture. The valid alert descriptor values are:
TRUE
can be alerted
FALSE
cannot be alerted
en_class
Identifies the class of the error log entries to match. The valid en_class descriptor values are:
H
Hardware Error class
S
Software Error class
O
Messages from the errlogger command
U
Undetermined
en_crcid
Specifies the error identifier associated with a particular error. An error identifier can be any numeric value that is valid as a Predefined Attribute (PdAt) object class attribute value. The errpt command displays error identifiers as hexadecimal. For example, to select an entry that the errpt command displays with IDENTIFIER: 67581038, specify en_crcid = 0x67581038.
en_dup
If set, identifies whether duplicate errors as defined by the kernel should be matched. The valid en_dup descriptor values are:
TRUE
Error is a duplicate.
FALSE
Error is not a duplicate.
en_err64
If set, identifies whether errors from a 64-bit or 32-bit environment should be matched. The valid en_err64 descriptors value are:
TRUE
Error is from a 64-bit environment.
FALSE
Error is from a 32-bit environment.
en_label
Specifies the label associated with a particular error identifier as defined in the output of the errpt -t command.
en_method
Specifies a user-programmable action, such as a shell script or command string, to be run when an error matching the selection criteria of this Error Notification object is logged. The error notification daemon uses the sh -c command to execute the notify method.

The following key words are automatically expanded by the error notification daemon as arguments to the notify method.

$1
Sequence number from the error log entry
$2
Error ID from the error log entry
$3
Class from the error log entry
$4
Type from the error log entry
$5
Alert flags value from the error log entry
$6
Resource name from the error log entry
$7
Resource type from the error log entry
$8
Resource class from the error log entry
$9
Error label from the error log entry
en_name
Uniquely identifies the object. This unique name is used when removing the object.
en_persistenceflg
Designates whether the Error Notification object should be automatically removed when the system is restarted. For example, to avoid erroneous signaling, Error Notification objects containing methods that send a signal to another process should not persist across system restarts. The receiving process and its process ID do not persist across system restarts.

The creator of the Error Notification object is responsible for removing the Error Notification object at the appropriate time. In the event that the process terminates and fails to remove the Error Notification object, the en_persistenceflg descriptor ensures that obsolete Error Notification objects are removed when the system is restarted.

The valid en_persistenceflg descriptor values are:

0
non-persistent (removed at boot time)
1
persistent (persists through boot)
en_pid
Specifies a process ID (PID) for use in identifying the Error Notification object. Objects that have a PID specified should have the en_persistenceflg descriptor set to 0.
en_rclass
Identifies the class of the failing resource. For the hardware error class, the resource class is the device class. The resource error class is not applicable for the software error class.
en_resource
Identifies the name of the failing resource. For the hardware error class, a resource name is the device name.
en_rtype
Identifies the type of the failing resource. For the hardware error class, a resource type is the device type by which a resource is known in the devices object class.
en_symptom
Enables notification of an error accompanied by a symptom string when set to TRUE.
en_type
Identifies the severity of error log entries to match. The valid en_type descriptor values are:
INFO
Informational
PEND
Impending loss of availability
PERM
Permanent
PERF
Unacceptable performance degradation
TEMP
Temporary
UNKN
Unknown

Examples

  1. To create a notify method that mails a formatted error entry to root each time a disk error of type PERM is logged, create a file called /tmp/en_sample.add containing the following Error Notification object:
    errnotify:
        en_name = "sample"
        en_persistenceflg = 0
        en_class = "H"
        en_type = "PERM"
        en_rclass = "disk"
        en_method = "errpt -a -l $1 | mail -s 'Disk Error' root"
    To add the object to the Error Notification object class, type:
    odmadd /tmp/en_sample.add
    The odmadd command adds the Error Notification object contained in /tmp/en_sample.add to the errnotify file.
  2. To verify that the Error Notification object was added to the object class, type:
    odmget -q"en_name='sample'" errnotify
    The odmget command locates the Error Notification object within the errnotify file that has an en_name value of "sample" and displays the object. The following output is returned:
    errnotify:
        en_pid = 0
        en_name = "sample"
        en_persistenceflg = 0
        en_label = ""
        en_crcid = 0
        en_class = "H"
        en_type = "PERM"
        en_alertflg = ""
        en_resource = ""
        en_rtype = ""
        en_rclass = "disk"
        en_method = "errpt -a -l $1 | mail -s 'Disk Error' root"
  3. To delete the sample Error Notification object from the Error Notification object class, type:
    odmdelete -q"en_name='sample'" -o errnotify
    The odmdelete command locates the Error Notification object within the errnotify file that has an en_name value of "sample" and removes it from the Error Notification object class.
  4. To send an email to root when a duplicate error occurs, create a file called /tmp/en_sample.add containing the following error notification stanza:
    errnotify:
            en_name = "errdupxmp"
            en_persistenceflg = 1
            en_dup = "TRUE"
            en_method = "/usr/lib/dupmethod $1"

    Create the /usr/lib/dupmethod script as follows:

    #!/bin/sh
    # email root when a duplicate error is logged. 
    # We currently don't clear the duplicate from the log.
    # 
    # Input: 
    #   $1 contains the error log sequence number.
    #
    # Use errpt to generate the body of this email. 
    /usr/bin/errpt -al$1 | /usr/bin/mail -s "Duplicate Error Logged" root >/dev/null 
    
    # Now delete that error (currently not done)
    #/usr/bin/errclear -l$1 0 
    exit $?