Hardware service requested

Restriction: Do not aggregate this event.

It is important to be notified when a hardware component fails so that Support can notify service technicians that can replace or repair the component. For devices such as disks, a hardware failure causes the system to bring a spare disk online, and after an activation period, the spare disk transparently replaces the failed disk. However, it is important to replace the failed disk with a healthy disk so that you restore the system to its normal operation with its complement of spares.

In other cases, such as SPU failures, the system reroutes the work of the failed SPU to the other available SPUs. The system performance is affected because the healthy resources take on extra workload. Again, it is critical to obtain service to replace the faulty component and restore the system to its normal performance.

If you enable the event rule HardwareServiceRequested, the system generates a notification when there is a hardware failure and service technicians might be required to replace or repair components.

The following is the syntax for the event rule HardwareServiceRequested:
-name 'HardwareServiceRequested' -on no -eventType hwServiceRequested 
-eventArgsExpr '' -notifyType email -dst 'you@company.com' -ccDst '' 
-msg 'NPS system $HOST - Service requested for $hwType $hwId at 
$eventTimestamp $eventSource.' -bodyText 
'$notifyMsg\n\nlocation:$location\nerror 
string:$errString\ndevSerial:$devSerial\nevent source:$eventSource\n'
c-eventAggrCount 0
The following table lists the arguments to the HardwareServiceRequested event rule.
Table 1. HardwareServiceRequested event rule
Arguments Description Example
hwType The type of hardware affected spu, disk, pwr, fan, mm
hwId The hardware ID of the component that reports a problem 1013
location A string that describes the physical location of the component  
errString Specifies more information about the error or condition that triggered the event. If the failed component is not inventoried, it is specified in this string.  
devSerial Specifies the serial number of the component, or Unknown if the component has no serial number. 601S496A2012
For source disks used in a disk regeneration to a spare disk, the HardwareServiceRequested event also notifies you when regeneration encounters a read sector error on the source disk. The event helps you to identify when a regeneration requires some attention to address possible issues on the source and newly created mirror disks. The error messages in the event notification and in the sysmgr.log and eventmgr.log files contain information about the bad sector, as in the following example:
2012-04-05 19:52:41.637742 EDT Info: received & processing event type 
= hwServiceRequested, event args = 'hwType=disk, hwId=1073, 
location=Logical Name:'spa1.diskEncl2.disk1' Logical Location:'1st 
rack, 2nd disk enclosure, disk in Row 1/Column 1', errString=disk md: 
md2 sector: 2051 partition type: DATA  table: 201328, 
devSerial=9QJ2FMKN00009838VVR9...
The errString value contains more information about the sector that had a read error:
  • The md value specifies the RAID device on the SPU that encountered the issue.
  • The sector value specifies which sector in the device has the read error.
  • The partition type specifies whether the partition is a user data (DATA) or SYSTEM partition.
  • The table value specifies the table ID of the user table that is affected by the bad sector.

If the system notifies you of a read sector error, contact Netezza Performance Server Support for assistance with troubleshooting and resolving the problems.