You can use the MODE command to control the recording or monitoring
of hard machine-check interruptions.
MODE {PD}[,INTERVAL={nnnnn}][,RECORD[=nnn ][,CPU={x }]
{SD} {300 } |=ALL {ALL}
{IV} |=25
{TC} |=16
{PT} |=5
{CC} |=10
{PS} |=20
{AD}
{SL} {SC}
{SS}
{IC}
{CO}
{CS}
|
The parameters are:
- PD
- Instruction-processing damage machine checks are to be monitored
in the specified mode.
- SD
- System damage machine checks are to be monitored in the specified
mode.
- IV
- Machine checks indicating invalid PSW or registers are to be
monitored in the specified mode.
- TC
- Machine checks indicating TOD clock damage are to be monitored
in the specified mode.
- PT
- Machine checks indicating processor timer damage are to be monitored
in the specified mode.
- CC
- Machine checks indicating clock comparator damage are to be
monitored in the specified mode.
- PS
- Machine checks indicating primary clock synchronization are
to be monitored in the specified mode.
- AD
- Machine checks indicating ETR attachment are to be monitored
in the specified mode.
- SL
- Machine checks indicating switch to local synchronization are
to be monitored in the specified mode.
- SC
- Machine checks indicating ETR synchronization checks are to
be monitored in the specified mode.
- SS
- Machine checks indicating STP synchronization checks are to
be monitored in the specified mode.
- IC
- Machine checks indicating STP island condition are to be monitored
in the specified mode.
- CO
- Machine checks indicating STP configuration change are to be
monitored in the specified mode.
- CS
- Machine checks indicating STP clock source error condition are
to be monitored in the specified mode.
- INTERVAL=nnnnn
- This parameter is used together with the RECORD=nnn parameter.
It defines the number of seconds used in counting hard machine check
interrupts. If the specified number of seconds elapses before the
specified number of interrupts of the specified type occur on the
specified processor, the count of that type of interrupt is set to
zero, and the counting is started again from zero. If the specified
number of hard machine check interrupts does occur in the specified
interval, then the system either performs a timer-related
recovery action or invokes alternate CPU recovery (ACR) to take
the failing processor offline. If the INTERVAL parameter is omitted,
then INTERVAL=300 is assumed.
- RECORD=nnn
- After the specified number (1 to 999) of hard machine checks
of the specified type occurs on the specified processor in the specified
interval, the system either performs a timer-related
recovery action or invokes alternate CPU recovery (ACR) to take the
failing processor offline. All interruptions of that type occurring
on that processor are recorded on the logrec data set until the specified
number is reached. If no number is specified or if the RECORD
parameter is omitted, the system uses the following default setting:
- RECORD=16 for PD
- RECORD=25 for SL
- RECORD=20 for SC
- RECORD=10 for SS, IC, CO, and CS
- RECORD=5 for all others
- RECORD=ALL
- All specified hard machine-check interruptions of the specified
type occurring on the specified processor are to be recorded on the
logrec data set. The system will no longer monitor the frequency
of hard machine-check interruptions of that type occurring on that
processor.
- CPU=x
- The address (0, 1, 2, 3...) of the processor to be monitored
in the specified mode. If the parameter is omitted, ALL is assumed.
- CPU=ALL
- All processors in the system are to be monitored in the specified
mode.
Example 1:
Monitor instruction-processing-damage machine-check interruptions
on processor 0. If seven of these interruptions occur in 600 seconds
on processor 0, invoke ACR to take processor 0 offline.
mode pd,record=7,interval=600,cpu=0
Example 2:
Record on the logrec data set all machine-check interruptions indicating
invalid PSW or registers, but do not monitor them for any processor
in the system.
MODE IV,CPU=ALL,RECORD=ALL
Example 3:
Monitor the frequency of system damage machine-check interruptions
on all processors, using the default values of five for the RECORD=
parameter and 300 for the INTERVAL= parameter. After five system
damage machine checks have occurred on a given processor within five
minutes (300 seconds), invoke ACR to take that processor offline.
mode sd