Situations

This chapter describes the predefined situations of the product.

Overview of situations

You can use the predefined situations shipped with IBM Z OMEGAMON AI for Storage as-is or modify them to meet your requirements. If you choose to modify a predefined situation, first make a copy to ensure a fallback, if necessary. You can also create your own situations using the attributes provided by IBM Z OMEGAMON AI for Storage.
Note: Do not modify the product-provided situations. If you want to modify a product-provided situation, copy the situation, modify the copy, and rename the copy.
 

Definition of a predefined situation

A situation is a logical expression involving one or more system conditions. IBM Z OMEGAMON AI for Storage uses situations to monitor the systems in your network. To improve the speed with which you begin using IBM Z OMEGAMON AI for Storage, the product provides situations that check for system conditions common to many enterprises. You can examine and if necessary, change the conditions or values being monitored to those best suited to your enterprise. Be sure to start the situations that you want to run in your environment.

Using situations

You manage situations from the Tivoli® management portal using the Situation editor. Using the Situation editor you can perform the following tasks:

  • Create a situation
  • Save a situation
  • Display a situation
  • Edit a situation
  • Start, stop, or delete a situation
  • Investigate the situation event workspace for a situation

When you open the Situation editor, the left frame initially lists the situations associated with the Navigator item you selected. When you click a situation name or create a new situation, the right frame of the Situation editor opens to provide the following information about the situation and allow you to further define that situation:

Condition
View, add to, and edit the condition being tested.
Distribution
View the systems to which the situation is assigned and assign the situation to systems.
Expert advice
Write comments or instructions to be read in the situation event workspace.
Action
Specify a command to be sent to the system.

You can also specify a Storage Toolkit request to be run when a situation becomes true if IBM Z OMEGAMON AI for Storage is installed and a storage table is enabled for Storage Toolkit commands.

Until
Reset a true situation when another situation becomes true or a specified time interval elapses.

Predefined situations descriptions

The following predefined situations are included in the IBM Z OMEGAMON AI for Storage product.

KS3_Applic_Resp_Time_Critical

If VALUE S3_Application_Monitoring.High_Dataset_MSR GE 50
Monitors the response time components to determine the reason for a poor response time when an application is accessing a dataset and the response time is greater than the critical threshold. Also examine the volume for over-utilization, cache settings, and the response time components at the volume level.

KS3_Applic_Resp_Time_Warning

If VALUE S3_Application_Monitoring.High_Dataset_MSR GE 40 AND
VALUE S3_Application_Monitoring.High_Dataset_MSR LT 50
Monitors the response time components to determine the reason for a poor response time when an application is accessing a dataset and the response time is greater than the warning threshold. Also examine the volume for over-utilization, cache settings, and the response time components at the volume level.

KS3_Cachecu_Cache_Stat_Critical

If VALUE S3_Cache_Control_Unit.Cache_Status NE Active
Monitors for the condition where caching is not active for the control unit. Use the SETCACHE command to activate caching, if appropriate.

KS3_Cachecu_DFW_Retry_Critical

If VALUE S3_Cache_Control_Unit.DFW_Retry_Percent GE 2
Monitors for the condition where the percent of DASD fast write attempts that cannot be satisfied because a shortage of available nonvolatile storage (NVS) space exceeds the critical threshold. Check for pinned NVS and correct the problem if NVS is pinned. Otherwise, if the impact on performance is not acceptable, you need to move a volume or dataset to another cache control unit or add NVS to this control unit.

KS3_Cachecu_DFW_Retry_Warning

If VALUE S3_Cache_Control_Unit.DFW_Retry_Percent GE 1 AND
VALUE S3_Cache_Control_Unit.DFW_Retry_Percent LT 2
Monitors for the condition where the percent of DASD fast write attempts that cannot be satisfied because a shortage of available nonvolatile storage (NVS) space has exceeded the warning threshold. Check for pinned NVS and correct the problem if NVS is pinned. Otherwise, if the impact on performance is not acceptable, move a volume or dataset to another cache control unit or add NVS to this control unit.

KS3_Cachecu_Inact_Vols_Critical

If VALUE S3_Cache_Control_Unit.Deactivated_Volumes GE 15
Monitors for the condition where the number of deactivated volumes on the control unit exceeds the critical threshold. You can use the SETCACHE command to activate caching on the volumes, if necessary.

KS3_Cachecu_Inact_Vols_Warning

If VALUE S3_Cache_Control_Unit.Deactivated_Volumes GE 10 AND
VALUE S3_Cache_Control_Unit.Deactivated_Volumes LT 15
Monitors for the condition where the number of deactivated volumes on the control unit exceeds the warning threshold. You can use the SETCACHE command to activate caching on the volumes, if necessary.

KS3_Cachecu_NVS_Stat_Critical

If Value_S3_Cache_Control_Unit.NVS_Status NE Active
Monitors for the condition where nonvolatile storage is not active for the control unit. All writes to volumes on the control unit are written directly to the hard disk drive. Use the SETCACHE command to activate NVS (nonvolatile storage), if appropriate.

KS3_Cachecu_Read_HitP_Critical

If VALUE S3_Cache_Control_Unit.Read_Hit_Percent LE 50 AND
VALUE S3_Cache_Control_Unit.Read_Hit_Percent GT 0
Monitors for the condition where the percent of read I/O requests resolved from cache has fallen below the critical threshold. If performance is a problem, look for volume with a low read hit percent and consider moving them to another control unit to balance the load. This condition can be caused by cache-unfriendly applications or a shortage of cache.

KS3_Cachecu_Read_HitP_Warning

If VALUE S3_Cache_Control_Unit.Read_Hit_Percent LE 60 AND
VALUE S3_Cache_Control_Unit.Read_Hit_Percent GT 50
Monitors for the condition where the percent of read I/O requests resolved from cache has fallen below the warning threshold. If performance is a problem, look for volume with a low read hit percent and consider moving them to another control unit to balance the load. This condition can be caused by cache-unfriendly applications or a shortage of cache.

KS3_Cachecu_Trk_Dstg_Critical

If VALUE S3_Cache_Control_Unit.Track_Destaging_Rate GE 70
Monitors for the condition where the rate at which tracks are being removed from cache and written to DASD exceeds the critical threshold. If performance is being impacted, you need to migrate datasets or volumes to another cache control unit. An alternative is to increase the cache capacity.

KS3_Cachecu_Trk_Dstg_Warning

If VALUE S3_Cache_Control_Unit.Track_Destaging_Rate GE 50 AND
VALUE S3_Cache_Control_Unit.Track_Destaging_Rate LT 70
Monitors for the condition where the rate at which tracks are being removed from cache and written to DASD exceeds the warning threshold. If performance is being impacted, you need to migrate datasets or volumes to another cache control unit. An alternative is to increase the cache capacity.

KS3_Cachecu_Trk_Stag_Critical

If VALUE S3_Cache_Control_Unit.Track_Staging_Rate GE 70
Monitors for the condition where the movement of tracks from the physical device to cache has exceeded the critical threshold. If performance is impacted, you might need to move the logical volume that is causing the excessive activity or to move datasets on the logical volume.

KS3_Cachecu_Trk_Stag_Warning

If VALUE S3_Cache_Control_Unit.Track_Staging_Rate GE 50 AND
VALUE S3_Cache_Control_Unit.Track_Staging_Rate LT 70
Monitors for the condition where the movement of tracks from the physical device to cache has exceeded the warning threshold. If performance is impacted, you might need to move the logical volume that is causing the excessive activity or to move datasets on the logical volume.

KS3_Cachecu_Write_HitP_Critical

If VALUE S3_Cache_Control_Unit.Write_Hit_Percent LE 45 AND
VALUE S3_Cache_Control_Unit.Write_Hit_Percent GE 0
Monitors for the condition where the percent of DASD/Cache fast write commands that were successfully processed without accessing the volume is below the critical threshold. If performance is impacted, you might need to move a volume or dataset to another control unit to balance the workload.

KS3_Cachecu_Write_HitP_Warning

If VALUE S3_Cache_Control_Unit.Write_Hit_Percent LE 50 AND
VALUE S3_Cache_Control_Unit.Write_Hit_Percent GT 45
Monitors for the condition where the percent of DASD/Cache fast write commands that were successfully processed without accessing the volume is below the warning level. If performance is impacted, you might need to move a volume or dataset to another control unit to balance the workload.

KS3_Channel_Busy_Pct_Critical

If VALUE S3_Channel_Path.Complex_Percent_Utilized GE 85
Monitors high response time for I/O requests to volumes being serviced by the channel due to over utilization of that channel. You might need to balance the workload between channels by moving volumes or datasets.

KS3_Channel_Busy_Pct_Warning

If VALUE S3_Channel_Path.Complex_Percent_Utilized GE 70 AND
VALUE S3_Channel_Path.Complex_Percent_Utilized LT 85
Monitors high response time for I/O requests to volumes being serviced by the channel due to over utilization of that channel. You might need to balance the workload between channels by moving volumes or datasets.

KS3_FICON_Port_Frame_Pacing_W

Situation is true if the port is not a switch (if port is connected to a CU or channel path) and the average frame pacing is greater than 100.

This situation is written against the S3_FICON_Director_Ports attribute group.

KS3_FICON_Switch_Frame_Pacing_W

Situation is true if the port is not a switch (if port is connected to a CU or channel path) and the average frame pacing is greater than 100.

This situation is written against the S3_FICON_Director_Ports attribute group.

KS3_HSM_Backup_Held_Critical

If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND
VALUE S3_HSM_Function_Summary.Function EQ Backup
Monitors the HSM backup function to see if it is being held. If the hold is inadvertent, issue the HSM RELEASE BACKUP command to allow the backup function to continue processing.

KS3_HSM_Backup_Queue_Critical

If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 50 AND
VALUE S3_HSM_Function_Summary.Function EQ Backup
Monitors the HSM backup queue for a condition where the number of backup requests waiting exceeds the critical threshold. If the number of backup tasks is not at the maximum, issue the HSM SETSYS MAXBACKUPTASKS command to increase the number of backup tasks, thus increasing the processing rate. Keep in mind that the number of available backup volumes serves as a constraint on the number of active backup tasks.

KS3_HSM_Backup_Queue_Warning

If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND
VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50 AND
VALUE S3_HSM_Function_Summary.Function EQ Backup
Monitors the HSM backup queue for a condition where the number of backup requests waiting exceeds the warning threshold. If the number of backup tasks is not at the maximum, issue the HSM SETSYS MAXBACKUPTASKS command to increase the number of backup tasks, thus increasing the processing rate. Keep in mind that the number of available backup volumes serves as a constraint on the number of active backup tasks.

KS3_HSM_CRQ_Element_Full_Warn

If VALUE S3_HSM_CRQplex.Element_Percent_Full GT 80
Monitors the percentage of elements on the Common Recall Queue that are currently in use. HSM throttles the use of the CRQ when the percent used reaches 95%. To expand the CRQ structure, issue the SETXCF START,ALTER command.

KS3_HSM_CRQ_Entry_Full_Warning

If VALUE S3_HSM_Cross_System_CRQplex.Entry_Percent_Full GT 80
Monitors the percentage of entries on the Common Recall Queue that are currently in use. HSM throttles the use of the CRQ when the percent used reaches 95%. To expand the CRQ structure, issue the SETXCF START,ALTER command.

KS3_HSM_CRQ_Host_Critical

If VALUE S3_HSM_Cross_System_CRQ_Hosts.HSM_Host_CRQ_State NE Connected
AND VALUE S3_HSM_Cross_System_CRQ_Hosts.CRQplex_Base_Name NE n/a
Monitors the state of the host in regards to the Common Recall Queue. To connect an HSM host to the CRQ, issue the HSM SETSYS command.

KS3_HSM_CRQ_Host_Disconn_Crit

If VALUE S3_HSM_Cross_System_CRQplex.HSM_Hosts_Not_Connected GT 0
Monitors the number of HSM hosts currently not connected to the Common Recall Queue.

KS3_HSM_CRQ_Host_Held_Critical

If VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Held EQ Yes
Monitors the commonqueue status for this host. This condition can occur if the HOLD COMMONQUEUE command has been issued. To resolve this condition, issue a RELEASE COMMONQUEUE command.

KS3_HSM_CRQ_Host_Place_Crit

If VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Place_Held EQ Internal OR
VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Place_Held EQ External OR
VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Place_Held EQ Both
Monitors the commonqueue status for this host and whether requests can be placed on the common recall queue. This condition can occur if the HOLD COMMONQUEUE(RECALL(PLACEMENT)) command has been issued or inferred because a HOLD COMMONQUEUE or HOLD COMMONQUEUE(RECALL) was issued. To resolve this condition, issue a RELEASE COMMONQUEUE(RECALL(PLACEMENT)) command.

KS3_HSM_CRQ_Host_Recall_Crit

If VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Held EQ Internal OR
VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Held EQ External OR
VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Held EQ Both
Monitors the commonqueue status for this host and whether requests can be recalled from the common recall queue. This condition can occur if the HOLD COMMONQUEUE(RECALL) command has been issued or inferred because a HOLD COMMONQUEUE was issued. To resolve this condition, issue a RELEASE COMMONQUEUE(RECALL) command.

KS3_HSM_CRQ_Host_Select_Crit

If VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Select_Held EQ Internal OR
VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Select_Held EQ External OR
VALUE S3_Cross_System_HSM_CRQ_Hosts.Host_CRQ_Recall_Select_Held EQ Both
Monitors the commonqueue status for this host and whether requests can be pulled from the common recall queue. This condition can occur if the HOLD COMMONQUEUE(RECALL(SELECT)) command has been issued or inferred because a HOLD COMMONQUEUE or HOLD COMMONQUEUE(RECALL) was issued. To resolve this condition, issue a RELEASE COMMONQUEUE(RECALL(SELECT)) command.

KS3_HSM_Dump_Held_Critical

If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND
VALUE S3_HSM_Function_Summary.Function EQ Dump
Monitors the HSM dump function to see if it is being held. If the hold is inadvertent, issue the HSM RELEASE DUMP command to allow dump processing to continue.

KS3_HSM_Dump_Queue_Critical

If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND
VALUE S3_HSM_Function_Summary.Function EQ Dump
Monitors the HSM dump queue for a condition where the number of dump requests waiting exceeds the critical threshold. If the number of dump tasks is not at the maximum, use the HSM SETSYS MAXDUMPTASKS command to increase the number of dump tasks, thus increasing the processing rate. Keep in mind that the number of available tape drives serves as a constraint on the number of active dump tasks.

KS3_HSM_Dump_Queue_Warning

If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND
VALUE S3_HSM_Function_Summary.Function EQ Dump AND
VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50
Monitors the HSM dump queue for a condition where the number of dump requests waiting exceeds the warning threshold. If the number of dump tasks is not at the maximum, use the HSM SETSYS MAXDUMPTASKS command to increase the number of dump tasks, thus increasing the processing rate. Keep in mind that the number of available tape drives serves as a constraint on the number of active dump tasks.

KS3_HSM_Inactive_Host_Warning

If VALUE S3_HSM_Status.Inactive_HSM_Hosts GT 0
Monitors when an inactive HSM host has been detected. The event workspace for this situation has a link to the DFSMShsm Host Details workspace.

KS3_HSM_Migrate_Held_Critical

If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND
VALUE S3_HSM_Function_Summary.Function EQ Migration
Monitors the migrate function to see if it is being held. If the hold on the function is inadvertent, issue the HSM RELEASE MIGRATION command to allow migration to continue.

KS3_HSM_Migrate_Queue_Critical

If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 50 AND
VALUE S3_HSM_Function_Summary.Function EQ Migration
Monitors the HSM migration queue for a condition where the number of migration requests waiting exceeds the critical threshold. If the number of migrate tasks is not at the maximum, use the HSM SETSYS MAXMIGRATIONTASKS command to increase the number of migration tasks, thus increasing the processing rate. Note that this affects only those migrations requested by automatic functions. Only one task is available to process command migration requests.

KS3_HSM_Migrate_Queue_Warning

If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND
VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50 AND
VALUE S3_HSM_Function_Summary.Function EQ Migration
Monitors the HSM migration queue for a condition where the number of migration requests waiting exceeds the warning threshold. If the number of migrate tasks is not at the maximum, use the HSM SETSYS MAXMIGRATIONTASKS command to increase the number of migration tasks, thus increasing the processing rate. Note that this affects only those migrations requested by automatic functions. Only one task is available to process command migration requests.

KS3_HSM_Recall_Held_Critical

If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND
VALUE S3_HSM_Function_Summary.Function EQ Recall
Monitors the recall function to see if it is being held. If the hold on the function is inadvertent, issue the HSM RELEASE RECALL command to allow recalls to resume.

KS3_HSM_Recall_Queue_Critical

If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 50 AND
VALUE S3_HSM_Function_Summary.Function EQ Recall
Monitors the HSM recall queue for a condition where the number of recall requests waiting exceeds the critical threshold. If the number of recall tasks is not at the maximum, use the HSM SETSYS MAXRECAL LTASKS command to increase the number of recall tasks, thus increasing the processing rate.

KS3_HSM_Recall_Queue_Warning

If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND
VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50 AND
VALUE S3_HSM_Function_Summary.Function EQ Recall
Monitors the HSM recall queue for a condition where the number of recall requests waiting exceeds the critical threshold. If the number of recall tasks is not at the maximum, use the HSM SETSYS MAXRECAL LTASKS command to increase the number of recall tasks, thus increasing the processing rate.

KS3_HSM_Recovery_Held_Critical

If VALUE S3_HSM_Function_Summary.Function_Status EQ Held AND
VALUE S3_HSM_Function_Summary.Function EQ Recovery
Monitors the recovery function to see if it is being held. If the hold on the function is inadvertent, issue the HSM RELEASE RECOVER command to allow recovery function to resume.

KS3_HSM_Recovery_Queue_Critical

If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 50 AND
VALUE S3_HSM_Function_Summary.Function EQ Recovery
Monitors the HSM recovery queue for a condition where the number of recover requests waiting exceeds the critical threshold. If the number of recovery tasks is not at the maximum, use the HSM SETSYS MAXDSRECOVERTASKS command to increase the number of recover tasks, thus increasing the processing rate. Keep in mind that the number of backup tape cartridges serves as a constraint on the number of active recovery tasks.

KS3_HSM_Recovery_Queue_Warning

If VALUE S3_HSM_Function_Summary.Waiting_Requests GE 15 AND
VALUE S3_HSM_Function_Summary.Waiting_Requests LT 50 AND
VALUE S3_HSM_Function_Summary.Function EQ Recovery
Monitors the HSM recovery queue for a condition where the number of recover tasks waiting exceeds the warning threshold. If the number of recovery tasks is not at the maximum, use the HSM SETSYS MAXDSRECOVERTASKS command to increase the number of recover tasks, thus increasing the processing rate. Keep in mind that the number of backup tape cartridges serves as a constraint on the number of active recovery tasks.

KS3_HSM_Status_Inactive_Crit

If VALUE S3_HSM_Status.HSM_Status EQ InActive
Monitors the status of the HSM. If status is not active, restart HSM.

KS3_LCU_Av_Delay_Q_Critical

If VALUE S3_Logical_Control_Unit.Average_Delay_Queue GE 0.500
Monitors for the condition where the average number of requests queued to devices assigned to a logical control unit due to busy conditions on physical paths has exceeded the critical threshold. If performance is impacted, you might be able to balance the workload across multiple LCUs by moving a volume or dataset. Otherwise, you need to add physical paths to the LCU.

KS3_LCU_Av_Delay_Q_Warning

If VALUE S3_Logical_Control_Unit.Average_Delay_Queue GE 0.2 AND
VALUE S3_Logical_Control_Unit.Average_Delay_Queue LT 0.500
Monitors for the condition where the average number of requests queued to devices assigned to a logical control unit due to busy conditions on physical paths has exceeded the warning threshold. If performance is impacted, you might be able to balance the workload across multiple LCUs by moving a volume or dataset. Otherwise, you need to add physical paths to the LCU.

KS3_LCU_Cont_Rate_Critical

If VALUE S3_Logical_Control_Unit.Contention_Rate GE 1.001
Monitors for the condition where the rate at which I/O requests are being queued to devices on a logical control unit (LCU) due to busy conditions on physical paths has exceeded the critical threshold. If performance is impacted, you need to migrate volumes or datasets to another LCU, otherwise, you need to add physical paths to the LCU.

KS3_LCU_Cont_Rate_Warning

If VALUE S3_Logical_Control_Unit.Contention_Rate GE 0.2 AND
VALUE S3_Logical_Control_Unit.Contention_Rate LT 1.001
Monitors for the condition where the rate at which I/O requests are being queued to devices on a logical control unit (LCU) due to busy conditions on physical paths has exceeded the warning threshold. If performance is impacted, you need to migrate volumes or datasets to another LCU, otherwise, you need to add physical paths to the LCU.

KS3_LCU_IO_Rate_Sec_Critical

If VALUE S3_Logical_Control_Unit.Channel_Path_I/O_Rate GE 600
Monitors for the condition where the I/O rate per second to volumes in the logical control unit (LCU) has exceeded the critical threshold. If performance is impacted, you need to balance the workload across multiple LCUs by moving volumes or datasets.

KS3_LCU_IO_Rate_Sec_Warning

If VALUE S3_Logical_Control_Unit.Channel_Path_I/O_Rate GE 200 AND
VALUE S3_Logical_Control_Unit.Channel_Path_I/O_Rate LT 600
Monitors for the condition where the I/O rate per second to volumes in the logical control unit (LCU) has exceeded the warning threshold. If performance is impacted, you need to balance the workload across multiple LCUs by moving volumes or datasets.

KS3_RLS_Accelerated_Mode_Warn

If VALUE S3_RLS_Buffer_LSU_Summary.BMF_Accelerated_Mode_Pct GT 75
Monitors for the warning condition where the BMF Accelerated Mode Pct is greater than 75.

KS3_RLS_Dataset_Avg_Resp_Time_C

If VALUE S3_RLS_Dataset_Group_Details.Average_Response_Time GT 3
Monitors for the critical condition where the Average Response Time is greater than 3.

KS3_RLS_Dataset_Avg_Resp_Time_W

If VALUE S3_RLS_Dataset_Group_Details.Average_Response_Time GT 2 AND
VALUE S3_RLS_Dataset_Group_Details.Average_Response_Time LT 3
Monitors for the warning condition where the Average Response Time is between 2 and 3.

KS3_RLS_DSG_Max_Avg_Resp_Time_C

If VALUE S3_RLS_Dataset_Group_Summary.Max_Average_Response_Time GT 3
Monitors for the critical condition where the Max Average Response Time is greater than 3.

KS3_RLS_DSG_Max_Avg_Resp_Time_W

If VALUE S3_RLS_Dataset_Group_Summary.Max_Average_Response_Time GT 2 AND
VALUE S3_RLS_Dataset_Group_Summary.Max_Average_Response_Time LT 3
Monitors for the warning condition where the Max Average Response Time is between 2 and 3.

KS3_RLS_Panic_Mode_Critical

If VALUE S3_RLS_Buffer_LSU_Summary.BMF_Panic_Mode_Pct GT 75
Monitors for the critical condition where the BMF Panic Mode Pct is greater than 75.

KS3_RLS_Panic_Mode_Warning

If VALUE S3_RLS_Buffer_LSU_Summary.BMF_Panic_Mode_Pct GT 50 AND
VALUE S3_RLS_Buffer_LSU_Summary.BMF_Panic_Mode_Pct LT 75
Monitors for the warning condition where the BMF Panic Mode Pct is between 50 and 75.

KS3_RLS_StorCls_Avg_Resp_Time_C

If VALUE S3_RLS_Storage_Class.Average_Response_Time GT 3
Monitors for the critical condition where the Average Response Time is greater than 3.

KS3_RLS_StorCls_Avg_Resp_Time_W

If VALUE S3_RLS_Storage_Class.Average_Response_Time GT 2 AND
VALUE S3_RLS_Storage_Class.Average_Response_Time LT 3
Monitors for the warning condition where the Average Response Time is between 2 and 3.

KS3_RMM_CDS_Backup_Critical

If VALUE S3_RMM_Control_Dataset.Days_Since_Last_Backup GT 3
The number of days since the last backup of the DFSMSrmm CDS or Journal exceeded the critical threshold.

KS3_RMM_CDS_Backup_Warning

If VALUE S3_RMM_Control_Dataset.Days_Since_Last_Backup GT 1 AND
VALUE S3_RMM_Control_Dataset.Days_Since_Last_Backup LE 3
The number of days since the last backup of the DFSMSrmm CDS or Journal exceeded the warning threshold.

KS3_RMM_CDS_Space_Critical

If VALUE S3_RMM_Control_Dataset.RMM_Percent_Used GT 90
The percentage of space used by the DFSMSrmm CDS or Journal is greater than the critical threshold.

KS3_RMM_CDS_Space_Warning

If VALUE S3_RMM_Control_Dataset.RMM_Percent_Used GE 80 AND
VALUE S3_RMM_Control_Dataset.RMM_Percent_Used LE 90
The percentage of space used by the DFSMSrmm CDS or Journal is greater than the warning threshold.

KS3_RMM_Exit_Status_Critical

If ( ( VALUE S3_RMM_Config.EDGUX200_Status NE Enabled ) OR 
( VALUE S3_RMM_Config.EDGUX100_Status NE Enabled ) )
The DFSMSrmm EDGUX100 or EDGUX200 exit is not Enabled.

KS3_RMM_Journal_Status_Critical

If VALUE S3_RMM_Config.Journal_Status NE Enabled
The DFSMSrmm Journal is either Disabled or Locked. DFSMSrmm does not allow further updates to the journal until BACKUP is run to back up the DFSMSrmm control dataset and to clear the journal. If the Journal is Locked, DFSMSrmm fails any requests that result in an update to the DFSMSrmm control dataset. Message EDG2103D might also have been issued to the DFSMSrmm operator console.

KS3_RMM_Operating_Mode_Warning

If VALUE S3_RMM_Config.Operating_Mode NE Protect
DFSMSrmm is not operating in Protect mode. Certain actions that should be rejected are permitted if DFSMSrmm is not operating in protect mode, for example attempting to read a scratch tape volume.

KS3_RMM_Scratch_Tape_Critical

If VALUE S3_RMM_Summary.Type EQ 0 AND 
VALUE S3_RMM_Summary.Scratch_Volumes LT 100
The number of Scratch volumes is below the critical threshold.

KS3_RMM_Scratch_Tape_Warning

If VALUE S3_RMM_Summary.Type EQ 0 AND 
VALUE S3_RMM_Summary.Scratch_Volumes LT 200 AND 
VALUE S3_RMM_Summary.Scratch_Volumes GE 100
The number of Scratch volumes is below the warning threshold.

KS3_RMM_Inactive_Critical

If VALUE S3_RMM_Config.Subsystem_Status EQ Inactive
The DFSMSrmm subsystem is inactive.

KS3_Stg_Toolkit_Result_Critical

If VALUE S3_Storage_Toolkit_Result_Summary.Return_Code GT 4
The batch job submitted by the Storage Toolkit to execute a command or user-defined JCL returns a value greater than 4. Or the Storage Toolkit encountered an error while attempting to process a command or user-defined JCL. A value that is greater than 4, and is not specific to the Storage Toolkit, typically denotes that a command failed to complete. If you elected to save the results of the batch job, go to the Storage Toolkit Result Detail workspace to determine whether the error requires further attention.

KS3_Stg_Toolkit_Result_Warning

If VALUE S3_Storage_Toolkit_Result_Summary.Return_Code EQ 4
The batch job submitted by the Storage Toolkit to execute a command or user-defined JCL returns the value 4. A value of 4 typically denotes a warning. If you elected to save the results of the batch job, go to the Storage Toolkit Result Detail workspace to determine whether the warning requires further attention.

KS3_Storage_Gr_Pct_Free_Crit

If VALUE S3_Volume_Group_Summary.Free_Space_Percent LT 5.0 AND
VALUE S3_Volume_Group_Summary.Group_Type EQ SMSGROUP AND
VALUE S3_Vol ume_Group_Summary.Free_Space_Percent GE 0.0
Monitors the percentage of free space available for allocation in the storage group and detects when free space has dropped below the critical threshold. To prevent allocation failures, you might have to either add one or more logical volumes to the storage group, or to move datasets off of the logical volumes in the storage group.

KS3_Storage_Gr_Pct_Free_Warning

If VALUE S3_Volume_Group_Summary.Free_Space_Percent LT 10.0 AND
VALUE S3_Volume_Group_Summary.Group_Type EQ SMSGROUP AND
VALUE S3_Volume_Group_Summary.Free_Space_Percent GE 5.0",
Monitors the percentage of free space available for allocation in the storage group and detects when free space has dropped below the warning threshold. In order to prevent allocation failures, you might have to either add one or more logical volumes to the storage group, or to migrate datasets off of the logical volumes in the storage group.

KS3_TDS_Array_Degraded_Crit

If VALUE S3_TotalStorageDS_Array.RAID_Degraded EQ Yes
Monitors the arrays in a TotalStorageDS storage facility for a degraded condition where one or more arrays need rebuilding.

KS3_TDS_Array_Prob_Crit

If VALUE S3_TotalStorageDS_Configuration.Number_of_arrays_with_problems GT 0
Monitors for the condition where the number of arrays in the TotalStorageDS storage facility running degraded, throttled, or with an RPM exception exceeds the threshold. The RAID Degraded condition indicates that one or more DDMs in the array need rebuilding. The DDM Throttling condition indicates that a near-line DDM in the array is throttling performance due to temperature or workload. The RPM Exception condition indicates that a DDM with a slower RPM than the normal array DDMs is a member of the array as a result of a sparing action.

KS3_TDS_Array_RPM_Crit

If VALUE S3_TotalStorageDS_Array.RPM_Exception EQ Yes
Monitors the arrays in a TotalStorageDS for a condition where a DDM with a slower RPM than the normal array DDMs is a member of the array as a result of a sparing action.

KS3_TDS_Array_Throttled_Crit

If VALUE S3_TotalStorageDS_Array.DDM_Throttling EQ Yes
Monitors the arrays in a TotalStorageDS for a condition where the array is throttling performance due to overload or temperature.

KS3_TDS_ExtPool_Array_Prob_Crit

If VALUE S3_TotalStorageDS_Extent_Pool.Number_of_arrays_with_problems GT 0
Monitors for the condition where the number of arrays in the extent pool running degraded, throttled, or with an RPM exception exceeds the threshold. The RAID Degraded condition indicates that one or more DDMs in the array need rebuilding. The DDM Throttling condition indicates that a near-line DDM in the array is throttling performance due to temperature or workload. The RPM Exception condition indicates that a DDM with a slower RPM than the normal array DDMs is a member of the array as a result of a sparing action.

KS3_TDS_Rank_Array_Prob_Crit

If VALUE S3_TotalStorageDS_Rank.Number_of_arrays_with_problems GT 0
Monitors for the condition where the number of arrays in the rank running degraded, throttled, or with an RPM exception exceeds the threshold. The RAID Degraded condition indicates that one or more DDMs in the array need rebuilding. The DDM Throttling condition indicates that a near-line DDM in the array is throttling performance due to temperature or workload. The RPM Exception condition indicates that a DDM with a slower RPM than the normal array DDMs is a member of the array as a result of a sparing action.

KS3_Vol_Cache_DFW_Retry_Critical

If VALUE S3_Cache_Devices.DFW_Retry_Percent GE 2 AND
VALUE S3_Cache_Devices.I/O_Count GE 25
Monitors for the condition where the percentage of DASD fast write attempts for a volume that cannot be satisfied due to a shortage of available nonvolatile storage (NVS) space exceeded the critical threshold. Check for pinned NVS and correct the problem if NVS is pinned. Otherwise, if the impact on performance is not acceptable, move a volume or dataset to another cache control unit or add NVS to this control unit.

KS3_Vol_Cache_DFW_Retry_Warning

If VALUE S3_Cache_Devices.DFW_Retry_Percent GE 1 AND
VALUE S3_Cache_Devices.DFW_Retry_Percent LT 2 AND
VALUE S3_Cache_Devices.I/O_Count GE 25
Monitors for the condition where the percentage of DASD fast write attempts for a volume that cannot be satisfied due to a shortage of available nonvolatile storage (NVS) space exceeded the warning threshold. Check for pinned NVS and correct the problem if NVS is pinned. Otherwise, if the impact on performance is not acceptable, move a volume or dataset to another cache control unit or add NVS to this control unit.

KS3_Vol_Cache_Read_HitP_Critical

If VALUE S3_Cache_Devices.Read_Hit_Percent LE 45 AND
VALUE S3_Cache_Devices.Read_Hit_Percent GE 0 AND
VALUE S3_Cache_Devices.I/O_Count GE 25
Monitors for the condition where the cache read hit percent is below the critical threshold. If performance is impacted, determine the reason for the low read hit percent. Common problems are cache-unfriendly applications and over-utilization of the control unit.

KS3_Vol_Cache_Read_HitP_Warning

If VALUE S3_Cache_Devices.Read_Hit_Percent LE 55 AND
VALUE S3_Cache_Devices.Read_Hit_Percent GT 45 AND
VALUE S3_Cache_Devices.I/O_Count GE 25
Monitors for the condition where the cache read hit percent is below the warning threshold. If performance is impacted, determine the reason for the low read hit percent. Common problems are cache-unfriendly applications and over-utilization of the control unit.

KS3_Vol_Cache_Writ_HitP_Critical

If VALUE S3_Cache_Devices.Write_Hit_Percent LE 20 AND
VALUE S3_Cache_Devices.Write_Hit_Percent GE 0 AND
VALUE S3_Cache_Devices.I/O_Count GE 25
Monitors for the condition where the cache write hit percent for a volume is below the critical threshold. Check the status of the nonvolatile storage in the cache control unit. You can move volumes or datasets to balance the workload.

KS3_Vol_Cache_Writ_HitP_Warning

If VALUE S3_Cache_Devices.Write_Hit_Percent LE 30 AND
VALUE S3_Cache_Devices.Write_Hit_Percent GT 20 AND
VALUE S3_Cache_Devices.I/O_Count GE 25
Monitors for the condition where the cache write hit percent for a volume is below the warning threshold. Check the status of the nonvolatile storage in the cache control unit. You can move volumes or datasets to balance the workload.

KS3_Vol_Disabled_VTOC_Critical

If VALUE S3_DASD_Volume_Space.VTOC_Index_Status EQ Disabled
Monitors for the condition where a VTOC index has been disabled. This condition can degrade performance on the volume. Enable the VTOC index.

KS3_Vol_EAV_Fragment_Index_Crit

If VALUE S3_DASD_Volume_Space.Extended_Address_Volume EQ Yes AND
VALUE S3_DASD_Volume_Space.Track_Managed_Fragmentation_Index GE 850
The fragmentation index in the track managed area of an Extended Address Volume exceeds the critical threshold.

KS3_Vol_EAV_Fragment_Index_Warn

If VALUE S3_DASD_Volume_Space.Extended_Address_Volume EQ Yes AND
VALUE S3_DASD_Volume_Space.Track_Managed_Fragmentation_Index GE 650 
AND VALUE S3_DASD_Volume_Space.Track_Managed_Fragmentation_Index LT 850
The fragmentation index in the track managed area of an Extended Address Volume exceeds the warning threshold.

KS3_Vol_EAV_Free_Space_Pct_Crit

If VALUE S3_DASD_Volume_Space.Track_Managed_Percent_Free LE 5.0 AND
VALUE S3_DASD_Volume_Space.Track_Managed_Percent_Free GE 0.0 AND
VALUE S3_DASD_Volume_Space.Extended_Address_Volume EQ Yes
The percentage of free space in the track managed area of an Extended Address Volume is below the critical threshold.

KS3_Vol_EAV_Free_Space_Pct_Warn

If VALUE S3_DASD_Volume_Space.Track_Managed_Percent_Free LE 10.0 AND
VALUE S3_DASD_Volume_Space.Track_Managed_Percent_Free GT 5.0 AND
VALUE S3_DASD_Volume_Space.Extended_Address_Volume EQ Yes
The percentage of free space in the track managed area of an Extended Address Volume is below the warning threshold.

KS3_Vol_Fragment_Index_Critical

If VALUE S3_DASD_Volume_Space.Fragmentation_Index GE 850
Monitors for the condition where a volume has a fragmentation index that exceeds the critical threshold. Defragment the volume so that free extents are combined to help prevent dataset allocation failures.

KS3_Vol_Fragment_Index_Warning

If VALUE S3_DASD_Volume_Space.Fragmentation_Index GE 650 AND
VALUE S3_DASD_Volume_Space.Fragmentation_Index LT 850
Monitors for the condition where a volume has a fragmentation index that exceeds the warning threshold. Defragment the volume so that free extents are combined to help prevent dataset allocation failures.

KS3_Vol_Free_Space_Pct_Critical

If VALUE S3_DASD_Volume_Space.Percent_Free_Space LE 5 AND
VALUE S3_DASD_Volume_Space.Percent_Free_Space GE 0
Monitors for the condition where the percentage of free space on a volume is below the critical threshold. If datasets on the volume require more space, then either migrate some datasets to another volume or release space from datasets that might be over-allocated.

KS3_Vol_Free_Space_Pct_Warning

If VALUE S3_DASD_Volume_Space.Percent_Free_Space LE 10 AND
VALUE S3_DASD_Volume_Space.Percent_Free_Space GT 5
Monitors for the condition where the percentage of free space on a volume is below the critical threshold. If datasets on the volume require more space, then either migrate some datasets to another volume or release space from datasets that might be over-allocated.

KS3_Vol_Perf_Resp_Time_Critical

If VALUE S3_DASD_Volume_Performance.Response_Time GE 55 AND
VALUE S3_DASD_Volume_Performance.I/O_Count GE 25
Monitors for the condition where response time for the volume exceeds the critical threshold. Look at the volume to see if high utilization is a problem. If so, it might be necessary to migrate datasets from the volume to reduce utilization. Also check the cache status of the volume. Look at the components of I/O to determine where the time is being spent and address the problem accordingly.

KS3_Vol_Perf_Resp_Time_Warning

If VALUE S3_DASD_Volume_Performance.Response_Time GE 35 AND
VALUE S3_DASD_Volume_Performance.Response_Time LT 55 AND
VALUE S3_DASD_Volume_Performance.I/O_Count GE 25
Monitors for the condition where response time for the volume exceeds the warning threshold. Look at the volume to see whether high utilization is a problem. If so, you can migrate datasets from the volume to reduce utilization. Also check the cache status of the volume. Look at the components of I/O to determine where the time is being spent and address the problem accordingly.

KS3_VTS_Disconnect_Time_Crit

If VALUE S3_VTS_Overview.Virtual_Disconnect_Time GE 500
Monitors for the condition where the logical control unit disconnect time for the virtual tape server exceeds the critical threshold. This condition is often an indication that the tape volume cache capacity is being exceeded.

KS3_VTS_Host_GB_Warning

If VALUE S3_VTS_Overview.Host_Channel_Activity_GB GE 18
Monitors for the condition where the activity between the MVS system and the virtual tape server on the host channels exceeds 19 GB over the hour interval. This condition can be an indication that the virtual tape server is being overloaded.

KS3_VTS_Pct_Copy_Throt_Warn

If VTSTPVOLC.PCTCPT GT 50
Monitors for the condition where copy is the predominant reason for throttling.

KS3_VTS_Pct_Wr_Over_Throt_Warn

If VTSTPVOLC.PCTWROT GT 50
Monitors for the condition where write overrun is the predominant reason for throttling.

KS3_VTS_Recall_Pct_Warning

If VALUE S3_VTS_Overview.Volume_Recall_Percent GE 20
Monitors for the condition where the percent of virtual tape mounts that required a physical tape mount to be satisfied exceeded the warning threshold. This condition can lead to unacceptably large virtual mount times. If so, then investigate the reason for the recalls. If rescheduling or removing the application workload is not possible, you need to increase the cache capacity of the VTS.

KS3_VTS_Virt_MtPend_Av_Warning

If VALUE S3_VTS_Overview.Average_Virtual_Mount_Pend_Time GE 300
Monitors for the condition where the average seconds required to satisfy a virtual mount in the virtual tape subsystem exceeded the warning threshold. If this condition persists, then further study is required to determine the cause for the elongated mount times. The condition might be due to VTS-hostile applications or to a shortage of VTS resources.

KS3_VTS_Virt_MtPend_Mx_Warning

If VALUE S3_VTS_Overview.Maximum_Virtual_Mount_Pend_Time EQ 900
Monitors for the condition where the maximum seconds required to satisfy a virtual mount in the virtual tape subsystem exceeded the warning threshold. If this condition persists, then further study is required to determine the cause for the elongated mount times. The condition might be due to VTS-hostile applications or to a shortage of VTS resources.