Configuring thresholds for heuristic host blocking
Configure the thresholds based on which hosts are blocked in your cluster.
Procedure
- Open the application profile in an XML editor.
- In the SOAM > RetriedTaskAndBlockedHostCtrl section of the application profile, configure the following
parameters:
- heuristicHostBlockMonitorPeriod: Specifies the task failure number for Symphony workload and the task failure factor for MapReduce workload over the past x minutes. For example, if the monitoring period is set to 3 hours, the SSM monitors task failures in the last 3 hours on each host. The value must be a positive integer, expressed in minutes. The default value is 180 minutes. A value of 0 resets the monitoring period to its default value. Data that is older than heuristicHostBlockMonitorPeriod is deleted.
- heuristicHostFailureFactorThreshold: Specifies the number of task failures allowed per slot on a host for Symphony workload and the task failure factor allowed per slot on a host for MapReduce workload. The value must be a positive integer. The default value is 3 for Symphony workload and 1 for MapReduce workload. A value of 0 resets the threshold to its default value.
- heuristicHostFailureFactorPercentThreshold (Applies only to MapReduce workload): Specifies the number of task failures a host must reach as a percentage over the average number of task failures for all hosts assigned to the application before the host is blocked. So if the threshold is set to 0.5 (50%), a host must have an amount of service failures that exceeds the average of all hosts assigned to the application by 50% before it gets blocked. The value must be greater than 0. The default value is 0.5, which is equivalent to 50%. A value of 0 resets the threshold to its default value.
- heuristicHostBlockJoinPeriod (Applies only to MapReduce workload): Specifies the duration (in minutes) of a host’s failure factor that must be counted into the average value after the host joins the application manager (SSM).
Note: For MapReduce applications, if you defined these settings as variables in SOAM > SSM > osTypes > osType > env, the values in SOAM > RetriedTaskAndBlockedHostCtrl take precedence. - Save your changes and register the application using the soamreg command.