Configuring thresholds for heuristic host blocking

Configure the thresholds based on which hosts are blocked in your cluster.

Procedure

  1. Open the application profile in an XML editor.
  2. In the SOAM > RetriedTaskAndBlockedHostCtrl section of the application profile, configure the following parameters:
    • heuristicHostBlockMonitorPeriod: Specifies the task failure number for Symphony workload and the task failure factor for MapReduce workload over the past x minutes. For example, if the monitoring period is set to 3 hours, the SSM monitors task failures in the last 3 hours on each host. The value must be a positive integer, expressed in minutes. The default value is 180 minutes. A value of 0 resets the monitoring period to its default value. Data that is older than heuristicHostBlockMonitorPeriod is deleted.
    • heuristicHostFailureFactorThreshold: Specifies the number of task failures allowed per slot on a host for Symphony workload and the task failure factor allowed per slot on a host for MapReduce workload. The value must be a positive integer. The default value is 3 for Symphony workload and 1 for MapReduce workload. A value of 0 resets the threshold to its default value.
    • heuristicHostFailureFactorPercentThreshold (Applies only to MapReduce workload): Specifies the number of task failures a host must reach as a percentage over the average number of task failures for all hosts assigned to the application before the host is blocked. So if the threshold is set to 0.5 (50%), a host must have an amount of service failures that exceeds the average of all hosts assigned to the application by 50% before it gets blocked. The value must be greater than 0. The default value is 0.5, which is equivalent to 50%. A value of 0 resets the threshold to its default value.
    • heuristicHostBlockJoinPeriod (Applies only to MapReduce workload): Specifies the duration (in minutes) of a host’s failure factor that must be counted into the average value after the host joins the application manager (SSM).
    Note: For MapReduce applications, if you defined these settings as variables in SOAM > SSM > osTypes > osType > env, the values in SOAM > RetriedTaskAndBlockedHostCtrl take precedence.
  3. Save your changes and register the application using the soamreg command.