Defining step timeout behaviour

This page has not been liked. Updated 1/17/14, 4:41 AM by DaraMurphyTags: None

Defining step timeout behavior

When you create an Automation Plan, you can define timeout settings for each step in the plan. This provides you with the ability to satisfactorily handle unresponsive endpoints when running your Automation Plan. The timeout settings are known as step timeout behavior. Step timeout behavior allows you to control the length of time the system waits for actions to complete on endpoints. You define step timeout behavior by specifying the length of time that you want the system to wait for an action to complete on the endpoints. When the length of time that you specify elapses and if the action for the step remains in a wait state, the system automatically stops this action. The system then either continues with the next step or stops the Automation Plan altogether, depending on the selections you make when configuring the step timeout behavior. By default, step timeout behavior is not set, which means that the step waits indefinitely for all endpoints to report back a successful or failure result.

How it works

The step timeout feature provides you with the ability to configure the execution of an Automation Plan as it is processed on endpoints. The step timeout feature allows you to define a period of time for the system to wait while it processes a step on endpoints. When the timeout period that you set elapses, the system times-out the action on unresponsive endpoints.

A timeout is different from a failure. A timeout state can be any state other than a success or failure. Any of the following states, or no state, are regarded as a timeout if the endpoint is reporting back that status up to the point at which the timeout period you specified expires:
  • RUNNING
  • EVALUATING
  • WAITING
  • POSTPONED
  • PENDING_DOWNLOADS
  • PENDING_RESTART
  • PENDING_MESSAGE
  • PENDING_LOGIN
  • <Not reported>
However, after a timeout occurs, the action might still complete successfully on the endpoints on which it has timed out. The timeout is simply reflecting the status of the action at the point at which the timeout period elapses.

As described in Defining step failure behavior settings, failure steps are run only when a step action fails on one or more endpoints. Therefore, if a step is successful on some endpoints and times out on other endpoints, any failure step associated with the step does not run. Timed out targets are not targeted by failure steps. If a timeout occurs on a step that has an associated failure step with a FAILED ONLY targeting policy, only the endpoints that reported failure are targeted. Because the action timed out, the system cannot determine correctly if there is a failure to remediate. If the targeting policy is set to ALL, all non-timed out endpoints are targeted by the failure step.

When there is a combination of failed and timed out targets for a step, step failure behavior settings are processed rather than the timeout settings. Whether endpoints are included or excluded from future steps is determined by the step result. Failed targets are processed as per the step failure behavior settings and timed out targets are processed as per the step timeout behavior. If the step with both failed and timed out endpoints has a failure step defined and the targeting policy of that step is FAILED ONLY, only the targets that reported a failure are targeted by the failure step. Regardless of the targeting policy, timed out endpoints are not targeted by the failure step in this scenario.

You can also control the behavior that occurs after the timeout period elapses, either stopping the Automation Plan or allowing it to continue. If you choose to continue the Automation Plan, you can continue it on all endpoints or only on the endpoints on which the step was successful. This setting is processed only if the timeout period elapses and there have been no failures reported.

When you add a step to your Automation Plan, you can define the step timeout behavior by moving to the Settings tab and selecting the Enable Step Timeout check box. To specify the length of time for the timeout, enter a time in the Step Timeout field. Define whether you want to continue or stop the Automation Plan by choosing one of the following options.
Table 1. Defining step timeout behavior
Option Description
Stop Automation Plan Select this option to indicate that if there are one or more timed out targets the system stops the Automation Plan. If one or more failures are also reported, the system first runs any associated failure step, according to the step failure behavior. No failures means no failure step execution. If there is a failure on one or more endpoints, the choice that you make here is superseded by the step failure behaviour settings.
Continue Automation Plan Select this option to indicate that where there are one or more timed out targets, the system moves on to the next step in the Automation Plan. If one or more failures are also reported, then the system first runs any associated failure step. If there are no failures, the failure step is not executed. If there is a failure on one or more endpoints, the choice that you make here is superseded by the step failure behaviour settings.
If you want to continue the Automation Plan, decide if you want the endpoints on which the step timed out to be included or excluded from future steps. If you select Include in Future Steps, the Automation Plan continues on all endpoints. If you choose Exclude from Future Steps, the endpoints on which the step timed out are excluded from all future steps in the Automation Plan.
Table 2. Defining timeout targets
Option Description
Include in Future Steps Select this option if you want the endpoints on which the step timed out to be included in future steps in the Automation Plan.
Exclude from Future Steps Select this option if you want the endpoints on which the step timed out to be excluded for all future steps.
As mentioned above, a timeout is not the same result as a failure. When a step fails, the step action has run, but not successfully. When a step times out, the step action might not have been executed at all, for example because the target is powered off. Therefore, a timeout is managed differently than a failure. When you select an option to include or exclude targets on which the step timed out, this is different from how endpoints on which the step fails are treated. For endpoints on which the step fails, the step failure behavior settings control whether those endpoints are included in future steps in the Automation Plan. If you run an Automation Plan and a step times out on some targets rather than fails, a failure step, if set, does not run for that step if there are no failures for that step on any endpoint up to the point when the timeout period elapses.

If you open a legacy Automation Plan created before the step timeout feature was introduced, the default settings are applied to that Automation Plan so timeout is not set.

Tracking timed-out Automation Plan actions

To view actions that timed out, go to the Automation Plan Action Status dashboard. Timed out actions have a status of Timed out. If an action times out and you have configured the Automation Plan to continue, the status displayed is Timed out but continued. The following screen shows an example of a timed out action in the Automation Plan Action Status dashboard.

This graphic shows the Automation Plan Action Status dashboard and the Timed out status in the Status column.

The Automation Plan Action Status dashboard displays the timeout status of the step, showing the status of the step at the point at which the timeout period you set elapses. The action might subsequently run successfully or fail on endpoints on which the step timed out. The Automation Plan Action Status dashboard continues to display the timeout status. Any timed out action normally requires investigation and because you have configured timeout settings, the timeout status is the primary status in Server Automation.

There might be a time lag between the time that you specify for the timeout period and when the timeout actually occurs. For example, if you specify a timeout period of 20 minutes, the timeout might not occur exactly 20 minutes after the step action starts. Because the timeout check is completed once per Automation Plan queue poll operation, there might be a slight delay between the time you specify and when the timeout occurs.

Note also that the step is not processed until the timeout period that you specify elapses. Suppose you specify a timeout period of 20 minutes and the step fails immediately on some endpoints and then times out on one endpoint after 20 minutes elapses, the step is not processed until the timeout period elapses.