StopCommand

  1. If the StopCommand was able to stop the resource it should return a value of 0 to indicate that the resource was properly stopped and should go offline within the next few seconds.
  2. There is no mechanism to handle a failed StopCommand in IBM Tivoli System Automation for Multiplatforms. The StopCommand may indicate a failing stop of the application by returning a value other than 0, but this will not trigger an automation action.
    If the resource does not reach the OpState Offline after a certain amount of time, IBM Tivoli System Automation for Multiplatforms issues a reset operation against the resource, which results in a second invocation of the StopCommand. This second invocation of the StopCommand can be determined within the StopCommand script by checking the SA_RESET environment variable, which is set to 1 for the second invocation. A StopCommand can honor this, like in the following example:
    #/bin/sh
    # A sample stop/reset automation script for the lpd application
    if [ $SA_RESET == 1 ]; then
      killall -9 lpd
      exit $?
    else
      /etc/init.d/lpd stop
      exit $?
    fi
    If a second execution of the StopCommand is not desired, the script could just exit, for example:
    #/bin/sh
    # A sample stop/reset automation script for the lpd application
    if [ $SA_RESET == 1 ]; then
      exit 0
    else
      /etc/init.d/lpd stop
      exit $?
    fi 

    Evaluating the SA_RESET environment variable, an enhanced stop behavior can easily be achieved. If the second StopCommand also fails to bring the operational state of the resource to Offline, the OpState of the resource is set to Stuck Online. At this time, operator intervention is required to stop the resource. After the resource has finally stopped and the OpState has changed to Offline, IBM Tivoli System Automation for Multiplatforms will control this resource again.

  3. The variable SA_RESET is set to 1 when one of the following applies:
    • The StopCommand is executed the second time.
    • The StopCommand is invoked due to a resource reset using resetrsrc.
    • A resource is reset after an unsuccessful start attempt. This happens, if the StartCommand returned 0 (successful), but the MonitorCommand does not monitor the resource as Online. In this case System Automation for Multiplatforms initiates a reset (StopCommand with SA_RESET=1) before executing the next start attempt.
  4. If the StopCommand of a resource is not able to complete within the StopCommandTimeout, the resource manager will kill the command and treat the stop like a failing StopCommand as described under item 2.
  5. A non-zero return code of the StopCommand or reset operation will result in the resource being set to automation state 'Problem'. This situation can only be recovered by a successful execution of resetrsrc against this resource. Since the execution of resetrsrc against a resource triggers the StopCommand of the resource, the StopCommand needs to run successfully as well (that is, it must deliver a return code of zero). In general, it is recommended that the StopCommand of a resource be configured to return with a non-zero return code only if there is a real problem to stop the resource. For all other situations, the StopCommand should be configured to return with a zero return code in order to prevent the resource from ending up in the 'Problem' state.
  6. In case the stop command was valid when the resource was defined, but is later removed or not present (for example, because of a missing NFS mount), the stop procedure is treated like a failing StopCommand as described under item 2.