Enabling automatic failover

To enable automatic failover, configure one or more backup engines, and set the related global options using the optman command, so that when the active master becomes unavailable, a long-term switchmgr operation is triggered.

Ensure that the master domain manager and the back master domain managers were installed using the same user UID) and group (GID).
If you performed an upgrade from Version 9.5 or 9.5 Fix Pack 1, the automatic failover feature is disabled, but it can be enabled following a few simple steps outlined in this task. Automatic failover is, instead, enabled by default for a fresh installation of Version 9.5 Fix Pack 2 and later, and any backup master domain manager installed and configured with Fix Pack 2 is an eligible backup. If you subsequently disabled this feature, you can use the following procedure to re-enable it. You can also use this procedure to define a list of preferred eligible backups, excluding any backups you do not want to consider as an eligible backup. You can also configure a separate list of potential backups for the event processor.
  1. Ensure the local option, mm resolve master, in the localopts file, is set to no on both the master domain manager and on all eligible backup master domain managers.
  2. Optional. Define a list of potential backups for the master domain manager and the event manager.
    1. Update the global option, workstationMasterListInAutomaticFailover, on the master domain manager to specify a list of workstations to be considered as eligible backups for the master domain manager. Edit the value of this option by adding a list of workstations, separated by commas, starting with your preferred choices at the top of the list. The list includes the current master domain manager. If no workstations are specified in this list, then the first backup master domain manager to detect that the master is down, performs the switch.
    2. Specify potential backups for the event processor by editing the value for the workstationEventMgrListInAutomaticFailover global option. Add a list of workstations, separated by commas, starting with your preferred choices at the top of the list. The list includes the current event manager workstation.
  3. Set the following global options to "yes": enAutomaticFailover | af and enAutomaticFailoverActions | aa using the optman chg command. For example:
    optman chg af=yes
    optman chg aa=yes
  4. Restart WebSphere Application Server Liberty Base
    Important: Complete the remaining steps only if they are not already present in your environment.
  5. If not already present on the master domain manager, create a new workstation with the following specifications:
    • Type: Extended Agent
    • Access method: unixlocl
    • Host: $MASTER
    For example, if you create a workstation named, MDM_XA, with these specifications, the following is the workstation definition:
    CPUNAME MDM_XA
      DESCRIPTION "Workload Scheduler Virtual Master"
      OS OTHER
      NODE mdm_xa TCPADDR 31111
      FOR MAESTRO HOST $MASTER ACCESS "unixlocl"
        TYPE X-AGENT
        AUTOLINK OFF
        BEHINDFIREWALL OFF
        FULLSTATUS OFF
    END
    
  6. Set the FINAL and FINALPOSTREPORTS job streams on the master domain manager to "draft". Draft job streams are not added to the preproduction plan.
    composer mod jS=FINAL
    composer mod js=FINALPOSTREPORTS
    For example, the following is an extract from the definition for the FINAL job stream:
    SCHEDULE MDM#FINAL
    DESCRIPTION "Added by composer."
    DRAFT
    ON RUNCYCLE RC1 "FREQ=DAILY;"
    AT 2359
    CARRYFORWARD
    FOLLOWS MDM#FINAL.SWITCHPLAN PREVIOUS
    :
    
    The following example is an extract from the definition for the FINALPOSTREPORTS job stream:
    SCHEDULE MDM#FINALPOSTREPORTS
    DESCRIPTION "Added by composer."
    DRAFT
    ON RUNCYCLE RC1 "FREQ=DAILY;"
    SCHEDTIME 2359
    CARRYFORWARD
    FOLLOWS MDM_XA#FINAL.SWITCHPLAN PREVIOUS
    :
    
  7. If not already present, make the following changes to the Sfinal file:
    1. Create a backup of the Sfinal file. For example:
      cp  /<TWA_home>/TWS/Sfinal   /<TWA_home>/TWS/Sfinal.orig
    2. Add the new extended agent workstation to the FINAL and FINALPOSTREPORTS job stream definitions.
    3. Substitute the SCRIPTNAME keyword with DOCOMMAND in all of the jobs defined in the FINAL and FINALPOSTREPORTS job streams.
    4. Ensure the path to the scripts launched by the jobs in the FINAL and FINALPOSTREPORTS job streams use the variable, UNISONHOME.
    5. Submit the composer add Sfinal command to generate the FINAL and FINALPOSTREPORTS job streams on the new extended agent workstation if they do not already exist.
    6. Verify that the new extended agent workstation has been added to the Sfinal file. The following example is an extract of the modified Sfinal file containing the addition of the MDM_XA extended agent workstation, the substitution of the SCRIPTNAME keyword with DOCOMMAND in all jobs defined in FINAL and FINALPOSTREPORTS job stream definitions, and the use of the UNISONHOME variable in place of the path to the scripts:
      FINAL:
      SCHEDULE MDM_XA#FINAL ON EVERYDAY
               AT 2359
               CARRYFORWARD
      FOLLOWS MDM_XA#FINAL.SWITCHPLAN  PREVIOUS
      ...
      ...
      ...
               STARTAPPSERVER DOCOMMAND
      "${UNISONHOME}/../appservertools/startAppServer.sh"
               STREAMLOGON wa95ids
               RECOVERY CONTINUE
               MAKEPLAN DOCOMMAND "${UNISONHOME}/MakePlan"
               STREAMLOGON wa95ids
               RCCONDSUCC "(RC=0) OR (RC=4)"
               FOLLOWS STARTAPPSERVER
               SWITCHPLAN DOCOMMAND "${UNISONHOME}/SwitchPlan"
               STREAMLOGON wa95ids
              FOLLOWS MAKEPLAN
      ...
      ...
      ...
      END
      FINALPOSTREPORTS:
      SCHEDULE  MDM_XA#FINALPOSTREPORTS ON EVERYDAY
               SCHEDTIME 2359
               CARRYFORWARD
      FOLLOWS MDM_XA#FINAL.SWITCHPLAN  PREVIOUS
      ...
      ...
      ...
               CHECKSYNC DOCOMMAND "${UNISONHOME}/bin/planman checksync"
               STREAMLOGON wa95ids
               RECOVERY CONTINUE
               CREATEPOSTREPORTS DOCOMMAND "${UNISONHOME}/CreatePostReports"
               STREAMLOGON wa95ids
               RECOVERY CONTINUE
               UPDATESTATS DOCOMMAND "${UNISONHOME}/UpdateStats"
               STREAMLOGON wa95ids
               RECOVERY CONTINUE
               FOLLOWS CHECKSYNC
      ...
      ...
      ...
      END
      
  8. Submit the composer add Sfinal command and then verify that the FINAL and FINALPOSTREPORTS job streams and the related jobs, are correctly defined on the extended agent workstation. The following is an example of the correct output:
    ...
    ...
    ...
    /
    -add Sfinal
    AWSJCL003I The command "add" completed successfully on object "jd=MDM_XA#STARTAPPSERVER".
    AWSJCL003I The command "add" completed successfully on object "jd=MDM_XA#MAKEPLAN".
    AWSJCL003I The command "add" completed successfully on object "jd=MDM_XA#SWITCHPLAN".
    AWSJCL003I The command "add" completed successfully on object "js=MDM_XA#FINAL".
    AWSJCL003I The command "add" completed successfully on object "jd=MDM_XA#CHECKSYNC".
    AWSJCL003I The command "add" completed successfully on object "jd=MDM_XA#CREATEPOSTREPORTS".
    AWSJCL003I The command "add" completed successfully on object "jd=MDM_XA#UPDATESTATS".
    AWSJCL003I The command "add" completed successfully on object "js=MDM_XA#FINALPOSTREPORTS".
    AWSBIA090I For file "Sfinal": errors 0, warnings 0.
    AWSBIA288I Total objects updated: 8
    
  9. Compare the two copies of the FINAL and FINALPOSTREPORTS job streams and make any necessary changes to those on the extended agent workstation, for example, the job stream submit time, run cycles, or any other custom changes to personalize the schedule.
  10. Submit JnextPlan with the -noremove options to update the plan with the new extended agent workstation:
    JnextPlan -for 0000 -noremove
  11. If JnextPlan runs correctly, proceed to delete the FINAL and FINALPOSTREPORTS job streams previously set to "draft" on the master domain manager.
    composer del FINALPOSTREPORTS
    composer del FINAL
    
  12. Delete the FINAL and FINALPOSTREPORTS job streams from the plan as follows.
    conman "canc FINALPOSTREPORTS"
    conman "canc FINAL"
    
  13. Modify the new job stream definitions for the FINAL and FINALPOSTREPORTS job streams, setting the limit to "0":
    SCHEDULE MDM_XA#FINAL
    DESCRIPTION "Added by composer."
    ON RUNCYCLE RC1 "FREQ=DAILY;"
    AT 2359
    CARRYFORWARD
    FOLLOWS MDM_XA#FINAL.SWITCHPLAN PREVIOUS
    LIMIT 0
    :
    MDM_XA#STARTAPPSERVER
    
  14. Submit first the FINAL, and then the FINALPOSTREPORTS job streams into the current plan.
    conman sbs MDM_XA#FINAL
    conman sbs MDM_XA#FINALPOSTREPORTS
    
  15. Verify that the start time and date for the FINAL and FINALPOSTREPORTS job streams are correct by submitting the conman showschedules command.
  16. Reset the value of the limit job stream keyword for the FINAL and FINALPOSTREPORTS job streams, both in the database and in the plan.
    conman "limit MDM_XA#FINAL ;10"
    conman "limit MDM_XA#FINALPOSTREPORTS ;10"
    
    Both job streams should be in WAITING (HOLD internal status), awaiting execution time.
  17. Archived plans, forecast and trial plans are stored on the master domain manager where the plans run. To make these plans available on the backup master domain manager, either store the plan in a single shared folder, or create a job that synchronizes the plans between the master domain manager and the backup master domain manager.
To enable the new master to access the plans that ran on the original master (the current plan is visible because it is synchronized with the backup), configure a job that copies the plans from the original master to the new master.

After an automatic failover, if you would like to subsequently return service to the original master, you must perform a manual switch. See Manually switching the master.