Recovering the plan from the latest archived plan

You can recover a corrupted plan on a master domain manager using the latest archived plan, however; this is possible only if you have performed some configuration steps prior to the occurrence of the file corruption.

The following procedure recovers a corrupted plan using ResetPlan. The plan is recovered using the latest archived plan and any events logged throughout the day are written to a new event message file. The last archived Symphony file is copied into the current Symphony file and then JnextPlan is run to apply the events from the evtlog.msg file.
Restriction: Before you can perform the recovery procedure, you must have completed some configuration steps prior to the file corruption occurrence.
Note:
    • Some IBM® Workload Scheduler events that were triggered before applying the recovery procedure, might be triggered again after the recovery procedure has completed. This limitation concerns those events that are not managed through a message queue, for example, UNTIL, DEADLINE, and MAXDUR.
    • Jobs in USERJOBS Job Stream are not subject to resource controlling. As a result, affected resources should be adjusted and attended manually.
    • Prompts in the recovered plan might have a prompt number different from the prompt number in the original plan. To prevent mismatches, prompt reply events are not recovered.
  1. Complete the following configuration steps so that you can use the recovery procedure in the future if it becomes necessary:
    1. In the localopts file, add the following attribute and value: bm log events = ON.
    2. Optionally, customize the path where IBM Workload Scheduler creates the evtlog.msg event file by setting the bm log events path property in the localopts file. If you do not modify this setting, the evtlog.msg event file is created in the following default location: <TWA_INST_DIR>/TWS.
    3. Stop and start all IBM Workload Scheduler processes or run JnextPlan to create the evtlog.msg file.
    4. If necessary, you can configure the maximum size of both the evtlog.msg and Intercom.msg event files as follows:
      evtsize -c evtlog.msg 500000000
      evtsize -c Intercom.msg 550000000 
      Note: The default size of these event files is 10 MB. When the maximum size is reached, events are no longer logged to these files and the recovery procedure is unable to recover them and any that follow. Moreover, the following BATCHMAN warning is logged to <TWA_INST_DIR>/TWS/stdlist/traces/YYYYMMDD_TWSMERGE.log:
      13:11:51 18.10.2012|BATCHMAN:+ WARNING:Error writing in evtlog:
        AWSDEC003I End of file on events file.
      13:11:51 18.10.2012|BATCHMAN:* 
      13:11:51 18.10.2012|BATCHMAN:* AWSBHT160E The EvtLog message file is full,
        events will not be logged until a new Symphony is produced. Recovery with
        event reapply is no more possible until that time.
      13:11:51 18.10.2012|BATCHMAN:*
      If you encounter this problem increase the size of the evtlog.msg and Intercom.msg event files.
      Consider that for 80,000 jobs and a Symphony file of size 40 MB, the evtlog.msg file is approximately 70 MB in size.
      Important: The Intercom.msg maximum size should always be set to a value greater than the maximum size of evtlog.msg
      In the <TWA_INST_DIR>/TWS/stdlist/traces/YYYYMMDD_TWSMERGE.log trace file, the BATCHMAN process logs an informational line containing the expected size of the evtlog.msg queue. For example:
      19:02:06 14.10.2012|BATCHMAN:INFO:0.25 MB of events to log during this 
        batchman run
      If Intercom.msg reaches the maximum size during the recovery procedure, batchman stops.
    5. If the file system where evtlog.msg resides runs out of space, a BATCHMAN warning is logged to <TWA_INST_DIR>/TWS/stdlist/traces/YYYYMMDD_TWSMERGE.log as follows:
      13:10:36 16.10.2012|BATCHMAN:+ WARNING:Error writing in evtlog: 
        AWSDEC002E An internal error has occurred. 
      The following UNIX system error occurred on an events file: 
       "No space left on device" at line = 3517.
  2. Complete the recovery procedure:
    1. Ensure the IBM Workload Scheduler processes are stopped. Run conman stop to stop them.
    2. Copy the information retrieved by running the planman showinfo command.
    3. Run ResetPlan. The corrupted Symphony file is archived in the schedlog folder.
    4. Copy the second last Symphony file archived in the schedlog folder, and not the most recent one which is the corrupted file. For example, on UNIX, submit the following command:
      cp -p /opt/ibm/TWA/TWS/schedlog/MYYYYMMDDhhmm /opt/ibm/TWA/TWS/Symphony
    5. Run JnextPlan as follows using the information retrieved when you ran planman showinfo:
      JnextPlan -from MM/DD/YYYY hhmm TZ Timezone -for hhhmm
      where,
      -from
      Production plan start time of last extension.
      -for
      Production plan time extension.

When running this procedure, consider that the run number, that is, the total number of times the plan was generated, is automatically increased by one.

Job stream instances that have already completed successfully at the time this procedure is run are not included in the recovered plan.

After you have performed the recovery procedure, the workstation limit is set to 0 and the evtlog.msg queue is cleared with each successive run of JnextPlan.