IBM Workload Scheduler network communication

IBM Workload Scheduler uses the TCP/IP protocol for network communication. The node name and the port number used to establish the TCP/IP connection are set for each workstation in its workstation definition. Refer to Workstation definition for additional details.

A store-and-forward technology is used by IBM Workload Scheduler to maintain consistency and fault-tolerance at all times across the network by queuing messages in message files while the connection is not active. When TCP/IP communication is established between systems, IBM Workload Scheduler provides bi-directional communication between workstations using links. Links are controlled by the autolink flag set in the Workstation definition, and by the link and unlink commands issued from the conman command-line program.

When a link is opened, messages are passed between two workstations. When a link is closed, the sending workstation stores messages in a local message file and sends them to the destination workstation as soon as the link is re-opened.

There are basically two types of communication that take place in the IBM Workload Scheduler environment, connection initialization and scheduling event delivery in the form of change of state messages during the processing period. These two types of communication are now explained in detail.

Connection initialization and two-ways communication setup
These are the steps involved in the establishment of a two-ways IBM Workload Scheduler link between a domain manager and a remote fault-tolerant agent:
  1. On the domain manager, the mailman process reads the host name, TCP/IP address, and port number of the fault-tolerant agent from the Symphony file.
  2. The mailman process on the domain manager establishes a TCP/IP connection to the netman process on the fault-tolerant agent using the information obtained from the Symphony file.
  3. The netman process on the fault-tolerant agent determines that the request is coming from the mailman process on the domain manager, and creates a new writer process to handle the incoming connection.
  4. The mailman process on the domain manager is now connected to the writer process on the fault-tolerant agent. The writer process on the fault-tolerant agent communicates the current run number of its copy of the Symphony file to the mailman process on the domain manager. This run number is the identifier used by IBM Workload Scheduler to recognize each Symphony file generated by JnextPlan. This step is necessary for the domain manager to check if the current plan has already been sent to the fault-tolerant agent.
  5. The mailman process on the domain manager compares its Symphony file run number with the run number of the Symphony file on the fault-tolerant agent. If the run numbers are different, the mailman process on the domain manager sends to the writer process on the fault-tolerant agent the latest copy of the Symphony file.
  6. When the current Symphony file is in place on the fault-tolerant agent, the mailman process on the domain manager sends a start command to the fault-tolerant agent.
  7. The netman process on the fault-tolerant agent starts the local mailman process. At this point a one-way communication link is established from the domain manager to the fault-tolerant agent.
  8. The mailman process on the fault-tolerant agent reads the host name, TCP/IP address, and port number of the domain manager from the Symphony file and uses them to establish the uplink back to the netman process on the domain manager.
  9. The netman process on the domain manager determines that the request is coming from the mailman process on the fault-tolerant agent, and creates a new writer process to handle the incoming connection. The mailman process on the fault-tolerant agent is now connected to the writer on the domain manager and a full two-way communication link is established. As a result of this, the writer process on the domain manager writes messages received from the fault-tolerant agent into the Mailbox.msg file on the domain manager, and the writer process on the fault-tolerant agent writes messages from the domain manager into the Mailbox.msg file on the fault-tolerant agent.
Job processing and scheduling event delivery in the form of change of state messages during the processing day performed locally by the fault-tolerant agent
During the production period, the Symphony file present on the fault-tolerant agent is read and updated with the state change information about jobs that are run locally by the IBM Workload Scheduler workstation processes. These are the steps that are performed locally on the fault-tolerant agent to read and update the Symphony file and to process jobs:
  1. The batchman process reads a record in the Symphony file that states that job1 is to be launched on the workstation.
  2. The batchman process writes in the Courier.msg file that job1 has to start.
  3. The jobman process reads this information in the Courier.msg file, starts job1, and writes in the Mailbox.msg file that job1 started with its process_id and timestamp.
  4. The mailman process reads this information in its Mailbox.msg file, and sends a message that job1 started with its process_id and timestamp, to both the Mailbox.msg file on the domain manager and the local Intercom.msg file on the fault-tolerant agent.
  5. The batchman process on the fault-tolerant agent reads the message in the Intercom.msg file and updates the local copy of the Symphony file.
  6. When job job1 completes processing, the jobman process updates the Mailbox.msg file with the information that says that job1 completed.
  7. The mailman process reads the information in the Mailbox.msg file, and sends a message that job1 completed to both the Mailbox.msg file on the domain manager and the local Intercom.msg file on the fault-tolerant agent.
  8. The batchman process on the fault-tolerant agent reads the message in the Intercom.msg file, updates the local copy of the Symphony file, and determines the next job that has to be run.
For information on how to tune job processing on a workstation, refer to the IBM Workload Scheduler: Administration Guide.