IBM Workload Scheduler network communication
IBM Workload Scheduler uses the TCP/IP protocol for network communication. The node name and the port number used to establish the TCP/IP connection are set for each workstation in its workstation definition. Refer to Workstation definition for additional details.
A store-and-forward technology is used by IBM Workload Scheduler to maintain consistency and fault-tolerance at all times across the network by queuing messages in message files while the connection is not active. When TCP/IP communication is established between systems, IBM Workload Scheduler provides bi-directional communication between workstations using links. Links are controlled by the autolink flag set in the Workstation definition, and by the link and unlink commands issued from the conman command-line program.
When a link is opened, messages are passed between two workstations. When a link is closed, the sending workstation stores messages in a local message file and sends them to the destination workstation as soon as the link is re-opened.
There are basically two types of communication that take place in the IBM Workload Scheduler environment, connection initialization and scheduling event delivery in the form of change of state messages during the processing period. These two types of communication are now explained in detail.
- Connection initialization and two-ways communication setup
- These
are the steps involved in the establishment of a two-ways IBM Workload Scheduler link
between a domain manager and a remote fault-tolerant agent:
- On the domain manager, the mailman process reads the host
name, TCP/IP address, and port number of the fault-tolerant agent from
the
Symphony
file. - The mailman process on the domain manager establishes a
TCP/IP connection to the netman process on the fault-tolerant agent using
the information obtained from the
Symphony
file. - The netman process on the fault-tolerant agent determines that the request is coming from the mailman process on the domain manager, and creates a new writer process to handle the incoming connection.
- The mailman process on the domain manager is now connected
to the writer process on the fault-tolerant agent.
The writer process on the fault-tolerant agent communicates
the current run number of its copy of the
Symphony
file to the mailman process on the domain manager. This run number is the identifier used by IBM Workload Scheduler to recognize eachSymphony
file generated by JnextPlan. This step is necessary for the domain manager to check if the current plan has already been sent to the fault-tolerant agent. - The mailman process on the domain manager compares its
Symphony
file run number with the run number of theSymphony
file on the fault-tolerant agent. If the run numbers are different, the mailman process on the domain manager sends to the writer process on the fault-tolerant agent the latest copy of theSymphony
file. - When the current
Symphony
file is in place on the fault-tolerant agent, the mailman process on the domain manager sends astart
command to the fault-tolerant agent. - The netman process on the fault-tolerant agent starts the local mailman process. At this point a one-way communication link is established from the domain manager to the fault-tolerant agent.
- The mailman process on the fault-tolerant agent reads
the host name, TCP/IP address, and port number of the domain manager
from the
Symphony
file and uses them to establish the uplink back to the netman process on the domain manager. - The netman process on the domain manager determines that
the request is coming from the mailman process on the fault-tolerant agent,
and creates a new writer process to handle the incoming connection.
The mailman process on the fault-tolerant agent is
now connected to the writer on the domain manager and a full
two-way communication link is established. As a result of this, the writer process
on the domain manager writes messages received from the fault-tolerant agent into
the
Mailbox.msg
file on the domain manager, and the writer process on the fault-tolerant agent writes messages from the domain manager into theMailbox.msg
file on the fault-tolerant agent.
- On the domain manager, the mailman process reads the host
name, TCP/IP address, and port number of the fault-tolerant agent from
the
- Job processing and scheduling event delivery in the form of change of state messages during the processing day performed locally by the fault-tolerant agent
- During the production period, the
Symphony
file present on the fault-tolerant agent is read and updated with the state change information about jobs that are run locally by the IBM Workload Scheduler workstation processes. These are the steps that are performed locally on the fault-tolerant agent to read and update theSymphony
file and to process jobs:- The batchman process reads a record in the
Symphony
file that states thatjob1
is to be launched on the workstation. - The batchman process writes in the
Courier.msg
file thatjob1
has to start. - The jobman process reads this information in the
Courier.msg
file, startsjob1
, and writes in theMailbox.msg
file thatjob1
started with its process_id and timestamp. - The mailman process reads this information in its
Mailbox.msg
file, and sends a message thatjob1
started with its process_id and timestamp, to both theMailbox.msg
file on the domain manager and the localIntercom.msg
file on the fault-tolerant agent. - The batchman process on the fault-tolerant agent reads
the message in the
Intercom.msg
file and updates the local copy of the Symphony file. - When job
job1
completes processing, the jobman process updates theMailbox.msg
file with the information that says thatjob1
completed. - The mailman process reads the information in the
Mailbox.msg
file, and sends a message thatjob1
completed to both theMailbox.msg
file on the domain manager and the localIntercom.msg
file on the fault-tolerant agent. - The batchman process on the fault-tolerant agent reads
the message in the
Intercom.msg
file, updates the local copy of theSymphony
file, and determines the next job that has to be run.
- The batchman process reads a record in the