Improving throughput in high-latency networks

Aggregating messages prior to putting them on the wire can improve throughput in high-latency networks such as WANs. This technique enables a IBM® Spectrum Symphony client to maximize utilization of the network connection between itself and the session manager. This feature is supported on all supported operating systems.

About message aggregation and best practices

Before describing how message aggregation can improve throughput for a WAN connection, it is helpful to know how task data is submitted in IBM Spectrum Symphony's default model.

This feature is not recommended for recoverable clients since message aggregation would require complex application coding to handle a client that terminates abnormally.

The reason that it is not recommended to use message aggregation with recoverable clients is that due to message aggregation, the number of outstanding sends to the session manager is greater than 1. As a result, there is no way to know how many outstanding sends were successfully received or are in the process of being handled by the session manager.

A recoverable client is a client that can tolerate an abnormal termination of its execution and is able to recover and continue to process workload. Recovery of such a client usually involves it being restarted and given enough context to allow it to connect and open an existing session that previously contained its workload. For this type of client, it is usually recommended to set the discardResultsOnDelivery attribute to false in the application profile to allow for a simplified recovery procedure.

Default behavior without message aggregation

Although the underlying communication channel between the client and the session manager is asynchronous, the data submission protocol is synchronous. Here is the sequence of events when a client wants to send task data to the session manager:
  1. Client sends task data to the API layer.
  2. API layer serializes the data and submits it to the underlying communication layer.
  3. API layer blocks the client's submission thread.
  4. Data is transferred by the communication layer to the session manager. The session manager replies with an acknowledgment upon successful receipt of the data.
  5. API layer returns a Task Input Handle to the client and unblocks the client's thread. The client's thread is then free to submit more input.
Note: If the session's recoverable attribute is set to true, the API waits for confirmation of successful data persistence from the session manager before returning a Task Input Handle to the client.
If the session's recoverable attribute is set to true, the API waits for confirmation of successful data persistence from the session manager before returning a Task Input Handle to the client

Behavior with message aggregation

The message aggregation feature enables the submission of work in a non-blocking mode. To better understand the interaction between the client and the session manager when using message aggregation, look at the sequence of events:
The message aggregation feature enables the submission of work in a non-blocking mode. To better understand the interaction between the client and the session manager when using message aggregation, look at the sequence of events:
  1. Client sends task data to the API layer.
  2. API layer serializes the data and submits it to the underlying communication layer.
  3. API layer returns a Task Input Handle to the client and program execution returns to the client's submission thread. The client repeats the data submission process until all the data is submitted.
  4. Client waits with a collection of Task Input Handles until the session manager confirms that all the data has been received.
    Note: The task ID is only set in the Task Input Handle after the message has been successfully delivered to the session manager, i.e., the client has received confirmation from the session manager.

If the session's recoverable attribute is set to true, the API waits for confirmation of successful data persistence from the session manager.

When deploying message aggregation, the client's sendTaskInput() call becomes non-blocking and returns with a valid Task Input Handle object as soon as the data to be sent is queued for dispatch in the underlying communication layer. Once the call returns, the client can use the returned Task Input Handle to synchronize and/or query the state of the submission to the session manager. Since the communication layer dispatches as many queued items as it can fit into a dispatch unit, it means that data is aggregated transparently to the client code.

Although operations between the client and the session manager are asynchronous, they are still serialized and have no priority associated with them. This means that the order in which the operations are dispatched is always preserved. As a consequence, if a "direct fetch" or "close" operation is issued immediately after the client sends many tasks in rapid succession (as is the case with message aggregation), it is unlikely that the task queue will be empty. The "direct fetch" or "close" operation will be queued accordingly and dispatched in due time. As a result, the client may experience a delay for the operation in the queue to complete if there are many other pending operations at the time of the call.

Dispatch unit aggregation

In WAN networks, since the packet-level turnaround time is longer than for local network connections, the connection can be better utilized by packing as much data as possible into a packet before sending it. This is achieved by setting the TCP_NODELAY attribute in the $EGO_CONFDIR/../../eservice/esc/conf/services/sd.xml configuration file.
Messaging queuing architecture

Client API

The message aggregation feature can only be accessed through the client API.

Client requirements

To enable message aggregation, the client application must do the following:
  1. Create a session using the appropriate session flag to inform the API of the client's intention to send task data in an overlapped manner.The session flag is a member of the SessionCreationAttributes class. The following list shows how the session flag is set for overlapped sending in each supported language:
    C++
    .setSessionFlags(Session::SendOverlapped)
    Java™
    .setSessionFlags(Session.SEND_OVERLAPPED)
    C#.NET:
    .SessionFlags=SessionFlags.SendOverlapped
  2. Perform the send operation using sendTaskInput().
  3. Since the send operation may be pending, the client must keep track of the returned Task Input Handle for verification or synchronization at a later time.
  4. Client must query the Task Input Handle to evaluate whether the data has been successfully submitted to the session manager. The query may take the form of issuing a waitForSubmissionComplete() call or the client may poll for an update in the status. The waitForSubmissionComplete() call has the following possible outcomes:
    • Data was successfully submitted so the waitForSubmissionComplete() call returns with a success code.
    • Data was unsuccessfully submitted so the waitForSubmissionComplete() call returns with a failure code, at which point the client must acquire the exception from the Task Input Handle to find out the reason for failure. If the optional parameter of throwOnSubmissionFailure is set to true, the method throws the exception directly. The default behavior is not to throw an exception.
    • The waitForSubmissionComplete() call timed out and returns to the client.
    • The waitForSubmissionComplete() call may also throw an exception if there is an internal error while performing the operation.
    For more information about the waitForSubmissionComplete() method of the TaskInputHandle class, refer to the API reference documentation.

Configurable TCP connection attributes

When configuring a remote client using large latency WAN connections, performance may be improved by setting TCP_NODELAY=0 and applying other appropriate TCP settings (based on user-specific network parameters to optimize TCP package throughput over the connection). Refer to Configuring TCP Connections.