Payload replication
Replication is the process of creating a copies of a message that is added to Global Mailbox in a data center. The copies are created in other data centers.
A message constitutes of metadata and payload. Apache Cassandra and replication server are the two components in Global Mailbox that take care of replication. Cassandra replicates message metadata and payload, if the payload size is less than a set threshold. Replication server replicates the payload if the size is more than the set threshold.
Replication not only creates copies of a message, but also synchronizes the copies, so that a change made to a message in one data center is reflected on other data centers.
Replication is required to ensure reliability and fault tolerance. The data that enters the system must be replicated on multiple nodes and data centers, so that it can be retrieved when required, even if one of the data centers or nodes is not available. In an active-active data center configuration, replication plays important role of maintaining data consistency.
Replication types
A Global Mailbox system can be configured for either immediate (synchronous) or delayed (asynchronous) replication. In immediate replication, receipt of a message is not acknowledged until the message is replicated in another data center. In an delayed replication, receipt of a message is acknowledged before the message is replicated.
Immediate replication takes a longer time to complete, as the data must be replicated on another data center before responding to the trading partner. Immediate replication might affect performance to some extent.
Delayed replication (default) does not require the data to be replicated on another data center before responding to a trading partner, resulting in quicker response and performance. However, data across all the data centers might be less consistent. For example, there might be outage in a data center before the delayed replication completes, resulting in inconsistency in data. A trading partner user might not be able to view the acknowledged message on the other data center. You must thoroughly understand the business requirements before you configure the replication type.
Immediate replication flow in the Global Mailbox
The following diagram shows the flow for immediate replication in the Global Mailbox system:
- A trading partner uploads a file by using a supported protocol (FTP, SFTP, Connect:Direct®, or myFileGateway).Important: The system waits for replication to complete before the acknowledging the receipt of message.
- The protocol adapter determines the size of the payload and does one of the following tasks:
- If the payload size is less than the specified inline payload size threshold, a blob is created in Cassandra.
- If the payload size is more than the specified inline payload size threshold, segments of the payload are created in the shared disk.
- Replication process in other data centers run at regular intervals looking for unreplicated
segments. The process also determines if the payload is in Cassandra or shared disk, and does one of
the following tasks, based on the location of the payload:
- Checks if Cassandra replicated the blob to at least one data center. If so, Global Mailbox in the receiving data center is notified and immediate replication is complete.
- Checks if replication server replicated payload segments to at least one data center. If so, Global Mailbox in the receiving data center is notified and immediate replication is complete.
- Simultaneously, the replication process does one of the following tasks, based on the location
of the payload:
- Checks if Cassandra replicated the blob to all data centers. If so, an event is raised on the queue for all messages that have a matching event rule.
- Checks if replication server replicated payload segments to all data centers. If so, an event is raised on the queue for all messages that have a matching event rule.
- WebSphere® MQ accepts events on the queue.
- Sterling B2B Integrator at regular interval pull events from the queue and executes the business process or contract for each event.
- Based on the whether processing was successful or not, event processing status is set to Failed or Processed on Global Mailbox management user interface.
Delayed replication flow in the Global Mailbox
The following diagram shows the flow for delayed payload replication in the Global Mailbox system:
- A trading partner uploads a file by using a supported protocol (FTP, SFTP, Connect:Direct, or myFileGateway).
- The protocol adapter determines the size of the payload and does one of the following tasks:
- If the payload size is less than the specified inline payload size threshold, a blob is created in Cassandra.
- If the payload size is more than the specified inline payload size threshold, segments of the payload are created in the shared disk.
- Receipt of the message is acknowledged to the trading partner.
- Replication process in other data centers run at regular intervals looking for unreplicated
segments. The process also determines if the payload is in Cassandra or shared disk, and does one of
the following tasks, based on the location of the payload:
- Checks if Cassandra replicated the blob to all data centers. If so, an event is raised on the queue for all messages that have a matching event rule.
- Checks if replication server replicated payload segments to all data centers. If so, an event is raised on the queue for all messages that have a matching event rule.
- WebSphere MQ accepts events on the queue.
- Sterling B2B Integrator at regular interval pull events from the queue and executes the business process or contract for each event.
- Based on the whether processing was successful or not, event processing status is set to Failed or Processed on Global Mailbox management user interface.