Split-brain in Global Mailbox
In general terms, split-brain arises when two or more data centers cannot connect and communicate with each other. Split-brain is also called as network partition.
In most of the cases, split-brain can lead to lack of availability and data inconsistency issues. In Global Mailbox, split-brain happens when Cassandra nodes in one data center cannot communicate with the Cassandra nodes in another data center. The communication is lost primarily because of network issues.
Cassandra servers in all data centers work together to provide a consistent view of data. They do this by communicating over the network to replicate data as quickly as possible, ensuring the data is kept as up to date as possible on all servers. If the servers in a data center cannot communicate with servers in the other data centers, this might lead to operations and updates done on out-of-date data. Conflicting operations might be performed on each side of the partition. For example, a user might remove a user mailbox permission in data center 1, but someone else might add permissions for that user in data center 2. Since there is a network partition, the data is out of sync.
After the partition is resolved, Cassandra attempts to resolve any conflicts on the objects that are created, updated, and deleted during the partition. It chooses the change that has the latest time stamp.
- When data centers cannot communicate and a few transactions happen on the same object in more than one data center. Such changes are called as conflicting changes. When the connection is restored and the conflicting Cassandra changes are resolved, the record with the latest time stamp always overwrites any previous version of the object.
- When you delete a mailbox, the first change takes precedence over the later changes. For example, if a mailbox /acme_mbx is deleted first in data center 1, and a submailbox /acme_mbx/acme_user is created in data center 2 after the deletion, /acme_mbx and /acme_mbx/acme_user mailboxes are deleted after the merge.
What you cannot do during split-brain
During a split-brain, you can only disable or enable an event rule.
- Create new event rules
- Rename event rules
- Delete event rules
- Update event rules
- Create routing rule with Global Mailbox producer partner.
- Convert producer partner who has routing channel associated with producer role.
- Delete a routing rule which has Global Mailbox partner.
- Delete a partner who has routing channel associated with producer role.
- Update a routing rule which has a Global Mailbox producer partner.
Global Mailbox behavior during split-brain
- Duplicate messages
- During split-brain, there are chances of duplicate messages being uploaded across
other data centers, though
allowDuplicates=false
. This is because the system cannot reach the other data center to verify the message. However, upload of duplicate messages in the local data center is not allowed ifallowDuplicates=false
.
- Extraction count
- There can be a potential loss of extract count if extraction criteria of a message is changed in another data center during split-brain. For example, during split-brain, if a trading partner extracts a message in data center 1, and later an admin changes extraction criteria for the same message in data center 2, then the extraction count is lost after the data centers are merged. The latest update by the administrator takes precedence over the extraction count that is recorded after the trading partner extracted the message.
- Extraction criteria
- There can be a potential loss of message extraction criteria update if a message is extracted in another data center during split-brain, after the criteria changed. For example, during split-brain, if an administrator changes extraction criteria for a message in data center 1, and later a trading partner extracts the same message in data center 2, the changes made by the administrator are lost after split-brain is resolved. As obvious, the latest change that happened is extraction of the message by the trading partner, and that takes precedence.
- Orphaned mailboxes
- Creating or deleting a mailbox during split-brain can leave behind orphaned mailboxes that cannot be cleaned up. If a submailbox is created in data center 1 and one of its parent mailboxes is deleted later in data center 2, then after the merge, the deleted mailbox, and submailboxes under it are not displayed. In addition, a user cannot create a submailbox with the same name.