Active - active capability

IBM FlashSystem is an active-active dual controller system. Active-active is a term that describes the I/O processing mechanisms in a storage controller. Assuming a controller is made of a pair of controller nodes. In this case, an active-active controller pair would process I/O for a specific volume through either node. The opposite mechanism would be an active-passive controller system. In active-passive, only one node processes I/O for a specific volume. If a passive node receives I/O, it forwards the I/O to the active node to process. IBM FlashSystem controllers use an active-active processing mechanism.

In addition to the processing mechanism, a path model provides logical connections between a host system and the storage controller nodes. A path can be classified as optimized (preferred) or non-optimized (non-preferred). The path model is independent of the I/O processing mechanism.

As shown in Figure 1, the system can be installed and configured to present optimized paths from all nodes, as defined by the standard SCSI multipath terminology. Configuring all paths as optimized means that either node can process the I/O operation in a symmetric manner. For read operations, I/O is processed entirely on the node that receives the I/O. For write operations, the I/O is coordinated by the node that receives the I/O with any required cache mirroring processing. To ensure data consistency, only one node handles any subsequent destage operations.

Figure 1. Active-active with all paths optimized
Active-active with all paths optimized

Enhancements available with FlashSystem

When installing a new IBM FlashSystem, the default configuration uses both optimized and non-optimized paths as shown in Figure 2.

The default optimized and non-optimized path configuration provide the following performance advantages.

  • An increased probability of a read cache hit by ensuring maximum I/O read requests for a specific volume are sent that uses the optimized paths, and therefore the same node receives. Read cache is not mirrored and is only stored on the node that receives the read request.
  • The read-ahead or cache prefetch algorithm is more likely to detect a sequential read stream if the same node receives all read I/O for a specific volume.
  • Each volume is allocated to a preferred node by using round-robin algorithm. The preferred node presents optimized paths to that volume and vice versa. This ensures an even balancing of workload across nodes.

Figure 2. Active-active with default optimized and non-optimized paths
Active-Active with default optimized and non-optimized paths

Active-active I/O flow

To further demonstrate the nature of active-active behavior, consider Figures 3 - 6 that shows the I/O flow for read and write operations through a FlashSystem.

Consider a read is issued to a volume. In this case, no matter if the system is using optimized paths or not, if the read is sent through a path to Node 1, the entire read operation is completed by Node 1 alone.

Figure 3. Active-active Reads through Node 1
Active-active Reads through Node 1

Similarly, as shown in Figure 4, the same read is issued to the same volume, but this time a path is provided by Node 2. The entire read operation is completed by Node 2 alone. This function is the standard definition of an Active-Active controller.

Figure 4. Active-active read through Node 2
Active-active read through Node 2

As shown in Figure 5, the write I/O flow is shown. In this case, the volume is written through a path to Node1. For each volume, the system automatically designates one node as the owner node for the volume. In this example, Node 1 is set as the owner node that defines which node owns the eventual destage operation and happens to match the optimized or non-optimized assignment used.

Figure 5. Active-active - Write to Node 1
Active-active - Write to Node 1

In Figure 5, steps 1 - 4 shows steps to complete a write operation. The initial write, the cache mirror operation, and the acknowledgments back to the host. Also, step A to C shows the required steps to asynchronously complete a write destage.

Note: A storage controller must have a mechanism as stated in the preceding example, where only one node conducts the destage operation. If such mechanism is not implemented, it can lead to data corruption.

In Figure 6, it shows the same write I/O flow for the same I/O when the operation is submitted through Node 2. Node 1 is still the owner of the volume. Therefore, Node 1 completes the destage operation. However, all other processing is managed by Node 2.

Figure 6. Active-active - Write to Node 2
Active-active - Write to Node 2

In both cases, the number and nature of the required internal operations are identical. The same write data ends up in cache on both nodes, and one node completes destage of the write data to disk.

As explained in the preceding sections, active-active processing is always present on the IBM FlashSystem and in addition it can be configured to use optimized and non-optimized paths. These two concepts do not interfere with each other. Therefore, I/O mechanism of the path model should not be confused with the active-active processing.