The Autonomic Ownership Takeover Manager
The Autonomic Ownership Takeover Manager (AOTM) is a method by which either Read-Only Takeover (ROT) or Write-Only Takeover (WOT) can be automatically enabled against a failed TS7700 Cluster through internal negotiation methods. AOTM is optionally configurable.
When communication between two or more TS7700 clusters is disrupted, the clusters (local and remote) are no longer able to negotiate ownership of virtual volumes. In this scenario, ownership takeover, or human intervention, is sometimes utilized to establish temporary access to data resources. Ownership takeover occurs when an operator, working with the knowledge that one cluster is in a failed state, physically intervenes to obtain permission to access that cluster's data. When network problems cause a communication failure, ownership takeover is not the correct solution to reestablish access.
Before AOTM intervenes to allow a working cluster access to data from a remote cluster, it must determine whether the remote cluster is inaccessible due to failure of the cluster itself or failure of the Grid network. AOTM does this by sending a status message across a network connecting the TSSC associated with the clusters. This network is referred to as the TSSC Grid Network and illustrated by the following figures.


If a working cluster in a TS7700 Grid is unable to process transactions with a remote cluster because communication with the remote cluster has been lost, the local cluster starts an AOTM grace period timer. When the grace period configured by you expires, the AOTM cluster outage detection process is initiated. The local cluster communicates with the local TSSC, which then forwards a request to a remote TSSC. The remote TSSC then attempts to communicate with the remote cluster. Only when the remote TSSC request returns and agrees that the remote cluster has failed is the configured takeover mode enabled. When enabled, access to data that is owned by the failed cluster is allowed by using the enabled takeover mode.
Conditions required for takeover
When AOTM is enabled on multiple systems in a TS7700 Grid environment, takeover occurs when one TS7700 Cluster fails if all TSSC system consoles attached to the TS7700 Clusters remain in communication with one another.
Remote cluster appearance | Remote cluster actual state | Does present third peer recognize cluster as down? | Status of Grid links | Status of links between local TSSC and local cluster | Status of links between remote TSSC and remote cluster | Status of links between TSSCs | Notes |
---|---|---|---|---|---|---|---|
Down | Down | Yes | Not applicable | Connected | Connected | Connected | |
Down | Down | Yes | Not applicable | Connected | Down | Connected | |
Down | Online | Not present | Down | Connected | Down | Connected | Cluster is assumed down since last network path is down. |
Down | Offline | Yes | Not applicable | Connected | Not applicable | Connected |
Remote cluster appearance | Remote cluster actual state | Does present third peer recognize cluster as down? | Status of Grid links | Status of links between local TSSC and local cluster | Status of links between remote TSSC and remote cluster | Status of links between TSSCs | Notes |
---|---|---|---|---|---|---|---|
Down | Down | Yes/not present | Not applicable | Connected | Not applicable | Down | |
Down | Down | Yes/not present | Not applicable | Down | Not applicable | Not applicable | |
Down | Online | No | Down | Not applicable | Not applicable | Not applicable | Third cluster prevents takeover |
Down | Online | Not present | Down | Connected | Down | Connected | Cluster is assumed down since last network path is down. |
Down | Online | Not present | Down | Connected | Connected | Connected | |
Down | Offline | Yes | Not applicable | Down | Not applicable | Down | Takeover not enabled if TSSC is not present or accessible |
Down | Offline | Yes | Not applicable | Connected | Not applicable | Down | Takeover not enabled if TSSC is not present or accessible |
Differences between forms of takeover
AOTM is not the only form of ownership takeover that can be employed in the event of a system failure. The following descriptions of Service Ownership Takeover (SOT), Read Only Takeover/Write Only Takeover (ROT / WOT), and AOTM are provided to avoid confusion when discussing options for ownership takeover.- SOT
- SOT is activated during normal operating conditions prior to bringing a system offline for upgrade, maintenance, or relocation purposes. A TS7700 Cluster in the SOT state surrenders ownership of all its data and other TS7700 Clusters in the Grid may access and mount its virtual volumes.
- ROT / WOT
- ROT / WOT is employed when a TS7700 Cluster is in the failed state and cannot be placed in SOT, or Service mode. In this state, the virtual volumes that belong to the failed TS7700 Cluster cannot be accessed or modified. You must use the TS7700 Management Interface from an active TS7700 Cluster in the Grid to establish ROT / WOT for the failed TS7700 Cluster.
AOTM represents an automation of ROT / WOT. AOTM makes it unnecessary for a user to physically intervene through the TS7700 Management Interface to access data on a failed TS7700 Cluster. AOTM limits the amount of time that virtual volumes from a failed TS7700 Cluster are inaccessible and reduces opportunities for human error while establishing ROT / WOT.