Flashes (Alerts)
Abstract
Stretch Cluster split-brain condition occurs, which could cause I/O timeouts in SDDPCM as a result of waiting for SVC nodes to recover from this condition
Content
The SVC storage can sometimes encounter a fault scenario called split-brain condition. In this scenario, the two nodes of an SVC cluster iogrp fail to communicate with each other for some time. In a stretched cluster environment, the SVC takes approximately 95 - 120 secs to recover from a split-brain condition. After this period, the expectation is that one of the nodes performs a target reset and the other one takes over the ownership of the devices. During this time, any I/O to the storage is either queued causing a timeout or encounters a no device response error.
If TIMEOUT and NO Device response error is reported due to split-brain condition, it may result in I/O failures, especially when it takes more than 120 seconds for SVC to recover from the split-brain condition.
Not every timeout is to be taken as an indication of a split brain condition. Analysis by IBM support is needed to verify that the condition occurred.
Resolution
A new ODM attribute svc_sb_ttl (“SVC time to live”)now exists starting with the 2.6.5.0 release of SDDPCM.
This hdisk attribute denotes the time that an I/O remains active since it was first issued, if a split-brain condition occurs.
This variable is used in cases of an SVC Stretch cluster configuration. The recommended value for svc_sb_ttl is zero in cases other than SVC stretch cluster.
Conditions
SDDPCM detects that all good paths have been tried at least once. If the I/O encountered errors that are caused by timeout or no device response, which are indications of split-brain.The I/O is not failed immediately, but is retried on all paths.
This gives the storage enough time to recover from split-brain and I/O is successful.
To enable this part of the algorithm, the value of svc_sb_ttl has to be greater than rw_timeout.
If the value of svc_sb_ttl is less than the rw_timeout, then the rw_timeout algorithm would behave as it would without this newly added svc_sb_ttl tunable.
Recommended Fix
Attention:
Set the svc_sb_ttl attribute only for a stretched cluster environment of SVC. In other cases, retain the default attribute value, which is 0.
Syntax
pcmpath set device num1 svc_sb_ttl t
Parameters
num1 [ num2 ]
When only num1 is specified, the command applies to the hdisk specified by num1.
When 2 device logical numbers are entered, this command applies to all the devices whose logical numbers fit within the range of the two device logical numbers.
t
The range of supported values for the time to live attribute of SVC is 0-240 seconds.
Example:
If you enter pcmpath set device 2 10 svc_sb_ttl 180, the time to live attribute of hdisk2 to hdisk10 is immediately changed to 180 seconds.
Note: The value of svc_sb_ttl must be more than rw_timeout so that SVC has additional time to recover from split-brain scenario.
Contact the IBM Support Center if assistance is needed.
Was this topic helpful?
Document Information
Modified date:
25 September 2022
UID
ssg1S1004522