IBM Support

SDDPCM can report I/O failure during SVC split brain recovery period

Flashes (Alerts)


Abstract

Stretch Cluster split-brain condition occurs, which could cause I/O timeouts in SDDPCM as a result of waiting for SVC nodes to recover from this condition

Content

The SVC storage can sometimes encounter a fault scenario called split-brain condition. In this scenario, the two nodes of an SVC cluster iogrp fail to communicate with each other for some time. In a stretched cluster environment, the SVC takes approximately 95 - 120 secs to recover from a split-brain condition. After this period, the expectation is that one of the nodes performs a target reset and the other one takes over the ownership of the devices. During this time, any I/O to the storage is either queued causing a timeout or encounters a no device response error.

If TIMEOUT and NO Device response error is reported due to split-brain condition, it may result in I/O failures, especially when it takes more than 120 seconds for SVC to recover from the split-brain condition.


Not every timeout is to be taken as an indication of a split brain condition. Analysis by IBM support is needed to verify that the condition occurred.

Resolution

A new ODM attribute svc_sb_ttl (“SVC time to live”)now exists starting with the 2.6.5.0 release of SDDPCM.
This hdisk attribute denotes the time that an I/O remains active since it was first issued, if a split-brain condition occurs.

This variable is used in cases of an SVC Stretch cluster configuration. The recommended value for svc_sb_ttl is zero in cases other than SVC stretch cluster.

Conditions

SDDPCM detects that all good paths have been tried at least once. If the I/O encountered errors that are caused by timeout or no device response, which are indications of split-brain.The I/O is not failed immediately, but is retried on all paths.

This gives the storage enough time to recover from split-brain and I/O is successful.

To enable this part of the algorithm, the value of svc_sb_ttl has to be greater than rw_timeout.

If the value of svc_sb_ttl is less than the rw_timeout, then the rw_timeout algorithm would behave as it would without this newly added svc_sb_ttl tunable.

Recommended Fix

Attention: 

Set the svc_sb_ttl attribute only for a stretched cluster environment of SVC. In other cases, retain the default attribute value, which is 0.

Syntax

pcmpath set device num1 svc_sb_ttl t

Parameters
num1 [ num2 ]


When only num1 is specified, the command applies to the hdisk specified by num1.

When 2 device logical numbers are entered, this command applies to all the devices whose logical numbers fit within the range of the two device logical numbers.

t

The range of supported values for the time to live attribute of SVC is 0-240 seconds.

Example:

If you enter pcmpath set device 2 10 svc_sb_ttl 180, the time to live attribute of hdisk2 to hdisk10 is immediately changed to 180 seconds.

Note: The value of svc_sb_ttl must be more than rw_timeout so that SVC has additional time to recover from split-brain scenario.


Contact the IBM Support Center if assistance is needed.

[{"Product":{"code":"ST52G7","label":"Storage software-\u003ESystem Storage Multipath Subsystem Device Driver"},"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Component":"--","Platform":[{"code":"","label":"AIX 6.1\/7.1"}],"Version":"2.6.3.1;2.6.3.2;2.6.4.0","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
25 September 2022

UID

ssg1S1004522