Concurrent software upgrade on IBM storage (San Volume Controller, FlashSystem) is a common task but how can I ensure my host can handle it? Here we list some of the known items that can affect the host abilities to successfully failover (or failback) the paths to the volumes on storage.
The following technote can be read first to understand the upgrade process on storage:
Spectrum Virtualize Family of Products Upgrades - Frequently Asked Questions and Pre-Upgrade Checklist
https://www.ibm.com/support/pages/node/688931
Note: This document also assumes each host port has multipath access to both nodes in the IOgroup of the storage where the volumes reside.
For all operating systems
- Verify compatibility of the environment/host by using IBM SSIC and the new storage level to install:
IBM System Storage Interoperation Center (SSIC)
- Host mappings: verify whether any volumes on storage are mapped to multiple hosts that use different SCSI IDs
Some operating systems are using the SCSI-ID (or also called LUN ID) for their volume handling and especially cluster systems are taking the SCSI-ID into account to handle resource management and failover. There are seen issues with ESX and MSCS cluster systems, also when the volumes are connected for NPIV usage.
Here is also reference for ESX that explains why SCSI ID is also important for volumes handling at newer ESX versions:
- Application settings
While it is generally advisable to set 60 seconds disk timeout on operating systems, the applications running on host also need to take the disk timeout into account. Configuration must be set to allow application to equal or greater disk timeout setting than the operating system and can prevent application from overreacting to otherwise recoverable SAN or path issues.
- Metro Mirror and Global Mirror relationships
When you update software on a system that has primary or secondary volumes of running Metro Mirror or Global Mirror relationships, write performance might be degraded on the primary volumes. Global Mirror relationships can be automatically stopped with one or more errors with error code 1920. You might want to proactively stop such relationships or consistency groups or the partnership before you update the software to avoid the write performance degradation, and restart the relationships after the update completes.
Here are most common configuration options to verify per operating system:
VMware
- Verify all ESXi VMware settings are set to recommended values, as per IBM Documentation portal.
Steps 3 and 4 are most important for successful concurrent maintenance.
(Link is for SVC at 8.5.0.x, modify for your specific product and version upgrade target)
- If host hardware platform is Cisco UCS, review the following flash for exposure:
flash
MS Windows
- Set proper O/S disk timeout
Using regedit.exe modify the key:
HKEY_LOCAL_MACHINE > System > CurrentControlSet > Services > Disk > TimeOutValue.
Set the value data to 0x3c (hexadecimal) or 60 (decimal) and reboot OS for the change to take effect.
- Microsoft MPIO driver is the recommended multipath driver (IBM SDD is not longer supported)
AIX
- Verify rw_timeout and application timeout settings
- Recommended Multi-path Driver
- Verify host exposure to APAR HU01894:
Linux
- If host is running RHEL6, RHEL7, or SLES12, make sure the scsi_mod.inq_timeout parameter to 70 seconds. Otherwise, these operating systems cannot regain previously failed paths such as in a system update or where a node is manually rebooted.
- For all Linux distributions, verify the following settings for proper handling of concurrent maintenance on storage:
Oracle
- This parameter specifies the timeout value for disk operations. Add the following line to the /etc/system file to set the sd_io_time parameter for the system LUNs:
set sd:sd_io_time=0x78
- This parameter specifies the retry count for disk operations. Add the following line to the /etc/system file to set the sd_retry_count parameter for the system LUNs:
set sd:sd_retry_count=5
Note: The sd_retry_count parameter applies to Solaris versions 8 and 9 only.
More settings here:
https://www.ibm.com/docs/en/sanvolumecontroller/8.5.x?topic=csos-setting-oracle-host-parameters-use-nmp-veritas-dmp-1
More host recommended settings are specified in the Host attachment sections of the IBM Documentation portal.
(Link is for SVC at 8.5.0.x, modify for your specific product and version upgrade target)
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STPVGU","label":"SAN Volume Controller"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"ST3FR7","label":"IBM Storwize V7000"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STHGUJ","label":"IBM Storwize V5000"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STHGUL","label":"IBM Storwize V5000E"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STSLR9","label":"IBM FlashSystem 9x00"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STSLR9","label":"IBM FlashSystem 9x00"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSA76Z4","label":"IBM FlashSystem 7x00"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"ST3FR9","label":"IBM FlashSystem 5000"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"ST2HTZ","label":"IBM FlashSystem Software"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"STKMQV","label":"IBM FlashSystem V9000"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]
V3700;V5000;V7000;SVC;FS9100;FS9200;FS7200;FS5100;V840;V900;V9000