How To
Summary
There are some proactive steps customers may take to limit their exposure to a known Broadcom (formerly Brocade) fibre channel switch defect. The defect impacts only AIX hosts and VIO servers utilizing Spectrum Virtualize Storage Products.
Objective
For the documented Broadcom (formerly Brocade) switch firmware defect which causes some AIX LPARs to lose all access to the storage during node reboots on Spectrum Virtualize Product Storage arrays, there are some proactive steps customers may take to limit their exposure.
The purpose of this article is to document those proactive steps.
Environment
AIX (all versions)
VIOS/ PowerVM (all versions)
4 Gb or 8 Gb or 16 Gb fibre adapters
Virtual (NPIV) adapters
Broadcom (formerly Brocade) Fibre Channel Switches (All switch firmware versions)
Spectrum Virtualize Storage products (such as SVC, V7K, FS9100)
Steps
Before doing any of the options below, it's best to assess past experience. What has the experience been with storage-side microcode upgrades or other storage-side maintenance activity performed in the past?
How many opportunities to encounter this problem have there been? For an 8 node SVC cluster a single microcode upgrade provides 16 opportunities.
If you have performed storage-side microcode upgrades in the past and not encountered an issue then it's not likely you will encounter an issue in the future.
If you have suffered host-side issues during storage-side maintenance in the past, these are steps which can be taken to limit that exposure/risk:
1. Minimize the amount of I/O being driven at the time of storage maintenance.
2. If possible, shutdown non-critical AIX hosts/applications to further reduce the number of hosts sharing the same storage
3. If a DR site is available, fail hosts to the DR site, do maintenance on the primary site, then fail back
4. For any maintenance which will only impact ONE SVC node (such as some type of hardware replacement) .. manually remove host-side paths to that node prior to the maintenance (with the rmpath command). After the maintenance, bring those paths back online with cfgmgr. Naturally, caution should be taken to remove the -correct- paths.
2. If possible, shutdown non-critical AIX hosts/applications to further reduce the number of hosts sharing the same storage
3. If a DR site is available, fail hosts to the DR site, do maintenance on the primary site, then fail back
4. For any maintenance which will only impact ONE SVC node (such as some type of hardware replacement) .. manually remove host-side paths to that node prior to the maintenance (with the rmpath command). After the maintenance, bring those paths back online with cfgmgr. Naturally, caution should be taken to remove the -correct- paths.
It's important to note that it's always a good idea and a general recommendation to perform maintenance during off production/ low utilization's periods. So option 1 is most important.
Additional Information
Reference the Related URL below for the Broadcom Defect Details.
Related Information
Document Location
Worldwide
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"HW1A1","label":"IBM Power Systems"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]
Product Synonym
AIX; SVC; V7K, FS9100, Spectrum Virtualize
Was this topic helpful?
Document Information
Modified date:
03 May 2021
UID
ibm11100067