IBM Support

Drive firmware levels MS24 MS24, MS34, MS36 and MS39 can lead to detected data loss following system or drivecode maintenance

Flashes (Alerts)


Abstract

An issue has been discovered where data on certain Tier 1 SAS SSDs running firmware levels MS24, MS34, MS36 or MS39 is not kept refreshed and can degrade over time. Following any system maintenance such as upgrades, or certain configuration activities, an array scrub will be initiated which can discover the degraded data. If multiple drives in the same array are affected, then this can cause a detected data loss.

Customers with drives using these firmware levels should contact IBM support who can determine whether the system is exposed and provide an appropriate action plan for safely removing the exposure. It is not recommended to upgrade the drive firmware without having a system healthcheck first.

These drives can be identified by running the command “lsdrive [drive_id]” from the CLI and looking for MS24, MS34, MS36 or MS39 in the resulting output

Content

Table of affected drive models:

Feature

Impacted firmware levels

Product ID

2.5” SAS Tier 1 SSD - 1.92TB

MS24

MZILS1T9HEJH

2.5” SAS Tier 1 SSD - 3.84TB

MS24

MZILS3T8HMLH

2.5” SAS Tier 1 SSD - 1.9TB MS34, MS36, MS39 MZILT15THMLA
2.5” SAS Tier 1 SSD - 3.8TB MS34, MS36, MS39 MZILT1T9HAJQ
2.5” SAS Tier 1 SSD - 7.6TB MS34, MS36, MS39 MZILT3T8HALS
2.5” SAS Tier 1 SSD - 15TB MS34, MS36, MS39 MZILT7T6HMLA
In most situations where a system contains drives using MS24 firmware, support should be contacted and a "Snap Type 4: Standard logs plus new statesaves" obtained for a healthcheck which willl determine the best way to mitigate the exposure
There are two exceptions where a healthcheck is not necessary. In these cases, the drive microcode should be updated to MS2A or later, or MS3E or later depending on model, which can be found in the latest drive firmware package available at IBM Fix Central:
•    If a system code update has been applied within the last 6 months
•    Systems containing a small number of these drives - a single drive in each RAID5 array, or two or fewer drives in each RAID6
Where a healthcheck is necessary, it is important to avoid any maintenance or configuration activity on a system, until support confirms that there is no risk. This includes:
•    System code installation
•    Drive firmware installation
•    Array rebuild
•    Array parameter alteration
The cause of this exposure is the combination of two failures. The first is within the drive firmware level MS24 which results in the drive not refreshing data as it degrades before the voltages have dropped below a readable level. 
The second is the background RAID parity scrub stalling after 200 days where no configuration activity is performed on an array. This scrub has a side effect of keeping data refreshed, and as a result this exposure only exists where the parity scrub has been stalled. APAR HU02277 will prevent parity scrub stalling. This is included in 8.3.1.3 and will also be included in future PTFs for other code streams 
Restarting the scrub after a long period where it has not been running, can result in drives being rejected due to excessive number of media errors, which can then result in data loss as further areas of degraded data are discovered on arrays which no longer have redundancy.
IBM Support can identify the state of the parity scrub from livedumps and supply an ‘ifix’ system code which allows the scrub to restart without risk of rejecting drives. This allows all degraded data to be restored from RAID parity.

Determining the drive firmware level using the GUI

  • Open Pools -> Internal Storage.
  • Right click on one of the table headers (e.g. Use) and check the “Firmware Level”  box
  • The firmware level is now added to the table.
Alternatively, the following CLI command can be used; "lsdrive -gui | grep MS24"

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"ST3FR7","label":"IBM Storwize V7000"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STPVGU","label":"SAN Volume Controller"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STHGUJ","label":"IBM Storwize V5000"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"STKMQV","label":"IBM FlashSystem V9000"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STSLR9","label":"IBM FlashSystem 9x00"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STSLR9","label":"IBM FlashSystem 9x00"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSA76Z4","label":"IBM FlashSystem 7x00"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STHGUL","label":"IBM Storwize V5000E"},"ARM Category":[{"code":"a8m0z000000bqPRAAY","label":"Configuration"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)","Line of Business":{"code":"LOB26","label":"Storage"}},{"Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"ST3FR9","label":"IBM FlashSystem 5000"},"ARM Category":[{"code":"a8m0z000000bqUHAAY","label":"Hardware->Drive\/Internal Drive"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"},{"Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"ST2HTZ","label":"IBM FlashSystem Software"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Document Information

Modified date:
21 April 2023

UID

ibm16380846