IBM Spectrum Scale Active File Management (AFM) and AFM Asynchronous Disaster Recovery (DR)

Flashes (Alerts)

Abstract

IBM has identified certain situations with respect to Active File Management (AFM) and AFM Asynchronous Disaster Recovery (DR) in IBM Spectrum Scale that may result in undetected data corruption:

- AFM may intermittently read files from the home cluster incorrectly which could result in undetected data corruption due to Direct IO usage.
- AFM may have undetected data corruption when eviction and read operations run in parallel on the same file.
- AFM cache may incorrectly read a file from the home cluster due to the incorrect calculation of the file sparseness information, potentially resulting in undetected data corruption.
- If parallel IO is enabled, AFM and AFM Asynchronous DR may experience undetected data corruption with failover, resync and changeSecondary commands.
- AFM Asynchronous DR failback may read HSM migrated files from the acting AFM Primary cluster (originally the AFM Secondary cluster) as sparse files, potentially resulting in the AFM cache to return incorrect data (all zeros) to an application on a read.

Content

IBM has identified certain situations with respect to Active File Management (AFM) and AFM Asynchronous Disaster Recovery (DR) in IBM Spectrum Scale that may result in undetected data corruption:

1. AFM may intermittently read files from the home cluster incorrectly which could result in an undetected data corruption due to Direct IO usage.

Problem Summary:

Users affected:

Recommendations:

2. In the event of manual eviction and read operations running in parallel on the same file, AFM may experience undetected data corruption.

Problem Summary:

Users affected:

Recommendations:

3. AFM cache may incorrectly read a file from the home cluster due to the incorrect calculation of the file sparseness information, potentially resulting in undetected data corruption.

Problem Summary:

afmReadSparseThreshold

Users affected:

mmafmconfig

afmReadSparseThreshold

Recommendations:

4. When parallel IO is enabled, the use of the failover or resync commands (AFM caching modes) or the changeSecondary command (AFM DR mode) may result in undetected data corruption.

Problem Summary:

failover

resync

changeSecondary

Users affected:

Failover

resync

changeSecondary

afmParallelWriteThreshold

Recommendations:

5. AFM Asynchronous DR failback may read HSM migrated files from the acting AFM Primary cluster (originally the AFM Secondary cluster) as sparse files. This situation may result in the AFM cache to return incorrect data (all zeros) to an application on a read.

Problem Summary:

Users affected:

Recommendations:

[{"Product":{"code":"SSFKCN","label":"General Parallel File System"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"--","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"}],"Version":"3.5.0","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Tips

IBM Spectrum Scale Active File Management (AFM) and AFM Asynchronous Disaster Recovery (DR)

Flashes (Alerts)

Abstract

Content

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?