IBM Support

IJ50563: STALE DATA OF FILES WITH NUMBER OF DATA BLOCKS MORE THAN 5000 MIGHT NOT BE REPAIRED DURING DISK START PROCESS WITH MMCHDISK

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • In a file system with replication configured, for a large file
    with number of data blocks more than 5000, if there are
    miss-updated on some data blocks\ due to disk failures on one
    replica disk, then these stale replicas would not be repaired if
    the helper nodes are getting involved to repair them.
    

Local fix

  • Only specify the fs mgr node as the participant node for
    mmchdisk command.
    

Problem summary

  • In a file system with replication configured, for a large file
    with number of data blocks more than 5000, if there are
    miss-updated on some data blocks\ due to disk failures on one
    replica disk, then these stale replicas would not be repaired if
    the helper nodes are getting involved to repair them.
    

Problem conclusion

  • This problem is fixed in 5.1.9.3
    To see all Spectrum Scale APARs and their respective
    Fix solutions refer to page:
    https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_
    apars.html
    
    Benefits of the solution:
    Avoid the stale replicas to be missing repaired on large files.
    
    Work Around:
    Only specify the fs mgr node as the participant node for
    mmchdisk command.
    
    Problem trigger:
    I/O errors on disk caused it marked as "down", and some further
    write failures happen on a large file with the number of data
    blocks more than 5000, then start the down disk with multiple
    participant nodes.
    
    Symptom:
    replica mismatch
    
    Platforms affected:
    All Operating Systems
    
    Functional Area affected:
    Scale Users
    
    Customer Impact:
    Critical
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ50563

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    519

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2024-03-21

  • Closed date

    2024-03-21

  • Last modified date

    2024-03-21

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"519","Line of Business":{"code":"LOB69","label":"Storage TPS"}}]

Document Information

Modified date:
04 April 2024