IBM Support

IJ57339: POTENTIAL UNDETECTED DATA LOSS AFTER POWERHA FAILOVER, DURING JF

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • **************************************************************
    * USERS AFFECTED:
    * Systems running PowerHA 7.2TL10 with
    * any of the following filesets at or between the given levels:
    * MIN          MAX          FILESET
    * 7.2.10.0     7.2.10.0     cluster.es.server.events
      **************************************************************
    * ERROR DESCRIPTION:
    * When a PowerHA node acquires a jfs2 filesystem that had not
    * been cleanly unmounted, such as after a crash or halt, the
    * filesystem log may not be replayed when fsck is executed on
    * the filesystem.
    * This may result undetected data loss as the filesystem is
    * brought to a consistent state without replaying the log.
      **************************************************************
    * RECOMMENDATION:
    * Install APAR IJ57339.
    * Prior to fix availability, an interim fix is available from
    * either
    * ftp://aix.software.ibm.com/aix/ifixes/ij57270/
    * https://aix.software.ibm.com/aix/ifixes/ij57270/
    * Installation of the ifix does not require a reboot.
      **************************************************************
    

Local fix

Problem summary

  • Under specific recovery conditions, logredo may be skipped for
    filesystems with MOUNTGUARD enabled, resulting in fsck executing
    
    without prior log replay. This can lead to discarding of the
    jfslog and potential filesystem inconsistencies.
    APAR IJ57270 addresses and resolves this issue.
    
    APAR Description
    
    When a filesystem protected by MOUNTGUARD is acquired following
    an unclean unmount, PowerHA is expected to run logredo before
    invoking fsck to ensure that journaled metadata operations are
    replayed.
    
    However, it was observed that If all PowerHA cluster nodes are
    active, And the filesystem is not mounted on any node, the
    recovery logic incorrectly determines that a log replay should
    not occur because the existing code requires at least one node
    to be down before initiating logredo.
    
    As a result:
    
    logredo does not execute when it should, cl_activate_fs.sh
    proceeds to run fsck without the logredo step, Causing pending
    journal entries to be discarded, Leading to potential filesystem
    
    inconsistencies during mount.
    This behavior affects filesystem integrity and can cause
    unexpected repair operations or incomplete metadata recovery.
    
    Affected PowerHA Releases
    The following PowerHA versions are impacted and require APAR
    IJ57270:
    
    7.2.6 GA -> 7.2.6 SP4
    7.2.7 GA -> 7.2.7 SP3
    7.2.8 GA -> 7.2.8 SP3
    7.2.9 GA -> 7.2.9 SP1
    7.2.10 GA
    
    Customers running any of the above releases should apply the
    APAR fix to ensure correct filesystem recovery behavior under
    MOUNTGUARD.
    
    Root Cause
    The issue was introduced as a side effect of APAR IJ36577
    HA FILESYSTEMS MAY BE DUALLY MOUNTED EVEN WITH MOUNTGUARD=YES.
    
    
    
    That APAR added enhanced mount validation logic, including a
    stricter node down condition.
    This inadvertently prevented logredo from being triggered during
    
    recovery when all nodes were up, even though the filesystem was
    not mounted anywhere.
    

Problem conclusion

  • APAR IJ57270 introduces the following corrections:
    
    1. Removal of the "Node Down" Dependency
    The recovery logic has been modified to eliminate the
    requirement that at least one node must be down to trigger
    logredo.
    The script now correctly checks only whether the filesystem is
    not mounted on any active node.
    
    2. Guaranteed logredo Execution
    The updated logic ensures that logredo will run whenever needed
    after an unclean unmount, regardless of overall cluster state.
    
    3. Update to cl_activate_fs.sh
    Enhancements were made to ensure that fsck is invoked with the
    correct logredo option, guaranteeing:
    Replay of pending journal operations
    Prevention of premature jfslog discard
    Maximum filesystem consistency during mount
    
    4. Safe ReExecution of logredo
    The updated logic permits safe rerunning of logredo, providing
    added assurance during recovery operations.
    
    Impact and Recommendation
    This APAR is classified as critical because skipping logredo can
    
    lead to:
    Incomplete journal replay
    Metadata inconsistencies
    Filesystem repair loops
    Unexpected fsck behaviors during resource group activation
    
    IBM strongly recommends applying APAR IJ57270 on all affected
    PowerHA environments to ensure predictable and safe filesystem
    recovery.
    
    Additional Notes
    
    This issue occurs only when recovering filesystems protected by
    MOUNTGUARD following an unclean unmount event.
    The APAR does not affect standard mounts or clean unmount
    operations.
    Customers using automated failover or Recovery Group activation
    paths should apply the fix proactively to avoid filesystem
    risks.
    

Temporary fix

  •   *********
      * HIPER *
      *********
    

Comments

APAR Information

  • APAR number

    IJ57339

  • Reported component name

    POWERHA SYSMIR

  • Reported component ID

    5765H3900

  • Reported release

    721

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2026-02-10

  • Closed date

    2026-03-03

  • Last modified date

    2026-03-03

  • APAR is sysrouted FROM one or more of the following:

    IJ57270

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    POWERHA SYSMIR

  • Fixed component ID

    5765H3900

Applicable component levels

[{"Business Unit":{"code":"BU008","label":"Security"},"Product":{"code":"SGL4G4","label":"PowerHA"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"721"}]

Document Information

Modified date:
03 March 2026