IBM Support

IJ27285: AFM: LS HUNG FOR SOME "ACTIVE" FILESETS AFTER HOME RECONNECT

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • After AFM Home reconnect, some filesets at AFM Cache are
    left in a bad status that can cause access hang, although
    mmafmctl still reports them as Active.
    
    Reported in:
    Spectrum Scale 5.0.4.3
    
    Known Impact:
    Fileset access hung,  wrong status report.
    
    Verification steps:
    
    (1) mmafmctl shows the fileset1 is Active
    
    # mmafmctl gpfs1 getstate
    Fileset Name    Fileset Target
        Cache State          Gateway Node    Queue Length
    Queue numExec
    ------------    --------------
    fileset1        gpfs:///gpfs1/fileset1
          Active             afmgw1.gpfs.net 0   110016
    
    (2) On gateway node afmgw1,  run "mmfsadm dump afm" as
    below, although the filesets status shows Normal,Mounted
    and Active, but the "CTL -1" indicates the issue that is
    fixed by this APAR.
    
    # mmfsadm dump afm fset fileset1
    
    ............
    Fileset: fileset1 655 (AFM)
      mode: independent-writer queue: Normal   MDS: <c0n4>
    QMem 0 CTL -1
      home:  homeServer:  proto: gpfs port: 0   lastCmd: 16
      handler: Mounted Dirty       refCount: 1
      queueTransfer: state: Idle senderVerified: 0
    receiverVerified: 1 terminate: 0 psnapWait: 0
      remoteAttrs: AsyncLookups 0 tsfindinode: success 0
    failed 0 totalTime 0.0 avgTime 0.000000 maxTime 0.0
      queue: delay 15 QLen 0+0 flushThds 0 maxFlushThds 4
    numExec 108273 qfs 0 iwo 0 err 0
      handlerCreateTime : 2020-08-09_18:04:32.462+1000
    numCreateSnaps : 0 InflightAsyncLookups 0
      lastReplayTime : 2020-08-12_09:04:22.462+1000
    lastSyncTime : 2020-08-12_09:04:22.462+1000
      i/o: readBuf: 33554432 writeBuf: 2097152
    sparseReadThresh: 134217728  pReadThreads 1
      i/o: pReadChunkSize 134217728 pReadThresh: 0 >>
    Disabled << pWriteThresh: 0 >> Disabled <<
      i/o: prefetchThresh 0 (Prefetch)
      iw: afmIwTakeoverTime 0
      Priority Queue:  Empty (state: Active)
      Normal Queue:  Empty (state: Active)
    
    
    Recovery action:
    Restart the fileset by:
    # mmafmctl AFM stop -j fileset1
    # mmafmctl AFM start -j fileset1
    

Local fix

Problem summary

  • Remote mount is not responsive and
    the control file failed(-1) to
    set. Due to this it is returning
    E_STALE and the lookup gets
    requeued with E_RESTART every time.
    

Problem conclusion

  • Benefits of the solution:
    Don't mount the fileset until the control file is found.
    
    Work around:
    N/A
    
    Problem trigger:
    ls command is executed on fileset root.
    
    Symptom: ls command hung and failed to return.
    
    Platforms affected: All OS environments
    
    Functional Area affected: AFM
    
    Customer Impact: Suggested
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ27285

  • Reported component name

    SPEC SCALE DME

  • Reported component ID

    5737F34AP

  • Reported release

    504

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-08-26

  • Closed date

    2020-08-28

  • Last modified date

    2020-08-28

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IJ27810

Fix information

  • Fixed component name

    SPEC SCALE DME

  • Fixed component ID

    5737F34AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"504","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
10 September 2020