APAR status
Closed as program error.
Error description
After AFM Home reconnect, some filesets at AFM Cache are left in a bad status that can cause access hang, although mmafmctl still reports them as Active. Reported in: Spectrum Scale 5.0.4.3 Known Impact: Fileset access hung, wrong status report. Verification steps: (1) mmafmctl shows the fileset1 is Active # mmafmctl gpfs1 getstate Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec ------------ -------------- fileset1 gpfs:///gpfs1/fileset1 Active afmgw1.gpfs.net 0 110016 (2) On gateway node afmgw1, run "mmfsadm dump afm" as below, although the filesets status shows Normal,Mounted and Active, but the "CTL -1" indicates the issue that is fixed by this APAR. # mmfsadm dump afm fset fileset1 ............ Fileset: fileset1 655 (AFM) mode: independent-writer queue: Normal MDS: <c0n4> QMem 0 CTL -1 home: homeServer: proto: gpfs port: 0 lastCmd: 16 handler: Mounted Dirty refCount: 1 queueTransfer: state: Idle senderVerified: 0 receiverVerified: 1 terminate: 0 psnapWait: 0 remoteAttrs: AsyncLookups 0 tsfindinode: success 0 failed 0 totalTime 0.0 avgTime 0.000000 maxTime 0.0 queue: delay 15 QLen 0+0 flushThds 0 maxFlushThds 4 numExec 108273 qfs 0 iwo 0 err 0 handlerCreateTime : 2020-08-09_18:04:32.462+1000 numCreateSnaps : 0 InflightAsyncLookups 0 lastReplayTime : 2020-08-12_09:04:22.462+1000 lastSyncTime : 2020-08-12_09:04:22.462+1000 i/o: readBuf: 33554432 writeBuf: 2097152 sparseReadThresh: 134217728 pReadThreads 1 i/o: pReadChunkSize 134217728 pReadThresh: 0 >> Disabled << pWriteThresh: 0 >> Disabled << i/o: prefetchThresh 0 (Prefetch) iw: afmIwTakeoverTime 0 Priority Queue: Empty (state: Active) Normal Queue: Empty (state: Active) Recovery action: Restart the fileset by: # mmafmctl AFM stop -j fileset1 # mmafmctl AFM start -j fileset1
Local fix
Problem summary
Remote mount is not responsive and the control file failed(-1) to set. Due to this it is returning E_STALE and the lookup gets requeued with E_RESTART every time.
Problem conclusion
Benefits of the solution: Don't mount the fileset until the control file is found. Work around: N/A Problem trigger: ls command is executed on fileset root. Symptom: ls command hung and failed to return. Platforms affected: All OS environments Functional Area affected: AFM Customer Impact: Suggested
Temporary fix
Comments
APAR Information
APAR number
IJ27285
Reported component name
SPEC SCALE DME
Reported component ID
5737F34AP
Reported release
504
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-08-26
Closed date
2020-08-28
Last modified date
2020-08-28
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
IJ27810
Fix information
Fixed component name
SPEC SCALE DME
Fixed component ID
5737F34AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"504","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
10 September 2020