IJ05921: MMFSD CANNOT ALLOCATE MEMORY DUE TO UNACKNOWLEDGED RPC REPLIES

APAR status

Closed as program error.

Error description

Under certain conditions, the mmfsd daemon grows in size
to the point where it is no longer able to allocate
memory
, and is forced to shut down. The source of this
unchecked
growth in memory is the lists of saved unacknowledged RPC
replies to other nodes in the cluster.

One particular case where this has been seen is with an
application making fsync() calls with high frequency
(>200,000 fsyncs in 30s). The fsync() results in GPFS
sending RPCs to every node that has a token for the file
being sync'ed, even if it is a read-only token. Those
nodes then send a reply to that RPC, but the sender of
the RPC does not have a seqno to match to the RPC reply
and so the reply is not acknowledged.

While there are mechanisms in place to periodically clean
up this list of replies, if the list gets too big, too
fast, then it is not kept in check and can continue to
grow into the 100s of millions before the mmfsd is
finally
forced to exit.

The messages in /var/adm/ras/mmfs.log.latest preceding
the shutdown may look like these:

[W] ReadMap: Cannot open map file
/usr/lpp/mmfs/bin/mmfsd, not enough memory
[E] processStart: fork: err 12
[W] ReadMap: Cannot open map file
/usr/lpp/mmfs/bin/mmfsd, not enough memory
[N] Restarting mmsdrserv
[E] processStart: fork: err 12
[E] Cannot allocate memory
[X] The mmfs daemon is shutting down abnormally.
[N] mmfsd is shutting down.
[N] Reason for shutdown: LOGSHUTDOWN called

The root cause, the list of unacknowledged replies, can
be
seen with the 'mmfsadm dump tscomm' command. The entry
for
one or more connected nodes will have a long list of
unacknowledged replies:

  <c0n3> 10.1.1.3/0 (gpfsnode3)
    sndbuf 47520 rcvbuf 4194304 authEnabled 1
securityEnabled 0 sameSubnet 1
    in_conn 0 need_notify 0 reconnEnabled 1
    reconnecting 0 reconnected 0 reconnCheckdup 0
reconnConnecting 0 resending 0
    disconnecting 0 shutting 0 idleCount 0 reconnects 0
    rdmaConnInProgress 0 rdmaConnDone 0 rdmaVsendEnabled
0 rdmaVsendOkay 0 rdmaCMEnabled 0
    n_rw 0 handlerCount 1 inboundCount 0 connRetryCount 0
    sentBytes 0 thread 0 sendState initial
    Messages being serviced pool 14:
      msg_id 668409134  thread 7601  age 2.120
fileMsgSyncFile
      ran into a deleted object
    unacknowledged replies:
      msg_id 690078333 seq 45551 resent 0 msg_type 1
'fileMsgSyncFile'
      msg_id 690078357 seq 45552 resent 0 msg_type 1
'fileMsgSyncFile'
      msg_id 690078366 seq 45553 resent 0 msg_type 1
'fileMsgSyncFile'
      msg_id 690078372 seq 45554 resent 0 msg_type 1
'fileMsgSyncFile'
      msg_id 690078386 seq 45555 resent 0 msg_type 1
'fileMsgSyncFile'
      ...potentially millions of these...


Recovery action:

If this growth in the lists of unacknowledged replies,
and
thus the mmfsd process, is observed, you can restart GPFS
on either node: the one with the long list, or the one
that those replies were sent to, and the list will be
cleared.

Reported in:
Spectrum Scale 4.2.3.4 on RHEL 7

Local fix

If possible, identify the source of the VFS calls in an
effort to reduce their frequency. In the case where the
message is 'fileMsgSyncFile', those result from fsync()
calls. Use tools like 'lsof' to determine which processes
have files open in the GPFS filesystems, and then use
'strace -p <pid>' against those running processes to see
if they are making the fsync() or other offending VFS
calls.

By reducing the frequency of the calls that lead to
these RPC messages being sent and replied to, you will
prevent the list of unacknowledged replies from growing
so large, and prevent the mmfsd from having to exit.

Problem summary

When receiving RPCs which are sending to multiply nodes, these
RPCs don't have ack_seqno in them, then the saved replies in the
saved reply list would increase dramatically if no other RPCs
which have ack_seqno acknowledge them, and if in a very fast
speed 65535/5 per second, seqno would overflow very
soon(uint16),
then our saved reply acknowledge method and reply cleanup thread
wouldn't work properly, then the saved reply list would become
larger and larger, finally, run out of memory.

Problem conclusion

Enhance saved reply acknowledge method to support seqno
overflow.

Temporary fix

Comments

APAR Information

APAR number
IJ05921
Reported component name
SPECTRUM SCALE
Reported component ID
5725Q01AP
Reported release
423
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2018-04-20
Closed date
2018-05-07
Last modified date
2018-05-07

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

IJ06242

Fix information

Fixed component name
SPECTRUM SCALE
Fixed component ID
5725Q01AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"423","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSFKCN","label":"General Parallel File System"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"423","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
07 May 2018

Tips

IJ05921: MMFSD CANNOT ALLOCATE MEMORY DUE TO UNACKNOWLEDGED RPC REPLIES

Subscribe to this APAR

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

Document Information

Share your feedback

Need support?