IBM Support

IJ05921: MMFSD CANNOT ALLOCATE MEMORY DUE TO UNACKNOWLEDGED RPC REPLIES

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • Under certain conditions, the mmfsd daemon grows in size
    to the point where it is no longer able to allocate
    memory
    , and is forced to shut down. The source of this
    unchecked
    growth in memory is the lists of saved unacknowledged RPC
    replies to other nodes in the cluster.
    
    One particular case where this has been seen is with an
    application making fsync() calls with high frequency
    (>200,000 fsyncs in 30s). The fsync() results in GPFS
    sending RPCs to every node that has a token for the file
    being sync'ed, even if it is a read-only token. Those
    nodes then send a reply to that RPC, but the sender of
    the RPC does not have a seqno to match to the RPC reply
    and so the reply is not acknowledged.
    
    While there are mechanisms in place to periodically clean
    up this list of replies, if the list gets too big, too
    fast, then it is not kept in check and can continue to
    grow into the 100s of millions before the mmfsd is
    finally
    forced to exit.
    
    The messages in /var/adm/ras/mmfs.log.latest preceding
    the shutdown may look like these:
    
    [W] ReadMap: Cannot open map file
    /usr/lpp/mmfs/bin/mmfsd, not enough memory
    [E] processStart: fork: err 12
    [W] ReadMap: Cannot open map file
    /usr/lpp/mmfs/bin/mmfsd, not enough memory
    [N] Restarting mmsdrserv
    [E] processStart: fork: err 12
    [E] Cannot allocate memory
    [X] The mmfs daemon is shutting down abnormally.
    [N] mmfsd is shutting down.
    [N] Reason for shutdown: LOGSHUTDOWN called
    
    The root cause, the list of unacknowledged replies, can
    be
    seen with the 'mmfsadm dump tscomm' command. The entry
    for
    one or more connected nodes will have a long list of
    unacknowledged replies:
    
      <c0n3> 10.1.1.3/0 (gpfsnode3)
        sndbuf 47520 rcvbuf 4194304 authEnabled 1
    securityEnabled 0 sameSubnet 1
        in_conn 0 need_notify 0 reconnEnabled 1
        reconnecting 0 reconnected 0 reconnCheckdup 0
    reconnConnecting 0 resending 0
        disconnecting 0 shutting 0 idleCount 0 reconnects 0
        rdmaConnInProgress 0 rdmaConnDone 0 rdmaVsendEnabled
    0 rdmaVsendOkay 0 rdmaCMEnabled 0
        n_rw 0 handlerCount 1 inboundCount 0 connRetryCount 0
        sentBytes 0 thread 0 sendState initial
        Messages being serviced pool 14:
          msg_id 668409134  thread 7601  age 2.120
    fileMsgSyncFile
          ran into a deleted object
        unacknowledged replies:
          msg_id 690078333 seq 45551 resent 0 msg_type 1
    'fileMsgSyncFile'
          msg_id 690078357 seq 45552 resent 0 msg_type 1
    'fileMsgSyncFile'
          msg_id 690078366 seq 45553 resent 0 msg_type 1
    'fileMsgSyncFile'
          msg_id 690078372 seq 45554 resent 0 msg_type 1
    'fileMsgSyncFile'
          msg_id 690078386 seq 45555 resent 0 msg_type 1
    'fileMsgSyncFile'
          ...potentially millions of these...
    
    
    Recovery action:
    
    If this growth in the lists of unacknowledged replies,
    and
    thus the mmfsd process, is observed, you can restart GPFS
    on either node: the one with the long list, or the one
    that those replies were sent to, and the list will be
    cleared.
    
    Reported in:
    Spectrum Scale 4.2.3.4 on RHEL 7
    

Local fix

  • If possible, identify the source of the VFS calls in an
    effort to reduce their frequency. In the case where the
    message is 'fileMsgSyncFile', those result from fsync()
    calls. Use tools like 'lsof' to determine which processes
    have files open in the GPFS filesystems, and then use
    'strace -p <pid>' against those running processes to see
    if they are making the fsync() or other offending VFS
    calls.
    
    By reducing the frequency of the calls that lead to
    these RPC messages being sent and replied to, you will
    prevent the list of unacknowledged replies from growing
    so large, and prevent the mmfsd from having to exit.
    

Problem summary

  • When receiving RPCs which are sending to multiply nodes, these
    RPCs don't have ack_seqno in them, then the saved replies in the
    saved reply list would increase dramatically if no other RPCs
    which have ack_seqno acknowledge them, and if in a very fast
    speed 65535/5 per second, seqno would overflow very
    soon(uint16),
    then our saved reply acknowledge method and reply cleanup thread
    wouldn't work properly, then the saved reply list would become
    larger and larger, finally, run out of memory.
    

Problem conclusion

  • Enhance saved reply acknowledge method to support seqno
    overflow.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ05921

  • Reported component name

    SPECTRUM SCALE

  • Reported component ID

    5725Q01AP

  • Reported release

    423

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-04-20

  • Closed date

    2018-05-07

  • Last modified date

    2018-05-07

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IJ06242

Fix information

  • Fixed component name

    SPECTRUM SCALE

  • Fixed component ID

    5725Q01AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"423","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSFKCN","label":"General Parallel File System"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"423","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
07 May 2018