IBM Support

IJ50068: RESOLVING NFS-GANESHA CRASH AT _GET_GSH_EXPORT_REF AND DEC_STATE_T_REF

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • This APAR addresses two issues related to NFS-Ganesha that can
    cause crashes. Here are the details:
    (gdb) bt
    #0 0x00007fff88239a68 in raise ()
    #1 0x00007fff8881ffb8 in crash_handler (signo=11,
    info=0x7ffb42abbe48, ctx=0x7ffb42abb0d0)
    #3 0x00007fff888da5f4 in atomic_add_int64_t (augend=0x148,
    addend=1)
    #4 0x00007fff888da658 in atomic_inc_int64_t (var=0x148)
    #5 0x00007fff888de44c in _get_gsh_export_ref (a_export=0x0)
    #6 0x00007fff8888c6c0 in release_lock_owner
    (owner=0x7ffef94a1cc0)
    #7 0x00007fff88923e9c in nfs4_op_release_lockowner
    (op=0x7ffef922be60, data=0x7ffef954d290, resp=0x7ffef8629c30)
    #8 0x00007fff888fb810 in process_one_op (data=0x7ffef954d290,
    status=0x7ffb42abcdf4)
    #9 0x00007fff888fcc9c in nfs4_Compound (arg=0x7ffef95eec38,
    req=0x7ffef95ee410, res=0x7ffef8ce4b40)
    #10 0x00007fff88819130 in nfs_rpc_process_request
    (reqdata=0x7ffef95ee410, retry=false)
    #11 0x00007fff88819864 in nfs_rpc_valid_NFS (req=0x7ffef95ee410)
    #12 0x00007fff88750618 in svc_vc_decode (req=0x7ffef95ee410)
    #13 0x00007fff8874a8f4 in svc_request (xprt=0x7fff30039ca0,
    xdrs=0x7ffef95eb400)
    #14 0x00007fff887504ac in svc_vc_recv (xprt=0x7fff30039ca0)
    #15 0x00007fff8874a82c in svc_rqst_xprt_task_recv
    (wpe=0x7fff30039ed8)
    #16 0x00007fff8874b858 in svc_rqst_epoll_loop
    (wpe=0x10041cc5cb0)
    #17 0x00007fff8875b22c in work_pool_thread (arg=0x7ffdcd1047d0)
    #18 0x00007fff88229678 in start_thread ()
    #19 0x00007fff880d8938 in clone ()
    
    or
    
    (gdb) bt
    #0 0x00007f96f58d9b8f in raise ()
    #1 0x00007f96f75c6633 in crash_handler (signo=11,
    info=0x7f96ad9fc9b0, ctx=0x7f96ad9fc880) a
    #3 dec_nfs4_state_ref (state=0x7f9640465440)
    #4 0x00007f96f76762f9 in dec_state_t_ref (state=0x7f9640465440)
    #5 0x00007f96f767640c in nfs4_op_free_stateid
    (op=0x7f8dec12fba0, data=0x7f8dec1992b0, resp=0x7f8dec04ce70)
    #6 0x00007f96f766dbae in process_one_op (data=0x7f8dec1992b0,
    status=0x7f96ad9fe128)
    #7 0x00007f96f766ee80 in nfs4_Compound (arg=0x7f8dec110ab8,
    req=0x7f8dec110290, res=0x7f8dec5b7db0)
    #8 0x00007f96f75c17db in nfs_rpc_process_request
    (reqdata=0x7f8dec110290, retry=false)
    #9 0x00007f96f75c1cf1 in nfs_rpc_valid_NFS (req=0x7f8dec110290)
    #10 0x00007f96f733edfd in svc_vc_decode (req=0x7f8dec110290)
    #11 0x00007f96f733ac61 in svc_request (xprt=0x7f95d00c4a60,
    xdrs=0x7f8dec18dd00)
    #12 0x00007f96f733ed06 in svc_vc_recv (xprt=0x7f95d00c4a60)
    #13 0x00007f96f733abe1 in svc_rqst_xprt_task_recv
    (wpe=0x7f95d00c4c98)
    #14 0x00007f96f73462f6 in work_pool_thread (arg=0x7f8ddc0cc2f0)
    #15 0x00007f96f58cf1ca in start_thread ()
    #16 0x00007f96f5119e73 in clone ()
    

Local fix

Problem summary

  • This APAR addresses two issues related to NFS-Ganesha that can
    cause crashes. Here are the details:
    
    (gdb) bt
    #0 0x00007fff88239a68 in raise ()
    #1 0x00007fff8881ffb8 in crash_handler (signo=11,
    info=0x7ffb42abbe48, ctx=0x7ffb42abb0d0)
    #3 0x00007fff888da5f4 in atomic_add_int64_t
    (augend=0x148, addend=1)
    #4 0x00007fff888da658 in atomic_inc_int64_t (var=0x148)
    #5 0x00007fff888de44c in _get_gsh_export_ref (a_export=0x0)
    #6 0x00007fff8888c6c0 in release_lock_owner
    (owner=0x7ffef94a1cc0)
    #7 0x00007fff88923e9c in nfs4_op_release_lockowner
    (op=0x7ffef922be60, data=0x7ffef954d290, resp=0x7ffef8629c30)
    #8 0x00007fff888fb810 in process_one_op (data=0x7ffef954d290,
    status=0x7ffb42abcdf4)
    #9 0x00007fff888fcc9c in nfs4_Compound (arg=0x7ffef95eec38,
    req=0x7ffef95ee410, res=0x7ffef8ce4b40)
    #10 0x00007fff88819130 in nfs_rpc_process_request
    (reqdata=0x7ffef95ee410, retry=false)
    #11 0x00007fff88819864 in nfs_rpc_valid_NFS
    (req=0x7ffef95ee410)
    #12 0x00007fff88750618 in svc_vc_decode
    (req=0x7ffef95ee410)
    #13 0x00007fff8874a8f4 in svc_request
    (xprt=0x7fff30039ca0, xdrs=0x7ffef95eb400)
    #14 0x00007fff887504ac in svc_vc_recv (xprt=0x7fff30039ca0)
    #15 0x00007fff8874a82c in
    svc_rqst_xprt_task_recv (wpe=0x7fff30039ed8)
    #16 0x00007fff8874b858 in svc_rqst_epoll_loop
    (wpe=0x10041cc5cb0)
    #17 0x00007fff8875b22c in work_pool_thread
    (arg=0x7ffdcd1047d0)
    #18 0x00007fff88229678 in start_thread ()
    #19 0x00007fff880d8938 in clone ()
    
    Or (gdb) bt
    #0 0x00007f96f58d9b8f in raise ()
    #1 0x00007f96f75c6633 in crash_handler (signo=11,
    info=0x7f96ad9fc9b0, ctx=0x7f96ad9fc880) a
    #3 dec_nfs4_state_ref (state=0x7f9640465440)
    #4 0x00007f96f76762f9 in dec_state_t_ref
    (state=0x7f9640465440)
    #5 0x00007f96f767640c in nfs4_op_free_stateid
    (op=0x7f8dec12fba0, data=0x7f8dec1992b0,
    resp=0x7f8dec04ce70)
    #6 0x00007f96f766dbae in process_one_op
    (data=0x7f8dec1992b0, status=0x7f96ad9fe128)
    #7 0x00007f96f766ee80 in nfs4_Compound (arg=0x7f8dec110ab8,
    req=0x7f8dec110290, res=0x7f8dec5b7db0)
    #8 0x00007f96f75c17db in nfs_rpc_process_request
    (reqdata=0x7f8dec110290, retry=false)
    #9 0x00007f96f75c1cf1 in nfs_rpc_valid_NFS (req=0x7f8dec110290)
    #10 0x00007f96f733edfd in svc_vc_decode (req=0x7f8dec110290)
    #11 0x00007f96f733ac61 in svc_request (xprt=0x7f95d00c4a60,
    xdrs=0x7f8dec18dd00)
    #12 0x00007f96f733ed06 in svc_vc_recv
    (xprt=0x7f95d00c4a60)
    #13 0x00007f96f733abe1 in svc_rqst_xprt_task_recv
    (wpe=0x7f95d00c4c98)
    #14 0x00007f96f73462f6 in work_pool_thread (arg=0x7f8ddc0cc2f0)
    #15 0x00007f96f58cf1ca in start_thread ()
    #16 0x00007f96f5119e73 in clone ()
    

Problem conclusion

  • This problem is fixed in 5.1.9.3
    To see all Spectrum Scale APARs and their respective
    Fix solutions refer to page:
    https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_
    apars.html
    
    Benefits of the solution:
    The code has been modified to address the crashes.
    
    Work Around:
    None
    
    Problem trigger:
    The crash occurs when the NFSv4 client attempts to access and
    delete a file simultaneously through different processes or
    threads, potentially leading to timing issues.
    
    Symptom:
    Abend/Crash
    
    Platforms affected:
    Linux Only
    
    Functional Area affected:
    NFS-Ganesha crash followed by CES-IP failover.
    
    Customer Impact:
    High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ50068

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    519

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2024-02-15

  • Closed date

    2024-02-22

  • Last modified date

    2024-02-22

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"519","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
22 February 2024