IBM Support

IJ43262: SAVEBASE CAN HANG WHEN AIX_STDBUFSZ IS SET APPLIES TO AIX 7200-05

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available.

Notify me when this APAR changes.

 

APAR status

  • Closed as program error.

Error description

  • with large ODM data the following might hang randomly
    export AIX_STDBUFSZ=64K
    savebase
    
    proctree will show savebase forked a child compress
    process both savebase & compress will hang :
    procstack <savebase_pid>
    6029872: savebase
    0xd01223ac  write(??, ??, ??) + 0x1cc
    0x10001c9c  IPRA.$piped_compress(??, ??) + 0x25c
    0x100029f4  IPRA.$compress_and_write(??) + 0xb4
    0x10000690  main(??, ??) + 0x150
    
    procstack <compress_pid>
    6554324: /usr/bin/compress
    0xd01223ac  write(??, ??, ??) + 0x1cc
    0xd0120bbc  _xwrite@AF19_9(??, ??, ??, ??, ??) + 0x5c
    0xd01200c8  _xflsbuf(??) + 0xc8
    0xd01207d8  __flsbuf(??, ??) + 0x98
    0x10000c10  IPRA.$output(??) + 0x330
    0x1000186c  compress() + 0x2ec
    0x10001eb8  do_stdin() + 0x2f8
    0x10002c90  main(??, ??) + 0xc30
    

Local fix

  • unset AIX_STDBUFSZ before calling savebase or any thing
    that could call savebase at the end.
    

Problem summary

  • Both savebase and compress can be stuck in write when the
    pipelines are full as they are in writes, none will be able
    to read to free some space in the pipelines and make the
    make the overall progress.
    proctree will show savebase forked a child compress
    process both savebase & compress will hang :
    procstack <savebase_pid>
    6029872: savebase
    0xd01223ac  write(??, ??, ??) + 0x1cc
    0x10001c9c  IPRA.$piped_compress(??, ??) + 0x25c
    0x100029f4  IPRA.$compress_and_write(??) + 0xb4
    0x10000690  main(??, ??) + 0x150
    
    procstack <compress_pid>
    6554324: /usr/bin/compress
    0xd01223ac  write(??, ??, ??) + 0x1cc
    0xd0120bbc  _xwrite@AF19_9(??, ??, ??, ??, ??) + 0x5c
    0xd01200c8  _xflsbuf(??) + 0xc8
    0xd01207d8  __flsbuf(??, ??) + 0x98
    0x10000c10  IPRA.$output(??) + 0x330
    0x1000186c  compress() + 0x2ec
    0x10001eb8  do_stdin() + 0x2f8
    0x10002c90  main(??, ??) + 0xc30
    

Problem conclusion

  • Change the code to use  unblocked writes and use poll
    so that it can both read & write based on poll event
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ43262

  • Reported component name

    AIX V7.2

  • Reported component ID

    5765CD200

  • Reported release

    720

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2022-09-26

  • Closed date

    2022-09-26

  • Last modified date

    2023-03-13

  • APAR is sysrouted FROM one or more of the following:

    IJ38959

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    AIX V7.2

  • Fixed component ID

    5765CD200

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11S","label":"AIX 7.2 HIPERS- APARs and Fixes"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"720","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
14 March 2023