APAR status
Closed as program error.
Error description
with large ODM data the following might hang randomly export AIX_STDBUFSZ=64K savebase proctree will show savebase forked a child compress process both savebase & compress will hang : procstack <savebase_pid> 6029872: savebase 0xd01223ac write(??, ??, ??) + 0x1cc 0x10001c9c IPRA.$piped_compress(??, ??) + 0x25c 0x100029f4 IPRA.$compress_and_write(??) + 0xb4 0x10000690 main(??, ??) + 0x150 procstack <compress_pid> 6554324: /usr/bin/compress 0xd01223ac write(??, ??, ??) + 0x1cc 0xd0120bbc _xwrite@AF19_9(??, ??, ??, ??, ??) + 0x5c 0xd01200c8 _xflsbuf(??) + 0xc8 0xd01207d8 __flsbuf(??, ??) + 0x98 0x10000c10 IPRA.$output(??) + 0x330 0x1000186c compress() + 0x2ec 0x10001eb8 do_stdin() + 0x2f8 0x10002c90 main(??, ??) + 0xc30
Local fix
unset AIX_STDBUFSZ before calling savebase or any thing that could call savebase at the end.
Problem summary
Both savebase and compress can be stuck in write when the pipelines are full as they are in writes, none will be able to read to free some space in the pipelines and make the make the overall progress. proctree will show savebase forked a child compress process both savebase & compress will hang : procstack <savebase_pid> 6029872: savebase 0xd01223ac write(??, ??, ??) + 0x1cc 0x10001c9c IPRA.$piped_compress(??, ??) + 0x25c 0x100029f4 IPRA.$compress_and_write(??) + 0xb4 0x10000690 main(??, ??) + 0x150 procstack <compress_pid> 6554324: /usr/bin/compress 0xd01223ac write(??, ??, ??) + 0x1cc 0xd0120bbc _xwrite@AF19_9(??, ??, ??, ??, ??) + 0x5c 0xd01200c8 _xflsbuf(??) + 0xc8 0xd01207d8 __flsbuf(??, ??) + 0x98 0x10000c10 IPRA.$output(??) + 0x330 0x1000186c compress() + 0x2ec 0x10001eb8 do_stdin() + 0x2f8 0x10002c90 main(??, ??) + 0xc30
Problem conclusion
Change the code to use unblocked writes and use poll so that it can both read & write based on poll event
Temporary fix
Comments
APAR Information
APAR number
IJ43262
Reported component name
AIX V7.2
Reported component ID
5765CD200
Reported release
720
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-09-26
Closed date
2022-09-26
Last modified date
2023-03-13
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
AIX V7.2
Fixed component ID
5765CD200
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11S","label":"AIX 7.2 HIPERS- APARs and Fixes"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"720","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]
Document Information
Modified date:
14 March 2023