A fix is available
APAR status
Closed as program error.
Error description
With a stack like the following: (4)> f pvthread+0E4900 STACK: [000235E4]vm_gethx+000024 (00000000041DE424 [??]) [000899B4]v_sidpno+000034 (??, ??, ??, ??) [00089654]v_patch_caller_bla+000094 (??) [000892D8].patch_caller_glue+00005C () [041DE424]efc_dump+000104 (??, ??, ??) [04246A54]efsc_dump_start+000034 (??) [04247754]efsc_dump+0001F4 (??, ??, ??, ??, ??, ??) [00247EA0]devdump+000120 (??, ??, ??, ??, ??, ??) [00014F50].kernel_add_gate_cstack+000030 () [04309780]scsidisk_dump+000220 (??, ??, ??, ??, ??, ??) [00247EA0]devdump+000120 (??, ??, ??, ??, ??, ??) [00014F50].kernel_add_gate_cstack+000030 () [F1000000C02E21E8]base_dump+0001A0 (F1000A0027926400, F1000A0027E29000) [F1000000C02C2308]PowerPlatformBottomDispatch+00019C (F1000A0027926000, F1000A0101022F00) [F1000000C02C25DC]PowerSyncIoBottomDispatch+0000A0 (F1000A0027926000, F1000A0027E29000, F1000A0101022F00) [F1000000C02C2740]PowerBottomDispatchPirp+000094 (F1000A0027926800, F1000A0027E29000) [F1000000C02C34D8]PowerDispatchX+000454 (F1000A00278EC500, F1000A0027E29000, 0000000000000000) [00014D70].hkey_legacy_gate+00004C () [F1000000C03626E4]MpxDispatchDown+000110 (F1000A0027E29610) [F1000000C0362ACC]MpxDispatchGuts+00013C (F1000A0027E29610) [F1000000C0363500]MpxDispatch+0009B8 (F1000A00278E5A00, F1000A0027E29000) [F1000000C02C3380]PowerDispatchX+0002FC (F1000A00278E5A00, F1000A0027E29000, 0000000000000000) [00014D70].hkey_legacy_gate+00004C () [F1000000C037D830]GpxDispatchDown+000030 (F1000A002784DF00, F1000A0027E29000) [00014D70].hkey_legacy_gate+00004C () [F1000000C03A8AC0]VluDispatch+0000DC (F1000A002784DF00, F1000A0027E29000) [F1000000C037ED24]GpxDispatch+000070 (F1000A00278EB900, F1000A0027E29000) [F1000000C02C3380]PowerDispatchX+0002FC (F1000A00278EB900, F1000A0027E29000, 0000000000000000) [00014D70].hkey_legacy_gate+00004C () [F1000000C037D830]GpxDispatchDown+000030 (F1000A002784DF80, F1000A0027E29000) [00014D70].hkey_legacy_gate+00004C () [F1000000C0393870]XcryptDispatchGuts+000274 (F1000A002784DF80, F1000A0027E29000, 0000000100000001) [F1000000C0393AA4]XcryptDispatch+000100 (F1000A002784DF80, F1000A0027E29000) [F1000000C037ED24]GpxDispatch+000070 (F1000A00278ECA00, F1000A0027E29000) [F1000000C02C3380]PowerDispatchX+0002FC (F1000A00278ECA00, F1000A0027E29000, 0000000000000000) [00014D70].hkey_legacy_gate+00004C () [F1000000C037ED50]GpxDispatch+00009C (F1000A00278EC400, F1000A0027E29000) [F1000000C02C3380]PowerDispatchX+0002FC (F1000A00278EC400, F1000A0027E29000, 0000000000000000) [00014D70].hkey_legacy_gate+00004C () [F1000000C03AF278]safe_dump+000278 (8000002600000041, 0000000000000000, 0000000200000002, F1000A0109E312C8, 0000000000000000, 0000000000000000) [00014D70].hkey_legacy_gate+00004C () [00247EA0]devdump+000120 (??, ??, ??, ??, ??, ??) [00014F50].kernel_add_gate_cstack+000030 () [F1000000C0478520]dodiskdump+00005C (0000000000000000) [F1000000C04788BC]callbackfunc+0000B0 (0000000000000000, 0000000000000000) [00014D70].hkey_legacy_gate+00004C () [002EF848]halt_display_excp+0000A8 (??, ??, ??, ??) [0013A1C0]rmgr_halt_system_epilog+000060 (??) [0013AA1C]rmgr_halt_system+0000BC (??, ??) [003DCCD0]legacy_recovery_manager+000170 (??) [003DCAF0]recovery_manager+000130 (??, ??, ??) [00147480]state_save_ret+000578 () ____ Exception (F000000030081600) ____ iar : 0000000000663BBC msr : 8000000000009032 cr : 22004240 lr : 0000000000663B90 ctr : 0000000000000000 xer : 00000012 mq : 2028D7C0 asr : FFFFFFFFFFFFFFFF amr : FFFCBF3FFFFFFFFF r0 : 0000000000000000 r1 : F0000000300812A0 r2 : 0000000002B654A8 r3 : FFFCFF3FFFFFFFFF r4 : 0000000000000002 r5 : 0000000000000003 r6 : F1000E0000183808 r7 : 0000000000000000 r8 : 0000000400000000 r9 : 0000000004800001 r10 : 0000000004800001 r11 : FFFCBF3FFFFFFFFC r12 : 0000000000663B90 r13 : F1000A0109EC2000 r14 : 0000000000000000 r15 : 0000000000000000 r16 : 0000000000000000 r17 : 0000000000000000 r18 : 0000000000000000 r19 : 0000000000000000 r20 : 0000000000000000 r21 : 0000000000000000 r22 : 00000000F0234CE8 r23 : 0000000000001800 r24 : 0000000000000024 r25 : 00000000F0234CE8 r26 : 0000000000001800 r27 : 00000000F022FBF0 r28 : 000000000000000C r29 : 0000000000001800 r30 : 000000000000000C r31 : F000000030081538 prev 0000000000000000 stackfix 0000000000000000 int_ticks 0000 cfar 000000000059F7F4 kjmpbuf 0000000000000000 excbranch 0000000000000000 no_pfault 00 intpri 0B backt 00 flags 00 hw_fru_id 00000001 hw_cpu_id 00000009 fpscr 0000000000000000 fpscrx 00000000 fpowner 01 fpeu 01 fpinfo 00 alloc F000 o_iar 0000000000663BBC o_toc 0000000002B654A8 o_arg1 FFFCFF3FFFFFFFFF o_vaddr F1000E0000183830 krlockp 0000000000000000 rmgrwa F1000815B012EE20 amrstackhigh F000000030069FF0 amrstacklow F000000030069000 amrstackcur F000000030069FF0 amrstackfix 0000000000000000 kstackhigh 0000000000000000 kstacksize 00000000 frrstart 700DFEED00000000 frrend 700DFEED00000000 frrcur 700DFEED00000000 frrstatic 0000 kjmpfrroff 0000 frrovcnt 0000 frrbarrcnt 0000 frrmask 00 callrmgr 00 Except : excp_type 0000010E EXCEPT_SKEY orgea F1000E0000183830 dsisr 0000000000200000 bit set: DSISR_SKEY vmh 000000000E000510 curea F1000E0000183830 pftyp 4000000000000106 [00663BBC]mapfile+00009C (FFFCFF3FFFFFFFFF, 0000000000000002, 0000000000000003 [??]) [005C2868]_shmat64+0000E8 (0000000C0000000C, 0000000000000000, 0000180000001800) [005C72F8]shmat+000058 (0000000C0000000C, 0000000000000000, 0000180000001800) [00003850]ovlya_addr_sc_flih_main+000130 () [kdb_get_virtual_memory] no real storage @ 20281FC8 [D053C9A8]D053C9A8 () [kdb_read_mem] no real storage @ FFFFFFFFFFF9370
Local fix
N/A
Problem summary
If the dump devoice is chosen to be on a disk connected to the FC HBA, dump writing to disk may sometimes fail because of an overrun of the stack usage limit. The stack trace indicates that scsidisk_dump consumes about 0.5 of stack on a dk_cmd instance that may be avoided. It may be noted that other components of the stack having such high stack consumption may also need to be fixed to ensure that dump writing goes through. One such component is recovery_manager that has a much more significant usage pattern.
Problem conclusion
scsidisk_dump has been modified to not define an instance of the dk_cmd structure on the stack that was consuming over 500 bytes of space. One of the unused static structures from the diskinfo is used instead. This decreases the stack consumption by only around roughly 0.5k. Other components of the stack having such high stack consumption may also need to be fixed to ensure that dump writing goes through.
Temporary fix
Comments
APAR Information
APAR number
IV26467
Reported component name
AIX V7.1
Reported component ID
5765H4000
Reported release
710
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Submitted date
2012-08-18
Closed date
2012-08-18
Last modified date
2013-03-26
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
AIX V7.1
Fixed component ID
5765H4000
Applicable component levels
R710 PSY U852825
UP12/12/07 I 1000
PTF to Fileset Mapping
U852825 devices.fcp.disk.rte 7.1.0.20
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMV87","label":"AIX 6.1 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSMVAX","label":"AIX Express Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11R","label":"AIX 7.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
26 March 2013