IBM Support

IV26467: STACK OVERFLOW CRASH - PROMPTED BY SCSIDISK_DUMP ROUTINE APPLIES TO AIX 7100-00

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • With a stack like the following:
    
    (4)> f
    pvthread+0E4900 STACK:
    [000235E4]vm_gethx+000024 (00000000041DE424 [??])
    [000899B4]v_sidpno+000034 (??, ??, ??, ??)
    [00089654]v_patch_caller_bla+000094 (??)
    [000892D8].patch_caller_glue+00005C ()
    [041DE424]efc_dump+000104 (??, ??, ??)
    [04246A54]efsc_dump_start+000034 (??)
    [04247754]efsc_dump+0001F4 (??, ??, ??, ??, ??, ??)
    [00247EA0]devdump+000120 (??, ??, ??, ??, ??, ??)
    [00014F50].kernel_add_gate_cstack+000030 ()
    [04309780]scsidisk_dump+000220 (??, ??, ??, ??, ??, ??)
    [00247EA0]devdump+000120 (??, ??, ??, ??, ??, ??)
    [00014F50].kernel_add_gate_cstack+000030 ()
    [F1000000C02E21E8]base_dump+0001A0 (F1000A0027926400,
    F1000A0027E29000)
    [F1000000C02C2308]PowerPlatformBottomDispatch+00019C
    (F1000A0027926000, F1000A0101022F00)
    [F1000000C02C25DC]PowerSyncIoBottomDispatch+0000A0
    (F1000A0027926000, F1000A0027E29000,
       F1000A0101022F00)
    [F1000000C02C2740]PowerBottomDispatchPirp+000094
    (F1000A0027926800, F1000A0027E29000)
    [F1000000C02C34D8]PowerDispatchX+000454
    (F1000A00278EC500, F1000A0027E29000,
       0000000000000000)
    [00014D70].hkey_legacy_gate+00004C ()
    [F1000000C03626E4]MpxDispatchDown+000110
    (F1000A0027E29610)
    [F1000000C0362ACC]MpxDispatchGuts+00013C
    (F1000A0027E29610)
    [F1000000C0363500]MpxDispatch+0009B8 (F1000A00278E5A00,
    F1000A0027E29000)
    [F1000000C02C3380]PowerDispatchX+0002FC
    (F1000A00278E5A00, F1000A0027E29000,
       0000000000000000)
    [00014D70].hkey_legacy_gate+00004C ()
    [F1000000C037D830]GpxDispatchDown+000030
    (F1000A002784DF00, F1000A0027E29000)
    [00014D70].hkey_legacy_gate+00004C ()
    [F1000000C03A8AC0]VluDispatch+0000DC (F1000A002784DF00,
    F1000A0027E29000)
    [F1000000C037ED24]GpxDispatch+000070 (F1000A00278EB900,
    F1000A0027E29000)
    [F1000000C02C3380]PowerDispatchX+0002FC
    (F1000A00278EB900, F1000A0027E29000,
       0000000000000000)
    [00014D70].hkey_legacy_gate+00004C ()
    [F1000000C037D830]GpxDispatchDown+000030
    (F1000A002784DF80, F1000A0027E29000)
    [00014D70].hkey_legacy_gate+00004C ()
    [F1000000C0393870]XcryptDispatchGuts+000274
    (F1000A002784DF80, F1000A0027E29000,
       0000000100000001)
    [F1000000C0393AA4]XcryptDispatch+000100
    (F1000A002784DF80, F1000A0027E29000)
    [F1000000C037ED24]GpxDispatch+000070 (F1000A00278ECA00,
    F1000A0027E29000)
    [F1000000C02C3380]PowerDispatchX+0002FC
    (F1000A00278ECA00, F1000A0027E29000,
       0000000000000000)
    [00014D70].hkey_legacy_gate+00004C ()
    [F1000000C037ED50]GpxDispatch+00009C (F1000A00278EC400,
    F1000A0027E29000)
    [F1000000C02C3380]PowerDispatchX+0002FC
    (F1000A00278EC400, F1000A0027E29000,
       0000000000000000)
    [00014D70].hkey_legacy_gate+00004C ()
    [F1000000C03AF278]safe_dump+000278 (8000002600000041,
    0000000000000000,
       0000000200000002, F1000A0109E312C8, 0000000000000000,
    0000000000000000)
    [00014D70].hkey_legacy_gate+00004C ()
    [00247EA0]devdump+000120 (??, ??, ??, ??, ??, ??)
    [00014F50].kernel_add_gate_cstack+000030 ()
    [F1000000C0478520]dodiskdump+00005C (0000000000000000)
    [F1000000C04788BC]callbackfunc+0000B0 (0000000000000000,
    0000000000000000)
    [00014D70].hkey_legacy_gate+00004C ()
    [002EF848]halt_display_excp+0000A8 (??, ??, ??, ??)
    [0013A1C0]rmgr_halt_system_epilog+000060 (??)
    [0013AA1C]rmgr_halt_system+0000BC (??, ??)
    [003DCCD0]legacy_recovery_manager+000170 (??)
    [003DCAF0]recovery_manager+000130 (??, ??, ??)
    [00147480]state_save_ret+000578 ()
    ____ Exception (F000000030081600) ____
    iar   : 0000000000663BBC  msr   : 8000000000009032  cr
    : 22004240
    lr    : 0000000000663B90  ctr   : 0000000000000000  xer
    : 00000012
    mq    : 2028D7C0  asr   : FFFFFFFFFFFFFFFF  amr   :
    FFFCBF3FFFFFFFFF
    r0  : 0000000000000000  r1  : F0000000300812A0  r2  :
    0000000002B654A8
    r3  : FFFCFF3FFFFFFFFF  r4  : 0000000000000002  r5  :
    0000000000000003
    r6  : F1000E0000183808  r7  : 0000000000000000  r8  :
    0000000400000000
    r9  : 0000000004800001  r10 : 0000000004800001  r11 :
    FFFCBF3FFFFFFFFC
    r12 : 0000000000663B90  r13 : F1000A0109EC2000  r14 :
    0000000000000000
    r15 : 0000000000000000  r16 : 0000000000000000  r17 :
    0000000000000000
    r18 : 0000000000000000  r19 : 0000000000000000  r20 :
    0000000000000000
    r21 : 0000000000000000  r22 : 00000000F0234CE8  r23 :
    0000000000001800
    r24 : 0000000000000024  r25 : 00000000F0234CE8  r26 :
    0000000000001800
    r27 : 00000000F022FBF0  r28 : 000000000000000C  r29 :
    0000000000001800
    r30 : 000000000000000C  r31 : F000000030081538
    
    prev      0000000000000000 stackfix  0000000000000000
    int_ticks 0000
    cfar      000000000059F7F4
    kjmpbuf   0000000000000000 excbranch 0000000000000000
    no_pfault 00
    intpri    0B               backt     00
    flags     00
    hw_fru_id 00000001         hw_cpu_id 00000009
    fpscr     0000000000000000 fpscrx    00000000
    fpowner   01
    fpeu      01               fpinfo    00
    alloc     F000
    o_iar     0000000000663BBC o_toc     0000000002B654A8
    o_arg1    FFFCFF3FFFFFFFFF o_vaddr   F1000E0000183830
    krlockp   0000000000000000 rmgrwa    F1000815B012EE20
    amrstackhigh  F000000030069FF0 amrstacklow
    F000000030069000
    amrstackcur   F000000030069FF0 amrstackfix
    0000000000000000
    kstackhigh    0000000000000000 kstacksize    00000000
    frrstart  700DFEED00000000 frrend    700DFEED00000000
    frrcur    700DFEED00000000 frrstatic 0000 kjmpfrroff 0000
    frrovcnt  0000 frrbarrcnt 0000 frrmask 00 callrmgr 00
    Except :
    excp_type 0000010E  EXCEPT_SKEY
     orgea F1000E0000183830 dsisr 0000000000200000  bit set:
    DSISR_SKEY
     vmh   000000000E000510 curea F1000E0000183830 pftyp
    4000000000000106
    [00663BBC]mapfile+00009C (FFFCFF3FFFFFFFFF,
    0000000000000002,
       0000000000000003 [??])
    [005C2868]_shmat64+0000E8 (0000000C0000000C,
    0000000000000000,
       0000180000001800)
    [005C72F8]shmat+000058 (0000000C0000000C,
    0000000000000000,
       0000180000001800)
    [00003850]ovlya_addr_sc_flih_main+000130 ()
    [kdb_get_virtual_memory] no real storage @ 20281FC8
    [D053C9A8]D053C9A8 ()
    [kdb_read_mem] no real storage @ FFFFFFFFFFF9370
    

Local fix

  • N/A
    

Problem summary

  • If the dump devoice is chosen to be on a disk connected to the
    FC HBA, dump writing to disk may sometimes fail because of an
    overrun of the stack usage limit. The stack trace indicates
    that scsidisk_dump consumes about 0.5 of stack on a dk_cmd
    instance that may be avoided.
    
    It may be noted that other components of the stack having
    such high stack consumption may also need to be fixed to
    ensure that dump writing goes through. One such component
    is recovery_manager that has a much more significant usage
    pattern.
    

Problem conclusion

  • scsidisk_dump has been modified to not define an instance
    of the dk_cmd structure on the stack that was consuming
    over 500 bytes of space. One of the unused static structures
    from the diskinfo is used instead.
    
    This decreases the stack consumption by only around roughly
    0.5k. Other components of the stack having such high stack
    consumption may also need to be fixed to ensure that dump
    writing goes through.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV26467

  • Reported component name

    AIX V7.1

  • Reported component ID

    5765H4000

  • Reported release

    710

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Submitted date

    2012-08-18

  • Closed date

    2012-08-18

  • Last modified date

    2013-03-26

  • APAR is sysrouted FROM one or more of the following:

    IV17896

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    AIX V7.1

  • Fixed component ID

    5765H4000

Applicable component levels

  • R710 PSY U852825

       UP12/12/07 I 1000

PTF to Fileset Mapping

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMV87","label":"AIX 6.1 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSMVAX","label":"AIX Express Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11R","label":"AIX 7.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
26 March 2013