IBM Support

IV65589: VIOS CRASHED IN EMUL_SCHEDULE APPLIES TO AIX 6100-08

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • VIOS Crashed at following stack..
    
    
    CPU 1 CSA F00000002FF47600 at time of crash, error code
    for LEDs:30000000
    pvthread+003500 STACK:
    [F1000000C0313784]vscsi_host:emul_schedule+000044
    (700DFEED00007262,
    F1000A00E0E04A08 [??])
    [F1000000C030DF20]vscsi_host:emul_no_dev_scheduler+000020
     (??, ??, ??)
    [F1000000C0310C1C]vscsi_host:emul_schedule_queue+0001BC
    (??, ??, ??)
    [F1000000C03226FC]vscsi_host:target_edasd_iodone+000F1C
    (??)
    [00014D70].hkey_legacy_gate+00004C ()
    [00262D20]iodone+000340 (??)
    [F1000000C031A5E8]vscsi_host:hdasd_issue_open+000168 (??)
    [F1000000C032BD9C]vscsi_host:target_kproc+00015C (??, ??,
    ??)
    [00014D70].hkey_legacy_gate+00004C ()
    
    
    (1)> print dev_info F1000A002B64BE00
    struct dev_info {
        struct q_manage {
            struct q_manage *next   = 0x0000000000000000;
            struct q_manage *prev   = 0x0000000000000000;
        } global_dlist;
        uint64_t eye        = 0x56626C6B44657600;
        uint64_t *dev_trace = 0x0000000000000000;
        void *dd_private    = 0x0000000000000000;
        struct task_set_t *task_set = 0xF1000A002B614600;
        unknown num_act_cmds        = 0x00000000;
        unknown dev_trc_index       = 0x00000000;
        unknown flags       = 0x0000;
        uchar device_type   = 0x08;
        uchar pad[0]        = 0x00;
        uchar pad[1]        = 0x00;
        uchar pad[2]        = 0x00;
        uchar pad[3]        = 0x00;
        uchar pad[4]        = 0x00;
        union Simple_lock {
            simple_lock_data _slock = 0x0000000000000000;
            struct lock_data_instrumented *_slockp  =
    0x0000000000000000;
        } dev_lock;
        struct q_manage *lun_list   = 0x0000000000000000;
        void *emulation_priv        = 0x0000000000000000;
        void (*scheduler)() = 0xF1000000C03AD158;
        struct cmd_elem *(*abort_io)()      =
    0xF1000000C03AD170;
        long (*wait_done)() = 0xF1000000C03AD188;
        void (*error_devstrat)()    = 0x0000000000000000;
        void (*delete_lun)()        = 0xF1000000C03AD1A0;
        void (*initiator_login)()   = 0xF1000000C03AD1B8;
        void (*unit_attention)()    = 0xF1000000C03AD1B8;
        long (*need_thread)()       = 0xF1000000C03AD1E8;
        void (*set_vtd)()   = 0xF1000000C03AD1D0;
        void (*get_vtd)()   = 0xF1000000C03AD1D0;
        long (*dev_dep_fun)()       = 0xF1000000C03ACAF8;
        void (*vbsd_init)() = 0x0000000000000000;
        long (*finish_init_lun)()   = 0x0000000000000000;
        unknown use_count   = 0x00000000;
        unknown blk_size    = 0x00000000;
        unknown num_blks    = 0x0000000000000000;
        dev_t blck_devno    = 0x0000000000000000;
        struct lun_reserve {
            unknown reserved        = 0x00000000;
            dev_t adpt_devno        = 0x0000000000000000;
        } reserve;
        char *copy_buf      = 0x0000000000000000;
        ulong copy_buf_len  = 0x0000000000000000;
        struct cdt *cdt     = 0x0000000000000000;
        void *ras_cb        = 0x700DFEED00007262;
    <---------------
    
    
    
         | 000000                           PDEF
    emul_schedule
        0|                                  PROC
    lun,cmd,gr3,gr4
        0| 008180 std      FBE1FFF8   1     ST8
    #stack(gr1,-8)=gr31
    ...
      697| 0081A0 ld       EB830028   1     L8
    gr28=(*)_vadapter_lu._vadapter_lu.lun(gr3,40)
    ...
      703| 0081B8 ld       E87C0100   1     L8
    gr3=(*)dev_info.dev_info.ras_cb(gr28,256)
    ...
      703| 0081C4 lha      A8030018   1     L2A
    gr0=(*)Crasr_block.rasr_block.rrb_trace_privlevel(gr3,24)
    <-----------
    ...
    
    gr3 is loaded with dev_info.ras_cb, which is,
    0x700DFEED00007262.  In
    
    case of other dev_info.ras_cb, it is set to valid values.
    We need to understand why dev_info for this is not
    properly
    initialized.
    

Local fix

Problem summary

  • When a crash is caused by this problem, emul_schedule_queue
    will always be somewhere in the stack, though the actual
    function at the top of the stack may vary.  If a listing is
    used to see precisely where in the code the crash occurred, it
    will be while trying to reference the dkstat structure.
    

Problem conclusion

  • Fix problem which corrupts command queues.
    

Temporary fix

Comments

  • 6100-08 - use AIX APAR IV65589
    6100-09 - use AIX APAR IV65788
    

APAR Information

  • APAR number

    IV65589

  • Reported component name

    VIRTUAL I/O SER

  • Reported component ID

    5765G3400

  • Reported release

    220

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2014-10-07

  • Closed date

    2014-11-21

  • Last modified date

    2015-09-29

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    U870211

Fix information

  • Fixed component name

    VIRTUAL I/O SER

  • Fixed component ID

    5765G3400

Applicable component levels

[{"Business Unit":{"code":"BU029","label":"Software"},"Product":{"code":"SSAVPM","label":"PowerVM VIOS Standard Edition"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"220"}]

Document Information

Modified date:
05 August 2024