IBM Support

IV11780: DEADLOCK IN HD_GS_VGSA_MERGE AND HD_SA_ONEREV APPLIES TO AIX 6100-07

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Customer may encounter a deadlock in concurrent VG when
    HACMP processes LVM_IO_FAIL error.
    
    (0)> th -lk
                    SLOT NAME     STATE    TID PRI   RQ CPUID
     CL  WCHAN
    
    pvthread+01E600  486 gsclvmd  SLEEP 1E60037 03C    0
       0  F1000A02013A0C68 slist_table+000940
    pvthread+02E200  738 cl_vgsa_ SLEEP 2E20071 03C    2
       0  F1000A020039CE78
    
    (0)> f 486
    pvthread+01E600 STACK:
    [00535AE8]slock+000488 (0000000000009024,
    8000000000009032 [??])
    [00009558].simple_lock+000058 ()
    [045A10F8]hd_dev2vg+000058 (??, ??)
    [0455ED84]hd_gs_vgsa_merge+000044 (??, ??)
    [045520BC]hd_lvm_config+000F7C (??, ??, ??, ??)
    [0455C4B4]hd_cfg+0001D4 (??, ??, ??, ??)
    [00003850]ovlya_addr_sc_flih_main+000130 ()
    [kdb_get_virtual_memory] no real storage @ 30326A68
    [D0FD761C]D0FD761C ()
    [kdb_read_mem] no real storage @ FFFFFFFFFFF92D0
    
    (0)> f 738
    pvthread+02E200 STACK:
    [000D57F0]e_block_thread+000290 ()
    [000D6448]e_sleep_thread+0000E8 (??, ??, ??)
    [00014F50].kernel_add_gate_cstack+000030 ()
    [045A9558]hd_sa_onerev+0000F8 (??, ??, ??, ??)
    [045A8D1C]hd_sa_config+001B3C (??, ??, ??)
    [0456F38C]hd_ioctl+002F2C (??, ??, ??, ??, ??, ??)
    [00556D00]rdevioctl+0000C0 (??, ??, ??, ??, ??, ??)
    [007279C0]spec_ioctl+000080 (??, ??, ??, ??, ??, ??)
    [005873F0]vnop_ioctl+000050 (??, ??, ??, ??, ??, ??)
    [0059C3FC]vno_ioctl+00009C (??, ??, ??, ??, ??)
    [00664DF8]common_ioctl+0000F8 (??, ??, ??, ??)
    [00003850]ovlya_addr_sc_flih_main+000130 ()
    [kdb_get_virtual_memory] no real storage @ 2FF22848
    [D01310D4]D01310D4 ()
    [kdb_read_mem] no real storage @ FFFFFFFFFFF92D0
    

Local fix

  • A reboot only clears the deadlock
    

Problem summary

  • A concurrent VG in powerHA enviornment may become deadlocked
    if both nodes experience I/O failures at the same time for
    the VG, and with very specific timing.
    

Problem conclusion

  • Do not wait for a particular lock in the VGSA_ONEREV codepath,
    so that it cannot deadlock with an active vgsa merge thread.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV11780

  • Reported component name

    AIX 610 STD EDI

  • Reported component ID

    5765G6200

  • Reported release

    610

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Submitted date

    2011-12-06

  • Closed date

    2011-12-06

  • Last modified date

    2013-02-27

  • APAR is sysrouted FROM one or more of the following:

    IV11021

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    AIX 610 STD EDI

  • Fixed component ID

    5765G6200

Applicable component levels

  • R610 PSY U839508

       UP12/05/11 I 1000

PTF to Fileset Mapping

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMV87","label":"AIX 6.1 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSMVAX","label":"AIX Express Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSAUMY","label":"IBM AIX Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11Q","label":"AIX 6.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
27 February 2013