A fix is available
APAR status
Closed as program error.
Error description
If a cluster is removed and some nodes that members of the cluster are unaware of the removal, such as if they are not online or are stopped, if the cluster is sufficiently large enough (greater than 16 nodes), the unaware node(s) may crash with the following stack trace shown: pvthread+000200 STACK: WARNING: bad IAR: 00000000, display stack from LR: F1000000C044A 9B4 ■F1000000C044A9B4get_node_state_from_repos+000114 (000000000000 0000, 000000000001000A ■??) ■F1000000C044D4A8gossip_xmt+0006C8 () ■F1000000C044DC8Cinit_gossip_timer+00014C () ■00014D70.hkey_legacy_gate+00004C () ■000F9090clock+000330 (??) ■002E8CB8i_softmod+0005D8 () ■001B7BD4flih_util+000260 () ____ Exception (F00000002FF47600) ____ iar : 00000000000D00FC msr : 8000000000009032 cr : 2400 8224 lr : 0000000000097D84 ctr : 0000000000000000 xer : 0000 0000 mq : 00000000 asr : 00000000BE42D001 amr : FFFCFF3FFFFF FFFF r0 : 0000000000000000 r1 : 0FFFFFFFF3FFFDA0 r2 : 0000000002 CD62D0 r3 : 0000000000000000 r4 : 0000000000000000 r5 : 800000001C 949120 r6 : 800000001C9564F8 r7 : 0000000000000000 r8 : 0000000000 000000 r9 : 0000000000000040 r10 : 0000000000000000 r11 : 0000000000 1A0F13 r12 : 0000000000000000 r13 : F1000A00E00B0C00 r14 : 0000000000 0034E0 r15 : 0000000000000000 r16 : 0000000000000000 r17 : 0000000000 000000 r18 : 0000000001827F46 r19 : 000000003B9ACA00 r20 : 0000000000 000001 r21 : 0000000000000000 r22 : 0000000002D40C24 r23 : 0000000000 000000 r24 : F1000F0A10000278 r25 : 00000000024D6500 r26 : 0000000002 4D64FE
Local fix
Problem summary
If a cluster is removed and some nodes that members of the cluster are unaware of the removal, such as if they are not online or are stopped, if the cluster is sufficiently large enough (greater than 16 nodes), the unaware node(s) may crash with the following stack trace shown: pvthread+000200 STACK: WARNING: bad IAR: 00000000, display stack from LR: F1000000C044A 9B4 F1000000C044A9B4 get_node_state_from_repos+000114 (000000000000 0000, 000000000001000A ?? ) F1000000C044D4A8 gossip_xmt+0006C8 () F1000000C044DC8C init_gossip_timer+00014C () 00014D70 .hkey_legacy_gate+00004C () 000F9090 clock+000330 (??) 002E8CB8 i_softmod+0005D8 () 001B7BD4 flih_util+000260 () ____ Exception (F00000002FF47600) ____ iar : 00000000000D00FC msr : 8000000000009032 cr : 2400 8224 lr : 0000000000097D84 ctr : 0000000000000000 xer : 0000 0000 mq : 00000000 asr : 00000000BE42D001 amr : FFFCFF3FFFFF FFFF r0 : 0000000000000000 r1 : 0FFFFFFFF3FFFDA0 r2 : 0000000002 CD62D0 r3 : 0000000000000000 r4 : 0000000000000000 r5 : 800000001C 949120 r6 : 800000001C9564F8 r7 : 0000000000000000 r8 : 0000000000 000000 r9 : 0000000000000040 r10 : 0000000000000000 r11 : 0000000000 1A0F13 r12 : 0000000000000000 r13 : F1000A00E00B0C00 r14 : 0000000000 0034E0 r15 : 0000000000000000 r16 : 0000000000000000 r17 : 0000000000 000000 r18 : 0000000001827F46 r19 : 000000003B9ACA00 r20 : 0000000000 000001 r21 : 0000000000000000 r22 : 0000000002D40C24 r23 : 0000000000 000000 r24 : F1000F0A10000278 r25 : 00000000024D6500 r26 : 0000000002 4D64FE
Problem conclusion
The logic in the cluster kernel extension remove procedure was adjusted to properly handle these cases.
Temporary fix
Comments
APAR Information
APAR number
IV49276
Reported component name
AIX 610 STD EDI
Reported component ID
5765G6200
Reported release
610
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Submitted date
2013-09-17
Closed date
2013-09-17
Last modified date
2014-02-17
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
AIX 610 STD EDI
Fixed component ID
5765G6200
Applicable component levels
R610 PSY U859947
UP14/02/17 I 1000
PTF to Fileset Mapping
U859947 bos.cluster.rte 6.1.8.17
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSLLZP","label":"AIX Standard Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMV87","label":"AIX 6.1 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSMVAX","label":"AIX Express Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSAUMY","label":"IBM AIX Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11Q","label":"AIX 6.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11R","label":"APARs - AIX 7.1 environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
17 February 2014