APAR status
Closed as program error.
Error description
After a reboot of a STOPPED node, a manual startup of clconfd can result in a deadlock which prevents the node from starting CAA cluster services. The syslog shows repetitious messages: Sep 18 16:00:46 aix1aiib19p caa:info unix: kcluster_lock.c wait_on_node_bringup 480 Active node count = 0 rc = -1 code = 455 Sep 18 16:00:46 aix1aiib19p caa:err|error unix: kcluster_lock.c get_storage_nodecnt 105 NDD_NODE_CNT returned rc=16 Sep 18 16:00:47 aix1aiib19p caa:err|error unix: kcluster_lock.c get_storage_nodecnt 105 NDD_NODE_CNT returned rc=16 Sep 18 16:00:47 aix1aiib19p caa:err|error unix: kcluster_lock.c get_storage_nodecnt 105 NDD_NODE_CNT returned rc=16 Sep 18 16:00:48 aix1aiib19p caa:info unix: kcluster_lock.c get_storage_nodecnt 132 NDD_NODE_CNT rc 16 count -1 line 113 Sep 18 16:00:48 aix1aiib19p caa:info unix: kcluster_lock.c count_active_nodes 393 rc -1 code 314 num_nodes_active 0 up_node_cnt 0 db_node_cnt 0 Sep 18 16:00:48 aix1aiib19p caa:info unix: kcluster_lock.c count_active_nodes 395 num_local_nodes_active 0 local_up_node_cnt 0 local_db_node_cnt 0 Sep 18 16:00:48 aix1aiib19p caa:info unix: kcluster_lock.c wait_on_node_bringup 480 Active node count = 0 rc = -1 code = 455 Sep 18 16:00:48 aix1aiib19p caa:err|error unix: kcluster_lock.c get_storage_nodecnt 105 NDD_NODE_CNT returned rc=16
Local fix
"stopsrc -s clconfd" or "kill -9 clconfd_pid", followed by "clmgr online node START_CAA=yes"
Problem summary
An operation to start a CAA node will fail due to failure to obtain lock.
Problem conclusion
Do not attempt to JOIN a STOPPED node to the CAA cluster.
Temporary fix
Comments
7100-04 - use AIX APAR IJ09741 7100-05 - use AIX APAR IJ09764 7200-01 - use AIX APAR IJ13807 7200-02 - use AIX APAR IJ13519 7200-03 - use AIX APAR IJ12372
APAR Information
APAR number
IJ09741
Reported component name
AIX V7.1
Reported component ID
5765H4000
Reported release
710
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2018-09-27
Closed date
2018-09-28
Last modified date
2019-11-14
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
IJ09764 IJ09871 IJ12372 IJ13519 IJ13807
Fix information
Fixed component name
AIX V7.1
Fixed component ID
5765H4000
Applicable component levels
R710 PSY U879895
UP19/07/15 I 1000
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11R"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]
Document Information
Modified date:
20 April 2022