After a PowerHA failover that removed a service IP alias address on one cluster node and then activated this IP alias address on another cluster node, clients sometimes can not connect to this IP address and network services provided by this node are not reachable.
Clients can not connect to the service IP alias. Connection timeout errors can be displayed on the client.
Established TCP sessions can hang for some time.
When an alias IP address was removed from the IP interface layer then TCP sockets in ESTABLISHED state using this removed IP as local address can survive. If the sockets have data in the send buffer then the sockets go into retransmit state and the TCP retransmit timeout will kill the socket after about 10mins. When using PowerHA this can confuse the external ethernet switch and lead to port flapping (MAC address flapping) between the original port and the new port.
IBM PowerHA cluster nodes with AIX operating system using IP alias addresses as service IP labels.
Logging into the system that just had the alias IP address removed:
The command "netstat -an" will show TCP sessions in "ESTABLISHED" state with the just removed IP alias address as "local" IP address.
APAR IJ14259 introduced a new network option "ip_ifdelete_no_retrans".
In order to activate the fix for this problem two network options need to be enabled:
# no -p -o ip_ifdelete_notify=1
# no -p -o ip_ifdelete_no_retrans=1
With the activated fix surviving TCP sessions will be canceled.
This makes it impossible that packets are sent to the external network with the removed IP alias address as source address.
|
SUPPORT:
If additional assistance is required after completing all of the instructions provided in this document, please follow the step-by-step instructions below to contact IBM to open a case for software under warranty or with an active and valid support contract. The technical support specialist assigned to your case will confirm that you have completed these steps.
a. Document and/or take screen shots of all symptoms, errors, and/or messages that might have occurred
b. Capture any logs or data relevant to the situation.
c. Contact IBM to open a case:
-For electronic support, please visit the IBM Support Community:
https://www.ibm.com/mysupport
-If you require telephone support, please visit the web page:
https://www.ibm.com/planetwide/
d. Provide a good description of your issue and reference this technote
e. Upload all of the details and data to your case
-You can attach files to your case in the IBM Support Community
-Or Upload data to IBM testcase server analysis:
http://www.ibm.com/support/docview.wss?uid=ibm10733581
f. Click here to submit feedback for this document.
|
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSPHQG","label":"PowerHA SystemMirror"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]