APAR status
Closed as program error.
Error description
************************************************************** * USERS AFFECTED: * Systems running the AIX 6100-09 Technology Level * or VIOS 2.2.5.x level * with ios.sea at the 6.1.9.200 or 6.1.9.201 level. ************************************************************** * ERROR DESCRIPTION: * This issue only affects VIOS 2.2.5 using Shared Ethernet * Adapter (SEA) with a primary and backup in HA mode (either * auto or sharing). * * When the primary VIOS is rebooted, the SEA may be placed in * an UNHEALTHY state due to the link going down and up during * reboot. * It may stay this way for 10 minutes after rebooting. * If, during this time, the other VIOS providing the SEA is * also rebooted, network connectivity through the SEA will be * lost, because the UNHEALTHY SEA will not take over as * PRIMARY. * * To avoid the problem, after rebooting the primary VIOS, wait * until the state of its SEA adapter no longer shows UNHEALTHY * state before rebooting the partner VIOS. * You can check the state of the SEA adapter by using * 'entstat -all' on the SEA. For example: * * $ entstat -all entX | grep State * State: UNHEALTHY * .... * * The fix below can also be installed to both VIOSes to avoid * any issue. This fix changes the link checking part of * health check to no longer cause SEA to go into UNHEALTHY * state during reboot. It also reduces the default health * time delay to 60 seconds. ************************************************************** * RECOMMENDATION: * Install APAR IV97991. * Prior to fix availability, an interim fix is available from * either * ftp://aix.software.ibm.com/aix/ifixes/iv97991/ * https://aix.software.ibm.com/aix/ifixes/iv97991/ * Installation of the ifix requires a reboot. **************************************************************
Local fix
Problem summary
Given a pair of VIOS LPARs (2.2.5.x and up) with matching SEAs in HA mode (ha_mode set to auto or sharing) with one node in UNHEALTHY state, if the healthy node is rebooted or loses link, the UNHEALTHY node will not assume the PRIMARY state. In the field, a customer reboots the primary LPAR and waits until it is back up. Then the customer reboots the backup LPAR. Unbeknownst to the customer, the primary LPAR has gone into the UNHEALTHY state because the link came up slightly delayed. When the backup LPAR is shutdown, the primary LPAR does not take over and become PRIMARY as it did before the upgrade.
Problem conclusion
Code changed to disable link check as part of health check and also reduce the default value of health_check attribute to 60 secs and minimum value to 1s.
Temporary fix
********* * HIPER * *********
Comments
APAR Information
APAR number
IV97991
Reported component name
VIRTUAL I/O SER
Reported component ID
5765G3400
Reported release
220
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2017-07-12
Closed date
2017-08-09
Last modified date
2017-10-17
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
IV99666 IV99667
Fix information
Fixed component name
VIRTUAL I/O SER
Fixed component ID
5765G3400
Applicable component levels
R220 PSY U870065
UP17/10/17 I 1000
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU029","label":"Software"},"Product":{"code":"SSAVPM","label":"PowerVM VIOS Standard Edition"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"220"}]
Document Information
Modified date:
04 September 2024