When both SEAs are on standby
KEEPING EVERYONE ON STANDBY
I recently saw an usual configuration for two Shared Ethernet Adapters and it took me a while to understand why either of the SEAs was working at all. First, a little background. (Actually, a LOT of background) When you have Shared Ethernet Adapters set up in failover mode, all external network traffic goes through one SEA. The second SEA on the other VIO server is there for redundancy, in case the first one should go down for some reason (network disconnected, VIOS reboot). The one with the highest trunk priority usually takes the traffic. For example, on VIOS1 you may have a virtual adapter with trunk priority 1, and VIOS2 has trunk priority 2. Let's call the SEA on VIOS1 "SEA1". SEA2 is on VIOS2 with trunk priority 2. There is a control channel adapter on each SEA which allows SEA2 to check that SEA1 still has external network access. If it does, SEA2 just sits there patiently, just waiting for SEA1 to fail so it can take over (have you ever worked with someone like that?) Ordinarily, you'd have SEA1 as the active adapter, as shown by some of the output of the VIOS entstat command run as the user padmin: The same command on SEA2, shows that the adapter on VIOS2 is not Active for external traffic: Now to test that SEA2 would actually work in the event of a reboot of VIOS1, or a network failure on SEA1, you could unplug SEA1's network cables, disable its switch ports or reboot VIOS1. Or you could simply set the High Availability Mode on SEA1 to standby (once again, as the padmin user): chdev -dev ent6 -attr ha_mode=standbyA new entstat on SEA1 now ought to show this: High Availability Mode: Standby
and SEA2 would automatically become active, even though it's priority 2. And you'd see that in the VIOS2' error report via the VIOS command errlog (use errlog -ls for a long listing): LABEL: SEAHA_PRIMARY
On VIOS1 you'd see an equivalent "BECOME BACKUP" message in the error report. (If you didn't use the errlog command, and chose oem_setup_env followed by the AIX errpt command, take five marks off your scorecard for today's class.) That's how it's supposed to work. And when the VIOS1 comes up again (if you'd rebooted it), or you set the SEA1 high availability mode on SEA1 back to auto, it ought to become the primary ethernet adapter once again: chdev -dev ent6 -attr ha_mode=autowith the Active showing "True" again on SEA1. Who's the boss? So setting the ha_mode to standby is a signal to the SEA to hand over control to the alternate SEA. As this document on SEA network attributes explains: Typically, a Shared Ethernet Adapter in a failover setup is operating in auto mode, and the primary adapter is decided based on which adapter has the highest priority (lowest numerical value).(The "lowest numerical value" means the highest priority. So priority 1 is higher priority, because 1 is a lower number than 2. I hope that's clear, class.) If they're both in auto mode - as they usually would be - then changing the priority 2 SEA to standby ought to have no effect, since priority 1 should already have the traffic as indicated by Active: True from the entstat command. Now, what if both SEAs had their ha_mode set to standby? What would happen? I've seen exactly this situation, and it puzzled me for a while. SEA1 was set to standby, effectively relinquishing its Active status to SEA2, but SEA2 was in no position to take over, since it, too, was set to standby. Clearly, this was not an SEA failover configuration that was going to work, and yet traffic was going through SEA1. In fact, the output of the entstat command on SEA1 showed this: High Availability Mode: Standby
I was puzzled at first, because I thought that neither SEA was in a position to stake a claim for hosting external network traffic. But the ha_mode=standby really means this: A shared Ethernet device can be forced into the standby mode, where it will behave as the backup device as long as it can detect the presence of a functional primary.It was easy enough to figure out what was happening. The SEA1 had been created first (assuming no reboots of either VIOS) and it, by default became the Active SEA. When SEA2 was set up with a valid control channel, it never attempted to take over from SEA1, even though SEA1 was in standby mode. Why? Because SEA2 was also sitting there like an loaded gun, but with no "ha_mode=auto" to pull its trigger. "After you". "No, I insist, after you." Basically, the two SEAs were too polite to take over control from the other one, and the first one to be set up remained the boss by default. I don't know why they were both set up to standby, but the configuration worked effectively as if there was only a single SEA - SEA1. Failover was never going to happen. Fixing it would be very easy: set the ha_mode on each SEA to auto. If you do it on SEA2 first, then it ought to take control and become the primary adapter, but as soon as you make the change on SEA1, that will become the primary (active) adapter because it is priority 1. Resources If you want more information about setting up SEAs, have a look at the Advanced Power Virtualization Best Practices Redpaper and the PowerVM Virtualization Introduction and Configuration Redbook (in draft format at the time of writing this post). |