today I thought about using heartbeat over ip-alias on a lpar that has virtual ethernet adapter(s).
I came to the conclusion that there is no point in configuring heartbeat over ip-alias in this case.
Earlier we all used physical ethernet adapters on our systems running hacmp.
As we did not want an adapter (or switch)-outage to cause a failover, we put 2 physical adapters in the system/lpar.
So hacmp could do a swap_adapter(-event) when a ethernet-card stopped working and so hacmp did not had to cause a failover. It simply moved the service-IP from the failing card to the second card.
However, in order to be able to move an (service-)ip from one card to another both cards had to be in the same ip-subnet, of course.
And that results in not having a clear way/route for pakets that had to be send to an destination-ip in that subnet. There were 2 interfaces that could be used and AIX used both alternately.
While this does not cause any problems with normal traffic, it causes problems with heartbeating over ip.
Suppose you have node A and B in a cluster and 192.168.1.1 and 192.168.1.2 on NICs on node A and 192.168.1.3 and 192.168.1.4 on NICs on node B.
Now HACMP wants to know that 192.168.1.1 is up and so wants to send a paket to say 192.168.1.2.
Had Node A only 192.168.1.1 as an interface and not 192.168.1.2 HACMP could be sure that the paket would be send through 192.168.1.1.
But with 2 adapters HACMP cant know where the paket will be send through. The paket will be send alternately through 192.168.1.1 and 192.168.1.2 (causing 50% heartbeat-loss).
Thats clearly not acceptable and so some smart guy at hacmp-develpment had the idea to simply configure an additional ip-network on those interfaces so that hacmp could determine what interface would be used simply by using at the destination-ip.
So we configured (f.e.) the IP 10.0.0.1 on the ethernet-interface that already had 192.168.1.1 and 10.1.0.1 (another ip-subnet) on the ethernet-interface that already has 192.168.1.2. (and on the other node 10.0.0.2 on the first and 10.1.0.2 on the second card).
Now when HACMP sent a heartbeat to 10.0.0.2 HACMP could rely on that the first interface (having IPs 10.0.0.1 and 192.168.1.1) would be used. (and that the second NIC would be used when sending heartbeats to 10.1.0.2). Problem solved.
Now, these days we have shared ethernet-adapters that already have a failover-mechanism between 2 ethernet-cards. So we dont configure 2 NICs in the same ip-subnet anymore on the HACMP-Node.
And so, we dont have the problem (2 nics on the same ip-subnet on the node) anymore that ip-heartbeating over alias solves.
So, there is no point anymore in using heartbeat over ip-alias anymore.
Do you agree? Or am I missing something?
This topic has been locked.
2 replies Latest Post - 2012-08-27T17:26:54Z by j.gann
Pinned topic Heartbeat over ip-alias pointless when using shared ethernet adapter?
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2012-08-27T17:26:54Z at 2012-08-27T17:26:54Z by j.gann
Holgervk 0100002CRS10 PostsACCEPTED ANSWER
Re: Heartbeat over ip-alias pointless when using shared ethernet adapter?2012-08-21T16:46:20Z in response to Holgervksorry,
>Now HACMP wants to know that 192.168.1.1 is up and so wants to send a paket to say 192.168.1.2.
>Now HACMP wants to know that 192.168.1.1 is up and so wants to send a paket to say 192.168.1.3.
(there is no point in heartbeating own interfaces as that traffic would go through lo0)
j.gann 270000SSYT18 PostsACCEPTED ANSWER
Re: Heartbeat over ip-alias pointless when using shared ethernet adapter?2012-08-27T17:26:54Z in response to HolgervkHello,
I agree there is no point in having multiple virtual interfaces per node per network if these are backed by the same SEA failover pair.
And that's independent of what type of heartbeat you choose (heartbeat over ip aliases or over base adresses).
The multiple interfaces per node per network have/had their use with physical adapters by enabling a node to decide about whether a network failure is caused by an own or foreign adapter failure or a global network failure.
Virtual networks have special semantics and it took hacmp some time to adapt to these after release of power5 virtualization. you surely have come across netmon.cf
powerha7 (i.e. cluster-aware aix) re-designed heartbeat to use ip multicast, had no intention to use a netmon.cf initially, and...
now gets the functionality with APAR IV14422