IBM Support

10/100 EtherLink Server Adapter by 3Com - Load Balancing Information

Troubleshooting


Problem

10/100 EtherLink Server Adapter by 3Com - Load Balancing with the 10/100 EtherLink Server Adapter by 3Com Affected Configurations Systems configured with the 10/100 EtherLink Server Adapter by 3Com network interface card

Resolving The Problem

Symptom

Load Balancing with the 10/100 EtherLink Server Adapter by 3Com

Affected Configurations

  • Systems configured with the 10/100 EtherLink Server Adapter by 3Com network interface card

Solution

What is Hashing?

This is an algorithmic function to determine which NIC to use. Performing a "hashing" calculation on specific address information and returning a number correlating to which NIC to use does this. Hashing refers to the frequent, multiple use of the algorithm being used.

Transmit Load Balancing

How is data/traffic distributed across NICs from the server when in a load-balanced group?
In the server’s transmit direction, when a client initiates a session to the server a decision must be made to determine which of the grouped NICs will be used for that client session. 3Com uses an advanced hashing algorithm to arrive at this decision.

With four server NICs installed and configured in a group, the hash function uses the provided TCP Source Port, TCP Destination Port, and the Destination (client) IP address to determine the NIC to use. This algorithm will return a number between one and four where that number is the number of the NIC to be used for that connection.

3Com's solution uses a dynamic function to make sure the sessions across all of the server NICs in a group are as balanced and efficient as possible. If a grouped NIC is detected as not in use, the algorithm will dynamically change to ensure it's use and subsequent even distribution of data traffic across all grouped NICs. This solution does not ensure that an equal number of bytes are transmitted on all the NICs (uneven utilization of the NICs) but statistically in a large network all the NICs will be evenly utilized.

Example: A client sends a TCP/IP packet to the server. When the server sends a response back to the client, the load balancing module intercepts the packet, performs a hashing function on the TCP ports and the destination IP address contained in the packet and then uses the result of the hashing function to select which NIC to transmit the packet on.

Transmit Load Balancing across a Router or L3 Switch

Transmit Load Balancing uses destination IP address as part of the hashing function to select which NIC to transmit the packet on. For clients situated across a router or L3 switch, the destination IP address contained in the packet is the client's IP address while the destination MAC address is the Router's MAC address. Since 3Com's load balancing algorithm looks at the destination IP address and not the destination MAC address, traffic going to clients situated across a Router/L3 switch will also be load balanced.

Transmit Load Balancing in Clustering Environments

As described above, transmit load balancing uses a simplified method for balancing network traffic. This simplified method would not result in additional latency that would cause client timeouts in clustering environments. With transmit load balancing, the traffic will still be balanced even in these clustering environments.

Resilient Server Links and Transmit Load Balancing

If one of the connections to the server’s grouped NICs were to fail, either by faulty or unplugged cabling, the steps taken depend on which connection failed. If a secondary NIC cable was disconnected, the load balancing software would simply run the hashing function again and re-assign clients to NICs. No other actions need to be taken by either the server or the clients. If the primary NICs connection failed, the load balancing software would allow the secondary NIC to assume the properties of the original primary and thus become the new primary. The properties exchanged are the MAC address and multicast groups belonged to. Again, the algorithm will perform another hash routine to re-assign the clients to the remaining NICs. Periodically, the failed connection is tested to determine if the connection has been re-established. If it has, that NIC is then added back to the group, and the hashing algorithm will start using this NIC immediately.

Receive Load Balancing

How is data/traffic distributed across NICs to the server when in a load-balanced group?
In the server’s receive direction; load balancing is achieved by using a connection-based algorithm. This function works when either the client connects to the server, or the server connects to the client and each server NIC is used on a connection-by-connection basis.

Introducing Connection Steering IP Address

Connection steering is a term and method used to describe how the server communicates with connected clients and how it determines which of the grouped NICs to use during its load balanced session with each client. This is a configurable parameter that allows the network administrator to configure the IP address of the server the client will initially receive. This is done because if the server always sent its true IP address and MAC address to every client upon connection initialization, every client would always receive the same NIC’s MAC address and there would be no receive load balancing.

An example configuration:
If the server’s IP address is 10.1.1.1 with a subnet mask of 255.0.0.0, and the connection steering IP address is configured for the fourth byte to be .253, the clients would receive the IP address for the server as 10.0.0.253. Or, 10.1.1.1 with 255.255.0.0 will produce a CSIP of 10.1.0.253.

When a client initiates the connection

When a client initiates the connection to the server, the client sends an ARP Request to the server's IP address. The server updates its ARP cache with the respective client's IP and MAC Address and responds to the client. These responses to clients are transmitted by the receive load balancing solution in a round robin fashion, so that all the NICs in the group have an equal number of clients associated to them.

When a server initiates the connection

When the server initiates connections to clients, it sends an ARP Request to the respective client's IP address. This ARP from the server's protocol stack is intercepted by the receive load balancing function and the Connection Steering IP address is inserted in the ARP Source IP location. This packet is then sent out to the client. The client in turn updates its ARP cache and sends a directed ARP response to the server. The receive load balancing function intercepts this directed ARP response, updates the Target IP address of the ARP packet with the server's true IP address and indicates it to the ARP Protocol. The receive load balancing function then generates an unsolicited directed ARP response to the client with the server's true IP Address and determines which NIC to use for that client connection. The client then updates it's ARP cache with the server's true IP and the server NIC's MAC address that transmitted the unsolicited ARP response.

Receive Load Balancing across a Router or L3 Switch

When the TCP packets are destined to other subnets, crossing a router, the ARP responses are futile, since the router serializes the traffic and will respond to and generate its own ARPs. So, in this situation alone, ARP responses for new connections are not transmitted. Since these packets are not being passed to the server, the algorithm never learns the client destination MAC address and cannot load balance the received packets. For this reason, it is recommended that you use transmit load balancing if you require load balancing for clients situated across a router or L3 switch.

Receive Load Balancing in Clustering Environments

The receive load balancing algorithm depends on ARP packet negotiation and IP steering and result in a longer setup process because of the added driver processing in addition to the clustering service processing. In some situations, client timeouts may result if a client application can not recover from the delayed ARP update. For these reasons, it is recommended that you use transmit load balancing in clustering environments.

Resilient Server Links and Receive Load Balancing

If one of the connections fails the following actions will take place regardless of whether the failed connection was from that of the primary or a secondary NIC. The first action the receive load balancing function will do is send out a gratuitous ARP from the primary NIC (or new primary NIC if the primary failed). This will inform and cause all clients to direct their transmit traffic (the servers receive traffic) to the primary NIC. This effectively disables receive load balancing as now all traffic is going to a single NIC.

Summary

In summary, it is highly recommended that you use transmit load balancing when clients are situated across routers or L3 switches or in clustering environments.

Document Location

Worldwide

Operating System

System x Hardware Options:All operating systems listed

[{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU00GTV","label":"System x Hardware Options->Ethernet->10\/100 Mbps->09N9901"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
28 January 2019

UID

ibm1MIGR-4TQVCT