In Version 10.5 Fix
Pack 5 and later fix packs, you can configure the network settings
of hosts on a remote direct memory access (RDMA) protocol over Converged
Ethernet (RoCE) network with IP support. A
RoCE network with IP support is characterized by the presence of a
network interface on hosts that can transmit and receive both TCP/IP
and RDMA data. To configure the network settings, you must check for
the required uDAPL software, create a network interface, associate
interconnect net names with IP addresses, and add required entries
to the Direct Access Transport (DAT) configuration file.
Before you begin
The steps in this topic are to configure the
network settings of hosts on a RoCE network that have IP support.
This topic is specific to configurations with these adapters: EC3A,
EC3B. If you are configuring the network settings of hosts on a RoCE
network with IP support, see topic
Configuring
network settings on a RoCE network without NIC support.
Ensure
that you complete the following tasks:
About this task
You must perform these steps on each host, or LPAR, you
want to participate in the DB2 pureScale instance.
Cluster caching facilities (CFs) and members support multiple communication
adapter ports to help DB2 pureScale environments
scale and to help with high availability. One communication adapter
port for each CF or member is all that is required, though it is recommended
to use more adapter ports to increase bandwidth, add redundancy, and
allow the use of multiple switches. This topic guides you through
the installation and setup of User Direct Access Programming
Library (uDAPL) and 40 Gigabit Ethernet
on AIX® hosts
and configuring IP addresses.
Restrictions
- Administrative access is required on all DB2 member and CF hosts.
Procedure
- Log in as root.
- Modify the /etc/rc.tcpip file and
add these routes:
route add <switch1 IP> <switch1 IP> -if <interface 1>
route add <switch2 IP> <switch2 IP> -if <interface 2>
These are the IP addresses assigned to the
switch that reside in the same IP subnet as the IPs used on the hosts.
Here, <switch1 IP> is assigned to the first switch
and <interface 1> is a network interface on the
host that maps to a cable on the adapter that is plugged into this
same switch. <switch2 IP> is assigned to the second
switch and <interface 2> maps to a cable on the
adapter that is plugged into the second switch.For more information
on cabling of two switch configurations, see Network
topology configuration support for DB2 pureScale environments.
For
more information on how to assign IP addresses to switches, see Configuring
switch failover for a DB2 pureScale environment on a RoCE network
(AIX®).
For example, route add 10.1.1.24 10.1.1.24 -if en1
route add 10.1.2.23 10.1.2.23 -if en2
Here, 10.1.1.24 is
assigned to switch 1 and en1 maps is assigned to a cable on the adapter
that is plugged into this same switch. 10.1.2.23 is assigned to switch
2 and en2 maps is assigned to a cable on the adapter that is plugged
into the second switch.
- Verify that your system has the file sets
installed that are required to use uDAPL. To verify that uDAPL is
installed correctly, run the following command, which is shown with
sample output:
lslpp -l bos.mp64 udapl.rte ofed.core.rte devices.ethernet.mlx.diag devices.ethernet.mlx.rte
Fileset Level State Description
----------------------------------------------------------------------------
Path: /usr/lib/objrepos
bos.mp64 7.1.3.30 APPLIED Base Operating System 64-bit
Multiprocessor Runtime
devices.ethernet.mlx.diag
7.1.3.30 APPLIED RoCE Converged Network Adapter
Diagnostics
devices.ethernet.mlx.rte 7.1.3.30 APPLIED RoCE Converged Network Adapter
ofed.core.rte 7.1.3.30 APPLIED OFED Core Runtime Environment
udapl.rte 7.1.3.30 APPLIED uDAPL
Path: /etc/objrepos
bos.mp64 7.1.3.30 APPLIED Base Operating System 64-bit
Multiprocessor Runtime
devices.ethernet.mlx.rte 7.1.3.30 APPLIED RoCE Converged Network Adapter
ofed.core.rte 7.1.3.0 COMMITTED OFED Core Runtime Environment
udapl.rte 7.1.3.30 APPLIED uDAPL
The command output varies depending on the version, the technology
level, and the service pack level.
- Verify that ent1 adapters exist that are for RoCE:
root@p8svt21:/> lsdev -C | grep "RoCE Converged Network Adapter"
ent1 Available 00-00-00 RoCE Converged Network Adapter
ent2 Available 00-00-01 RoCE Converged Network Adapter
- Configure the 40GE network interfaces.
- To configure IP addresses, run the smitty inet command.
smitty inet
- Select "Change / Show Characteristics of a Network Interface".
- Select the adapter:
en1 00-00-00 Standard Ethernet Network Interface
en2 00-00-01 Standard Ethernet Network Interface
- Assign an ip address and netmask to the adapter, and change the
current state to UP:
Change / Show a Standard Ethernet Interface
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
Network Interface Name en1
INTERNET ADDRESS (dotted decimal) [10.1.1.1]
Network MASK (hexadecimal or dotted decimal) [255.255.255.0]
Current STATE up +
Use Address Resolution Protocol (ARP)? yes +
BROADCAST ADDRESS (dotted decimal) []
Interface Specific Network Options
('NULL' will unset the option)
rfc1323 []
tcp_mssdflt []
tcp_nodelay []
tcp_recvspace []
tcp_sendspace []
Apply change to DATABASE only no +
- Verify the state of the Network interfaces created:
root@p8svt21:/> lsdev -C | grep " Standard Ethernet"
en0 Available Standard Ethernet Network Interface
en1 Available 00-00-00 Standard Ethernet Network Interface
Note: In
the previous example, the en1 interface on the ent1 in the 10.1.1.0/24
subnet. To enable multiple communication adapter ports on the cluster
caching facility (CF) or member, repeat steps 4 - 6 for each communication
adapter port on each adapter. Each communication adapter port of a
host or LPAR must be on a different subnet. Repeat steps 4 - 6 on
the secondary CF such that each network interface shares the subnet
of the corresponding interface on the primary CF. Repeat steps 4 -
6 on each member. For each successive network interface, the number
should increase. For example, en1 will be the first adapter and the
subsequent adapter will be en2.
- Update the /etc/hosts file on each of the hosts so that
for each host in the planned DB2 pureScale environment,
the file includes all the IP addresses of all the communication adapter
ports for all hosts in the planned environment. The /etc/hosts
file must have this format: <IP_Address> <fully_qualified_name> <short_name>.
All hosts in the cluster must have the same /etc/hosts format.
For
example, in a planned
DB2 pureScale environment
with multiple communication adapter ports on the CFs and four members,
the /etc/hosts configuration file might resemble the following file:
10.1.1.1 cf1-en1.example.com cf1-en1
10.1.2.1 cf1-en2.example.com cf1-en2
10.1.3.1 cf1-en3.example.com cf1-en3
10.1.4.1 cf1-en4.example.com cf1-en4
10.1.1.2 cf2-en1.example.com cf2-en1
10.1.2.2 cf2-en2.example.com cf2-en2
10.1.3.2 cf2-en3.example.com cf2-en3
10.1.4.2 cf2-en4.example.com cf2-en4
10.1.1.3 member1-en1.example.com member1-en1
10.1.2.3 member1-en2.example.com member1-en2
10.1.1.4 member2-en1.example.com member2-en1
10.1.2.4 member2-en2.example.com member2-en2
10.1.1.5 member3-en1.example.com member3-en1
10.1.2.5 member3-en2.example.com member3-en2
10.1.1.6 member4-en1.example.com member4-en1
10.1.2.6 member4-en2.example.com member4-en2
Note: In a four member environment that uses only one communication
adapter port for each CF and member, the file would look similar to
the previous example, but contain only the first IP address of each
of the CFs and members in the previous example.
- Ensure that the /etc/dat.conf file has the following formats:
hca<number> u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "<netname> 1 <network interface>" " "
- The <number> is an incremental number starting
at 0 for the first line.
- The <netname> is the host name of the private
network to be used for the DB2 cluster interconnect that was defined
in step 7.
- The <network interface> name is the ethernet
adapter name.
In the case of a CF or member, the /etc/dat.conf would resemble
the following example: hca0 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "cf1-en1 1 en1" " "
hca1 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "cf1-en2 1 en2" " "
hca2 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "cf1-en3 1 en3" " "
hca3 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "cf1-en4 1 en4" " "
In
this example, cf1-en1 is a netname for the pureScale cluster interconnect
in the/etc/hosts file, and en1 is the network
interface that is associated with this netname.
- Verify the state of the ports and network interfaces' connectivity.
Use the entstat -d <device> command to check the physical port
state. Verify that the links are up. This check applies only for the
port and interface that were previously identified in /etc/dat.conf:
entstat -d ent1 | grep -i "port link"
Physical Port Link Status: Up
Logical Port Link Status: Up
entstat -d ent2 | grep -i "port link"
Physical Port Link Status: Up
Logical Port Link Status: Up
Ping from each new ethernet
interface to every other new interface in the cluster that are in
the same IP subnet to make sure that they are reachable. For example, ping -I <source IP> <destination IP>
Ping
the gateways on the switches to ensure that the switches are reachable.
For example, ping <switch IP>