A
remote direct memory access (RDMA) over Converged Ethernet (RoCE)
network with IP support is characterized by the presence of a network
interface on hosts that can transmit and receive both TCP/IP and RDMA
data at the same time. To configure the network settings, you must
check for the required uDAPL software, create a network interface,
associate interconnect net names with IP addresses, and add required
entries to the Direct Access Transport (DAT) configuration file.
Before you begin
The
steps in this topic are to configure the network settings of hosts on a RoCE network that have IP
support. This topic is specific to configurations with these adapters: EC3A, EC3B, EC2M, EC2N, EC37,
EC38, EC3L, EC3M, EC2S, and EC2R . If you are configuring the network settings of hosts on a RoCE
network without IP support, see topic Configuring the network settings of hosts in a Db2 pureScale environment on a RoCE network without IP support (AIX). Ensure that you complete the
following tasks:
About this task
You must perform these steps on each host, or LPAR, you
want to participate in the Db2
pureScale instance.
Cluster caching facilities (CFs) and members support multiple communication
adapter ports to help Db2
pureScale environments
scale and to help with high availability. One communication adapter
port for each CF or member is all that is required, though it is recommended
to use more adapter ports to increase bandwidth, add redundancy, and
allow the use of multiple switches. This topic guides you through
the installation and setup of User Direct Access Programming
Library (uDAPL) and 40 Gigabit Ethernet
on AIX® hosts
and configuring IP addresses.
Restrictions
- Administrative access is required on all Db2 member
and CF hosts.
Procedure
- Log in as root.
- Ensure that any AIX fixes
are installed from the installation prerequisites at this time.
- Verify that your system has the file sets installed that are required to use uDAPL. To
verify that uDAPL is installed correctly, run the following command, which is shown with sample
output:
$ lslpp -l bos.mp64 udapl.rte ofed.core.rte devices.ethernet.mlx.diag devices.ethernet.mlx.rte
devices.ethernet.mlxc.rte
Fileset Level State Description
----------------------------------------------------------------------------
Path: /usr/lib/objrepos
bos.mp64 7.2.2.18 APPLIED Base Operating System 64-bit
Multiprocessor Runtime
devices.ethernet.mlx.diag 7.2.0.0 COMMITTED RoCE Converged Network Adapter
Diagnostics
devices.ethernet.mlx.rte 7.2.2.16 APPLIED RoCE Converged Network Adapter
devices.ethernet.mlxc.rte
7.2.2.16 APPLIED MLXC RoCE Adapter Software
EFIXLOCKED
ofed.core.rte 7.2.2.15 COMMITTED OFED Core Runtime Environment
udapl.rte 7.2.2.0 APPLIED uDAPL
Path: /etc/objrepos
bos.mp64 7.2.2.18 APPLIED Base Operating System 64-bit
Multiprocessor Runtime
devices.ethernet.mlx.rte 7.2.2.16 APPLIED RoCE Converged Network Adapter
devices.ethernet.mlxc.rte
7.2.2.16 APPLIED MLXC RoCE Adapter Software
EFIXLOCKED
ofed.core.rte 7.2.2.15 COMMITTED OFED Core Runtime Environment
udapl.rte 7.2.0.0 COMMITTED uDAPL
The command output varies depending on the version, the technology level, and the service pack
level.
- Verify that ent1 adapters exist that are for RoCE:
root@p8svt21:/> lsdev -C | grep "RoCE Converged Network Adapter"
ent1 Available 00-00-00 RoCE Converged Network Adapter
ent2 Available 00-00-01 RoCE Converged Network Adapter
-
Starting from V11.1.4.4, this step is no longer required as adapter port liveliness test has
enhanced and automated. Some restrictions apply. Refer to technote#0733765 for
restrictions.
Modify the /etc/rc.tcpip file and add
these routes:
route add <switch1 IP> <switch1 IP> -if <interface 1>
route add <switch2 IP> <switch2 IP> -if <interface 2>
For example,
route add 10.1.1.24 10.1.1.24 -if en1
route add 10.1.2.23 10.1.2.23 -if en2
Here,
10.1.1.24 is assigned to switch 1 and en1 maps is assigned to a cable on the adapter that is plugged
into this same switch. 10.1.2.23 is assigned to switch 2 and en2 maps is assigned to a cable on the
adapter that is plugged into the second switch.
Also run the same route add commands to ensure that they take effect on the host.
- Configure the RoCE network interfaces.
- To configure IP addresses, run the smitty inet command.
smitty inet
- Select "Change / Show Characteristics of a Network Interface".
- Select the adapter:
en1 00-00-00 Standard Ethernet Network Interface
en2 00-00-01 Standard Ethernet Network Interface
- Assign an ip address and netmask to the adapter, and change the
current state to UP:
Change / Show a Standard Ethernet Interface
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
Network Interface Name en1
INTERNET ADDRESS (dotted decimal) [10.1.1.1]
Network MASK (hexadecimal or dotted decimal) [255.255.255.0]
Current STATE up +
Use Address Resolution Protocol (ARP)? yes +
BROADCAST ADDRESS (dotted decimal) []
Interface Specific Network Options
('NULL' will unset the option)
rfc1323 []
tcp_mssdflt []
tcp_nodelay []
tcp_recvspace []
tcp_sendspace []
Apply change to DATABASE only no +
- Verify the state of the Network interfaces created:
root@p8svt21:/> lsdev -C | grep " Standard Ethernet"
en0 Available Standard Ethernet Network Interface
en1 Available 00-00-00 Standard Ethernet Network Interface
Note: In
the previous example, the en1 interface on the ent1 in the 10.1.1.0/24
subnet. To enable multiple communication adapter ports on the cluster
caching facility (CF) or member, repeat steps 4 - 6 for each communication
adapter port on each adapter. Each communication adapter port of a
host or LPAR must be on a different subnet. Repeat steps 4 - 6 on
the secondary CF such that each network interface shares the subnet
of the corresponding interface on the primary CF. Repeat steps 4 -
6 on each member. For each successive network interface, the number
should increase. For example, en1 will be the first adapter and the
subsequent adapter will be en2.
- Update the /etc/hosts file on each of the hosts so that
for each host in the planned Db2
pureScale environment,
the file includes all the IP addresses of all the communication adapter
ports for all hosts in the planned environment.
The /etc/hosts
file must have this format: <IP_Address> <fully_qualified_name> <short_name>.
All hosts in the cluster must have the same /etc/hosts format.
For
example, in a planned
Db2
pureScale environment
with multiple communication adapter ports on the CFs and four members,
the /etc/hosts configuration file might resemble the following file:
10.1.1.1 cf1-en1.example.com cf1-en1
10.1.2.1 cf1-en2.example.com cf1-en2
10.1.3.1 cf1-en3.example.com cf1-en3
10.1.4.1 cf1-en4.example.com cf1-en4
10.1.1.2 cf2-en1.example.com cf2-en1
10.1.2.2 cf2-en2.example.com cf2-en2
10.1.3.2 cf2-en3.example.com cf2-en3
10.1.4.2 cf2-en4.example.com cf2-en4
10.1.1.3 member1-en1.example.com member1-en1
10.1.2.3 member1-en2.example.com member1-en2
10.1.1.4 member2-en1.example.com member2-en1
10.1.2.4 member2-en2.example.com member2-en2
10.1.1.5 member3-en1.example.com member3-en1
10.1.2.5 member3-en2.example.com member3-en2
10.1.1.6 member4-en1.example.com member4-en1
10.1.2.6 member4-en2.example.com member4-en2
Note: In a four member environment that uses only one communication
adapter port for each CF and member, the file would look similar to
the previous example, but contain only the first IP address of each
of the CFs and members in the previous example.
- Ensure that the /etc/dat.conf file has the following formats:
<interface adapter name> u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "<netname> 1 <network interface>" " "
- The <interface adapter name> string cannot be more than 19 characters
long.
- The <netname> is the host name of the private network to be used for the
Db2 cluster
interconnect that was defined in step 7.
- The <network interface> name is the ethernet adapter name.
In the case of a CF or member, the /etc/dat.conf would resemble the following example:
hca0 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "cf1-en1 1 en1" " "
hca1 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "cf1-en2 1 en2" " "
hca2 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "cf1-en3 1 en3" " "
hca3 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2_ofed.a(shr_64.o) IBM.2.0 "cf1-en4 1 en4" " "
In
this example, cf1-en1 is a netname for the pureScale cluster interconnect in
the
/etc/hosts file, and en1 is the network interface that is associated with
this
netname.
Note: The
/etc/dat.conf file must only contain entries for the adapters that are in the
local host. The sample
/etc/dat.conf file that is installed by default
typically contains irrelevant entries. To avoid unnecessary processing of the file, make the
following changes:
- Move all the Db2
pureScale
cluster-related adapter entries to the top of the file.
- Comment out the irrelevant entries or remove them from the file.
- Verify the state of the ports and network interfaces' connectivity.
Use the entstat -d <device> command to check the physical port
state. Verify that the links are up. This check applies only for the
port and interface that were previously identified in /etc/dat.conf:
entstat -d ent1 | grep -i "port link"
Physical Port Link Status: Up
Logical Port Link Status: Up
entstat -d ent2 | grep -i "port link"
Physical Port Link Status: Up
Logical Port Link Status: Up
Ping from each new ethernet
interface to every other new interface in the cluster that are in
the same IP subnet to make sure that they are reachable. For example,
ping -S <source IP> <destination IP>
Ping the gateways on the switches to
ensure that the switches are reachable. For example,
ping <switch IP>