Configuring the network settings of hosts in a Db2 pureScale environment on an RoCE network without IP support (AIX)
A remote direct memory access (RDMA) over Converged Ethernet (RoCE) network without IP support is characterized by an RoCE special device file (and the absence of a network interface) on hosts which can only transmit and receive RDMA data. To configure the network settings, you must install required uDAPL software and configure ICM, associate interconnect netnames with pseudo IP addresses, and add required entries to the Direct Access Transport (DAT) configuration file.
Before you begin
The steps in this topic are to configure the network settings of hosts on an RoCE network that does not have network interface card IP support. This topic is specific to configurations with these adapters: EC26, EC27, EC28, EC29, EC30. If you are configuring the network settings of hosts on an RoCE network with IP support, see topic Configuring network settings on an RoCE network with IP support.
- Ensure you have created your Db2 pureScale Feature installation plan. Your installation plan helps ensure that your system meets the prerequisites and that you have performed the preinstallation tasks.
- Read about supported network topologies for Db2 pureScale environments in Network topology configuration support for Db2 pureScale environments.
- Ensure that your setup conforms to a supported Db2 pureScale environment listed in the Installation prerequisites for Db2 pureScale Feature (AIX).
- Ensure that the required uDAPL for your TL level is as specified in the installation prerequisites.
About this task
You must perform these steps on each host, or LPAR, you want to participate in the Db2 pureScale instance.
Cluster caching facilities (CFs) and members support multiple communication adapter ports to help Db2 pureScale environments scale and to help with high availability.
One communication adapter port for each CF or member is all that is required, though it is recommended to use more adapter ports to increase bandwidth, add redundancy, and allow the use of multiple switches. This topic guides you through the installation and setup of User Direct Access Programming Library (uDAPL) on AIX® hosts and configuring IP addresses.
- Log in as root.
- Ensure that any AIX fixes are installed from the installation prerequisites at this time.
- If file /etc/dat.conf was previously setup with the desired values, save the existing copy of dat.conf.
- Verify that your system has the correct uDAPL and RoCE network file sets.
To verify uDAPL is installed correctly, run the following command, shown with sample output:
The command output varies depending on version, technology level, and service pack level.
$ lslpp -l bos.mp64 devices.chrp.IBM.lhca.rte devices.common.IBM.ib.rte devices.pciex.b3154a63.rte devices.pciex.b315506714101604.rte udapl.rte Fileset Level State Description ---------------------------------------------------------------------------- Path: /usr/lib/objrepos bos.mp64 220.127.116.11 APPLIED Base Operating System 64-bit Multiprocessor Runtime devices.chrp.IBM.lhca.rte 18.104.22.168 APPLIED Infiniband Logical HCA Runtime Environment devices.common.IBM.ib.rte 22.214.171.124 APPLIED Infiniband Common Runtime Environment devices.pciex.b3154a63.rte 126.96.36.199 APPLIED 4X PCI-E DDR Infiniband Device Driver devices.pciex.b315506714101604.rte 188.8.131.52 COMMITTED RoCE Host Bus Adapter (b315506714101604) udapl.rte 184.108.40.206 APPLIED uDAPL Path: /etc/objrepos bos.mp64 220.127.116.11 APPLIED Base Operating System 64-bit Multiprocessor Runtime devices.chrp.IBM.lhca.rte 18.104.22.168 COMMITTED Infiniband Logical HCA Runtime Environment devices.common.IBM.ib.rte 22.214.171.124 APPLIED Infiniband Common Runtime Environment devices.pciex.b3154a63.rte 126.96.36.199 APPLIED 4X PCI-E DDR Infiniband Device Driver devices.pciex.b315506714101604.rte 188.8.131.52 COMMITTED RoCE Host Bus Adapter (b315506714101604) udapl.rte 184.108.40.206 APPLIED uDAPL
any of the filesets in the previous step were newly installed or updated,
reboot the system by running the following command:
- Configure the RoCE subsystem and set IP addresses:
- Configure the RoCE network subsystem in this substep only if an RoCE network was never
set up before on the
host. Run the
smitty icm command:
- Select Add an InfiniBand Communication Manager
- Key Enter and wait for the command to complete
- Exit by keying Esc+0
Infiniband Communication Manager Device Name icm Minimum Request Retries  Maximum Request Retries  Minimum Response Time (msec)  Maximum Response Time (msec)  Maximum Number of HCA's  Maximum Number of Users  Maximum Number of Work Requests  Maximum Number of Service ID's  Maximum Number of Connections  Maximum Number of Records Per Request  Maximum Queued Exception Notifications Per User  Number of MAD buffers per HCA 
- Configure the RoCE network subsystem in this substep only if an RoCE network was never set up before on the host. Run the smitty icm command:
- Reboot the systems by running the following command on
You must associate each interconnect netname for a member or CF that will be selected during
install with an IPv4 pseudo IP address in /etc/hosts. Each interconnect netname
is associated with an RoCE communication adapter port via the Direct Access Transport (DAT)
configuration file in the next step. This pseudo IP address is used only for resolving the netname
and for uDAPL purposes, it is not pingable. Each pseudo IP address must be unique.
Update the /etc/hosts file on each of the hosts so that for each host in the planned Db2 pureScale environment, the file includes all the pseudo IP addresses of interconnect netnames in the planned environment. The /etc/hosts file must have this format: <IP_Address> <fully_qualified_name> <short_name>. All hosts in the cluster must have the same /etc/hosts format. For example, in a planned Db2 pureScale environment with multiple communication adapter ports on the CFs and four members, the /etc/hosts configuration file might resemble the following file:
10.222.1.1 cf1-en1.example.com cf1-en1 10.222.2.1 cf1-en2.example.com cf1-en2 10.222.3.1 cf1-en3.example.com cf1-en3 10.222.4.1 cf1-en4.example.com cf1-en4 10.222.1.2 cf2-en1.example.com cf2-en1 10.222.2.2 cf2-en2.example.com cf2-en2 10.222.3.2 cf2-en3.example.com cf2-en3 10.222.4.2 cf2-en4.example.com cf2-en4 10.222.1.101 member1-en1.example.com member1-en1 10.222.2.101 member1-en2.example.com member1-en2 10.222.1.102 member2-en1.example.com member2-en1 10.222.2.102 member2-en2.example.com member2-en2 10.222.1.103 member3-en1.example.com member3-en1 10.222.2.103 member3-en2.example.com member3-en2 10.222.1.104 member4-en1.example.com member4-en1 10.222.2.104 member4-en2.example.com member4-en2Note: The pseudo IP addresses of each netname for the CF and member must have a different third octet. All pseudo IP address of members must have the same third octet, which is the same as the third octet for the pseudo IP address associated with the first communication adapter port of each of the CFs and members. In the previous example, the third octet isAll host names in the example above are not associated with regular Ethernet adapters. These host names are set up only for resolving the netnames and for uDAPL purposes. They are not pingable.
1.In a four member environment that uses only one communication adapter port for each CF and member, the file would look similar to the previous example, but contain only the first pseudo IP address of each of the CFs in the previous example. Here is an example of this:
10.222.1.1 cf1-en1.example.com cf1-en1 10.222.1.2 cf2-en1.example.com cf2-en1 10.222.1.101 member1-en1.example.com member1-en1 10.222.1.102 member2-en1.example.com member2-en1 10.222.1.103 member3-en1example.com member3-en1 10.222.1.104 member4-en1.example.com member4-en1
Direct Access Transport (DAT) configuration file /etc/dat.conf was previously
saved, verify that the contents are still equivalent. If the contents are not still equivalent,
replace the currently dat.conf with the saved copy. If the
dat.conf file was not previously setup, edit the
dat.conf file on each host to add a line to associate each interconnect netname
with a uDAPL device and an RoCE Adapter port.
The /etc/dat.conf file must only contain entries for the adapters that are in the local host. The sample /etc/dat.conf file that is installed by default typically contains irrelevant entries. To avoid unnecessary processing of the file, make the following changes:
- Move all the Db2 pureScale cluster-related adapter entries to the top of the file.
- Comment out the irrelevant entries or remove them from the file.
<interface adapter name> u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2.a(shr_64.o) IBM.1.1 "/dev/roce0 1 hostname-en1" " "
The following format is also supported:
- The <interface adapter name> string cannot be more than 19 characters long.
- The name within quotes ("/dev/roce0 1 hostname-en1") is the platform-specific string. This
string consists of:
- Adapter special file ( /dev/roce0 )
- port number ( 1 or 2 )
- The interconnect netname for the member or CF that will run on this host.
Where 10.10.11.131 is the pseudo IP address corresponding to the netname.
hca0 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2.a(shr_64.o) IBM.1.1 "/dev/roce0 1 10.10.11.131" " "Note: If you are receiving a communication error between the member and CF, it is likely that the system attempted to communicate with an adapter interface that is not set up correctly in the Direct Access Transport (DAT) configuration file for the adapter port.In the case of a CF or member that uses two communication adapters, each communication adapter having 2 ports, the /etc/dat.conf would resemble the following example:
hca0 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2.a(shr_64.o) IBM.1.1 "/dev/roce0 1 cf1-en1" " " hca1 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2.a(shr_64.o) IBM.1.1 "/dev/roce0 2 cf1-en2" " " hca2 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2.a(shr_64.o) IBM.1.1 "/dev/roce1 1 cf1-en3" " " hca3 u2.0 nonthreadsafe default /usr/lib/libdapl/libdapl2.a(shr_64.o) IBM.1.1 "/dev/roce1 2 cf1-en4" " "
- Verify the RoCE network subsystem. Verify the RoCE network
components are in the Available State: For example, the system output of the following command run on a host, verifies that all devices are available:
To check the state, use the ibstat -v command. Verify that the ports are active and the links are up. This check applies only for the port and interface that were previously identified in /etc/dat.conf (by default port 1 on roce0):
# lsdev -C | grep -E "Infiniband|PCIE RDMA" icm Available Infiniband Communication Manager roce0 Available 02-00 PCIE RDMA over Converged Ethernet RoCE Adapter (b315506714101604)
------------------------------------------------------------------------------- ETHERNET PORT 1 INFORMATION (roce0) ------------------------------------------------------------------------------- Link State: UP Link Speed: 10G XFI Link MTU: 9600 Hardware Address: 00:02:c9:4b:97:b8 GIDS (up to 3 GIDs): GID0 :00:00:00:00:00:00:00:00:00:00:00:02:c9:4b:97:b8 GID1 :00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 GID2 :00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
- Ensure Global Pause (IEEE 802.3x) is enabled on the switches connected to the adapters. For details see: Switch configuration on an RoCE network (AIX).