Configuring the network settings of hosts for a Db2 pureScale environment on an InfiniBand network (Linux)
As described in the network topology tables and diagrams, configure the communication adapter ports in pairs, so that the devices with the same device ID (for example, ib0) are on the same subnet.
Before you begin
- Ensure you have created your Db2 pureScale Feature installation plan. Your installation plan helps ensure that your system meets the prerequisites and that you have performed the preinstallation tasks.
- Ensure you have read about supported network topologies for Db2 pureScale environments in Network topology configuration support for Db2 pureScale environments.
Administrative access is required on all Db2 member and CF hosts.
About this task
Procedure
- Log in as root.
- Configure OpenFabrics Enterprise Distribution (OFED) software.
- OFED configuration details for SLES 11 systems. SLES 12 and above
do not require the following:
- For SLES:
- Configure the SLES online updates to include the maintenance repository for your version of SLES.
- Install the following packages from the maintenance repository:
compat-dapl compat-dapl-32bit dapl-32bit dapl-doc dapl ibutils-32bit Ibutils infiniband-diags libcxgb3-rdmav2-32bit libcxgb3-rdmav2 libibcm libibcm-32bit libibcommon1 libibcommon1-32bit libibmad5 libibmad5-32bit libibumad3 libibumad3-32bit libibverbs libibverbs-32bit libipathverbs libipathverbs-32bit libmlx4-rdmav2 libmlx4-rdmav2-32bit libmthca-rdmav2 libmthca-rdmav2-32bit libnes-rdmav2 librdmacm librdmacm-32bit libsdp-32bit libsdp mpi-selector mstflint ofed-doc ofed-kmp-default ofed ofed-doc ofed-kmp-default opensm-32bit opensm ibvexdmtools qlvnictools sdpnetstat srptoolsn
- Verify each of the packages that are part of OFED are installed with the rpm -qa command.
- For SLES:
- OFED configuration details for RHEL systems.On RHEL, run a group installation of the "InfiniBand Support" package to install the required InfiniBand software. The "InfiniBand Support" package is available as a group install. Perform the following as root to install the package:
yum groupinstall "InfiniBand Support"
Note: For the yum command to work, it requires local repositories to be created first from either Red Hat Network (RHN) or from the DVD iso images. After the repository is setup, the yum command is aware of the location to find the target packages. Registering with RHN is the recommended mechanisms to access latest kernel updates and fixes. Users are recommended to setup the repository for every RHEL system.If the repository cannot be setup with RHN, it can also be setup using the iso images that come with the RHEL DVD media. These procedures are only required on a system if it cannot be registered with RHN. The following example shows how to setup the repository using the RHEL 5.7 iso image.- Copy the file RHEL5.7-20100922.1-Server-x86_64-DVD1.iso from
the DVD to a temporary directory on the target system, /tmp/iso
# cd /tmp/iso # ls -rlt total 3354472 -rw-r--r-- 1 root root 3431618560 Jan 10 20:13 RHEL5.7-20100922.1-Server-x86_64-DVD1.iso
- Extract the iso image.
mount -o loop /tmp/iso/RHEL5.7-20100922.1-Server-x86_64-DVD1.iso /mnt/iso/
- Create a repository.
# cd repodata/ # ls -rlt total 76180 -rw-r--r-- 1 root root 8032315 Jan 17 12:59 primary.xml.gz -rw-r--r-- 1 root root 51522840 Jan 17 12:59 other.xml.gz -rw-r--r-- 1 root root 18346363 Jan 17 12:59 filelists.xml.gz -rw-r--r-- 1 root root 951 Jan 17 12:59 repomd.xml # cd .. # cd repodata/
- Create a repository, by creating a local repository for the iso
in /etc/yum.repos.d/my.repo
# cat my.repo [my.repo] name=Redhat LTC baseurl=file:///mnt/iso gpgcheck=0 enabled=1
- The previous steps complete the creation of the local repository to point to /mnt/iso as the source.
- Issue the relevant yum command to perform the
installation of the required packages.Sample output for a successful installation:
[root@coralxib42 ~]# yum groupinstall 'Infiniband Support' Loaded plugins: product-id, refresh-packagekit, rhnplugin, subscription-manager Updating Red Hat repositories. 4/4 Setting up Group Process Resolving Dependencies --> Running transaction check ---> Package dapl.x86_64 0:2.0.25-5.2.el6 will be installed ---> Package ibsim.x86_64 0:0.5-4.el6 will be installed ---> Package ibutils.x86_64 0:1.5.4-3.el6 will be installed --> Processing Dependency: libosmcomp.so.3(OSMCOMP_2.3)(64bit) for package: ibutils-1.5.4-3.el6.x86_64 --> Processing Dependency: libosmvendor.so.3(OSMVENDOR_2.0)(64bit) for package: ibutils-1.5.4-3.el6.x86_64 --> Processing Dependency: libopensm.so.2(OPENSM_1.5)(64bit) for package: ibutils-1.5.4-3.el6.x86_64 --> Processing Dependency: tk for package: ibutils-1.5.4-3.el6.x86_64 --> Processing Dependency: libosmcomp.so.3()(64bit) for package: ibutils-1.5.4-3.el6.x86_64 --> Processing Dependency: libosmvendor.so.3()(64bit) for package: ibutils-1.5.4-3.el6.x86_64 --> Processing Dependency: libopensm.so.2()(64bit) for package: ibutils-1.5.4-3.el6.x86_64 --> Processing Dependency: libibdmcom.so.1()(64bit) for package: ibutils-1.5.4-3.el6.x86_64 ---> Package libcxgb3.x86_64 0:1.3.0-1.el6 will be installed ---> Package libibcm.x86_64 0:1.0.5-2.el6 will be installed ---> Package libibmad.x86_64 0:1.3.4-1.el6 will be installed ---> Package libibumad.x86_64 0:1.3.4-1.el6 will be installed ---> Package libibverbs.x86_64 0:1.1.4-4.el6 will be installed ---> Package libibverbs-utils.x86_64 0:1.1.4-4.el6 will be installed ---> Package libipathverbs.x86_64 0:1.2-2.el6 will be installed ---> Package libmlx4.x86_64 0:1.0.1-8.el6 will be installed ---> Package libmthca.x86_64 0:1.0.5-7.el6 will be installed ---> Package libnes.x86_64 0:1.1.1-1.el6 will be installed ---> Package librdmacm.x86_64 0:1.0.10-2.el6 will be installed ---> Package librdmacm-utils.x86_64 0:1.0.10-2.el6 will be installed ---> Package rdma.noarch 0:1.0-9.el6 will be installed ---> Package rds-tools.x86_64 0:2.0.4-3.el6 will be installed --> Running transaction check ---> Package ibutils-libs.x86_64 0:1.5.4-3.el6 will be installed ---> Package opensm-libs.x86_64 0:3.3.5-1.el6 will be installed ---> Package tk.x86_64 1:8.5.7-5.el6 will be installed --> Finished Dependency Resolution
Dependencies Resolved ==================================================================================== Package Arch Version Repository Size ==================================================================================== Installing: dapl x86_64 2.0.25-5.2.el6 rhel-x86_64-server-6 143 k ibsim x86_64 0.5-4.el6 rhel-x86_64-server-6 55 k ibutils x86_64 1.5.4-3.el6 rhel-x86_64-server-6 1.0 M libcxgb3 x86_64 1.3.0-1.el6 rhel-x86_64-server-6 16 k libibcm x86_64 1.0.5-2.el6 rhel-x86_64-server-6 19 k libibmad x86_64 1.3.4-1.el6 rhel-x86_64-server-6 52 k libibumad x86_64 1.3.4-1.el6 rhel-x86_64-server-6 55 k libibverbs x86_64 1.1.4-4.el6 rhel-x86_64-server-6 44 k libibverbs-utils x86_64 1.1.4-4.el6 rhel-x86_64-server-6 34 k libipathverbs x86_64 1.2-2.el6 rhel-x86_64-server-6 13 k libmlx4 x86_64 1.0.1-8.el6 rhel-x86_64-server-6 27 k libmthca x86_64 1.0.5-7.el6 rhel-x86_64-server-6 33 k libnes x86_64 1.1.1-1.el6 rhel-x86_64-server-6 15 k librdmacm x86_64 1.0.10-2.el6 rhel-x86_64-server-6 22 k librdmacm-utils x86_64 1.0.10-2.el6 rhel-x86_64-server-6 27 k rdma noarch 1.0-9.el6 rhel-x86_64-server-6 16 k rds-tools x86_64 2.0.4-3.el6 rhel-x86_64-server-6 55 k Installing for dependencies: ibutils-libs x86_64 1.5.4-3.el6 rhel-x86_64-server-6 924 k opensm-libs x86_64 3.3.5-1.el6 rhel-x86_64-server-6 53 k tk x86_64 1:8.5.7-5.el6 rhel-x86_64-server-6 1.4 M
Transaction Summary ===================================================================================== Install 20 Package(s) Total download size: 4.0 M Installed size: 0 Is this ok [y/N]:
- Copy the file RHEL5.7-20100922.1-Server-x86_64-DVD1.iso from
the DVD to a temporary directory on the target system, /tmp/iso
- OFED configuration details for SLES 11 systems. SLES 12 and above
do not require the following:
- DAT configuration file details
for SLES and RHEL systems:
- On SLES, edit the Direct Access Transport (DAT) configuration file, /etc/dat.conf, to have a line for each of the communication adapter ports
- On RHEL , the DAT configuration
file is located in /etc/rdma/dat.conf and it
is updated by the group installation of the "InfiniBand Support" package.
Ensure that the file has the following format:
<interface adapter name> u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "<network interface> 0" " "
- The <interface adapter name> string cannot be more than 19 characters long.
- The <network interface> name is the ethernet adapter name.
cat /etc/dat.conf ofa-v2-ib0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib0 0" "" ofa-v2-ib1 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib1 0" "" ofa-v2-ib2 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib2 0" "" ofa-v2-ib3 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib3 0" ""
Note: If you are receiving DAT_INTERNAL_ERR communication errors, it is likely that the system attempted to communicate with an adapter interface that is not set up correctly in the Direct Access Transport (DAT) configuration file for the adapter port. - Edit the network configuration files to configure a static
IP address for each communication adapter
port interface. The following file listings show the network adapter configuration for the CFs, hosts cf1 and cf2, and members, member1, member2, member3, and member4. Edit the network configuration files on each host so that the first communication adapter port listed on each host is on the same subnet as the other hosts. If configuring multiple communication adapter ports on the CFs, pair the additional communication adapter ports CFs so that each DEVICE on the secondary CF is on the same subnetwork as the DEVICE on the primary with the same ID.The network configuration files are located in /etc/sysconfig/network in SLES and /etc/sysconfig/network-scripts in RHEL. SLES example is below.
ssh cf1 cat /etc/sysconfig/network/ifcfg-ib0 DEVICE=ib0 BOOTPROTO='static' IPADDR='10.222.0.1' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' ssh cf1 cat /etc/sysconfig/network/ifcfg-ib1 DEVICE=ib1 BOOTPROTO='static' IPADDR='10.222.1.1' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' ssh cf1 cat /etc/sysconfig/network/ifcfg-ib2 DEVICE=ib2 BOOTPROTO='static' IPADDR='10.222.2.1' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' ssh cf1 cat /etc/sysconfig/network/ifcfg-ib3 DEVICE=ib3 BOOTPROTO='static' IPADDR='10.222.3.1' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no'
ssh cf2 cat /etc/sysconfig/network/ifcfg-ib0 DEVICE=ib0 BOOTPROTO='static' IPADDR='10.222.0.2' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' ssh cf2 cat /etc/sysconfig/network/ifcfg-ib1 DEVICE=ib1 BOOTPROTO='static' IPADDR='10.222.1.2' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' ssh cf2 cat /etc/sysconfig/network/ifcfg-ib2 DEVICE=ib2 BOOTPROTO='static' IPADDR='10.222.2.2' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' ssh cf2 cat /etc/sysconfig/network/ifcfg-ib3 DEVICE=ib3 BOOTPROTO='static' IPADDR='10.222.3.2' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no'
ssh member1 cat /etc/sysconfig/network/ifcfg-ib0 DEVICE=ib0 BOOTPROTO='static' IPADDR='10.222.0.101' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' ssh member2 cat /etc/sysconfig/network/ifcfg-ib0 DEVICE=ib0 BOOTPROTO='static' IPADDR='10.222.0.102' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' ssh member3 cat /etc/sysconfig/network/ifcfg-ib0 DEVICE=ib0 BOOTPROTO='static' IPADDR='10.222.0.103' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' ssh member4 cat /etc/sysconfig/network/ifcfg-ib0 DEVICE=ib0 BOOTPROTO='static' IPADDR='10.222.0.104' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no'
Note:- For simplicity, the IP addresses used in the previous example
use the
255.255.255.0
subnetwork mask (NETMASK) so that the subnetwork third and forth IP segments can match the numbers of the interface devices and hostname. This subnetwork mask results in the IP addresses for CFs formatted like10.222.interface-id-device-number.CF-hostname-suffix
and members IP addresses like10.222.interface-id-device-number.10member-hostname-suffix
. - The first communication adapter port on each CF host is on the same subnet as the members.
- Each communication adapter port on a CF or member is on a distinct subnet.
- Communication adapter ports with the same interface DEVICE name on the primary and secondary CFs share the same subnet.
- For simplicity, the IP addresses used in the previous example
use the
-
If configuring multiple communication adapter ports on members,
use the same IP subnet for each adapter interface device on the second
host as was used for adapter interface with the same device ID on
the other hosts so that matching devices are on the same IP subnets.
All members must be on an IP subnet used by the CF adapter interfaces. The resulting IP subnets are:cat /etc/sysconfig/network/ifcfg-ib0 DEVICE=ib0 BOOTPROTO='static' IPADDR='10.1.1.161' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no' cat /etc/sysconfig/network/ifcfg-ib1 DEVICE=ib1 BOOTPROTO='static' IPADDR='10.1.2.161' NETMASK='255.255.255.0' STARTMODE='onboot' WIRELESS='no'
- The 10.1.1 subnet has the ib0 device from all members and all CFs.
- The 10.1.2 subnet has the ib1 device from all members and all CFs.
- For BladeCenter deployments
only, enable the subnet manager service (Open SM) on all hosts in
the Db2
pureScale environment.
To enable the subnet manager service run the following commands on each host to start the service and have it start after a reboot:
chkconfig opensmd on service opensmd start
-
Update the /etc/hosts file on each of the hosts so
that for each host in the planned Db2
pureScale environment,
the file includes all the IP addresses of all the communication adapter ports for all
hosts in the planned environment.
The /etc/hosts file must have this format: <IP_Address> <fully_qualified_name> <short_name>. All hosts in the cluster must have the same /etc/hosts format.
For example, in a planned Db2 pureScale environment with multiple communication adapter ports on the CFs with four members, the /etc/hosts configuration file might resemble the following file:
10.222.0.1 cf1-ib0.example.com cf1-ib0 10.222.1.1 cf1-ib1.example.com cf1-ib1 10.222.2.1 cf1-ib2.example.com cf1-ib2 10.222.3.1 cf1-ib3.example.com cf1-ib3 10.222.0.2 cf2-ib0.example.com cf2-ib0 10.222.1.2 cf2-ib1.example.com cf2-ib1 10.222.2.2 cf2-ib2.example.com cf2-ib2 10.222.3.2 cf2-ib3.example.com cf2-ib3 10.222.0.101 member1-ib0.example.com member1-ib0 10.222.1.101 member1-ib1.example.com member1-ib1 10.222.0.102 member2-ib0.example.com member2-ib0 10.222.1.102 member2-ib1.example.com member2-ib1 10.222.0.103 member3-ib0.example.com member3-ib0 10.222.1.103 member3-ib1.example.com member3-ib1 10.222.0.104 member4-ib0.example.com member4-ib0 10.222.1.104 member4-ib1.example.com member4-ib1
Note:- In a four member environment that uses a communication adapter port for each CF and member, the file would look similar to the previous example, but contain only the first IP address of each of the CFs in the previous example.
- Restart the service for the InfiniBand subsystem.
On RHEL:service openibd restart
service rdma restart
- Verify the InfiniBand subsystem.
- Verify that the ports are active and the links are up.
Use the ibstat -v command or the ibstatus command to list the state of the adapters. This check applies to the ports and interfaces that were previously identified in /etc/dat.conf.
ibstatus Infiniband device 'mlx4_0' port 1 status: default gid: fe80:0000:0000:0000:0002:c903:0007:eafb base lid: 0x2 sm lid: 0x1 state: 4: ACTIVE phys state: 5: LinkUp rate: 20 Gb/sec (4X DDR) Infiniband device 'mlx4_0' port 2 status: default gid: fe80:0000:0000:0000:0002:c903:0007:eafc base lid: 0x3 sm lid: 0x1 state: 4: ACTIVE phys state: 5: LinkUp rate: 20 Gb/sec (4X DDR)
Note: Port 1 of the example output the ibstatus command on Linux® corresponds to port 0 in the dat.conf file:ofa-v2-ib0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib0 0" ""
Verify that the state field value is ACTIVE and the phys state field reports that the link is up (LinkUp). -
Ensure the destination IP is resolvable. For example, enter the following:
# ip -resolve neigh coralxib44-ib3 dev ib3 lladdr 80:00:00:49:fe:80:00:00:00:00:00:00:00:02:c9:03:00:0e:9d:5e REACHABLE coralxib42.torolab.ibm.com dev bond0 lladdr 00:1a:64:c9:d1:e8 REACHABLE coralxib42-ib0 dev ib0 lladdr 80:00:00:48:fe:80:00:00:00:00:00:00:00:02:c9:03:00:07:ea:5f REACHABLE coralxib44-ib0 dev ib0 lladdr 80:00:00:48:fe:80:00:00:00:00:00:00:00:02:c9:03:00:07:eb:13 REACHABLE 9.26.120.1 dev bond0 lladdr 00:00:0c:07:ac:01 REACHABLE coralxib43.torolab.ibm.com dev bond0 lladdr 00:1a:64:c9:cc:d4 REACHABLE coralxib44-ib2 dev ib2 lladdr 80:00:00:48:fe:80:00:00:00:00:00:00:00:02:c9:03:00:0e:9d:5d REACHABLE coralxib44.torolab.ibm.com dev bond0 lladdr 00:1a:64:c9:d5:24 REACHABLE coralxib44-ib1 dev ib1 lladdr 80:00:00:49:fe:80:00:00:00:00:00:00:00:02:c9:03:00:07:eb:14 REACHABLE coralxib43-ib0 dev ib0 lladdr 80:14:00:48:fe:80:00:00:00:00:00:00:00:02:c9:03:00:07:ea:07 REACHABLE # arp -an ? (10.1.4.144) at 80:00:00:49:fe:80:00:00:00 [infiniband] on ib3 ? (9.26.120.241) at 00:1a:64:c9:d1:e8 [ether] on bond0 ? (10.1.1.142) at 80:00:00:48:fe:80:00:00:00 [infiniband] on ib0 ? (10.1.1.144) at 80:00:00:48:fe:80:00:00:00 [infiniband] on ib0 ? (9.26.120.1) at 00:00:0c:07:ac:01 [ether] on bond0 ? (9.26.120.103) at 00:1a:64:c9:cc:d4 [ether] on bond0 ? (10.1.2.144) at 80:00:00:48:fe:80:00:00:00 [infiniband] on ib2 ? (9.26.120.104) at 00:1a:64:c9:d5:24 [ether] on bond0 ? (10.1.3.144) at 80:00:00:49:fe:80:00:00:00 [infiniband] on ib1 ? (10.1.1.143) at 80:14:00:48:fe:80:00:00:00 [infiniband] on ib0
- Verify that the ports are active and the links are up.
What to do next
Modify the kernel parameters of hosts that you plan to include in the Db2 pureScale environment.