Verifying uDAPL configurations for connectivity issues on Linux
uDAPL connectivity issues are most commonly caused by misconfiguration. You can verify uDAPL configurations to ensure that the members can communicate with the CF.
Before you begin
rpm -qa | grep ofed
command.Procedure
Use the following steps to verify your uDAPL configurations:
- Examine the physical port states by running the
ibstat -v
command.Ensure that the State is Active, and the Physical State is LinkUp as shown in the following example:
If the port State is not Active, check the cable for connectivity.CA 'mthca0' CA type: MT25208 (MT23108 compat mode) Number of ports: 2 Firmware version: 4.7.400 Hardware version: a0 Node GUID: 0x0005ad00000c03d0 System image GUID: 0x0005ad00000c03d3 Port 1: State: Active Physical state: LinkUp Rate: 10 Base lid: 16 LMC: 0 SM lid: 2 Capability mask: 0x02510a68 Port GUID: 0x0005ad00000c03d1 Port 2: State: Down Physical state: Polling Rate: 10 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510a68 Port GUID: 0x0005ad00000c03d2
- On the CF hosts, verify that the IP address associated
with the IB ports matches the IP addresses used for the net names
for the CF entry in the db2nodes.cfg file.
- View the IP address that is associated with the IB ports
on the CF host. To view the IP address that is associated with the IB port, run the
ifconfig -a
command. The IP address can be found by looking at the address that is associated with theinet addr
field as shown:
In the output,coralxib20:/home/svtdbm3 >ifconfig -a ib0 Link encap:UNSPEC HWaddr 80-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:10.1.1.120 Bcast:10.1.1.255 Mask:255.255.255.0 inet6 addr: fe80::205:ad00:c:3d1/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 RX packets:18672 errors:0 dropped:0 overruns:0 frame:0 TX packets:544 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:2198980 (2.0 Mb) TX bytes:76566 (74.7 Kb)
ib0
is the interface name. The status is UP, and the IP address is 10.1.1.120. It is important to ensure that the interface status is up. - Ensure the network names for the CF in the db2nodes.cfg
file match with the IP addresses for the intended IB port
to use for the CF. You must also ensure that the name can be pinged, and is reachable from all hosts on the cluster.
From each member host, run a ping command against the network names that are associated with the CF entry in the db2nodes.cfg file. Observe the IP address returned. The IP address must match the IP address that is associated with the IB port configuration at the CF host, as in the
ifconfig -a
output.Note: When you ping an IP address on a different subnet, the pings are unsuccessful. This occurs when you have multiple subnet masks for each interface when there are multiple interfaces defined for the CF. In this case, from the member, ping the target IP address on the CF host that has the same subnet mask as the interface on the member host.
- View the IP address that is associated with the IB ports
on the CF host.
- Verify that the uDAPL interface is configured in the /etc/dat.conf file
on all hosts, and that the right adapter port value is used. Since Db2® pureScale® uses uDAPL 2.0, look for the first entry that has
u2.0
in the second column with the matching interface name and port number. On Linux®, the adapter port value is not used, and is "0". The following entry might look similar to the entry in your /etc/dat.conf on SLES, or /etc/rdma/dat.conf on RHEL file:
In the output,ofa-v2-ib0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib0 0" ""
ofa-v2-ib0
is the unique transport device name for the uDAPL interface. Theu2.0
indicates that the entry is for a uDAPL 2.0 application. You must ensure that the libdaplofa.so.2 file exists for it is the uDAPL shared library. The ib0 0 output is the uDAPL provider-specific instance data. In this case, the adapter isib0
, and the port is "0", since it is not used.If the CF is configured with multiple interfaces by using multiple netnames in the db2nodes.cfg file, you must ensure that all the interfaces are defined in the dat.conf file.Note: The /etc/dat.conf file must only contain entries for the adapters that are in the local host. The sample /etc/dat.conf file that is installed by default typically contains irrelevant entries. To avoid unnecessary processing of the file, make the following changes:- Move all the Db2 pureScale cluster-related adapter entries to the top of the file.
- Comment out the irrelevant entries or remove them from the file.
- Ensure that the port value specified on the client connect
request match the port value the CF listens on. You must ensure that the CF port values are the same in the /etc/services files for all hosts in the cluster.
- To determine the port value that is used for the CF,
look in the CF diagnostic log file. In the cfdiag_<timestamp>.<id>.log file, look for the value that is associated with the
CA Port[0]
field as part of the prolog information at the beginning of the log file. -
To determine the port value that is used by the member on the connect request, look for the
PsOpen event in the Db2 member diagnostic log
(
db2diag.log
) file.Look for the value of thecaport
field.
- To determine the port value that is used for the CF,
look in the CF diagnostic log file.