Product Documentation
Abstract
Things to look for when using SMC-R (RoCE) or SMC-D on z/OS Communications Server
Content
Diagnosing problems with SMC-R and SMC-D
Wireshark support for SMC-R is available with Wireshark Release 2.0 and beyond. The following URL provides the Wireshark 2.0 release notes indicating that the SMC-R support is now included (see section 2.3 New Protocol Support):
https://www.wireshark.org/docs/relnotes/wireshark-2.0.0.html
Note: Links to publications listed below will be added after z/OS GA.
SMC-R problems are often related to switch configuration, physical network ID (PNetID) configuration, and other configuration issues.
Common problems with using SMC-R communication include the following problems:
- Switch configuration issues
- Physical network ID configuration issues
- No associated subnet mask
- PFID status remains STARTING
- Problem with SMC-R interaction with security function
The SMCReason field of the Netstat ALL/-A report and the SMCR field of the Netstat DEvlinks/-d report provide information that is related to SMC-R problems. For a complete list of SMCReason codes in the Netstat ALL/-A report and the SMCR Disabled reasons in the Netstat DEvlinks/-d report, see z/OS V2R1 Communications Server IP System Administrator's Commands.
Switch configuration issues
RDMA processing requires standard 10 GbE switch support, and distance limitations exist. Enable the global pause frame (a standard Ethernet switch feature for Ethernet flow control that is described in the IEEE 802.3x standard) on the switch.
When the SMCReason field of the Netstat ALL/-A report is 00005013 - RDMA CONNECTIVITY FAILURE, SMC-R was not able to complete the Link Confirm flow, which usually indicates a switch configuration issue. The Link Confirm message is the first data sent over the RoCE fabric. Check for the following issues:
- If you are using VLANs, verify that the VLAN configuration on the RoCE switch ports is consistent with the VLAN configuration on the OSD switch ports.
For example, the OSD switch ports might be configured properly with no VLAN ID or the default VLAN ID, but the RoCE switch ports have a different VLAN ID configured, such as trunk mode with VLAN IDs 400 and 500.
- Verify that your cable is plugged into the correct port on the RoCE Express feature and into the correct port on the switch.
For example, perhaps the cable is plugged into the correct port on the RoCE Express feature but it is plugged into the wrong port on the switch, or maybe the cable is plugged into the correct port on the switch but the wrong port on the RoCE Express feature.
- Verify that the MTU value configured on the switch is large enough to support your configured MTU size for this interface.
Hint: Enable jumbo frame support on the RoCE switch ports (when using 2K MTU).
- Multiple switches are in use but the switch uplinks are not configured properly.
For more information about configuring VLANs, see VLANID considerations in z/OS Communications Server IP Configuration Guide.
Physical network ID configuration issues
The TCP/IP stack must be able to determine which physical network is connected to a particular 10GbE RoCE Express interface, so that the 10GbE RoCE Express interface can be associated (associated RNIC interface) with the SMC-R capable IPAQENET or IPAQENET6 interfaces that connect to that same physical network.
Use the Netstat DEvlinks/-d and D NET,TRL,TRLE=xxxx commands to verify the physical network ID (PNetID) value on the OSD interfaces and the 10GbE RoCE Express interfaces.
- If the Netstat DEvlinks/-d report for your OSD interface indicates SMCR: DISABLED (NO PNETID), ensure that you configured the PNetID value on the correct OSD port in the HCD definitions.
- If you receive message EZD2028I with reason PNETID IS NOT CONFIGURED during 10GbE RoCE Express interface activation, ensure that you configured the PNetID value on the correct 10GbE RoCE Express port in the HCD definitions.
- If the Netstat DEvlinks/-d report for your OSD interface indicates SMCR: Yes and your 10GbE RoCE Express interfaces initialized successfully, verify that the PNetID value of the OSD interface matches the PNetID value of the intended 10GbE RoCE Express interfaces.
In the HCD definitions, the Physical network ID 1 value is for port 1 on 10GbE RoCE Express features and port 0 on OSD adapters, and the Physical network ID 2 value is for port 2 on 10GbE RoCE Express features and port 1 on OSD adapters. The Physical network ID 3 and Physical network ID 4 values are not used.
For more information about configuring PNetIDs, see Physical network considerations in z/OS V2R1 Communications Server IP Configuration Guide.
No associated subnet mask
SMC-R is used only between peers whose IPv4 interfaces have the same subnet value or whose IPv6 interfaces have at least one prefix in common.
- For IPv4, when a subnet mask value is not configured for the OSD interface, the SMCR field of the Netstat DEvlinks/-d report is DISABLED (NO SUBNET MASK).
- For IPv4, you might also see that the SMCReason code in the Netstat ALL/-A report is 521E PEER SUBNET/PREFIX MISMATCH.
- For IPv6, the SMCReason code in the Netstat ALL/-A report is 521E PEER SUBNET/PREFIX MISMATCH.
For information about associating your interfaces with the appropriate subnet or prefix, see Configuring Shared Memory Communications – RDMA in z/OS V2R1 Communications Server IP Configuration Guide.
PFID status remains STARTING
The PFIDSTATUS field is the RNIC interface PFID status. The following list describes the possible status values:
- READY
READY indicates that the initialization sequence with the PFID is complete and the PFID is now ready. - NOT ACTIVE
NOT ACTIVE indicates that the PFID was never started or was stopped after it was started. - STARTING
STARTING indicates that a START of the PFID was issued and TCP/IP sent an activation request to the Data Link Control (DLC) layer. This means z/OS Communications Server did not receive a port state change event that indicates the port is active from the RoCE Express adapter. Until the port state change event is received, the PFIDSTATUS remains in STARTING state. - Check that your cables are connected properly.
- Verify that the switch ports are enabled.
- If the RoCE adapters are hard-wired to each other, the STARTING status is expected until the partner side has started the RNIC interface.
- Verify that the optical cable used for the RoCE adapter is not damaged.
If the PFIDSTATUS field does not change from STARTING to READY, take the following actions:
Generally, security functions that require TCP/IP to examine TCP packets cannot be used with SMC-R communications because data that is sent over SMC-R links is not converted into TCP packets. For more information, see Security functions in z/OS V2R1 Communications Server IP Configuration Guide.
Recommended Maintenance
See Info APAR II14751
Was this topic helpful?
Document Information
Modified date:
17 June 2018
UID
swg27039578