Networking on z/OS
Previous topic | Next topic | Contents | Glossary | Contact z/OS | PDF


TCP/IP problem determination

Networking on z/OS

If TCP/IP abends, a dump should be produced. What you discover in the dump directs your search for the problem source. However, most TCP/IP problems have more subtle symptoms. TCP/IP problems can produce many different symptoms, particularly in a load balancing and sysplex environment. Within this information, the focus remains on the more basic problems that could be encountered.

In most instances, IP problems are reported as one of the following:
Connectivity problems
The target host cannot be contacted over the network.
Response time problems
The host is not responding in a timely fashion.
Performance problems
The data is not moving at the desired or expected rate. This is also called a throughput problem and is usually associated with bulk data transfer.

The difficult part is knowing the source of such a problem. For example, is the problem a result of the TCP/IP address space, the target application, the target host itself, or is it an intermediate host (router) somewhere in between? Could it be switching equipment? Or could the problem itself begin at the workstation or host that is attempting the connection?

The following suggestions can help you to narrow down the source of the problem.

Network messages

The primary location to check for messages, whether issued by TCP/IP or an IP application, is the z/OS system log. Messages might also appear in the SYSPRINT, SYSERR, SYSERROR, and SYSDEBUG data sets. DD statements for these data sets are usually configured to direct messages to the joblog–but check your JCL, to be sure. As noted earlier, TCP/IP messages begin with the EZ prefix. The z/OS Communications Server IP messages manuals (all four volumes) contain excellent details on why a message may have been issued.

Note: The standard TCP/IP applications that are included in z/OS Communications Server also issue EZ prefixed messages. FTP, TN3270, telnet, SMTP and many other applications write EZ messages that can be used for diagnostic information.

Messages might tell exactly what the problem is, or perhaps they might at least direct the system programmer to where to focus attention. A sample TCP/IP message is shown in Figure 1.

Figure 1. TCP/IP TN3270 message
 EZZ6035I TELNET DEBUG CLIENT IPADDR..PORT
 168.214.219.106..4319
 CONN: 0000A427   LU: TESTTRLB  MOD: EZBTTRCV
 RCODE: 1001-01 Client disconnected from the connection.
 PARM1: 00000000 PARM2: 00000000 PARM3: 00000000

This message is actually issued by the TN3270 server application and is a straightforward indication that a TN3270 client has disconnected from the TN3270 server.

Determining TCP/IP server or client address space problems

Sometimes the problem manifests itself when a TCP/IP server or client address space (application) stops processing. Or perhaps the application is looping or in a slowdown. The following actions would be appropriate in such a situation:
  1. Obtain an SVC dump of TCP/IP or the looping TCP/IP application by issuing the DUMP command from the z/OS system console. If the loop is disabled, the z/OS system console is not available for input so take a stand-alone dump.
  2. If the application itself issued any error codes or messages, keep them available because sometimes these messages contain return or reason code details that are system-related rather than application-related.
  3. Obtain the appropriate portion of the z/OS system console log.
  4. Obtain the job log from the started procedure.
  5. Obtain the LOGREC output.

Diagnosing network problems

The following basic sequence can assist in diagnosing a network-based problem:
  1. Test and verify the TCP/IP address space configuration using NETSTAT commands.
  2. Test connectivity to remote hosts using the PING and TRACERTE commands.
  3. Obtain a TCP/IP packet trace (component SYSTCPDA).

The packet trace can be especially useful for determining where delays or response failures occur. By examining time stamps, you can determine whether a delay is at the z/OS end of the connection or somewhere else on the network.

Mainframe packet trace

The component SYSTCPDA trace is one of the starting points for diagnosing IP-based problems. The trace is written unformatted to a CTRACE data set, and the data can subsequently be formatted using IPCS. There are many filtering and report generation options available. One of the most commonly used report options is a connection summary option called SESSION(DETAIL). A sample is shown in Figure 2.

Figure 2. Excerpt of packet trace session output
  Local Ip Address:                           198.162.245.166      
  Remote Ip Address:                            142.178.32.65      
                                                                   
                                                                   
 Host:                                 Local,          Remote      
  Client or Server:                   SERVER,          CLIENT      
  Port:                                   21,            3930      
  Application:                           ftp,                      
  Link speed (parm):                      10,              10 Megabits/s
                                                                   
 Connection:                                                       
  First timestamp:                 2005/09/20 09:47:25.592108      
  Last timestamp:                  2005/09/20 09:47:25.808723      
  Duration:                                   00:00:00.216615      
  Average Round-Trip-Time:                              0.019 sec  
  Final Round-Trip-Time:                                0.312 sec  
  Final state:                         CLOSED (PASSIVE CLOSE)      
  Out-of-order timestamps:                                  0      
                                                                   
 Data Quantity & Throughput:         Inbound,        Outbound  
  Application data bytes:                130,             123      
  Sequence number delta:                 132,             124      
  Total bytes Sent:                      130,             123      
  Bytes retransmitted:                     0,               0 
 Throughput:                           0.75,           0.709 Kilobytes/s
  Bandwidth utilization:               0.06%,           0.05%      
  Delay ACK Threshold:                   200,             200 ms   
  Minimum Ack Time:                 0.038900,        0.000000      
  Average Ack Time:                 0.039707,        0.000000      
  Maximum Ack Time:                 0.040515,        0.000000      
                                                                   
 Data Segment Stats:                 Inbound,        Outbound      
  Number of data segments:                 1,               2      
  Maximum segment size:                 4056,            1460      
  Largest segment size:                  130,              63      
  Average segment size:                  130,              61      
  Smallest segment size:                 130,              60      
  Segments/window:                       1.0,             1.0      
  Average bytes/window:                  130,              61      
  Most bytes/window:                     130,              63      
                                                                   
 Window Stats:                       Inbound,        Outbound      
  Number of windows:                       1,               2      
  Maximum window size:                 65536,           65536      
  Largest window advertised:           65535,           65406      
  Average window advertised:           52415,           54526      
  Smallest window advertised:              0,           32768      
  Window scale factor:                     0,               0      
  Window frequency:                   0.0001,          0.0001 Windows/s 

The SESSION(DETAIL) output gives an at-a-glance summary of the connection that was traced. It begins with the basics, such as IP addresses and port numbers, and then continues on with all other measurable aspects of a TCP/IP connection. The connection in Figure 2 was of very short duration and only a few bytes were exchanged. Considering that it is connected to port 21 on the mainframe, the bytes exchanged would presumably be FTP commands flowing along the control connection of an FTP session.

Much of the other information in a packet trace requires sophisticated knowledge of the TCP and IP protocols.

LAN-based tracing

When the problem appears to be outside the mainframe, a sniffer (LAN) trace may be appropriate. A sniffer trace may be used to run a trace at the remote end of the connection at the same time as a packet trace is running on the mainframe. By comparing the two traces, the location of the problem can be more accurately pinpointed. For example, by comparing timestamps, a response time problem can be confirmed as being on the mainframe, the remote host, or on the network in between.

There are many tools that perform packet sniffing, network monitoring, and protocol analyzing. Two of the most common are the SNIFFER Tool from Network General, or ETHEREAL which is an Open Source Software released under the GNU General Public License.





Copyright IBM Corporation 1990, 2010