PTF test for PI38376: A TCP connection can use the wrong maximum segment size (MSS) on V2R1
Part One
A test role here at Communications Server for z/OS involves more than just testing the latest and greatest code. Customers run into issues and fixes need to be made available for them, but not before they are internally tested. Today I describe my experience testing this PTF as a still relatively new member of the z/OS System test team. A lot of frustrations were had, but fortunately I still have all of my hair and had the chance to learn some new things.
Gathering Information - I know some stuff, maybe?
To start, I look at the PTF record and corresponding web pages in our internal source control tool to gather some initial information. I end up with a Notepad++ document full of haphazardly pasted notes from various resources to sift through and make some sense of. Fortunately there are a ton of details, which makes for a happy tester. I'll leave the majority of the nitty-gritty out and summarize the situation:
- A distributed DVIPA (DRVIPA) is defined for at least two systems in a sysplex: one being the primary distributor and the other a backup.
- If the backup stack is started before the primary distributor and takes over the DVIPA, an implicit (host) route from the backup to the primary distributor is created for that DRVIPA with an MTU size of 576 during the SYN stage of a TCP handshake. In the failing case the multipath routing algorithm is used, which chooses the smallest MTU value among all possible routes to the DRVIPA but ignores the default host route. Although OMPROUTE uses OSPF to advertise host routes with larger MTU sizes, the MTU for this particular route remains "stuck", resulting in an MSS (maximum segment size) of 536 for outbound TCP connection setup requests. No bueno.
"The problem occurs when an implicit host route for the DRVIPA is generated with the default MTU 576 instead of 65535 on a backup system. This is accomplished by starting the backup system first before the distributor."
To make matters a bit more convoluted, the conglomerate of notes inform me that I will not be able to view the incorrect MTU with a simple netstat route display. Instead, I'll have to dump the TCP/IP stack after recreating the scenario and scour through the raw memory. Staring at hexadecimal. Looking for something called an "RTE" in something else called an "RTOP."
Additionally, the customer's error description included steps to recreate the error:
1. Define DRVIPA to be used
-Backup definition must be defined without the MOVEABLE IMMED
2. Start TCPIP on a backup system without OMPROUTE
3. Start and stop the primary distributor without OMPROUTE to force the DRVIPA takeover on the backup stack
-At this point the host route for the DRVIPA with the MTU of 576 is created on the backup stack
4. Restart the primary distributor with OMPROUTE to takeback the DRVIPA
5. Start OMPROUTE on the backup so OSPF host routes will be learned from the distributor
-At this point the MTU value set at 576 will get "stuck"
6. If a connection is established from the backup to the DRVIPA on the distributor, a netstat display on this connection will show the MSS set to 536
7. Dump the TCP/IP address space
-Examine the dump to find the MTU value of 576 from RTOP in RTE
So far, I know our environment has, at least, bits and pieces of this customer configuration. The SVT environment has DVIPAs defined with their corresponding VIPADISTRIBUTE and VIPABACKUP definitions. Since these DVIPA definitions were built to be highly customizable to suit a customer's needs, the amount of options and parameters possible combined with ensuring correct syntax can be overwhelming at times. For this reason my preferred method is to take from example - there are already so many different kinds of configuration files saved over the years in our test environment that there is likely one I can use as a template for this test.
As a tester, however, I could have saved a decent amount of time if all of the test information I needed were in a single location, instead of a number of separate records/web pages. I had to dig around to find useful pieces of information, and the first place that I looked (the PTF test record directly assigned to me) did not contain detailed error recreation instructions.
Research - I figure out some stuff
Terms and concepts
From the information I've gathered so far, I need to define some acronyms and understand some concepts not previously encountered.
- RTOP: Google search didn't come up with anything, and neither did the two 'Terminology' bots on Sametime, so I went to the V2R1 Knowledge Center. A search gave me the RTOPTS, or run-time options for the language environment which doesn't seem related, so I asked a more experienced tester who wasn't familiar with the acronym either. Luckily I was eventually able to find someone who was familiar with it, and it turns out that RTOP is an identifier (an "eye catcher" as we call it) for a control block of a group of routes to a given IP address destination in a dump of the TCP/IP address. Internal stuff, so that's why it wasn't publicly searchable.
- RTE: Another hopeless Google search but 'RTE', turns out, is related to a TCPIPCS ROUTE report in IPCS. All of that happens to be in the Knowledge Center. Looking as a sample TCPIPCS ROUTE report gives me more clues - it looks like RTE is just a shortened name for the "Route" field in the report. RTOP has something to do with this report, but I don't quite see where it fits in just yet. Eventually, the same person who explained what RTOP was explained that RTE is an eye catcher for the route control block in a memory dump; there can be multiple RTE entries for a single RTOP.
- MOVEABLE IMMED parameter for VIPABACKUP definitions: sticking to Knowledge Center for this one, the MOVEABLE IMMEDIATE parameter refers to the behavior of a DVIPA in the case of stack takebacks. So, if the stack owned the DVIPA then went down, transfers ownership of that DVIPA to the backup stack defined, and then comes back up, the stack will regain control and all new connections for that DVIPA. In this test scenario, the customer does not have MOVEABLE IMMEDIATE defined, so I will need to ensure that it is also not defined on my test systems.
- Implicit host route/routing, and how to spot it: I thought way too hard about this one and should have just asked someone right away because, as it turns out, it is simply a host route or a route to an IP address on the HOME list (signified by the H flag in a netstat display). It appears that the term "implicit" is older language exclusive to the mainframe crowd. That happens quite a bit around here.
- Multipath, including where it's defined and how to tell that it is enabled: What I want to know is where and how multipath is defined on a stack or router, or at least how to tell that it is being used. Back at the Knowledge Center I came across the Routing section with an OSPF overview. Just as the name suggests, multipath allows for a routing table to contain multiple routes ("paths") to a destination. There is also an IPCONFIG MULTIPATH or NOMULTIPATH statement for the TCP/IP profile, which either enables or disables multipath routing for outbound traffic, respectively. When testing I will need to verify that the MULTIPATH statement is configured for the TCP/IP stack I will use; I can also confirm it is being used by looking at a routing table netstat display. If multiple routes exists for a single destination, multipath is in use, at least for inbound traffic.
- How to examine a dump of the TCP/IP address space to find the IP routing table that should show the DRVIPA implicit host route with MTU 576: I know this part is specific to our environment and what we already have set up, but to find the MTU value buried in raw hex could be a formidable undertaking.
Test procedure
The combined PTF records already provide a lot of information that will save me a significant amount of time by outlining steps for recreation with a good amount of detail - high quality test relies on the presence of as much relevant information as possible. The challenge at this point, however, is determining how I will mimic the customer's configuration and recreation steps in our own shared test environment.
For this test I will need at least two systems in the same sysplex. One as the primary distributor of the DRVIPA and one as a backup. Likewise, I'll need a method to "stop the distributor" which can be done using a few different ways such as forcing the stack to the leave the sysplex, or using a VARY DEACTIVATE command on the DRVIPA itself, or stopping the stack entirely. I'll need to define this DRVIPA similar to the customer's configuration and be able to access and modify TCP/IP profiles. Finally I'll need to be able to stop and start OMPROUTE on either stack.
At this point I think it's time to bring up some systems and see all of this for myself.
Stay tuned for Part Two where I dive in and actually do some testing!