From time to time (sometimes everyday - the support business is a capricious one) I need to see what's really going on in the fibre. For that reason we have a couple of tracers which can be sent to the EMEA countries. Some IBM organizations in some countries even have their own tracers. For the SAN support we use the XGIGs from JDSU (originally from Finisar). Usually I trace, if the problem is somehow protocol related and cannot solved with the RAS packages of the switches and the devices. Or if the RAS information from one device contradicts the other one. Or if every support team (internal and external) points to each other. Or if something totally strange happens and nobody can deal with it. Maybe we trace a little too often, because meanwhile other vendors sometimes say things like "Oh, you also have IBM gear in your environment? Let them trace it!".
So what's this tracing all about?
To put it simple, you connect it in the line and it just records all the traffic. Of course you can filter it and let it trace only the interesting part of the frames. I do not care for the actual data but the FCP and the SCSI header info are precious information. Of course an 8 Gbps link generates a lot of data, too and the memory is very limited. So you want to be sure to trace exactly what you need - not more, not less. The tracing is done by IBM customer engineers. We ensure to have a suitable number of trained CEs in every region. I hosted some of the trainings by myself and imho it's definitely worth it. The analysis is then done afterwards. I personally like it, because it offers me a possibility to not be "bound" to the RAS packages alone. I can really see what happens.
Although the whole topic is pretty much straight forward, for the ones unfamiliar with it, tracers seem to be mystical devices. Over time I faced several "urban legends" impeding troubleshooting a lot sometimes:
- "What info? You should see that in the trace!" - Often I get no additional information for a trace (e.g. consisting of 8 trace files from different channels) which slows down the analysis extremely. I need at least a layout where I can see where exactly the tracer was connected. I need to know how it was configured, if the problem really happened during the trace and I need the data collection of the switch and the devices to compare what I see against the RAS packages. Please help me to help you! :o)
- "We can't put this link down. Is it important where to plug in the tracer?" - Yes, of course it is. Like described above, it just records the traffic that enters the tracer. Nothing more. There are no tiny little photon-based nano robots swarming out through the fibres and collecting data. Really. If you plug it somewhere else, I won't see the problem.
- "Thank you so much for introducing a tracer in our environment. It solved the problem. It has to stay." - No, the tracer did not solve the problem by itself. If the problem somehow vanished with cabeling in the tracer, then a simple portdisable/portenable should have helped as well. The tracers are needed frequently and can't stay in the environment till the end of days.
These were just some of the rumors and statements I heard in the past. To summarize it, please keep in mind:
A tracer is not a magical device. It just records traffic.