Well a crazy workday kept me from blogging yesterday. I was , however, reminded of an important piece to trouble shooting applications, even database instances. What was that piece? Never get hung up on a single test box, or a single test instance. The reason why may be obvious, but the problem is that if you get hung up on a single instance or box , you can miss the actual problem.
Take yesterday, for example, I was helping a customer with a box that recently migrated to 11.50.FC5 , their app was crashing every time the engine came on-line, and in the process was crashing the Informix engine as well. Now as a support engineer you tend to focus on the assertion failure file and shared memory dump , just like an application developer would focus on debug logs and a core file. Well to make a long story short, after trying to identify the problem, I finally asked them to test on a separate box that had 11.50.FC5, if they had one. They did have another test box, and tested their application which did not crash and worked as expected. It turned out there was no problem with Informix, or the application, but the original test box had significant issues all its own, due to an unforseen accident that both the developers and myself were not originally aware of.
It's so easy these days, in this "whose to blame" society that we forget sometimes, that conditions exist where no one is to blame. Accidents happen, and it's what we do to idenify and correct the issue, accidental or not, that helps make our application, and ourselves , successful.