Stories from the field: Performance diagnostics or performance problem???
I just got off the phone with a DBA who just started to use DB2 Performance Expert (PE) a couple of months ago. This customer, whom I will call “Milo,” after my cat, called me to tell they were able to finally diagnose a pesky performance problem that had been hiding for all these months.
Let me back up to give you some history. When I first talked to Milo, they were having performance problems on all their environments. Before using PE, they usually ran a series of scripts to diagnose performance problems. These scripts would help them collect performance metrics for each partition. To add to the complexity they had a bunch of different environments they were monitoring. This means that Milo would end up with reports all over the place --in different systems and in different directory structures. Then when he downloaded them to his workstation he would need to print them out.
Milo's desk was covered with printouts. He had stacks of printouts on the left, stacks of printouts on the right -- they were even stacked on his bookcase. Milo's baseball bobble heads were holding up printouts because Milo ran out of room in his cube. Each area of his cube represented a different environment. Milo even made jokes that he was drowning in his own printouts, and he felt like the performance reports were multiplying if you left them alone. We both got a good chuckle out of that.
Fast forward to life after getting PE. With PE, Milo was easily able to pinpoint the performance problem in their large warehousing environments. PE was able to monitor all their different environments, which helped them diagnose a ton of problems because they were able to see all the performance metrics and how they correlated. For example, PE allowed them to view each partition and compare the partitions.
However, there was one sticky problem in their large warehouse environment (environment A) which they were unable to diagnose. But since they began using PE, that slowdown didn’t show up for months. Then one day, that pesky performance problem finally revealed itself again. Apparently one of the DBAs ran the old performance scripts on environment A, at which time the problem reappeared so the team could finally work on isolating the problem.
Milo told me laughing We couldn't figure it out….. the slowdown happened almost like clockwork. We had our scripts scheduled to capture the performance data and somehow it just happened. We used PE to diagnose the problem, only to discover it was our own poorly written performance scripts that caused the problem. You see, environment A was built on smaller UNIX boxes, with less memory and other resources. When they ran the performance scripts they caused they system to run out of memory, thus impacting their DB2 system. Since PE was able to show them what else is using resources on the system outside of DB2 they were able to see the problem immediately. Their scripts didn't check OS resources, but PE does.
The person who wrote the original scripts has been gone for a long time. The scripts ran fine and never caused performance problems on the other larger environments, so there was never any reason to examine them. Ooops :-)
Cheers - Alice Ma