This blog experiment seems to be working. The entries are gietting around 100 visits and growing - good enough to keep at it. I have found that writing the entries has given me the opportunity to clarify and express my thoughts. This entry is a case in point.
We are deploying a BAO solution for the level 3 support organizations in our IBM India Software Labs. That deployment provides a case study in how to integrate two concepts I introduced in earlier blogs. This entry is longer than the others. I hope you find it worth the wait and effort to read.
In those previous entries, I discussed two frameworks for reasoning
These frameworks address different aspects of the problem of using measures to achieve business goals by measuring the right things and taking actions to respond to the measurements. In fact, these frameworks fit together hand and glove.
Recall that level 3 support teams provide fixes to defects found in delivered code. Each of the teams deals with an ongoing series of change requests (aka APARs, PMRs). An organization goal is to reduce the time to and cost of completion of these requests. To achieve the goal, they are adopting some Rational-supported practices and supporting tools. So the questions that need to be answered are:
1. What is the time trend of the time to complete of the change requests?
2. What is the time trend of the cost to complete of the change requests?
3. In each case how would I know that some improvement action resulting in significant improvement in the trends?
Now comes the hard part: determining the measures that answer the questions. The change requests come arrive somewhat unpredictably. Each goes through the fix and release process and presumably gets released in a patch or point release. So at any given time there is a population of currently open and recently closed releases. The measures that answer the question are a time trend of some statistic on the population on some population of change requests.
Each of the change requests requires different amount of time and effort to complete. So to measure if the outcome is being achieved, one must reason statistically: defining populations of requests, building the statistical distribution of say time to complete for that population, defining the outcome statistic for the distribution. So we need to do things to define the measure:
1. Specify the population of requests for each point on the trend line
2. Specify the statistics on that population
To keep it simple (as least as simple as possible), lets form the population by choosing the set of change requests closed in some previous period, say the previous month or quarter. To choose a statistic, one needs to look at the data and pick the statistic that best answers the question. Most people assume the mean of the time (or cost) to complete is the best choice. However, that choice is appropriate when the shape of the histogram of the time to complete is centered on a mean as is common in normally distributed data.
One of the advantages of working in IBM is that we have lots of useful data. Inspection of some APAR data of the time to complete from one of our teams in the IBM Software Lab in India shows the distribution is not centered on a mean. and so reduction of the mean time to complete is not the best measure of improvement.
We have looked at literally tens of thousands of data points for time to complete of change requests across all of IBM and have found the same distribution. For you statistics savvy, it appears to be a Pareto Distribution, but statistical analysis carried out by Sergey Zeltyn of IBM Research’s Haifa lab shows that this distribution does not well fit any standard distribution. A possible explanation is that is the time required to fix the defects is Pareto distributed, but since the resources available to fix them is limited, the actual time to complete is not pure Pareto. In any case, a practical way to proceed is to choose a simple (non-parametric) measure: width of the head, i.e. the time it takes to complete 80% of the distributions.
So with this analysis in place, the organization decides to precisely specify the goal such as a 15% reduction in time and cost to complete 80% of the requests closed each month. So the outcome measures are the time it took to close and costs of 80% of the requests closed each month.
Having chosen this measures, we are ready to identify the data sources and instrument the measures. So far so good. But wait, we still need to answer questions 3.
As I mentioned, in order to improve the outcome measure and achieve the goals, the lab teams have agreed to adopt appropriate Rational practices and tools to automate certain processes. The practices were selection using the Rational MCIF Value Tractability Trees (a development causal analysis methed). Adopting and maturing the practices and their automations are the controls. Some control examples are automating the regression test and build process, and the adoption of a stricter unit test discipline to reduce time lost in broken builds. There are control mechanisms with associated control measures such as time-to-build, regression test time-to-complete, percent of code unit-tested, and a self-assessment by the team of their adoption of testing and build practices.
To answer question 3, we need statistical analytics to determine if the changes in the control measures have had a significant impact on the outcome measures. Our Research staff has settled on those analytics, but I will discuss that in a later entry. This entry is already too long.
This case study is both reasonably straightforward and far from trivial. It does show as promised that GQM(AD) and Outcome and Controls work together. I leave you all with a thought problem. How would you apply the pattern to teams developing new features to existing applications?