Folks who have heard me present will recognize the following discussion as a variation of what I have used as an example to explain the importance of variance in software and system estimates. Imagine this time you are a development organization manager given the following artificial opportunity. You can agree to the following deal: Have the teams at your own expense develop some application, each meeting a given set of requirements. The client really wants the applications and will accept them if acceptable and perfectly will to be consulted throughout the projects. Here is the catch: if you deliver the projects on time in 12 months, you will receive $1M per application. If you are a day late, you get nothing. You have to decide whether to take the deal.
Lets suppose you take the projects to your estimators and they tell you the estimated time to complete is 11 months and the estimated cost to complete is $750K for each of the projects. So you stand to make an estimated $250K per project. So you staff up as much as you take on three projects looking forward to your bonus. Was this a good deal?
Those who have read The Flaw of Averages by Sam Savage and Dan Denziger already know the answer. Those who haven’t read the book should. This book nicely captures the sort of statistical reasoning that underlies IBM Rational’s approach to business analytics and optimizations (found in the RTC agile planner and the ROI calculations in Focal Point). Some key rules:
- Uncertain quantities are captured by curves called distributions (e.g. the bell shaped curve of normal distributions)
- Most distributions for uncertain quantities are not normal, bell shaped curves, i.e. normal distributions are abnormal.
- Calculating with averages in any case yields the
wrong answer with business critical effects. Rather one should calculate with
the distributions. This is done with Monte Carlo methods.
Back to the example: The time to complete is an uncertain quantity and so must be described by a distribution. Often, the estimate returned by the estimator is the mean of that distribution. The distribution may be pretty wide and so may look like Figure 1 of the attached document. (I have had bad luck trying to embed figures in the blog and I have put the figures in this this attachment.) Note that 40% of the distribution lies beyond 12 months.
Assuming the $750K cost to complete estimate is dead on, lets apply some simple high school probability to get the distribution of profit (See Figure 2):
· The chance of succeeding at all three projects and getting $3M is revenue.is (0.6)3=0.216,
· The chance of succeeding at exactly two projects and getting $2M in revenue is 3(0.6)2(0.4)=0.432
· The chance of succeeding at exactly one project and getting $1M in revenue is 3(0.6)(0.4)2=0.288
· The chance you will fail at all three projects yielding no revenue is (0.4)3=0.064.
The weighted average of the distribution of revenues is
(0.216)($3M) + (0.432)($2M) + (0.288)($1M) + (0.064)($0) = $1.8M
So the likely outcome of your (3)$750K = $2.25M expense is a loss of $450K.
But wait, it is worse. The distribution is probably not normal. Programs are more likely to late than early and so are skewed to the right. In this case the average (i.e. the mean) is less than the 50% point. So, as shown in Figure 3, it is possible to have the estimate of 11 months and the likelihood of failure is 50%. The revenue distribution is given in Figure 4. In this case, the weighted average of the distribution of revenues is
(0.125)($3M) + (0.375)($2M) + (0.375)($1M) + (0.125)($0) = $1.5M
In this the expected loss is $750K.
But wait, it is still worse. The cost to complete is also uncertain. To keep things as simple as possible, lets suppose the cost to complete for each of the projects is described by three values: best case is $700K, the likely case is $750K, and the worse case is $1M To compute the expect profit in this case requires using this values as parameters for a triangular distribution (see Figure 5) and then apply Monte Carlo methods to do the calculation to get the distribution that describes the profit. The result is shown in Figure 6. Briefly in this case:
· The most likely outcome is a loss of $945K
· There is a 90% certainty of losing at least $805K
· There is a 10% chance of losing more than $1.1M
So taking this deal is at best career limiting!
Notice by ignoring the rules, one is tempted to make a bad deal. Applying each of the rules with more discipline shows how bad the deal is. The moral of all this is that making business decisions based on calculations of averages can lead to disastrous outcomes.
This moral needs to be taken to heart by our industry. Far too often, managers when faced with making funding projects or business commitments insist, “Just give me the number.” What they need is a distribution; the number they are given is likely to be an average. Decisions based on the number will likely go sour. No wonder the software and system business outcomes rarely delight their stakeholders. The good news is that there are robust, proven techniques to avoid the flaw of averages.