Murray Cantor
Risks of what?
Yesterday, I was at the Conference on System Engineering Research (CSER) held this year at Stevens Institute. I sat through a talk which stimulated my curmudgeon tendencies. In the spirit of hopefully generating some contraversy, I will not hold back.
The talk was about an expertsystem based engineering risk management system. Essentially, the authors got a set of experts together to identify catagories of risks (people, delivery, product ...), risks in the categories, and a method for identifying level of risk and their consequence and then summing the products of the levels and the consequences. The end is the total amount of category risk. Looking at the output is supposed to give you insight of the overall program risk and the contributing risks. My problem is that I cannot parse the last sentence. In fact I do not understand terms like "program risk" and say "people risk". There may be a clash of cultures here; to many those terms seem reasonable. My argument starts here: One can ask 'What is my risk of going over budget?' or 'what is my risk of missing the delivery date?' The answers to these sort of questions are answered using stardard business analytics. See, for example, Mun's text on risk analytsis that defines risk as statistical uncertainty of a quantity that matters. For example, 'time to complete' is a quantity that does matter to a project. The uncertainty in making the date can be measuresd as the variance (or standard deviation) of the estimate of the timetocomplete. (Note, for the math aware, timetocomplete is what the statisticians call a continuous random variable.) So the answer to the question, 'what is my schedule risk?' has an unambiguous, quantified answer. What is 'my people risk' has no such answer. In fact, 'people risk' is not a concept defined in business analytics. Of course, it does make sense to ask what contributes to the schedule risk. One might fear that the inability to staff the project contributes to the schedule risk. Fair enough. In my mind, that does not make staffing a 'risk', but say a schedule risk factor. I am not sure why I am so adamant about this, but I am. It could be that I believe that the less precise use and measurement of risk is holding our industry back. Anyone want to comment or defend the socalled risk management practice underlying the talk I found so annoying ? 
Measurement Principles
In a conversation with a development lab productivity team, I was reminded that the first challenges software and system organizations face when starting an improvement program is 'what to measure'. In particular, some organizations start with what is easy to measure with the understandable thought "we need to start somewhere". I have found this approach tends not get traction. Over the years I have settled on some principles that seem to apply:
I suspect there some other principles that should be added. But these are a good start. In a later blog, I will discuss levels of measurement. Comments? 
Arithmetic with Random VariablesThis entry is a followon to my most recent entry. The idea is that random variables are are the way describe the uncertain quantities that arise in managing development efforts. They are a natural extension of the fixed variables we all grew up with. In fact, a fixed variable in a random variable that has probability one of taking a given value and probability zero of taking any other value. In this entry, I explain how you can (with computer assistance) calculate with random variables. Suppose you want to add two random variables, v1 and v2. This need might arise if you have two serialized tasks in a Gantt chart, each described by a distribution as explain in the previous entry and you would like to know how long it would take to complete both of them, i.e. the sum of their durations. How would you proceed? First note the sum would be another random variable. Therefore what you need is the probability distribution of the sum. There is no formula for that distribution, but there is an effective, commonly used numerical approach, known as Monte Carlo simulation. The idea behind Monte Carlo simulation is to use a pseudo random number generator take a sample value of v1 and a sample value of v2 and then add them. For more detail, follow this link. Note that the values are selected according to the probability distributions of each of the variables. The more likely values are taken more often. Now save that sum and do the same thing many times, say 100,000 times, and store each of the sums. For each of the sums, you can compute its probability by looking at its frequency in the collection of saved sums (some sums are more frequent than others) and divide by the number of samples (actually you have to round the sums to get the counts). What you get is an approximation of the distribution of the sums. Lets look at an example, if v1 has a triangular distribution with L = 3, E = 4, H=7, as shown in Figure 2 and v2 has a triangular distribution with L = 1, E = 6, H= 7 as seen figure 3. The distribution of the sum, found using the Monte Carlo simulator in Focal Point, is given by figure 3. First note the sum is not another triangular distribution. It is
smoother. This is to be expected from the mathematics of probability. On
the other hand, the distribution of the sum makes sense. For example,
we would expect the most likely value of the sum to be 10, the sum of
the two most likely values, but the simulation found 9.98. The
discrepancy is due to chance and would diminish with more samples. Also
note the probability is zero below 4, the sum of the lows, and above 14,
the sums of the highs. For fun, here is the distribution of the product for the variables: The reader can check if this looks sensible. Note also that the product does not have a triangular distribution. The peak is much smoother. So random variables can be used in place of fixed variables in any computation. So they have all of the utility of fixed variables and enable us to express uncertainty. They may seem foreign at first, but they are worth the trouble to learn. Like anything else, they become intuitive after a while. 
Economics of QualityI have the honor of giving one of the keynotes at the Conseg2011 conference this February in Bangalore. I have chosen a large, perhaps overly ambitious topic: "The Economics of Quality". Here is my conference proceedings document. My goals in preparing the paper and presentation is to make the case is that quality is fundamentally an economics concern and to suggest an overall approach for reasoning about when the software has sufficient quality for shipping. For those who have read some of my earlier entries, you will see have different my thinking on the topic differs from those who take a technical debt approach. Anyhow, the brief proceedings paper is very high level and there is considerable work to filling in the details and validating the approach. This paper really is the beginning of a program that I believe, when carried out, will have great benefit to our industry. So, I would like to hear from anyone who has similar interests and perspectives. There must be some existing relevant research. Perhaps we find enough likeminded folks to build a community exploring the topic. 
The Flaw of Averages in Software and Systems
mcantor@us.ibm.com
Tags:
software_estimates
flaw_of_averages
system_estimates
2 Comments
4,220 Views
Folks who have heard me present will recognize the following discussion as a variation of what I have used as an example to explain the importance of variance in software and system estimates. Imagine this time you are a development organization manager given the following artificial opportunity. You can agree to the following deal: Have the teams at your own expense develop some application, each meeting a given set of requirements. The client really wants the applications and will accept them if acceptable and perfectly will to be consulted throughout the projects. Here is the catch: if you deliver the projects on time in 12 months, you will receive $1M per application. If you are a day late, you get nothing. You have to decide whether to take the deal. Lets suppose you take the projects to your estimators and they tell you the estimated time to complete is 11 months and the estimated cost to complete is $750K for each of the projects. So you stand to make an estimated $250K per project. So you staff up as much as you take on three projects looking forward to your bonus. Was this a good deal? Those who have read The Flaw of Averages by Sam Savage and Dan Denziger already know the answer. Those who haven’t read the book should. This book nicely captures the sort of statistical reasoning that underlies IBM Rational’s approach to business analytics and optimizations (found in the RTC agile planner and the ROI calculations in Focal Point). Some key rules:
Back to the example: The time to complete is an uncertain quantity and so must be described by a distribution. Often, the estimate returned by the estimator is the mean of that distribution. The distribution may be pretty wide and so may look like Figure 1 of the attached document. (I have had bad luck trying to embed figures in the blog and I have put the figures in this this attachment.) Note that 40% of the distribution lies beyond 12 months. Assuming the $750K cost to complete estimate is dead on, lets apply some simple high school probability to get the distribution of profit (See Figure 2): · The chance of succeeding at all three projects and getting $3M is revenue.is (0.6)^{3}=0.216, · The chance of succeeding at exactly two projects and getting $2M in revenue is 3(0.6)^{2}(0.4)=0.432 · The chance of succeeding at exactly one project and getting $1M in revenue is 3(0.6)(0.4)^{2}=0.288 · ^{ }The chance you will fail at all three projects yielding no revenue is (0.4)^{3}=0.064. The weighted average of the distribution of revenues is (0.216)($3M) + (0.432)($2M) + (0.288)($1M) + (0.064)($0) = $1.8M So the likely outcome of your (3)$750K = $2.25M expense is a loss of $450K. But wait, it is worse. The distribution is probably not normal. Programs are more likely to late than early and so are skewed to the right. In this case the average (i.e. the mean) is less than the 50% point. So, as shown in Figure 3, it is possible to have the estimate of 11 months and the likelihood of failure is 50%. The revenue distribution is given in Figure 4. In this case, the weighted average of the distribution of revenues is (0.125)($3M) + (0.375)($2M) + (0.375)($1M) + (0.125)($0) = $1.5M In this the expected loss is $750K. But wait, it is still worse. The cost to complete is also uncertain. To keep things as simple as possible, lets suppose the cost to complete for each of the projects is described by three values: best case is $700K, the likely case is $750K, and the worse case is $1M To compute the expect profit in this case requires using this values as parameters for a triangular distribution (see Figure 5) and then apply Monte Carlo methods to do the calculation to get the distribution that describes the profit. The result is shown in Figure 6. Briefly in this case: · The most likely outcome is a loss of $945K · There is a 90% certainty of losing at least $805K · There is a 10% chance of losing more than $1.1M So taking this deal is at best career limiting! Notice by ignoring the rules, one is tempted to make a bad deal. Applying each of the rules with more discipline shows how bad the deal is. The moral of all this is that making business decisions based on calculations of averages can lead to disastrous outcomes. This moral needs to be taken to heart by our industry. Far too often, managers when faced with making funding projects or business commitments insist, “Just give me the number.” What they need is a distribution; the number they are given is likely to be an average. Decisions based on the number will likely go sour. No wonder the software and system business outcomes rarely delight their stakeholders. The good news is that there are robust, proven techniques to avoid the flaw of averages. 
What were they Thinking?
First, some personal disclosure: In the late 1980’s, I
worked for a while at Shell Research, developing seismic modeling and data
imaging algorithms. (See
this link.) While there I received training on oil exploration. I
remain awed by the passion, expertise, daring, and discipline of the engineers,
scientists, technicians, and skilled laborers who take responsibility for
providing the hydrocarbons we completely rely on.
Oil exploration is remarkably costly and risky. Even then in the late 1980’s, it was not uncommon to spend $1B on an exploration well, hoping to find oil based on the seismic data only to find it dry. At Shell, I was on a team that developed, for the time, a highly computeintensive algorithm for imaging seismic data captured in complex subsurfaces. They literally bought us a Cray since running the algorithm might make a marginal difference in the success rate of exploration drilling. Hence, I am not an oil industry basher, far from it. So I have been watching the BP, Deepwater Horizon gulf catastrophe with great interest. In this entry, I will share what I have gleaned from various news sources. (I have found the Wall Street Coverage very credible). So here is my net: Recall, the blowout occurred shortly after capping an exploration well (a will drilled solely to confirm the presence of a oil resevior). The depth of the well, reported 18,000 ft, is no big deal. The depth of the water, 5000 ft, is far from the record of around 8000 ft. So the well itself was routine for the industry. So what happened? BP had drilled many of these wells. In fact, ironically the blowout occurred while BP executives were celebrating their safety record. However, over the last few years have become profitable by building a very costconscious culture. Such a culture is likely to cut corners, repeatedly taking small risks in business operations. Such behavior may be rational if you believe the total liability is bounded. There is reason to believe BP’s liability is ‘capped’ at $75M. This culture seemed to be at work on the oil platform:
I bet that BP managers routinely made the same decisions for years with no adverse outcomes. They were probably rewarded for this behavior. Such a culture makes such disasters inevitable over time. A great case study of such cultures is found in Diane Vaugh’s The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA. She studies the NASA culture that led to the decision to launch the shuttle that exploded on launch killing ‘the first teacher in space’ while tens of thousands of school children watched on television. The managers overruled the engineers who advised them that the temperature was out of spec for a launch. As she explains, the managers had gotten away with taking similar risks in the past and had decided to bow to political pressures and approve the launch. So what they were thinking is something like, “These risks are no big deal and the savings matter.” A key moral is that over time the unlikely becomes inevitable. Further, experience and past data lead to exactly the wrong behavior: they reward the risk taking, not the caution. Many have made exactly the same point about behavior of the financial firms during the financial meltdown. Internalizing this moral and acting accordingly is key to our industry. Increasing, we will be building life critical, economic critical systems that are very complex, will operate over long periods, and whose failure could be catastrophic. There is no turning away from this inevitability. So, we all need to understand that cost savings must be balanced by a clear understanding of the overall risks of failure, their consequences and the real return of investment in failure avoidance. This of course takes some math. In particular, thinking about averages is not useful. That is topic of my next entry. 
Quantum Project Management
Today, April 1, seems like a good day to bring forward an important new idea. In fact, I think this may be the next big thing.
One of the wellunderstood problems with software development project management is that it is often impossible to completely specify the complete work breakdown with certainty. The longer the project and the more innovative the project, the more uncertain the work breakdown items. This is addressed in iterative, agile planning by identifying the summary work items and then adding detail as the project evolves. Another source of uncertainty is the dependency between the summary items. This uncertainty in turn makes critical path analysis for such programs problematic. In fact there is a whole ensemble of project critical paths, each with some likelihood. For the physics literate, this ensemble of paths is much like Feynman Path Integrals in quantum theory. The math is pretty hairy (see this elementary description). Fortunately, as Feyman also pointed out, one can simulate quantum mechanics with quantum computers. I am no expert in quantum computing, but even so I have a proposal: Quantum Informed Projects (QuIPs). The idea is to represent work items as QItems using QBits from quantum computing.. Then we can represent the project as a set of entangled QItems and using a suitablly large quantum computer to calculate the wave function for the critical path. My understanding is that we do not yet have large enough enough quantum computers to make this practical. However, the same is true for implementing other useful quantum algorithms (see this example). So we can start by building algorthms. There is no time like the present (not accounting for the quantum uncertainty of measuing time) So on this special day, lets turn our attention to QuiPs. 
Toyota's Dilemma
I have mentioned in the first posting, I am still getting the hang of blogging. I guess one use of blogs is to share what on my mind while staying in the neighborhood of the topic of analytics. So, I have been putting a lot of thought to Toyota's diemma about how to deal with the reports of dangerous acceleration in their cars. The recent reports of Prius incidents (see this article in the New York Times) confirmed some of my earlier suspicions and hence this blog.
First, I need to come clean; all I know is from news accounts. I have had no contact with Toyota or any IBMers working with Toyota. Further, I need to say the the opinions here, in my opinion not controversial, are my own and do not reflect any IBM position. So what do we know:
Now, say there is one chance in a million miles of driving of the latent defects manifesting. They may be impossible to find with standard testing and will inevitably happen every so often to drivers. This is the standard insight that with large volumes unlikely events become inevitable. So with Toyota's large sales, they may be the victim of their success. The avionics community has developed a discipline around safetycritical software. There are design and model testing methods to validate that the embedded software is good enough to stake people's lives on the code running correctly. (There is a good article is the latest Communications of the ACM on model checking for avionis) It seems Toyota and the entire auto industry needs to adopt these safetycritical disciplines going forward. The cost of these practices is overshadowed by the costs of costs of the highly publicized incidents, the suits, and other liability. 
Embracing UncertaintyAn IBM colleague, Jim Densmore, has the phrase 'Embrace Uncertainty' below his email signature. I know Jim well enough to believe I know what he means. In fact, I used to give a presentation with the that title. The idea is that uncertainty is an aspect of the world as it is. If you constantly deny uncertainty and expect to be able to predict the future as if it were certain, all sorts of disappoints ensue. If, instead, you understand and accept the uncertainty, you are more in harmony with the world. Uncertainty is not your enemy, it just is a part of life. What brought this mind is that one of the world's leading poker players, Annie Duke, used the phrase in the latest episode of Radiolab, one of my favorite NPR programs. She explains how embracing uncertainty is the key to being a great poker player, not gimmicks like reading 'tells'. Follow the link and check it out. Of course, readers of the blog know that I believe this applies to our domain. Developing software is systems is fraught with uncertainty. One should embrace it and learn to manage it by applying the right analytics.

Eliciting 90% confidence Triangular Distributions
mcantor@us.ibm.com
Tags:
anything
triangular
estimations
distributions
measure
how
to
1 Comment
5,280 Views
In a previous entry, I discussed triangular distributions. I pointed out that they arose from the practices in Hubbard's How to Measure Anything. When making an estimate, one asks for the high, low and expected values of a quantity. These are used to to define a probability distribution by interpreting the results to mean that there is zero probability that the value is less than the low or greater than the high and the mode of the distribution is at the expected. You get a distribution like the figure below. A reader, Blaine Bateman, president of EAF LLC, found the elicitation question too restraining. After all, saying there is no probability of a value being below the low may call for an unreasonable level of certainty. He prefers asking the asking the question, "Give me low and high values that in which you are 90% confidence." This raises an interesting (at least to me) mathematical question, "Can you specify the triangle given the mode and the 90% of the distribution". That is, can you specify the triangle given points c and d rather than a and b. Blaine has a couple of solutions based a choice of addition assumptions you need to make to get a unique solution. His paper is found following this link. I recommend it, It is a nice piece of work.
