Yesterday, I was at the Conference on System Engineering Research (CSER) held this year at Stevens Institute. I sat through a talk which stimulated my curmudgeon tendencies. In the spirit of hopefully generating some contraversy, I will not hold back.

The talk was about an expert-system based engineering risk management system. Essentially, the authors got a set of experts together to identify catagories of risks (people, delivery, product ...), risks in the categories, and a method for identifying level of risk and their consequence and then summing the products of the levels and the consequences. The end is the total amount of category risk. Looking at the output is supposed to give you insight of the overall program risk and the contributing risks.

My problem is that I cannot parse the last sentence. In fact I do not understand terms like "program risk" and say "people risk". There may be a clash of cultures here; to many those terms seem reasonable.

My argument starts here: One can ask 'What is my risk of going over budget?' or 'what is my risk of missing the delivery date?' The answers to these sort of questions are answered using stardard business analytics. See, for example, Mun's text on risk analytsis that defines risk as statistical uncertainty of a quantity that matters. For example, 'time to complete' is a quantity that does matter to a project. The uncertainty in making the date can be measuresd as the variance (or standard deviation) of the estimate of the time-to-complete. (Note, for the math aware, time-to-complete is what the statisticians call a continuous random variable.) So the answer to the question, 'what is my schedule risk?' has an unambiguous, quantified answer. What is 'my people risk' has no such answer. In fact, 'people risk' is not a concept defined in business analytics.

Of course, it does make sense to ask what contributes to the schedule risk. One might fear that the inability to staff the project contributes to the schedule risk. Fair enough. In my mind, that does not make staffing a 'risk', but say a schedule risk factor.

I am not sure why I am so adamant about this, but I am. It could be that I believe that the less precise use and measurement of risk is holding our industry back.

Anyone want to comment or defend the so-called risk management practice underlying the talk I found so annoying ?

## Comments (6)

1santosg commented PermalinkMurray,

I think this is due different views and perspectives on what "risk" here (the "clash of cultures" you refer to). To a great extend this is due to the ambiguity of the term itself, and how it is used.

A variation of this second class of approach to risk that was introduced that goes beyond uncertainty (variance) alone, and couples it with consequence... i.e., a volatile portfolio is not risky if it's returns have little probability of ending up below a given benchmark/goal.

I wonder if most of the decisions involving software relate more to the latter than to the former. Is the software world represented by unbounded, unknown distributions?

2mcantor@us.ibm.com commented PermalinkThanks Santosg. I really did want to start a dialog. Your defense of the some of the common thinking is challenging and has forced me to think further.

3mcantor@us.ibm.com commented PermalinkOne more thought. With the distribution of say the time-to-complete (taken as a random variable') and a fixed delivery date, one can detect the area of the distribution that lies past the deadline. That area is the likelihood of missing the date. One could then define the schedule risk as this likelihood. So 80% of the estimation is greater than the delivery date, the project is 80% likely to be late.

4santosg commented PermalinkAgree Murray (i'm one of those converted in favor of a more strict notion of risk :).

Saying that "this project has 80% probability of being late" is a very straightforward and useful way of assessing it's risk. Although this is based on underlying computations of variance, it is a much more compelling and intuitive form of operationalizing risk than saying "this project has x variability in time-to-complete" (i.e., translating the variance into a probability statement about an event that matters is a more appealing characterization).

The other side of this, though, is the confidence that we can give consumers of the reliability/validity of that probability statement. How confidence are we when we tell you that the probability of being late is 80%. If it is based on approximation to an unknown distribution (a distribution whose parameters are unknown)... how "good enough" are those approximations? How "good" will the decisions driven by those approximations be? Experimentally it would require replicating the exact conditions in a project multiple times to see if the actual outcome matches the probability statement. Problem, seems to me, is that replicating the exact conditions surrounding a project could be as difficult as replicating the conditions surrounding the financial markets at a particular point of time (this is where the equity portfolio extrapolations can be relevant). This is important, because we would need to be prepared to answer a corporate executive asking us "How much can i bet on your probability estimation being well calibrated"? (i.e., my decision to do something would need to be based not only on the probability of the project being late, but on the accuracy of that probability). However, I should look at Hubbard's writtings, as it looks like there are ways around this.

5ClayEW commented PermalinkHello all:

6santosg commented PermalinkGood points, Clay.

Thinking about your "Mitigation" point, and taking it back to the talk at Stevens Institute conference that caused Murray's dismay...there maybe a salvageable aspect in their approach. The concept of braking down the sources of risk. Assuming the right conceptualization of risk (e.g., "risk of going over budget" or "risk of missing the delivery date" instead of the ambiguous "people risk" or "program risk)... the notion of breaking down the sources of that risk is useful. This may have been their ultimate goal (though maybe approached in a misguided way).

It would be a legitimate effort to figure the sources of risk... the source of variability. This would amount to an analysis of variance, where we could tell people what % of the uncertainty is due to what sources. Ultimately this would translate in the (more consumable)... "your chances of delivering late would decrease from 80% to 40% if you manage to staff these 10 pieces of work with the right skill", or "your chances of running over budget would go from 60% to 30% if you cut in half the cost rate of a third of your staff. That chance could be cut another 15% further if you improve your quality by just 5%".

This could get even more interesting in the multivariate, inter-related case. Like "Your chance of delivering late is 80%, and the chance or running over-budget is 60%. If you decrease the chance of being late by 10% you would decrease the chance of running overbudget by 10%. However, if you decrease the probability of running overbudget you are increasing the change of delivering late by 25%".