Yesterday, I was at the Conference on System Engineering Research (CSER) held this year at Stevens Institute. I sat through a talk which stimulated my curmudgeon tendencies. In the spirit of hopefully generating some contraversy, I will not hold back.
The talk was about an expert-system based engineering risk management system. Essentially, the authors got a set of experts together to identify catagories of risks (people, delivery, product ...), risks in the categories, and a method for identifying level of risk and their consequence and then summing the products of the levels and the consequences. The end is the total amount of category risk. Looking at the output is supposed to give you insight of the overall program risk and the contributing risks.
My problem is that I cannot parse the last sentence. In fact I do not understand terms like "program risk" and say "people risk". There may be a clash of cultures here; to many those terms seem reasonable.
My argument starts here: One can ask 'What is my risk of going over budget?' or 'what is my risk of missing the delivery date?' The answers to these sort of questions are answered using stardard business analytics. See, for example, Mun's text on risk analytsis that defines risk as statistical uncertainty of a quantity that matters. For example, 'time to complete' is a quantity that does matter to a project. The uncertainty in making the date can be measuresd as the variance (or standard deviation) of the estimate of the time-to-complete. (Note, for the math aware, time-to-complete is what the statisticians call a continuous random variable.) So the answer to the question, 'what is my schedule risk?' has an unambiguous, quantified answer. What is 'my people risk' has no such answer. In fact, 'people risk' is not a concept defined in business analytics.
Of course, it does make sense to ask what contributes to the schedule risk. One might fear that the inability to staff the project contributes to the schedule risk. Fair enough. In my mind, that does not make staffing a 'risk', but say a schedule risk factor.
I am not sure why I am so adamant about this, but I am. It could be that I believe that the less precise use and measurement of risk is holding our industry back.
Anyone want to comment or defend the so-called risk management practice underlying the talk I found so annoying ?
It should not be a surprise that I have been following the BP oil spill with much interest. In fact, as I starting typing this entry, I was watching the grilling of the BP CEO, Tony Heyward, by Congress. Rep. Stupak is focusing on the BP’s risk management.
Some of you have read my earlier posting on my thoughts of the BP decision process that led to the Deepwater Horizon blowout. So far, information uncovered since that posting is remarkably consistent with my earlier suppositions. In this entry I would like to step back a bit and discuss what broader lessons might be learned from the incident. While it is all too easy to fall into BP bashing, I would rather use this moment to reflect more deeply on risk taking and creating value. (BTW, some of you might now that my signature slogan is ‘Take risks, add value’.)
In our industry we create value primarily through the efficient delivery of innovation. Delivering innovation, by definition, requires investing in efforts without initial full knowledge of the effort required and the value of the delivery. This incomplete information results in uncertainties in the cost, effort, schedule of the projects and the value of the delivered software and system, i.e. cost, schedule and value risk.
Deciding to drill an oil well also entails investing in an effort with uncertain costs and value. In this case, the structure of the subsystem and productivity of the well cannot be know with certainty before drilling. As I pointed out in an earlier blog, a good definition of risk is uncertainty in some quantifiable measure that matters to the business. So in both our industry and oil drilling we deliberately assume risk to deliver value.
So, what can we learn from the BP incident? Briefly, one creates value by genuinely managing risk. One creates the semblance of value for a while by ignoring risk.
Assuming risk, investing in uncertain projects, provides the opportunity for creating value. That value is actually realized by investing in activities that reduce the risk. The model that shows the relationship is described in this entry. So, reducing risk has economic value, but reducing risk takes investment. In the end, the quality risk management is measured with a return on investment calculation. This in turn requires a means to quantify and in fact monetize risk.
I wonder what was there risk management approach was followed by BP. A recent Wall Street Journal article suggested they used a risk map approach – building a diagram with one axis a score of the ‘likelihood of the risk’ and the other a score of the ‘severity of a failure’. So with this method, they would score the risk of a blowout as very low (based on past history) with a very high consequence. So, such a risk needs to be ‘mitigated’. (Some actually multiply the scores to get to some absolute risk measure.) Their mitigation was the installation of a blow-out preventer. They could then confidently report they have executed their risk management plan. Note these scores are at best notionally quantified and not monetized.
Paraphrasing my good colleague, Grady Booch (speaking of certain architecture frameworks), risk maps is the semblance of risk management. As pointed out by Douglis Hubbard in The Failure of Risk Management (and in an earlier rant in this blog), this sort of risk management is not only common, but dangerous: It is a sort of business common failure mode that leads to bad outcomes. Also, Hubbard points out, useful risk management entails quantification and calculation using probability distributions and Monte Carlo analysis. I would add that since risk management in the end is about business outcomes, risks need to be monetized as well as quantified. I am willing to bet a good bottle of wine that BP did no such thing. Any takers? The business common failure mode was over-reliance on the preventers, even though there are several studies showing they are far from ‘failsafe’.
Further, it appears BP assumed risk by consistently taking the cheaper, if riskier. design and procedure alternative, the one with greater uncertainty in the outcome, even when the cost of an undesired, if unlikely, outcome was possibly catastrophic. The laundry list of such decisions is long; some outlined in Congressman Waxman’s letter to Tony Hayward. CEO’s of Shell and Exxon testified before congress that their companies would have used a different, more costly designs and followed more rigorous procedures. According the congressional and journalistic reports, this behavior is BP standard operating procedure. So BP assumed risk by drilling wells but did not invest in reducing the risk.
For quite a while they got away with the approach of assuming but really reducing risk, and appeared to be creating value as reflected in stock and value and dividends to the investors. The BP management raised the stock price from around $40/share in 2003 to a peak of around of $74/share prior to the Deepwater Horizon incident. At this writing the stock is trading at $32/share and the current dividend has been cancelled. Investors might rightly wonder if there is another latent disaster and so discount the apparent future profitability with the likelihood of unknown liabilities. The total loss of stockholder value is over $100B, which is in the ballpark of the eventual liability of BP. So, whatever approach BP used to manage risk failed.
BTW, some may recognize this same pattern in the management of financial firms that participated in the subprime mortgage market. In that case, they ‘mitigated risk’ by relying on the ratings agencies. Those who actually built monetized models of the risk realized there was a great opportunity to bet against the subprime mortgage lenders and made huge fortunes (See, e.g. The Big Short: Inside the Doomsday Machine by Michael Lewis .).
Readers of the blog will notice a recurrent theme is some of the postings. It is essential that we assume and manage risk. To repeat a favorite quote, “One cannot manage what one does not measure.” The risk map, score methods, while common are insufficient to the needs of our industry; they do not measure, nor really manage risk. We as a discipline need to step up to quantifying, monetizing, and working off risk in order to be succeed as drivers of innovation. We need to step up to the mathematical approach found in the Douglas and Dan Savage’s (see this posting) texts.
I came to this same realization probably a decade ago. I held off at first because I had not deep enough understanding of how to proceed, and I knew I would encounter great skepticism. I tested the waters in 2005 and posted my first paper on the subject in 2006. I indeed received a great deal of skepticism and resistance, but enough acceptance to go forward. I have learned some important lessons from all that. In my next blog, I will share my experiences of bringing more mathematical thinking to risk management for SSD.
firstname.lastname@example.org 110000CH6X Tags:  net management ppm portfolio roi npv return investment software of on present value 2,554 Views
In an earlier blog entry, I mentioned my article Calculation and Improving the ROI of Software and System Programs. I am pleased to announce to that it has been published in the September 2011 issue of the Communications of the ACM.
email@example.com 110000CH6X Tags:  management software development_analytics agile andes development project 2,365 Views
In my last blog, I laid out a vision of how a project lead and her stakeholders might use the predictive analytics to drive to better project outcomes. As I mentioned in the entry, IBM Rational is work on such a tool. A demonstration of this tool is found here: Agile Development Analytics Demo. The video was created by Peri Tarr, the lead architect on the project.
Some of you might notice that the terminology and development process described in the demo is at odds with your understanding of Agile. We do understand that and currently we are working on making a robust tool that accommodates a wide range of processes for what might some might call 'pure Agile' to various hybrids we are discovering in the market place.
In the next blog, I will explain more on how the tool works.