Calculate your return on investment for software and systems

The term "return on investment" (ROI) is frequently used to describe the benefit derived from investments in software and systems or other business investments. To better align software and systems investments, there are different kinds of ROI answers to different business questions: Have we received a good return on the investments to date? Should we continue to invest in the project? What will be the total ROI over the life of the software or system? This article provides the different ROI calculations to answer these questions.

Murray Cantor, IBM Distinguished Engineer, IBM

As a leader in the IBM Rational field services group, Murray Cantor promotes and extends Rational best practices, and works closely with customers on innovative ways to build and deliver systems more efficiently. Currently, he leads the evolution of a new engagement model for transforming software development organizations, as well as Rational Unified Process for Systems Engineering® (RUP-SE®). The latter methodology is critical for organizations working at the leading edge of large-scale hardware and software system development. He also focuses on how to integrate IBM Rational field capabilities with those of other IBM brands.

He has been named Distinguished Engineer both for his contributions to RUP-SE and for his successes with client enterprise transformations. A well-known thought leader, he is a sought-after keynote speaker at industry events, has published two books and numerous papers, and plays a key role on standards committees relating to UML and RUP.

Murray Cantor received his Ph.D. in mathematics from the University of California at Berkeley in 1973.



15 May 2012

Also available in Chinese Russian Vietnamese

Introduction

We often speak of the return on investment of a software or system or IT project as the chief justification for deciding to proceed with the effort. The term is sometimes adopted notionally, so we might say that the ROI of certain software will be better efficiency, without articulating a measurement of efficiency. This calls for precise definitions of ROI. It turns out there is more than one, and we can use each for a different kind of decision. This article describes the ideas behind the calculations. The detailed formulas are in Appendix 3..


Calculating the future

Someone once said, "It is impossible to predict the future, but that is our job." Those who are responsible for reasoning about the value of future investments need to work with incomplete information. For example, it is impossible to know for certain what the future revenue of a new product will be. Nevertheless, you need that revenue for computing the expected ROI of bringing that product to market. Fortunately, there is a way forward.

The sections that follow present several types of ROI with their associated calculations. When using the various equations that follow, you can use random variables (see Appendices 1 and 2) in place of the fixed values. It is common in modern business analysis to use random variables with triangular distributions, as described in Appendix 1.

Notes:

  • If you are already familiar with the concepts of random variables, read on. If you are not, you will find it helpful to read the appendices first.
  • You might also find it useful to read the introduction to finding the value of ongoing development efforts found in my previous article, "Calculating and Improving your Return on Investment of Software and System Programs." Communications of the Association for Computing Machinery (Digital Edition), September, 2011. This article provides the explicit formula for the ideas in that earlier article.

To-dateROI and To-goROI

To get started, all of the kinds of ROI are based on the same core concept: the return on investment, generally speaking, is the ratio of the change of value to the cost of the investment. In this formula, V0 is some initial value, V1 is the value at some later date, and I is the money spent in the meantime:

mathematical equation

It is the application of this equation that varies with the kind of asset.

The simplest example of ROI is in reasoning about some capital asset, such as a share of stock that you buy at one price and sell at another. There are two easily understood values used in the computation: the purchase price (pp) and sales price (sp), both of which are values set by the market. The ROI in this case is the ratio of the change of the price over the cost of purchasing the stock. In this case:

mathematical equation

Even in this case, there could be variations. Let's suppose that an investor, being a professional, has a couple of key questions:

To-date
Have I made a good investment (What would be the ROI if I were to sell today?)
 
To-go
Should I invest in the asset? (What would be the ROI if I bought some of this asset today?)
 

The first question is retrospective, because it addresses how well the investor made decisions. The answer might lead to a change in investment strategy. The second question is part of implementing the investment strategy. Of course, answering the two questions requires different ROI calculations:

To-date
Here, the V1 is today's value (the proceeds that would accrue from selling all holdings at the current price), plus whatever benefits the investor has received to date, such as dividends, the V0 and I are the sums of all of the costs of all of the investments in the asset.
 
To-go
Here, the V1 is the estimated proceeds from the sale of the asset at some set future date, the Vo is the initial cost of the asset, and I is the sum of what you expect to spend on the investment, the initial costs and future payments.
 

Important:
The cases are almost completely independent. The previous costs (already spent) that are used for the to-date case have no place in the to-go case.

Going back to the first equation, in investment analysis (IA), the To-date ROI is the random variable using that equation, where

  • V1 = NPVtoday + sum of actual benefits to date
  • VO, I = the sum of the costs to date This assumes the NPV at the onset of the project

Notice that I is based on actual expenditures. In most cases, it is reasonable to set NPVprogram_onset to zero.

For IA programs, "end of life" is when all costs and benefits end. At that point, the value is zero. More generally, an IA investment depreciates after delivery.

In the To-go ROI, all of the past costs and benefits are ignored. All that matters is the discounted future costs and benefits. We can apply the base equation to get this formula:

mathematica equation

The NPV and future cost calculations involve future values captured as random variables in IA, so this formula uses the IA Monte Carlo engine (see (Appendix 2).


Total ROI

Given that IA contains both actual and forecasted costs and benefits, other useful ROI calculations are possible. For example, you might decide to invest in a program if there is sufficient expected return over the total life of the investment. In this case, you would compute the forecasted To-date ROI at the end of the program. Because it has no value at the end of the program, by setting the NPVprogram_onset to zero, we find:

mathematical equation

At the true end of the program, the terms are all actuals. Before then, you might want to forecast the totalROI, where terms are a mix of discounted future values and actuals. In this case, the totalROI is a random variable and found by using the Monte Carlo engine in IA.


Reference dates

ROI predictions are generally calculated as of today. However, because IA contains the full lifecycle of the costs and benefits of the investment -- the past values as actuals, and the future values as random variables -- it is possible to set any reference date for the calculations. That is, you can forecast the distributions of NPV, as well as the To-go ROI and To-date ROI at any future date. The date of delivery is an example. You would forecast the values when the product benefits would start. This might be a good calculation for comparing two investments with different delivery dates.

Note:
The Total ROI is the forecasted To-date ROI with program end as the delivery date.


In summary

The Monte Carlo simulation in Appendix 2 was computed in IBM® Rational® Focal Point, Version 6.5.1.

Notionally, return of investment (ROI) notionally is the ratio of benefit to expense. Anyone who has limited funds would want to use those funds to maximize the ratio. Even for simple investments, there is more than one flavor of ROI, each used to answer to a different question. This article introduces three of the most useful:

  • To-date: What return have I gotten for the investment I have made?
  • To-go: What return can I expect from future investments?
  • Total: At the end of the program, what ROI can I expect from all of the investments?

The detailed formulas are in Appendix 3.


Acknowledgement

This article took considerable care in preparation. Much thanks to Jim Densmore of IBM for his edits, suggestions, and challenges. Without his help, it could not have been written.


Appendix 1. Random variables

Suppose that you are uncertain about the value that you want to use. For example, the sales volume in a future period of an undelivered product might be important, but no one can be certain of the actual value. In modern business analytics, it is common practice to specify such uncertain quantities as random variables. What follows is a brief explanation of their use. (A much more extensive treatment is found in Douglas Hubbard's book, How to Measure Anything: Finding the Value of Intangibles in Business (2nd ed.), Wiley, 2010) Given that we are not 100% certain of a future value, the next best thing is to specify that the value of v can be any value within a range. For example:

a ≤ v ≤ b

By this we are saying the probability is zero that v is less than a or greater than b (in some cases, we can let a equal -∞ or b equal ∞). We are also saying that the probability is one that v lies between a and b. We can go further and suppose that some values for v are more likely than others. In that case, we can specify the likelihood of each possible value of v. Therefore, we will have a curve that gives, for each possible value of v, the probability of v taking that value. Thus, a random variable is a quantity described by a curve that gives, for each value in a range, the probability of it taking that value. The curve is called the probability distribution of the random variable.

An important property of these distributions is that, because a random variable must take some value, the sum of the probabilities of the values must equal one.

For example, when we capture the best case (H), worst case (L), or most likely value (E) of the future sales volume, mathematically, we can specify its random variable with a distribution that looks like Figure 1.

Figure 1. A triangular distribution for a random variable
Triangle with L at left, E at the peak, H at R

The height of the curve at any point along the scale represents the probability of the random variable taking that value. Hence, we have chosen a distribution with no probability below L or above H, with a peak at E. The heights are chosen so that the area of the triangle (sum of all the probabilities) is 1. In this case, the probability of v taking a value near L or H is small and the probability of taking a value near E is relatively high.

Of course, the shape of the distribution can be any curve, given that the area under that curve is 1.

In summary, a random variable is a quantity that can take any value. However, some values are more probable than others. So a random variable is specified by the function that assigns a probability to each value. This function is called the probability distribution of the random variable.


Appendix 2. Calculating with random variables: Monte Carlo simulation

Suppose that you want to add two random variables, v1 and v2. How would you proceed? First note that the sum would be another random variable. Therefore, what you would need is the probability distribution of the sum. There is no formula for that distribution, but there is an effective, commonly used numerical approach known as the Monte Carlo simulation.

The idea behind the Monte Carlo simulation is to use a random number generator take a sample value of v1 and a sample value of v2 and then add them. The values are selected according to the probability distributions of each of the variables. The more likely values are taken more often. Now save that sum and do the same thing many times, say 100,000 times, and store each of the sums. For each of the sums, you can compute the probability by looking at the frequency in the collection of saved sums (some sums are more frequent than others) and dividing by the number of samples (actually, you have to round the sums to get the counts). What you get is an approximation of the distribution of the sums.

Let's look at an example where v1 has a triangular distribution with L = 3, E = 4, H = 7, as shown in Figure 2, and v2 has a triangular distribution with L = 1, E = 6, H = 7, as Figure 3 shows.

Figure 2. Triangular distribution graph
L = 3, E = 4, H = 7
Figure 3. Triangular distribution graph
L = 1, E = 6, H = 7

Figure 4 gives the distribution of the sum of the two random variables shown in Figures 2 and 3, which you can find by using a Monte Carlo simulator. This was found by using 100,000 samples and computed in IBM Rational Focal Point.

Figure 4. The simulated distribution of the sum (from 100,000 samples)
Distribution curve, with peak at 9.80

First, notice that the sum is not another triangular distribution, but somewhat closer to a normal distribution, a bell-shaped curve. The peak (mode) of the distribution is 9.80. This is to be expected from the mathematics of probability (in particular, the central limit theorem). The distribution of the sum makes sense. For example, we would expect the most likely value of the sum to be 10, the sum of the two most likely values, but the simulation found 9.8. The discrepancy is due to chance and would diminish with more samples. Also, notice that the probability approaches zero below 4, the sum of the lows (not specifically called out in Figure 4, but apparent), and also approaches zero above 14, the sums of the highs.

Finally, fixed and random variables are easily combined. You can treat a fixed variable as a random variable that takes a single value with probability of one and the probability of all other values as zero.


Appendix 3. The ROI formulas

Suppose that, in our program, there are T time intervals, NB identified benefits, and NC identified costs. Note that each cost and benefit is a time series. Then:

  • For 0 ≤ t ≤ T and 1 ≤ n ≤ NB, let mathematical equation = the value of the nth benefit at interval t,
  • For 0 ≤ t ≤ T and 1 ≤ m ≤ Nc, let mathematical equation = the value of the mth cost at interval t.

Before proceeding, there is an important point to make: All of the time series are revised throughout the lifecycle, so they are time-dependent. The time series are each captured as a series of snapshots. Therefore, as more information comes in, the random variables should be updated throughout the lifecycle. As time passes, estimates convert to actuals, and the future values are updated. In practice, each term of the cost and benefits time series is also time-dependent. This is discussed in more detail in the sections on formulas, To-date and To-go ROI and Total ROI.

Bk(s) and Cl(s) are the snapshots of the benefits and cost streams. To avoid clutter, we will drop the snapshot variable unless necessary. We need more notations:

  • Let r = the reference period for the calculation, the current period or some specified future period
  • For 1 ≤ n ≤ NB, let rbn = the discount rate of the benefit Bn
  • For 1 ≤ m ≤ NB, let rcm = the discount rate of the cost Cm
  • For a given period t and reference period r, let the sum of all the discounted benefits at t with respect to r be:

mathematical equation

  • Similarly, for a given period t and reference period r, let the sum of all of the discounted benefits at t with respect to r be:

mathematical equation

Notice that the terms of the time series might be random variables, fixed variables, or both. In any case, they can be summed using Monte Carlo simulation, as needed.

In this notation, we can define the net present value NPV at period r to be:

mathematical equation

And then:

mathematical equation

Notice that for any two periods, s and t:

mathematical equation

In this calculation, TodateROIs is the special case ROIs,0. In this case, the Bj,s and Cj,s are generally actuals.

Finally, TodateROI is ROIT,0.

By the definition of end of life, NPVT = 0 since, there are no costs or benefits remaining. Also, in most cases, NPV0 is close to 0, since all of the costs are in the future. Thus, by setting NPV0 to be 0, we get:

mathematical equation

Resources

Learn

Get products and technologies

  • Download a free trial version of Rational software.
  • Evaluate other IBM software in the way that suits you best: Download it for a trial, try it online, use it in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement service-oriented architecture efficiently.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Rational software on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Rational, DevOps
ArticleID=815759
ArticleTitle=Calculate your return on investment for software and systems
publish-date=05152012