Comment lines: Making good decisions through business analytics

In order to make good decisions, you need to understand what’s going on around you. But it can take a lot of work to make sense of the vast and diverse data that is available to you, and even more work to understand how it can help you to make decisions. Fortunately, IBM® has done a lot of this work for you. Top-tier statistical and optimization packages like IBM SPSS® and IBM ILOG® CPLEX® show IBM’s continuing commitment to business analytics technology solutions that help you use and understand the data that supports your business. With unique levels of accessibility to this data within reach, the next important task for IT services everywhere is putting this data to good use. This content is part of the IBM WebSphere Developer Technical Journal.

Share:

Tony Efremenko, Certified Master IT Specialist, IBM

Tony Efremenko is a Certified Master IT Specialist with IBM, with over 24 years of experience in the software industry.



17 July 2013

Also available in Chinese Russian

"Computer, what do I do next?"

As a kid, I wanted to be an astronaut. Life was simple then, but simplicity had consequences: I believe I estimated pi to be “about 3” longer than any post-bronze age culture ever did. In other words, you wouldn’t have wanted me working out the trajectory for the next Mars explorer. But once I got a calculator, I would check my math every time, and my estimated area of circles improved, along with my potential future as an astronaut.

As I’ve gotten older and the problems have gotten harder, my expectation that technology could help me sort through complexity went up as well. I learned to generate and measure data. But deciding what to do next with my information — well, that involved a lot of work. That’s because all the activity of my systems and their interaction with the Internet has meant that lots of numbers were generated, but the complicated task of making sense of it was left entirely to me. It’s one thing to be able to make the right decision; that is, a decision which, in hindsight, would seem to be the correct answer. But I wasn’t even sure I was making a good decision, where I chose the path that aligned with the largest number of data points.

Wouldn’t it be great if my software could at least help me make informed decisions about what to do next?


The power of analytics and optimization

IBM Business Analytics and Optimization software helps you by bringing the power of scientific modeling, statistical analysis, and optimization (along with many other analytic tools) to help solve real world problems. It’s designed to provide insight and meaning to the huge volumes of data, generated not only by what we think of as true “Big Data” activities, but normal, work-a-day problems we all encounter.

Business analytics means that I can analyze a problem to determine the relationships that already exist as part of the data. It means, for example, that I can do an analysis of what a product or service should cost, based on market factors. I can use optimization packages to tell the maximum or minimum amount of something I can produce using certain inputs and constraints. I can finally understand what all that data is trying to tell me.

Big Data is certainly a huge driving force for the need, but the result for us all is the accessibility of these tools. IBM delivers many of these solutions through a cloud delivery approach, which means the footprint for this powerful software in your data center is reduced, without diminishing the value or power of the solutions delivered. This is coupled with run time packaging that enables you to embed the necessary APIs is your system, to make them a part of your system’s capabilities.

For years now, IBM has been at the forefront of analytics in all aspects of its software. You see statistical analytics in our performance evaluation software. You can even use our diagnostic software, such as IBM Support Assistant, to tell you where to look for memory leaks based on the tool’s deep analysis the tool brings, or evaluate the performance statistics from IBM WebSphere® Application Server to help find bottlenecks.

But the offerings in business analytics bring the promise of deep statistical analysis to full fruition. Using the best mathematical algorithms, brought on through years of research, they can help you make decisions that are supported by the data you use to support your business. This quantitative analysis approach brings a unique dimension to decision support that enables making a “good” decision based on statistical modeling methods and best-in-class optimization methods.


Simple example: Capacity planning

Maybe you think business analytics is mainly for finance solutions, or maybe you think your application is too small to apply this kind of approach. But just like any tool, once you see the value and learn to apply its power, you will find all kinds of uses for it – just like a calculator for checking your math – but for decisions derived from your business data.

So then, what kind of things can business analytics bring to my corner of the world?

As a software architect, I have a common problem: capacity planning. Let’s suppose I’m about to deploy a new system. I have a mix of hardware. What’s the relationship of cost to my throughput? And given what the hardware is costing me, what’s the best mix to get the most throughput for the least cost? This simple example will give you the idea of the kind of power behind this software.

I’m pretty sure that faster hardware has a greater cost, but how much? I’ll look at the data to tell me.

Assume I have a spreadsheet of cost and throughput statistics for a set of computers for some imaginary data center. Suppose I’m savvy enough about statistics to know I need a certain sample size to get a meaningful result, and so I have about 30 entries. Let’s also suppose that even though I believe cost and CPU clock speed are correlated, I’m not really interested in that for this simple sample. The data looks something like this:

Table 1
Server entryClock speedThroughputCost
142504011600
235002037300
321001010200

This goes on for 30 entries. When finished, it looks like cost is a factor of throughput – but how much?

I’ll crack open my IBM SPSS statistical package, and do a linear regression to see how much CPU clock speed and cost affect my throughput.

Figure 1. SPSS Statistical Data Editor showing linear regression
Figure 1. SPSS Statistical Data Editor showing Linear Regression
Figure 2. Detail on SPSS Data Editor showing linear regression selection dialog
Figure 2. Detail on SPSS Data Editor showing linear regression selection dialog
Figure 3. SPSS Data Editor showing linear regression output report
Figure 3. SPSS Data Editor showing linear regression output report

From our coefficient of determination (R2) value of .993 (Figure 3) we see there is definitely a relationship between throughput and clock speed and cost (the closer R2 to 1, the stronger the linear relationship).

Even better, the data tells us how that relationship holds up (Figure 4).

Figure 4. SPSS Data Editor showing further down on linear regression output report
Figure 4. SPSS Data Editor showing further down on linear regression output report

The coefficents result tells me that a one dollar change in cost will give me 7.305 additional units of throughput.

You could easily solve this trivial example on a spreadsheet. But what if you had hundreds or thousands of entries, and the relationship wasn’t obvious? Without much work, you could still get the SPSS statistical package to reveal this and many other relationships that are hidden in your data.


Another example: Optimize my cost

Now that I understand how cost figures in, how can I get the best throughput for my servers?

For this example, I’ll use IBM ILOG CPLEX Optimizer to help me decide what mix of computers to use for the best cost.

Suppose my three servers look something like this:

Table 2
HighMiddleLow
Throughput500020001000
Cost600300200

Now, suppose I need at least 30000 throughput units to meet the demands of my application. But I can only get 5 of the “High” throughput servers. What mix of computers should I get to give me the least cost?

For this example, I’ll use the command line interface. From my costs, I know that:

  • My high level server (call it x1) costs about $600 and gives me 5000 units of throughput.
  • My middle server (x2) costs $300 and gives me 2000 units of throughput.
  • My low end (x3) costs $200 and gives me 1000 units of throughput.
  • I want to minimize cost, get at least 30000 units of throughput, but can only obtain 5 of the high end (x1) servers.

Here’s how the problem looks:

  • Total cost equation: 600x1 + 300x2 + 200x3 = total cost. This is what I want to minimize.
  • My decision variables are x1, x2 and x3.
  • My constraints are that x1, x2 and x3 must be greater than 0, and 5000 x1 + 2000 x2 + 1000 x3 >= 30000. Also, x1 must be less than or equal to 5.

Using CPLEX optimization , it looks like Listing 1, and Figures 5 and 6.

Listing 1
Enter example
Minimize 600 x1 + 300 x2 + 200 x3
Subject to 5000 x1 + 2000 x2 + 1000 x3 >= 30000
bounds
X1 <= 5
0 <= x1	
0 <= x1
0 <= x2

Integer
X1 x2 x3
end
Figure 5. IBM ILOG CPLEX Optimizer showing command line interface
Figure 5. IBM ILOG CPLEX Optimizer showing command line interface
Figure 6. IBM ILOG CPLEX Optimizer command line interface showing optimize command
Figure 6. IBM ILOG CPLEX Optimizer showing command line interface showing “optimize” command

Then, we say “optimize” (Figure 7).

Figure 7. Results from IBM ILOG CPLEX Optimizer optimize command
Figure 7. Results from IBM ILOG CPLEX Optimizer “optimize” command

The results (Figure 7) show that the optimal cost is $3800.00. So how many servers do I need? Run this command and see the results in Figure 8:

>display solution variables x1-x3

Figure 8. IBM ILOG CPLEX Optimizer “display solution variables ” command
Figure 8. IBM ILOG CPLEX Optimizer “display solution variables ” command

I need to buy 5 high end servers, 2 middle tier, and 1 low end server.

This example shows the optimization using the command line interface. But CPLEX has an extensive API suite and tooling studio that enables you to add the CPLEX capabilities to your system. So as the data changes, you still get the optimized result.


Conclusion

I always use good tools to make my life simpler. With business analytics software, I now have the tools to help me make good decisions on what to do next based on what’s in my data. With cloud delivery for the tools and run time APIs, this power is now more accessible to me than ever. IBM Business Analytics software is the key to unlock what your data has been trying to tell you.


Acknowledgements

I would like to thank my professors at Carnegie Mellon University Tepper School of Business, including Professor Michael Trick, Senior Associate Dean, Professor Francois Margot, Professor of Operations Research, and Professor Gerard Cornuejols, IBM University Professor of Operations Research, for unlocking the promise of statistical analysis and optimization.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere, Business analytics
ArticleID=937561
ArticleTitle=Comment lines: Making good decisions through business analytics
publish-date=07172013