Data Science

Democratize and Optimize with IBM Cloud and Watson Studio

Share this post:

Empowering the whole team

Here in the IBM Cloud Garage, we transform business as part of a cultural movement—our focus is on people and our willingness to work in new ways. The Garage Method empowers all kinds of people in the organization, regardless of role, to speak up and contribute to the new process. In workshops with our clients, our design-thinking approach gives the whole team the ability to contribute by considering new ways to approach a business problem. Of course, we don’t squash the experience of the experts, but we do supplement their knowledge with experience from the whole team, which makes for a richer solution. The Garage Method is a great way to tap the institutional knowledge of an organization, as we use the combined knowledge of everyone on the team to achieve a business objective.

To this foundation of people working in agile, collaborative ways, we add the best tools to help them be effective. Our IBM Cloud has all kinds of tools that I can spin up in seconds to prove out what we need to address our solution, often for free. Giving the best-in-class tools to a motivated, agile team creates an elite team of high-powered business solvers, all ready to take aim and attack any business problem.

One of the ways we can put this approach to work is using data science to get insights. Our Garage Method already uses a way to approach a business hypothesis for an application or process, so it’s a natural extension to apply a hypothesis-based approach to the data as well. We work closely with many service teams to pull in experts on data. Within IBM, we have access to hundreds of data science experts who can help ensure that the conclusions that we draw are accurate.

Just as applications can benefit from modernization to take advantage of new capabilities, the data that they generate can benefit as well. I’m used to offloading an extract of data from my enterprise applications to a spreadsheet or analytics package, and that’s worked well in the past. But there are so many new capabilities and disciplines around data science out there; I want to take advantage of the capabilities that a cloud-native approach to data mining offers me.

Take care of the little things (and the big things)

In the past, the barrier to entry to using specialized tools was rather high. If I wanted something out of the ordinary, I’d have to install and configure the analytics software and extract the data to my computer. I also needed specialized software to transform the data, which meant more installation and configuration, all before I even got started. By the time I got to the insights, my investment was big. It made sense that I saved that effort for the big decisions.

With IBM Cloud, the barrier to entry is very low. In fact, the bar is low enough for me to seek them out for “smaller” decisions that I make as part of my daily work, not just the big, case-study style insights that affect my whole industry. But of course, the discipline I apply for the small things means that, again, using the cloud-native scaling I have at my fingertips, I can scale my approaches when I do go after the big decisions.

With Watson Studio, I’ve got everything I need to do a good job from end-to-end. I can test out my ideas on the cloud subscription. Because of IBM Cloud Private and IBM Cloud Private for Data, I can use the same notebooks and models on sensitive data that I keep on premise, which fulfills the promise of hybrid cloud. Regardless of where I run, I’ve got access to Spark and Jupyter notebooks, along with many tools to improve the extraction, transformation, and load of data and even governance and data catalogs to help me with teaming and organization.

Don’t just improve—optimize!

Here’s a great example of what I mean: optimization. Optimization technology is among my favorite tools because I think that the ability of a model to predict the optimal value for a target is the epitome of statistical decision-making. In my mind, it fulfills the promise that a quantitative approach to problem-solving makes.

As an IT architect, my solutions lend themselves to the kind of approaches provided by linear programming and optimization. But I’m not alone. Regardless of your business role, there are hundreds of applications that even a simple optimization model can improve. It also doesn’t hurt that it’s a lot of fun to see the optimal solution to a problem.

Example: Microservices throughput

Let me give you an example that’s relevant to IT problems: microservices. Let’s say that I have a set of microservices that provide a useful function for me; say . . . vulnerability scanning some code. That is, if I pass in a unit of a code snippet, I have a microservice that checks its security. Let’s say I built one myself, but I also bought three as part of an “as-a-service” subscription from some vendors.

Now let’s say I’ve got a big code release on a deadline, so I have a lot of code to check on all at once; however, I want to minimize my cost.

Suppose Service 1 is from a high-cost vendor, but it gives good throughput. Service 2 is less costly, but I have to tweak the interfaces a lot (maintenance). Service 3 is moderate cost, but I bought a long-term contract with it—I have to use it. Service 4 (my service) doesn’t cost as much, but its maintenance and throughput is middling.

The table shows this:

Service Cost Maintenance Throughput
1 17 2 3
2 12 3 4
3 8 4 5
4 (my service) 7 5 5

 

My objective is to minimize my cost, so we derive the target objective function by going down the “Cost” column. That value is:

  •  Minimum cost <= 17 * service1 + 12 * service2 +  8 * service3 + 7 * service4

My constraints come from reading down the other columns.

My maintenance constraint is as follows (read down the “Maintenance” column):

  •  Maintenance >= 2 * service1 + 3 * service2 + 4 * Service3 + 5 * service4

My throughput constraint is as follows (read down the “Throughput” column):

  • Throughput >=  3 * service1 + 4 * service2 + 5 * service3 + 5 * service 4

Also, remember I have to buy 400 units as part of a contract for Service 3:

  • service3 >= 400

My total code units is 1500 units:

  • service1 + service2 + service3 + service4 = 1500

So what’s the minimum cost?

The tools

OK, so what do I need to solve this problem? Watson Studio in the Cloud and Decision Optimization in Cloud.

Watson Studio is available here: https://dataplatform.cloud.ibm.com/

And you can sign up for Decision Optimization here: https://www.ibm.com/us-en/marketplace/decision-optimization-cloud. Get your API key and endpoint, and you can plug it into the solution.

After getting those, I plugged in my API key and endpoint into my Watson Studio Spark/Python workspace. The notebook is checked in here: https://github.com/tonye/TonyDemocratizeOptimize/

Here is the solution:

 

My minimum cost is $18,400. Interestingly, it also tells me that I shouldn’t use Service 2 at all.

Conclusion

Does everyone have to know optimization? No, but chances are, someone is eager to put their quantitative knowledge to work, and you can’t beat it. It’s a wonderful capability and is extremely easy to access in the IBM Cloud. This kind of capability democratizes access to the data in ways like never before so that you can optimize your best decisions, like how we do in the IBM Cloud Garage.

Get started with a cost-free consultation.

Executive Architect, Cloud Garage

More Data Science stories
September 13, 2018

How KONE Uses Data Analytics with Event-Driven Compute

KONE, a global leader in the elevator and escalator industry, is investing in IBM's Cloud and Internet of Things technologies to power a data-analytics and predictive-maintenance solution for city infrastructure.

Continue reading

September 12, 2018

Introducing First-Class Integration Between Segment and IBM Db2 Warehouse on Cloud

We're proud to announce the immediate availability of a first-class integration between Segment and our very own IBM Db2 Warehouse on Cloud. Db2 Warehouse is our fully-managed, enterprise-class, petabyte-scale cloud data warehouse built with scalability and performance as the defining characteristics.

Continue reading

September 11, 2018

Migrating from Retrieve and Rank and Document Conversion to Watson Discovery

Watson Retrieve and Rank and Document Conversion services will be permanently sunset on October 3, 2018. Switch now to Watson Discovery!

Continue reading