Python Is Not C
I have been using Python a lot lately, for various data science projects. Python is known for its ease of use. Someone with coding experience can use it effectively in few days. This sounds good but there may be an issue if you program in Python as you would in another language, say C. Let me give an example based on my own experience. I have a strong background with imperative languages like C and C++. I also have substantial experience with oldies but goodies like Lisp and Prolog. I also used Java, Javascript,... [More]
Tags: analytics python programming 
No, The TSP Isn't NP Complete
Two recent blog posts discussing the Traveling Saleman Problem (TSP) led me to write this post. The two blog posts are What is Operations Research by Graham Kendall, and I’ve Been Everywhere (Optimally…) by Rob Jefferson. Both are worth reading (I wish I had written them..). These posts share two interesting properties: both discuss the TSP, and both make a slight mistake about the TSP. The same mistake occurs regularly in blog posts and even books. The mistake is... [More]
Tags: optimization analytics np 
The Role Of Data Science
I am sure I'll get flamed for this post, given how hyped data science is. Let me first say that I do not pretend to define what data science is, others, probably more qualified than me, have done it well. For instance, I like this definition from Dawen Peng, as it speaks to an Operations Research person like me. I will rather focus on the role data science can have for business. What I see the most is data scientists analyzing data then publishing reports on insights they found in data. Just browse over... [More]
Tags: analytics big_data data_science 
NP Or Not NP? That Is The Question
A recent blog entry on TSP and NP completeness made me write the long overdue entry I wanted to write about complexity of optimization problems. It comes in play when customers ask this simple question: My problem takes too long to solve, what can I do? I'm pretty sure most optimization professionals heard this question at least once. I already blogged about it in my It Is Too Slow entry without actually answering it (clever isn't it?) Here are various ways to answer it depending on your own agenda. As
an employee of one of the largest... [More]
Tags: analytics complexity optimization 
We must show the pain before we can propose the cure
Part of my job is to inject optimization in IBM Anaytics solutions. During one of the discussions with solution teams we argued about a fairly general issue that can prevent prescriptive analytics adoption. I think it is worth sharing. Specifically, one colleague presented the following analytics classification. I said that we should rather use the one below (I discussed it in Prescriptive vs Predictive Analytics Explained .) where the question prescriptive analytics answers is " What should I do about it?... [More]
Tags: analytics optimization 
Practical Guidelines for Solving Difficult Mixed Integer Programs
Update on Sept 30 2013: Ed Klotz has co authored a paper with Alexandra M. Newman on that very topic, worth a read. They also have a related paper on Practical guidelines for solving difficult linear programs Update on Sept 6 2013: slides and replay for Ed Klotz presentation are available in our developerWorks community . Ed Klotz will present on our next virtual user group webinar. Ed is the worldwide expert on how to tune CPLEX and reformulate models in order to get better... [More]

Installing IBM Containers Extension on Boot2Docker
You want to deploy Docker containers on IBM Bluemix? You have a Windows workstation? If you answered yes to both questions then this post is for you. The Docker service for IBM Bluemix is the IBM Containers service. This service is currently in beta test and I was lucky enough to get access to it. You can register for the beta on IBM Bluemix home page shown above. In order to use this service we must first install IBM Containers Extension (ice) on our local Docker host. For Windows machines, the... [More]
Tags: boot2docker bluemix windows docker containers 
Computing The Really Optimal Tour Across The USA On The Cloud With Python
When Randy Olson's Computing the optimal road trip across the U.S. resulted in articles in the Washington Post , NY Daily News , Daily Mail , People Magazine , NY Times , NPR , and many other outlets, the mathematical optimization community got surprised, and almost shocked. It got surprised for a couple of reasons. First reason to be surprised, the road trip computed by Randy Olson was not optimal, i.e. there is a shorter tour. The first to publish the shorter tour was Bill Cook in... [More]
Tags: cloud analytics optimization python 
Actionable Insights
It is good practice to eat your own food. I should be no exception. In my post on the role of data science I was blaming data scientists who left business users without any clue about how to use the insights they produce. I should do the same, and help businesses use the advice I gave in that post: Data science role is to enable data based decision making. What does it mean in practice for a business? It means that data scientists should not only provide interesting insights, but they also should care... [More]
Tags: big_data analytics decision data_science optimization 
Memory Locality
How can a Java code be 85x slower than a C++ code solving the same problem? This post is trying to answer this question. Why am I asking this question in the first place? It all started with a seemingly simple exercise. We were working on a large scale analytics (aka big data) project and had trouble agreeing on what results should a particular analysis return. I decided to write a C++ code for it, and a colleague decided to use Java. The goal was to use two completely independent implementations for cross... [More]
Tags: analytics graphs high_performance 
Solving the hardest Sudoku  part 1
Do you know the hardest Sudoku problem? Do you know the best way to solve it? Before answering these questions, let me remind you of what the Sudoku puzzle is about in case you haven't read a newspaper in the last decade (adapted from wikipedia ): The objective is to fill a 9×9 grid with digits so that the digits in each column, each row, and each of the nine 3×3 subgrids that compose the grid (also called ""blocks") are pairwise different. The puzzle setter provides a partially completed grid, which... [More]
Tags: constraint_programming mathematical_optimization optimization analytics sudoku mathematics 
Benchmarking Is Tricky
We benchmark all the time. Why? There are mainly two reasons. First, our customers keep asking for performance improvements, as they apply CPLEX to larger and more complex problems. We therefore need to make sure newer releases of CPLEX are faster for our customers. The only way to know is to benchmark our code. Second, marketing people like to be able to claim speedup in their messaging. Indeed, this boils down to a simple number that can be reused everywhere. The latter is a nice side effect,... [More]
Tags: analytics cplex benchmark 
Proactive Analytics
Why blog again about optimization and analytics? Because the current way of having optimization be part of analytics is a bit misleading. Let me first say I assume that optimization is part of analytics here. Granted, a previous post of mine supported a different view, but the idea that mathematical optimization is part of the broader category of analytics is gaining momentum. For instance, the INFORMS society is pushing for it with its... [More]
Tags: analytics optimization 
Issues Are Not Where One Think They Are
Where are the issues when one tries to use optimization to improve business? They may not lie where one think. My former colleague Laurent Perron (now at Google) splits the average time spent on optimization projects as follow in his CP 2011 invited talk : 50% Getting the right problem with the right people 25% Getting clean data 5% Solving the problem 20% Reporting the results/Explaining the implications One could argue about the exact split, but the broad picture is true as far as I can tell from my experience. I would... [More]
Tags: optimization analytics graphics 
What Is The Solution When There Is No Solution ?
Optimization is like a Ferarri, when you drive it correctly you can
achieve incredible performance . But you must understand what it can
do and what it can't do or you will crash. Same is true for optimization. I'm starting a series of posts on various
pitfalls that people using optimization can fall into. This is the what it can't do part . This will complement posts where I brag about the value of optimization, which are centered around what it can do . Today's topic is about the difference between an exact answer , and a useful... [More]
Tags: modeling infeasibility optimization overconstrained 