Python Is Not C
Update on December 8, 2015. An updated version of this post is available at Python Is Not C: Take Two . I have been using Python a lot lately, for various data science projects. Python is known for its ease of use. Someone with coding experience can use it effectively in few days. This sounds good but there may be an issue if you program in Python as you would in another language, say C. Let me give an example based on my own experience. I have a strong background with imperative languages like C and C++. I... [More]
Etiquetas: analytics python programming 
How To Make Python Run As Fast As Julia
Julia vs Python Should we ditch Python and other languages in favor of Julia for technical computing? That's certainly a thought that comes to mind when one looks at the benchmarks on http://julialang.org/. Python and other high level languages are way behind in term of speed. The first question that came to my mind was different however: did the Julia team wrote Python benchmarks the best way for Python? My take on this kind of cross language comparison is that the benchmarks should be defined by tasks to... [More]
Etiquetas: numpy python numba julia cython 
Python Is Not C: Take Two
When I wrote Python Is Not C 6 months ago I did not imagine that it would be my most popular post ever, with more than 67k views. The conclusion of that post reads: The lesson is clear: do not write Python code as you would do in C. Use numpy array operations rather than iterate on arrays. For me it meant a mental shift. Given Python ecosystem is rapidly evolving, I decided to revisit this conclusion using the performance improvement tools that I discuss in my previous post . Let me briefly introduce... [More]
Etiquetas: numba python numpy nearest_neighbors scipy sklearn 
Free CPLEX Software For Academics
Here is an update of my previous post on this topic . IBM ILOG CPLEX Optimization Studio (CPLEX) is free for for academics thanks to the IBM Academic Initiative. If you are not a faculty member, you may be interested by the free trials available on our developerWorks site. IBM Academic Initiative (AI) is a global program that faculty members, research professionals at accredited institutions, and qualifying members of standards organizations can join. Members can get full versions of a large... [More]
Etiquetas: free cplex academic 
No, The TSP Isn't NP Complete
Two recent blog posts discussing the Traveling Saleman Problem (TSP) led me to write this post. The two blog posts are What is Operations Research by Graham Kendall, and I’ve Been Everywhere (Optimally…) by Rob Jefferson. Both are worth reading (I wish I had written them..). These posts share two interesting properties: both discuss the TSP, and both make a slight mistake about the TSP. The same mistake occurs regularly in blog posts and even books. The mistake is... [More]
Etiquetas: optimization analytics np 
The Role Of Data Science
I am sure I'll get flamed for this post, given how hyped data science is. Let me first say that I do not pretend to define what data science is, others, probably more qualified than me, have done it well. For instance, I like this definition from Dawen Peng, as it speaks to an Operations Research person like me. I will rather focus on the role data science can have for business. What I see the most is data scientists analyzing data then publishing reports on insights they found in data. Just browse over... [More]
Etiquetas: analytics big_data data_science 
How To Quickly Compute The Mandelbrot Set In Python
Introduction My Christmas Gift was about creating nice images of the Mandelbrot set. A comment on reddit make me write this sequel. The comment is suggesting that I should use a vectorized version of the code rather than the sequential one I am using. I take this excellent suggestion as an excuse to review several ways of computing the Mandelbrot set in Python using vectorized code and gpu computing. I will specifically have a look at Numpy, NumExpr, Numba, Cython, TensorFlow, PyOpenCl, and... [More]
Etiquetas: python pycuda pyopencl math fractals gpu dataviz opencl mandelbrot 
A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization
How fast can compiled Python be compared to, say C? You'd be surprised by the answer. The study below contradicts common wisdom that you cannot get close to C for matrix oriented computation. A good example of a study supporting the common wisdom is Sebastian F. Walter's Speed comparision Numba vs C vs pure Python at the example of the LU factorization . He has shown that Numba, a recent compiler that can be used with Python, is between 2x and... [More]
Etiquetas: julia numpy c gcc cython python numba scipy 
Deploy IPython Notebooks With Docker On Bluemix In Minutes
Are you interested in deploying Docker containers in IBM Bluemix ? Are you developing these containers on a Windows workstation with Boot2Docker? If you answered yes to both then this post is of interest. Furthermore, if you are using IPython notebooks then this post is definitely worth a read! The Docker container service on IBM Bluemix is IBM Containers . This service is currently in beta test and I was lucky enough to get access to it. You can register for the beta on Bluemix home page shown above. In... [More]
Etiquetas: bluemix boot2docker python ipython docker 
How Zara Really Grew Into the World s Largest Fashion Retailer
The New York Times recently published an interesting paper on How Zara Grew Into the World’s Largest Fashion Retailer . The paper describes the Fast Fashion business model that fuels Zara' growth. What the paper doesn't say is that mathematical optimization played a key role in enabling this business model. More precisely, Zara worked with MIT and UCLA on several business problems. There are few publications, I pasted their abstracts below. Clearance Pricing Optimization for a FastFashion Retailer Fastfashion
retailers such as Zara... [More]
Etiquetas: customer analytics zara retail optimization 
The Analytics Maturity Model
Update on Sept 21, 2015. An improved version of this model is presented in Analytics Maturity Models. Analytics can be defined in many ways, but what matters is the purpose of analytics. Most definitions agree on the following: analytics is used to gain insights from data in order to make better decisions, see for instance INFORMS definition : Analytics is defined as the scientific process of transforming data into insight for making better decisions. Some speak of actionable insights to stress the purpose of such... [More]
Etiquetas: optimization analytics 
Simulation And Optimization Are Not The Same
Selling optimization to happy users of simulation technology can be a tough nut to crack. Here is an example I find quite effective at opening eyes. Before diving into it let me start with a disclaimer. I am not trying to show that optimization is superior to simulation, nor am I trying to undermine the value of simulation. I simply want to make clear that simulation and optimization are two different things, each with its own value. There are cases where optimization is a better fit, as shown below. There are also cases where simulation... [More]
Etiquetas: simulation optimization analytics 
My Christmas Gift: Mandelbrot Set Computation In Python
My mother likes fractals for their strange beauty. I decided to give her a simple way to generate beautiful fractal images of the Mandelbrot set . As you may guess, I'll be using Python for this. There are available Python code for this on the web, but they are either slow, or they don't produce nice images. Hence my own attempt at it. The code used here is available in a notebook on github or on nbviewer . I explore various ways to speed that code in How To Quickly Compute The Mandelbrot Set In Python . In... [More]

NP Or Not NP? That Is The Question
A recent blog entry on TSP and NP completeness made me write the long overdue entry I wanted to write about complexity of optimization problems. It comes in play when customers ask this simple question: My problem takes too long to solve, what can I do? I'm pretty sure most optimization professionals heard this question at least once. I already blogged about it in my It Is Too Slow entry without actually answering it (clever isn't it?) Here are various ways to answer it depending on your own agenda. As
an employee of one of the largest... [More]
Etiquetas: analytics complexity optimization 
The Orange Juice Algorithm
Update on May 20. A recent Network World paper discloses that Coca Cola is indeed using our optimization software for the orange juice application I originally described in the blog entry below. A nice, recent, article in BloombergBusinessweek describes a very interesting use of mathematical optimization at Coke. Optimization is used to ensure that their Minute Maid and Simply Orange orange juices always taste the same. This paper caused some buzz because it is said that a problem of up to one quintillion... [More]
Etiquetas: analytics optimization solution modeling 