所带标签：
analytics
X

## Machine Learning Algorithm != Learning Machine
How easy it is to build a learning machine? Shouldn't one just hire some Machine Learning PhDs and have them run their algorithms? Well, this is most probably a good idea, but it won't be enough. I'll try to explain why in this blog entry. Before answering our questions, let's define what we are dealing with. A Learning Machine is a machine (a software, a web site, a mobile app, a robot, pick your favorite) that performs a task, and that gets better and better as it performs it. In recent years,... [More]
标签： bigdata machine_learning datascience analytics |

## Installing XGBoost For Anaconda on Windows
XGBoost is a recent implementation of Boosted Trees. It is a machine learning algorithm that yields great results on recent Kaggle competitions . I decided to install it on my computers to give it a try. Installation on OSX was straightforward using these instructions (as a matter of fact, reality is a bit more complex, see the update at the bottom of this post). Installation on Windows was not as straightforward. I am sharing what worked for me in case it might help others. I describe how to install for... [More]
标签： xgboost python machine-learning analytics datascience |

## Installing PyCUDA On Anaconda For Windows
PyCUDA is a great library if you want to use gpu computing with NVIDIA chips. If you want a more portable approach or if you have ATI chips instead of NVIDIA, then you might consider PyOpenCl instead of PyCUDA. I provided instructions on how to install PyOpenCl on Anaconda for Windows in a previous entry . Installing PyCUDA on Anaconda for Windows can be tricky. Here is what you can do, it worked fine for me. I am using the latest Anaconda distribution with Python 3.5 in it.... [More]
标签： python big_data pycuda machine_learning anaconda analytics |

## Perception Matters
I lenjoyed reading the following from Dear Mona, Which Is The Fastest Check-Out Lane At The Grocery Store? (You should read it all, as it provides an interesting crash course on queuing theory in practice): After airline passengers wouldn’t stop complaining about the time they spent at baggage claim (even when more staff were added and wait times fell) a Houston airport simply moved the arrival gates so that passengers spent more of their “wait” time walking to... [More]
标签： analytics psychology optimization |

## Analysts Views On Optimization
According to leading analyst firms, the corporate world should invest more in advanced analytics in general, and optimization in particular. Here are few examples. Last week at IBM Insight conference, Forrester's Mike Gualtieri presented with my colleague Eric Mazeran an interesting view on prescriptive analytics. Here is one of their slides. I like it because it propose a comprehensive view of where optimization lies within an end to end flow from data to actions. For those who never read my blog before, mathematical... [More]
标签： prescriptive analytics optimization |

## Using Jupyter Docker Stack To Run R Notebooks
I'm a Python fan but I am cognizant of R being more popular than Python among data scientists. The combination of Python scientific stack , Jupyter notebooks , and Docker makes it easy to deploy cloud data science services, see for instance Deploy IPython Notebooks With Docker On Bluemix In Minutes . Can we do the same with R? I decided to test a combination of R, Jupyter notebooks, and Docker. To make an unbiased test, I further decided to use R code not written by me (which probably is a good thing... [More]
标签： notebook docker analytics jupyter rstats |

## Solving Sudoku In Python With DOcplex On DOcloud
Sudoku is a great example to introduce prescriptive analytics: it is well known, and it is not trivial to solve manually. I will use docplex Python api to implement a web application that solves Sudoku problems. The code is available in a notebook on github and nbviewer . More information on docplex can be found here . DOcplex can be installed via pip as any other Python package: !pip install docplex Once installed, we can use it to create arbitrary math programming models. These models can either be solved using our... [More]
标签： cloud optimization docplex docloud python analytics sudoku |

## Analytics Landscape
A great way to explain the value of analytics is to speak about the analytics maturity model . This model contains two pieces. First, analytics is a two step process: insights are generated from data, then decisions are made based on these insights. Second, we distinguish four maturity levels, depending on how much of the analytics process is automated: descriptive, diagnostic, predictive, and prescriptive. Descriptive Analytics answers: What happened? What is happening now? It makes data visible to human decision... [More]
标签： big_data data_science analytics |

## Optimizing Car (And Cyclist) Speed
What is the optimal way to adjust one's car speed in order to minimize fuel consumption (or CO2 emission) while meeting desired travel time? The answer to that question came to me after I wrote my last blog entry on Predicting Cyclist Speed . In that post I explained how an endurance cyclist, Dave Haase , was using his power. He wasn't using constant power as most cyclists do. This made me think about what would be the best strategy. Use constant power, or use something closer to what Dave was doing? After... [More]
标签： green math sustainable optimization analytics |

## Predicting Cyclist Speed
I have been the 'data scientist' on the IBM team that helped Dave Haase run the Race Across America (RAAM) this year. This project exemplified quite a few of the classical tips of data science documents in The Inconvenient Truth About Data Science : Data is never clean. You will spend most of your time cleaning and preparing data. 95% of tasks do not require deep learning. In 90% of cases generalized linear regression will do the trick. Big Data is just a tool. You should embrace the Bayesian... [More]
标签： data_science optimization python analytics |

## Prescriptive Analytics Is Easier And More Profitable Than Predictive Analytics
When you hear about algorithms these days, chances are that you hear about machine learning or predictive analytics. (Some make a distinction between machine learning and predictive analytics, but the distinction is not material for this post. I'll use both interchangeably here). A quick search returns recent discussion in the news of machine learning algorithms: Using Algorithms to Determine Character , When Algorithms Discriminate ,... [More]
标签： optimization predictive prescriptive analytics |

## How Does Cognitive Computing Relate To Analytics?
Readers of this blog are familiar with the analytics maturity model that includes several analytics levels: descriptive, predictive, and prescriptive. Presenting this model sometime triggers a very interesting question: where would Watson fit? If you've missed recent IBM history here is a refresher. Watson stands for IBM's offerings for Cognitive Computing. It has its roots in the Jeopardy Watson supercomputer that won the Jeopardy game few years ago. Watson now includes several... [More]
标签： cognitive analytics |

## Modeling Cyclist Power
With the Tour de France nearing its end, and with some controversy about the power developed by some racers, I thought it would be timely to share some work I did in a recent project called Analytics For The Perfect Race . Part of that project required the capacity to forecast the pace at which the cyclist would move. For that we needed to build a physical model of the cyclist. More precisely, we needed to build a model that relates the power of the cyclist to his actual speed on the... [More]
标签： analytics cycling python |

## Where Is Operations Research In Social Media?
Michael Trick's State of Operations Research Blogging discusses the fact that OR blogging is vanishing while twitter activity around OR is increasing. As an example of the latter, look for #ismp2015 on twitter and you'll get the most detailed ever journal of an OR conference I have seen. Mike further links blogging decline to the disappearance of Google Reader. I do think that there is another reason to the decrease in OR blogging. There is definitely a trend where OR is being... [More]
标签： optimization analytics data_science |

## Python Is Not C
Update on December 8, 2015. An updated version of this post is available at Python Is Not C: Take Two . I have been using Python a lot lately, for various data science projects. Python is known for its ease of use. Someone with coding experience can use it effectively in few days. This sounds good but there may be an issue if you program in Python as you would in another language, say C. Let me give an example based on my own experience. I have a strong background with imperative languages like C and C++. I... [More]
标签： analytics python programming |