with Tags:
machine_learning
X

IBM Machine Learning
We announced IBM Machine Learning last week, see here and here for event replays. I was interviewed as part of the launch. A good write up of what I said can be found on Silicon Angle. You can find the video of the interview at the end of this post. This video has been shared on social media under various titles, but he one that got most impact is : The evolution of # machinelearning : fusing human thought with algorithmic insights . It is probably because the interview contains a discussion of AI potential... [More]
Tags:  machine_learning |
Feature Engineering For Deep Learning
Feature engineering and feature extraction are key, and time consuming, parts of the machine learning workflow. They are about transforming training data, augmenting it with additional features, in order to make machine learning algorithms more effective. Deep learning is changing that according to its promoters. With deep learning, one can start with raw data as features will be automatically created by the neural network when it learns. For instance, see this excerpt from... [More]
Tags:  machine_learning deep_learning |
Is Python Slow As Molasses?
Python is a popular language for machine learning. It is even the most popular one according to a study of mine recently published here and on KDnuggets . The above study generated quite a few reactions on social media. One that draw my attention reads: I just recently switched to Scala. Somewhat similar to python but with a number of advanced concepts. It's definitely more complex to learn than Java, but from a performance perspective much faster. Although considering that Python is slow as molasses yet is leading... [More]
Tags:  python machine_learning |
Most Popular Posts Of 2016
I wish you, all my readers, your families, and your friends, all the best for 2017. Your renewed interest made me write 34 entries in 2016. This year I focused mostly on Python and Machine Learning, reflecting my role at IBM as the technical lead for machine learning offerings. This shows in the topics of my most popular posts for 2016: The Most Popular Language For Machine Learning Is ... A Speed Comparison Of C, Julia, Python, Numba, and Cython on LU Factorization Installing XGBoost For Anaconda on Windows Tidy Data In... [More]
Tags:  python optimization machine_learning |
The Most Popular Language For Machine Learning Is ...
What programming language should one learn to get a machine learning or data science job? That's the silver bullet question. It is debated in many forums. I could provide here my own answer to it and explain why, but I'd rather look at some data first. After all, this is what machine learners and data scientists should do: look at data, not opinions. So, let's look at some data. I will use the trend search available on indeed.com. It looks for occurrences over time of selected terms in job... [More]
Tags:  deep_learning machine_learning data_science |
Using Python Subprocess To Drive Machine Learning Packages
A lot of state of the art machine learning algorithms are available via open source software. Many open source software are designed to be used via a command line interface. I much prefer to use Python as I can mix many packages together, and I can use a combination of Numpy, Pandas, and Scikit-Learn to orchestrate my machine learning pipelines. I am not alone, and as a result, many open source machine learning software provide a Python api. Most, but not all. For instance Vowpal Wabbit does not support a... [More]
Tags:  analytics machine_learning python |
The Machine Learning Workflow
I have been giving two talks recently on the machine learning workflow, discussing pain points within it and how we might address them. First one was at Spark Summit Europe at Brussels , the other one at MLConf at San Francisco . You can find videos and slides for each below. Main message is that the machine learning workflow is not that simple. MLConf, San Francisco That was a great event. I was in very good company with top presenters from a number of prominent companies, as you can... [More]
Tags:  analytics machine_learning spark |
Installing LightGBM on MacOSX with Python wrapper
There is a new kid in machine learning town: LightGBM . It is an implementation of gradient boosted decision trees (GBDT) recently open sourced by Microsoft. GBDT is a family of machine learning algorithms that combine both great predictive power and fast training times. Interested readers can find a good introduction on how GBDT work here . Why does LightGBM matter? It matters because it is way faster to train than the reference implementation for GBDT ( XGBoost .) I learned about LightGBM... [More]
Tags:  analytics machine_learning |
What's New In Machine Learning?
What has changed in Machine Learning in the past 25 years? You may not care about this question. You may even not realize that Machine Learning as a technical and scientific field is older than 25 years. But I do care about this question. I care because I got a PhD in Machine Learning in 1990. I then moved sidetrack to work on constraint programming and mathematical optimization. I am back to machine learning since a couple of years, and I did ask myself this: is my PhD still relevant, or has Machine Learning... [More]
Tags:  machine_learning optimization analytics |
Spark Summit Europe
Please join me at the Spark Summit next week (Oct 25-27) in Brussels. This is one of the yearly events where the Spark community gathers. More details can be found at: https://spark-summit.org/eu-2016/ The Meetup I will be talking about Machine Learning with my colleague Nick Pentreath at the meetup we organize right after the summit, on Thursday night. Location details below: Spark and Machine Learning meetup: Brussels. October 27th from 6:30pm to 9:30pm (Brussels time) ... [More]
Tags:  machine_learning analytics spark |
An ode to the analytics grease monkeys (analytics deployment = ROI)
Here is a guest post by my colleague Erick Brethenoux, Director, IBM Analytics Strategy, Decision Management & Initiatives. He provides a new and interesting angle on a very important topic that I discussed several time here: analytics and data science provide business value only when business actions are taken. I like the way Erick discusses it, and I hope you'll agree with me. Analytics has value only when it is actionable Analytics provide a significant business (i.e., monetary) impact for organizations when analytical... [More]
Tags:  data_science analytics machine_learning |
A Practical Guide to Machine Learning: Understand, Differentiate, and Apply
Co-authored by Rob Thomas ( @ robdthomas ) Machine Learning represents the new frontier in analytics, and is the answer of how many companies can capitalize on the data opportunity. Machine Learning was first defined by Arthur Samuel in 1959 as a “Field of study that gives computers the ability to learn without being explicitly programmed.” Said another way, this is the automation of analytics, so that it can be applied at scale. What is highly manual today (think about an analyst combing thousand line spreadsheets), becomes... [More]
Tags:  analytics machine_learning |
Be Brave In Machine Learning
There is lots of confusion about the role of test data in machine learning. The typical outcome is overfitting , a plague that must be avoided at all reasonable cost. The confusion comes from blurring two, fundamentally different, roles for test data: Model selection. Candidate machine learning models are applied to data that was not used to train them. The model leading to best predictions is selected. The data used for selecting models is often called validation data . Performance on validation data... [More]
Tags:  machine_learning analytics |
Analytics For The Perfect Race Across America
Applying analytics to sports is one of the fun part of my work. I had a great opportunity last year to work as part of an IBM team to help ultra cyclist Dave Haase race across America. Racing across America is quite a challenge: imagine a 3000+ miles, non stop, race across USA, with over 110,000 feet elevation (see pictures below). Cyclists can race as they wish, rest only when they chose to. Last year winner slept about one hour every 24 hours, for 8 days. Dave Haase finished close... [More]
Tags:  predictive_analytics cycling machine_learning analytics |
Machine Learning As Prescriptive Analytics
I made a mistake about machine learning. Repeatedly. I said, and I wrote, that machine learning and predictive analytics were almost the same. To be more specific, my view was simple: analytics can be divided in four categories, exemplified below (see Analytics Landscape for details) I put machine learning near predictive analytics in this 2D landscape: Of course, I also put optimization as the queen of all analytics technologies as it yields best business value. What else would you expect from someone who spent nearly 3... [More]
Tags:  machine_learning analytics optimization |