I had the honor to give a tutorial at a Big Data and Optimization seminar, thanks to an invitation from John Poppelaars, One of the topics I discussed seemed to resonate well. Let me try to explain it here.
The first thing people think of when they hear about Big Data is large data volume. There are other dimensions than volume in Big Data, see Big Data For Dummies for instance, but let's focus on large data sets. Is current optimization technology ready for the huge data sets available now (petabyte size)?
My answer is yes. It is yes despite the fact that state of the art math programming solvers do not solve models of petabyte size.
Let us look at an example before I describe the general pattern that makes big data optimization possible IMHO.
I recently discussed retail price optimization. My colleagues at demandTec have developed a series of quite advanced analytics techniques to help major retailers adjust their prices in order to maximize their margins. I won't repeat details here, but let us look at the general approach that was used. The process has basically two steps:
Training. A predictive model is built using history data. That model predicts sales levels as a function of product prices and additional data. History data can be huge as is includes all transactions (all sales) in all stores for all product of the retailer for a period of two years or more. For each sale it records the current price of every product sold, and the use of coupons and other promotions. It also includes data about everything that could influence sales levels. To name a few: competitive information (competitor price for similar products), inventory level (out of stock situations impact sales), weather (hot weather favors drink and ice creams sales), seasonality (Christmas or Black Friday are a bit special), existence of national marketing campaigns like TV ads, etc. The net result is a very large volume of data. That data is analyzed in order to produce a function (aka a model) that computes sales levels if you provide input data such as existing prices (own and competitors), period of the year, weather, etc. This function is called price elasticity.
Online.On a regular basis (could be as often as several times a day for an online retailer), we compute the prices that would lead to the highest margin, or lead to the highest revenue, depending on what the retailer is looking for. This is where optimization plays a role. The data provided to the optimization piece is two fold: First, we have the price elasticity model, second we have the current conditions, including own prices, competitor price, current promotions, current ad campaigns, current weather, etc. The good news here is that we no longer deal with huge data sets. The predictive model is the result of some logistic regression, and is usually expressed with a rather small non linear function. The current conditions are also a small data set. As a result, the optimization models we have to solve are of a decent size, but well in the range of what current solvers can solve.
We can summarize the process graphically as follows.
This is actually something quite general. The general pattern is a two step process
Predictive Analytics. Analyze history data to understand the behavior of the process we are interested. In the above example, the process is the customer buying process. The understanding is a quantification of how many customers will buy each of the products on sale. This understanding is built by answering two business questions: "What did happen"? and "Why did it happen".
Prescriptive Analytics. When such understanding is built, then we can also answer a third question: "What will happen?". In turn, when we have a good idea of what might happen, then we can plan in advance to provide the best possible business response. In the above example, we could set new prices that maximize the number of sales.
This is a rather flexible pattern that we have seen regularly. We can depict it as follows:
They key point is that the large body of history data is compressed into a small predictive model. That predictive model contains the essence of the history data. It embodies how customers behaved in the above example.
Then, we can use this small size model, with small data sets representing the current situation to compute optimal actions. Given we now deal with reasonably sized data sets we can use current solvers. There is no need to have solvers able to ingest petabytes directly.
We will see in following posts how the same pattern can be used to address other big data dimensions.