Optimization Is Ready For Big Data: Part 3, Variety
JeanFrancoisPuget 2700028FGP Visits (10345)
A colleague of mine once told me that Big Data should be called "All Data". Indeed, one of the key dimension of Big Data is to apply analytics techniques to all kind of data. Other dimensions include volume and velocity of data.
Can optimization be applied to all sorts of data? I'd say yes despite the fact that optimization primarily deals with numerical data. Indeed, optimization has already been applied to a wide variety of data, much more than common knowledge may suggest. Let's see a few examples that support this claim.
First example is about editing videos for Youtube. The material below is based on Matt
We present a novel algorithm for automatically applying constrainable, L1-optimal camera paths to generate stabilized videos by removing undesired motions. Our goal is to compute camera paths that are composed of constant, linear and parabolic segments mimicking the camera motions employed by professional cinematographers. To this end, our algorithm is based on a linear programming framework to minimize the first, second, and third derivatives of the resulting camera path. Our method allows for video stabilization beyond the conventional filtering of camera paths that only suppresses high frequency jitter. We incorporate additional constraints on the path of the camera directly in our algorithm, allowing for stabilized and retargeted videos. Our approach accomplishes this without the need of user interaction or costly 3D reconstruction of the scene, and works as a post-process for videos from any camera or from an online source.
The full paper is available here. One interesting point is that they can detect salient features (eg a face) and express constraint about how salient features should appear in the end video. The video below speaks better than a long text. The video is accessible from this page if you can't see it below.
In this case, optimization is applied to video feeds. However, it is not applied to raw video data. Indeed, authors have developed a 3 step process:
Optimization is used in step 2. In step 1, raw data (video frames) is processed by some predictive analytics algorithms. These algorithms infer camera movement coordinates. These coordinates are then fed into some optimization model that computes smoothed camera movements in step 2. All in all optimization is applied to video feed thanks to some preprocessing by predictive analytics algorithms.
The same generic pattern of predictive analytics followed by optimization has been used in our second example. This example is about vehicle routing. we have been working on aproject with the city of Lyon to provide routing that takes into account current and predicted traffic. I'll refer to Anal
A third example is retail price optimization. I'll refer to my Retail Price Optimization post for more details, but let me summarize the process here. First, a variety of sales history data is analyzed to produce a predictive model known as price elasticity. This model is then used as input to a price optimization model. The data used to compute price elasticity can be extremely diverse, see for instance the figure below for a list.
Optimization is applied to all these data sources, albeit not directly. Some predictive analytics is used to digest all data into an elasticity model that captures purchase intents of customers. Elasticity is then used as the objective function in price optimization models.
Let's recap. Our three examples share a common pattern that can be depicted as follows. Predictive analytics (machine learning, statistics) is used to process raw data in a wide variety of forms. As a result, data variety is abstracted into a small predictive model (camera motion, traffic, and price elasticity in our examples). That predictive model can then be used to output numerical data that optimization can use directly.
We have seen in previous posts that the exact same pattern can be used to apply optimization to large data volume, and to fast moving data. We'll discuss in our next post the consequences of using predicted data as input to optimization.