Price OptimizationDid you know that the price at which you buy your Coke bottle (pick any brand you'd like here) at your nearest retail store was probably set by a process that involved mathematical optimization? If not, then learn how it was probably done.
Let me first say that the idea of optimizing prices isn't really new. The airline industry has rolled out techniques called revenue management in the 90s, where the number of seats offered at a given rate was modified according to the demand. In a nutshell, if a given flight sells well, then the number of cheap seats (economy) is decreased, while the number of expensive seats (business, premium economy) is increased. Numbers are modified the other way if the flight does not sell well. Airlines also adjust prices in each fare class in addition to modifying the number of available seats in each fare class. Recently, hotel industry has started to apply revenue management techniques borrowed from airline industry, read for instance how Inte Retail industry is also using optimization to set prices, albeit this is less known. There is an important difference between price optimization for retail, and price optimization for the transport industry. In retail, we can assume an unlimited supply in first approximation, while in transportation, the number of seats per plan, or the number of rooms per hotel is a hard limit. The other difference is that retailers usually focus on margin rather than total revenue. Price optimization involves two main steps (there is more to it, but we'll focus on these for the sake of clarity):
Single Product Price ElasticityPrice elasticity is the given name for the relationship between price levels and demand. In general, the cheaper a product, the more it sells. Note that this is not true for luxury goods as a higher price may lead to higher sales in that case. We assume in the rest of this article that we are not in the luxury goods retail, and that lower prices mean larger sales volume. This qualitative relationship isn't good enough, we also need to know the rate at which volume increases when price decreases. This rate is what is called price elasticity. A typical elasticity curve is displayed in figure 1. We also display the total revenue, which is the product of sales levels by price. At low price levels you saturate the market, and the sales level no longer depend on the price. At very high prices you are out of the market, and sales level drops to 0. The interesting part is between these two extremes. This is where the optimal price level is. Figure 1. A price elasticity curve, x axis is price Estimating price elasticity is an interesting challenge in itself. The Intercontinental Hotel paper cited above describes the way they did it for their industry. They were quite constrained because were wanted to reuse as much as possible of the existing IT systems. In other industry, other practices are used. When possible, a simple way is to provide the same good at different price points, and record the sales levels. This is done for instance in the auto insurance industry. Car insurance companies run experiments. During a short period, they'll offer a lower price to a subset of their prospects, a higher price to another subset of prospects, and the current price to all other prospects. They then get 3 data points from which they can compute the average number of sales per offer, and the average revenue per offer, for each price levels. Figure 2 provides an example. This example is simplified as it assumes all cars are the same. Reality is more complex, and the variety of cars has to be taken into account. Figure 2. Results of Insurance elasticity experiment Even if we assume a one product case (eg all cars being identical), results may be biased if another insurance company is performing a similar test in the market! Therefore, the way prospects are split into the three groups must be immune to bias introduce by other players on the market. Single Product Price OptimizationOnce we have an elasticity model, we just need to compute the process level that maximizes margin. For a single product, the cost is constant, hence margin and revenue are closely related. For the sake of simplicity we can assume we optimize revenue in that case. We assume we have a curve relating revenue to price, like the one depicted in figure 1. We then have a non linear optimization problem to solve to find the price level that maximizes revenue. If we know the elasticity curve through a finite set of data points, then we might just use a brute force approach. We simply evaluate revenue for each price level and select the one that generates the highest revenue. For instance, if we have three data points as in figure 2 we can see that reducing the price by 5% yields the largest average revenue. We have to be careful though, as this conclusion is made on a small set of data points. If we had closed one deal less for the 95 price we would get the data shown in figure 3. Figure 3. Results of Insurance elasticity experiment In that case, current price level seems best. So, should we decrease price by 5% given data in Figure 2? I'd say no given a single deal could change the conclusion. This little example shows that data driven decisions may be brittle if based on too few data points. This is well known in statistics where one really emphasizes the risks of small data. I've seen too many data science applications where the result of an analysis is a set of numbers without any confidence interval or another measure of confidence in the results. Data quality, and biases are too often neglected. It is a pity given statistics techniques usually come with an indicator of how data supports a given conclusion. Read Jon Mount's post and the ones linked from it for more on this topic . Multi Product Price ElasticityLet's now look at the retail industry in general, where no longer can assume that sales levels only depend on the price of the product to be sold. Many factors influence sales levels, including competition and cannibalization. Competition implies that a product sales level not only depends on the price at which you sell it, but that it also depends on the price at which your competitors sell it. The lower competitor price, the lower your sales (again, we assume we're not in the luxury goods market). Cannibalization means that your customers may buy another product from you than the one you're considering. There are two flavors of it. The first one is between substitutable products, for instance between 1.5L and 2L Coke bottles. If you lower the price of the 2L bottles, then the volume of 2L sales will increase, but the volume of 1.5L sales is likely to decrease as customer trade 1.5L for 2L bottles. Another phenomenon is competition between different product categories. Say your store runs a promotion on TV sets, then TV sets sales are likely to jump, but consumer spending may stay flat, i.e. consumers spend less on the rest the day they buy a TV set. I tmeans that TV sets sales go up at the expense of sales of the rest of the products globally. Beside competition and cannibalization, many other factors influence sales levels. One is seasonality, some goods sell at Christmas time for instance. Weather can also influence sales: you sell more ice creams when the weather is hot. Ad campaigns also influence sales levels. One should also take inventory into account: if your store was out of stock then sales levels are zero, whatever the price level! There is more to it, but I guess you get the idea: sales level for a given product P depends on lots of variables: price level for P in your store and competitor stores, price levels of products substitutable to P, macro economic trends (for customer spending), seasonality, ads, inventory, etc. Estimating price elasticity in that case is quite complex. It results in multivariate surfaces rather than single variable curve as above. Several companies claim to solve this problem effectively. Let me briefly discuss one I know quite well given it is part of IBM: DemandTec. Their team has a proven big data analytics approach to estimating price elasticity. They create elasticity models from two years of history data. This data includes all single product sales, with price at which the sales was made, and timing of the sale. They add to the mix weather, ad campaigns, competitor prices when available, etc. They then use multivariate regressions to estimate how customer spend is split between various products. This results in models that predict sales levels of each and every products a retailer sells. The model is basically a set of non linear equations that express the sales level of each product as a function of many variables, including prices of other products in the store, and prices at nearby competing stores. Multi Product Price OptimizationThe multivariate elasticity models can then be solved using an non linear optimization solver to find the best price levels. Note that in the retail industry, margin may not be the only component of the objective function. Some products are called image products. These are the products that consumers use to evaluate the price position of a given retailer. You want to have low price on image products as they are used by customers to decide if they buy from you or from competition. For image products we therefore want to optimize a mix of margin and being cheaper than competition. For non image products you may not need to worry that much about low prices, you can solely focus on margin.
So far we spoke about price optimization as if it was one problem. There are many variants of the problem as a matter of fact, depending if it is for everyday price, or promotions, or markdown. Readers interested by the latter topic can look at work done for Zara on mark I will now focus on a particular price optimization problem, called micro pricing. Here, the problem is to adjust prices to respond to competitor price changes. The objective is to maximize the profit out of a set of stores. The decisions to be made are the price levels of every article in these stores. Available data include current price levels, and elasticity. Given this kind of price change needs to happen frequently, we restrict the scope of price changes within a narrow band, say plus or minus 2%. Given we are only considering small price changes, we can approximate the non linear elasticity model by taking its gradient at the point defined by current price levels. It means that price elasticity is described by
These numbers are then used as coefficient for a quadratic objective of the form sum_{P} e(P) . xP . (xP  cP) + sum_{P1,P2}ce(P1,P2) . xP1 . (xP2  cP2) where xP is the decision variable representing the new price for product P and cP the current price for product P Constraints come from various sources.
Taking everything into account defines a quadratic optimization problem that is amenable to a solver like CPLEX. Note that this problem may be non convex. Another problem we have been working on is to evaluate the robustness of the optimized price levels vis a vis price elasticity. Indeed, price elasticity is a predicted data, hence it is uncertain. Depending on how we model this uncertainty we can use robust optimization or stochastic optimization techniques here. ConclusionPrice optimization is used much more pervasively than one might think. Current techniques are quite sophisticated, and I only scratched the surface of the topic in this post However, I hope I did convey the close interplay between predictive analytics (elasticity estimation) and optimization. This interplay is not specific to price optimization. Indeed, we are seeing more and more applications requiring this combination. I'll blog about some more in the near future.
