Correlating sales data with weather and traffic data can help retailers predict patterns of demand, enabling them to outsmart their rivals—but the sheer volume of data makes this a major challenge.
Brightlight rapidly deployed an IBM Analytics solution that brings the power of Hadoop to a sports retailer. The highly scalable solution simplifies and accelerates complex queries on huge data sets.
Predictsdemand patterns, helping to optimize inventory and promotions
6 timesfaster data loading and ultra-fast, index-free queries accelerate insight
Scalablearchitecture can be extended easily as data sets grow
Business challenge story
Out-thinking the other team
Retailers’ POS systems gather vast amounts of data that can reveal seasonal sales trends by brand, product line, location and more. Unlocking the insights in this data can be hugely valuable. For example, knowing that there is typically a sales peak for a certain product, in a certain set of locations and at a certain time of the year, helps managers ensure that enough stock is in stores to meet demand and avoid missed sales.
In a fast-paced sector where trends change rapidly, turning mountains of POS data into actionable insight fast enough to drive smarter decision-making is not easy. Facing this challenge, a major sporting goods retailer in the U.S. called in expert consultants from the Brightlight Business Analytics division of Sirius Computer Solutions, an IBM Premier Business Partner, to help turn its data into insight.
To add to the challenge, the retailer wanted to go beyond sales data to encompass external data in the form of weather reports, traffic and road-closure data, and various other kinds of geo-tagged data. The goal was to cross-reference POS data—broken down by location and time—with these external data sets to gain a deeper understanding of how external conditions impact sales.
As a simple example, a hot summer may increase sales of lightweight running gear, while a wet fall could drive the sale of waterproof cycling apparel. Equally, a major construction project on a particular city block might cause traffic problems, reducing footfall in a nearby store and ultimately impacting its sales.
David Birmingham, Principal Solutions Architect at Brightlight, comments: “Our client realized that if they could understand how these external factors affected their historical sales, they could improve their ability to predict future sales. If you know that a particular product line is going to be popular in a given set of weather conditions, you can ensure you have it in stock in good time—and ahead of the competition.”
Making rocket science simpleBuilding on its client’s existing data analytics landscape—based on IBM PureData System for Analytics—Brightlight deployed IBM BigInsights to provide a Hadoop environment for storing both historical POS transactions and external data on weather and geo-coding. In the future, the client will also add data on road closures and construction projects to Hadoop.
Birmingham comments: “Our client sees significant value in analyzing historical sales precisely because sports apparel is quite a predictable business. For example, each new season’s NFL team uniforms will appear at the same time each year, and major events like the Super Bowl also happen at known times. This means that you can usefully compare sales data from the same stores over the years, and it also implies that there is strong first-mover advantage in the market.
“In short, a retailer that can get the right product into its stores or launch the right promotions at the right time—based on an analysis of past performance and current conditions—can beat its competitors in securing a customer’s one-off annual purchase of a team shirt. Even being a day or two ahead of the competition can make a huge difference.”
The solution deployed by Brightlight takes advantage of IBM Fluid Query to provide integration between the PureData and the BigInsights Hadoop environments. “Fluid Query enables us to bring over parallel streams of data from Hadoop via simple SQL queries, and setting it up was really easy,” says Birmingham. “We had experience using another vendor’s technology and it took weeks. With BigInsights, the process was smooth as silk and took only an hour or so.”
He continues: “Getting data in and out of Hadoop can feel like rocket science, because you need to know a lot of parameters about your data before you can perform a Hadoop query. Fluid Query enables transparent communication between databases, so you can set up a standard query on PureData and have it run seamlessly on your Hadoop landscape.
“This is great news for companies that want to work with data sets that grow rapidly over time, such as weather or stock price data. With IBM’s approach, you can not only keep data in a massively scalable Hadoop environment, but also make it fast and easy to query.”
Winning the game with rapid insightGaining insights from data is only half the battle; in an increasingly fast-paced world, retailers need to deliver those insights to decision-makers as rapidly as possible. A key benefit of the new analytics environment is that it cuts data loading time from two hours to just 20 minutes—a six-fold improvement in performance.
Says Birmingham, “The solution avoids high latency in extracting, transforming and loading data, so analysts can get their hands on data faster and deliver near real-time insight. It also dramatically simplifies dealing with Hadoop, and that’s a major benefit with large-scale systems. Whenever you can simplify, you make it easier to ensure scalability and stability, so that the environment can grow in line with requirements and always be there to support the business.”
Today, the largest table in the retailer’s BigInsights Hadoop store has some 68 billion rows, yet most queries still run in less than a second.
“Combining PureData and BigInsights is a great way to analyze large volumes of data rapidly,” comments Birmingham. “Both technologies feature parallelization of query workload, and that’s a good match for the pseudo-parallelization of Hadoop. Because PureData does not use indexes, its query performance advantages over traditional relational databases grow as data volumes increase.
“It also provides a ‘zone map’ which keeps track of the physical location of new blocks of data as they arrive. When you run a query, the solution ignores any locations where it knows the data will not be found. We know of a 150-billion-row table over 25 TB in size that returns 95 percent of queries in less than five seconds. For a traditional relational database, it would take longer than that just to handle the index scans.”
Armed with the ability to perform high-speed queries on massive data sets, the sports retailer is now able to see new opportunities to out-maneuver rivals. As it builds up its store of historical data not only on sales but also on external factors such as weather and traffic, the retailer will improve its ability to predict the best times and places to launch new products, open new stores or run sales events.
Birmingham concludes: “Combining technologies from the IBM Analytics portfolio with our internal expertise and experience in deploying business intelligence solutions has enabled us to transform our clients’ capabilities.”
About Sirius Brightlight Business Analytics
Brightlight Business Analytics, a division of Sirius Computer Solutions, provides technology-agnostic insight and end-to-end solutions for business intelligence and big data analytics. Brightlight helps its clients make better strategic decisions and maintain their competitive advantage.
Take the next step
IBM Analytics offers one of the world's deepest and broadest analytics platform, domain and industry solutions that deliver new value to businesses, governments and individuals. For more information about how IBM Analytics helps to transform industries and professions with data, visit ibm.com/analytics/us/en/. Follow us on Twitter at @IBMAnalytics, on our blog at ibmbigdatahub.com and join the conversation #IBMAnalytics.
View more client stories or learn more about IBM Analytics