Business challenge

To attract and retain loyal customers, this retailer needs rapid insight into sales and inventory data across thousands of stores—but its existing data architecture was struggling to keep up with demand.


The retailer is using a big data governance and integration solution from IBM to deliver high-quality data to its analytics environment faster than ever—ensuring a seamless flow of insight to the business.



insight helps decision-makers predict and respond to changing customer needs

24 hours

to perform customer affinity analyses that took up to 20 days in the past


governance boosts confidence in data and facilitates change management

Business challenge story

Dealing with explosive growth in data

One of IBM’s clients was experiencing continuous, explosive growth in its retail division, with new customers, new product lines and new stores opening all the time. Business was booming—and so were data volumes, which created unprecedented challenges around analytics and reporting.

Like all retailers, the company’s goals are straightforward: making sure that its stores are always stocked with the right products for the right customer base, and that it can make the right offers at the right time. However, when operating on a global scale with more than 1,000 stores, achieving these goals becomes very challenging.

The sheer volume of data generated on a daily basis meant that the company was struggling to gain insight fast enough or frequently enough to make the best operational decisions. Some tasks, such as customer affinity analysis, were taking 14 to 20 days to complete.

The retailer also wanted to start analyzing product data at the size, color and style level, and correlate it with customer behavior and purchase patterns to make more accurate predictions around stock levels and product sales. Getting down to that level of granularity with its existing data warehouse architecture was almost impossible.

Part of the problem was the difficulty of building and maintaining efficient extract, transform and load (ETL) processes to get the data from the company’s source systems into its data warehouse. Relying on hundreds of individually maintained ETL scripts also raised issues around data quality, data governance and data lineage—particularly when the business needed to make changes to its underlying systems.

For example, modifying a single field in a single database could have a knock-on effect on the entire analytics landscape. And although the IT team always ran impact analyses to make sure that such modifications would not adversely affect application performance or report accuracy, it could take up to three weeks to complete these assessments.

It was clear that the retailer needed to find a better approach to keep up with the demands of its growing business.

By using IBM BigIntegrate for Hadoop, we can run data processing tasks that previously took up to 20 days in just 24 hours.

Spokesperson, leading European retailer

Transformation story

Ultra-rapid, efficient data analysis

The retailer decided to build an entirely new analytics architecture that would help solve its big data problems. The idea was to build a modular solution, composed of “blocks” that would enable it to collect large volumes of data easily, process and analyze streams of data in near-real-time, and provide flexibility to use the best tools for individual analytics task.

Hadoop was the obvious option, because it is a general-purpose big data architecture that would enable the company to use different engines for different jobs. For example, for batch processing on huge volumes of structured data, Hadoop offers tools like Hive, while for real-time stream processing, it has tools like Spark Streaming.

The company believed that Hadoop’s design meant that it would be able to scale much more cost-effectively than its existing data warehouse appliance. The aim was to use Hadoop initially to augment the existing infrastructure, and eventually to replace it altogether.

However, Hadoop itself was only part of the answer: the company also needed to find a way to integrate the new platform into its complex existing ETL architecture, and to strengthen its capabilities around data quality, governance and lineage.

The retailer’s IT team looked at a number of information integration and governance tools, and ran extensive tests to see how they coped with its requirements. Several vendors proposed good technical solutions, but the IT team were more convinced by IBM’s ability to offer a comprehensive combination of technology, support, training, and consultancy.

IBM proposed a big data integration solution based on IBM BigIntegrate and IBM BigQuality—providing a scalable platform for transforming and integrating the company’s data into the new Hadoop environment, as well as a rich set of data quality, profiling, cleansing and monitoring capabilities.

IBM also proposed adopting IBM InfoSphere® Information Governance Catalog, which would ultimately help the company leverage the solution’s data lineage capabilities to build a single, trusted catalog of all its data assets across the entire business.

“We were the first company worldwide to choose this combination of IBM tools for Hadoop,” states a company spokesperson. “The IBM team worked closely with our business intelligence team to build a viable business case and deliver top-notch training, which gave us great confidence and peace of mind.

A team from IBM Analytics Services helped the client deploy IBM InfoSphere BigIntegrate and BigQuality, including components such as Information Governance Dashboard with IBM Cognos Analytics, and the Data Stewardship Center with IBM Business Process Manager. The IBM team even flew in a senior member of the BigIntegrate development team from the US to assist during the first months of implementation, and to provide advice and training during the later stages of the project.

Results story

Putting swift insight in the hands of decision-makers

The IBM platform’s comprehensive data quality, data lineage, and data governance capabilities help the retailer trace its data from its source systems all the way through to its final reports. The tools allow the company to specify sophisticated sets of business rules and filters that control and shape how data moves through its new analytics environment, as well as monitoring exceptions, and notifying administrators of any problems. This provides a solid foundation for data governance, as well as the capacity to trace all data back to its source.

Today, the company can quickly and efficiently manage and integrate immense volumes of data related to millions of items across thousands of stores. Once the data is in its Hadoop cluster, the company can select the right engines to process it quickly and efficiently, unlocking precious insights into customer purchasing patterns, inventory levels, and product sales.

“BigIntegrate contributes to improved ETL performance, which in turn contributes to the overall speed of analytics,” comments a member of the company’s project team. “We can get data into Hadoop in much greater volumes than our existing data warehouse could handle, and we can process it much faster. Customer affinity analysis that previously took up to 20 days, now runs in just 24 hours; and daily stock position calculations that used to take a full day are now ready in four hours.”

As a result, the company can obtain information faster, more frequently, and at a more granular level—which gives executives the insight they need to take more timely and effective action.

Calculating the affinity between customers and products helps the retailer design more relevant offers and coupons, enticing customers to come back for more after a successful purchase. Similarly, with intra-day insight into inventory positions, it is easier to keep the right products in stock at each store to meet customer demand.

For the IT team, the ability to build and update ETL jobs in BigIntegrate, rather than writing and maintaining dozens of complex PL-SQL scripts, is a significant advantage. Over the long term, the greater transparency and powerful data lineage capabilities of BigIntegrate are expected to deliver considerable time-savings.

Above all, when the team makes a change to one of its source systems, it can now assess the impact on all of its ETL jobs and analytics applications in minutes. This will save huge amounts of time on impact analyses, and enable greater flexibility and agility about introducing new features or integrating new systems and data.

Thanks to the IBM technology, business insight is now readily available at the touch of a button, helping decision-makers supercharge their sales and marketing strategies. Ultimately, this means that the company can keep the right products on its shelves at all its stores, and offer them at the right prices to keep customers happy and loyal to the brand.

Leading European Retailer

Headquartered in Europe, this IBM client is a multinational corporation that manages a diversified portfolio of businesses in retail, financial services and other sectors. The company employs approximately 40,000 people, and operates in 60 countries globally.

Take the next step

IBM offers a comprehensive, scalable Unified Governance and Integration platform and solutions—available on premises, on cloud and hybrid environments—successfully delivering trusted data for insights and compliance to businesses, governments and individuals. Learn more about Unified Governance and Integration at Follow us on Twitter at @IBMAnalytics, on our blog at and join the conversation #IBMUGI.

View more client stories or learn more about IBM Analytics