To enhance decision-making for core processes such as resource allocation and project management, Deloitte Canada needs timely, precise insight into staff deployment and operational performance.
During a move to a new ERP platform, Deloitte Canada seized the opportunity to build an enterprise data hub, which unites information from many systems in a single data lake for fast, easy analysis.
Enablesnew types of cross-functional analysis by uniting structured and unstructured data sources
8x fasterspeed-to-insight, with average response times of less than 100ms for data lake queries
Reducesanalytics development costs by providing easy access to a comprehensive catalog of data
Business challenge story
Steering a booming business
Deloitte Canada is one of the country’s largest professional services organizations, employing approximately 10,000 people. To keep its extensive business running smoothly, the company must ensure that its highly-skilled practitioners are being utilized as effectively as possible on client projects. To make optimal decisions regarding resource allocation, managers require a clear view of how staff are deployed, how the business is performing financially, and how well its operations are aligned to clients’ needs.
Deloitte wanted to provide timely, detailed insight to guide decision-making, but the company’s existing information management systems had not been designed to handle the enormous volume and variety of data that its day-to-day operations now generate. In particular, the company knew that it would need to be able to leverage both structured and unstructured sources of data in order to meet the demands of today’s business environment.
Raj Ramani, Director of Information Management at Deloitte Canada, explains: “Previously, we relied on an enterprise data warehouse for business intelligence. The system just wasn’t built to address today’s challenges around transforming fast-moving and varied data into useful insight in real time.
“When the business decided to move to a new ERP platform, we knew that the moment for change had come. The ERP implementation required us to rewrite most of the interfaces to our data warehouse anyway, so it was a perfect time to make the leap to a new analytics platform.”
Adopting a new approach
Deloitte Canada’s Information Management team began gathering requirements for the new analytics solution, and presented its findings to several leading vendors. Raj Ramani recalls: “We evaluated each vendor’s proposal against the requirements posed by our diverse user groups. We shortlisted IBM and one other vendor, and decided to run detailed proofs-of-concept [PoCs] for both solutions. We decided to proceed with IBM after the PoC revealed that their proposal could outperform the other vendor’s offering in several key areas.
“The implementation was complex and had many moving parts, because we weren’t just migrating to a new analytics solution: we also needed to integrate with the new ERP platform. The IBM solutions performed well during test cycles, in particular during performance testing; and when we went live, it was clear that we had made the right choice.”
Deloitte Canada has created a data lake built on Hadoop, which is fed by structured and unstructured data ingested from numerous source systems at remarkable speed. Most data is ingested in a few hours or via real-time replication—a significant step forward compared to the weekly batch jobs used to update its older data warehouse.
The solution acts as an enterprise data hub—a comprehensive enterprise repository for all kinds of business data, which both acts as an analytics platform in its own right, and feeds data into the data warehouse for traditional reporting.
The enterprise data hub approach enables business users to incorporate data from a much broader range of sources into their analyses, giving them a new level of insight to help answer questions more quickly and accurately.
“The Hadoop architecture made sense to us because of the scale at which we operate and the variety of data sources that we manage,” explains Raj Ramani. “The amount of data we want to ingest is growing, and we wanted to stay ahead of that trend.
“At the same time, effective governance is vital to prevent the data lake from becoming a swamp of unfindable, unverifiable information. Before we decide to pull data from a new source into the hub, we carefully analyze it with our architecture and delivery teams. Once we’re ready to integrate a new source, we don’t just select a few attributes to import, as you would with a traditional data warehouse; instead, the data lake gives us the capacity to ingest everything.
“The advantage is that as business requirements evolve, we don’t have to go back and build new extract, transform and load [ETL] scripts every time someone requests a new piece of information. We’ve already got all the raw data at our disposal, just waiting to be turned into insight. That’s going to dramatically lower the cost and accelerate the delivery of new analytics and reporting projects in future.”
For data integration between source systems, the data lake and the data warehouse, Deloitte Canada relies on IBM InfoSphere® DataStage®. Raj Ramani adds: “IBM InfoSphere DataStage continues to cope well with the massive volumes of data we transfer internally, as well as help us manage the quality and consistency of data before it enters the data lake. Its support for parallel processing is the key to optimizing ETL performance—for example, we have some complex jobs that run on 37 parallel threads, and usually complete in two minutes.”
Deloitte Canada uses IBM Db2® Big SQL to run high-performance queries against structured data-sets in its Hadoop cluster, and IBM Cognos® Analytics provides some of its front-end reporting capabilities. For disaster recovery purposes, Deloitte Canada also uses IBM BigReplicate to replicate its production Hadoop cluster to a Hadoop cluster at its secondary data center.
Raj Ramani remarks: “The Big SQL engine was the most mature of the SQL-over-Hadoop platforms we looked at, and that’s primarily because Big SQL is built on tried-and-tested IBM Db2 technology. Hadoop has huge potential, but the technology is in pretty early stages from a broad enterprise adoption and support perspective. Using Big SQL as our core engine gave us confidence that we’d be able to succeed with a Hadoop data lake as an enterprise platform.”
He adds: “In practice, the performance is great. Our whole landscape has changed so much that it’s hard to do an apples-to-apples comparison with our old data warehouse, but the bottom line is that data volumes have increased, yet performance has improved. You only get that kind of result when you have a mature SQL engine like Big SQL.”
Richer data for decision-making
Today, business users at Deloitte Canada can make decisions based on data that is moving faster through the company’s ecosystem, giving them a more current operational picture. What’s more, the company is establishing new ways to make its rich enterprise data more readily available to users across all functions.
“We have built a web service on top of our data lake, enabling people to query various data-sets via the Big SQL engine,” says Raj Ramani. “The other IT teams are amazed by how quickly the service runs, considering how much data it pulls. On average, queries run within 100 milliseconds—eight times faster than the performance goal we set at the beginning of the project. We have some queries that scan multi-million-row tables and still return results in 200 milliseconds. It just goes to show what good technology combined with clever engineering can achieve.”
The new platform makes it possible to deliver new analytics applications extremely rapidly, to meet even the most immediate, short-term business requirements. For example, when the launch of the company’s new ERP platform required the company’s employees to learn a new set of allocation codes to track the time they spent on project work, the Deloitte Canada team was quickly able to launch a web service that enabled its employees to look up the new codes with just a few clicks.
Raj Ramani comments: “The client code lookup app was a small, simple project, but it might have taken us weeks to deliver it with our old architecture—in fact, it might not have been worth doing at all. The enterprise data hub cuts out so much of the time and effort we used to spend simply getting the data into our analytics tools, and helps us be much more agile in solution delivery.”
Next, Deloitte Canada plans to migrate its Hadoop environment to Hortonworks Data Platform, and augment its enterprise data hub with new capabilities for data cataloging and discovery and preparation. This should make the data lake easier to govern, and pave the way towards true self-service data science. In addition, the company plans to take advantage of other IBM InfoSphere solutions to manage data quality in its business intelligence environment.
Raj Ramani concludes: “IBM Analytics solutions have helped us raise the bar for information management. We are excited to see where the future will take us.”
About Deloitte Canada
Deloitte LLP, one of Canada’s leading professional services firms, provides audit, tax, consulting and financial advisory services to a wide range of Canadian and international clients. Deloitte LLP is the Canadian member firm of Deloitte Touche Tohmatsu Limited—a network of member firms, each of which is a legally separate and independent entity. Headquartered in Toronto, Deloitte Canada employs nearly 10,000 people and operates 58 locations nationwide.
Take the next step
IBM Analytics offers one of the world's deepest and broadest analytics platform, domain and industry solutions that deliver new value to businesses, governments and individuals. For more information about how IBM Analytics helps to transform industries and professions with data, visit ibm.com/analytics. Follow us on Twitter at @IBMAnalytics, on our blog at ibmbigdatahub.com and join the conversation #IBMAnalytics.