Data-Driven

Approaches and Considerations becoming a Data-Driven Enterprise

Veröffentliche eine Notiz:

Working with data has a long history in IT and has seen a constant stream of innovation and changes over the past decades. In this third article of the “Data-Driven Enterprise” series, we want to look closer into organizational and methodical approaches for implementing new data architectures and capabilities and how they relate to the goals, the requirements and challenges from Part 1 and 2

Enterprise IT functions have continuously undergone drastic changes in the past. The organizational approaches to working with data make no difference here. Central IT teams or centers of excellence were home to database administrators, data analysts, data engineers, and many other related roles. Alongside these roles technologies such as databases, enterprise data warehouses, data lakes, etc. have been implemented and managed centrally. 

Requests for new data for analytics or machine learning or other changes often took weeks or months to be delivered. Centralized approaches were also often applied to incubate new technologies and roles, but they failed in scaling initial successes across the enterprise and across multiple business units and domains. 

Where central teams and IT functions could not serve line of business requirements quickly enough, shadow IT developed. Mostly invisible to central policies and constraints, these decentralized efforts often succeeded with quick wins, but struggled with governance, compliance, and continuous operations. Nevertheless, the need for understanding data in context based on domain knowledge more recently caused a shift to decentralized approaches for data projects across many enterprises. So, is a decentralized approach the best cultural and organizational approach for a data-driven enterprise? What culture is necessary to success as a data-driven enterprise anyway?

Culture of Data-Driven Enterprises

“You understand it, you own it.”

To explain a major shift in data culture, let’s look at software development as an analogy. Before the rise of DevOps, software and applications were developed as monoliths by several deeply specialized roles for database development, backend service development, front-end development, testing, application deployment and operations. Long release cycles and complex dependencies were the consequence. 

With the introduction of DevOps/DevSecOps the responsibility across major parts of the end-to-end lifecycle of applications was shifted back to the product owners and corresponding development teams. “You build it, you run it” became the mantra to describe the new mix between central and decentral responsibilities. Cloud technologies, APIs and platform services were a necessary pre-requirement. Therefore, central IT teams did not completely lose their relevance. They still influence or set the IT strategy, establish technology frameworks and platforms, and provide end-to-end governance, observability and monitoring capabilities required for enterprise compliance and security. 

Over the last few years, we have witnessed the same changes being introduced in data management, analytics, and machine learning. Previously, organizations relied on monolithic data architectures and a strong division of labor with a highly specialized workforce. However, modern approaches like DataOps, MLOps and data products stress the importance of embracing more horizontal responsibilities and decentralized application and therefore also data ownership.

This means that individual teams are responsible for building and maintaining their own data products, in contrast to centralized approaches where a lack of domain-centricity can result in a loss of knowledge and insight.

„You understand it, you own it“ could be the new mantra for data-driven enterprises. The individual business unit that produces or maintains relevant data within their business processes, exposes data as a product and owns its lifecycle not just within their unit but also for others to use. 

Intrinsic Motivation to Share vs. Incentivization

The goal of sharing data across business units or even enterprise boundaries creates a motivational challenge for many enterprises. Why should one invest in maintaining or extending a data asset if mostly others benefit from the effort?

Ownership typically starts with the business unit that generates or processes data from within their business domain, but for data-driven enterprises it is essential to motivate business units to share their data products. A data marketplace can help facilitate this by allowing domain data product owners to publish and offer their data products, while consumers can pull data products into their services or aggregated data products.

Motivating data owners to share can either be based on intrinsic or extrinsic motivational factors. Strong intrinsic motivational factors in this context are Competence and Purpose. In one case, a company started to regularly report on the quality of data products (based on defined criteria and measurements). Teams who owned and produced the highest data quality were recognized in a regular report. This representation of competence in their business domain and in creating high quality data products was sufficient to motivate others to invest in their data assets. 

Purpose is another strong motivator and companies reporting on the influence of specific data products on their business outcomes or sustainability footprint saw an increase in investment in data assets. 

But intrinsic motivation will not necessarily be an easy solution for many enterprises today. Specifically, stock listed companies with strong external forces focusing on business growth, revenue and profit often require extrinsic motivation. Sharing data across business unit boundaries needs to be compensated for via cross-charging models or similar approaches. Tracking the usage of data and either defining an internal cost to using data or assessing the positive impact on revenue or profit is a requirement. Data platforms need to provide means to support such cross-charging models, but also data contracts to specify the terms under which data is being provided or consumed. Usage of data can then be used to negotiate a form of compensation for the data product provider e.g., monetary reward, virtual credits, or more budget for the business unit.

Organizational Considerations

Balance of decentralized ownership and centralized governance

At technical, cultural, and organizational level, organizations can pick from a variety of architectures and approaches ranging from a centralized data lake to Data Meshes of decentralized data products. However, fully centralized, or fully decentralized approaches are not feasible, requiring a balanced combination based upon the underlying corporate strategy.

Part 2 described requirements to make best use of domain knowledge in decentralized business teams, while at the same time break silos and make data available across an enterprise. To support decentralized requirements Data Mesh principles can be used. Data Mesh is a socio-technological approach that focuses on decentralized ownership of data productscombined with a self-serve data platform to empower cross-domain usage of data. This allows for greater speed, agility, and resilience in data-driven decision-making.

However, regulatory compliance and governance are ineffective if fully decentralized. To address this, data governance and compliance policies must be established centrally through federated representation from the domains. The central center of competence should have experts for compliance, data governance, and data engineering. This ensures that data is compliant with regulations and organizational policies, while still allowing for flexibility and innovation within individual domains.

To align decentralized data ownership with centralized data governance, a data strategy needs to be established. Ideally such data strategy combines use cases from within and across business domains, with an enterprise data blueprint for architecture, technology, governance, and compliance management. A balanced approach allows individual business units to make specific tool decisions and integrate them while still maintaining a consistent overall strategy.

Figure 1 – Data Strategy and the Balance of Central vs. Decentral Ownership

How to become Data-Driven?

Keeping the cultural and organizational aspects above in mind, how to set an entire enterprise on the path to becoming data-driven? Where do you start?

We have seen enterprises starting with technology. Picking a data platform or individual technology building blocks is important but won’t get you there. Empowering a central IT or Chief Data Office to define and implement all required technology and hoping for adoption is not an optimal approach. As with cultural and organizational aspects, it requires a balanced approach between central governance and technology leadership combined with decentralized data ownership and the flexibility to choose alternative technological building blocks.

 

Establish a Data Strategy

Becoming data-driven is a journey and as with any journey, the most important first step is to define a direction to travel in. What is the North Star that each step along the way can use to validate if this is moving in the right direction. For a data-driven enterprise this North Star is the Data Strategy. 

Smaller companies can successfully establish a data strategy without implementing a central function to own it. Most enterprises benefit from implementing a Chief Data Office or similar responsibility at enterprise level to define and own a Data Strategy. However, a solely centrally owned Data Stategy may hinder widespread adoption across all business units. Therefore, business unit leaders need to identify data strategy contributors for their division or domain. Again, balance is required for successful data democratization. 

What’s in a Data Strategy?

A Data Strategy is a thoughtful plan, changing the current state towards a vision for the business and the usage of data. All subsequent steps need to be considered at either an enterprise or individual business unit level or both. Establishing a data strategy will require business and technology leaders to collaborate and align closely at enterprise and line of business level. 

The first and most important step is to understand the Business Vision and Goals and what role data will play to achieve them. Typically, these fall into either increasing revenue, cutting cost, improving customer experience, or creating new revenue streams. A data strategy should also identify the stakeholders who will benefit most and establish them as sponsors or part of a steering committee. 

Once the overarching business goals and dependencies have been captured, the key next step is identifying the most compelling Use Cases to pursue. For each use case the desired outcome as well as biggest challenges currently preventing the business from achieving these goals need to be evaluated. Individual use cases will need to be carefully assessed in terms of impact on business goals vs. the effort required to establish and maintain them. Based on this assessment, use cases need to be classified and prioritized. Picking “low hanging fruits” initially will help to establish trust with stakeholders and show immediate results. They also help gaining additional insights and experience while implementing organizational and technological solutions to deliver upon the larger data strategy. 

Use cases must be owned within lines of business and key stakeholders for individual use cases should own their (decentralized) share of the data strategy. The Chief Data Office can provide subject matter experts to establish trust and collaboration between enterprise and individual business unit level. 

Low hanging fruits can only be the starting point though. Data-driven enterprises also need to address the tough problems with large business impact. It will be important to set up challenges and ask for yet untapped solutions that data could allow to fulfill, if data assets and the right tools to take advantage of the data were available.

Finally, central teams and decentral stakeholders need to establish a common framework to measure success and the required processes to observe progress. Successful data driven-enterprises continuously measure and monitor progress: It’s essential to measure and monitor progress regularly to ensure that you are meeting your data-driven objectives. Use key performance indicators (KPIs) to measure progress and make data-driven adjustments where necessary.

Business and IT Alignment

The core elements of a Data Strategy take business goals and the role of data into consideration. But working with data at scale will always require technology and tools. Based upon the initial data strategy, enterprises need to align business and IT and understand, what technology is currently available. Additionally, they need to discuss and agree on missing technology, including cutting edge technologies, to be considered and potentially introduced. Based on what technologies are available and which need to be added, enterprises need to define a plan of action for using data, analytics, artificial intelligence, and other technologies to achieve the outlined business outcomes. 

Modern Data Architecture

Data-driven enterprises need to establish a target technology blueprint, target operating model and a roadmap for implementation and maintenance. Different use cases and business goals will require different technological approaches and tools. A modern data architecture needs to address the key requirements defined in the data strategy and put specific focus on the key requirements as well as known challenges outlined in Part 1. Part 4 will identify and explain key building blocks and technologies of a modern data platform. Part 5 will describe a modern data platform from an IBM point of view in more detail. Subsequent articles will deep dive on specific aspects of modern data architectures based upon IBM products and technologies. 

At a high level, modern data architectures usually implement three major layers: 

  • Data Stores to process and store data and manage access to data. Data stores are the sources and destinations for data products and APIs.
  • Data Fabric & AI Lifecycle platform to support automated data integration, governance, processing and management of data and data science artifacts. 
  • Data Insights and Applications tools and blueprints for integrating insights and models derived from data into reports, applications, and solutions.

Enterprise Data Topology

Experience shows that sometimes it is easier to choose an existing project as a starting point. Enterprises might tie their initial steps into application modernization and digital transformation initiatives. They need to ensure, that as applications get modernized and/or moved to cloud technologies, the data these applications manage is considered as part of the overall data strategy. Where will this data be placed? How can it be acquired, integrated, and governed for data-driven use cases? What benefits can be drawn from it?

But existing initiatives can only be a starting point, enterprises need to take an inventory of data and establish a data topology across the enterprise, describing the classification, clustering, and management of data. A data topology allows to pinpoint data silos and identify the need for data architecture or technology updates. 

Policies and Controls

Based upon the data topology, enterprises need to establish a baseline for data governance and compliance. Which data elements are subject to regulatory requirements and therefore require data governance policies and controls (i.e., names, addresses, social security numbers and other PII data)? Regulatory requirements for data can be of generic nature, such as the handling of personally identifiable data. For most enterprises additional geography or industry specific regulations need to be considered as well. With the rise of machine learning and artificial intelligence, regulations for AI have also been established. The European Union has proposed the EU Artificial Intelligence Act, which defines the need to classify AI models based upon identified risks and requires enterprises to implement corresponding measures for AI models. 

For compliance with regulatory requirements, enterprises need to balance the central coordination of policies and controls with decentral data ownership. They need to line up individual projects, outline data governance (quality, privacy, compliance, and security) requirements and policies and support data owners in implementing compliance as early as possible in the data management lifecycle. 

People & Skills

As discussed at the beginning of this article, Data-drive enterprises need to implement a balanced approach to decentralize data ownership while supporting data owners with centrally supported technologies, policies, and controls. 

To implement this balance, data-driven enterprises need critical skills within central Chief Data Offices or centers of excellence, but also requires data engineering, data science and other skills in lines of business to own and maintain data products. 

Effectively assessing current skills and defining future skill requirements is a key element of a data strategy. With a clear view on existing and needed skills, enterprises can also lay out the political landscape of supporters vs. inhibitors as well as required enablement efforts.

To improve data literacy and further grow the required skills across all business units, experience shows that data advocates, deployed or responsible across domains and business units have had a positive impact. Additionally, business units need to establish product owners or product champions to help others embrace data ownership.

Technology can be of essence as well. A knowledge catalog and active metadata can improve productivity of data teams across the enterprise. The Chief Data Office ideally supports the implementation of a common nomenclature and helps getting everyone on the same page. With the right skills and appropriate coordination, data-driven enterprises can realize much better operational efficiency.

Experience Delivery Approach 

For implementing and refining both data strategy and individual use cases, iterative approaches are needed, where stakeholders can “experience” the outcome quickly. Experience approaches are critical to success because they take an active, iterative, and user-centric approach that focuses on user needs and feedback. In contrast, long-running „black box“ initiatives are often rigid and less adaptive approaches that do not address specific user needs.

Methodology

Short iterations, like sprints, are an important part of the user-focus and fast-fail experience journey. Following proven methods like Design Thinking or Lean Startup ensure outcome orientation while giving enough flexibility and support creative solutioning. They make it possible to build a trusted foundation and build incrementally minimum viable products (MVPs) based on actual user needs.

People

User needs are best covered within a multidisciplinary team that co-create during MVP design and build phase. Having a team that both represents business and IT ensures business value and operability.

Thus, feedback loops, like Weekly Playbacks, are critical, as they allow the performance and usefulness of the product to be continuously monitored and adjusted.

To maximize the success of the experience approach, the whole organization itself should also consider and define how the three pillars of culture, organization and data strategy will operate in the future. This fosters a strong connection between the different departments and enables them to share a common vision and direction becoming a data-driven enterprise. 

IBM Client Engineering (CE) provides experience delivery and thus speed-to-value and innovation for clients. The talent CE brings to the table is an investment by IBM for clients to co-create and co-execute on customer facing business and technical challenges. This means that clients bring their business and technology context, sponsorship, subject matter experts, and data & IBM CE bring a deeply skilled multi-disciplinary squad, technical accelerators, proven methods, and a memorable experience.

We will provide more details on IBM Client Engineering principles and methodology in a dedicated article in the future. The positive impact of experience delivery approaches on establishing and delivering upon a data strategy is best demonstrated with success stories. 
One such project was delivered by IBM Client Engineering together with the higher regional court in Stuttgart. The IBM CE team has co-created and deployed an IBM mass trial assistant for case processing in diesel exhaust proceedings. The AI-supported system is used for document processing and is intended to relieve the burden on all parties involved.

A dedicated, future article will describe the IBM Client Engineering approach and success stories in more detail. 

Becoming data-driven requires cultural, organizational, technological, and methodological change and close alignment with business goals. In this article we focused on culture, organization, and approaches to consider. Part 4 will provide a summary of key technological building blocks to a data-driven enterprise.

 

Resources:

IBM Distinguished Engineer, Technical Lead Data and AI DACH

Florian Scheil

Business Technology Leader | IBM Client Engineering DACH

Sascha Slomka

Senior Client Engineering Solution Architect, IBM Technology, DACH

More stories
By Sascha Slomka and others on Oktober 24, 2023

AI Governance

AI governance has received a lot more attention as AI regulations are being formulated and passed. But AI Governance is not only about regulation, it is the key discipline to master the complexity induced by the variety of AI frameworks, models and tools. AI Governance relies on proper Data Governance which has been discussed in […]

Weiterlesen

By Andreas Weininger and others on September 12, 2023

IBM’s Data Platform for Data-Driven Enterprises

What technology does IBM have to offer to help you become or strengthen your position as a data driven enterprise? IBM recognizes that most enterprises don’t start on a greenfield, but instead already have a landscape of data stores and analytical systems grown over many years. Therefore, IBM’s approach to a modern data platform focuses […]

Weiterlesen

By Sascha Slomka and others on Juli 18, 2023

Experiential and Incremental Implementation

Motivation We have started this blog-series with the question why it is so difficult to become data driven and explored the approaches to accomplish this in Part 3. In this article we go in more detail and focus on experiential and incremental delivery. The main goal of experiential and incremental approaches is to gain a […]

Weiterlesen