Definition

What is data governance?

Data governance consists of policies, processes and an organizational structure to support enterprise data management. The structure of a data governance program provides understanding, security and trust around an organization’s data among its stakeholders, especially as companies scale and accumulate more data sources and assets. With the exponential accrual of new data, companies need to determine the appropriate big data environments for storage and access purposes, such as data lakes, and they need to design a data architecture to govern those sources and integrate and make them available across the organization. This data integration becomes increasingly important as it impacts the workflows and decision-making of various teams.

Data governance is essential to an organization’s overall strategy for data management and as part of a complete DataOps practice. It helps you to know what data you have, where that data resides and how that data can be used. Data governance lays the foundation for business-ready data through the adherence to defined rules and processes to accelerate analytics and growth initiatives.

Data governance and IBM

A data governance platform with an integrated data catalog can help your organization find, curate, analyze, prepare and share data to support your AI initiatives. IBM data governance solutions help to ensure that the data pipeline is ready to help catalog, protect and govern sensitive data and to trace data lineage.

Why IBM

IBM Watson Knowledge Catalog

Activate business-ready data for AI and analytics with a data catalog that’s backed by active metadata and policy management. Help your colleagues find data to curate, categorize, govern, analyze and use.

Dive deeper on data governance

Benefits of data governance

Stakeholders can achieve cross-organization success with strong data governance practices that enable deeper insights while protecting data.

Better data security and compliance

Different types of data may have different permissions or rules surrounding it, especially if that data contains personally identifiable information (PII). Data governance practices can help promote security and compliance, assisting companies in reducing risks of breaches and fines, and protecting customer trust. Data governance practices help to know what PII exists and where, and through policy and metadata management can automate compliance.  

Privacy regulations are only on the rise with global sweeping regulations like the European General Data Protection Regulation (GDPR) that provides data privacy to European citizens, particularly on the internet. Additionally, there are more industry-specific and regional regulations like the Health Insurance Portability and Accountability Act (HIPAA) for protecting patients and their personal health information. Spurred by the rise in data-driven marketing and increasingly remote work, compliance regulations are only becoming more prevalent, as customers become more aware of their data rights and as companies address increasing reputational risks.

Improved data quality

Business intelligence tools are only as good as the data that feeds them. If the underlying data has not been cleansed and managed properly, this can limit the ability of users to make informed business decisions. In addition, data is often pulled from a variety of data sources, where inconsistencies can hinder analytics and other critical projects. Data governance helps to connect information across systems to identify meaningful relationships to get the most out of an organization’s data and helps to ensure critical data doesn’t get left behind.

Accelerated automation

With practices and tools to maintain data organization and quality, analytics teams can start to innovate and automate specific tasks and processes with machine learning algorithms. For example, customer data can be fed into models to determine how prospects should be prioritized within the sales pipeline. Because a data governance practice helps ensure that customer data is accurate and protected, teams can achieve greater growth and more targeted selling.

Roles around data governance

A few roles are key to the practice of data governance. Three roles ensure that standards are created and maintained over time, aiding in data compliance, security, data quality and automation goals.

Chief data officer

Executive sponsors, such as chief data officers, signal the importance of a data governance program to the organization through its prioritization. These individuals are instrumental in the development of a cross-functional council, which usually sources members from various business units to represent the needs and concerns of different disciplines or product portfolios. This committee serves as a forum to communicate new data governance initiatives and assign responsibilities to achieve agreed upon timelines and outcomes.

Data owners

These individuals are responsible for the state of the data. They are usually designated by the type of data that they manage, such as customer or financial data, and their role seeks to maintain data accuracy and usability. Common tasks include troubleshooting data issues, approving data definitions, and providing data recommendations, particularly as it relates to any regulatory requirements.

Data stewards

These individuals are subject matter experts (SMEs) around their data domain, influencing data policies and championing data governance across the organization. Since they can communicate the importance of specific data points for business processes or decisions, they can also impact the structures of database tables to ensure that the right data is surfaced for reporting purposes. Overall, though, data stewardship helps keep stakeholders accountable for their role in maintaining data quality.

Data governance framework

Data governance practices have increased in adoption over the years, especially with the growth in digital transformation projects. For data governance initiatives to achieve successful outcomes, they should include a number of components, such as:

Data standards

Data dictionaries, taxonomies and business glossaries should be developed to provide clarity around business and data definitions. This documentation reduces confusion in conversations, particularly ones involving metrics and reporting. It also gives stakeholders visibility into the data architecture, enabling teams to innovate on their own to automate processes for their discipline.

Data processes and organizational structure

Data governance processes provide transparency to end users around how data is processed within an organization. This can be inclusive of data refresh cadences, PII restrictions, regulatory data policies or even something as simple as data access. This type of documentation also supports organizational structure by clarifying the responsibilities of different roles as it pertains to the management and maintenance of data.

Technology

Different data governance tools, such as metadata management platforms, support the processes and standards around data. These tools can store and secure information about the data that an organization manages. This can include documentation on business definitions, data logs, data owners, database information (such as database and tables names, server locations, data types, etc.). It can also feed into self-service data analytics tools, allowing analysts to query and visualize different data sets for reporting or innovation projects.

Testimonial

You might also be interested in

IBM Cloud Pak® for Data

A flexible multicloud data platform that integrates your data, whether on premises or on cloud, and helps to keep it more secure at its source.

IBM® DataStage®

A highly scalable data integration tool used to design, develop and run jobs that move and transform data, deployable on premises and on any cloud.

IBM InfoSphere Advanced Data Preparation

Software that delivers self-service access to data. Begin data analysis more quickly with automated transformation.

Next steps