New to data governance? Five best practices to get started

IBM launches Watson Knowledge Catalog Academy to help users navigate data governance best practices

By | 3 minute read | September 9, 2020

With so many different definitions and methodologies, data governance and cataloging can be overwhelming. Where to begin? How do you develop best practices for building out your data governance framework? IBM Watson Knowledge Catalog Academy is a new resource designed to address the most pressing questions about managing enterprise data and AI model governance, quality and collaboration. Developing best practices for building out your data governance framework is crucial especially as rapidly changing business conditions reconfigure the landscape of work.

As organizations transition from on-premises to hybrid or remote working arrangements, data governance is a growing concern. Recent research from Gartner suggests that strong data governance policies and practices can help organizations overcome tremendous data protection and privacy challenges as workers connect and share data from distributed environments.

IBM suggests you consider the following five points as you build a solid foundation for data governance expertise.

Data governance or data management?

Is there a difference? The answer is yes, but both concepts are closely related.

Data governance refers to enterprise-level management of data availability, relevance, usability, integrity and security. Data governance helps organizations manage institutional knowledge by defining data owners, business terms, rules, policies and processes throughout the entire data lineage.

Data management is the technical implementation of data governance, a comprehensive method to define and manage enterprise data. Data management policies and procedures ensure data is collected and organized properly, including using tools and techniques to mask, encrypt, profile and define data.

Understanding different sources and types of information assets

Whether an organization is large or small, if its enterprise data is not well understood, it cannot be fully protected and used. An information asset is a body of information defined and managed as a single unit so that it can be understood, shared, protected and used effectively. Examples of such information assets include personally identifiable information (PII), intellectual property, financial information and any other information critical to company operations. Identifying different data sources and the appropriate role-based access, regardless of where that data lives, is also important. And these sources are not limited to traditional, structured data sources typically housed in relational databases, but unstructured data sources, like emails, blogs and other web content.

Measuring organizational maturity

To better assess the strength and needs of the organization, enterprises must understand the state of their data governance maturity. Rather than using spreadsheets, tribal knowledge or hand coding, data assets must be cataloged by capturing metadata, assigning policies to data classes, assessing and scoring data quality and leveraging tools for data integration. Once data governance maturity has been assessed, teams can move toward improving data governance capabilities across the entire enterprise.

Power of a data catalog

Many enterprises struggle to manage their data due to a lack of a reliable end-to-end solution on an integrated platform. A modern data catalog operates as the single source of trust that can organize and govern all the metadata shared across the enterprise to allow for easy collaboration.

Gartner research notes that “demand for data catalogs is soaring as organizations continue to struggle with finding, inventorying and analyzing vastly distributed and diverse data assets.” Using AI- and machine-learning to support data cataloging can become a core feature of a data governance best practices regimen.

With a robust data catalog, enterprises can locate and classify information at scale, unlock the hidden value in their data, improve data visibility and better enforce data governance policies as well as enable developers and data scientists to analyze and prepare enterprise data for artificial intelligence (AI) applications.

Dive deeper: Explore the ebook “A comprehensive guide for the modern data catalog.”

Best practices for a sound governance foundation

To recognize business value and increase efficiency between stakeholders, enterprises must understand how to integrate the principles of data governance and management within the end-to-end platform of a data catalog. Each of these five principles articulate how enterprises can build a strong governance foundation. When an organization strives to improve efficiency and promote collaboration across lines of business, the first step should be to build a robust business taxonomy, concentrating on the meaning of business definitions and developing actionable milestones.

For a deeper look at best practices for delivering an end-to-end business ready foundation, read Data governance: The importance of a modern machine learning knowledge catalog.

Next steps

Continue building your understanding of the core concepts of data governance, data cataloging, and Watson Knowledge Catalog. Explore the WKC Academy today.

To learn more about Watson Knowledge Catalog, visit www.ibm.com/watson-knowledge-catalog.

Watson Knowledge Catalog is now included in the base of IBM Cloud Pak for Data. Learn more about our unified data and AI platform by visiting our website and reading our newsletter.