August 9, 2017 | Written by: Jay Limburn
Categorized: Data Analytics
Share this post:
Data governance is rarely seen as a glamorous topic, and even the mere mention of the ‘G’ word often inspires groans and yawns from non-specialists. But are they missing a trick? It’s possible that the failure to appreciate data governance comes from a lack of understanding about the value it can deliver, and just how important it is to future success.
Today, we’re going to attempt to address that gap in understanding. First, let’s define our terms: by data governance, we’re referring to the overall management of the availability, usability, integrity, and security of the data employed in an enterprise. A sound data governance program includes a defined set of procedures, a plan to execute those procedures, and people who are responsible for putting that plan into action. This might sound like a lot of work without much payoff—but the truth is that data governance plays a key role in ensuring that data is used to its full potential.
In our previous blog, “How smart catalogs can turn the big data flood into an ocean of opportunity”, we touched on the overlap between cataloging and data governance, and suggested that both are vital to a successful data strategy. Now we’re going to explore how governance and cataloging can combine to help you evolve your organization into a truly data-driven enterprise.
Confronting the dangers of being reactive, not proactive
Let’s look at some of the most common approaches to data governance. Many data governance plans are created or updated in response to emerging regulations, such as the upcoming Global Data Protection Regulation (GDPR). These types of plans typically place a strong focus on avoiding non-compliance—and little else.
Developing a policy in this way often means hiring an army of compliance officers and data stewards, writing reams of documents around data handling, and hurriedly deploying a range of niche technology solutions to shore up the process as and where needed.
The result? A fragmented set of data governance tools, and a high probability that data governance professionals gain a reputation for being the ‘tax inspectors’ of the data world—focusing on enforcing restrictions, rather than helping the business achieve its goals. In consequence, knowledge workers are limited rather than enabled in the effective use of data to support everyday decision-making.
Embracing a new data governance perspective
It doesn’t need to be this way. By flipping the focus of data governance from restricting usage of data to one of enabling access, sharing and reuse, organizations will start to realize the positive value that good governance delivers.
Already, chief data officers are tasked with driving better use of data throughout their organizations, on top of their typical responsibilities around information security and compliance. By embracing the right data governance strategy, they can hit both these targets; giving users the confidence to use and share data freely, while still keeping the organization and its data appropriately protected.
Enabling governance for enforcement and insight
How can you put this new approach into action? The first step is to extend your definition of governance. The organization needs to understand that governance is not just about ensuring enforcement, it’s also about the reliable delivery of insight.
In fact, these two aspects of the role go hand-in-hand. A good governance framework will ensure that all data shared within your platform is automatically protected and used in line with the company’s governance guidelines. At the same time, the protections offered by the framework can help give users the confidence to discover data anywhere within your organization, knowing that whatever they can access, they can use. Similarly, they will be encouraged to contribute their own data, safe in the knowledge that it will be shared appropriately and won’t be leaked or misused.
Thus, governance strategies designed around enforcement can also enhance data usage and increase insight.
One platform for all data
End-to-end governance that empowers rather than restricts the user is a big step towards becoming a data-driven organization. With IBM Watson Data Platform, making the move towards data governance models that enable and empower the business in its use of data, rather than restricting it, will be easy. This is because Watson Data Platform will deliver value in three key areas through its new and upcoming Data Catalog solution (currently in beta):
Automated enforcement and classification: By providing real-time classification and enforcement of governance policies, Data Catalog will ensure data is appropriately organized—and if necessary, that sensitive information is masked, hidden or protected—whenever it is accessed, edited or moved.
Monitoring and analytics: A Data Catalog governance dashboard will offer immediate insight into the status of an organization’s governance program, allowing chief data officers and other managers to track their progress towards becoming a data-driven business.
Dynamic metadata: Data Catalog will feature dynamic management of the metadata around how data should be used and managed. This enables an intelligent search functionality, powering more effective analytics and making the most valuable information easier for users to find and use.
Embedded as shared components within the fabric of future releases of Data Catalog, each of these three capabilities will be fronted by APIs that enable seamless integration with other tools.
But what does it mean in practice?
Let’s consider one possible use case for Watson Data Platform’s Data Catalog solution, and how the embedded data governance capabilities could have a huge impact in real-life terms.
Imagine you are a business analyst in the North American offices of a financial institution. You’re working on a potentially sensitive data asset, and are considering whether to share it within your company’s enterprise catalog. Many of your colleagues would find elements of the dataset useful—they could use it to help identify locations where large numbers of new bank accounts are being opened, for example. But because the dataset includes customers’ Social Security numbers, you cannot share it in its entirety. Plus, teams in Europe aren’t allowed to see any information on North American customers, so you need to make sure they don’t even know this data exists.
With a traditional data platform, your options would be limited: if you wanted to share the dataset, you’d need to invest large amounts of time and effort in cleaning and masking it to avoid violating your data governance policies. The likelihood is that you’d decide sharing the data simply isn’t worth your time, and your company would miss out on a host of insights.
But fortunately, your organization has recently rolled out Data Catalog. When you add the data to the enterprise catalog, it is automatically classified as containing sensitive information. The solution also immediately redacts or masks any data that users are not allowed to see, in line with your organization’s data governance policies. Your colleagues in Europe will not even be aware of the data’s existence, while your North American team will be able to find and access it instantly to enrich their analysis.
Best of all, you too are able to dive into the platform and extract data that will enhance your own work, confident that by definition, if you can see it, you’re allowed to use it. In short: everyone can be more confident, more productive, and more collaborative in their use of data.
The message is clear
It is time to start thinking about data governance in a new way. IBM Data Catalog is being built from the ground up to focus on a new way to address how governance can provide insights to your business. Learn more about the Data Catalog solution and sign up for more information about the beta.