Chapter 01Chapter 02Chapter 03Chapter 04Chapter 05Chapter 06

No relevant matches. Try broadening your query.

Manage privacy and compliance of sensitive data

Chapter 05
7 min read

As a digital enterprise, your business likely stores and manages sensitive data sets, including customer and employee personally identifiable information (PII) and intellectual property. Sensitive data can be a significant portion of your data landscape and, therefore, can and should be used to power your AI models. However, it’s critical to identify and protect sensitive data and adhere to relevant regulatory requirements. Failure to do so can result in hefty fines, loss of business and loss of reputation.

The average cost of a data breach is USD 3.86 million.1

Many organizations struggle with identifying and managing sensitive customer data. Research by IBM and the Ponemon Institute found that 80% of data breaches in 2020 included customer PII. Out of all the types of data exposed in these breaches, customer PII was also the costliest to the businesses studied.2


81% of consumers would stop engaging with a brand online following a data breach.2

To effectively manage data privacy, organizations need full visibility into data located across their hybrid cloud environment — anywhere in the data and AI lifecycle. Establishing trusted data pipelines equips your teams with the ability to share data safely across the enterprise ecosystem to achieve faster time to compliance, innovation and AI deployment.

A holistic data privacy and security framework helps businesses avoid a piecemeal approach to securing data across disparate sources and using disparate point solutions. This approach enables enterprises to discover, audit and govern sensitive data. With a data catalog — an essential capability in a data fabric — organizations can automate the detection and governance of sensitive information, simplifying the effort to manage privacy and compliance.

Identify sensitive data
The first step to managing data privacy and regulatory compliance is to know what data you’re working with.

Consider the following questions:

  • Does your team know where your customers’ private data is located?
  • Do you have control of who can access it?
  • How ready is your team to respond to customers’ data access requests?

A modern data catalog automates metadata curation, including automatic detection and classification of sensitive data. It also automates core data governance services, including data lineage — a real-time visualization showing the data’s complete journey across the lifecycle, policy and rules enforcement, and reference data management.

By automating the ingestion and classification of sensitive data, a data catalog gives you critical visibility of your sensitive data, which is step one in managing privacy and compliance.

Graphic of (1) cubes representing sensitive data (2) person observing anonymized, protected data and (3) person viewing data graphs on computer monitor

Anonymize sensitive data for AI
Once you’ve identified sensitive data, it must be anonymized to safely use it for AI. In a data fabric, automated privacy features help anonymize PII and other confidential information using the best-fit techniques for any given data set, such as encryption, tokenization, masking techniques and statistical noise. A data fabric also delivers auto-enforcement of an organization’s data protection rules. These automated privacy capabilities help reduce risk without excluding sensitive data from AI projects.

Stay compliant and audit-ready
As regulatory requirements evolve and introduce complexity into data management, businesses need a solution that helps ensure compliance and streamline the auditing process. A data fabric’s automated metadata and governance layer speeds time to compliance by automating routine, manual governance activities. It provides a PII taxonomy for regulatory standards such as the General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA), as well as industry reference data to support data mapping for compliance.

By helping businesses identify and anonymize sensitive data and simplify compliance, a data catalog is central to building a strong foundation for AI using a data fabric architecture.

What percentage of data breaches in 2020 included customer personally identifiable information (PII)?
Lock on data from a mobile phone worldwide
Of data breaches in 2020 included customer PII, which can cost businesses financially and reputationally.

1 Cost of a Data Breach Report 2021, IBM Security and the Ponemon Institute, 2021.