AI helps companies meet new data protection challenges

By | 4 minute read | July 30, 2019

woman looking at computer data

In an ideal world, rules should be based on principles—on what’s right, not what’s easy. In Europe, a good example of that maxim in action is the General Data Protection Regulation (GDPR), a set of rules adopted in 2016 designed to protect privacy and personal data for citizens living in the European Union (EU) and the European Economic Area (EEA).

The basic principle behind the GDPR is that companies controlling a customers’ personal data have a legal obligation to take all measures to protect it. The biggest contrast with the previous data privacy rules relates to the treatment of “incidental” personal data that lurk in the inner recesses of documents, emails and all other kinds of files saved in different parts of the company—any of which can expose citizens to privacy risks. While the old rules cut companies some slack about these shards of personal data, GDPR holds them to a kind of zero-tolerance standard, saying best efforts aren’t good enough.

Many needles … in many haystacks

Needless to say, upholding these principles is anything but easy for the companies that have to follow the GDPR. They need to take whatever steps are necessary to document what the personal data is and where it resides, and to have legal justification for having it. Think finding a needle in a haystack is hard? GDPR is more like finding many needles in many haystacks, and not missing any. The capabilities of traditional document management processes—the kind most companies have—don’t come close to addressing this super-stringent requirement.

Manually scanning through all this unstructured data can seem like an almost impossible job. But in reality, it’s just really, really time-consuming and prone to mistakes and oversights. And that makes it a nearly perfect use case for applying AI technology.

A new framework for managing personal data

The company we founded to do this, Aigine, is predicated on the idea that AI—when combined with smart workflows and the right organizational development—can transform how personal data is managed and protected.

Our business model is to use AI to filter through a company’s unstructured data to not only find and catalog personal information, but also to provide an intelligent framework for how to manage it in the context of GDPR requirements. As we prepared our offering, known as the Aigine Unstructured Data Inventory Engine (UDIE), we looked for an AI solution that was both powerful and adaptable. The latter factor is important because the rules as to what constituted personal data is always subject to legislation or court rulings. The degree to which IBM Watson lets us easily reprogram the AI algorithms to reflect these kinds of changes really stood out from the rest of the market.

Augmented intelligence in action

The solution we brought to market is based on a suite of IBM Watson AI services—Discovery, Natural Language Classifier and Natural Language Understanding—connected via APIs to the IBM StoredIQ solution, which is used for unstructured data assessment. The UDIE solution starts off by using AI models to filter through all files and then identify and cordon off those that contain personal information, creating, in effect, a privacy data inventory.

Managing the personal data within the inventory is the job of IBM StoredIQ, which is deployed as an instance on the customer’s on-premise network. The solution’s algorithms automatically determine which person or people within the organization are best suited to make the decisions for specific data elements based on its usage context—maybe it’s marketing data, maybe it’s billing data. The solution then uses intelligent workflows to alert and direct these internal expert reviewers to that task. Next, using the solution’s annotation engine, the reviewers perform the critical function of defining how each data element is being used, how the company is protecting it and what the legal basis is for maintaining the data.

And they do it with a big help from AI. Algorithms highlight each instance on screen and—drawing context from an underlying knowledge base—proactively prompt the reviewers with data-driven recommendations for what the legal basis should be. For the enterprise as a whole, what’s going on is an ongoing, collaborative training process all directed at getting better and more efficient at GDPR data governance. That’s the essence of AI.

Lower costs and better governance

When it comes to GDPR, it isn’t a question of whether to comply—the fines can be as high as €20 million or up to 4 percent of the annual revenue—but how, and at what cost. The value proposition for our UDIE solution is that it can reduce the time and labor cost of GDPR compliance by more than 90 percent. It serves as an example of how AI can make huge—almost inconceivably difficult—tasks manageable and cost effective.

But there’s also a strategic, forward-looking angle to the solution. The fact that our AI algorithms are continuously improving makes them not only better at the monitoring angle of GDPR, but also able to help companies enforce their data privacy policies. So in that sense, AI becomes an ideal platform for improving data governance strategies going forward.

Watch Karl-Oskar Brännström discuss how AI can make GDPR compliance more efficient: