New legislation in the Netherlands requires government organizations to pro-actively publish all the information they hold that is in the public interest—after first redacting all the personal data in every document.
To equip organizations to achieve compliance at speed and scale, Migrato teamed up with the IBM Build Lab to augment its natural language processing (NLP) capabilities with IBM® Watson® Natural Language Processing Library for Embed. The new solution allows Migrato to process unstructured data faster and more accurately—empowering it to help its clients rise to the new compliance challenges.
Based in the Netherlands, Migrato specializes in the processing of unstructured content. Using its custom-developed Migrato Intelligent Content Classifier (MICC) tool, the company helps organizations turn unstructured data into actionable business insights.
Oscar Dubbeldam, CEO of Migrato, takes up the story: “Migrato was originally founded to analyze content and migrate content from A to B. For most of our clients, the point of origin is typically a network share, while the destination is some kind of regulated environment—for example, an enterprise content management platform.”
Dubbeldam continues: “Our clients set out on these migration projects for many reasons. These range from decommissioning a costly or end-of-life network share or adopting modern cloud platforms such as Microsoft 365 to meeting regulatory requirements such as the General Data Protection Regulation [GDPR].”
Using its Intelligent Content Classifier software suite, Migrato has helped many organizations to complete their migration projects quickly and efficiently. With new and more stringent regulatory requirements from the Dutch government on the horizon, the company saw an opportunity to further enhance its capabilities.
“Recently, the government of the Netherlands introduced Wetgeving Open Overheid [Open Government Legislation], which requires government bodies to proactively publish all documents that are in the public interest,” explains Dubbeldam. “To comply with the new law, government organizations must first ensure that they remove, anonymize or pseudonymize any personal-identifiable information from these documents.”
Migrato was confident that it already had many of the core capabilities required to help government organizations comply with the new legislation. However, the company’s existing NLP engine lacked the performance and scalability to address the increased volume and complexity of document processing workloads.
“Our goal is to process hundreds of thousands of documents automatically and be absolutely certain that the data we have extracted does not contain GDPR-sensitive data such as personal names, bank details and/or social security numbers,” adds Dubbeldam. “As the next step, we looked for a way to deepen our understanding of the content and context of documents.”
By co-creating with IBM Build Lab experts, Migrato rapidly developed a working prototype.
By leveraging embeddable AI, Migrato slashed time to market for its new solution.
Migrato is ready to help government bodies in the Netherlands to address new regulatory requirements.
Following a chance meeting with IBM at an industry event, Migrato quickly realized that IBM Watson offered all the capabilities it needed to address the requirements of the new Open Government Legislation.
“When I first met with IBM, I was intrigued by their focus on process mining and data science,” recalls Dubbeldam. “After taking part in a full-day workshop with IBM, I was convinced that IBM Watson Natural Language Processing Library for Embed was exactly the solution we needed.”
IBM Watson NLP Library for Embed is a containerized library that infuses powerful natural language AI into existing solutions. Built on a combination of open source and IBM Research® NLP algorithms, the solution empowers Migrato to accurately detect and redact sensitive personal information from raw text data.
Working with experts from the IBM Build Lab, Migrato rapidly turned its proof-of-concept into a full-fledged solution.
Dubbeldam comments: “After co-creating with the IBM Build Lab team for just a couple of hours, we’d completed around 70% of the solution development work—it’s a very effective partnership. The hands-on guidance we received from IBM was extremely valuable, particularly around best practices for embedding specific libraries into our Intelligent Content Classifier application.”
Through its partnership with IBM, Migrato has achieved its goal of optimizing its NLP capabilities. With IBM Watson NLP Library for Embed, the company can process unstructured data with even greater speed and accuracy.
“It only took us around five days to get our new solution into production, which was a great experience,” says Dubbeldam. “In the past, our keyword analyses were based on statistical analysis of all the words used in a document—but in many cases, the most common words in a document aren’t the most important ones. With IBM IBM Watson NLP Library for Embed, we can leverage AI to perform topic keyword extraction, helping us drill down into the meaning of unstructured data.”
With enhanced NLP capabilities powered by IBM Watson, Migrato is ready to help government and for-profit organizations across the Netherlands rise to the challenge of Open Government Legislation.
Dubbeldam elaborates: “Today, the big opportunity for Migrato is to work with government bodies with a proactive publication duty, but we also see that our solution will have wider significance beyond the public sector. Banks and insurance companies also need to identify which of the digital documents in their possession are privacy sensitive—and the Migrato solution with embedded IBM AI technology can help these organizations, too.”
He adds: “We see that taking the time to fully understand the data you hold has the potential to deliver big benefits in the long run. These include reducing the regulatory and reputational risks of a data breach as well as reduced storage costs, higher operational efficiency, and improved employee productivity.”
Today, Migrato is continuing its work with IBM to help make its new solution even more performant, scalable and repeatable—enabling it to rapidly configure the solution to meet the specific requirements of more than 340 municipalities across the Netherlands.
At the same time, Migrato is partnering with IBM Build Lab to develop value-added solutions based on IBM watsonx™. An AI and data platform, IBM watsonx empowers organizations to deploy machine learning models with ease, scale analytics and AI workloads rapidly, and deliver full transparency and explainability for AI data.
“We are very excited by the potential use cases for IBM watsonx,” says Dubbeldam. “For example, suppose we use our AI solution to help a client remove duplicate documents from their repositories, cutting the total number of documents from one million to just 500,000. With IBM watsonx, we can make those documents fully searchable using natural language queries—helping employees leverage the data more effectively than ever.”
Dubbeldam concludes: “Without a doubt, our partnership with IBM will deliver massive value for government and non-government organizations across the Netherlands. As we continue our work with the IBM Build Lab, we’re looking forward to what the future holds.”
Founded in 2014 and headquartered in Strijen in the Netherlands, Migrato helps organizations to analyze, inventory, cleanse and migrate their unstructured content. Using custom-built solutions, the company transforms unstructured data into actionable insights that add value to employees and business processes.
