June 22, 2017 | Written by: David Jenness
Share this post:
For years now, we’ve all been quoting the same Price Waterhouse Coopers study that says that 80% of data is unstructured. I never read the original study and you probably didn’t either, but you’ll be hard pressed to find anyone to disagree with it. More importantly, everyone agrees that unstructured data is important. It contains customer information, company knowledge, winning sales narratives, best practices and conversations between subject matter experts.
In short, it’s content.
However, the difficulty of accessing content has made it largely unavailable for today’s data strategies – applying analytics, gaining insight and making better decisions. However, now, with the introduction of cognitive capture, we have a new tool that uses artificial intelligence, machine learning and natural language analysis to crack open previously hidden data inside of content for a wide variety of uses.
At IBM, we call it cognitive capture.
Traditional capture was already getting pretty good at finding and extracting data from documents. I joined Datacap in 1998, and for the next 12 years, I saw some pretty amazing development, using templates, “fingerprints,” and zones and rules to locate and grab data. But template-based data extraction relies on knowing the type of document first, a template and set of rules, and then tweaking it to locate and extract the data. This strategy works for documents that have predictable layouts – invoices, tax returns, shipping documents, applications, and medical claims, for example.
But it wasn’t good at free-form layouts, like contracts, correspondence, memos, and the many other unstructured documents that make up the bulk of business data. When Datacap joined IBM, we suddenly had access to a variety of artificial intelligence tools. And then things got interesting.
Cognitive capture doesn’t use templates. Instead, text analytics, Natural Language understanding, pair matching and other strategies extract data and determine the document type by the data and layout it finds. This strategy allows businesses to dramatically widen their use of data extraction from documents and gives us a more powerful tool to train on that 80% of data that is unstructured – since the majority of it is free form documents.
IBM, AIIM and Doculabs just recorded a webinar on the topic that introduces IBM Datacap Insight Edition and examines all the elements of a successful capture program to apply cognitive capture in a way that assures success.
View this recording and learn how you can intelligently fuel critical business processes using capture.