How to integrate custom-built annotators into your data pipeline

Share this post:

Key Points:

  • Use annotation models generated with Watson Knowledge Studio to create a custom Watson Discovery Service configuration pipeline.
  • With business logic built into the ground truth, annotations carry meaning pertaining to a business scenario.
  • Apply document enhancing capabilities and extract information from industry specific or scientific domains.

Learn more about Discovery Service


Clients are looking to IBM Watson® to satisfy a wide variety of cognitive use cases spanning multiple domains. Extracting insights from these often-complex domains of knowledge, however, can be a challenge given the intricate nature of industry-specific data. For example, we often work with clients trying to discover trends, relations, and patterns within a company’s financial reports. Analysis of these reports requires deep subject matter expertise which can be challenging to find and scale across an organization. By deploying Watson Knowledge Studio models to Watson Discovery Service, users can apply the document enhancing capabilities of Discovery to extract information from industry or scientific domains.

  • Generate a simpler enrichment structure within Discovery Service to enhance query results.
  • Targeted relationship extraction where relations are surfaced and combined to generate trends, patterns, and actionable insight.

Using an investment banker persona to address questions around risks and issues across financial companies, let’s explore how it works.

Our user is Henry, an investment banking analyst tasked to compile pitch books: pitch books are marketing presentations filled with useful investment considerations. Henry fills his with investment considerations extrapolated from financial documents like SEC 10K reports.

While analyzing these documents, Henry looks for insights based on three business considerations.

  1. Threat – Which factors create a specific risk, and is there warning or indication of potential harm.
  2. Entity at Risk – Which part of the company is at risk by the threat.
  3. Relationship – The relation between threat and entity at risk, and how the entity at risk is affected.

Because of their dense and lengthy nature, reading and annotating these documents is time consuming and frustrating. Plus, documents are generated at a faster pace than Henry can review them.

Here’s where Watson Discovery Service comes into play. All the SEC 10K reports are collected and ingested into Discovery Service, and default enrichments applied to extract meaningful insight. Now Henry can quickly and easily explore and discover the entities, keywords, and concepts present in the documents.

From this view, Henry finds relevant entities and keywords within risk reports, but can’t determine how they’re related. Enter Watson Knowledge Studio, trained using a subset of the SEC 10K reports to expand and augment annotations. In this case, the annotation model followed Henry’s business logic, and when applied to the report looked something like this:

“Information system failures, network disruptions and breaches in data security that could have a material adverse effect on our ability to conduct our business.”

The model was deployed and integrated within the Discovery Service enrichment pipeline. Henry can now explore the SEC content and discover insight aligned to his business considerations. A simple query to the Discovery Service SEC collection shows the enhanced results:

Bonus, Henry can also rapidly perform this analysis for other companies he’s interested in and complete his pitch book in record time.

Discovery Service makes it possible to rapidly build cognitive, cloud-based exploration applications that unlock actionable insights hidden in unstructured data — including your own proprietary data, as well as public and third-party data. You can test the service with our free, 30-day Bluemix trial to see how it can help you extract value from your data.


Learn more about Watson Discovery Service and try it for free with our 30-day trial.


Add Comment
No Comments

Leave a Reply

Your email address will not be published.Required fields are marked *

More Developers Stories
November 20, 2017

How fund managers can apply AI to turn data into insights, reduce bias in decisions and generate alpha

In this age of rampant data growth, the only way to reliably beat the market on a risk-adjusted basis is to mine unstructured data faster and more accurately than competitors. Companies that combine AI, and machine learning with speed, accuracy, nuance and contextual awareness will change the game of managing and growing investments.

Continue reading

November 16, 2017

Get the Forrester Report on how to make customer service smarter, faster and more cost effective

Call centers executives face the constant challenge of meeting customer expectations and business cost goals. Customers prefer interacting with virtual agents, and are choosing messaging over phone calls for issue resolution. This Forrester report outlines the trends that will enable call centers to become smarter and more strategic.

Continue reading

November 14, 2017

Top 10 ways that AI will impact business in the next decade

AI already impacts many aspects of our daily lives at work and at home. Over the next decade, experts predict that AI enterprise software revenue will grow from $644 million to nearly $39 billion. Here are the top 10 ways that we think AI will impact business over the next 10 years.

Continue reading