How to integrate custom-built annotators into your data pipeline

Share this post:

Key Points:

  • Use annotation models generated with Watson Knowledge Studio to create a custom Watson Discovery Service configuration pipeline.
  • With business logic built into the ground truth, annotations carry meaning pertaining to a business scenario.
  • Apply document enhancing capabilities and extract information from industry specific or scientific domains.

Learn more about Discovery Service


Clients are looking to IBM Watson® to satisfy a wide variety of cognitive use cases spanning multiple domains. Extracting insights from these often-complex domains of knowledge, however, can be a challenge given the intricate nature of industry-specific data. For example, we often work with clients trying to discover trends, relations, and patterns within a company’s financial reports. Analysis of these reports requires deep subject matter expertise which can be challenging to find and scale across an organization. By deploying Watson Knowledge Studio models to Watson Discovery Service, users can apply the document enhancing capabilities of Discovery to extract information from industry or scientific domains.

  • Generate a simpler enrichment structure within Discovery Service to enhance query results.
  • Targeted relationship extraction where relations are surfaced and combined to generate trends, patterns, and actionable insight.

Using an investment banker persona to address questions around risks and issues across financial companies, let’s explore how it works.

Our user is Henry, an investment banking analyst tasked to compile pitch books: pitch books are marketing presentations filled with useful investment considerations. Henry fills his with investment considerations extrapolated from financial documents like SEC 10K reports.

While analyzing these documents, Henry looks for insights based on three business considerations.

  1. Threat – Which factors create a specific risk, and is there warning or indication of potential harm.
  2. Entity at Risk – Which part of the company is at risk by the threat.
  3. Relationship – The relation between threat and entity at risk, and how the entity at risk is affected.

Because of their dense and lengthy nature, reading and annotating these documents is time consuming and frustrating. Plus, documents are generated at a faster pace than Henry can review them.

Here’s where Watson Discovery Service comes into play. All the SEC 10K reports are collected and ingested into Discovery Service, and default enrichments applied to extract meaningful insight. Now Henry can quickly and easily explore and discover the entities, keywords, and concepts present in the documents.

From this view, Henry finds relevant entities and keywords within risk reports, but can’t determine how they’re related. Enter Watson Knowledge Studio, trained using a subset of the SEC 10K reports to expand and augment annotations. In this case, the annotation model followed Henry’s business logic, and when applied to the report looked something like this:

“Information system failures, network disruptions and breaches in data security that could have a material adverse effect on our ability to conduct our business.”

The model was deployed and integrated within the Discovery Service enrichment pipeline. Henry can now explore the SEC content and discover insight aligned to his business considerations. A simple query to the Discovery Service SEC collection shows the enhanced results:

Bonus, Henry can also rapidly perform this analysis for other companies he’s interested in and complete his pitch book in record time.

Discovery Service makes it possible to rapidly build cognitive, cloud-based exploration applications that unlock actionable insights hidden in unstructured data — including your own proprietary data, as well as public and third-party data. You can test the service with our free, 30-day Bluemix trial to see how it can help you extract value from your data.


Learn more about Watson Discovery Service and try it for free with our 30-day trial.


Technical Program Manager - Watson Implementations

Add Comment
No Comments

Leave a Reply

Your email address will not be published.Required fields are marked *

More Developers stories
April 9, 2018

Driving faster, more accurate and more beneficial tax decisions

With Watson, KPMG tax professionals have access to a powerful solution that frees them to focus on the qualitative documentation which defines a project, helping to transform and enhance the accuracy, speed and ROI in generating qualified tax credits for their clients.

Continue reading

April 5, 2018

Prudential Singapore puts Watson AI to work to increase sales agent performance

Facing steady growth in customer and business volumes, Prudential recognized it needed an easily scalable and cost-effective solution to support a growing number of queries from financial consultants. Prudential created a scalable, intelligent chatbot using Watson AI to provide financial consultants with real-time customer-specific policy information.

Continue reading

April 2, 2018

5 Ways AI is transforming marketing

With AI, marketers can unlock insights from competitors’ campaigns that would otherwise be invisible. Big trends like market share and dollar sales are easily visible, but more nuanced trends like sentiment, or share among a particular age cohort or demographic might be impossible to see with the naked eye. Using Watson platforms, consulting firms like LPA are able to ingest billions of data points from social platforms and public domain sources to bring these kinds of insights into focus – and give companies a window of opportunity to act on them.

Continue reading