Watson Discovery Features

Embedded NLP

Learn more

Watson Discovery ships with natural language processing built-in. By simply ticking a few options we’re able to extract sentiment, entities, concepts, semantic roles, and more.

Relevancy Training

Learn more

You can train the Discovery service to improve the relevance of query results for your corpus. When you provide Discovery with training data, the service uses machine-learning Watson techniques to find signals in your content and questions. The service then reorders results to display the most relevant results at the top. As you add more training data, the service becomes more accurate and sophisticated in the ordering of results it returns.

Domain Customization

Learn more

Discovery can be taught to understand terms that are specific to your domain. Simply use a custom machine learning model built with Watson Knowledge Studio to customize the enrichment of your corpus.

Passage Retrieval

Learn more

Passage retrieval provides the option to return specific passages that meet search criteria as defined chunks of text to your user.

Document Similarity

Learn more

Document Similarity allows you to provide a known document Id – which Discovery analyzes, understands the most important aspects of the document, and finds textually similar documents in the collection.

Anomaly Detection

Learn more

Anomaly detection is used to locate unusual datapoints within a time series and to flag them for further review. Example uses for anomaly detection include identifying news alerts, event detection, and finding trends.

Discovery News

Learn more

Discovery News is included with Discovery and is a pre-enriched dataset of news articles that is updated continuously.

Knowledge Graph (beta)

Learn more

Knowledge Graphs can function as the "knowledge hub" for your company and can be used for enterprise search, summarization, recommendation engines, and other decision making processes. Knowledge Graph automatically creates custom knowledge graphs from unstructured data by extracting and disambiguating entities and relationships, enriching the relationships using algorithmic techniques and ranking the results using relevance algorithms.

Element Classification

Learn more

Element Classification makes it possible to rapidly parse through governing documents to convert, identify, & classify elements of importance. Using state of the art Natural Language Processing, party (who it refers to), nature (type of element), and category (specific class) are extracted from elements of a document.

Document Deduplication (beta)

Learn more

If you are querying the Discovery News collection, or your private data collection contains multiple identical (or near-identical) documents, you can exclude them from your query results using document deduplication.

Visual Insights (experimental)

Learn more

Visual Insights is an experimental feature that you can use to visually explore connections identified by Discovery's understanding of semantic elements, relations, concepts, and more. You can learn more about your collections, before using Discovery to create queries that you can integrate into your new application or existing solution that will point users to the information they need.

Document Segmentation

Learn more

You can split your Word, PDF, and HTML documents into segments based on HTML heading tags. Once split, each segment is a separate document that will be enriched and indexed separately. Since queries will return these segments as separate documents, document segmentation can be used to perform aggregations on individual segments of a document and to perform relevancy training on segments instead of documents, which will improve result reranking.

Query Expansion

Learn more

You can expand the scope of a query beyond exact matches - for example, you can expand a query for "car" to include "automobile" and "motor vehicle" - by uploading a list of query expansion terms using the Discovery API. Query expansion terms are usually synonyms, antonyms, or typical misspellings for common terms.