Scenario: Curating content for IBM Watson

Watson™ drives value with expertise. By strengthening the partnership between people and technology, Watson enhances, scales, and accelerates the development of expertise, resulting in more people making better decisions across the organization. Developing expertise requires trusted information for training Watson. Trusted information is relevant, dynamic, defensible, and transparent. IBM® Watson Curator implements the content curation methodology for cognitive computing that is known as the Watson Foundations Method. This methodology ensures that content that is prepared by Watson Curator is consumable by, and relevant for, Watson.

This example scenario uses all four of the roles that IBM Watson Curator supports.
Team Lead
Coordinates and monitors the curation effort for Watson projects; initiates new collections; provides final approval of collections; can restart the curation process on one or more documents; can create ad hoc tasks for the Content Curator.
Content Curator
Oversees the entire content curation process; directs Data Experts to collect content; reviews collected content; directs Domain Experts to classify and enhance content; maintains existing collections. Can optionally contribute content to collections, classify content, add topics to content, and exclude content from the curated document sets in collections.
Data Expert
Finds and collects content that is pertinent to the project needs as defined by the collection purpose, scope, and criteria. Can optionally classify content, add topics to content, and exclude content from the curated document sets in collections.
Domain Expert
Provides quality control of the content in collections; assesses the validity, accuracy, and value of the content that is contributed to collections; classifies content, adds topics to content, and excludes content from the curated document sets in collections based on extensive industry knowledge and expertise.

A Watson project owner, who requires information about cancer, logs in to IBM Watson Curator as a Team Lead. He looks through the list of available collections and can filter the collections by name, classification type, or topic. He can also see a summary of each collection and view attributes of, and statistics for, each collection.

When the Team Lead fails to find a collection that meets the information needs of the project, he creates one or more collections. For each collection, the Team Lead documents the purpose, scope, and criteria for the domain of information that is represented by that collection. For example, the Team Lead creates collections for causes and detection of cancer, cancer research and drug trials, and currently available treatments for cancer.

When the Content Curator receives the project requirements from the Team Lead, the Content Curator creates collection tasks for the Data Experts, who help to determine whether information exists that can be collected electronically. The Content Curator might also attach to each collection any documents that are readily available on the local system and that meet the requirements of the use case.

The Data Experts use IBM StoredIQ Data Workbench to match the collection criteria with existing information sources and if necessary, request that additional information sources be set up. After filtering the data, the Data Experts collaborate with the Content Curator to refine the content. When the Content Curator is satisfied with the quality of the collected content, the Data Experts copy the content to the repository. The Data Experts can also use IBM StoredIQ Policy Manager to copy incremental document sets to the repository at regularly-specified intervals.

Within each collection, the Content Curator reviews the collected content. During the review, he excludes some documents from, and includes other documents in, the curated document set of the collection. For most documents, the Content Curator creates a document curation task for the Domain Experts, who provide industry expertise and quality management of the content.

In response to the document review tasks, the Domain Experts review the content to determine its relevance to the collections. If they decide to keep the content, they classify it and add topics to it. They can exclude from the curated document set any content that is irrelevant, duplicated, or out of date.

After all of the content in the collections is curated, the Content Curator reviews the content again and marks the collections complete. Completing the collections triggers notifications to the Team Lead, who is responsible for verifying that the collections meet the project requirements. Only after the Team Lead approves the collections is the content made available to Watson.

The following diagram illustrates this scenario and shows the typical workflow of a content curation team.
Begin figure description. The workflow of a typical content curation team was described in the text that preceded the diagram. End figure description.