Approaches to using Retrieve & Rank

The IBM Watson™ Retrieve and Rank service combines two information retrieval components in a single service: the power of Apache Solr and a sophisticated machine learning capability. This combination provides users with more relevant results by automatically reranking them by using these machine learning algorithms. The purpose of the Retrieve and Rank service is to help you find documents that are more relevant than those that you might get with standard information retrieval techniques. The ranker modes takes advantage of rich data in your documents to provide more relevant answers to queries.

Simple case

The Retrieve & Rank service is used to find potential answers, generate numerical features to score those answers, and rank order those answers using a machine learning model to turn the numerical features associated with each feature into a confidence score. In this process, the simplified approach, you submit the user's query and the Retrieve & Rank service returns a ranked set of results. This combines the searching, scoring, and ranking into a single API call. For many applications, these steps are good enough to get good performance.

You can see a demo of this case here.

Advanced case

The Retrieve & Rank Application uses an advanced approach. In some cases you may want to exploit knowledge that is specific to your corpus, domain, or application. In the advanced approach, the search, scoring, and ranking stages are split into separate steps. This allows more fine grain control over each stage and adds complexity. This additional control is what allows you to exploit knowledge specific to your domain. Instead of simply making a single REST call to have Retrieve & Rank search and return results, the application make an initial call to the "retrieve" portion of the service to obtain candidate answers. The application then scores each answer using customized answer scorers that are unique to the use-case. In our application's case, these scorers make use of detailed metadata associated with our corpus. The scorers take into account information specific to the corpus that the Retrieve & Rank service would be unable to account for by itself. Once the answers are scored, they are sent to the "rank" portion of the Retrieve & Rank service. This allows the custom answer scores to be considered during the final ranking of answers and returns customized results.

For more information on how we use custom scorers to achieve better results, see our blog.

Note that when the information is sent to the ranking segment, there is an additional cost for use. For information, see Retrieve & Rank's Pricing page.