The IBM Watson™ Retrieve and Rank service combines two information retrieval components in a single service: the power of Apache Solr and a sophisticated machine learning capability. This combination provides users with more relevant results by automatically reranking them by using these machine learning algorithms.
The following image shows the process of creating and using the Retrieve and Rank service:
For a step-by-step overview of using the Retrieve and Rank service, see the Tutorial page.
The purpose of the Retrieve and Rank service is to help you find documents that are more relevant than those that you might get with standard information retrieval techniques.
Retrieve: Retrieve is based on Apache Solr. It supports nearly all of the default Solr APIs and improves error handling and resiliency. You can start your solution by first using only the Retrieve features, and then add the ranking component.
Rank: The rank component (ranker) creates a machine-learning model trained on your data. You call the ranker in your runtime queries to use this model to boost the relevancy of your results with queries that the model has not previously seen.
The service combines several proprietary machine learning techniques, which are known as learning-to-rank algorithms. During its training, the ranker chooses the best combination of algorithms from your training data.
The core users of the Retrieve and Rank service are customer-facing professionals, such as support staff, contact center agents, field technicians, and other professionals. These users must find relevant results quickly from large numbers of documents:
The Retrieve and Rank service can improve information retrieval as compared to standard results.
As previously mentioned, the Retrieve part of the Retrieve and Rank service is based on Apache Solr. When you use Retrieve and Rank, you need to be knowledgeable about Solr as well as about the specifics of the Retrieve and Rank service. For example, when Solr passes an error code to the service, the service passes it to your application without modification so that standard Solr clients can correctly parse and act upon it. You therefore need to know about Solr error codes when writing error-handling routines in your Retrieve and Rank application.
To learn about Solr, see the Apache Solr Resources page. The page provides links to resources including a quick start tutorial; documentation and books; and forums for discussion, advice, and problems.
A sample application based on the Retrieve and Rank service is discussed at Retrieve and Rank application overview and available to try at Professor Languo. You can also download the application's source files from GitHub. The application enables you to ask questions about standard English usage, with answers processed by a back-end Retrieve and Rank instance from data provided by the Stack Exchange English Language and Usage forum.
Warning: The free cluster you can create to test the Retrieve and Rank demo application is a single reduced-size unit consisting of a maximum of 50 MB of disk storage. It does not guarantee any specific amount of RAM. The free cluster is meant only to run the demonstration application or small proof-of-concept applications. It cannot be used as a unit in a paid Retrieve and Rank cluster. It is not intended for production use. See Sizing your Retrieve and Rank cluster for more information.
We are always looking to improve and learn from your experience with our services. Find answers to your questions about Watson in our developer communities: