Managing Result Retrieval

The sources that you create in Watson™ Explorer Engine should be capable of retrieving an arbitrary number of results from the search engine that you are connecting to. Larger numbers of results generally increase the probability of retrieving all of the information that a user is looking for, and also improve the quality of the clusters created by Watson Explorer Engine.

The default value for the total number of results requested for each query in a Watson Explorer Engine search application is 200. This value is defined in the Metasearch tab of the Watson Explorer Engine project in which you are creating a particular source, and therefore applies to all sources within that project. Similarly, the number of results displayed on each page of a Watson Explorer Engine search application is a project-wide setting in the Main Display Options portion of the project's Overview tab so that all sources in that your search application behave identically.

Because different search engines behave differently, the Watson Explorer Engine administration tool provides a number of per-source settings that you can customize. Each source's detailed form definition settings (visible by selecting the form and clicking edit) include a variety of pagination and result selection parameters. These make it easy to identify any parameters or numeric values that your source can send to the target search engine in order to:

  • define the number of results that a search engine returns on each page of results
  • request additional sets of results
  • retrieve a specific page or set of results
Tip: The initial page of search results returned by a search engine will enable you to visually identify the number of results that are displayed per page, but usually uses default values for the other parameters discussed in this section. When first creating a source, it is a good idea to step through a few pages of results from the external search engine used in your source, in order to identify the CGI parameters that are being set in the URL as you change pages, and how they are used.