Skip to main content

skip to main content

developerWorks  >  Information Management | WebSphere  >

Enable OmniFind to retrieve WebSphere Portal Document Manager content

Configuring Information Integrator Content Edition streaming

developerWorks
Go to the previous pagePage 2 of 8 Go to the next page

Document options
PDF format - Fits A4 and Letter

PDF - Fits A4 and Letter
2474 KB (51 pages)

Get Adobe® Reader®

Discuss


My developerWorks needs you!

Connect to your technical community


Rate this tutorial

Help us improve this content


Introduction

IBM OmniFind Enterprise Edition is a search product that provides scalable, secure, and high-quality enterprise search. OmniFind Enterprise Edition crawls documents from a supported data source, parses the content to extract text information for linguistic analysis, and then builds indexes to add information about these documents to enable search by users. When search results are returned in a browser, users can click on any search result in the list to retrieve the corresponding document for viewing.

This tutorial focuses on WebSphere Portal Document Manager as the data source and describes the steps you can perform to enable documents in the search results to be streamed to the user's browser. Throughout the tutorial, references to the Information Integrator Content Edition Portal Document Manager connector refer to the direct mode of the connector.

Figure1 illustrates a typical flow diagram for retrieving documents.


Figure 1. Document retrieval scenario without streaming
Figure illustrates flow of content from Portal Server to OmniFind Server and back

The user enters the search terms in the OmniFind search application and clicks the search button to initiate a search. The search results are retrieved and displayed as links in the results section of the search application. The user can then click any of the links to retrieve the actual document. The actual flow of document retrieval involves an eight-step process. It starts from the search application.

  1. The search application submits a document retrieve request to the search server.
  2. The search server goes to the Information Integrator Content Edition Portal Document Manager connector to retrieve the requested document.
  3. The connector contacts the Information Integrator Content Edition Portal Document Manager proxy that is deployed in the portal server.
  4. The proxy goes to the Portal Document Manager that is running in the WebSphere Portal server to retrieve the document.
  5. The Portal Document Manager retrieves the document and hands it back to the Information Integrator Content Edition proxy.
  6. The proxy sends it back to the Information Integrator Content Edition connector.
  7. The connector gives it to the search server.
  8. The search server sends the document to the search application.

As you can see, this is a lengthy process. Also, if the document being retrieved is a large document, it is very inefficient to pass this document through so many components before it finally arrives at the search application for the user to view.

The Information Integrator Content Edition product has a component called HTTP Access Servlet. This servlet is a default implementation for serving native content over HTTP and provides a fast and direct path to native content. To utilize this function, the OmniFind Enterprise Search application implemented a streaming option for content retrieval that goes through supported Information Integrator Content Edition connectors. When this streaming option is enabled in the search application, document retrieval completely bypasses the OmniFind Enterprise Edition search server. This shortens the document retrieval process to only six steps.

The flow diagram for this document retrieval process is illustrated in Figure 2.


Figure 2. Document retrieval scenario with streaming enabled
Figure illustrates flow of content from Portal Server to OmniFind Server and back, with streaming

When streaming is enabled, the step are as follows:

  1. The OmniFind Enterprise Edition search application contacts the Information Integrator Content Edition Portal Document Manager connector to retrieve the document.
  2. The connector contacts the Information Integrator Content Edition HTTP Access Servlet in the WebSphere Portal server (instead of the Portal Document Manager proxy).
  3. The HTTP Access Servlet goes to the Portal Document Manager to get the document.
  4. The Portal Document Manager retrieves the document and hands it back to the HTTP Access Servlet.
  5. The HTTP Access Servlet streams it back to the Information Integrator Content Edition connector on the OmniFind Enterprise Edition server over HTTP.
  6. The connector sends the document to the search application.

The next two sections of the tutorial provide step-by-step instructions on how to configure this streaming option in the OmniFind Enterprise Edition search application.



Back to top



Go to the previous pagePage 2 of 8 Go to the next page