Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Introduction to Lotus Extended Search

Lauren Wendel, Product Manager, Lotus
Lauren Wendel is product manager for Lotus Extended Search, Lotus Discovery Server, and expertise technologies. Previously, Lauren worked with the Lotus Enterprise Integration team for five years, overseeing the initial releases of Lotus Enterprise Integrator, DECS, ERP Connectors, and the Connector API Toolkit. She has also worked as a developer consultant within the Lotus Business Partner program, and previously, within the 1-2-3 engineering team. Lauren's also managed systems planning at Wells Fargo Bank, Citibank, Duke University School for Executive Education, and Grant Thornton Ltd. She enjoys running the "occasional" marathon and sings with a community chorus.

Summary:  This article presents the architecture and components of Extended Search, discusses features of Release 3.5, and previews the upcoming IBM Lotus Extended Search Release 3.7 coming in mid-2002.

Date:  01 May 2002
Level:  Introductory

Activity:  2413 views
Comments:  

Do your users need to search Domino databases in multiple Domino domains, Web and content sources, data indexed to a Microsoft Index Server system, and enterprise data from a RDBMS databases? If so, then you should get to know IBM Lotus Extended Search.

IBM Lotus Extended Search is a scalable, server-based technology that searches across many parallel content and data sources and returns query results to a Notes or Web application. Previously known as Domino Extended Search, the product was initially released in 1997 to extend the search capability of Domino applications to multiple back end data sources, such as DB2, Oracle, and Microsoft SQL Server databases. Since then, we've added many new features as well as support for Web application servers and more operating systems.

In this article, we'll present the technical architecture and components of Extended Search, give an overview of the latest feature additions in Release 3.5, and preview the upcoming IBM Lotus Extended Search Release 3.7 available mid-year 2002.

Extended Search Overview

Extended Search uses a system of interlocking components for managing all aspects of client search requests including:

  • Query string to source language interpretation
  • Data security verification
  • Number of search results desired
  • Results ranking
  • Server scalability and load balancing to support a growing number of users in organizations that require search capabilities

The following diagram shows how client-side and server-side components relate to one another in the Extended Search architecture. In the upper-left corner, you see the two components of user search access: a Web browser and a Lotus Notes client. The remaining components-Brokers, Agents, and so on-constitute the server components of the Extended Search system. We'll look at each component in detail to help you understand how Extended Search works.


Figure 1. Lotus Extended Search architecture
Lotus Extended Search architecture

User search "Client-side" components

You can access Extended Search from a Notes or Web application. To help you get started, Extended Search provides the following sample query applications that you can customize.

  • The Web application contains enhanced Hypertext Markup Language (HTML) files with expanded search functionality provided by Extended Search. The sample Web application allows users to search across multiple data sources simultaneously using a Web browser. When you perform a search with a Web browser, Extended Search channels search requests and options, as specified in the HTML file, through an Extended Search server-side Servlet described in the next section.
  • The Notes application shows the same set of Extended Search query options that are available in the Web application sample. With the Notes client, users submit queries, view results, and fetch documents through their Notes client. When you perform a search with a Notes application, Extended Search invokes a set of Extended Search LSXs to execute search requests and options on the client. Using the Extended Search Notes sample application, user installation of the LSXs occurs automatically the first time the user accesses the application. The server can save search results in the same Notes application, so the results can be re-used or shared with other users.

Most organizations customize these sample applications or extend their existing applications to include federated search capabilities. For example, Extended Search supplies a set of custom search tags that you can embed in HTML files. Users can enter query strings, identify various sources to be searched, and set a variety of search and retrieval options. The server saves search results in an HTML file in locations identified by Extended Search replacement variables. The Extended Search tags, which you can embed anywhere within an HTML file, do not interfere with surrounding HTML tags. So a Web designer can control nearly all presentational aspects and the subsequent look and feel of the search application.

Server-side components

Several server-side components work together to execute a search request from a Notes or Web application.

Reflector and Servlet

The Reflector and Servlet dispatch search requests on behalf of the Extended Search LSXs or on behalf of a Web browser to the Broker. The Reflector and Servlet exist on the machine that hosts your Web application server. Extended Search supports several Web servers, including Lotus Domino, IBM Websphere, Microsoft IIS, and IBM HTTP (Apache).

Extended Search uses the Hypertext Transfer Protocol (HTTP) to invoke the Reflector and the Servlet. The Reflector is a Common Gateway Interface (CGI) program used to service Notes client requests, and the Servlet is a Java program used to service Web browser requests. This two-tiered approach allows the client application to use an industry-standard protocol (HTTP), so the client can use many Web server-related features like socks, proxies, and Secure Sockets Layer (SSL) technology.

The Reflector and Servlet can communicate with an Extended Search server that resides on a machine other than the Web server. This provides added flexibility when planning resource capacity and performance because it's possible to dispatch search requests from primary to secondary Brokers running on different servers.

Broker

The Broker is a central part of the Extended Search server. It receives search requests and acts as a resource coordinator to dispatch the query and return results to the calling client application. To support performance and scalability, a given Extended Search domain can contain multiple Brokers. Each Broker has its own network address and typically performs the following tasks:

  • Validates the user
  • Obtains a list of the data sources available to the user defined within the search application
  • Submits queries to target data sources managed by Extended Search Agents and Links
  • Aggregates and sorts search results to a single set of results, or hitlist, according to the definitions of the search application
  • Retrieves source documents (note that for Web-based searches, the Web browser uses the URL returned in the hitlist to retrieve the document)

Configuration database

The Broker obtains information about the resources it manages from the Extended Search configuration database (CDB). The CDB, managed by IBM DB2 Universal Database, contains information about data sources that have been defined for search. It stores network addresses, field-level usage controls, and the names of the application search programs used to access, search, and fetch documents from each data source. The CDB may also include specific configuration options per data source. For example, it can identify specific security DLLs to invoke for data source search authentication, or "user exit."

You can manage the CDB using the Administration applet to modify Extended Search components. The Administration applet includes search data source Discoverers. These are programs that can query a source server for available applications to search and that can automate the process of loading the CDB with default information about new data sources available on specific servers.

Operations and performance console and Monitor

When you start an Extended Search and a Broker is launched, Extended Search displays a scrollable console window in the Administrator applet. This console window initially contains startup messages and ends with a command prompt. Other than infrequent messages that are displayed in the text window, the Extended Search server runs in an uninformative, quiet mode, processing messages and service requests.

The Extended Search Monitor monitors server activities. After you invoke the Monitor for each Extended Search server, you can observe server activity from the Administration applet. With the Monitor, you can adjust and refresh server configurations directly without having to stop and restart the individual server or Extended Search domain, which contains servers, Brokers, Agents, and Links communicating across the network to service aspects of search operations.

The Monitor can run independently of the Broker. You can start and stop the Monitor any number of times without affecting the processing of the Extended Search server. More than one Monitor can connect to the same Extended Search server simultaneously. You can run the Monitor remotely to check the status of the Extended Search server from a location other than the host machine console.

Agents

Extended Search Agents are programs that respond to search and retrieval operations targeted toward a particular data source. Agents load the appropriate source Link and translator modules from query string to source syntax when a request for a specific data source type is made. The Agents then call upon these module libraries for query translation and for connect, disconnect, search, and retrieval operations.

Agents sort the results by relevance rank, then truncate the results to the maximum number of hits specified in the original search request application. We recommend that Agents reside on the same machine as the data source, so the Agents can service search requests faster. Agents can also use a data source's remote APIs for access. More than one copy of an Agent can run on a single server to handle concurrent search and retrieval requests. You can dedicate an Agent to a single data source, to a group of sources of a particular type, or to a range of sources that have a mixture of Link requirements.

Data source search Links

An Extended Search Link translates a search and retrieval request sent by a Broker into the native calls of each data source's programming interface. The Link then executes the translated request against each data source. Extended Search applications translate the search request string into an internal representation called Generalized Query Language (GQL). This makes the individual query languages transparent to the user and enables Extended Search to search multiple and varied data sources in parallel.

Here's an example of how Links work: an Agent invokes an already activated Notes Link when it receives a GQL string search request for information in a Notes database. The Notes Link translates the query from the Extended Search Generalized Query Language (GQL) into the Notes native syntax. Then the Link issues the appropriate Notes Object Interface (NOI) calls to the Notes C/C++ API (FTSearch) and gets the results.

Each source Link stores its search results in a hitlist. It sends the hitlist back to the Agent to aggregate the results according to the desired maximum, to sort by relevance, and to return the results to the Broker for further processing. The Broker aggregates and sorts the results with hitlists that are returned by other Agents and returns a single hitlist to the requesting user through the search application.

Extended Search Link types

Users can search and retrieve documents from repositories that include Lotus Notes 4.x and 5.x, Domino.Doc, Lotus Connectors, and Domino R5 Domain Index. They can also search 18 popular News sites and Web search engines like HotBot and Excite. Extended Search also searches file systems (including uncompressed file attachments), LDAP Directories, and other sources like output from Microsoft Index Server and Microsoft Site Server.

An e-mail content search Link searches Domino and Microsoft Exchange mail servers. You can also search for information from ODBC databases and IBM Enterprise Information Portal content and data sources.

Extended Search includes a Web crawler that traverses Web sites on a scheduled basis and outputs results in XML or HTML format. Lotus Business Partner Atlantic Decisions provides a Verity search Link.

Extended Search Link Toolkit

You can create your own Links with the Extended Search Link Toolkit. To create Web site Links, you can use Intelligent Algorithms infoGIST Toolkit to create a custom Web source definition file that searches the Web. Refer to the Extended Search documentation for more information about these toolkits.


Parallel, direct searches of source content

Extended Search does not require a centralized index. Instead, it accesses information directly using search and retrieval methods native to each data source. This design allows Extended Search to provide a distributed, heterogeneous search across many different data repositories through a single, efficient, and easy-to-use point of access.

The approach has several advantages:

  • Search results always provide current information obtained directly from the data source (unless the Web crawler has been used to extract and store source content for search on a scheduled basis)
  • Data remains within its original source database
  • Overall storage requirements are reduced and the cost of re-indexing resources are eliminated because duplication of the data into a centralized index isn't necessary

At the same time, there are advantages to indexing content especially in situations where a large volume of content will be aggregated for search. Extended Search supports a growing number of indexing technologies to leverage indexed data. Many Extended Search customers today search across indexed and non-indexed content and data sources from their Extended Search-enabled applications.


Managing security across content sources

We designed the Extended Search architecture to provide layers of security that protect source content at several levels:

Web server user authentication
Extended Search uses the Web server environment to verify user access to the search applications. The Web server can use private and public key encryption, digital certificates, and passwords to authenticate users and to ensure that only authorized users can access search applications.

Search application level authentication
At the application level, you can specify access controls for individual search applications with the Administration applet. Not only can you restrict access to data sources, but you also have complete control over which fields a user can search and which results he can view or retrieve. You can even set a "user exit" for a particular search application to invoke a specific security verification routine.

For example,a user exit can communicate with a secure database to confirm the user's identity, and the application can invoke this user exit for each data source searched. A user exit can also verify the user individually or as a member of a defined group or return a different user ID to be used within the Extended Search domain and search application. This feature can prove useful because not all IDs are appropriate for all sources or for all content in a specific source.

Broker level authentication
You can define custom user exits at the Broker level to selectively deny or approve access to individual data sources named in the search request.

Agent level authentication
You can also define custom user exits at the Agent level. Optionally, Extended Search can invoke additional security rules during Agent processing. For example, an Agent level user exit might further validate a user identity prior to processing the search request. Before returning results, the exit might filter the content and any defined constraint values and, if necessary, delete items that fail to pass this post-processing inspection.

Link level authentication
Extended Search works with the individual data sources, each of which may have their own security mechanisms to permit or deny access to content. You can set security for user exit verification at the Link level of search processing.


New features of the current release

The currently available Extended Search Release 3.5 introduced many new search options, performance improvements, and platform support, including:

  • Operating system support for Microsoft Windows NT and 2000, IBM AIX, and Sun Solaris 7 and 8
  • JSP search examples of customizable Web applications
  • EJB enablement to communicate system processing requests with a configuration application
  • Integration with Lotus K-station and Websphere Portal that includes predefined search portlets to Extended Search sources
  • Support for XML search results
  • Saved queries for shared use and re-application
  • Scheduled queries that run at specific times and that store search results
  • National Language Support (NLS-enablement) to search content and to develop applications in any language
  • Enhanced troubleshooting that includes improved error reporting and additional logging controls
  • Web crawling to retrieve Web pages from selected Web sites for subsequent indexing or searching
  • Performance enhancements


Features coming soon

Extended Search Release 3.7 will be available mid-year 2002 from Lotus and IBM Websphere Portal 4.1. This upcoming release will include useful new features and enhancements such as:

  • Search data source configuration wizards
  • Support for IBM DB2 7.2 as the configuration data store
  • Internationalized versions of documentation and product interfaces
  • Native Microsoft SQL Server and Microsoft Access search links

In addition, the Extended Search team is currently working toward Release 4.0 of Extended Search, which will include a number of key search sources for IBM Websphere Portal - Juru Index, Lotus Discovery Server, and Lotus QuickPlace as well as search integration with Lotus Sametime. We have a number of search option and administration enhancements planned and are always interested in your feedback and desire to participate in upcoming pre-release beta programs. Please communicate with us through our newly launched Extended Search forum on the Lotus Developer Domain. We also encourage you to review many existing product information white papers and FAQs on the Extended Search page of the Lotus Web site.


Resources

About the author

Lauren Wendel is product manager for Lotus Extended Search, Lotus Discovery Server, and expertise technologies. Previously, Lauren worked with the Lotus Enterprise Integration team for five years, overseeing the initial releases of Lotus Enterprise Integrator, DECS, ERP Connectors, and the Connector API Toolkit. She has also worked as a developer consultant within the Lotus Business Partner program, and previously, within the 1-2-3 engineering team. Lauren's also managed systems planning at Wells Fargo Bank, Citibank, Duke University School for Executive Education, and Grant Thornton Ltd. She enjoys running the "occasional" marathon and sings with a community chorus.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Lotus
ArticleID=12536
ArticleTitle=Introduction to Lotus Extended Search
publish-date=05012002
author1-email=
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers