Understanding the sample search application for WebSphere Information Integrator OmniFind Edition

Organizations can use WebSphere® Information Integrator OmniFind™ Edition to develop highly customized search applications to best fit their unique needs, requirements and mix of information sources. This article provides a technical overview of the WebSphere Information Integrator OmniFind Edition sample search application. The article covers the overall design, important classes and methods, and some basic execution paths.

Dana Morris (damorris@us.ibm.com), Advisory Software Engineer, IBM, Software Group

Author photoDana Morris is a Software Engineer in the IBM Software Group, Search Technology department. He has been actively involved with IBM search technologies for the past 5 years. Reach Dana at damorris@us.ibm.com.



30 June 2005

Introduction

Companies today are faced with an ever-growing abundance of information. The lack of proper systems to access and manage their information assets can cripple an organization. Not being able to find highly relevant information when needed translates into bad decisions, missed opportunities, wasted time, and wasted money.

WebSphere Information Integrator OmniFind Edition (WebSphere II OmniFind Edition) is designed to support enterprise scale search requirements over a wide variety of content sources. WebSphere Information Integrator’s new enterprise search server offers high-quality, scalable, secure, free-form text search. WebSphere II OmniFind Edition provides extensive capabilities for searching diverse collections of business information from a single point of access, delivering highly relevant search results with subsecond response time while scaling to millions of documents and thousands of users.

During the installation of WebSphere® Information Integrator OmniFind™ Edition (WebSphere II OmniFind Edition), a sample search web application is deployed automatically to the local WebSphere® Application Server. The primary use of this pre-deployed application is to validate collections that a WebSphere II OmniFind Edition administrator created for searching. This application also serves as a working sample for customers to use as a basis for their own search interfaces. This article provides an in-depth look at the search web application. Upon completion of the article, you should have enough information to successfully customize the search application to meet the specific needs of your enterprise.

Prerequisites

This article assumes knowledge of the following technologies:

  • Apache Jakarta Struts v1.1
  • Apache Jakarta Struts Tiles
  • Java™ Servlet Specification Level 2.3
  • Java Server Pages (JSP)
  • IBM Search and Indexing API (SIAPI) version 1.4

For further information about these technologies, please refer to the Resources section later in this article.


Overview

Before exploring the technical details of the sample search application, it is important to understand how end users interact with the application. Figure 1 depicts what the user first sees upon loading the application in a Web browser.

Figure 1. The Search page
The Search page

As you can see in Figure 1, the page consists of three primary elements:

  • Banner
  • Navigational toolbar
  • Body

The toolbar consists of seven links, with the active link highlighted:

  • Search: Opens the main page for entering queries against one or more collections.
  • Category Tree: Opens a page for searching and browsing specific categories of documents from a single collection.
  • Options: Opens a page where the user can configure options for creating queries and obtaining search results.
  • My Profile: If WebSphere Global Security is enabled, this page allows a user to enter their credentials to be submitted when querying secured collections.
  • Logoff: If WebSphere Global Security is enabled, this link allows the user to log out of the search application.
  • Help: Launches a context-sensitive help page in the DB2 Information Center.
  • About: Launches a page that contains an image with the product name and copyright information.

If we concentrate on the Body section of the Search page (see Figure 1), you will note that there are only two items with which the user is required to interact. If the user does not select any collections to search, then the search application defaults to searching all available collections. The user can limit the scope of the query by selecting one or more specific collections to search. To submit a query, the user enters query terms and clicks the Search button. All other query and search result controls are automatically assigned default values for the user in order to make the user experience as simple as possible.

Category Tree page

If users click on the Category Tree link from the toolbar, they will be taken to a page that looks very similar to the Search page (see Figure 2). The Category Tree page allows the user to search a single collection because category trees are assigned on a per collection basis. Another difference between the Search and Category Tree pages is the presence of the tree control on the bottom left of the page. This tree allows users to browse the documents assigned to each category in the tree, similar to how users navigate the Windows® Explorer. If the user selects a category from the tree, the documents that belong to that category are displayed on the right side of the page.

Figure 2. The Category Tree page
The Category Tree page

Options page

If users want to change how the query is executed or how the results are returned, they click on the Options link in the toolbar (see Figure 3). This page consists of several options which the user can change. After making changes to the page, the user clicks the Apply button. The user's options are validated and then saved. If the user then clicks on the Search link again, the previous search will be re-executed with the new query options.

Figure 3. The Options page
The Options page

In-depth analysis

The search application was designed with the Apache Jakarta Struts framework. The Struts framework follows the Model-View-Controller (MVC) design pattern. In order to understand the search application's operation, you need to first understand the Struts framework. The following sections explore the design of the WebSphere II OmniFind Edition sample search application in the context of the Struts framework.

Model

The Model layer contains System state or Business logic beans. In our case, the business logic is maintained directly in the SIAPI Query and ResultSet objects. Since the purpose of the search application is to allow end users to interact with the WebSphere II OmniFind Edition search server, the SIAPI classes are used directly in the application to serve as the beans. There are a some additional beans and classes that are used to help with user interaction in the page. For the purposes of this article, we'll only focus on one in particular, the ISearchHelper interface.

The ISearchHelper interface defines a set of required search page operations to be implemented. The out-of-the-box search application contains an implementation of the ISearchHelper interface that supports SIAPI. By providing an interface class, we can support other variations of the search operations required by the search application. For example, we might want to enhance this search application to search against another search product that does not support the SIAPI interface.

View

The View layer of the application is built using Struts Tiles and JavaServer Pages (JSP). There are several tiles layouts that were created for this application:

  • baseLayout.jsp: Provides the basic layout used by the Search page. Consists of three parts: banner, toolbar, and body.
  • portalBaseLayout.jsp: Provides the basic layout used when the application is executing as a WebSphere Portal portlet. Consists of two parts: toolbar and body.
  • toolbarLayout.jsp: Provides the basic layout used for the toolbar section of baseLayout and portalBaseLayout.
  • treeLayout.jsp: Provides the layout for the category tree used on the category browsing page.

When a user clicks on a particular link in the toolbar, a JSP page is loaded to represent the body section:

  • search.jsp: Provides the main search page.
  • categoryTree.jsp: Provides the page used to search and browse the category tree.
  • options.jsp: Provides the page used to control the query and results options.

The search.jsp and categoryTree.jsp pages comprise several other JSP pages. Because both of these JSP pages share many of the same controls and properties, several other JSP pages were created to contain logic common to both pages so as to avoid code duplication and to make the pages more maintainable. The following additional JSP pages are loaded into the parent JSP page by using the jsp:include tag:

  • commands.jsp: Provides commands for dynamically controlling how the search results are displayed.
  • queryPrompt.jsp: Provides the query entry box and the drop-down or check boxes for selecting the collections to search or browse.
  • messages.jsp: Renders any messages returned in the result set. The messages can be errors, warnings, or informational messages.
  • quickLinks.jsp: Renders any predefined results (quick links) returned in the search results. This page is only used in search.jsp.
  • resultsHeaderFooter.jsp: Displays the number of search results and the controls for navigating the pages of results.
  • searchResults.jsp: Renders the actual search results.
  • spellCorrections.jsp: Renders any spelling suggestions returned. This page is only used by search.jsp.
  • synonymExpansions.jsp: Renders any synonym expansions returned. This page is only used by search.jsp.

Controller components

As explained in the Struts User's guide, the controller's job is to process the user's request and determine what view to use to return to the end user. The WebSphere II OmniFind Edition search application is designed with a single ActionForm and several Struts Action classes. All Action classes extend another class named BaseAction which in turn extends the Struts Action class. Before exploring the Action classes, let's first review what services the BaseAction class provides.

BaseAction was designed to provide most of the logic and helper methods that the other Action classes require to process the users' requests. The motivation for adding an additional parent class was to reduce the complexity and to share common functionality between the Action classes. The added benefit is that it provides a good framework for future extensibility of the application. The BaseAction class provides the following functionality:

  • Interacting with the ActionForm properties.
  • Initializing SIAPI and National Language Support (NLS) services.
  • Providing common actions such as refreshing the collections list, clearing the search results, and processing queries.
  • Providing access to the Apache Commons Logging service.

Each Action class implements the execute method of the Struts Action class. The execute method is the primary entry point whenever a user executes an action that requires them to submit the page. By calling the BaseAction initialize method, the execute method first ensures that the required resources have been loaded; these include the SIAPI services, the list of available searchables (collections), and the NLS-related utilities.

Each Action extracts an ActionForm variable called command that indicates what action the user has invoked on the search application. The possible values for the command property differ depending on which page the user is currently interacting with. Most of these actions will invoke some SIAPI operation against the search server. Some of the possible actions the user could invoke from the Search or Category Tree pages are:

  • clearResults: Clear the current query and its results from the request.
  • refresh: Refresh the list of available searchables (collections).
  • search: A basic search request was submitted.
  • prev: Retrieve the previous page of search results.
  • next: Retrieve the next page of search results.
  • treeNodeSelect: Select a node in the category tree.
  • treeNodeExpand: Expand a particular node in the category tree.

TThe execute method uses a series of IF statements to determine the action to perform and to which view the response should be forwarded. Each of the commands listed above has a corresponding action method that is called from within the execute method. These action methods perform the necessary steps for executing the user's requested action, and then forward the response to the appropriate view by using the ActionMappings defined in the struts-config.xml file.

As an example, let's take a look at the code for the refresh command (see Listing 1). This method will load the list of collections available from the SIAPI RemoteFederator by calling the loadFederatorCollections method. It will then add any errors to the Struts ActionErrors class and finally forward the response to the appropriate view.

Listing 1. The refreshAction method
protected ActionForward refreshAction(
	ActionMapping mapping,
	ActionForm form,
	ActionErrors errors,
	HttpServletRequest request,
	String targetPage) {

	// load the Remote Federator and it's list of assigned collections
	loadFederatorCollections(errors, request);
	if (!errors.isEmpty()) {
		saveErrors(request, errors);
	}

	return mapping.findForward(targetPage);
}

The loadFederatorCollections method (see Listing 2) first extracts the ISearchHelper object from the session. If the Helper class is found in the session then the ISearchHelper.loadRemoteFederator method is invoked to load the SIAPI RemoteFederator object from the search server. Finally, an array of Collection valueObjects is created from the array of SIAPI CollectionInfo objects assigned to the RemoteFederator. This Collection array is then stored in the form variable "collections". The array of CollectionInfo objects will be loaded on the queryPrompt.jsp page for display to the end user.

Listing 2. The loadFederatorCollections method
DynaActionForm theForm = (DynaActionForm) form;
ISearchHelper helper =
	(ISearchHelper) request.getSession().getAttribute(
		"SearchHelper");
if (helper != null) {
	// get the federator from the server
	helper.loadRemoteFederator();
	// get the list of collections from the federator and store 
	// the returned list in the form
	CollectionInfo[] infos = helper.getCollectionInfos();
	Vector collections = new Vector();
	for (int i = 0; i < infos.length; i++) {
		Collection collection =
			new Collection(infos[i].getID(), infos[i].getLabel());
		collections.add(collection);
	}
	Collection[] collectionObjs =
		new Collection[collections.size()];
	collectionObjs =
		(Collection[]) collections.toArray(collectionObjs);
	theForm.set("collections", collectionObjs);
}

There are several other classes used in the search application that provide various functions. All of these classes have comments at the top of the class definition to provide details about the purpose of the class. Below a quick overview of the various subpackages and the services they provide:

  • com.ibm.es.searchui.actions: Contains the BaseAction classes and all of the other action classes which extend BaseAction.
  • com.ibm.es.searchui.filters: Servlet filter class that sets the request encoding to UTF-8 to support the entry of query strings in all languages. This package is not used when the application is run in WebSphere Portal.
  • com.ibm.es.searchui.helpers: This package contains three classes, two of which (TreeControl and TreeControlNode) provide wrapper classes for representing the category tree and the ISearchHelper interface.
  • com.ibm.es.searchui.helpers.siapi: Contains the SIAPI implementation of the ISearchHelper interface.
  • com.ibm.es.searchui.tags: A series of small Tag libraries that mimic the WebSphere Portal EncodeNamespaceTag when the application is operating in a standard WebSphere Application Server deployment, as well as providing formatting support for specific types of display information such as spelling suggestions, synonym expansions, document dates, and document scoring
  • com.ibm.es.searchui.validation: Provides a single class that is used to validate the user's entry for the "Number of results per page".
  • com.ibm.es.searchui.valueObjects: Contains the ResultLanguage class to store each supported language and its corresponding translated string representation. Example: en_US would be the supported language, "English - United States" might be the translated string displayed to the user.

Now let's examine the typical search execution scenario. For this scenario, assume that the user leaves the default collection selected, enters a simple query string, and then clicks the Search button. First, the request enters the execute method in the SearchAction class. Before processing the search, the SearchValidation class is called to ensure that the user entered a valid value for the number of results per page value.

Once the request passes the validation, the processQueryAction method is called. This method immediately extracts the ISearchHelper object from the session. If the ISearchHelper object is present, then the ActionForm parameters are extracted and stored in a HashMap. This HashMap is passed in to the ISearchHelper.buildQueryString method (see Listing 4) which will construct a valid SIAPI query string that is appended on to the user's query terms. For example, if the customer wants to limit the results to documents written in Japanese, then the query will contain an additional query term of "^$language::ja".

Listing 3. BaseAction processQueryAction code snippet
HashMap parameters = getFormValues(form);
parameters.put(
	"categoryId",
	request.getParameter("categoryId"));
parameters.put(
	"siteSearch",
	request.getParameter("siteSearch"));
parameters.put(
	"UserSecurityContent",
	(String) request.getSession().getAttribute(
		"UserSecurityContent"));

// build the query string
String queryString = helper.buildQueryString(parameters);
Query query = helper.createQuery(queryString, parameters, calculatePageControl(form));
if (query != null
	&& query.getText() != null
	&& query.getText().length() > 1) {
	ResultSet results = null;
	// if the user is interacting with the category tree
	// then they can only select one collection and it will
	// be in the collectionSelected property
	if (targetPage.compareToIgnoreCase("categoryTreePage") == 0) {
		String[] selected = new String[] {(String) theForm.get("collectionSelected")};
		results = helper.search(query, selected);
	} else {
		// otherwise, walk the list of collections and see which ones the user
		// selected for searching
		Collection[] collections = getCollections((DynaActionForm) form, request);
		if (collections != null) {
			Vector ids = new Vector();
			for (int i = 0; i < collections.length; i++) {
				if (collections[i].isSelected()) {
					ids.add(collections[i].getID());
				}
			}
			String[] selectedIDs = new String[ids.size()];
			selectedIDs = (String[]) ids.toArray(selectedIDs);
			results = helper.search(query, selectedIDs);
		} else {
			// default to searching all collections in the
			// RemoteFederator
			results = helper.search(query, null);
		}
	}
	storeQueryResponse(results, query, request, form);
}
Listing 4. SiapiHelper buildQueryString code snippet
String query = (String) parameters.get("queryString");
String site = (String) parameters.get("siteSearch");
String categoryId = (String) parameters.get("categoryId");
String[] languages = (String[]) parameters.get("resultLanguages");
String[] sources = (String[]) parameters.get("documentSources");
String[] types = (String[]) parameters.get("documentTypes");
String[] scopes = (String[]) parameters.get("scopes");

if (query != null && query.length() > 1) {
	// check if this is a query to search more from
	// the same site
	if (site != null) {
		query = query + " samegroupas:" + site;
	}

	// if the user selected any scopes to limit
	// the query, add them
	if (scopes != null && scopes.length > 0) {
		for (int i = 0; i < scopes.length; i++) {
			query = query + " ^$scopes::" + scopes[i];
		}
	}

	// return only documents that match the selected languages
	if (languages != null && languages.length > 0) {
		for (int i = 0; i < languages.length; i++) {
			query = query + " ^$language::" + languages[i];
		}
	}

	// return only documents of the specified types
	if (types != null && types.length > 0) {
		for (int i = 0; i < types.length; i++) {
			query = query + " ^$doctype::" + types[i];
		}
	}

	// return only documents that match the specified source (crawler) types
	// examples:  UNIXFS, WINFS, WEB, etc
	if (sources != null && sources.length > 0) {
		for (int i = 0; i < sources.length; i++) {
			query = query + " ^$source::" + sources[i];
		}
	}
}

After successfully constructing a SIAPI query string, the ISearchHelper.createQuery method (see Listing 5) is invoked to create an actual SIAPI Query object from the query string along with adding the user's other query options. Examples of additional query options are settings such as whether to return spelling suggestions or to include synonym expansions in the query results. With a valid SIAPI Query object, the ISearchHelper.search method is invoked to actually execute the search request against the search server. The response is then forwarded to the processSearchResponse method.

Listing 5. SiapiHelper createQuery code snippet
// create the query that passes in the query string
Query query = searchFactory.createQuery(queryString);
query.setRequestedResultRange(start, range);
// if there was a previous query, then we need to maintain the
// SearchState
if (state != null) {
	query.setSearchState(state);
}

// set the query language
if (queryLanguage != null && queryLanguage.length() > 1) {
	query.setQueryLanguage(queryLanguage);
}

// if there are any security tokens, add them here
// In 8.2 Fixpack 2, this is only relevant for collections that
// contain documents from Domino databases
if (aclConstraints != null && aclConstraints.length() > 0) {
	// if you wanted to also support additional security tokens
	// that were configured when creating crawlers
	// refer to the API guide and the SIAPI Javadoc for more
	// information on the syntax of the setACLConstraints method's
	// query string parameter 
	query.setACLConstraints(
		@SecurityContext::' + aclConstraints + ');
}

// should predefined results (quick links) be returned with the results?
if (predefinedResults.compareToIgnoreCase("Yes") == 0) {
	query.setPredefinedResultsEnabled(true);
} else {
	query.setPredefinedResultsEnabled(false);
}

// should results from the same site be collapsed?
if (siteCollapsing.compareToIgnoreCase("Yes") == 0) {
	query.setSiteCollapsingEnabled(true);
} else {
	query.setSiteCollapsingEnabled(false);
}

// should spelling corrections be returned with the results?
if (spellCorrections.compareToIgnoreCase("Yes") == 0) {
	query.setSpellCorrectionEnabled(true);
} else {
	query.setSpellCorrectionEnabled(false);
}

// should synonym expansions be returned with the results?
if (synonymExpansions.compareToIgnoreCase("Automatic") == 0) {
	query.setSynonymExpansionMode(Query.SYNONYM_EXPANSION_AUTOMATIC);
} else {
	query.setSynonymExpansionMode(Query.SYNONYM_EXPANSION_OFF);
}

The storeQueryResponse method (see Listing 6) first stores the SIAPI Query and ResultSet objects in to the request so that the View can extract them for rendering to the user. Additionally, this method calculates what the current results range is for display to the user (ex: 21-30). The SiapiMessage objects are extracted from the ResultSet to be displayed to the end user in the messages.jsp file. Finally, the response is forwarded to the appropriate view.

Listing 6. SiapiHelper createQuery code snippet
DynaActionForm theForm = (DynaActionForm) form;
// store the query and results in the request
request.setAttribute("results", results);
request.setAttribute("query", query);
		
// determine the page variables to use for display
// to the user
int range =
	query.getFirstRequestedResult() + query.getNumRequestedResults();
if (range > results.getAvailableNumberOfResults()) {
	range = results.getAvailableNumberOfResults();
	request.setAttribute("isLastPage", "true");
} else {
	request.setAttribute("isLastPage", "false");
}
request.setAttribute("displayRange", Integer.toString(range));
request.setAttribute(
	"firstResult",
	Integer.toString(query.getFirstRequestedResult() + 1));

// store any messages from the ResultSet into the messages
// textarea
List messages = results.getMessages();
if (messages != null && messages.size() > 0) {
	StringBuffer sb = new StringBuffer();
	Iterator iter = messages.iterator();
	while (iter.hasNext()) {
		SiapiMessage message = (SiapiMessage) iter.next();
		// load the message in the user's locale
		try {
			sb.append(message.getMessage(true, request.getLocale()));
		} catch (SiapiException e) {
			// failed to load the message, display the error for failing
			// to load the message instead
			sb.append(e.getMessage());
		}
		sb.append("\n");
	}

	// store the messages to the textarea on the messagesFooter.jsp
	theForm.set("messages", sb.toString());
}

Summary

After completing this article you should have a general understanding of the basic design and execution of the IBM WebSphere Information Integrator OmniFind Edition sample search application. The search application is intended to be a fully functional sample application that demonstrates the powerful search capabilities that the product offers. By using this article and the product programming guide (Programming Guide and API Reference for Enterprise Search), you should be able to use the sample search application as a basis for designing your own customized user interface.

Resources

DB2 Information Integrator OmniFind Edition was recently renamed WebSphere Information Integrator OmniFind Edition. You might see references to WebSphere Information Integrator OmniFind Edition on the product Web pages, but the published product documentation and support site still reflect the DB2 brand. The product is also frequently referred to by its functional description: enterprise search.

More downloads

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, Java technology, WebSphere
ArticleID=87541
ArticleTitle=Understanding the sample search application for WebSphere Information Integrator OmniFind Edition
publish-date=06302005