Level: Intermediate Itai Dinur (DINUR@il.ibm.com), Software Engineer, IBM Ann Hayes (annhayes@us.ibm.com), Software Engineer, IBM David Konopnicki (DAVIDKO@il.ibm.com), Manager, Search Technologies Development, IBM Eitan Shapiro (EITANS@il.ibm.com), Software Engineer, IBM
25 Oct 2006 Does your company require specific content metadata or a custom look and feel within search applications? This article describes the IBM® Search and Indexing API (SIAPI), now available with IBM WebSphere® Portal V6.0, along with its design goals and capabilities. With an emphasis on real life code examples, this article describes the internals of a sample search portlet, explains how to construct custom queries, browse taxonomies, and understand the new search administration concepts introduced in WebSphere Portal V6.0.
From the IBM WebSphere Developer Technical Journal.
Introduction
The IBM Search and Indexing API (SIAPI) is a Java™ API used to develop search portlets and applications. It enables you to send queries to the collections managed by the Portal Search Engine, retrieve results, and display those results. With SIAPI, you can also browse category trees associated with collections and display the documents that belong to a specific category.
SIAPI is common to both the Portal Search Engine (PSE) and WebSphere Information Integrator OmniFind Edition (OmniFind), IBM's enterprise search product. While there are slight differences between the implementations, as described in the Javadoc documentation (see Resources), porting search applications from PSE to Omnifind should be pretty straightforward.
This article covers only the search functionality. As of this writing, the indexing capabilities are not yet available.
How the API is organized
SIAPI is organized into several packages:
Management package - used to manage the different SIAPI implementations that have been deployed on a portal; for example, the default PSE SIAPI implementation and the Omnifind implementation. This package is also used to manage the different search services that were created on the WebSphere Portal server (more on this below).
Search package - used to search document collections.
Browse package - used to browse the taxonomy trees associated with the collections.
Scope.admin package - used to create or delete scopes. Scopes are a new feature of WebSphere Portal V6.0. An administrator can define search scopes to enable users to limit search results to specific content locations and specific document types. This enables users to target their searches better. Users can select these scopes from a selection menu provided with the search box in the theme and with the Search Center portlet.
Scope.search package - used to search with scopes.
All the SIAPI packages follow the same organization and contain a factory that is used to create the data objects needed for operation. For example, using the SearchFactory you can create Query objects representing full-text queries that can be submitted to a collection. Aside from the factory, the SIAPI packages contain one or more Service objects that provide the operations. For example, using the SearchService, you can obtain Searchable objects that provide the search method used to search document collections.
The Display package (com.ibm.portal.search.providers.display) is an additional package, which is not part of SIAPI, that is used to format search results in the Portal Search Center.
In the sections that follow, we explore the different packages, explain the underlying concepts, and detail code snippets from the sample search portlet included in this article.
Manage the different SIAPI implementations
As explained above, WebSphere Portal contains several subsystems that implement the SIAPI. Out-of-the-box, WebSphere Portal contains three SIAPI implementations:
Portal Search Engine SIAPI - used to access collections managed by the Portal Search Engine.
Content Management SIAPI - used to access the search collection for Portal Document Manager (PDM) documents.
Generic SIAPI implementation - this implementation abstracts out the details of the other implementations installed in the portal. For example, this implementation will enable you to access all the collections existing in a portal independently of the SIAPI service in which it is defined.
If Omnifind is installed, a fourth SIAPI implementation is deployed and extends the list above.
Search services
WebSphere Portal V6.0 introduces the notion of search service, namely a search backend that can be searched. By default, two search services are installed:
The default Portal Search Service is used to access user-defined collections. The default Portal Search Service implements PSE SIAPI.
The Default Content Model (CM) search service is used to query documents saved in the Portal Document Manager. The CM search service implements CM SIAPI.
It is possible to define several search services using Portal Search remote capabilities; for example, to distribute the search load over several machines. This is done in the search administration portlet by clicking on Search Services => New Search Service and choosing the type of search service you want to create from the drop down list.
After creating several remote search services, the configuration of your search services may resemble that shown in Figure 1.
Figure 1: Two PSE search services and the CM search service
In Figure 1, the generic search service abstracts the difference between the SIAPI implementations and services, and enables access to all the collections deployed in a portal. The different SIAPI implementations are depicted in different colors.
Developing a search application
The general tasks involved in developing a search application include:
- Create an entry point for the search application
- Choose the search factory
- Select a search service or use the generic search service
- Work with the searchable object to execute a query
- Browse in a taxonomy (optional)
- Work with search scopes
- Use the display provider to work with search results
These tasks are detailed in the next sections.
1. Create an entry point for the search application
The searchProvider is the programmatic entry point for your portlets to SIAPI. The searchProvider is obtained as a portlet service (for IBM portlets or JSR 168 portlets) or through a context lookup in skins or themes. For example, for JSR 168 portlets, this is done as follows:
Listing 1
ctx = new javax.naming.InitialContext();
Object home = ctx.lookup("portletservice/
com.ibm.portal.portlet.service.search.SearchProvider");
if (home != null)
portletServiceHome = (PortletServiceHome) home;
...
SearchProvider searchProvider = (SearchProvider) portletServiceHome
.getPortletService(com.ibm.portal.portlet.service.search.SearchProvider.class);
|
See basicSiapiPortlet.java for other examples of how to obtain the searchProvider.
2. Choose the search factory
Using the search provider, you can access SIAPI factories. Again, in each package, the SIAPI factory lets you to access some functionality, and there is a different factory for each SIAPI implementation. To access a SIAPI factory, you need to pass the name of its implementation to the searchProvider objects. As a next step, you would retrieve the Management Factory with:
searchProvider.getSiapiManagementFactory
("com.ibm.lotus.search.siapi.SiapiManagementFactory"); |
The generic SIAPI implementation contains management, search, browse, scope and display factories. The PSE and the CM SIAPI implementations contain only search and browse factories. The names of the different SIAPI factory implementations are detailed in the SIAPI Javadoc (see Resources).
As shown above, to retrieve a factory from the searchProvider, you need to know the name of the implementation you wish to retrieve. If you want to discover dynamically what are the available factories, you can use the management factory and call the method getSearchFactoryInfos that will return you the list of all the search factories that have been deployed. Then it is possible to instantiate a particular factory as shown in basicSiapiPortlet.java:
FactoryInfo searchFactoryInfo =
siapiManagementFactory.getSearchFactoryInfo(
"com.ibm.lotus.search.plugins.pse.PSESearchFactoryImp");
searchFactory =
siapiManagementFactory.getSearchFactory(searchFactoryInfo);
|
3. Select a search service or use the generic search service
As explained above, using the search service management administrator user interface, it is possible to create several search services. Thus, when you work with SIAPI, you have two alternatives:
Use the generic search service. There is only one such service and it abstracts all the other search services that have been defined in the portal. This means that calling the method genericSearchService.getAvailableSearchables() returns a list of all the collections that have been defined in any other search service in your Portal.
In the same way, if you know the name of the collection you want to query, you can call genericSearchService.getSearchable(...your collection name...).
You can retrieve the generic search service as follows:
First, retrieve the SearchFactory:
SearchFactory genericSearchFactory = searchProvider.getSiapiSearchFactory
("com.ibm.lotus.search.plugins.provider.core.GenericSearchFactory"); |
Then you can obtain the unique generic search service from this factory:
SearchService genericSearchService = genericSearchFactory.getSearchService(null); |
As an alternative, you can access a particular search service you created using the search administrator UI. This is done by using the SiapiManagementService on which you can call GetRegisteredSearchServices. This call returns you a ServiceInfo object containing the services properties necessary to instantiate a SearchService that has been created through the Search Admin portlet.
4. Work with the searchable object to execute a query
A SearchService enables you to obtain searchable objects that can then be used to facilitate your search. A SIAPI query is passed to the search method, which then returns the results. For example, here is how our sample portlet executes queries:
/**
* Executes the current query on the current collection and
* set the results.
* @param queryString The query string
* @param fieldID The document field to query
* @param docType The type of document to return
* @param categoryID The category to search
* @throws SiapiException There was a Siapi error
*/
public void doCurrQuery(String queryString,
String fieldID,
String docType,
String categoryID,
String scopeID) throws SiapiException {
// Create a query
this.currQuery = searchFactory.createQuery(queryString);
...
this.currQuery.setText(currQueryInfo.getQueryText());
//set the range of the query
this.currPage = 0;
this.setQueryRange();
this.setReturnedValues();
this.setReturnedAttributes();
this.currQuery.setSortKey(BaseQuery.SORT_KEY_RELEVANCE);
this.currDetail = true;
this.ScopeID = scopeID;
// Set the ACL Constraints to user DN
this.currQuery.setACLConstraints(userID);
// Execute the query
this.executeQuery();
} |
Then the query is executed in method executeQuery:
resultSet = this.currSearchable.search(currQuery); |
By manipulating the Query object, you can modify the searches executed by your end users. For example, you can examine the user profile and add a category condition to the user query in order to return results from a category of documents appropriate for this user. For example, here is how the QueryInfoBean builds a SIAPI query string:
/**
* Returns the Siapi query text that corresponds to the parameters
* @return A Siapi query text
*/
public String getQueryText() {
String queryText = new String("");
// Add condition on the field
if(null != fieldID && null != text && !text.trim().equals("")){
if(!fieldID.equals(BasicSiapiPortlet.WHOLE_DOCUMENT)){
queryText += this.fieldID += ":";
}
}
// Add user query
if(null != text){
queryText += text;
}
// Add condition on file type
if(null != documentType){
if(!documentType.equals(BasicSiapiPortlet.ALL_DOC_TYPES)){
queryText += " $doctype::" + documentType;
}
}
// Add condition on category
if(null != categoryID){
if(!categoryID.equals(BasicSiapiPortlet.ALL_CATEGORIES)){
queryText += " taxonomy::" + categoryID ;
}
}
return queryText;
} |
In a similar manner, you can manipulate the Result objects that are returned by a search, hide document attributes, or add new document attributes obtained from another system.
5. Browse in a taxonomy (optional)
You can retrieve a BrowseService object in the same way you retrieve a SearchService object. This service is required to access TaxonomyBrowser objects, which enable you to browse the category tree associated with a collection. This object gives you the root category of your tree first, and then enables you to retrieve the children categories. The following code shows how you would traverse the category tree:
/**
* Builds a list of all the categories defined in a collection
* @param collectionID The ID of the collection
* @return A list of categories
* @throws SiapiException There was a Siapi error
*/
public Category[] getCategories(String collectionID) throws
SiapiException{
TaxonomyBrowser taxonomy= null;
try {
taxonomy = browseService.getTaxonomyBrowser(null,
collectionID, "Default");
} catch (SiapiException e) {
…
}
//execute BFS to find all categories
ArrayList categories = new ArrayList();
try {
Category curr = taxonomy.getRootCategory();
addDescendants(taxonomy, curr, categories);
} catch (RuntimeException e1) {
…
}
if(null == categories){
return new Category[0];
}
return (Category[])categories.toArray(new Category[0]);
}
/**
* Visits recursively the descendant of some category
* @param taxonomy The Siapi taxonomy browser
* @param curr The current category
* @param categories The array of categories being updated
*/
private void addDescendants(TaxonomyBrowser taxonomy, Category curr,
ArrayList categories) {
if(null == curr){
return;
}
categories.add(curr);
CategoryInfo[] children = curr.getChildren();
if(null == children){
return ;
}
for(int i=0; i<children.length; i++){
String ChildID = children[i].getID();
Category child = taxonomy.getCategory(ChildID);
//recursive call
addDescendants(taxonomy, child, categories);
}
}
|
As shown above, to get the list of the documents in a given category, you must add a category condition to a query by using the SIAPI query syntax; this is done by adding taxonomy::category_id to the query text. The category_id is obtained from the Category object.
6. Work with search scopes
In WebSphere Portal V6.0, the preferred way to query for documents is to use a search scope. You can define search scopes through the search administrative UI and use them to target a search to particular set of documents. The set of documents is defined by any combination of the following criteria:
- A set of locations, which are collections or particular content sources.
- A set of features defining the source of the documents that should be returned (for example, the Web or WCM sites).
- A full-text query that the documents in the scope must satisfy.
A search scope can be retrieved using the ScopeSearchService: obtain the Scope object by name, and use the search method to query the documents that belong to the scope.
This is shown in BasicSiapiPortlet.java:
resultSet = scopeSearchService.search(appInfo,scope,currQuery); |
However, it is also possible to create a scope dynamically (on the fly) -- without registering it to appear in the scopes menu in the UI -- and execute a search on it. Using scopes rather than searchables in your search application has several advantages:
- The scopes API is simpler to use.
- If you change the collection that needs to be searched, you do not have to change your application. You will be able to simply change the definition of your search scope in the Administrative UI.
7. Use the display provider to work with search results
Display providers are used to generate the information used to display a search result, depending on its type. Using the DisplayService it is possible to call getDisplayData and obtain a DisplayData object. Such an object can return the icon that should be presented along with a result (depending on the document type), and the formatted text that displays the result attributes as formatted in the search center. Furthermore, if the result is a portlet, the DisplayService returns a PortalDisplayData class used to generate a portal URL for a result:
/**
* Inits the link to the result
* @param result The Siapi result encapsulated
*/
public void setHTMLLink(Result result) {
String urlString = null;
if (dispData instanceof PortalDisplayData ) {
urlString = ((PortalDisplayData) dispData).getPortalURL();
}
else {
urlString = result.getDocumentURI();
}
//set the link to the URL (document source)
this.HTMLLink = "<A href=\"" + urlString + "\">";
//display the document title
this.HTMLLink += result.getTitle() + "</A>";
} |
Conclusion
In this article, we presented SIAPI, its organization, and its key concepts. We explained how the API is implemented in WebSphere Portal and how it can be used by developers. In the Resources section you will find a detailed Javadoc, together with a sample search portlet, which will exemplify the use of the APIs described here.
Acknowledgements
The authors wish to thank Andreas Prokoph for his helpful comments.
Download | Description | Name | Size | Download method |
|---|
| Code sample | BasicSiapiPortlet.war | 42 KB | HTTP |
|---|
Resources Learn
Get products and technologies
About the authors  | 
|  | Itai Dinur holds a B.sc. degree in Computer Science from the Technion, Israel. Itai has over 2 years of experience in Software Development, specializing in Search Engine and Search Applications development. Itai currently works at IBM's Research and Development lab in Haifa, Israel. |
 | 
|  | Ann Hayes is a Software Engineer in the IBM Software Group, Portal Search Technologies department. She has been actively involved with IBM search technologies for the past 4 years. |
 | 
|  | David Konopnicki manages the Search Technologies department of the IBM Haifa Research Lab, which develops the full-text search engines embedded in IBM Lotus Collaboration products and IBM WebSphere Portal. David has 6 years of experience in Research and Developement of Search Engines, Search Applications, Databases and Electronic Commerce. |
 | 
|  | Eitan Shapiro holds a BSc degree in Information Systems Engineering from the Technion, Haifa, Israel. He joined IBM in 2005, and he is the chief programmer of the Haifa Search Technologies Team. |
Rate this page
|