A few months ago, I had a discussion with some of my colleagues about the possibility of adding a search facility to our intranet (as a pilot) and then extending it to the corporate Web sites. One of the main goals behind this initiative was to discover what visitors were looking for and adapt the Web content accordingly. Some of us envisioned a completely custom solution (I'm not joking). They suggested building tables on a database server to hold keywords and the address of the page they were related to. The extraction of keywords would have been a manual process, their selection entirely at the mercy of the person performing it. The query on the index would have been performed using standard Structured Query Language (SQL).
But then a different idea emerged. We had everything in hand to implement it: a full Windows environment for all our servers and a Web development tool already in use for internal developments (ColdFusion). The only piece missing was documentation about how to implement it with this tool. A lot of information was available about Active Server Pages (ASP), but resources for ColdFusion were missing. Being the only one on our team with the necessary skills, I rolled up my sleeves and built the ColdFusion equivalent of what existed for ASP, which became our search facility. In the process, I decided to give it back to the community. And here it is.
Before I go into the details of how the search facility is built, this section presents the tools used throughout this article. Each one has a specific role to play in the final solution: the Rico JavaScript library to enhance the user experience when displaying results, ColdFusion MX 7 to access the index and build the search results, and Windows Indexing Service to provide the indexing infrastructure.
The Rico JavaScript library is an open source Asynchronous JavaScript + XML (Ajax) framework available under the Apache 2.0 License. It emphasizes simplicity by providing a single JavaScript object to add support to any HTML page.
To start working with it, all you have to do is include, in the header of the Hypertext Markup Language (HTML) page, the JavaScript libraries that provide the necessary functionalities:
<script src="prototype.js"></script> <script src="rico.js"></script> |
Doing this causes an instance of the Ajax engine to be automatically created and made accessible via an object named ajaxEngine.
In addition to drag-and-drop support and animation effects, this Ajax framework provides the following key characteristics that I use in this article:
- A standard Extensible Markup Language (XML) definition for the base Ajax responses
- The capability to link a response to a specific HTML element
- The automatic update of the
innerHTMLproperty of the target HTML element with the content of the response
ColdFusion MX 7 is an application server and Web development framework that can be used to develop high-performance dynamic Web sites. All the power of ColdFusion comes from its flexible language: the ColdFusion Markup Language (CFML). The language's syntax was modeled after HTML, making it easy to learn. CFML provides numerous tags to encapsulate or extend HTML as well as perform conditional processing, interact with the local file system, perform HTTP or FTP operations, generate PDF files on the fly, connect to all kind of data created by external applications, and so on. With all its built-in capabilities and its flexibility to integrate with other applications or services, CFML can serve all your Web development needs right out of the box.
Of course, you can use other languages to provide a search page as presented in this article. You can use any Web language, such as PHP, ASP, or Ruby on Rails, to name a few, as long as it can access a Component Object Model (COM) object. That's the only requirement to access the indexed content when working with Windows Indexing Service.
Windows Indexing Service is a base component installed on Windows XP or Windows Server 2003 systems. Its role is to analyze the properties and content of documents on the file system or within a Web server and then to build an indexed catalog to ease the search on this data.
You can check its installation in the Add/Remove Windows Components section of the Add/Remove Programs tool, accessible via the Control Panel (see Figure 1):
Figure 1. Add/Remove Windows Components
Once installed, you can access and manage this component through the Computer Management console in the Windows Administrative Tools, under the Services and Applications section (see Figure 2):
Figure 2. The Computer Management console
This tool is part of the system and completely integrated, so it's an obvious choice when you're working in a full Windows environment.
If you need cross-platform support, alternatives are available under open source licenses. Examples include Apache Lucene (a high-performance, full-featured text search engine library) and Apache Solr (a search server based on the Lucene search library).
Index the content of an IIS server
Before you begin to code the search facility, you need to activate indexing on your Web server and, if necessary, limit its scope to a subset of files with the help of a catalog.
By default, the indexing system is active, and two catalogs are automatically created:
- System: Through this catalog, all documents of the computer are indexed. You can also query this catalog with the help of Windows' standard search tool.
- Web: Through this catalog, all documents of the Web server are indexed; that is, all documents are stored under c:\inetpub\wwwroot.
The Web catalog is used by default if nothing is specified during the search on the content of the Web server. Most of the time, it's more than enough. However, you may want to create additional catalogs if, for instance, you wand to provide several search pages on your site, each limited to a subset of the site, or if several Web sites are hosted on the same Microsoft Internet Information Services (IIS) server.
You create a new catalog through the Computer Management console, in the Indexing Service subsection, by choosing New > Catalog from the context menu. You then specify the name of the new catalog as well as its location on disk (see Figure 3):
Figure 3. Add a new catalog
A hidden folder named catalog.wci is automatically created at the specified location
and will hold the files generated during the indexing. As soon as this operation is complete, you specify which folders to index or exclude from indexing and associate the catalog with the Web server (on the Tracking tab of the catalog properties) so that the virtual paths (vpath) can be generated properly (see Figure 4):
Figure 4. Catalog properties
Because this article is about building a search page, you need a page on the client
side (see Listing 1) to act as the user interface for the search
facility. This page presents a basic form to the users on which they can enter search terms. Below that form is an empty area (div) that later displays the search results (see Figure 5):
Figure 5. The search form
Listing 1. The user interface for the search facility (ajax_search.cfm)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>Search</title>
<script src="prototype.js"></script>
<script src="rico.js"></script>
<script language="JavaScript">
var CurrentPage = 1;
function onLoad() {
ajaxEngine.registerRequest('getSearchResults','ajax_results.cfm');
ajaxEngine.registerAjaxElement('SearchResults');
}
function getResults(RequestedPage) {
var SearchString = document.SearchForm.qu.value;
CurrentPage = RequestedPage;
ajaxEngine.sendRequest('getSearchResults',
"qu=" + SearchString,
"pg=" + RequestedPage);
}
function getPreviousResults() {
getResults(CurrentPage-1);
}
function getNextResults() {
getResults(CurrentPage+1);
}
</script>
</head>
<body onLoad="onLoad()">
<h1>Search</h1>
<cfform name="SearchForm" method="post" action="#CGI.SCRIPT_NAME#"
onSubmit="getResults(1); return false;">
<cfinput type="text" name="qu" size="50" maxlength="100" value="">
<cfinput type="submit" name="" value="Search">
</cfform>
<div id="SearchResults">
</div>
</body>
</html>
|
When the page is first loaded (onLoad event), the request handler is registered with the Ajax engine. It associates the URL of the page on the server side (the page that handles the requests) with a name that serves as its identifier on the client side. In addition, because the page provides a div section to be updated by the Ajax engine with the search results, it must be registered as an element:
ajaxEngine.registerRequest('getSearchResults','ajax_results.cfm');
ajaxEngine.registerAjaxElement('SearchResults');
|
To avoid an overload of information, the display of results is limited to a predefined number of entries; JavaScript functions are provided for forward and backward navigation. Each of these functions generates an Ajax request to call the page on the server side and pass to it two parameters on the URL: qu (the character string to look for) and pg (the number of the page of results to display).
On the server side, the page (see Listing 2) retrieves the URL parameters, queries the indexed content, selects the correct subset of results as requested by the client, and returns the results to the browser as a typical ajax-response, packed in XML format. In ColdFusion, I use the cfcontent tag to specify the type of content returned to the browser.
Listing 2. The ajax_results.cfm server page
<cfcontent type="text/xml">
<ajax-response>
<response type="element" id="SearchResults">
<h1>Search results</h1>
<cfscript>
PageSize = 10;
MaxRecords = 50;
if (IsDefined("URL.qu")) {
SearchString = URL.qu;
}
else {
SearchString = "";
}
if (IsDefined("URL.pg") and IsNumeric(URL.pg)) {
RequestedPage = Val(URL.pg);
}
else {
RequestedPage = 1;
}
</cfscript>
<cfif SearchString eq "">
<p>Your search did not match any documents.</p>
<cfelse>
<cfscript>
QueryString = SearchString & " AND ##filename *.htm?";
ixQuery = CreateObject("COM","ixsso.Query");
ixQuery.Query = QueryString;
ixQuery.Columns = "filename,size,rank,characterization,vpath,DocTitle,DocAuthor";
ixQuery.SortBy = "rank[d], DocTitle";
ixQuery.MaxRecords = MaxRecords;
RS = ixQuery.CreateRecordSet("nonsequential");
if (RS.EOF) {
WriteOutput("<p>Your search did not match any documents.</p>");
}
else {
RS.PageSize = PageSize;
PageCount = RS.PageCount;
if ((RequestedPage lt 1) or (RequestedPage gt PageCount)) {
RequestedPage = 1;
}
RS.AbsolutePage = RequestedPage;
if (RS.RecordCount gt 0) {
FirstRecordOnPage = RS.AbsolutePosition;
LastRecordOnPage = FirstRecordOnPage + PageSize - 1;
if (LastRecordOnPage gt RS.RecordCount) {
LastRecordOnPage = RS.RecordCount;
}
WriteOutput("<p>");
WriteOutput("Results <strong>" & FirstRecordOnPage & " - " & LastRecordOnPage);
WriteOutput("</strong> of <strong>" & RS.RecordCount & "</strong> (page ");
WriteOutput(RequestedPage & " of " & PageCount & ")");
if (RequestedPage gt 1) {
WriteOutput(" <a href='javascript:getPreviousResults()'>Previous results</a>");
}
if ((PageCount gt 1) and (RequestedPage neq PageCount)) {
WriteOutput(" <a href='javascript:getNextResults()'>Next results</a>");
}
WriteOutput("</p>");
}
while (not (RS.EOF or (RS.AbsolutePage neq RequestedPage))) {
WriteOutput("<p>");
WriteOutput("<a href='" & RS.Fields.Item("vpath").Value & "' target='_blank'>");
WriteOutput(XmlFormat(RS.Fields.Item("DocTitle").Value) & "</a>");
WriteOutput("<br />");
WriteOutput(XmlFormat(RS.Fields.Item("characterization").Value));
WriteOutput("</p>");
RS.MoveNext();
}
}
RS.Close();
ReleaseComObject(RS);
ReleaseComObject(ixQuery);
</cfscript>
</cfif>
</response>
</ajax-response>
|
You can query a Windows Indexing Service catalog either using the Indexing Service query language or using SQL. Each of these languages supports several Application Programming Interfaces (APIs). The technique that I use in this article is based on Query Helper, a high-level API that provides an object-based interface for accessing Windows Indexing Service data. It permits you to build a query and submit it, generating an ActiveX Data Object (ADO) Recordset in return.
With ColdFusion, those operations are done as follows:
<cfscript>
QueryString = SearchString & " AND ##filename *.htm?";
ixQuery = CreateObject("COM","ixsso.Query");
ixQuery.Query = QueryString;
ixQuery.Columns = "filename,size,rank,characterization,vpath,DocTitle,DocAuthor";
ixQuery.SortBy = "rank[d], DocTitle";
ixQuery.MaxRecords = MaxRecords;
RS = ixQuery.CreateRecordSet("nonsequential");
</cfscript>
|
The most important parameters of this object that are used here are as follows:
-
Query: The request, also called a restriction, to submit to Windows Indexing Service. It's a combination of words and parameters that determines which documents are returned as part of the search results. This request can be expressed in several dialects: Dialect 1, Dialect 2, or SQL. In its simplest form, it includes only the word(s) to look for. To refine the search, you can add parameters to limit the scope of the results. For instance,
#filename *.htmsearches only among HTML files, and#vpath \docs*considers only the docs folder. You can combine such expressions with boolean operators (AND, OR, NOT). -
Columns: The list of fields to return in the search results. They include the following, among others:
-
filename: The name of the document. -
rank: A value that specifies the position of the document in the results, based on the frequency of the searched words. -
characterization: A summary of the document. -
vpath: The virtual path to access the document. -
DocTitle: The title of the document.
-
-
SortBy: The sort order of the search results, based on the columns that are specified. The
[d]option indicates that one of the fields must be sorted in descending order. - MaxRecords. The maximum number of records retrieved from the indexed content.
- Dialect. The dialect of the Indexing Service query language. The literal value "1" indicates Dialect 1, and "2" indicates Dialect 2 (the default).
-
Catalog. The name of one or more catalogs (separated with a comma) to use to restrict the search. A catalog name is the name
associated with a hidden directory on the local computer. It uses a URL-like syntax:
query://hostname/indexname. Thehostnameis the name of the computer where the catalog is located; theindexnameis the catalog name on that computer. If no name is supplied for this property, the default catalog on the computer is used (Web, in this situation, because IIS is installed).
The request is executed and the ADO Recordset object is generated in the CreateRecordset method. This recordset is then used to iterate through the search results to display a given subset.
Typically, a Rico Ajax response is represented in well-formed XML format, delimited by
the <ajax-response></ajax-response> tags. The content of the response is also delimited by <response></response> tags; see Listing 3:
Listing 3. A sample Ajax response
<ajax-response>
<response type="element" id="SearchResults">
<h1>Search results</h1>
<p>Results <strong>1 - 10</strong> of <strong>50</strong>
(page 1 of 5) <a href='javascript:getNextResults()'>Next results</a></p>
<p><a href='/cfdocs/htmldocs/introa.htm' target='_blank'></a><br />
Getting Started Building Blackstone Applications is intended for anyone
who needs to begin programming in the Macromedia Blackstone development environment.</p>
<p><a href='/cfdocs/htmldocs/introb.htm' target='_blank'></a><br />
CFML Reference is your primary ColdFusion Markup Language (CFML) reference.
Use this book to learn about CFML tags and functions, ColdFusion expressions, and using
JavaScript objects for WDDX in Macromedia ColdFusion MX.</p>
</response>
</ajax-response>
|
A response holds two attributes: type and id. These parameters specify that the response is meant to update the content of an HTML element on the page (type="element") identified by the mentioned id. The content of this response must represent valid XHTML code. In the example, the element to update on the search page is a div section, associated with an id attribute — the same as the one in the Ajax response.
The search results are returned in the form of an ADO Recordset. All you have to do is loop through it to access and display the results (see Figure 6):
Figure 6. Search results displayed
The search results are displayed as a list, which includes the name of the document (DocTitle) as a hyperlink and the summary of the document (characterization).
To make sure data from the catalog is transmitted in a valid XML format, you use the ColdFusion function XmlFormat. Its role is to escape special XML characters in a string so the string can be used as simple text in an XML document.
Page through the search results
At the search-page level (on the client), the number of the page of results currently displayed is kept in memory (CurrentPage). Two JavaScript functions start a request for the previous page (getPreviousResults) and the next page (getNextResults).
On the server side, you use the PageSize property of the recordset to split the search results into chunks to limit the number of results displayed simultaneously in the browser. After this property is set, you can retrieve the number of pages (PageCount) that are available in the recordset. This way, you can add the necessary navigation links in the XHTML code that is returned to the browser.
As you can see in Listing 2, you also specify which subset (page) of the search results to display with the help of the AbsolutePage property of the Recordset. The value given to this property was passed to the page on the URL.
Windows Indexing Service offers a simple means of adding search functionalities to a Web site or intranet, right out of the box. By associating it with Ajax and mature server technologies, you can turn this functionality into an efficient and user-friendly search facility.
| Description | Name | Size | Download method |
|---|---|---|---|
| ColdFusion scripts for this article | wa-aj-rico.zip | 2KB | HTTP |
Information about download methods
Learn
- Read about Windows
Indexing Service, a base service for Windows to index the content of IIS servers.
- The Rico JavaScript library is an open source Ajax framework.
-
Adobe ColdFusion is an application server and development framework.
-
Ruby on Rails is an open source Web framework optimized for sustainable productivity.
- Learn about PHP, a general-purpose scripting language suited for Web development.
-
Apache Lucene is an open source, high-performance, full-featured text search engine library.
- Learn about Apache Solr, an open source search server based on the Lucene search library.
- The developerWorks Ajax resource center is packed with tools, code, and information to get you started developing slick Ajax applications today.
- With Web 2.0 being a hot area within development
circles, you'll find an ever-growing collection of resources in the Web development zone.
Get products and technologies
- Download
IBM product evaluation versions
and get your hands on application development tools and middleware products from
DB2®, Lotus®, Rational®, Tivoli®, and
WebSphere®.
Discuss

Philippe Randour is a software and system engineer with more than 10 years of experience. You can reach Philippe at philippe@randour.net.
Comments (Undergoing maintenance)





