Skip to main content

skip to main content

developerWorks  >  Lotus  >

The DigestSearch method for Lotus Domino databases

developerWorks
Document options

Document options requiring JavaScript are not displayed

Sample code


Rate this page

Help us improve this content


Level: Introductory

Andrei Kouvchinnikov, Principal Domino Developer, Botstation Technologies

31 Jan 2006

This article introduces DigestSearch, an alternative solution for working with IBM Lotus Notes Profile documents and for performing simple, high-speed searches. For searching server-based databases from a Notes client, DigestSearch is twice as fast as any other search method available, outperforming both full-text search and LotusScript's GetDocumentByKey method.

The DigestSearch method can be best described as an alternative solution to LotusScript's View.GetDocumentByKey search method for working with Profile documents. Its main purpose is to find one or more documents using a single search word, for example, a Social Security number, a phone number, or a search for the unique sequence number of a stored Structured Query Language (SQL) record.

The main advantage of DigestSearch is that for searching large databases (especially searching server-based databases from a Notes client) it outperforms all other conventional methods with respect to speed. In fact, depending upon the complexity of the search, the DigestSearch method can be up to 20 times faster! And DigestSearch does not need any views for performing searches, so you can reduce your database size by removing unnecessary views. The number of documents in your database does not substantially affect the speed of searches with DigestSearch.

The main disadvantage of DigestSearch is that it can accept only one non-wildcard keyword. In this respect, the method is similar to LotusScript's View.GetDocumentByKey method and to @DBLookup. So, if performance is important to you, and search keywords are predictable, consider looking into the DigestSearch method. Implementing DigestSearch in your databases doesn't require any significant design changes or modifications to existing documents. Although the method still has a bit of room for improvement (especially in the area of indexing and searching for multiple keywords), it can provide significant performance advantages today.

The sample databases discussed in this article, Digestprofile.nsf (the Profile documents database), Testindex.nsf (the Demo Index database), Digest2.nsf (DigestSearch for simple searches), and Demonab.nsf (the demo Domino directory), reside in the file Digest_dbs.zip, which is available for download from the Download section.

In this article, we show two ways that you can use the DigestSearch method:

  • As an efficient replacement of Profile documents and other kinds of temporary storage documents
  • As a way to search a Domino Directory and return personal information for all users in a specified group

This article assumes that you're an experienced Notes/Domino programmer.

Performance

When you compare the search speed of the DigestSearch method and traditional Domino search methods, as shown in the following two tables, you can see that for using a single keyword to find a single document, DigestSearch outperforms all other methods, especially if the database resides on a server and you perform the search from a Notes client. In our performance test, we measured the time required to get an object handle to 100 documents. We measured the results against the Domino Directory search example and simulated a search that consisted of several steps. You can run your own test using the Performance test agent in the Digest Search 2 database (Digest2.nsf).

Note: This performance test is not designed to be a general speed comparison among methods; the results are applicable only to the particular task of searching members of a group.

In the first table, simulated users performed a search from a Notes client where the database is on the server:

Search method Time in seconds
DigestSearch2.9
Db.Search13.1
Db.FTSearch6.1
View.GetDocumentByKey5.8
@DBLookup12.1

The second table displays the results of searching from a Notes client on a local database:

Search method Time in seconds
DigestSearch1.2
Db.Search9.7
Db.FTSearch2.8
View.GetDocumentByKey0.9
@DBLookup1.2

As you can see, when run locally, @DBLookup is the second fastest method; but when run against a server-based database, it is the second slowest. (However, returning a document handle instead of pure text results could have affected the results.) Another interesting fact is that Db.Search is only 30 percent slower for searching on a server, while other methods become at least twice as slow.

Note: In this test, we used @DBLookup with the Cache parameter and with several keywords combined into the same query, which isn't always possible for real-world searches.

How DigestSearch works

You implement the DigestSearch method entirely in LotusScript, and its core code takes approximately 30 lines. You can use DigestSearch for background searches, and its syntax and functionality are reminiscent of the View.GetAllDocumentsByKey("searchword", True) method. The main similarity between these two methods is that they both take one keyword as a parameter and return one or more exact matching documents as a result. The differences are that DigestSearch does not require views to perform the search, and it always makes an exact match for a searched word.

You call DigestSearch using this LotusScript command:

Set doc=FastSearchByKey(db, "searchword")

The method is called DigestSearch because it uses the unique digest (one-way hash) value to represent a search word. This unique key is an encrypted search word resulting in a 32-character string, that is in turn used as the Universal ID (UNID) for a document. Thus, DigestSearch does not really search the database; it simply checks whether or not a document with a particular UNID exists.

Here's a simple example of a search flow. When you understand how the flow works, modifications and customizations are easy:

  1. The user searches for the phone number +1 212 12345678 to find the name of the number's owner.
  2. +1 212 12345678 is converted to a digest value with the help of the @Password function. The resulting digest/hash string is 3F915F67F52D35053113AAB40385FE46.
  3. The script checks whether or not a document with UNID 3F915F67F52D35053113AAB40385FE46 exists in the database.
  4. If a document is found, the search is over. The script reads the fields from the document and displays them to user or simply opens the document in the user interface. If no document was found, the user sees a message stating that no document with such a keyword exists.

The following code shows a simplified working @formula example for performing a digest search:

seachrword:="+1 212 12345678";
fullname:=@GetDocField(@Middle(@Password(seachword);1;32);
"FullName");@Prompt([Ok];"Result for "+searchword;
@If(@IsError(fullname);"Document "+searchword+” does not exist";
"Fullname: "+fullname))

And this is the LotusScript agent code used in the preceding code:

Option Public
Use "DigestSearchLib"
Sub Initialize
Dim session as New NotesSession
Dim db As NotesDatabase
Dim doc As NotesDocument, curdoc as NotesDocument
Dim workspace As New NotesUIWorkspace
Dim uidoc As NotesUIDocument
Dim searchstring As String

Set uidoc = workspace.CurrentDocument
Set curdoc=uidoc.CurrentDocument
Set db=session.CurrentDatabase

searchstring=doc.inputfield(0) ' Phone number input field on form
' Exit if no phone number was supplied
If searchstring = "" Then Exit Sub
Set doc= FastSearchByKey(db, searchstring) ' Perform DigestSeach and
' return document handle
If Not doc Is Nothing Then
curdoc.FullName=doc.FirstName(0)+" "+doc.LastName(0)
' populate fields in current document with information
' from the found document
curdoc.Address=doc.Address(0)
curdoc.Job=doc.JobDescription(0)
Else
' No document was found, notify user about it
Msgbox "Info for "+searchtxt+" not found"
End If
End Sub

In the background script library, the following process occurs when this agent runs: The search word is converted to a digest using the @Password method. (This step produces a UNID-compatible 32-character string.)

ev=Evaluate(|@Password("|+skey+|")|)

The library checks whether or not a document with that UNID exists:

On Error 4091 Goto wrongiderr4091 Set digestdoc= digestdb.GetDocumentByUNID(Mid(ev(0),2,32))

Note: You must handle possible Invalid universal ID errors, which will occur if the search digest has no corresponding document.

If digestdoc did not result in error 4091 Invalid universal ID, the script passes the handle of the document back to the calling function:

Set FindDocByDigestKey=digestdoc
Exit Function
wrongiderr4091:
Set FindDocByDigestKey=Nothing

The script shows values from the found document to the user:

curdoc.FullName=doc.FirstName(0)+" "+doc.LastName(0)

This code shows the whole DigestSearchLib LotusScript library, which contains the core of the DigestSearch method:

' ---- DigestSearchLib script library start ------
Dim digestdb As NotesDatabase
Dim digestdoc As NotesDocument
Dim lastkey As String
Dim lastgeneratedID As String
Dim lastgeneratedDoc As NotesDocument

Function FindDocByDigestKey(custdb As NotesDatabase,skey As String)_
As NotesDocument ' this is main function for getting search result
Dim unid As String
Set digestdb=custdb
On Error Goto errh
unid=CalculateDigest(skey)
If Len(unid)<>32 Then
Set FindDocByDigestKey=Nothing
Exit Function
End If
Set digestdoc = lastgeneratedDoc
' lastgeneratedDoc is a global variable
'return document handle back to calling agent
Set FindDocByDigestKey=digestdoc
Set digestdoc=Nothing
Exit Function
errh:
Set FindDocByDigestKey=Nothing 'no document for that keyword is found
Resume endas
endas:
End Function

Function IsDigestKeyTaken(unid As String)
' this function checks if document for the keyword already exist
On Error Resume Next
' error nr 4091 means invalid Universal ID
On Error 4091 Goto wrongiderr4091
' check if document with unique keyword already exist
Set digestdoc= digestdb.GetDocumentByUNID(unid) IsDigestKeyTaken=True
Set lastgeneratedDoc=digestdoc
Exit Function
wrongiderr4091:
IsDigestKeyTaken=False
Resume wrongID
wrongID:
End Function

Function CalculateDigest(skey As String)
'this function computes 32-character digest for the keyword
Dim unid As String
'calculate digest for the keyword
ev=Evaluate(|@Password("|+skey+|")|)
unid=Mid(ev(0),2,32) 'strip parentheses around generated digest
lastgeneratedID=unid 'assign global variable a new value
If IsDigestKeyTaken(unid)=False Then
unid="no doc with that digest yet"
End If
CalculateDigest=unid
End Function

Sub MakeSearchable(sdoc As NotesDocument)
' this function sets new Universal ID and saves the document
unid=lastgeneratedID
sdoc.UniversalID=unid
End Sub
' ----- script library end ------

Using DigestSearch to work with Profile documents

You have probably noticed that Lotus Notes caches Profile documents. This behavior can be a problem if two or more users are simultaneously modifying the Profile document. Because of caching issues, users get old, outdated values. But you can overcome this problem by using DigestSearch instead of standard GetProfileDocument functionality as shown in the following code snippet:

Set db=session.CurrentDatabase
' Perform search and return a document handle
Set profiledoc = FastSearchByKey(db, "My Profile 1")
If Not profiledoc Is Nothing Then
MsgBox profiledoc.Field1(0) ' show a field from profile document
profiledoc.Field2=Cstr(Now) ' update profile with new field
Call profiledoc.save(True,False) 'save profile document
End If

You can even use formula language for accessing the new digest Profile documents as shown in this code:

profname:="My Profile 1";
comment:=@GetDocField(@Middle(@Password(profname);1;32); "comment");
createddate:=
@GetDocField(@Middle(@Password(profname);1;32); "doccreated");
@Prompt([Ok];"Result for "+profname; @If(@IsError(comment);
"Profile "+profname+"does not exist";
"Comment: "+comment+@Char(10)+"Created: "+createddate))

Unfortunately, you can't create new digest Profile documents using the Notes @formula language. You can only search existing documents and modify them (using @SetDocField). The Download section of this article contains a ZIP file that includes a database called "DigestSearch demo for profile docs" (Digestprofile.nsf) that contains the "Modify example with Formula" agent source code and examples. This database also contains both LotusScript and @formula code for you to test (see Figure 1). Click Create profile doc in the Instructions view to create a new profile, and then click Find profile doc to find the profile you've just created.


Figure 1. Action buttons for profile testing
Action buttons for profile testing

Using DigestSearch to work with simple searches

Working with searches is more complicated than working with Profile documents. The number of Profile documents is often limited; they're created on demand; and they don't depend upon other documents. Such is not the case when you're searching a database containing hundreds of thousands of documents that are linked with each other and with other databases.

To maintain the consistency of existing databases, you can't modify the UNID of the documents in those databases directly. Therefore, you need a special mirror/index database in which to keep index documents for documents in the parent database.

The index database does not need any design elements; it simply contains two types of documents:

  • Reference holder documents
  • Keyword holder documents

One reference holder document (a SourceRefHolder form) exists for each indexed document in the source database. This reference holder document contains two dynamically created fields: SKey and REFUNIDs. You use the SKey field only as a reminder of the UNID of the original document in the source database, not for searching. The REFUNIDs field is a multi-value field that you use to keep track of the keyword holder documents. That is, the field contains the keyword holder documents' UNIDs.

The number of keyword holder documents per source document is the same as the number of search fields in each document. You specify these fields in the Configuration document as shown in Figure 2. Each index document also has multi-value field UNIDs, which contain the UNIDs of all documents in the source database that match the keyword assigned for that index document.


Figure 2. The Configuration document
The Configuration document

The source code of the search is similar to the Profile document example described earlier in this article. The differences are:

  • For searching, you use the Demo Index database (Testindex.nsf) instead of the current database.
  • You keep track of multiple keywords in a special document.

Assuming that the Demo Index database is updated and contains all the documents from the source database (that is, the Domino Directory), the search produces the same result as if you searched the source database.

There are three main ways to update the Demo Index database with new documents:

  • Running the QuerySave script on documents in the source database
  • Using a scheduled agent
  • Using an add-on server program that transparently performs the update immediately when documents are modified

Note: The Digest Search 2 database (Digest2.nsf) includes an agent called Process database index that creates an index for the documents from the source database.

Example of searching a Domino Directory

Two agents for performing Domino Directory searches are included in the example database Digest Search 2. One agent (Find users by first name) finds all Domino Directory documents in which the first name of the person matches the name you typed in the input box. The other agent (Find group members) finds all Person documents for people in the specified group. You need three databases to run the sample agents as shown in Figure 3:

  • Digest Search 2 (Digest2.nsf)
  • The Demo NAB database (Demonab.nsf); you can also use a copy of your Domino Directory
  • An empty database for holding the index, Demo Index (Testindex.nsf)


Figure 3. Databases for the sample agents
Databases for the sample agents

All three of these databases are contained within the ZIP file in the Download section of this article.

In the Digest Search 2 database, you must configure the locations of both the Demo NAB and the Demo Index databases as shown in Figures 2 and 4. After you have configured the paths to the databases, you must populate the index database with information from the Demo NAB. To do so, click Synchronize Index in the Configuration view (see Figure 4).


Figure 4. Action buttons in Configuration view
Action buttons in Configuration view

After indexing is complete, you can perform your search. Simply click "Find persons by first name" or "Find all users in a group." Then type the user's first name or group name and click OK. In our example, the user's first name is John. You would change it to an appropriate name that exists in your directory (see Figure 5).


Figure 5. Search by first name
Search by first name

If everything is configured properly, a window similar to the one shown in Figure 6 appears.


Figure 6. Results of the first name search
Results of the first name search

When you click "Find all users in a group," the following background events occur:

  1. The script identifies the matching index document (that is, the keyword holder document) for keyword listname=demogroup.
  2. In that index document, the script identifies the UNID of the source document in the source database.
  3. The script reads the Members field from the source document, and for each person in that field, the script performs a new search with the new constructed keyword "fullname="+doc.members(x).
  4. The search loops until all people in the field are processed, and then combines the answer into a text string.
  5. MessageBox displays the resulting text string to the user.

When to (and when not to) use DigestSearch

Use DigestSearch when you need to find one or more documents by a single keyword quickly. Profile documents and documents for the temporary storage of data are the most obvious usage areas. Similarly, use DigestSearch when, for whatever reason, you can't apply traditional search methods, but need a high-performance search solution (preferably for static data, such as in the Domino Directory).

Here are practical examples for when you should consider using DigestSearch:

  • To find a Person document by Social Security number or phone number
  • To find a product by its article number
  • When synchronizing with SQL, to determine whether or not a record already exists in the Notes database by its unique record number
  • To get particular fields from a user's document in the Domino Directory for all members in a particular group

Practical examples of when not to use DigestSearch include:

  • The speed of a search is satisfactory with standard search methods (<1 second).
  • Your database is not large (<5,000 documents).
  • You need to define more than one search keyword (for example, Form="Document" & Status="Active").
  • You need to use more flexible search keywords (for example, @Left(Product;2)="Di", @Created=@Today).
  • The number of documents returned as a search result for a particular keyword is greater than 500.

The DigestSearch method was originally created for improving the speed of synchronization for large numbers of SQL records with a Domino database. Every SQL record has its own Unique Sequence Number, and you can also compute a custom unique number for all combined fields in the record. Every record exists only once, which makes it a perfect situation to which to apply the DigestSearch method. You can avoid caching existing documents (in our case, it was more than 1,000,000 documents) into an array or list, and synchronization time can be reduced from 25 minutes to 10 minutes.

A to-do list for the next DigestSearch version

As we mentioned earlier, DigestSearch does have some room for improvement. Some issues are easy to overcome, while others would require new approaches. Here are some changes that we'd like to see in future versions of the DigestSearch method:

  • The agent that updates the index database with new documents is currently rather slow and is not optimized, which can be fixed by adding a couple of tracking fields to the index documents.
  • Check whether or not implementing incremental indexing is possible.
  • Develop a server add-on that automatically adds new and modified documents from the source database to the index database.
  • Use the same index document to store references for 4,096 source documents instead of having one index document for each source document. (More investigation is required to determine whether this change would affect performance.)
  • Allow several search keywords per search (for example, both last name and city). It's possible to accomplish this functionality by comparing results from two or more separate searches and eliminating non-matching results.
  • Make it possible for several source databases to use the same index database.
  • Consider the possibility of implementing the B-Tree search technique to improve the performance and manageability of the index. (We are currently investigating whether using a customized B-Tree algorithm can make it possible to search with the help of wild cards and to store index documents more compactly. Together with DigestSearch's ability to instantly find a particular keyword, it could become an unbeatable solution.)

Speed does matter

Under normal circumstances, DigestSearch gets results faster than regular search methods. If you do not need the luxury of the advanced queries that FTSearch and DBSearch can offer, but rather are looking for an alternative for Profile documents, the GetDocumentByKey method, or @DBLookup, download the sample databases and see if DigestSearch can be useful for you. With the help of the examples included in this article and the sample databases, you can test the functionality of DigestSearch in a matter of minutes and implement it for your own applications simply by changing the Configuration document.




Back to top


Download

DescriptionNameSizeDownload method
Sample applications for this articledigest_dbs.zip1284KBHTTP
Information about download methods


Resources

Learn

Get products and technologies

Discuss


About the author

Andrei Kouvchinnikov is a certified Principal Domino Developer and Administrator. His experience includes full life cycle development of Lotus Domino applications running on multiple platforms and development of applications for QuickPlace and Sametime. He has been working with the Lotus Domino platform since R4.5 for OS2. You can reach Andrei at andrei@botstation.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top