The DigestSearch method can be best described as an alternative solution to LotusScript's View.GetDocumentByKey search method for working with Profile documents. Its main purpose is to find one or more documents using a single search word, for example, a Social Security number, a phone number, or a search for the unique sequence number of a stored Structured Query Language (SQL) record.
The main advantage of DigestSearch is that for searching large databases (especially searching server-based databases from a Notes client) it outperforms all other conventional methods with respect to speed. In fact, depending upon the complexity of the search, the DigestSearch method can be up to 20 times faster! And DigestSearch does not need any views for performing searches, so you can reduce your database size by removing unnecessary views. The number of documents in your database does not substantially affect the speed of searches with DigestSearch.
The main disadvantage of DigestSearch is that it can accept only one non-wildcard keyword. In this respect, the method is similar to LotusScript's View.GetDocumentByKey method and to @DBLookup. So, if performance is important to you, and search keywords are predictable, consider looking into the DigestSearch method. Implementing DigestSearch in your databases doesn't require any significant design changes or modifications to existing documents. Although the method still has a bit of room for improvement (especially in the area of indexing and searching for multiple keywords), it can provide significant performance advantages today.
The sample databases discussed in this article, Digestprofile.nsf (the Profile documents database), Testindex.nsf (the Demo Index database), Digest2.nsf (DigestSearch for simple searches), and Demonab.nsf (the demo Domino directory), reside in the file Digest_dbs.zip, which is available for download from the Download section.
In this article, we show two ways that you can use the DigestSearch method:
- As an efficient replacement of Profile documents and other kinds of temporary storage documents
- As a way to search a Domino Directory and return personal information for all users in a specified group
This article assumes that you're an experienced Notes/Domino programmer.
When you compare the search speed of the DigestSearch method and traditional Domino search methods, as shown in the following two tables, you can see that for using a single keyword to find a single document, DigestSearch outperforms all other methods, especially if the database resides on a server and you perform the search from a Notes client. In our performance test, we measured the time required to get an object handle to 100 documents. We measured the results against the Domino Directory search example and simulated a search that consisted of several steps. You can run your own test using the Performance test agent in the Digest Search 2 database (Digest2.nsf).
Note: This performance test is not designed to be a general speed comparison among methods; the results are applicable only to the particular task of searching members of a group.
In the first table, simulated users performed a search from a Notes client where the database is on the server:
|Search method||Time in seconds|
The second table displays the results of searching from a Notes client on a local database:
|Search method||Time in seconds|
As you can see, when run locally, @DBLookup is the second fastest method; but when run against a server-based database, it is the second slowest. (However, returning a document handle instead of pure text results could have affected the results.) Another interesting fact is that Db.Search is only 30 percent slower for searching on a server, while other methods become at least twice as slow.
Note: In this test, we used @DBLookup with the Cache parameter and with several keywords combined into the same query, which isn't always possible for real-world searches.
You implement the DigestSearch method entirely in LotusScript, and its core code takes approximately 30 lines. You can use DigestSearch for background searches, and its syntax and functionality are reminiscent of the View.GetAllDocumentsByKey("searchword", True) method. The main similarity between these two methods is that they both take one keyword as a parameter and return one or more exact matching documents as a result. The differences are that DigestSearch does not require views to perform the search, and it always makes an exact match for a searched word.
You call DigestSearch using this LotusScript command:
Set doc=FastSearchByKey(db, "searchword")
The method is called DigestSearch because it uses the unique digest (one-way hash) value to represent a search word. This unique key is an encrypted search word resulting in a 32-character string, that is in turn used as the Universal ID (UNID) for a document. Thus, DigestSearch does not really search the database; it simply checks whether or not a document with a particular UNID exists.
Here's a simple example of a search flow. When you understand how the flow works, modifications and customizations are easy:
- The user searches for the phone number +1 212 12345678 to find the name of the number's owner.
- +1 212 12345678 is converted to a digest value with the help of the @Password function. The resulting digest/hash string is 3F915F67F52D35053113AAB40385FE46.
- The script checks whether or not a document with UNID 3F915F67F52D35053113AAB40385FE46 exists in the database.
- If a document is found, the search is over. The script reads the fields from the document and displays them to user or simply opens the document in the user interface. If no document was found, the user sees a message stating that no document with such a keyword exists.
The following code shows a simplified working @formula example for performing a digest search:
seachrword:="+1 212 12345678"; fullname:=@GetDocField(@Middle(@Password(seachword);1;32); "FullName");@Prompt([Ok];"Result for "+searchword; @If(@IsError(fullname);"Document "+searchword+â does not exist"; "Fullname: "+fullname))
And this is the LotusScript agent code used in the preceding code:
Option Public Use "DigestSearchLib" Sub Initialize Dim session as New NotesSession Dim db As NotesDatabase Dim doc As NotesDocument, curdoc as NotesDocument Dim workspace As New NotesUIWorkspace Dim uidoc As NotesUIDocument Dim searchstring As String Set uidoc = workspace.CurrentDocument Set curdoc=uidoc.CurrentDocument Set db=session.CurrentDatabase searchstring=doc.inputfield(0) ' Phone number input field on form ' Exit if no phone number was supplied If searchstring = "" Then Exit Sub Set doc= FastSearchByKey(db, searchstring) ' Perform DigestSeach and ' return document handle If Not doc Is Nothing Then curdoc.FullName=doc.FirstName(0)+" "+doc.LastName(0) ' populate fields in current document with information ' from the found document curdoc.Address=doc.Address(0) curdoc.Job=doc.JobDescription(0) Else ' No document was found, notify user about it Msgbox "Info for "+searchtxt+" not found" End If End Sub
In the background script library, the following process occurs when this agent runs: The search word is converted to a digest using the @Password method. (This step produces a UNID-compatible 32-character string.)
The library checks whether or not a document with that UNID exists:
On Error 4091 Goto wrongiderr4091
Set digestdoc= digestdb.GetDocumentByUNID(Mid(ev(0),2,32))
Note: You must handle possible Invalid universal ID errors, which will occur if the search digest has no corresponding document.
If digestdoc did not result in error 4091 Invalid universal ID, the script passes the handle of the document back to the calling function:
Set FindDocByDigestKey=digestdoc Exit Function wrongiderr4091: Set FindDocByDigestKey=Nothing
The script shows values from the found document to the user:
This code shows the whole DigestSearchLib LotusScript library, which contains the core of the DigestSearch method:
' ---- DigestSearchLib script library start ------ Dim digestdb As NotesDatabase Dim digestdoc As NotesDocument Dim lastkey As String Dim lastgeneratedID As String Dim lastgeneratedDoc As NotesDocument Function FindDocByDigestKey(custdb As NotesDatabase,skey As String)_ As NotesDocument ' this is main function for getting search result Dim unid As String Set digestdb=custdb On Error Goto errh unid=CalculateDigest(skey) If Len(unid)<>32 Then Set FindDocByDigestKey=Nothing Exit Function End If Set digestdoc = lastgeneratedDoc ' lastgeneratedDoc is a global variable 'return document handle back to calling agent Set FindDocByDigestKey=digestdoc Set digestdoc=Nothing Exit Function errh: Set FindDocByDigestKey=Nothing 'no document for that keyword is found Resume endas endas: End Function Function IsDigestKeyTaken(unid As String) ' this function checks if document for the keyword already exist On Error Resume Next ' error nr 4091 means invalid Universal ID On Error 4091 Goto wrongiderr4091 ' check if document with unique keyword already exist Set digestdoc= digestdb.GetDocumentByUNID(unid) IsDigestKeyTaken=True Set lastgeneratedDoc=digestdoc Exit Function wrongiderr4091: IsDigestKeyTaken=False Resume wrongID wrongID: End Function Function CalculateDigest(skey As String) 'this function computes 32-character digest for the keyword Dim unid As String 'calculate digest for the keyword ev=Evaluate(|@Password("|+skey+|")|) unid=Mid(ev(0),2,32) 'strip parentheses around generated digest lastgeneratedID=unid 'assign global variable a new value If IsDigestKeyTaken(unid)=False Then unid="no doc with that digest yet" End If CalculateDigest=unid End Function Sub MakeSearchable(sdoc As NotesDocument) ' this function sets new Universal ID and saves the document unid=lastgeneratedID sdoc.UniversalID=unid End Sub ' ----- script library end ------
You have probably noticed that Lotus Notes caches Profile documents. This behavior can be a problem if two or more users are simultaneously modifying the Profile document. Because of caching issues, users get old, outdated values. But you can overcome this problem by using DigestSearch instead of standard GetProfileDocument functionality as shown in the following code snippet:
Set db=session.CurrentDatabase ' Perform search and return a document handle Set profiledoc = FastSearchByKey(db, "My Profile 1") If Not profiledoc Is Nothing Then MsgBox profiledoc.Field1(0) ' show a field from profile document profiledoc.Field2=Cstr(Now) ' update profile with new field Call profiledoc.save(True,False) 'save profile document End If
You can even use formula language for accessing the new digest Profile documents as shown in this code:
profname:="My Profile 1"; comment:=@GetDocField(@Middle(@Password(profname);1;32); "comment"); createddate:= @GetDocField(@Middle(@Password(profname);1;32); "doccreated"); @Prompt([Ok];"Result for "+profname; @If(@IsError(comment); "Profile "+profname+"does not exist"; "Comment: "+comment+@Char(10)+"Created: "+createddate))
Unfortunately, you can't create new digest Profile documents using the Notes @formula language. You can only search existing documents and modify them (using @SetDocField). The Download section of this article contains a ZIP file that includes a database called "DigestSearch demo for profile docs" (Digestprofile.nsf) that contains the "Modify example with Formula" agent source code and examples. This database also contains both LotusScript and @formula code for you to test (see Figure 1). Click Create profile doc in the Instructions view to create a new profile, and then click Find profile doc to find the profile you've just created.
Figure 1. Action buttons for profile testing
Working with searches is more complicated than working with Profile documents. The number of Profile documents is often limited; they're created on demand; and they don't depend upon other documents. Such is not the case when you're searching a database containing hundreds of thousands of documents that are linked with each other and with other databases.
To maintain the consistency of existing databases, you can't modify the UNID of the documents in those databases directly. Therefore, you need a special mirror/index database in which to keep index documents for documents in the parent database.
The index database does not need any design elements; it simply contains two types of documents:
- Reference holder documents
- Keyword holder documents
One reference holder document (a SourceRefHolder form) exists for each indexed document in the source database. This reference holder document contains two dynamically created fields: SKey and REFUNIDs. You use the SKey field only as a reminder of the UNID of the original document in the source database, not for searching. The REFUNIDs field is a multi-value field that you use to keep track of the keyword holder documents. That is, the field contains the keyword holder documents' UNIDs.
The number of keyword holder documents per source document is the same as the number of search fields in each document. You specify these fields in the Configuration document as shown in Figure 2. Each index document also has multi-value field UNIDs, which contain the UNIDs of all documents in the source database that match the keyword assigned for that index document.
Figure 2. The Configuration document
The source code of the search is similar to the Profile document example described earlier in this article. The differences are:
- For searching, you use the Demo Index database (Testindex.nsf) instead of the current database.
- You keep track of multiple keywords in a special document.
Assuming that the Demo Index database is updated and contains all the documents from the source database (that is, the Domino Directory), the search produces the same result as if you searched the source database.
There are three main ways to update the Demo Index database with new documents:
- Running the QuerySave script on documents in the source database
- Using a scheduled agent
- Using an add-on server program that transparently performs the update immediately when documents are modified
Note: The Digest Search 2 database (Digest2.nsf) includes an agent called Process database index that creates an index for the documents from the source database.
Two agents for performing Domino Directory searches are included in the example database Digest Search 2. One agent (Find users by first name) finds all Domino Directory documents in which the first name of the person matches the name you typed in the input box. The other agent (Find group members) finds all Person documents for people in the specified group. You need three databases to run the sample agents as shown in Figure 3:
- Digest Search 2 (Digest2.nsf)
- The Demo NAB database (Demonab.nsf); you can also use a copy of your Domino Directory
- An empty database for holding the index, Demo Index (Testindex.nsf)
Figure 3. Databases for the sample agents
All three of these databases are contained within the ZIP file in the Download section of this article.
In the Digest Search 2 database, you must configure the locations of both the Demo NAB and the Demo Index databases as shown in Figures 2 and 4. After you have configured the paths to the databases, you must populate the index database with information from the Demo NAB. To do so, click Synchronize Index in the Configuration view (see Figure 4).
Figure 4. Action buttons in Configuration view
After indexing is complete, you can perform your search. Simply click "Find persons by first name" or "Find all users in a group." Then type the user's first name or group name and click OK. In our example, the user's first name is John. You would change it to an appropriate name that exists in your directory (see Figure 5).
Figure 5. Search by first name
If everything is configured properly, a window similar to the one shown in Figure 6 appears.
Figure 6. Results of the first name search
When you click "Find all users in a group," the following background events occur:
- The script identifies the matching index document (that is, the keyword holder document) for keyword listname=demogroup.
- In that index document, the script identifies the UNID of the source document in the source database.
- The script reads the Members field from the source document, and for each person in that field, the script performs a new search with the new constructed keyword "fullname="+doc.members(x).
- The search loops until all people in the field are processed, and then combines the answer into a text string.
- MessageBox displays the resulting text string to the user.
Use DigestSearch when you need to find one or more documents by a single keyword quickly. Profile documents and documents for the temporary storage of data are the most obvious usage areas. Similarly, use DigestSearch when, for whatever reason, you can't apply traditional search methods, but need a high-performance search solution (preferably for static data, such as in the Domino Directory).
Here are practical examples for when you should consider using DigestSearch:
- To find a Person document by Social Security number or phone number
- To find a product by its article number
- When synchronizing with SQL, to determine whether or not a record already exists in the Notes database by its unique record number
- To get particular fields from a user's document in the Domino Directory for all members in a particular group
Practical examples of when not to use DigestSearch include:
- The speed of a search is satisfactory with standard search methods (<1 second).
- Your database is not large (<5,000 documents).
- You need to define more than one search keyword (for example, Form="Document" & Status="Active").
- You need to use more flexible search keywords (for example, @Left(Product;2)="Di", @Created=@Today).
- The number of documents returned as a search result for a particular keyword is greater than 500.
The DigestSearch method was originally created for improving the speed of synchronization for large numbers of SQL records with a Domino database. Every SQL record has its own Unique Sequence Number, and you can also compute a custom unique number for all combined fields in the record. Every record exists only once, which makes it a perfect situation to which to apply the DigestSearch method. You can avoid caching existing documents (in our case, it was more than 1,000,000 documents) into an array or list, and synchronization time can be reduced from 25 minutes to 10 minutes.
As we mentioned earlier, DigestSearch does have some room for improvement. Some issues are easy to overcome, while others would require new approaches. Here are some changes that we'd like to see in future versions of the DigestSearch method:
- The agent that updates the index database with new documents is currently rather slow and is not optimized, which can be fixed by adding a couple of tracking fields to the index documents.
- Check whether or not implementing incremental indexing is possible.
- Develop a server add-on that automatically adds new and modified documents from the source database to the index database.
- Use the same index document to store references for 4,096 source documents instead of having one index document for each source document. (More investigation is required to determine whether this change would affect performance.)
- Allow several search keywords per search (for example, both last name and city). It's possible to accomplish this functionality by comparing results from two or more separate searches and eliminating non-matching results.
- Make it possible for several source databases to use the same index database.
- Consider the possibility of implementing the B-Tree search technique to improve the performance and manageability of the index. (We are currently investigating whether using a customized B-Tree algorithm can make it possible to search with the help of wild cards and to store index documents more compactly. Together with DigestSearch's ability to instantly find a particular keyword, it could become an unbeatable solution.)
Under normal circumstances, DigestSearch gets results faster than regular search methods. If you do not need the luxury of the advanced queries that FTSearch and DBSearch can offer, but rather are looking for an alternative for Profile documents, the GetDocumentByKey method, or @DBLookup, download the sample databases and see if DigestSearch can be useful for you. With the help of the examples included in this article and the sample databases, you can test the functionality of DigestSearch in a matter of minutes and implement it for your own applications simply by changing the Configuration document.
|Sample applications for this article||digest_dbs.zip||1284KB||HTTP|
The developerWorks Lotus article, "Notes application strategies: Interactive search," explains how to build an interactive search application.
Check out the Botstation Web site, where you can post suggestions and comments for DigestSearch.
Get products and technologies
Download a trial of Lotus Domino 7 and Lotus Notes, Domino Designer, and Domino Adminstrator 7 at no charge from developerWorks.
Participate in developerWorks
blogs and get involved in the developerWorks community.
Participate in the Notes/Domino 6 & 7 Forum on developerWorks Lotus.
Andrei Kouvchinnikov is a certified Principal Domino Developer and Administrator. His experience includes full life cycle development of Lotus Domino applications running on multiple platforms and development of applications for QuickPlace and Sametime. He has been working with the Lotus Domino platform since R4.5 for OS2. You can reach Andrei at firstname.lastname@example.org.