Working with file attachments in the LCLSX
The Lotus Connectors LotusScript Extension (LCLSX) easily enables synchronization of simple fields between Notes databases and relational databases -- text, date, and number fields present no problem. You can even store rich text fields as plain text or store them in a relational BLOB field (though the binary encoding of a rich text field is not usable except by Lotus Notes/Domino).
However, file attachments are a little trickier. Neither Notes native classes nor the LCLSX support direct access to the data in a file attachment. The only way to extract a file attachment from Lotus Notes is to detach it to disk, and the only way to create a file attachment is from a disk file. But the LCLSX can make things easier by doing the detaching and attaching for you automatically as you work with Notes data.
This article presents a step-by-step explanation of a sample application to copy Notes file attachments to and from a relational database. We assume that you already have some experience with the use and terminology of the LCLSX. If not, please refer to the IBM Redbook, Implementing IBM Lotus Enterprise Integrator 6. This book includes a chapter on the LCLSX that gives a good grounding in this technology with many examples.
The general theory of file attachments in the LCLSX
The Notes connector of the LCLSX contains the following properties that are used for handling file attachments:
- LoadFile is a Boolean property that when set to True, adds a special field named FILE to the result set returned by the Select method. In addition, whenever you perform a Fetch on a Notes document, any file attachments in the document are copied to a local file directory, and the FILE field contains their filenames (as a multivalue list). When you update or insert a Notes document, any files listed in the FILE field are attached to the Notes document.
- FilePath is a string value that defines the file path of the directory where attachments are detached when a Notes document is fetched, if LoadFile is True. If left blank, the current directory is used. Do not leave this blank! The current directory usually is the Notes program directory, which contains a lot of files you don't want to accidentally overwrite by detaching a file that happens to have the same name.
- CopyFile is a Boolean property that when set to True adds a special field named LCXFILE to the result set fieldlist returned by the Select method. If you insert or update another Notes document using this fieldlist, it copies all the attachments to the new document.
The first two properties are the ones we use in this article. CopyFile is only useful for transferring attachments between two Notes databases, not between Lotus Notes and relational databases.
So we have a way to get the Notes attachment data into a file. The next step is to get the file contents stored in a binary field in the relational database. To do this, we have to get the binary file data into an LCField object. The tool that lets us do that is the File connector. The File connector accesses data from a file system in a way that makes it look like a relational table. Each file is represented by one row in the result set, and every directory table, no matter which operating system you use, contains the same four columns:
- Filename is the name of the file without the full file path.
- Contents is the data in the file either in binary form or text, depending upon the setting of the Binary property of the connector. For our purposes, we want to use binary format so that the exact contents of the attachment are preserved.
- Timestamp identifies when the file was last modified.
- Size is the size of the file in bytes.
The overall process for export from Lotus Notes is to Fetch data from a Notes document using the Notes connector, which automatically detaches a file for each attachment. Use the File connector to read those files into memory and any relational connector to store the file data into a binary column. To go the other way, you can insert records into a File connector, which creates the files.
Whether importing or exporting, the statements to set up the connections are largely the same. Here are the statements our export sample uses to set up the Notes connector (which are the same whether reading or writing).
Dim lcconNotes As New LCConnection("notes") lcconNotes.server = db.server lcconNotes.database = db.filepath lcconNotes.metadata = "Villain" lcconNotes.FilePath = containing_folder + "\" + foldername lcconNotes.LoadFile = True
This assumes that you're using the Windows operating system because we use "\" as a directory separator. Next, set up the File connector as follows:
Dim lcconFiles As New LCConnection("file") lcconFiles.Database = containing_folder lcconFiles.Metadata = foldername lcconFiles.Binary = True
The final statement specifies that the file data is binary -- the File connector will not try to interpret the bytes as characters. In the sample code, we use a custom class to create a temporary directory which is automatically deleted when execution ends.
The sample application
You can download a sample Notes database containing the code we discuss. The application is the "Dick Tracy Crime Files." Tracy, a famous police detective, is a long-time fan of Lotus products. He created this database to track known criminals. The criminals' mug shots are stored as JPEG file attachments. We need to integrate this Notes data with a relational database where the JPEGs are stored in a binary column.
The Notes Villain documents contain the field ID, a unique key used to relate the records to other relational tables. The table that contains the data we have to update has three columns:
- VillainID, where we should copy the ID field from Lotus Notes. A villain can have multiple mugshot photos, so this column does not contain unique values.
- Filename, containing the original name of the JPEG file. The VillainID and Filename taken together are expected to be unique.
- Filedata, the binary file contents.
In our sample scripts, there's also synchronization of other Notes fields to other tables, but we're only concerned with the file attachments.
Setting up to run the sample scripts
You can run the scripts from a Notes client. They will of course need a relational database of yours to export to and import from. Any database will do -- Microsoft Access will work. The sample scripts automatically create the tables they need if they do not already exist, provided the relational database login you use has this level of access.
As written, the scripts read their connection information from a DECS (or LEI) Administrator database on a Domino server (decsadm.nsf). This database may be local if you prefer. You do not need to run DECS or LEI, but you must have a decsadm.nsf database. You can use the DECS Administrator template (decsadm.ntf) to create it. Or, if you prefer, you can hard code the connection information into a script library. In any case, you want to edit this script library, which is named Customize Connections. In the (Options) section, you find several Const statements. Adjust the values of these parameters to tell the script where to look for the connection information:
- The name YOUR_CONNECTION_DOC contains the name of the Connection document that the script will search for in the DECS Administrator database.
- VILLAINS_MAIN, VILLAINS_CRIMES, and VILLAINS_ATTACHMENTS contain the names of three tables to which the data is exported.
- YOUR_DECS_SERVER is a place where you can put the name of the server where your DECS Administrator database can be found. For local, leave blank.
If you prefer to hard code the relational database type, name, ID, and password, visit each of the modules in this script for instructions on how to customize it.
After you customize this library, create the Connection document that you entered in YOUR_CONNECTION_DOC. You should then be ready to go.
Running the sample scripts
Open the database in the Notes client and go to the Villains \ by ID view. This contains four sample documents. The Actions menu lists numbered agents that you should run in sequence.
1: Create/Clear Tables
This agent uses the Action method with argument LCACTION_TRUNCATE to remove any data that may be in the three relational tables. If the tables do not exist, the script creates them. You may be interested to look at the script to see how you can create tables with the LCLSX, but because that isn't the point of this article, we won't go into detail.
2: Export Villains to RDB
This agent implements the export process discussed previously. Performance is not a big consideration in this script because there are only four documents, but to serve as a good example for your applications where performance may be an issue, we've followed best practices by establishing the relationships between the fields to be copied first, so that the copying happens automatically without data having to be moved from field to field within the main program loop.
This is done by creating fieldlists that contain references to the same field objects. The LCField is a separate object, so it may be referenced by multiple fieldlists, which each may know by a different name. In Figure 1, a rectangle represents a fieldlist element, and an oval is an LCField object. Connecting lines show which fieldlist elements refer to which LCField objects.
Figure 1. Fieldlists
When you do this kind of programming, we suggest you always create a diagram. It may not always be this neat, but is often more complex. By setting up the fieldlists this way, you can read a document from the Notes database, and the data you want to write to two other tables are already in the fieldlists that you insert into the relational database. As you read the Notes record, files are automatically created, which you can read with the File connector -- and the data are already in the fieldlist that you insert into the attachments table in the relational database.
Here's the code that creates these linked fieldlists:
Call lcconNotes.Select(Nothing, 1, flNotes) ' select all docs, init flNotes Call lcconFiles.Select(Nothing, 1, flFiles) ' init flFiles (no files yet) Call flDest.MapName(flNotes, "ID,Name,Location,Body", "VillainID,Name,Hideout,Comments") Set fldID = flDest.Lookup("VillainID") Call flDestAtt.MapName(flFiles, "Filename,Contents", "Filename,Filedata") Call flDestAtt.IncludeField(3, fldID, "VillainID") Call flCrimes.MapName(flNotes, "ID,Crimes", "VillainID,Crime") Set fldCrimes = flNotes.Lookup("Crimes")
The two Select statements create the field objects for the two source fieldlists. We want to read all the Notes documents, so the first Select serves a dual purpose. However, there aren't any files yet in the temporary directory, so the Select on the File connector is just for the purpose of initializing the fieldlist.
The statements that follow create new references to the same fields and insert them into the other fieldlists. Following that, the main loop of the export program is pretty simple:
Do While lcconNotes.Fetch(flNotes) > 0 ' The Fetch has also detached all attachments into ' the temp directory. Call lcconDest.Insert(flDest) ' create main villain record. Call lcconDestCrimes.Insert(flCrimes) ' insert multiple crimes ' rows in one operation. Call lcconFiles.Select(Nothing, 1, Nothing) ' see what files are ' in the temp directory. Do While lcconFiles.Fetch(flFiles) Call lcconDestAtt.Insert(flDestAtt) ' insert one file in attachment ' table. Loop Call lcconFiles.Action(LCACTION_TRUNCATE) ' get rid of the ' attachment files. Loop
The Fetch into fieldlist flNotes has also created files in the temporary directory. The script selects from the File connector to get a list of those files and also to read their contents. Notice that the third argument to the Select statement is Nothing. Ordinarily, we would supply a blank fieldlist so that the fields of the result set can be filled in, but we already have the fieldlist we need to fetch the results, so there's no point in building another one. In fact, the one we have is better than a new one because it's already linked to the output fieldlists.
Some points worth noting about this program:
- We have three LCConnection objects that point to the destination database, one for each table. The connectors for the main and attachment tables are direct to the database, while the connection to the crimes table goes through a Collapse/Expand metaconnector, which automatically converts between a Notes multivalue field and multiple rows in a one-to-many relational table. This last is beyond the scope of this article, but you may want to look at the code to see how it's done.
- The Notes fieldlist contains a FILE field, which contains a list of the full paths of the detached files. However, we don't need to use this field during the export. We fetch the file names and contents directly from disk with the File connector, so we get the same information that way. It would be a different story if the directory contained other files, but we created a new temporary directory specifically so we wouldn't have to worry about that.
- While fetching a Notes document can automatically copy the file attachments onto disk, there's no function to automatically remove the attachments after you're done with them. At the end of the loop, the Action method erases all the files from the temporary directory by truncating the table that contains the directory contents.
- We have mapped the Body rich text field of the Notes document to a binary BLOB field in a relational database. This copies a raw binary image of the rich text field into the BLOB field. This data is in a proprietary format that is not usable except by importing it back into Lotus Notes. Although the icons for the file attachments may appear within the rich text field when you edit the form, attachments are stored separately from rich text -- copying the rich text does not copy the attachments. However, this gives you a way to make backup copies of embedded images and other paraphernalia that appear within the rich text, which you can later restore in Lotus Notes.
- The inner loop fetches however many attachments are found in the temporary directory and inserts each of them into the attachments relational table, using the combination of VillainID and filename to give the record a unique identifier. Internally, Notes prevents duplicate attachment names within the same document by renaming any that are non-unique, so duplicate keys should not be an issue.
- As the main loop executes, a custom function named DebugStr is used to create string descriptions of the data being written to disk. Message boxes display information about each set of records as they're written. The lines to debug output have been omitted from the preceding code so that you can see just the functionality for moving the data.
3. Import from RDB
The next agent does the reverse of the previous agent: It creates new Notes documents with file attachments from the data in the relational database. Because we didn't want to delete our source data records, we created a second Notes form (just like the Villain form, but called Bad Guy), and this is the type of document the import script creates.
We must create the files from the data in the relational databases, so we use a key from the main Villains relational table to select from the attachments relational table. We use that data to create the files on disk by inserting into the File connector. Then we insert into the Notes connector, and this operation grabs the files from disk and automatically creates the Notes file attachments.
However, there's one important difference between using the LoadFile property of the Notes connector to read and using it to write. When reading, we can ignore the FILE field in the Notes fieldlist and see which files are created. But when writing, we have to use the FILE field to tell Lotus Notes the complete filepath of each file we want to attach. It will not simply pick up all the files in the target directory.
The FILE field is special. In addition to the name, it has special virtual field codes associated with it that tell the Notes connector to give it special treatment. This lets you distinguish between a field with file attachment list functionality and a regular field that just happens to be named FILE. There are several special field names used by the Notes connector, and they all have the virtual flag.
NOTE: The virtual field flag should not be confused with the DECS and LEI Virtual Fields activity, which fetches live information from a relational database. They have similar names, but vastly different concepts.
Before it gets into the main loop, the import script constructs a set of linked fieldlists, which is very similar to that shown previously, including the special FILE field. (Diagramming those field relationships is left as an exercise for the reader.) As we saw earlier, the easy way to create a fieldlist that matches your table is to use Select to retrieve a description of the table. This is especially true when special fields, such as multivalue and virtual fields, are in the list. However, why do things the easy way when an instructive example can be created by doing it the hard way? Here's the code to create your own FILE virtual field from scratch:
Set fldFiles = flNotes.Append("FILE", LCTYPE_BINARY)
The field is a binary type, but it is not a BLOB field. Binary fields are also used to store special Notes field values, such as rich text and multivalue fields.
Call fldFiles.SetFormatStream( , , LCSTREAMFMT_TEXT_LIST)
We have assigned the type of binary field, a multivalue Notes text field.
The SetVirtualCode function requires an argument that tells which connector or connection is supposed to give this field special treatment. Different connectors may have different fieldnames that they consider special, so this lets us designate the field as special to Lotus Notes, but not to other connectors. This is one of the ways we can access the unique functionality of the different backends. The flag we assigned here tells Lotus Notes to pay special attention to the field.
The Connector code property of every Notes connection returns the same number value. The Connection code, however, is unique to a single LCConnection object. By setting the virtual code that belongs to the connection rather than the connector, you can load a list of names from a Notes document with one Notes connection (which treats FILE as just a simple multivalue field), but link this field to a fieldlist associated with a second connection, which treats it as the special FILE field and finds those files on disk. The FILE field needs to contain the complete path of each of the files you want to attach, not just the filenames.
The main loop of the import function must construct this list of values for every Villain record it reads from the relational database. Here's that code:
Do While lcconVillains.Fetch(flVillains) > 0 Call lcconCrimes.Select(flVillains, 1, Nothing) ' read thru metaconnector, ' so there's always at most one result. If lcconCrimes.Fetch(flCrimes) = 0 Then fldCrimes.Value = "" ' if they have no crimes, erase value left over ' from previous iteration. End If strFilenames = "" Call lcconAttach.Select(flVillains, 1, Nothing) ' Loop thru list of attachment relational records associated with ' current key, and detach each file to temp dir. Do While lcconAttach.Fetch(flAttach) Call lcconFiles.Insert(flFiles) ' copy the file contents into a ' disk file. Build a delimited string of the full filepaths of the ' files we detach. strFilenames = strFilenames & NEWLINE & dirTemp.fullpath & "\" & fldFilename.Text(0) Loop ' If there are attachments, construct a string that contains a multivalue ' where each value is the full path of one file. If strFilenames <> "" Then lcstrFilenames.Value = Split(Mid$(strFilenames, 2), NEWLINE) Else Call lcstrFilenames.Clear End If Call fldFiles.SetStream(1, lcstrFilenames) ' set FILE multivalue field ' to list of files to attach. Call lcconNotes.Insert(flNotes) Call lcconFiles.Action(LCACTION_TRUNCATE) ' get rid of temp ' attachment files. Loop
The variable lcstrFilenames is an LCStream object. LCStreams are the objects used to store the data for every text and binary LCField object. They aren't used very much, but they're very handy when you need to work directly with the data of a binary field, such as a Notes multivalue. It's possible to simply assign the text of a multivalue field, and the LCStream class splits it into multiple values for you, but it uses a comma as a delimiter, so this only works when the data values don't contain commas.
Note the following:
- The VillainID field in the fieldlist flVillains is set to be a key (LCFIELDF_KEY). This flag is ignored by the Fetch operation that loads the data into the fieldlist, but it makes it possible to use flVillains as the selection criteria argument to a Select to pull up related crimes and attachment records.
- When we select from the two related tables, we supply Nothing as the third argument. We don't need to have the result set fieldlists built for us during the Select because we already have them. And the ones we have are already sharing fields with other fieldlists, so they are better than new ones would be, anyway.
- The expression dirTemp.fullpath refers to a property of a custom class used to create temporary directories and automatically delete them when done. You may find the class useful in other scripts. This property returns the full path of the temporary directory.
- Once again, we use "\" as the directory delimiter, which only works with Windows operating systems.
- You don't have to use the full filepath when you assign the FILE field. You could instead set the Notes current directory (Curdir) to your temporary directory and use just filenames. However, setting Curdir is not very safe because (1) you may have other scripts running in parallel in the same process, and (2) these scripts are also setting Curdir (or assuming that it has its default value).
- Unfortunately, setting the FilePath property of the Notes connector does not make it look for the files in that directory if the FILE value doesn't specify a path.
- This script also uses a Collapse/Expand metaconnector, in this case, to translate from multiple relational rows to a single Notes multivalue field, Crimes.
The preceding scripts are fairly simple. In real life, applications are often more complex than just dumping data from one database to another. More commonly, you want to synchronize information between different data sources.
Of course, if the data movement is one way, you can synchronize by deleting all the data in the destination, and then copying everything from the source -- also known as the lazy programmer's technique. In certain applications, this is okay to do, but there are several factors that may make it a bad idea:
- Increasingly, many databases of all kinds have custom programmed event code that triggers when records are added, deleted, or modified. Deleting a record and then adding a new one just like the one you deleted may trigger unneeded processing with possibly unpleasant results.
- If the destination is Lotus Notes, there are several disadvantages to dumping and recreating all the Notes documents:
- The doclinks you created today will not work tomorrow.
- Deletions leave behind deletion stubs, which inflate the size of the database, increase replication and view indexing times, and generally harm performance, especially when there are 40 times as many of them as there are active documents.
- Unread marks are gone.
- Domino servers, too, can have event-based processing, such as API add-ins and agents that run on new and modified documents.
- Other database types may also suffer performance deficits from unnecessary churn in their data.
To help you navigate these waters, the sample database also contains a sample agent that performs a simple one-way keyed synchronization between two LCConnections. The agent "5. Replicate Pirates" includes a reusable replication engine that reads one record at a time from two result sets and compares the keys and data values to decide whether or not there's a difference between the records, and if so, whether it represents an insert, update, or deletion. Then the next record is read from one or both result sets as appropriate. It's up to you to feed this engine two result sets that are identically sorted, which for some combinations of connectors may require the use of an Order metaconnector. After you've done that, the process is automatic (at least for simple field types).
It's up to you to combine this technique with any file attachment manipulation you may need to do. But while you can use an LCStream object to compare the binary data of two file attachments to see whether or not they are equal, it's a lot faster if you compare timestamps.
This article described how to work with file attachments in LCLSX. We've talked a bit about how LCLSX handles file attachments, and then discussed in detail the sample application we've included to demonstrate file attachment processing. We concluded with some observations about real-world synchronization techniques.
We hope that you found this article useful. If so, we suggest you download the sample we provided and adapt it to your own needs and requirements. Good luck!
- Download the sample code discussed in this article.
- To learn more about the LCLSX, visit the Lotus Enterprise Integration product page.
- In addition, the IBM Redbook, Implementing IBM Lotus Enterprise Integrator 6, includes a chapter on the LCLSX.
- See the Domino Designer help for full descriptions of the methods discussed in this article.