Client and server workflows

This topic describes the document constructor client and server workflows.

Client workflow

During an indexing session, the client application invokes the Collection.setConstructorConfiguration method before starting to index. This method defines the custom data (optionally per collection) that is passed to the constructor.

For each document that must be constructed (as opposed to documents that are indexed directly) the application uses the IndexableDocument method which is enhanced by two additional methods: setConstructorName() and setConstructedDocumentData(). The name of the constructor that is used for this document is provided in setConstructorName(). Any additional data that is used by the constructor, such as a temporary file name from which the content is retrieved, a table name, or a column name, is provided in the setConstructedDocumentData() method.

The name of the constructor and additional data are not required by the IndexableDocument method. The client application specifies the default constructor name that is associated with a particular collection. The Collection.createCollection method is used to define a default construction name for a specific collection.

Server workflow

The constructor is initialized once when it is started with registry information from constructors.xml and can be updated with collection-specific information from setConstructorConfig during run time.

During indexing, the populateDocuments() method is invoked and receives an array of constructed documents (IConstructedDocument).

The constructor code can invoke the IConstructedDocument methods to retrieve the ID, the constructed document data, and the specified content type of the document. The constructor code returns the content of the document by setting the input stream (InputStream) and the content type. Possible content types are PLAIN_TEXT, XML, HTML, BINARY, or UNKNOWN.

For example, the constructor can provide an InputStream to a binary document on the file system and set the content type to BINARY. Alternatively, the constructor can create a text string or an XML string, provide an InputSteam pointer to a ByteArrayInputStream, and set the content type to PLAIN_TEXT or XML accordingly. If errors occur, the constructor can provide specific error messages for each document by using the addConstructorMessage(IMessage message) method.