As discussed throughout this section, Watson Explorer Engine
provides a number of different functions that enable you to
enqueue URLs, enqueue XML data, enqueue delete operations, and
enqueue sets of URLs and delete operations as a single, atomic
transactional unit. Each of the functions that perform these
operations provide a synchronization attribute that indicates
the point at which the crawler should return success for the
associated operation. The possible values for this synchronization
attribute are the following:
- enqueued - after all the child nodes are found to satisfy their crawl conditions, are ready to be submitted to the indexer, and are committed to secondary storage. This is the default synchronization value.
- indexed - after the child nodes have been recorded by the indexer. This synchronization mode forces the indexer to do additional work to reply in the most punctual manner.
- indexed-no-sync - after the child nodes have been recorded by the indexer, but does not force the indexer to do additional work. In most cases, this synchronization mode is recommended over the indexed synchronization mode.
- none - immediately after receiving the enqueue
- to-be-indexed - immediately before the crawled and converted data is ready to be sent to the indexer
You can tune these synchronization settings for each enqueue
operation to control when you receive notification (and audit
log messages) associated with that enqueue.
Note: Some synchronization settings should be used sparingly, if at all, with potentially complex or long-running enqueue operations. For example, using the indexed or indexed-no-sync synchronization setting on the partial enqueues associated with a single index-atomic operation may cause the enqueue to time out depending on the amount of data associated with the enqueue and the structure of your application.