Downloading FileNet P8 content in bulk using a FileNet Sweep Job

If you have a FileNet P8 repository, you can use Datacap to discover additional information that were not extracted as metadata properties.

You can use the FileNet sweep framework and its bulk processing capabilities to download the document content into a directory. Once these files are in the directory, a Datacap application can be used to ingest the documents into Datacap and extract information that can be exported back into FileNet P8. See the IBM Knowledge Center for more details about handling bulk processing with FileNet sweeps: https://www.ibm.com/support/knowledgecenter/en/SSNW2F_5.2.1/com.ibm.p8.ce.admin.tasks.doc/p8pcc175.htm.

This section provides the procedures to create a Sweep Job that downloads the FileNet P8 documents in a format that can be ingested by Datacap. If you decide that a Sweep Policy is more suited for this task, you can modify the JavaScript provided to fit the sweep policy framework as long as you follow the same download file name convention.

File naming convention

You need to update the properties of these documents. Datacap expects the downloaded file names to follow a naming convention that concatenates the document item ID and the original content file name together for each of the downloaded content element.

For Datacap to be able to identify the document that is associated with the downloaded content file, the file name must begin with the document item ID. For example.

{FE52C000-EBF1-4797-BF3B-5CF98AFC5854}.TM000001.tif.

Datacap parses the file name and identify the document ID of the file. This allows the Datacap application to update the existing FileNet P8 document with properties that are found during Datacap processing.

Output directory

In the JavaScript, you also need to specify the download directory. This must be a common directory or shared drive that is readable and writable for both the FileNet P8 and Datacap servers if FileNet P8 and Datacap are not co-located.