Extract Text
This task extracts the text of email attachments and prepares it for full-text indexing.
If text extraction fails, the Extract Text task writes an
error notification to the text-search indexing document. Refer to the related topic for
a list of possible error strings.
Task summary
| Characteristic | Value |
|---|---|
| Task name | Extract Text |
| Main purpose | Extracts the text of files or email attachments and prepares it for full-text indexing |
| Usable with which source connectors? | Email Connector, SMTP Connector |
| Usable with which target connectors? | IBM® FileNet® P8 Connector |
| When needed? | Required in email archiving task routes when processing attachments that must be full-text indexed |
| Placement in task route | Can appear only after the EC Extract Attachments task in email archiving task routes |
| Produces which metadata? | Task Status, Text Extraction |
| Configuration options |
File Extension Filter
Define a filter for file extensions. When you define an exclude filter, the Extract Text task will skip files with the listed extensions for text indexing. If the list is empty, the task will render all files that are passed in. When you define an include filter, the Extract Text task will process files with the listed extensions for text indexing. If the list is empty, the task will render none of the files that are passed in.For all files that are skipped, the task writes the string "IcmFceWarning:IcmConfigFilteringFile" to the icc_attachment and icc_attachment_text fields of the text-search indexing document, so that you can search for all documents that contain attachments that were not indexed.
To restore the default list of extensions, click Load Default Extension List. To empty the list, click Clear Extension List.