Troubleshooting
Problem
By default, eDiscovery Analyzer does not index text from PDF documents that were produced by optical character recognition (OCR) software.
Symptom
A PDF document generated by OCR software can be searched in a PDF viewer, but the text is not indexed by eDiscovery Analyzer.
Cause
Text that is generated by OCR software is saved as hidden text in PDFs. This text is not indexed by eDiscovery Analyzer by default.
Resolving The Problem
To index hidden text in PDF documents, change a setting in the Oracle Outside In Search Export filter.
Change the following lines in the configuration file stellent/searchexport.cfg under the eDiscovery Analyzer installation directory:
# SCCOPT_XML_SEARCHML_CHAR_ATTRS
(lines skipped)
#hidden yes
hidden no
to:
hidden yes
#hidden no
Was this topic helpful?
Document Information
Modified date:
17 June 2018
UID
swg21432316