You can create a document class subscription for persistent text extraction. After you
create the subscription, text is automatically extracted from any new documents that are added to
the object store, which belong to the subscribed document class. However, you need to run a custom
text extraction job sweep for existing documents that were added to the object store before the
subscription was created.
You need to have the following permissions for all existing documents for which you want to
extract text:
READ
WRITE
LINK
VIEWCONTENT
Configure and run a custom job sweep for text extraction to extract text for existing
document content.
- Start the New Custom Job wizard in the Administration Console for
Content Platform Engine (ACCE):
- In the domain navigation pane, select the object store.
- In the object store navigation pane, select the folder.
- Click New.
You can configure the new custom job
sweep to extract text for existing documents.
- Enter the name and description for the custom job sweep for text
extraction.
You can name the sweep as Custom Text Extraction
Job.
- Specify the Custom Sweep subclass as Custom Sweep
Job.
- Specify the Sweep Mode as
Normal.
- Select the Enabled property to enable to custom sweep to run.
Click Next.
- Specify the Target Class as
Document.
- Specify the Filter Expression to filter objects of the selected
target class type.
For example, you can include a filter
VersionStatus=1 for document classes that are
Released.
- Optional: Select Enabled if you want to include
subclasses and if you want to record failures.
- Specify the Sweep Action as Text Extraction Sweep
Action.
- Optional: Specify the start and end dates for the custom job
sweep.
- Complete the wizard.