Configuring text extraction for document classes

Enable persistent text extraction for any document class that you want to process with the Content Assistant service.

Configuring text extraction for document classes for CPE V5.5.12 and earlier

You can enable text extraction for any document class when you create a subscription for the text extraction event.

About this task

By default, a document class is not enabled for persistent text extraction. You need to create a document class subscription for persistent text extraction. After you create the subscription, text is automatically extracted from any new documents that are added to the object store, which belong to the subscribed document class. However, you need to run a custom sweep job to extract text for existing documents that were added to the object store before the subscription was created.

Procedure

In the Administration Console for Content Platform Engine (ACCE), go to the object store where you want to enable text extraction.
Go to Data Design > Classes > Document and select a document class.
You can select a document class that subscribes to the text extraction event action.
For the selected class, click Actions > New subscription.
Enter a subscription name and click Next.
For example, you can enter Document-Text Extraction as the subscription name.
Select the subscription behavior Scope value as Applies to all objects of this class and click Next.
Leave other properties as default values.
Select Checkin Event and Update Event as events that trigger the subscription and click Next.
Select Text Extraction Event Action as the event action for the subscription and click Next.
Specify the values for additional options.
1. Set the initial state as Enable the subscription.
2. Select Include subclasses if you want to enable text extraction for all sub-classes of the parent class.
  
  Tip: If you want to enable persistent text extraction for the Document class along with subclasses, create the subscription for the Document base class and select Include sub-classes.
3. Set Run synchronously as the subscription run mode.
4. Leave other fields blank and click Next.
Click Finish to create the subscription for text extraction.
Repeat the process for all document classes where you want to enable persistent text extraction.

What to do next

If you want to configure text extraction for existing documents that were already added to the object store, see the topic Running a custom text extraction sweep for existing documents.