Importing extracted content into Classification Workbench

If you extracted items from IBM® Content Manager by using the Content Extractor, import these items into Classification Workbench by using the import wizard.

About this task

The items that you import, which are in XML format, are referred to as a content set, which is a group of content items that are used to build a knowledge base or build and analyze a decision plan.

To import extracted content and create a knowledge base:

Procedure

  1. On the computer on which you installed Classification Workbench, click Start > Programs > IBM Content Classification 8.8 > Classification Workbench.
  2. On the Open Project window, click New.
  3. On the New Project window, enter a project name, select the knowledge base project type, and click Next.
  4. On the New Project Options window, click Create a project by importing a content set and click Next.
  5. On the Import Content Set window, click XML and click Next.
  6. Navigate to the folder that contains the output from the Content Extractor, such as Classification_Home/ECMTools/extractorOutput, and click Next.
  7. Select the Scan XML data files for fields before importing the content set check box and then click Finish.
  8. After the content is imported, specify options for fields with textual or numeric content. Repeat the following steps for each field that you want IBM Content Classification to use for classifying documents:
    1. On the Field Definitions panel, right-click a field that contains content that you want to classify and select Edit Field.
    2. For Data type, select string if the field contains text or select number if the field contains numeric data, such as a telephone number.
    3. For Content type, specify the following options:
      • For a field that contains body text, select PlainText.
      • For a field that contains the document title, select DocTitle.
      • For a field that contains the file name of a document, select PlainText.
      • For a field that contains an email address, select Sender.
      • For a field that contains the subject of an email message, select Subject.
      • For a field that contains the body of an email message, select Body.
  9. Define the categories field:
    1. On the Field Definitions panel, right-click the field that represents the classification criteria and select Edit Field.
    2. For Data type, select classification.
    3. Select the Designate as categories field check box and click OK.
  10. Create the knowledge base. Access the Create, Analyze, and Learn wizard by clicking Create, Analyze and Learn on the toolbar.