Start of change

Example: Configuring Content Classification to set item attributes and expiration dates when archiving email with Content Collector

This scenario shows how you can configure Content Classification to set an item attribute and the expiration date when Content Collector archives email into IBM® Content Manager.

About this task

ExampleCo. Enterprises, a fictitious insurance company, wants to set up a system to automatically classify and archive all new email. For each email, the company wants to determine the claim type and set an appropriate expiration date. They use Content Classification to analyze the content of each email and identify the type of claim, which is specified in an IBM Content Manager item attribute called ClaimType. They also use a Content Classification decision plan rule to set an appropriate expiration date according to their retention policies. For example, for an email about damage caused by a fire in a residential building, the ClaimType attribute would be set to FireResidential and the expiration date would be set to 7 years.

In the following scenario, after Content Collector ingests an email as part of the task route, it passes the document to Content Classification. Content Classification analyzes the document by using the specified decision plan and referenced knowledge base to determine the appropriate claim type attribute and expiration date for the document. Content Classification then returns the values of the relevant decision plan output fields (ContentManager:SetAttribute:ClaimType and ICC:ExpirationDate) to Content Collector, where the values populate the mapped Content Collector user-defined metadata fields. When documents continue to be processed by the Content Collector task route, the Calculate Expiration Date task sets the expiration date based on the value of the ICC:ExpirationDate metadata property, and the CM 8.x Configure Item Types task sets the claim type attribute based on the value of the ContentManager:SetAttribute:ClaimType metadata property.

The following scenario assumes that you already completed the following tasks:
  • Installed Content Collector and Content Classification on separate servers.
  • Configured the Content Collector server to work with Content Classification.
  • Defined an IBM Content Manager attribute with the name ClaimType. This attribute must be defined for all item types that will be assigned to the emails.

Procedure

To configure Content Classification to set the claim type attribute and expiration date when archiving email with Content Collector:

  1. Run the Content Extractor to obtain sample content from your IBM Content Manager repository. This extracted content can then be imported into Classification Workbench and be used to train a knowledge base or build and analyze decision plan rules.
  2. In Classification Workbench, build a Content Classification knowledge base and decision plan. In the decision plan, create rules that set expiration date and claim type attribute based on the content of the document. For example, use the Set an item attribute in IBM Content Manager action to populate the ContentManager:SetAttribute:ClaimType field. To set the expiration date, create rules that set the ICC:ExpirationDate field and use the Set a content field to a date for IBM Content Collector action to convert the dates to the Content Collector internal date format.
  3. After you create the decision plan and referenced knowledge base, publish them to the Content Classification server.
  4. For each decision plan output field, create a user-defined metadata property in Content Collector. In the Content Collector Configuration Manager, click Metadata and Lists > User-Defined Metadata and add two metadata properties named ICC:ExpirationDate and ContentManager:SetAttribute:ClaimType. You can choose any names for the metadata properties, but it is convenient to use the same names as in Content Classification.
  5. In the Content Collector Configuration Manager, create a task route by using the Default Archiving (Automatic) task route template.
  6. After the EC Extract Metadata task, add an instance of the EC Prepare Email for Archiving task. To ensure that attachments are available for classification, clear the Save native message files without attachments check box in this new instance of the EC prepare Email for Archiving task.
  7. After the EC Prepare Email for Archiving task that you added, add the IBM Content Classification task.
  8. Configure the IBM Content Classification task.
    1. In the Server area, specify the host name of the Content Classification server on which the decision plan is running and the port number of the Content Classification listener component.
    2. In the Instance type area, select Decision Plan and click the explore button to retrieve the list of available decision plans. Select the decision plan that you created in step 2.
    3. For the Content field list, click the explore button to retrieve the list of available content fields and then select Document.
    4. Go to the Map Decision Plan Results tab.
    5. In the Metadata source list, select the metadata set that you created in step 4. The mapping table is populated with the metadata properties.
    6. Click a metadata property in the table and then select a decision plan output field in the Decision Plan property list. For example, click the ContentManager:SetAttribute:ClaimType metadata property in the table and then select the ContentManager:SetAttribute:ClaimType field in the Decision Plan property list. Repeat this step for the ICC:ExpirationDate property.
  9. Add the Calculate Expiration Date after the IBM Content Classification task. Configure the task to set the expiration date based on the value of the ICC:ExpirationDate metadata property.
  10. Configure the CM 8.x Configure Item Types task to set the ClaimType attribute based on the value of the ContentManager:SetAttribute:ClaimType metadata property.
  11. To store a history of the classification decisions, configure the audit task in your task route to log the ICC:ExpirationDate and ContentManager:SetAttribute:ClaimType user-defined metadata properties.

What to do next

After your system is in production, you can audit the classification performance by using Classification Center to review the emails that were ingested into IBM Content Manager. You can use the Classification Center to periodically verify that the correct claim type attribute values and expiration dates were set when emails were ingested. If needed, you can reclassify particular emails to help improve the classification of emails in the future.
End of change