This scenario shows how you can configure Content Classification to set an item attribute
and the expiration date when Content Collector archives
email into IBM® Content
Manager.
About this task
ExampleCo. Enterprises, a fictitious insurance company,
wants to set up a system to automatically classify and archive all
new email. For each email, the company wants to determine the claim
type and set an appropriate expiration date. They use Content Classification to analyze the content
of each email and identify the type of claim, which is specified in
an IBM Content
Manager item attribute
called ClaimType. They also use a Content Classification decision plan rule
to set an appropriate expiration date according to their retention
policies. For example, for an email about damage caused by a fire
in a residential building, the ClaimType attribute would be set to
FireResidential and the expiration date would be set to 7 years.
In
the following scenario, after Content Collector ingests
an email as part of the task route, it passes the document to Content Classification. Content Classification analyzes the document
by using the specified decision plan and referenced knowledge base
to determine the appropriate claim type attribute and expiration date
for the document. Content Classification then
returns the values of the relevant decision plan output fields (ContentManager:SetAttribute:ClaimType
and ICC:ExpirationDate) to Content Collector,
where the values populate the mapped Content Collector user-defined metadata fields.
When documents continue to be processed by the Content Collector task route, the Calculate
Expiration Date task sets the expiration date based on the value of
the ICC:ExpirationDate metadata property, and the CM 8.x Configure
Item Types task sets the claim type attribute based on the value of
the ContentManager:SetAttribute:ClaimType metadata property.
The
following scenario assumes that you already completed the following
tasks:
- Installed Content Collector and Content Classification on separate servers.
- Configured the Content Collector server
to work with Content Classification.
- Defined an IBM Content
Manager attribute
with the name ClaimType. This attribute must be defined for all item
types that will be assigned to the emails.
Procedure
To configure Content Classification to
set the claim type attribute and expiration date when archiving email
with Content Collector:
- Run the Content Extractor to
obtain sample content from your IBM Content
Manager repository. This extracted content can then be imported into Classification Workbench and be used to train
a knowledge base or build and analyze decision plan rules.
- In Classification Workbench,
build a Content Classification knowledge
base and decision plan. In the decision plan, create rules
that set expiration date and claim type attribute based on the content
of the document. For example, use the Set an item attribute
in IBM Content
Manager action
to populate the ContentManager:SetAttribute:ClaimType field. To set
the expiration date, create rules that set the ICC:ExpirationDate
field and use the Set a content field to a date for IBM Content
Collector action to convert
the dates to the Content Collector internal
date format.
- After you create the decision plan and referenced knowledge
base, publish them to the Content Classification server.
- For each decision plan output field, create
a user-defined metadata property in Content Collector. In the Content Collector Configuration Manager,
click and add two metadata properties
named ICC:ExpirationDate and ContentManager:SetAttribute:ClaimType.
You can choose any names for the metadata properties, but it is convenient
to use the same names as in Content Classification.
- In the Content Collector Configuration
Manager, create a task route by using the Default Archiving (Automatic)
task route template.
- After the EC Extract Metadata task, add an instance of
the EC Prepare Email for Archiving task. To ensure that
attachments are available for classification, clear the Save
native message files without attachments check box in this new instance of the EC
prepare Email for Archiving task.
- After the EC Prepare Email for Archiving task that you
added, add the IBM Content
Classification task.
- Configure the IBM Content
Classification task.
- In the Server area, specify the
host name of the Content Classification server
on which the decision plan is running and the port number of the Content Classification listener component.
- In the Instance type area, select Decision
Plan and click the explore button to retrieve the list
of available decision plans. Select the decision plan that
you created in step 2.
- For the Content field list, click
the explore button to retrieve the list of available content fields
and then select Document.
- Go to the Map Decision Plan Results tab.
- In the Metadata source list,
select the metadata set that you created in step 4. The mapping
table is populated with the metadata properties.
- Click a metadata property in the table and then select
a decision plan output field in the Decision Plan property list. For example, click the ContentManager:SetAttribute:ClaimType metadata
property in the table and then select the ContentManager:SetAttribute:ClaimType field
in the Decision Plan property list. Repeat
this step for the ICC:ExpirationDate property.
- Add the Calculate Expiration Date after the IBM Content
Classification task. Configure
the task to set the expiration date based on the value of the ICC:ExpirationDate
metadata property.
- Configure the CM 8.x Configure Item Types task to set the
ClaimType attribute based on the value of the ContentManager:SetAttribute:ClaimType
metadata property.
- To store a history of the classification decisions, configure
the audit task in your task route to log the ICC:ExpirationDate and
ContentManager:SetAttribute:ClaimType user-defined metadata properties.
What to do next
After your system is in production, you can audit the classification
performance by using
Classification Center to
review the emails that were ingested into
IBM Content
Manager. You can use the
Classification Center to periodically verify
that the correct claim type attribute values and expiration dates
were set when emails were ingested. If needed, you can reclassify
particular emails to help improve the classification of emails in
the future.