Creating rules to find documents that fit differing categories
is time-consuming and requires constant, meticulous adjustments. However,
importing a classification model with sets of training documents helps
find other, similar documents.
About this task
Using previously harvested data, you can create an auto-classification
model.
Procedure
- Determine the categories into which you want the auto-classification
model to classify documents.
- Using IBM®
StoredIQ® Data Workbench,
create a filter for each category to capture documents that are representatives
of the category.
- For each filter, create an infoset. The members of the
resulting infoset become the "training corpus" for the category.
- For each infoset, run a copy action with IBM
StoredIQ Data Workbench onto a folder that
is accessible by the IBM Content Classification application.
-
Use the IBM Content Classification application to create a decision plan and knowledge base by
importing the training corps that you created.
Note: A classification model consists of one decision plan and at least one knowledge base, which
is a requirement of the IBM
StoredIQ auto-classification
feature.