Use Content Analytics Studio to
easily create and deploy custom text analytics for IBM® Content
Analytics with Enterprise Search applications.
Content Analytics Studio is a complete
development environment for the building, customization, and testing
of dictionaries, rules, and UIMA annotators. This environment eliminates
the need for specialist knowledge of the underlying technologies of
natural language processing or UIMA. Content Analytics Studio enables you to develop
text analysis engines without needing to write any code.
Use Content Analytics Studio and IBM Content
Analytics with Enterprise Search to iteratively develop
custom annotators. After you develop a Content Analytics Studio UIMA pipeline, you
can export it to IBM Content
Analytics with Enterprise Search.
You can then verify the annotation results by exploring facets in
the content analytics miner and enterprise
search application. Based on the results, you can go back to Content Analytics Studio and fine-tune the text
analysis engine. Then export the updated UIMA pipeline to IBM Content
Analytics with Enterprise Search and verify your changes
in the content analytics miner.
You can use
Content Analytics Studio for
the following tasks:
- Build language and domain resources into dictionaries.
- Develop character rules to recognize patterns of characters that
represent specific types of information, such as telephone numbers,
email addresses, or URLs.
- Develop parsing rules to analyze tokens and other UIMA annotations
created by previous annotators, such as dictionary-based and character
rule-based annotators. You can also define parsing rules to create
new annotations by identifying patterns of words.
- Create UIMA annotators that are based on dictionaries and rules.
- Annotate text and browse the contents of each annotation.
- Export a UIMA pipeline as a UIMA PEAR file that can be manually
deployed in IBM Content
Analytics with Enterprise Search.
- Export a UIMA pipeline directly to IBM Content
Analytics with Enterprise Search and create fields and facets
that are mapped to the annotations. The exported UIMA pipeline is
automatically added as the custom annotator stage of the document
processing pipeline in a collection.
- Analyze documents using a IBM Content
Analytics with Enterprise Search pipeline.
Content Analytics Studio can pass
documents in the project to a IBM Content
Analytics with Enterprise Search server
to be annotated by the document processing pipeline associated with
a collection. Content Analytics Studio can
then display the resulting annotations that are returned from IBM Content
Analytics with Enterprise Search server.
Before you can export a text analysis engine to
IBM Content
Analytics with Enterprise Search or analyze documents in
IBM Content
Analytics with Enterprise Search, you must configure
Content Analytics Studio to access the
IBM Content
Analytics with Enterprise Search by creating an
IBM Content
Analytics with Enterprise Search server connection file.
For enterprise search collections: Before
you export a text analysis engine to an IBM Content
Analytics with Enterprise Search enterprise search collection,
enable the document cache to avoid recrawling content. If the document
cache is not enabled, you must run a full recrawl after you export
the text analysis engine to IBM Content
Analytics with Enterprise Search.
For more information about using Content Analytics Studio, see the Content Analytics Studio online help and documentation.