Analyzing a single document

You can analyze sample documents to view the annotations that are generated as you develop and test the UIMA pipeline. Creating a document with examples of annotations you want to capture is a good way to start developing an annotator. For best results, use sample text that represents the patterns and entities that you want the pipeline to identify, such as locations and company names.

About this task

After the analysis is complete, the Outline view displays a list of the annotations that are found in the document. If the text is not annotated correctly, you can modify your linguistic resources to improve the performance of the pipeline. After you rebuild your modified resources, the annotations in the Outline view are automatically updated.

For annotations that are generated by a parsing rule, the rule identifier is displayed in the Properties view. To review or edit the parsing rule that generated a particular annotation type, open the relevant parsing rules database and search for the rule identifier. If you update the rule, save the updated rule and rebuild the parsing rules database.

Tip: By default, all annotation types in a document are displayed in the Outline view. To hide particular annotations in the Outline view and the final output, edit the cleanup stage in the UIMA pipeline configuration file.

Procedure

To analyze a single document:

  1. Open a document in the editor view.
    • To analyze a document, open a text file that you pasted, dragged, or imported into the Documents directory of your project.
    • To create a document for analysis, right-click the Documents directory for your project in the Studio Explorer view and click New > File. When you specify a name for the file, ensure that you include the .txt extension. In the editor view, type or paste text into the document.
  2. Right-click the document in the editor view, click Analyze Document, and select a UIMA pipeline configuration file from the Configuration/Annotators directory of your project.