Expected annotations are not created

When you analyze documents with your UIMA pipeline, some words or sections of text are not annotated as you expect.

Symptoms

Some terms in your analyzed document are not annotated and are not displayed in the list of annotations in the Outline tab.

Resolving the problem

Try one or more of the following possible solutions:

Check the UIMA pipeline configuration: Ensure that the linguistic resource that you expected to annotate the text is included in the appropriate stage of the UIMA pipeline configuration for the language of the analyzed document. Dictionary and character rule files must be included in the lexical analysis stage, and parsing rule files must be included in a parsing rules stage. If your pipeline automatically identifies the document language, determine the language that was identified for your document by checking the value of the Language feature of the DocumentAnnotation annotation. Then, ensure that the linguistic resource is included in the list of files for that language in the UIMA pipeline configuration.
Rebuild the associated linguistic resource: Ensure that the associated linguistic resource was rebuilt after the resource was last modified, such as after you added a dictionary entry or modified a parsing rule. Right-click the database for the linguistic resource and click Build Studio Resource.
Check for longer dictionary annotations over the same text span: For missing dictionary annotations, check whether a longer dictionary annotation was identified over the same text span. For example, in the text John graduated from London University with a Masters degree, the term London that is contained in the City dictionary is not annotated separately if there is an entry London University that is contained in the University dictionary in the same lexical analysis stage of the pipeline. If you want to generate both annotations, include the dictionaries in separate lexical analysis stages of the pipeline.
Check that the text matches the criteria that was specified for the rule: For missing character rule and parsing rule annotations, check whether the text matches the criteria that was specified for the rule. For parsing rules, check which annotations were created over the particular section of text and compare them with the selection criteria of the rule. If necessary, update the criteria so that the text is matched by the rule.
Modify the tokenization setting: For missing character rule annotations, check whether the text spans more than one token. For a character rule to match text that spans multiple tokens, the character rule database must be configured to affect tokenization. Right-click the character rules database and click Properties. Click Database Link > Database Build and select Affects tokenization for the Tokenization field. The dictionary must be closed before you can edit the attributes. By default, the character rules in the database do not affect tokenization.
Check that your custom annotator uses the correct version of Java™: For missing annotations from a custom annotator, ensure that your custom annotator uses Java 7. Also, check the Content Analytics Studio log file for any error messages. The log file is in the .metadata directory in your Content Analytics Studio workspace.