Develop and integrate custom UIMA text analysis engines

Gain business insights from unstructured data

developerWorks archives

Stefan Abraham, Simone Daum, and Benjamin Leonhardi

Date archived: August 25, 2016 | First published: August 27, 2009

In the first two articles of this series, you learned about IBM® InfoSphere® Warehouse text analysis capabilities, how to use regular expressions and dictionaries to extract information from text, and how to publish the results with a Cognos report. This article describes how to use the Unstructured Information Management Architecture (UIMA) framework to create a custom text annotator and use it in InfoSphere Warehouse. The ability of InfoSphere Warehouse to use UIMA based annotators in analytic flows is a powerful feature. You can write custom annotators that can extract almost any information from text. Plus you can use UIMA based annotators that are provided by IBM, other companies, and many universities. For example, you can find UIMA annotators that tokenize words and extract concepts such as persons or sentiments.

This content is no longer being updated or maintained. The full article is provided "as is" in a PDF file. Given the rapid evolution of technology, some steps and illustrations may have changed.

