Text analysis in InfoSphere Warehouse, Part 1, Architecture overview and example of information extraction with regular expressions

Gain business insights from unstructured data

From the developerWorks archives

Stefan Abraham, Simone Daum, and Benjamin Leonhardi

Date archived: January 12, 2017 | First published: June 04, 2009

Unstructured information represents the largest, most current, and fastest growing source of information that is available today. This information exists in many different sources such as call center records, repair reports, product reviews, e-mails, and many others. The text analysis features of IBM® InfoSphere™ Warehouse can help you uncover the hidden value in this unstructured data. This series of articles covers the general architecture and business opportunities of analyzing unstructured data with the text analysis capabilities of InfoSphere Warehouse. The integration of this capability with IBM Cognos® reporting enables people across the company to exploit the text analysis results. This first article introduces the basic architecture of the text analysis feature in InfoSphere Warehouse and includes a technical example showing how to extract concepts from text using regular expressions.

This content is no longer being updated or maintained. The full article is provided "as is" in a PDF file. Given the rapid evolution of technology, some steps and illustrations may have changed.



static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, Big data and analytics
ArticleID=394466
ArticleTitle=Text analysis in InfoSphere Warehouse, Part 1: Architecture overview and example of information extraction with regular expressions
publish-date=06042009