Toolkit com.ibm.streams.text 2.3.2

General Information

The Text Toolkit includes operators to extracts information from text data.

The Text Toolkit integrates the Text Analytics component of IBM BigInsights Version 4.2.0.0, which provides a system for extracting information from text data. By using the Text Toolkit, a streams processing application can read text data and derive structured information that is based on various rules. These rules are defined in extractors, which are programs that extract information from within a text field. Extractors can be created using the Information Extraction Web Tool in BigInsights versions 4.0 and up or they can be written in AQL manually. The product of extractors is a set of annotated text that includes specific information that is important to your business. By using the TextExtract operator in your application, you can output this information as tuples on a data stream. BigInsights Text Analytics includes a set of pre-built extractors that extract mentions of general information such as names, e-mail addresses, currency, and other general data from input text. These pre-built extractors are bundled within the text toolkit and the TextExtract operator can be used to load and run them. See the "Examples" section of the operator's documentation for an example.

The unit of compilation in the Text Toolkit is a module, which is one or more AQL files in a directory. Modules can have input that is specified at run time in the form of dictionaries and tables. When a module is compiled, the result is a TAM file.

NOTE: The pre-built sentiment extractors are not supported on IBM Power Systems.

Additional information BigInsights Text Analytics documentation

Using the Information Extraction Web tool to create extractors
Toolkit structure
Developing and running applications that use the Text Toolkit
createTypes script
Version
2.3.2
Required Product Version
4.2.1.0

Indexes

Namespaces
Operators
Functions
Types

Namespaces

com.ibm.streams.text.analytics
Operators
Functions
Types