Using the Information Extraction Web tool to create extractors

Specialized toolkits - release 4.3.1.0-prod20190605 > com.ibm.streams.text 2.3.2 > Using the Information Extraction Web tool to create extractors

As was mentioned before, you can use the Information Extraction Web Tool to create your extractors. The web tool is available through the following options: If you have IBM BigInsights installed, the web tool is accessible from the BigInsights home page by clicking "Text Analytics". In addition, IBM Streams includes a stand-alone version of the Information Extraction Web Tool in <STREAMS_INSTALL>/etc/text-analytics that runs in its own web server. Follow these instructions to get started:
  1. Unpack the archive into a directory:
    
    `tar -x -f $STREAMS_INSTALL/etc/text-analytics/text-analytics-web-tool.tar.gz -C <target-directory.`
    
  2. Start the server:
    
     cd <target-directory>/bin
     ./text-analytics-web-tooling-start.sh 
    
  3. Once the server is started, you can go to http://localhost:9080/TextAnalyticsWeb/html/IEWTApp.html to access the web tool. The default web server port can be changed by editing the line -Djetty.port=9080 in <target-directory>/start.ini.

When you have created and saved your extractor, you need to export it before you can use it with the TextExtract operator:

  1. In the web tool, select the extractor in the Extractors pane, and choose Export..
  2. Select Export Source under What to Export.
  3. Select "local files" under Where to Export.
  4. Enter the name of the exported zip file, such as "myextractor.zip" in the box labeled Name of File.
  5. Click OK. A zip file containing the extractor will be downloaded to the local file system.
  6. Unpack the contents of the zip file from the previous step into a folder that will be accessible by the Streams runtime, or place it in the "etc" folder of your SPL project.
  7. In your application, you can use the exported extractor by:
    • Setting the moduleSearchPath parameter to the path where you unpacked the extractor in step 6,
    • Setting the outputViews parameter to the name of the extractor you created, as in the following snippet:
      
      type PersonSearchType = rstring name, rstring city;
      stream<PersonSearchType> PersonSearchStream = TextExtract(InputStream) {
      param
      		moduleSearchPath: "etc/exportedAQL";
      		outputMode: "multiPort";
      		outputViews: "PersonSearch";
      		tokenizer: "STANDARD";
      	}
      }
      

To learn more about creating extractors in the web tool, consult the web tool documentation.