Creating and deploying a custom global analysis plug-in

You can create plug-ins to use custom logic in addition to the default global analysis tasks that occur during the indexing process.

About this task

Restriction: Custom global analysis is available only for collections that use IBM® InfoSphere® BigInsights. Jaql must be installed on the InfoSphere BigInsights server.

Sample plug-ins are provided in the ES_INSTALL_ROOT/samples/jaql directory.

Procedure

To create and deploy a custom global analysis plug-in:

  1. Develop one or more Jaql scripts to specify the custom global analysis processing to perform.
  2. Create the custom global analysis configuration file. In a text editor, create a file with the name install.jaql. The format of the file is a JSON record, as shown in the following example.
    {
       "name" : "CustomGAJob-TfIdf",
       "description" :"Compute \"TF-IDF\" of noun words for each document",
       "variables" : {
         "jaql.files.list" : [ "./modules/tfidfMain.jaql" ],
         "jars.list" : ["./lib/es.jaql.example.jar","./lib/es.jaql.example2.jar"],
         "es.jaql.path" : "./modules"
       }
    }
    The record has three primary fields.
    name
    The name of the custom global analysis task. This field is required and is used as the ID for monitoring and logging.
    description
    The description of the custom global analysis task. This field is optional. If specified, the description is displayed in the administration console.
    variables
    Contains one or more of the following fields that specify paths to required files. All paths are relative to the install.xml file.
    jaql.files.list
    An array of paths to the Jaql script files to run.
    jars.list
    An array of paths to the JAR files that are used by the Java user-defined functions (UDFs). If Java UDFs are not used, this entry is not required.
    es.jaql.path
    A string that specifies the directory that contains additional Jaql script files. These Jaql script files contain functions that are imported by the Jaql script files that are specified in the jaql.files.list field.
  3. Add all required files for the plug-in to an archive file that has the .zip file extension. In addition to the Jaql scripts, the archive file must contain the custom global analysis configuration file (install.jaql) and any JAR files that are needed by the Jaql scripts. Save the install.jaql file at the top level of the archive file, as shown in the following example:
    • ./install.xml
    • ./modules/tfidfMain.jaql
    • ./modules/tfidf.jaql
    • ./modules/icautil.jaql
    • ./lib/es.jaql.example.jar
    • ./lib/es.jaql.example2.jar
  4. To deploy the custom global analysis plug-in, configure a custom global analysis task for a collection in the administration console to specify which fields and facets to pass to the script for analysis. After you configure the task, you must restart the parse and index services for the collection. If you do not set a schedule for custom global analysis, the task starts automatically after all documents are parsed and indexed.