Build Interactively

In the Model tab of the text mining modeling node, you can choose a build mode for your model nuggets. If you choose Build interactively, then an interactive interface opens when you execute the stream. In this interactive workbench, you can:

  • Extract and explore the extraction results, including concepts and typing to discover the salient ideas in your text data.
  • Use a variety of methods to build and extend categories from concepts, types, TLA patterns, and rules so you can score your documents and records into these categories.
  • Refine your linguistic resources (resource templates, libraries, dictionaries, synonyms, and more) so you can improve your results through an iterative process in which concepts are extracted, examined, and refined.
  • Perform text link analysis (TLA) and use the TLA patterns discovered to build better category model nuggets. The Text Link Analysis node doesn't offer the same exploratory options or modeling capabilities.
  • Generate clusters to discover new relationships and explore relationships between concept, types, patterns, and categories in the Visualization pane.
  • Generate refined category model nuggets to the Models palette in IBM® SPSS® Modeler and use them in other streams.
Note: You cannot build an interactive model if you are creating an IBM SPSS Collaboration and Deployment Services job.

Use session work (categories, TLA, resources, etc.) from last node update. When you work in an interactive workbench session, you can update the node with session data (extraction parameters, resources, category definitions, etc.). The Use session work option allows you to relaunch the interactive workbench using the saved session data. This option is disabled the first time you use this node, since no session data could have been saved. To learn how to update the node with session data so that you can use this option, see Updating Modeling Nodes and Saving.

If you launch a session with this option, then the extraction settings, categories, resources, and any other work from the last time you performed a node update from an interactive workbench session are available when you next launch a session. Since saved session data are used with this option, certain content, such as the resources copied from the template below, and other tabs are disabled and ignored. But if you launch a session without this option, only the contents of the node as they are defined now are used, meaning that any previous work you've performed in the workbench will not be available.

Note: If you change the source node for your stream after extraction results have been cached with the Use session work... option, you will need to run a new extraction once the interactive workbench session is launched if you want to get updated extraction results.

Skip extraction and reuse cached data and results. You can reuse any cached extraction results and data in the interactive workbench session. This option is particularly useful when you want to save time and reuse extraction results rather than waiting for a completely new extraction to be performed when the session is launched. In order to use this option, you must have previously updated this node from within an interactive workbench session and chosen the option to Keep the session work and cache text data with extraction results for reuse. To learn how to update the node with session data so that you can use this option, see Updating Modeling Nodes and Saving.

Begin session by. Select the option indicating the view and action you want to take place first upon launching the interactive workbench session. Regardless of the view you start in, you can switch to any view once in the session.

  • Using extraction results to build categories. This option launches the interactive workbench in the Categories and Concepts view and, if applicable, performs an extraction. In this view, you can create categories and generate a category model. You can also switch to another view. See the topic Interactive workbench mode for more information.
  • Exploring text link analysis (TLA) results. This option launches and begins by extracting and identifying relationships between concepts within the text, such as opinions or other links in the Text Link Analysis view. You must select a template or text analysis package that contains TLA pattern rules in order to use this option and obtain results. If you are working with larger datasets, the TLA extraction can take some time. In this case, you may want to consider using a Sample node upstream. See the topic Exploring Text Link Analysis for more information.
  • Analyzing co-word clusters. This option launches in the Clusters view and updates any outdated extraction results. In this view, you can perform co-word cluster analysis, which produces a set of clusters. Co-word clustering is a process that begins by assessing the strength of the link value between two concepts based on their co-occurrence in a given record or document and ends with the grouping of strongly linked concepts into clusters. See the topic Interactive workbench mode for more information.