Interactive workbench mode

From a text mining modeling node, you can choose to launch an interactive workbench session during stream execution. In this workbench, you can extract key concepts from your text data, build categories, and explore text link analysis patterns and clusters, and generate category models. In this section, we discuss the workbench interface from a high-level perspective along with the major elements with which you will work, including:

  • Extraction results. After an extraction is performed, these are the key words and phrases identified and extracted from your text data, also referred to as concepts. These concepts are grouped into types. Using these concepts and types, you can explore your data as well as create your categories. These are managed in the Categories and Concepts view.
  • Categories. Using descriptors (such as extraction results, patterns, and rules) as a definition, you can manually or automatically create a set of categories to which documents and records are assigned based on whether or not they contain a part of the category definition. These are managed in the Categories and Concepts view.
  • Clusters. Clusters are a grouping of concepts between which links have been discovered that indicate a relationship among them. The concepts are grouped using a complex algorithm that uses, among other factors, how often two concepts appear together compared to how often they appear separately. These are managed in the Clusters view. You can also add the concepts that make up a cluster to categories.
  • Text link analysis patterns. If you have text link analysis (TLA) pattern rules in your linguistic resources or are using a resource template that already has some TLA rules, you can extract patterns from your text data. These patterns can help you uncover interesting relationships between concepts in your data. You can also use these patterns as descriptors in your categories. These are managed in the Text Link Analysis view.
  • Linguistic resources. The extraction process relies on a set of parameters and linguistic definitions to govern how text is extracted and handled. These are managed in the form of templates and libraries in the Resource Editor view.

Potential Interactive Workbench issues

  • Multiple Interactive Workbench sessions can cause sluggish behavior. SPSS® Modeler Text Analytics and SPSS Modeler share a common Java run-time engine when an interactive workbench session is launched. Depending on the number of Interactive Workbench sessions you invoke during a SPSS Modeler session, system memory may cause the application to become sluggish, even if opening and closing the same session. This effect may be especially pronounced if you are working with large data or have a machine with less than the recommended RAM setting of 4GB. If you notice your machine is slow to respond, it is recommended that you save all your work, shut down SPSS Modeler, and re-launch the application. Running SPSS Modeler Text Analytics on a machine with less than the recommended memory, particularly when working with large data sets or for prolonged periods of time, may cause Java to run out of memory and shut down. It is strongly suggested you upgrade to the recommended memory setting or larger (or use SPSS Modeler Text Analytics Server) if you work with large data.
  • SPSS Modeler Client can run out of memory after multiple SPSS Modeler Text Analytics Interactive Workbench sessions are run without restarting the application. Monitor the memory usage in the status line and, if running low, close and re-open SPSS Modeler Client.