Text Mining node: TextMiningWorkbench

You can use the following parameters to define or update a node through scripting. The node itself is called TextMiningWorkbench.

Important: It is not possible to specify a different resource template via scripting. If you think you need a template, you must select it in the node dialog box.
Table 1. Text Mining modeling node scripting properties
Scripting properties Data type Property description
text field  
method ReadText ReadPath  
docType integer With possible values (0,1,2) where 0 = Full Text, 1 = Structured Text, and 2 = XML
encoding Automatic "UTF-8" "UTF-16" "ISO-8859-1" "US-ASCII" "CP850" "EUC-JP" "SHIFT-JIS" "ISO2022-JP" Note that values with special characters, such as "UTF-8", should be quoted to avoid confusion with a mathematical operator.
unity integer With possible values (0,1) where 0 = Paragraph and 1 = Document
para_min integer  
para_max integer  
mtag string Contains all the mtag settings (from Settings dialog box for XML files)
mclef string Contains all the mclef settings (from Settings dialog box for Structured Text files)
partition field  
custom_field flag Indicates whether or not a partition field will be specified.
use_model_name flag  
model_name string  
use_partitioned_data flag If a partition field is defined, only the training data are used for model building.
model_output_type Interactive Model Interactive results in a category model. Model results in a concept model.
use_interactive_info flag For building interactively in a workbench session only.
reuse_extraction_results flag For building interactively in a workbench session only.
interactive_view Categories TLA Clusters For building interactively in a workbench session only.
extract_top integer This parameter is used when model_type = Concept
use_check_top flag  
check_top integer  
use_uncheck_top flag  
uncheck_top integer  
language de en es fr it ja nl pt
frequency_limit integer Deprecated in 14.0.
concept_count_limit integer Limit extraction to concepts with a global frequency of at least this value.
fix_punctuation flag  
fix_spelling flag
spelling_limit integer
extract_uniterm flag
extract_nonlinguistic flag
upper_case flag
group_names flag
permutation integer Maximum nonfunction word permutation (the default is 3).