Text Mining node: TextMiningWorkbench
You can use the following parameters to define or update a node through
scripting. The node itself is called TextMiningWorkbench
.
Important: It is not possible to specify a different resource template via scripting. If
you think you need a template, you must select it in the node dialog box.
Scripting properties | Data type | Property description |
---|---|---|
text
|
field | |
method
|
ReadText
ReadPath
|
|
docType
|
integer | With possible values (0,1,2) where 0 = Full Text , 1 = Structured Text , and 2 = XML
|
encoding
|
Automatic
"UTF-8"
"UTF-16"
"ISO-8859-1"
"US-ASCII"
"CP850"
"EUC-JP"
"SHIFT-JIS"
"ISO2022-JP"
|
Note that values with special characters, such as "UTF-8" , should be quoted to avoid confusion
with a mathematical operator. |
unity
|
integer | With possible values (0,1) where 0 = Paragraph and 1 = Document
|
para_min
|
integer | |
para_max
|
integer | |
mtag
|
string | Contains all the mtag settings (from Settings dialog box for XML files) |
mclef
|
string | Contains all the mclef settings (from Settings dialog box for Structured Text files) |
partition
|
field | |
custom_field
|
flag | Indicates whether or not a partition field will be specified. |
use_model_name
|
flag | |
model_name
|
string | |
use_partitioned_data
|
flag | If a partition field is defined, only the training data are used for model building. |
model_output_type
|
Interactive
Model
|
Interactive results in a category model. Model results in a
concept model. |
use_interactive_info
|
flag | For building interactively in a workbench session only. |
reuse_extraction_results
|
flag | For building interactively in a workbench session only. |
interactive_view
|
Categories
TLA
Clusters
|
For building interactively in a workbench session only. |
extract_top
|
integer | This parameter is used when model_type = Concept
|
use_check_top
|
flag | |
check_top
|
integer | |
use_uncheck_top
|
flag | |
uncheck_top
|
integer | |
language
|
de
en
es
fr
it
ja
nl
pt
|
|
frequency_limit
|
integer | Deprecated in 14.0. |
concept_count_limit
|
integer | Limit extraction to concepts with a global frequency of at least this value. |
fix_punctuation
|
flag | |
fix_spelling
|
flag | |
spelling_limit
|
integer | |
extract_uniterm
|
flag | |
extract_nonlinguistic
|
flag | |
upper_case
|
flag | |
group_names
|
flag | |
permutation
|
integer | Maximum nonfunction word permutation (the default is 3). |