Properties for nodes and flows
You can use scripting to modify properties for nodes and flows by using a script.
Node properties are the specific settings for a node. These are often the same properties that you can access through the user interface, for example, when you double-click a node to open its properties. Flow properties refer to high-level flow operations, such as caching.
Structured properties
Scripts use structured properties in the following ways to increase clarity when parsing:
- To give structure to the names of properties for complex nodes, such as the Mimic and Anonymize nodes.
- To provide a format for specifying multiple properties at once.
The scripts for nodes with tables and other complex interfaces must follow a particular structure to parse correctly. These properties need a name that's more complex than the name for a single identifier. This name is called the key. Structured properties are also sometimes called keyed properties.
For example, with an Anonymize node, each available field on its upstream side is either anonymized or retained. To refer to this information, the Anonymize node stores one item of information per field (whether each field is True or False). To anonymize a field called Age, set the property enable_anonymize, with the key Age, to the value True:
# Anonymize node requires the input fields while setting the values
node.setKeyedPropertyValue("enable_anonymize", "Age", True)
node.setKeyedPropertyValue("transformation", "Age", "Random")
node.setKeyedPropertyValue("set_random_seed", "Age", True)
node.setKeyedPropertyValue("random_seed", "Age", 123)
Setting multiple properties
For many nodes, you can assign more than one node or flow structured property at a time, for example:
mimic.setPropertyValue("missing_value_imputation", "True")
mimic.setPropertyValue("missing_value_imputation_strategies",[["Age","Fixed","60"],["K", "mean"]])
This is referred to as a multiset command or set block.
Another advantage that structured properties have is their ability to set several properties on a node before the node is stable. By default, a multiset command sets all properties in the block before taking any action based on an individual property setting.
Syntax for setting properties
You can set properties using the following syntax:
OBJECT.setPropertyValue(PROPERTY, VALUE)
You can retrieve the value of properties using the following syntax:
VARIABLE = OBJECT.getPropertyValue(PROPERTY)
For keyed properties, you can set properties using the following syntax
OBJECT.setKeyedPropertyValue(PROPERTY, KEY, VALUE)
And you can retrieve the value of properties using the following syntax:
VARIABLE = OBJECT.getKeyedPropertyValue(PROPERTY, KEY)
In each of the code examples, replace OBJECT, PROPERTY, and KEY with the following values:
OBJECTis a node or outputPROPERTYis the name of the node property that your expression refers toKEYis the key value for keyed properties
Node properties
Each type of node has its own set of properties, and each property has a type. The type for a property might be a general type, such as number, flag, or string. Alternatively, the property reference might specify the range of legal values, such
as Discard, PairAndDiscard, and IncludeAsText.
Common node properties
Some properties are common to all nodes, such as name, annotation, and ToolTip, while others are specific to certain types of nodes. The following properties are common to all nodes in Synthetic Data
Generator.
| Property name | Data type | Property description |
|---|---|---|
use_custom_name |
flag | |
name |
string | Read-only property that reads the name (either auto or custom) for a node on the canvas. |
custom_name |
string | Specifies a custom name for the node. |
tooltip |
string | |
annotation |
string | |
keywords |
string | Structured slot that specifies a list of keywords associated with the object (for example, ["Keyword1" "Keyword2"]). |
cache_enabled |
flag | |
node_type |
all node names as specified for scripting | Read-only property used to refer to a node by type. For example, instead of referring to a node only by name, such as real_income, you can also specify the type, such as userinputnode or filternode. |
Errors for node properties
For properties that are general types, settings for the properties are coerced to the correct type. An error occurs if they can't be coerced. For properties that have a specific range of legal values, such as Discard or PairAndDiscard,
an error occurs if any other value is used.
Flag properties should be read or set by using values of true and false consistently to avoid any confusion. Variations including Off, OFF, off, No, NO,
no, n, N, f, F, false, False, FALSE, or 0 are also recognized when setting values. However, they might cause errors when
reading property values in some cases. All other values are regarded as true. Using true and false.
Flow properties
Flow properties are high-level flow operations that affect an entire flow rather than individual nodes. To reference flow properties, you must set the execution method to use scripts:
stream = sdg.script.stream()
stream.setPropertyValue("execute_method", "Script")
The following flow properties are available for use in scripting.
| Property Name | Data type | Property description |
|---|---|---|
execute_method |
Normal, Script | |
date_format |
"DDMMYY", "MMDDYY", "YYMMDD", "YYYYMMDD", "YYYYDDD", DAY, MONTH, "DD-MM-YY", "DD-MM-YYYY", "MM-DD-YY", "MM-DD-YYYY", "DD-MON-YY", "DD-MON-YYYY", "YYYY-MM-DD", "DD.MM.YY", "DD.MM.YYYY", "MM.DD.YYYY", "DD.MON.YY", "DD.MON.YYYY", "DD/MM/YY", "DD/MM/YYYY", "MM/DD/YY", "MM/DD/YYYY", "DD/MON/YY", "DD/MON/YYYY", MON YYYY, q, WK YYYY | |
date_baseline |
number | |
date_2digit_baseline |
number | |
time_format |
"HHMMSS", "HHMM", "MMSS", "HH:MM:SS", "HH:MM", "MM:SS", "(H)H:(M)M:(S)S", "(H)H:(M)M", "(M)M:(S)S", "HH.MM.SS", "HH.MM", "MM.SS", "(H)H.(M)M.(S)S", "(H)H.(M)M", "(M)M.(S)S" | |
time_rollover |
boolean | |
import_datetime_as_string |
boolean | Use to refresh Import nodes automatically upon flow execution. |
decimal_places |
number | |
decimal_symbol |
Default, Period, Comma | |
angles_in_radians |
boolean | |
use_max_set_size |
boolean | |
max_set_size |
number | |
ruleset_evaluation |
Voting, FirstHit | |
refresh_source_nodes |
boolean | Use to refresh Import nodes automatically upon flow execution. |
script |
string | |
annotation |
string | |
name |
string | This property is read-only. If you want to change the name of a flow, you should save it with a different name. |
nodes |
See information that follows. | |
encoding |
SystemDefault, "UTF-8" | |
stream_rewriting |
boolean | |
stream_rewriting_maximise_sql |
boolean | |
stream_rewriting_optimise_clem_execution |
boolean | |
stream_rewriting_optimise_syntax_execution |
boolean | |
enable_parallelism |
boolean | |
sql_generation |
boolean | |
database_caching |
boolean | |
sql_logging |
boolean | |
sql_generation_logging |
boolean | |
sql_log_native |
boolean | |
sql_log_prettyprint |
boolean | |
record_count_suppress_input |
boolean | |
record_count_feedback_interval |
integer | |
use_stream_auto_create_node_settings |
boolean | If true, then flow-specific settings are used, otherwise user preferences are used. |
create_source_node_from_builders |
boolean | If true, when a source builder creates a new source output, and it has no active update links, a new Import node is added. |
create_source_node_update_links |
createEnabled, createDisabled, doNotCreate | Defines the type of link created when an Import node is added automatically. |
has_coordinate_system |
boolean | If true, applies a coordinate system to the entire flow. |
coordinate_system |
string | The name of the selected projected coordinate system. |
deployment_area |
Scoring, None | Choose how you want to deploy the flow. If this value is set to None, no other deployment entries are used. |
scoring_terminal_node_id |
string | Choose the scoring branch in the flow. It can be any terminal node in the flow. |
scoring_node_id |
string | Choose the nugget in the scoring branch. |
Example script with flow properties
You can use node properties to create a list of all nodes in the flow and write that list in the flow annotations. The node property is used to refer to the nodes in the current flow. The following flow script provides an example:
stream = sdg.script.stream()
annotation = stream.getPropertyValue("annotation")
annotation = annotation + "\n\nThis flow is called \"" + stream.getLabel() + "\" and
contains the following nodes:\n"
for node in stream.iterator():
annotation = annotation + "\n" + node.getTypeName() + " node called \"" + node.getLabel()
+ "\""
stream.setPropertyValue("annotation", annotation)
The list that is written in the flow annotation looks like this:
This flow is called "druglearn" and contains the following nodes:
import node called "DRUG1n"
anonymize node called "Hide data"
mimic node called "DRUG1n mimic"
generate node called "Synthetic DRUG1n data"
evaluate node called "DRUG1n quality"
export node called "Save to csv"