Properties for nodes and flows

You can use scripting to modify properties for nodes and flows by using a script.

Node properties are the specific settings for a node. These are often the same properties that you can access through the user interface, for example, when you double-click a node to open its properties. Flow properties refer to high-level flow operations, such as caching.

Structured properties

Scripts use structured properties in the following ways to increase clarity when parsing:

  • To give structure to the names of properties for complex nodes, such as the Mimic and Anonymize nodes.
  • To provide a format for specifying multiple properties at once.

The scripts for nodes with tables and other complex interfaces must follow a particular structure to parse correctly. These properties need a name that's more complex than the name for a single identifier. This name is called the key. Structured properties are also sometimes called keyed properties.

For example, with an Anonymize node, each available field on its upstream side is either anonymized or retained. To refer to this information, the Anonymize node stores one item of information per field (whether each field is True or False). To anonymize a field called Age, set the property enable_anonymize, with the key Age, to the value True:

# Anonymize node requires the input fields while setting the values
node.setKeyedPropertyValue("enable_anonymize", "Age", True)
node.setKeyedPropertyValue("transformation", "Age", "Random")
node.setKeyedPropertyValue("set_random_seed", "Age", True)
node.setKeyedPropertyValue("random_seed", "Age", 123)

Setting multiple properties

For many nodes, you can assign more than one node or flow structured property at a time, for example:

mimic.setPropertyValue("missing_value_imputation", "True")
mimic.setPropertyValue("missing_value_imputation_strategies",[["Age","Fixed","60"],["K", "mean"]])

This is referred to as a multiset command or set block.

Another advantage that structured properties have is their ability to set several properties on a node before the node is stable. By default, a multiset command sets all properties in the block before taking any action based on an individual property setting.

Syntax for setting properties

You can set properties using the following syntax:

OBJECT.setPropertyValue(PROPERTY, VALUE)

You can retrieve the value of properties using the following syntax:

VARIABLE = OBJECT.getPropertyValue(PROPERTY)

For keyed properties, you can set properties using the following syntax

OBJECT.setKeyedPropertyValue(PROPERTY, KEY, VALUE)

And you can retrieve the value of properties using the following syntax:

VARIABLE = OBJECT.getKeyedPropertyValue(PROPERTY, KEY)

In each of the code examples, replace OBJECT, PROPERTY, and KEY with the following values:

  • OBJECT is a node or output
  • PROPERTY is the name of the node property that your expression refers to
  • KEY is the key value for keyed properties

Node properties

Each type of node has its own set of properties, and each property has a type. The type for a property might be a general type, such as number, flag, or string. Alternatively, the property reference might specify the range of legal values, such as Discard, PairAndDiscard, and IncludeAsText.

Common node properties

Some properties are common to all nodes, such as name, annotation, and ToolTip, while others are specific to certain types of nodes. The following properties are common to all nodes in Synthetic Data Generator.

Table 1. Common node properties
Property name Data type Property description
use_custom_name flag
name string Read-only property that reads the name (either auto or custom) for a node on the canvas.
custom_name string Specifies a custom name for the node.
tooltip string
annotation string
keywords string Structured slot that specifies a list of keywords associated with the object (for example, ["Keyword1" "Keyword2"]).
cache_enabled flag
node_type all node names as specified for scripting Read-only property used to refer to a node by type. For example, instead of referring to a node only by name, such as real_income, you can also specify the type, such as userinputnode or filternode.

Errors for node properties

For properties that are general types, settings for the properties are coerced to the correct type. An error occurs if they can't be coerced. For properties that have a specific range of legal values, such as Discard or PairAndDiscard, an error occurs if any other value is used.

Flag properties should be read or set by using values of true and false consistently to avoid any confusion. Variations including Off, OFF, off, No, NO, no, n, N, f, F, false, False, FALSE, or 0 are also recognized when setting values. However, they might cause errors when reading property values in some cases. All other values are regarded as true. Using true and false.

Flow properties

Flow properties are high-level flow operations that affect an entire flow rather than individual nodes. To reference flow properties, you must set the execution method to use scripts:

stream = sdg.script.stream()
stream.setPropertyValue("execute_method", "Script")

The following flow properties are available for use in scripting.

Table 2. Flow properties for scripting
Property Name Data type Property description
execute_method Normal, Script
date_format "DDMMYY", "MMDDYY", "YYMMDD", "YYYYMMDD", "YYYYDDD", DAY, MONTH, "DD-MM-YY", "DD-MM-YYYY", "MM-DD-YY", "MM-DD-YYYY", "DD-MON-YY", "DD-MON-YYYY", "YYYY-MM-DD", "DD.MM.YY", "DD.MM.YYYY", "MM.DD.YYYY", "DD.MON.YY", "DD.MON.YYYY", "DD/MM/YY", "DD/MM/YYYY", "MM/DD/YY", "MM/DD/YYYY", "DD/MON/YY", "DD/MON/YYYY", MON YYYY, q, WK YYYY
date_baseline number
date_2digit_baseline number
time_format "HHMMSS", "HHMM", "MMSS", "HH:MM:SS", "HH:MM", "MM:SS", "(H)H:(M)M:(S)S", "(H)H:(M)M", "(M)M:(S)S", "HH.MM.SS", "HH.MM", "MM.SS", "(H)H.(M)M.(S)S", "(H)H.(M)M", "(M)M.(S)S"
time_rollover boolean
import_datetime_as_string boolean Use to refresh Import nodes automatically upon flow execution.
decimal_places number
decimal_symbol Default, Period, Comma
angles_in_radians boolean
use_max_set_size boolean
max_set_size number
ruleset_evaluation Voting, FirstHit
refresh_source_nodes boolean Use to refresh Import nodes automatically upon flow execution.
script string
annotation string
name string This property is read-only. If you want to change the name of a flow, you should save it with a different name.
nodes See information that follows.
encoding SystemDefault, "UTF-8"
stream_rewriting boolean
stream_rewriting_maximise_sql boolean
stream_rewriting_optimise_clem_execution boolean
stream_rewriting_optimise_syntax_execution boolean
enable_parallelism boolean
sql_generation boolean
database_caching boolean
sql_logging boolean
sql_generation_logging boolean
sql_log_native boolean
sql_log_prettyprint boolean
record_count_suppress_input boolean
record_count_feedback_interval integer
use_stream_auto_create_node_settings boolean If true, then flow-specific settings are used, otherwise user preferences are used.
create_source_node_from_builders boolean If true, when a source builder creates a new source output, and it has no active update links, a new Import node is added.
create_source_node_update_links createEnabled, createDisabled, doNotCreate Defines the type of link created when an Import node is added automatically.
has_coordinate_system boolean If true, applies a coordinate system to the entire flow.
coordinate_system string The name of the selected projected coordinate system.
deployment_area Scoring, None Choose how you want to deploy the flow. If this value is set to None, no other deployment entries are used.
scoring_terminal_node_id string Choose the scoring branch in the flow. It can be any terminal node in the flow.
scoring_node_id string Choose the nugget in the scoring branch.

Example script with flow properties

You can use node properties to create a list of all nodes in the flow and write that list in the flow annotations. The node property is used to refer to the nodes in the current flow. The following flow script provides an example:

stream = sdg.script.stream()
annotation = stream.getPropertyValue("annotation")

annotation = annotation + "\n\nThis flow is called \"" + stream.getLabel() + "\" and
 contains the following nodes:\n"

for node in stream.iterator():
    annotation = annotation + "\n" + node.getTypeName() + " node called \"" + node.getLabel()
 + "\""

stream.setPropertyValue("annotation", annotation)

The list that is written in the flow annotation looks like this:

This flow is called "druglearn" and contains the following nodes:

import node called "DRUG1n"
anonymize node called "Hide data"
mimic node called "DRUG1n mimic"
generate node called "Synthetic DRUG1n data"
evaluate node called "DRUG1n quality"
export node called "Save to csv"