Properties for nodes and flows

You can use scripting to modify properties for nodes and flows by using a script.

Node properties are the specific settings for a node. These are often the same properties that you can access through the user interface, for example, when you double-click a node to open its properties. Flow properties refer to high-level flow operations, such as caching.

Structured properties

Scripts use structured properties in the following ways to increase clarity when parsing:

To give structure to the names of properties for complex nodes, such as the Mimic and Anonymize nodes.
To provide a format for specifying multiple properties at once.

The scripts for nodes with tables and other complex interfaces must follow a particular structure to parse correctly. These properties need a name that's more complex than the name for a single identifier. This name is called the key. Structured properties are also sometimes called keyed properties.

For example, with an Anonymize node, each available field on its upstream side is either anonymized or retained. To refer to this information, the Anonymize node stores one item of information per field (whether each field is True or False). To anonymize a field called Age, set the property enable_anonymize, with the key Age, to the value True:

# Anonymize node requires the input fields while setting the values
node.setKeyedPropertyValue("enable_anonymize", "Age", True)
node.setKeyedPropertyValue("transformation", "Age", "Random")
node.setKeyedPropertyValue("set_random_seed", "Age", True)
node.setKeyedPropertyValue("random_seed", "Age", 123)

Setting multiple properties

For many nodes, you can assign more than one node or flow structured property at a time, for example:

mimic.setPropertyValue("missing_value_imputation", "True")
mimic.setPropertyValue("missing_value_imputation_strategies",[["Age","Fixed","60"],["K", "mean"]])

This is referred to as a multiset command or set block.

Another advantage that structured properties have is their ability to set several properties on a node before the node is stable. By default, a multiset command sets all properties in the block before taking any action based on an individual property setting.

Syntax for setting properties

You can set properties using the following syntax:

OBJECT.setPropertyValue(PROPERTY, VALUE)

You can retrieve the value of properties using the following syntax:

VARIABLE = OBJECT.getPropertyValue(PROPERTY)

For keyed properties, you can set properties using the following syntax

OBJECT.setKeyedPropertyValue(PROPERTY, KEY, VALUE)

And you can retrieve the value of properties using the following syntax:

VARIABLE = OBJECT.getKeyedPropertyValue(PROPERTY, KEY)

In each of the code examples, replace OBJECT, PROPERTY, and KEY with the following values:

OBJECT is a node or output
PROPERTY is the name of the node property that your expression refers to
KEY is the key value for keyed properties

Node properties

Each type of node has its own set of properties, and each property has a type. The type for a property might be a general type, such as number, flag, or string. Alternatively, the property reference might specify the range of legal values, such as Discard, PairAndDiscard, and IncludeAsText.

Common node properties

Some properties are common to all nodes, such as name, annotation, and ToolTip, while others are specific to certain types of nodes. The following properties are common to all nodes in Synthetic Data Generator.

Table 1. Common node properties
Property name	Data type	Property description
`use_custom_name`	flag
`name`	string	Read-only property that reads the name (either auto or custom) for a node on the canvas.
`custom_name`	string	Specifies a custom name for the node.
`tooltip`	string
`annotation`	string
`keywords`	string	Structured slot that specifies a list of keywords associated with the object (for example, `["Keyword1" "Keyword2"]`).
`cache_enabled`	flag
`node_type`	all node names as specified for scripting	Read-only property used to refer to a node by type. For example, instead of referring to a node only by name, such as `real_income`, you can also specify the type, such as `userinputnode` or `filternode`.

Errors for node properties

For properties that are general types, settings for the properties are coerced to the correct type. An error occurs if they can't be coerced. For properties that have a specific range of legal values, such as Discard or PairAndDiscard, an error occurs if any other value is used.

Flag properties should be read or set by using values of true and false consistently to avoid any confusion. Variations including Off, OFF, off, No, NO, no, n, N, f, F, false, False, FALSE, or 0 are also recognized when setting values. However, they might cause errors when reading property values in some cases. All other values are regarded as true. Using true and false.

Flow properties

Flow properties are high-level flow operations that affect an entire flow rather than individual nodes. To reference flow properties, you must set the execution method to use scripts:

stream = sdg.script.stream()
stream.setPropertyValue("execute_method", "Script")

The following flow properties are available for use in scripting.

Table 2. Flow properties for scripting
Property Name	Data type	Property description
`execute_method`	Normal, Script
`date_format`	"DDMMYY", "MMDDYY", "YYMMDD", "YYYYMMDD", "YYYYDDD", DAY, MONTH, "DD-MM-YY", "DD-MM-YYYY", "MM-DD-YY", "MM-DD-YYYY", "DD-MON-YY", "DD-MON-YYYY", "YYYY-MM-DD", "DD.MM.YY", "DD.MM.YYYY", "MM.DD.YYYY", "DD.MON.YY", "DD.MON.YYYY", "DD/MM/YY", "DD/MM/YYYY", "MM/DD/YY", "MM/DD/YYYY", "DD/MON/YY", "DD/MON/YYYY", MON YYYY, q, WK YYYY
`date_baseline`	number
`date_2digit_baseline`	number
`time_format`	"HHMMSS", "HHMM", "MMSS", "HH:MM:SS", "HH:MM", "MM:SS", "(H)H:(M)M:(S)S", "(H)H:(M)M", "(M)M:(S)S", "HH.MM.SS", "HH.MM", "MM.SS", "(H)H.(M)M.(S)S", "(H)H.(M)M", "(M)M.(S)S"
`time_rollover`	boolean
`import_datetime_as_string`	boolean	Use to refresh Import nodes automatically upon flow execution.
`decimal_places`	number
`decimal_symbol`	Default, Period, Comma
`angles_in_radians`	boolean
`use_max_set_size`	boolean
`max_set_size`	number
`ruleset_evaluation`	Voting, FirstHit
`refresh_source_nodes`	boolean	Use to refresh Import nodes automatically upon flow execution.
`script`	string
`annotation`	string
`name`	string	This property is read-only. If you want to change the name of a flow, you should save it with a different name.
`nodes`		See information that follows.
`encoding`	SystemDefault, "UTF-8"
`stream_rewriting`	boolean
`stream_rewriting_maximise_sql`	boolean
`stream_rewriting_optimise_clem_execution`	boolean
`stream_rewriting_optimise_syntax_execution`	boolean
`enable_parallelism`	boolean
`sql_generation`	boolean
`database_caching`	boolean
`sql_logging`	boolean
`sql_generation_logging`	boolean
`sql_log_native`	boolean
`sql_log_prettyprint`	boolean
`record_count_suppress_input`	boolean
`record_count_feedback_interval`	integer
`use_stream_auto_create_node_settings`	boolean	If true, then flow-specific settings are used, otherwise user preferences are used.
`create_source_node_from_builders`	boolean	If true, when a source builder creates a new source output, and it has no active update links, a new Import node is added.
`create_source_node_update_links`	createEnabled, createDisabled, doNotCreate	Defines the type of link created when an Import node is added automatically.
`has_coordinate_system`	boolean	If true, applies a coordinate system to the entire flow.
`coordinate_system`	string	The name of the selected projected coordinate system.
`deployment_area`	Scoring, None	Choose how you want to deploy the flow. If this value is set to None, no other deployment entries are used.
`scoring_terminal_node_id`	string	Choose the scoring branch in the flow. It can be any terminal node in the flow.
`scoring_node_id`	string	Choose the nugget in the scoring branch.

Example script with flow properties

You can use node properties to create a list of all nodes in the flow and write that list in the flow annotations. The node property is used to refer to the nodes in the current flow. The following flow script provides an example:

stream = sdg.script.stream()
annotation = stream.getPropertyValue("annotation")

annotation = annotation + "\n\nThis flow is called \"" + stream.getLabel() + "\" and
 contains the following nodes:\n"

for node in stream.iterator():
    annotation = annotation + "\n" + node.getTypeName() + " node called \"" + node.getLabel()
 + "\""

stream.setPropertyValue("annotation", annotation)

The list that is written in the flow annotation looks like this:

This flow is called "druglearn" and contains the following nodes:

import node called "DRUG1n"
anonymize node called "Hide data"
mimic node called "DRUG1n mimic"
generate node called "Synthetic DRUG1n data"
evaluate node called "DRUG1n quality"
export node called "Save to csv"