mergenode properties
The Merge node takes multiple input records and creates a single output record containing some or all of the input fields. It's useful for merging data from different sources, such as internal customer data and purchased demographic data.
Example
node = stream.create("merge", "My node")
# assume customerdata and salesdata are configured database import nodes
stream.link(customerdata, node)
stream.link(salesdata, node)
node.setPropertyValue("method", "Keys")
node.setPropertyValue("key_fields", ["id"])
node.setPropertyValue("common_keys", True)
node.setPropertyValue("join", "PartialOuter")
node.setKeyedPropertyValue("outer_join_tag", "2", True)
node.setKeyedPropertyValue("outer_join_tag", "4", True)
node.setPropertyValue("single_large_input", True)
node.setPropertyValue("single_large_input_tag", "2")
node.setPropertyValue("use_existing_sort_keys", True)
node.setPropertyValue("existing_sort_keys", [["id", "Ascending"]])
mergenode properties |
Data type | Property description |
---|---|---|
method
|
Order
Keys
Condition
Rankedcondition |
Specify whether records are merged in the order they are listed in the data files, if one or more key fields will be used to merge records with the same value in the key fields, if records will be merged if a specified condition is satisfied, or if each row pairing in the primary and all secondary data sets are to be merged; using the ranking expression to sort any multiple matches into order from low to high. |
condition
|
string | If method is set to Condition , specifies the condition for
including or discarding records. |
key_fields
|
list | |
common_keys
|
flag | |
join
|
Inner
FullOuter
PartialOuter
Anti
|
|
outer_join_tag.n
|
flag | In this property, n is the tag name as displayed in the node properties. Note that multiple tag names may be specified, as any number of datasets could contribute incomplete records. |
single_large_input
|
flag | Specifies whether optimization for having one input relatively large compared to the other inputs will be used. |
single_large_input_tag
|
string | Specifies the tag name as displayed in the note properties. Note that the usage of this
property differs slightly from the outer_join_tag property (flag versus string)
because only one input dataset can be specified. |
use_existing_sort_keys
|
flag | Specifies whether the inputs are already sorted by one or more key fields. |
existing_sort_keys
|
[['string',
'Ascending' ] \ ['string'',
'Descending' ]] |
Specifies the fields that are already sorted and the direction in which they are sorted. |
primary_dataset
|
string | If method is Rankedcondition , select the primary data set
in the merge. This can be considered as the left side of an outer join merge. |
rename_duplicate_fields
|
boolean | If method is Rankedcondition , and this is set to
Y , if the resulting merged data set contains multiple fields with the same name
from different data sources the respective tags from the data sources are added at the start of the
field column headers. |
merge_condition
|
string | |
ranking_expression
|
string | |
Num_matches
|
integer | The number of matches to be returned, based on the merge_condition and
ranking_expression . Minimum 1, maximum 100. |
default_sort_order
|
Ascending Descending |
Specify whether, by default, records are sorted in ascending or descending order of the sort key values. |