dataauditnode properties

Data Audit node iconThe Data Audit node provides a comprehensive first look at the data, including summary statistics, histograms and distribution for each field, as well as information on outliers, missing values, and extremes. Results are displayed in an easy-to-read matrix that can be sorted and used to generate full-size graphs and data preparation nodes.

Example

stream = modeler.script.stream()
sourcenode = stream.findByID("id46WRP1285C")
node = stream.createAt("dataaudit", "My node", 196, 100)
stream.link(sourcenode, node)
node.setPropertyValue("custom_fields", True)
node.setPropertyValue("fields", ["Age", "Na", "K"])
node.setPropertyValue("display_graphs", True)
node.setPropertyValue("basic_stats", True)
node.setPropertyValue("advanced_stats", True)
node.setPropertyValue("median_stats", False)
node.setPropertyValue("calculate", ["Count", "Breakdown"])
node.setPropertyValue("outlier_detection_method", "std")
node.setPropertyValue("outlier_detection_std_outlier", 1.0)
node.setPropertyValue("outlier_detection_std_extreme", 3.0)
node.setPropertyValue("output_mode", "Screen")
Table 1. dataauditnode properties
dataauditnode properties Data type Property description
custom_fields flag  
fields [field1 … fieldN]  
overlay field  
display_graphs flag Used to turn the display of graphs in the output matrix on or off.
basic_stats flag  
advanced_stats flag  
median_stats flag  
calculate Count Breakdown Used to calculate missing values. Select either, both, or neither calculation method.
outlier_detection_method std iqr Used to specify the detection method for outliers and extreme values.
outlier_detection_std_outlier number If outlier_detection_method is std, specifies the number to use to define outliers.
outlier_detection_std_extreme number If outlier_detection_method is std, specifies the number to use to define extreme values.
outlier_detection_iqr_outlier number If outlier_detection_method is iqr, specifies the number to use to define outliers.
outlier_detection_iqr_extreme number If outlier_detection_method is iqr, specifies the number to use to define extreme values.
use_output_name flag Specifies whether a custom output name is used.
output_name string If use_output_name is true, specifies the name to use.
output_mode Screen File Used to specify target location for output generated from the output node.
output_format Formatted (.tab) Delimited (.csv) HTML (.html) Output (.cou) Used to specify the type of output.
paginate_output flag When the output_format is HTML, causes the output to be separated into pages.
lines_per_page number When used with paginate_output, specifies the lines per page of output.
full_filename string