Nodes supporting SQL generation

The following tables show nodes representing data-mining operations that support SQL generation. With the exception of the database modeling nodes, if a node does not appear in these tables, it does not support SQL generation.

You can preview the SQL that is generated before executing it. See the topic Previewing Generated SQL for more information.

Table 1. Sources
Node supporting SQL generation	Notes
Database	This node is used to specify tables and views to be used for further analysis. This node enables entry of SQL queries. Avoid results sets with duplicate column names. See the topic Writing SQL Queries for more information.

Table 2. Record operations
Node supporting SQL generation	Notes
Select	Supports generation only if SQL generation for the select expression itself is supported (see expressions below). If any fields have nulls, SQL generation does not give the same results for discard as are given in native IBM® SPSS® Modeler.
Sample	Simple sampling supports SQL generation to varying degrees depending on the database. See Table 3.
Aggregate	SQL generation support for aggregation depends on the data storage type. See Table 4.
RFM Aggregate	Supports generation except if saving the date of the second or third most recent transactions, or if only including recent transactions. However, including recent transactions does work if the `datetime_date(YEAR,MONTH,DAY)` function is pushed back.
Sort
Merge	No SQL generated for merge by order. Merge by key with full or partial outer join is only supported if the database/driver supports it. Non-matching input fields can be renamed by means of a Filter node, or the Filter tab of a source node. Supports SQL generation for merge by condition. For all types of merge, `SQL_SP_EXISTS` is not supported if inputs originate in different databases.
Append	Supports generation if inputs are unsorted. Note: SQL optimization is only possible when your inputs have the same number of columns.
Distinct	A Distinct node with the (default) mode Create a composite record for each group selected does not support SQL optimization.

Table 3. SQL generation support in the Sample node for simple sampling
Mode	Sample	Max size	Seed	Db2 for z/OS	Db2 for OS/400	Db2 for Win/UNIX	Netezza	Oracle	SQL Server	Teradata
Include	First	n/a		Y	Y	Y	Y	Y	Y	Y
	1-in-n	off		Y	Y	Y	Y	Y		Y
		max		Y	Y	Y	Y	Y		Y
	Random %	off	off	Y		Y	Y	Y		Y
			on	Y		Y		Y
		max	off	Y		Y	Y	Y		Y
			on	Y		Y		Y
Discard	First	off					Y	Y
		max					Y	Y
	1-in-n	off		Y	Y	Y	Y	Y		Y
		max		Y	Y	Y	Y	Y		Y
	Random %	off	off	Y		Y	Y	Y		Y
			on	Y		Y		Y
		max	off	Y		Y	Y	Y		Y
			on	Y		Y		Y

Table 4. SQL generation support in the Aggregate node
Storage	Sum	Mean	Min	Max	SDev	Median	Count	Variance	Percentile
Integer	Y	Y	Y	Y	Y	Y*	Y	Y	Y*
Real	Y	Y	Y	Y	Y	Y*	Y	Y	Y*
Date			Y	Y		Y*	Y		Y*
Time			Y	Y		Y*	Y		Y*
Timestamp			Y	Y		Y*	Y		Y*
String			Y	Y		Y*	Y		Y*

* Median and Percentile are supported on Oracle.

Table 5. Field operations
Node supporting SQL generation	Notes
Type	Supports SQL generation if the Type node is instantiated and no `ABORT` or `WARN` type checking is specified.
Filter
Derive	Supports SQL generation if SQL generated for the derive expression is supported (see expressions below).
Ensemble	Supports SQL generation for Continuous targets. For other targets, supports generation only if the "Highest confidence wins" ensemble method is used.
Filler	Supports SQL generation if the SQL generated for the derive expression is supported (see expressions below).
Anonymize	Supports SQL generation for Continuous targets, and partial SQL generation for Nominal and Flag targets.
Reclassify
Binning	Supports SQL generation if the "Tiles (equal count)" binning method is used and the "Read from Bin Values tab if available" option is selected. Note: Due to differences in the way that bin boundaries are calculated (this is caused by the nature of the distribution of data in bin fields), you might see differences in the binning output when comparing normal stream execution results and SQL pushback results. To avoid this, use the Record count tiling method, and either Add to next or Keep in current tiles to obtain the closest match between the two methods of stream execution.
RFM Analysis	Supports SQL generation if the "Read from Bin Values tab if available" option is selected, but downstream nodes will not support it.
Partition	Supports SQL generation to assign records to partitions.
SetToFlag
Restructure

Table 6. Graphs
Node supporting SQL generation	Notes
Graphboard	SQL generation is supported for the following graph types: Area, 3-D Area, Bar, 3-D Bar, Bar of Counts, Heat map, Pie, 3-D Pie, Pie of Counts. For Histograms, SQL generation is supported for categorical data only. SQL generation is not supported for Animation in Graphboard.
Distribution
Web
Evaluation

For some models, SQL for the model nugget can be generated, pushing back the model scoring stage to the database. The main use of this feature is not to improve performance, but to allow streams containing these nuggets to have their full SQL pushed back. See the topic Viewing SQL for Model Nuggets for more information.

Table 7. Model nuggets
Model nugget supporting SQL generation	Notes
C&R Tree	Supports SQL generation for the single tree option, but not for the boosting, bagging or large dataset options.
QUEST
CHAID
C5.0
Decision List
Linear	Supports SQL generation for the standard model option, but not for the boosting, bagging or large dataset options.
Neural Net	Supports SQL generation for the standard model option (Multilayer Perceptron only), but not for the boosting, bagging or large dataset options.
PCA/Factor
Logistic	Supports SQL generation for Multinomial procedure but not Binomial. For Multinomial, generation is not supported when confidences are selected, unless the target type is Flag.
Generated Rulesets
Auto Classifier	If a User Defined Function (UDF) scoring adapter is enabled, these nuggets support SQL pushback. In addition, If either SQL generation for Continuous targets, or the "Highest confidence wins" ensemble method are used, these nuggets support further pushback downstream.
Auto Numeric

Table 8. Output
Node supporting SQL generation	Notes
Table	Supports generation if SQL generation is supported for highlight expression (see expressions below).
Matrix	Supports generation except if "All numerics" is selected for the Fields option.
Analysis	Supports generation, depending on the options selected.
Transform
Graphs	SQL Pushback is not supported if animation is enabled in a Graphboard node.
Statistics	Supports generation if the Correlate option is not used.
Report
Set Globals

Table 9. Export
Node supporting SQL generation	Notes
Database
Publisher	The published stream will contain generated SQL.