SQL Optimization
For best performance, you should always try to maximize the amount of SQL generated to exploit the performance and scalability of the database. Only the parts of the stream that cannot be compiled to SQL should be executed within IBM® SPSS® Modeler Server. For more information, see SQL optimization.
Uploading File-Based Data
Data that is not stored in a database cannot benefit from SQL optimization. If the data you want to analyze is not already in a database, you can upload it using a Database Output node. You can also use this node to store intermediate data sets from data preparation and the results of deployment.
IBM SPSS Modeler can interface with the external loaders for many common database systems. Several scripts are included with the software and are available (with documentation) in the /scripts subdirectory under your IBM SPSS Modeler installation folder.
The following table shows the potential performance benefit of bulk-loading.
The figures show the elapsed time to export 250,000 records and 21 fields to an Oracle database. The
external loader was Oracle’s sqlldr
utility.
Export option | Time (in seconds) |
---|---|
Default (ODBC) | 409 |
Bulk-load via ODBC | 52 |
Bulk-load via external loader | 33 |