Supported data sources for SPSS Modeler

In SPSS Modeler, you can connect to your data no matter where it lives.

Connectors

The following table lists the data sources that you can connect to from SPSS Modeler.

For more information about SQL pushback (such as lists of nodes, CLEM expressions, and operators that support SQL pushback), see SQL optimization.

For a list of databases that support custom SQL queries to pull in data, see Data Asset node.

Connector Read Only Read & Write SQL Pushback Notes
Amazon RDS for MySQL Replace the data set option isn't supported for this connection.
Amazon RDS for Oracle  
Amazon RDS for PostgreSQL Replace the data set option isn't supported for this connection.
Amazon Redshift  
Amazon S3    
Apache Cassandra    
Apache Derby    
Apache HDFS (formerly known as "Hortonworks HDFS")      
Apache Hive    
Apache Impala  
Box    
Cloud Object Storage  
Cloud Object Storage (infrastructure)  
Cloudant
Cognos Analytics  
Connector Read Only Read & Write SQL Pushback Notes
DataStax Enterprise      
Db2    
Db2 Big SQL    
Db2 for i    
Db2 for z/OS    
Db2 on Cloud    
Db2 Warehouse    
Dremio      
Dropbox  
Exasol    
FTP (remote file system transfer)  
Generic JDBC     Use the Generic JDBC connection to connect to a data source that doesn't have a defined connection for Cloud Pak for Data.
Google BigQuery Google BigQuery has these limitations when SQL pushback is enabled:
  • Data streaming isn't used to insert data in a Data Asset Export node
  • Special characters aren't allowed in column names

For more information, see Known issues and limitations for SPSS Modeler.

Google Cloud Storage      
Connector Read Only Read & Write SQL Pushback Notes
Greenplum    
HDFS via Execution Engine for Hadoop     You can write to an existing data asset, but writing to a new asset isn't currently supported.
Hive via Execution Engine for Hadoop  
HTTP      
IBM Cloud Databases for MySQL    
IBM Cloud Data Engine      
IBM Cloud Databases for MongoDB      
IBM Cloud Databases for PostgreSQL    
IBM Data Virtualization      
IBM watsonx.data Presto    
Impala via Execution Engine for Hadoop      
Informix      
Looker      
MariaDB    
Microsoft Azure Blob Storage      
Microsoft Azure Cosmos DB    
Connector Read Only Read & Write SQL Pushback Notes
Microsoft Azure Databricks    
Microsoft Azure Data Lake Storage  
Microsoft Azure File Storage      
Microsoft Azure SQL Database  
Microsoft Azure Synapse Analytics    
Microsoft SQL Server    
MinIO      
MongoDB      
MySQL    
Netezza Performance Server    
OData      
Oracle    
Planning Analytics (formerly known as "IBM TM1")     Only the Replace the data set option is supported.
Presto    
Connector Read Only Read & Write SQL Pushback Notes
PostgreSQL    
Salesforce.com      
SAP ASE      
SAP HANA    
SAP IQ      
SAP OData      
SingleStoreDB      
Snowflake    
SPSS Analytic Server      
Storage volume (formerly known as Mounted volume)   If your data contains a column or row delimiter such as a comma (,), your flow might fail when it tries to write to a storage volume. As a workaround, you can first use a Filler node to replace the delimiters.
Tableau      
Teradata    
Vertica  

Data files

In addition to using data from remote data sources or integrated databases, you can use data from files. You can work with data from the following types of files in SPSS Modeler.

Connector Read Only Read & Write Notes
Avro  
CSV, Delimited
Attention: If your .csv file contains any malicious payloads in an input field (in formulas for example), these payloads might be executed.
JSON  
ORC  
Parquet  
SAS  
SAV (SPSS Statistics)    
SHP  
XLS, XLSX (Excel)    
XML  

ODBC drivers

Cloud Pak for Data connections use JDBC drivers. You can also use ODBC drivers to take advantage of SQL optimization and pushback.

Note: ODBC drivers might affect the precision of data. SPSS Modeler usually maintains a precision of 16 significant digits when it uses JDBC drivers. However, ODBC drivers might cause data to be rounded or truncated. These changes can create differences between these two methods.
The following ODBC drivers are preinstalled with SPSS Modeler:
  • SPSS Data Access Pack 8.1.1.0
  • Netezza native driver 7.2.1.10
  • Db2 native driver 11.5.4
The following ODBC drivers can be installed through a custom SPSS Modeler image:
  • SAP HANA driver (hanaclient-2.7.26-linux-x64.tar.gz)
  • Exasol driver (EXASOL_ODBC-7.1.4.tar.gz)
  • Teradata driver (TeradataToolsAndUtilitiesBase__linux_x8664.17.20.05.00-1.tar.gz)

For more information, see Building custom images to install ODBC drivers.