Supported data sources
Ways to connect to your data
Use the following list to choose a method to connect to your data for your use case.
- Creating connections in other workspaces
- In general, project connections simplify the process of creating and maintaining connections.
You create the connection and then multiple services can refer to the connection. If you update the
connection, the changes are automatically picked up by the projects that use the connection.
You can create project connections from the Connectivity page. These connections can be used by various services across workspaces. However, the Connectivity page is available only if the Cloud Pak for Data common core services are installed.
Consider creating connections at the project level if the following statements are true:
- The services support project connections.
- The same connection needs to be used by multiple services or instances or across multiple projects.
- You have the appropriate permissions to create project connections.
You must have the Editor or Admin role on the Connectivity page.
Project connections are visible to all users. However, only users with the credentials for the data source can use the connection.
If you don't see the type of data source that you want to connect to, a Cloud Pak for Data administrator can create a custom JDBC connector for the data source. If you are connecting to only one data source and users do not need a repeatable method to connect to it, you can create a Generic JDBC connection.
Not all services support the same types of connections. If you want to use a connection from the Connectivity catalog, the list of connections is filtered based on the types of connections that the service supports. For example, if you are using a connection to add a data source to a project, only connections that are supported for projects are displayed.
- Creating connections in other workspaces
-
Create connections in other workspaces that are specific to certain services:
- Data Virtualization. For more information, see Connecting to data sources in Data Virtualization.
- Master data. For more information, see Managing master data by using IBM Match 360.
- Db2 Big SQL. For more information, see Creating a service instance for Db2 Big SQL.
Internet protocols with data sources
You can use internet protocols version 6 (IPv6) with supported databases.
You can give the hostname or IP Address in IPv6 format and you will be able to connect to the corresponding host.
Connectors
The following table lists the data sources that you can connect to from Cloud Pak for Data.
Other data sources
An administrator can upload JDBC drivers to enable connections to more data sources.
The Data Virtualization service supports connections that are established by using third-party JDBC drivers.
Data files
In addition to using data from remote data sources or integrated databases, you can use data from files. You can work with data from the following types of files.
Type of data file | Supported in |
---|---|
Avro | DataStage
IBM Knowledge Catalog
SPSS
Modeler Watson Studio
|
CSV |
DataStage
Decision Optimization
IBM Knowledge Catalog
SPSS
Modeler Data Virtualization
Watson Studio
|
JSON |
DataStage
Decision Optimization (JSON tabular form)
IBM Knowledge Catalog
Data Virtualization
Watson Studio
|
Microsoft Excel spreadsheets |
DataStage
IBM Knowledge Catalog
SPSS
Modeler Data Virtualization
Watson Studio
|
ORC |
DataStage
Data Virtualization
|
Parquet |
DataStage
IBM Knowledge Catalog
Data Virtualization
Watson Studio
|
SAS | SPSS
Modeler
Watson Studio (Data Refinery)
|
SAV |
DataStage
SPSS
Modeler |
TSV |
DataStage
IBM Knowledge Catalog
Data Virtualization
Watson Studio (Data Refinery)
|
XML |
DataStage
Decision Optimization (XML tabular form)
SPSS
Modeler |