DataStage connectors

A DataStage® connector is a palette node that provides data connectivity and metadata integration for external data sources, such as relational databases, public cloud storage services, or messaging software.

Connectors for remote data sources

For connectors for remote data sources, you need to create a project connection asset for the associated DataStage connector before you can load data to or read from it in DataStage. A connection contains the information necessary to connect to the data source.

  1. To create the connection asset or the "optimized" connection asset: From the project page, click Add to project > Connection. For more information, see Adding connections to projects.
  2. Open DataStage and add the associated connector to the canvas. Go to Properties > Connection. Double-click the connector node on the canvas to open its Details card.
  3. Select the connection from the Details card. Open the Stage tab, and go to Properties > Connection, and select the connection.

    Optional: The connectors are listed on the DataStage palette so that you can build your flow and add the connection asset later.

The "(optimized)" version of a connection gives you increased performance and more features such as before and after SQL statements, and sparse lookup and rejects links. However, you cannot use the "(optimized)" connection with other tools. You can use the connections that are available to other tools (for example, Salesforce.com), if you already created the connection, and you want to reuse it in DataStage.

Data sources are also available from the ODBC connection, which is also optimized for DataStage. Select the ODBC connection in the Add connection page, and then select a data source in the Create connection page.

Use the Generic JDBC connection to connect to a data source that has no connector defined for Cloud Pak for Data.

Connection Optimized version Available in the ODBC connection
Amazon RDS for PostgreSQL   PostgreSQL data source in ODBC
Amazon Redshift    
Amazon S3

In the Details card, select Use DataStage properties to access the DataStage-specific properties. The DataStage-specific properties provide more features and granular control of the flow execution, similar to DataStage "optimized" connectors.

   
Apache Cassandra Apache Cassandra (optimized)* Apache Cassandra data source in ODBC
Apache HDFS    
Apache Hive

In the Details card, select Use DataStage properties to access the DataStage-specific properties. The DataStage-specific properties provide more features and granular control of the flow execution, similar to DataStage "optimized" connectors.

Supports source connections only.

  Apache Hive data source in ODBC
  Apache Kafka*  
Databases for PostgreSQL    
FTP    
Generic JDBC

In the Details card, select Use DataStage properties to access the DataStage-specific properties. The DataStage-specific properties provide more features and granular control of the flow execution, similar to DataStage "optimized" connectors.

Restriction: The CREATE statement is not supported for connecting to a MongoDB database.

   
Google BigQuery    
Google Cloud Pub/Sub    
Google Cloud Storage    
Greenplum   Greenplum data source in ODBC
HTTP

Supports source connections only.

   
IBM Cloud Object Storage

You must create the Cloud Object Storage credentials with the Hash-based Message Authentication Code (HMAC) option. See Using HMAC credentials.

   
IBM Data Virtualization    
IBM Data Virtualization Manager for z/OS    
IBM Db2 Db2 (optimized)* IBM Db2 data source in ODBC
IBM Db2 Big SQL   IBM Db2 data source in ODBC
IBM Db2 Event Store   IBM Db2 data source in ODBC
IBM Db2 for i   IBM Db2 data source in ODBC
IBM Db2 for z/OS   IBM Db2 data source in ODBC
IBM Db2 Hosted   IBM Db2 data source in ODBC
IBM Db2 on Cloud   IBM Db2 data source in ODBC
    IBM Db2 on iSeries (AS400) data source in ODBC
    IBM Db2 on Linux on System z data source in ODBC
IBM Db2 Warehouse   IBM Db2 data source in ODBC
IBM Informix   IBM Informix data source in ODBC
IBM Netezza (PureData System for Analytics) Netezza (optimized)* IBM Netezza data source in ODBC
    Impala data source in ODBC
Microsoft Azure Blob Storage    
Microsoft Azure Data Lake Store    
Microsoft Azure File Storage    
Microsoft SQL Server   Microsoft SQL Server data source in ODBC
    MongoDB data source in ODBC
MySQL   MySQL data source in ODBC
ODBC

Select the ODBC connection in the Add connection page, and then select a data source in the Create connection page.

   
Oracle Oracle (optimized)* Oracle data source in ODBC
PostgreSQL   PostgreSQL data source in ODBC
Salesforce.com

Supports source connections only.

Salesforce.com (optimized)*  
SAP ASE

Supports source connections only.

  SAP ASE data source in ODBC

Supports source and target connections

    SAP IQ data source in ODBC
SAP OData    
Snowflake

In the Details card, select Use DataStage properties to access the DataStage-specific properties. The DataStage-specific properties provide more features and granular control of the flow execution, similar to DataStage "optimized" connectors. For example, for Snowflake, DataStage properties have explicit options for Create and Append operations.

   
Teradata    
* Denotes a project connection that is for DataStage only.

Other types of connector components

These entries in the palette do not require that you create a connection asset in the project.