Connecting to data sources in Watson Query

Watson Query supports many relational and nonrelational data sources that you can add to your data source environment. Watson Query connects to relational data sources by using the Java™ Database Connectivity (JDBC) protocol.

Learn how you can connect to your data sources.

Important: Do not create a connection to your Watson Query instance.

Before you begin

  1. If you want to enforce governance for your published objects, set up a governed catalog to publish your assets to. For more information, see Governing virtual data in Watson Query.
  2. Review the list of data sources that are supported for Watson Query. For more information, see Supported data sources in Watson Query.
    Note: Watson Query supports the PEM certificate format only for SSL/TLS-enabled data sources.
  3. For certain data sources, such as Amazon S3, Ceph®, IBM® Cloud Object Storage, Google BigQuery, MinIO, SAP HANA, and Snowflake, you must complete specific steps.
  4. For certain data sources, you must enter additional properties to support SSL. For more information, see JDBC drivers default to use the TLSv1.3 protocol.
  5. After a data source is added, any user with virtualize permissions (Watson Query Admin or Engineer roles) can create virtual tables. The user can create virtual tables by using any of the added data sources, no matter which user added the data source. Users with a Watson Query User role must be granted access to the virtualized table or view by using the data request workflow. For more information, see Managing users and roles.
  6. Review limitations and restrictions for data type mapping in Watson Query. For more information, see Supported data sources in Watson Query.

Adding a connection to a data source

To add a connection to a data source in your environment, complete the following steps.

  1. On the navigation menu, click Data > Data virtualization. The service menu opens to the Data sources page by default.

  2. Click Add connection > New connection to see a list of data sources that can be added to Watson Query.

  3. Select the type of data source that you want to connect to and then click Select.

    The type of connection that you create determines the information that you must provide.

    Typically, a connection requires a URL, a hostname, and port number.

  4. Specify the required information based on the connection that you selected:

    • The connection name and description.
    • The name of the database
    • The hostname or IP address and port number of the database, which is required to access the connection.
    • The username and password that allow access to the data source.
      Note: The username and password that is specified here refer to an ID with read-only access to the data source. This user is required for accessing data from the data source and does not necessarily correspond to a Cloud Pak for Data username or a Watson Query user ID.

      For some data sources, you can use the Cloud Pak for Data credentials to access the data source. To do so, select the corresponding checkbox.

    • For some data sources, you can use secrets from a vault as your credentials. With this option, you select the secrets that contain the appropriate credentials. For example, if you need to specify your username and password, select the secret that contains your username and the secret that contains your password. Watson Query uses the secrets (which are stored in a vault) to authenticate you.

      To use this option, an administrator must enable the vaults feature. For more information, see Enabling vaults for the Cloud Pak for Data web client. Additionally, if you are using secrets from an external vault, you must have the appropriate permissions to connect to external vaults or an administrator must share the appropriate secrets with you. For more information, see Managing secrets and vaults.

    • Any additional properties required to create the connection.
  5. If you want to use SSL to connect to the database, copy the content of the SSL certificate and paste it in the corresponding box.

    Some data sources can use an SSL certificate that is stored as a secret. To use this option, an administrator must enable the vaults feature. For more information, see Enabling vaults for the Cloud Pak for Data web client. Additionally, if you are using secrets from an external vault, you must have the appropriate permissions to connect to external vaults or an administrator must share the appropriate secrets with you. For more information, see Managing secrets and vaults.

  6. Click Create to add the connection to the data source environment.

  7. Optional: Select a remote connector to associate to the data source and click Add to connector.

    For more information, see Installing remote connectors.

Adding a data source from an existing platform connection

To add a data source from an existing platform connection, complete the following steps.

Important:

Generally, you should edit connections from Watson Query after you add a data source from an existing platform connection. If you edit a connection from the Platform connections page, changes to schema, tables, and other parameters might not be available immediately in Watson Query. You might not see changes for up to ten minutes. Editing a connection from Watson Query allows the connection to synchronize correctly when changes are made and data sources are refreshed.

  1. On the navigation menu, click Data > Data virtualization. The service menu opens to the Data sources page by default.

  2. Click the Add connection drop-down menu and click Existing platform connection to see a list of data sources that can be added to Watson Query.

  3. Select the data source that you want to add and click Add.

  4. Optional: Select a remote connector to associate to the data source and click Add to connector.

    For more information, see Installing remote connectors.

Learn more