Connecting to data sources in Watson Query

Watson Query supports many relational and nonrelational data sources that you can add to your data source environment. Watson Query connects to relational data sources by using the Java™ Database Connectivity (JDBC) protocol.

Learn how you can connect to your data sources.

Important: Do not create a connection to your Watson Query instance.

Before you begin

  1. If you want to enforce governance for your published objects, set up a governed catalog to publish your assets to. For more information, see Governing virtual data in Watson Query.
  2. If you want to connect to data sources Kerberos authentication, you must provide the Kerberos configuration file in Watson Query before you create the connection. For more information, see Enabling Kerberos authentication in Watson Query.
  3. Review the list of data sources that are supported for Watson Query. For more information, see Supported data sources in Watson Query.
    Note: Watson Query supports the PEM certificate format only for SSL/TLS-enabled data sources.
  4. For certain data sources, such as Amazon S3, Ceph®, IBM Cloud Object Storage, Google BigQuery, MinIO, SAP HANA, and Snowflake, you must complete specific steps.
  5. After a data source is added, any user with virtualize permissions (Watson Query Admin or Engineer roles) can create virtual tables. The user can create virtual tables by using any of the added data sources, no matter which user added the data source. For more information, see Managing users and roles.
  6. Review limitations and restrictions for data type mapping in Watson Query. For more information, see Supported data sources in Watson Query.
  7. Review Data source connection access restrictions in Watson Query to plan who you want to be able to access the data source connection and what privileges you want them to have.
Note:

4.7.0 and 4.7.1 In Watson Query versions 4.7.0 and 4.7.1, if a connection has been removed and needs to be added again, only the connection owner or an Admin user who has the SECADM privilege can add the connection back. The owner or Admin also can transfer ownership to another user who can add the removed connection back.

Adding a connection to a data source

To add a connection to a data source in your environment, complete the following steps.

  1. On the navigation menu, click Data > Data virtualization. The service menu opens to the Data sources page by default.

  2. Click Add connection > New connection to see a list of data sources that can be added to Watson Query.

  3. Select the type of data source that you want to connect to and then click Select.

    The type of connection that you create determines the information that you must provide.

    Typically, a connection requires a URL, a hostname, and port number.

  4. Specify the required information based on the connection that you selected:

    • The connection name and description.
    • The name of the database
    • The hostname or IP address and port number of the database, which is required to access the connection.
    • The username and password that allow access to the data source.
      Note: The username and password that is specified here refer to an ID with read-only access to the data source. This user is required for accessing data from the data source and does not necessarily correspond to a Cloud Pak for Data username or a Watson Query user ID.

      For some data sources, you can use the Cloud Pak for Data credentials to access the data source. To do so, select the corresponding checkbox.

    • For some data sources, you can use secrets from a vault as your credentials. With this option, you select the secrets that contain the appropriate credentials. For example, if you need to specify your username and password, select the secret that contains your username and the secret that contains your password. Watson Query uses the secrets (which are stored in a vault) to authenticate you.

      If you are using secrets from an external vault, you must have the appropriate permissions to connect to external vaults or an administrator must share the appropriate secrets with you. For more information, see Managing secrets and vaults.

    • For data sources with Kerberos authentication, the service principal name, user principal name, and a keytab file are required to create the connection.
    • Any additional properties required to create the connection.
  5. If you want to use SSL to connect to the database, copy the content of the SSL certificate and paste it in the corresponding box.

    Some data sources can use an SSL certificate that is stored as a secret. If you are using secrets from an external vault, you must have the appropriate permissions to connect to external vaults or an administrator must share the appropriate secrets with you. For more information, see Managing secrets and vaults.

  6. Add collaborators to the data source connection to determine who can access it. On the Add collaborators and add the connection to a remote connector page, the creator is listed as a collaborator. Choose from the following options to add additional collaborators to the connection:
    • Select Skip to create the data source connection without any additional collaborators or remote connectors. This means only the creator of the connection can view and use it.
    • Select Add collaborators > Users and User Groups and select the users and user groups that you want to add as collaborators. Any users that you select, and any users that belong to groups that you select, can access the connection
    • Select Add collaborators > Roles and select the roles that you want to add as collaborators. You can add the Engineer role, the Admin role, or both. Any users that have the roles that you select can access the connection.

    For more information about collaborators, see Collaborators.

  7. Click Create to add the connection to the data source environment.

  8. Optional: Select a remote connector to associate to the data source and click Add to connector.

    For more information, see Accessing data sources by using remote connectors in Watson Query.

  9. Manage access for the connection to determine what database tasks the collaborators can perform on the connection.
    1. On the Data sources page, click the vertical overflow menu (The vertical overflow menu icon.) and select Manage access. On the Manage access page, you can see the collaborators and their currently assigned privileges.
    2. For each Grantee (user, user group, or role that has access to the data source connection), grant or revoke privileges. For more information, see Connection privileges. You can only grant privileges that you are assigned yourself and you must have the GRANT privilege.
    3. You can add additional collaborators from the Manage access page.
    4. Apply your changes.
  10. Transfer ownership of the data source connection. See Transferring ownership of data sources in Watson Query.

Adding a data source from an existing platform connection

To add a data source from an existing platform connection, complete the following steps.

Important:

Generally, you should edit connections from Watson Query after you add a data source from an existing platform connection. If you edit a connection from the Platform connections page, changes to schema, tables, and other parameters might not be available immediately in Watson Query. You might not see changes for up to ten minutes. Editing a connection from Watson Query allows the connection to synchronize correctly when changes are made and data sources are refreshed.

  1. On the navigation menu, click Data > Data virtualization. The service menu opens to the Data sources page by default.

  2. Click the Add connection drop-down menu and click Existing platform connection to see a list of data sources that can be added to Watson Query.

  3. Select the data source that you want to add and click Add.

  4. Add collaborators to the data source connection to determine who can access it. On the Add collaborators and add the connection to a remote connector page, the creator is listed as a collaborator. Choose from the following options to add additional collaborators to the connection:
    • Select Skip to create the data source connection without any additional collaborators or remote connectors. This means only the creator of the connection can view and use it.
    • Select Add collaborators > Users and User Groups and select the users and user groups that you want to add as collaborators. Any users that you select, and any users that belong to groups that you select, can access the connection
    • Select Add collaborators > Roles and select the roles that you want to add as collaborators. You can add the Engineer role, the Admin role, or both. Any users that have the roles that you select can access the connection.

    For more information about collaborators, see Collaborators.

  5. Optional: Select a remote connector to associate to the data source and click Add to connector.

    For more information, see Accessing data sources by using remote connectors in Watson Query.

  6. Click Add to add the connection.
    Note: When you add data source connections in Watson Query, you might need to refresh twice on the Virtualize page. The first refresh notification is displayed when new data source connections are added. Click Refresh to reload tables, including those from new connections. After tables reload, a second notification appears. Click Refresh again to update your table list with newly loaded tables.
  7. Manage access for the connection to determine what database tasks the collaborators can perform on the connection.
    1. On the Data sources page, click the vertical overflow menu (The vertical overflow menu icon.) and select Manage access. On the Manage access page, you can see the collaborators and their currently assigned privileges.
    2. For each Grantee (user, user group, or role that has access to the data source connection), grant or revoke privileges. For more information, see Connection privileges. You can only grant privileges that you are assigned yourself and you must have the GRANT privilege.
    3. You can add additional collaborators from the Manage access page.
    4. Apply your changes.
  8. Transfer ownership of the data source connection. See Transferring ownership of data sources in Watson Query.

Learn more