Virtualizing data with Watson Query

Use Watson Query to create a virtual table to segment or combine data from one or more tables. Watson Query connects multiple data sources into a single self-balancing collection of data sources or databases.

Overview

Service This service is not available by default. An administrator must install this service on the IBM Cloud Pak® for Data platform. To determine whether the service is installed, open the Services catalog and check whether the service is enabled.

Watson Query has no prerequisite services or service integrations. Watson Query provisions IBM® Db2 Data Management Console if it is not provisioned already. If you want to publish your virtual data to a governed catalog, you must install Watson Knowledge Catalog. For more information, see Watson™ Knowledge Catalog on Cloud Pak for Data.

The most common mechanism for virtualizing data is to create a table view or virtual table.

Figure 1. Connection of multiple data sources into a single collection
The image shows how Watson Query connects multiple data sources into a single self-balancing collection of data sources or databases.

Tables from multiple sources that are similar can be combined into a single virtual table, which creates a unified definition that contains the columns and data from all participating data sources. Segmentation is vertical (either a subset or superset of columns based on a selection of chosen columns). You can then run queries against the resulting virtual table the same way that you would query any of the base tables. These tables are referred to as grouped tables. For more information, see Creating a virtualized table from multiple data sources in Watson Query.

After you provision the Watson Query service, you can manage users, connect to multiple data sources, create and govern virtual assets, then consume the virtualized data.
Figure 2. Virtualizing data with the Watson Query service
Connect, Join, Create Views, and Consume are the main actions that are needed for Watson Query.
  1.  Connect  Start by connecting to data sources. You can connect to multiple data sources. For more information, see Adding data sources and Supported data sources in Watson Query.
  2.  Join, create, and govern  Then, create virtual tables, group tables by schema, associate data with projects, and govern your virtual assets. For more information, see Creating a virtualized table and Governing virtual data in Watson Query.
  3.  Consume  Finally, consume virtual tables in projects, dashboards, data catalogs, and other applications. For more information, see Analytics services, Dashboard services, and Data governance services.

Learn more

For more information about these tasks, see the following resources. You can also review these tasks in a tutorial on Data Virtualization on IBM Cloud Pak for Data