Creating a virtualized table from a single data source in Watson Query

Important: IBM Cloud Pak® for Data Version 4.8 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.

You can create a virtual table that uses any of the supported data sources.

Remember:

The data requests (Data > Data requests) feature was removed in Cloud Pak for Data Version 4.8.0. Consider workflows instead.

About this task

These steps describe how to virtualize data without business terms. If you want to govern your virtual data, see Governing virtual data.

Procedure

On the navigation menu, click Data > Data virtualization to reveal the service menu.
The service menu opens to the Data sources page by default.
On the service menu, click Virtualization > Virtualize and click the Tables tab.

The list of connections appears in the default Explore view. Click a connection, schema, or table to select it and preview the contents.

The list of available tables in your connections appears in the List view. You can filter the listed tables by adding filters on the Data sources page. Additionally, you can search for tables by name, schema, column, or business term.

Note: When you add data source connections in Watson Query, you might need to refresh the Virtualize page twice. The first refresh notification is displayed when new data source connections are added. Click Refresh to reload tables, including new connections. After tables reload, a second notification is displayed. Click Refresh again to update your table list with newly loaded tables.

The list of available tables includes tables in which the read permission is not granted (nonreadable tables).

Note: The Virtualize page automatically excludes tables that reside in system or application schemas of the remote sources. You can override this restriction by specifying a list of schemas to reveal.

Large data sources, for example with greater than 100,000 tables, slow down the loading of tables in the Virtualize page. You can reduce the number of displayed tables of a large data source in the List view by using the Filter data option on the Data sources page.

For more information, see Troubleshooting virtualization issues in Watson Query.
If you create a virtual table that includes a nonreadable table, select one of the following options to make your virtual table queryable.
- Option 1: Ensure that the user who is assigned to the data source connection can access the nonreadable table. You can add this access before or after the table is virtualized.
- Option 2: Create a data source connection in Watson Query to access the same data source with credentials for a separate user that does have read-access to the table.
Select the table that you want to virtualize and click Add to cart.
Click View cart to view your selections. From this window, you can also edit the table and schema names or remove a selection from your cart.
Select the appropriate sharing options for the virtualized table.
Select Publish to catalog if you also want to publish to a selected catalog.
A list of available catalogs is shown in the drop-down menu. Each catalog is tagged as Governed or Not governed.
Note: You must have at least one catalog in IBM Knowledge Catalog.
You must have permission to publish to a catalog. An administrator can enable whether all virtual objects are published to a selected governed catalog, which prevents a user from publishing to a specified catalog.
Specify a schema in the Schema field.
You can also create a schema by following these steps.
- If you have the Watson Query Engineer or User role, leave the Schema field as default to create a schema with your user ID.
- If you have the Watson Query Admin role, leave the Schema field as default to create a schema with your user ID or enter the new schema name in the Schema field.
For more information, see Creating schemas for virtual objects.
Click Virtualize to complete the process.
When the status window appears, you can select to view your virtualized data or virtualize more data. However, you must wait until virtualization is complete before you navigate away from the page.
Click View virtualized data to see your newly created tables.

Results

If Watson Query and IBM Knowledge Catalog are installed in the same OpenShift® project (namespace), your virtual object is published to the primary catalog.

What to do next

You can collect statistics for your virtual object. For more information, see Collecting statistics in Watson Query.
On the Virtualized data page, you can publish your virtual object to the catalog. For more information, see Publishing virtual data to the catalog.
You can also create join multiple virtual tables to create a joined view. See Creating a join view from multiple tables.