Joining virtual objects by using the Data Virtualization UI

You can join multiple tables from multiple data sources into a single virtual table, which is also known as a join view.

Remember:
The data requests (Data > Data requests) feature was removed in Cloud Pak for Data Version 5.4.0. Consider workflows instead.

Procedure

To create a virtual view from existing virtualized tables, complete the following steps.

  1. On the navigation menu, click Data > Data virtualization to reveal the service menu.
  2. On the service menu, click Virtualization > Virtualized data.
    Your existing virtualized tables are listed.
  3. Select two tables that you want to join and click Join to display the Join virtual objects window.
    Tip: If you prefer, you can click Open in SQL editor to skip the following steps and use the IBM common SQL engine instead. See the SQL Reference for more details on SQL syntax and function compatibility.
  4. Use the graphical join wizard to define a join between two virtualized tables by selecting at least one join key, which is a pair of columns with matching data types from each virtualized table. After specifying the join key, choose the columns you want to include in the join results. If you have identical columns, then only select one, otherwise they will both appear in the result. This step does not copy or move any data, it simply creates a table definition that is a combination of the two tables.
    Joining two tables
    To create a join key by using only the keyboard, follow these steps:
    1. Press Enter to select a column name in a row in Table 1.
    2. Press the Tab key to navigate to a column name in a row in Table 2.
    3. Press Enter on the row in Table 2. The rows are joined.
    Restrictions:
    • If you are joining tables with many rows, the preview of the join might time out after approximately 10 minutes if the data sources are unable to complete the processing of the join.
    • If the columns of the tables that you are joining do not share any common data, the preview of your join view might be empty. You can continue to join your virtual objects; however, the join view might not contain valuable data. Any data that you add to the virtual objects is automatically reflected in the join view.
    • You can join only two tables at one time. To join more than two tables, join two tables and get a view. Then, join the view and the third table to get another view.
    • Db2 privileges (Db2 GRANTS) and authorities that are granted to user groups are not considered when you create views. This limitation is a result of a Db2 limitation on groups.

      To workaround this issue, instead of granting access to specific groups from the Manage access page, you should grant access to the public by selecting All data virtualization users. Then create a new data protection rule to define Allow or Deny rules based on groups.
  5. Click Next. You can use the new table to query the data from both of the base tables.
  6. On the Edit column names page, enter a View name, select a Schema, and edit column names as needed.
  7. Select the appropriate sharing options for the view.
  8. Click Create view to complete the process.

Results

If Data Virtualization and IBM® Knowledge Catalog are installed in the same OpenShift® project (namespace), your virtual object is published to the primary catalog.

What to do next

You can use the virtualized data in a number of different ways. For example, you can use them in a Jupyter Notebook, create new models within the Model Builder, or build charts or graphics on the analytics dashboard.
Note: You cannot apply data masking policies with views. For more information, see Limitations for data masking.