Table of contents

Reviewing and working with the quick scan results (Watson Knowledge Catalog)

After the quick scan finishes analyzing and classifying data, review the discovery results, work on term assignments, and publish assets to one or more catalogs.

Quick scan provides insights about business term assignments, data class assignments, and data quality score. In the discovery job details, you can display information about discovered files, schemas, tables, and columns.

When you review the results, you can edit only term assignments. The remaining results are read-only. To work with the results, you must approve the discovered data assets. They are then loaded to a data quality project that you selected when you started a new discovery job, and added to the default catalog, but without the analysis results, such as term and data class assignments, and quality score. You can publish the analysis results from a data quality project.

Required permissions
To access quick scan results for projects in which you are a collaborator, you need the following user permission:
  • To review and publish results: Access data quality or Manage asset discovery
  • To rerun discovery jobs: Manage asset discovery and Manage data quality
  • To delete discovery jobs: Manage asset discovery

You must have one of the following roles in the data quality project that you select when starting a quick scan:

  • Data Steward to view discovery jobs and work with the results
  • Business Analyst and Data Operator to view, cancel, delete, or rerun discovery jobs and to review and publish assets and analysis results

To publish assets, you must also have the Admin or the Editor role in the catalog to which you want to publish.

  1. Go to Governance > Data discovery. Jobs that are ready for review are listed on the Action required tab.
  2. Find the job that you want to review and click its job ID to display a detailed list of discovered assets.

    The number of discovered schemas or tables might be smaller than the overall number of these objects in the connection. This difference can be due to, for example, the credentials used for discovery not allowing access to all objects or some schemas being empty.

  3. Review the information.
  4. Depending on the selected asset type, you have several options:

    Columns
    You can directly edit term assignments.

    You can assign or remove business terms to individual columns, to a selected set of columns, or to all columns on the current page at once. To change the term assignment for more than one column, select the columns and click Assign business term or Remove business term. Select one term to assign or remove, and click Update. When you assign a business term, the term is assigned to all selected columns where existing term assignments are overwritten. When you remove a business term, the term is removed from all of the selected columns that had the term assigned.

    To assign business terms to or remove terms from a single column, you can select the column and click Assign business term or Remove business term, or click the edit (edit icon) icon in the respective Assigned business terms, Suggested business terms, or Actions column.

    Files or tables
    You can publish or, optionally, audit assets. These actions are not available for columns.

    To publish assets, select the assets, and click Publish assets. Then, select the catalog to which you want to publish. When you click Publish, the selected assets are shared to this catalog, along with the analysis results, including business term and data class assignments and data quality metadata. If the respective connection does not yet exist in the catalog,the connection is also published. You can publish assets to as many catalogs as you want. However, business term assignments are published only once.

    Optionally, click Audit assets to audit your sensitive data. For more information, see the Auditing your sensitive data with IBM Guardium topic.

    Tables
    You can click the table name to see some basic details about its columns: column name, context, quality, assigned and suggested data classes and business terms. You can edit the term assignments by clicking the edit icon and you can publish the asset.

    For each table, you can click View details to see the analysis results, data quality information, and details about data classes and data types. Work with the tabs to drill down into specific quality details. After you review the information, return to the results to publish the asset from there.

    Schemas
    You can publish assets. Auditing assets is not possible.

    Publishing a schema is equivalent to selecting and then publishing all data assets in that schema. The review status for a schema is derived from the statuses of the individual data assets. For example, it is set to Submitted if one or more of the data assets are submitted for publishing. The status Published is set for the schema only after all data assets are published. Thus, the status of the schema might change if anyone publishes a data asset of this schema directly.

    If any of the schema’s tables are already in Loading status, publishing at the schema level is not possible. If you publish the entire schema after any of the schema’s tables were already published from the list of discovered tables, those tables will be published again together with the not yet published ones.

After you review a data asset and publish it, the status of the data asset changes to published and each catalog to which the asset is published is listed. This process can take a while. You might need to refresh the page manually several times before the new status shows. You can add published assets from a catalog to any analytics project. To be able to work with a published asset in a data quality project, you must publlish it to the default catalog. The data asset is then synced to the Information assets view from where you can add the asset to a data quality project.

The job status changes to reviewed after all data assets are published.