Table of contents

What's new and changed in Watson Knowledge Catalog

The Watson™ Knowledge Catalog release and subsequent refreshes can include new features, bug fixes, and security updates. Refreshes appear in reverse chronological order, and only the refreshes that contain updates for Watson Knowledge Catalog are shown.

You can see a list of the new features for the platform and all of the services at What's new in IBM Cloud Pak for Data?

Installing or upgrading Watson Knowledge Catalog

Ready to install or upgrade Watson Knowledge Catalog?
Related documentation:

Refresh 6 Cloud Pak for Data Version 3.5

A new version of Watson Knowledge Catalog was released in May 2021.

Assembly version: 3.5.5

This release includes the following changes:

New features
The 3.5.5 release of Watson Knowledge Catalog includes the following features and updates:
Data discovery
This release includes the following changes to data discovery:
  • Improved usability and performance when editing term assignments in the discovery results view.
  • In the quick scan results interface, you can now assign business terms to more than one asset at once. See Working with quick scan results.
  • Publishing discovery results can now be restricted to users with admin or edit access to the default catalog. See Restricting publishing of discovery results.
Data quality
When you download a data asset from a data quality project, only assigned terms are included.
Bug fixes
Data curation: auto discovery
Issue: Name-based data class matching does not work on empty tables when you run auto discovery.

Resolution: Name-based data class matching now works if you run auto discovery on tables with zero rows.

Data curation: auto term assignment
Issue: The input for training the ML-based term assignment lacks assignments to assets of type 'database view'.

Resolution: If the discovery results of database view are published, those published results are now used to train auto term assignment for other assets.

Data curation: data source support
Issue: Data quality column analysis jobs fail with a Microsoft Azure Data Lake Store data source.

Resolution: Column analysis and data quality analysis now run successfully for a Microsoft Azure Data Lake Store data source that uses JSON, ORC, Avro, or Parquet file formats.

Issue: Data quality rule analysis jobs fail with the Microsoft Azure Data Lake Store connector.

Resolution: Data quality rule analysis jobs now run successfully with Microsoft Azure Data Lake Store connector data sources.

Issue: Apache Kudu SSL connections that are created in platform connections don't work for quick scan and auto discovery.

Resolution: Apache Kudu SSL connections  can now be used for quick scan and auto discovery, but the SSL options must be specified directly in the JDBC URL.

Issue: A connection cannot be browsed if it was created with a long JDBC URL that also contains JDBC options in quick scan discovery.

Resolution: Long JDBC URL + JDBC options can now be successfully browsed.

Issue: Quick scan discovery might fail if you are using database drivers that are added in platform connections.

Resolution: Modified the order in which quick scan loads the drivers to the class path so that class path conflicts are avoided and quick scan completes successfully.

Data curation: performance
Issue: When you open a data quality project, the relationships table loads all the data sets at one time. Loading all the data sets can take a long time if the project contains thousands of data sets or more.

Resolution: The process is changed so that the first 500 data sets are loaded to begin with, and then more data sets can be loaded on demand it is needed.

Issue: Creating indexes for quick scan, auto discovery, and data quality analysis jobs can take longer than expected for jobs with thousands of columns.

Resolution: The index creation is now optimized to create indexes only if the indexes do not already exist.

Issue: Accepting or rejecting term assignments is slow on a system with thousands of terms. 

Resolution: Optimizations were made to accept or reject terms and search for terms while you manage term assignments. The time that it takes to complete these tasks is now significantly reduced.

Issue: Retrieving relationships in a data quality project has poor performance.

Resolution: Queries to retrieve the relationships were optimized.

Issue: Data quality analysis job logic and queries use excessive resources to create and drop tables and create too many indexes.

Resolution: Improved logic and queries to create and drop tables and create indexes, resulting in data quality analysis jobs that run with better performance.

Data curation: quick scan discovery
Issue: For a JDBC data source that uses an SSL connection, the connection URL instead of the database name is displayed for the database name column in the quick scan results view.

Resolution: The database name is displayed for the database column in quick scan results.

Issue: Data class based term assignment does not work as expected in quick scan discovery.
Resolution: Adjustments in data class-based term assignment were made to ensure the following functions:
  • All terms that are associated with all the assigned and suggested data classes of a column are assigned correctly
  • Recognize the category scope in the configuration
  • Recognize the "classes with no business meaning" in the configuration
Issue: Quick scan fails with the message "Null password is not supported" when you access a data source with SAML authentication.

Resolution: The API key for SAML authentication is now passed correctly during quick scan analysis and the scan completes successfully.

Data curation: relationship analysis
Issue:  In a data quality project, table relationships that are defined at the data source are not displayed on the relationships page.

Resolution: Relationships that are defined at the database level are now displayed when the data sets are added to the data quality project.

Data curation: resiliency
Issue: Auto discovery jobs fail intermittently with the error "No IA Compute pods found in metrics file" when dynamic compute pod configuration is used.

Resolution: Added wait logic to deal with intermittent cases where the dynamic compute pod is temporarily unavailable.

Issue: Quick scan analysis fails after the number of Solr pods is scaled up.

Resolution: The analysis code was changed to avoid hard commits after each analyzed data set and do soft commits every 30 seconds, reducing the load on Solr.

Issue: A Finley pod gets into a state of constantly restarting if the pod is unable to connect to the XMETA service database.

Resolution: The retry logic was modified to avoid the problem of constantly restarting.

Data curation: usability
Issue: Publication of quick scan results from the schema filter view does not work.

Resolution: You can publish quick scan results from the schema filter view.

Issue: A relationship analysis on a pre-upgrade project cannot be triggered without sampling or prior relationship analysis results.

Resolution: The relationship analysis now triggers after you specify or select the sampling settings.

Issue: In the data classes view of a data quality project, the data classes menu is not wide enough to display the complete data class name. 

Resolution: The data classes menu is now wider to show longer data class names.

Issue: The number of displayed items in the quick scan results and "Explore assets" view shrinks after you select all items, scroll down, and clear items at the end of the page.

Resolution: The number of displayed items does not change after various combinations of selecting all items, scrolling, and clearing items.

Issue: When you access the relationships view in a data quality project that has more than 500 data sets, you are not notified that it might take a while to load all the relationships for all data sets.

Resolution: A notice is displayed that it might take a while to load all the relationships for all data sets and that you can alternatively select specific data sets to load the relationships.

Issue: Term assignments are not displayed in the data quality project data set columns view unless you open the column details first.

Resolution: You no longer need to open into column details to force the load of term assignments to get the term assignments to show in the data quality project data set columns view.

Issue: You cannot open data quality analysis results when the number of columns in the data set is greater than 100.

Resolution: You can now successfully open results for data sets with greater than 100 columns.

Governance artifacts: performance
Issue: Initial loading of governance artifacts views is slow.

Resolution: Views for various governance artifact types such as drafts (of any type), reference data, rules, terms, policies, and so forth, now load faster when they load for the first time.

Governance artifacts: usability
Issue: Imported draft policies cannot be deleted after Watson Knowledge Catalog is upgraded from version 3.5.1 if the same policies are already published before the upgrade.

Resolution: The imported draft policies can now be deleted without producing errors.

Issue: Governance rules with long names are not returned in the published rules search list.

Resolution: Updated the search capability for all governance artifacts views to better handle long multi-word artifact names.

Issue: When you add a governance artifact to a category on the Category details page, the search feature does not work.

Resolution: Search now works in the Add artifacts dialog box.

Issue: Unable to view in the UI business terms that have empty custom attribute values.

Resolution: The UI can now handle terms that have empty custom attribute values and displays them properly.

Issue: The Information Governance Catalog traditional application is displayed when lineage for an information asset is viewed.

Resolution: This issue only happened on subsequent attempts to view the lineage details after the first time, and it is now resolved.

Governance artifacts: workflows
Issue: In the task inbox, adding a comment in the Activities panel of task produces an error message.

Resolution: Comments can now be successfully added without producing an error.

Issue: On the Governance Workflow Drafts overview page, an error occurs when you try to sort on "Workflow Status."

Resolution: You can now successfully sort on "Workflow Status" on the Governance Workflow Drafts overview page.

Issue: If a custom template was uploaded with a step that had no description, the step can't be displayed in the UI.

Resolution: The workflow template can now be loaded if the template description is missing.

Issue: When you filter by artifact type on the “Workflow configuration” tab, an error is produced that says “Something went wrong. Contact your system administrator.”

Resolution: Filtering by artifact type now works as expected.

Issue: The activities pane does not load when only a single asset is in the workflow task inbox.

Resolution: The activities pane now shows the details for the single asset task.

Platform connections: data source support
Issue: The JDBC connector fails to import metadata when a multi-line comment contains special characters.

Resolution: The JDBC connector can now import metadata when there are special characters like ", ', <, > and & are present in multi-line comments.

Issue: Progress Datadirect JDBC driver for PostgreSQL returns an empty result set with a getColumns() API call when the version of the PostgreSQL Database is 12.0 or higher.

Resolution: Updated IBM PostgreSQL JDBC v5.1.4.000284 resolved the issue with empty result sets.

Security fixes

This release includes fixes for the following security issues:

  • CVE-2016-5007
  • CVE-2016-6811
  • CVE-2017-15095
  • CVE-2017-15718
  • CVE-2017-17485
  • CVE-2017-3166
  • CVE-2017-7525
  • CVE-2018-11307
  • CVE-2018-1270
  • CVE-2018-1271
  • CVE-2018-1272
  • CVE-2018-14718
  • CVE-2018-14719
  • CVE-2018-14720
  • CVE-2018-14721
  • CVE-2018-19360
  • CVE-2018-19361
  • CVE-2018-19362
  • CVE-2018-5968
  • CVE-2019-10086
  • CVE-2019-10173
  • CVE-2019-10782
  • CVE-2019-14379
  • CVE-2019-14540
  • CVE-2019-16335
  • CVE-2019-16942
  • CVE-2019-16943
  • CVE-2019-17195
  • CVE-2019-17531
  • CVE-2019-20916
  • CVE-2019-9658
  • CVE-2020-11612
  • CVE-2020-13954
  • CVE-2020-15138
  • CVE-2020-26217
  • CVE-2020-26258
  • CVE-2020-26259
  • CVE-2020-28499
  • CVE-2020-29582
  • CVE-2021-21341
  • CVE-2021-21342
  • CVE-2021-21343
  • CVE-2021-21344
  • CVE-2021-21345
  • CVE-2021-21346
  • CVE-2021-21347
  • CVE-2021-21348
  • CVE-2021-21349
  • CVE-2021-21350
  • CVE-2021-21351
  • CVE-2021-21366
  • CVE-2021-22112
  • CVE-2021-22696
  • CVE-2021-23341
  • CVE-2021-28165
  • CVE-2021-28363
  • CVE-2021-3156
  • CVE-2021-3449
  • CVE-2021-3450

Refresh 4 of Cloud Pak for Data Version 3.5

A new version of Watson Knowledge Catalog was released in March 2021.

Assembly version: 3.5.4

This release includes the following changes:
New features
The 3.5.4 release of Watson Knowledge Catalog includes the following features and updates:
Search column descriptions
You can now use the global search bar to search for terms in the column descriptions of new data assets.
Publish results at schema level
In quick scan results, you can now publish entire schemas instead of selecting the tables of a schema individually for publishing. For details, see Reviewing and working with the quick scan results
Bug fixes
Data curation: data source support
Issue: Issues with viewing Microsoft Azure Data Lake Store connector assets with folders and files that have special characters in Information Server metadata import and analysis jobs.

Resolution: Microsoft Azure Data Lake Store folders and files with special characters can be successfully imported in IMAM and can be successfully analyzed in data quality and DataStage® jobs.

Issue: Publish and approval of next generation quick scan results fail for a Teradata data source that has a JDBC URL without a database name.

Resolution: Publication of the Teradata data source results is successful whether the JDBC URL has a database name or not.

Data curation: performance
Issue: The addition of published data sets to a data quality project is slow.

Resolution: Reduced query processor usage contributes to better performance of adding published data projects to a data quality project.

Issue: Scrolling data assets in data quality can be slow due to redundant calls to get the data set review status.

Resolution: The review status is now retrieved only one time so scrolling and general UI performance is improved.

Issue: The approval of quick scan results can be slow due to slow Solr indexing of the results.

Resolution: The performance of approving quick scan results is improved by optimizing the underlying Solr index queries.

Issue: Detailed views in data discovery assets tables take a long time to load.

Resolution: UI and API fixes help detailed views in data discovery assets tables load faster.

Issue: When you click a graph in a data quality project dashboard and drill down into data sets, the display of the data asset details is delayed.

Resolution: Individual loaders are added to those graphs so that the UI is more responsive during drill-down.

Issue: Adding data sets to a data quality project can be slow if the data sets are already published.

Resolution: Issues were addressed to improve performance.

Issue: Managing term assignments in a data quality project slows down when thousands of terms are available.

Resolution: Underlying queries were optimized to make accepting and rejecting term assignments in the data quality UI faster.

Issue: The UI takes a long time to update after you submit a quick scan discovery job.

Resolution: The underlying API call was optimized to make the UI update faster after a quick scan discovery job is submitted.

Data curation: resiliency
Issue: Quick scan fails to retrieve the status of analyzed data assets if the Solr service becomes momentarily unavailable.

Resolution: Added retry to make the retrieval of data asset details for quick scan jobs more resilient.

Issue: The UI is sometimes unresponsive in the Data discovery > Assets > Tables > View details action.

Resolution: The UI was made more tolerant for handling data that is retrieved by API.

Data curation: scalability
Issue: A quick scan job that uses multiple quick scan pods shows a status of "Failed" if one pod fails while the other pod is still running and discovering assets.

Resolution: The quick scan job status remains in a state of "active" if one or more pods are still discovering assets. When the pods complete discovery, the overall quick scan job status is set to "ERROR" to indicate that a failure occurred with one or more of the pod subtasks.

Data curation: usability
Issue: Where failures occur in a quick scan job, the UI doesn't indicate why discovery failed for certain schemas or tables.

Resolution: When you hover over the error status in the quick scan results and discovered assets views, a tooltip displays details about the nature of the failure.

Issue: Some UI elements are not aligned properly in the quick scan results assets view.

Resolution: All UI elements are now properly aligned in the quick scan assets view.

Issue: The data discovery page shows gaps in some browser or monitor configurations.

Resolution: The UI behavior is updated to eliminate this issue.

Issue: Drilling down into an actively running rule causes an exception in the data quality project dashboard.

Resolution: Drilling down into an actively running rule causes no exception in the data quality project dashboard.

Issue: Quick scan fails if a workspace does not previously exist.

Resolution: A workspace is created before an analysis job.

Issue: In a Chrome browser, the presence of many data sets all in one project might cause the screen to go blank.

Resolution: Data sets scroll issue is fixed in the Chrome browser.

Issue: Data quality chart mismatch with threshold.

Resolution: Data quality threshold and quality score distribution are in sync.

Issue: When a business term with a long name is assigned to a column, the name is truncated in the data quality columns tab view and any more term assignments cannot be seen. 

Resolution: Long names are now displayed in a pop-up window.

Issue: When multiple schemas are selected, behavior in the data discovery UI forces you back to the beginning of the page.

Resolution: Data discovery behavior is updated to allow the selection of multiple schemas smoothly.

Issue: Pagination is not occurring for View asset details columns.

Resolution: The UI is updated to paginate columns correctly.

Issue: The data quality UI shows the data assets threshold incorrectly.

Resolution: The data quality UI is fixed to ensure that data assets display 'conformed' in assets that match the requirement.

Issue: The Data assets tab in the Project details page needs to be first in the tab order.

Resolution: The tab order is changed to Data assets, Dashboard, Data rules, Relationships, and Settings.

Issue: For quick scan jobs that use concurrent quick scan analysis pods, if one of analysis pods has a failure, the discovery UI shows the job as completed with a status of "Failed," even if the other analysis subtasks are still running in the other pod.

Resolution: The job status now remains as "Active" while the other analysis subtasks are still running. After the running analysis is finished, the overall job status will be set to ERROR.

Issue: Quick scan results cannot be reapproved if the approval of the results gets stuck in a "LOADING" state.

Resolution: If approval of the quick scan results cannot proceed, the approval results are set to the "error" state and you can attempt to publish the results again.

Issue: When you view data set details in a data quality project, the details for all tabs are loaded simultaneously, causing a long delay to view any of the data set details.

Resolution: The data set details are now loaded on demand based on accessing the given tab view.

Issue: A quick scan job remains paused after a momentary loss of connectivity to IIS services if the job is in its post-process phase when the connectivity is lost. You have no way to resume the job.

Resolution: The job now resumes after connectivity to IIS services is reestablished.

Issue: Category scope cannot be specified for quick scan analysis.

Resolution: You can configure category scope at a global or project level to restrict the terms that are used for quick scan analysis.

Issue: If you have many cookies under the same domain, you cannot access the data quality UI.

Resolution: The UI max HTTP header size was increased to allow for a many cookies to be handled, preventing the HTTP 431 error.

Issue: In the quick scan job results view, the data quality distribution graph total columns metric does not account for columns with no data score.

Resolution: The data quality distribution graph total columns metric now accounts for columns with a data quality score and also total columns (with or without a data score).

Issue: Term assignments cannot be saved on multiple data sets simultaneously.

Resolution: Added the capability to do a bulk save of term assignments in the data quality UI.

Governance artifacts: asset sync
Issue: After upgrade from Cloud Pak for Data 3.5.4, sync of business terms to the Information Assets view is not working.

Resolution: Assets are now successfully synced to the Information Assets view after upgrading to Cloud Pak for Data 3.5.4.

Governance artifacts: scalability
Issue: Publication of large reference data sets can be slow and blocks UI operations.

Resolution: Publishing reference data sets is now asynchronous so that UI operations are not blocked when large reference data sets are published.

Governance artifacts: usability
Issue: After a parent or dependent data set is added to a reference data set hierarchy, no confirmation appears.

Resolution: A confirmation message appears that indicates that the data set was successfully added.

Issue: User IDs instead of usernames are shown when you select See All in the governance activity pane for artifact modifications.

Resolution: Usernames are now displayed when the See All is selected.

Issue: You are unable to assign data classes to imported terms if data class assignments already exist.

Resolution: You can successfully add data classes to imported business terms.

Issue: When terms in governance artifacts are imported, the background screen goes blank.

Resolution: The UI behavior is modified so that the background remains visible while terms are being imported.

Issue: The category path is not displayed consistently for related artifacts.

Resolution: The category path is now added for related artifacts.

Issue: When business artifacts are created, trailing and leading spaces need to be removed to prevent duplicate artifacts with the same name from being published.

Resolution: Trailing and leading spaces are removed to enforce uniqueness in business artifact names.

Issue: New reference data cannot be assigned to a published reference data class.

Resolution: Published data class is updated to have new reference data.

Governance artifact workflows: usability
Issue: The Workflow Status filter option in the Governance Workflows Draft Overview page does not list filter items.

Resolution: Workflow status filter items are now displayed in the filter drop-down list.

Issue: Assets in various workflow states are intermittently not pushed to the next assignee.

Resolution: Assets are pushed through correctly through various assignees for approval.

Issue: Some tasks disappear from the task inbox.

Resolution: Fixed the UI to ensure that all the tasks that are selected are displayed.

Issue: "Review and Approve task comment" pop-up window says that the comment is optional, but the comment is required. 

Resolution: Removed the 'optional' word from the UI to make the text consistent with the behavior.

Issue: The conditions that are already set in a workflow configuration are not shown as selected in the conditions matrix. The result is that the conditions get removed if other conditions are set for a specific category when you edit the workflow configuration.

Resolution: All already selected conditions are also selected in the conditions matrix and are kept after you modify the workflow configuration.

Issue: In the workflow tasks inbox, pagination for the display for imported assets does not work correctly.

Resolution: The display of the configured default items per page and number of items per page is now shown correctly.

Issue: The governance artifacts activities log shows random "NaN" entries.

Resolution: Parsing of the activity log entries is now improved to prevent display of null entries.

Issue: Scrolling of workflow task details in the task inbox view does not work if the task details do not fit into the task details pane view.

Resolution: Scroll bars are now available if you need to scroll to view the rest of the task details.

Governance artifact custom workflows: usability
Issue: The Activate Workflow configuration screen has extraneous data. 

Resolution: The Activate Workflow configuration screen is updated to show relevant data only.

Issue: Creating a workflow HTTP task fails when the URL contains fewer than 20 characters.

Resolution: The workflow HTTP task now handles short URLs.

Platform connections: data source support
Issue: Connection test fails for Microsoft Azure Blob Storage connection.

Resolution: Connection test for Microsoft Azure Blob Storage connection now works.

Platform connections: usability
Issue: Sorting on connection name doesn't work in the Platform Connections view.

Resolution: Connections can now be sorted based on the connection name.

Security fixes

This release includes fixes for the following security issues:

  • CVE-2018-1000873
  • CVE-2019-14893
  • CVE-2019-17267
  • CVE-2019-20330
  • CVE-2019-20477
  • CVE-2020-10029
  • CVE-2020-10672
  • CVE-2020-10968
  • CVE-2020-10969
  • CVE-2020-11111
  • CVE-2020-11111
  • CVE-2020-11113
  • CVE-2020-11619
  • CVE-2020-11620
  • CVE-2020-11868
  • CVE-2020-13817
  • CVE-2020-14343
  • CVE-2020-1747
  • CVE-2020-1971
  • CVE-2020-26137
  • CVE-2020-27814
  • CVE-2020-28241
  • CVE-2020-29573
  • CVE-2020-5398
  • CVE-2020-7788
  • CVE-2020-8840
  • CVE-2020-9546
  • CVE-2020-9547
  • CVE-2020-9548
  • CVE-2021-3156

Refresh 3 of Cloud Pak for Data Version 3.5

A new version of Watson Knowledge Catalog was released in February 2021.

Assembly version: 3.5.3

This release includes the following changes:
New features
The 3.5.3 release of Watson Knowledge Catalog includes the following features and updates:
Additional data source for discovery
Apache Kudu data sources are now supported for automated discovery and quick scan.
Bug fixes
Data curation: auto term assignment
Issue: Auto term assignments that are based on a manually selected data class are not applied during column analysis.

Resolution: If you run column analysis again after you manually select the data class for a specific column, the auto term assignment takes the manually selected data class into account for term assignments.

Issue: Machine learning based auto term assignment service (Finley) might not be able to automatically apply a term in certain situations.

Resolution: Machine learning based auto term assignment service now uses a more stable and faster method to retrieve assets and terms. This method improves the performance and stability of that service.

Data curation: data source support
Issue: Data Quality analysis and ability to drill down on an Apache Kudu data source fails.

Resolution: You can successfully run automated discovery and view the details for an Apache Kudu data source.

Issue: Quick scan analysis and publishing fail with an Apache Kudu data source.

Resolution: You can successfully run quick scan discovery and publish the analysis results to the Watson Knowledge Catalog catalog for an Apache Kudu data source.

Issue: Publishing of quick scan results fails with a loading error status for a Hive data source that uses Knox SSL authentication.

Resolution: Assets now publish successfully to the Watson Knowledge Catalog catalog for a Hive data source that uses Knox SSL authentication.

Issue: Quick scan results show that 0 data assets were analyzed if the Teradata data source JDBC URL does not specify the database name.

Resolution: You can now successfully analyze and publish quick scan results for a Teradata data source if the database is not included in the JDBC URL for the connection.

Data curation: performance
Issue: Concurrent quick scan analysis jobs on large numbers of columns use an excessive amount of memory.

Resolution: The default memory limit in the ia-analysis service was increased to allow for better performance with concurrent quick scan analysis jobs.

Issue: Working with thousands of tables and millions of columns causes system slowness when you work with data quality assets.

Resolution: The XMETA Db2® repository memory footprint was increased and readiness checks were added for greater resilience when you work with large numbers of assets.

Issue: General slowness when you work with data quality projects and view assets within those projects.

Resolution: Loading optimizations and database indexes improve the time that is needed to list and open data projects and to load and view the data assets within a data quality project.

Issue: General slowness in the data quality UI when you navigate data quality projects that contain large number of assets.

Resolution: Improved handling of the cached assets lists, resulting in better navigation and viewing of the assets within the data quality project.

Issue: When you enter a data quality project, loading all the relationships analysis results might take a long time.

Resolution: Relationships are now loaded based on the selected data set, reducing the time that is needed to open and view a data quality project.

Data curation: resilience
Issue: Quick scan publishing of analysis results does not resume after the InfoSphere® Information Server service is restarted.

Resolution: If the InfoSphere Information Server service restarts, quick scan publishing jobs that are in the 'Submitted' state resume publishing.

Issue: For long-running quick scan analysis or publish jobs, any temporary unavailability of the Solr indexing service might lead to failures.

Resolution: Retry mechanisms were put in place to enable long-running quick scan jobs to be more resilient to temporary Solr unavailability.

Issue: For long-running quick scan publish jobs, any temporary unavailability of the ia-analysis service might lead to failures.

Resolution: Retry mechanisms were put in place to enable long-running quick scan jobs to be more resilient to temporary publish service unavailability.

Issue: A running quick scan job still stays in "running" state even if the backend service becomes unavailable because of a problem.

Resolution: The running quick scan job is set to "failed" if the backend service becomes unavailable.

Data curation: scalability
Issue: Long-running quick scan publish jobs might time out.

Resolution: Default timeout increased to eight hours so that publishing jobs can run longer before the publishing times out.

Issue: Publishing many assets to a data quality project from quick scan results takes a long time and uses a large amount of memory in the XMETA Db2 repository.

Resolution: Publishing quick scan results to the data quality project now uses less memory and takes less time.

Issue: When you publish data sets that are discovered by quick scan, if the publish fails, the data sets become stuck in the "loading" state and you are unable to attempt republishing.

Resolution: The fix is during the reset operation. Only data sets with a "loading" status are retrieved from Solr instead of retrieving all data sets and associated columns.

Issue: The auto term assignment service (Finley) creates a large number of InfoSphere Information Server sessions, which might lead to reaching the maximum InfoSphere Information Server sessions threshold.

Resolution: The auto term assignment service now manages the InfoSphere Information Server sessions properly so that they are not continuously increasing in number and reaching the maximum sessions threshold.

Data curation: security
Issue: Users without permissions to see that quick scan jobs details can still access the job details from bookmarked URLs.

Resolution: Users are now authenticated before they are allowed to view quick scan job results through a direct-access URL.

Data curation: UI performance
Issue: Data quality and discovery UIs are generally slow at loading initial views.

Resolution: Optimized caching and compression, enabling faster loading of the data quality and discovery UIs.

Data curation: usability
Issue: When you select the check box to clear all options in the view Project settings > Column analysis > Data classes, the view is truncated.

Resolution: When you clear all the options, the proper view display and scrolling capability is retained.

Issue: Schema, table, column, and context displayed names in quick scan results and filter views are truncated.

Resolution: Multi-line display for schema, table, column, and context name now allows for the full name to be displayed as wrapped text in quick scan results views and filters.

Issue: It is not apparent whether a quick scan results data set is being published or waiting in the queue to be published.

Resolution: When a data set is submitted for publishing by clicking Approve, the status is changed to "Submitted." When the data set is being published, the status is changed to "Loading."

Issue: A data quality project with a trailing space in the project name shows that no assets are in the project even though the project does contain assets.

Resolution: Leading and trailing spaces are stripped from the project name when the data quality project is saved to prevent a scenario with an invalid project name.

Issue: The data quality relationships view has display issues on the Chrome browser when UI elements are selected and more than 50 data assets are in the view.

Resolution: Now when you select the Customize display option, the relationships chart header is displayed properly. When you select one of the data assets, the page header is displayed properly.

Issue: Longer term names are displayed truncated in the quick scan results term assignment dialog box.

Resolution: The maximum display length for term names is now increased to take advantage of available space in the dialog box.

Issue: In the quick scan "Approve Assets" dialog box, the Approve button remains enabled after you click it, making it unclear whether the button needs to be clicked again.

Resolution: After you click Approve, the button becomes disabled so that you are aware that no further action is needed.

Issue: The mutli-select capability for adding users to a data quality project does not work correctly after you run multiple searches for users to select.

Resolution: Now, if you search for and select one user, and then search for and select another user , both users are added when Add is selected.

Issue: A quick scan job background task does not stop when you pause the quick scan job.

Resolution: When you pause the quick scan job, the quick scan service pauses the job successfully and frees up job resources.

Data protection rules: usability
Issue: Users without the "Manage data protection rules" permission are able to see the relevant buttons or navigation in the Rule UI to create, edit, or delete data protection rules.

Resolution: The UI elements to create, edit, or delete data protection rules are hidden for users without the "Manage data protection rules" permission.

General: usability
Issue: If you are working in multiple browser tab sessions, you get abruptly logged out of the catalog, governance, or data quality UIs.

Resolution: You can now have multiple browser tab sessions without getting logged out.

Governance artifact workflows: usability
Issue: The workflow task inbox summary field displays content for a different task when two tasks are created in succession and the second task is approved or delivered.

Resolution: After multiple new tasks are created, the summary field displays the correct content for the task that is being approved or delivered.

Issue: Inconsistent display of read-only details in "Custom workflow tasks completed by me" tab views.

Resolution: All fields are displayed in proper non-editable format and no fields have missing values.

Issue: Read-only workflow task details are displayed as disabled (grayed out) UI elements in the workflow task dialog boxes.

Resolution: Ready-only task details are now displayed as non-editable values, but the UI elements are not grayed out in the workflow task details dialog boxes.

Issue: Multi-select inconsistencies in workflow task inbox view.

Resolution: When multiple tasks are selected in the task inbox, the selected tasks banner no longer overlaps the selected tasks table, and the proper action items are enabled in the selected tasks table.

Issue: In the steps of some workflows, some fields that are automatically filled with information from previous steps are not passed when you confirm the workflow action. This behavior results in an error. You must edit the necessary fields to avoid the error.

Resolution: The populated values are now saved successfully without having to manually edit them.

Issue: The activity page is not loaded when you view governance assets and the time zone is UTC+00.

Resolution: The activity page now loads as expected when the browser is in that time zone.

Issue: Only the term name for related artifacts of a governance asset is displayed.

Resolution: The full context (category path) for related artifacts of a governance asset is displayed.

Issue: Overdue notifications are not received by assignees for overdue tasks.

Resolution: If a workflow task is overdue for a user, the user now gets an overdue notification through email and a one-time pop-up notification.

Issue: Steward details are missing in the activity page artifact modification details after you add stewards to a governance asset.

Resolution: When you click the 'add stewards' artifact modification entry on the activity page, the artifact modification details show all the stewards that were added.

Issue: Workflow task inbox does not show the assignee who just claimed the task.

Resolution: After you claim the task, the task details show "Claimed by you" rather than "+x assignees."

Issue: You are unable to publish, rename, or edit a draft term name that is a duplicate of a name that already exists.

Resolution: You can rename the duplicate term and publish it successfully after you rename it.

Issue: Intermittent 504 timeout issue when various governance asset details are accessed.

Resolution: Governance asset details views load faster now with improvements in resource handling in the Governance UI service.

Issue: An error occurs when you attempt to add a custom attribute value of type 'text' for a category.

Resolution: You can now add a custom attribute value of type 'text' to categories.

Platform connections: data source support
Issue:The Azure Data Lake Storage connector cannot parse folders and files that contain special characters in their names.

Resolution: Azure Data Lake Storage folders and files that contain special characters can be successfully browsed in Watson projects.

Platform connections: usability
Issue: Your personal credentials become invalidated when properties of a connection are changed.

Resolution: Personal credentials are not affected when other shared properties of the connection are changed, such as a hostname or URL.

Policy rules: security
Issue: An administration collaborator with only the "manage gov categories" platform permission cannot view category collaborators.

Resolution: The "View governance artifacts platform" permission to the user is now added for that user.

Security fixes

This release includes fixes for the following security issues:

  • CVE-2020-7774
  • CVE-2020-10543
  • CVE-2020-10878
  • CVE-2020-12723
  • CVE-2020-24659
  • CVE-2020-8265
  • CVE-2020-8287
  • GHSA-4xcv-9jjx-gfj3
  • GHSA-9v62-24cr-58cx

Refresh 2 of Cloud Pak for Data Version 3.5

A new version of Watson Knowledge Catalog was released in January 2021.

Assembly version: 3.5.2

This release includes the following changes:

New features
  • Quick scan job results now directly show the list of discovered assets. When you drill down into the results of a quick scan job, you are now directly taken to the list of discovered assets.
  • You must install Version 3.5.2 of the Watson Knowledge Catalog service if you want to install the service on Red Hat® OpenShift® 4.6.

    Version 3.5.2 of Watson Knowledge Catalog also includes the following features and updates:
    Support for Microsoft SQL Server
    The synchronization of assets in the default catalog and information assets for Microsoft SQL Server connections is now supported.
Bug fixes
This release includes the following fixes:
  • Issue: Users in time zones less than GMT see their selected date value that is shown as one day before the actual value.

    Resolution: Users in time zones less than GMT correctly see their selected date value.

  • Issue: Exporting Custom Attribute definitions and data lineage reports does not work in the Chrome browser.

    Resolution: Exporting Custom Attribute definitions and data lineage reports works in the Chrome browser.

  • Issue: Some radio button fields in custom workflow templates don’t work.

    Resolution: All radio button fields in custom workflow templates work.

  • Issue: Selections in tasks for custom workflows disappear in subsequent steps.

    Resolution: Selections in tasks for custom workflows appear properly in subsequent steps.

  • Issue: URL task fields for custom workflows incorrectly show as editable and cause an error.

    Resolution: URL task fields for custom workflows correctly show as editable.

  • Issue: Selections in tasks for custom workflows disappear in subsequent steps.

    Resolution: Selections in tasks for custom workflows appear properly in subsequent steps.

  • Issue: Initial sort fails in the data assets view in a data quality project.

    Resolution: Initial sort is successful in the data assets view in a data quality project.

  • Issue: Quick Scan data discovery fails when the run time is more than 13 hours.

    Resolution: Quick Scan data discovery runs more than 13 hours successfully.

  • Issue: JDBC Connector aborts with the error "Data truncated for column c1 at row 1" when an unsigned bigint data type is written to a MySQL database.

    Resolution: JDBC Connector successfully writes an unsigned bigint data type to a MySQL database.

  • Issue: Initial sort fails in data assets view in a data quality project.

    Resolution: Initial sort is successful in data assets view in a data quality project.

  • Issue: If you view quick scan results and have access to more than 100 data quality workspaces, an error is produced.

    Resolution: If you have access to more than 100 data quality workspaces, you can successfully view quick scan results.

  • Issue: Improvements needed for quick scan data discovery when you analyze and publish large data sets.

    Resolution: Quick Scan data discovery when you analyze and publish large data sets is improved.

  • Issue: Explore assets action is blocked during quick scan because statistics are loading.

    Resolution: Quick scan job results now directly show the list of discovered assets without waiting for statistics to finish loading.

  • Issue: Improvements are needed for exploring assets in the data discovery results UI.

    Resolution: Improvements were made for exploring assets in the data discovery results UI.

  • Issue: Improvements are needed for manual term assignment and rejection in quick scan results.

    Resolution: Improvements were made for manual term assignment and rejection in quick scan results.

  • Issue: Improvements are needed for workflow UI fit and finish. Data curation usability enhancements are also needed.

    Resolution: Improvements were made for workflow UI fit and finish. Data curation usability enhancements were also made.

  • Issue: The owner of the table-assets that are synced to the default catalog is shown as an administrator instead of an asset owner.

    Resolution: The owner of the table-assets that are synced to the default catalog is shown properly as an asset owner.

  • Issue: Fixes are needed for data discovery connection management and browsing.

    Resolution: Fixes were made for data discovery connection management and browsing.

  • Issue: Glossary UI enhancements are needed for fit and finish usability for governance artifacts.

    Resolution: Glossary UI enhancements were made for fit and finish usability for governance artifacts.

  • Issue: An error occurs when you access a draft version of a data class with a matching method by using a hierarchical reference data set.

    Resolution: You can successfully access a draft version of a data class with a matching method by using a hierarchical reference data set.

  • Issue: Connections are missing when you create a connection by using the metadata import feature.

    Resolution: Connections appear properly when you create a connection by using the metadata import feature.

  • Issue: Analytic Server connections return errors for certain data sources.

    Resolution: Analytic Server connections operate properly for all data sources.

  • Issue: When you manually assign a term in quick analysis results, a null pointer exception is produced.

    Resolution: When you manually assign a term in quick analysis results, the action is successful.

  • Issue: The HADOOP_OPTS parameter needs to be included in the odf-fast-analyzer service.

    Resolution: The HADOOP_OPTS parameter is included in the odf-fast-analyzer service.

  • Issue:Users with data quality analyst role do not have "Access Catalog" permission

    Resolution: Users with data quality analyst role successfully have "Access Catalog" permission

Security fixes
This release includes fixes for the following security issues:
  • Prototype Pollution issue in Node.js ini module in catalog, profiling, and metadata import services.
  • Issues for multiple open source, vulnerability analysis, and twistlock security fixes in all services.
  • CVE-2018-1000873
  • CVE-2018-19360
  • CVE-2018-19361
  • CVE-2018-19362
  • CVE-2018-8039
  • CVE-2019-12086
  • CVE-2019-12384
  • CVE-2019-12406
  • CVE-2019-12419
  • CVE-2019-12423
  • CVE-2019-12814
  • CVE-2019-13012
  • CVE-2019-14379
  • CVE-2019-14439
  • CVE-2019-14751
  • CVE-2019-14892
  • CVE-2019-16943
  • CVE-2019-17531
  • CVE-2019-17573
  • CVE-2019-18276
  • CVE-2020-8277
  • CVE-2020-13957
  • CVE-2020-14779
  • CVE-2020-14781
  • CVE-2020-14782
  • CVE-2020-14792
  • CVE-2020-14796
  • CVE-2020-14797
  • CVE-2020-14798
  • CVE-2020-14803
  • CVE-2020-15256
  • CVE-2020-1954
  • CVE-2020-26137
  • CVE-2020-26217
  • CVE-2020-26258
  • CVE-2020-26259
  • GHSA-xgh6-85xh-479p

Initial release of Cloud Pak for Data Version 3.5

A new version of Watson Knowledge Catalog was released as part of Cloud Pak for Data Version 3.5.

Assembly version: 3.5.0

This release includes the following changes:

New features
Reference data set enhancements
You can customize your reference data sets in the following ways:
  • Configure hierarchies between reference data sets and between values within a reference data set.
  • Add custom columns.
  • Create values mappings, or crosswalks, between values of multiple reference data sets in 1:1, n:1, and 1:n relationships.

For details, see Reference data sets.

Catalog enhancements
Catalogs are enhanced in the following ways:
  • Additional information is shown on the new Overview page for assets, such as, the asset's path and related assets.
  • More activities are shown on the Activities page for assets.
  • COBOL copybook is now a supported asset type. You can preview the contents of copybooks.
  • You can add more types of assets and metadata to catalogs by coding custom attributes for assets and custom asset types with APIs.
New connections
Watson Knowledge Catalog can connect to:
  • Amazon RDS for MySQL
  • Amazon RDS for PostgreSQL
  • Apache Cassandra
  • Apache Derby
  • Box
  • Elasticsearch
  • HTTP
  • IBM Data Virtualization Manager for z/OS®
  • IBM Db2 Event Store
  • IBM SPSS® Analytic Server
  • MariaDB
  • Microsoft Azure Blob Storage
  • Microsoft Azure Cosmos DB
  • MongoDB
  • SAP HANA
  • Storage volume
In addition, the following connection names have changed:
  • PureData System for Analytics is now called Netezza® (PureData® System for Analytics)

    Your previous settings for the connection remain the same. Only the name for the connection type changed.

New SSL encryption support for connections
The following connections now support SSL encryption in Watson Knowledge Catalog:
  • Amazon Redshift
  • Cloudera Impala
  • IBM Db2 for z/OS
  • IBM Db2 Warehouse
  • IBM Informix®
  • IBM Netezza (PureData System for Analytics)
  • Microsoft Azure SQL Database
  • Microsoft SQL Server
  • Pivotal Greenplum
  • PostgreSQL
  • Sybase
Category roles control governance artifacts
The permissions to view and manage all types of governance artifacts, except for data protection rules, are now controlled by collaborator roles in the categories that are assigned to the artifacts.
To view or manage governance artifacts, users must meet these conditions:
  • Have a user role with one of the following permissions:
    • Access governance artifacts
    • Manage governance categories
  • Be a collaborator in a category

Category collaborators have roles with permissions that control whether they can view artifacts, manage artifacts, manage categories, and manage category collaborators. Subcategories inherit collaborators from their parent categories. Subcategories can have other collaborators, and their collaborators can accumulate more roles. The predefined collaborator, All users, includes everyone with permission to access governance artifacts.

For details, see Categories.

Changes to user permissions
If you upgraded from Cloud Pak for Data Version 3.0.1, the following user permissions are automatically migrated as part of the upgrade :
  • Users who had the Manage governance categories permission continue to have that permission and also have the Owner role for all top-level categories.
  • Users who had the Manage governance artifacts permission now have the Access governance artifacts permission, the Editor role in all categories, and the new Manage data protection rules permission.
  • All users now have the Access governance artifacts permission. However, when you add new users, the Access governance artifacts permission is not included in all of the predefined roles. It is include in the Administrator, Data Engineer, Data Steward, and Data Quality Analyst roles.
  • All users who were listed as Authors in a governance workflow now have the Access governance artifacts permission and also the Editor role in all categories.
Workflows for governance artifacts support categories
Workflow configurations for governance artifacts now require categories to identify the governance artifacts and users for the workflow:
  • When you create a new workflow configuration for governance artifacts, you must select either one category or all categories as part of the triggering condition for the workflow, along with governance artifact types and events.
  • You no longer specify artifact authors in a workflow configuration. Artifact authors are all users who have permission to edit artifacts in a category that is specified in the workflow configuration.
  • You now specify one or more of these types of assignees to approve and review artifacts: the workflow requestor, users with specified roles in the categories for the workflow, users with the Data Steward role, or selected users.

For details, see Managing workflows for governance artifacts.

Discovery enhancements
Watson Knowledge Catalog includes the following changes for discovering data:
Automated discovery
The sample size is 1,000 records by default. Changes require specific permissions.
Quick scan
With the improved version, you can perform more scalable data discovery with richer analysis results that can be published to one or more catalogs directly from the quick scan results.

For details, see Running a quick scan.

Import metadata from an analytics project
You can use the metadata import asset type to import data assets from a connection so that you can analyze and enrich the assets later.

For details, see Importing metadata.

Import additional artifacts and properties
You can now import reference data sets. When you import a reference data set, you can also import secondary categories, effective dates, and custom attribute values for most artifacts.
For business terms, you can import:
  • Type of terms relationships
  • Assigned data classes
  • Synonyms

For details, see Importing governance artifacts.