Preventive Service Planning
Abstract
The following known issues, limitations, and workarounds apply for running automated discovery or quick scan on Hive or HDFS connections in Watson Knowledge Catalog for IBM Cloud Pak for Data.
Content
Existing connections in upgrade installations
If you have existing Hive Kerberos connections, a project administrator must complete several steps before and after upgrading the Watson Knowledge Catalog service in IBM Cloud Pak for Data.
Before upgrading
As a project administrator, back up the driver configuration for Kerberos authentication. Log in to the conductor pod (usually the is-en-conductor-0 pod) and create a backup copy of the /opt/IBM/InformationServer/ASBNode/lib/java/JDBCDriverLogin.conf file.
After upgrading
As a project administrator, complete these steps:
- Log in to the conductor pod (usually the is-en-conductor-0 pod) and replace the /opt/IBM/InformationServer/ASBNode/lib/java/JDBCDriverLogin.conf file with the backup copy you created before the upgrade.
- For connections that you want to use for automated discovery, complete steps 8 thru 12 of the procedure described in Configuring Hive with Kerberos for quality tasks.
New connections
| Data source | Connection type | Authentication type | Quick scan | Automated discovery |
|---|---|---|---|---|
| HDFS | Third party: Apache HDFS | Without additional authentication mechanism | Not supported | Platform connection |
| With Knox gateway | Not supported | Platform connection | ||
| With Kerberos | Not supported |
Connection created through metadata import with additional configuration.
Metadata import must be enabled, and the user setting up the connection must have the Access advanced governance capability and Metadata import permissions.
Connector to use:
Apache > File connector - HDFS
For details, see Metadata import connectors.
|
||
| Hive | Third party: Apache Hive | Without additional authentication mechanism | Platform connection | Platform connection |
| With Knox gateway | Platform connection | Platform connection | ||
| With Kerberos |
Connection created through metadata import with additional configuration.
Metadata import must be enabled, and the user setting up the connection must have the Access advanced governance capability and Metadata import permissions.
Connector to use:
IBM > JDBC connector
For details about the additional configuration steps, see Configuring Hive with Kerberos for quick scan.
|
Connection created through metadata import with additional configuration.
Metadata import must be enabled, and the user setting up the connection must have the Access advanced governance capability and Metadata import permissions.
Connector to use:
IBM > JDBC connector
For details, see Configuring Hive with Kerberos for quality tasks.
|
Considerations when setting up data discovery jobs for Hive or HDFS connections
- Connections created through metadata import are automatically shown in the connection selection list when you create a new discovery job.
- When you create a discovery job for a platform-level Hive connection, you might need to add the connection twice.
- When you set up an automated discovery job for an HDFS connection, browsing the discovery root might not work. As a workaround, select a different connection and then switch back to the originally selected HDFS connection. Then, browsing will work and you can select the folder for analysis.
[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSHGYS","label":"IBM Cloud Pak for Data"},"ARM Category":[{"code":"a8m50000000ClVnAAK","label":"Organize->Discovery"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"3.5.0"}]
Was this topic helpful?
Document Information
Modified date:
20 November 2020
UID
ibm16366639