IBM Support

Connect to Hive via Apache Knox Gateway within IBM Cloud Pak for Data

How To


Summary

IBM Cloud Pak for Data enables creating connection to Hive database to be used by IBM Watson Knowledge Catalog to run discovery jobs. Here we describe how to set up correct connection parameters when the remote Hive needs to be accessed through Apache Knox Gateway.

Objective

Learn how to set up a connection to Hive via Apache Knox Gateway within Cloud Pak for Data.

Environment

  • IBM Watson Knowledge Catalog add-on service installed.

Steps

Export Certificate on Apache Knox server
1- Export Certificate.
cd /usr/hdp/current/knox-server/data/security/keystores 

keytool -exportcert -alias gateway-identity  -file /tmp/exported_ssl_certificate.cer -keystore ./gateway.jks -storepass <password>

 Note: Replace <password> with Apache Knox master secret.
2- Move the exported_ssl_certificate.cer to one of Cloud Pak for Data Cluster nodes under /tmp.
Import to client truststore on Cloud Pak for Data cluster
1 - Log in to OpenShift cluster by using oc username and password on cluster node:
oc login
2- Log in to is-en-conductor-0 pod by using:
oc rsh is-en-conductor-0
3- Copy the crt file inside the pod under /user-home/
4 - Import to newly created client_truststore.jks truststore. To avoid losing it when the pod re-created, make sure to store it on a retained persistent volume like /user-home for example.
cd /opt/IBM/InformationServer/jdk/bin/

./keytool -importcert -file /user-home/exported_ssl_certificate.cer -keystore  /user-home/client_truststore.jks -alias ssl_certificate.cer -storepass <password> -noprompt
 Note: Replace <password> with selected password.
Create the Connection
- Log in to Cloud Pak for Data.
- Create new connection that supports hive connectivity and Watson Knowledge Catalog discovery jobs.
- Use the following connection parameters:
Host: <hive_server2_hostname>
Port: 8443
DatabaseName=<db_name>
Parameters
EncryptionMethod=ssl;TrustStore=/user-home/client_truststore.jks;TrustStorePassword=<password>;TransportMode=http;HTTPPath=gateway/default/hive;CryptoProtocolVersion=TLSv1.2
- Provide user and password to authenticate to Apache Knox/Hive.
- Save the connection.

Document Location

Worldwide

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSHGYS","label":"IBM Cloud Pak for Data"},"ARM Category":[{"code":"a8m0z000000GpDtAAK","label":"Cataloging and governing data"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"2.5.0;3.0.0;3.0.1","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
28 May 2020

UID

ibm16205888