Configuring IBM JDBC Hive driver to connect to the Hive server

You can use the IBM DataDirect JDBC driver to connect to the Hive server.

  • To connect to the data source using the JDBC driver, a JDBC connection URL is required. For the IBM JDBC Hive driver, the connection URL will start with jdbc:ibm:hive, which is followed by the rest of the configuration parameters that are specific to the driver. Based on the authentication used, the configuration parameters of the driver will change. For more details on the driver configuration parameters, refer to Progress DataDirect JDBC Driver for Hive.

To connect to an Hive server, the hive server host and port number is required. In addition to this, the authentication method needs to be provided based on the authentication configured on the hive server.

The driver supports two authentication methods:
  1. User authentication
  2. Kerberos authentication
Note:

This is controlled by the driver connection option AuthenticationMethod. If the value of this property is set to userIdPassword, then the user authentication is used and if it is set to Kerberos, the Kerberos authentication is used. Which authentication method to be used is primarily governed by the authentication configured on the Hive Server. The default value for this connection option is userIdPassword.

Here is a Sample connection URL when User authentication is used : jdbc:ibm:hive://<hive_server_host>:<hive_server_thrift_port>;<DatabaseName=value>

When Kerberos configuration is used, there are additional configuration steps that needs to be performed to be able to authenticate using Kerberos. The JDBC Driver needs an additional JAAS configuration file (JDBCDriverLogin.conf) which should be placed at the same location where the driver file is copied. By default, the driver would be installed under $ISHOME/ASBNode/lib/java and hence the JDBCDriverLogin.conf should be placed in the same location.

The JAAS configuration file (JDBCDriverLogin.conf) can contain multiple stanzas. When multiple stanzas are defined in the file, the connection parameter loginConfigName should be used to define the stanza to be used. The stanzas can be defined to either use the user cache file or the user keytab file. The entries in the stanza would change based on whether the keytab is used or cache file is being used. The parameters in the file are relevant to the IBM JDK. You may refer to the JAAS configuration link for IBM JDK for additional details on writing the JAAS Configuration.

JDBCDriverLogin.conf with a single stanza
JDBC_DRIVER_01 {
com.ibm.security.auth.module.Krb5LoginModule required
credsType=both
principal="dsadm@IBM.COM"
useKeytab="/home/dsadm/dsadm.keytab";
};
Note: principal here could be any kerberos user who has access to Hive. dsadm user is used an example here.

Here is a Sample connection URL when Kerberos authentication is being used and JDBCDriverLogin.conf has a single entry jdbc:ibm:hive://<hive_server_host>:<hive_server_thrift_port>;<DatabaseName=value>;AuthenticationMethod=kerberos;ServicePrincipalName=<hive_service_principal>

JDBCDriverLogin.conf with multiple stanzas
JDBC_DRIVER_dsadm_keytab {
com.ibm.security.auth.module.Krb5LoginModule required
credsType=both
principal="dsadm@IBM.COM"
useKeytab="/home/dsadm/dsadm.keytab";
};
JDBC_DRIVER_dsadm_cache{
com.ibm.security.auth.module.Krb5LoginModule required
credsType=initiator
principal="dsadm@IBM.COM"
useCcache="FILE:/home/dsadm/krb5cc_dsadm";
};

Here is a Sample connection URL when Kerberos authentication is being used and JDBCDriverLogin.conf has multiple entries. jdbc:ibm:hive://<hive_server_host>:<hive_server_thrift_port>;<DatabaseName=value>;AuthenticationMethod=kerberos;ServicePrincipalName=<hive_service_principal>;loginConfigName= JDBC_DRIVER_dsadm_keytab

Note:
  1. The configuration information provided above is for the IBM JDBC Hive Driver (that is, IBM branded Progress DataDirect JDBC Driver for Hive). You may refer to the respective driver documentation for configuration information if a different drive for Hive is being used.
  2. For additional connection options on the Progress DataDirect JDBC Driver for Hive, refer to Progress DataDirect JDBC Driver for Hive.