Apache Impala connection
To access your data in Apache Impala, create a connection asset for it.
Apache Impala provides high-performance, low-latency SQL queries on data that is stored in popular Apache Hadoop file formats.
Supported versions
Apache Impala 4.0.
Prerequisites for Kerberos authentication
If you plan to use Kerberos authentication, complete the following requirements:
- Configure the data source for Kerberos authentication. Optional: This connection supports Kerberos SSO with user impersonation, which requires additional configuration.
- Confirm that the service that you plan to use the connection supports Kerberos. For more information, see Kerberos authentication in Cloud Pak for Data.
- An administrator must complete one set of the following setup steps:
- Kerberos without SSO: Enabling platform connections to use Kerberos authentication
- Kerberos SSO: Configuration for Kerberos SSO
Create a connection to Apache Impala
To create the connection asset, you need these connection details:
- Database (optional): If you do not enter a database name, you must enter the catalog name, schema name, and the table name in the properties for SQL queries.
- Hostname or IP address
- Port number
- Username and password
- SSL certificate (if required by the database server)
Authentication method
Select the security mechanism to use to authenticate the user:
-
Username and password or Kerberos credentials
Available Kerberos selections depend on whether you select Personal or Shared credentials. -
LDAP
Use an LDAP security mechanism for external authentication.Note:SPSS Modeler supports only the Username and password authentication method.
Credentials
The credentials setting determines the available authentication methods.
If you select Shared (default), you can use either username and password authentication or Kerberos authentication (without SSO). For more information,
see Prerequisites for Kerberos authentication. For Kerberos, you need the following connection details:
- Service principal name (SPN) that is configured for the data source
- User principal name to connect to the Kerberized data source
- The keytab file for the user principal name that is used to authenticate to the Key Distribution Center (KDC)
If you select Personal, you can enter your username and password for the server manually, use secrets from a vault, or use Kerberos authentication. For more information, see Prerequisites for Kerberos authentication. You have two choices for Kerberos:
- Kerberos (without SSO). For Kerberos without SSO, you need the following connection details:
- Service principal name (SPN) that is configured for the data source
- User principal name to connect to the Kerberized data source
- The keytab file for the user principal name that is used to authenticate to the Key Distribution Center (KDC)
- Kerberos SSO. Select Kerberos SSO and enter the Service principal name (SPN) that is configured for the data source.
For Credentials and Certificates, you can use secrets if a vault is configured for the platform and the service supports vaults. For information, see Using secrets from vaults in connections.
Choose the method for creating a connection based on where you are in the platform
- In a project
- Click Assets > New asset > Prepare data > Connect to a data source. See Adding a connection to a project.
- In a catalog
- Click Add to catalog > Connection. See Adding a connection asset to a catalog.
- In a deployment space
- Click Import assets > Data access > Connection. See Adding data assets to a deployment space.
- In the Platform assets catalog
- Click New connection. See Adding platform connections.
Next step: Add data assets from the connection
Federal Information Processing Standards (FIPS) compliance
This connection can be used on a FIPS-enabled cluster (FIPS tolerant); however, it is not FIPS-compliant.
Apache Impala setup
Restriction
You can use this connection only for source data. You cannot write to data or export data with this connection.
Running SQL statements
To ensure that your SQL statements run correctly, refer to the Impala SQL Language Reference for the correct syntax.
Learn more
Parent topic: Supported connections