Connecting to a Presto server
To connect to a Presto Server from a client program or CLI, the following items are required:
- Hostname and port for the Presto server or workstation where the watsonx.data™ developer or software is installed.
- Certificates served by the Presto server to establish trust.
- Authorized user credentials to access the Presto server.
curl -ki https://<presto-hostname>:<presto-portnumber>
nc -v <presto-hostname> <presto-portnumber>
- The Presto server exposes a HTTPs (TCP) port. Therefore, you can use any convenient HTTP or TCP-based utility to ensure that the Presto server can be accessed from your network.
- When you run the
curl
command, the server may return an HTTP/1.1 401 Unauthorized response. This is expected as the server is secured by authentication. - When you run the
nc
command, the server returns a success message.
In watsonx.data, you can connect to the Presto server in multiple ways based on the platform and utilities you are using. See the following sections for more details:
location = s3a://<storage-name>/
Using presto-cli executable (remote) - software
This section provides instructions to connect the Presto server in watsonx.data software edition from a remote
presto-cli
.
-
The Red Hat® OpenShift Project namespace administrator or cluster administrator has exposed a route to the Presto (Java) server. The administrators should refer to Exposing secure route to Presto server for instructions.
-
You have received the hostname and port (by default 443) associated with the specific Red Hat OpenShift route from the administrator.
-
Import the certificate of the Presto server to establish trust. You can import the certificate either by using the cert-mgmt utility or manually. Based on your setup, complete one of the following steps:
Note: Presto servers in watsonx.data use TLS for securing access from clients. Sometimes, the Presto server may be associated with a Self-signed certificate instead of a certificate authority (CA) signed certificate. In such cases, the client programs that connect via the HTTPs (TLS) protocol must be configured to trust the self-signed certificate configured in the Presto Server.- If you are connecting from the watsonx.data utilities, go to
bin/cert-mgmt
and run the following command:bin/cert-mgmt --op=add --host=<hostname of the Presto server> --port=443
You see the following output:/mnt/infra/tls/truststore.jks updated /mnt/infra/tls/cabundle.crt updated
Note: The ibm-lh-client installation includes utilities and packages to connect to Presto engines in watsonx.data software deployments. The client installation includes thebin/cert-mgmt
executable that simplifies fetching the certificate that is used with a Presto server. It stores such certificates in a Java (JKS) format truststore for Java based clients and also in PEM format for use with nonjava clients.The presto-cli and presto-run watsonx.data (ibm-lh-client) utilities are Java-based and automatically use this JKS truststore. JDBC programs (such as those run via the
bin/dev-sandbox
utility) must be configured to explicitly use this JKS truststore. Thecabundle.crt
must be used by non-Java programs, such as Python scripts. - If you are connecting from programs or utilities that run outside the watsonx.data client installation, do one of the
following:
i. From your client machine, point or copy the truststore or certificate file present in the ibm-lh-client installation directory. The certificates are at ibm-lh-client/localstorage/volumes/infra/tls/. In this location, you can find:
truststore.jks
- can be used for Java/JDBC based applications.cabundle.crt
- can be used for non-Java applications.aliases/
- is a subdirectory with individual certificates. For example,aliases/<Presto server hostname>:443.crt
refers to the certificate for a specific Presto server only.ii. Manually import certificates into the truststore. For instructions, see Importing self-signed certificates from a Presto server to a Java truststore.
- If you are connecting from the watsonx.data utilities, go to
-
Authenticate with the Presto server.
./presto --server https://<route_name> --truststore-path /<directory>/<certfilename/truststore>.jks --truststore-password changeit --catalog iceberg_data --user admin --password
Note:-
The default out-of-the-box user for watsonx.data software is ‘admin’ with a generated password that is accessible by the ibm-lakehouse-manage get-cpd-instance-details command after installation.
-
-—truststore-path
and-—truststore-password
arguments are optional if the certificates were imported into the default JDK/JRE trust store already
The Administrator of the watsonx.data instance can configure different Identity Providers to authenticate users or grant them appropriate Admin or User Roles. See the “Managing Users” topic for more details.
In addition to directly using their credentials as username and password properties to connect to the Presto server, users can choose an API Key as well.
-
-
When connecting to the watsonx.data Presto servers, the credential properties to set would then be:
username: username, password: <password>
Or
username: ibmlhapikey, password:<base64 of username:apikey>
Or
username: ibmlhtoken, password:<platform token> or < instance token >
Using Java/JDBC – software
This section provides instructions to connect to a Presto server from a Java JDBC application.
-
the Red Hat OpenShift Project namespace administrator or cluster administrator has exposed the route to the Presto server. The administrators should refer to Exposing secure route to Presto server for instructions.
-
you have the received the hostname and port (by default 443) associated with that specific Red Hat OpenShift route from the administrator.
To connect with watsonx.data Presto server from a Java JDBC application, complete the following steps:
-
If required, complete the instructions in Using presto-cli executable (remote) - software to get the hostname, port, server certificate, authentication, and test the connectivity.
-
Download and install the
presto-jdbc-0.286.jar
(or later) on the client machine. -
Add
presto-jdbc-0.286.jar
(or later) to the class path of your Java application. -
Create a Java application by using JDBC interface. Following is a sample JDBC code.
Note:-
presto_url - Identifies the jdbc URL to the Presto server
-
SSLTrustStorePath
andSSLTrustStorePassword
properties are set to identify the location of the truststore and its password (if necessary). -
Values for user and password properties are the watsonx.data software edition username and password.
import java.sql.*; import java.util.Properties; public class PrestoJdbcSample { public static void main(String[] args) throws Exception { /** * example of fetching the location and credentials needed to connect, from environment variables **/ String username = System.getenv("ENG_USERNAME"); String password = System.getenv("ENG_PASSWORD"); String hostname = System.getenv("ENG_HOST"); String portnumber = System.getenv("ENG_PORT"); String presto_url = "jdbc:presto://" + hostname + ":" + portnumber; String ts_location = System.getenv("TRUSTSTORE"); // example: “ibm-lh-client/localstorage/volumes/infra/tls/truststore.jks"; String ts_password = System.getenv("TRUSTSTORE_PASSWORD"); // example: "changeit"; Connection connection = null; Statement statement = null; ResultSet resultSet = null; try { /** load the Presto JDBC Driver class **/ String driverClass = "com.facebook.presto.jdbc.PrestoDriver"; Class.forName(driverClass); /** Set the connection properties**/ Properties properties = new Properties(); properties.setProperty("user", username); properties.setProperty("password", password); properties.setProperty("SSL", "true"); /** * identify where the java truststore is located. * skip if the default JDK truststore already has the Presto Server's certificate * or if the Presto server has signed certificates **/ properties.setProperty("SSLTrustStorePath", ts_location); properties.setProperty("SSLTrustStorePassword", ts_password); /** Connect **/ connection = DriverManager.getConnection(presto_url, properties); /** Issue a Query **/ String query = "SELECT * FROM tpch.tiny.customer LIMIT 10"; statement = connection.createStatement(); resultSet = statement.executeQuery(query); /** iterate through the results **/ while (resultSet.next()) { String phone = resultSet.getString("phone"); String name = resultSet.getString("name"); System.out.println("phone = " + phone + ", name = " + name); } } catch (Exception e) { e.printStackTrace(); } finally { /** clean up at the end always **/ if (resultSet != null) { resultSet.close(); } if (statement != null) { statement.close(); } if (connection != null) { connection.close(); } } } }
-
Using Python scripts – software
This section provides instructions to connect to a Presto server from a Python client.
-
the Red Hat OpenShift Project namespace administrator or cluster administrator has exposed the route to the Presto Server. The administrators should refer to Exposing secure route to Presto server for instructions.
-
you have the received the hostname and port (by default 443) associated with that specific Red Hat OpenShift route from the administrator.
To connect with watsonx.data Presto server from a Python client, complete the following steps:
-
Install python 3.x (3.10 or later recommended) and
pip3
on your client workstation. -
Install common python modules to connect to Presto.
pip3 install SQLAlchemy 'pyhive[presto]' presto-python-client
-
If required, complete the instructions in Using presto-cli executable (remote) - software to get the hostname, port, server certificate, authentication, and test the connectivity.
-
Start the sandbox container for the registered Presto engine.
ibm-lh-client/bin/dev-sandbox --engine=demo-b
-
In the bash prompt, install the
prestodb
module.export HOME=/tmp pip3 install SQLAlchemy 'pyhive[presto]' presto-python-client
Important: Steps 4 and 5 are required only if you are not using the Prestodb module inibm-lh-client
. Also, currently due to an issue with the Prestodb module inibm-lh-client,
you must complete steps 4 and 5. -
Use the DBAPI interface to query the Presto server. Following is a sample Python script:
Note:-
See how the "ENG_" environment variables are examples of how the relevant Presto server locations can be used with the
dbapi.connect
function call. -
An example of how to point to the location of the Presto server’s certificate file is represented by the
cert_location
variable. -
The
conn._http_session.verify = cert
_location property is used to point to the certificate location (for self-signed cases only).
import os import prestodb username=os.environ["ENG_USERNAME"] password=os.environ["ENG_PASSWORD"] hostname=os.environ["ENG_HOST"] portnumber=os.environ["ENG_PORT"] cert_location='./ibm-lh-client/localstorage/volumes/infra/tls/aliases/' + hostname + ':' + portnumber + '.crt' with prestodb.dbapi.connect( host=hostname, port=portnumber, user=username, catalog='tpch', schema='tiny', http_scheme='https', auth=prestodb.auth.BasicAuthentication(username,password) ) as conn: conn._http_session.verify = cert_location cur = conn.cursor() cur.execute('select * from tpch.tiny.customer limit 10') rows = cur.fetchall() print(rows)
-