Integrating with watsonx.data on IBM Cloud (Analytics Engine powered by Apache Spark)
You can integrate Analytics Engine powered by Apache Spark with watsonx.data on IBM Cloud.
Before you begin
Before you can configure an Analytics Engine powered by Spark instance for watsonx.data on IBM Cloud, you must:
- Provision a watsonx.data instance on IBM Cloud. For information, see Getting started with watsonx.data.
- Install Cloud Pak for Data with Analytics Engine powered Apache Spark. For more information, see Installing Analytics Engine powered by Apache Spark.
- Provision an instance of Analytics Engine powered by Apache Spark service. See Provisioning an instance.
Procedure
Configure your Analytics Engine powered by Apache Spark instance with your watsonx.data instance:
-
Generate an access token to set the Analytics Engine powered by Apache Spark instance default configuration. See Generating an API authorization token.
-
Run the API to set instance default configuration:
c. curl -X PATCH --location --header "Authorization: ZenApiKey ${TOKEN}" --header "Accept: application/json" --header "Content-Type: application/merge-patch+json" --data '{ d. <CONFIGURATION_DETAILS> }' "<https://<CloudPakforData_URL>/v4/analytics_engines/<INSTANCE_ID>/default_configs"
-
CONFIGURATION_DETAILS: Copy the following configuration details and substitute the following values:
<hms-thrift-endpoint-from-watsonx.data>
: Specify the credentials for watsonx.data.<hms-user-from-watsonx.data>
: The watsonx.data username.<hms-password-from-watsonx.data>
: The watsonx.data password.
{ "spark.sql.catalogImplementation": "hive", "spark.driver.extraClassPath": "/opt/ibm/connectors/iceberg-lakehouse/iceberg-3.3.2-1.2.1-hms-4.0.0-shaded.jar", "spark.sql.extensions": "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions", "spark.sql.iceberg.vectorization.enabled": "false", "spark.sql.catalog.lakehouse": "org.apache.iceberg.spark.SparkCatalog", "spark.sql.catalog.lakehouse.type": "hive", "spark.sql.catalog.lakehouse.uri": "<hms-thrift-endpoint-from-watsonx.data> for example (thrift://81823aaf-8a88-4bee-a0a1-6e76a42dc833.cfjag3sf0s5o87astjo0.databases.appdomain.cloud:32683) ", "spark.hive.metastore.client.auth.mode": "PLAIN", "spark.hive.metastore.client.plain.username": "<hms-user-from-watsonx.data> (for example, ibmlhapikey)", "spark.hive.metastore.client.plain.password": "<hms-password-from-watsonx.data>", "spark.hive.metastore.use.SSL": "true", "spark.hive.metastore.truststore.type": "JKS", "spark.hive.metastore.truststore.path": "file:///opt/ibm/jdk/lib/security/cacerts", "spark.hive.metastore.truststore.password": "changeit" }
Learn more
Parent topic: Configuring an Analytics Engine powered by Apache Spark instance for watsonx.data