Integrating with watsonx.data on Cloud Pak for Data (Analytics Engine powered by Apache Spark)
You can integrate Analytics Engine powered by Apache Spark with watsonx.data on Cloud Pak for Data.
Before you begin
Before you can configure an Analytics Engine powered by Apache Spark instance for watsonx.data, you must:
- Install the latest version of Cloud Pak for Data with watsonx.data. For more information, see Installing watsonx.data.
- Install the latest version of Cloud Pak for Data with Analytics Engine powered by Apache Spark. For more information, see Installing Analytics Engine powered by Apache Spark.
- Provision an instance of Analytics Engine powered by Apache Spark service. See Provisioning an instance.
Configuring an Analytics Engine powered by Apache Spark instance for watsonx.data
To configure an Analytics Engine powered by Apache Spark instance for watsonx.data
-
Configure your Analytics Engine powered by Apache Spark instance with your watsonx.data instance:
-
Generate an access token to set the Analytics Engine powered by Apache Spark instance default configuration. See Generating an API authorization token.
-
Run the API to set instance default configuration:
curl -X PATCH --location --header "Authorization: ZenApiKey ${TOKEN}" --header "Accept: application/json" --header "Content-Type: application/merge-patch+json" --data '{ <CONFIGURATION_DETAILS> }' "<https://<CloudPakforData_URL>/v4/analytics_engines/<INSTANCE_ID>/default_configs"
-
-
CONFIGURATION_DETAILS: Copy the following configuration details and substitute the following values:
<hms-thrift-endpoint-from-watsonx.data>
: Obtain the HMS endpoint from the watsonx.data instance.<hms-user-from-watsonx.data>
: Cloud Pak for Data cluster admin user or a user with metastore admin access. To grant metastore admin access to a user, see Managing access to the Hive Metastore.<hms-password-from-watsonx.data>
: Cloud Pak for Data cluster admin password.
{ "spark.sql.catalogImplementation": "hive", "spark.driver.extraClassPath": "/opt/ibm/connectors/iceberg-lakehouse/iceberg-3.3.2-1.2.1-hms-4.0.0-shaded.jar", "spark.sql.extensions": "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions", "spark.sql.iceberg.vectorization.enabled": "false", "spark.sql.catalog.lakehouse": "org.apache.iceberg.spark.SparkCatalog", "spark.sql.catalog.lakehouse.type": "hive", "spark.sql.catalog.lakehouse.uri": "thrift://<hms-thrift-endpoint-from-watsonx.data>", "spark.hive.metastore.client.auth.mode": "PLAIN", "spark.hive.metastore.client.plain.username": "<hms-user-from-watsonx.data>", #(for example, admin or metastoreadmin) "spark.hive.metastore.client.plain.password": "<hms-password-from-watsonx.data>", "spark.hive.metastore.use.SSL": "true", "spark.hive.metastore.truststore.type": "JKS", "spark.hive.metastore.truststore.path" : "file:///opt/ibm/jdk/lib/security/cacerts" "spark.hive.metastore.truststore.password" : "changeit" }
Learn more
Parent topic: Configuring an Analytics Engine powered by Apache Spark instance for watsonx.data