Integrating with watsonx.data on IBM Cloud (Analytics Engine powered by Apache Spark)

You can integrate Analytics Engine powered by Apache Spark with watsonx.data on IBM Cloud.

Before you begin

Before you can configure an Analytics Engine powered by Spark instance for watsonx.data on IBM Cloud, you must:

Procedure

Configure your Analytics Engine powered by Apache Spark instance with your watsonx.data instance:

  1. Generate an access token to set the Analytics Engine powered by Apache Spark instance default configuration. See Generating an API authorization token.

  2. Run the API to set instance default configuration:

    c. curl -X PATCH --location --header "Authorization: ZenApiKey ${TOKEN}" --header "Accept: application/json" --header "Content-Type: application/merge-patch+json" --data '{
    d. <CONFIGURATION_DETAILS>
    }' "<https://<CloudPakforData_URL>/v4/analytics_engines/<INSTANCE_ID>/default_configs"
    
  3. CONFIGURATION_DETAILS: Copy the following configuration details and substitute the following values:

    • <hms-thrift-endpoint-from-watsonx.data>: Specify the credentials for watsonx.data.
    • <hms-user-from-watsonx.data>: The watsonx.data username.
    • <hms-password-from-watsonx.data>: The watsonx.data password.
    {
    "spark.sql.catalogImplementation": "hive",
    "spark.driver.extraClassPath": "/opt/ibm/connectors/iceberg-lakehouse/iceberg-3.3.2-1.2.1-hms-4.0.0-shaded.jar",
    "spark.sql.extensions": "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions",
    "spark.sql.iceberg.vectorization.enabled": "false",
    "spark.sql.catalog.lakehouse": "org.apache.iceberg.spark.SparkCatalog",
    "spark.sql.catalog.lakehouse.type": "hive",
    "spark.sql.catalog.lakehouse.uri": "<hms-thrift-endpoint-from-watsonx.data> for example (thrift://81823aaf-8a88-4bee-a0a1-6e76a42dc833.cfjag3sf0s5o87astjo0.databases.appdomain.cloud:32683) ",
    "spark.hive.metastore.client.auth.mode": "PLAIN",
    "spark.hive.metastore.client.plain.username": "<hms-user-from-watsonx.data> (for example, ibmlhapikey)",
    "spark.hive.metastore.client.plain.password": "<hms-password-from-watsonx.data>",
    "spark.hive.metastore.use.SSL": "true",
    "spark.hive.metastore.truststore.type": "JKS",
    "spark.hive.metastore.truststore.path": "file:///opt/ibm/jdk/lib/security/cacerts",
    "spark.hive.metastore.truststore.password": "changeit"
    }
    

Learn more

Parent topic: Configuring an Analytics Engine powered by Apache Spark instance for watsonx.data