Configuring Analytics Engine
You can configure IBM Analytics Engine instance to connect to the IBM® watsonx.data instance. Set watsonx.data configurations and Spark related configuration as the default configuration for the IBM Analytics Engine instance.
watsonx.data on IBM Software Hub
Configuring an Analytics Engine instance by using IBM Cloud console
- Log in to your IBM Cloud account.
- Access the IBM Cloud Resource list.
- Find your Analytics Engine instance and click the instance to see the details.
- Click to view the configuration.
- In the Default Spark configuration section, click Edit.
- Add the following configuration to the Default Spark configuration
section.
Parameter values:{ "spark.sql.catalogImplementation": "hive", "spark.sql.extensions": "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions", "spark.sql.iceberg.vectorization.enabled": "false", "spark.sql.catalog.lakehouse": "org.apache.iceberg.spark.SparkCatalog", "spark.sql.catalog.lakehouse.type": "hive", "spark.sql.catalog.lakehouse.uri": "<public_IP_address>:<nodeport>", "spark.hive.metastore.client.auth.mode": "PLAIN", "spark.hive.metastore.client.plain.username": "<metastore_admin_user>", "spark.hive.metastore.client.plain.password": "<metastore_admin_password>", "spark.hive.metastore.use.SSL": "true", "spark.hive.metastore.truststore.type": "JKS", "spark.hive.metastore.truststore.path": "file:///home/spark/shared/user-libs/wxd-library/custom/truststore.jks", "spark.hive.metastore.truststore.password": "<trustsore_password>" }
<public_IP_address>:<nodeport>
- public_IP address and nodeport for the watsonx.data instance.<metastore_admin_user>
- watsonx.data cluster metastore-admin user.Note:To submit spark jobs to watsonx.data , you must have Admin or Metastore admin privilege in watsonx.data. The preferred one is Metastore admin. For more information, see Managing access to metastore.
<metastore_admin_password>
- watsonx.data cluster metastore-admin password.<trustsore_password>
-truststore.jks
password.
Configuring an Analytics Engine instance by using Analytics Engine API
- Generate an IAM token to connect to the Analytics Engine API. For more information about how to generate an IAM token, see Granting permissions to users.
- Run the API to set instance default
configuration:
Parameter values:curl -X PATCH --location --header "Authorization: Bearer {IAM_TOKEN}" --header "Accept: application/json" --header "Content-Type: application/merge-patch+json" --data '{ <CONFIGURATION_DETAILS> }' "{BASE_URL}/v3/analytics_engines/{INSTANCE_ID/default_configs"
IAM_TOKEN
- The API token generated for the Analytics Engine API.INSTANCE_ID
- The Analytics Engine instance ID. For more information, see Obtaining the service endpoints using the IBM Cloud CLI.CONFIGURATION_DETAILS
{ "spark.sql.catalogImplementation": "hive", "spark.sql.extensions": "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions", "spark.sql.iceberg.vectorization.enabled": "false", "spark.sql.catalog.lakehouse": "org.apache.iceberg.spark.SparkCatalog", "spark.sql.catalog.lakehouse.type": "hive", "spark.sql.catalog.lakehouse.uri": "<public_IP_address>:<nodeport>", "spark.hive.metastore.client.auth.mode": "PLAIN", "spark.hive.metastore.client.plain.username": "<metastore_admin_user>", "spark.hive.metastore.client.plain.password": "<metastore_admin_password>", "spark.hive.metastore.use.SSL": "true", "spark.hive.metastore.truststore.type": "JKS", "spark.hive.metastore.truststore.path": "file:///home/spark/shared/user-libs/wxd-library/custom/truststore.jks", "spark.hive.metastore.truststore.password": "<trustsore_password>" }
Configuring an Analytics Engine instance by using Analytics Engine CLI
To specify the configuration settings for your IBM Analytics Engine instance from CLI, complete the following steps:
ibmcloud analytics-engine-v3 instance default-configs-update [--id INSTANCE_ID] --body BODY
Parameter values:-
INSTANCE_ID
- The Analytics Engine instance ID. For more information, see Obtaining the service endpoints. BODY
- Copy and paste the following configuration information:{ "spark.sql.catalogImplementation": "hive", "spark.sql.extensions": "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions", "spark.sql.iceberg.vectorization.enabled": "false", "spark.sql.catalog.lakehouse": "org.apache.iceberg.spark.SparkCatalog", "spark.sql.catalog.lakehouse.type": "hive", "spark.sql.catalog.lakehouse.uri": "<public_IP_address>:<nodeport>", "spark.hive.metastore.client.auth.mode": "PLAIN", "spark.hive.metastore.client.plain.username": "<metastore_admin_user>", "spark.hive.metastore.client.plain.password": "<metastore_admin_password>", "spark.hive.metastore.use.SSL": "true", "spark.hive.metastore.truststore.type": "JKS", "spark.hive.metastore.truststore.path": "file:///home/spark/shared/user-libs/wxd-library/custom/truststore.jks", "spark.hive.metastore.truststore.password": "<trustsore_password>" }
After you configure Analytics Engine, you can submit the Spark application. For more information, see Run Spark use case.