Using IBM®
watsonx.data, you can add
Spark engines.
You can either provision native Spark engine or register external Spark engine. Native Spark
engine is a compute engine that resides within watsonx.data. External Spark engines are engines that
exist in a different environment from where watsonx.data is available.
watsonx.data on IBM Software
Hub
About this task
To add a Spark engine, complete the following steps.
Procedure
- Log in to watsonx.data
console.
- From the navigation menu, select Infrastructure
Manager.
- To add a Spark engine, click Add component and click
Next.
- In the Add component page, from the Engines
section, select IBM Spark.
- In the Add component - IBM Spark page, configure the following
details:
Important: Co-located and self-managed Spark engines are deprecated in the 2.0.2 release
and will not be available from the 2.0.3 release onwards. Use native Spark engine for Spark use
cases. To start using native Spark engine, see
Native Spark engine.
Field |
Description |
Display name |
Enter your compute engine name. |
Registration mode |
Based on your requirement, you can select one of the following options:
- Create a native Spark engine : The native Spark engine is a compute
engine that resides within watsonx.data. If
you select this option, see Provisioning native Spark engine to provision the native Spark
engine.
- Register an external Spark engine : The Spark and watsonx.data instances are located in different
clusters. For example, your Spark instance is provisioned on IBM Cloud, and watsonx.data is installed on your computer.
- Register a co-located Spark engine (deprecated) : The Spark and watsonx.data instances are located in the same cluster.
|
Instance |
If you selected the Register a co-located Spark engine (deprecated)
as the Registration mode , select the Spark instance
(that is colocated with watsonx.data) from the list. Click Create one to create an instance if you do
not have one. |
Management method |
If you selected Register an external Spark engine as the
Registration mode , select the appropriate management method:
- Fully-managed: Indicates that the Spark instance is owned and managed by
IBM Cloud.
- Self-managed(deprecated): Indicates that the instance is an IBM Analytics
Engine Spark on Cloud Pak for Data cluster.
|
Instance API endpoint |
If you selected the Registration mode as Register an
external Spark engine and Management method as
Fully-managed, enter the IBM Analytics engine instance endpoint. For more
information, see Retrieving service endpoints. |
API key |
If you selected the Registration mode as Register an
external Spark engine and Management method as
Fully-managed, enter the API key. |
Spark jobs V4 endpoint |
If you selected the Registration mode as Register an
external Spark engine and Management method as
Self-managed (deprecated), enter the self-managed IBM Analytics engine
endpoint details. |
ZenApiKey |
If you selected the Registration mode as Register an
external Spark engine and Management method as
Self-managed (deprecated), enter the self-managed API details. |
- Click Create. The engine is provisioned and is displayed in the
Infrastructure Manager page.