Using advanced features in Analytics Engine Powered by Apache Spark
Depending on the way you configure Analytics Engine Powered by Apache Spark, you can take advantage of Spark advanced features that support application deployment and monitoring when you create instances.
Important: You must configure the Analytics Engine Powered by Apache Spark service to support Spark advanced features before you create instances. The advanced features will not be available for instances created before the advanced feature was enabled for the service. One MUST use a new instance after enabling/disabling the advance features.
These advanced features include:
- When a Spark service instance is created, a deployment space is associated with instance. The deployment space contains all the job runs associated with the particular instance.
- You can check the Spark
application-id
,job-id
, the status and the duration of a job from the IBM Cloud Pak for Data user interface. - You can view and download logs from the user interface.
- You can upload the Spark application JAR or file as an asset, which is automatically added to the Spark classpath making it available to the Spark job.
- You can also leverage the jobs dashboard to view jobs based on their status.
Enabling advanced features
Required services: The Common core services must be installed. See Shared cluster components.
Required role: You must be an Openshift administrator or Openshift project administrator to make changes to the Analytics Engine custom resource (CR).
You can enable using the advanced features in provisioned instances in one of two ways:
-
Either by using the following patch command:
oc patch AnalyticsEngine <analyticsengine-cr-name> --namespace ${PROJECT_CPD_INSTANCE} --type merge --patch '{"spec": {"serviceConfig":{"sparkAdvEnabled":true}}}'
-
Or by carrying out the following steps:
- Log in to the Cloud Pak for Data cluster.
- Update the
spec.serviceConfig.sparkAdvEnabled
property in the Analytics Engine CR YAML file that was used to set up Analytics Engine Powered by Apache Spark. - Then apply the changes to the existing deployed CR using the following command:
oc apply -f cr.yaml -n ${PROJECT_CPD_INSTANCE}
- Wait for the Analytics Engine CR to be in
Completed
state:oc get analyticsengine -n ${PROJECT_CPD_INSTANCE}
The configuration changes might need a few minutes to take effect. All Spark instances that are subsequently created will have the Spark advanced features enabled.
Parent topic: Administering Analytics Engine Powered by Apache Spark