What's new and changed in Analytics Engine powered by Apache Spark

Analytics Engine powered by Apache Spark updates can include new features and fixes. Releases are listed in reverse chronological order so that the latest release is at the beginning of the topic.

You can see a list of the new features for the platform and all of the services at What's new in IBM Software Hub.

IBM Cloud Pak for Data Version 5.1.3

A new version of Analytics Engine powered by Apache Spark was released in April 2025.

This release includes the following changes:

Customer-reported issues fixed in this release
For a list of customer-reported issues that were fixed in this release, see the Fix List for IBM Cloud Pak® for Data on the IBM Support website.

IBM Cloud Pak for Data Version 5.1.2

A new version of Analytics Engine powered by Apache Spark was released in March 2025.

This release includes the following changes:

New features
This release of Analytics Engine powered by Apache Spark includes the following features:
Run applications on Spark 3.5
You can now use Spark version 3.5.4 to run your applications in Analytics Engine powered by Apache Spark. For more information, see Submitting Spark jobs via API.
Customer-reported issues fixed in this release
For a list of customer-reported issues that were fixed in this release, see the Fix List for IBM Cloud Pak for Data on the IBM Support website.

IBM Cloud Pak for Data Version 5.1.1

A new version of Analytics Engine powered by Apache Spark was released in February 2025.

This release includes the following changes:

New features
This release of Analytics Engine powered by Apache Spark includes the following features:
Spark UI and Spark history server available on remote physical locations
You can view the status of the Spark workloads on remote clusters by using the history server of your Spark instance.
Run Spark workloads interactively on remote physical locations
You can run Spark applications interactively by using the Kernel API on remote clusters.
Customer-reported issues fixed in this release
For a list of customer-reported issues that were fixed in this release, see the Fix List for IBM Cloud Pak for Data on the IBM Support website.

IBM Cloud Pak for Data Version 5.1.0

A new version of Analytics Engine powered by Apache Spark was released in December 2024.

This release includes the following changes:

New features
This release of Analytics Engine powered by Apache Spark includes the following features:
Automatic daily database snapshot backups
IBM Analytics Engine now automatically backs up the metastore database each day. Administrators can restore the database from the snapshots.
Improved flexibility when managing Spark environment variables
When configuring your Spark environment variables, you can now decide whether your changes apply to:
  • All Spark instances and jobs
  • A single instance of the Analytics Engine
  • An individual Spark job
For more information, see Spark jobs API.
IBM Power (ppc64le) supports Spark with R4.3
Spark with R4.3 is supported on IBM Power (ppc64le) starting in 5.1.
Schedule Spark workloads on remote physical locations
You can now install Analytics Engine powered by Apache Spark on a remote physical location so that you can run Spark workloads on remote clusters. This capability is not enabled by default.
Customer-reported issues fixed in this release
For a list of customer-reported issues that were fixed in this release, see the Fix List for IBM Cloud Pak for Data on the IBM Support website.
Deprecated features
The following feature is deprecated in this release:
Spark 3.3 is deprecated and removed
Apache Spark 3.3, which was previously deprecated, has now been removed and is no longer available in Analytics Engine. Upgrade to Spark 3.4.