Table of contents

What's new and changed in Execution Engine for Apache Hadoop

The Execution Engine for Apache Hadoop release and subsequent refreshes can include new features, bug fixes, and security updates. Refreshes appear in reverse chronological order, and only the refreshes that contain updates for Execution Engine for Apache Hadoop are shown.

You can see a list of the new features for the platform and all of the services at What's new in IBM® Cloud Pak for Data.

Installing or upgrading Execution Engine for Apache Hadoop

Ready to install or upgrade Execution Engine for Apache Hadoop?

Related documentation:

Refresh 2 of Cloud Pak for Data Version 3.5

A new version of Execution Engine for Apache Hadoop was released in January 2021.

Assembly version: 3.5.1

This release includes the following changes:
New features

You must install Version 3.5.1 of the Execution Engine for Apache Hadoop service if you want to install the service on Red Hat® OpenShift® 4.6.

In addition this release also includes the following features and updates:

Support for Cloudera 7.1.x
Execution Engine for Apache Hadoop Version 3.5.1 supports the Cloudera 7.1.x platform.
Bug fixes
  • Issue: RStudio® Server with R 3.6 includes a new version of Sparklyr. Sparklyr doesn't work with the HadoopLibUtils library that's provided to connect to remote Hadoop clusters using Livy. The following error occurs: Error: Livy connections now require the Spark version to be specified.

    Resolution: It now works as expected.

Initial release of Cloud Pak for Data Version 3.5

A new version of Execution Engine for Apache Hadoop was released as part of Cloud Pak for Data Version 3.5.

Assembly version: 3.5.0

This release includes the following changes:

New features
Integration with IBM Spectrum® Conductor with Spark clusters
IBM Spectrum Conductor with Spark is now supported. You can integrate IBM Spectrum Conductor with Spark and Watson™ Studio by using Jupyter Endpoint Gateway endpoints. Users can open a notebook in Watson Studio to access Jupyter Endpoint Gateway instances that are running on IBM Spectrum Conductor with Spark. For details, see Spectrum environments.
New configurations that allow you to use your own certificates
The configurations convert DSXHI to do the following customizations:
  • Provide a custom Keystore to generate the required .crt.
  • Provide any custom truststore (CACERTS), where DSXHI certificates will be added.
  • Provide options to either add the host certificate to the truststore yourself or have DSXHI add it.

For details, see Installing the Execution Engine for Apache Hadoop service on Apache Hadoop clusters or on Spectrum Conductor clusters.

Support for additional types of security
Execution Engine for Apache Hadoop supports:
  • The JSON Web Tokens to Kerberos delegation token provider, which provides authentication to HiveServer2, HDFS, and HMS resources. For details, see Using delegation token endpoints.
  • The updated versions for Jupyter Endpoint Gateway 2.3 and Knox 1.4.
Improved validation
The system_check.py scripts were introduced to validate your Hadoop configuration.