Updating instance groups to use a new Spark version

When a new (higher) Apache Spark version becomes available, upon adding it to your system to use it with IBM® Spectrum Conductor, you can then update existing instance groups to use the new Spark version.

Before you begin

  • You must be a cluster administrator, consumer administrator, or have the Instance Groups Configure permission.
  • The new Spark version that you want to associate with your existing instance groups must be installed on the system (see Adding Spark versions).
  • The instance group must be in the Registered, Ready, Register Error, or Deploy Error state. If the instance group is running workload, stop the instance group and all associated notebooks before you update it. See Stopping instance groups and Stopping notebooks in an instance group.
  • Upgrading the Spark version is a permanent change to your instance group. Once upgraded, you cannot roll back the upgrade to downgrade the Spark version for the instance group. As a best practice, before you upgrade your instance group on production, back up necessary assets and test application compatibility before you configure it with the new Spark version:
    1. The new Spark version must be higher than the current Spark version, support all notebooks enabled for, and all data connectors configured for, the existing instance group.
    2. Upgrading the Spark version for an existing instance group removes all the logs under the current instance group deployment directory. To retain theses logs, back them up prior to upgrading the Spark version for the instance group.
    3. The Spark version upgrade can introduce new configuration parameters or new default values to existing parameters. Review the configuration for the new Spark version prior to upgrading a instance group to use it.
    4. If you deploy a notebook with its base data directory configured within the instance group deployment directory, the Spark version upgrade undeploys the current Spark version, and removes that directory. Back up this directory to a location outside of the instance group deployment directory prior to upgrading the Spark version for the instance group. Upon upgrading the Spark version for the instance group, you can restore this directory.

      To avoid this step during future Spark version upgrades, reconfigure the base data directory for all notebooks to a location outside of the instance group deployment directory. You can complete this at the same time as the Spark version upgrade for the instance group, or as a separate step before or after the upgrade.

    5. Create a test instance group to test application on the new Spark version. Either create a new instance group, or copy an existing one to a template and create a new instance group from that template.

      Upon testing your applications and verifying on the new Spark version using the test instance group, you can stop the production instance group and configure it to use the new Spark version.

Procedure

  1. From the cluster management console, select Workload > Instance Groups.
  2. Select the instance group that you want to use the upgraded Spark version, and click Configure.
  3. The Spark configuration list shows all available Spark versions to use with the instance group. (If newer Spark versions are available but are incompatible with the instance group's notebooks, data connectors, or both, the system prompts with a message indicating this.)

    Select the Spark version to which you want the instance group to use.

  4. The system prompts you with a reminder that upgrading the Spark version for an existing instance group is a permanent change and describes the necessary testing and backup activities you should complete before upgrading. Review this information and select the check box to proceed.
  5. Click Modify Instance Group.

Results

Your instance group is now updated to uses the newer Spark version.

What to do next

  1. Start the instance group. See Starting instance groups.
  2. Test your applications on the upgraded Spark version using the test instance group that you created before the upgrade. Once satisfied, stop the production instance group and configure it to use the new Spark version.
  3. If you backed up the base data directory to a location outside of the instance group deployment directory prior to upgrading the Spark version for the instance group, restore this directory for use with the upgraded Spark version.