Adding dependent packages

Optionally, add any additional packages that applications in the instance group require to run.

Before you begin

Based on your requirements, ensure that you meet the requirements to create an instance group. See Prerequisites for an instance group.

About this task

When you add a dependent package and deploy an instance group, the package is uploaded to the repository and deployed to hosts of the instance group. As a result, the dependent files are already downloaded to the hosts and available when you start the instance group. By default, the Spark package is deployed to all resource groups; a container package with metadata is deployed to all resource groups except for notebook resource groups; and notebook packages are deployed to their own resource groups.
Note:
  • When your cluster is installed to a local file system and the instance groups that share packages use different execution users, you must upload and deploy the package directly from the package repository.
  • When your cluster is installed to a shared file system, do not share packages between instance groups. If you still want to use the same package for multiple instance groups, use different package names and upload the package separately to each instance group. Or, upload and deploy the package directly from the package repository.

For information on adding a package to the repository, see Adding packages to the Service Packages repository or egodeploy.

You can add a package to a instance group, either by selecting an existing package from the repository or by selecting one from your computer. As a convenience, you can also create a package from a single-file.

Procedure

  • Follow these steps to add a package from the repository:
    1. In the Packages tab, click Add packages from the repository.
    2. Select the package. Packages that are registered to the top-level consumer, its children, and its parent consumers are available for selection. For example, if the top-level consumer test-123 is assigned to the root consumer and has two children test-abc and test-xyz, you can choose packages that are registered to test-123 (top-level consumer), test-abc and test-xyz (child consumers of test-123), and root (the parent of test123).
    3. Click Select.
  • Follow these steps to add from your local computer:
    1. Add a package or a file to be created as a package:
      • To add a package, in the Packages tab, drag the service package file to Upload Packages, or click this button and browse to the location of your package, select it, and then click Open.

        The system uploads the package to the repository and deploys it to all hosts in the instance group. If the package uses the same name as an existing repository package, the package in the repository is replaced.

      • To create a single-file package:
        1. Drag the file to Create Single-File Packages, or click this button and browse to the location of your file, select it and click Open.

          The system creates a package from the file, uploads it to the package repository, and then deploys it to all hosts in the instance group. If the package uses the same name as an existing repository package, the package in the repository is replaced.

          You can additionally set optional configuration for this single-file package using the remaining steps.

        2. Optional: By default, the destination path to where the system copies the file on all hosts in the instance group is $SPARK_HOME/jars. To change this destination, type in a new path. Note that the instance group execution user must have write permissions to this path.
        3. Optional: Once the system copies the file to all hosts in the instance group, the default permission for the file is 644 (read-write permission by owner, and read-only for other users). To change this permission, type in a new numeric file permission value.
        4. Optional: By default, Decompress file is clear; you can select it only if the file is in an archive format. To decompress the file at deployment time, after the system copies the file to all hosts in the instance group, select this check box.
        5. Optional: By default, Append the destination path is clear. To automatically append the destination path to the Spark parameters spark.driver.extraClassPath and spark.executor.extraClassPath in your Spark configuration, select this check box.
    2. Optional: By default, Use $DEPLOY_HOME is selected to indicate to use the $DEPLOY_HOME environment variable during package deployment as the value of the instance group's deployment directory, select this check box. If you do not want the system to use the environment variable's value for the directory, clear this check box.

      If you write custom service package scripts, and you added your package using the Add packages from the repository or Upload Packages options, you can use $DEPLOY_HOME environment variable in those scripts.

    3. Optional: By default, Use $SPARK_HOME is selected to indicate to use the $SPARK_HOME environment variable during package deployment as the value of the instance group's Spark home directory, select this check box. If you do not want the system to use the environment variable's value for the directory, clear this check box.

      If you write custom service package scripts, and you added your package using the Add packages from the repository or Upload Packages options, you can use the $SPARK_HOME environment variables in those scripts.

What to do next

  1. Optionally, to add data connectors to the instance group, see Adding data connectors.
  2. Create and deploy the instance group.
    • Click Create and Deploy Instance Group to create the instance group and deploy its packages simultaneously. In this case, the new instance group appears on the Instance Groups page in the Ready state. Verify your deployment and then start the instance group.
    • Click Create Only to create the instance group but manually deploy its packages later. In this case, the new instance group appears on the Instance Groups page in the Registered state. When you are ready to deploy packages, deploy the instance group and verify the deployment. Then, start the instance group.