Cluster properties

Before installing IBM® Spectrum Conductor, familiarize yourself with some key cluster properties.

Cluster administrator

The cluster administrator account must exist on every host in the cluster. The account must have the same password on all hosts. Ideally, the password must not change frequently.

It is recommended to create a new domain user account (named egoadmin) that can be dedicated exclusively to administration and operation of the cluster.

Installation directory

It is recommended to use the default installation directory: /opt/ibm/spectrumcomputing. On all hosts in the cluster, you must have the same installation directory available.

The installation directory that you specify is saved as the $EGO_TOP environment variable. When documentation refers to this environment variable, substitute the full path to the installation directory. The path to the installation directory must not contain spaces.

Ports and the cluster name

The default base port is 7869. IBM Spectrum Conductor requires seven consecutive ports starting from the base port; for example, 7869 to 7875. On all hosts in the cluster, you must have the same set of ports available.

The default cluster name is cluster1. To specify your own unique cluster name, specify the value during installation. The cluster name is permanent; you cannot change it after installation. Do not use a valid host name as the cluster name.

Spark workload runs on non-management hosts in your cluster. As a result, the Apache Spark UI and RESTful APIs that are available from Spark applications and the Spark history server must be accessible to your end-users. This access is also required for any notebooks that you configure for use with IBM Spectrum Conductor.

If the hosts and the ports used are not accessible from your client machines, you can encounter errors when you access notebooks and Spark UIs. The management hosts also must be able to access these hosts and the ports used.

Startup and management

Do the following:
  • Enable automatic startup for ease of administration. You need root permission to enable this, as automatic startup is supported only if you installed IBM Spectrum Conductor as root.
  • Enable remote host management to start, stop, and restart other hosts in the cluster from your local host. To enable this, you must be able to run one of the following shells:
    • Remote shell (rsh) from management hosts to compute hosts, using the cluster administrator account.
    • Secure Shell (SSH) across all hosts in the cluster without having to enter a password.

    See your operating system documentation for information about configuring rsh or SSH. Additionally, to enabling SSH for IBM Spectrum Conductor, see Enabling secure shell).

  • Grant root permissions to the cluster administrator with the egosetsudoers command. With egosetsudoers, the cluster administrator can start a local host in the cluster, and shut down or restart any host in the cluster from the local host (with remote management). You need root permission to enable this.

Failover

You can allow primary host failover. To enable failover, you must create a shared directory that is fully controlled by the cluster administrator and accessible from all management hosts. Dedicate at least two hosts exclusively for cluster management (the primary host and one other management host). You also need at least one compute host to execute instance groups.

If you choose not to allow failover, the primary host can be used to run instance groups.

Shared file system

You can choose to deploy your cluster to a shared file system, such as IBM Spectrum Scale, for storage and high availability. With a shared file system, you reduce your installation footprint by installing your cluster just once. You can also save on the time taken to deploy an instance group, because the required packages are deployed only once for the instance group.

To install your cluster to a shared file system, set the SHARED_FS_INSTALL environment variable during installation. Ensure that your users have access to the shared file system in order to retrieve files.

IBM Spectrum Scale
If you have an existing installation of IBM Spectrum Scale, you can use the IBM Spectrum Scale file system for shared storage and high availability. IBM Spectrum Scale provides high availability through advanced clustering technologies, dynamic file system management, and data replication. For more information, refer to IBM Spectrum Scale documentation.
Note: IBM Spectrum Scale is not mandatory for high availability with IBM Spectrum Conductor. You can use any supported shared file system, including NFS mounts to a file system. For a list of supported file systems, see Supported file systems.
Hadoop connector for IBM Spectrum Scale
If you have an IBM Spectrum Scale File Placement Optimizer (FPO) cluster and want to leverage data locality, you can install the Hadoop connector to enable Hadoop over IBM Spectrum Scale. For information on installing the Hadoop connector, see the IBM Spectrum Scale Hadoop Connector wiki.
The recommended Hadoop connector version is 2.7.0-8. When using versions earlier than 2.7.0-8, configure the gpfs.supergroup parameter in your core-site.xml to avoid permission issues with Zeppelin notebooks. This setting adds the user group of the Zeppelin notebook service's execution user as the super group.

Database choices

Use a commercial database to store reporting data. Contact your database administrator and set up the external database as soon as possible after installation.

If you choose to enable the non-production database (Derby), choose the primary host or a management host as the database host.

IBM Spectrum Conductor on Linux 64-bit hosts requires a database to support the following functions on the cluster management console:
  • Generate reports.
  • Display the Rack View on the cluster management console.
If you do not require these functions, you can manually disable the individual data loaders.

Elastic Stack

IBM Spectrum Conductor integrates Elasticsearch, Logstash, and Beats (the Elastic Stack) to provide data analytics capabilities. If required, modify the default port numbers for these services during installation and ensure that the port numbers are the same on all hosts. You can also choose to change the default directory for log files, which are used to harvest information and support queries.

SSL

Web server communication in your cluster is by default enabled with SSL to secure connections between clients and servers. You can optionally disable SSL communication for a non-production environment. Enabling SSL communication is highly recommended when integrity and confidentiality of data transmission is essential.

For web server communication over SSL, ensure that OpenSSL 1.0.1 or higher is installed on your hosts.

The web servers are accessible on the following default ports:
Web server With SSL Without SSL
Web server for the cluster management console 8443 8080
REST web server, which hosts the RESTful APIs for resource management and package deployment 8543 8180
ascd web server, which hosts the RESTful APIs for instance group management 8643 8280
Check for port conflicts to ensure that the specified ports are free. If you disable SSL, ensure that the HTTP ports are not used by any other service.
Important: You must use the same SSL setting for the cluster management console and the RESTful web servers. If you enable SSL for one, you must also enable SSL for the other; if you disable SSL for one, SSL must be disabled for the other as well. This setting also takes effect for cloud bursting with host factory. Ensure that SSL for all these functions is configured consistently in the cluster; without a uniform configuration, errors occur. Note, however, that when SSL is uniformly enabled, you can use different certificates and keys as required.
With SSL enabled, take into account the following important considerations:
  • When you access the cluster management console for the first time, you are prompted about trusting an untrusted website. Ensure that you install the certificate (rather than accept it). Installing the certificate installs the certificate for use by the cluster management console and the RESTful web servers that require the same certificate.
    Failure to install the certificate causes the following issues:
    • You are prompted to accept the certificate when accessing the API web servers as well. When only accepted, the certificate eventually expires.
    • You can get errors when using the Spark workload pages on the cluster management console because your browser is trying to call the corresponding APIs but access is blocked without the certificate.
  • For functional certificates to be generated, ensure that the domain name of the cluster management console and the REST hosts do not start with a number.
  • All hosts in your cluster must use FQDNs. The Spark workload pages connect to the RESTful API servers based on the URL returned by the egosh client view command, which shows the location of each web server.

    Without an FQDN, the cluster management console cannot contact the REST server. Or, cookies saved when logging in to those REST servers may not be cached to the correct FQDN. In this case, you might be prompted to log in even when you are already logged in to the cluster management console because the system is asking you to log in to the REST servers.

    FQDNs are required for all functions that use RESTful API servers, including cloud bursting through host factory.