System requirements and recommendations

When planning your installation, take into account hardware requirements for your installation. IBM® Spectrum Conductor is supported on x86-based servers and IBM PowerLinux servers.

The following tables list the minimum system requirements for running IBM Spectrum Conductor in both evaluation and production environments. You may have extra requirements (such as extra CPU and RAM) depending on the instance groups that will run on the hosts, especially for compute hosts that run workload.

Tip: We suggest that you disable real-time anti-virus software and any defragmentation software. These tools cause poor performance and instability, especially on management hosts and create various problems if they lock files while scanning them. Also, schedule virus scanning during cluster downtime.

Hardware requirements for evaluation

Table 1. Minimum hardware requirements for evaluation
Requirement Management hosts Compute hosts
CPU power At least 2.4 GHz At least 2.4 GHz
RAM 24 GB 8 GB
Disk space to install 12 GB 12 GB
Additional disk space (for instance group packages, logs, etc.) Can be 30 GB for a large cluster 1 GB*N slots + sum of service package sizes (including dependencies)

Hardware requirements for production

Table 2. Minimum hardware requirements for production
Requirement Management hosts Compute hosts Notes
Cores used 8 1 or more Choose a modern processor with multiple cores. Common clusters use two to eight core machines. When it comes to choosing between faster CPUs or more cores, it is recommended to choose hosts with more cores.
CPU power At least 2.4 GHz At least 2.4 GHz
RAM 64 GB 32 GB In general, the more memory your hosts have, the better performance is.
Disk space to install 12 GB 12 GB Use at least this amount of disk space for installation.

The file system must not be mounted with the nosuid option if you want to set setuid to root for the cluster administrator user.

Additional disk space (for instance group packages, logs, etc.) Can be 30 GB for a large cluster 1 GB*N slots + sum of service package sizes (including dependencies) Disk space requirements depend on the number of instance groups and the Spark applications that you run. Long running applications, such as notebooks and streaming applications, can generate huge amounts of data that is stored in Elasticsearch. What your applications log can also increase disk usage. Consider all these factors when estimating disk space requirements for your production cluster. For optimal performance, look at tuning how long to keep application monitoring data based on your needs.

Software requirements

IBM Spectrum Conductor 2.4.1 supports GPU applications on Red Hat Enterprise Linux, and Linux on POWER compute hosts in clusters with any supported Linux and Linux on POWER primary hosts.
  • Supported GPU compute compatibility levels: 1.0, 1.1, 1.2, 1.3, 2.0, or later.
  • Supported CUDA APIs:
    • Red Hat Enterprise Linux 6.4 and later supports CUDA 8.0 or later.
    • POWER8 LE with Red Hat Enterprise Linux 7.2 supports CUDA 8.0 or later.

User access of client machines to cluster hosts

Spark workload runs on non-management hosts in your cluster. As a result, the Apache Spark UI and RESTful APIs that are available from Spark applications and the Spark history server must be accessible to your end-users. This access is also required for any notebooks that you configure for use with IBM Spectrum Conductor.

If the hosts and the ports used are not accessible from your client machines, you can encounter errors when you access notebooks and Spark UIs. The management hosts also must be able to access these hosts and the ports used.

Heap considerations for the Elastic Stack

The default Elasticsearch installation uses 10 GB heap for the Elasticsearch services and 4 GB for Logstash service, which satisfies the 24 GB of RAM for IBM Spectrum Conductor system requirements. If your hosts have more than 24 GB memory and you have a need to increase the heap (for example, for system performance reasons), you can increase the Elasticsearch and Logstash heap sizes in IBM Spectrum Conductor according to Tuning the heap sizes for Elasticsearch and Logstash to accommodate heavy load.