Installation requirements for IBM Spectrum LSF Suite for Enterprise

Learn about special installation requirements for your IBM Spectrum LSF Suite for Enterprise cluster. The different roles that hosts in your cluster can take are also described.

Installation prerequisites

IBM Spectrum LSF Suite for Enterprise uses an RPM installation program.

The RPM installer for IBM Spectrum LSF Suite for Enterprise deploys a fully functional cluster with minimal user input. It also allows administrators to easily apply updates to the cluster.

The IBM Spectrum LSF Suite for Enterprise RPM installation requires:
  • A computing environment that consists of two or more hosts
  • A host that has at least one network interface to act as the LSF management host
  • A supported operating system that is preinstalled on the LSF management host
  • Compute hosts that can be set to boot over a network

The IBM Spectrum LSF Suite for Enterprise must meet the minimum hardware and software requirements.

Host prerequisites

By default, IBM Spectrum LSF Suite for Enterprise is installed locally in the /opt/ibm directory on each host, or the binaries and configuration files for the LSF_Servers and LSF_Clients role machines can be on a shared directory.
Important: The shared directory cannot be mounted anywhere under /opt/ibm.

For better performance, Elasticsearch components are always installed locally.

To install IBM Spectrum LSF Suite for Enterprise, host roles require the following free space for the installation files:
  • LSF_Master role requires 2 GB in /opt/ibm
  • LSF_Server role requires 1.5 GB in /opt/ibm
  • LSF_Clients role requires 300 MB in /opt/ibm
  • GUI_Hosts role requires 32 GB in /opt/ibm
  • DB_Host role requires 150 MB in the root directory (/)

Large clusters should use SSDs for the database directory /var/lib/mysql and the Elasticsearch directory /opt/ibm/elastic/elasticsearch.

Keep track of the size of the following file systems:
  • /opt/ibm/lsfsuite/lsf/work contains job data.
  • /opt/ibm/lsflogs contains LSF logs.
  • /opt/ibm/elastic/elasticsearch contains Elasticsearch data and logs. This directory can get large.
  • The high-availability shared directory (HA_shared_dir) contains the LSF work and configuration directories.

To achieve the highest degree of performance and scalability, use a powerful management host.

LSF has no minimum CPU requirement. For the systems that LSF is supported on, any host with sufficient physical memory can run LSF management host. Swap space is normally configured as twice the physical memory. LSF daemons in a cluster on Linux x86-64 use about 488 MB of memory when no jobs are running. Active jobs use most of the memory that LSF requires.


Table 1. LSF management host requirements
Cluster size Active jobs Minimum required memory (typical) Recommended server CPU
Small (<100 hosts) 1,000 1 GB (32 GB) Any server CPU
10,000 2 GB (32 GB) Recent server CPU
Medium (100 - 1000 hosts) 10,000 4 GB (64 GB) Multi-core CPU (2 cores)
50,000 8 GB (64 GB) Multi-core CPU (4 cores)
Large (>1000 hosts) 50,000 16 GB (128 GB) Multi-core CPU (4 cores)
500,000 32 GB (256 GB) Multi-core CPU (8 cores)


Table 2. GUI host requirements
Cluster size Active jobs Minimum required memory (typical) Recommended server CPU
Small (<100 hosts) 1,000 16 GB (32 GB) Multi-core CPU (4 cores)
Medium (100 - 1000 hosts) 50,000 32 GB (64 GB) Multi-core CPU (8 cores)
Large (>1000 hosts) 500,000 96 GB (256 GB) Multi-core CPU (16 cores)

Operating system configuration

Installation on RHEL and CentOS has no operating system configuration prerequisites or internet connection. Installation on Ubuntu requires Ansible Version 2.2 and RPM tools and an internet connection to get any dependencies required during installation.

Correct host name resolution is essential. DNS or hosts files must be accurate and complete. All the hosts that are listed in the lsf-inventory file must be resolvable on all other hosts.

The operating system repository must be configured before installation so that installation dependency packages can be automatically pulled in. Each server must be able to use the OS repository. Each host must set passwordless SSH from the deployer host to enable the deployer host to log in to each host in the cluster with SSH so that Ansible can deploy the cluster.

The primary LSF administrator is lsfadmin. The primary administrator owns the LSF configuration files and log files for job events. If the lsfadmin user does not exist, the lsfadmin user account is created automatically as the default cluster administrator account. It is created as a system account that does not expire; it contains a default UID of 495 (for Fix Pack 12 and earlier) or 1000321495 (for Fix Pack 13 and later). The installation can also use an existing lsfadmin account that you created in LDAP or other system.

The high-availability (HA) installation requires a shared directory (either NFS or IBM Spectrum Scale). You can also configure an optional external database in the HA configuration.

Run some simple tests check host name resolution and OS repository availability.
  • The following command returns the IP address of the host:
    host <host_name> 
    
  • The following command returns the host name:
    host <host_IP>
  • Use the ping command with either the host name or host IP address to make sure that the host is reachable.
  • Use the yum command to test the OS repository. The following command return the proper OS yum repository:
    yum repolist
  • Run the yum update command. If you can't use yum update, try any package name that is shipped in OS distribution:
    yum list mariadb
  • Check that the JDBC driver is available:
    yum list mysql-connector-java
  • Make sure that the global profile.d file is sourced by root on all hosts.
  • Check that passwordless SSH is set up from deployer to all hosts:
    ssh-copy-id <host_name>

Network configuration and host roles

A simple corporate network with one NIC that all servers are connected to is the most basic setup. Each node in the cluster must resolve both the management and deployment hosts.

The following figure illustrates a basic network configuration:
Basic network configuration for LSF Suites

You can also have a configuration with multiple networks. For example, the LSF servers might be connected to a low-latency network for running MPI workload over Infiniband, with a private network for handling LSF workload. LSF clients can be configured on the corporate network for the users to submit jobs from. The GUI hosts must be accessible on the corporate network. For installation, use the corporate network host name for the GUI hosts.

The following figure shows a more advanced network configuration.
Advanced network configuration for LSF Suites