September 2, 2021 By Chih-Chieh Yang
Ziji Zhang
4 min read

In this post, we demonstrate how to run an HPC workload in an IBM Spectrum LSF cluster on IBM Cloud. 

We assume that you have an IBM Cloud account and that you have already created an IBM Spectrum LSF cluster through the IBM Cloud Catalog. If you haven’t, please follow the instructions in this previous blog post first to create a cluster.

Step 1: Log in to the master node of your cluster

Refer to the tutorial if you don’t know how to get the IP addresses needed for the ssh command:

ssh -J root@ip-jumphost lsfadmin@ip-masternode


Step 2: Navigate to the shared folder

After logging into the master node, you can see a folder named shared. This allows access to the shared NFS of the cluster. All files placed under this folder will be accessible to all the worker nodes. This is an ideal location for placing the application binaries and software libraries so that the jobs we submit can have the same environment for their work.

Let’s create a subfolder demo in the shared folder for everything that we need in this demo:

mkdir -p shared/demo
cd shared/demo


Step 3: Build and install LAMMPS

We choose LAMMPS as an example workload. As a classical molecular dynamics (MD) code with a focus on materials modeling, Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) is widely used in many research fields, including materials science, biomedical engineering, chemical engineering, etc.

First, we set the environment variables to include the MPI executables and library shared objects:

export PATH=$PATH:/usr/local/openmpi-4.1.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/openmpi-4.1.0/lib

Then, we acquire the latest stable LAMMPS release:

cd ~/shared/demo
git clone -b stable_29Oct2020 

Install the needed packages:

cd lammps/src
make yes-molecule
make yes-rigid
make yes-manybody


sed '/CCFLAGS/s/$/ -std=c++11/' MAKE/Makefile.mpi > MAKE/Makefile.lsf
make -j 16 lsf

Run an example to make sure everything works:

cd ~/shared/demo/lammps
mpirun -np 2 src/lmp_lsf -in examples/flow/in.flow.pois

You should see no errors in the output.

Step 4: Build and install mpitrace

To collect detailed information about messaging, mpitrace contains a library of tools aiming at analysis of distributed-memory parallel applications written with MPI, which provide a convenient place to enable other performance tools. Here, we build and install it in the shared folder, too.

Make sure the environment variables are set for MPI:

export PATH=$PATH:/usr/local/openmpi-4.1.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/openmpi-4.1.0/lib
cd ~/shared/demo
git clone
cd mpitrace/src

Now the dynamic library $HOME/shared/demo/mpitrace/src/ is accessible for every worker node. When submitting jobs on the HPC cluster, we just need to export it as LD_PRELOAD variable before MPI applications to enable mpitrace profiling, for example:

export LD_PRELOAD=$HOME/shared/demo/mpitrace/src/
<MPI applications>


Step 5: Submit an LSF job to run LAMMPS simulation with mpitrace profiling

Now that we have set up the execution environment, it’s time to run a multi-node MPI job. We will use one of the LAMMPS benchmarks for simulating Cu metallic solid with embedded atom method (EAM) potential. There are 32,000 atoms in a box of 20^3 Angstroms, and we run NVE time integration for 100 steps with 5.0 fs as the time step size.

Before we prepare the benchmark and the job script, let’s examine the LSF cluster status by running bhosts. If you used the default setting and the minimum worker node number is 0, you should see something like the following. This indicates that there are two master nodes in the LSF cluster, and there are no available slots for LSF jobs. Thus, when you submit a job that require slots, LSF auto-scaling will provision worker nodes on the fly. Thus, there will be a small delay before your job executes:

[lsfadmin@icgen2host-10-241-128-37 eam]$ bhosts
icgen2host-10-241- closed          -      0      0      0      0      0      0
icgen2host-10-241- closed          -      0      0      0      0      0      0

First, download the benchmark and extract it:

cd ~/shared/demo
tar zxvf bench_eam.tar.gz

Then, we create an LSF script for the job that will run this benchmark:

cd eam
cat << EOF > job.lsf
#BSUB -n 8
#BSUB -R span[ptile=2]
#BSUB -o out.%J
#BSUB -e err.%J
#BSUB -J cu-20

export PATH=$PATH:/usr/local/openmpi-4.1.0/bin:/home/lsfadmin/shared/demo/lammps/src
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/openmpi-4.1.0/lib
export LD_PRELOAD=/home/lsfadmin/shared/demo/mpitrace/src/

cd ~/shared/demo/eam/
mpirun --report-bindings --bind-to core lmp_lsf -in in.eam

We specify -n 8, meaning that there will be eight MPI ranks. And ptile=2 means that we want two ranks per worker node. Since the default bx2-4x16 profile grants two cores per worker node, this job requires four worker nodes to run and we can fully subscribe the worker nodes. In the script, we also include export commands to set up the environment variables. The LD_PRELOAD setting will enable mpitrace profiling.

Now that everything is there, we submit the job by running the following:

bsub < job.lsf

You can then run bjobs to check the status of the submitted job. As previously mentioned, there can be a delay before enough worker nodes are provisioned. When the workers are ready, bjobs should output something similar to the following:

4       lsfadmi RUN   normal     icgen2host- icgen2host- cu-20      Aug 23 14:30

You can also see that there are more hosts from bhosts. Four extra worker nodes were automatically provisioned by LSF:

icgen2host-10-241- closed          -      0      0      0      0      0      0
icgen2host-10-241- closed          -      0      0      0      0      0      0
icgen2host-10-241- ok              -      2      0      0      0      0      0
icgen2host-10-241- ok              -      2      0      0      0      0      0
icgen2host-10-241- ok              -      2      0      0      0      0      0
icgen2host-10-241- ok              -      2      0      0      0      0      0

When the job completes, the output from stdout will be in out.<jobid> and errors in err.<jobid>. Since we also enabled mpitrace, we will see some files with file name like mpi_profile*. By default, four files from rank 0, rank with minimum communication time, rank with maximum communication time, and rank with median communication time will be stored. You can use sftp to connect to the master node and download these files back to your local machine.


That’s it! You’ve completed the basics of running an HPC workload on an IBM Cloud HPC cluster. We went through the steps for installing HPC software, submitting a job to run LAMMPS simulation and collecting results from the job in an HPC cluster on IBM Cloud. Please give it a try and refer to IBM Cloud Docs for more detailed instructions. We also encourage you to bring your own HPC workloads to IBM Cloud through the convenient HPC service that IBM provides. Please share your valuable feedback with us.


Was this article helpful?

More from Cloud

IBM Cloud Virtual Servers and Intel launch new custom cloud sandbox

4 min read - A new sandbox that use IBM Cloud Virtual Servers for VPC invites customers into a nonproduction environment to test the performance of 2nd Gen and 4th Gen Intel® Xeon® processors across various applications. Addressing performance concerns in a test environment Performance testing is crucial to understanding the efficiency of complex applications inside your cloud hosting environment. Yes, even in managed enterprise environments like IBM Cloud®. Although we can deliver the latest hardware and software across global data centers designed for…

10 industries that use distributed computing

6 min read - Distributed computing is a process that uses numerous computing resources in different operating locations to mimic the processes of a single computer. Distributed computing assembles different computers, servers and computer networks to accomplish computing tasks of widely varying sizes and purposes. Distributed computing even works in the cloud. And while it’s true that distributed cloud computing and cloud computing are essentially the same in theory, in practice, they differ in their global reach, with distributed cloud computing able to extend…

How a US bank modernized its mainframe applications with IBM Consulting and Microsoft Azure

9 min read - As organizations strive to stay ahead of the curve in today's fast-paced digital landscape, mainframe application modernization has emerged as a critical component of any digital transformation strategy. In this blog, we'll discuss the example of a US bank which embarked on a journey to modernize its mainframe applications. This strategic project has helped it to transform into a more modern, flexible and agile business. In looking at the ways in which it approached the problem, you’ll gain insights into…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters