How to Run an HPC Workload on an IBM Cloud HPC Cluster

4 min read

In this post, we demonstrate how to run an HPC workload in an IBM Spectrum LSF cluster on IBM Cloud. 

We assume that you have an IBM Cloud account and that you have already created an IBM Spectrum LSF cluster through the IBM Cloud Catalog. If you haven’t, please follow the instructions in this previous blog post first to create a cluster.

Step 1: Log in to the master node of your cluster

Refer to the tutorial if you don't know how to get the IP addresses needed for the ssh command:

ssh -J root@ip-jumphost lsfadmin@ip-masternode

 

Step 2: Navigate to the shared folder

After logging into the master node, you can see a folder named shared. This allows access to the shared NFS of the cluster. All files placed under this folder will be accessible to all the worker nodes. This is an ideal location for placing the application binaries and software libraries so that the jobs we submit can have the same environment for their work.

Let’s create a subfolder demo in the shared folder for everything that we need in this demo:

mkdir -p shared/demo
cd shared/demo

 

Step 3: Build and install LAMMPS

We choose LAMMPS as an example workload. As a classical molecular dynamics (MD) code with a focus on materials modeling, Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) is widely used in many research fields, including materials science, biomedical engineering, chemical engineering, etc.

First, we set the environment variables to include the MPI executables and library shared objects:

export PATH=$PATH:/usr/local/openmpi-4.1.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/openmpi-4.1.0/lib

Then, we acquire the latest stable LAMMPS release:

cd ~/shared/demo
git clone -b stable_29Oct2020 https://github.com/lammps/lammps.git 

Install the needed packages:

cd lammps/src
make yes-molecule
make yes-rigid
make yes-manybody

Build LAMMPS:

sed '/CCFLAGS/s/$/ -std=c++11/' MAKE/Makefile.mpi > MAKE/Makefile.lsf
make -j 16 lsf

Run an example to make sure everything works:

cd ~/shared/demo/lammps
mpirun -np 2 src/lmp_lsf -in examples/flow/in.flow.pois

You should see no errors in the output.

Step 4: Build and install mpitrace

To collect detailed information about messaging, mpitrace contains a library of tools aiming at analysis of distributed-memory parallel applications written with MPI, which provide a convenient place to enable other performance tools. Here, we build and install it in the shared folder, too.

Make sure the environment variables are set for MPI:

export PATH=$PATH:/usr/local/openmpi-4.1.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/openmpi-4.1.0/lib
cd ~/shared/demo
git clone https://github.com/IBM/mpitrace.git
cd mpitrace/src
./configure
make

Now the dynamic library $HOME/shared/demo/mpitrace/src/libmpitrace.so is accessible for every worker node. When submitting jobs on the HPC cluster, we just need to export it as LD_PRELOAD variable before MPI applications to enable mpitrace profiling, for example:

export LD_PRELOAD=$HOME/shared/demo/mpitrace/src/libmpitrace.so
<MPI applications>
unset LD_PRELOAD

 

Step 5: Submit an LSF job to run LAMMPS simulation with mpitrace profiling

Now that we have set up the execution environment, it’s time to run a multi-node MPI job. We will use one of the LAMMPS benchmarks for simulating Cu metallic solid with embedded atom method (EAM) potential. There are 32,000 atoms in a box of 20^3 Angstroms, and we run NVE time integration for 100 steps with 5.0 fs as the time step size.

Before we prepare the benchmark and the job script, let's examine the LSF cluster status by running bhosts. If you used the default setting and the minimum worker node number is 0, you should see something like the following. This indicates that there are two master nodes in the LSF cluster, and there are no available slots for LSF jobs. Thus, when you submit a job that require slots, LSF auto-scaling will provision worker nodes on the fly. Thus, there will be a small delay before your job executes:

[lsfadmin@icgen2host-10-241-128-37 eam]$ bhosts
HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV 
icgen2host-10-241- closed          -      0      0      0      0      0      0
icgen2host-10-241- closed          -      0      0      0      0      0      0

First, download the benchmark and extract it:

cd ~/shared/demo
wget https://www.lammps.org/bench/bench_eam.tar.gz
tar zxvf bench_eam.tar.gz

Then, we create an LSF script for the job that will run this benchmark:

cd eam
cat << EOF > job.lsf
#!/bin/bash 
#BSUB -n 8
#BSUB -R span[ptile=2]
#BSUB -o out.%J
#BSUB -e err.%J
#BSUB -J cu-20

export PATH=$PATH:/usr/local/openmpi-4.1.0/bin:/home/lsfadmin/shared/demo/lammps/src
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/openmpi-4.1.0/lib
export LD_PRELOAD=/home/lsfadmin/shared/demo/mpitrace/src/libmpitrace.so

cd ~/shared/demo/eam/
mpirun --report-bindings --bind-to core lmp_lsf -in in.eam
unset LD_PRELOAD
EOF

We specify -n 8, meaning that there will be eight MPI ranks. And ptile=2 means that we want two ranks per worker node. Since the default bx2-4x16 profile grants two cores per worker node, this job requires four worker nodes to run and we can fully subscribe the worker nodes. In the script, we also include export commands to set up the environment variables. The LD_PRELOAD setting will enable mpitrace profiling.

Now that everything is there, we submit the job by running the following:

bsub < job.lsf

You can then run bjobs to check the status of the submitted job. As previously mentioned, there can be a delay before enough worker nodes are provisioned. When the workers are ready, bjobs should output something similar to the following:

JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
4       lsfadmi RUN   normal     icgen2host- icgen2host- cu-20      Aug 23 14:30
                                             icgen2host-10-241-128-44
                                             icgen2host-10-241-128-45
                                             icgen2host-10-241-128-45
                                             icgen2host-10-241-128-46
                                             icgen2host-10-241-128-46
                                             icgen2host-10-241-128-43
                                             icgen2host-10-241-128-43

You can also see that there are more hosts from bhosts. Four extra worker nodes were automatically provisioned by LSF:

HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV 
icgen2host-10-241- closed          -      0      0      0      0      0      0
icgen2host-10-241- closed          -      0      0      0      0      0      0
icgen2host-10-241- ok              -      2      0      0      0      0      0
icgen2host-10-241- ok              -      2      0      0      0      0      0
icgen2host-10-241- ok              -      2      0      0      0      0      0
icgen2host-10-241- ok              -      2      0      0      0      0      0

When the job completes, the output from stdout will be in out.<jobid> and errors in err.<jobid>. Since we also enabled mpitrace, we will see some files with file name like mpi_profile*. By default, four files from rank 0, rank with minimum communication time, rank with maximum communication time, and rank with median communication time will be stored. You can use sftp to connect to the master node and download these files back to your local machine.

Conclusions

That's it! You've completed the basics of running an HPC workload on an IBM Cloud HPC cluster. We went through the steps for installing HPC software, submitting a job to run LAMMPS simulation and collecting results from the job in an HPC cluster on IBM Cloud. Please give it a try and refer to IBM Cloud Docs for more detailed instructions. We also encourage you to bring your own HPC workloads to IBM Cloud through the convenient HPC service that IBM provides. Please share your valuable feedback with us.

References

Be the first to hear about news, product updates, and innovation from IBM Cloud