SGI vendor MPI support

Run your SGI MPI jobs through LSF.

Compiling and linking your MPI program

You must use the SGI C compiler (cc by default). You cannot use mpicc to build your programs.

Configuring LSF to work with SGI MPI

To use 32-bit or 64-bit SGI MPI with LSF, set the following parameters in lsf.conf:

  • Set LSF_VPLUGIN to the full path to the MPI library libxmpi.so.

    You can specify multiple paths for LSF_VPLUGIN, separated by colons (:). For example, the following configures both /usr/lib32/libxmpi.so and /usr/lib/libxmpi.so:

    LSF_VPLUGIN="/usr/lib32/libxmpi.so:/usr/lib/libxmpi.so"

For PAM to access the libxmpi.so library, the file permission mode must be 755 (-rwxr-xr-x).

To run a mulithost MPI applications, you must also enable rsh without password prompt between hosts:

  • The remote host must be defined in the arrayd configuration.
  • Configure .rhosts so that rsh does not require a password.

Running jobs

To run a job and have LSF select the host, the command mpirun -np 4 a.out is entered as:

bsub -n 4 pam -mpi -auto_place a.out

To run a single-host job and have LSF select the host, the command mpirun -np 4 a.out is entered as:

bsub -n 4 -R "span[hosts=1]" pam -mpi -auto_place a.out

To run a multihost job (5 processors per host) and have LSF select the hosts, the following command:

mpirun hosta -np 5 a.out: hostb -np 5 a.out

is entered as:

bsub -n 10 -R "span[ptile=5]" pam -mpi -auto_place a.out

Limitations

  • The mbatchd and sbatchd daemons take a few seconds to get the process IDs and process group IDs of the PAM jobs from the SGI MPI components. If you use bstop, bresume, or bkill before this happens, uncontrolled MPI child processes may be left running.
  • A single MPI job cannot run on a heterogeneous architecture. The entire job must run on systems of a single architecture.