Troubleshooting
Problem
Using Platform MPI in Platform HPC Clusters
Resolving The Problem
Platform MPI is a commercial implementation of MPI2 standard for developing and running parallel algorithms on HPC clusters, and it is included in the Platform HPC cluster stack. It is highly optimized, fast and feature rich MPI implementation. This article describes some of its main features an benefits and gives examples how to use it with Platform LSF.
This article is divided in following sections:
1. Supported Operating Systems
2. Supported Interconnects and Protocols
3. Features and Benefits
4. Using Platform MPI
1. Load Platform MPI module file
2. Compile Platform MPI code
3. Submit Platform MPI job to LSF
5. Additional Information
1. SUPPORTED OPERATING SYSTEMS
- Red Hat Enterprise Linux 4.6, 5.x and 6.x
- SUSE Linux Enterprise Server 10 and 11
- CentOS 5.3
2. SUPPORTED INTERCONNECTS AND PROTOCOLS
- InfiniBand(Linux) OFED (1.1, 1.2, 1,3, 1.4 and 1.5), PSM, uDAPL on X86-64 and Itanium2
? 160; SDR, DDR, QDR, ConnectX and ConnextX2 Mellanox FCA - GigE(Linux) 0; RDMA, uDAPL, TCP/IP
- Infiniband(Windows) WinOF 2.x, IBAL, WSD, SDR, DDR, QDR, ConnectX2
- GigE(Windows) ? 160;TCP/IP on x86_64
3. FEATURES AND BENEFITS PLATFORM MPI
Features
- Platform MPI fully complies with the MPI 2.2. standard, providing dynamic processes, one-sided communications, extended collectives, thread safety and updated ROMIO
- Complete debugging, diagnostic and profiling tools
- Auto detection of interconnects and dynamic loading of libraries
- Improved shared memory performance, incorporating code and methods from Platform MPI 5.6 (Scali MPI)
- 75% reduction in job startup and shutdown at scale
- Scalability to 17,000 ranks
- RDMA message progression & coalescing enhancements
- Flexible CPU binding options maximize cache effectiveness and balance applications to minimize latency
- Automated benchmarking of collective operations
- Scheduler agnostic with workload manager integration for Platform LSF, Windows HPC and others
- Run-time selection of interconnects, with no need to re-compile your code
- CPU binding features which are well suited for GPU-aware applications
Benefits
- Applications port easily to other platforms
- Protects ISV software investment
- Technical problems resolved quickly and efficiently by Platform Support
- Reduced latency for best performance
- Performance improves without explicitly developer action
- Easy to optimize application performance
4. USING PLATFORM MPI
4.1 Load PMPI Modulefile
By default, no MPI libraries are defined in Platform HPC user's environment. To add Platform MPI to you environment, you first must load the Platform MPI modulefile
$ module load PMPI/modulefile
NOTE: For more information about using Environment Modules in your cluster, see article T1014792 .
This will dynamically modify your environment to include Platform MPI. Following environment variables are configured:
MANPATH, LD_LIBRARY_PATH, PATH, MPI_ROOT, MPI_USELSF and MPIHOME.
4.2 Load Platform MPI Modulefile Automatically
If you wanted to modify your user environment permanently, so that Platform MPI is automatically loaded when you log into the cluster, you need to do following:
1. Open your ~/.bashrc file with your favourite Linux editor (i.e. vim)
2. Modify the last line to read "module load PMPI/modulefile"
3. Save the file and re-login to the cluster
If you are the cluster administrator, and you wish to make sure Platform MPI is in the environment of every user, you need to modify the kusu-genconfig plugin which generates user's .bashrc files.
1. Open /opt/kusu/lib/plugins/genconfig/bashrc.py file.
2. In the last line, change the word "null" to "PMPI/modulefile".
3. Update the bashrc skel file by running
# kusu-genconfig bashrc > /etc/skel/.bashrc
For all users added after this point, the Platform MPI modulefile will be loaded by default.
4.2 Compile Platform MPI Code
In this example we use the included $MPI_ROOT/help/hello_world.c program to compile with Platform MPI compiler
$ mpicc $MPI_ROOT/help/hello_world.c -o hello_world.exe
NOTE: For a full list of compile options please refer to mpicc man page as well as Platform MPI User's Guide available through the Help menu of the Platform HPC web console.
4.3 Submit a Platform MPI Job to Platform LSF
To submit your Platform MPI programs, you need use mpirun or mpiexec command. Here is an example of submitting the above hello_world.exe program to LSF via command line.
$ mpirun -np 4 ./hello_world.exe
Job is submitted to default queue .
This command submits the job to LSF and requests 4 job slots. Also, the LSF output and error files are automatically created. The name format of the files is: hello_world.exe-.out and hello_world.exe-.err respectively.
In addition, you can also submit Platform MPI jobs by using the bsub command, as other LSF jobs. In this way, you can add many more resource requirements for the job. For example, you can submit the same job, but request that job runs only on host group "hgroupA", and that you reserver 1 GB of memory for the job
$ bsub -n 4 -m hgroupA -R "rusage[mem=1000]" mpirun ./hello_world.exe
Platform MPI is fully integrated and optimized for running on LSF clusters, therefore there is no need to use the helper "wrapper" scripts as with other MPI libraries, such as Open MPI and MPICH2.
5.ADDITIONAL INFORMATION
You can find a lot more information about using Platform MPI, including topics on:
- Profiling
- Tuning
- High Availability MPI programming
- Debugging and Troubleshooting, AND
- Platform MPI FAQs
in Platform MPI User's Guide. You can find the guide by logging the Platform HPC web console and then clicking on Help >> Kit Reference >> Platform MPI User Guide.
Was this topic helpful?
Document Information
More support for:
Platform Cluster Manager
Software version:
3.0
Document number:
673609
Modified date:
05 September 2018
UID
isg3T1014793