IBM Support

"No ACTIVE ports found" reported when using MVAPICH to run MPI jobs in PCM cluster

Troubleshooting


Problem

The following error message appears when running MPI jobs using MVAPICH: [user@hpcinstaller ~]$ mpirun -np 1 -machinefile machinefile ./hellof Abort signaled by rank 0: No ACTIVE ports found MPI process terminated unexpectedly Exit code -5 signaled from compute-00-00 Killing remote processes...DONE [user@hpcinstaller ~]$ Signal 15 received.

Resolving The Problem

MPI jobs compiled with MVAPICH from Platform HPC kit can only be used with InfiniBand as interconnection. Y You

Verify your InfiniBand network to ensure that there are no issues with it before rerunning your job.

To diagnose your InfiniBand network quickly, use the following two commands:

ibchecknet: performs port/node/errors check on the subnet

The output should show you all your IB devices, including nodes and switches, and you should not see any bad nodes or ports. Also, you should not have any ports with errors beyond thresholds.

Example:

 

[root@compute-00-00 ~]# ibchecknet

# Checking Ca: nodeguid 0x0002c902002789ac

# Checking Ca: nodeguid 0x0002c9030002847c

## Summary: 3 nodes checked, 0 bad nodes found
## 4 ports checked, 0 bad ports found
## 0 ports have errors beyond threshold

ibcheckstate: performs port state and physical port state check on the subnet

The output is similar to above command, but more concise. You should not see any bad nodes or ports.

Example:
[root@compute-00-00 ~]# ibcheckstate

## Summary: 3 nodes 
checked, 0 bad nodes found
## 4 ports checked, 0 ports with bad state 
found

.

[{"Product":{"code":"SSDV85","label":"Platform Cluster Manager"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"--","Platform":[{"code":"PF016","label":"Linux"}],"Version":"Version Independent","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}},{"Product":{"code":"SSZUCA","label":"IBM Spectrum Cluster Foundation"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":null,"Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 September 2018

UID

isg3T1016282