--map-by unit option
- hwthread
- core
- L1cache
- L2cache
- L3cache
- socket
- numa
- board
- node
% mpirun -host hostA:4,hostB:2 -map-by core ...
R0 hostA [BB/../../../../../../..][../../../../../../../..]
R1 hostA [../BB/../../../../../..][../../../../../../../..]
R2 hostA [../../BB/../../../../..][../../../../../../../..]
R3 hostA [../../../BB/../../../..][../../../../../../../..]
R4 hostB [BB/../../../../../../..][../../../../../../../..]
R5 hostB [../BB/../../../../../..][../../../../../../../..]
This is sometimes called a packed or latency binding because it tends to produce the fastest communication between ranks.
% mpirun -host hostA:4,hostB:2 -map-by socket ...
R0 hostA [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
R1 hostA [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
R2 hostA [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
R3 hostA [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
R4 hostB [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
R5 hostB [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
In the preceding examples, -host hostA:4,hostB:2 indicates that the cluster has six slots (spaces in which a process can run). Each rank consumes one slot, and processes are assigned hardware elements by iterating over the specified unit until the available slots are consumed.
The ordering of these examples, is implicitly core and socket, respectively, so core and socket are iterated for each rank assignment. The binding is also implicitly core and socket, respectively, so the final binding is to the same element that was chosen by the mapping.
Mapping policy:
BYCORE Ranking policy: CORE Binding policy: CORE:IF-SUPPORTED
Mapping policy:
BYSOCKET Ranking policy: SOCKET Binding policy: SOCKET:IF-SUPPORTED
If no binding options are specified, by default, Open MPI assumes --map-by-socket for jobs with more than two ranks. This produces the interleaved ordering in the preceding examples.
% mpirun -host hostA:4,hostB:2 -map-by socket -rank-by core ...
R0 hostA [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
R1 hostA [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
R2 hostA [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
R3 hostA [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
R4 hostB [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
R5 hostB [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
A
common binding pattern involves binding to cores, but spanning those core assignments over all of
the available sockets. For
example:% mpirun -host hostA:4,hostB:2 -map-by socket -rank-by core -bind-to core ...
R0 hostA [BB/../../../../../../..][../../../../../../../..]
R1 hostA [../BB/../../../../../..][../../../../../../../..]
R2 hostA [../../../../../../../..][BB/../../../../../../..]
R3 hostA [../../../../../../../..][../BB/../../../../../..]
R4 hostB [BB/../../../../../../..][../../../../../../../..]
R5 hostB [../../../../../../../..][BB/../../../../../../..]
In
this example, the final binding unit is smaller than the hardware selection that was made in the
mapping step. As a result, the cores within the socket are iterated over for the ranks on the same
socket. When the mapping unit and the binding unit differ, the -display-devel-map
output can be used to display the mapping output from which the binding was taken. For example, at
rank 0, the -display-devel-map output
includes:Locale: [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
Binding: [BB/../../../../../../..][../../../../../../../..]
A possible purpose for this binding is to use all the available hardware resources such as cache and memory bandwidth. This is sometimes called a bandwidth binding, and is a good starting point for overall application performance. The amount of cache and memory bandwidth is maximized, and the ranks are ordered so that close ranks by index are near each other in the hardware as much as possible while still spanning the available sockets.
% mpirun -host hostA:4,hostB:2 -map-by numa -rank-by core -bind-to core ...
R0 hostA [BB/../../../../../../..][../../../../../../../..]
R1 hostA [../BB/../../../../../..][../../../../../../../..]
R2 hostA [../../../../../../../..][BB/../../../../../../..]
R3 hostA [../../../../../../../..][../BB/../../../../../..]
R4 hostB [BB/../../../../../../..][../../../../../../../..]
R5 hostB [../../../../../../../..][BB/../../../../../../..]
% mpirun -host hostA:4,hostB:2 -map-by node ...
R0 hostA [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
R1 hostB [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
R2 hostA [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
R3 hostB [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]
R4 hostA [BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../..]
R5 hostA [../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB]