#pragma omp distribute parallel for simd

Purpose

The omp distribute parallel for simd directive distributes loop iterations to each master thread, further redistributes those iterations among the threads of each team, and then applies SIMD vectorization to each iteration.

Syntax

Read syntax diagramSkip visual syntax diagram
                                                .-+---+------.   
                                                | '-,-'      |   
                                                V            |   
>>-#--pragma--omp distribute parallel for simd----+--------+-+-><
                                                  '-clause-'     

Read syntax diagramSkip visual syntax diagram
>>-for-loops---------------------------------------------------><

Parameters

The omp distribute parallel for simd construct is a composite construct. clause can be any of the clauses that are accepted by the omp distribute or omp parallel for simd directive with identical meanings and restrictions.

Usage

The omp distribute parallel for simd directive takes effect only if you specify both the -qsmp and -qoffload compiler options.

Rules

If any specified clause except the collapse clause is applicable to both the omp distribute and omp parallel for simd directives, it is applied twice; the collapse clause is applied only once.

You can specify only loop iteration variables on the linear clause.

Examples

int N = 8;
int a[N];
#pragma omp target map(to: N) map(tofrom: a)
#pragma omp teams num_teams(2) thread_limit(N/2)
#pragma omp distribute parallel for simd
for (i=0, i<N, i++)
{
  a[i] = N;
}
#pragma omp end distribute parallel for simd
#pragma omp end teams
#pragma omp end target

In this example, the target region contains a teams region that consists of two teams. With the omp distribute parallel for simd directive, the iterations of the closely nested distribute loop are assigned to the teams that are actually created, and the master thread of each team executes the distributed iterations. Each parallel-do SIMD chunk is further redistributed among threads of each team, and SIMD vectorization is applied to each iteration of the SIMD chunks.



Voice your opinion on getting help information Ask IBM compiler experts a technical question in the IBM XL compilers forum Reach out to us