#pragma omp for

Purpose

The omp for directive instructs the compiler to distribute loop iterations within the team of threads that encounters this work-sharing construct.

Syntax

Read syntax diagramSkip visual syntax diagram
                       .-+---+------.             
                       | '-,-'      |             
                       V            |             
>>-#--pragma--omp for----+--------+-+--for-loop----------------><
                         '-clause-'               

Parameters

clause is any of the following clauses:

collapse (n)
Allows you to parallelize multiple loops in a nest without introducing nested parallelism.
Read syntax diagramSkip visual syntax diagram
>>-COLLAPSE--(--n--)-------------------------------------------><

  • Only one collapse clause is allowed on a worksharing for or parallel for pragma.
  • The specified number of loops must be present lexically. That is, none of the loops can be in a called subroutine.
  • The loops must form a rectangular iteration space and the bounds and stride of each loop must be invariant over all the loops.
  • If the loop indices are of different size, the index with the largest size will be used for the collapsed loop.
  • The loops must be perfectly nested; that is, there is no intervening code nor any OpenMP pragma between the loops which are collapsed.
  • The associated do-loops must be structured blocks. Their execution must not be terminated by an break statement.
  • If multiple loops are associated to the loop construct, only an iteration of the innermost associated loop may be curtailed by a continue statement. If multiple loops are associated to the loop construct, there must be no branches to any of the loop termination statements except for the innermost associated loop.
Ordered construct
During execution of an iteration of a loop or a loop nest within a loop region, the executing thread must not execute more than one ordered region which binds to the same loop region. As a consequence, if multiple loops are associated to the loop construct by a collapse clause, the ordered construct has to be located inside all associated loops.
Lastprivate clause
When a lastprivate clause appears on the pragma that identifies a work-sharing construct, the value of each new list item from the sequentially last iteration of the associated loops, is assigned to the original list item even if a collapse clause is associated with the loop
Other SMP and performance pragmas
stream_unroll,unroll,unrollandfuse,nounrollandfuse pragmas cannot be used for any of the loops associated with the collapse clause loop nest. The independent_loop pragma can be used for any of the loops associated with the collapse clause. independent_loop is not OpenMP specific.
private (list)
Declares the scope of the data variables in list to be private to each thread. Data variables in list are separated by commas.
firstprivate (list)
Declares the scope of the data variables in list to be private to each thread. Each new private object is initialized as if there was an implied declaration within the statement block. Data variables in list are separated by commas.
lastprivate (list)
Declares the scope of the data variables in list to be private to each thread. The final value of each variable in list, if assigned, will be the value assigned to that variable in the last iteration. Variables not assigned a value will have an indeterminate value. Data variables in list are separated by commas.
reduction (operator: list)
Performs a reduction on all scalar variables in list using the specified operator. Reduction variables in list are separated by commas.
A private copy of each variable in list is created for each thread. At the end of the statement block, the final values of all private copies of the reduction variable are combined in a manner appropriate to the operator, and the result is placed back in the original value of the shared reduction variable. For example, when the max operator is specified, the original reduction variable value combines with the final values of the private copies by using the following expression:
original_reduction_variable = original_reduction_variable < private_copy ?
private_copy : original_reduction_variable; 
For variables specified in the reduction clause, they must satisfy the following conditions:
  • Must be of a type appropriate to the operator. If the max or min operator is specified, the variables must be one of the following types with or without long, short, signed, or unsigned:
    • _Bool
    • char
    • int
    • float
    • double
  • Must be shared in the enclosing context.
  • Must not be const-qualified.
  • Must not have pointer type.
ordered
Specify this clause if an ordered construct is present within the dynamic extent of the omp for directive.
schedule (type)
Specifies how iterations of the for loop are divided among available threads. Acceptable values for type are:
auto
With auto, scheduling is delegated to the compiler and runtime system. The compiler and runtime system can choose any possible mapping of iterations to threads (including all possible valid schedules) and these may be different in different loops.
dynamic
Iterations of a loop are divided into chunks of size ceiling(number_of_iterations/number_of_threads).

Chunks are dynamically assigned to active threads on a "first-come, first-do" basis until all work has been assigned.

dynamic,n
As above, except chunks are set to size n. n must be an integral assignment expression of value 1 or greater.
guided
Chunks are made progressively smaller until the default minimum chunk size is reached. The first chunk is of size ceiling(number_of_iterations/number_of_threads). Remaining chunks are of size ceiling(number_of_iterations_left/number_of_threads).

The minimum chunk size is 1.

Chunks are assigned to active threads on a "first-come, first-do" basis until all work has been assigned.

guided,n
As above, except the minimum chunk size is set to n; n must be an integral assignment expression of value 1 or greater.
runtime
Scheduling policy is determined at run time. Use the OMP_SCHEDULE environment variable to set the scheduling type and chunk size.
static
Iterations of a loop are divided into chunks of size ceiling(number_of_iterations/number_of_threads). Each thread is assigned a separate chunk.

This scheduling policy is also known as block scheduling.

static,n
Iterations of a loop are divided into chunks of size n. Each chunk is assigned to a thread in round-robin fashion.

n must be an integral assignment expression of value 1 or greater.

This scheduling policy is also known as block cyclic scheduling.

Note: if n=1, iterations of a loop are divided into chunks of size 1 and each chunk is assigned to a thread in round-robin fashion. This scheduling policy is also known as block cyclic scheduling.
nowait
Use this clause to avoid the implied barrier at the end of the for directive. This is useful if you have multiple independent work-sharing sections or iterative loops within a given parallel region. Only one nowait clause can appear on a given for directive.
and where for_loop is a for loop construct with the following canonical shape:
for (init_expr; exit_cond; incr_expr)
 statement
where:
init_expr takes the form:
iv = b
integer-type iv = b
exit_cond takes the form:
iv <= ub
iv <  ub
iv >= ub
iv >  ub
incr_expr takes the form:
++iv
iv++
--iv
iv--
iv += incr 
iv -= incr
iv = iv + incr
iv = incr + iv
iv = iv - incr
and where:
iv Iteration variable. The iteration variable must be a signed integer not modified anywhere within the for loop. It is implicitly made private for the duration of the for operation. If not specified as lastprivate, the iteration variable will have an indeterminate value after the operation completes.
b, ub, incr Loop invariant signed integer expressions. No synchronization is performed when evaluating these expressions and evaluated side effects may result in indeterminate values.

Usage

This pragma must appear immediately before the loop or loop block directive to be affected.

Program sections using the omp for pragma must be able to produce a correct result regardless of which thread executes a particular iteration. Similarly, program correctness must not rely on using a particular scheduling algorithm.

The for loop iteration variable is implicitly made private in scope for the duration of loop execution. This variable must not be modified within the body of the for loop. The value of the increment variable is indeterminate unless the variable is specified as having a data scope of lastprivate.

An implicit barrier exists at the end of the for loop unless the nowait clause is specified.

Restriction:
  • The for loop must be a structured block, and must not be terminated by a break statement.
  • Values of the loop control expressions must be the same for all iterations of the loop.
  • An omp for directive can accept only one schedule clause.
  • The value of n (chunk size) must be the same for all threads of a parallel region.


Voice your opinion on getting help information Ask IBM compiler experts a technical question in the IBM XL compilers forum Reach out to us