#pragma unrollandfuse
Purpose
Instructs the compiler to attempt an unroll and fuse operation on
nested for loops.
Note: IBM® Open XL C/C++ for AIX®
17.1.2 still accepts
#pragma unrollandfuse but maps it to the #pragma
unroll_and_jam Clang pragma. If you used #pragma unrollandfuse in your
program, you are recommended to replace it with #pragma unroll_and_jam when you
migrate the program to IBM Open XL C/C++ for AIX
17.1.2.
Syntax
Parameters
- number
- A loop unrolling factor.
The value of number is
a positive integral constant expression.
The value of number is a positive scalar integer
or compile-time constant initialization expression.
If number is not specified, the optimizer determines an appropriate unrolling factor for each nested loop.
Usage
The #pragma unrollandfuse directive
applies only to the outer loops of nested
for loops
that meet the following conditions: - There must be only one loop counter variable, one increment point for that variable, and one termination variable. These cannot be altered at any point in the loop nest.
- Loops cannot have multiple entry and exit points. The loop termination must be the only means to exit the loop.
- Dependencies in the loop must not be "backwards-looking". For example, a statement such as
A[i][j] = A[i -1][j + 1] + 4)must not appear within the loop.
For loop unrolling to occur, the #pragma unrollandfuse directive
must precede a for loop. You must not specify #pragma
unrollandfuse for the innermost for loop.
You must not specify #pragma unrollandfuse more than once, or combine the directive with
#pragma nounrollandfuse, #pragma nounroll, or #pragma unroll directives for the same
for loop.
Predefined macros
None.
Examples
In the following example, a #pragma
unrollandfuse directive replicates and fuses the body of the loop.
This reduces the number of cache misses for array
b. int i, j;
int a[1000][1000];
int b[1000][1000];
int c[1000][1000];
....
#pragma unrollandfuse(2)
for (i=1; i<1000; i++) {
for (j=1; j<1000; j++) {
a[j][i] = b[i][j] * c[j][i];
}
}
The for loop below shows a possible result
of applying the #pragma unrollandfuse(2) directive to the loop
shown above: for (i=1; i<1000; i=i+2) {
for (j=1; j<1000; j++) {
a[j][i] = b[i][j] * c[j][i];
a[j][i+1] = b[i+1][j] * c[j][i+1];
}
}
You can also specify multiple #pragma unrollandfuse directives
in a nested loop structure.
int i, j, k;
int a[1000][1000];
int b[1000][1000];
int c[1000][1000];
int d[1000][1000];
int e[1000][1000];
....
#pragma unrollandfuse(4)
for (i=1; i<1000; i++) {
#pragma unrollandfuse(2)
for (j=1; j<1000; j++) {
for (k=1; k<1000; k++) {
a[j][i] = b[i][j] * c[j][i] + d[j][k] * e[i][k];
}
}
}
