Developer Guide and Reference

Contents

qopt-assume-no-loop-carried-dep, Qopt-assume-no-loop-carried-dep

Lets you set a level of performance tuning for loops.
This content is specific to C++; it does not apply to
DPC++
.

Syntax

Linux:
-qopt-assume-no-loop-carried-dep
[=
n
]
Windows:
/Qopt-assume-no-loop-carried-dep
[=
n
]
Arguments
n
Is the action for loop-carried dependencies. Possible values are:
0
The compiler does not assume there are no loop carried dependencies. This is the default if this option is not specified.
1
Tells the compiler to assume there are no loop-carried dependencies for innermost loops. This is the default if the option is used but
n
is not specified.
2
Tells the compiler to assume there are no loop-carried dependencies for all loop levels.
Default
[q or Q]qopt-assume-no-loop-carried-dep=0
The compiler does not assume there are no loop carried dependencies.
Description
This option lets you set a level of performance tuning for loops.
It is useful for C/C++ applications and benchmarks where pointers and arguments could be aliased. This is because when you specify level 1 or level 2, more loops will be vectorized or benefit from loop transformations.
This option is applied to all loops in the file. It does not apply to code outside loops.
IDE Equivalent
None
Alternate Options
None
Examples
The following loop will not be vectorized because of data dependency. Specifying
[q or Q]opt-assume-no-loop-carried-dep=1
tells the compiler to assume no data dependence will occur in this loop and it allows this loop to be vectorized:
void sub (float *A, float *B, int* M ) { for (int i =0; i< 10000 ; i++) { A[i] += B[M[i]] + 1; } }
In the following example, this matrix multiply kernel will not be optimized because of dependency in all loop nests. Specifying
[q or Q]opt-assume-no-loop-carried-dep=2
will result in loop transformations such as blocking, unroll and jam, and vectorization:
void matmul(double *a, double *b, double *c) { int i, j, k; int n = 1024; for (i = 0; i < 1024; i++) { for (j = 0; j < 1024; j++) { for (k = 0; k < 1024; k++) { c[i * n + j] += a[i * n + k] * b[k * n + j]; } } } }

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.