Dynamic alignment clause for vector pragma/directive

By Xiaoping Duan, Published: 06/12/2018, Last Updated: 06/11/2018

Intel® Compiler 19.0 provides a new dynamic alignment clause  control of vector pragma/directive. 

C/C++
#pragma vector dynamic_align[(pointer)]
#pragma vector nodynamic_align
 
Fortran
!DIR$ vector dynamic_align[(var)]
!DIR$ vector nodynamic_align
 
With dynamic_align clause the compiler generates a peel loop for the specified pointer. If no pointer is specified, the compiler automatically decides for which pointer to generate aligned loads and stores, or else doesn’t generate a peel loop. With the nodynamic_align clause, the compiler will not generate a peel loop.
 
Consider the following example:
void foo(float * a, float * b, float * c, int len) // source test.cpp
{
 int i;

 for (i = 0;i < len;i++)
        a[i] = b[i]*c[i];
}

Compile it and check the optimization report:


$icc -fno-alias -c test.cpp -qopt-report=5 -qopt-report-phase=vec
$cat test.optrpt
......
LOOP BEGIN at test.cpp(5,2)
<Peeled loop for vectorization>
LOOP END
LOOP BEGIN at test.cpp(5,2)
   remark #15388: vectorization support: reference a[i] has aligned access   [ test.cpp(6,2) ]
   remark #15389: vectorization support: reference b[i] has unaligned access   [ test.cpp(6,9) ]
   remark #15388: vectorization support: reference c[i] has aligned access   [ test.cpp(6,14) ]

Compiler generates a peeled loop and automatically chooses "a" and "c" for aligned access. 

Now place "#pragma vector dynamic_align(b)" before the loop, re-compile it and check the optimization report:

LOOP BEGIN at test.cpp(6,2)
<Peeled loop for vectorization>
LOOP END
LOOP BEGIN at test.cpp(6,2)
   remark #15389: vectorization support: reference a[i] has unaligned access   [ test.cpp(7,2) ]
   remark #15388: vectorization support: reference b[i] has aligned access   [ test.cpp(7,9) ]
   remark #15389: vectorization support: reference c[i] has unaligned access   [ test.cpp(7,14) ]

Compiler generates a peeled loop and chooses "b" for aligned access. 

Now replace the "dynamic_align(b)" clause with "nodynamic_align" clause, re-compile it and check the optimization report:

LOOP BEGIN at test.cpp(6,2)
   remark #15389: vectorization support: reference a[i] has unaligned access   [ test.cpp(7,2) ]
   remark #15389: vectorization support: reference b[i] has unaligned access   [ test.cpp(7,9) ]
   remark #15389: vectorization support: reference c[i] has unaligned access   [ test.cpp(7,14) ]

Compiler doesn't generate a peeled loop and all pointer accesses are unaligned. 

Refer to the Intel® Parallel Studio XE 2019 Composer Edition product documentation for additional details.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804