Vectorization reporting bug / regression

Vectorization reporting bug / regression

There is a substantial change in vectorization reporting with ifort 13.1 vs 12.1 on the two operating systems I have access to, Linux and Mac OS X.  I assume this is a bug / regression, rather than the intended behavior.

With 12.1:

host% ifort --version
ifort (IFORT) 12.1.3 20120212
Copyright (C) 1985-2012 Intel Corporation. All rights reserved.
host% ifort -O3 -xAVX -vec-report1 vec_report_bug.F90 -c
vec_report_bug.F90(31): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(32): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(33): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(34): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(35): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(37): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(38): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(39): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(40): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(42): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(44): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(45): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(46): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(47): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(49): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(51): (col. 1) remark: LOOP WAS VECTORIZED.

With 13.1:

host% ifort --version
ifort (IFORT) 13.1.3 20130607
Copyright (C) 1985-2013 Intel Corporation. All rights reserved.
host% ifort -O3 -xAVX -vec-report1 vec_report_bug.F90 -c
vec_report_bug.F90(31): (col. 1) remark: LOOP WAS VECTORIZED.
vec_report_bug.F90(31): (col. 1) remark: LOOP WAS VECTORIZED.

Examination of the assembly code shows that both 12.1 and 13.1 are generating vectorized code for all of the lines reported by 12.1.

Fichier attachéTaille
Télécharger vec-report-bug.f901.88 Ko
4 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Thanks - we'll take a look. Escalated as issue DPD200247457.

Steve - Intel Developer Support

The developers tell me that this is not a bug. What has happened instead is that the 13.1 compiler aggresively fuses the multiple loops, where the 12.1 compiler didn't. If you ask for an optimization report, you see something like this:

High Level Optimizer Report (_WENO6N)
Fusion loop partitions: (loop line numbers)
Fused Loops: ( 49 51 )
Fused Loops: ( 47 49 )
Fused Loops: ( 46 47 )
Fused Loops: ( 45 46 )
Fused Loops: ( 44 45 )
Fused Loops: ( 42 44 )
Fused Loops: ( 40 42 )
Fused Loops: ( 39 40 )
Fused Loops: ( 38 39 )
Fused Loops: ( 37 38 )
Fused Loops: ( 35 37 )
Fused Loops: ( 34 35 )
Fused Loops: ( 33 34 )
Fused Loops: ( 32 33 )
Fused Loops: ( 31 32 )

and then a "loop distribution" optimization splits it up. In a future release we have plans to better integrate the vectorization report to make this more understandable.

Steve - Intel Developer Support

if you wish  to prevent fusion at some loop boundaries you can set !dir  no fusion.   by fusing some  loops the compiler should be able to approach full performance at smaller loop counts with less unrolling provided there are no store to reload misalignment.

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui