Loop Vectorization Divide By Zero

Loop Vectorization Divide By Zero

I recently upgraded to ifort 13.1.3. With the upgrade, new loops are being vectorized and in some cases, causing divide-by-zero SIGFPEs. The loop contains a condition that should prevent a divide by zero but it occurs nonetheless. I've attached a minimal example with makefile demonstrating the issue. This appears only to be an issue when the -mp1 flag is being used (nothing or -fltconsistency work OK). My setup: Intel FORTRAN 13.1.3 x86_64 Red Hat Enterprise Linux 6.4 Intel Core i5 650 Thanks for your help. -Phil

AnexoTamanho
Download vec.f90699 bytes
Download makefile.txt114 bytes
4 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Your example program has two misspelled identifiers and doesn't build. But I figured it out. You need to use -fp-model strict if you're going to change the floating point environment, otherwise the compiler assumes you're not going to do that and makes additional optimizations.

We recommend not using -mp1 or -fltconsistency - use the -fp-model options instead.

Steve - Intel Developer Support

-mp1 use is being discouraged, and we recommend options such as -fp-model precise or strict, or perhaps the 2 options  -prec-div -prec-sqrt

For this particular case, newer versions of ifort allowed greater vectorization of loops with conditionals such as your example.  This sometimes allows the compiler to compute all possible outcomes in temporaries and use a vector mask to assign output values based on a vector mask of T or F - thus both the THEN and ELSE statements may be computed resulting in the FP div by zero that you describe.

You can prevent this with -fp-speculation=safe to prevent unsafe fp speculation by the compiler.  you will see that this prevents the error in your testcase.  Oh, there were a few typos in the testcase but they were trivial to fix and did not detract from the analysis.

Also, if you use -g, you can also add -traceback to get more info from the stack traceback.

I would recommend ditching -mp1 and investigating -fp-model.  -fp-model precise balances performance with accuracy.  I think you may like this option better than the old -mp1.  -mp1 is roughly equal to -prec-div -prec-sqrt in modern options.

ron

Thank you both for your helpful replies. And, sorry for the typos!

Deixar um comentário

Faça login para adicionar um comentário. Não é membro? Inscreva-se hoje mesmo!