Running results are different for an application from compiling -O0 option and -O3 option

Running results are different for an application from compiling -O0 option and -O3 option

Hi,

    I am running some climate models and met an annoying problem. If the climate model was compiled wih debug mode -O0 (using intel/12.1.9.293 and openmpi 1.4.3), then the running result is totally different from that running model compiled with -O3 option.

    Could someone please tell me if this is normal?  Should a model compiled with debug mode be used in production runs?

     Many thanks.

Cheers,

Lyndon.

3 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Hi Lyndon

you should not consider to run your code compiled by -O0 in production mode. You need to find the cause of the difference.  Very likely they are caused by inconsistencies of  floating point operation. See http://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler/ for a good introduction to this topic ( the PDF file you can download from thebottom of the  page).  This covers the computation on a single compute node only however. Parallelization by MPI (and OpenMP) might cause numerical differences too e.g. in case the order of operations for a reduction is changing due to the different optimization level. In case you use  a recent Intel compiler(not 12.1 you currently use  but 13.1, 14.0) , OpenMP reductions can be enforced to be deterministic by setting environment variable KMP_DETERMINISTIC_REDUCTION=yes    

Thus as first steps:

1. Try to find a configuration not using OpenMP and MPI if possible. This will help then to exclude these parallel  models as the cause

2. Use option "-O2  -fp-model precise" and check results.Both should not have a large impact on performance. Next  you too might try "-fp-model strict" but this will slow down the code considerably. But it would tell you, that the FP operations are the cause of the numerical instability

3. Compile half of your soure flles by optimization -O2, the rest by -O0 and continue recursively until you found the file causing the difference

4. For the file you found in (3), copy half of the routines to a new file, compile one by -O2, one by -O0  and search for the critical routine similar to (3)

Once you found the routine, you might see what is causing the difference. Typically it is code related to a reduction operation where ve.g. vectorizaton changes the order of operations.

Heinz 

How different are your results? In our in house CFD codes running with MPI always produce slightly different results, but still accurate to at worst ~1%. Depending on which nodes you get, how busy they are, etc. the order of MPI reductions can definitely impact the noise in your simulations. It's possible as well that changing the optimization levels is exposing an MPI/parallel programming bug, related to message passing that wasn't triggered in the other case. (race condition etc.)

-Zaak

Deixar um comentário

Faça login para adicionar um comentário. Não é membro? Inscreva-se hoje mesmo!