Wrong results by Intel MPI 4.0.3.006?

Wrong results by Intel MPI 4.0.3.006?

jackyjngwn的头像

Hi

I am using Intel MPI 4.0.3.006 to run my application, and found that the generated outputs were different from those generated using older version of MPI. I tried with MPICH2-1.4 and got the same wrong outputs. Has anyone seen similar problem and how can I fix it?

Thanks.

 

 

 

6 帖子 / 0 new
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项.
James Tullos (Intel)的头像

Hi Jacky,

Can you please give some more information regarding what outputs are changed? What previous version are you using for comparison? Are you using the same compiler? What operating system are you using? Did you really mean 4.0.3.006?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

jackyjngwn的头像

I am using Linux Redhat 5, and comparing the results given by MPICH2-1.0.4 and Intel MPI 4.0.3.006, the one with the hydra process manager.

I used the same compiler, gcc 4.2.4. I also tried gcc 4.6.1 and intel compiler 12.1.0, but got the same outputs, which were incorrect.

Thanks.

James Tullos (Intel)的头像

Hi Jacky,

You should be using 4.0.3.008, rather than 4.0.3.006.

What application are you using? Do you have a small reproducer code for this problem?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

jackyjngwn的头像

James,

I think I got this version if Intel MPI as a test package before 4.0.3.008 was released.But I got the same results using 4.0.2.003.

The application I am using is quite complicated, so it's almost impossible for me the reproduce it. Looking at the output, it seems as if some input data were not propagated to certain nodes or some nodes did not send their results back to the master node. However, I did not get any MPI error message.

Have you ever seen or heard of similar problem? Or could you suggest a way for me to pinpoint the cause? Someone suggested to me it might be caused by a confusion of "big endian" and "small endian". How can I check that then? Thanks.

Thanks!

James Tullos (Intel)的头像

Hi Jacky,

I would actually recommend trying the latest version of the Intel® MPI Library, Version 4.1.

Have you tried using the message checking library? Use -check with mpirun to enable this.

Are you using a homogeneous cluster?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

登陆并发表评论。