MPI_Allreduce strange result

MPI_Allreduce strange result

imagem de FortCpp

Hi users and developers,

I am having a stringe result come from MPI_Allreduce subroutine in fortran. Here is the thing:

ALLOCATE(mesh%idx%lxyz_inv(nr(1, 1):nr(2, 1), nr(1, 2):nr(2, 2), nr(1, 3):nr(2, 3)))
mesh%idx%lxyz_inv(:,:,:) = 0
!In a subroutine, An array was first allocated and initialized. nr(1, ?):nr(2, ?) are all -36:36 for the test run
!...
!...
npoints = product(nr(2, 1:3) - nr(1, 1:3) + 1)
call MPI_Allreduce(MPI_IN_PLACE, mesh%idx%lxyz_inv(nr(1, 1), nr(1, 2), nr(1, 3)), npoints, MPI_INTEGER, MPI_BOR, mpi_world%comm, mpi_err)

There were something wrong here at run time (I just ran it by typing a.exe. Not a parallel run). Before MPI_Allreduce, the first 48 elements of mesh%idx%lxyz_inv were all zero. But after the call was made, the first 48 elements of mesh%idx%lxyz_inv became:

0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 1140850688 1 0
0 0 0 0 0 32768
64 4194304 64 896 384 0
-2 37010544 0 0 0 0
0 0 0 0 1078984704 0
0 1 0 1 0 3899248

And I noticed that the first strange number 1140850688 is the value of MPI_IN_PLACE defined in mpif.h. Other than that I have completely no clue of it.

I cannot simplify it to a small test program since when I test MPI_Allreduce with the same size array in a simple code, it works fine.

I am using intel Cluster studio 13 with everyting updated and intel MPI. I am trying to compile a 64 bit program so I used the header and libray in $(I_MPI_ROOT)em64t. I don't know whether that's the problem of this strange result. The compile is fine except a lot of "warning LNK4049: locally defined symbol mpifcmb5_ imported", "warning LNK4049: locally defined symbol mpipriv1_ imported", "warning LNK4049: locally defined symbol mpipriv2_ imported" and "warning LNK4049: locally defined symbol mpiprivc_ imported" when linking. OS is win 7 64bit.

Any comment will be appreciated! I can attach my project if it is necessary. Thanks.

18 posts / 0 new
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.
imagem de James Tullos (Intel)

Hi,

You should not be seeing the linker errors.  The easiest way to solve this is if you can send the project.  Either attach it to a post, or if you'd prefer, you can send it directly to me.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

imagem de FortCpp

James,

Thanks for your response. I sent you a message with my solution attached.

imagem de James Tullos (Intel)

Hi,

I've got the Visual Studio* solution you sent.  What version of Visual Studio* are you using?  I'm using 2010, and thus far I'm up to 2 hours waiting for the solution to load.  If you have the Intel® Trace Analyzer and Collector, try linking with VTmc.lib.  This is the Correctness Checker library, and will verify that the MPI calls are correct.  It will slow the application significantly, so I don't recommend using it outside of development.  I'll try this myself once I can get the solution loaded.  Have you tried this on Linux*?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

imagem de FortCpp

Hi James,

I am using the same version, MSVS 2010. There seems to be a small problem somewhere in the IVF. I post it here:http://software.intel.com/en-us/forums/topic/368495

There is a quick fix by diabling the Database of IVF. Then everything should be fine. Could you please try it again after disabling the Database?

I never used Trace Analyzer. I'll learn to use it. Not sure how far I can go, so I am still hoping that you can help me out.

Thanks,

Yonghui

imagem de James Tullos (Intel)

Hi Yonghui,

Thank you for the information, that does help load it more quickly.  I'll try linking with the Correctness Checker and see if I can find anything.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

imagem de FortCpp

Hi James,

Did you see what is the problem in the code? I tried to remove it from the source and test it with single CPU. It turned out to be OK.

Other than that I don't have any new discovery.

Thanks,

Yonghui

imagem de James Tullos (Intel)

Hi Yonghui,

I have not been able to find anything yet.  I'm going to try to get a smaller reproducer together and have our developers look at it.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

imagem de FortCpp

OK. Thanks a lot.

I am looking forward for the result.

Yonghui

imagem de James Tullos (Intel)

Hi Yonghui,

The developers have found and corrected the problem.  The fix will be available in our next release.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

imagem de FortCpp

Great! It was the Fortran wrapper problem or the MPI library problem? Just curious.

Yonghui

imagem de James Tullos (Intel)

Hi Yonghui,

The problem was due to an incorrectly exported symbol from MPI.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

imagem de James Tullos (Intel)

Hi Yonghui,

The newest version of the Intel® MPI Library, Version 4.1 Update 1, is now available for download from the Intel® Registration Center.  Please download it and verify that it corrects the problem for you.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

imagem de lyh03259

Hi James,

I can't see it in intel Reg Center. All I can see is "intel cluster studio". And I can't see it in intel software manager where I update my software.

Can you give a link to download?

Thanks.

imagem de James Tullos (Intel)

Hi Yonghui,

If you look under Intel® Cluster Studio, you should be able to see the individual components.  Here, you can click on the version number and get to the download link from there.  Please see the attached screenshot.

James.

Anexos: 

imagem de lyh03259

Hi James,

I just checked it. The problem is fixed! Thanks a lot.

But it seems that there is one more problem that wasn't there before. I am looking into it to see.

imagem de James Tullos (Intel)

Hi Yonghui,

Great!  I'm going to go ahead and close the Allreduce issue.  Once you know if there is a problem, let me know and we'll look into it.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

imagem de lyh03259

Thanks James.

Faça login para deixar um comentário.