How to debug mpi program in VS2010+Intel MPI?

How to debug mpi program in VS2010+Intel MPI?

Dear all,

I have an MPI program by Fortran and it can run in command line by:
mpiexec -n 8 test

Now I want to debug the program in VS2010 environment by F5, how to configure the Intel Visual Fortran project to let it work?

Thanks,
Zhanghong Tang

22 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Dear all,

I need to detect the memory leak problem of my program. I used the Intel VTune Amplifier XE 2011, but don't know how to launch the program by mpiexec. Can I do this by some other tools or some other method? The OS on my computer is Windows 7 64 bit.

Thanks,
Zhanghong Tang

Hi Zhanghong,

To debug an MPI application in Visual Studio, there are two methods. If your edition of Visual Studio supports it, you can use the MPI Cluster Debugger by going to the project Properties, selecting Debugging, and choosing the MPI Cluster Debugger in the Debugger to launch drop box.

If the MPI Cluster Debugger is not supported in your version of Visual Studio, then you will need to launch your process and attach the Visual Studio debugger while the processes are running.

I would not recommend using Intel VTune Amplifier for finding memory errors. It is intended to find areas for performance improvement. There are other tools you can use for finding program errors.

Intel Inspector is designed to find memory leaks and race conditions. It can be used with an MPI program on the command line. First, set the appropriate environment variables (C:\Program Files (x86)\Intel\Inspector XE\inspxe-vars.bat), then run:

mpiexec -n  inspxe-cl -r  -collect   

You can also use the MPI Correctness Checker by compiling with -check_mpi. This will check your MPI calls for errors at runtime.

Please let me know if you need some additional options or assistance.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James,

Thank you very much for your kindly reply. I found that when I open a C++ project, the MPI Cluster Debugger is here, but when the project is IVF project, I can't find the MPI Cluster Debugger. Is there anything wrong?

I have tried to check the memory problem by Intel Inspector, the following errors displayed:
ID Problem Sources Modules Object Size State
P13 Memory leak calloc_impl.c DE.exe 532 New

what could lead to this problem?

There are also many items of this error:
ID Problem Sources Modules Object Size State
P4 Kernel resource leak mytest.f90 DE.exe New

this kind of error related to 'open', 'write' operation in Fortran code, how to solve this kind of problem?

Thanks,
Zhanghong Tang

Hi James,

Could you please tell me how to add the option 'check_mpi' to VS2010 project? I tried to add
/check_mpi
in Fortran/Command line, the following error displayed:

ifort: command line warning #10006: ignoring unknown option '/check_mpi'

Thanks,
Zhanghong Tang

Hi Zhanghong,

The integration with Visual Studio is best asked in the Intel Visual Fortran Compiler for Windows forum. Someone there will be able to help more with the debugger options listed in Visual Studio*.

I believe the memory leak shown in Intel Inspector for the Intel MPI Library is a non-issue. I know I have seen it and typically ignore it, with no (noticeable) problems. I will verify this. In general, questions about Inspector should go to the Intel Inspector XE forum, and the non-MPI leak can best be explained there.

To add the MPI correctness checking in Visual Studio*, add VTmc.lib (found in C:\Program Files (x86)\Intel\Trace Analyzer and Collector\8.0.3.008\lib\impi64) to your project.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James,

Thank you very much for your kindly reply.

Do you mean that I only need to add the VTmc.lib without adding the /check_mpi option?

Thanks,
Zhanghong Tang

Hi Zhanghong,

Yes, that library is the correctness checking library. Adding it will add correctness checking (it intercepts the MPI calls at runtime).

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Dear James,

Thank you very much for your kindly help. I have successfully linked the library to the program. I run the program by:

mpiexec -n 16 de1

The following error displayed:

D:\Users\tang\Debug>mpiexec -n 16 de1

[0] INFO: CHECK LOCAL:EXIT:SIGNAL ON
[0] INFO: CHECK LOCAL:EXIT:BEFORE_MPI_FINALIZE ON
[0] INFO: CHECK LOCAL:MPI:CALL_FAILED ON
[0] INFO: CHECK LOCAL:MEMORY:OVERLAP ON
[0] INFO: CHECK LOCAL:MEMORY:ILLEGAL_MODIFICATION ON
[0] INFO: CHECK LOCAL:MEMORY:INACCESSIBLE ON
[0] INFO: CHECK LOCAL:REQUEST:ILLEGAL_CALL ON
[0] INFO: CHECK LOCAL:REQUEST:NOT_FREED ON
[0] INFO: CHECK LOCAL:REQUEST:PREMATURE_FREE ON
[0] INFO: CHECK LOCAL:DATATYPE:NOT_FREED ON
[0] INFO: CHECK LOCAL:BUFFER:INSUFFICIENT_BUFFER ON
[0] INFO: CHECK GLOBAL:DEADLOCK:HARD ON
[0] INFO: CHECK GLOBAL:DEADLOCK:POTENTIAL ON
[0] INFO: CHECK GLOBAL:DEADLOCK:NO_PROGRESS ON
[0] INFO: CHECK GLOBAL:MSG:DATATYPE:MISMATCH ON
[0] INFO: CHECK GLOBAL:MSG:DATA_TRANSMISSION_CORRUPTED ON
[0] INFO: CHECK GLOBAL:MSG:PENDING ON
[0] INFO: CHECK GLOBAL:COLLECTIVE:DATATYPE:MISMATCH ON
[0] INFO: CHECK GLOBAL:COLLECTIVE:DATA_TRANSMISSION_CORRUPTED ON
[0] INFO: CHECK GLOBAL:COLLECTIVE:OPERATION_MISMATCH ON
[0] INFO: CHECK GLOBAL:COLLECTIVE:SIZE_MISMATCH ON
[0] INFO: CHECK GLOBAL:COLLECTIVE:REDUCTION_OPERATION_MISMATCH ON
[0] INFO: CHECK GLOBAL:COLLECTIVE:ROOT_MISMATCH ON
[0] INFO: CHECK GLOBAL:COLLECTIVE:INVALID_PARAMETER ON
[0] INFO: CHECK GLOBAL:COLLECTIVE:COMM_FREE_MISMATCH ON
[0] INFO: maximum number of errors before aborting: CHECK-MAX-ERRORS 1
[0] INFO: maximum number of reports before aborting: CHECK-MAX-REPORTS 0 (= unli
mited)
[0] INFO: maximum number of times each error is reported: CHECK-SUPPRESSION-LIMI
T 10
[0] INFO: timeout for deadlock detection: DEADLOCK-TIMEOUT 60s
[0] INFO: timeout for deadlock warning: DEADLOCK-WARNING 300s
[0] INFO: maximum number of reported pending messages: CHECK-MAX-PENDING 20

[0] WARNING: LOCAL:MEMORY:OVERLAP: warning
[0] WARNING: Send and receive buffers overlap at address 0000000000299E80.
[0] WARNING: Control over buffers is about to be transferred to MPI at:
[0] WARNING: MPI_SCATTER(*sendbuf=0x0000000000299e80, sendcount=20, sendty
pe=MPI_DOUBLE_PRECISION, *recvbuf=0x0000000000299e80, recvcount=20, recvtype=MPI
_DOUBLE_PRECISION, root=0, comm=MPI_COMM_WORLD, *ierr=0x000000000029f5e0)
[0] WARNING: MOVEBOUNDNODEINDEX (de1)
[0] WARNING: MOVEBOUNDNODEINDEX (de1)
[0] WARNING: mkl_blas_mc3_ctrsm_run (de1)
[0] WARNING: mkl_pds_blkldl_ooc_pardiso (de1)
[0] WARNING: BaseThreadInitThunk (kernel32)
[0] WARNING: RtlUserThreadStart (ntdll)
[0] WARNING: ()

[0] WARNING: LOCAL:MEMORY:OVERLAP: warning
[0] WARNING: Send and receive buffers overlap at address 000000000029EDC0.
[0] WARNING: Control over buffers is about to be transferred to MPI at:
[0] WARNING: MPI_GATHER(*sendbuf=0x000000000029edc0, sendcount=1, sendtype
=MPI_DOUBLE_PRECISION, *recvbuf=0x000000000029edc0, recvcount=1, recvtype=MPI_DO
UBLE_PRECISION, root=0, comm=MPI_COMM_WORLD, *ierr=0x000000000029f5e0)
[0] WARNING: ELEMENT_mp_QUAD2DOUTER (de1)
[0] WARNING: MOVEBOUNDNODEINDEX (de1)
[0] WARNING: mkl_blas_mc3_ctrsm_run (de1)
[0] WARNING: mkl_pds_blkldl_ooc_pardiso (de1)
[0] WARNING: BaseThreadInitThunk (kernel32)
[0] WARNING: RtlUserThreadStart (ntdll)
[0] WARNING: ()

[0] WARNING: LOCAL:MEMORY:OVERLAP: warning
[0] WARNING: Send and receive buffers overlap at address 000000000029C680.
[0] WARNING: Control over buffers is about to be transferred to MPI at:
[0] WARNING: MPI_SCATTER(*sendbuf=0x000000000029c680, sendcount=20, sendty
pe=MPI_DOUBLE_PRECISION, *recvbuf=0x000000000029c680, recvcount=20, recvtype=MPI
_DOUBLE_PRECISION, root=0, comm=MPI_COMM_WORLD, *ierr=0x000000000029f5e0)
[0] WARNING: ELEMENT_mp_QUAD2DOUTER (de1)
[0] WARNING: MOVEBOUNDNODEINDEX (de1)
[0] WARNING: mkl_blas_mc3_ctrsm_run (de1)
[0] WARNING: mkl_pds_blkldl_ooc_pardiso (de1)
[0] WARNING: BaseThreadInitThunk (kernel32)
[0] WARNING: RtlUserThreadStart (ntdll)
[0] WARNING: ()

[0] WARNING: LOCAL:MEMORY:OVERLAP: warning
[0] WARNING: Send and receive buffers overlap at address 000000000029EE40.
[0] WARNING: Control over buffers is about to be transferred to MPI at:
[0] WARNING: MPI_GATHER(*sendbuf=0x000000000029ee40, sendcount=1, sendtype
=MPI_DOUBLE_PRECISION, *recvbuf=0x000000000029ee40, recvcount=1, recvtype=MPI_DO
UBLE_PRECISION, root=0, comm=MPI_COMM_WORLD, *ierr=0x000000000029f5e0)
[0] WARNING: ELEMENT_mp_QUAD2DOUTER (de1)
[0] WARNING: MOVEBOUNDNODEINDEX (de1)
[0] WARNING: mkl_blas_mc3_ctrsm_run (de1)
[0] WARNING: mkl_pds_blkldl_ooc_pardiso (de1)
[0] WARNING: BaseThreadInitThunk (kernel32)
[0] WARNING: RtlUserThreadStart (ntdll)
[0] WARNING: ()

[15] ERROR: LOCAL:EXIT:SIGNAL: fatal error
[15] ERROR: Fatal signal 3 (???) raised.
[15] ERROR: Stack back trace:
[15] ERROR: ELEMENT_mp_MATAB2D (de1)
[15] ERROR: ELEMENT_mp_ADDCK2D (de1)
[15] ERROR: ELEMENT_mp_QUAD2DOUTER (de1)
[15] ERROR: MOVEBOUNDNODEINDEX (de1)
[15] ERROR: mkl_blas_mc3_ctrsm_run (de1)
[15] ERROR: mkl_pds_blkldl_ooc_pardiso (de1)
[15] ERROR: BaseThreadInitThunk (kernel32)
[15] ERROR: RtlUserThreadStart (ntdll)
[15] ERROR: ()
[15] ERROR: After leaving:
[15] ERROR: MPI_BCAST(*buffer=0x0000000000dac038, count=6, datatype=MPI_DO
UBLE_PRECISION, root=0, comm=MPI_COMM_WORLD, *ierr=0x000000000019855c->MPI_SUCCE
SS)
[0] WARNING: starting premature shutdown
[6] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[6] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[6] ERROR: sending remaining 0 of 0 bytes failed: send(): Connection reset by pe
er.
[6] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[5] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[5] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[5] ERROR: sending remaining 0 of 0 bytes failed: send(): Connection reset by pe
er.
[5] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[7] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[7] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[7] ERROR: sending remaining 0 of 0 bytes failed: send(): Connection reset by pe
er.
[7] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[15] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[15] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.
[15] ERROR: sending remaining 0 of 0 bytes failed: send(): Connection reset by p
eer.
[15] ERROR: receiving remaining 8 of 8 bytes failed: recv(): Connection reset by
peer.

job aborted:
rank: node: exit code[: error message]
0: N01: 123
1: N01: 123
2: N01: 123
3: N01: 123
4: N01: -1073741819: process 4 exited without calling finalize
5: N01: 1
6: N01: 1
7: N01: 1
8: N01: 123
9: N01: 123
10: N01: 123
11: N01: 123
12: N01: 123
13: N01: 123
14: N01: 123
15: N01: 1

Related MPI code is as follows:

bstart=myid*blocksize+1
bend=bstart+blocksize-1
recvbuf=>pop(1:DD,bstart:bend)
call MPI_SCATTER (pop,sendcount,MPI_DOUBLE_PRECISION,recvbuf,sendcount,MPI_DOUBLE_PRECISION,0,comm,ierr)
...
sendarr=>val(bstart:bend)
call MPI_Gather (sendarr,blocksize,MPI_DOUBLE_PRECISION,val,blocksize,MPI_DOUBLE_PRECISION,0,comm,ierr)
...
call MPI_Bcast ( KT, nt, MPI_DOUBLE_PRECISION, 0, comm, ierr )

Could you please tell me how to modify the MPI related code?

Thanks,
Zhanghong Tang

Hi Zhanghong,

According to the warnings, you are using the same address for sending and receiving in the collective operations. While this is supported, it is better if done using the MPI_IN_PLACE argument. For MPI_Scatter, replace recvbuf with MPI_IN_PLACE, and for MPI_Gather, replace sendarr with MPI_IN_PLACE.

As for the error, that appears to be coming from MKL, sometime after the MPI_Bcast call. If you are using the Intel Math Kernel Library, I would suggest asking on that forum (http://software.intel.com/en-us/forums/intel-math-kernel-library/). It is possible that there is an MPI problem leading to it, but without more information, that's not something I can easily identify.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James,

Thank you very much for your good information. I put 'MPI_IN_PLACE in the code as you suggested and the warning messages are dispeared, furthermore, I also found some bugs after checked the code carefully. However, the following error messages still bothered me, I don't know what could lead to this kind of message:

[35] ERROR: LOCAL:EXIT:SIGNAL: fatal error
[35] ERROR: Fatal signal 3 (???) raised.
[35] ERROR: Stack back trace:
[35] ERROR: (de)
[35] ERROR: (de)
[35] ERROR: (de)
[35] ERROR: VT_AtExit (de)
[35] ERROR: VT_AtExit (de)
[35] ERROR: BaseThreadInitThunk (kernel32)
[35] ERROR: RtlUserThreadStart (ntdll)
[35] ERROR: ()
[35] ERROR: After leaving:
[35] ERROR: MPI_BCAST(*buffer=0x000000000187bf68, count=6, datatype=MPI_DO
UBLE_PRECISION, root=0, comm=MPI_COMM_WORLD, *ierr=0x0000000000a4e25c->MPI_SUCCE
SS)

Could you please help me to check these problems?

Thanks,
Zhanghong Tang

Hi Zhanghong,

What you are seeing is a fatalerror in rank 35. The last MPI call before the error was to MPI_BCAST, with a double precision variable (actually 6) being broadcast from rank 0. This call was successful. The stacktrace shows (as best it can) where the error occurred. Without seeing the source, I can't really identify the problem, it does not appear to be directly from MPI. Do you have debugging symbols turned on? You should get a better stack trace with debugging on.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James,

Thank you very much for your kindly reply. It is strange that in Debug mode the program can run correctly. I also found that the VTune cant point out where the problem really located, for example, it shows the error happened at line 8000, but when I got to that line I found that they are some comments.

I will continue to check what lead to the problem in Release mode.

Thanks,
Zhanghong Tang

Hi Zhanghong,

VTune isn't really the program you want to use for debugging. It's more for finding places to improve performance of your code. I would recommend using Inspector instead, as it is intended for finding memory leaks and race conditions. If your problem goes away when running in debug, I would start looking for those two problems first.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James,

Sorry to bother you again. There is another problem. I run the program by the following command:

D:\Users\tang\Debug>mpiexec -wdir \\n01\debug -hosts 10 n01 12 n02 12 n03 12 n04 12 n05 12 n06 12 n07 12 n08 12 n09 12 n10 12 \\n01\debug\de

It can run successfully at the first time, however, when run the program by the same command, many of the following errors from different nodes displayed:

launch failed: CreateProcess(\\n01\debug\de) on 'N01' failed, error 2 - The system cannot find the file specified.

It is strange that after I rebuilt the project and generate the execute program and run the same command again, the program can run successfully. As a result, every time I modify the model (by modifying the parameter files), I have to rebuild the program to let it work. Could you please tell me what could lead to this kind of problem?

Thanks,
Zhanghong Tang

Hi Zhanghong,

The error message, as you've probably determined, says that MPI cannot find the executable. If it only works once, I would check the folder after the run to make sure that your executable is still in the folder.

When you say you are changing parameter files, are these files part of what is compiled into the program, or are they read by the program at runtime?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James,

Thank you very much for your kind reply. The executable file is still in that folder, I only change the parameter files which are text files and as you said, they will be read by the program at runtime.

Furthermore, after I modified the parameter files, the executable file can run on local node by the folloiwng command without rebuilding the program:
mpiexec -n 24 \\n01\debug\de

Thanks,
Zhanghong Tang

Hi Zhanghong,

Are you changing the names of the parameter files? Is the name hardcoded into theprogram somewhere? When you recompile, are you changing anything in the program, or just recompiling it as it is?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James,

Thank you very much for your kindly reply.

I didn't change the names of parameter files, and the names are hardcoded into the program. I didn't change anything in the program when recompiling the program.

Thanks,
Zhanghong Tang

Hi Zhanghong,

What if you have another executable in the folder (such as the MPI test program) and run it instead? Are you able to run it using the same command multiple times?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James,

Thank you very much for your kindly reply. I didn't test the MPI test program in the folder. I will test it latter.

For the same commands (run the same program and the same parameters several times), I tested and found that sometimes it works, but sometimes it doesn't work, the same error message.

Thanks,
Zhanghong Tang

Hi Zhanghong,

I would recommend checking the node hosting the shared folder and make certain it isn't having failures.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Leave a Comment

Please sign in to add a comment. Not a member? Join today