Instrumenting an Example with Data Type Mismatch

To experiment with the data type mismatch example, copy the contents of the <install-dir>/itac/examples/checking/global/collective/datatype_mismatch/ directory to your working directory:

$ cp -r <install-dir>/itac_latest/examples/checking/global/collective/datatype_mismatch/ ~
$ cd ~/datatype_mismatch

Then compile and run the MPI_Bcast example located in the directory using the following commands:

$ mpiicc -g MPI_Bcast.c -o MPI_Bcast
$ mpirun -n 4 -check_mpi -genv VT_CHECK_MAX_ERRORS 0 MPI_Bcast

The command lines above use the following flags:

  • -g – generate the debugging information in the object file to be able to analyze the source files
  • -check_mpi – dynamically link the correctness checker library (VTmc.so)
  • -genv VT_CHECK_MAX_ERRORS 0 – set the maximum of errors found to unlimited (1 by default)

After running the application you will get the following output:

...
[0] ERROR: GLOBAL:COLLECTIVE:DATATYPE:MISMATCH: error
[0] ERROR:    Mismatch found in local rank [1] (global rank [1]),
[0] ERROR:    other processes may also be affected.
[0] ERROR:    No problem found in local rank [0] (same as global rank):
[0] ERROR:       MPI_Bcast(*buffer=0x7fff1066e814, count=1, datatype=MPI_INT, root=0, comm=MPI_COMM_WORLD)
[0] ERROR:       main (/checking/global/collective/datatype_mismatch/MPI_Bcast.c:50)
[0] ERROR:    1 elements transferred by peer but 4 expected by
[0] ERROR:    the 3 processes with local ranks [1:3] (same as global ranks):
[0] ERROR:       MPI_Bcast(*buffer=..., count=4, datatype=MPI_CHAR, root=0, comm=MPI_COMM_WORLD)
[0] ERROR:       main (/checking/global/collective/datatype_mismatch/MPI_Bcast.c:53)
[0] INFO: GLOBAL:COLLECTIVE:DATATYPE:MISMATCH: found 1 time (1 error + 0 warnings), 0 reports were suppressed
[0] INFO: Found 1 problem (1 error + 0 warnings), 0 reports were suppressed.

The highlighted error messages refer to lines 50 and 53 in the MPI_Bcast.c source file:

...
39 int main (int argc, char **argv)
40 {
41     int rank, size;
42
43     MPI_Init( &argc, &argv );
44     MPI_Comm_size( MPI_COMM_WORLD, &size );
45     MPI_Comm_rank( MPI_COMM_WORLD, &rank );
46
47     /* error: types do not match */
48     if( !rank ) {
49         int send = 0;
50         MPI_Bcast( &send, 1, MPI_INT, 0, MPI_COMM_WORLD );
51     } else {
52         char recv[4];
53         MPI_Bcast( &recv, 4, MPI_CHAR, 0, MPI_COMM_WORLD );
54     }
55
56     MPI_Finalize( );
57
58     return 0;
59 }

The above code example shows a mismatch in the data types within the MPI_Bcast function. While you set the sent data type to int, the receiver expects a char. The number of transferred bytes is the same, so normally this issue is not detected by MPI.

To fix the issue:

  • in line 52, change the receiver type from char array to int.
  • in line 53, change the MPI data-type argument from MPI_CHAR to MPI_INT, and the number of received elements to 1.
 52         int recv;
 53         MPI_Bcast( &recv, 1, MPI_INT, 0, MPI_COMM_WORLD );

To check that you have eliminated the message checking errors, re-compile and re-run the application:

...
[0] INFO: Error checking completed without finding any problems.
...
For more complete information about compiler optimizations, see our Optimization Notice.