Why can’t Hotspots Analysis trace spawned processes of MPI job?


Using mpiexec command like as "mpiexec -np 4 program" can run MPI jobs on local host.

You may find that VTune™ Amplifier XE 2011's Hotspots Analysis only displays your MPI program as only one process, one thread, one module, even the user enable "Analyze child processes" option.   Note: "Analyze system-wide" option is unnecessary or useless for Hotspots Analysis.

Here is an example to show this problem.  It calculate Pi by using MPI Library, code looks like:


MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);

h   = 1.0 / (double) n;
sum = 0.0;
for (i = myid + 1; i <= n; i += numprocs) {
     x = h * ((double)i - 0.5);
     sum += 4.0 / (1.0 + x*x);
mypi = h * sum;

MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0,

if (myid == 0)
   printf("pi is approximately %.16f, Error is %.16f\n",
                  pi, fabs(pi - PI25DT));


Normally you may run the following steps to compile and run your MPI program: (Note: Intel® MPI Library 4.0.0 and Intel® Amplifier XE 2011 have been already installed in your local machine)

1) export PATH=$PATH:/opt/intel/vtune_amplifier_xe_2011/bin64/
2) source /opt/intel/impi/4.0.0/bin64/mpivars.sh
3) mpicc -g pi.c -o pi.gcc
4) mpdboot
5) amplxe-cl -collect hotspots -r r0001hs -- mpiexec -np 4 ./pi.gcc
6) amplxe-cl -report hotspots -r r0001hs -group-by process

User can also view results via GUI by using command "amplxe-gui".  You will find only process "python" was displayed, for example:



mpiexec doesn't run MPI program directly, it run connection to MPI's mpd daemon via socket and pass all parameters, so the program is not child process of mpiexec. 


Running your MPI program on local host by using command "mpiexec.hydra" instead of "mpiexec", and with "-bootstrap fork" options.  Thus, it will run MPI programs on local host but using fork mechanism from operation system.   For example, like as "mpiexec.hydra -bootstrap fork -np 4 program". 

So change step 5 to :
5b) amplxe-cl -collect hotspots -r r0002hs -- mpiexec.hydra -bootstrap fork -np 4 ./pi.gcc
6b) amplxe-cl -report hotspots -r r0002hs -group-by process

Now we can see the correct results after using command 'amplxe-gui" to open result.  Bsides the process "mpiexec.hydra" and "pmi_proxy", four "pi.gcc" processes were displayed too.


1) This method will be helpful for Lightweight hotspots function, if user won't use system-wide analysis
2) It only works if your MPI program is runnning in your local single node machine (with multi-core or multi-processors).

For more complete information about compiler optimizations, see our Optimization Notice.