Problem:
Using mpiexec command like as "mpiexec -np 4 program" can run MPI jobs on local host.
You may find that VTune™ Amplifier XE 2011's Hotspots Analysis only displays your MPI program as only one process, one thread, one module, even the user enable "Analyze child processes" option. Note: "Analyze system-wide" option is unnecessary or useless for Hotspots Analysis.
Here is an example to show this problem. It calculate Pi by using MPI Library, code looks like:
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
h = 1.0 / (double) n;
sum = 0.0;
for (i = myid + 1; i <= n; i += numprocs) {
x = h * ((double)i - 0.5);
sum += 4.0 / (1.0 + x*x);
}
mypi = h * sum;MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0,
MPI_COMM_WORLD);if (myid == 0)
printf("pi is approximately %.16f, Error is %.16f\n",
pi, fabs(pi - PI25DT));
MPI_Finalize();
Normally you may run the following steps to compile and run your MPI program: (Note: Intel® MPI Library 4.0.0 and Intel® Amplifier XE 2011 have been already installed in your local machine)
1) export PATH=$PATH:/opt/intel/vtune_amplifier_xe_2011/bin64/
2) source /opt/intel/impi/4.0.0/bin64/mpivars.sh
3) mpicc -g pi.c -o pi.gcc
4) mpdboot
5) amplxe-cl -collect hotspots -r r0001hs -- mpiexec -np 4 ./pi.gcc
6) amplxe-cl -report hotspots -r r0001hs -group-by process
User can also view results via GUI by using command "amplxe-gui". You will find only process "python" was displayed, for example:
Root-cause:
mpiexec doesn't run MPI program directly, it run connection to MPI's mpd daemon via socket and pass all parameters, so the program is not child process of mpiexec.
Solution:
Running your MPI program on local host by using command "mpiexec.hydra" instead of "mpiexec", and with "-bootstrap fork" options. Thus, it will run MPI programs on local host but using fork mechanism from operation system. For example, like as "mpiexec.hydra -bootstrap fork -np 4 program".
So change step 5 to :
5b) amplxe-cl -collect hotspots -r r0002hs -- mpiexec.hydra -bootstrap fork -np 4 ./pi.gcc
6b) amplxe-cl -report hotspots -r r0002hs -group-by process
Now we can see the correct results after using command 'amplxe-gui" to open result. Bsides the process "mpiexec.hydra" and "pmi_proxy", four "pi.gcc" processes were displayed too.
Note:
1) This method will be helpful for Lightweight hotspots function, if user won't use system-wide analysis
2) It only works if your MPI program is runnning in your local single node machine (with multi-core or multi-processors).
