MPI run crashes on more than one node

MPI run crashes on more than one node


KInd Attn: James Tullos (Intel)

I am facing a similar problem like and

My mpirun crashes giving the below error message.

[proxy:0:4@cn004] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:70): assert (!(pollfds[i].revents & ~POLLIN & ~POLLOUT & ~POLLHUP)) failed
[proxy:0:4@cn004] main (./pm/pmiserv/pmip.c:387): demux engine error waiting for event
[mpiexec@hn1] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:101): one of the processes terminated badly; aborting
[mpiexec@hn1] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:18): bootstrap device returned error waiting for completion
[mpiexec@hn1] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:521): bootstrap server returned error waiting for completion
[mpiexec@hn1] main (./ui/mpich/mpiexec.c:548): process manager error waiting for completion

I have also tried the command [bash]mpirun -np N -check_mpi ./wrf.exe which is not giving any output.

Thanks for your help and support.




2 帖子 / 0 全新

Hi Sravan,

Are you able to run the provided test program (in the $I_MPI_ROOT/test/ folder, compile any one of the files there)?

Using -check_mpi links to the MPI correctness checking library, which is for checking the MPI calls within your code.  It won't help diagnosing a problem with getting your code started, which appears to be the case here.

James Tullos
Technical Consulting Engineer
Intel® Cluster Tools