Hello, i am using intel mpirun (version = for Linux* OS, Version 4.0 Update 3 Build 20110824) to run a program that i have compiled in our cluster. We use PBS queue system (version = PBSPro_11.1.0.111761).
[proxy:0:1@n022] got crush from 5, 0 [proxy:0:2@n023] got crush from 5, 0 [proxy:0:2@n023] got crush from 4, 0 [proxy:0:0@n009] got crush from 6, 0 [proxy:0:0@n009] got crush from 9, 0 [proxy:0:0@n009] got crush from 17, 0 [proxy:0:1@n022] got crush from 4, 0 [proxy:0:0@n009] got crush from 10, 0 APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
I have tryed calling mpirun with -check_mpi and -env I_MPI_DEBUG=5 but so far i have no clue of what is going on. This happens only when i use more than one computing node.
Could you please provide the full output of your MPI- run with the “-genv I_MPI_HYDRA_DEBUG=1” environment. Also, please provide us the output of “cat $PBS_NODEFILE” - after resource allocation.
I also had problems when trying to use more than one computing node with Intel MPI. These are my previous posts in case you can find some useful information:
is in the file log.txt. Even if i redirect my output to a file i got this message
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
for each of the MPI processes. I looked for this libVTmc.so and found that it is a debugging library so i believe it is not related to the original problem in any manner.
Thanks for your reply Iván, but I could not get the same error message you got in your posts, even though I used exactly the same flags in the mpirun call.
mpirun error "APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)"
Hello, i am using intel mpirun (version = for Linux* OS, Version 4.0 Update 3 Build 20110824) to run a program that i have compiled in our cluster. We use PBS queue system (version = PBSPro_11.1.0.111761).
When I use
$ mpirun -n 8 -machinefile $PBS_NODEFILE -verbose /home/a.c.padilha/bin/vasp.teste.O0.debug.x
I end up getting these error messages:
[proxy:0:1@n022] got crush from 5, 0
[proxy:0:2@n023] got crush from 5, 0
[proxy:0:2@n023] got crush from 4, 0
[proxy:0:0@n009] got crush from 6, 0
[proxy:0:0@n009] got crush from 9, 0
[proxy:0:0@n009] got crush from 17, 0
[proxy:0:1@n022] got crush from 4, 0
[proxy:0:0@n009] got crush from 10, 0
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
I have tryed calling mpirun with -check_mpi and -env I_MPI_DEBUG=5 but so far i have no clue of what is going on. This happens only when i use more than one computing node.
Any help would be very nice.