Dear all,
I'm trying to run a classical MPI test code on our cluster, and I'm still in trouble with it. I have installed the Intel Cluster Studio XE 2013 for Linux and Torque 4.1.3.
If I don't use torque "mpirun -f machine -np 18 ./code", it runs fine (machine is the list of nodes). If i use torque, it runs and stop at the end of walltime with the following errors
=>> PBS: job killed: walltime 143 exceeded limit 120
[mpiexec@node4] HYD_pmcd_pmiserv_send_signal (./pm/pmiserv/pmiserv_cb.c:221): assert (!closed) failed
[mpiexec@node4] ui_cmd_cb (./pm/pmiserv/pmiserv_pmci.c:128): unable to send SIGUSR1 downstream
[mpiexec@node4] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@node4] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:388): error waiting for event
[mpiexec@node4] main (./ui/mpich/mpiexec.c:718): process manager error waiting for completio
Do you have any idea ?
Thanks in advance,
M.



