How can I suspend all the processes in an MPI job? I tried to use I_MPI_JOB_SIGNAL_PROPAGATION but it didn't seem to work. I am using Intel MPI 4.0.1.007. Thanks.
Well, I've just check with 4.0.2 and it works.[dk@cl210 ~]$ export I_MPI_JOB_SIGNAL_PROPAGATION=1[dk@cl210 ~]$ mpiexec -n 8 IMB-MPI1
In other terminal window:[dk@cl210 ~]$ ps ux | grep mpiexec dk 13809 0.1 0.0 140860 9876 pts/11 T 12:06 0:00 python /users/dk/impi/4.0.2/intel64/bin/mpiexec -n 8 IMB-MPI1[dk@cl210 ~]$ kill -20 13809 (send SIGTSTP)
In the first window you'll see:+ Stopped mpiexec -n 8 IMB-MPI1
Again in the second window type:[dk@cl210 ~]$ kill -18 13809 (send SIGCONT)
And IMB is continuing to work.
Is it not your case?
Thanks for your reply. I tried what you did and unfortunately it didn't work in my case. Actually, nothing happened when I used "kill -18" in another terminal window. When I used "Ctrl-Z" in the terminal window where the program was running, only the first process was suspended and all the other processes kept running.
Is this because I am using Intel MPI 4.0.1.007? Or is there anything else I need to configure? Thanks.
I've taken a look into the code of mpiexec and you know you are absolutely right - documentation and reality are not the same. So, SIGTSTP and SIGCONT are not propogated to an application. It can be easily changed, but I doubt that you'll be able to do this.You can submit a tracker at premier.intel.com and I'll send you a patch for testing.
Thanks for the reply. I tried to submit an issue at premier.intel.com, but intel cluster kit is not in my product list. What can I do then? Thanks.
What's your product? If it is Cluster Toolkit or Cluster Studio you should be able to submit a tracker againt Intel MPI Library for Linux.
I have submitted the issue. Could you please take a look? Thanks.
Got it - will be working on that.