Erroneus [pmi_proxy] <defunct> left behind

Erroneus [pmi_proxy] left behind

Portrait de Nils M.

My application makes heavy use of MPI_Comm_spawn calls to dynamically create and abandon processes.

I am using Intel(R) MPI Library for Linux* OS, Version 4.1 Update 1 Build 20130522 on a Linux Cluster environment.

Each subsequent call of MPI_Comm_spawn unfortunately leaves a

 [pmi_proxy] <defunct>

process behind, even if the subprocess has finished normally. These processes will be killed when the whole application finishes. They do not take in any resources. Since I make about 2000 MPI_Comm_spawn calls, these can become a serious and hard to detect bug if the OS reaches its file handle limit.

Searching the Web gives certain results on the mpich bug tracker, namely ticket 670 and 1504 (spam filter prevents me from posting convenient links) and the mpich discussion board:

http://lists.mpich.org/pipermail/discuss/2013-March/000515.html

Could this still be an issue in the hydra implementation used by intel mpi?

Thank you very much for your help!

2 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Portrait de Dmitry Sivkov (Intel)

Hi,

Thank you for the message.

Please submit the ticket against this issue on Intel(R) Premier Support.

--

Dmitry

Connectez-vous pour laisser un commentaire.