mpdboot error: Failed to establish a socket connection with node:53317 (111, 'Connection refused')

mpdboot error: Failed to establish a socket connection with node:53317 (111, 'Connection refused')

Portrait de Minia Oseguera

Hi, i have this error when i run the followed script on a cluster:

--------------SRIPT---------------------

#!/bin/bash

# Start mpd daemons on all compute nodes

echo "Shutting down any existing mpd daemon"

mpdallexit

echo "Starting MPI on all nodes"

mpdboot -r ssh -n 8 -f $HOME/mpd.hosts

echo "MPI was initialized on the following nodes:"

mpdtrace

-------------- ERROR --------------------

Shutting down any existing mpd daemon

mpdroot: cannot connect to local mpd at: /tmp/mpd2.console_root

probable cause: no mpd daemon on this machine

possible cause: unix socket /tmp/mpd2.console_root has been removed

mpdallexit (__init__ 1470): forked process failed; status=255

Starting MPI on all nodes

mpdboot_gdc-cluster (handle_mpd_output 883): Failed to establish a socket connection with compute-00-00:53317 : (111, 'Connection refused')

mpdboot_gdc-cluster (handle_mpd_output 900): failed to connect to mpd on compute-00-00

MPI was initialized on the following nodes:

mpdroot: cannot connect to local mpd at: /tmp/mpd2.console_root

probable cause: no mpd daemon on this machine

possible cause: unix socket /tmp/mpd2.console_root has been removed

mpdtrace (__init__ 1470): forked process failed; status=255

-----------------------------------------------

Anyone knows why is happening that? thanks!

2 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Portrait de Dmitry Kuzmin (Intel)

Hi Minia,

Could you please give us information about MPI library.
And please confirm that you have set password-less connection between nodes.
Try to run: 'ssh compute-00-01 hostname'

Regards!
Dmitry

Connectez-vous pour laisser un commentaire.