Coarrays: problems with sample program.

Coarrays: problems with sample program.

Bild des Benutzers rudi-gaelzer

My system: Dell Vostro 1500 w/ Intel Core 2 Duo T7300 @ 2.0GHz
OS: Fedora 17 w/ kernel 3.5.3-1.fc17.x86_64
Compiler (ifort -V): Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 13.0.0.079 Build 20120731

Following the tutorial for coarrays, I compiled the hello_image.f90 file:

program hello_image

  write(*,*) "Hello from image ", this_image(), &
              "out of ", num_images()," total images"

end program hello_image

with the command: ifort -coarray hello_image.f90 -o hello_image

Got no trouble with compilation, but when I ran the code with $./hello_image, the program hang.

Interrupting the processing with CTRL-C, I got the following error message:

mpiexec_proteus.universe.net (mpiexec 1162): failed to obtain sock from the process manager (mpdman daemon). Please examine the /tmp/mpd2.logfile_rudi log file on each node of the ring.

The log file contains merely:

logfile for mpd with pid 2011
proteus.universe.net_mpdman_1: connection error in connect_lhs call: Connection timed out

I've just installed version 13.  Did I leave something out?  The sample was working with the previous version.

16 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.
Bild des Benutzers Steve Lionel (Intel)

Do you have a separate MPI implementation installed? What is the definition of your path environment variable?

Steve
Bild des Benutzers rudi-gaelzer

Zitat:

sblionel schrieb:

Do you have a separate MPI implementation installed? What is the definition of your path environment variable?

Thanks for replying. AFAIK, the only implementation of MPI that I have installed is the one that comes with the intel fortran composer XE 2013 bundle. OTOH, the installer maintained the previous installation of XE 2011 (sp1.11.339). PATH's variable value on my system is: =============================================================================== rudi@proteus|~>printenv PATH /usr/local/intel/composer_xe_2013.0.079/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/mpirt/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/bin/intel64_mic:/usr/local/intel/composer_xe_2013.0.079/debugger/gui/intel64:/usr/lib64/qt-3.3/bin:/usr/local/intel/composer_xe_2013.0.079/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/mpirt/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/bin/intel64_mic:/usr/local/intel/composer_xe_2013.0.079/debugger/gui/intel64:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/sbin:/usr/sbin:/home/rudi/bin:/home/rudi/lib/bin ==================================================================== As you can see, each path relevant to the compiler is somehow duplicated within the variable value. Don't know how that happened, but no path points to the previous (2011) version. Could that be the problem?
Bild des Benutzers Steve Lionel (Intel)

I think this should work. Let me try some cases on my system and see what I get. What do you use for a "source" command when you start this command session?

Steve
Bild des Benutzers rudi-gaelzer

Quote:

sblionel wrote:

I think this should work. Let me try some cases on my system and see what I get. What do you use for a "source" command when you start this command session?

Erm, I'm not sure what you mean now... Do you mean, what's the compilation line? That would be: rudi@proteus|~>ifort -coarray hello_image.f90 -o hello_image Or do you mean, how I set the compiler environment via the compilervars.sh script?
Bild des Benutzers Steve Lionel (Intel)

Yes - how you use compilervars.sh. I am puzzled that you got duplicate entries in path.

Steve
Bild des Benutzers Steve Lionel (Intel)

I just tried this on my system and it worked ok. On Linux, it doesn't matter how many other versions of the compiler you have installed since only one is active at any time.

Steve
Bild des Benutzers rudi-gaelzer

Quote:

sblionel wrote:

Yes - how you use compilervars.sh. I am puzzled that you got duplicate entries in path.

I simply create the file: /etc/profile.d/ifort.sh containing the line: ## Setting up the Intel Composer XE 2013 environment. ## source /usr/local/intel/bin/compilervars.sh intel64
Bild des Benutzers rudi-gaelzer

Zitat:

mad\sblionel schrieb:

I just tried this on my system and it worked ok. On Linux, it doesn't matter how many other versions of the compiler you have installed since only one is active at any time.


I temporarily changed my /etc/profile.d/ifort.sh file to:
## Setting up the Intel Composer XE 2013 environment. ##
#source /usr/local/intel/bin/compilervars.sh intel64
source /usr/local/intel/composer_xe_2011_sp1.11.339/bin/compilervars.sh intel64

and compiled the sample program using the previous version of the compiler.
Got the same problem. Funny now, the previous version used to work with coarrays...

Bild des Benutzers Steve Lionel (Intel)

I will ask someone who is more familiar with this environment to see if they can offer suggestions. It suggests something is wrong with the way MPI is starting up.

Steve
Bild des Benutzers Steve Lionel (Intel)

Ok, some suggestions...

First, please show the output of the following commands:

which mpiexec
mpdtrace
printenv LD_LIBRARY_PATH

Next set the environment variable FOR_COARRAY_DEBUG_STARTUP to 1 , run your program, and show me the "Generated MPI command line" that is displayed.

Steve
Bild des Benutzers rudi-gaelzer

Quote:

sblionel wrote:

Ok, some suggestions...

First, please show the output of the following commands:

which mpiexec
mpdtrace
printenv LD_LIBRARY_PATH

Next set the environment variable FOR_COARRAY_DEBUG_STARTUP to 1 , run your program, and show me the "Generated MPI command line" that is displayed.

Ok, here's the output: ---------------------------------------------------- rudi@proteus|coarray_samples>which mpiexec /usr/local/intel/composer_xe_2013.0.079/mpirt/bin/intel64/mpiexec rudi@proteus|coarray_samples>mpdtrace mpdtrace: cannot connect to local mpd (/tmp/mpd2.console_rudi); possible causes: 1. no mpd is running on this host 2. an mpd is running but was started without a "console" (-n option) rudi@proteus|coarray_samples>printenv LD_LIBRARY_PATH /usr/local/intel/composer_xe_2013.0.079/compiler/lib/intel64:/opt/intel/mic/coi/host-linux-release/lib:/opt/intel/mic/myo/lib:/usr/local/intel/composer_xe_2013.0.079/mpirt/lib/intel64:/usr/local/intel/composer_xe_2013.0.079/compiler/lib/intel64:/usr/local/intel/composer_xe_2013.0.079/mkl/lib/intel64 rudi@proteus|coarray_samples>FOR_COARRAY_DEBUG_STARTUP=1 rudi@proteus|coarray_samples>export FOR_COARRAY_DEBUG_STARTUP rudi@proteus|coarray_samples>printenv FOR_COARRAY_DEBUG_STARTUP 1 rudi@proteus|coarray_samples>./hello_image Generated MPI command line is 'mpd --daemon >/dev/null && mpiexec -genv I_MPI_DEVICE shm -genv I_MPI_FALLBACK disable -n 2 ./hello_image ; mpdallexit'. ^Cmpiexec_proteus.universe.net (mpiexec 1162): failed to obtain sock from the process manager (mpdman daemon). Please examine the /tmp/mpd2.logfile_rudi log file on each node of the ring.
Bild des Benutzers Steve Lionel (Intel)

What happens if you write a simple "Hello World" program (no coarray stuff and not compiled with -coarray) and do this:

mpd –daemon
mpiexec –n 4 hello

Steve
Bild des Benutzers rudi-gaelzer

Zitat:

sblionel schrieb:

What happens if you write a simple "Hello World" program (no coarray stuff and not compiled with -coarray) and do this:

mpd –daemon
mpiexec –n 4 hello

The program still hangs, even with a simple program without coarrays, compiled with standard options. The program is called hello_image_si.f90. I had to interrupt the run with ctrl-c. Here's the output: --------------------------------------------------------------------------------- rudi@proteus|coarray_samples>ifort hello_image_si.f90 -o hello_image_si rudi@proteus|coarray_samples>mpd --daemon rudi@proteus|coarray_samples>mpiexec –n 4 hello_image_si ^Cmpiexec_proteus.universe.net (mpiexec 1162): failed to obtain sock from the process manager (mpdman daemon). Please examine the /tmp/mpd2.logfile_rudi log file on each node of the ring.
Bild des Benutzers Steve Lionel (Intel)

What does the referenced log file say? Seems there is a general problem getting MPI off the ground on your system. Is it already running some sort of MPI daemon?

Steve
Bild des Benutzers Steve Lionel (Intel)

I suggest you reboot your system and see if the problem goes away. It looks as if there's a stuck mpd on your system.

Steve

Melden Sie sich an, um einen Kommentar zu hinterlassen.