Coarrays: problems with sample program.

Coarrays: problems with sample program.

My system: Dell Vostro 1500 w/ Intel Core 2 Duo T7300 @ 2.0GHz
OS: Fedora 17 w/ kernel 3.5.3-1.fc17.x86_64
Compiler (ifort -V): Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 13.0.0.079 Build 20120731

Following the tutorial for coarrays, I compiled the hello_image.f90 file:

program hello_image

  write(*,*) "Hello from image ", this_image(), &
              "out of ", num_images()," total images"

end program hello_image

with the command: ifort -coarray hello_image.f90 -o hello_image

Got no trouble with compilation, but when I ran the code with $./hello_image, the program hang.

Interrupting the processing with CTRL-C, I got the following error message:

mpiexec_proteus.universe.net (mpiexec 1162): failed to obtain sock from the process manager (mpdman daemon). Please examine the /tmp/mpd2.logfile_rudi log file on each node of the ring.

The log file contains merely:

logfile for mpd with pid 2011
proteus.universe.net_mpdman_1: connection error in connect_lhs call: Connection timed out

I've just installed version 13.  Did I leave something out?  The sample was working with the previous version.

16 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Do you have a separate MPI implementation installed? What is the definition of your path environment variable?

Steve - Intel Developer Support

Quote:

Steve Lionel wrote:

Do you have a separate MPI implementation installed? What is the definition of your path environment variable?

Thanks for replying. AFAIK, the only implementation of MPI that I have installed is the one that comes with the intel fortran composer XE 2013 bundle. OTOH, the installer maintained the previous installation of XE 2011 (sp1.11.339). PATH's variable value on my system is: =============================================================================== rudi@proteus|~>printenv PATH /usr/local/intel/composer_xe_2013.0.079/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/mpirt/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/bin/intel64_mic:/usr/local/intel/composer_xe_2013.0.079/debugger/gui/intel64:/usr/lib64/qt-3.3/bin:/usr/local/intel/composer_xe_2013.0.079/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/mpirt/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/bin/intel64:/usr/local/intel/composer_xe_2013.0.079/bin/intel64_mic:/usr/local/intel/composer_xe_2013.0.079/debugger/gui/intel64:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/sbin:/usr/sbin:/home/rudi/bin:/home/rudi/lib/bin ==================================================================== As you can see, each path relevant to the compiler is somehow duplicated within the variable value. Don't know how that happened, but no path points to the previous (2011) version. Could that be the problem?

I think this should work. Let me try some cases on my system and see what I get. What do you use for a "source" command when you start this command session?

Steve - Intel Developer Support

Quote:

Steve Lionel wrote:

I think this should work. Let me try some cases on my system and see what I get. What do you use for a "source" command when you start this command session?

Erm, I'm not sure what you mean now... Do you mean, what's the compilation line? That would be: rudi@proteus|~>ifort -coarray hello_image.f90 -o hello_image Or do you mean, how I set the compiler environment via the compilervars.sh script?

Yes - how you use compilervars.sh. I am puzzled that you got duplicate entries in path.

Steve - Intel Developer Support

I just tried this on my system and it worked ok. On Linux, it doesn't matter how many other versions of the compiler you have installed since only one is active at any time.

Steve - Intel Developer Support

Quote:

Steve Lionel wrote:

Yes - how you use compilervars.sh. I am puzzled that you got duplicate entries in path.

I simply create the file: /etc/profile.d/ifort.sh containing the line: ## Setting up the Intel Composer XE 2013 environment. ## source /usr/local/intel/bin/compilervars.sh intel64

Quote:

Steve Lionel wrote:

I just tried this on my system and it worked ok. On Linux, it doesn't matter how many other versions of the compiler you have installed since only one is active at any time.

I temporarily changed my /etc/profile.d/ifort.sh file to: ## Setting up the Intel Composer XE 2013 environment. ## #source /usr/local/intel/bin/compilervars.sh intel64 source /usr/local/intel/composer_xe_2011_sp1.11.339/bin/compilervars.sh intel64 and compiled the sample program using the previous version of the compiler. Got the same problem. Funny now, the previous version used to work with coarrays...

I will ask someone who is more familiar with this environment to see if they can offer suggestions. It suggests something is wrong with the way MPI is starting up.

Steve - Intel Developer Support

Ok, some suggestions...

First, please show the output of the following commands:

which mpiexec
mpdtrace
printenv LD_LIBRARY_PATH

Next set the environment variable FOR_COARRAY_DEBUG_STARTUP to 1 , run your program, and show me the "Generated MPI command line" that is displayed.

Steve - Intel Developer Support

Quote:

Steve Lionel wrote:

Ok, some suggestions...

First, please show the output of the following commands:

which mpiexec
mpdtrace
printenv LD_LIBRARY_PATH

Next set the environment variable FOR_COARRAY_DEBUG_STARTUP to 1 , run your program, and show me the "Generated MPI command line" that is displayed.

Ok, here's the output: ---------------------------------------------------- rudi@proteus|coarray_samples>which mpiexec /usr/local/intel/composer_xe_2013.0.079/mpirt/bin/intel64/mpiexec rudi@proteus|coarray_samples>mpdtrace mpdtrace: cannot connect to local mpd (/tmp/mpd2.console_rudi); possible causes: 1. no mpd is running on this host 2. an mpd is running but was started without a "console" (-n option) rudi@proteus|coarray_samples>printenv LD_LIBRARY_PATH /usr/local/intel/composer_xe_2013.0.079/compiler/lib/intel64:/opt/intel/mic/coi/host-linux-release/lib:/opt/intel/mic/myo/lib:/usr/local/intel/composer_xe_2013.0.079/mpirt/lib/intel64:/usr/local/intel/composer_xe_2013.0.079/compiler/lib/intel64:/usr/local/intel/composer_xe_2013.0.079/mkl/lib/intel64 rudi@proteus|coarray_samples>FOR_COARRAY_DEBUG_STARTUP=1 rudi@proteus|coarray_samples>export FOR_COARRAY_DEBUG_STARTUP rudi@proteus|coarray_samples>printenv FOR_COARRAY_DEBUG_STARTUP 1 rudi@proteus|coarray_samples>./hello_image Generated MPI command line is 'mpd --daemon >/dev/null && mpiexec -genv I_MPI_DEVICE shm -genv I_MPI_FALLBACK disable -n 2 ./hello_image ; mpdallexit'. ^Cmpiexec_proteus.universe.net (mpiexec 1162): failed to obtain sock from the process manager (mpdman daemon). Please examine the /tmp/mpd2.logfile_rudi log file on each node of the ring.

What happens if you write a simple "Hello World" program (no coarray stuff and not compiled with -coarray) and do this:

mpd –daemon
mpiexec –n 4 hello

Steve - Intel Developer Support

Quote:

Steve Lionel wrote:

What happens if you write a simple "Hello World" program (no coarray stuff and not compiled with -coarray) and do this:

mpd –daemon
mpiexec –n 4 hello

The program still hangs, even with a simple program without coarrays, compiled with standard options. The program is called hello_image_si.f90. I had to interrupt the run with ctrl-c. Here's the output: --------------------------------------------------------------------------------- rudi@proteus|coarray_samples>ifort hello_image_si.f90 -o hello_image_si rudi@proteus|coarray_samples>mpd --daemon rudi@proteus|coarray_samples>mpiexec –n 4 hello_image_si ^Cmpiexec_proteus.universe.net (mpiexec 1162): failed to obtain sock from the process manager (mpdman daemon). Please examine the /tmp/mpd2.logfile_rudi log file on each node of the ring.

What does the referenced log file say? Seems there is a general problem getting MPI off the ground on your system. Is it already running some sort of MPI daemon?

Steve - Intel Developer Support

I suggest you reboot your system and see if the problem goes away. It looks as if there's a stuck mpd on your system.

Steve - Intel Developer Support

Leave a Comment

Please sign in to add a comment. Not a member? Join today