MPICH_PORT_RANGE for intel mpi ?

MPICH_PORT_RANGE for intel mpi ?

Hi all:Our software product now use Intel MPI library for parallel computing. Before that, we use MPICH2.For Windows, you need to add mpiexec, smpd, and the MPI apps (the app which calls MPI functions) into the firewall exception list to make the parallel computing work properly.For MPICH2, by the following command you are able to use port 50000~51000 for MPI apps.mpiexec -env MPICH_PORT_RANGE 50000:51000 MPIApp.exeTherefore, you can just open the port 50000:51000 of the firewall instead of creating lots of exception items in the list (especially if there're lots of MPI apps in your software product).My question is : Are there any MPICH_PORT_RANGE equivalent parameters for Intel MPI ?I have used intel MPI with [ -genv MPICH_PORT_RANGE ] for the following simple code, and MPI_Barrier() never returns.#include "mpi.h"#include #include int main(int argc, char **argv){ int cpuid = 0; int ncpu = 0; MPI_Init(&argc, &argc) MPI_Comm_rank(MPI_COMM_WORLD, &cpuid); MPI_Comm_size(MPI_COMM_WORLD, &ncpu); printf("Before barrier\\n"); fflush(stdout); MPI_Barrier(MPI_COMM_WORLD); printf("After barrier\\n"); fflush(stdout); MPI_Finalize(); return 0;}Thanks very much!

8 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

Hi Seifer,

Unfortunately you cannot use MPICH_PORT_RANGE.
The firewall needs to let through the socket traffic from both the
smpd.exe AND the program itself.
But, you can limit port range used by smpd.

c:\smpd stop
c:\set SMPD_PORT_RANGE=50000:51000
c:\smpd

smpd will use ports from the range (50000-51000) and I hope that this will solve you problem.

Regards!
Dmitry

Hi Dmitry:Thanks for your help. But after doing the steps, smpd still uses random ports (shown by TcpView of Windows).By the way, we also bought Intel MPI for Linux.I have to machines installed by CentOS 5.5 32bit, and I do the followings for testing.(1) Add -A INPUT -p tcp -m tcp --dport 10000:11000 -j ACCEPT into /etc/sysconfig/iptables on both machines.(2) Add MPD_PORT_RANGE=10000:11000 into ~/.mpd.conf(3)Executing mpdboot.py -n 2 -f ~/mpdhost.txt -r rsh The contents of mpdhost.txt: 192.168.120.162 192.168.120.163 mpdboot seems OK. From netstat, I get tcp 0 0 0.0.0.0:10001 0.0.0.0:* LISTEN 3505/python tcp 0 0 0.0.0.0:10002 0.0.0.0:* LISTEN 3505/python tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 6973/python tcp 0 0 192.168.120.162:10000 192.168.120.163:42694 ESTABLISHED 6973/python tcp 0 0 192.168.120.162:10000 192.168.120.163:42695 ESTABLISHED 6973/python After executing mpdtrace.py, I get 192.168.120.162 192.168.120.163(4)Executing the MPIApp.out mpiexec.py -machinefile machine.txt -n 2 MPIApp.out The contents of machine.txt: 192.168.120.162 192.168.120.163 And I get Assertion failed in file ../../socksm.c at line 2577: (it_plfd->revents & 0x008) == 0 internal ABORT - process 0 rank 0 in job 1 192.168.120.162_10000 caused collective abort of all ranks exit status of rank 0: return code 1(5)Do step (4) again after stopping the iptables The MPIApp.out runs just fine.... For Intel MPI (Windows), we may tell our customers just put mpiexec, smpd, MPIApps into the Windows Firewall Exception List.For Intel MPI (Linux), is there any way to set the port range used by MPIApp ?Thanks very much.Regards,Seifer

Hi Dmitry:
I do the step (4) of my last post again by the new commandline with MPICH_PORT_RANGE parameter.mpiexec.py -genv MPICH_PORT_RANGE 10000:11000 -machinefile machine.txt -n 2 MPIApp.outAnd everything works fine. ^^Therefore,Intel MPI for Windows --> Can't use MPICH_PORT_RANGEIntel MPI for Linux --> MPICH_PORT_RANGE works fineIt will be appreciated if Intel MPI for Windows provides the way to limit the port range of MPIApps.regards,Seifer

Hi Seifer,

Thank you for the update. I was just trying to reproduce you problem...

Yeah, on Linux MPICH_PORT_RANGE IS supported. And it should be sopprted on Windows as well, but there is some either error or missunderstanding inside of the library and application doesn't work.

Pay attention, that there are socket connection for mpd (smpd) daemons and for internal net module (tcp communication between [s]mpd and application) . They use different environment variables: [S]MPD_PORT_RANGE and MPICH_PORT_RANGE.

On Windows: could you please try to add '-genv I_MPI_PLATFORM 0' and check your test case with barrier. If it doesn't work add also '-genv I_MPI_DEBUG 500' and check one more time. I'm not sure that tcp connection will use ports from the range but at least it doesn't hang in my experiments.

Regards!
Dmitry

Hi Dmitry:I set the Windows firewall as the following on both nodes (192.168.120.36 & 192.168.120.11)(1) smpd.exe is added into the firewall exception list.(2) mpiexec.exe is added into the firewall exception list.(3) TCP port 10000~11000 is opened in the firewall.[Test1]: One MPI process at each node by the following commandmpiexec.exe -genv MPICH_PORT_RANGE 10000:11000 -hosts 2 192.168.120.36 1 192.168.120.11 1 \\\\192.168.120.36\\share\\test_intel_mpi.exeBefore barrierBefore barrierAfter barrierAfter barrier[Test2]: Two MPI processes at each node by the following commandmpiexec.exe -genv MPICH_PORT_RANGE 10000:11000 -hosts 2 192.168.120.36 2 192.168.120.11 2 \\\\192.168.120.36\\share\\test_intel_mpi.exeAnd no printf is shown..., even the "Before barrier" is NOT shown. :( (I've used fflush after printf.)[Test3]: Same as Test2, by adding more debug optionsmpiexec.exe -genv I_MPI_PLATFORM 0 -genv I_MPI_DEBUG 500 -genv MPICH_PORT_RANGE 10000:11000 -hosts 2 192.168.120.36 2 192.168.120.11 2 \\\\192.168.120.36\\share\\test_intel_mpi.exe[0] MPI startup(): Intel MPI Library, Version 4.0 Update 1 Build 20100910[0] MPI startup(): Copyright (C) 2003-2010 Intel Corporation. All rights reserved.[0] MPI startup(): fabric dapl failed: will try use tcp fabric[1] MPI startup(): fabric dapl failed: will try use tcp fabric[2] MPI startup(): fabric dapl failed: will try use tcp fabric[3] MPI startup():fabric dapl failed: will try use tcp fabric[1] MPI startup(): shm and tcp data transfer modes[0] MPI startup(): shm and tcp data transfer modes[3] MPI startup(): shm and tcp data transfer modes[2] MPI startup(): shm and tcpdata transfer modesAnd no printf from the MPIApp is shown..., even the "Before barrier" is NOT shown.:(regards,Seifer

Hi Seifer,

Does it work with I_MPI_PLATFORM but without I_MPI_DEBUG?

As I wrote before there are 2 different programs: smpd and mpiexec. You need to set both SMPD_PORT_RANGE (and restart smpd service) and MPICH_PORT_RANGE. But I'm not sure that MPICH_PORT_RANGE works properly - you can check ports by tcpview.

Windows firewall is constantly a headache. We recommend to turn it off.
From my point of view firewall should be set and configured on a dedicated computer for external connections. And internal network should be behind the firewall without restrictions.

BTW: by default smpd listen to port 8678. This number can be changed by '-port' option.

Regards!
Dmitry

Hi Dmitry:I have some problems about the Intel MPI for Linux.I have 2 machines 192.168.120.162(node1) and 192.168.120.163(node2), both of them open the port range 10000:11000 via iptables setting.MPD_PORT_RANGE=10000:11000 is set to ~/.mpd.confI run the following command at 192.168.120.162mpdboot.py -n 2 -f ~/mpdhost.txtThe contents of ~/mpdhost.txt:192.168.120.162192.168.120.163mpdboot.py is done successfully.Now I created a file named ~/machinefile.txt, which contains the following lines192.168.120.162192.168.120.163Now I run the following command on 192.168.120.162 (node1)mpiexec.py -l -machinefile ~/machinefile.txt -genv MPICH_PORT_RANGE 10000:11000 -n 2 hostnameeverything is OK.The output is0: node11: node2Now I modified the ~/machinefile.txt to the followings192.168.120.163192.168.120.162(I exchange the seqeunce of my nodes.)Then I run the following command again on 192.168.120.162 (node1)mpiexec.py -l -machinefile ~/machinefile.txt -genv MPICH_PORT_RANGE 10000:11000 -n 2 hostnameThen it hangs........But if the iptables are turned off. it never hangs.My experience is that the node to launch mpiexec.py must be the first node in machinefile.txt ifiptables is turned on....For example:mpiexec.py is launched on node1, then the contents of machinefile.txt must benode1node2when mpiexec.py is launched on node2, then the contents of machinefile.txt must benode2node1otherwise, it will just hang.... even the app is only a "hostname"!Therefore, it seems that for the hanging cases, mpd uses random port even MPD_PORT_RANGE is set in ~/.mpd.conf.Is there any work around ? Thanks very much!(BTW, I still have no time to try your suggestions for Windows smpd.)best regards,Seifer

Lascia un commento

Eseguire l'accesso per aggiungere un commento. Non siete membri? Iscriviti oggi