Intel MPI and Active Directory

Intel MPI and Active Directory

Hi:I tried to run HelloWorld by select "delegate" as authentication methods.I followed the "Intel@MPI Library for Windows OS Reference Manual" to setup Active Directory.(Enable the delegate for cluster nodes and users, Register Service Principal Name for cluster nodes)Details of the cluster is as below:headnode: Windows Server R2 2008 64bit (192.168.120.105)computenode1:Windows XP SP2 64bit (192.168.120.201)computenode2:Windows XP SP2 64bit (192.168.120.202)I done the following tests by using administrator account.test 1:hostnametest 2:HelloWorld_IntelMPI.exe (local path)test 3:HelloWorld_IntelMPI.exe (UNC path)________________________________________________________________________________test 1:mpiexec.exe -delegate -hosts 2 192.168.120.201 192.168.120.202 hostnamecomputenode1computenode2________________________________________________________________________________test 2:mpiexec.exe -delegate -hosts 2 192.168.120.201 192.168.120.202 C:\\\\test_delegate\\\\HelloWorld_IntelMPI.exeHello by 0 of 2 processer! My hostname: computenode1, MyPID=2668Hello by 1 of 2 processer! My hostname: computenode2, MyPID=2496________________________________________________________________________________test 3:mpiexec.exe -delegate -hosts 2 192.168.120.201 192.168.120.202 \\\\\\\\192.168.120.105\\\\test_delegate\\\\HelloWorld_IntelMPI.exelaunch failed: CreateProcess(\\\\\\\\192.168.120.105\\\\test_delegate\\\\HelloWorld_IntelMPI.exe) on 'computenode1.test.com' failed, error 5 - Access is denied.launch failed: CreateProcess(\\\\\\\\192.168.120.105\\\\test_delegate\\\\HelloWorld_IntelMPI.exe) on 'computenode2.test.com' failed, error 5 - Access is denied. It seems that the account for delegation is unable to use CreateProcess for an executable with UNC path ?regards,Seifer

14 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Seifer,

What happens if you just run

192.168.120.105test_delegateHelloWorld_IntelMPI.exe


What about

mpiexec -delegate -n 1 192.168.120.105test_delegateHelloWorld_IntelMPI.exe

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi Seifer,

I am now able to reproduce this error. I'm putting in a defect report at this time.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James:Will this problem soon be resolved in the updated version of Intel MPI 4.0 ?We have customers that use Windows Domain/Active Directory to manage all accounts.Using newly created local account in each compute nodes for parallel computing is not acceptable.We hope the problem will be resolved very soon.Thank you.regards,Seifer

Hi Seifer,

I do not have any information about when this will be corrected. Have you tried using the -map option?

mpiexec.exe -delegate -map z:192.168.120.105test_delegate -hosts 2 192.168.120.201 192.168.120.202 z:HelloWorld_IntelMPI.exe

This option creates a temporary drive mapping to the share on each of the nodes, runs the job, and disconnects the mapping when the job is completed.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi James:I run the HelloWorld_IntelMPI.exe by mpiexec.exempiexec.exe -delegate -map Z:\\192.168.120.105\test_delegate -hosts 2 192.168.120.201 192.168.120.202 Z:\\HelloWorld_IntelMPI.exelaunch failed: CreateProcess(Z:\\HelloWorld_IntelMPI.exe) on 'ComputeNode1.test.com' failed, error 3 - The system cannot find the path specified.launch failed: CreateProcess(Z:\\HelloWorld_IntelMPI.exe) on 'computenode2.test.com' failed, error 3 - The system cannot find the path specified.I run the hostname by mpiexec.exempiexec.exe -delegate -map Z:\\192.168.120.105\test_delegate -hosts 2 192.168.120.201 192.168.120.202 hostname*********** Warning ************Access to the network resource (\\192.168.120.105\test_delegate) was denied.*********** Warning ************ComputeNode1*********** Warning ************Access to the network resource (\\192.168.120.105\test_delegate) was denied.*********** Warning ************computenode2Do you have any suggestions about this issue? Thank you.regards,Seifer

Hi Seifer,

Are you able to map the share to a drive manually?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi Seifer,

Do you have your Active Directory* setup for delegation? If so, was this done manually or from the Intel MPI Library installer? What type of account (AD user, local user, local admin, etc.) are you using for the job?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Dear Dr. Tullos,

I have almost the same problem to lauch a program on cluster. I have Windows 7 64bit installed on 2 nodes. I used VS2010 + Intel(R) Visual Fortran Compiler XE 12.1.0.233 [Intel(R) 64] + package inside MKL to compile my project (Fortran calls C++ dynamic linked library built by VS2010). The Intel MPI I used is version 4.1.

Program works well when running
mpiexec -wdir \\n01\debug\ -n 6 \\n01\debug\test

However, the following error displayed when running
mpiexec -wdir \\n01\debug\ -hosts 2 n01 6 n02 6 \\n01\debug\test

launch failed: CreateProcess(\\n01\debug\test) on 'N02' failed, error 2 - The system cannot find the file specified.

Could you please help me to take a look at it?

Thanks,
Zhanghong Tang

Hi Zhanghong,

Dr. Tullos? Thanks, but I'm not there (yet).

Try running

mpiexec -wdir \n01debug -n 6 \n01debugtest

While you are on n02.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Dear James,

Sorry for launching a so long ago issue. I have a cluster with 10 calculation nodes. The names are N01 to N10 and IP addresses are 10.0.0.1 to 10.0.0.10. Before this I have the Windows 7 64bit + Intel MPI 4.1.3.047 environment on every calculation node. I have mapped driver Z: (\\n02\debug) on every node and the working folder is on Z:. I used the following command to run the program

mpiexec -wdir "z:\test" -mapall -hosts 10 n01 2 n02 2 n03 2 n04 2 n05 2 n06 2 n07 2 n08 2 n09 2 n10 2 Z:\test\fem

and the program runs OK.

Recently I need to create a domain on this cluster and add all calculation nodes to this domain. For test, I formatted the harddisk of one node and installed Windows Server 2012 on that node, and then installed Microsoft HPC pack 2012 on every node. I have also successfully created a domain 'bjut.edu' and added all node to this domain. For every node, I created a domain username 'tang' with administrators group (there is also another username 'tang' with administrators group in local machine of every node) and logged into it run the following command:

mpiexec -remove

mpiexec -register

I registered every node with the domain user name 'tang'.

Now I run the program by the same command again. The following errors displayed:

launch failed: CreateProcess(\\n02\Debug\fem) on 'N02.bjut.edu' failed, error 2 - The system cannot find the file specified.

sometimes the following errors displayed:

launch failed: CreateProcess(\\n02\Debug\fem) on 'N03.bjut.edu' failed, error 5 - Access is denied.

Could you please help me to take a look at what I missed?

Thanks,

Zhanghong Tang

Dear James,

Sorry for launching a so long ago issue. I have a cluster with 10 calculation nodes. The names are N01 to N10 and IP addresses are 10.0.0.1 to 10.0.0.10. Before this I have the Windows 7 64bit + Intel MPI 4.1.3.047 environment on every calculation node. I have mapped driver Z: on every node and the working folder is on Z:. I used the following command to run the program

mpiexec -wdir "z:\test" -mapall -hosts 10 n01 2 n02 2 n03 2 n04 2 n05 2 n06 2 n07 2 n08 2 n09 2 n10 2 Z:\fem

and the program runs OK.

Recently I need to create a domain for this cluster and so I formatted the harddisk of node n01 and reinstalled windows server 2012 on that node. I created the domain bjut.edu successfully and added all other calculation nodes to this domain. Latter, I installed Microsoft HPC 2012 pack on every node and added another domain user tang to the administrators group (the original local user name is also tang in the administrators group). I run the following commands to reset username of mpiexec:

mpiexec -remove

mpiexec -register

and use bjut\tang to set password, then I run the program with above command again. The following error displayed (logged into one node as a local user, for example, \n02\tang ):

launch failed: CreateProcess(\\n02\Debug\fem) on 'N02.bjut.edu' failed, error 2 - The system cannot find the file specified.

sometimes the following errors displayed:

launch failed: CreateProcess(\\n02\Debug\fem) on 'N03.bjut.edu' failed, error 5 - Access is denied.

If I use the domain user to login and run the program, the following errors displayed:

Error while connecting to host, A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. (10060)
Connect on sock (host=n01, port=8678) failed, exhaused all end points
Unable to connect to 'n01:8678', sock error: Error = -1
 

Furthermore, I also tested the following command (removed n01 node which installed windows server 2012) after logged into one node as a local user:

mpiexec -wdir "z:\test" -mapall -hosts 9 n02 2 n03 2 n04 2 n05 2 n06 2 n07 2 n08 2 n09 2 n10 2 Z:\fem

the following error messages displayed:

Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(658)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(104)..................:
MPID_nem_tcp_post_init(344)..........:
MPID_nem_newtcp_module_connpoll(3102):
gen_cnting_fail_handler(1816)........: connect failed - The semaphore timeout period has expired.
 (errno 121)
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(658)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(104)..................:
MPID_nem_tcp_post_init(344)..........:
MPID_nem_newtcp_module_connpoll(3102):
gen_cnting_fail_handler(1816)........: connect failed - The semaphore timeout period has expired.
 (errno 121)

job aborted:
rank: node: exit code[: error message]
0: n02: 1: process 0 exited without calling finalize
1: n02: 1: process 1 exited without calling finalize
2: n03: 123
3: n03: 123
4: n04: 123
5: n04: 123
6: n05: 123
7: n05: 123
8: n06: 123
9: n06: 123
10: n07: 123
11: n07: 123
12: n08: 123
13: n08: 123
14: n09: 123
15: n09: 123
16: n10: 123
17: n10: 123

Thanks,

Zhanghong Tang

Hi Zhanghong,

I'd recommend you to check the firewall status. If it's enabled try the same scenario with disabled firewall.

Dear James,

Thank you very much for your kindly reply. It works after I disabled the firewall! On the other hand, is it possible to enable firewall and add the Intel MPI associated programs to the exception of firewall?

Thanks,

Zhanghong Tang

Leave a Comment

Please sign in to add a comment. Not a member? Join today