Segmentation Fault when using mpirun -f option

Segmentation Fault when using mpirun -f option

Whenever I use the mpirun -f option to run MPI programs on the device, I get a segmentation fault.

[user@node ~]$ echo mic0 > mic0.hosts
[user@node ~]$ mpirun -perhost 1 -n 2 -f mic0.hosts  ./hello
Segmentation fault

However, I am able to run it fine with the -host option or even with the -f option as long as there are no mic devices in the file.

[user@node ~]$ echo different_node > mic0.hosts
[user@node ~]$ mpirun  -n 2 -f mic0.hosts  ./hello
CPU: Hello from different_node 1 of 2
CPU: Hello from different_node 0 of 2
[user@node ~]$ mpirun -perhost 1 -n 2 -host mic0  ./hello
MIC: Hello from node-mic0 1 of 2
MIC: Hello from node-mic0 0 of 2

I haven't been very successful in debugging this problem. Does anyone have any suggestions?



6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Dan,

If the file contains "mic0" only, then your two commands are equivalent

% mpirun -perhost 1 -n 2 -f mic0.hosts  ./hello

% mpirun -n 2 -host mic0  ./hello

Assuming that "hello" is the MIC binary, and transferred to /root in mic0

Could you verify the content of mic0.hosts contains only mic0?

% cat mic0.hosts

Also, what MPI version and compiler version are you using? Thank you. 




Thank you for your response.

I am using Intel MPI v4.1.3.045.

I construct mic0.hosts using "echo mic0 > mic0.hosts". It does contain only mic0 as confirmed by cat:

$ cat mic0.hosts

If I construct the hosts file the same way but with non-Xeon Phi hosts, then the program executes correctly. I would have expected the -hosts command and the -f command to be equivalent to, however I observe a segmentation fault with one and proper execution with the other.


I notice that in your original example, you specify -perhost on the command line when you are using the coprocessor but not when you are using the non-coprocessor hosts. Is this just a typo? If not, could you try your example using the -perhost in both cases?

I have not been able to duplicate this behavior. If this is still a problem, please let us know.

Thank you for checking in Frances. I have checked again today and there is no problem with using the -f option, although I haven't updated any of the software involved. It is working both with the -perhost option and without it. So, it seems I am no longer able to reproduce the problem myself.

I will post here again if the problem resurfaces. Thank you for your help.


Leave a Comment

Please sign in to add a comment. Not a member? Join today