Message Passing Interface

Need help making sense of NBC performance (MPI3)

Hello everyone,

I am fairly new to parallel computing, but am working on a certain legacy code that uses real-space domain decomposition for electronic structure calculations. I have spent a while modernizing the main computational kernel to hybrid MPI+openMP and upgraded the communication pattern to use nonblocking neighborhood alltoallv for the halo exchange and a nonblocking allreduce for the other communication in the kernel. I have now started to focus on "communication hiding", so that the calculations and communication happen alongside each other.

Can each thread on Xeon Phi be given private data areas in the offload model


I want to calculate a  Jacobian matrix, which is a sum of 960 (to be simple) 3x3 matrices  by distributing the calculations of these 3x3 matrices to a Xeon Phi card. The calculation of the 3x3 matrices uses a third party library whose subroutines use an interger vector not only for the storage of parameter values but also to write and read intermidiate results. It is therefore necessary for each task to have this integer vector protected from other tasks. Can this be obtained on the physical core level or even for each thread (each Xeon Phi has 60x4=240 threads. 

mpirun with bad hostname hangs with [ssh] <defunct> until Enter is pressed

We have been experiencing hangs with our MPI-based application and our investigation led us to observing the following behaviour of mpirun:

mpirun -n 1 -host <good_hostname> hostname works as expected

mpirun -n 1 -host <bad_hostname> hostname hangs, during which ps shows: 

Varying Intel MPI results using different topologies


I am compiling and running a massive electronic structure program on an NSF supercomputer.  I am compiling with the intel/15.0.2 Fortran compiler and impi/5.0.2, the latest-installed Intel MPI library.

The program has hybrid parallelization (MPI and OpenMP).  When I run the program on a molecule using 4 MPI tasks on a single node (no OpenMP threading anywhere here), I obtain the correct result.

However, when I spread out the 4 tasks on 2 nodes (still 4 total tasks, just 2 on each node), I get what seem to be numerical-/precision-related errors.

Debugging Fortran MPI codes in VS2012 and Intel MPI


Before this I was using VS2008 with ifort 11 and MPICH.

I folllowed the 1st mtd (by attaching to a currently running process (one VS window for all selected MPI processes)) from:

It worked but fails for np >= 4. Seems to be MPICH problem.

However, using the new setup, I can't get it to work, even with np = 1 or 2. Error is:

Books - Message Passing Interface (MPI)

This book covers essential concepts of the Message Passing Interface (MPI). From this book, the reader will gain insights into utilizing MPI to write portable parallel code. The book covers the following essential elements of MPI: Sending and receiving with MPI_Send and MPI_Recv; Dynamic receiving with MPI_Probe and MPI_Status; A collective communication introduction with MPI_Bcast; Common collectives – MPI_Scatter, MPI_Gather, and MPI_Allgather; and Using MPI_Reduce and MPI_Allreduce for parallel number reduction.
  • Developers
  • Professors
  • Students
  • Linux*
  • C/C++
  • Fortran
  • Beginner
  • Intermediate
  • Message Passing Interface
  • Academic
  • Cluster Computing
  • Code Modernization
  • Intel® Parallel Studio XE 2016 Cluster Edition Initial Release Readme

    Intel® Parallel Studio XE 2016 Cluster Edition for Linux* and Windows*

    Deliver top application performance and reliability with the Cluster Edition of Intel® Parallel Studio XE 2016. This C++ and Fortran software development suite simplifies the design, build, debug, and tune of applications that take advantage of scalable MPI, thread and vector parallel processing to boost application performance.

    Key Features

  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 10
  • Microsoft Windows* 8.x
  • C/C++
  • Fortran
  • Intel® Parallel Studio XE Cluster Edition
  • Message Passing Interface
  • Cluster Computing
  • MPICH3.14 with intel c++ complier on os x 10.10.3,Error:dyld: Library not loaded: libiomp5.dylib

    Hi friends:

    I install MPICH with intel c++ complier on OS X,when I complier the code link with MKL library,and run the code,it gives me error:

    mpicxx -mkl testcode.cpp -o testcode

    mpiexec -n 3 ./testcode

    dyld: Library not loaded: libiomp5.dylib

      Referenced from: /Users/.....

      Reason: image not found


    how can  I fix this?



    Seg Fault when using US NFS install of MPI from site in Russia


    One of my team members from Russia is accessing a NFS installation of MPI located at a US site. When this team member runs the simple ring application test.c, she encounters a segmentation fault when running with four processes and one process per node. This does not happen for the team members based at US sites.  The seg fault does not happen when the application is executed on only a single node, the login node.

    The test.c application was compiled by each team member in this way (in a user-specific scratch space in the US NFS allocation) :

    MPI_Comm_spawn strips empty strings from argv

    Hi. I'm using Intel MPI 5 on Windows and have the following problem.  I have a C++ application that spawns worker processes using MPI_Comm_spawn() and passing parameters via the argv argument.  I have an array of strings that is passed this way - however, one of these string is empty.  When the worker process receives the argv array, the empty string has been removed from the array.

    Subscribe to Message Passing Interface