Message Passing Interface

Issue with Intel MPI library on Microsoft Azure machines


I am trying to setup RDMA between two Azure VMs using Intel's MPI library (v5.1.1.109). Both machines can remotely connect to the other machine using ssh and using the pingpong utility in the following way, I can get latency numbers without any errors.

No Cost Options for Intel Parallel Studio XE, Support Yourself, Royalty-Free

Intel® Parallel Studio XE is a very popular product from Intel that includes the Intel® Compilers, Intel® Performance Libraries, tools for analysis, debugging and tuning, tools for MPI and the Intel® MPI Library. Did you know that some of these are available for free? Here is a guide to “what is available free” from the Intel Parallel Studio XE suites.

Intel® Parallel Studio XE 2016: High Performance for HPC Applications and Big Data Analytics

Intel® Parallel Studio XE 2016, launched on August 25, 2015, is the latest installment in our developer toolkit for high performance computing (HPC) and technical computing applications. This suite of compilers, libraries, debugging facilities, and analysis tools, targets Intel® architecture, including support for the latest Intel® Xeon® processors (codenamed Skylake) and Intel® Xeon Phi™ processors (codenamed Knights Landing). Intel® Parallel Studio XE 2016 helps software developers design, build, verify and tune code in Fortran, C++, C, and Java.

Need help making sense of NBC performance (MPI3)

Hello everyone,

I am fairly new to parallel computing, but am working on a certain legacy code that uses real-space domain decomposition for electronic structure calculations. I have spent a while modernizing the main computational kernel to hybrid MPI+openMP and upgraded the communication pattern to use nonblocking neighborhood alltoallv for the halo exchange and a nonblocking allreduce for the other communication in the kernel. I have now started to focus on "communication hiding", so that the calculations and communication happen alongside each other.

Can each thread on Xeon Phi be given private data areas in the offload model


I want to calculate a  Jacobian matrix, which is a sum of 960 (to be simple) 3x3 matrices  by distributing the calculations of these 3x3 matrices to a Xeon Phi card. The calculation of the 3x3 matrices uses a third party library whose subroutines use an interger vector not only for the storage of parameter values but also to write and read intermidiate results. It is therefore necessary for each task to have this integer vector protected from other tasks. Can this be obtained on the physical core level or even for each thread (each Xeon Phi has 60x4=240 threads. 

mpirun with bad hostname hangs with [ssh] <defunct> until Enter is pressed

We have been experiencing hangs with our MPI-based application and our investigation led us to observing the following behaviour of mpirun:

mpirun -n 1 -host <good_hostname> hostname works as expected

mpirun -n 1 -host <bad_hostname> hostname hangs, during which ps shows: 

Varying Intel MPI results using different topologies


I am compiling and running a massive electronic structure program on an NSF supercomputer.  I am compiling with the intel/15.0.2 Fortran compiler and impi/5.0.2, the latest-installed Intel MPI library.

The program has hybrid parallelization (MPI and OpenMP).  When I run the program on a molecule using 4 MPI tasks on a single node (no OpenMP threading anywhere here), I obtain the correct result.

However, when I spread out the 4 tasks on 2 nodes (still 4 total tasks, just 2 on each node), I get what seem to be numerical-/precision-related errors.

Debugging Fortran MPI codes in VS2012 and Intel MPI


Before this I was using VS2008 with ifort 11 and MPICH.

I folllowed the 1st mtd (by attaching to a currently running process (one VS window for all selected MPI processes)) from:

It worked but fails for np >= 4. Seems to be MPICH problem.

However, using the new setup, I can't get it to work, even with np = 1 or 2. Error is:

Books - Message Passing Interface (MPI)

This book covers essential concepts of the Message Passing Interface (MPI). From this book, the reader will gain insights into utilizing MPI to write portable parallel code. The book covers the following essential elements of MPI: Sending and receiving with MPI_Send and MPI_Recv; Dynamic receiving with MPI_Probe and MPI_Status; A collective communication introduction with MPI_Bcast; Common collectives – MPI_Scatter, MPI_Gather, and MPI_Allgather; and Using MPI_Reduce and MPI_Allreduce for parallel number reduction.
  • Entwickler
  • Professoren
  • Studenten
  • Linux*
  • C/C++
  • Fortran
  • Anfänger
  • Fortgeschrittene
  • Message Passing Interface
  • Akademischer Bereich
  • Cluster-Computing
  • Codemodernisierung
  • Intel® Parallel Studio XE 2016 Cluster Edition Initial Release Readme

    Deliver top application performance and reliability with the Cluster Edition of Intel® Parallel Studio XE 2016. This C++ and Fortran software development suite simplifies the design, build, debug, and tune of applications that take advantage of scalable MPI, thread and vector parallel processing to boost application performance.
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 10
  • Microsoft Windows* 8.x
  • C/C++
  • Fortran
  • Intel® Parallel Studio XE Cluster Edition
  • Message Passing Interface
  • Cluster-Computing
  • Message Passing Interface abonnieren