Intel® Clusters and HPC Technology

Infiniband and MPI_THREAD_MULTIPLE

Hi!

I would like IMPI to use Infiniband together with the MPI_THREAD_MULTIPLE mode. I tested this combination with the Intel MPI Benchmark Suite which mostly runs fine, but crashes at the Bcast Benchmark.

Is it safe to use IMPI that way?

Edit: Some additional Informations.
OS: Windows Server 2008 R2
IMPI version:  4.1 Build 08/22/2012
Intel Benchmark Suite Version: 3.2.3
Infiniband Hardware Connect-X 2 from Mellanox with OFED Providers
Provider used by IMPI: DAPL
 

everything runs on one core

hi

i hope you are doing good. i am trying to make an application using multiple CPUs connected via MPI over a network. I am able to compile my code. However i have an issues for which i seek your help through this forum.

The toolchain i am using is

Compiler XE for applications running on Intel(R) 64, Version 12.1.6.361 Build 20120821
Copyright (C) 1985-2012 Intel Corporation.

I wish to use the intel compilers with intel MKL and intel-mpi.

This is my link line that i got from the link line advisor

MPI_Barrier bug with defined communicators.

Hi,

I am working with IntelMPI version 4.1.0.024 and I detected a problem with the MPI_Barrier() function (maybe a bug).

In the attached code I create a new process via the MPI_Comm_spawn function. Then I merge the intercomm and

the parent communicator with the MPI_Intercomm_merge function and I call a MPI_Barrier() function with the new

communicator.

The problem is some processes don't continue the execution (they remain held in the MPI_Barrier() function).

I have tested the code with other MPI implementations and it works fine.

MPI MPMD fault tolerance support

Hi,

I would really appreciate some help. I would like to know whether Intel MPI supports fault tolerance (run-through stabilisation) for multiple programs multiple data (MPMD) applications?

I have read the Intel MPI fault tolerance documentation. I am running a master - worker application, where the master and worker code are seperate and where there is no communication amongst workers. My configure command looks like this:

problem with mpi_comm_accept

 I am trying to set up a server client pair that establishes a connection (after being launched independently) using mpi_comm_accept and connect. I successfully have the server wait for a connection request using MPI_Comm_accept.  The client successfully connects to using MPI_Comm_connect.   Both the server and the client return without any error but the a negative handle is returned in 'newcomm' to both the server and the client.  I launch both using mpiexec and have mpd running.

I cannot figure out what is wrong and it is probably something simple.  Any ideas?

SERVER:

IMB-MPI1.4.0.3 fails with Signal 1 hangup errors

We have several new IBM iDataplexes. Some of our codes compiled with Intel 12.1 with INTEL-MPI-4.0.3 would sometimes fail with this error:

"APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)"

I can consistenly replicate this error with the Intel IMB-MPI1.4.0.3 benchmark system on two nodes (32 cores).

The error above happens in the Allgatherv benchmark using 32 processes after te 8192 byte size messages (see below).

mpitune critical errors

I'm attempting to use mpitune to get optimized IMPI environment settings for an application. 

I ran it with the following command:

mpitune -d -hf $nodelist -od $pwd -avd min -pm hydra -a \"mpirun -ppn 16 -np 1001 ./myapplication\"

During the tuning, I got the following critical errors:

intel mpi error

We have a new cluster with Mellanox FDR Infiniband interconnect and sometimes get the following error when running Intel MPI :

[15] Abort: Error code in polled desc!at line 2346 in file ../../ofa_init.c[

16] Abort: Error code in polled desc!

[16] Abort: Got FATAL event 3at line 1010 in file ../../ofa_utility.c

at line 2346 in file ../../ofa_init.c[

159] Abort: Error code in polled desc!at line 2346 in file ../../ofa_init.c

[0] Abort: Error code in polled desc!at line 2346 in file ../../ofa_init.c

How to pin Intel MPI processes within Torque cpusets? Set domain issues

Hi,

I think I have a problem with process pinning, for older version of Intel MPI (4.0.1). The version cannot be changed because it is bundled with the user's application ( Accelrys Material Studio) and there are tons of scripts surrounding it. The code works when started interactively, but when run under the Torque batch system, there are following messages:

Pages

Subscribe to Intel® Clusters and HPC Technology