error impi 4.1.2 dapl error


Dear Experts,


After an update of our Cluster I started receiving dapl errors. I am compiling my fortran code with impi 4.1.2. The error occurs if I try running my code with complicated models having large memory demand. The dapl errors occur after completing about 75% of the entire job. This is a rather strange problem because I have 64GB of memory on my nodes. This should be more than enough to fit even the most complicated problems. Moreover these inputs were running without any objections before the update.

Cluster Studio Install Error

I am trying to install Cluster Studio on Linux. The install froze so I had to restart the installation. When I tried to run the installer again, I get the following message.

Another instance of the installation program started by root has been detected. Please quit the other instance and try again.

I thought I killed the previous installer but apparently it's still running. Does anyone know what the installer that is run from is called?


Intel Micro Benchmark Result Problem with PCI-passthrough via FDR infiniband

Hi everyone, I still evaluate cluster performance. For now, i move on virtualization with PCI-passthrough via FDR infiniband on KVM hypervisor.

My problem is Sendrecv throughput that decrease by half when compare with physical machine and i use 1 rank/node. For example

Node           Bare-metal (MB/s)               PCI-passthrough (MB/s)

    2                14,600                                        13,000

   4                 14,500                                        12,000

Azure multi-agent failure on MPI_Init

Hey Forum,

I'm trying to run on the Azure cloud using Intel's MPI implementation, but there is a problem. Everything works as expected when run on one Agent (8 processors), however anything with 2 or more Agents fails on MPI_Init() roughly 25% of the time. The failure is instantaneous (see output below). I was also able to reproduce this crash with a simple point to point send between all processors. I'm unable to reproduce the issue on my local system.

