Infiniband and MPI_THREAD_MULTIPLE

Infiniband and MPI_THREAD_MULTIPLE

Hi!

I would like IMPI to use Infiniband together with the MPI_THREAD_MULTIPLE mode. I tested this combination with the Intel MPI Benchmark Suite which mostly runs fine, but crashes at the Bcast Benchmark.

Is it safe to use IMPI that way?

Edit: Some additional Informations.
OS: Windows Server 2008 R2
IMPI version:  4.1 Build 08/22/2012
Intel Benchmark Suite Version: 3.2.3
Infiniband Hardware Connect-X 2 from Mellanox with OFED Providers
Provider used by IMPI: DAPL
 

12 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.
Bild des Benutzers James Tullos (Intel)

Hi Stefan,

Please send me the output with I_MPI_DEBUG=5.  Are you using FCA?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Hi James!

Thank you for the quick Response!
No we are not using FCA. You can find the Output at http://pastebin.com/nk5K2XRD.

I am mostly concerned because of the DAPL Transport Layer, which is (as far as I know), not thread safe.

Bild des Benutzers James Tullos (Intel)

Hi Stefan,

I am unable to reproduce this on Linux*.  I do not have access to a Windows* cluster to test this.  I'll check with our developers and get their opinions.  In the meantime, you could try using the -mt_mpi option to link with the multithreaded version of the Intel® MPI Library.  I don't think this will help here, but in general if you are using multithreaded MPI this is good practice.

Also, for future reference, you can attach a file directly to a post here.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Hi James,

I am linking directly to impimt.lib, which should be equivalent to the -mt_mpi option. Without it, we were not able to get MPI_THREAD_MULTI, because it defaulted to MPI_THREAD_SINGLE (which performed all tests without a problem). 

Bild des Benutzers James Tullos (Intel)

Hi Stefan,

Thanks for the information.  Can you give me the full details of how you are compiling?  Did you make any changes to IMB?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

We got a notification that it would be necessary to link explicitly the lilbmpi_mt library in place of libmpi to support this mode.  There's been some question whether this is a bug (I certainly would think so if it's not documented properly).

Anyway, you should test MPI_Init_thread() using the required and provided arguments and expect a report if a library version which doesn't support it is active.

I am using the provided Visual Studio Project to build the IMB Executable while following the instructions of the ReadMe.txt. The Compiler and Linker are the standard Visual C++ Compiler and Link. The only changes I´ve done is to link directly to libmpimt.lib as well as adding the preprocessor definition "USE_MPI_INIT_THREAD".

USE_MPI_INIT_THREAD causes IMB to use MPI_Init_thread with requested threading level MPI_THREAD_MULTI which is granted by the library.

I just ran some additional tests which had some pretty interesting results. It does not matter which threading level is selected. Running IMB with MPI_THREAD_SINGLE, MULTIPLE or FUNNELED (using MPI_Init_thread or MPI_Init) with libmpimt over the DAPL Transport Layer causes it to crash. However no error occurs, if the Transport Layer is not DAPL (tested with fabric Socket/TCP). There is also no error if libmpi is used.

I think the problem here really migth be with the DAPL Transport Layer. As far as I know, the uDapl Library is not threadsafe, so that might be the source of the error. This also matches with the actual error reported: Assertion failed in file .\dapl_conn_rc.c at line 1128: 0.

Bild des Benutzers James Tullos (Intel)

Tim,

If you link with the single-threaded library but try to initialize with MPI_Init_thread, then the provided threading level returned by MPI_Init_thread will only be MPI_THREAD_SINGLE.  This is something left to the developer to check.  IMB does report the provided threading level, but it does nothing to stop the run if the requested threading level is not met (it does not use threads anyway).

Using -mt_mpi will link to libmpi_mt, as will several other compiler options (-Qopenmp, -Qparallel, -threads, -reentrancy, -reentrancy threaded) if passed through the IMPI compiler scripts.

Stefan,

Thanks for the additional information.  I'll add this information to the report, it should help narrow down where the problem is located.  Would it be possible for you to test with the Intel® C Compiler?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

James,

I tested that yesterday, but forgot to mention it.
I used the "Use Intel C++" feature in VS2010 to switch over to the Intel Composer XE 2013. Nothing changes, the error still occurs at the same place. 

Bild des Benutzers James Tullos (Intel)

Stefan,

This should be corrected in Version 5.0 of the Intel® MPI Library, which will be available within the next few weeks.  Once it is available, please test it and verify.

Bild des Benutzers James Tullos (Intel)

Stefan,

Version 5.0 of the Intel® MPI Library is now available.  Please test and verify if the problem is still present.

Melden Sie sich an, um einen Kommentar zu hinterlassen.