Enabling Connectionless DAPL UD in the Intel® MPI Library

What is DAPL UD?

Traditional InfiniBand* support involves MPI message transfer over the Reliable Connection (RC) protocol. While RC is long-standing and rich in functionality, it does have certain drawbacks: since it requires that each pair of processes setup a one-to-one connection at the start of the execution, memory consumption could (at the worst case) grow linearly as more MPI ranks are added and the number of pair connections grows.

In recent years, the User Datagram (UD) protocol has emerged as a more memory-efficient alternative to the standard RC transfer. UD implements a connectionless model that allows for a many-to-one connection to be set up, using a fixed number of connection pairs even as more MPI ranks are started.

Availability

There are two aspects to DAPL UD support: availability in the InfiniBand* software stack, and support in the MPI implementation.

The Open Fabrics Enterprise Distribution (OFED™) stack is open source software for high-performance networking applications offering low latencies and high bandwidth. It is developed, distributed, and tested by the Open Fabrics Alliance (OFA) – a committee of industry, academic, and government organizations working to improve and influence RDMA fabric technologies. Support for the DAPL UD extensions is part of OFED 1.4.2 and later. Make sure you have the latest OFED installed on your cluster.

Alternatively, contact your InfiniBand* provider and ask if your cluster’s IB software stack supports DAPL UD.

On the MPI side, the Intel® MPI Library has supported execution over DAPL UD since Intel MPI 4.0 and later. Make sure you have the latest Intel MPI version installed on your cluster. To download the latest release, log into the Intel® Registration Center or check our website.

Enabling DAPL UD

To enable usage of DAPL UD with your Intel MPI application, you need to set the following environment variables:

$ export I_MPI_FABRICS=shm:dapl
$ export I_MPI_DAPL_UD=enable

Note that the shm:dapl setting is default for the I_MPI_FABRICS environment variable. This will use the shm device for intra-node communication and the dapl device when communicating between nodes.

Selecting the DAPL UD provider

Finally, select the appropriate DAPL provider that supports the UD InfiniBand* extensions. While several providers (e.g. scm, ucm) offer this functionality, we recommend using the ucm device as that offers better scalability and is more suitable for many-core machines. For example, given the following /etc/dat.conf entries:

$ cat /etc/dat.conf
OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib0 0" ""
OpenIB-mlx4_0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 1" ""
OpenIB-mlx4_0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 2" ""
ofa-v2-mlx4_0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 1" ""
ofa-v2-mlx4_0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 2" ""
ofa-v2-ib0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib0 0" ""
ofa-v2-mlx4_0-1u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx4_0 1" ""
ofa-v2-mlx4_0-2u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx4_0 2" ""

The UD-supported providers are highlighted in bold. To use the ucm-specific provider, set:

$ export I_MPI_DAPL_UD_PROVIDER=ofa-v2-mlx4_0-1u

Your Intel MPI application will now utilize connectionless communication at runtime.

For more complete information about compiler optimizations, see our Optimization Notice.

1 comment

Top
dingjun.chencmgl.ca's picture

I am trying to test the Intel MPI Benchmark(IMB) 4.0 beta on our Windows PCs cluster. Both Intel MPI 5.0 and the WinOFED 3.2 are installed on our Windows PCs cluster. When I did tests, the following errors were always occurred:

C:\Users\dingjun\mpi5tests>mpiexec -configfile config_file

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

The above confgi_file contains the following content:

-host drmswc4-1 -n 8 -genv I_MPI_FABRICS shm:dapl IMB-MPI1 Exchange

-host drmswc4-2 -n 8 -genv I_MPI_FABRICS shm:dapl IMB-MPI1 Exchange

 

C:\Users\dingjun\mpi5tests>mpiexec -n 4 -env I_MPI_FABRICS shm:dapl IMB-MPI1
dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

dapls_ib_init() NdStartup failed with NTStatus: The specified module could not be found.

job aborted:
rank: node: exit code[: error message]
0: drmswc4-1.cgy.cmgl.ca: 291: process 0 exited without calling finalize
1: drmswc4-1.cgy.cmgl.ca: 291: process 1 exited without calling finalize
2: drmswc4-1.cgy.cmgl.ca: 291: process 2 exited without calling finalize
3: drmswc4-1.cgy.cmgl.ca: 291: process 3 exited without calling finalize

Could you tell me the reasons why above errors occurred? If you are not able to answer this question, could you tell me someone in the Intel Corp. who can answer it?

By the way, on our LINUX PCs cluster the Intel MPI DAPL option works very well and the above problem only occurred on our Windows PCs cluster. In addition, What kind of hardware is  used to pass DAPL over Infiniband test ?  We need hardware information such as vender and model and driver information, provided by vender or opensource, if opensource what’s download link.

I am looking forward to hearing from you and your early response is highly appreciated.

Have a good day.

Dingjun Chen

Office #150, 3553-31 Street NW

Calgary, AB T2L 2K7, Canada

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.