Experience with various interconnects and DAPL* providers


What DAPL* versions does the Intel® MPI Library support?
The Intel® MPI Library uses the Direct Access Programming Library (DAPL*) as a fabric independent API to run on fast interconnects like InfiniBand* or Myrinet*. Currently The Intel MPI Library supports DAPL* version 1.2 as well as DAPL* version 2.0-capable providers. The Intel MPI Library automatically determines the version of DAPL* standard to which the provider conforms.


How to select one DAPL* provider out of many others?
Look into the /etc/dat.conf file or its equivalent on your system. Note that some vendors put this file into a directory different from /etc and may name this file differently. Consult your DAPL* provider documentation to clarify this.

Every line of the aforementioned dat.conf file that does not start with a hash "# " sign describes a DAPL* provider. The first word in the respective line is the provider identifier you should look for. Add a colon sign ":" and this word to the I_MPI_DEVICE environment variable value when selecting the rdma and rdssm devices. For example, if the first word is "OpenIB-cma0", set the I_MPI_DEVICE variable to "rdma:OpenIB-cma0" or "rdssm:OpenIB-cma0".


How to run MPI programs over InfiniBand*?
To use InfiniBand*, do the following:

  1. Install and configure your InfiniBand hardware. See your InfiniBand hardware vendor's documentation for details.
  2. Install and configure your InfiniBand software, drivers, and DAPL* provider. See your Infin iBand software vendor's documentation for details. Alternatively, if you do not have a vendor-supplied InfiniBand DAPL* software, download and install an OpenFabrics* Enterprise Distribution† (OFED*). Use the OFED-1.4 or higher.
  3. Test that the installed InfiniBand hardware, software, and DAPL provider work as expected, independent of the Intel MPI Library.
  4. Make sure that the directory containing libdat.so is either listed in /etc/ld.so.conf (and that ldconfig has been run) or is listed in LD_LIBRARY_PATH.
  5. Intel MPI Library selects the most appropriate fabric combination automatically. Set I_MPI_DEVICE =<device>: <provider> (for example, I_MPI_DEVICE=rdma:OpenIB-cma0) to select InfiniBand explicitly.

How to run MPI programs over Myrinet*?
To use Myrinet*, do the following:

  1. Install and configure your Myrinet hardware. See your Myrinet hardware vendor's documentation for details.
  2. Install and configure your Myrinet software, drivers, and DAPL* provider. See your Myrinet software vendor's documentation for details. Alternatively, if you do not have a vendor-supplied Myrinet DAPL software, download and install the open source DAPL* provider for Myrinet*†.
  3. Test that the installed Myrinet hardware, software, and DAPL provider work as expected, independent of the Intel MPI Library.
  4. Make sure that the directory containing libdat.so is either listed in /etc/ld.so.conf (and that ldconfig has been run) or is listed in LD_LIBRARY_PATH.
  5. Make sure that an entry for the DAPL provider (libdapl.so) exists, with the correct library path, in /etc/dat.conf.
  6. When executing a MPI program, use the mpiexec -gm -mx options or set I_MPI_DEVICE= <device>: <provider> (for example, I_MPI_DEVICE=rdma:GmHca0) to activate Myrinet.

 

See the Intel MPI Library Reference Manual for more details


What does the <provider> field mean in the Intel MPI device specification rdma[: <provider>] or rdssm[: <provider>]?
Use this feature to utilize a particular provider instead of the first valid provider described in the dat.conf file. The default location of the configuration file is /etc/dat.conf.

For example, you have a dat.conf file:

#
# Generic DAT configuration file
#
# This is a DAPL provider configuration for InfiniBand fast fabric
OpenIB-cma0 u1.2 nonthreadsafe default /opt/ofed/lib64/libdaplcma.so mv_dapl.1.2 "ib0 0" ""
# This is a DAPL provider configuration for Myrinet fast fabric
GmHca0 u1.2 nonthreadsafe default /Myrinet_DAPL_Providers/libdapl.so gm_dapl.1.2 "GmHca0 0" ""

 

Use the following command to utilize the Myrinet* fabric instead of the default InfiniBand* fabric:

mpiexec -np $numproc -env I_MPI_DEVICE rdma:GmHca0 $application $app_args

 

How to verify which device/provider is used for communication?
Set the I_MPI_DEBUG environment variable to two. The Intel MPI Library will report what device/provider is in use.

For example:

mpiexec -np numproc -env I_MPI_DEBUG 2 $executable

 

How to specify an alternative path to the DAPL* configuration file dat.conf?
Use the DAT_OVERRIDE environment variable to override the default location of the dat.conf file.


I got the ("CMA: unable to open/dev/infiniband/rdma cm") error message while using the Intel MPI Library over OFED*. How do I fix it?
Some of the nodes are not loading rdma_cm modules. Use "modprode rdma_ucm" to start the modules on the nodes with errors to eliminate this problem?


Why do I get the following error message "librdmacm: kernelABI version 4 does not match library version 2" while starting the Intel MPI Library over InfiniBand*?
Update the librdmacm library to avoid this issue. Download it from http://www.openfabrics.org† as a separate package or a part of the OFED* distribution.


Why do I get DAT_INTERNAL_ERROR or DAT_INVALID_ADDRESS errors during Intel MPI application start over Infiniband*?
Make sure you have properly configured IP addresses for InfiniBand*. It is required for proper functioning of the OpenSM subnet manager.


How to disable DAPL* provider versions compatibility check at runtime?
Set the I_MPI_CHECK_DAPL_PROVIDER_MISMATCH environment variable to none.


Operating System:

SUSE* Linux Enterprise Server 10, Red Hat* Enterprise Linux 5.0, SUSE* Linux Enterprise Server 9, Red Hat* Enterprise Linux 4.0

 


For more complete information about compiler optimizations, see our Optimization Notice.

4 comments

Top
Gergana S. (Intel)'s picture

Thank you for pointing this out, Denis. I've fixed the error.

Regards,
~Gergana

anonymous's picture

Hi,

Just to inform on a typo;
In the answer to the question "How to specify an alternative path to the DAPL* configuration file dat.conf?", the correct environment variable name seems to be "DAT_OVERRIDE" (without the final S) :

$ strings /usr/lib64/libdat.so |grep DAT_OVER
DAT_OVERRIDE
DAT Registry: DAT_OVERRIDE, bad filename - %s, aborting
$ rpm -qa | grep dapl
dapl-1.2.14-1.ofed1.4.1
dapl-devel-1.2.14-1.ofed1.4.1
$

Denis

Gergana S. (Intel)'s picture

Hi Bert,

I would recommend that you submit your question to the Intel(R) Clusters and HPC Technology forum at http://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/. That's the best place to get your Intel MPI Library questions answered.

Regards,
~Gergana

anonymous's picture

Hi to All,
we are using sles10sp2, OFED 1.2.5 and intelmpi 3.0.1

When we start a job (torque) with these env variables:
export I_MPI_DEVICE=rdssm:OpenIB-cma
#export I_MPI_DAPL_PROVIDER=OpenIB-cma
export I_MPI_DEBUG=256

and this dat.conf file:
OpenIB-cma u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "ib0 0" ""
OpenIB-cma-1 u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "ib1 0" ""
OpenIB-cma-2 u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "ib2 0" ""
OpenIB-cma-3 u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "ib3 0" ""
OpenIB-bond u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "bond0 0" ""

we get this messages:
I_MPI: [13] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [13] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [2] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [2] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [14] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [14] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [8] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [4] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [0] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [12] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [6] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [10] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [5] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [15] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [2] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cmaI_MPI: [1] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma

I_MPI: [3] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [13] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [11] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [14] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [9] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [7] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [0] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [0] LIBRARY pinning(): The process is pinned on node052:CPU00I_MPI: [2] MPIDI_CH3_Init():
I_MPI: [0] MPI_Init: The process (pid=5586) started on node052
I_MPI: [1] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [1] LIBRARY pinning(): The process is pinned on node050:CPU00
I_MPI: [1] MPI_Init: The process (pid=10851) started on node050
I_MPI: [4] MPIDI_CH3_Init(): will use rdssm configuration
will use rdssm configurationI_MPI: [3] MPIDI_CH3_Init(): will use rdssm configuration

I_MPI: [2] LIBRARY pinning(): The process is pinned on node052:CPU01
I_MPI: [2] MPI_Init: The process (pid=5579) started on node052
I_MPI: [3] LIBRARY pinning(): The process is pinned on node050:CPU01
I_MPI: [5] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [5] LIBRARY pinning(): The process is pinned on node050:CPU02
I_MPI: [5] MPI_Init: The process (pid=10853) started on node050
I_MPI: [4] LIBRARY pinning(): The process is pinned on node052:CPU02
I_MPI: [4] MPI_Init: The process (pid=5580) started on node052
I_MPI: [3] MPI_Init: The process (pid=10852) started on node050
I_MPI: [6] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [6] LIBRARY pinning(): The process is pinned on node052:CPU03
I_MPI: [6] MPI_Init: The process (pid=5581) started on node052
I_MPI: [7] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [7] LIBRARY pinning(): The process is pinned on node050:CPU03
I_MPI: [7] MPI_Init: The process (pid=10854) started on node050
I_MPI: [8] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [8] LIBRARY pinning(): The process is pinned on node052:CPU04
I_MPI: [8] MPI_Init: The process (pid=5582) started on node052
I_MPI: [10] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [10] LIBRARY pinning(): The process is pinned on node052:CPU05
I_MPI: [10] MPI_Init: The process (pid=5583) started on node052
I_MPI: [9] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [9] LIBRARY pinning(): The process is pinned on node050:CPU04
I_MPI: [9] MPI_Init: The process (pid=10855) started on node050
I_MPI: [11] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [11] LIBRARY pinning(): The process is pinned on node050:CPU05
I_MPI: [11] MPI_Init: The process (pid=10856) started on node050
I_MPI: [12] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [12] LIBRARY pinning(): The process is pinned on node052:CPU06
I_MPI: [12] MPI_Init: The process (pid=5584) started on node052
I_MPI: [15] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [13] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [15] LIBRARY pinning(): The process is pinned on node050:CPU07
I_MPI: [15] MPI_Init: The process (pid=10858) started on node050
I_MPI: [13] LIBRARY pinning(): The process is pinned on node050:CPU06
I_MPI: [13] MPI_Init: The process (pid=10857) started on node050
I_MPI: [14] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [14] LIBRARY pinning(): The process is pinned on node052:CPU07
I_MPI: [14] MPI_Init: The process (pid=5585) started on node052
I_MPI: [5] MPIDI_CH3I_RDMA_wait_connect(): [inode050] rejecting CR from 4:inode052 because there is coupling of protocols
I_MPI: [13] MPIDI_CH3I_RDMA_wait_connect(): [inode050] rejecting CR from 12:inode052 because there is coupling of protocols
I_MPI: [11] MPIDI_CH3I_RDMA_wait_connect(): [inode050] rejecting CR from 10:inode052 because there is coupling of protocols
I_MPI: [1] MPIDI_CH3I_RDMA_wait_connect(): [inode050] rejecting CR from 0:inode052 because there is coupling of protocols
I_MPI: [7] MPIDI_CH3I_RDMA_wait_connect(): [inode050] rejecting CR from 6:inode052 because there is coupling of protocols
I_MPI: [3] MPIDI_CH3I_RDMA_wait_connect(): [inode050] rejecting CR from 2:inode052 because there is coupling of protocols
I_MPI: [15] MPIDI_CH3I_RDMA_wait_connect(): [inode050] rejecting CR from 14:inode052 because there is coupling of protocols
I_MPI: [9] MPIDI_CH3I_RDMA_wait_connect(): [inode050] rejecting CR from 8:inode052 because there is coupling of protocols
I_MPI: [15] MPIDI_CH3I_RDMA_wait_connect(): [inode050] rejecting CR from 0:inode052 because there is coupling of protocols
I_MPI: [2] MPIDI_CH3I_RDMA_wait_connect(): I_MPI: [6] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 5:inode050 because there is coupling of protocols
[inode052] rejecting CR from 1:inode050 because there is coupling of protocols
I_MPI: [12] MPIDI_CH3I_RDMA_wait_connect(): I_MPI: [8] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 11:inode050 because there is coupling of protocols[inode052] rejecting CR from 7:inode050 because there is coupling of protocols

I_MPI: [14] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 13:inode050 because there is coupling of protocols
I_MPI: [10] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 9:inode050 because there is coupling of protocols
I_MPI: [4] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 3:inode050 because there is coupling of protocols
MPIDI_CH3I_RDMA_wait_connect()

inodeXYZ is the IPoIB address of the node.

Where is my failure?

Thanks,
Bert

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.