Using Multiple DAPL* Providers with the Intel® MPI Library

By James T.,

Published:09/19/2013   Last Updated:09/19/2013

Introduction

If your MPI program sends messages of drastically different sizes (for example, some 16 byte messages, and some 4 megabyte messages), you want optimum performance at all message sizes.  This cannot easily be obtained with a single DAPL* provider.  This is due to latency being a major factor for smaller messages, but bandwidth being more important for larger messages, and providers often making a tradeoff to improve one or the other.  The Intel® MPI Library, as of Version 4.1 Update 1, now supports up to three providers for a single rank of your job.

Details

The ideal scenario is to allow the automatic provider detection to choose the providers for your job.  The automatic detection has additional features for finding the appropriate adapter to use in systems with multiple adapters, and can easily set each rank to use different adapters.  If you want to control the provider selection directly, you can do so with the environment variable I_MPI_DAPL_PROVIDER_LIST.  This takes a comma separated list of up to three providers with the syntax I_MPI_DAPL_PROVIDER_LIST=<Provider1,Provider2,Provider3>, where:

  • Provider1 is used for small messages.  This provider should be a low latency provider.
  • Provider2 is used for large messages inside of a single node.  This is most applicable in jobs involving Intel® Xeon Phi™ Coprocessors, or if you are using DAPL* for intranode communications.
  • Provider3 is used for large messages between multiple nodes.

 The following environment variables can also take a comma separated list of options.  The first option will be applied to the first provider, the second to the second, and the third to the third.  If only one value is set, it is applied to all providers, and if an invalid value is given, all providers will fall back to default values.

  • I_MPI_DAPL_DIRECT_COPY_THRESHOLD (I_MPI_RDMA_EAGER_THRESHOLD, RDMA_IBA_EAGER_THRESHOLD)
  • I_MPI_DAPL_TRANSLATION_CACHE (I_MPI_RDMA_TRANSLATION_CACHE)
  • I_MPI_DAPL_TRANSLATION_CACHE_AVL_TREE
  • I_MPI_DAPL_CONN_EVD_SIZE (I_MPI_RDMA_CONN_EVD_SIZE, I_MPI_CONN_EVD_QLEN)
  • I_MPI_DAPL_RDMA_RNDV_WRITE (I_MPI_RDMA_RNDV_WRITE, I_MPI_USE_RENDEZVOUS_RDMA_WRITE)

 

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804