Using Multiple DAPL* Providers with the Intel® MPI Library

Introduction

If your MPI program sends messages of drastically different sizes (for example, some 16 byte messages, and some 4 megabyte messages), you want optimum performance at all message sizes.  This cannot easily be obtained with a single DAPL* provider.  This is due to latency being a major factor for smaller messages, but bandwidth being more important for larger messages, and providers often making a tradeoff to improve one or the other.  The Intel® MPI Library, as of Version 4.1 Update 1, now supports up to three providers for a single rank of your job.

Details

The ideal scenario is to allow the automatic provider detection to choose the providers for your job.  The automatic detection has additional features for finding the appropriate adapter to use in systems with multiple adapters, and can easily set each rank to use different adapters.  If you want to control the provider selection directly, you can do so with the environment variable I_MPI_DAPL_PROVIDER_LIST.  This takes a comma separated list of up to three providers with the syntax I_MPI_DAPL_PROVIDER_LIST=<Provider1,Provider2,Provider3>, where:

  • Provider1 is used for small messages.  This provider should be a low latency provider.
  • Provider2 is used for large messages inside of a single node.  This is most applicable in jobs involving Intel® Xeon Phi™ Coprocessors, or if you are using DAPL* for intranode communications.
  • Provider3 is used for large messages between multiple nodes.

 The following environment variables can also take a comma separated list of options.  The first option will be applied to the first provider, the second to the second, and the third to the third.  If only one value is set, it is applied to all providers, and if an invalid value is given, all providers will fall back to default values.

  • I_MPI_DAPL_DIRECT_COPY_THRESHOLD (I_MPI_RDMA_EAGER_THRESHOLD, RDMA_IBA_EAGER_THRESHOLD)
  • I_MPI_DAPL_TRANSLATION_CACHE (I_MPI_RDMA_TRANSLATION_CACHE)
  • I_MPI_DAPL_TRANSLATION_CACHE_AVL_TREE
  • I_MPI_DAPL_CONN_EVD_SIZE (I_MPI_RDMA_CONN_EVD_SIZE, I_MPI_CONN_EVD_QLEN)
  • I_MPI_DAPL_RDMA_RNDV_WRITE (I_MPI_RDMA_RNDV_WRITE, I_MPI_USE_RENDEZVOUS_RDMA_WRITE)

 

标签:
如需更全面地了解编译器优化,请参阅优化注意事项