Build Faster, Scalable High-Performance Computing Code

Master the performance challenges of communication and computation in high-performing applications.

OpenFabrics Interfaces (OFI) Support

This framework is among the most optimized and popular tools for exposing and exporting communication services to high-performance computing (HPC) applications. Its key components include APIs, provider libraries, kernel services, daemons, and test applications.

Intel® MPI Library uses OFI to handle all communications, enabling a more streamlined path that starts at the application code and ends with data communications. Tuning for the underlying fabric can now happen at run time through simple environment settings, including network-level features like multirail for increased bandwidth. Additionally, this support helps developers deliver optimal performance on extreme scale solutions based on Mellanox InfiniBand* and Intel® Omni-Path Architecture.

The result: increased communication throughput, reduced latency, simplified program design, and a common communication infrastructure.

 

See below for further notes and disclaimers.1

Scalability

Implementing the high-performance MPI 3.1 standard on multiple fabrics, the library lets you quickly deliver maximum application performance (even if you change or upgrade to new interconnects) without requiring major modifications to the software or operating systems.

  • Scaling verified up to 262,000 processes
  • Thread safety allows you to trace hybrid multithreaded MPI applications for optimal performance on multicore and many-core Intel® architecture
  • Support for multi-endpoint communications lets an application efficiently split data communication among threads, maximizing interconnect utilization
  • Improved start scalability through the mpiexec.hydra process manager (Hydra is a process management system for starting parallel jobs. It is designed to natively work with multiple network protocols such as ssh, rsh, pbs, slurm, and sge.)

Performance & Tuning Utilities

Two additional functionalities help you achieve top performance from your applications.

Intel® MPI Benchmarks

This utility performs a set of MPI performance measurements for point-to-point and global communication operations across a range of message sizes. Run all of the supported benchmarks or specify a single executable file in the command line to get results for a particular subset.

The generated benchmark data fully characterizes:

  • Performance of a cluster system, including node performance, network latency, and throughput
  • Efficiency of the MPI implementation

User Guide

mpitune

Sometimes the library’s wide variety of default parameters aren’t delivering the highest performance. When that happens, use mpitune to adjust your cluster or application parameters, and then iteratively adjust and fine-tune the parameters until you achieve the best performance. 

 

Windows*

Linux*

Product and Performance Information

1

Performance results are based on testing as of November 8, 2019 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.

2

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information, see Performance Benchmark Test Disclosure.

 

3

Testing by Intel as of November 8, 2019. Intel® Xeon® Platinum 8260L processor at 2.40 GHz, 196 GB RAM. Intel® Hyper-Threading Technology is supported but not enabled. Intel® Turbo Boost Technology is enabled.

Intel® Omni-Path Host Fabric Interface, Silicon 100 series. IFS: 10.10.0.0.445_1062.4.1_2.10.8. Software: Intel® C and Intel® C++ Compilers 19.0.5, Intel® Math Kernel Library 2016 Update 1. Linux*: CentOS* Linux release 7.7.1908 (Core), Kernel 3.10.0-1062.4.1.el7.crt1.x86_64. Compiler flags: '-O3 -no-prec-div -xCORE-AVX512’. Application: SPEC MPI* 2007 v2.0.

4

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804