Developer Guide


Getting Started with Intel Optimized HPCG

To start working with the benchmark,
  1. On a cluster file system, unpack the Intel Optimized HPCG package to a directory accessible by all nodes. Read and accept the license as indicated in the
    file included in the package.
  2. Change the directory to
  3. Determine the prebuilt version of the benchmark that is best for your system or follow
    instructions to build a version of the benchmark for your MPI implementation.
  4. Ensure that
    Intel® oneAPI Math Kernel Library
    , Intel C/C++ Compiler and MPI run-time environments have been set properly. You can do this using the scripts
    , and
    that are included in those distributions.
  5. Run the chosen version of the benchmark.
    • The Intel AVX and Intel AVX2 optimized versions perform best with one MPI process per socket and one OpenMP* thread per core skipping simultaneous multithreading (SMT) threads: set the affinity as
      . Specifically, for a 128-node cluster with two Intel® Xeon® Processor E5-2697 v4 per node, run the executable as follows:
      #> mpiexec.hydra -n 256 -ppn 2 env OMP_NUM_THREADS=18 KMP_AFFINITY=granularity=fine,compact,1,0 ./bin/xhpcg_avx2 -n192
    • The Intel® Xeon® Phi processor optimized version performs best with four MPI processes per processor and two threads for each processor core, with SMT turned on. Specifically, for a 128-node cluster with one Intel® Xeon® Phi processor 7250 per node, run the executable in this manner:
      #> mpiexec.hydra -n 512 -ppn 2 env OMP_NUM_THREADS=34 KMP_AFFINITY=granularity=fine,compact,1,0 ./bin/xhpcg_knl -n160
  6. When the benchmark completes execution, which usually takes a few minutes, find the YAML file with official results in the current directory. The performance rating of the benchmarked system is in the last section of the file:
    HPCG result is VALID with a GFLOP/s rating of: [GFLOP/s]
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at