Developer Guide

Contents

Getting Started with Intel Optimized HPCG

To start working with the benchmark,
  1. On a cluster file system, unpack the Intel Optimized HPCG package to a directory accessible by all nodes. Read and accept the license as indicated in the
    readme.txt
    file included in the package.
  2. Change the directory to
    hpcg/bin
    .
  3. Determine the prebuilt version of the benchmark that is best for your system or follow
    QUICKSTART
    instructions to build a version of the benchmark for your MPI implementation.
  4. Ensure that
    Intel® oneAPI Math Kernel Library
    , Intel C/C++ Compiler and MPI run-time environments have been set properly. You can do this using the scripts
    vars.sh
    ,
    compilervars.sh
    , and
    mpivars.sh
    that are included in those distributions.
  5. Run the chosen version of the benchmark.
    • The Intel AVX and Intel AVX2 optimized versions perform best with one MPI process per socket and one OpenMP* thread per core skipping simultaneous multithreading (SMT) threads: set the affinity as
      KMP_AFFINITY=granularity=fine,compact,1,0
      . Specifically, for a 128-node cluster with two Intel® Xeon® Processor E5-2697 v4 per node, run the executable as follows:
      #> mpiexec.hydra -n 256 -ppn 2 env OMP_NUM_THREADS=18 KMP_AFFINITY=granularity=fine,compact,1,0 ./bin/xhpcg_avx2 -n192
    • The Intel® Xeon® Phi processor optimized version performs best with four MPI processes per processor and two threads for each processor core, with SMT turned on. Specifically, for a 128-node cluster with one Intel® Xeon® Phi processor 7250 per node, run the executable in this manner:
      #> mpiexec.hydra -n 512 -ppn 2 env OMP_NUM_THREADS=34 KMP_AFFINITY=granularity=fine,compact,1,0 ./bin/xhpcg_knl -n160
  6. When the benchmark completes execution, which usually takes a few minutes, find the YAML file with official results in the current directory. The performance rating of the benchmarked system is in the last section of the file:
    HPCG result is VALID with a GFLOP/s rating of: [GFLOP/s]
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.
Notice revision #20201201

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.