The Scalable Heterogeneous Computing Benchmark Suite (SHOC) for Intel® Xeon Phi™

The Scalable Heterogeneous Computing Benchmark Suite (SHOC https://github.com/vetter/shoc-mic#readme)  may be used for measuring performance and stability of Coprocessor based systems. The benchmark has been ported to support Intel® Xeon Phi™ using offload programming constructs implemented in the Intel® Compiler that is available as part of Intel® Composer XE 2013 package.

You can get more information about the benchmark from https://github.com/vetter/shoc-mic#readme.

Benchmark Download

The Intel Xeon Phi-specific modifications to the benchmark can be downloaded from the git repository https://github.com/vetter/shoc-mic

 

The SHOC benchmark for Intel Xeon Phi consists of following components: All of these benchmarks, where relevant, reports performance numbers with and without the data transfer overhead.

Level 0 Benchmarks: Measures 'feeds and speeds' of the coprocessor hardware 

  1. BusSpeedDownload and BusSpeedReadback:  These benchmarks measure the data transfer speed between host and Intel Xeon Phi coprocessor for various data sizes.
  2. DeviceMemory: This measures read/write speed to GDDR3 memory from the coprocessor core.
  3. MaxFlops: Measures the maximum floating point computation rate for double precision and single precision arithmetic. Note: In this version there are some errors due to compiler optimizing out part of the code thus showing incorrect results. However some of the results are reliable like MADD8-SP and MADD8-DP numbers to get a feel of pure computational speed..

Level 1 Benchmarks: Measures device performance for low level compute tasks.

    1. GEMM: Measures performance of general matrix matrix multiply operation on single precision and double precision numbers using Intel® Math Kernel Library on Intel Xeon Phi.
    2. FFT: Measures performance of
    1. MD:  Measures the performance of  Lennard-Jones potential computations used in  Molecular Dynamics.
    2. Reduction: Measures performance of sum reduction operation of floating point numbers.
    3. Scan: Measure performance of parallel prefix sum of floating point numbers.

Level 2 Benchmark:  Measures performance of real application kernels

  1. S3D: S3D application is used to simulate combustion process. This benchmark measures the performance of 'getrates' kernel that computes the rate of chemical reactions across a regular 3D grid.

 

Einzelheiten zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.