This article provides instructions for code access, build, and run directions for the miniGhost code, running on Intel® Xeon® processors and Intel® Xeon Phi™ Coprocessors.
miniGhost is a Finite Difference mini-application which implements a difference stencil across a homogenous three dimensional domain.
The kernels that it contains are:
- computation of stencil options,
- inter-process boundary (halo, ghost) exchange.
- Global summation of grid values.
In the High Performance Computing (HPC) area, parallel computing techniques such as MPI, OpenMP*, one-sided communications, shmem, and Fortran coarray are widely utilized. This blog is part of a series that will introduce the use of these techniques, especially how to use them on the Intel® Xeon Phi™ coprocessor. This first blog discusses the main usage of the hybrid MPI/OpenMP model.
This article describes a parallel merge sort code, and why it is more scalable than parallel quicksort or parallel samplesort. The code relies on the C++11 “move” semantics. It also points out a scalability trap to watch out for with C++. The attached code has implementations in Intel® Threading Building Blocks (Intel® TBB), Intel® Cilk™ Plus, and OpenMP*.
With Intel® VTune™ Amplifier XE 2013 Update 12 and earlier it was possible to profile OpenMP applications with parallel regions as described in the article Profiling OpenMP* applications with Intel® VTune™ Amplifier XE by Kirill Rogozhin.
This chapter covers topics in vectorization. Vectorization is a form of data-parallel programming. In this, the processor performs the same operation simultaneously on N data elements of a vector ( a one-dimensional array of scalar data objects such as floating point objects, integers, or double precision floating point objects).
- 第 1 页