openmp

英特尔® 至强™ 处理器和英特尔® 至强融核™ 协处理器上的 miniGhost

目的

本文可为针对英特尔® 至强™ 处理器和英特尔® 至强融核™ 协处理器上运行的 miniGhost 代码提供代码访问、构建和运行说明。

简介

miniGhost 是一种有限差分方法,可用于在同构三维域中实施差分模板。

它包含的内核为:
- 模板选项计算,
- 进程间边界 (halo, ghost) 交换,
- 网格值全局总和。

miniGhost 主要用于研究计算环境中 BSPMA 配置的性能特征,这些计算广泛应用于各种不同的科学算法。

在 BSPMA(带有消息聚合功能的整体同步并行) 模型中,针对每个变量的面部数据(face data)被聚合至用户托管的缓冲区内。 然后,缓冲区被传送至(最多)6 个相邻的进程,并针对每个变量应用所选模板的计算。 另外一种模型是 SVAF(单变量,聚合面部数据),但本文仅讨论 BSMPA 模型。

miniGhost 可作为来自 Sandia 的 CTH (Shock Physics) 代码的代理(或 miniapp)。

  • Linux*
  • Server
  • C/C++
  • Intel® Xeon® processors
  • Intel® Xeon Phi™ Coprocessors
  • miniGhost
  • BSPMA
  • MPI
  • openmp
  • Intel® Many Integrated Core Architecture
  • miniGhost on Intel® Xeon® processors and Intel® Xeon Phi™ Coprocessor

    Purpose

    This article provides instructions for code access, build, and run directions for the miniGhost code, running on Intel® Xeon® processors and Intel® Xeon Phi™ Coprocessors.

    Introduction

    miniGhost is a Finite Difference mini-application which implements a difference stencil across a homogenous three dimensional domain.

    The kernels that it contains are:
    - computation of stencil options,
    - inter-process boundary (halo, ghost) exchange.
    - Global summation of grid values.

  • Linux*
  • Server
  • C/C++
  • Intel® Xeon® processors
  • Intel® Xeon Phi™ Coprocessors
  • miniGhost
  • BSPMA
  • MPI
  • openmp
  • Intel® Many Integrated Core Architecture
  • Hybrid MPI and OpenMP* Model

    In the High Performance Computing (HPC) area, parallel computing techniques such as MPI, OpenMP*, one-sided communications, shmem, and Fortran coarray are widely utilized. This blog is part of a series that will introduce the use of these techniques, especially how to use them on the Intel® Xeon Phi™ coprocessor. This first blog discusses the main usage of the hybrid MPI/OpenMP model.

    A Parallel Stable Sort Using C++11 for TBB, Cilk Plus, and OpenMP

    This article describes a parallel merge sort code, and why it is more scalable than parallel quicksort or parallel samplesort. The code relies on the C++11 “move” semantics. It also points out a scalability trap to watch out for with C++. The attached code has implementations in Intel® Threading Building Blocks (Intel® TBB), Intel® Cilk™ Plus, and OpenMP*.

  • Developers
  • Professors
  • Students
  • C/C++
  • Intermediate
  • Intel® Cilk™ Plus
  • Intel® Threading Building Blocks
  • parallel
  • Merge Sort
  • Cilk Plus
  • tbb
  • openmp
  • OpenMP*
  • Parallel Computing
  • Vectorization Essentials

    Compiler Methodology for Intel® MIC Architecture

    Vectorization Essentials

    Overview

    This chapter covers topics in vectorization. Vectorization is a form of data-parallel programming. In this, the processor performs the same operation simultaneously on N data elements of a vector ( a one-dimensional array of scalar data objects such as floating point objects, integers, or double precision floating point objects).

  • Developers
  • Linux*
  • C/C++
  • Fortran
  • Advanced
  • Intel® C++ Compiler
  • Intel® Fortran Compiler
  • OpenMP*
  • Auto-vectorization
  • Intel® Xeon Phi™ Coprocessor
  • vectorization
  • compiler methodology
  • MIC
  • Intel® Cilk™ Plus
  • openmp
  • Intel® Many Integrated Core Architecture
  • Subscribe to openmp