Intel® HPC Developer Conference 2017

Parallel Programming

I/O Performance at Scale and Storage Class Memory

I/O is a key part of all applications, but is often neglected when performance optimization is considered. Also, as I/O is a shared resource on computational systems, the performance can vary significantly for the same application. New memory technology that can be used for I/O is on the horizon.

Adrian Jackson, EPCC, The University of Edinburgh

Presentation (PDF)

The OpenCL™ Platform in Scientific High-Performance Computing

The OpenCL™ Platform in Scientific High-Performance Computing
(32 min)

Unlike other parallel programming models, the OpenCL™ platform provides practical portability across a wide range of architectures, offering perks (runtime kernel-compilation and being library only). We show an interdisciplinary OpenCL platform-based workflow and evaluate the OpenCL platform and its implementations from an HPC perspective.

Matthias Noack, Zuse Institute Berlin

Presentation (PDF)

Accelerate Cryo-Electron Microscopy Reconstruction

Accelerate Cryo-Electron Microscopy Reconstruction in RELION with x86 SIMD Instructions
(27 min)

Structural biology is going through a revolution where cryo-electron microscopy (cryo-EM) now determine 3D structures from thousands of noisy images, but it relies on very large computations. This talk presents our work with Intel to accelerate the RELION program with x86 SIMD, Intel® Threading Building Blocks, and Intel® Math Kernel Library to provide outstanding performance.

Erik Lindahl, Stockholm University
Charles Congdon, Intel

Presentation (PDF)

RckT: Scalable, Physically Accurate Spectral Rendering in OSPRay
(27 min)

We discuss RckT, a scalable, physically accurate, spectral rendering system that builds on OSPRay (a high-fidelity visualization framework). Rckt is an extensible framework used to implement scalable ray-based rendering techniques for high-performance visual analysis tools across several domains.

Christiaan Gribble, SURVICE Engineering

Presentation (PDF)

Energy-Efficient, Scalable Computing of Extremely Large Electronic Structures
(20 min)

With technical details of development and parallelization schemes, this session introduces an example of practical applications that shows the strength of manycore computing with Intel® Xeon Phi™ processors in terms of speed and energy-efficiency through a solid comparison to NVIDIA* P100 GPGPU devices.

Hoon Ryu, Intel, Korea Institute of Science and Technology Information

Presentation (PDF)

SWIFT: Task-Based Calculation Plus Task-Based MPI Plus Task-Based I/O Equals Maximum Performance
(26 min)

Traditional large HPC simulation codes rely on MPI or MPI plus OpenMP for their parallelization over clusters of more than 100,000 cores. This approach of task-based parallelism strategy is used in SPH With Interdependent Fine-Grained Tasking, or SWIFT. This open-source cosmological code makes use of vectorization, dynamic scheduling, task-based I/O, and more.

Matthieu Schaller, Institute for Computational Cosmology (ICC), Durham University

Presentation (PDF)

Interactive In-Situ Visualization of LAMMPS Simulations with OSPRay

In this technical lecture, we describe our work instrumenting LAMMPS for interactive in-situ visualization with SENSEI and OSPRay. We show results of our implementation on supercomputers based on the Intel® Xeon Phi™ architecture, such as Stampede 2 at Texas Advanced Computing Center (TACC) and Theta at Argonne National Laboratory (ANL).

Will Usher and Valerio Pascucci, Scientific Computing and Imaging Institute, University of Utah
Silvio Rizzi and Joseph Insley, Argonne National Laboratory

Presentation (Interactive PPT)

High-Fidelity Rendering of Bioscience Data

High-Fidelity Rendering of Bioscience Data Using OSPRay and VR
(19 min)

Depth perception cues are critical for understanding biological structure and function. We present scientific visualization for bioscience data (cancer biophysics, molecular diffusion, and medical imaging) that have been rendered using OSPRay and VR to gain insights.

Ayat Mohammed, Anne Bowen, and Hadley Vaughn, Texas Advanced Computing Center (TACC)

Presentation (PDF)

Challenges and Opportunities in Using Software-Defined Visualization in MegaMol

We detail the evolution our previously GPU-centric visualization framework, MegaMol, to take advantage of software-defined visualization and make it ready for in-situ visualization. We show where visualization is a valuable addition to the scientific workflow and talk about interfacing simulation code and MegaMol.

Tobias Rau, VISUS University of Stuttgart

Presentation (Interactive PPT)

Comparison and Analysis of Parallel Tasking Performance

Comparison and Analysis of Parallel Tasking Performance for an Irregular Application
(25 min)

We share a case study on parallel tasking frameworks by comparing performance using Fast Multipole Method mini-app, which is used for a parallel task model of computation. The mini-app is ported to Intel® TBB, Intel Cilk™ Plus, OpenMP 4.0, OpmSs. See how different methods of synchronization can affect the performance.

Patrick Atkinson and Simon McIntosh-Smith, University of Bristol

Presentation (PDF)

Rendering in Blender* Cycles

Rendering in Blender* Cycles Using Intel® Xeon Phi™ Processors
(35 min)

Learn about basic principles of rendering, Blender* software, and its rendering engine, Cycles. See how we ported this code to Intel® Xeon Phi™ processors and how hybrid parallelization combining OpenMP and MPI was used to utilize multiple nodes for rendering.

Lubomir Riha, IT4Innovations, VSB – Technical University of Ostrava

Presentation (PDF)