Article

Explicit Vector Programming – Best Known Methods

Vectorizing improves performance, and achieving high performance can save power. Introduction to tools for vectorizing compute-intensive processing.
Authored by Last updated on 04/24/2019 - 11:25
Article

A Parallel Stable Sort Using C++11 for TBB, Cilk Plus, and OpenMP

This article describes a parallel merge sort code, and why it is more scalable than parallel quicksort or parallel samplesort. The code relies on the C++11 “move” semantics.

Authored by Last updated on 08/01/2019 - 09:30
Video

Getting Better Performance on Dijkstra’s Shortest Path Graph Algorithm using the Intel® Compiler

We optimized a version of Dijkstra’s shortest path graph algorithm using a combination of Intel® Cilk™ Plus array notation and OpenMP* parallel for.

Authored by Last updated on 03/04/2019 - 13:33
File Wrapper

Parallel Universe Magazine - Issue 18, June 2014

Authored by admin Last updated on 05/16/2019 - 11:39
Article

Efficient Parallelization

This article is part of the Intel® Modern Code Developer Community documentation which supports developers in leveraging application performance in code through a systematic step-by-step optimization framework methodology. This article addresses: Thread level parallelization.
Authored by Ronald W Green (Blackbelt) Last updated on 09/30/2019 - 17:28
Article

Vectorization Essentials

Vectorization essentials to effectively use feature in the Intel® Xeon product family
Authored by admin Last updated on 10/02/2019 - 15:11
Blog post

Slides da palestra sobre Computação Paralela no FISL14

A palestra "Como domar uma fera de 1 TFlop que cabe na palma da sua mão" foi apresentada em 3/7/13, no FISL14, por Luciano Palma - Community Manager da Intel para Servidores e Computação de Alto De

Authored by Luciano Palma (Intel) Last updated on 10/03/2019 - 09:03
Article

Choosing the right threading framework

This is the second article in a series of articles about High Performance Computing with the Intel Xeon Phi.

Authored by Last updated on 10/15/2019 - 16:40
Article

Hybrid Parallelism: A MiniFE* Case Study

This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.
Authored by David M. Last updated on 10/15/2019 - 16:40
Article

Putting Your Data and Code in Order: Data and layout - Part 2

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Authored by David M. Last updated on 10/15/2019 - 16:40