Article

A Parallel Stable Sort Using C++11 for TBB, Cilk Plus, and OpenMP

This article describes a parallel merge sort code, and why it is more scalable than parallel quicksort or parallel samplesort. The code relies on the C++11 “move” semantics.

Authored by Last updated on 08/01/2019 - 09:30
Article

Scale-Up Implementation of a Transportation Network Using Ant Colony Optimization (ACO)

In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
Authored by Sunny G. (Intel) Last updated on 07/05/2019 - 19:10
Article

Choosing the right threading framework

This is the second article in a series of articles about High Performance Computing with the Intel Xeon Phi.

Authored by Last updated on 07/06/2019 - 16:30
Blog post

Graduate Intern at Intel - Parallel Ray-Tracing

Ray-tracing is a classic example of an embarrassingly parallel algorithm; since each pixel is typically independent of the rest, theoretically every pixel can be done in parallel (given enough core

Authored by Last updated on 06/14/2017 - 15:37
Blog post

Graduate Intern at Intel - Parallel N-Body

The N-Body problem is a classic example used frequently to demonstrate parallelization and how it improves performance.

Authored by Last updated on 06/14/2017 - 15:46
Blog post

CLRS III: Extension of the Threads

I've got a great wife. For my birthday she got me a copy of the newly updated Introduction to Algorithms, 3rd ed. by Cormen, Leiserson, Rivest, and Stein.

Authored by Clay B. (Blackbelt) Last updated on 02/12/2019 - 13:54
Blog post

Applying Intel® Threading Building Blocks Observers for Thread Affinity on Intel® Xeon Phi™ Coprocessors

In spite of the fact that the Intel® Threading Building Blocks (Intel® TBB) library [1] [2] provides high-level task based parallelism intended to hide sof

Authored by Alex (Intel) Last updated on 08/01/2019 - 09:30
Article

Putting Your Data and Code in Order: Data and layout - Part 2

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Authored by David M. Last updated on 07/06/2019 - 16:40
Article

Reciprocal Collision Avoidance and Navigation for Video Games

Collision avoidance and navigation among virtual agents is an important component of modern video games. Recent developments in commodity hardware are allowing large numbers of virtual agents to be incorporated into game levels in increasing numbers
Authored by Adam Lake (Intel) Last updated on 05/03/2019 - 13:29