博客

Graduate Intern at Intel - Parallel Ray-Tracing

Ray-tracing is a classic example of an embarrassingly parallel algorithm; since each pixel is typically independent of the rest, theoretically every pixel can be done in parallel (given enough core

作者: 最后更新时间: 2017/06/14 - 15:37
博客

Graduate Intern at Intel - Parallel N-Body

The N-Body problem is a classic example used frequently to demonstrate parallelization and how it improves performance.

作者: 最后更新时间: 2017/06/14 - 15:46
博客

Parallel Universe Magazine #12: Advanced Vectorization

This blog contains additional content for the article "Advanced Vectorization" from Parallel Universe #12:

作者: 最后更新时间: 2019/07/03 - 20:08
Article

Cilk Plus Solver for a Chess Puzzle or: How I Learned to Love Fast Rejection

Intel® Cilk™ Plus enabled parallelizing a chess puzzle solver with a few changes.
作者: 最后更新时间: 2017/06/07 - 09:12
Article

A DPRNG for Cilk™ Plus?

Continuing my previous post, I describe some of the challenges in implementing DotMix, a determinstic parallel random-number generator (DPRNG) for Intel® Cilk™ Plus.
作者: 最后更新时间: 2019/03/20 - 13:00
Article

Parallel sorts for Cilk Plus

This article describes the parallel sorts in the latest release of “Cilkpub”, an open-source library of utilities for Intel®

作者: 最后更新时间: 2017/06/07 - 10:29
Article

Using Pedigrees in Intel® Cilk™ Plus

Pedigrees are a new feature implemented in Intel Cilk Plus and currently available in Intel® Composer XE 2013. In this post, I explain what pedigrees are, how they work, and how you can use them in Cilk Plus. Pedigrees are a key component used in the implementation of DotMix, a contributed code for a deterministic parallel random-number generator (DPRNG) discussed in my previous post.
作者: 最后更新时间: 2017/10/11 - 11:28
Article

Why is Cilk™ Plus not speeding up my program? (Part 1)

In this article, I discuss some common performance pitfalls in Cilk™ Plus programs that prevent users from seeing speedups in their code, and describe some techniques for avoiding these pitfalls.
作者: 最后更新时间: 2019/02/04 - 10:40
Article

Choosing the right threading framework

This is the second article in a series of articles about High Performance Computing with the Intel Xeon Phi.

作者: 最后更新时间: 2019/07/06 - 16:30
Article

Late-initialization of frame descriptors in Cilk Plus/LLVM

The Intel® Cilk™ Plus C/C++ language extensions support the expression of portable and efficient task and vector parallel programs. Cilk Plus/LLVM is an implementation of these extensions in the Clang frontend for LLVM. In this article we explain one of the optimizations that we have implemented in Cilk Plus/LLVM: late-initialization of frame descriptors[1]. With this explanation, we provide a...
作者: 最后更新时间: 2017/06/07 - 09:11