NUMA-aware assignment of tasks to Cilk++ workers

NUMA-aware assignment of tasks to Cilk++ workers

imagem de kadir.akbudak

I have a NUMA system. There is a thread for each core in the system. Threads that process similar data are assigned to the same node to reuse the data in the large L3 cache of the node. I want threads that are assigned to the same node should steal each other's jobs. If all jobs on a node have finished, these threads should steal jobs assigned to threads on other nodes. How can I implement this via Cilk++?

3 posts / 0 new
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.
imagem de Barry Tannenbaum (Intel)

First off, I assume you're using Intel® Cilk™ Plus, not Cilk++ which was a product of Cilk Arts, Inc.  Cilk Plus is implemented by the Intel Composer XE C/C++ compiler, a branch of the GCC C/C++ compiler, and a branch of LLVM/CLang.

At this time the Cilk Plus scheduler is totally unaware of NUMA (as was the Cilk++ scheduler before it). The scheduler assumes uniform access to memory. Given reasonably sized tasks, it still works quite well on NUMA systems since steals are rare.

We're researching ways to optimize execution on a NUMA system, but do not have anything ready for release yet, nor can we promise when or if that research will ever be productized. We'd certainly be interested in research by others in the area.

    - Barry

imagem de Jim Sukha (Intel)

Moreover, having done some experimentation myself in this area, I would add that the benefits of having this kind of locality-aware stealing policy are not always as obvious and immediate as one might think.   Stealing locally from the same node also tends to lead to stealing smaller chunks of work, which can increases the overall number of successful steals and the steal overhead.

If you really want to experiment, I'd suggest implementing your own work-stealing scheduler code using pthreads.  If you can get a performance improvement compared to an ordinary Cilk code on a real application that is robust across a variety of platforms, then that would provide some evidence of the benefits of NUMA-aware assignment of tasks.

Cheers,

Jim

Faça login para deixar um comentário.