Performance comparison between Intel TBB task_list, openMP task and parallel for

I am planning on parallelizing a hotspot in a project. And I would like to know your opinion between the performance evaluation between parallel for, omp single followed by task and intel TBB task_list, under ideal conditions where number of threads are equal to computation items and when computation are much greater than available threads to see scheduling overhead(in order to evaluate the most efficient scheduler). I will also, be writing some sample test programs to evaluate myself but I also wanted to know if anybody had previously made these evaluations.

Thanks in advance.

Further information about different barrier algorithms


I'm researching on barrier algorithms using SIMD instructions and I'm trying to deeply understand the different versions included in the RTL.

I've noticed that there is a new barrier algorithm (hierarchical) since the last time I had a look.

Where could I find a further description of them? Could someone from Intel provides me with further information?


Thank you in advance.

Kind regards.

an interesting and serious topic

Hello there:

         I have found an interesting  appearance which I can not explain. Okey, let's go.

         I apply "micsmc" to surveiling the offload running state of MIC. The critical code like this:

#pragma offload target(mic:0) inout(XXXX) in (XXXX)
#pragma omp parallel for schedule (dynamic)
for( int i = 0; i < num_cluster; i++) // num_cluster from 60 to 300,concentrated on 90~150
  do something....


         And then set the environment variables :

export KMP_AFFINITY=compact

can't start because libiomp5md.dll is missing from your computer

Hi all:

I met one problem when I use XE2015 with visual studio 2010 in windows 7, 64 bit.


The program can compiled successfully, but when I run it in CMD command window, I met the following error:


Can anyone tell me how to fix this? I am new to this, kindly let me know the detail procedure to solve this problem. Thanks in advance!

Introduction to OpenMP* on YouTube

Tim Mattson (Intel), has authored an extensive series of excellent videos as in introduction to OpenMP*. Not only does he walk through a series of programming exercises in C, he also starts with a background introduction on parallel programming.

Check out the series:

OpenMP* WORKSHARE 现在可与英特尔® Fortran 编译器 15.0 并行

英特尔® Fortran 编译器 15.0 现可为包含阵列分配的 OpenMP WORKSHARE 和 PARALLEL WORKSHARE 结构的指定实例生成多线程代码。  很显然,它们是使用 OpenMP SINGLE 结构进行部署,这表示仅可生成单线程代码。


OMP WORKSHARE 结构的数据块内的语句并非总是生成多线程代码。 一些语句进行并行化;另一些语句不进行并行化,而在 OMP SINGLE 结构内按顺序执行,以保持 WORKSHARE 的语义正确。




  • Desarrolladores
  • Profesores
  • Apple OS X*
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8
  • Servidor
  • Fortran
  • Avanzado
  • OpenMP*
  • 在Intel® MIC多核架构上使用OpenMP*库的编程及调优实践

        本文将主要介绍在Intel® MIC多核架构上运行及优化OpenMP*多线程程序的相关技术,且将围绕offload及native两种运行时执行环境展开详解。





  • C/C++
  • OpenMP*
  • Arquitectura Intel® para muchos núcleos integrados
  • Suscribirse a OpenMP*