VTune overflow "collection data limit, MB" while Paused?

It seems that "VTune Amplifier XE 2016 Update1" consumes sample buffer even while it is paused

"The specified data limit of 500 MB is reached. Data collection is stopped." even before I Resume collection, and be able to start profiling.

I would like to delay sample collection untill the point my application is fully loaded and initialized, but have to specify too big sample data buffer over 1GB which is extremely slow at the later processing time.


英特尔® 物联网开发套件能使您方便地开始添加到您的物联网的项目的传感器和传动装置。添加到项目的传感器和传动装置时,您将执行以下基本步骤:

  1. 选择要添加到项目中的组件。一些有用的建议包括:

  2. 将传感器组件连接到您的开发板。进行此操作的具体步骤取决于您所使用的组件,如 Grove* Starter Kit Plus 等等。

  3. 编写或修改项目代码以便与组件进行交互。

Blurred Screens During Decoding With The Media SDK

Hi everyone.

I am using the Intel Media SDK to decode a 4K-pixel video, for a live broadcast application.
The video has a 19fps frame rate and 10Mbps bit rate, its exact resolution is 4000 x 3000 pixels.

When the application is decoding 12 of this video simultaneously, the video screens are all blurred.
Through GPU-Z, I see the GPU load is lower than 40%, and about 1.2GB graphics memory has been used.
Well when only 10 of this video are being played at the same time, everything is almost normal.

An always pulling multiple-in-multiple-out tab::flow node


I would like to use tbb::flow in the context of a Software Defined Radio (SDR)
application. This seems like a perfect fit, because I need to pipeline complex 

I have read that TBB nodes use a push-pull process for communication---the 
sender will push as long as the receiver is able to accept. If this is not 
possible because the node is still running, the edge goes in a pull mode with 
the receiver pulling results from the sender.

The problem I have is that some algorithms have multiple inputs which are 

Inspector finds data races in libm functions (asin, acos, atan)


I have an openMP code that takes 8 instances of a class and uses them in parallel (the class itself is a well known particle physics simulation code). It runs slower on multiple cores than on a single core however.

When using Intel Inspector, I find data races in various functions of the class that seem to be related to basic maths functions in libm. See this screenshot for an example involving atan:


Vectorization with SIMD-enabled functions works from functions, not from main()


I have run into a situation that I cannot explain. I have a loop with a SIMD-enabled function and I use #pragma simd before it. This loop vectorizes if it is placed in a separate function, but does not vectorize if it is inside main(). I am using Intel C++ compiler Please see code and vectorization reports below. Can anyone explain what is happening and if there is a way to work around this?

This is loop-in-main.cc:

Visual Studio 2015 integration


I had Visual Studio 2012 on Windows 2012 server installed with Intel C/C++ composer 2013 edition - all working fine for years.

I then recently purchased an upgrade to Intel PS XE 2016 Cluster edition, installed it with VS2012 addon option - all working fine. I can now compile my VS2012 projects using Intel C/C++ 2016. 

I then installed Visual Studio 2015 Community. The installation succeeded (and I remembered to include Visual C++ tools). The problem is now:

Vaapi API based decoder on NUC -Intel(R) Pentium(R) CPU N3700

HI All,

I am posting this query here as I didn't find any separate forum for Vaapi based decode/encoders.

Actually, i got a reply form this forum  that Intel Media SDK is not supported for NUC -Intel(R) Pentium(R) CPU N3700.

So, I have taken libva 1.6, Kernel 4.2, vaaps-driver  from 


MY vainfo-

Subscribe to Optimization