I am working on a C++ application that has to process data in real-time. The application uses Intel TBB's pipeline pattern for parallel processing of data. The application has multiple pipelines each with single token to process data.I built this application with ICC and started performance measurements using vTune Amplifier. During the general exploration, I noticed that vTune Amplifier always reported high CPI and issues with the final_task_switch() function in vmlinux module.Does this mean that kernel is spending too much time context swapping? Can anyone provide me suggestions on how to tackle this performance reducing behaviour.Appreciate your help.Regards,Vishal
For more complete information about compiler optimizations, see our Optimization Notice.