Intel® Threading Building Blocks

Intel TBB segfaults after updating Ubuntu kernel to 4.13

TBB version: 2018 initial release

I am using OpenCV library compiled with TBB threading framework though python interface. After updating Kubuntu 16.04 kernel from 4.10 to 4.13 version, my code stopped working with segfault right in "import cv2" line. I tried to recompile OpenCV with TBB having new kernel ending up with the same problem. After recompiling with  OpenMP instead of TBB problem disappears. It seems that 2018 Update 2 has the same problem.

tbb::task priority and kill task


I am working on a socket application, my app is receiving very fast packets and i have to process and replay each packet within specific time (say 100 milliseconds). i am adding every packet to a queue, a thread is picking a packet and executing a tbb::task to process packet. i have 16 cores, and not able to process all packets in given time. 

my question is can i change task priority to high or kill task  which is not started in 50 ms and execute new task ?

what i am doing in my queue processing thread is:


packet* p=q.pop();

2D prefix scan (summed area table)



I was wondering if anybody had suggestions on how to implement a summed area table with Intel TBB.  The general idea of the algorithm:


1. Given an input, do an independent (inclusive) prefix scan on every row.  Call this Intermediate.

2. Transpose Intermediate, call this IntermediateTranspose.

3. Do step (1) again, only do an inclusive prefix scan on every row of IntermediateTranspose.  Call this OutputTranspose.

4. Transpose from (3) OutputTranspose -> Output.


Intel tbb flowgraph speedup

Here is my attempt to benchmark the performance of intel tbb flow graph. Here is the setup:

- One broadcast node sending continue_msg to N successor nodes (broadcast_node<continue_msg>)

- Each successor node perform a computation that takes t seconds.

- The total computation time when performed serially is Tserial = N* t

- The ideal computation time if all cores are used is Tpar(ideal) = N * t / C, where C is the number of cores.

- The speedup is defined as Tpar(actual) / Tserial

- I tested the code with gcc5 on a 16 core PC.

How to use weak_ptr as key in tbb::concurrent_unordered_map?

I am using tbb::concurrent_unordered_map to replace std::map in my program like this:


class KvSubTable;
typedef std::weak_ptr<KvSubTable> KvSubTableId;
std::map<KvSubTableId, int, std::owner_less<KvSubTableId> > mEntryMap;

Now, I use tbb::concurrent_unordered_map to replace std::map , but it has some compile errors:

concurrent_hash_map: Bad performance compared to std::unordered_map with shared_lock


Recently I found a microbenchmark on performance of different implementations of concurrent hash maps at, where the test results are repeatable on my machine. I was wondering why simple std::unordered_map with std::shared_map outperforms tbb::conrurrent_hash_map? Is there any pitfall in that KVIntelTBB implementation?


Having issues compiling code of A Parallel Stable Sort Using C++11 TBB

Intel has already provided the source code for this. But I assume there is some issue with code in the file named "test.cpp" at line number 276 where it says.Severity Code Description Project File Line Suppression State  -> Error name followed by "::" must be a class or namespace name 

Here is the link to get the source code

Can anyone help me fix this issue?


Subscribe to Intel® Threading Building Blocks