Intel® Threading Building Blocks

Stack Overflow

I'm porting an app to TBB. Before I started, everything worked well, but now I'm having problems with stack overflow. How do I go about tracking this down? (The call stack window in VS2008 shows a list of calls, but warns me that "Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll".)

concurrent_queue freezes after 65535 pops

I have a bizarre problem with TBB concurrent queues and would appreciate any help.

I have two tasks running in parallel (triggered in a parallel_for). The first task ("producer") pushes an event pointer to an eventQueue and waits for (pops) an acknowledgement on an ackQueue. The second task ("consumer") pops the eventQueue and pushes an acknowledgement to the ackQueue once it is done. This way I can have multiple consumers subscribing to the same set of events.

scalable_malloc and cache alignment

Hey all,

I'm writing some low-level code (for cache optimized data structures), and it's critical that I know that the pointer returned by scalable_malloc points to the beginning of a cache-line. Is this the default behaviour of scalable_malloc? Is it even possible to ensure that I've actually allocated memory at the beginning of a cache-line?



TBB Malloc memory consumption always rising in our application


First of all, I must say thatI amreallyenjoying theTBB library.I modified a few of our single thread algorithms to take advantage of the parallel_for and parallel_sort constructs, andI am really impressed by the results, especially on 8 and 16 cores servers.

Ialso decided to use TBB Malloc as our default allocator. I read the technical article, as well as the source code, to try to better understand how it works and what we should expect in terms of speed up on many cores, etc.

How to call Core component function/macro of IXP2xxx network processor?

I'm using IXP2350 Network processor. I found that there are some Core component macros as mentioned in "Intel Internet Exchange Architecture Software Building blocks" document. I wonder how to use them? Do I need to include them in source code of MEs as other macros?

Thank you very much for your consider and help!

Don't Forget - There's a TBB IRC Channel

Just wanted to let everyone know that there is an unofficial IRC channel for TBB at Just join #tbb. We've had some interesting discussions on the channel, however there's been a steady decline of users for whatever reason.

You can find more info here:


Is it ok to add iterators to blocked_range?

I had been using blocked_range a bit, but was getting tired of writing custom functor objects all the time, so I began trying to use boost/foreach and boost/mem_fn, but got compiler errors about lack of iterator, so looked into the header and found blocked_range doesn't have iterators. It has a const_iterator, but not an iterator. Also the const_iterator isn't actually const at all.

So I added to file:blocked_range.h around line:50

//! A range over which to iterate.

/** @ingroup algorithms */


class blocked_range {


efficiency question in TBB

When I am testing efficiency of the TBB programs. Carry out the same TBB programs 10,000 times.
The first several efficiency is always worse. what kinds of reason could cause this?

test method :
task_scheduler_init init;

for(int i = 0; i < testcount; i++){
tick_count t0 = tick_count::now();
parallel_for(blocked_range(0, size), ApplyFoo(source,dest,minus), auto_partitioner()); //for auto
tick_count t1 = tick_count::now();
sum = sum + (t1-t0).seconds();
fprintf(fp, " %d : %.12f\n",i, (t1-t0).seconds());
} // end for

Alternative Termination for parallel_do

Flush with success at my first parallel_for, I'm on to bigger and more complex. I now have a problem in which I have a stochastic process that is used to fill a data structure. The process can result in 0,1 or several additions to the data structure. I need to terminate when the data structure is full. The serial code is a simple do{StochasticProcess();}while (!dataStructure.isFull()); It takes 10's of thousands of calls to StochasticProcess to fill the dataStructure, and they are independent, so it's a great thing to parallelize.

Iscriversi a Intel® Threading Building Blocks