thanks for the great TBB library!
I have a timestep-based simulation where I use parallel_for every timestep. In order to reduce memory allocations I use several memory pools, stored in a concurrent_queue (actually a pointer to a pool is stored in the queue). In operator(), every task pops a pool-pointer from the queue, performs it's operation over the range and pushes the pointer back to the queue.
Recently I heard about thread-local-storage that can be easily used in Visual Studio (via "__declspec( thread )"). In my case, it would be sufficient to have a pool for each TBB hardware thread. The access to the pool via the concurrent_queue could be omitted in this scenario.
But this only makes sense if the TBB thread-pool is fixed after initialization. Does this assumption hold? If yes, I need to do initialization- and clean-up-operations on my memory-pools. I can't do this in a ctor and dtor because one cannot declare objects with ctor and dtor a thread local (in Visual C++). If I use a parallel_for with grainsize 1 over the number of cores (task-scheduler is initialized with the same number),
task-stealing may happen and not all memory-pools are accessed.
Is there a way to prevent task-stealing? What do you think about the scenario in general. Are there alternatives?
Thanks and regards,