I've been reading a lot about the cache, and multi-core processors. Still quite a bit of reading to go.
I have read that moving a thread from one processor to another will result in loss of performance improvements from the cache. I understand that the new affinity_partitioner is designed to attempt to run tasks on the same core when possible.
How much control does TBB really have over which core the tasks get mapped to during execution?
As I have been reading, it seems to me that TBB could benefit from some kernel-level support. Indeed TBB is still implemented in standard C++, however what's the harm in adding something to the kernel (i.e. Linux) to help performance along?
Are there areas where TBB could be improved with kernel-level support, for instance memory allocation, context-switching, and execution?
These aren't so much questions as just thinking out loud.