Speed bump in parallel_for after ~30s of execution time.

Speed bump in parallel_for after ~30s of execution time.

I am using a simple `parallel_for` to do some geometric computations (game context). At first, I get about 30 frames per second for my execution time. However, after quite some time, TBB seems to change its load balancing and performance increases up to ~50 fps.

My question is, how can I figure out what changes, so I can provide adequate grainsize and/or partitioner and/or hints to get the performance boost right from the get-go? What APIs are available to figure out what is happening underneath the hood?

I've tried monitoring the grainsize in the `parallel_for` callback, but it doesn't change. At least, what you get from `block_range::grainsize()` doesn't change. Thank you.

A profiling session :

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.