Hello! I've been evaluating the TBB allocator to see if it would provide any performance gains for our project, and I discovered something interesting. I wrote a stress test that constantly allocs/frees memory, allocating slightly more than it frees, until the program crashes. The goal is to see how much useful allocation space the allocator can provide to the user in fragment-genic conditions. In addition, a giant 64MB allocation is periodically made and freed, to make the crash occur at a well-defined moment (i.e., when there isn't 64MB of contiguous address space left).
What I found was that the TBB allocator fails out at only around 0.5GB! The default CRT malloc makes it up to ~3.6GB, successfully utilizing most of the space available to the 32bit LARGEADDRESSAWARE process.
My searches on this board uncovered two threads of interest:
The first did not have a repro-case and seemed to be basically unsolved. The second has a lot of comments, and I'm honestly not sure exactly where it landed--but it did seem to be touching on an issue similar to mine.
This problem only occurs on Windows--I performed the same test on Mac OSX and found that scalable_malloc did roughly comparable to CRT malloc (2.8GB vs 3.1GB). This makes me wonder if perhaps I've just exposed a bug on the Windows TBB build? (This is mainly why I'm writing--if this is by design then it may just be that the TBB allocator doesn't meet our needs, which is fine).
The 'tbb_heapfragment.zip' file attached has my repro. If you have a Visual Studio 2010 command prompt, you should be able to unextract it, run the two batch files, and then run the newly-built exe to see the problem (full steps in the readme.txt).