scalable_malloc heap fragmentation on Windows

scalable_malloc heap fragmentation on Windows

Hello! I've been evaluating the TBB allocator to see if it would provide any performance gains for our project, and I discovered something interesting. I wrote a stress test that constantly allocs/frees memory, allocating slightly more than it frees, until the program crashes. The goal is to see how much useful allocation space the allocator can provide to the user in fragment-genic conditions. In addition, a giant 64MB allocation is periodically made and freed, to make the crash occur at a well-defined moment (i.e., when there isn't 64MB of contiguous address space left). 

What I found was that the TBB allocator fails out at only around 0.5GB! The default CRT malloc makes it up to ~3.6GB, successfully utilizing most of the space available to the 32bit LARGEADDRESSAWARE process. 

My searches on this board uncovered two threads of interest: 

scalable_malloc fragmentation problem

scalable_malloc fails to allocate memory while there is much memory avaliable.

The first did not have a repro-case and seemed to be basically unsolved. The second has a lot of comments, and I'm honestly not sure exactly where it landed--but it did seem to be touching on an issue similar to mine. 

This problem only occurs on Windows--I performed the same test on Mac OSX and found that scalable_malloc did roughly comparable to CRT malloc (2.8GB vs 3.1GB). This makes me wonder if perhaps I've just exposed a bug on the Windows TBB build? (This is mainly why I'm writing--if this is by design then it may just be that the TBB allocator doesn't meet our needs, which is fine). 

The '' file attached has my repro. If you have a Visual Studio 2010 command prompt, you should be able to unextract it, run the two batch files, and then run the newly-built exe to see the problem (full steps in the readme.txt). 

Thanks, David.

Downloadapplication/zip tbb-heapfragment.zip492.55 KB
8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hello David,

I have a quick question - Do you plan to use 32-bit server application on 64-bit server?


Check MSDN for LFH or Low Fragment Heap

Jim Dempsey

Thanks for the responses; we are using a 32-bit server, in large part because it shares lots of code with our clients, which still don't have 64-bit chips as a minimum requirement. Converting the server to 64-bit is certainly a possibility we've talked about--obviously it would make address space congestion much less of a worry!

Of course CRT malloc is using the LFH heap by default, and that is kicking ass--beyond that I'm not quite sure what you mean, Jim. Does TBB create thread-specific private heaps? If so, do I have to twiddle something to tell TBB: "Please use the LFH setting when creating your heaps?"

Hello David,

I've reproduced the issue locally. I hope we will have something in next update. I'll ask to try it out when it is available.

Stay tuned:)

The TBB scalable allocator works somewhat independent from the C++ heap manager. If you have chosen to overload operator new and delete (all allocations going through scaldable allocator) then you are subject to its fragmentation quirks. Should this present a problem, then consider:

a) not overloading new and delete with TBB scalable allocator, and adding specific scalable allocations for your high flux objects
b) overloading new and delete with TBB scalable allocator, and adding specific non-scalable allocations for your large low-flux objects.

This may reduce fragmentation tendency on 32-bit system

Jim Dempsey

Try out our published update 2. I believe it should work better


awesome, thanks for looking into it. I'll check it out on Monday and report back.

Leave a Comment

Please sign in to add a comment. Not a member? Join today