TBB's Dynamic Memory Interface Replacement in offloaded code

TBB's Dynamic Memory Interface Replacement in offloaded code

I would like to try replacing the memory allocator with TBB's scalable memory allocator as detailed here:


I would like to do this for allocations in offload regions.  This is on windows.  What I've tried:

1) Adding 

  • tbbmalloc_proxy.lib /INCLUDE:"__TBB_malloc_proxy"

to the link line.  This clearly only affects the host allocations

2) Adding

-ltbbmalloc_proxy -ltbbmalloc

to the offload linker options. I also had to copy the .so's to the MIC and put them in /usr/lib64.

Memory allocation still seems to be slow, although I'm not sure how to definitively tell if I'm actually using the TBB allocator.  Is there anything else that I need to do?

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

It turned out so that the TBB proxy overload approach works only on Windows and Linux but not in offload. Thus you cannot replace standard allocator with TBB allocator via TBB proxy overload in offload. I am not familiar to the offload execution model very much but the root cause is that by the moment offload module starts (and TBB proxy library is loaded) a standard allocator is already loaded and it is too late to overload malloc/free symbols.

The most reliable way to enable TBB allocator in your offload application is replacing malloc/free with scalable_malloc/scalable_free in your code. E.g.

#define malloc(size) scalable_malloc(size)
#define free(ptr) scalable_free(ptr)

Regards, Alex



Leave a Comment

Please sign in to add a comment. Not a member? Join today