What is tbbmalloc_proxy?

What is tbbmalloc_proxy?

I noticed that in the latest build for TBB, there is a new library called tbbmalloc_proxy, and off the top of my head, I couldn't imagine what's it purpose is.

And after spending 3 minutes looking I couldn't find any documentation about the new module, and although I guess it's documented somewhere, it's not dead easy to find, so I figure posting it here might help others when they notice it, and google it or come to the forums.

So what is it? What benefit does it have, when should I use it?

Thanks!

6 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

The tbbmalloc_proxy binaries can be used to replace memory allocation routines with the TBB scalable allocator across the whole process.
On Linux, one should use LD_PRELOAD to load the library into the process before any other shared module. Then it will redirect every malloc/free etc. call to the corresponding TBB scalable_* functions.
On Windows, the effectisthe same but the usage model is different; basically thetbbmalloc_proxy.lib import library should be linked into the main application module.See /installinclude/tbb/tbbmalloc_proxy.h for instructions.
This functionality is not yet supported forother operating systems.

Added: I fixed the misprintin the path above.

Quoting - Alexey Kukanov (Intel)

The tbbmalloc_proxy binaries can be used to replace memory allocation routines with the TBB scalable allocator across the whole process.
On Linux, one should use LD_PRELOAD to load the library into the process before any other shared module. Then it will redirect every malloc/free etc. call to the corresponding TBB scalable_* functions.
On Windows, the effectisthe same but the usage model is different; basically thetbbmalloc_proxy.lib import library should be linked into the main application module.See /install/tbb/tbbmalloc_proxy.h for instructions.
This functionality is not yet supported forother operating systems.

Thanks! By the way that's a great feature, I'd kind of been hoping for something like this. Now the allocator should be as easy to use as Hoard or it's like.

Quoting - Alexey Kukanov (Intel)

The tbbmalloc_proxy binaries can be used to replace memory allocation routines with the TBB scalable allocator across the whole process.
On Linux, one should use LD_PRELOAD to load the library into the process before any other shared module. Then it will redirect every malloc/free etc. call to the corresponding TBB scalable_* functions.
On Windows, the effectisthe same but the usage model is different; basically thetbbmalloc_proxy.lib import library should be linked into the main application module.See /installinclude/tbb/tbbmalloc_proxy.h for instructions.
This functionality is not yet supported forother operating systems.

Added: I fixed the misprintin the path above.

I am trying to use the new LD_PRELOAD feature of TBB under Linux and I am trying to figure something out. I am using the TBB 2.1 update 3 Commercial Aligned Release (tbb21_015oss). I am only trying to replace the standard memory allocator with the TBB scalable allocator. I am setting LD_PRELOAD to ${TBB_PATH}/lib/libtbbmalloc_proxy.so.2:${TBB_PATH}/lib/libtbbmalloc.so.2 and I believe that I am getting the scalable allocator. However I noticed that I get significantly better performance if I create an instance of tbb::task_scheduler_init than when I do not. What else is going on here? Tracing through the code I find that initializing TBB does eventually call initialize_cache_aligned_allocator(). Does that give me a different version of the scalable allocator? Is there a way to get that without creating an instance of tbb::task_scheduler_init and including the libtbb.so?

Quoting - mcmark64
I am trying to use the new LD_PRELOAD feature of TBB under Linux and I am trying to figure something out. I am using the TBB 2.1 update 3 Commercial Aligned Release (tbb21_015oss). I am only trying to replace the standard memory allocator with the TBB scalable allocator. I am setting LD_PRELOAD to ${TBB_PATH}/lib/libtbbmalloc_proxy.so.2:${TBB_PATH}/lib/libtbbmalloc.so.2 and I believe that I am getting the scalable allocator. However I noticed that I get significantly better performance if I create an instance of tbb::task_scheduler_init than when I do not. What else is going on here? Tracing through the code I find that initializing TBB does eventually call initialize_cache_aligned_allocator(). Does that give me a different version of the scalable allocator? Is there a way to get that without creating an instance of tbb::task_scheduler_init and including the libtbb.so?

Frankly I have no idea why just initializing TBB might improve performance; and knowing nothing about the application it is hard to make guesses. TBB initialization won't give you any different version of the allocator; and it won't give you any malloc replacement.

Added: tbb::cache_aligned_allocator and tbb::tbb_allocator are C++ classes that can be used instead of std::allocator, and provide some additional properties as described in the documentation. Internally, both use scalable_malloc & co if available, and regular malloc otherwise; the initialization function you mentioned performs the selection. Since you preload libtbbmalloc.so.2, scalable_malloc is available to use from TBB.

Also: by default, TBB packages do not have lib directory, so I assume you somehow ensured that ${TBB_PATH}/lib/libtbbmalloc_proxy.so.2 really points to the desired binary.

Now the same functionality is available on windows also in the 20090511 development release.

Lascia un commento

Eseguire l'accesso per aggiungere un commento. Non siete membri? Iscriviti oggi