I was just previewing the Intel Guide for Developing Multithreaded Applications and the section on Avoiding Heap Contention Among Threads peeked my interest. I definitely recommend that you read it.
Simple read, all the information you need right away without going around in circles, and proves that there is always something new to learn. For me that something was to learn that it is possible to create a lock-free Heap. I consider myself an expert with Win32 API but never really had the reason to use Heap allocation API, at least not since I started using a Multi-Core CPU.
The simplest and fastest programming technique involves massive use of memory allocations. It is fast and simple to use malloc( ) and new. Most of the times the application needs to allocate some internal data structure and using the allocation API is simple and safe. The problem is that there are scenarios in which we need to store massive amounts of data for example a server may expect 1KB every minute but could get a burst of 100KB in some situations. Sometimes we can prepare buffers in advance and sometimes the size of buffer is unknown and cannot be managed well. Using the global Heap means using a shared resource which means locking the access. Locking means that only one core / one thread can work with the Heap at the same time.
The article describes the correct use of memory allocations by giving each thread its own Heap. The memory can still be shared but every thread has its own allocation engine. A thread can allocate memory and pass it to another thread but only the creator can deallocate the buffer. This is a good practice regardless of the performance issue. Single ownership means no locks and provides a lock-free solution. The text also demonstrates a good use of Microsoft Visual C++ extensions for TLS. TLS - Thread Local Storage is the parallel equivalent of a singleton. If cout / printf was a TLS then you would have no thread to thread collisions on print.
I would even extend the document and say that it is preferable to use the lock-free Heap when ever possible, enforcing a design that makes it a must to delete the buffer in the same thread that allocate the buffer. In C++ this means that you can overload the operators new and delete. This will not break existing code because the access restriction is only for allocation and deallocation and all threads can still access any buffer in the application's memory.
Going over the Win32 API for Heap creation it looks like there is even another parameter called HEAP_GENERATE_EXCEPTIONS. If you ommit this parameter in C++ then you can avoid the out-of-memory Exception. Exceptions are particularly problematic in a parallel environment because and Exception can kill a thread during work while another thread is waiting for it. Generally speaking an exception means breaking the flow of execution unexpectedly which might leave a file open, a semaphore not released, or no one to single the completion event. In C the following code is OK:
In C++ however the following will never work:
There is an exception thrown before the pointer is ever verified so the next line will never execute if ptr is NULL. By using or not using the flag HEAP_GENERATE_EXCEPTIONS you can make sure whether an exception is thrown in both new and malloc, for example throw in debug mode and don't throw in release.
Bottom line is that the section on on Avoiding Heap Contention Among Threads I am referring to is well written and is very recommended as a good practice even if you have a single core CPU.
Asaf
Simple read, all the information you need right away without going around in circles, and proves that there is always something new to learn. For me that something was to learn that it is possible to create a lock-free Heap. I consider myself an expert with Win32 API but never really had the reason to use Heap allocation API, at least not since I started using a Multi-Core CPU.
The simplest and fastest programming technique involves massive use of memory allocations. It is fast and simple to use malloc( ) and new. Most of the times the application needs to allocate some internal data structure and using the allocation API is simple and safe. The problem is that there are scenarios in which we need to store massive amounts of data for example a server may expect 1KB every minute but could get a burst of 100KB in some situations. Sometimes we can prepare buffers in advance and sometimes the size of buffer is unknown and cannot be managed well. Using the global Heap means using a shared resource which means locking the access. Locking means that only one core / one thread can work with the Heap at the same time.
The article describes the correct use of memory allocations by giving each thread its own Heap. The memory can still be shared but every thread has its own allocation engine. A thread can allocate memory and pass it to another thread but only the creator can deallocate the buffer. This is a good practice regardless of the performance issue. Single ownership means no locks and provides a lock-free solution. The text also demonstrates a good use of Microsoft Visual C++ extensions for TLS. TLS - Thread Local Storage is the parallel equivalent of a singleton. If cout / printf was a TLS then you would have no thread to thread collisions on print.
I would even extend the document and say that it is preferable to use the lock-free Heap when ever possible, enforcing a design that makes it a must to delete the buffer in the same thread that allocate the buffer. In C++ this means that you can overload the operators new and delete. This will not break existing code because the access restriction is only for allocation and deallocation and all threads can still access any buffer in the application's memory.
Going over the Win32 API for Heap creation it looks like there is even another parameter called HEAP_GENERATE_EXCEPTIONS. If you ommit this parameter in C++ then you can avoid the out-of-memory Exception. Exceptions are particularly problematic in a parallel environment because and Exception can kill a thread during work while another thread is waiting for it. Generally speaking an exception means breaking the flow of execution unexpectedly which might leave a file open, a semaphore not released, or no one to single the completion event. In C the following code is OK:
char* ptr = malloc( 120 );
if ( NULL == ptr ) return; // ERROR !
In C++ however the following will never work:
char* ptr = new char[ 120 ];
if ( NULL == ptr ) return; // ERROR !
There is an exception thrown before the pointer is ever verified so the next line will never execute if ptr is NULL. By using or not using the flag HEAP_GENERATE_EXCEPTIONS you can make sure whether an exception is thrown in both new and malloc, for example throw in debug mode and don't throw in release.
Bottom line is that the section on on Avoiding Heap Contention Among Threads I am referring to is well written and is very recommended as a good practice even if you have a single core CPU.
Asaf
