parallel_reduce body objects leaked when cancelled

parallel_reduce body objects leaked when cancelled

Hello,

I'm noticing that the body objects that are created during the execution of parallel_reduce are being leaked (ie. not destroyed) when the a body's operator() throws an exception (or tbb::task::self().cancel_group_execution() is used. The documentation does not make any mention of this issue. Is it a bug, or intentional (though it seems unlikely that it is intentional).

I have attached a simple test case that reproduces the behavior (This was built with MSVC 2010 Win32 Debug, but the behavior was first noticed in an x64 Release build).

Here is sample output that shows that destructors are not called:

New body #0
New body #1 split from #0
New body #2 split from #0
New body #3 split from #0
New body #4 split from #1
Joined body #3 to #0
Deleted body #3
New body #5 split from #2
New body #6 split from #5
New body #7 split from #1
Joined body #6 to #5
Deleted body #6
Joined body #5 to #2
Deleted body #5
Joined body #2 to #0
Deleted body #2
New body #8 split from #7
New body #9 split from #1
Joined body #8 to #7
Deleted body #8
New body #10 split from #9
New body #11 split from #4
New body #12 split from #10
New body #13 split from #11
Deleted body #0
Exception
Created: 14 objects, deleted 6 objects

AttachmentSize
Downloadtext/x-c++src main.cpp2.45 KB
6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Upon further debugging, it appears that parallel_reduce has a resource deallocation bug. Specifically in tbb::interface#::internal::finish_reduce when has_right_zombie is true. If the task is cancelled (explicitly or via an exception) the finish_reduce::zombie_space member may contain a constructed body object but finish_reduce::execute will not be called (due to the cancellation) so the destructor is never run.

This simple patch outlines a fix.

--- tbb/parallel_reduce.h Tue Oct 16 16:08:32 2012
+++ tbb/parallel_reduce.h Tue Oct 16 16:08:44 2012
@@ -62,12 +62,18 @@
my_body(NULL)
{
}
+ ~finish_reduce(){
+ if( has_right_zombie )
+ zombie_space.begin()->~Body();
+ }
+
task* execute() {
if( has_right_zombie ) {
// Right child was stolen.
Body* s = zombie_space.begin();
my_body->join( *s );
s->~Body();
+ has_right_zombie = false;
}
if( my_context==1 ) // left child
itt_store_word_with_release( static_cast(parent())->my_body, my_body );

Quote:

Darcy Harrison wrote:

Upon further debugging, it appears that parallel_reduce has a resource deallocation bug. Specifically in tbb::interface#::internal::finish_reduce when has_right_zombie is true. If the task is cancelled (explicitly or via an exception) the finish_reduce::zombie_space member may contain a constructed body object but finish_reduce::execute will not be called (due to the cancellation) so the destructor is never run.

This simple patch outlines a fix.

Thank you a lot for the report and suggested bugfix!

Looking at the 4.1 Update 3 release, I see that this bug is still outstanding. Please fix it!

So what about this issue. Its presence in the latest TBB is pretty frustrating.

the fix is on the way, stay tuned:)

--Vladimir

Leave a Comment

Please sign in to add a comment. Not a member? Join today