TBB 4.1: Lockup on closing

TBB 4.1: Lockup on closing

Аватар пользователя binksoftware

I rencently updated my development environment from Intel C++ Studio XE 2013 Beta to the release version. This new version comes with TBB 4.1, and now the application I've been working on for over a year decides to lockup when closing. Such behavior did not occur with previous versions of TBB, up to the version released along with the Beta (initial release, as I did not update it afterwards while still in Beta).

It may be worth noting that what makes use of TBB is a COM DLL. When DllMain is called with DLL_PROCESS_ATTACH, a static pointer to a task_scheduler_init is initialized. When DllMain is called with DLL_PROCESS_DETACH, the pointer is deleted. The deletion is where things start to go wrong now.

Here's the call stack up to the locking place:

> tbb_debug.dll!tbb::internal::binary_semaphore::V() Line 91 + 0x11 bytes C++
tbb_debug.dll!rml::internal::thread_monitor::notify() Line 186 C++
tbb_debug.dll!tbb::internal::rml::private_worker::start_shutdown() Line 241 + 0xb bytes C++
tbb_debug.dll!tbb::internal::rml::private_server::request_close_connection(bool __formal) Line 186 + 0x11 bytes C++
tbb_debug.dll!tbb::internal::market::release() Line 139 C++
tbb_debug.dll!tbb::internal::market::try_destroy_arena(tbb::internal::market * m, tbb::internal::arena * a, unsigned int aba_epoch, bool master) Line 212 C++
tbb_debug.dll!tbb::internal::arena::on_thread_leaving<1>() Line 343 + 0x13 bytes C++
tbb_debug.dll!tbb::internal::generic_scheduler::cleanup_master() Line 1109 C++
tbb_debug.dll!tbb::internal::governor::terminate_scheduler(tbb::internal::generic_scheduler * s) Line 167 C++
tbb_debug.dll!tbb::task_scheduler_init::terminate() Line 308 + 0x9 bytes C++
dllname.dll!tbb::task_scheduler_init::~task_scheduler_init() Line 111 C++
dllname.dll!tbb::task_scheduler_init::`scalar deleting destructor'() + 0x16 bytes C++

What's also worth noting is that the issue does not seem to occur if no TBB function in particular is called (like parallel_for).

I hope you can shed some light into this serious issue.

Thanks.

17 сообщений / 0 новое
Последнее сообщение
Пожалуйста, обратитесь к странице Уведомление об оптимизации для более подробной информации относительно производительности и оптимизации в программных продуктах компании Intel.
Аватар пользователя binksoftware

TBB 4.0 Update 5 does not have this issue.

Аватар пользователя Alexandr Konovalov (Intel)

Could you please report stacks of other thread in time of locking?

Аватар пользователя binksoftware

No threads were available other than the Main Thread, which I find rather strange. I would expect a bunch of working threads to be reported, but they are not there. Same story when debugging with WinDbg. I wonder which change from 4.0 Update 5 to 4.1 may have caused this new behavior.

The only extra information I can provide are the topmost calls of the callstack when the process is already locked up (these come from WinDbg):
ntdll!NtReleaseKeyedEvent+0x15
ntdll!RtlDeleteTimer+0x2ed

I also tried compiling the DLL with TBB 4.1, but using the TBB 4.0 Update 5 DLL (tbb_debug.dll). It is even worse as the application hangs during normal use. It was a long shot, and it had all chances of failing anyway.

I also checked if there were any other threads running when releasing the DLL while using TBB 4.0 Update 5, and it's only the Main Thread running (as with 4.1). No lockups at all.

Any other things I may check?

Аватар пользователя Anton Malakhov (Intel)

I wonder what is in between NtReleaseKeyedEvent() and binary_semaphore::V(). I bet there is ReleaseSRWLockExclusive(). If so, we have a theory what is changed and happens in TBB 4.1. Please provide version of your OS and the full call stack from WinDbg - we need to know what caused call to NtReleaseKeyedEvent which is known can block in absence of corresponding wait function.

Аватар пользователя binksoftware

Here's the call stack, excluding wrongly solved stack frames from the Delphi executable:

WARNING: Stack unwind information not available. Following frames may be wrong.
ntdll!NtReleaseKeyedEvent+0x15
ntdll!RtlDeleteTimer+0x2ed
tbb_debug!tbb::internal::binary_semaphore::V+0x11 [z:\itt\branch_tbb40\tbb\1.0\src\tbb\semaphore.cpp @ 91]
tbb_debug!rml::internal::thread_monitor::notify+0x40 [z:\itt\branch_tbb40\tbb\1.0\src\rml\server\thread_monitor.h @ 186]
tbb_debug!tbb::internal::rml::private_worker::start_shutdown+0x6f [z:\itt\branch_tbb40\tbb\1.0\src\tbb\private_server.cpp @ 241]
tbb_debug!tbb::internal::rml::private_server::request_close_connection+0x37 [z:\itt\branch_tbb40\tbb\1.0\src\tbb\private_server.cpp @ 186]
tbb_debug!tbb::internal::market::release+0x92 [z:\itt\branch_tbb40\tbb\1.0\src\tbb\market.cpp @ 139]
tbb_debug!tbb::internal::market::try_destroy_arena+0xaa [z:\itt\branch_tbb40\tbb\1.0\src\tbb\market.cpp @ 212]
tbb_debug!tbb::internal::arena::on_thread_leaving<1>+0x6c [z:\itt\branch_tbb40\tbb\1.0\src\tbb\arena.h @ 343]
tbb_debug!tbb::internal::generic_scheduler::cleanup_master+0x261 [z:\itt\branch_tbb40\tbb\1.0\src\tbb\scheduler.cpp @ 1109]
tbb_debug!tbb::internal::governor::terminate_scheduler+0x54 [z:\itt\branch_tbb40\tbb\1.0\src\tbb\governor.cpp @ 167]
tbb_debug!tbb::task_scheduler_init::terminate+0xac [z:\itt\branch_tbb40\tbb\1.0\src\tbb\governor.cpp @ 308]
DllName_66ed0000!tbb::task_scheduler_init::~task_scheduler_init+0x1e [c:\program files (x86)\intel\composer xe 2013\tbb\include\tbb\task_scheduler_init.h @ 111]
DllName_66ed0000!tbb::task_scheduler_init::`scalar deleting destructor'+0x16
DllName_66ed0000!DllMain+0x9f [c:\path\src\dllmain.cpp @ 26]
DllName_66ed0000!__DllMainCRTStartup+0xcd [f:\dd\vctools\crt_bld\self_x86\crt\src\crtdll.c @ 512]
DllName_66ed0000!_DllMainCRTStartup+0x21 [f:\dd\vctools\crt_bld\self_x86\crt\src\crtdll.c @ 476]
ntdll!RtlQueryEnvironmentVariable+0x241
ntdll!LdrShutdownProcess+0x141
ntdll!RtlExitUserProcess+0x74
kernel32!ExitProcess+0x15
DelphiExe!_enc$textbss$begin+0x6967

Following the disassembly code at the point where the binary semaphore is Verhoogd, it calls:
_RtlReleaseSRWLockExclusive (that's what __TBB_release_binsem is solved to)
+_RtlpWakeSRWLock
++_NtReleaseKeyedEvent
+++_NtSetInformationThread

I cannot continue debugging once it reaches _NtSetInformationThread, and if I pause the process then it says it's stuck at _NtReleaseKeyedEvent.

I fear that may not be enough information to solve the issue, though.

Аватар пользователя Anton Malakhov (Intel)

Another question, does it deadlock in debug configuration only?

Аватар пользователя binksoftware

Цитата:

Anton Malakhov (Intel) wrote:

Another question, does it deadlock in debug configuration only?

I just tested, and it deadlocks in release mode as well.

Аватар пользователя Anton Malakhov (Intel)

Hi,
we cannot reliably observe the problem in our tests, thus would appreciate your help in making a reproducer..
Meanwhile, we will disable the usage of SRWLocks in the next Update 1 release. We hope, it will help until the root cause is identified and fixed.

Аватар пользователя binksoftware

Hi.

Цитата:

Anton Malakhov (Intel) wrote:we cannot reliably observe the problem in our tests, thus would appreciate your help in making a reproducer..

I have added an issue report at the Premier Support website with the same title as this topic, in which there is a way to reproduce it. I would rather not put the link information to download the application here in the forums.

Цитата:

Anton Malakhov (Intel) wrote:Meanwhile, we will disable the usage of SRWLocks in the next Update 1 release. We hope, it will help until the root cause is identified and fixed.

I will be checking it then once the new version is released.

Regards,
Paúl.

Аватар пользователя Wooyoung Kim (Intel)

Hi,

Цитата:

binksoftware wrote:

I also tried compiling the DLL with TBB 4.1, but using the TBB 4.0 Update 5 DLL (tbb_debug.dll). It is even worse as the application hangs during normal use. It was a long shot, and it had all chances of failing anyway.

I tried to reproudce the hang you mentioned here by compiling the TBB tests with 4.1 headers and running them with 4.0U5 DLL.
Unfortunately, I was not able to reproduce any hang. Could you give us a bit more information regarding the nature of your application?
In particular, we would like to know what TBB constructs the application uses. That will help us understand the cause of the problem: whether the problem is due to the API changes from 4.0U5 to 4.1 or it is due to a bug in the TBB release and the TBB tests need be extended.

Thank you very much

Аватар пользователя binksoftware

Hi.

Цитата:

Wooyoung Kim (Intel) wrote:I tried to reproudce the hang you mentioned here by compiling the TBB tests with 4.1 headers and running them with 4.0U5 DLL.
Unfortunately, I was not able to reproduce any hang. Could you give us a bit more information regarding the nature of your application?
In particular, we would like to know what TBB constructs the application uses. That will help us understand the cause of the problem: whether the problem is due to the API changes from 4.0U5 to 4.1 or it is due to a bug in the TBB release and the TBB tests need be extended.

Thank you very much

The program uses tbb::atomic (mostly for reference counting), tbb::spin_mutex for (occasional) locking and the parallel_for and parallel_do template functions. The program also uses the scalable_malloc and scalable_free functions for memory allocation of objects and containers. I don't recall using anything else.

Regards,

Paúl.

Аватар пользователя jimdempseyatthecove

Are you explicitly deleting the task_scheduler_init object (prior to termination/exit of all tasks)?
IOW you have some (exception) condition (or convergence) and you wish to stop execution and chose to do so via "delete" on the task_scheduler_init object

Jim Dempsey

www.quickthreadprogramming.com
Аватар пользователя binksoftware

Цитата:

jimdempseyatthecove wrote:

Are you explicitly deleting the task_scheduler_init object (prior to termination/exit of all tasks)?
IOW you have some (exception) condition (or convergence) and you wish to stop execution and chose to do so via "delete" on the task_scheduler_init object

Jim Dempsey

The task_scheduler_init object is explicitly deleted, but only when unloading the COM DLL. No TBB operations are being performed at this point, and all objects that were created using the DLL should have been released. IOW the COM DLL is only released when the program is closed, and only then.

Regards,

Paúl.

Аватар пользователя Wooyoung Kim (Intel)

Thanks a lot!!
Does your reproducer show this kind of issues, too? I.e., if I run it with 4.0U5 DLL, does it hang in the middle of execution?
I have looked at the changes log and the diffs for those TBB constructrs between 4.0U5 and 4.1, but it does not seem obvious
what changes might have caused the issue. If your previous reproducer does not have the issue, would you mind giving us another small reproducer?

Цитата:

binksoftware wrote:

Hi.

Quote:

Wooyoung Kim (Intel) wrote:I tried to reproudce the hang you mentioned here by compiling the TBB tests with 4.1 headers and running them with 4.0U5 DLL.
Unfortunately, I was not able to reproduce any hang. Could you give us a bit more information regarding the nature of your application?
In particular, we would like to know what TBB constructs the application uses. That will help us understand the cause of the problem: whether the problem is due to the API changes from 4.0U5 to 4.1 or it is due to a bug in the TBB release and the TBB tests need be extended.

Thank you very much

The program uses tbb::atomic (mostly for reference counting), tbb::spin_mutex for (occasional) locking and the parallel_for and parallel_do template functions. The program also uses the scalable_malloc and scalable_free functions for memory allocation of objects and containers. I don't recall using anything else.

Regards,

Paúl.

Аватар пользователя binksoftware

Цитата:

Wooyoung Kim (Intel) wrote:

Thanks a lot!!
Does your reproducer show this kind of issues, too? I.e., if I run it with 4.0U5 DLL, does it hang in the middle of execution?
I have looked at the changes log and the diffs for those TBB constructrs between 4.0U5 and 4.1, but it does not seem obvious
what changes might have caused the issue. If your previous reproducer does not have the issue, would you mind giving us another small reproducer?

The reproducer I provided in Premier Support is the only reproducer I have (which happens to be the full application). That one was compiled with TBB 4.1, and by just replacing the DLL with that of TBB 4.0U5 I had the early hang. I don't really think swapping DLLs is the way to go, though.

Regards,

Paúl.

Аватар пользователя binksoftware

In case someone else is interested, Intel reverted the code that introduced the issue. It is available in TBB 4.1 Update 1. It's not part of the release notes, though.

Thanks again for the fix.

Зарегистрируйтесь, чтобы оставить комментарий.