Opencl calls on Skylake leaks handles (Showstopper)

Opencl calls on Skylake leaks handles (Showstopper)

We have a commercial product that runs on OpenGL framework, it uses OpenGL-OpenCL Interop to do format conversion. We started doing this on Haswell CPUs and everything worked fine, then we ported this to Skylake with Intel HD 530, our products did not show any issue, but recently we noticed there is a serious issue with newer version of display drivers only on Skylake CPUs.

The issue is that using clEnqueueAcquireGLObjects and clEnqueueReleaseGLObjects never releases handles. With every call, the count of handles increases, and after 2 hours there will be about 500K handles taken and Windows starts to slow down until it freezes.

To confirm this issue I created a simple app with an OpenGL texture and setup OpenCL device and put the aforementioned methods in a forever loop, without doing anything else in OpenGL and OpenCL number of handles increased rapidly.

It is worth mentioning that OpenCL documents says we have to call these methods before and after every time we use an OpenGL object. It seems with the Intel implementation we can avoid calling these methods every time and we can acquire the handle only once at the beginning and release it after we are done with the object - doing so would fix our problem since it wouldn't increase the handles with every call. Unfortunately this works on Haswell but not on Skylake, this issue might have the same cause.

Our products are deployed on Windows 10 but to gather more information on this issue, I also tested it on Windows 7 and witnessed the same result.

The same code still works fine on older CPUs with latest drivers and on Skylake with driver version 15407.4279 , but newer version of the drivers on Skylake all have this issue.

This is a show stopper for our product because we already have many customers that are using it and we are worried that they encounter this issue and also we are concerned about releasing our new version with this problem.

13 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Thanks for sending this report.  Of course we want to work with you to find an answer quickly.  If you could send your simple reproducer this could help us toward that goal.  If there are no concerns about sharing you can attach the reproducer to this thread. Otherwise, please feel free to send a private message by clicking "send author a message" by my name above this reply.

Hi Jeffrey, 

Thanks for the reply, my test app uses our proprietary stuff that I can't share, but I'm creating a simple app from scratch to share with you as soon as possible

this is a simple app that creates a gl texture and uses clEnqueueAcquireGLObjects , clEnqueueReleaseGLObjects in a loop

if you run this on Haswell and watch the number of handles in task manager it will be constant but if you run it on Skylake with updated drivers this number will increase very rapidly

 

Attachments: 

AttachmentSize
Downloadapplication/zip source.zip393.28 KB
Downloadapplication/zip binary.zip66.48 KB

Thank you for the reproducer.  We are investigating and will get back to you soon.

Issue is reproduced and bug is filed.

 

Thanks Jeffery,

we are looking forward to the bug fix

Hi Jeffery,

We are very close to doing a hardware product release that could be held back to this issue. Is there a workaround to this issue on Skylake? As stated, on Haswell we're able to acquire the handle once clEnqueueAcquireGLObjects and hold on to it but we haven't been able to get that workaround to work on Skylake. If there is no workaround, would it be possible to obtain an early beta driver when this issue is fixed?

Hi Jeffery,

This is getting more critical for us. Due to the existing issue in newer drivers we decided to use the old driver but we are experiencing other issues and lower performance with the old driver.

The dev team is working on a fix now.  I've contacted them to get the latest details.

Thanks a lot , I appreciate it

 

Hi,

I am able to reproduce this using Madhi's test code on a Haswell system (i7-4790S, 4600) with driver version 20.19.15.4703

So it appears that it is not specific to the hardware configuration.

Using Sysinternals Process Explorer, I found that the call to clEnqueueReleaseGLObjects() creates 3 events (NOT cl_events), and the call to clEnqueueAcquireGLObjects() deletes 2 of them on the next pass, leaving an event behind each time.

(yes, the order is correct.)

Hope this is useful.

Since the bug was reproduced by Intel in August, could you provide us with a workaround for this issue? We have a month to ship our product and we will need a resolution or workaround for this to go out the door with Intel processors.

Leave a Comment

Please sign in to add a comment. Not a member? Join today