I'm having OpenGL performance issues in an application with the latest Sandybridge HD3000 hardware on Ubuntu 12.04. I've attached the glxinfo output and the Xorg log file for this system. I know this same code works on Windows, and OS X with the same chipset running OpenGL. The same code has also been tested and works on Ubuntu 10.04 with older G35 chipsets as well as other ATI and NVIDIA cards. However, the problem I'm seeing is that the performance will dramatically degrade over time until the Xorg process is taking up ~100% CPU. I'm currently running mesa 8.0.2 as shown in the glxinfo, but I've also seen this on 10.04 with the HD3000 running mesa 7.10.x.
I've experimented with disabling various OpenGL features to no avail. There are no OpenGL errors that are occuring that I see. There is one things, however, that did resolve it. We currently use a shared OpenGL context between up to 48 different rendering contexts (one for each camera we display video for). If I do not use a shared context then the performance will not degrade over time. However, this solution is not ideal because hiding/showing the different contexts is extremely slow when we change the number of cameras being displayed. Also its more efficient to share the contexts since I would have to create a texture containing all the fonts per context instead of sharing one.
Any thoughts/suggestions would be greatly appreciated.
Also on a probably unrelated note is that I'm not getting the performance I would expect from using OpenGL to display the video in Linux. The software actually performs about 50-60% faster if I just use a wxPaintDC to draw wxBitmaps. I'm not doing anything incredibly complex with OpenGL. Its simply a matter of creating a texture for each YUV plane (or 1 texture for RGB). I use a pixel buffer if supported for texture upload otherwise just a glTexSubImage2D call. If pixel shaders are supported I use a shader for YUV->RGB conversion otherwise I use the Intel IPP for converting from YUV to RGB into pixel buffer or intermediate buffer. Then I simply use a quad for displaying the image. It looks like a majority of the time spent comes from switching rendering contexts, SwapBuffers (vsync/vblank is disabled), and texture upload. I would think the smaller texture uploads of YUV data and conversion to RGB on the GPU should make OpenGL a little faster or at least comperable if anything. I don't think I'm fill rate limited since cameras I'm testing with are low res and if we're displaying them small enough we resize before sending to the GPU to save bandwidth. With my testing at most would be 48MB/sec total if I were able to display all frames. The rate I'm seeing right now is more like 24MB/sec.
I've attached some glIntercept logs from another system in Windows for a brief overview of OpenGL calls in case anyone knows more about OpenGL than I do then maybe they can point out any flaws or improvements that can be made. One is the simple path where no pixel buffers or shaders are supported and the other is the more complex path using multitexturing and pixel shaders to do the YUV->BGR conversion. If I can provide more information or any other details please let me know.