DirectX11: big stalls in the driver of an Intel HD 3000

DirectX11: big stalls in the driver of an Intel HD 3000

Hi,

I'm currently working on having an existing DirectX11 application running well on an Intel Graphics HD 3000 based laptop.
As a first draft, I got extremely poor performances. I have read the Sandy Bridge Graphics Developer’s guide, and of course I have planned some already known tweak & optimizations.

But, when I use the Concurrency Visualizer integrated in Visual Studio 2012, it appears that my rendering thread sometimes stalls quite randomly in the frame for more than 10ms, inside the intel driver according to the sampled callstack, which looks like to wait an internal event.

Is this the “normal” behavior of the driver (flush of the command buffer, or something else) ?
If not, Is there some known particular features of DirectX11 that could lead to this ? Is there some ways, using Intel GPA for example, to get more precise information’s about what goes wrong ?

Any help, tips or advice is welcome.

 Thanks in advance.

27 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.

>>>But, when I use the Concurrency Visualizer integrated in Visual Studio 2012, it appears that my rendering thread sometimes stalls quite randomly in the frame for more than 10ms, inside the intel driver according to the sampled callstack, which looks like to wait an internal even>>>

Can you post the stalled thread call stack?

Quote:

iliyapolak wrote:

>>>But, when I use the Concurrency Visualizer integrated in Visual Studio 2012, it appears that my rendering thread sometimes stalls quite randomly in the frame for more than 10ms, inside the intel driver according to the sampled callstack, which looks like to wait an internal even>>>

Can you post the stalled thread call stack?

Another option to try is to run your DirectX app under windbg and check for processor hog.Sadly !runaway command is user mode command and in order to fully inspect and find the culprit you need to perform kernel mode debugging coupled with the user mode debugging.

Quote:

iliyapolak wrote:

>>>But, when I use the Concurrency Visualizer integrated in Visual Studio 2012, it appears that my rendering thread sometimes stalls quite randomly in the frame for more than 10ms, inside the intel driver according to the sampled callstack, which looks like to wait an internal even>>>

Can you post the stalled thread call stack?

examples of callstack:

ntoskrnl.exe!SwapContext_PatchXRstor+0x103
ntoskrnl.exe!KiSwapContext+0x7a
ntoskrnl.exe!KiCommitThreadWait+0x1d2
ntoskrnl.exe!KeWaitForMultipleObjects+0x26a
dxgmms1.sys!VidSchWaitForEvents+0x9c
dxgmms1.sys!VidSchWaitForCompletionEvent+0x139
dxgmms1.sys!VIDMM_DMA_POOL::WaitDmaBufferNotBusy+0xcc
dxgmms1.sys!VIDMM_DMA_POOL::AcquireBuffer+0x2a1
dxgkrnl.sys!DXGCONTEXT::Render+0x263
dxgkrnl.sys!DxgkRender+0x3e7
win32k.sys!NtGdiDdDDIRender+0x12
ntoskrnl.exe!KiSystemServiceCopyEnd+0x13
wow64win.dll!ZwGdiDdDDIRender+0xa
wow64win.dll!whNtGdiDdDDIRender+0xf9
wow64.dll!Wow64SystemServiceEx+0xd7
wow64cpu.dll!ServiceNoTurbo+0x2d
wow64.dll!RunCpuSimulation+0xa
wow64.dll!Wow64LdrpInitialize+0x429
ntdll.dll! ?? ::FNODOBFM::`string'+0x6d07
ntdll.dll!LdrInitializeThunk+0xe
gdi32.dll!_NtGdiDdDDIRender@4+0x15
d3d11.dll!NDXGI::CDevice::RenderCB+0x1a9
igd10umd32.dll!0x26bd16
igd10umd32.dll!0x26c3e3
igd10umd32.dll!0x2090fd
igd10umd32.dll!0x2111a3
igd10umd32.dll!0x211539
igd10umd32.dll!0x2085a1
igd10umd32.dll!0x2088c2
igd10umd32.dll!0x1fe092

and

ntoskrnl.exe!SwapContext_PatchXRstor+0x103
ntoskrnl.exe!KiSwapContext+0x7a
ntoskrnl.exe!KiCommitThreadWait+0x1d2
ntoskrnl.exe!KeWaitForMultipleObjects+0x26a
dxgmms1.sys!VidSchWaitForEvents+0x9c
dxgmms1.sys!VidSchWaitForCompletionEvent+0x139
dxgmms1.sys!VIDMM_GLOBAL::xWaitOnDMAReferences+0xa2
dxgmms1.sys!VIDMM_GLOBAL::BeginCPUAccess+0x7e3
dxgmms1.sys!VidMmBeginCPUAccess+0x28
dxgkrnl.sys!DXGDEVICE::Lock+0x287
dxgkrnl.sys!DxgkLock+0x22a
win32k.sys!NtGdiDdDDILock+0x12
ntoskrnl.exe!KiSystemServiceCopyEnd+0x13
wow64win.dll!ZwGdiDdDDILock+0xa
wow64win.dll!whNtGdiDdDDILock+0x76
wow64.dll!Wow64SystemServiceEx+0xd7
wow64cpu.dll!ServiceNoTurbo+0x2d
wow64.dll!RunCpuSimulation+0xa
wow64.dll!Wow64LdrpInitialize+0x429
ntdll.dll! ?? ::FNODOBFM::`string'+0x6d07
ntdll.dll!LdrInitializeThunk+0xe
gdi32.dll!_NtGdiDdDDILock@4+0x15
d3d11.dll!NDXGI::CDevice::LockCB+0x4c
igd10umd32.dll!0x2073dc
igd10umd32.dll!0x22cc3f
igd10umd32.dll!0x20df39
igd10umd32.dll!0x201b24

Thanks for help.

Obviously your rendering thread entered so called synchronous waiting state and was swapped.The main problem is locate the culprit responsible
for the thread's stall.Sadly can not find any information regarding this driver and its functions 'dxgmms1.sys!VIDMM_GLOBAL'.
My bet is that this is user mode driver which is responsible for receiving commands from the DirectX runtime.
These 3 functions calls are crucial for understanding the cause for the extended wait:

dxgmms1.sys!VIDMM_GLOBAL::xWaitOnDMAReferences+0xa2
dxgmms1.sys!VIDMM_GLOBAL::BeginCPUAccess+0x7e3
dxgmms1.sys!VidMmBeginCPUAccess+0x28

While going through the WDK display driver documentation i was not able to find any relevant information regarding dxgmms.sys driver.
By looking at the call stack I think that the possible reason could be related to the DMA buffer(s) and if I'm not wrong for allocation of DMA buffers is responsible display miniport driver.Moreover miniport driver needs to lock DMA buffer pages so this can be an issue i.e waiting for the lock to be obtained/released.
It is very hard to exactly understand what those functions are doing without the putting a breakpoint on one of them and doing single step through the dissasembly.
My advice is to run GpuView.exe utility which will collect statistics about the GPU and CPU performance and maybe you will be able to pinpoint the source of your problem.
Another option is to run your app under windbg and issue !runaway 3 command for the tracing processor hog.
Another option is to do kernel mode debugging for the stalled process.

>>>wow64win.dll!ZwGdiDdDDILock+0xa>>>

Do you run 32-bit app on 64-bit Windows OS?

Hi djiz!

Do you have a checked build of dxgkrnl.sys?If you do you can enable extended logging feature which will be displayed on the debugger break in.
If you do not so you can still log those errors.
As I stated earlier in my previous post GpuView.exe is an essential tool to be used in the case of DMA buffer errors.

Quote:

iliyapolak wrote:

>>>wow64win.dll!ZwGdiDdDDILock+0xa>>>

Do you run 32-bit app on 64-bit Windows OS?

Hi,
Thanks for your answers.
You're right, I'm running a 32 bit app on a 64-bit Windows 7. Could it be related to the stalls ?

Quote:

iliyapolak wrote:

Do you have a checked build of dxgkrnl.sys?If you do you can enable extended logging feature which will be displayed on the debugger break in.
If you do not so you can still log those errors.

How can I do that ?

For your other suggest, I've tried a little GPUView one year ago but at that time I thought it was to much complex for my needs. Now with the current issue, this is probably more adapted. I'll give it a deeper try.

>>>Hi,
Thanks for your answers.
You're right, I'm running a 32 bit app on a 64-bit Windows 7. Could it be related to the stalls ?>>>

Hardly to say if this is the reason for the thread stalls.Wow64 is simply hooking ,intercepting and translating your 32-bit system calls.If you were able to run your app on 32-bit Win it could be great we could have been able to eliminate the or to remove the responsibility from the WoW64.dll

>>>How can I do that ?>>> For this so called checked build of Windows is needed it is not free and if you have somehow possibility to obtain at least checked dxgkrnl.sys it could be very helpful in your case.

P.S

 Checked build Windows is available for MSDN subscribers only.

>>>For your other suggest, I've tried a little GPUView one year ago but at that time I thought it was to much complex for my needs. Now with the current issue, this is probably more adapted. I'll give it a deeper try.>>>

I strongly advise you to use GPUView tool this program can gather information about the performance of GPU.More obscure alternative is to perform full blown kernel debugging.

@djiz

Do you have any updates regarding your stalled application?

Hi,

I've spent some time exploring GPUView. It confirms the big stalls in the rendering thread, and it seems that they occur while the thread is in kernel mode (inside the driver).
I didn't get much more information but I've observed that the dma packet containing the present is queued a long time in the hardware & the previous dma packet looks like an huge one. I was intuitively expecting much more small dma packet.

PS: I've serach for "checked build" into MSDN subscriber download but without success.

Hi dijz!

Checked build Windows is not free in order to download it you must be MSDN subscriber and this will cost you a 700$.For efficient dxgkrnl.sys debugging you need checked build directX kernel driver which is te special version filled with debugger assertions and this functionality will make easier driver debugging by performing debugger break-in on trigerred assertion.

Can GPUView provide you with the DMA related call stack?Can you follow function which calls HAL dma functions.IIRC DMA buffers are managed by HAL.DLL routines.As I stated earlier in my post for DMA allocations is responsible miniport driver the one that is working with your video hardware.I do not know how exactly  Intel miniport driver is accessing DMA functionality.As I told you earlier there is an option to dig deep down those DMA related features this option is to perform kernel debugging on your stalled thread by using !dma instruction.You can also run driver verifier which will help you to test DMA functionality. I strongly advise to run driver verifier and test DMA.Please post the results.

 You can a lot of information about the driver verifier in this link :http://support.microsoft.com/kb/244617

Did you test you display driver with driver verifier?

@dijz

Do you have any updates?

Hi,

Sorry. I was very busy with other things these days, but when I have more time to spend on this, I'll post updates here.
Thanks again. 

Quote:

DZ wrote:

Hi,

Sorry. I was very busy with other things these days, but when I have more time to spend on this, I'll post updates here.
Thanks again. 

Ok I will be waiting for any updates.

Btw Are you DirectX developer?

 

Hi

Any updates on the status of your problem?

Hi,

Unfortunately, still no updates on this because of other high priority tasks. I'm sure you know what I mean.
To answer your question, yes I'm a DirectX developer but not only. More generally a graphic developer (I worked also a lot on consoles, that's why I'm quite frustrated when working on PC with a lot of various "opaque" arhitectures and unknown driver-side or os-side behavior ;)

>>>Unfortunately, still no updates on this because of other high priority tasks. I'm sure you know what I mean>>>

Yes I understand this pretty well:)

>>>I'm quite frustrated when working on PC with a lot of various "opaque" arhitectures and unknown driver-side or os-side behavior ;)>>>

You mean "Windows".Sadly Micrososft decided that for efficient GDI and DirectX debugging there is need to install checked build release which is not free of course.Moreover many features are undocumented.

I do not know the programming model of consoles.Do you work with some kind of API (could that be OpenGL)?

 

You're right, I had to say "Windows" instead of "PC" because I've already heard that Linux offers much better control, but I've never experienced it myself.
Consoles graphics API are generally specific.Sometimes, they are "inspired" by OpenGL or DirectX for the interface, but you have much more low-level control and much understanding of what happen "behind the scene" as the architecture is fixed. Even when the API seems to be very close to another PC one, the behavior may be totally different, for example if you have a unified memory architecture, etc...

>>>I've already heard that Linux offers much better control, but I've never experienced it myself.>>> This the case with Intel and Ati graphics hardware as I heard Nvidia is reluctant to reveal it hardware architecture for the open source community.I remember that I was excited when AMD-ATI revealed to the general public R600 registers specification.If you are interested in getting closer to the GPU hardware I advise you to follow a reversing of nvidia drivers done by the developers from the Nouveau project.

Quote:

iliyapolak wrote:
If you are interested in getting closer to the GPU hardware I advise you to follow a reversing of nvidia drivers done by the developers from the Nouveau project.

Do you have any link to provide to me plz ?

Quote:

DZ wrote:

Quote:

iliyapolak wrote:If you are interested in getting closer to the GPU hardware I advise you to follow a reversing of nvidia drivers done by the developers from the Nouveau project.

Do you have any link to provide to me plz ?

Yes of course:)I'm posting link without the http or www(because of anti-spam filter )

 nouveau.freedesktop.org

 

i'm sory, come listen and learn with this discussion..

Kommentar hinterlassen

Bitte anmelden, um einen Kommentar hinzuzufügen. Sie sind noch nicht Mitglied? Jetzt teilnehmen