About HW decode + HW vpp on SandyBridge

About HW decode + HW vpp on SandyBridge

imagem de thorpe2323232

Hi,

I am doing a project that ultilize the HW decode and HW vpp (downsize the image) by MSDK, but the app is sometimes crashed if extending channels to 9 (or above).

If to use HW decode only, everything works fine. (can up to 16 channels)

This is my testing case 1920x1080 -> 720x480.

I also try to modify the sample code of decoder in MSDK 3.0 beta, to add the vpp part for downscale.
It can also be replicated.

Is there's anyone could guide meif there's a limitation if using decode + vpp?

My platform is as follows:
- CPU: Core i3, 3.3Ghz

- GPU: GT1, 850MHz
- Driver: v8.15.10.2361
- Ram: 2GB
- OS: Win7 build 7600

Thanks in advance.

8 posts / 0 new
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.
imagem de admin

Hi,

Could you please provide some additional information.

- Is your application a 32 bit build? - Could you share the exact details of your CPU. I.e. is it a Sandy Bridge processor? - Are you using system memory or D3D memory surfaces? - Do you mean to say that SW decode works fine but not HW decode?
- Can you supply some more information regarding the crash. Are you getting a Media SDK API error message?

Regards, Petter

imagem de thorpe2323232

Hi Petter,

1. Yes, the application is 32bits (both my app and modified sample_decode)
2. Yes, it's SandyBridge processor, the details is as I described in the previous post. Please let me know if you still need some specific information.
3. I am using D3D memory surface (if I use HW decode + HW vpp, I suppose to use D3D surface, is it?)
4. I do not test the SW decode, since my target is to use HW decode + HW vpp.
Also, as I mentioned, HW decode only works fine at my testing case, but HW decode + HW vpp will crash
5. In my app, I do not get the MSDK API error message, and it crashed at unexpected windows kernel module.
In modified sample_decode, it crashed at itself and MSDK API returns the error code sometimes asMFX_ERR_NULL_PTR, MFX_ERR_MEMORY_ALLOC, MFX_ERR_MORE_SURFACE

Thanks.

imagem de admin

Hi,

Thanks for sharing the details about your environment. I gather that the issue you are facing is possibly similar to another issue that was reported recently with regards to 32bit multi-threaded decode. We are investigating this issue but currently have no immediate resolution.

As a workaround I suggest trying 64 bit build. We have not been able to reproduce the multi HW decode issue on 64 bit builds.

As an alternative (if you are bound to 32 bit) you may also try encapsulating the CDecodingPipeline.Init() function with a critical section. We have verified that this resolves the multi decode issue, but this naturally also slows down overall application initialization.

Regards, Petter

imagem de thorpe2323232

Hi Peter,

Thanks a lot for the comment.

Are you saying torun the32bits AP on 64bits OS or run the 32bits AP on 64bits OS?
I'd tried to run 32bits AP on 64bits OS (Win 7), but the issue is still existed.

Thanks,
Neal

imagem de admin

Hi Neal, the issue I referred to is for 32 bit builds (applicable to execution on both 32bit or 64bit OS). So I suggest trying 64 bit build instead. Or you can also explore the critical section approach. Regards, Petter

imagem de thorpe2323232

Hi Petter,

Thanks for your feedback.
I'd tried to build the modified sample_decode as 64bits and it seems to work.
(be noticed that the modified sample_decode is reference from http://software.intel.com/en-us/forums/showthread.php?t=84292&p=1#155568provided by your previous post. :))

However, I observe a scenario that would need to have your comment.
When I use more threads to simulate the case of videowall, I found that the system memory is surprised increased, since I supposed to not use too much system memorydue toHW decoding.

The source is HD content as 1920x1080, H.264
I found that it costs about 100MB of system memory per one decoding thread, when I extend to 16 decoding threads, it will cost about 800MB

Is it reasonable? or is there a way to reduce the system memory consumption?

Thanks.

imagem de admin

Hi,

Could you please check how many surfaces are allocated for each decoder. It will give you a rough estimate of how much memory will be used. Check the surface allocation section of the sample code (QueryIOSurf/minimum and suggested number of frames)

Note that you can modify the value of the AsyncDepth parameter. By setting the value to 1 the decoder/encoder will need less surfaces. However, note that this may impact performance.

Regards, Petter

Faça login para deixar um comentário.