After debugging the filter by attaching the debug target directly to graphedit, I can get the filter to work in HW mode although graphedit definitely shows a leak still.
If I set the project to release mode and attach the filter to the graph in graphedit I get the HRESULT 0x80004005 which is why I thought before there was a typedef incompatibility. But its strange that it still works in debug mode?
That's definitely strange. Could it be that one of configurations wasn't rebuilt after applying the patch, or maybe is linked to a different dispatcher (libmfx.lib)?
Are you debugging your own filter or the original one from \samples folder (if any difference)?
Have you applied the patch for base_encoder? (just to double check :))
Have you sorted out the problem with sessions joining?
Can you check which of the calls inside the constructor generates an error?
I double checked the libmfx.lib dependency, it is correct as well as rebuilding after applying the patch.
I am debugging the one from the samples folder.
I have applied the patch.
Cannot join the session in release mode... however if I attach Visual Studio to the debug target graphedit.exe it works in either release or debug mode... but if it works then I'm not sure how to debug it because it won't work when regsvr32ing it.
I'm trying to figure out the call the contructor is erroring at however it seems to be difficult to debug such a scenario when debug mode always works.
This makes me think that there is a difference in filters registration.
Maybe it works with VS attached because filter is not registered after rebuild, so the unmodified binary is picked up. I think VS filters vcproj files don't have the setting. If you call regsvr32 explicitely - modified binary is registered, which fails.
Instead of debugging you may try removing all Join/DisJoin calls for MFXSessions. And add a SyncOperation call after RunFrameVPPAsync in CBaseEncoder::RunEncode(). The problem is that you have SW MSDK dll having API 1.3 and HW MSDK dll having API version 1.1.
Another option is to replace libmfxsw32.dll from MSDK 2012 with a dll from MSDK 2.0 (has API version 1.1), do you still have this previous release?
Please also remember to apply the first patch for CBaseEncoder::InternalReset which was a partial fix.
Can you confirm what the first partial fix was inside InternalReset, I can't remember what it was after all the changes.
I am now able to run the release filter in graphedit, I was confused when you said replace libmfxsw32... i thought it was a reference in the compiler settings and couldn't find it so I literally copied the library (made a backup of 3.0's) and replace the library explicitly.
I am still seeing a memory leak, I will continue investigating... maybe the first patch I am forgetting will do the trick?
Oh, I'm sorry, it was InternalClose:
The condition and the assert before it should be removed - to free surfaces and samples in any case.
Ah yes, the change you did to the baseencoder must of switched the if statement a bit but I see it now. Just tested... however graphedit bloats up in size after every stop... I think I had one with a 30 meg foot-print. Is this what you are seeing on your side?
I don't know if it makes a difference but I have been working in 64 bit mode.
I have confused myself. Of course the patch base_encoder.h/cpp contained both changes. No more changes needed.
That's strange you still see the leak. I was testing 32-bit mode but this should not matter.
Couple more points:
1) what is the encoder filter input format - YUY2? Please make sure it's not RGB32 - with this format CSCPlugin will switch on in the pipeline and it contains an additional HW VPP by mistake.
2) I haven't tested with MSDK 2.0 sw lib, so it could happen that there are problems in that lib. Can you try going the first way and removing all Join/Disjoin calls, add a SyncOperation and run with MSDK 2012 sw lib again?
Unfortunately I'll be able to do more testing at my side only tomorrow when I will come to the office.
I'll try the other way around with syncoperation... the colorspacing being used is UYVY 4:2:2... I believe its 16 bits.
I'll let you know what I find after making the changes and flipping back to 3.0's mfxlibsw.
After making those changes and switching back I try to add the filter to graphedit however graphedit freezes and becomes unresponsive.
Make sure you have removed the whole cycle:
sts = MFX_WRN_IN_EXECUTION;
while (MFX_WRN_IN_EXECUTION == sts)
sts = m_mfxVPPSession.DisjoinSession();
In SetAcceleration and in destructor.
I had commented the while loop in the destructor but forgot the actual constructor. I can now add it the graph, however when I run it the graph spits out "Could not change state. The operation completed successfuly HRESULT = 0x00000000" which i believe means it was successful.
I'll continue looking through the filter.
Ok, I will try tomorrow to test your version of HW library and not joined set-up.
Sounds good, I'll await to hear what you find.
Please let me know how the leak behaves.
That makes more sense, I had it set to 0 for a wait period. It now runs and I ran the graph several times... the last time graphdit was at 237,000 and pushing stop it went up to 253,000.
I have a feeling were very close though, I can feel it! What else can I try?
Oh, good to hear that at least we managed to make the patch run. But it's realy frustrating that you still see the leak.
Today I was checking with the public GFX driver that most likely is the one you have (BTW - can you please tell the exact driver version you are using?) and I saw "leaks" only on first several iterations of Play-Stop and then memory level stabilized. But I was using a very simple graph: Cam - YUY2 - Encoder-Dump
I will debug the code tomorrow to check where these first leaks come from.
Could you compare 2 versions of filter - the origial and with the patch - by dynamics of memory in several (10 should be enough) interations of Play-Stop? This should help me understand if we are on the same page regarding what we observe.
Could you also try building the minimum graph - e.g. use Dump filter instead of Muxer+FileWriter or Decoder+Renderer?
I'll keep testing...
The driver version is 64x 22.214.171.124
I'll keep you posted with what I find.
I ran through the iterations... and I do notice it stabalizing... I'm going to build the filter into our application and run a suite of tests such as 500 recordings in 10 second intervals, this will give me a better analysis. Keep you posted, thanks for all your dedication and hardwork! :)
Great to hear that! I'll wait for your results.
BTW - for graphics driver version - it should be something like126.96.36.1999, can you check throuhg DeviceManager->DisplayAdapter->Intel HD Graphics?
Sorry I clicked on the details of the installer... the actual version number is 188.8.131.529
Much better results... however we found a small bug in our code that kept us from doing the full suite of tests but will continue to do the tests after we squash it.
What we did find... out of 500 recordings we had stopped it at 302 because we wanted to go up to 200 recordings in 5 second intervals.
What we see after 302 recordings is 890,000K being utilized. Much better than before seeing how that was a new high score at about 1/4 of the previous memory consumption. However it would be nice if the application stabalized at around 20,000K.
After we set out to do 2000 recordings, we tested it on 2 machines for the night, coming in this morning i found that one of the machines did not have HW turned on so obviously the system crashes, and blue-screened.
The other machine got to 268 recordings at 1,125,000K and this is where we found a bug in our communications... the command being sent to turn the graph on was not being sent because of a socket buffer issue.
I'll keep you posted with more details later today.
Overall, much better so far... but still a small leak that allows the system to consume memory gradually.
Thanks for the good news. At this point, can you confirm by experiment that the remaining memory consumption is due to MSDK filter/library? Would it be possible to run your automated test with SW MSDK library to check consumption?
BTW, could you explain the usage model of multiple Play-Stop in your application? If the idea is to have chosen time frames in the output stream then probably Play-Pause would be sufficient. If separate output streams are needed - then it depends on file writer if it is able to write to a new file each time. And for this scenario the graph could be destroyed and created each time, though this could introduce bigger latency..
I will see if I can find a machine to run the automatedtest with just graphedit and basic setup using just the software library.
We cannot do just play pause and there are different resolutions, hertz, and other criteria needed to create the stream each time. I currently dispose of the graph and recreate it with new formats each time depending on the format selected in the system. From what I can tell the latency is very small when waiting... about 2 milliseconds, give or take.
I see now. Sound like a good workaround considering the 2ms latency.
Anyways, I will keep investigating the leak at my side too.
There is a new driver available:http://downloadcenter.intel.com/SearchResult.aspx?lang=eng&ProductFamily=Graphics&ProductLine=Processor+graphics&ProductProduct=2nd+Generation+Intel+Core+Processors+with+Intel+HD+Graphics+3000%2f2000&ProdId=3319&LineId=3310&FamilyId=39
It's HW MSDK has API 1.3. I was originally experimenting with that driver. The fix for the issue you reported will be likely integrated in one of the 1.3 future libraries anyway. Could you try upgrading the driver and test the patch on it? Since HW dll API version is same 1.3 as MSDK 2012 sw dll you can use the original patch with joined sessions. It should be more effective.
This is perfect... I'll start testing once again and let you know the outcome.
Quick question about the drivers... are they valid for Windows Embedded?
Nope, sorry. Only Win7/Vista.
Just tested the driver... was able to install it on an embedded system. Tested with old encoder as well as the newly patched one. The old one still leaks... however the patched one works very well... our tests concluded over 500 recordings without the system crashing so we should be in good shape for now.
Things are looking good :)
Thanks for your support Nina, cheers!
Such a good piece of news just before I go to sleep :-) Thank you for cooperation and good luck!
I see an issue with the patched encoder... I've just ruled everything else out that would cause this new scenario and swapping the intel encoder back to its original version confirms my results.
Whenencoding with the new patched version to a file... the time stamp is wrong... and when playing the video back it plays in twice the speed. Something we did affected the file directly for playback.
Any thoughts as to what it could of been, hopefully theres a quick fix. As of right now with this fix... no file can be played back with accurate results.
I will investigate. Sure there must be a fix for this. I'll get back to you as soon as I find the rootcause.
I'm not able to reporduce the problem. Don't see any dfference so far. But I use MSDK Muxer so it can be different form your setup. I will check timestamps directly tomorrow.
Meanwhile - could you check timestamps values at your side, after VPPFrameAsync in old and new code? The patch doesn't touch timestamps in any way. The only difference is that HW VPP is replaced by SW VPP. Could be that they calculate differently.
Please also double check your code for other changes..
To fix the playback issue, I looked through some of your changes in base_encoder.cpp and the while loop in SetAcceleration to disjoin the vpp session seems to be causing it... commenting out the loop fixes the playback but then it eats up the memory again... I'll continue to diagnose.
Commenting out just the MFX_THREAD_WAIT in that loop will fix the playback issue.... however still a memory issue.
I also debugged the video source a bit... it says its UYVY however it almost looks like the encoder is pulling it in as RGB32. Thoughts?
This is really strange. THREAD_WAIT and that loop should not be affecting timestamps anyhow. Having RGB32 as encoder filter input can lead to memory problem if CSCPlugin is invoked (in case of bottom up RGB32). Can you checked whether it is invoked or not?
Just to confirm I have all the RIGHT changes in my encoder base file... can you send your modified copy so that I can compare with me own?
Sure, I'll send you by e-mail.