I'm writing an server application that records and merges from several IP cams, and renders the result using the Intel media sdk. For this I use a modified version of the sample encoder. However the application sporadically freezes the entire system completely, in such a way that nothing echoes to console, nothing is printed to any system log. More importantly the server becomes unresponsive and has to be rebooted physically turning it on and off. We have two such servers each controlling 11 cameras, our resulting frames are 1920x480, fps 50.
There is also a "light" version of the above where we get the errorcode -17 (DEVICE_FAILURE) in which case we can abort. and reset the server. This is not as bad as the above, but still not something that is acceptable longterm.
We are currently using medisdk 1.6, weve successfully run the modified encoder, I've essentially just replaced parts that had to do with reading and allocating from file to allocating from frames in memory. It usually crashes within the first 50 000 or so frames, shortest i believe was just 5-6000, longest is 266 000.
What we have tried so far in locating the error:
* Removed the lines in the large server program that initialized, sent frames to , stopped and closed the encoder, four lines total, And ran the entire app for well over a million frames, No problems.
* Ran the modified encoder class in a minimalistic program that created frames by drawing simple patterns (lines moving, circles expanding etc), and it seems we can run it more or less indefinitely, however when run on both servers we once got very different filesizes (one server was about 14.4GB, the other 12.5GB).
* Ran the full program, but instead of removing the encoder lines we generated the same frames as in the test-program, although appearing more stable this resulted in several -17 errors
So to try to formulate some questions:
What are potential causes of the -17 error, the docs doesn't say anything more detailed than "You are screwed, shut down the encoder". I had the idea that the frames I fed it could be bad. We do a lot of processing on CUDA and the final merged frame resides in CUDA memory so I converted to NV12 directly before extracting, using http://en.wikipedia.org/wiki/YUV (the part relating to 709). However FourCC has a long text about how such transforms are wrong, Y has to be in [16:235] , e.t.c . so are there any such requirements that I've missed? On successful runs, the resulting videos look great, perhaps slightly to strong colours.
I'm currently running the encoder class inside a separate thread, and the thread responsible for calling and retriving from CUDA places NV12 frames i a queue (threadsafe queue) using preallocated (threadsafe) memory, the encoderthread pops and sends to the encoder. There are several more threads (one for each camera among others), could that be a problem? If so would it help to run the encoder as a separate process instead?
When I run the standalone test program I'm not using near the same resources (memory, cpu) as in the full server application. Could this affect performance?
Has anyone else experienced similar issues?
I'll supply my modified class and test programs if it helps, however as of yet I can't reproduce the error using them. I can't supply the source for the entire program as easily, plus that you need a fairly specific hardware configuration with lots of cameras adhering to particular APIs, so it probably won't be useful anyhow. In addition to the problem barely being reproducible there either