I have just upgraded my rather old code from IPP 5.3 to 7.1.1. Turned out to be a huge job due to the API changes, but that's OK, it happens. And my code ended up being much smaller and cleaner as many features I had to try and emulate are now in the sample code.
My problem now is that I am getting very high CPU usage, and very slow frame rates, the two being, of course, closely related. First, I am using the "max slice size" option as I am trying to send RFC 3984 compliant packetisation mode zero RTP packets. Second, I am encoding a y4m file to minimise any possible interactions with cameras etc. Finally, I am using contant bit rate set to 2Mbps. My test code (effectively) takes a YUV420P frame from the file, feeds it through the codec, then splits it up into separate RTP/NALU's by searching for the start codes etc, and finishes by throwing away the result. The build is using VS2012, 32 bit.
A CIF sized image maxes out the single thread and yields 10fps. HD720 is about 4fps and HD1080 is about 2fps.
That is seriously non-linear for a start. The HD stuff, I can sort of understand eating CPU for breakfast, but really, CIF should be able to do 30fps, with time to spare. Even in one thread.
I have played with the num_slices and m_iThreads parameters, as well as resolutions and CBR bit rates, and nothing seems to makes a lot of difference.
Can anyone think of something I am doing wrong?
Oh yeah, this is on a realtively old i7, but I got 10 times this performance with my old code, and IPP 5.3, two years ago.