How my H264 transcoding may achieve better performance

How my H264 transcoding may achieve better performance

I've designed a module which processes FLV stream in order to transcode the H.264 video stream.

In the search for better performance, I've started from the code found in sample_decode, and modified it to fit the needs of the original project.

When examining the performance, however, I've found out that the decoding does not outperform the original solution based on FFMpeg featuring SSE2, MMX etc.

I've spent some time analyzing the factors which might contribute to the lack of performance, but would also need advices from experienced fellow designers. As the most detrimental factor I think is the fact that the stream arrives from the network (i.e. is not abundantly available as when reading it from the file). Also, the transcoding application does not render anything - its task is to merely generate the transcoded stream which is sent to the other modules for further processing.

Also, I've run the GPA analysis toolset, and would need expert's insight into their meaning. 

The details of my machine (generated by the GPA tool):

Windows 7, 64-bit DEP enabled
Num Processors: 8
Memory: 8079MB
System BIOS: LENOVO 8BET46WW (1.26 ) (06/22/2011)
Video BIOS: Hardware Version 0.0
Driver 0:
     Device: Intel(R) HD Graphics Family
     Provider: Intel Corporation
     Date: 3-6-2011
     Version: 8.15.10.2321
     VendorId: 8086
     ProductId: 126 (Intel® HD Graphics 3000)
     Stepping: 9
     Supports GPA Instrumentation
GPA install directory: C:\Program Files\Intel\GPA\2012 R5\
GPA version: 12.5.187105
Current user is in Administrators group: YES
Current GPA 2012 R5 (12.5.187105)

The GPA log is coming soon.

5 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

How many FPS are you getting ? My guess though, is you are right, it's a network performance issue.

Hi,

The sample applications (like sample_decode) are no optimized for performance, as thier intent is to demonstrate the use of the API.  There is a new tutorial that discusses peformance operations you may find valuable:

http://software.intel.com/en-us/articles/intel-media-sdk-tutorial

-Tony

 

 

-Tony

Thanks for the answers. Attaching the performance log just in case that the log can reveal some more details...

Fichiers joints: 

Fichier attachéTaille
Télécharger myapplog.zip2.2 Mo

@andy4us

I don't have FPS problem, as I am transcoding prerecorded stream, and not really being under real-time contstraints. So, the answer is: I don't drop any frames. 

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui