post AVX killer apps here

post AVX killer apps here

I was wondering which applications will profit most from AVX extensions. There are some AES, fp and integer math operations. Please post algorithms, academical papers, forum discussions, benchmarks.

Besides the newly integrated media content creation engine, I'm wondering in particular how much the word's best H.264 encoder
http://compression.ru/video/codec_comparison/h264_2010/
will profit from the AVX extension.

There are ongoing discussions
http://forum.doom9.org/showthread.php?t=157714
http://forum.doom9.org/showthread.php?t=156761
http://x264dev.multimedia.cx/

and I'm in every resource you know of. Thanks.

4 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

All applications with a massive amount of computation should be able to benefit from wider SIMD units to some degree. E.g. (medical) image processing, finite element methods, monte carlo simulations, ...

Yes, it's possible for many applications with a certain degree of SIMD affinity to exploit the AVX instruction set. However this doesn't make it a killer app if it doesn't have a tangible impact.

Example: Transcoding Videos for one time watching on a mobile phone such as Apple's iPhone.

Yes, you may accellerate the transcoding process by means of the AVX instruction set. However it will most likely be slower than the integrated hardware avc encoder of Sandy Bridge (400 fps):
http://www.anandtech.com/show/3922/intels-sandy-bridge-architecture-exposed/6
as well slower than pure GPGPU transcoders such as
http://www.badaboomit.com/

It seems the AVX instruction set has little impact (read: not a killer app) for transcoding videos for mobile devices although it's algorithms can make use of SIMD.

But maybe there's use for the AVX instruction set for HIGH QUALITY transcodes (likely NOT mobile devices)? Intel has a write up of SSE4 tuned SAD implementations:
http://software.intel.com/en-us/articles/motion-estimation-with-intel-streaming-simd-extensions-4-intel-sse4/
This suggests SSE4 is useful for motion estimation algorithms based on SAD.

But it turns out these "simple" and exhaustive algorithms are not used by any of the more quality oriented transcoding options of e.g. x264
http://www.digital-digest.com/articles/x264_options_page6.html
New instruction sets only make operations faster, if the actual underlying algorithms can make use of it. With SAD and SSE4 this hasn't shown up in practise and any performance improvements were due to other changes e.g.
http://x264dev.multimedia.cx/archives/51

However since AVX not only brings new instruction sets, but also widens the data path vom 128 Bit SSE -> 256 Bit AVX it looks very promising for high quality H.264 encoding improvements due to AVX.

I don't know much about the domain of medical image processing. Are they at all optimized for SSE?
And how about monte carlo simulations? If the practical relevant ones will only be deployed on supercomputers, then AVX won't help.

Even Intel Performance Primitives don't seem to be AVX optimized yet, are they?
http://software.intel.com/en-us/articles/intel-ipp/#whats-new

Still looking for killer apps...

Actually I wasn't aware that AVX is basically floating point only, which makes it pretty useless for x264:
http://doom10.org/index.php?topic=514.msg3536#msg3536
However some of the Video Encoding/Decoding capabilities might turn out to be VERY useful for x264. Intel and the x264 devs are working together to get this out. The interesting stuff is in the IRC log:
http://doom10.org/index.php?topic=717.msg4585#msg4585
But this is completely unrelated to AVX (but related to Sandy Bridge in general).

发表评论

登录添加评论。还不是成员?立即加入