AVX2 application released

AVX2 application released

Imagen de bronxzv

As part of the normal updates of our pure software realtime 3D engine

http://www.inartis.com/default.aspx

we have already included a new path forHaswell targets (using FMA and AVX2 instructions).

Thanks to the early support in the Intel compiler and the SDE I was able to port and validate very quickly the codeusing FMA and the 256-bit packed int instructions. A cool feature of the Intel C++ compiler is that legacy code using MUL + ADD intrinsics (such as _mm256_mul_ps / _mm256_add_ps) use FMA instructions wherever possible when compiledwith the "/QxCORE-AVX2" flag, it's a great time saver and we can continue to have exactly the samesourcecodefor all (legacy SSE &AVX andnew FMA+AVX2) paths. Also since we use wrapper classes around intrinsics, the source code is still very readable, for example

res = a*x + b*y +c;

is far more readable than if wehad to introduce FMA functions such as

res = madd(a,x,madd(b,y,c));

More optimization opportunities are still there using any to any permute and gather for example, I suppose that I'll wait for the real chipsfor these.

publicaciones de 2 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.
Imagen de Max Locktyukhin (Intel)

Thank you for your effort and the feedback,it is noticedwith a great pleasure!

Inicie sesión para dejar un comentario.