In my last blog, I introduced the concept of vectorization, which is parallelism across data elements in a regi
Writing the sample code for this post I was amazed myself to see how simple it was to reach over 20 times performance improvement with so little effort.
Hello and welcome to my blog. This is my first blog posting.
This blog contains additional content for the article "Advanced Vectorization" from Parallel Universe #12:
In this blog I’ll try to show how to convert SSE4.2 assembly to AVX2 (using the schemes from the blog