If you recall where we left off on my post yesterday we compiled a test program with gcc and saw this code for the 'working' part of a loop. (Yes, I will be getting to the Intel C++ compiler next post, but I'll stick with what I've got so far just so we can take baby steps).
ASM? You mean assembly language? I haven't looked at that since my senior project! How arcane! And compilers are so smart anymore, why should I care?
I used to feel the same way...albeit with a latent desire to learn it as I wish I knew Latin. Then one day I found myself out of options on my SIMD code generation project. The compilers were great, but making progress was like building a ship in the bottle. I was playing a game I know you've played too: "Let's Guess What the Compiler Will Do"!
On January 2000, Intel published an optimized matrix library (4D single-precision matrix and vector classes) for use with Pentium® III Streaming SIMD (Single Instruction Multiple Data) Extensions, or SSE, in an article in www.gamasutra.com.
In my last blog, I introduced the concept of vectorization, which is parallelism across data elements in a register inside a single CPU core. It's a topic that I am very excited about this year, and in this blog I will expand on the subject to address what types of applications can take advantage of vectorization.