One of my performance focus areas for this year is vectorization.
I've known this day was coming - but when I saw Knights Corner clearly sustaining a TeraFlop (DGEMM, wide range of block sizes) per second - I was surprised by my emotional reaction inside.
Real results for many-core processors illustrate the power of a familiar configuration (SMP) even when reduced to a single chip.
Experimental Cloud-based Ray Tracing Using Intel® MIC Architecture for Highly Parallel Visual ProcessingThe cloud is game-changing factor in computing. Companies are offering a service in which the game itself runs on servers in the cloud. It processes user interactions from the game client, & the server sends back a compressed, rendered image to the user.
Intel Vectorization Toolkit: 4. Get Advice Using the Intel Compiler GAP Report and Toolkit ResourcesIntel Vectorization Toolkit: 4. Get Advice Using the Intel Compiler GAP Report and Toolkit Resources
Download OpenCL* Device Fission for CPU Performance [PDF 762KB]