Learn how to accurately measure events of short duration using the Enhanced Timer.
Avoid performance penalties associated with excessive software prefetching. Prefetch instructions are not completely free in terms of bus cycles, machine cycles, and other resources, even though they require minimal clocks and memory bandwidth. Excessive prefetching may lead to performance penalties because of issue penalties in the front-end of the machine and/or resource contention in the memory sub-system. This effect may be severe in cases where the target loops are small and/or cases where the target loop is issue-bound.
Detect characteristics of 32-bit Intel® architecture processors. Code that can take advantage of certain Intel® NetBurst™ microarchitecture performance-enhancing innovations, including Streaming SIMD Extensions (SSE) technology and Hyper-Threading Technology, must detect processor support for these technologies.
Rearrange (deswizzle) data from SoA (Structure of Arrays) format to AoS (Array of Structures) format. In the deswizzle operation, we want to arrange the data so the xxxx, yyyy, zzzz are rearranged and stored in memory as xyz.
Implement the application-programming model for SSE3 Instructions. The application-programming environment for using SSE3 instructions is unchanged from that provided for Streaming SIMD Extensions (SSE) and Streaming SIMD Extensions 2 (SSE2).