Vectorize for Better Performance
Vectorization is the operation of Single Instruction Multiple Data (SIMD) instructions (like Intel® Advanced Vector Extensions and Intel® Advanced Vector Extensions 512) on multiple data objects in parallel within a single CPU core. This can greatly increase performance by reducing loop overhead and making better use of the multiple math units in each core.
- Use Roofline Analysis to see performance "headroom" and to co-optimize memory and compute.
- Find loops that will benefit the most from better vectorization.
- Identify where it is safe to force compiler vectorization.
Visualize actual performance against hardware-imposed performance ceilings (rooflines)—such as memory bandwidth and compute capacity—which provide an ideal roadmap of potential optimization steps. This analysis highlights loops that have the most headroom for improvement, allowing you to focus on areas that deliver the biggest performance payoff.
Use the Roofline chart to answer the key questions, including:
- Does the application work optimally on current memory and compute resources?
- If not, what bottlenecks are limiting performance?
- What are the best candidates for optimization?
What Customers Are Saying
Learn About Additional Capabilities
1 Roofline modeling was first proposed in 2009 by University of California at Berkeley researchers Samuel Williams, Andrew Waterman, and David Patterson in Roofline: An Insightful Visual Performance Model for Multicore Architectures.
Reference Cache-Aware Roofline Model (CARM) and Aleksandar Ilic, Frederico Pratas, and Leonel Sousa, Cache-Aware Roofline Model: Upgrading the Loft (IEEE Computer Architecture Letters, January 2014), volume 13, number 1, 21–24.