Introduction to Microarchitectural Optimization for Itanium® Processors

Submit New Article

March 9, 2009 9:00 PM PDT



Introduction

This guide introduces software developers to the systematic use of Itanium® processor performance monitoring events to analyze the execution efficiency of their applications using the Intel® VTune™ Performance Analyzer. The guide focuses on a global discussion of microarchitectural optimization using the performance counters, the analyzer, and Intel Compilers. The intended readers are software developers who work on performance critical applications, software development tools, device drivers and operating systems.

The Itanium processor family relies on explicit scheduling for achieving the highest performance. Consequently, an essential component of software development for these new processors is developing code in high-level languages and using the most advanced version of the compilers. The focus of this guide is on optimizing code developed in high-level languages. The optimizations developed for the Itanium processor will apply well for future generations, with at most a recompilation to incorporate future architectural advances. While the microarchitectural optimization methodology is perfectly applicable to hand-coded assembler, hand-coded assembler should be avoided, since it cannot be scheduled by a compiler for future architectures.

This edition of the guide will discuss overcoming the execution inefficiencies caused by improper matching of algorithm and data structures during data reading operations.