One of my performance focus areas for this year is vectorization. I am excited to start creating more content and spreading the message about this technology, as it has been a little bit underappreciated in the past. So to kick things off, I am going to launch a blog series and a 1-hour overview webinar.
First, information about the webinar.
As part of my focus on software performance, I also support and consult on implementing scalable parallelism in applications. There are many reasons to implement parallelism as well as many methods for doing it - but this blog is not about either of those things. This blog is about the performance advantages of one particular way of implementing parallelism - and, luckily, that way is supported by several models available.
Prepare applications for optimization on the Intel® Itanium® processor family. The first issue in getting high performance code on Itanium-based systems is to get the code ported or written to run correctly in the 64-bit environment. It is not uncommon for code that functions correctly in a 32-bit environment to have latent bugs that will be exposed when the code is moved to a 64-bit environment.
Measure the time a program and its functions take to execute as part of the diagnosis phase of performance optimization. Such measurements are extremely valuable as a simple means to become familiar with how an application behaves during execution.
Use either the Linux time command or the clock function in the C library, and profile the application during compilation. The time command is used as follows:
It gives the following information:
Identify the root cause of a back-end processor bubble on the Intel® Itanium® processor. A separate item, How to Identify Back-End Bubbles on 64-Bit Intel® Architecture, shows how to use the Intel® VTune™ Performance Analyzer to identify a bubble. In order to resolve this performance issue, the root cause of the bubble must be determined.
Identify a processor back-end bubble on the Intel® Itanium® processor. A 'bubble' is defined as any delay in the processor. The 'back end' is the place where instructions are retired when they are complete. There are five main causes of bubbles in the Itanium 2 processor: