Intel® Architecture Code Analyzer

Submit New Article

Last Modified On :   April 1, 2009 2:30 PM PDT
Rate
 


What If Home | Product Overview | Technical Requirements
FAQ | Primary Technology Contacts | Discussion Forum | Blog

Product Overview

The Intel® Architecture Code Analyzer helps you conduct quick analysis for Intel® Advanced Vector Extensions (Intel® AVX) before processors with these instructions are actually available. This analysis helps you experiment with code for Intel AVX and compare it to an implementation based on Intel® Streaming SIMD Extensions (Intel(R) SSE).

Features and Benefits

For a given kernel (Intel AVX or legacy code), Intel® Architecture Code Analyzer analysis includes:

  • Identifying the binding of the kernel instructions to the processor ports under ideal front-end, out-of-order engine and memory hierarchy conditions.
  • Identifying the number of cycles for which each instruction binds the ports.
  • Performing static analysis of throughput and latency cycle counts.

The Intel® Architecture Code Analyzer does not assume a specific Intel processor with an implementation of the Intel AVX instruction set. It models the ports, functional units, first level cache latencies, instruction throughputs and latencies of a possible HW implementation.

Among other things the modeled processor has:

  • One divide unit attached to port 0.
  • Two 128-bit load ports (2 and 3), each with an Address Generation Unit (AGU) attached to it.
  • One 128-bit store port (port 4).
  • First level cache latencies in a range between 5 and 8 cycles.

Intel® Architecture Code Analyzer is a command line tool with ASCII output. It handles a single basic block that is marked for analysis within an executable, a shared library, or an object file.

Intel® Architecture Code Analyzer output presents:

  • The throughput and latency of the analyzed instruction block
  • The bottleneck resource: front-end, port #, or the divider unit
  • A port binding cycle summary
  • A detailed report on the port binding of each instruction and the number of cycles the port was bound.

Technical Requirements

Intel® Architecture Code Analyzer has been developed and validated on Microsoft® Windows* XP Operating System running on top of an Intel Core 2 Duo processor. It should successfully run on other Microsoft operating systems and Intel hardware, though.

In order to generate a binary file with AVX instructions you may choose to use the latest open source YASM tool with Intel AVX support or the Intel Compiler Beta with Intel AVX support already available (see FAQ section for details).

Functional execution of your code with Intel AVX can be done on iSDE, another Intel tool posted on whatif.intel.com that supports Intel AVX.

Frequently Asked Questions

Q1: Is there a Linux version of the tool?

A1: Starting from Version 1.1 Linux version of the tool is also available.

Q2: Is there well defined interface for this technology for integration into other tools?

A2: Yes. there are no plans to release this interface at this time. Please contact us to discuss your requirements.

Q3: How accurate is the tool?

A3: The tool provides an estimated performance of a kernel, assuming it is a loop body executed multiple iterations. It ignores potential performance limiters within modern processors and as such provides an optimistic assessment of the code performance. You’ll need to rerun your code once an Intel® processor with Intel® AVX support becomes available to measure the true performance of your code on that hardware.

Q4: Is there a version of Intel Compiler available that supports Intel AVX?

A4: The 11.1 version of the Intel Compiler that supports the Intel AVX instructions is currently in Beta. You may apply to participate in this 11.1 Beta program by email to beta_request. In the email please include your name, email address, company, and reason for your request.

Please visit the Intel® Architecture Code Analyzer Forum and share your thoughts.  Questions about Intel® AVX and CPU instructions can be posted to the Intel® AVX and CPU Instructions Forum.

Primary Technical Contacts

Tal Uliel has been working on Intel® AVX since he started at Intel in summer 2007. Tal has developed many kernels written in Intel AVX and Intel SSE and analyzed their performance. He then moved to develop the Intel® Architecture Code Analyzer to provide quick feedback while optimizing his code.

Release Notes for 1.1.3
The following features were added for 1.1.3

  • fixed a bug where using -o option produced truncated output
  • fixed IACA_UD_BYTES definition in iacaMarks.h to include {}.
Please take a look at the updated User's Guide(pdf) for more info.

Release Notes for 1.1.2
The following features were added for 1.1.2

  • Intel® Architecture Code Analyzer now supports adding START and END marks in code compiled with Visual C++ compiler (64-bit). See iacaMarks.h
  • Intel® Architecture Code Analyzer now supports multiple block analysis. You can direct the tool to analyze the n'th block that is delimited with analyzer marks. When used with n=0, all surrounded blocks in the file are analyzed and the output contains separate reports per block.

Release Notes for 1.1.1
The following features were added for 1.1.1 

  • Fixed Intel® AVX zero idiom instructions wrong identification
  • Fixed empty code blocks (containing only zero idiom instructions / not supported instructions) crashing the analyzer
  • Fixed Analyzer ñarch nehalem option to treat AES and PCLMUL instructions as illegal. these aren't supported on Intel(R) microArchitecture - codename Nehalem.
  • Changed analyzer marks to abort if the binary is executed. To deactivate the marks when building for execution #define IACA_MARKS_OFF or use -DIACA_MARKS_OFF option in the compiler command line. Binaries with active marks should be used for analysis only.
Release Notes for 1.1
The following features were added for 1.1 

  • Intel® Architecture Code Analyzer is now hosted on Linux* operating systems, in addition to Windows* operating systems. Both IA-32 and Intel® 64 operating systems are supported.
  • Intel® Architecture Code Analyzer now supports two existing Intel® processors: Intel® microarchitecture, codenamed Nehalem and Westmere
  • Two critical path types are detected:
    • DATA_DEPENDENCY critical path (similar to previous releases - reflects instruction data dependencies only)
    • PERFORMANCE critical path (new - reflects port conflicts and front-end pressure, as well)
Release Notes for 1.0.2
The following features were added for 1.0.2

  • Ignoring pop ebx / push ebx that Intel® Architecture Code Analyzer Markers add to IA32 code
  • Fixed misclassifying rcp / rsqrt as divider operations

Release Notes for 1.0.1
The following features were added for 1.0.1

  • Graceful handling of unsupported instructions, they are quietly ignored in the analyzed block analysis and do not impact the throughput and latency calculations.
  • A few unsupported instructions are now supported, e.g. CMOV instruction family
  • Intel® AVX to Intel® SSE code switch detection. The performance penalty associated with such code switch is noted but not accounted for.