ISA Extensions Intel AVX

ISA Extensions

Intel’s Instruction Set Architecture (ISA) continues to evolve and expand in functionality, enrich user experience, and create synergy across industries.

INTEL® AVX

Intel® Advanced Vector Extensions (Intel® AVX)

The need for greater computing performance continues to grow across industry segments. To support rising demand and evolving usage models, we continue our history of innovation with the Intel® Advanced Vector Extensions (Intel® AVX) in products today.

Intel® AVX is a new-256 bit instruction set extension to Intel® SSE and is designed for applications that are Floating Point (FP) intensive. It was released early 2011 as part of the second generation Intel® Core™ processor family and is present in platforms ranging from notebooks to servers. Intel AVX improves performance due to wider vectors, new extensible syntax, and rich functionality. Intel AVX2 was released in 2013 with the fourth generation Intel® Core processor family and further extends the breadth of vector processing capability across floating-point and integer data domains. This results in higher performance and more efficient data management across a wide range of applications like image and audio/video processing, scientific simulations, financial analytics and 3D modeling and analysis.

 

Intel® Advanced Vector Extensions 512 (Intel® AVX-512)

In the future, some new products will feature a significant leap to 512-bit SIMD support. Programs can pack eight double precision and sixteen single precision floating numbers within the 512-bit vectors, as well as eight 64-bit and sixteen 32-bit integers. This enables processing of twice the number of data elements that IntelAVX/AVX2 can process with a single instruction and four times the capabilities of Intel SSE.

Intel AVX-512 instructions are important because they open up higher performance capabilities for the most demanding computational tasks. Intel AVX-512 instructions offer the highest degree of compiler support by including an unprecedented level of richness in the design of the instruction capabilities.

Intel AVX-512 features include 32 vector registers each 512-bit wide and eight dedicated mask registers. Intel AVX-512 is a flexible instruction set that includes support for broadcast, embedded masking to enable predication, embedded floating point rounding control, embedded floating-point fault suppression, scatter instructions, high speed math instructions, and compact representation of large displacement values.

Intel AVX-512 offers a level of compatibility with Intel AVX which is stronger than prior transitions to new widths for SIMD operations. Unlike Intel SSE and Intel AVX which cannot be mixed without performance penalties, the mixing of Intel AVX and Intel AVX-512 instructions is supported without penalty. Intel AVX registers YMM0–YMM15 map into Intel AVX-512 registers ZMM0–ZMM15 (in x86-64 mode), very much like Intel SSE registers map into Intel AVX registers. Therefore, in processors with Intel AVX-512 support, Intel AVX and Intel AVX2 instructions operate on the lower 128 or 256 bits of the first 16 ZMM registers.

More information about Intel AVX-512 instructions can be found in the blog "AVX-512 Instructions". The instructions are documented in the Intel® Architecture Instruction Set Extensions Programming Reference (PDF) (see the "Get Started" tab on this page).

Compiling for the Intel® Xeon Phi™ processor and the Intel® AVX-512 ISA This document briefly gives an overview of the Intel® Advanced Vector Extensions 512 (Intel® AVX-512) and shows different ways to build an application for the Intel® Xeon Phi™ processor x200 using the Intel® compiler.
Celebrating a Decade of Parallel Programming with Intel® Threading Building Blocks (Intel®TBB) This year marks the tenth anniversary of Intel® Threading Building Blocks (Intel® TBB). And it’s a good time to look back over those last 10 years to see where Intel TBB started, how far it’s come, and how successful it’s been in addressing the needs of developers.
Migrating Applications from Knights Corner to Knights Landing Self-Boot Platforms While there are many different programming models for the Intel® Xeon Phi™ coprocessor (code-named Knights Corner (KNC)), this paper lists the more prevalent KNC programming models and further discusses some of the necessary changes to port and optimize KNC models for the Intel® Xeon Phi™ processor...
Vectorize or Performance Dies: Tune for the Latest Intel® AVX SIMD Instructions―Even Without the Latest Hardware


Vectorization is critical to achieving the full performance potential of modern processors. Intel® Advisor’s vectorization advisor prioritizes loops for vectorization, gives you crucial optimization data―like data dependencies, trip counts, and memory access patterns―and now helps you...

Webcast: Parallel computing on Intel® Architecture
Webcast: Parallel computing on Intel® Architecture Regulatory pressures, which increase the amount of required computing, the need to improve operational efficiency and competition, which require faster computing,  are among the drivers that incent financial institutions to pursue significant improvements in computational efficiencies. As...
Can You Write a Vectorized Reduction Operation? I can. And if you read this post you will also be able to write one, too. (Might be a cool party trick or a sucker bet to make a little cash.)
Fast Computation of Huffman Codes The generation of Huffman codes is used in many applications, among them the DEFLATE compression algorithm. The classical way to compute these codes uses a heap data structure. This approach is fairly efficient, but traditional software implementations contain lots of branches that are data-...
Intel® C++ Compiler 17.0 Release Notes This page provides links to the current Release Notes for the Intel® C++ Compiler 17.0 component of Intel® Parallel Studio XE 2017 for Windows*, Linux* and OS X*.  To get product updates, log in to the Intel® Software Development Products Registration Center. For questions or technical support,...
Intel® IPP Functions Optimized for Intel® Advanced Vector Extensions 2 (Intel® AVX2) List of Intel IPP functions optimized for processor code name Haswell and Skylake
Whatever the Weather: The Intel Five Step Framework for Code Modernization Weather forecasting is a crucial aspect of modern life, enabling efficient planning and logistics, while also protecting life and property through timely warnings of severe conditions. But accurate, long-range weather prediction is extremely complex, often involving enormous data sets and requiring...