ISA Extensions Intel AVX

ISA Extensions

Intel’s Instruction Set Architecture (ISA) continues to evolve and expand in functionality, enrich user experience, and create synergy across industries.

INTEL® AVX

Intel® Advanced Vector Extensions (Intel® AVX)

The need for greater computing performance continues to grow across industry segments. To support rising demand and evolving usage models, we continue our history of innovation with the Intel® Advanced Vector Extensions (Intel® AVX) in products today.

Intel® AVX is a new-256 bit instruction set extension to Intel® SSE and is designed for applications that are Floating Point (FP) intensive. It was released early 2011 as part of the second generation Intel® Core™ processor family and is present in platforms ranging from notebooks to servers. Intel AVX improves performance due to wider vectors, new extensible syntax, and rich functionality. Intel AVX2 was released in 2013 with the fourth generation Intel® Core processor family and further extends the breadth of vector processing capability across floating-point and integer data domains. This results in higher performance and more efficient data management across a wide range of applications like image and audio/video processing, scientific simulations, financial analytics and 3D modeling and analysis.

 

Intel® Advanced Vector Extensions 512 (Intel® AVX-512)

In the future, some new products will feature a significant leap to 512-bit SIMD support. Programs can pack eight double precision and sixteen single precision floating numbers within the 512-bit vectors, as well as eight 64-bit and sixteen 32-bit integers. This enables processing of twice the number of data elements that IntelAVX/AVX2 can process with a single instruction and four times the capabilities of Intel SSE.

Intel AVX-512 instructions are important because they open up higher performance capabilities for the most demanding computational tasks. Intel AVX-512 instructions offer the highest degree of compiler support by including an unprecedented level of richness in the design of the instruction capabilities.

Intel AVX-512 features include 32 vector registers each 512-bit wide and eight dedicated mask registers. Intel AVX-512 is a flexible instruction set that includes support for broadcast, embedded masking to enable predication, embedded floating point rounding control, embedded floating-point fault suppression, scatter instructions, high speed math instructions, and compact representation of large displacement values.

Intel AVX-512 offers a level of compatibility with Intel AVX which is stronger than prior transitions to new widths for SIMD operations. Unlike Intel SSE and Intel AVX which cannot be mixed without performance penalties, the mixing of Intel AVX and Intel AVX-512 instructions is supported without penalty. Intel AVX registers YMM0–YMM15 map into Intel AVX-512 registers ZMM0–ZMM15 (in x86-64 mode), very much like Intel SSE registers map into Intel AVX registers. Therefore, in processors with Intel AVX-512 support, Intel AVX and Intel AVX2 instructions operate on the lower 128 or 256 bits of the first 16 ZMM registers.

More information about Intel AVX-512 instructions can be found in the blog "AVX-512 Instructions". The instructions are documented in the Intel® Architecture Instruction Set Extensions Programming Reference (PDF) (see the "Get Started" tab on this page).

Exposing processor features to dynamic languages It always causes me exquisite pain to see someone laboriously copying down a long number from their computer screen, just to type it in to another window or application. Doesn't it for you? After all, doesn't everyone know about the cut-copy-paste keys? I'm talking about selecting text with your...
A Mission-Critical Big Data Platform for the Real-Time Enterprise As the volume and velocity of enterprise data continue to grow, extracting high-value insight is becoming more challenging and more important. Businesses that can analyze fresh operational data instantly—without the delays of traditional data warehouses and data marts—can make the right decisions...
Fast Gathering-based SpMxV for Linear Feature Extraction This algorithm can be used to improve sparse matrix-vector and matrix-matrix multiplication in any numerical computation. As we know, there are lots of applications involving semi-sparse matrix computation in High Performance Computing. Additionally, in popular perceptual computing low-level...
Simple optimization methodology with Intel System Studio ( VTune, C++ Compiler, Cilk Plus ) Introduction:  In this article, we introduce an easy optimization methodology that includes Intel® Cilk™ Plus and Intel® C++ Compiler based on the performance analysis using Intel® VTune amplifier. Intel® System Studio 2015 that containes the mentioned components was used for this article.
Java* Application Performance Improvement with Intel® Xeon® Processor E7 v3 Background Java1, 2 is a programming language used for developing applications that can run on any operating system (OS). To do that, Java applications need to be compiled to bytecode.3 This bytecode can then be run on any Java Virtual Machine (JVM)4 without recompiling. To run Java applications...
Improving OpenSSL Performance
05/26/15
Contents AbstractOverview of OpenSSL       What are SSL/TLS       What is OpenSSL       Goals of OpenSSL 1.0.2 Cryptographic ImprovementsKey Components of OpenSSL 1.0.2       Function Stitching
Intel® Xeon® Processor D Product Family Technical Overview Contents 1. Form Factor Overview2. Intel® Xeon® Processor D Product Family Overview3. Intel® Xeon® Processor D Product Family Feature Overview4. Intel® Xeon® processor D Product Family introduces new instructions as well as enhancements of previous instructions4
Intel® System Studio - Solutions, Tips and Tricks Intel System Studio 2016 Info Release Notes What's New? Latest Analysis Tutorials New Video Guides (Scroll down to Systems)
Improve Server Application Performance with Intel® Advanced Vector Extensions 2 The Intel® Xeon® processor E7 v3 family now includes an instruction set called Intel® Advanced Vector Extensions 2 (Intel® AVX2), which can potentially improve application performance related to high performance computing, databases, and video processing. To validate this statement, I performed a...