Documentation

 

558 Search Results

Refine by

    Results for:

Best of Modern Code for April

More vectorization, optimization and transformations for you this month.

Top Ten Intel® Software Developer Stories for April

Build a custom classifier to identify leukemia. We also learn how Intel and Facebook are collaborating to boost PyTorch* CPU performance.

Code Sample: Intel® AVX512-Deep Learning Boost: Intrinsic Functions

How developers can use to take advantage of the new Intel® AVX512-Deep Learning Boost (Intel® AVX512-DL Boost) instructions.

Page icon

Second Generation Intel® Xeon® Processor Scalable Family Technical Overview

New features and enhancements available in the second generation Intel® Xeon® processor Scalable family and how developers can take advantage of them

Using the Intel® MPI Library in Google Cloud Platform*

Published on March 28, 2019By Fabio B.

In this article, we describe the process on how to download and install the Intel® MPI library in Google Cloud Platform* (GCP). This allows you to run MPI workloads on the cloud service provider. Furthermore we highlight also the technique and the process to build a...

Cache Blocking Techniques

Published on November 7, 2013, updated March 26, 2019By AmandaS

Cache Blocking Techniques
Overview

An important class of algorithmic changes involves blocking data structures to fit in cache. By organizing data memory accesses, one can load the cache with a small subset of a much larger data set. The idea is then to work on this block of data...

Memory Layout Transformations

Published on November 25, 2013, updated March 26, 2019By AmandaS

Memory Layout Transformations Overview

This chapter examines a useful user code transformation: moving from data organized in an Array of Structures (AoS) to an organization of Stucture of Arrays (SoA). This transformation allows the compiler to access data more efficiently on the processor....

Vectorization Toolkit

Published on May 14, 2012, updated March 25, 2019By AmandaS

A toolkit that gives 6 Steps to Increase Performance Through Vectorization in Your Application

Vectorization and Optimization Reports

Published on September 6, 2012, updated March 25, 2019By Ronald W Green

Optimization reports from the Intel® compilers guide the developer with optimization details

Getting the Most out of your Intel® Compiler with the New Optimization Reports

Published on October 8, 2014, updated March 25, 2019By Martyn Corden

Intel compiler optimization reports guide the developer to performance improvements

Optimizing Applications using Intel® Compiler for Intel® Xeon Processors

Published on October 13, 2015, updated March 25, 2019By AmandaS

The key to performance measurement is two-fold, know exactly what you are measuring and collect your baseline data. Next, profile your application and identify a specific and realistic performance goal based on the profiling data. Follow these steps to optimize your software.

Vectorization Essentials

Published on December 6, 2013, updated March 22, 2019

Vectorization essentials to effectively use feature in the Intel® Xeon product family

Memorable Persistent Memory Articles from Intel for March

This month handle memory errors and extend memory capacity. 

Best of Modern Code for March

Learn to pinpoint issues and speed up your application with Intel® Advisor this month.

Random Number Function Vectorization

Published on September 7, 2012, updated March 8, 2019

Random number function auto-vectorization supported

Avoid Manual Loop Unrolling

Published on September 9, 2012, updated March 8, 2019

Generate efficient vectorized code when a loop structure is not manually unrolled

Use CV to identify flowers

Top Ten Intel® Software Developer Stories for March

See how deep learning is used to match jobs with candidates and how computer vision can identify flowers. Find more in this month's top stories.

Utilizing Full Vectors and Use of Option -qopt-assume-safe-padding

Published on September 7, 2012, updated March 6, 2019

Vectorization Essentials: Efficient vectorization involves making full use of the vector-hardware in the kernel-vector loop.

Outer Loop Vectorization

Published on September 7, 2012, updated March 5, 2019

Vectorization Essentials: Vectorizing the outer loop can be profitable

Fortran Array Data and Arguments and Vectorization

Published on September 6, 2012, updated March 4, 2019

Examples of vectorizing Fortran applications

Common Vectorization Tips

Common Vectorization Tips

Published on October 7, 2013, updated March 4, 2019By AmandaS

Get tips for common vectorization functions, such as handling user-defined function calls inside vector loops.

Requirements for Vectorizable Loops

Published on August 2, 2012, updated March 4, 2019By Martyn Corden

Vectorization is one of many optimizations that are enabled by default in the latest Intel compilers. In order to be vectorized, loops must obey certain conditions, listed below. Some additional ways to help the compiler to vectorize loops are described.

Performance essentials using OpenMP* 4.0 vectorization with C/C++

Last updated: March 1, 2019Video length: 55 min

http://intel.com/software/products.  This webinar teaches you about vectorization, what it is and why you should care about it as a software developer.

Episode 4.1 - SIMD Parallelism and Intrinsics

Part 1: SIMD Parallelism and Intrinsics

Last updated: February 28, 2019Video length: 6 min

A discussion of expressing data parallelism.

Episode 4.2 Automatic Vectorization and Array Notation

Part 2: Automatic Vectorization and Array Notation

Last updated: February 28, 2019Video length: 6 min

Automatic Vectorization and Cilk Plus Array Notation.

Episode 4.3 Vector Dependence, Pointer Disambiguation and SIMD-Enabled Functions

Part 3: Vector Dependence, Pointer Disambiguation, and SIMD-Enabled Functions

Last updated: February 28, 2019Video length: 8 min

Some of the potential problems with vectorization.

Episode 4.4 Thread Parallelism and OpenMP*

Part 4: Thread Parallelism and OpenMP*

Last updated: February 28, 2019Video length: 3 min

OpenMP* thread parallelism.

Episode 4.5 Parallel Loops, Private and Shared Variables, Scheduling

Part 5: Parallel Loops, Private and Shared Variables, Scheduling

Last updated: February 28, 2019Video length: 4 min

Private/shared variables, parallel loops, scheduling.

Episode 4.6 Fork-Join Model OpenMP* Tasks

Part 6: Fork-Join Model OpenMP* Tasks

Last updated: February 28, 2019Video length: 3 min

Fork-Join parallelism

Episode 4.7 Race Conditions and Mutexes

Part 7: Race Conditions and Mutexes

Last updated: February 28, 2019Video length: 6 min

Race Conditions, Mutexes.

Episode 4.8 Parallel Reduction

Part 8: Parallel Reduction

Last updated: February 28, 2019Video length: 4 min

Parallel reduction in loops.

Episode 4.9 Distributed-memory Parallelism and MPI

Part 9: Distributed-Memory Parallelism and MPI

Last updated: February 28, 2019Video length: 6 min

Message Passing Interface (MPI).

Episode 5.1 - Optimization roadmap

Part 1: Optimization Roadmap

Last updated: February 28, 2019Video length: 4 min

Optimization steps required for performance.

Episode 5.2 - Scalar Tuning and General Optimization

Part 2: Scalar Tuning and General Optimization

Last updated: February 28, 2019Video length: 10 min

Optimization of Scalar Arithmetics

Episode 5.3 - Optimization of Vectorization- Data Structures

Part 3: Optimization of Vectorization-Data Structures

Last updated: February 28, 2019Video length: 6 min

Data structure for better vectorization.

Pages