See how the new Intel® Advanced Vector Extensions 512CD and the Intel AVX512F subsets (available in the Intel® Xeon Phi processor and in future Intel Xeon processors) lets the compiler automatically generate vector code with no changes to the code.
This is an exercise in performance optimization on heterogeneous Intel architecture systems based on multi-core processors and manycore (MIC) coprocessors.
Many Faces of Parallelism: Porting Programs to the Intel® Many Integrated Core Architecture
Exercise in performance optimization on Intel Architecture, including Intel® Xeon Phi™ processors.
This is a "cheatsheet" comparing the Fortran and C++ offload directives and functions in the context of programming for the Intel® Xeon Phi™ coprocessor
How to implement and optimize a three-dimension isotropic kernel with finite differences to run on Intel® Xeon® and Intel® Xeon Phi™ processors.
This recipe describes how to get, build, and run the GROMACS* code on Intel® Xeon® and Intel® Xeon Phi™ processors for better performance on a single node.
In this paper, we look at one of the successful efforts, pioneered by Barone-Adesi and Whaley, and apply the high performance parallel computing entailed in the modern microprocessors to create a program that can exceed our expectation for high performance with a suitable numerical result.
This document is designed to help users get started writing code and running MPI applications using the Intel® MPI Library on a development platform that includes the Intel® Xeon Phi™ processor.
The sample demonstrates how to implement efficient median filter with OpenCL™ standard. This implementation relies on auto-vectorization performed by Intel® SDK for OpenCL Applications compiler.