Article

Code Sample: Exploring MPI for Python* on Intel® Xeon Phi™ Processor

Learn how to write an MPI program in Python*, and take advantage of Intel® multicore architectures using OpenMP threads and Intel® AVX512 instructions.
Authored by Nguyen, Loc Q (Intel) Last updated on 07/06/2019 - 16:30
Article

Optimization Techniques for the Intel® MIC Architecture: Part 1 of 3

Part one of this three-part series focuses on thread parallelism and race conditions, and discusses using mutexes in OpenMP* to resolve race conditions.
Authored by Mike P. (Intel) Last updated on 10/01/2019 - 12:34
Article

著作 - High Performance Parallelism Pearls

A look into the contents of the two "Pearls" books, edited by James Reinders and Jim Jeffers. These books contain a collection of examples of code modernization.
Authored by Mike P. (Intel) Last updated on 09/30/2019 - 17:28
Article

面向英特尔® 至强融核™ 处理器(代号“Knights Landing”)的开发人员访问计划

Intel is bringing to market, in anticipation of general availability of the Intel® Xeon Phi™ Processor (codenamed Knights Landing), the Developer Access Program (DAP). DAP is an early access program for developers worldwide to purchase an Intel Xeon Phi Processor based system.
Authored by Mike P. (Intel) Last updated on 10/01/2019 - 12:34
Article

Books - High Performance Parallelism Pearls

A look into the contents of the two "Pearls" books, edited by James Reinders and Jim Jeffers. These books contain a collection of examples of code modernization.
Authored by Mike P. (Intel) Last updated on 09/30/2019 - 17:30
Article

面向英特尔® 至强融核™ 处理器的 Offload over Fabric教程

This tutorial shows how to install Offload over Fabric (OoF) software on 2nd generation Intel® Xeon Phi™ processor, configure the hardware, test the basic configuration, and enable OoF
Authored by Nguyen, Loc Q (Intel) Last updated on 09/30/2019 - 17:28
Article

整理您的数据和代码: 数据和布局 - 第 2 部分

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Authored by David M. Last updated on 07/06/2019 - 16:40
Article

Caffe* Optimized for Intel® Architecture: Applying Modern Code Techniques

This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
Authored by Last updated on 07/06/2019 - 16:40
Article

Classical Molecular Dynamics Simulations with LAMMPS Optimized for Knights Landing

LAMMPS is an open-source software package that simulates classical molecular dynamics. As it supports many energy models and simulation options, its versatility has made it a popular choice. It was first developed at Sandia National Laboratories to use large-scale parallel computation.
Authored by WILLIAM B. (Intel) Last updated on 03/21/2019 - 12:00
Article

Code Sample: Optimizing Binarized Neural Networks on Intel® Xeon® Scalable Processors

In the previous article, we discussed the performance and accuracy of Binarized Neural Networks (BNN). We also introduced a BNN coded from scratch in the Wolfram Language. The key component of this neural network is Matrix Multiplication.
Authored by Yash Akhauri Last updated on 03/21/2019 - 12:40