Blog post

Introduction to OpenMP* on YouTube*

Tim Mattson (Intel) has authored an extensive series of excellent videos as in introduction to OpenMP*.

Authored by Mike P. (Intel) Last updated on 07/04/2019 - 19:51
Blog post

Something Old to be New Again?

In the past couple of years I've noticed a trend to "re-invent" technology or re-brand old ideas and concepts from previous computing generations.

Authored by Clay B. (Blackbelt) Last updated on 03/05/2019 - 23:48
Article

Caffe* Optimized for Intel® Architecture: Applying Modern Code Techniques

This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
Authored by Last updated on 10/15/2019 - 15:30
Blog post

The Unfairness of Good Syntax

The unfairness of good syntax - bad syntax is a problem; good syntax is not a solution.
Authored by Last updated on 07/04/2019 - 11:17
Article

Case Study: Optimized Code for Neural Cell Simulations

One of the Intel® Modern Code Developer Challenge winners, Daniel Falguera, describes many of the optimizations he implemented and why some didn't work.
Authored by Last updated on 10/03/2019 - 07:55
Article

Parallel Programming Books

Use these parallel programming resources and books with your Intel® Xeon® processor and Intel® Xeon Phi™ processor family
Authored by Mike P. (Intel) Last updated on 09/30/2019 - 17:28
Article

Performance of Classic Matrix Multiplication Algorithm on Intel® Xeon Phi™ Processor System

Matrix multiplication (MM) of two matrices is one of the most fundamental operations in linear algebra. The algorithm for MM is very simple, it could be easily implemented in any programming language. This paper shows that performance significantly improves when different optimization techniques are applied.
Authored by Last updated on 10/15/2019 - 15:30
Article

Hybrid Parallelism: A MiniFE* Case Study

This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.
Authored by David M. Last updated on 10/15/2019 - 16:40
Blog post

Vectorized Reduction 2: Let the Compiler do that Voodoo that it do so well

As I mentioned in my previous post about writing a vectorized reduction code from Intel vector intrinsics, that part of the code was just the finishing touch on a loop computing squared difference of complex values.
Authored by Clay B. (Blackbelt) Last updated on 12/12/2018 - 18:08