Article

Fast Gathering-based SpMxV for Linear Feature Extraction

This algorithm can be used to improve sparse matrix-vector and matrix-matrix multiplication in any numerical computation. As we know, there are lots of applications involving semi-sparse matrix computation in High Performance Computing. Additionally, in popular perceptual computing low-level engines, especially speech and facial recognition, semi-sparse matrices are found to be very common....
Authored by Last updated on 12/12/2018 - 18:00
Article

Modern Memory Subsystems Benefits for Data Base Codes, Linear Algebra Codes, Big Data, and Enterprise Storage

This article describes and contrasts advantages different types of memory, including Multi-Channel DRAM (MCDRAM) and High-Bandwidth Memory (HBM), the future 3D XPoint™ memory devices, and Intel® Omni-Path Fabric (Intel® OP Fabric).
Authored by Last updated on 09/30/2019 - 17:28
Article

Parallel Programming Books

Use these parallel programming resources and books with your Intel® Xeon® processor and Intel® Xeon Phi™ processor family
Authored by Mike P. (Intel) Last updated on 09/30/2019 - 17:28
Article

Free access to Intel® Compilers, Performance libraries, Analysis tools and more...

Intel® Parallel Studio XE is a very popular product from Intel that includes the Intel® Compilers, Intel® Performance Libraries, tools for analysis, debugging and tuning, tools for MPI and the Intel® MPI Library. Did you know that some of these are available for free? Here is a guide to “what is available free” from the Intel Parallel Studio XE suites.
Authored by admin Last updated on 09/30/2019 - 17:28
Article

Caffe* Optimized for Intel® Architecture: Applying Modern Code Techniques

This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
Authored by Last updated on 10/15/2019 - 15:30
Article

What is Code Modernization?

Modern high performance computers are built with a combination of resources including:

Authored by Mike P. (Intel) Last updated on 10/15/2019 - 15:30
Article

Improve Performance with Vectorization

This article focuses on the steps to improve software performance with vectorization. Included are examples of full applications along with some simpler cases to illustrate the steps to vectorization.
Authored by David M. Last updated on 10/15/2019 - 15:30
Article

Recognize and Measure Vectorization Performance

Get a background on vectorization and learn different techniques to evaluate its effectiveness.
Authored by David M. Last updated on 10/15/2019 - 15:30
Article

Hybrid Parallelism: A MiniFE* Case Study

This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.
Authored by David M. Last updated on 10/15/2019 - 16:40
Article

Putting Your Data and Code in Order: Data and layout - Part 2

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Authored by David M. Last updated on 10/15/2019 - 16:40