Article

Superscalar Programming 101 (Matrix Multiply) Part 1 of 5

Part one of a five-part series, this article teaches a methodology to interpret statistics gathered during test runs and use those interpretations to improve parallel code.
Authored by jimdempseyatthecove (Blackbelt) Last updated on 07/04/2019 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 2 of 5

By Jim DempseyIn my last article we left off with

Authored by jimdempseyatthecove (Blackbelt) Last updated on 07/04/2019 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 3 of 5

By Jim Dempsey

Authored by jimdempseyatthecove (Blackbelt) Last updated on 07/04/2019 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 4 of 5

In the last installment (Part 3) we saw the effects of the QuickThread Parallel Tag Team method of Matrix Multiplica

Authored by jimdempseyatthecove (Blackbelt) Last updated on 07/04/2019 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 5 of 5

In part 4 we saw the effects of the QuickThread Parallel Tag Team Transpose method of Matrix Multiplication performe

Authored by jimdempseyatthecove (Blackbelt) Last updated on 07/04/2019 - 22:00
Article

Using Intel® AVX without Writing AVX

Intel® AVX is a new 256-bit instruction set extension to Intel® Streaming SIMD Extensions and is designed for applications that are floating point intensive. This paper discusses options to integrate Intel® AVX into an application via use of intrinsics.
Authored by richard-hubbard (Intel) Last updated on 03/05/2019 - 22:08
Article

Improving Averaging Filter Performance Using Intel® Cilk™ Plus

Intel® Cilk™ Plus is an extension to the C and C++ languages to support data and task parallelism.  It provides three new keywords to i

Authored by Anoop M. (Intel) Last updated on 12/12/2018 - 18:00
Article

Explicit Vector Programming – Best Known Methods

Vectorizing improves performance, and achieving high performance can save power. Introduction to tools for vectorizing compute-intensive processing.
Authored by Last updated on 04/24/2019 - 11:25
Article

高效并行化

高效并行化文档

面向英特尔® 集成众核架构的编译器方法

高效并行化

Authored by Ronald W Green (Blackbelt) Last updated on 09/30/2019 - 17:30
Article

Efficient Parallelization

This article is part of the Intel® Modern Code Developer Community documentation which supports developers in leveraging application performance in code through a systematic step-by-step optimization framework methodology. This article addresses: Thread level parallelization.
Authored by Ronald W Green (Blackbelt) Last updated on 09/30/2019 - 17:28