Article

Superscalar Programming 101 (Matrix Multiply) Part 1 of 5

Part one of a five-part series, this article teaches a methodology to interpret statistics gathered during test runs and use those interpretations to improve parallel code.
作者: jimdempseyatthecove (Blackbelt) 最后更新时间: 2019/07/04 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 2 of 5

By Jim DempseyIn my last article we left off with

作者: jimdempseyatthecove (Blackbelt) 最后更新时间: 2019/07/04 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 3 of 5

By Jim Dempsey

作者: jimdempseyatthecove (Blackbelt) 最后更新时间: 2019/07/04 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 4 of 5

In the last installment (Part 3) we saw the effects of the QuickThread Parallel Tag Team method of Matrix Multiplica

作者: jimdempseyatthecove (Blackbelt) 最后更新时间: 2019/07/04 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 5 of 5

In part 4 we saw the effects of the QuickThread Parallel Tag Team Transpose method of Matrix Multiplication performe

作者: jimdempseyatthecove (Blackbelt) 最后更新时间: 2019/07/04 - 22:00
Article

Improving Averaging Filter Performance Using Intel® Cilk™ Plus

Intel® Cilk™ Plus is an extension to the C and C++ languages to support data and task parallelism.  It provides three new keywords to i

作者: Anoop M. (Intel) 最后更新时间: 2018/12/12 - 18:00
Article

Vectorizing Loops with Calls to User-Defined External Functions

Introduction

作者: Anoop M. (Intel) 最后更新时间: 2018/12/12 - 18:00
Article

Vectorization Essentials

Vectorization essentials to effectively use feature in the Intel® Xeon product family
作者: 管理 最后更新时间: 2019/10/02 - 15:11
Article

Putting Your Data and Code in Order: Data and layout - Part 2

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
作者: David M. 最后更新时间: 2019/10/15 - 16:40
Article

整理您的数据和代码: 数据和布局 - 第 2 部分

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
作者: David M. 最后更新时间: 2019/10/15 - 16:40