博客

最快线程间数据交换算法,有效避免锁竞争 -- TwoQueues

处理多线程数据共享问题注意的几个要点:

1、锁竞争:尽量减少锁竞争的时间和次数。

2、内存:尽量是使用已分配内存,减少内存分配和释放的次数。尽量是用连续内存,减少共享占用的内存量。

多线程数据交换简单方案A:

定义一个list,再所有操作list的地方进行加锁和解锁。

简单模拟代码:

作者: 最后更新时间: 2019/07/04 - 21:30
Article

Measuring performance in HPC

This is the first article in a series of articles about High Performance Computing with the Intel® Xeon Phi™ coprocessor.

作者: 最后更新时间: 2019/07/06 - 16:10
Article

Vectorizing Loops with Calls to User-Defined External Functions

Introduction

作者: Anoop M. (Intel) 最后更新时间: 2018/12/12 - 18:00
Article

Putting Your Data and Code in Order: Data and layout - Part 2

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
作者: David M. 最后更新时间: 2019/07/06 - 16:40
Article

Peel the Onion (Optimization Techniques)

This paper is a more formal response to an Intel® Developer Zone forum posting. See: (https://software.intel.com/en-us/forums/intel-moderncode-for-parallel-architectures/topic/590710).
作者: jimdempseyatthecove (Blackbelt) 最后更新时间: 2018/12/12 - 18:00
Article

Приводим данные и код в порядок: данные и разметка, часть 2

In this pair of articles on performance and memory covers basic concepts to provide guidance to developers seeking to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
作者: David M. 最后更新时间: 2019/07/06 - 16:40
博客

Reduce Boilerplate Code in Parallelized Loops with C++11 Lambda Expressions

Parallelize loops with Intel® Threading Building Blocks using Intel® C++ Compiler for lambda expressions.
作者: gaston-hillar (Blackbelt) 最后更新时间: 2018/12/12 - 18:00
Article

Recognize and Measure Vectorization Performance

Get a background on vectorization and learn different techniques to evaluate its effectiveness.
作者: David M. 最后更新时间: 2019/07/06 - 16:40
Article

Чистим лук (но не плачем): методики оптимизации

Эта статья представляет собой формализованный ответ на публикацию на форуме Intel® Developer Zone. См.: (https://software.intel.com/en-us/forums/intel-moderncode-for-parallel-architectures/topic/590710).
作者: 最后更新时间: 2018/12/12 - 18:00
博客

Debug Intel® Transactional Synchronization Extensions

If printf or fprintf functions cause transaction aborts, use Intel® Processor Trace as a work-around.
作者: Roman Dementiev (Intel) 最后更新时间: 2019/07/04 - 17:00