Blog post

最快线程间数据交换算法,有效避免锁竞争 -- TwoQueues

处理多线程数据共享问题注意的几个要点:

1、锁竞争:尽量减少锁竞争的时间和次数。

2、内存:尽量是使用已分配内存,减少内存分配和释放的次数。尽量是用连续内存,减少共享占用的内存量。

多线程数据交换简单方案A:

定义一个list,再所有操作list的地方进行加锁和解锁。

简单模拟代码:

Authored by Last updated on 07/04/2019 - 21:30
Article

使用 OpenCL™ 2.0 读写图片

While Image convolution is not as effective with the new Read-Write images functionality, any image processing technique that needs be done in place may benefit from the Read-Write images. One example of a process that could be used effectively is image composition. In OpenCL 1.2 and earlier, images were qualified with the “__read_only” and __write_only” qualifiers. In the OpenCL 2.0, images can...
Authored by Last updated on 05/31/2019 - 14:20
Article

借助 SIMD 数据布局模板优化数据布局

Financial service customers need to improve financial algorithmic performance for models such as Monte Carlo, Black-Scholes, and others. SIMD programming can speed up these workloads. In this paper, we perform data layout optimizations using two approaches on a Black-Scholes workload for European options valuation from the open source Quantlib library.
Authored by Nimisha R. (Intel) Last updated on 12/12/2018 - 18:00
Article

借助 SIMD 数据布局模板和数据预处理提高 SIMD 在动画中的使用效率

In this paper, we walk through a 3D Animation algorithm example and describe some techniques and methodologies that may benefit your next vectorization endeavors. We also integrate the algorithm with SIMD Data Layout Templates (SDLT), which is a feature of Intel® C++ Compiler, to improve data layout and SIMD efficiency. Includes code sample.
Authored by Last updated on 03/25/2019 - 11:40
Article

已归档 - 面向增强现实的自主导航介绍

This article provides an introduction to autonomous navigation and its use in augmented reality applications, with a focus on agents that move and navigate. Autonomous agents are entities that act independently using artificial intelligence, which defines the operational parameters and rules by which the agent must abide. The agent responds dynamically in real time to its environment, so even a...
Authored by admin Last updated on 12/28/2018 - 16:36
Article

使用 LibRealSense 和 OpenCV 流传输 RGB 和深度数据

This article shows you how you can use LibRealSense and OpenCV to stream RGB and depth data. In the end you will have a nice starting point where you use this code base to build upon to create your own LibRealSense / OpenCV applications.
Authored by Rick Blacker (Intel) Last updated on 01/18/2018 - 16:13
Article

借助针对英特尔® 架构优化的 Caffe* 管理深度学习网络

如何面向英特尔® 架构优化 Caffe*,训练深度网络模型及部署网络。
Authored by Andres Rodriguez (Intel) Last updated on 03/11/2019 - 13:17
Article

自动矢量化失败后应该怎么办?

This article completes an analysis of a problem erroneously reported on the Intel® Developer Zone forum: Vectorization failed because of unsigned integer? It provides a more detailed examination showing that unsigned integer is not impacting compiler vectorization but what methodology to use when a modern C/C++ compiler fails to auto-vectorize for-loops.
Authored by Last updated on 07/05/2019 - 14:46
Article

使用现代 C++ 技术增强多核优化

如今,多核处理器已经在 PC 中普及,内核数量不断增长,软件工程师必须适应这种情况。通过学习如何处理潜在的性能瓶颈和并发性问题,工程师可以使他们的代码适应未来,以无缝处理添加到消费者系统的额外内核。
Authored by Last updated on 08/02/2018 - 00:18
Article

整理您的数据和代码: 数据和布局 - 第 2 部分

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Authored by David M. Last updated on 10/15/2019 - 16:40