过滤器

视频

第 4.8 集并行规约

我们将讨论 OpenMP for 循环中的并行规约。

作者: tianhui s. 最后更新时间: 2017/02/02 - 11:05
Article

2012 Cloud Computing and Its Core Technology,CCF YOCSEF-Intel Workshop

回到英特尔学术社区首页>>

作者: 管理 最后更新时间: 2012/10/26 - 12:30
Article

循环修改增强数据并行性能

When confronted with nested loops, the granularity of the computations that are assigned to threads will directly affect performance. Loop transformations such as splitting and merging nested loops can make parallelization easier and more productive.
作者: 管理 最后更新时间: 2017/01/26 - 00:49
Article

在 Matlab* 上使用英特尔® 数据分析加速库

英特尔® 数据分析加速库(英特尔® DAAL)是一种高性能库,它提供了丰富的算法集,从面向数据集的最基本的描述统计,到更高级的数据挖掘和机器学习算法。它可以帮助开发人员轻松地开发高度优化的大数据算法。

作者: Ying H. (Intel) 最后更新时间: 2017/04/08 - 08:48
Article

整理您的数据和代码: 数据和布局 - 第 2 部分

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
作者: David M. 最后更新时间: 2016/10/09 - 03:01
博客

线程并行化的概念及其用法

An Intro to Multi-Level Parallelism for High-Performance Computing by Clay Breshears | Life Sciences Software Architect, Intel
作者: Clay B. 最后更新时间: 2016/06/13 - 15:03
Article

借助针对英特尔® 架构优化的 Caffe* 来训练和部署深度学习网络

Caffe* is a deep learning framework developed by the Berkeley Vision and Learning Center (BVLC). Caffe optimized for Intel architecture is currently integrated with the latest release of Intel® Math Kernel Library (Intel® MKL) 2017 optimized for Advanced Vector Extensions (AVX)-2 and AVX-512 instructions which are supported in Intel® Xeon® and Intel® Xeon Phi™ processors (among others). This...
作者: Andres R. (Intel) 最后更新时间: 2017/03/06 - 19:32
Article

Fletcher 校验和的快速计算能力

Checksums are widely used for checking the integrity of data in applications such as storage and networking. We present fast methods of computing checksums on Intel® processors. Instead of computing the checksum of the input with a traditional linear method, we describe a faster method to split the data into a number of interleaved parallel streams, compute the checksum on these segments in...
作者: James Guilford (Intel) 最后更新时间: 2016/05/20 - 16:58
Article

方案:基于英特尔® 至强融核™ 处理器 x 200 的面向深度学习优化的 Caffe*

The computer learning code Caffe* has been optimized for Intel® Xeon Phi™ processors. This article provides detailed instructions on how to compile and run this Caffe* optimized for Intel® architecture to obtain the best performance on Intel Xeon Phi processors.
作者: Vamsi Sripathi (Intel) 最后更新时间: 2016/12/28 - 22:50
Article

通过避免或消除人工相关性实现并行性

Many applications and algorithms contain serial optimizations that inadvertently introduce data dependencies and inhibit parallelism. One can often remove such dependences through simple transforms, or even avoid them altogether through.
作者: 管理 最后更新时间: 2016/10/11 - 20:14
有关编译器优化的更完整信息,请参阅优化通知