整理您的数据和代码: 优化和内存 — 第 1 部分

This series of two articles discusses how data and memory layout affect performance and suggests specific steps to improve software performance. The basic steps shown in these two articles can yield significant performance gains. These two articles are designed at an intermediate level. It is assumed the reader desires to optimize software performance using common C, C++ and Fortran* programming...
作者: David M. 最后更新时间: 2018/12/12 - 18:00


An Intro to Multi-Level Parallelism for High-Performance Computing by Clay Breshears | Life Sciences Software Architect, Intel
作者: Clay B. (Blackbelt) 最后更新时间: 2018/12/12 - 18:08

英特尔® Parallel Studio XE 2017概述和新功能

英特尔® Parallel Studio 2017推出若干种令人激动的功能特性以及为数不多的几种新产品。

作者: Wei Du (Intel) 最后更新时间: 2019/01/28 - 00:29

即使不用最新硬件也可实现最新AVX SIMD指令调优


作者: Wei Du (Intel) 最后更新时间: 2019/01/28 - 00:20

借助 SIMD 数据布局模板和数据预处理提高 SIMD 在动画中的使用效率

In this paper, we walk through a 3D Animation algorithm example and describe some techniques and methodologies that may benefit your next vectorization endeavors. We also integrate the algorithm with SIMD Data Layout Templates (SDLT), which is a feature of Intel® C++ Compiler, to improve data layout and SIMD efficiency. Includes code sample.
作者: 最后更新时间: 2019/03/25 - 11:40

腾讯* 在基于英特尔® 至强® 处理器的游戏内购买推荐系统中使用机器学习

To enhance the online gaming user experience, Tencent uses an in-game purchase recommendation system employing the machine learning method to help users decide what equipment they would want to buy within their games. Tencent machine learning engine uses DGEMM6 extensively in its module to compute the coefficients for the logistic regression machine learning algorithm.
作者: Nguyen, Khang T (Intel) 最后更新时间: 2018/12/12 - 18:00


This article completes an analysis of a problem erroneously reported on the Intel® Developer Zone forum: Vectorization failed because of unsigned integer? It provides a more detailed examination showing that unsigned integer is not impacting compiler vectorization but what methodology to use when a modern C/C++ compiler fails to auto-vectorize for-loops.
作者: 最后更新时间: 2019/07/05 - 14:46



作者: 最后更新时间: 2018/12/12 - 18:08

英特尔® 数据分析加速库

The Intel® Data Analytics Acceleration Library (Intel® DAAL) helps speed big data analytics by providing highly optimized algorithmic building blocks for all data analysis stages (Pre-processing, Transformation, Analysis, Modeling, Validation, and Decision Making) for offline, streaming and distributed analytics usages. It’s designed for use with popular data platforms including Hadoop*, Spark*,...
作者: James R. (Blackbelt) 最后更新时间: 2019/08/27 - 13:50

基于 HIROMB‐BOOS‐Model 3D 海洋代码实现更出色的并发性和 SIMD

通过发挥英特尔® 至强融核™ 协处理器的优势,第 3 章“High Performance Parallelism Pearls”的作者能够对代码进行改进和现代化改造,并“实现出色的扩展、矢量化、带宽利用率和性能功耗比。

作者: Karthik Raman (Intel) 最后更新时间: 2019/09/30 - 17:30