Article

整理您的数据和代码: 优化和内存 — 第 1 部分

This series of two articles discusses how data and memory layout affect performance and suggests specific steps to improve software performance. The basic steps shown in these two articles can yield significant performance gains. These two articles are designed at an intermediate level. It is assumed the reader desires to optimize software performance using common C, C++ and Fortran* programming...
作者: David M. 最后更新时间: 2018/12/12 - 18:00
博客

线程并行化的概念及其用法

An Intro to Multi-Level Parallelism for High-Performance Computing by Clay Breshears | Life Sciences Software Architect, Intel
作者: Clay B. (Blackbelt) 最后更新时间: 2018/12/12 - 18:08
视频

英特尔® Parallel Studio XE 2017概述和新功能

英特尔® Parallel Studio 2017推出若干种令人激动的功能特性以及为数不多的几种新产品。

作者: Wei Du (Intel) 最后更新时间: 2019/01/28 - 00:29
视频

即使不用最新硬件也可实现最新AVX SIMD指令调优

向量化对于充分发挥现代处理器的全部潜能具有至关重要的作用。

作者: Wei Du (Intel) 最后更新时间: 2019/01/28 - 00:20
Article

借助 SIMD 数据布局模板和数据预处理提高 SIMD 在动画中的使用效率

In this paper, we walk through a 3D Animation algorithm example and describe some techniques and methodologies that may benefit your next vectorization endeavors. We also integrate the algorithm with SIMD Data Layout Templates (SDLT), which is a feature of Intel® C++ Compiler, to improve data layout and SIMD efficiency. Includes code sample.
作者: 最后更新时间: 2019/03/25 - 11:40
Article

腾讯* 在基于英特尔® 至强® 处理器的游戏内购买推荐系统中使用机器学习

To enhance the online gaming user experience, Tencent uses an in-game purchase recommendation system employing the machine learning method to help users decide what equipment they would want to buy within their games. Tencent machine learning engine uses DGEMM6 extensively in its module to compute the coefficients for the logistic regression machine learning algorithm.
作者: Nguyen, Khang T (Intel) 最后更新时间: 2018/12/12 - 18:00
Article

自动矢量化失败后应该怎么办?

This article completes an analysis of a problem erroneously reported on the Intel® Developer Zone forum: Vectorization failed because of unsigned integer? It provides a more detailed examination showing that unsigned integer is not impacting compiler vectorization but what methodology to use when a modern C/C++ compiler fails to auto-vectorize for-loops.
作者: 最后更新时间: 2019/07/05 - 14:46
博客

彩虹、独角兽和性能便携性

一个古老的犹太族寓言讲述了一个穷人向拉比寻求建议,他家里人多,房子小,感觉很拥挤。拉比告诉信徒,在房子里养一只山羊,一个月后再来见他。穷人很疑惑,但是没有争辩,他将一只山羊安置在房子里。一个月以后,拉比让穷人把山羊带走,一周后再来见他。不出所料,一周后,穷人感谢拉比,他的心情舒畅多了,因为他觉得家里没有那么挤了。

作者: 最后更新时间: 2018/12/12 - 18:08
博客

英特尔® 数据分析加速库

The Intel® Data Analytics Acceleration Library (Intel® DAAL) helps speed big data analytics by providing highly optimized algorithmic building blocks for all data analysis stages (Pre-processing, Transformation, Analysis, Modeling, Validation, and Decision Making) for offline, streaming and distributed analytics usages. It’s designed for use with popular data platforms including Hadoop*, Spark*,...
作者: James R. (Blackbelt) 最后更新时间: 2019/08/27 - 13:50
Article

基于 HIROMB‐BOOS‐Model 3D 海洋代码实现更出色的并发性和 SIMD

通过发挥英特尔® 至强融核™ 协处理器的优势,第 3 章“High Performance Parallelism Pearls”的作者能够对代码进行改进和现代化改造,并“实现出色的扩展、矢量化、带宽利用率和性能功耗比。

作者: Karthik Raman (Intel) 最后更新时间: 2019/09/30 - 17:30