Фильтры

Блоги

Visual Studio 2010 Built-in CPU Acceleration

Writing the sample code for this post I was amazed myself to see how simple it was to reach over 20 times performance improvement with so little effort.   

Автор: Последнее обновление: 12.12.2018 - 18:00
Article

OpenMP* and the Intel® IPP Library

How to configure OpenMP in the Intel IPP library to maximize multi-threaded performance of the Intel IPP primitives.
Автор: Последнее обновление: 31.07.2019 - 14:30
Article

Improving the Compute Performance of Video Processing Software Using Intel® Advanced Vector Extensions (Intel® AVX) Instructions

This paper describes a case study in which AVX instructions are used to enhance the performance of a de-saturation algorithm (a common video filter). The case study takes the algorithm from a non-SIMD state to AVX based SIMD.
Автор: Последнее обновление: 10.07.2019 - 16:54
Article

PAOS - Packed Array Of Structures

by Jim Dempsey

Автор: jimdempseyatthecove (Blackbelt) Последнее обновление: 28.12.2018 - 11:03
Article

Fast Gathering-based SpMxV for Linear Feature Extraction

This algorithm can be used to improve sparse matrix-vector and matrix-matrix multiplication in any numerical computation. As we know, there are lots of applications involving semi-sparse matrix computation in High Performance Computing. Additionally, in popular perceptual computing low-level engines, especially speech and facial recognition, semi-sparse matrices are found to be very common....
Автор: Последнее обновление: 12.12.2018 - 18:00
Блоги

Three Pieces of Advice for Code Modernization Success

What three code modernization techniques would I suggest to help a programmer improve the execution performance of her code? With too many specific things to choose from, these are three recommendations for any programmer anywhere and anytime.
Автор: Clay B. (Blackbelt) Последнее обновление: 12.12.2018 - 18:08
Article

Putting Your Data and Code in Order: Optimization and Memory – Part 1

This series of two articles discusses how data and memory layout affect performance and suggests specific steps to improve software performance. The basic steps shown in these two articles can yield significant performance gains. These two articles are designed at an intermediate level. It is assumed the reader desires to optimize software performance using common C, C++ and Fortran* programming...
Автор: David M. Последнее обновление: 12.12.2018 - 18:00
Article

整理您的数据和代码: 优化和内存 — 第 1 部分

This series of two articles discusses how data and memory layout affect performance and suggests specific steps to improve software performance. The basic steps shown in these two articles can yield significant performance gains. These two articles are designed at an intermediate level. It is assumed the reader desires to optimize software performance using common C, C++ and Fortran* programming...
Автор: David M. Последнее обновление: 12.12.2018 - 18:00
Article

Приводим данные и код в порядок: оптимизация и память, часть 1

This series of two articles discusses how data and memory layout affect performance and suggests specific steps to improve software performance. The basic steps shown in these two articles can yield significant performance gains. These two articles are designed at an intermediate level. It is assumed the reader desires to optimize software performance using common C, C++ and Fortran* programming...
Автор: Последнее обновление: 12.12.2018 - 18:00
Article

Fast Computation of Huffman Codes

The generation of Huffman codes is used in many applications, among them the DEFLATE compression algorithm. The classical way to compute these codes uses a heap data structure. This approach is fairly efficient, but traditional software implementations contain lots of branches that are data-dependent and thus hard for general-purpose CPU hardware to predict. On modern processors with deep...
Автор: James Guilford (Intel) Последнее обновление: 09.07.2019 - 16:09