Article

OpenMP* and the Intel® IPP Library

How to configure OpenMP in the Intel IPP library to maximize multi-threaded performance of the Intel IPP primitives.
Criado por Última atualização em 31/07/2019 - 14:30
Article

Improving Averaging Filter Performance Using Intel® Cilk™ Plus

Intel® Cilk™ Plus is an extension to the C and C++ languages to support data and task parallelism.  It provides three new keywords to i

Criado por Anoop M. (Intel) Última atualização em 12/12/2018 - 18:00
Article

Vectorizing Loops with Calls to User-Defined External Functions

Introduction

Criado por Anoop M. (Intel) Última atualização em 12/12/2018 - 18:00
Article

Fast Gathering-based SpMxV for Linear Feature Extraction

This algorithm can be used to improve sparse matrix-vector and matrix-matrix multiplication in any numerical computation. As we know, there are lots of applications involving semi-sparse matrix computation in High Performance Computing. Additionally, in popular perceptual computing low-level engines, especially speech and facial recognition, semi-sparse matrices are found to be very common....
Criado por Última atualização em 12/12/2018 - 18:00
Article

Peel the Onion (Optimization Techniques)

This paper is a more formal response to an Intel® Developer Zone forum posting. See: (https://software.intel.com/en-us/forums/intel-moderncode-for-parallel-architectures/topic/590710).
Criado por jimdempseyatthecove (Blackbelt) Última atualização em 12/12/2018 - 18:00
Article

Fast Computation of Huffman Codes

The generation of Huffman codes is used in many applications, among them the DEFLATE compression algorithm. The classical way to compute these codes uses a heap data structure. This approach is fairly efficient, but traditional software implementations contain lots of branches that are data-dependent and thus hard for general-purpose CPU hardware to predict. On modern processors with deep...
Criado por James Guilford (Intel) Última atualização em 09/07/2019 - 16:09
Article

Implementing a Masked SVML-like Function Explicitly in User-Defined Way

The Intel® Compiler provides SIMD intrinsics APIs for short vector math library (SVML) and starting with Intel® Advanced Vector Extensions

Criado por Última atualização em 16/07/2019 - 08:37
Article

Maximize TensorFlow* Performance on CPU: Considerations and Recommendations for Inference Workloads

This article will describe performance considerations for CPU inference using Intel® Optimization for TensorFlow*
Criado por Nathan Greeneltch (Intel) Última atualização em 31/07/2019 - 12:11
Article

Debugging Intel® Xeon Phi™ Applications on Linux* Host

Read details about a debug solution for Intel® Many Integrated Core Architecture (Intel® MIC) that can debug applications running on an Intel® Xeon Phi™ coprocessor.
Criado por Última atualização em 15/10/2019 - 15:30
Article

Caffe* Optimized for Intel® Architecture: Applying Modern Code Techniques

This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
Criado por Última atualização em 15/10/2019 - 15:30