Filtros
OpenMP* and the Intel® IPP Library
How to configure OpenMP in the Intel IPP library to maximize multi-threaded performance of the Intel IPP primitives.Improving Averaging Filter Performance Using Intel® Cilk™ Plus
Intel® Cilk™ Plus is an extension to the C and C++ languages to support data and task parallelism. It provides three new keywords to i
Fast Gathering-based SpMxV for Linear Feature Extraction
This algorithm can be used to improve sparse matrix-vector and matrix-matrix multiplication in any numerical computation. As we know, there are lots of applications involving semi-sparse matrix computation in High Performance Computing. Additionally, in popular perceptual computing low-level engines, especially speech and facial recognition, semi-sparse matrices are found to be very common....Peel the Onion (Optimization Techniques)
This paper is a more formal response to an Intel® Developer Zone forum posting. See: (https://software.intel.com/en-us/forums/intel-moderncode-for-parallel-architectures/topic/590710).Fast Computation of Huffman Codes
The generation of Huffman codes is used in many applications, among them the DEFLATE compression algorithm. The classical way to compute these codes uses a heap data structure. This approach is fairly efficient, but traditional software implementations contain lots of branches that are data-dependent and thus hard for general-purpose CPU hardware to predict. On modern processors with deep...Implementing a Masked SVML-like Function Explicitly in User-Defined Way
The Intel® Compiler provides SIMD intrinsics APIs for short vector math library (SVML) and starting with Intel® Advanced Vector Extensions
Maximize TensorFlow* Performance on CPU: Considerations and Recommendations for Inference Workloads
This article will describe performance considerations for CPU inference using Intel® Optimization for TensorFlow*Debugging Intel® Xeon Phi™ Applications on Linux* Host
Read details about a debug solution for Intel® Many Integrated Core Architecture (Intel® MIC) that can debug applications running on an Intel® Xeon Phi™ coprocessor.Caffe* Optimized for Intel® Architecture: Applying Modern Code Techniques
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.- 1
- Próxima