Fast Gathering-based SpMxV for Linear Feature Extraction

This algorithm can be used to improve sparse matrix-vector and matrix-matrix multiplication in any numerical computation. As we know, there are lots of applications involving semi-sparse matrix computation in High Performance Computing. Additionally, in popular perceptual computing low-level engines, especially speech and facial recognition, semi-sparse matrices are found to be very common....
作者: 最后更新时间: 2018/12/12 - 18:00

PGO: Let It Go (PHP)

We can hope that companies like Intel® will come along with a faster processor. (And this does tend to happen every year). Or we can improve our compilers to produce better machine code. Or we can analyze our own code and change it to run more optimally. For PHP, we do all three: We partner with the processor architects to improve the way they execute PHP; we look for changes we can make to the...
作者: David S. (Blackbelt) 最后更新时间: 2019/07/03 - 20:08

Reduce Boilerplate Code in Parallelized Loops with C++11 Lambda Expressions

Parallelize loops with Intel® Threading Building Blocks using Intel® C++ Compiler for lambda expressions.
作者: gaston-hillar (Blackbelt) 最后更新时间: 2018/12/12 - 18:00

Debug Intel® Transactional Synchronization Extensions

If printf or fprintf functions cause transaction aborts, use Intel® Processor Trace as a work-around.
作者: Roman Dementiev (Intel) 最后更新时间: 2019/07/04 - 17:00

Cells in the Cloud: Scaling a Biological Simulator to the Cloud

Hello, fellow developers! I am Konstantinos, and I currently work at CERN as a research intern for this summer.

作者: Konstantinos Kanellis 最后更新时间: 2018/12/12 - 18:00

Free access to Intel® Compilers, Performance libraries, Analysis tools and more...

Intel® Parallel Studio XE is a very popular product from Intel that includes the Intel® Compilers, Intel® Performance Libraries, tools for analysis, debugging and tuning, tools for MPI and the Intel® MPI Library. Did you know that some of these are available for free? Here is a guide to “what is available free” from the Intel Parallel Studio XE suites.
作者: 管理 最后更新时间: 2019/09/30 - 17:28

Caffe* Optimized for Intel® Architecture: Applying Modern Code Techniques

This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
作者: 最后更新时间: 2019/10/15 - 15:30

Recognize and Measure Vectorization Performance

Get a background on vectorization and learn different techniques to evaluate its effectiveness.
作者: David M. 最后更新时间: 2019/10/15 - 15:30

Hybrid Parallelism: A MiniFE* Case Study

This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.
作者: David M. 最后更新时间: 2019/10/15 - 16:40

Optimizing OpenStack* Swift* Performance with PyPy*

Isolate and address challenges, understand different solutions, and learn best-known methods associated with adopting a PyPy* just-in-time interpreter for a leading cloud co

作者: Peter X. Wang (Intel) 最后更新时间: 2019/10/15 - 16:40