Filters
Fast Gathering-based SpMxV for Linear Feature Extraction
This algorithm can be used to improve sparse matrix-vector and matrix-matrix multiplication in any numerical computation. As we know, there are lots of applications involving semi-sparse matrix computation in High Performance Computing. Additionally, in popular perceptual computing low-level engines, especially speech and facial recognition, semi-sparse matrices are found to be very common....Modern Memory Subsystems Benefits for Data Base Codes, Linear Algebra Codes, Big Data, and Enterprise Storage
This article describes and contrasts advantages different types of memory, including Multi-Channel DRAM (MCDRAM) and High-Bandwidth Memory (HBM), the future 3D XPoint™ memory devices, and Intel® Omni-Path Fabric (Intel® OP Fabric).Parallel Programming Books
Use these parallel programming resources and books with your Intel® Xeon® processor and Intel® Xeon Phi™ processor familyFree access to Intel® Compilers, Performance libraries, Analysis tools and more...
Intel® Parallel Studio XE is a very popular product from Intel that includes the Intel® Compilers, Intel® Performance Libraries, tools for analysis, debugging and tuning, tools for MPI and the Intel® MPI Library. Did you know that some of these are available for free? Here is a guide to “what is available free” from the Intel Parallel Studio XE suites.Caffe* Optimized for Intel® Architecture: Applying Modern Code Techniques
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.What is Code Modernization?
Modern high performance computers are built with a combination of resources including:
Improve Performance with Vectorization
This article focuses on the steps to improve software performance with vectorization. Included are examples of full applications along with some simpler cases to illustrate the steps to vectorization.Recognize and Measure Vectorization Performance
Get a background on vectorization and learn different techniques to evaluate its effectiveness.Hybrid Parallelism: A MiniFE* Case Study
This case study examines the situation where the problem decomposition is the same for threading as it is for Message Passing Interface* (MPI); that is, the threading parallelism is elevated to the same level as MPI parallelism.Putting Your Data and Code in Order: Data and layout - Part 2
Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.- 1
- Next