Article

Algorithms to vectorize load groups in x86

Learn about the algorithms used to achieve vectorization in GCC 5.0.
作者: Evgeny Stupachenko (Intel) 最后更新时间: 2018/12/12 - 18:00
Article

Fast Gathering-based SpMxV for Linear Feature Extraction

This algorithm can be used to improve sparse matrix-vector and matrix-matrix multiplication in any numerical computation. As we know, there are lots of applications involving semi-sparse matrix computation in High Performance Computing. Additionally, in popular perceptual computing low-level engines, especially speech and facial recognition, semi-sparse matrices are found to be very common....
作者: 最后更新时间: 2018/12/12 - 18:00
博客

The New Parallel Universe Magazine is Out: All About Vectorization

Parallel Universe is Intel's quarterly magazine that explores inroads and innovations in software development. The new issue takes a deep dive into the subject of vectorization and what it can do for you. Our first feature article looks at the SIMD directives for explicit vector programming now available in OpenMP. The second article walks you through Vectorization Advisor, a new tool in the...
作者: Sally Sams (Intel) 最后更新时间: 2018/12/31 - 15:00
博客

Three Pieces of Advice for Code Modernization Success

What three code modernization techniques would I suggest to help a programmer improve the execution performance of her code? With too many specific things to choose from, these are three recommendations for any programmer anywhere and anytime.
作者: Clay B. (Blackbelt) 最后更新时间: 2018/12/12 - 18:08
Article

Putting Your Data and Code in Order: Optimization and Memory – Part 1

This series of two articles discusses how data and memory layout affect performance and suggests specific steps to improve software performance. The basic steps shown in these two articles can yield significant performance gains. These two articles are designed at an intermediate level. It is assumed the reader desires to optimize software performance using common C, C++ and Fortran* programming...
作者: David M. 最后更新时间: 2018/12/12 - 18:00
博客

What is Thread Parallelism, and How Do I Put It to Use?

An Intro to Multi-Level Parallelism for High-Performance Computing by Clay Breshears | Life Sciences Software Architect, Intel
作者: Clay B. (Blackbelt) 最后更新时间: 2018/12/12 - 18:08
博客

Five Big Insights from the Student Winner of the Intel® Modern Code Developer Challenge

As Shared by Mathieu Gravey, Grand-Prize Winner of the Intel Modern Code Developer Challenge
作者: Mathieu Gravey 最后更新时间: 2018/12/12 - 18:08
博客

Can You Write a Vectorized Reduction Operation?

I can. And if you read this post you will also be able to write one, too. (Might be a cool party trick or a sucker bet to make a little cash.)
作者: Clay B. (Blackbelt) 最后更新时间: 2018/12/12 - 18:08
博客

Vectorized Reduction 2: Let the Compiler do that Voodoo that it do so well

As I mentioned in my previous post about writing a vectorized reduction code from Intel vector intrinsics, that part of the code was just the finishing touch on a loop computing squared difference of complex values.
作者: Clay B. (Blackbelt) 最后更新时间: 2018/12/12 - 18:08