Filters

Article

"Vectorization: Writing C/C++ code in VECTOR Format"

Vectorization: Writing C/C++ code in VECTOR FormatMukkaysh Srivastav

Authored by Last updated on 12/12/2018 - 23:28
Article

Automatic Parallelization with Intel® Compilers

With automatic parallelization, the compiler detects loops that can be safely and efficiently executed in parallel and generates multithreaded code.
Authored by admin Last updated on 12/12/2018 - 18:08
Article

Parallelism in the Intel® Math Kernel Library

The Intel® Math Kernel Library (Intel® MKL) contains a large collection of functions that can benefit math-intensive applications.
Authored by admin Last updated on 12/12/2018 - 18:00
Article

Predicting and Measuring Parallel Performance

The success of parallelization is typically quantified by measuring the speedup of the parallel version relative to the serial version. It is also useful to compare that speedup relative to the upper limit of the potential speedup.
Authored by admin Last updated on 12/12/2018 - 18:00
Article

Loop Modifications to Enhance Data-Parallel Performance

When confronted with nested loops, the granularity of the computations that are assigned to threads will directly affect performance. Loop transformations such as splitting and merging nested loops can make parallelization easier and more productive.
Authored by admin Last updated on 12/12/2018 - 18:08
Article

Granularity and Parallel Performance

One key to attaining good parallel performance is choosing the right granularity for the application. Granularity is the amount of real work in the parallel task. If granularity is too fine, then performance can suffer from communication overhead.
Authored by admin Last updated on 12/12/2018 - 18:00
Article

Load Balance and Parallel Performance

Load balancing an application workload among threads is critical to performance. The key objective for load balancing is to minimize idle time on threads.
Authored by admin Last updated on 12/12/2018 - 18:00
Article

Expose Parallelism by Avoiding or Removing Artificial Dependencies

Many applications and algorithms contain serial optimizations that inadvertently introduce data dependencies and inhibit parallelism. One can often remove such dependences through simple transforms, or even avoid them altogether through.
Authored by admin Last updated on 12/12/2018 - 18:00
Article

Using Tasks Instead of Threads

Tasks are a lightweight alternative to threads that provide faster startup and shutdown times, better load balancing, an efficient use of available resources, and a higher level of abstraction.
Authored by admin Last updated on 12/31/2018 - 15:00
Article

Exploiting Data Parallelism in Ordered Data Streams

This article identifies some of these challenges and illustrates strategies for addressing them while maintaining parallel performance.
Authored by admin Last updated on 12/12/2018 - 23:28