Filtros

Article

Automatic Parallelization with Intel® Compilers

With automatic parallelization, the compiler detects loops that can be safely and efficiently executed in parallel and generates multithreaded code.
Autor admin Última actualización 04/07/2019 - 21:33
Article

Loop Modifications to Enhance Data-Parallel Performance

When confronted with nested loops, the granularity of the computations that are assigned to threads will directly affect performance. Loop transformations such as splitting and merging nested loops can make parallelization easier and more productive.
Autor admin Última actualización 05/07/2019 - 14:47
Article

Granularity and Parallel Performance

One key to attaining good parallel performance is choosing the right granularity for the application. Granularity is the amount of real work in the parallel task. If granularity is too fine, then performance can suffer from communication overhead.
Autor admin Última actualización 05/07/2019 - 19:52
Article

Superscalar Programming 101 (Matrix Multiply) Part 1 of 5

Part one of a five-part series, this article teaches a methodology to interpret statistics gathered during test runs and use those interpretations to improve parallel code.
Autor jimdempseyatthecove (Blackbelt) Última actualización 04/07/2019 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 2 of 5

By Jim DempseyIn my last article we left off with

Autor jimdempseyatthecove (Blackbelt) Última actualización 04/07/2019 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 3 of 5

By Jim Dempsey

Autor jimdempseyatthecove (Blackbelt) Última actualización 04/07/2019 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 4 of 5

In the last installment (Part 3) we saw the effects of the QuickThread Parallel Tag Team method of Matrix Multiplica

Autor jimdempseyatthecove (Blackbelt) Última actualización 04/07/2019 - 22:00
Article

Superscalar programming 101 (Matrix Multiply) Part 5 of 5

In part 4 we saw the effects of the QuickThread Parallel Tag Team Transpose method of Matrix Multiplication performe

Autor jimdempseyatthecove (Blackbelt) Última actualización 04/07/2019 - 22:00
Article

Step-by-Step Application Performance Tuning with Intel Compilers

A step-by-step introduction to application performance tuning using the Intel® Compilers version 13 for IA-32 and Intel® 64 processors that are included with Intel® Parallel Studio XE 2013
Autor Martyn Corden (Intel) Última actualización 10/05/2019 - 08:30
Article

Don't Use the OpenMP* THREADPRIVATE 'Compatibility' Option when Everything is Compiled by Intel

The Intel C++ and Fortran compilers for Windows* and Linux* provide 'legacy' and 'compatibility' implementations of the OpenMP THREADPRIVATE directive. The 'compatibility' option should not be used when everything is compiled by Intel compilers.
Autor Kenneth Craft (Intel) Última actualización 08/07/2019 - 15:12