Article

OpenMP* and the Intel® IPP Library

How to configure OpenMP in the Intel IPP library to maximize multi-threaded performance of the Intel IPP primitives.
Authored by Last updated on 07/31/2019 - 14:30
Article

Os Três Estágios da Preparação para Otimização de Software Paralelo

A melhoria de desempenho no software paralelo requer uma abordagem estruturada que faça um bom uso dos recursos de desenvolvimento, obtendo bons resultados rapidamente.

Authored by aaron-tersteeg (Intel) Last updated on 07/05/2019 - 10:15
Article

Using Windows* Registry Hooks to Invoke Intel® VTune™ Amplifier XE to Profile Windows* Services

The article describes how to profile Windows* services by launching them from Intel® VTune™ Amplifier. This trick is useful for cases when attaching to process is not applicable.
Authored by Kirill R. (Intel) Last updated on 07/06/2019 - 11:17
Blog post

Considerations for tuning Your Intel® Xeon Linux*/Apache* Server

In this blog, I will discuss a list of useful Linux commands to run, to tune your system for running the Apache web service, as well as some Apache tuning configuration changes which have shown to

Authored by Thai Le (Intel) Last updated on 07/06/2019 - 17:00
Blog post

Exploring Intel® Transactional Synchronization Extensions with Intel® Software Development Emulator

Intel® Transactional Synchronization Extensions (Intel® TSX) is perhaps one of the most non-trivial extensions of instruction set architecture introduced in the 4th generation Intel® Cor

Authored by Roman Dementiev (Intel) Last updated on 07/06/2019 - 17:00
Article

A Matrix Multiplication Routine that Updates Only the Upper or Lower Triangular Part of the Result Matrix

  Background

Intel® MKL provides the general purpose BLAS*  matrix multiply routines ?GEMM defined as follows:

Authored by Zhang, Zhang (Intel) Last updated on 07/12/2019 - 14:46
Blog post

Optimization of Data Read/Write in a Parallel Application

(This work was done by Vivek Lingegowda during his internship at Intel.)

Authored by Last updated on 07/04/2019 - 17:40
Article

Workshop: Optimizing OpenCL applications for Intel® Xeon Phi™ Coprocessor

The Intel® Xeon Phi™ Coprocessor is designed for highly parallel, high performance demanding applications.

Authored by Arik Narkis (Intel) Last updated on 07/06/2019 - 16:30
Blog post

Modern Locking

Most multi-threaded software uses locking. Lock optimization traditionally has aimed to reduce lock contention, that is make the critical regions smaller.

Authored by Andreas Kleen (Intel) Last updated on 07/04/2019 - 19:18
Blog post

Intel® Transactional Synchronization Extensions (Intel® TSX) profiling with Linux perf

Intel® TSX exposes a speculative execution mode to the programmer to improve locking performance.. Tuning speculation requires heavily on a PMU profiler.

Authored by Andreas Kleen (Intel) Last updated on 07/04/2019 - 17:00