Intel Inspector XE 2011 automatically finds memory errors, deadlocks and other conditions that could lead to deadlocks, data races, thread . Some specific issues associated with debugging multithreaded applications will be discussed in this article.
Intel® Parallel Studio
Loop Modifications to Enhance Data-Parallel Performance
When confronted with nested loops, the granularity of the computations that are assigned to threads will directly affect performance. Loop transformations such as splitting and merging nested loops can make parallelization easier and more productive.
IPP Texture Compression Functions
IPP Texture Compression, Texture Compression, IPP 6.1, new in IPP, DirectX, DXTC
The Serial On-Ramp to the Multicore Highway: Preparing to Parallelize Code
This article discusses how coding and optimization on-the-fly are opposed and how performance experts approach performance improvement. It explains how they systematically prepare their code for optimization and how the optimization process is done.
Granularity and Parallel Performance
One key to attaining good parallel performance is choosing the right granularity for the application. Granularity is the amount of real work in the parallel task. If granularity is too fine, then performance can suffer from communication overhead.
Intel Guide for Developing Multithreaded Applications
The Intel® Guide for Developing Multithreaded Applications covers topics ranging from general advice applicable to any multithreading method to usage guidelines for Intel® software products to API-specific issues.
Avoiding Heap Contention Among Threads
Avoiding Heap Contention Among Threads (PDF 256KB)
Abstract
Allocating memory from the system heap can be an expensive operation due to a lock used by system runtime libraries to synchronize access to the heap. Contention on this lock can limit the performance benefits from multithreading. To solve this problem, apply an allocation strategy that avoids using shared locks, or use third party heap managers.
Using Intel® AVX without Writing AVX
Intel® AVX is a new 256-bit instruction set extension to Intel® Streaming SIMD Extensions and is designed for applications that are floating point intensive. This paper discusses options to integrate Intel® AVX into an application via use of intrinsics.
