| May 12, 2008 9:00 PM PDT | |
Parallel programs with multiple threads must use synchronization techniques in order to insure correct operation. Generally, synchronization operations use shared synchronization variables and "spin-wait" loops that check on the values of those variables. Starting from the Intel® Pentium® 4 and Xeon® processors, Intel® IA-32 architecture provides a new instruction to address the performance issues associated with spin loops. This application note addresses two important optimization issues for multi-threading computations involving high-speed processors: spin loop and shared-data management. Specifically, these optimizations include the use of the PAUSE instruction in spin-wait loops and the placement of shared and non-shared data on different 128-byte cache lines. Intel strongly recommends using the PAUSE instruction in spin-wait loops as soon as possible, since it is a backward compatible with all earlier IA-32 architecture. This document describes in detail the recommended changes and the reasons behind these changes.
Click here to read the article [PDF]
Click here to read the article [PDF]
For more complete information about compiler optimizations, see our Optimization Notice.
Comments (0) 
Trackbacks (0)
Leave a comment 
To obtain technical support, please go to Software Support.
