Problem : Intel® Parallel Inspector identifies a data race on an atomic construct. (Win32 Interlocked… functions, OpenMP* #pragma omp atomic, or Intel® Threading Building Blocks atomic<T> declared variables.)
Root Cause : For atomics, Intel Parallel Inspector checks to see if all accesses to a variable are atomic. If two accesses occur at the same time, and they are not both atomic, then it reports a data-race.
Resolution : Either place every access (reads and writes) to a variable in an atomic construct, or suppress the warning with the Intel Parallel Inspector suppresion feature.
More Information :
In Intel Threading Building Blocks, all operations (including reads) on objects of type atomic<T> are implicitly atomic. For more Information see section 6.2 of the Intel TBB reference Manual: http://software.intel.com/content/dam/develop/external/us/en/documents/301114-157214.pdf. If you see a data race reported by Intel® Parallel Inspector on such a variable you can safely suppress it in Intel Parallel Inspector.
Win32 Interlocked variable access states that it only guarantees atomicity in respect to other interlocked functions. Additionally, the variable must be aligned. For More Info see details for your specific Interlocked function, an example: http://msdn.microsoft.com/en-us/library/ms683614(VS.85).aspx
#pragma omp atomic is also only atomic in respect to other #pragma omp atomic operations: See Section 2.8.5 of OpenMP spec at http://software.intel.com/content/dam/develop/external/us/en/documents/spec30-157214.pdf
That’s the theory, now for something more practical. If you declare a variable which is a basic data type (no larger than 32 bits on a IA-32 architecture machine and 64 bits on an Intel® 64 architecture machine), aligned, and declared volatile, most C/C++ compilers will use a single atomic machine instruction to read the variable, even when read outside an atomic construct. Section 7.1.1 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, System Programming Guide, Part 1 allows atomicity on reads of variables, and allows some additional atomic operations on unaligned data and larger data sizes. In these cases you can safely suppress the data race using Intel Parallel Inspector’s suppression feature. But Note: Your code may not be portable to another set of compiler switches, compiler, or architecture.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804