• 2019 Update 4
  • 03/20/2019
  • Public Content
Contents

__private Memory

__private
memory that is allocated to registers is typically very efficient to access. If the private memory doesn’t fit in registers, however, the performance can be very poor. Since each work-item has its own spill space for
__private
memory, there is no locality for
__private
memory accesses, and each work-item frequently accesses a unique cache line for every access to
__private
memory. For this reason, accesses to
__private
memory data that has not been allocated to registers are very slow. In most cases, the compiler can map statically-indexed private arrays into registers. Also, in some cases, it can map dynamically-indexed private arrays in registers, but the performance of this code will be slightly lower than accessing statically indexed private arrays. As such, a common optimization is to modify code to ensure private arrays are statically indexed.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.