I'm wondering how the private_memory of Intel HD graphics works, I read from the "Intel® SDK for OpenCL* Applications 2013 - Optimization Guide for Windows* OS" this recommandation about the private memory :
Since each work-item has its own __private memory, there is no locality for __private memory accesses, and each work-item frequently accesses a unique cache line for every access to __private memory. For this reason, accesses to __private memory are very slow, and you should avoid indexed private memory if possible.
But I don't really understand why I have to avoid the indexed private memory ? can any one tell me more about this or just explain this recommendation ?