questions about memory to cache mapping

questions about memory to cache mapping

刘 驰.'s picture

I notice that Xeon Phi has large coherent L2 cache. I'd like to figure out more details. My question is about if  the local L2 cache is full,and the another core‘s L2 cache is not full.,can  this core' l2 cache be used by that core whose L2 cache is full.

ps:Are there more documents about memory to cache mapping in phi.

8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Sumedh Naik (Intel)'s picture

According to my understanding, if the local L2 cache is full then some data will be swapped out irrespective of whether another core's L2 is free or not. 

You can find out more about the caches in the Software Developer's Guide: 

http://download-software.intel.com/sites/default/files/article/334766/in...

Tim Prince's picture

The Answers... post which Robert Reed made yesterday may be useful, as well as this:

http://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-code...

The effect of increasing the amount of memory which can be effectively copied into cache is helped by setting thread affinities so that threads accessing distinct blocks of memory are pinned to different cores.  There is fairly effective read access across cores, but if a thread writes to a cache line resident on a different core, additional events are triggered and may be counted in the VTune General category.

James Cownie (Intel)'s picture

This article http://software.intel.com/en-us/forums/topic/373346 is also relevant, I think.

刘 驰.'s picture

It is a good post.Thank you!

Quote:

James Cownie (Intel) wrote:

This article http://software.intel.com/en-us/forums/topic/373346 is also relevant, I think.

刘 驰.'s picture

Thank you so much,I really appreciate it.

Another question:

When a core accesses its L2 cache and misses, an address request is sent on the address ring to the tag directories. The memory addresses are uniformly distributed amongst the tag directories on the ring to provide a smooth traffic characteristic on the ring. If the requested data block is found in another core’s L2 cache, a forwarding request is sent to that core’s L2 over the address ring and the request block is subsequently forwarded on the data block ring. If the requested data is not found in any caches, a memory address is sent from the tag directory to the memory controller.(http://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-code...

If only in this condition local L2 cache will access another core' L2 cache when the requested data block is found in that L2 cache. or  is there any more situation.

Quote:

Sumedh Naik (Intel) wrote:

According to my understanding, if the local L2 cache is full then some data will be swapped out irrespective of whether another core's L2 is free or not. 

You can find out more about the caches in the Software Developer's Guide: 

http://download-software.intel.com/sites/default/files/article/334766/in...

刘 驰.'s picture

Is the L2 cache in phi a private caching policy,a shared cache policy, or a hybrid of the two?

Quote:

TimP (Intel) wrote:

The Answers... post which Robert Reed made yesterday may be useful, as well as this:

http://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-code...

The effect of increasing the amount of memory which can be effectively copied into cache is helped by setting thread affinities so that threads accessing distinct blocks of memory are pinned to different cores.  There is fairly effective read access across cores, but if a thread writes to a cache line resident on a different core, additional events are triggered and may be counted in the VTune General category.

Sumedh Naik (Intel)'s picture

The Local L2 cache never really accesses another core's cache. The other core places the requested data onto the ring which is in turn read by the requesting (local) core. This is the general idea of data forwarding in snooping cache coherence protocol. I am unsure if there any other scenarios where data forwarding can occur. 

Also regarding your second question: the L2 cache in Intel Xeon Phi coprocessor is a distributed cache and not a shared cache. The following post by James Cownie in this thread might make things more clear: 

Quote:

The way to think of the machine is that each core has its own cache, and that all those caches are maintained coherent, not that there is one, large, shared L2 cache. (People familiar with Xeon, which does have a large, shared, L3 cache sometimes say that the Intel(r) Xeon Phi(tm) coprocessor "doesn't have a last-level cache", which is clearly wrong by definition :-), but they are right that there is no shared last level cache).

Login to leave a comment.