The last level cache is actually split evenly across the 6 cores. So while each core can access (load) from the entire 12MB range, their requests will only be cached into their slice.
Most caches, including the LLC, use set associativite. This means that when address is mapped to a cache line, there are several locations that it can be written to. As opposed to direct mapping which has only one location per cache line.
You can read more about cache associativity on wikipedia: CPU cache
I am sorry, I think I didn't give my question clearly enough. I make it again.
If the last level cache can entirely be accessed by 6 cores, it also means on each core, all the address space(we neglect physical address holes here) can use last level cache.
BUT on Xeon 5650, because the size of last level cache is not power of 2, if we directly use address divides cache size, it is not exactly divisible by all the addresses. For the undivisible addresses, they should use last level cache as well. But here is my question, how do these undivisible addresses be mapped to last level cache? If directly using undivisible remaider as cache index, it is unavoidablely some cache sets service more accesses. Therefore, accesses to last level cache is not evenly distributed.
Am I correct? Or is there some additional design at cache?
Quoting zhangyihereIf the last level cache can entirely be accessed by 6 cores, it also means on each core, all the address space(we neglect physical address holes here) can use last level cache.
The access to the portions of the Last Level cache by each core is different. Each core "owns" a part of the LLC which it will have it's reads brought into, i.e. if a line is not in the LLC it will be read from memory, sent to the core, AND written in the portion of the LLC assigned to this particular core. If another core then reads this same line, it will be able to access it in the first core's LLC portion.
This means that if a request from a core needs to replace a line in the LLC, it will only replace lines in the portion of the LLC allocated to this core and not to another core.
Quoting zhangyihereBUT on Xeon 5650, because the size of last level cache is not power of 2, if we directly use address divides cache size, it is not exactly divisible by all the addresses. For the undivisible addresses, they should use last level cache as well. But here is my question, how do these undivisible addresses be mapped to last level cache? If directly using undivisible remaider as cache index, it is unavoidablely some cache sets service more accesses. Therefore, accesses to last level cache is not evenly distributed.
The LLC is set associative. What this means is that eachaddressfrom thePhysical Addressspace will map into exactly oneposition in the LLC, however each position has several slots that can store several memory lines that have all mappedto this sameposition.
For exampleimagine addressesX1, X2,X3 all map to position A in the LLC. And imagine position A has 2 slots. So ifa read to X1 will bring it to[position A, slot 1]. a later read to X2 will bring X2 to [position A, slot 2]. If later X3 is read, then the LLC will need to decide to evict X1 or X2 since X3 can only be written to position A (slots 1 or 2).
In this example this is calles a 2-way set associative cache. In general you can have an N-way set associative cache. The number ofphsical address that canmap to each position is equal to (ADDRESS_SPACE / (CACHE_SIZE /SET_ASSOCIATIVITY_DEGREE) )
The operator is a DIV so they don't need to be perfect multiples in general, although in practice the value of (CACHE_SIZE / SET_ASSOCIATIVITY_DEGREE) iswhat needs to be a perfect divisor of the ADDRESS_SPACE.
Regarding the segregation of the the LLC across the cores,the key to understanding lies in understanding how the setsare distributed across the cores. Each portion of the LLC allocated to a core willslots that represent all the possible positions that a physical address can map to.