Software Tuning, Performance Optimization & Platform Monitoring

cache eviction policy of Intel newer CPUs

Hi everybody,

      From intel processor's optimiation manual, I know that in Sandy Bridge, L1 and L2 cache is shared within each core but L3 cache is shared by all the cores. But what's the evition policy: eg, can data remain in a L2 cache if it's evicted for L3 by another core? What about this policy in other architectures, eg, in core 2 like Q8200?

Different speed with Intel Processor Identification Utility

Hi All,

I am not really sure if this topic fits here. But I hope the moderator could move this to the right topic. 

I purchased this new samsung laptop yesterda with Core i3-3120M CPU @2.50GHz. This is what I get when I right click on the my computer icon on the desktop. But when I check with the 'Interl Processor Identification Utility' software it says that the reported speed is 1.19 GHz while the expected speed is 2.50GHz. 

verifying first-touch memory allocation

Is anyone aware of a basic tool for verifying first-touch memory allocation on a NUMA platform such as Xeon EP?

According to usual expectation, pinning of MPI processes to a single CPU should result in this happening automatically (barring running out of memory, etc.), unless a non-NUMA BIOS option has been selected.

Likewise, OpenMP where data are initialized by a parallel data access scheme consistent with the way they will be used should result in allocation local to the CPU, rather than on remote memory.

Notes about Loop-Blocking Optimization Technique to increase performance of processing

[ Note 1 ]

Loop-Blocking Optimization Technique is well described in Intel Software Development Manual and Intel C++ compiler User and Reference Guides. After extensive testing I could say that it is very important to select a right Block Size for the last for-loop and its optimal size depends on a size of L1 cache line of a CPU.


PCM output - why is core utilization over time interval shows lower numbers ?


I'm running PCM 2.5 on a Intel Xeon E5-2670, Can anybody please explain the text marked in BOLD in the below PCM output ?. Also can you please explain

1. The difference between core residency and package residency in the below PCM output.

2. If C1 represents core 1(physical) in PCM report where is the information related to C4 and C5 cores. (There are 8 physical cores)

Intel Performance Counter Monitor - Can't access PCI configuration space


I'm trying to run the latest PCM 2.5 on a machine with Intel(R) Xeon(R) CPU E5-2670 (Sandy Bridge), but PCM gives the below message and do not show any memory read/write access in the report.

Can not access SNB-EP (Jaketown) PCI configuration space. Access to uncore counters (memory and QPI bandwidth) is disabled.
You must be root to access these SNB-EP counters in PCM

Can you please guide me what am I missing here.

Time based cache eviction

Hi everyone,

Can someone please say whether the Xeon E5-2670 has a cache eviction logic which operates solely based on time, that is, suppose we don't try to load any new memory into the processor, will data residing in any cache level, which is older than a certain time, still be evicted?

Also, does anyone know if there exists a document which provides any sort of details on the factors affecting cache management for the above processor?

Subscribe to Software Tuning, Performance Optimization & Platform Monitoring