Software Tuning, Performance Optimization & Platform Monitoring

Access to Intel(r) Performance Counter Monitor has denied (no MSR or PCI CFG space access) message is showing with pcm-memory.x

Hello, i am jerry 

I am trying to run the pcm-memory.x and pcm-power.x on windows platform with WinRing0X64.dll and sys driver files instead of recompiled msr.sys. 

By the way, pcm-win.x is running well but pcm-memory and power tool have some problems to run them.

they are alway showing below messages 

for pcm-memory.x 


C:\IntelPerformanceCounterMonitor-V2.9\PCM-Memory_Win\Debug>pcm-memory.exe 5
DEBUG: Setting Ctrl+C done.

How to use perf_event_open() function exposed by linux kernel to obtain "Intel Iyy Bridge-EP IMC0-7 uncore" events

Dear All,

I wanna use perf_event_open function to get the iMC Performance Monitoring Events, but I don't know how to set up the parameters. I tried to set up it but the result I get is inconsistent with the results obtained by perf commands. The function is 

int perf_event_open(struct perf_event_attr *attr,
                           pid_t pid, int cpu, int group_fd,
                           unsigned long flags);

For this event

How to profile the communication between Host and Mic during Offload Mode

Recently I need to know exactly the data communication(exactly how much and when) between the Host and MIC while running an offload application.

Unlike Nvidia, Intel doesn't provide a tool like NV Profiler.

So I guess maybe Vtune Amplifier XE can do this job. But unfortunately I get to know that when Amplxe only analyze MIC's performance while I analyse an offload application.

So I come to get help from the forum. Is there anyone who can help me? 

Q on TLB, Cache and Memory Timings

I'm putting together a lecture on paging for my operating systems class, and our textbook (Silberschatz) gives a overly simplistic example of calculating the effective access time for memory.

I have been trying to gather some more up-to-date information so that students will see the relative impact of different parts of the hardware,  but I am doing a poor job.  I have no idea  what the average TLB access time on the Broadwell is.  I just need something in the ballpark.  With the larger L2 TLB I am guessing that the hit rate is over 99% on average.

MD5 optimization to support Amazon s3 integrity check generation

Amazon S3 uses MD5 for an integrity check on its data (note: non-security function so MD5 is ok -- I have no security issues with using MD5 for an integrity check).  Since this is their choice, for my on-premise data I am moving to and from Amazon using Intel Servers I am now computing MD5 using software -- no hardware assist.

Impact of "RdCode" on remote CPU via QPI

Hi All,

I am now working on the intel's HARP system (CPU+FPGA, connected by QPI).

After reading, I roughly know the impact of "RdData" to the remote CPU via QPI.

 However, the FPGA can issue the "RdCode" to the CPU via QPI,  I do not know the exact steps the CPU goes through.  Thanks.



Bug in PCM CSV formatting


While monitoring with Intel PCM I noticed there's a minor bug in CSV output format, which doesn't shift the top most header to align correctly with second header and following values.

The bug is simply missing a semicolon in the header section that separates Core X (Socket Y) info. E.g.

Core44 (Socket 1);;;;;;;;;;;;;;;;;Core45 (Socket 1)

This has 16 semicolons, resulting 15 spaces when exported to something like Excel, but actual values for a given core has 17 entries, so this should ideally contain 17 semicolons.


memory bandwidth on core i3 (Westmere/Clarkdale)

I'm trying to measure memory bandwidth on Core i3 530 CPU with pcm.exe and pcm-memory.exe utilities (v.2.9) but seems it's impossible. Here are what I got:

from pcm.exe:


DEBUG: Setting Ctrl+C done.

 Intel(r) Performance Counter Monitor V2.9 (2015-08-07 10:23:17 +0200 ID=721d9e3)

 Copyright (c) 2009-2015 Intel Corporation

Starting MSR service failed with error 2 Trying to load winring0.dll/winring0.sys driver...
Using winring0.dll/winring0.sys driver.

[PCM] Intel CPU performance counters to get real core utilization % with HT

Hi all,
this is my first post here in Intel forum but I have read more or less all the threads regarding PCM and core utilization with PC.
I still have doubts and questions on how effectively use PCM to get the real core utilization % when HT is active.

Подписаться на Software Tuning, Performance Optimization & Platform Monitoring