Intel® Developer Zone:
Platform Monitoring

Welcome to Intel Platform Monitoring Community!

Here you will find information covering performance monitoring and software tuning, and platform monitoring topics. Performance monitoring covers a variety of topics including an introduction to monitoring and software tuning methodologies, as well as software optimization techniques and best known methods (BKMs) for novice and more advanced users.

For developers, programming reference manuals are available with the latest information describing the hardware interface of the Performance Monitoring Unit (PMU) of Intel microprocessors including core and un-core monitoring resources, as well the definitive source of information on performance events which may be monitored.

Platform monitoring includes machine monitoring topics such as monitoring CPU core and graphics processors and other system coprocessors as well as metering and quality of service.

Memory and Cache Profiling Erratum on Intel® Xeon® processor E5 family
By Angela Schmid (Intel)Posted 05/06/20130
Audience: Anyone collecting event based performance data on a platform based on the Intel® Xeon® processor E5 family. There is a Performance monitoring unit erratum on the Intel®  Xeon® processor E5  family that affects the events used for memory and cache profiling. To collect data on the events...
Intel Architecture and Processor Identification With CPUID Model and Family Numbers
By Hussam Mousa (Intel)Posted 06/15/201211
Summary of recent Intel processor's cpuid values, model and family numbers linked to the architecture codename and processor codename as well as their brand names and model. Summary covers mainline IA x86 and x64 90nm, 65nm, 45nm, and 32nm processors.
Platform Monitoring Basics
By Posted 09/15/20100
Introduction This is an organic document, meaning, that it will expand as need and request dictate. The purpose is to help establish a baseline understanding of terms used in Platform Monitoring, concepts described, and utilizations or capabilities comprehended. Performance Monitoring Terminolog...
Performance Optimization and Platform Monitoring
By Posted 09/15/20102
Introduction This discussion covers some of the needs and implications that drive one to optimize and manage one’s platform. In this, we disclose opportunities to influence the ultimate performance of a computer system at the architectural, platform and software levels, and provide a rationale fo...
Subscribe to Intel Developer Zone Articles
Dissecting STREAM benchmark with Intel® Performance Counter Monitor
By Roman Dementiev (Intel) Posted on 11/23/10 8
Intel® Performance Counter Monitor (Intel® PCM) is an API and a set of tools that should help developers to understand how their applications utilize the underlying compute platform. In this blog I will explain how to instrument the well-known STREAM benchmark with library functions of Intel® PCM...
Subscribe to Intel Developer Zone Blogs
Run to Run variability on an Intel(R) Xeon(R) CPU E3-1240 v3
By Animesh J.0
Hi, I have been struggling to get reproducible results in a very simple Matrix multiplication code. I see variability of even more than 10% from run to run. I have run the perf stat command to monitor the runs.   1024x1024  Performance counter stats for 'taskset 0x1 MM_binaries/MM_tiled_1024':    198,505,412,302 cycles                    #    0.000 GHz                     [57.14%]    283,932,630,578 instructions              #    1.43  insns per cycle         [71.42%]    111,811,914,939 L1-dcache-loads                                              [71.43%]      1,091,387,355 L1-dcache-load-misses     #    0.98% of all L1-dcache hits   [71.43%]      1,088,112,128 r504f2e                                                      [71.44%]        543,287,591 r50412e                                                      [71.43%]                  0 LLC-prefetches                                               [57.14%]       71.139608212 seconds time elapsed  Performance counter stats for ...
Xeon E5 26xx v3 energy monitor error
By Roberto R.5
I am developing an app for a Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz. I need to access the cores energy consumption. I am doing it using the RAPL feature. I have access to the dram and package energy consumption but when I tried to access the core consumption (PP0) I always read zero from the MSR_PP0_ENERGY_STATUS. I am using the MSR specification from the Software Developer Manual table 35.25.  As I said the I can read the package and dram energy consumption but the PP0 always return 0. From the manual I understood this feature is present in my processor. What could be wrong? I tried my program in an Intel(R) Core(TM) i7-4702MQ CPU @ 2.20GHz and works perfect but in the server is not working. Thanks in advance.    
Instrument all LD/ST instructions for a given snippet of code
By Jithin Parayil T.1
Hi, Currently, I measure the L2 and L3 cache hit ratio and misses for a C++ code snippet using the Intel PCM library. Can the PCM library be used to collect  additional information as described below: I would like to instrument all load and store instructions executed by a snippet of C++ code. Following is the information I would like to collect for each such instruction: * Virtual memory address being accessed for the data * Was the access a cache hit or not? If it was a hit, which level of cache did it hit in? If the PCM library does not support this, is there an alternative option to collect this info? I'm working on an IvyBridge machine - Intel(R) Xeon(R) CPU E5-2697 v2 machine - 2 sockets with 12 cores per socket (Hyperthreading is enabled - hence 24 cores/socket) Thanks, Jithin
Do intel cpus have options to speed up ipc ?
By Jog L.2
Hello, Maybe something silly I ask, but still : Do intel cpus have some options to make interprocess communication faster. I currently use mmaped regions buts maybe something even better could be done at the cpu level. My feeling is that it cannot be used in any way because the kernel organizes things. But it would be amazing still. I am on Linux, with kernel 3.19. Any pointer how to make things go faster is of my interest.
Inline assembly to generate most heat on SB-E
By CommanderLake10
I'm curious as to what __asm instructions would generate the most heat on a SB-E for stability testing, with prime95 I can get the CPU package power to just over 130w but experimenting with my own AVX assembly I cant get more than 100w out of it?
Intel PCM, measuring RAM activity ? (data not energy)
By Jeremie Lagraviere2
Hi everyone, I have a simple question about Intel PCM: Is it possible to measure RAM data activity in the code ? Same question with any of the ready-made tools provided in Intel PCM package ? All I have seen so far, is RAM energy measurement. Thanks in advance for your help :) --Jeremie.
Error building IntelPerformanceCounterMonitorV2.8 in Visual Studio 2013
By kmathew8
Hello, I encounter the following error when I compile IntelPerformanceCounterMonitorV2.8 using Visual Studio 2013. 1>------ Build started: Project: PCM-Service, Configuration: Release Win32 ------ 1>  utils.cpp 1>c:\intelperformancecountermonitorv2.8\utils.h(60): error C3861: 'YieldProcessor': identifier not found ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ========== Has anyone encountered this issue?  Thanks  
CPU/GPU ring programming
By Samy C.1
Hello, I am new on this forum. I start to design a C langage software based on CPU+GPU programming. The idea is to offload the processing on GPU when the tasks become heavy for some part of the software. I read the Gen8 paper talking about CPU+GPU shared ring memory. I look for information on how to program an application to use theses features. Could you recommend me some API, debugging tools, cache layers (L1, L2, L3) tools to visualize, investigate how the software behave on theses layers. I currently use a MacBook Air ( Device Intel(R) Core(TM) i5-4260U CPU @ 1.40GHz supports OpenCL 1.2  Device HD Graphics 5000 supports OpenCL 1.2 ) Sorry for such a beginner question
Subscribe to Forums
Using Intel® GPA to Check Power Usage

Brad Hill of Intel talks about using Intel GPA to check application power usage. Learn how to use the GPA tool to analyze power consumption of graphics and CPU intensive applications. Learn more

Intel® Graphics Performance Analyzers 2012 R5

Paul Lindberg talks about the Intel® Graphics Performance Analyzers 2012 R5 releases, and gives a preview of what will be coming in 2013 for GPA.

Software Performance Monitoring

Software Performance Monitoring


Software Performance Monitoring

Highlights from the Community Manager

On Jan 5, 2011, Intel launched the 2nd Generation Intel® Core™ processor family (formerly code-named Sandy Bridge) for laptops and PCs. The new processors have a revolutionary new architecture that combines the computing “brain,” or microprocessor, with a graphics engine on the same die for the very first time. New features include Intel® Insider™, Intel® Quick Sync Video, and a new version of the company's award-winning Intel® Wireless Display (WiDi), which now adds 1080p HD and content protection for those wishing to beam premium HD content from their laptop screen to their TV.

Stay connected. Visit often. We will be posting the PMU programming guides and updated tools to give you the latest information on the new Intel microarchitecture innovations