Intel® Developer Zone:
Platform Monitoring

Welcome to Intel Platform Monitoring Community!

Here you will find information covering performance monitoring and software tuning, and platform monitoring topics. Performance monitoring covers a variety of topics including an introduction to monitoring and software tuning methodologies, as well as software optimization techniques and best known methods (BKMs) for novice and more advanced users.

For developers, programming reference manuals are available with the latest information describing the hardware interface of the Performance Monitoring Unit (PMU) of Intel microprocessors including core and un-core monitoring resources, as well the definitive source of information on performance events which may be monitored.

Platform monitoring includes machine monitoring topics such as monitoring CPU core and graphics processors and other system coprocessors as well as metering and quality of service.

Memory and Cache Profiling Erratum on Intel® Xeon® processor E5 family
By Angela Schmid (Intel)Posted 05/06/20130
Audience: Anyone collecting event based performance data on a platform based on the Intel® Xeon® processor E5 family. There is a Performance monitoring unit erratum on the Intel®  Xeon® processor E5  family that affects the events used for memory and cache profiling. To collect data on the events...
Intel Architecture and Processor Identification With CPUID Model and Family Numbers
By Hussam Mousa (Intel)Posted 06/15/201211
Summary of recent Intel processor's cpuid values, model and family numbers linked to the architecture codename and processor codename as well as their brand names and model. Summary covers mainline IA x86 and x64 90nm, 65nm, 45nm, and 32nm processors.
Platform Monitoring Basics
By Posted 09/15/20100
Introduction This is an organic document, meaning, that it will expand as need and request dictate. The purpose is to help establish a baseline understanding of terms used in Platform Monitoring, concepts described, and utilizations or capabilities comprehended. Performance Monitoring Terminolog...
Performance Optimization and Platform Monitoring
By Posted 09/15/20102
Introduction This discussion covers some of the needs and implications that drive one to optimize and manage one’s platform. In this, we disclose opportunities to influence the ultimate performance of a computer system at the architectural, platform and software levels, and provide a rationale fo...


Subscribe to
Dissecting STREAM benchmark with Intel® Performance Counter Monitor
By Roman Dementiev (Intel)Posted 11/23/20106
Intel® Performance Counter Monitor (Intel® PCM) is an API and a set of tools that should help developers to understand how their applications utilize the underlying compute platform. In this blog I will explain how to instrument the well-known STREAM benchmark with library functions of Intel® PCM...
Subscribe to Intel Developer Zone Blogs
Intel® VTune™ Amplifier XE for Linux
By thangam@cmmacs.ernet.in2
HI, Our customer try to evaluate Intel® VTune™ Amplifier XE for Linux. While activation in Step 3 Step 3 of 7 | Activation > Remote Offline Activation -------------------------------------------------------------------------------- In order to complete the offline activation process, you will need to use a system that is connected to the Internet. 1. On the connected system, go to: 2. Enter this product activation code on the above web page:    PMG2-GTNZ-2GGW-7EC3-MNQ3-P4V6-DMF 3. Save the product unlock code that is displayed on the above web page. 4. Choose option 1 below and enter the product unlock code to complete the activation process. So, try to redirect URL >> while register with product key. we are getting below message. Sorry, evaluation serial numbers may not be activated using this Remote Activation Page. Please ...
Single-threaded memory performance for dual socket Xeon E5-* systems
By Thomas B.5
Hi all, I had originally asked this question in a separate Intel community forum (, but it was suggested that I repost here. There is also a stackoverflow question from another user linked in the other posting ( that provides more details on a specific test platform. To summarize the core question/observation: When benchmarking the memory performance of (pinned) single-threaded operations on large buffers (larger than the last level of cache), we observe substantially lower copy bandwidth on dual-socket E5-26XX and E5-26XX v2 Xeon systems than on other systems tested, including older Westmere systems, i7 CPUs, etc. This result can be seen using CacheBench ( as shown in the stackoverflow posting. I realize that the aggregate bandwidth numbers can be increased substantially by using mulitple threads pinned to eac...
What causes the retired instructions to increase?
By Shiv_Inside1
This is a re-post from stack-overflow: I have a 496*O(N^3) loop. I am performing a blocking optimization technique where I'm operating 2 images at a time instead of 1. In raw terms, I am unrolling the outer loop. (The non-unrolled version of the code is as shown below: ) b.t.w I'm using Intel Xeon X5365 machine that has 8 cores and it has 3GHz clock, 1333MHz bus frequency, Shared 8MB L2( 4 MB shared between every 2 core), L1-I 32KB,L1-D 32KB . for(imageNo =0; imageNo<496;imageNo++){ for (unsigned int k=0; k<256; k++) { double z = O_L + (double)k * R_L; for (unsigned int j=0; j<256; j++) { double y = O_L + (double)j * R_L; for (unsigned int i=0; i<256; i++) { double x[1] = {O_L + (double)i * R_L} ; double w_n = (A_n[2] * x[0] + A_n[5] * y + A_n[8] * z + A_n[11]) ; double u_n = ((A_n[0] * x[0] ...
By Huazhe Z.2
Hi guys, I have been using RAPL for a while. However, I did not figure out how to disable RAPL accessing PMU in order to use PCM. Could anyone give me some idea on that? Thanks!
SIMD optimisation
By kiran N.0
Hi.. I am currently looking to port a h.264 decoder (video codec), from SSE4.2 to AVX2 instruction set. Are there any benchmarking numbers w.r.t video codecs using AVX2 (instruction set (or) Assemly code). PLMK,so that it would be useful before i start working on it to expect the outcome results out of it. Thanks in advance Regards, Kiran.
Varying CPU usage despite the same test pattern
By Oleg A.11
Hello all, We've got very strange behavior when testing IP packet forwarding performance on Sandy Bridge platform (Supermicro X9DRH with the latest BIOS) on Linux Kernel. This is two socket E5-2690 CPU system. Using different PC we're generating DDoS-like traffic with rate of about 4.5 million packets per second. Traffic is receiving by two Intel 82599 NICs and forwarding using the second port of one of this NICs. All load is evenly distributed among two nodes, so each of 32 CPUs SI usage is virtually equal. Now the strangest part. Few moments after pktgen start on traffic generator PC, average CPU usage on SB system goes to 30-35%. No packet drops, no rx_missed_errors, no rx_no_dma_resources. Very nice. But CPU usage starts to decreasing gradually. After about 10 seconds we see ~15% average among all CPUs. Still no packet drops, the same RX rate as in the beginning, RX packet count is equal to TX packet count. After some time we see that average usage start to go up. Peaked at ini...
Trouble getting useful information from the PCM
By Kirk S.1
Hi everybody. I am a novice when it comes to using Intel performance monitoring products, so I am not sure how to properly phrase this question. I am attempting to use the PCM to test some code a colleague and I developed for large scale numerical calculations in a super computing environment. We want investigate some performance characteristics of our methods compared to competitors. The problem is I am getting some nonsense information from the PCM, and I do not now why. === Relevant hardware and OS information === Computer: MacBook pro with Intel i5 processor OS: OpenSUSE 13.1 running within a VirtualBox PCM Version: 2.0b When I run our code with calls to the PCM, I get the following information returned: getIPC(before_sstate1,after_sstate1) // Returns -1 getL3CacheHitRatio(before_sstate1,after_sstate1) // Returns 1 getL3CacheMisses(before_sstate1,after_sstate1) // Returns 0 getBytesReadFromMC(before_sstate1,after_sstate1) // Returns 0 Where before_sstate1and after_sstate1 ar...
Have a beeping problem with new ram memory.
By pinger d.4
i think its about performance of computer. Have a beeping problem with new ram memory - 4gb.  beeping , and no display. Mother board: According to this, my destop board could accept memory stick, but pratically not. Ram sticks  : I have two of these rams and both news. I dont think they are broken. I  just put one in A place , not two in A and B place , because in information was written max - 4gb. Now pc working with 2x1gb ram sticks. 1333 Mhz  Memory default (5-5-5-15) Any suggestions,questions or solutions of problem? Someone has write that: "And you are right, it supports DDR3-1066/800 - not 1333 It was just a pure chance that your 1 GB 1333 MHz modules worked - sometimes higher frequency RAM will downclock and run - that does not mean that all 2 GB 1333 MHz should work. This is why you are hearing the beep and having other problems. Please change over t...


Subscribe to Forums
Using Intel® GPA to Check Power Usage

Brad Hill of Intel talks about using Intel GPA to check application power usage. Learn how to use the GPA tool to analyze power consumption of graphics and CPU intensive applications. Learn more

Intel® Graphics Performance Analyzers 2012 R5

Paul Lindberg talks about the Intel® Graphics Performance Analyzers 2012 R5 releases, and gives a preview of what will be coming in 2013 for GPA.

Software Performance Monitoring

Software Performance Monitoring

Subscribe to Videos

Highlights from the Community Manager

On Jan 5, 2011, Intel launched the 2nd Generation Intel® Core™ processor family (formerly code-named Sandy Bridge) for laptops and PCs. The new processors have a revolutionary new architecture that combines the computing “brain,” or microprocessor, with a graphics engine on the same die for the very first time. New features include Intel® Insider™, Intel® Quick Sync Video, and a new version of the company's award-winning Intel® Wireless Display (WiDi), which now adds 1080p HD and content protection for those wishing to beam premium HD content from their laptop screen to their TV.

Stay connected. Visit often. We will be posting the PMU programming guides and updated tools to give you the latest information on the new Intel microarchitecture innovations