Software Tuning, Performance Optimization & Platform Monitoring

Intel® VTune™ Amplifier XE for Linux


Our customer try to evaluate Intel® VTune™ Amplifier XE for Linux.

While activation in Step 3

Step 3 of 7 | Activation > Remote Offline Activation


In order to complete the offline activation process, you will need to use a

system that is connected to the Internet.

Single-threaded memory performance for dual socket Xeon E5-* systems

Hi all,

I had originally asked this question in a separate Intel community forum (, but it was suggested that I repost here. There is also a stackoverflow question from another user linked in the other posting ( that provides more details on a specific test platform.

SIMD optimisation


I am currently looking to port a h.264 decoder (video codec), from SSE4.2 to AVX2 instruction set. Are there any benchmarking numbers w.r.t video codecs using AVX2 (instruction set (or) Assemly code). PLMK,so that it would be useful before i start working on it to expect the outcome results out of it.

Thanks in advance



Trouble getting useful information from the PCM

Hi everybody. I am a novice when it comes to using Intel performance monitoring products, so I am not sure how to properly phrase this question.

I am attempting to use the PCM to test some code a colleague and I developed for large scale numerical calculations in a super computing environment. We want investigate some performance characteristics of our methods compared to competitors. The problem is I am getting some nonsense information from the PCM, and I do not now why.

Have a beeping problem with new ram memory.

i think its about performance of computer.

Have a beeping problem with new ram memory - 4gb. 

beeping , and no display.

Mother board:

According to this, my destop board could accept memory stick, but pratically not.

Ram sticks  :

Breaking down data access by cache and memory levels


I'm trying to analyze and compare two simple geometric multi-grid kernels using performance counters to see how much of the data (in Bytes) is coming from L1, L2, LLC and the DRAM for each implementation.

I realize that getting an accurate count is extremely difficult with so many different things going on underneath (prefetching, instructions, cache lines, etc.), so I am trying to get at least a *rough estimate*.

I'm using LIKWID to analyze my code and I was hoping to get what I need using the following counters:


Memory Bandwidth on 2 socket Xeon E5-2670 without uncore counters


I need to measure memory bandwidth on a data-center where each node is a 2 socket Xeon E5-2670.

I know it can be measured with Uncore performance counters (iMC performance monitoring CAS_COUNT) as described in Intel Xeon E5-2600 Product Family Uncore Performance Monitoring Guide, but when I look in /sys/bus/event_source/devices/ there is no uncore counters... (I guess this is because it runs old Linux kernel 3.0, but unfortunately I cannot change this, nor I can be root).

Subscribe to Software Tuning, Performance Optimization & Platform Monitoring