Software Tuning, Performance Optimization & Platform Monitoring

What port(s) are reffered to by the "Cycles of Port X Utilized"

Using the General Exploration analysis in VTune 2015 will deliver several columns that refer to port utilization. For example, "Cycles of 1 Port Utilized".   The documentation on these columns is, well, less than helpful.  What are the ports?  What do they do?  If code heavily using the ports, what can be done about it?

event for speculative executed instruction

Hi All,

We need to measure over Intel computers the instruction executed in a speculative manner, but not commited. We need to measure how many
instructions are discarded (over a period of time), to see how the speculative execution is working. We check the manual with the Performance Monitoring Events (from Intel) but cant figure out which event to monitor. If you could please know where to look for it, or with which name we should look for it.

Precision lost when compiled with -xAVX or -xHOST

Hi all,

My program output is in double and floating point values when i compile without -xAVX or -xHOST options results are correct but most of the loops aren't getting vectorized but when i use -xAVX or -xHOST option most of loops are getting vectorized and even the performance has been improved but the precision is lost. When I execute same program for a larger dataset this small precision loss is resulting in wrong output. I've even tried -fp-model precise/strict options along with -xHOST but still i'm getting wrong output.

Float pointing exceptions

Hi, I have a desire to understand the format of IEEE 754 (for example, I chose the addition operation), but I have a problem: the format is not accurately described the formation of the status bits. For example, I found an algorithm that also through the expansion of the mantissa of the result (three bits right) allows you to monitor an inaccurate result (rounding mode - to nearest). I decided to simulate the algorithm and compare the results with my processor intel i7 (Control Register "cwr") and I get different results.

Xeon E5 MSR_PP1_ENERGY_STATUS read/write Error


I am using the above utility to determine the power consumption of Xeon E5 chip. When I execute the above code on my machine the output is


Found Haswell CPU
Checking core #0
Power units = 0.125W
Energy units = 0.00006104J
Time units = 0.00097656s

-g option slows down execution

Hi all.

Has anyone experienced problems with execution time by using icc -g option, 

in order to analise the source code's behavior inside VTune ?

I also have some trouble generating vectorization report when compiling with icc -g 

option ? My *.optrpt file is generated empty if I use icc -g ...


Thanks in advance,

Fred. L. Cabral


Subscribe to Software Tuning, Performance Optimization & Platform Monitoring