Intel Core i7 processor uncore event availability in PTU

Intel Core i7 processor uncore event availability in PTU

Dear Performance Tuning Experts,

Is it possible tocollect sampling data for uncore performance events using PTU, such as, for example, UNC_QMC_NORMAL_READS.CH0 ?

If the answer it yes, how would one go about it in the PTU framework?

As far as I understand, PTU has the ability to count some uncore events using OFFCORE_RESPONSE_0.REQUEST.RESPONSE counter with appropriate REQUEST.RESPONSE encoding, but a lot of uncore counters do not fall into that category.

Thanks very much,
--Rasa

26 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Hi rasa,
Unfortunately thelatest published version does not contain support for any of UNC_* events.
Try to state you goal in general. Probablythere is a different way to achieve it.
K

Quoting - Konstantin Lupach (Intel)

Hi rasa,
Unfortunately thelatest published version does not contain support for any of UNC_* events.
Try to state you goal in general. Probablythere is a different way to achieve it.
K

Hi Konstantin,

Thank you for your prompt reply! I would like to measure the following quantities:

1) readbandwidth on each memory channel (and aggregate)
2) writebandwidth on each memory channel (and aggregate)
3) Read bandwidth on IOH QPI
4) Write bandwidth on IOH QPI
5)IOH QPI utilization
6) Intersocket QPI traffic from socket 0 to socket 1
7) Intersocket QPI traffic from socket 1 to socket 0
8) Intersocket QPI utilization

Thank you very much in advance for your advice!

Regards,
--Rasa

Hi again.
Currently for external users we can only propose to use perfmon2.
These abilities will most likely be in PTU too but appropriate public release will happen not earlier than EOY.
K

Quoting - Konstantin Lupach (Intel)

Hi again.
Currently for external users we can only propose to use perfmon2.
These abilities will most likely be in PTU too but appropriate public release will happen not earlier than EOY.
K

Im looking to perform the same measurements, as I'm working on a 4 port ixgbe 10gigE solution, and am having troubles obtaining line speed of all 4 ports simultaniously. My initial calculations show I should have plenty of bandwidth, but I believe there is over-uilitzation of certain links between the bridges and/or CPUs. Changing the QPI frequency appears to make the difference between 33gb/s and obtaining actual line speed 40gb/s, which is required for this application. Either way, we should be able to obtain 40gb/s under a lower frequency.

If there is a "internal" tool available, what might it be called so that I might try to request it via QUAD under my company's NDA and Intel support arrangements?

I sent the question to the tool owner and people who may clarify on providing this tool under NDA.

Quoting - Konstantin Lupach (Intel)

I sent the question to the tool owner and people who may clarify on providing this tool under NDA.

Luke
Please submit a support case to premier.intel.com under vtune and request it be assigned to Dave levinthal
d

Rasa,
We have now made available a process to measure memory bandwidth using the uncore events. They can only be counted, not sampled. Only the 2 events needed for precise memory bandwidth are supported. Please see this article:
http://software.intel.com/en-us/articles/how-do-i-measure-memory-bandwidth-on-an-intel-core-i7-or-xeon-5500-series-platform-using-intel-vtune-performance-analyzer/
Thanks,
Shannon

Quoting - Shannon Cepeda (Intel)

Rasa,
We have now made available a process to measure memory bandwidth using the uncore events. They can only be counted, not sampled. Only the 2 events needed for precise memory bandwidth are supported. Please see this article:
http://software.intel.com/en-us/articles/how-do-i-measure-memory-bandwidth-on-an-intel-core-i7-or-xeon-5500-series-platform-using-intel-vtune-performance-analyzer/
Thanks,
Shannon

if you want a few more...just edit the pmn.xml file and add the following. this only works for counting mode..(see Shannon's note)..if you have problems...you are on your own..DO NOT ask for help
:-)
d

0x100A0
0x2C
0x7
0x50
0x40000
0
35,36,37,38,39,40,41,42
UNC_IMC_NORMAL_READS.ANY
IMC normal read requests
pmn.chm
100000
0
1
A0

0x100A3
0x2F
0x7
0x50
0x40000
0
35,36,37,38,39,40,41,42
UNC_IMC_WRITES.FULL.ANY
IMC full cache line writes
pmn.chm
100000
0
1
A0

0x100B3
0x20
0x1
0x50
0x20000
0
35,36,37,38,39,40,41,42
UNC_QHL_REQUESTS.IOH_READS
Quickpath Home Logic IOH read requests
pmn.chm
100000
0
1
A0

0x100B3
0x20
0x2
0x50
0x20000
0
35,36,37,38,39,40,41,42
UNC_QHL_REQUESTS.IOH_WRITES
Quickpath Home Logic IOH write requests
pmn.chm
100000
0
1
A0

0x100B3
0x20
0x10
0x50
0x20000
0
35,36,37,38,39,40,41,42
UNC_QHL_REQUESTS.LOCAL_READS
Quickpath Home Logic local read requests
pmn.chm
100000
0
1
A0

0x100B3
0x20
0x20
0x50
0x20000
0
35,36,37,38,39,40,41,42
UNC_QHL_REQUESTS.LOCAL_WRITES
Quickpath Home Logic local write requests
pmn.chm
100000
0
1
A0

0x100B3
0x20
0x4
0x50
0x20000
0
35,36,37,38,39,40,41,42
UNC_QHL_REQUESTS.REMOTE_READS
Quickpath Home Logic remote read requests
pmn.chm
100000
0
1
A0

0x100B3
0x20
0x8
0x50
0x20000
0
35,36,37,38,39,40,41,42
UNC_QHL_REQUESTS.REMOTE_WRITES
Quickpath Home Logic remote write requests
pmn.chm
100000
0
1
A0

Many thanks for sharing thisinformation!
Isit reasonable to assume thatan UNC_QHL_REQUESTS.IOH_READS / WRITES event corresponds to a 64-byte transfer?

Thanks a lot in advance,
--Rasa

Quoting - rasa

Many thanks for sharing thisinformation!
Isit reasonable to assume thatan UNC_QHL_REQUESTS.IOH_READS / WRITES event corresponds to a 64-byte transfer?

Thanks a lot in advance,
--Rasa

I believe so..but have never tested it..I am unsure of what happens for uncacheable data
d

Hi,

I'm trying to perform these same measurements, but I'm using a Core i7-820QM (aka Clarksfield) processor. I've installed an evaluation version of VTune, and I've followed the posted instructions (at http://software.intel.com/en-us/articles/how-do-i-measure-memory-bandwidth-on-an-intel-core-i7-or-xeon-5500-series-platform-using-intel-vtune-performance-analyzer/) to install and configure SEP to count uncore events, but when I run the "uncore" script, I receive the following error:

"Illegal Event: UNC_IMC_WRITES.FULL.ANY"

I tried modifying the script to specify other events to count, but each returns the same "Illegal Event" error for whatever event type I specify. Ideally, I'd like to monitor the QPI IOH counters described in this thread.

What does this error mean? Is there anything I need to configure differently to run this script on a Clarksfield?

Thanks for any help,
Eric G.

In this thread above you have a piece of XML with multiple uncore events including UNC_IMC_WRITES.FULL.ANY one. Make sure that your pmn.xml file(s) have all the events you need.

Quoting - Konstantin Lupach (Intel)

In this thread above you have a piece of XML with multiple uncore events including UNC_IMC_WRITES.FULL.ANY one. Make sure that your pmn.xml file(s) have all the events you need.

Konstantin,

Thank you for the reply. Right now I am unable to monitor ANY events, as all are returning the "Illegal Event" error, though these events are present in the pmn.xml file. For example, in the simplest case, if I use the unmodified version of the pmn file and script included in the "win_measurebw.zip" file, I see the "Illegal Event" error for the UNC_IMC_WRITES.FULL.ANY event type. I also tried adding the other events using the XML included in this discussion thread, but none work (all return "Illegal Event").

Any other ideas?

Thanks,
Eric

I would stop trying to make Vtune working and try the way that worked for people in messages above.
Do you think it is acceptable?

Anyway uncore stuff is collected in counting mode only. You will get a simple text file and VTune is not needed for its analysis.

Or ask this question in the VTune forum. There are dedicated people who know the tool well.

Quoting - Konstantin Lupach (Intel)

I would stop trying to make Vtune working and try the way that worked for people in messages above.
Do you think it is acceptable?

Anyway uncore stuff is collected in counting mode only. You will get a simple text file and VTune is not needed for its analysis.

Or ask this question in the VTune forum. There are dedicated people who know the tool well.

I am not trying to use VTune. I am following the instructions (exactly as described) in this article:

http://software.intel.com/en-us/articles/how-do-i-measure-memory-bandwid...

Perhaps my original question was confusing; I am not trying to use VTune directly. I was under the impression that VTune was required to use PTU to count uncore events, but I guess this is not the case. The article mentions VTune several times, so it's a little confusing.

Stated another way:

When I try to follow the procedure outlined in this discussion thread, I get "Illegal Event" errors. Does anyone have suggestions as to what might cause these errors? I am using a Mobile Core i7 processor.

Thanks,
Eric

I see. It seems like you have several versions of pmn.xml and they conflict between each other.
The places it can be is VTune dir, PTU dir, the package for uncore measurments.
I have never seen and used the packages itself but I do know well the software it uses.
The package has some scripts that manipulate with PATH/LD_LIBRARY_PATH.

So it fix your problem:
Either
Make sure that directory containing the right pmn.xml (with the needed events) goes in both environment variables before any other directories that contains it too.
or
replace all the copies of pmn.xml (do not forget to make their backups in advance) you have with the version that contains the events you need. The only disadvantige of this simple variant is thatit makes you to remember that UNC_* events can be not be really used in VTune and PTU. Just do not select them when configuring any EBS collection and that is it.

Some explanation for suggested fixes:
If you are working on Windows be aware that VTune for Windows adds its directories to the system PATH variable so it might create the conflict. If you have both PTU and uncore measurments package you can rename the Vtune dir or uninstall it to have less chances to create conflicts.

Quoting - Konstantin Lupach (Intel)

I see. It seems like you have several versions of pmn.xml and they conflict between each other.
The places it can be is VTune dir, PTU dir, the package for uncore measurments.
I have never seen and used the packages itself but I do know well the software it uses.
The package has some scripts that manipulate with PATH/LD_LIBRARY_PATH.

So it fix your problem:
Either
Make sure that directory containing the right pmn.xml (with the needed events) goes in both environment variables before any other directories that contains it too.
or
replace all the copies of pmn.xml (do not forget to make their backups in advance) you have with the version that contains the events you need. The only disadvantige of this simple variant is thatit makes you to remember that UNC_* events can be not be really used in VTune and PTU. Just do not select them when configuring any EBS collection and that is it.

Some explanation for suggested fixes:
If you are working on Windows be aware that VTune for Windows adds its directories to the system PATH variable so it might create the conflict. If you have both PTU and uncore measurments package you can rename the Vtune dir or uninstall it to have less chances to create conflicts.

Konstantin,

Thank you again for your assistance. I tried your suggestions, but unfortunately I am still getting the "Illegal Event" error when trying to count uncore events using the instructions in this thread. I've tried running the script on both XP (32-bit) and Windows 7 (64-bit), and both return the same error. I uninstalled VTune and removed all occurrences of the "pmn.xml" file, just to eliminate any ambiguity.

If you have any other suggestions, I would appreciate them. This functionality would be very helpful for an issue I am trying to analyzer.

Thanks,
Eric

Sorry for delay with answers. Extremely busy with PTU 4.0.

Eric, unfortunately I can not find uncore packages using instructions from the article. If you give me direct links I will repeat it all myself. I do believe it is connected with not having needed events in the file you currently use. Try to find the events you use in the pmn.xml you think you use.
You may also try to ask for help people above who succeeded in using this package.

Konstantin,
I sent you a private message with a link to the Windows "uncore" script and pmn file. If you have time to see if you can get this working on your end, I'd appreciate it.
Has anyone successfully used this script on Windows?
Thanks,
Eric

I tried this on Win7 EM64T.
I used all defult script values including PTU dir.
It was not un purpose I just unzippedPTU to c:.
Installed PTU driver as administrator.

win_measurebw.zip was unzipped to c: and DFS share and worked from both.
The only case I was able to reproduce the same issue is when I tried to run uncore.bat having UNC path as a current dir (\hostshare).

You can simplify this script by removing all these useless prompts.
Just know when you have file with uncore events and be sure to add it in the beginning of the PATH variable (on Windows) before running the sep command that uses these events.

Another variant is to add all the needed uncore events to the event file in the original PTU location (just in case backup the original file before doing modifications). Just make sure you do not misuse them. Only the usage illustrated by this script will work for UNC_* events.

Hope this helps.

P.S. thank you for giving the direct link. It did help me.

Hi Shannon,

Are there any plans to support uncore counters in the future versions of Vtune Performance Analyzer or Intel Performance Tuning Utility?

Would it be possible to create a similar workaround to measure the bandwidth on the QPI links?

Thanks a lot!

Hi, it is not a good idea to write Shannon on this forum. As I know she supports VTune. So you would better ask this question on the VTune forum where this article was published. VTune is a supported product so where are more chances to get there timely answer.

I have an anwer to your question from David Levinthal. The definition of events he refers can be found earlier in this thread.

They can measure the QPI traffic with the events I gave them Reads and Writes Per Source UNC_QHL_REQUESTS.IOH_READS UNC_QHL_REQUESTS.IOH_WRITES UNC_QHL_REQUESTS.REMOTE_READS (includes RFO and NT store) UNC_QHL_REQUESTS.REMOTE_WRITES (includes NT Stores) UNC_QHL_REQUESTS.LOCAL_READS (includes RFO and NT Store) UNC_QHL_REQUESTS.LOCAL_WRITES (no NT stores) Precise totals can be measured in IMC But cannot be broken down per source UNC_IMC_NORMAL_READS.ANY (or by channel, includes RFO) UNC_IMC_WRITES.FULL.ANY (or by channel, includes NT stores)

First 4 measure QPI BW

Hope this helps.

Regarding future versions.

As I know VTune will continue to support UNC_* events only in the form described in Shannon's article.

Next version of PTU will not support them at all, however, it will be still possible to use the workaround from Shannon article and collect UNC_* events using the current public version of PTU (3.2U1) or VTune.

If somebodyfeels like this ability is vital for their job and/or their organization business I would suggest to write about it on the VTune forum. VTune forum is supported by technical support engineers who have some influence on feature set of VTune and its successors. Most likely VTune successors will support UNC_* events. When and how exactly this support will be introduced depends on demand. The more customers ask about it the sooner it happens.

Rasa,

you might trya measurement tool presented in this article: Intel Performance Counter Monitor - A better way to measure CPU utilization. It should be able to estimate some of the quantities you have mentioned. It also should work for the Intel Xeon 7500 Processor series (codenamed Nehalem-EX) that has a different uncore vs. Core i7and Xeon 5600/5500 processor series.

Roman

Hi,

I've tried the instructions listed in this article for a Westmere processor using the uncore.bat script. When I try to run the script, it tells me that UNC_IMC_WRITES.FULL.ANY and UNC_IMC_NORMAL_READS.ANY are invalid events. How do I get this to work with Westmere?

Thanks!
Don

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui