Measuring the UOPS dispatched down LSD (Loop Stream Detector)

Measuring the UOPS dispatched down LSD (Loop Stream Detector)

I am trying to measure the UOPs being delivered from the Loop Stream Dectector (LSD) in my Sandy Bridge processor. I don't see any documentation in the PMCs as to doing this. Is there a method I can use to determine the # of uops delivered to the UopQ from the LSD? Is the LSD in the UopQ, if so then it's not really delivering uops to the UopQ, right? PMC 79 allows me to measure the uops dispatched from the uop cache with umask=0x08, from the legacy decode unit (ILD) with umask=0x04 and from micro-code (MS) with umask=0x30, but if you can't determine those coming from the LSD, you can not account for all uops delivered to the UopQ. I ask this because I'm observing the large number of uops missing which are retired in simple copy/read/write tests and want to account for the sources and identify the %'s of uops delivered to the UopQ from the various sources.Thanks..perfwise

4 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Further, if I measure the UOPs delivered to the from the IDQ using B.3.7.2 here:http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.htmlYou will get a different number from that measured from PMC 9C or from PMC 0E.PMC 9C measures the # of uops delivered from the UopQ to the Renamer/Resource allocation table (RAT).PMC 0E measures the # of uops issues from RAT to the scheduler, correct?I'm trying to determine the # of uops provided by LSD because in some cases the # issues/retired varies significantly from that provided by IDQ.Thanksperfwise

According to what I heard, there was no plan to validate Loop Stream Detector events for Sandy Bridge, so this leaves you on your own as to what you can learn from them.

You may be able to get the number indirectly from taking those issued to the scheduler, PMC 0x0E, and subtracting the number provided to the UopQ from IDQ, PMC 0x9C. This make sense?

It tells you nothing about the distribution of uops provided per cycle by LSD. I am also measuring the LSD uops but those do not make a great deal of sense sometimes, and there's definitely overcounting taking place.

In a separate thread I also find, as does Pat @ Intel, that the number of uops provided from the UopQ to RAT is greatly overstated, esp in cases where the miss rate to the L1D is high. Any ideas why?

Perfwise

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui