Loading...
You are not logged-in Login/Register





  • Posts   Search Threads
  • nicolas.wangMarch 10, 2009 9:25 PM PDT   
    Different DTLB in different views

    What's the difference for different DTLB(or any other name) in different views like(process,thread/module/hotspots)?
    Please point me to the document if possible. I've checked the vtune manual and this forum. Didn't find this explanation.
    Thanks,
    Yu Wang

    srimksMarch 12, 2009 10:56 AM PDT
    Rate
     
    Re: Different DTLB in different views

    Quoting - nicolas.wang
    What's the difference for different DTLB(or any other name) in different views like(process,thread/module/hotspots)?
    Please point me to the document if possible. I've checked the vtune manual and this forum. Didn't find this explanation.
    Thanks,
    Yu Wang

    Somehow, I doubt DTLB query as asked by you could be answered either by VTune Book(VTune Performance Analyzer Essentials - James Reinders) or by any VTune manual/articles till date. Probably, Intel had discussed in detail about VTune useability for Itanium, but I doubt Intel had published any other documents either focussing IA-32 or IA-64 or Intel-64 Xeon processor using VTune as they have done for Itanium.

    Please refer "Introduction to Microarchitectural Software Optimization for Itanium Processors" http://cache-www.intel.com/cd/00/00/21/93/219348_software_optimization.pdf to understand about VTune on Itanium for DTLB as asked.

    Hope it should answer your query to some extent.

    ~BR

    nicolas.wangMarch 12, 2009 7:23 PM PDT
    Rate
     
    Re: Different DTLB in different views

    Quoting - srimks

    Somehow, I doubt DTLB query as asked by you could be answered either by VTune Book(VTune Performance Analyzer Essentials - James Reinders) or by any VTune manual/articles till date. Probably, Intel had discussed in detail about VTune useability for Itanium, but I doubt Intel had published any other documents either focussing IA-32 or IA-64 or Intel-64 Xeon processor using VTune as they have done for Itanium.

    Please refer "Introduction to Microarchitectural Software Optimization for Itanium Processors" http://cache-www.intel.com/cd/00/00/21/93/219348_software_optimization.pdf to understand about VTune on Itanium for DTLB as asked.

    Hope it should answer your query to some extent.

    ~BR
    Thanks, Srimks. Your reply reminds to search my old download from intel. In the IA32/IA64 manual, I do find some DTLB introduction.  I'll double check with the Itanium version. (

    By the way, yesterday I watched two of your posts. One is CPI in this forum. I'm wondering whether you have got the answer and solution. I too meet the high CPI(1.6~1.8) meanwhile I observed very low DTLB/ITLB data so I'm wondering what that means. Your post give some interesting direction to read. Another one is the question in your referenced article in the CPI post. With that paper I also raised one question hope intel's Malladi could take a second to reply:)

    In case other intel guy may also have the answer, I'd like to paste it here too.
    "
    I'm curious what's different of BUS_TRANS_ANY.ALL_AGENTS and BUS_DRDY_CLOCKS.ALL_AGENTS. I also see from some intel paper which uses later for FSB calculation. Could we say that with later event, the new ratio is representing the Data Bus Utilization Ratio? instead of Bus Utilization Ratio? The difference comes from "instruction communication"?
    "



    srimksMarch 12, 2009 11:51 PM PDT
    Rate
     
    Re: Different DTLB in different views

    Quoting - nicolas.wang
    Quoting - srimks

    Somehow, I doubt DTLB query as asked by you could be answered either by VTune Book(VTune Performance Analyzer Essentials - James Reinders) or by any VTune manual/articles till date. Probably, Intel had discussed in detail about VTune useability for Itanium, but I doubt Intel had published any other documents either focussing IA-32 or IA-64 or Intel-64 Xeon processor using VTune as they have done for Itanium.

    Please refer "Introduction to Microarchitectural Software Optimization for Itanium Processors" http://cache-www.intel.com/cd/00/00/21/93/219348_software_optimization.pdf to understand about VTune on Itanium for DTLB as asked.

    Hope it should answer your query to some extent.

    ~BR
    Thanks, Srimks. Your reply reminds to search my old download from intel. In the IA32/IA64 manual, I do find some DTLB introduction.  I'll double check with the Itanium version. (

    By the way, yesterday I watched two of your posts. One is CPI in this forum. I'm wondering whether you have got the answer and solution. I too meet the high CPI(1.6~1.8) meanwhile I observed very low DTLB/ITLB data so I'm wondering what that means. Your post give some interesting direction to read. Another one is the question in your referenced article in the CPI post. With that paper I also raised one question hope intel's Malladi could take a second to reply:)

    In case other intel guy may also have the answer, I'd like to paste it here too.
    "
    I'm curious what's different of BUS_TRANS_ANY.ALL_AGENTS and BUS_DRDY_CLOCKS.ALL_AGENTS. I also see from some intel paper which uses later for FSB calculation. Could we say that with later event, the new ratio is representing the Data Bus Utilization Ratio? instead of Bus Utilization Ratio? The difference comes from "instruction communication"?
    "


    Thanks Nicolas.

    I am non-Intel person and simply I use VTune to do some profiling for my application as needed and than I try exploring myself. I did referred "VTune Performance Analyzer Essentials - James Reinders" book and David Levinthal's articles ( http://assets.devx.com/goparallel/18027.pdf ) on VTune. Both serves the purpose to some extent in understanding VTune useability but they don't talk w.r.t processor specific EBS EVENTS and analysis, maybe in feature Intel will come with such needed documents/articles for it's VTune users.

    What I understand the difference between -

    BUS_DRDY_CLOCKS.ALL_AGENTS - This event counts the number of bus cycles during which the DRDY ( Data Ready ) signal is asserted on the bus. The DRDY signal is asserted when data is sent on the bus. With the 'ALL_AGENTS' mask this event counts the number of bus cycles during which any bus agent sends data on the bus. This includes all data reads and writes on the bus. It counts bus transactions initiated by any agent on the bus. In systems where each processor is attached to a different bus, each core counts only events it sees on its own bus.

    BUS_TRANS_ANY.ALL_AGENTS - This event counts all bus transactions, which includes - memory transactions, IO transactions ( non memory-mapped ), deferred transaction completion and other less frequent transactions ( such as interrupts). It counts activity initiated by any agent on the bus. In systems where each processor is attached to a different bus, the count reflects only the activity for the bus on which the processor resides.


    ~BR





Forum jump:  

Intel Software Network Forums Statistics

17,025 users have contributed to 48,321 threads and 172,753 posts to date.

In the past 24 hours, we have 16 new thread(s) 57 new posts(s), and 54 new user(s).

In the past 3 days, the most popular thread for everyone has been How to manage rounding by IVF ?? The most posts were made to Most likely, the issue is that The post with the most views is Optimalization of sine function\'s taylor expansion

Please welcome our newest member redfruit83


For more complete information about compiler optimizations, see our Optimization Notice.