Loading...
You are not logged-in Login/Register





  • Posts   Search Threads
  • jimdempseyatthecoveJune 24, 2009 10:20 AM PDT   
    determination of PREFETCH support

    In the Intel Archetecture Software Developer's Manual the description for PREFETCHn states that some processors may simply ignore this instruction (NOP it). The CPUID tables do not seem to return information as if the CPU supports PREFETCH or ignores PREFETCH. Is there a programical way of determining this (other than running a benchmark test at application initialization)?

    The reason I ask is in a test program on Q6600 PREFETCHn (all variations of n) slow down the program whereas replacing the PREFETCHn FutureAddress with

          trash = *FutureAddress; // copy aligned __int64
          foo = expressionWithDoubleUsingCurrentAddress;

    Gets marginal speedup

    Note, the integer load will eventually stall for the read. Whereas PREFETCHn will (should) not introduce a stall waiting for memory.

    Jim Dempsey




    Blog: The Parallel Void
    www.quickthreadprogramming.com

    Shih Kuo (Intel)June 29, 2009 8:26 AM PDT
    Rate
     
    Re: determination of PREFETCH support


    Hi Jim

    In the case of a prefetch hint that did not result in a fetch (i.e. dropped), it is not the same as a NOP, per se. For example, OOO hardware has to honor a NOP instruction being an instruction not a hint and retire the instruction. but has more latitude in how to treat a hint.

    But forgive my nitpicking on ISA definition aside, the thrust of your question is really about should implementation specific behavior of a prefetch hint have an architecturally-defined behavior that is reported via CPUID instruction.

    The scenarios of whether a prefetch hint issued sufficiently ahead of subsequent reference is highly dependent on workload characteristic and specific implementation techniques. The same is true of your example of pre-touching the memory location of a subsequent reference. Whether to rely on software prefetch hints or use explicit pre-touches to trigger hardware prefetchers is not a question with easy answers nor universally applicable solutions.

    My personal take of asking CPUID to provide additonal definitions about the implementation-specific nature of prefetch hint tend to be negative.

    For example, some CPU generations had implemented prefetch hint to be always dropped if the hint is requesting an address beyond page boundary. If this behavior was codified in some hypothetical CPUID flag, it would not be feasible to allow prefetch hint to be honored as a hint for fetching data across page boundary. The latter was implemented in later generations and improves the software's ability to handle TLB misses. 

    My experience has been issuing prefetch hint alters the load uop scheduling. Using software prefetch extensively would imply taking suitable precaution that accounts for different CPU implementations (CPUID family/model) can exhibit different performance characteristics. In that sense, asking new flags be added in CPUID to report implementation specific hardware behavior is not really different from using CPUID family and model combinations to dispatch code paths tuned to specific implementation.

    This may not be what you want to hear but sometimes the fun about programming is experimentation :)

    Shihjong



Forum jump:  

Intel Software Network Forums Statistics

16,369 users have contributed to 46,341 threads and 163,954 posts to date.

In the past 24 hours, we have 18 new thread(s) 102 new posts(s), and 67 new user(s).

In the past 3 days, the most popular thread for everyone has been Formula for the intersection of straight lines The most posts were made to Take a look at John Burkhard&# The post with the most views is \"-check none\" generates error

Please welcome our newest member bikerepair8


For more complete information about compiler optimizations, see our Optimization Notice.