Streaming SIMD Extensions anyone?

Streaming SIMD Extensions anyone?

I have just read about Streaming SIMD Extensions on intel cpu's.

Are these usable within the CVF compiler?

Thanks, TimH

Message Edited by on 12-09-2005 10:10 AM

6 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.

No. The only SSE instructions CVF uses are those for data prefetch.


Steve - Intel Developer Support

What is the Data-Prefetch?


Prefetch is way that the compiler can tell the processor "In a little while, I'm going to touch this particular memory address, so why don't you start loading it into the cache for me if it's not already there." This is a way of reducing memory latency and can give a 10-20% boost in performance for some applications.

The compiler looks at memory reference patterns, for example, stepping through an array, and automatically inserts prefetch instructions ahead of when the data will be used, improving the chance that the data will be in the cache when needed.

This is a big help on Pentium III, but not so much on Pentium 4 where the processor itself tries to predict memory use patterns and does its own prefetching. We found, for example, that applying a Pentium III prefetch model to Pentium 4 actually made performance worse! CVF 6.6 uses a more appropriate memory system model for Pentium 4, resulting in fewer prefetch instructions issued, and better performance.

We were able to add this to CVF 6.5 because our optimizer already knew how to do prefetching for Alpha, so it was just a matter of tuning the memory model.


Steve - Intel Developer Support

Thanks that is a useful answer.

1. Are you saying that I don't need to do anything to utilize pre-fetch as long as the exe is run on a P4?
Do I need to, at least, turn on a P4 switch at compile time?

2. Do you anticipate the ability to utilize this feature on Intel Fxx anytime soon?


In CVF, you have to compile with /arch:pn4 for Pentium 4, or /arch:host if you want to run on the same computer you're compiling for.

Intel Fortran has /QX and /QAX switches to select architecture - the /QAX variant generates code that automatically detects the running CPU type and dispatches to the appropriate code set.


Steve - Intel Developer Support

Kommentar hinterlassen

Bitte anmelden, um einen Kommentar hinzuzufügen. Sie sind noch nicht Mitglied? Jetzt teilnehmen