Passing some memory boundary values to 'prefetcht*' instructions

Passing some memory boundary values to 'prefetcht*' instructions

What happens if some memory boundary values, like 0x00000000 or 0xFFFFFFFF, are passed to 'prefetcht*' instructions on a 32-bit platform?

Here is an example:

...
_mm_prefetch( 0xFFFFFFFF, _MM_HINT_T0);
...

Best regards,
Sergey

7 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.

Hello Sergey,
I did a quick test on 64bit windows platform andthe exceptions don't getdelivered back to my program.
I did a loop of:
xor rax,rax
prefetcht0 [rax]

I think the exceptions get squished (not delivered back to the app) but there is a penalty.
The performance seems to be about equal to a trip to memory per reference on my "processor formerly codenamed Westmere"-based laptop.
I will check with others after the holidays.
Pat

Quoting Patrick Fay (Intel)...
I will check with others after the holidays.
Pat

Thank you, Patrick, and Merry Christmas!

PS: It would be good to see more technical details later...

Merry Christmas and happy holidays to you too!

>>xor rax,rax
>>prefetcht0 [rax]
>>
>>I think the exceptions get squished (not delivered back to the app) but there is a penalty.

Hi Patrick, I simply would like to follow up. You mentioned some exceptions. Do you mean
some internal CPU exceptions, or some exceptions from anoperating system?

Could provide a little bit more technicaldetails, please?

Best regards,
Sergey

Best Reply

Hello Sergey,
Sorry for the delay.

The prefetch* instructions will not raise an exception for an invalid address but there will be a slight performance penalty.
The penalty varies on existing processors, but it derives from the need to walk the page tables in order to determine that a given address is invalid.
Note that SDM instruction set reference does not list the #PF (invalid address) exception for the prefetch* instructions indicating that this instruction doesn't raise the #PF exception.

I hope this helps,
Pat

Just to add:

This is because an invalid address will not be present in the TLB(Supfast bufferfor Vto P translation).
So, the TLB miss will cause the page-table walk and hence the penalty..

Kommentar hinterlassen

Bitte anmelden, um einen Kommentar hinzuzufügen. Sie sind noch nicht Mitglied? Jetzt teilnehmen