rep; Nop / -asm pause

rep; Nop / -asm pause

On Xeon or Hyper-Threading tech, what isthe cycle time for rep; nop or _asm pause? The documentation hints that it can be anywhere from a nop to some definite value. So what is it?

Thanks in advance

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

We are forwarding your question to our engineering contacts and will let you know how they respond.

Regards,

Lexi S.

IntelSoftware NetworkSupport

http://www.intel.com/software

Contact us

Message Edited by intel.software.network.support on 12-02-2005 01:14 PM

Here is the response we received from our Application Engineers:

NOP instruction can be between 0.4-0.5 clocks and PAUSE instruction can consume 38-40 clocks. Please refer to the whitepaper on how to measure the latency and throughput of various instructions. The REPE instruction comes in various flavors and the latency/throughput of each of them varies. Please also see below for the sample code to measure the average clocks.

#include

#define ReadTSC( x ) __asm cpuid
__asm rdtsc
__asm mov dword ptr x,eax
__asm mov dword ptr x+4,edx

#define LOOP_COUNT 160000.
#define REPEAT_25( x ) x x x x x x x x x x x x x x x x x x x x x x x x x
#define REPEAT_100(x) REPEAT_25(x) REPEAT_25(x) REPEAT_25(x) REPEAT_25(x)
#define REPEAT_1000(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x) REPEAT_100(x)
#define FACTOR ((double)LOOP_COUNT*1000.0)

#define CLOCKSPERINSTRUCTION(start,end) ((double)end-(double)start)/(FACTOR)
void main(int argc,char **argv)
{
__int64 start, end,total;
total = 0;;
ReadTSC(start);
for (int i=0; i
{
REPEAT_1000(__asm { nop};)
}
ReadTSC(end);
total = end-start;

printf("nop: clocks per instruction %4.2f
",(double)total/(double)FACTOR);

ReadTSC(start);
total = 0;;
for (int i=0; i
{
REPEAT_1000(__asm { pause};)
}
ReadTSC(end);
total = end-start;

printf("pause: clocks per instruction %4.2f
",(double)total/(double)FACTOR);

}

==
Regards,

Lexi S.

IntelSoftware NetworkSupport

http://www.intel.com/software

Contact us

Message Edited by intel.software.network.support on 12-02-2005 01:14 PM

Thank you. The documentation states the results vary as does the answer. I'll try this on our various 8x, 16x and32x machines and see what the variance is.

Thanks again.

Leave a Comment

Please sign in to add a comment. Not a member? Join today