64K aliasing totally bogus?

64K aliasing totally bogus?

My app has some areas that show a very high performance impact from 64k aliasing in VTune. After studying the problem I've put together some test applications to try to understand why I would be experiencing 64k aliasing. The code below shows a 64k aliasing performance impact of about 80, which is 40x the recommended top end. However, it works only in 2 2K L1 cache regions that are seperated by exactly 8MB + 32K, so there should be no possibility of any 64k aliasing events at all. Can you explain why this is throwing so many events??

void Test1()
{
char *a = (char *)VirtualAlloc( 0, 16 * 1024 * 1024 + 0x8000, MEM_COMMIT, PAGE_READWRITE );

DWORD dwStart = timeGetTime();

float *src = (float *)a;
float *dest = (float *)(a + 8 * 1024 * 1024 + 0x8000);

__asm
{
mov esi, src
mov edi, dest
mov edx, 819200
add esi, 0x800
add edi, 0x800
outer:
mov ecx, 0x800
neg ecx
inner:
movaps xmm0, xmmword ptr [esi+ecx]
movaps xmmword ptr [edi+ecx], xmm0
add ecx, 0x10
jnz inner
dec edx
jnz outer
}
printf( "Elapsed time = %d ms
", timeGetTime() - dwStart );

VirtualFree( a );
}

Message Edited by dneufeld on 10-09-2004 06:29 PM

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.