How to accurately measure CPU frequency in user-mode code?

How to accurately measure CPU frequency in user-mode code?

Portrait de Igor Levicki

With all the performance counters and MSRs available in Intel CPUs, it still seems impossible to accurately measure CPU frequency in user-mode code:

#include

#include

#include 
#pragma comment(lib, "winmm.lib")
typedef unsigned __int64 u64;
__declspec(naked) u64 readtsc(void)

{

	__asm	{

		rdtsc

		ret

	}

}
double GetCPUFrequency(void)

{

static	u64		r0 = 0, f0 = 0;

	u64		r1, r2;

	DWORD		t0, t1;

	int		i;
	timeBeginPeriod(1);
	for (i = 0; i < 3; i++) {

		t0 = timeGetTime();
		t1 = t0;
		while (t1 - t0 < 20) {

			t1 = timeGetTime();

			r1 = readtsc();

		}
		t0 = t1;
		while (t1 - t0 < 40) {

			t1 = timeGetTime();

			r2 = readtsc();

		}
		r0 += r2 - r1;

		f0 += (t1 - t0);

	}
	timeEndPeriod(1);
	return (r0 / f0) / 1000.0;

}
int main(int argc, char* argv[])

{

	for (;;) {

		printf("Frequency : %.2f MHzr", GetCPUFrequency());

		Sleep(1000);

	}
	return 0;

}

The above code returns 3400 MHz for a Core i7 2600K overclocked to 4000 MHz. Tools such as CPU-Z and AIDA64 are capable of measuring frequency accurately but they use device drivers to execute ring 0 code which has access to MSRs.

My question is why Intel CPU engineers did not provide user-mode instructions (kind of like RDTSC/RDTSCP) for APERF and MPERF MSRs, but instead left those accessible only from ring 0? What were they thinking?

Finally, is there any way to work around this?

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.
20 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Portrait de iliyapolak

Maybe you have inaccurate measurements because of context switching and thread blocking induced by operating system.Try to increase priority to above normal and set affinity mask to preffered cpu

Portrait de Igor Levicki

No, that is not the reason.

Measured frequency (3400 MHz) is the default CPU frequency (34x is default multiplier).

CPU is overclocked to 4000 MHz by setting its turbo mutliplier to 40x in BIOS.

What I measure is probably affected with powersaving (SpeedStep, etc).

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.
Portrait de iliyapolak

Can you write simple driver which contains only driver entry function and add device function to access msr registers with inline assembly afaik you can pass the msr value to user mode code via irp.

Portrait de Igor Levicki

Yes I can, but driver needs to be digitally signed to load on Windows 7 x64 and I do not have a certificate because I can't buy a cheap one in Serbia and those I can buy are too expensive.

Yes, I can use open-source signed driver too, but that is not a solution to the main problem -- inability to find out current CPU frequency in user mode. That is a problem plaguing Windows and Linux user mode applications and nothing smart has been done so far to resolve it.

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.
Portrait de iliyapolak

Patchguard is blocking unsigned driver installation on 64 bit machine.Afaik this protection has been broken and there is option to disable patchguard.Read uninformed.org site they have info on disabling patchguard.Also try this software Driver Signature Enforcement Overrider it seems that it tool will help you to install unsigned drivers.

Portrait de Igor Levicki

That is cool, but I can do that only on my computer, not customer's :)

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.
Portrait de iliyapolak

Can you install "Driver Signature Enforcement Overrider" on customer's machine?

Portrait de jimdempseyatthecove

Igor,

Have you verified that timeGetTime is actually returning milliseconds since boot on overclocked system?
(IOW is it assuming non-overclocked system)

Jim Dempsey

www.quickthreadprogramming.com
Portrait de Igor Levicki

Jim, that is not relevant because I do not use absolute time, just the period and AFAIL timeGetTime() is returning monotonically incrementing time.

The only way I know to get frequency is to read APIC and current multiplier which both require kernel mode code.

In my opinion Intel should have created a way for user programs to get this information. Perhaps it is time to ask for CPUFREQ instruction to be added to x86 ISA.

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.
Portrait de iliyapolak

Or we can patch microcode andset rdmsrcpl to 3. :)

Portrait de Igor Levicki

That still wouldn't solve the problem of reading BCLK.

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.
Portrait de iliyapolak

I think that the only solution for now is to write simple driver read the msr and send the values to userland app.RegardingWindows 7 64-bit you can try to usethe software mentioned by me in my previous posts.

Portrait de jimdempseyatthecove

>> that is not relevant because I do not use absolute time, just the period and AFAIL timeGetTime() is returning monotonically incrementing time.

return (r0 / f0) / 1000.0;

Where:

r0 = 3-sum (delta rdtsc over 40-40+ ms interval as returned by timeGetTime())
f0 = 3-sum (delta timeGetTime() over 40-40+ ms interval)

In the event that timeGetTime() is returning counts at the (overclocked) accelerated rate, then the change in the ratio of r0 / f0 will not be observed.

You realy want f0 = 120ms of wall clock time.

Without seeing the code for timeGetTime() it is not conclusive as to if the ms is "virtual ms" under the assumption of a fixed clock rate that is also not overclocked.

My i7 2600K system is not overclocked, but it would be a relatively easy test for you to make on your overclocked system to assert that 10,000 ms as reported by timeGetTime() == 10 seconds of wall clock time.

Jim Dempsey

www.quickthreadprogramming.com
Portrait de jimdempseyatthecove

By the way,

On Windows 7 x64 I looked at QueryPerformanceFrequency on my system and it returns 3,331,259 ticks/second. I thought that this would be the FSB frequency. On my older Q6600 XP x64 it was the FSB frequency. Since the system has Turbo Boost (or I prefer to call it overheat protection slow-down), I cannot say if the FSB fluctuates with the Turbo-Boost, so the system may have a different means of comming up with a (somewhat) constant high frequency precision frequency.

You could use a ratio of your RDTSC vs QueryPerformanceFrequency without using a driver (at least on Windows).

Jim Dempsey

www.quickthreadprogramming.com
Portrait de Igor Levicki

Jim,

timeGetTime() is windows multimedia timer API so my bet is that it is pretty fixed.

Turbo boost does not affect FSB, it affects multiplier.

There is no FSB in i7, there is BCLK and it is 100 MHz.

QueryPerformanceFrequency() most likely returns HPET timer ticks.

You are free to experiment with that code (and overclocking and power management), I'd be gratefull if you can make it work :)

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.

I think, public OS APIs are the best option to do this. I didn't check but on Windows you can try using WMI and on Linux /proc/cpuinfo seem to report the current CPU frequency. You can also see if/sys/devices/system/cpu/cpu0/cpufreq/* suits you, although on my machine reading these files require root priveleges.

Portrait de iliyapolak

Igor!
As andysem wrote in his post you can use WMI to obtain accurate info aboute the CPU freq.

Portrait de Igor Levicki

I just tested this but WMI also reports incorrect frequency -- it is not aware of Turbo Boost multiplier set to 40x and it is showing 3400 MHz instead of 4000 MHz.

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.

New CPUs have "constant timestamp counter frequency" feature. This means that the timer which is queried by rdtsc instruction doesn't change its frequency when CPU cores are overclocked or downlocked by turboboost. It also means that you can not detect current CPU frequency by comparing rdtsc progress to HPET progress.

Connectez-vous pour laisser un commentaire.