Possible rdtsc bug
I'm running the following program on a Dell Inspiron Mini 10 (Atom Z520).
int main(void) { while(1) { unsigned __int64 time; _asm rdtsc _asm mov dword ptr [ time + 0 ], eax _asm mov dword ptr [ time + 4 ], edx printf("n%I64X", time); Sleep(1000); }
stdout: ======= 674EE7002254 674F2EB961BE 674F2BE098F4 * 674F46D3B240 674F7557DCCC 674F7100E0CE * 674F8B98AC14 674F9FC81058 674FB05DBC10 674FBE4559B4 674FDE1093EE 674FF73CB000 675022167C70 67504F43C644
* Time is going backwards.
| |
Re: Possible rdtsc bug
The speed of the tsc is dependant on CPU work load: I changed the Sleep(1000) in the loop to keep CPU usage at 100% (I used an AES256 encryption function, not that this should matter). The left column is the rdtsc value (hex). The right value is the difference (in decimal). The long term rate of increase is constant, and seems valid. An error of ~0x1B300000 seems to appear and disappear over time. 68A0A3168292 1328875720 68A0F25BF742 1329951920 68A141A16990 1329951310 68A175B39492 873605890 68A1C4FA2B9C 1330026250 68A22F74D828 1786424460 68A27EBCEABA 1330123410 68A2B2D12956 873741980 68A31D49DA30 1786294490 68A36C9278AE 1330159230 68A3BBD50F80 1329764050 68A40B1BA9E6 1330027110 68A45A61EE20 1330005050 68A4A9A884B2 1330026130 68A4F8EEDEFE 1330010700 68A52D017404 873633030 68A5977D3552 1786495310 68A5E6C497C2 1330078320 68A6360BDAF2 1330070320 68A66A1B18BC 873414090 68A6B9642454 1330187160 68A708A8473E 1329865450 68A77323005E 1786427680 Quoting - martlau1978
I'm running the following program on a Dell Inspiron Mini 10 (Atom Z520).
int main(void) { while(1) { unsigned __int64 time; _asm rdtsc _asm mov dword ptr [ time + 0 ], eax _asm mov dword ptr [ time + 4 ], edx printf("n%I64X", time); Sleep(1000); }
stdout: ======= 674EE7002254 674F2EB961BE 674F2BE098F4 * 674F46D3B240 674F7557DCCC 674F7100E0CE * 674F8B98AC14 674F9FC81058 674FB05DBC10 674FBE4559B4 674FDE1093EE 674FF73CB000 675022167C70 67504F43C644
* Time is going backwards.
| |
Re: Possible rdtsc bug
Hi Mart,
since your report seems to point at a possible assembly or even microcode isue, I assume that you are not reporting a problem with the software development tool suites for the Intel(R) Atom(TM) Processor.
I forwarded your sighting to some colleagues in the hardware performance teams. I'll get back to you as soon as I find out more.
Thanks, Rob
| |
Re: Possible rdtsc bug
Hi Rob, I was directed here by Sergio from online chat support. I'm hoping to be redirected the right forum/place... thanks! Martin Quoting - Robert MuellerAlbrecht (Intel)
Hi Mart,
since your report seems to point at a possible assembly or even microcode isue, I assume that you are not reporting a problem with the software development tool suites for the Intel(R) Atom(TM) Processor.
I forwarded your sighting to some colleagues in the hardware performance teams. I'll get back to you as soon as I find out more.
Thanks, Rob
| |
Re: Possible rdtsc bug
Hi Martin,
I had a few more exchanges with or Intel(R) Atom(TM) Processor core performance team. They are treating this as a possible hardware sighting and are tracking it and trying to reproduce it.
Could you try and provide a bit more input as to
which exact CPU power state the sleep setting on your system relates to? Would this be C4? Is it possible for you to provide us with the exact CPU chip ID. Which microcode version or updates and BIOS version are running on the system?
We were not able to reproduce your sighting on a standard Z510 with a gcc compiled binary running for several hours (with and without the 1 second delay).
Knowing the exact nature of the power mode switch and the underlying hardware details may be key.
I know you provided us with the basic code snippet to reproduce - since it doesn't show up on our verification systems - do you think you could provide us with the binary you use for testing?
Thanks, Rob
| |
Re: Possible rdtsc bug
Hi Robert, I use the BIOS provided by Dell. They provided this file to install it: TigerA03.exe . This is the first page of the BIOS Utility (F2 when booting):
Dell Inc. Phoenix SecureCore(tm) Setup utility BIOS Version: A03 CPU Type: Intel(R) Atom(TM) CPU Z520 CPU Speed: 1330 MHz CPU Cache Size: 512 KB CPU ID: 106C2 Product Name: Inspiron 1010
I'm not sure how to change the power mode. In the BIOS, there is a setting called "Intel(R)SpeedStep(TM) Techonology". I tried both Enabled and Disabled modes and I got the same behavior. If this is not what you meant, please direct me to the appropriate place.
When I run the cpuid instruction with parameter 1, I get: eax = 0x00106C2 ebx = 0x0020800
Unfortunately, I only have one Inspiron Mini 1010 here, so I can't check if this is a manufacturing defect or a design flaw. I've ordered two more, but the delivery is scheduled for the second week of June. These newer Inspiron 1011 come with a different processor: the Atom N270 1.6 GHz.
How I can send you my application?
Thanks, Martin
| |
Re: Possible rdtsc bug
Hi Martin,
the feedback I got from our hardware team by now is that this is an "old" known issue that is fixed in patch 20A.
Let me check whether they have some insight for you where you can get that patch as well.
Rob
| |
Re: Possible rdtsc bug
Hi Martin,
our hardware and firmware folks confirm that what you need most likelty is a BIOS update patch to version 20A or above. DELL should hopefully be able to provide this patch to you.
If you would like I can email you a little check utility that verifies the BIOS version on your system. I assume the email address in your profile would work?
Thanks, Rob
| |
Re: Possible rdtsc bug
OK. I'll contact Dell. My email address should work fine. Please send me this utility.
Thank you very much for your help. Martin
| |
Re: Possible rdtsc bug
Hi Robert, I contacted Dell, and they're going to "escalate" this forum thread to their BIOS group. They can't give me a timeline for the update, so I'll tag this thread as resolved, with many thanks to you. When (if?) the BIOS group delivers a new version, I'll tell you if this fixed my problem.
Thanks again, Martin
| |
Re: Possible rdtsc bug
My Dell BIOS does not support disabling hyper threading for some reason, so I could not try your suggestion directly. However, using the APIC field from the cpuid instruction, I was able to see what is going on. The third value listed is the ebx of cpuid(1) instruction (which is run right before rdtsc). It's now obvious that Windows is scheduling the thread on two CPUs (for some reason, it does this after some boots and not after others).
54891B575A 1321055690 01020800 54D860301A 1329912000 01020800 55432A36CA 1791624880 00020800 5592708AEA 1330009120 00020800 55E1B6DE56 1330008940 00020800 5630FD4D9C 1330016070 00020800 5680438B3C 1330003360 00020800 56CF8AD7C2 1330072710 00020800 571ED04B00 1329951550 00020800 5752935E0A 868422410 01020800 57BD5D0A92 1791601800 00020800 57F11FDCD8 868405830 01020800 5840664192 1330013370 01020800 588FAC893C 1330005930 01020800 58DEF3C1EA 1330067630 01020800
The tsc counters of both hyper-threaded CPUs are increasing in lock step. However, the tsc of CPU 1 is starting off with a delay of ~461552170 clocks. This behavior does not occur on Pentium 4 processors with hyper-threading, making profiling work even with hyper-threading enabled and Windows scheduling the threads on different processors over time. I looked at the cycle difference between the processors on the Pentium 4 and it is very, very small. Perhaps there is only one counter(?).
There are several workarounds that I can use to get around this problem with the Atom.
The easiest is to restrict Windows to a single processor. This can be done from the msconfig application, or manually in the boot.ini. This will reduce performance somewhat. I tried it, and this fixes the problem, as one would expect.
I could also measure the difference between the two processors' timestamp (it seems to be constant after boot), and apply the constant correction of ~461552170 clocks to CPU 1 (using cpuid). There are a few serialization issues here which are not fun, but they can be addressed either in a statistical fashion (ie. checking cpuid before and after rdtsc) or with proper serialization in a driver.
If solution 2 could be implemented in the rdtsc microcode, this would be the simplest solution for customers, as all serialization issues would be solved, and programs that work on Pentium 4 HT would work on Atom HT without modification.
| | |