All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFT] Please test rdtsc on various x86-64 hardware (app included)
@ 2011-04-18 19:32 Andrew Lutomirski
  2011-04-18 20:16 ` Linus Torvalds
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Andrew Lutomirski @ 2011-04-18 19:32 UTC (permalink / raw)
  To: linux-kernel, Ingo Molnar, Linus Torvalds, Andi Kleen, x86

Hi all-

I'd appreciate some help testing rdtsc's ordering wrt memory on
various hardware.  You can download evil-clock-test code at:

https://gitorious.org/linux-test-utils/linux-clock-tests/blobs/raw/master/evil-clock-test.cc

or pull from:

git://gitorious.org/linux-test-utils/linux-clock-tests.git

or see it online at:

https://gitorious.org/linux-test-utils/linux-clock-tests

No kernel patches required.  If you have an old glibc then timing_test
will fail to build.  You can ignore that problem, because I'm only
really interested in what evil-clock-test says.

On Sandy Bridge, you'll see something like:

$ ./evil-clock-test
CPU vendor   : GenuineIntel
CPU model    : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
CPU stepping : 7
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test passed  : margin 68 with 78370992 samples
Load3 test passed: margin 71 with 12740250 samples
Load test passed : margin 60 with 17743461 samples
Store test failed as expected: worst error 3316 with 14666029 samples

I've tested on Sandy Bridge, Allendale (i.e. Pentium Dual-Core),
Bloomfield. and C2D.  I don't have any AMD machines with usable tscs,
and I haven't tested on systems with multiple packages.  (If you're
feeling adventurous, you can play with the -p option.  If you give it
two cpu numbers, comma-separated, which live on different packages,
maybe you'll learn something interesting.  It might also be
interesting to try evil-clock-test -3 -p a,b,c where c is on a
different package from a and b.

(Oddly enough, the test *passes* on my C2D box, even though the kernel
thinks that my TSC halts in idle.  This is with a fair amount of time
spent in C6 and even after a suspend/resume cycle.  I'm not sure
what's going on there.)

For those of you who really care about this stuff, the 'store test'
will *fail* on most Intel systems.  IMO that's OK, since fixing it
would slow everything down and since I don't think it deserves to
pass, even though it looks like the tsc is warping.

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 19:32 [RFT] Please test rdtsc on various x86-64 hardware (app included) Andrew Lutomirski
@ 2011-04-18 20:16 ` Linus Torvalds
  2011-04-18 20:37   ` Ingo Molnar
  2011-04-18 20:23 ` Colin Walters
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2011-04-18 20:16 UTC (permalink / raw)
  To: Andrew Lutomirski; +Cc: linux-kernel, Ingo Molnar, Andi Kleen, x86

On Mon, Apr 18, 2011 at 12:32 PM, Andrew Lutomirski <luto@mit.edu> wrote:
>
> I've tested on Sandy Bridge, Allendale (i.e. Pentium Dual-Core),
> Bloomfield. and C2D.

Arrandale:

  CPU vendor   : GenuineIntel
  CPU model    : Intel(R) Core(TM) i5 CPU         670  @ 3.47GHz
  CPU stepping : 2
  TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
  Using lfence_rdtsc because you have an Intel CPU
  Will test the "lfence;rdtsc" clock.
  Now test passed  : margin 98 with 71696816 samples
  Load3 test passed: margin 72 with 8926478 samples
  Load test passed : margin 88 with 14660759 samples
  Store test failed as expected: worst error 764 with 10784245 samples

but that wasn't very surprising since you already tested the
micro-architectures just around it.

On a dual-socket X5550 (master.kernel.org: 16 threads total: 4 cores
per socket, with HT):

  CPU vendor   : GenuineIntel
  CPU model    : Intel(R) Xeon(R) CPU           X5550  @ 2.67GHz
  CPU stepping : 5
  TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
  Using lfence_rdtsc because you have an Intel CPU
  Will test the "lfence;rdtsc" clock.
  Now test passed  : margin 166 with 38875272 samples
  Load3 test passed: margin 235 with 2972261 samples
  Load test passed : margin 119 with 4409699 samples
  Store test failed as expected: worst error 2710 with 3764216 samples

Some opteron love would be good, but I don't have access to any right here.

                     Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 19:32 [RFT] Please test rdtsc on various x86-64 hardware (app included) Andrew Lutomirski
  2011-04-18 20:16 ` Linus Torvalds
@ 2011-04-18 20:23 ` Colin Walters
  2011-04-18 20:27   ` Andrew Lutomirski
  2011-04-18 22:10 ` Andi Kleen
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Colin Walters @ 2011-04-18 20:23 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: linux-kernel, Ingo Molnar, Linus Torvalds, Andi Kleen, x86

On Mon, Apr 18, 2011 at 3:32 PM, Andrew Lutomirski <luto@mit.edu> wrote:
> Hi all-
>
> I'd appreciate some help testing rdtsc's ordering wrt memory on
> various hardware.  You can download evil-clock-test code at:
>
> https://gitorious.org/linux-test-utils/linux-clock-tests/blobs/raw/master/evil-clock-test.cc

Hmm...the first time I ran it, it started OK, then printed over and over:

  ERROR!  Time1 went back by 2380216472
  ERROR!  Time1 went back by 2380216080
  ERROR!  Time1 went back by 2380215704
  ERROR!  Time1 went back by 2380215320

and the original output was lost in the terminal emulator history.
After piping it to tee a second time, of course it worked and didn't
print any errors =)

CPU vendor   : GenuineIntel
CPU model    : Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz
CPU stepping : 10
TSC flags    : tsc constant_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test passed  : margin 160 with 28911200 samples
Load3 test passed: margin 144 with 2493322 samples
Load test passed : margin 120 with 4929138 samples
Store test failed as expected: worst error 2184 with 4409828 samples

What's interesting is it seems unpredictable when running it whether
it will error out =/ Here's the start of a failing trace:

CPU vendor   : GenuineIntel
CPU model    : Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz
CPU stepping : 10
TSC flags    : tsc constant_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test failed  : worst error 2399269920 with 28704328 samples
  ERROR!  Time1 went back by 2399197568
  ERROR!  Time1 went back by 2399196984

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 20:23 ` Colin Walters
@ 2011-04-18 20:27   ` Andrew Lutomirski
  2011-04-18 20:35     ` Colin Walters
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Lutomirski @ 2011-04-18 20:27 UTC (permalink / raw)
  To: Colin Walters; +Cc: linux-kernel, Ingo Molnar, Linus Torvalds, Andi Kleen, x86

On Mon, Apr 18, 2011 at 4:23 PM, Colin Walters <walters@verbum.org> wrote:
> On Mon, Apr 18, 2011 at 3:32 PM, Andrew Lutomirski <luto@mit.edu> wrote:
>> Hi all-
>>
>> I'd appreciate some help testing rdtsc's ordering wrt memory on
>> various hardware.  You can download evil-clock-test code at:
>>
>> https://gitorious.org/linux-test-utils/linux-clock-tests/blobs/raw/master/evil-clock-test.cc
>
> Hmm...the first time I ran it, it started OK, then printed over and over:
>
>  ERROR!  Time1 went back by 2380216472
>  ERROR!  Time1 went back by 2380216080
>  ERROR!  Time1 went back by 2380215704
>  ERROR!  Time1 went back by 2380215320

Well, crap.  Can you run:
 dmesg | grep -i tsc

There are two possible explanations:
1. Your tscs are out of sync, and whether the test notices or not
depends on which cpus the scheduler sticks the threads on.
2. I have a dumb bug that makes it malfunction.  I used to have some
of those but I thought I fixed them.

Thanks,
Andy

>
> and the original output was lost in the terminal emulator history.
> After piping it to tee a second time, of course it worked and didn't
> print any errors =)
>
> CPU vendor   : GenuineIntel
> CPU model    : Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz
> CPU stepping : 10
> TSC flags    : tsc constant_tsc
> Using lfence_rdtsc because you have an Intel CPU
> Will test the "lfence;rdtsc" clock.
> Now test passed  : margin 160 with 28911200 samples
> Load3 test passed: margin 144 with 2493322 samples
> Load test passed : margin 120 with 4929138 samples
> Store test failed as expected: worst error 2184 with 4409828 samples
>
> What's interesting is it seems unpredictable when running it whether
> it will error out =/ Here's the start of a failing trace:
>
> CPU vendor   : GenuineIntel
> CPU model    : Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz
> CPU stepping : 10
> TSC flags    : tsc constant_tsc
> Using lfence_rdtsc because you have an Intel CPU
> Will test the "lfence;rdtsc" clock.
> Now test failed  : worst error 2399269920 with 28704328 samples
>  ERROR!  Time1 went back by 2399197568
>  ERROR!  Time1 went back by 2399196984
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 20:27   ` Andrew Lutomirski
@ 2011-04-18 20:35     ` Colin Walters
  2011-04-18 20:39       ` Andrew Lutomirski
  0 siblings, 1 reply; 15+ messages in thread
From: Colin Walters @ 2011-04-18 20:35 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: linux-kernel, Ingo Molnar, Linus Torvalds, Andi Kleen, x86

On Mon, Apr 18, 2011 at 4:27 PM, Andrew Lutomirski <luto@mit.edu> wrote:

> Well, crap.  Can you run:
>  dmesg | grep -i tsc

# dmesg|grep -i tsc
[    0.000000] Fast TSC calibration using PIT
[    0.098999] TSC synchronization [CPU#0 -> CPU#1]:
[    0.098999] Measured 2399269672 cycles TSC warp between CPUs,
turning off TSC clock.
[    0.098999] Marking TSC unstable due to check_tsc_sync_source failed

> There are two possible explanations:
> 1. Your tscs are out of sync, and whether the test notices or not
> depends on which cpus the scheduler sticks the threads on.

Looks like that's the case?   But for what you want to do in kernel,
the kernel already did this test and so would know to not use the TSC
for vgettimeofday(), right?  (I only sort of followed the clock
discussion earlier but I found it quite interesting, so decided to run
the test).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 20:16 ` Linus Torvalds
@ 2011-04-18 20:37   ` Ingo Molnar
  2011-04-19  0:38     ` Markus Trippelsdorf
  0 siblings, 1 reply; 15+ messages in thread
From: Ingo Molnar @ 2011-04-18 20:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Lutomirski, linux-kernel, Andi Kleen, x86


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Some opteron love would be good, but I don't have access to any right here.

Here's one:

 CPU vendor   : AuthenticAMD
 CPU model    : Quad-Core AMD Opteron(tm) Processor 8356
 CPU stepping : 3
 TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
 Using mfence_rdtsc because you don't have an Intel CPU
 Will test the "mfence;rdtsc" clock.
 Now test passed  : margin 252 with 7198256 samples
 Load3 test passed: margin 536 with 1833844 samples
 Load test passed : margin 219 with 3097882 samples
 Store test passed: margin 250 with 3553454 samples

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 20:35     ` Colin Walters
@ 2011-04-18 20:39       ` Andrew Lutomirski
  0 siblings, 0 replies; 15+ messages in thread
From: Andrew Lutomirski @ 2011-04-18 20:39 UTC (permalink / raw)
  To: Colin Walters; +Cc: linux-kernel, Ingo Molnar, Linus Torvalds, Andi Kleen, x86

On Mon, Apr 18, 2011 at 4:35 PM, Colin Walters <walters@verbum.org> wrote:
> On Mon, Apr 18, 2011 at 4:27 PM, Andrew Lutomirski <luto@mit.edu> wrote:
>
>> Well, crap.  Can you run:
>>  dmesg | grep -i tsc
>
> # dmesg|grep -i tsc
> [    0.000000] Fast TSC calibration using PIT
> [    0.098999] TSC synchronization [CPU#0 -> CPU#1]:
> [    0.098999] Measured 2399269672 cycles TSC warp between CPUs,
> turning off TSC clock.
> [    0.098999] Marking TSC unstable due to check_tsc_sync_source failed
>
>> There are two possible explanations:
>> 1. Your tscs are out of sync, and whether the test notices or not
>> depends on which cpus the scheduler sticks the threads on.
>
> Looks like that's the case?   But for what you want to do in kernel,
> the kernel already did this test and so would know to not use the TSC
> for vgettimeofday(), right?  (I only sort of followed the clock
> discussion earlier but I found it quite interesting, so decided to run
> the test).
>

Yes, the kernel won't run this code at all on your system.  That being
said, you have the constant_tsc flag and the error my tool measured is
pretty close to the error the kernel measured at boot, so it might be
interesting for the kernel to learn how to synchronize the TSCs
itself.  This is possible, abeit awkward, on newish Intel CPUs.  It's
possible but *really* awkward on older CPUs.

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 19:32 [RFT] Please test rdtsc on various x86-64 hardware (app included) Andrew Lutomirski
  2011-04-18 20:16 ` Linus Torvalds
  2011-04-18 20:23 ` Colin Walters
@ 2011-04-18 22:10 ` Andi Kleen
  2011-04-19  2:15   ` Andrew Lutomirski
  2011-04-19  0:49 ` Mihai Donțu
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2011-04-18 22:10 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: linux-kernel, Ingo Molnar, Linus Torvalds, Andi Kleen, x86

I tested it on a dual Westmere-EP and a Quad Westmere-EX.
Seems to pass everywhere.

However I kept the default MAX_THREADS 4. Shouldn't that be 
increased for the large systems?

I suspect the tests as written didn't really use the large 
systems.

Dual:

CPU vendor   : GenuineIntel
CPU model    : Intel(R) Xeon(R) CPU           E5640  ...
CPU stepping : ..
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test failed  : worst error 48 with 48647928 samples
Load3 test passed: margin 368 with 1013805 samples
Load test passed : margin 368 with 4142659 samples
Store test failed as expected: worst error 2940 with 4298473 samples

Quad:

CPU vendor   : GenuineIntel
CPU model    : Intel(R) Xeon(R) CPU E7 ...
CPU stepping : ..
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test passed  : margin 208 with 11997136 samples
Load3 test passed: margin 812 with 98710 samples
Load test passed : margin 208 with 1259451 samples
Store test failed as expected: worst error 9284 with 1086616 samples


-Andi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 20:37   ` Ingo Molnar
@ 2011-04-19  0:38     ` Markus Trippelsdorf
  0 siblings, 0 replies; 15+ messages in thread
From: Markus Trippelsdorf @ 2011-04-19  0:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Andrew Lutomirski, linux-kernel, Andi Kleen, x86

On 2011.04.18 at 22:37 +0200, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > Some opteron love would be good, but I don't have access to any right here.
> 
> Here's one:

And another:

CPU vendor   : AuthenticAMD
CPU model    : AMD Phenom(tm) II X4 955 Processor
CPU stepping : 2
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using mfence_rdtsc because you don't have an Intel CPU
Will test the "mfence;rdtsc" clock.
Now test passed  : margin 208 with 8769512 samples
Load3 test passed: margin 339 with 4880373 samples
Load test passed : margin 193 with 7189144 samples
Store test passed: margin 192 with 7149930 samples

-- 
Markus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 19:32 [RFT] Please test rdtsc on various x86-64 hardware (app included) Andrew Lutomirski
                   ` (2 preceding siblings ...)
  2011-04-18 22:10 ` Andi Kleen
@ 2011-04-19  0:49 ` Mihai Donțu
  2011-04-19  8:10 ` Frank Kingswood
  2011-04-22 14:33 ` Jan Ceuleers
  5 siblings, 0 replies; 15+ messages in thread
From: Mihai Donțu @ 2011-04-19  0:49 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: linux-kernel, Ingo Molnar, Linus Torvalds, Andi Kleen, x86

On Mon, 18 Apr 2011 15:32:52 -0400 Andrew Lutomirski <luto@mit.edu> wrote:
> Hi all-
> 
> I'd appreciate some help testing rdtsc's ordering wrt memory on
> various hardware.  You can download evil-clock-test code at:
> 
> https://gitorious.org/linux-test-utils/linux-clock-tests/blobs/raw/master/evil-clock-test.cc

On my laptop:
$ dmesg | grep -i tsc
[    0.000000] Fast TSC calibration using PIT
[    0.099991] TSC synchronization [CPU#0 -> CPU#1]:
[    0.099991] Measured 3156958390 cycles TSC warp between CPUs, turning off TSC clock.
[    0.099991] Marking TSC unstable due to check_tsc_sync_source failed

$ ./a.out
CPU vendor   : GenuineIntel
CPU model    : Intel(R) Core(TM)2 CPU         T5500  @ 1.66GHz
CPU stepping : 6
TSC flags    : tsc constant_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test passed  : margin 200 with 17193024 samples
Load3 test passed: margin 23270 with 45 samples
Load test passed : margin 70 with 3704403 samples
Store test failed as expected: worst error 50 with 3535212 samples

On a colleague's laptop:
$ dmesg | grep -i tsc
[    0.000000] Fast TSC calibration using PIT
[    0.339747] Marking TSC unstable due to TSC halts in idle

$ ./a.out 
CPU vendor   : GenuineIntel
CPU model    : Intel(R) Core(TM)2 Duo CPU     T9300  @ 2.50GHz
CPU stepping : 6
TSC flags    : tsc constant_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test passed  : margin 87 with 44323872 samples
Load3 test passed: margin 32062 with 28 samples
Load test passed : margin 62 with 14113177 samples
Store test failed as expected: worst error 850 with 12444429 samples

One of the servers I have around (2 x 6 core + HT):
$ dmesg | grep -i tsc
[    0.000000] Fast TSC calibration using PIT
[    0.398130] checking TSC synchronization [CPU#0 -> CPU#1]: passed.
[    0.577634] checking TSC synchronization [CPU#0 -> CPU#2]: passed.
[    0.757168] checking TSC synchronization [CPU#0 -> CPU#3]: passed.
[    0.936733] checking TSC synchronization [CPU#0 -> CPU#4]: passed.
[    1.116321] checking TSC synchronization [CPU#0 -> CPU#5]: passed.
[    1.295821] checking TSC synchronization [CPU#0 -> CPU#6]: passed.
[    1.475418] checking TSC synchronization [CPU#0 -> CPU#7]: passed.
[    1.654917] checking TSC synchronization [CPU#0 -> CPU#8]: passed.
[    1.834550] checking TSC synchronization [CPU#0 -> CPU#9]: passed.
[    2.014061] checking TSC synchronization [CPU#0 -> CPU#10]: passed.
[    2.193601] checking TSC synchronization [CPU#0 -> CPU#11]: passed.
[    2.373170] checking TSC synchronization [CPU#0 -> CPU#12]: passed.
[    2.552713] checking TSC synchronization [CPU#0 -> CPU#13]: passed.
[    2.732212] checking TSC synchronization [CPU#0 -> CPU#14]: passed.
[    2.911760] checking TSC synchronization [CPU#0 -> CPU#15]: passed.
[    3.091288] checking TSC synchronization [CPU#0 -> CPU#16]: passed.
[    3.270920] checking TSC synchronization [CPU#0 -> CPU#17]: passed.
[    3.450454] checking TSC synchronization [CPU#0 -> CPU#18]: passed.
[    3.629995] checking TSC synchronization [CPU#0 -> CPU#19]: passed.
[    3.809492] checking TSC synchronization [CPU#0 -> CPU#20]: passed.
[    3.989045] checking TSC synchronization [CPU#0 -> CPU#21]: passed.
[    4.168647] checking TSC synchronization [CPU#0 -> CPU#22]: passed.
[    4.348183] checking TSC synchronization [CPU#0 -> CPU#23]: passed.
[    4.577658] Switching to clocksource tsc

$ ./a.out
CPU vendor   : GenuineIntel
CPU model    : Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
CPU stepping : 2
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test passed  : margin 233 with 32792352 samples
Load3 test passed: margin 193 with 1912536 samples
Load test passed : margin 85 with 4347111 samples
Store test failed as expected: worst error 1996 with 3920055 samples

-- 
Mihai Donțu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 22:10 ` Andi Kleen
@ 2011-04-19  2:15   ` Andrew Lutomirski
  0 siblings, 0 replies; 15+ messages in thread
From: Andrew Lutomirski @ 2011-04-19  2:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, Ingo Molnar, Linus Torvalds, x86

On Mon, Apr 18, 2011 at 6:10 PM, Andi Kleen <andi@firstfloor.org> wrote:
> I tested it on a dual Westmere-EP and a Quad Westmere-EX.
> Seems to pass everywhere.

I see a failure below...

>
> However I kept the default MAX_THREADS 4. Shouldn't that be
> increased for the large systems?

MAX_THREADS is just the size of a data structure -- the tests use
either two or three threads.   However...

>
> I suspect the tests as written didn't really use the large
> systems.
>
> Dual:
>
> CPU vendor   : GenuineIntel
> CPU model    : Intel(R) Xeon(R) CPU           E5640  ...
> CPU stepping : ..
> TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
> Using lfence_rdtsc because you have an Intel CPU
> Will test the "lfence;rdtsc" clock.
> Now test failed  : worst error 48 with 48647928 samples

...that's not so good.  The "now test" is quite simple and really
shouldn't fail unless lfence doesn't work or the TSCs are out of sync.

Can you run now_test_all_pairs.sh from the same repository?  If you
have two packages that are a little out of sync, that should show it.
(It'll take a minute or so.)

I was thinking about how BIOS or the OS would go about syncing the
TSCs on different CPUs and it's not so obvious.  The problem is that
AFAICT you can't add an offset to a TSC; you have to reprogram the
whole thing.  That means that the time it takes for the wrmsr to
finish is a somewhat unknown error.  If you're off by, say, 70 cycles,
the now test will catch it if it ends up on the right CPUs.

FWIW, I can't reproduce this on a dual-package Xeon E5520.

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 19:32 [RFT] Please test rdtsc on various x86-64 hardware (app included) Andrew Lutomirski
                   ` (3 preceding siblings ...)
  2011-04-19  0:49 ` Mihai Donțu
@ 2011-04-19  8:10 ` Frank Kingswood
  2011-04-22 14:33 ` Jan Ceuleers
  5 siblings, 0 replies; 15+ messages in thread
From: Frank Kingswood @ 2011-04-19  8:10 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-kernel, Ingo Molnar, Linus Torvalds, Andi Kleen, x86

On 18/04/11 20:32, Andrew Lutomirski wrote:
> Hi all-
>
> I'd appreciate some help testing rdtsc's ordering wrt memory on
> various hardware.  You can download evil-clock-test code at:
>
> https://gitorious.org/linux-test-utils/linux-clock-tests/blobs/raw/master/evil-clock-test.cc

A low power core2 server:

CPU vendor   : GenuineIntel
CPU model    : Intel(R) Core(TM)2 Duo CPU     T8100  @ 2.10GHz
CPU stepping : 6
TSC flags    : tsc constant_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test passed  : margin 84 with 39213496 samples
Load3 test passed: margin 20947164 with 15 samples
Load test passed : margin 52 with 12466437 samples
Store test failed as expected: worst error 1397 with 11151528 samples

That margin number seemed a bit high, rerunning gives:

CPU vendor   : GenuineIntel
CPU model    : Intel(R) Core(TM)2 Duo CPU     T8100  @ 2.10GHz
CPU stepping : 6
TSC flags    : tsc constant_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test passed  : margin 84 with 39201440 samples
Load3 test passed: margin 7991 with 15 samples
Load test passed : margin 52 with 12482714 samples
Store test failed as expected: worst error 2457 with 11129294 samples

Frank


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-18 19:32 [RFT] Please test rdtsc on various x86-64 hardware (app included) Andrew Lutomirski
                   ` (4 preceding siblings ...)
  2011-04-19  8:10 ` Frank Kingswood
@ 2011-04-22 14:33 ` Jan Ceuleers
  5 siblings, 0 replies; 15+ messages in thread
From: Jan Ceuleers @ 2011-04-22 14:33 UTC (permalink / raw)
  To: Andrew Lutomirski; +Cc: linux-kernel, x86

On 18/04/11 21:32, Andrew Lutomirski wrote:
> Hi all-
>
> I'd appreciate some help testing rdtsc's ordering wrt memory on
> various hardware.

$ ./evil-clock-test
CPU vendor   : GenuineIntel
CPU model    : Intel(R) Core(TM) i3 CPU         540  @ 3.07GHz
CPU stepping : 2
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using lfence_rdtsc because you have an Intel CPU
Will test the "lfence;rdtsc" clock.
Now test passed  : margin 96 with 64336568 samples
Load3 test passed: margin 72 with 7731224 samples
Load test passed : margin 88 with 13575597 samples
Store test failed as expected: worst error 892 with 9605781 samples

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
  2011-04-19  5:06 George Spelvin
@ 2011-04-19  6:09 ` George Spelvin
  0 siblings, 0 replies; 15+ messages in thread
From: George Spelvin @ 2011-04-19  6:09 UTC (permalink / raw)
  To: luto; +Cc: linux-kernel, linux

A bit more info on those failing tests (common headers snipped):

$ ./evil-clock-test -v -v -v
CPU vendor   : AuthenticAMD
CPU model    : AMD Phenom(tm) 9850 Quad-Core Processor
CPU stepping : 3
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using mfence_rdtsc because you don't have an Intel CPU
Will test the "mfence;rdtsc" clock.
Running now test...
Now test passed  : margin 168 with 15176312 samples
Running load3 test...
Load3 test got no data
Running load test...
  Failed 5/615154 times with worst error 4
  Failed 3/618034 times with worst error 4
  Failed 4/618552 times with worst error 7
  Failed 2/618457 times with worst error 7
  Failed 1/618518 times with worst error 4
  Failed 2/616455 times with worst error 4
  Failed 1/617990 times with worst error 4
Load test failed : worst error 7 with 4323160 samples
Running store test...
  Failed 3/614883 times with worst error 4
  Failed 3/616023 times with worst error 4
  Failed 6/613633 times with worst error 4
Store test failed: worst error 4 with 1844539 samples

Running now test...
Now test passed  : margin 168 with 15125496 samples
Running load3 test...
Load3 test got no data
Running load test...
  Failed 5/603344 times with worst error 4
  Failed 3/609628 times with worst error 4
Load test failed : worst error 4 with 1212972 samples
Running store test...
  Failed 2/610838 times with worst error 3
  Failed 4/611400 times with worst error 4
Store test failed: worst error 4 with 1222238 samples

Running now test...
Now test passed  : margin 169 with 15136216 samples
Running load3 test...
Load3 test got no data
Running load test...
Load test got no data
Running store test...
  Failed 1/617293 times with worst error 4
Store test failed: worst error 4 with 617293 samples

Running now test...
Now test passed  : margin 170 with 15128832 samples
Running load3 test...
Load3 test got no data
Running load test...
Load test got no data
Running store test...
Store test got no data

Running now test...
Now test passed  : margin 170 with 15039712 samples
Running load3 test...
Load3 test got no data
Running load test...
  Failed 4/617259 times with worst error 4
  Failed 4/621375 times with worst error 4
Load test failed : worst error 4 with 1238634 samples
Running store test...
Store test got no data

Now test passed  : margin 169 with 14977856 samples
Running load3 test...
Load3 test got no data
Running load test...
  Passed with margin 2 (613724 samples)
  Failed 7/610443 times with worst error 30
Load test failed : worst error 30 with 1224167 samples
Running store test...
Store test got no data

Running now test...
Now test passed  : margin 170 with 15131920 samples
Running load3 test...
Load3 test got no data
Running load test...
  Failed 3/607735 times with worst error 4
  Failed 4/614570 times with worst error 4
  Failed 1/614801 times with worst error 4
  Failed 5/614483 times with worst error 4
  Failed 4/614106 times with worst error 4
  Failed 1/614649 times with worst error 4
  Failed 5/614351 times with worst error 8
  Failed 4/614453 times with worst error 4
  Failed 4/614643 times with worst error 4
Load test failed : worst error 8 with 5523791 samples
Running store test...
  Failed 5/613788 times with worst error 4
  Failed 1/614285 times with worst error 4
Store test failed: worst error 4 with 1228073 samples


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFT] Please test rdtsc on various x86-64 hardware (app included)
@ 2011-04-19  5:06 George Spelvin
  2011-04-19  6:09 ` George Spelvin
  0 siblings, 1 reply; 15+ messages in thread
From: George Spelvin @ 2011-04-19  5:06 UTC (permalink / raw)
  To: luto; +Cc: linux, linux-kernel

I'm not doing so well:

$ ./evil-clock-test
CPU vendor   : AuthenticAMD
CPU model    : AMD Phenom(tm) 9850 Quad-Core Processor
CPU stepping : 3
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using mfence_rdtsc because you don't have an Intel CPU
Will test the "mfence;rdtsc" clock.
Now test passed  : margin 168 with 15744696 samples
Load3 test got no data
Load test got no data
Store test failed: worst error 10 with 617583 samples

$ ./evil-clock-test
CPU vendor   : AuthenticAMD
CPU model    : AMD Phenom(tm) 9850 Quad-Core Processor
CPU stepping : 3
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using mfence_rdtsc because you don't have an Intel CPU
Will test the "mfence;rdtsc" clock.
Now test passed  : margin 168 with 15666016 samples
Load3 test got no data
Load test got no data
Store test got no data

$ ./evil-clock-test
CPU vendor   : AuthenticAMD
CPU model    : AMD Phenom(tm) 9850 Quad-Core Processor
CPU stepping : 3
TSC flags    : tsc rdtscp constant_tsc nonstop_tsc
Using mfence_rdtsc because you don't have an Intel CPU
Will test the "mfence;rdtsc" clock.
Now test passed  : margin 168 with 15981720 samples
Load3 test got no data
Load test got no data
Store test got no data

Compiled from git revision 8f7d7a62, "Add script to test all pairs."
Running as root makes no difference.

Other programs:

$ ./now_test_all_pairs.sh
0,1: Now test passed  : margin 168 with 15007304 samples
0,2: Now test passed  : margin 168 with 15288360 samples
0,3: Now test passed  : margin 168 with 15539776 samples
1,0: Now test passed  : margin 168 with 14890744 samples
1,2: Now test passed  : margin 168 with 15266128 samples
1,3: Now test passed  : margin 168 with 15322480 samples
2,0: Now test passed  : margin 168 with 14978320 samples
2,1: Now test passed  : margin 168 with 15109872 samples
2,3: Now test passed  : margin 168 with 15207616 samples
3,0: Now test passed  : margin 168 with 15043208 samples
3,1: Now test passed  : margin 168 with 14686352 samples
3,2: Now test passed  : margin 168 with 14945840 samples

$ ./timing_test
Usage: time <Miters> <mode> [POSIX clock id]

Clocks are:
  0 (CLOCK_REALTIME)  resolution = 0.000000001
  1 (CLOCK_MONOTONIC)  resolution = 0.000000001
  5 (CLOCK_REALTIME_COARSE)  resolution = 0.010000000
  6 (CLOCK_MONOTONIC_COARSE)  resolution = 0.010000000

System is 2.6.38.2 with 8 GB ECC RAM.  ATI RD790/SB600 chipset.

Hope this helps!

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-04-22 14:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-18 19:32 [RFT] Please test rdtsc on various x86-64 hardware (app included) Andrew Lutomirski
2011-04-18 20:16 ` Linus Torvalds
2011-04-18 20:37   ` Ingo Molnar
2011-04-19  0:38     ` Markus Trippelsdorf
2011-04-18 20:23 ` Colin Walters
2011-04-18 20:27   ` Andrew Lutomirski
2011-04-18 20:35     ` Colin Walters
2011-04-18 20:39       ` Andrew Lutomirski
2011-04-18 22:10 ` Andi Kleen
2011-04-19  2:15   ` Andrew Lutomirski
2011-04-19  0:49 ` Mihai Donțu
2011-04-19  8:10 ` Frank Kingswood
2011-04-22 14:33 ` Jan Ceuleers
2011-04-19  5:06 George Spelvin
2011-04-19  6:09 ` George Spelvin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.