All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Exposing TSC "reliability" to userland
@ 2010-05-03 20:21 Dan Magenheimer
  2010-05-04 23:16 ` Venkatesh Pallipadi
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Magenheimer @ 2010-05-03 20:21 UTC (permalink / raw)
  To: linux-kernel, x86; +Cc: venki, mingo

In a patch posted late last year by Venki: 

http://lkml.org/lkml/2009/12/17/360

it was noted that some systems that specify the "Invariant TSC"
bit in CPUID (on recent processors) are sadly not guaranteed to
have synchronized TSCs.  As a result, Ingo's check_tsc_warp() is
executed; if the warp test passes, the kernel uses TSC
as clocksource and, if it doesn't pass, the kernel marks
the TSC as unstable and chooses a different clocksource.

Whether the kernel deems TSC to be reliable or not is a very
useful piece of information to userland, e.g. to certain
enterprise apps such the Oracle DB, some JVM's, etc.  If
TSC IS reliable, rdtsc can be used by many of these
enterprise applications in many situations in place of a
gettimeofday call.  Rdtsc can be much faster even than
a vsyscall and it is certainly much much faster when,
for one reason or another, vsyscall is not enabled.
This can make a huge performance difference in real
benchmarks when timestamps are frequently taken (10%
benchmark performance improvement was measured using
rdtsc vs gettimeofday syscall).

Running a warp test in userland is not nearly as accurate
as the warp test run by the kernel.  So it makes sense to expose
the results of the kernel warp test to userland, maybe
through /sys/devices/system/clocksource/tsc_reliable

Comments?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] Exposing TSC "reliability" to userland
  2010-05-03 20:21 [RFC] Exposing TSC "reliability" to userland Dan Magenheimer
@ 2010-05-04 23:16 ` Venkatesh Pallipadi
  2010-05-04 23:51   ` Dan Magenheimer
  0 siblings, 1 reply; 3+ messages in thread
From: Venkatesh Pallipadi @ 2010-05-04 23:16 UTC (permalink / raw)
  To: Dan Magenheimer; +Cc: linux-kernel, x86, mingo

On Mon, May 3, 2010 at 1:21 PM, Dan Magenheimer
<dan.magenheimer@oracle.com> wrote:
>
> In a patch posted late last year by Venki:
>
> http://lkml.org/lkml/2009/12/17/360
>
> it was noted that some systems that specify the "Invariant TSC"
> bit in CPUID (on recent processors) are sadly not guaranteed to
> have synchronized TSCs.  As a result, Ingo's check_tsc_warp() is
> executed; if the warp test passes, the kernel uses TSC
> as clocksource and, if it doesn't pass, the kernel marks
> the TSC as unstable and chooses a different clocksource.
>
> Whether the kernel deems TSC to be reliable or not is a very
> useful piece of information to userland, e.g. to certain
> enterprise apps such the Oracle DB, some JVM's, etc.  If
> TSC IS reliable, rdtsc can be used by many of these
> enterprise applications in many situations in place of a
> gettimeofday call.  Rdtsc can be much faster even than
> a vsyscall and it is certainly much much faster when,
> for one reason or another, vsyscall is not enabled.
> This can make a huge performance difference in real
> benchmarks when timestamps are frequently taken (10%
> benchmark performance improvement was measured using
> rdtsc vs gettimeofday syscall).
>
> Running a warp test in userland is not nearly as accurate
> as the warp test run by the kernel.  So it makes sense to expose
> the results of the kernel warp test to userland, maybe
> through /sys/devices/system/clocksource/tsc_reliable
>
> Comments?

[ Sorry if this is a duplicate. I had messed up my mail client format setting ]

One option is to remove tsc from
/sys/devices/system/clocksource/clocksource*/available_clocksource
when it is detected as unstable.

That should already be happening with NOHZ or HIGHRES selected. But,
should be simple to add some code to do this always.

Would that work?

Thanks,
Venki

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [RFC] Exposing TSC "reliability" to userland
  2010-05-04 23:16 ` Venkatesh Pallipadi
@ 2010-05-04 23:51   ` Dan Magenheimer
  0 siblings, 0 replies; 3+ messages in thread
From: Dan Magenheimer @ 2010-05-04 23:51 UTC (permalink / raw)
  To: Venkatesh Pallipadi; +Cc: linux-kernel, x86, mingo

> > Running a warp test in userland is not nearly as accurate
> > as the warp test run by the kernel.  So it makes sense to expose
> > the results of the kernel warp test to userland, maybe
> > through /sys/devices/system/clocksource/tsc_reliable
> >
> > Comments?
> 
> [ Sorry if this is a duplicate. I had messed up my mail client format
> setting ]
> 
> One option is to remove tsc from
> /sys/devices/system/clocksource/clocksource*/available_clocksource
> when it is detected as unstable.
> 
> That should already be happening with NOHZ or HIGHRES selected. But,
> should be simple to add some code to do this always.
> 
> Would that work?

Hi Venki --

In some offlist discussion, a similar solution was suggested:
If /sys/devices/system/clocksource/clocksource*/current_clocksource
is "tsc" AND the "Invariant TSC" CPUID bit is set, then "reliable TSC"
can be assumed.

BUT, exposing the information explicitly from the kernel would be
more comforting rather than requiring some reverse-engineering some
combination of kernel tests that might change over time.  If the
kernel determines TSC is reliable, that seems like it should be
good enough for userland.

AND it was also pointed out that userland usage of TSC is almost
useless unless some reliable reasonably-precise frequency is also
known.  A possible solution to this is to expose:

/sys/devices/system/clocksource/clocksource*/clocksource_mult and
/sys/devices/system/clocksource/clocksource*/clocksource_shift

(or some other more TSC-specific name) and provide the same
mult/shift values the kernel uses for clocksource_cyc2ns().

By the way, excuse my ignorance, but is there ever a clocksourceN
where N is not zero?

Hope things are going well in google-land!

Thanks,
Dan

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-05-04 23:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-03 20:21 [RFC] Exposing TSC "reliability" to userland Dan Magenheimer
2010-05-04 23:16 ` Venkatesh Pallipadi
2010-05-04 23:51   ` Dan Magenheimer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.