linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] sched_clock fixes
@ 2017-04-21 14:57 Peter Zijlstra
  2017-04-21 14:57 ` [PATCH 1/9] x86/tsc: Provide tsc=unstable boot parameter Peter Zijlstra
                   ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Peter Zijlstra @ 2017-04-21 14:57 UTC (permalink / raw)
  To: tglx, mingo
  Cc: linux-kernel, ville.syrjala, daniel.lezcano, rafael.j.wysocki,
	marta.lofstedt, martin.peres, pasha.tatashin, peterz,
	daniel.vetter

Hi,

These patches were inspired (and hopefully fix) two independent bug reports on
Core2 machines.

I never could quite reproduce one, but my Core2 machine no longer switches to
stable sched_clock and therefore no longer tickles the problematic stable ->
unstable transition either.

Before I dug up my Core2 machine, I tried emulating TSC wreckage by poking
random values into the TSC MSR from userspace. Behaviour in that case is
improved as well.

People have to realize that if we manage to boot with TSC 'stable' (both
sched_clock and clocksource) and we later find out we were mistaken (we observe
a TSC wobble) the clocks that are derived from it _will_ have had an observable
hickup. This is fundamentally unfixable.

If you own a machine where the BIOS tries to hide SMI latencies by rewinding
TSC (yes, this is a thing), the very best we can do is mark TSC unstable with a
boot parameter.

For example, this is me writing a stupid value into the TSC:

[   46.745082] random: crng init done
[18443029775.010069] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[18443029775.023141] clocksource:                       'hpet' wd_now: 3ebec538 wd_last: 3e486ec9 mask: ffffffff
[18443029775.034214] clocksource:                       'tsc' cs_now: 5025acce9 cs_last: 24dc3bd21c88ee mask: ffffffffffffffff
[18443029775.046651] tsc: Marking TSC unstable due to clocksource watchdog
[18443029775.054211] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[18443029775.064434] sched_clock: Marking unstable (70569005835, -17833788)<-(-3714295689546517, -2965802361)
[   70.573700] clocksource: Switched to clocksource hpet

With some trace_printk()s (not included) I could tell that the wobble
occured at 69.965474. The clock now resumes where it 'should' have been.

But an unfortunate scheduling event could have resulted in one task
having seen a runtime of ~584 years with 'obvious' effects. Similar
jumps can also be observed from userspace GTOD usage.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-04-26  6:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-21 14:57 [PATCH 0/9] sched_clock fixes Peter Zijlstra
2017-04-21 14:57 ` [PATCH 1/9] x86/tsc: Provide tsc=unstable boot parameter Peter Zijlstra
2017-04-21 14:57 ` [PATCH 2/9] x86,tsc: Feed refined TSC calibration into sched_clock Peter Zijlstra
2017-04-21 14:57 ` [PATCH 3/9] sched/clock: Initialize all per-cpu state before switching (back) to unstable Peter Zijlstra
2017-04-21 14:58 ` [PATCH 4/9] x86/tsc,sched/clock,clocksource: Use clocksource watchdog to provide stable sync points Peter Zijlstra
2017-04-21 14:58 ` [PATCH 5/9] sched/clock: Remove unused argument to sched_clock_idle_wakeup_event() Peter Zijlstra
2017-04-21 14:58 ` [PATCH 6/9] sched/clock: Remove watchdog touching Peter Zijlstra
2017-04-21 15:28   ` Peter Zijlstra
2017-04-21 14:58 ` [PATCH 8/9] sched/clock: Use late_initcall() instead of sched_init_smp() Peter Zijlstra
2017-04-21 14:58 ` [PATCH 9/9] sched/clock: Print a warning recommending tsc=unstable Peter Zijlstra
2017-04-25  9:31 ` [PATCH 0/9] sched_clock fixes Lofstedt, Marta
2017-04-25 13:44   ` Peter Zijlstra
2017-04-26  6:41     ` Lofstedt, Marta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).