[patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
@ 2016-12-13 13:14 Thomas Gleixner
  2016-12-13 13:14 ` [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume Thomas Gleixner
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-13 13:14 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Roland Scheidegger, Kevin Stanton, Allen Hung

Roland reported interesting TSC ADJUST register wreckage on his DELL
machine, which seems to populate that MSR with a random number generator.

Deeper investagation into fixing this wreckage unearthed another special
feature which is designed by Intel: Negative TSC adjuste values cause
interrupt storms on the TSC deadline timer. Further details in patch 2/2

Thanks

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume
  2016-12-13 13:14 [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Thomas Gleixner
@ 2016-12-13 13:14 ` Thomas Gleixner
  2016-12-13 13:22   ` Peter Zijlstra
  2016-12-15 10:52   ` [tip:x86/timers] " tip-bot for Thomas Gleixner
  2016-12-13 13:14 ` [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero Thomas Gleixner
  2016-12-13 16:34 ` [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Roland Scheidegger
  2 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-13 13:14 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Roland Scheidegger, Kevin Stanton, Allen Hung

[-- Attachment #1: x86-tsc--Validate-TSC_ADJUST-after-resume.patch --]
[-- Type: text/plain, Size: 3313 bytes --]

Some 'feature' BIOSes fiddle with the TSC_ADJUST register during
suspend/resume which renders the TSC unusable.

Add sanity checks into the resume path and restore the
original value if it was adjusted.

Reported-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/tsc.h |    4 ++--
 arch/x86/kernel/process.c  |    2 +-
 arch/x86/kernel/tsc.c      |    6 ++++++
 arch/x86/kernel/tsc_sync.c |    6 +++---
 arch/x86/power/cpu.c       |    1 +
 5 files changed, 13 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -47,12 +47,12 @@ extern int tsc_clocksource_reliable;
  */
 #ifdef CONFIG_X86_TSC
 extern bool tsc_store_and_check_tsc_adjust(void);
-extern void tsc_verify_tsc_adjust(void);
+extern void tsc_verify_tsc_adjust(bool resume);
 extern void check_tsc_sync_source(int cpu);
 extern void check_tsc_sync_target(void);
 #else
 static inline bool tsc_store_and_check_tsc_adjust(void) { return false; }
-static inline void tsc_verify_tsc_adjust(void) { }
+static inline void tsc_verify_tsc_adjust(bool resume) { }
 static inline void check_tsc_sync_source(int cpu) { }
 static inline void check_tsc_sync_target(void) { }
 #endif
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -277,7 +277,7 @@ void exit_idle(void)
 
 void arch_cpu_idle_enter(void)
 {
-	tsc_verify_tsc_adjust();
+	tsc_verify_tsc_adjust(false);
 	local_touch_nmi();
 	enter_idle();
 }
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1080,6 +1080,11 @@ static void detect_art(void)
 
 static struct clocksource clocksource_tsc;
 
+static void tsc_resume(struct clocksource *cs)
+{
+	tsc_verify_tsc_adjust(true);
+}
+
 /*
  * We used to compare the TSC to the cycle_last value in the clocksource
  * structure to avoid a nasty time-warp. This can be observed in a
@@ -1112,6 +1117,7 @@ static struct clocksource clocksource_ts
 	.flags                  = CLOCK_SOURCE_IS_CONTINUOUS |
 				  CLOCK_SOURCE_MUST_VERIFY,
 	.archdata               = { .vclock_mode = VCLOCK_TSC },
+	.resume			= tsc_resume,
 };
 
 void mark_tsc_unstable(char *reason)
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -30,7 +30,7 @@ struct tsc_adjust {
 
 static DEFINE_PER_CPU(struct tsc_adjust, tsc_adjust);
 
-void tsc_verify_tsc_adjust(void)
+void tsc_verify_tsc_adjust(bool resume)
 {
 	struct tsc_adjust *adj = this_cpu_ptr(&tsc_adjust);
 	s64 curval;
@@ -39,7 +39,7 @@ void tsc_verify_tsc_adjust(void)
 		return;
 
 	/* Rate limit the MSR check */
-	if (time_before(jiffies, adj->nextcheck))
+	if (!resume && time_before(jiffies, adj->nextcheck))
 		return;
 
 	adj->nextcheck = jiffies + HZ;
@@ -51,7 +51,7 @@ void tsc_verify_tsc_adjust(void)
 	/* Restore the original value */
 	wrmsrl(MSR_IA32_TSC_ADJUST, adj->adjusted);
 
-	if (!adj->warned) {
+	if (!adj->warned || resume) {
 		pr_warn(FW_BUG "TSC ADJUST differs: CPU%u %lld --> %lld. Restoring\n",
 			smp_processor_id(), adj->adjusted, curval);
 		adj->warned = true;
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -256,6 +256,7 @@ static void notrace __restore_processor_
 	mtrr_bp_restore();
 	perf_restore_debug_store();
 	msr_restore_context(ctxt);
+	tsc_verify_tsc_adjust(true);
 }
 
 /* Needed by apm.c */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume
  2016-12-13 13:14 ` [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume Thomas Gleixner
@ 2016-12-13 13:22   ` Peter Zijlstra
  2016-12-13 13:23     ` Thomas Gleixner
  2016-12-15 10:52   ` [tip:x86/timers] " tip-bot for Thomas Gleixner
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2016-12-13 13:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Borislav Petkov, Bruce Schlobohm, Roland Scheidegger,
	Kevin Stanton, Allen Hung

On Tue, Dec 13, 2016 at 01:14:17PM -0000, Thomas Gleixner wrote:
> --- a/arch/x86/power/cpu.c
> +++ b/arch/x86/power/cpu.c
> @@ -256,6 +256,7 @@ static void notrace __restore_processor_
>  	mtrr_bp_restore();
>  	perf_restore_debug_store();
>  	msr_restore_context(ctxt);
> +	tsc_verify_tsc_adjust(true);
>  }

Should we do that sooner, as in before calling
restore_sched_clock_state() ? Otherwise we recompute the sched_clock
deltas vs the wrecked TSC and then fix it up through the ADJUST,
wrecking our sched clock again.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume
  2016-12-13 13:22   ` Peter Zijlstra
@ 2016-12-13 13:23     ` Thomas Gleixner
  0 siblings, 0 replies; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-13 13:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, x86, Borislav Petkov, Bruce Schlobohm, Roland Scheidegger,
	Kevin Stanton, Allen Hung

On Tue, 13 Dec 2016, Peter Zijlstra wrote:

> On Tue, Dec 13, 2016 at 01:14:17PM -0000, Thomas Gleixner wrote:
> > --- a/arch/x86/power/cpu.c
> > +++ b/arch/x86/power/cpu.c
> > @@ -256,6 +256,7 @@ static void notrace __restore_processor_
> >  	mtrr_bp_restore();
> >  	perf_restore_debug_store();
> >  	msr_restore_context(ctxt);
> > +	tsc_verify_tsc_adjust(true);
> >  }
> 
> Should we do that sooner, as in before calling
> restore_sched_clock_state() ? Otherwise we recompute the sched_clock
> deltas vs the wrecked TSC and then fix it up through the ADJUST,
> wrecking our sched clock again.

Ah. Indeed. Will fix. That needs some thought on the secondary CPUs as
well...

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip:x86/timers] x86/tsc: Validate TSC_ADJUST after resume
  2016-12-13 13:14 ` [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume Thomas Gleixner
  2016-12-13 13:22   ` Peter Zijlstra
@ 2016-12-15 10:52   ` tip-bot for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-12-15 10:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, bruce.schlobohm, hpa, tglx, bp, rscheidegger_lists,
	mingo, kevin.b.stanton, allen_hung, peterz

Commit-ID:  6a369583178d0b89c2c3919c4456ee22fee0f249
Gitweb:     http://git.kernel.org/tip/6a369583178d0b89c2c3919c4456ee22fee0f249
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Dec 2016 13:14:17 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 15 Dec 2016 11:44:29 +0100

x86/tsc: Validate TSC_ADJUST after resume

Some 'feature' BIOSes fiddle with the TSC_ADJUST register during
suspend/resume which renders the TSC unusable.

Add sanity checks into the resume path and restore the
original value if it was adjusted.

Reported-and-tested-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Allen Hung <allen_hung@dell.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161213131211.317654500@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/tsc.h | 4 ++--
 arch/x86/kernel/process.c  | 2 +-
 arch/x86/kernel/tsc.c      | 6 ++++++
 arch/x86/kernel/tsc_sync.c | 6 +++---
 arch/x86/power/cpu.c       | 1 +
 5 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index c054eaa..372ad0c 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -47,12 +47,12 @@ extern int tsc_clocksource_reliable;
  */
 #ifdef CONFIG_X86_TSC
 extern bool tsc_store_and_check_tsc_adjust(void);
-extern void tsc_verify_tsc_adjust(void);
+extern void tsc_verify_tsc_adjust(bool resume);
 extern void check_tsc_sync_source(int cpu);
 extern void check_tsc_sync_target(void);
 #else
 static inline bool tsc_store_and_check_tsc_adjust(void) { return false; }
-static inline void tsc_verify_tsc_adjust(void) { }
+static inline void tsc_verify_tsc_adjust(bool resume) { }
 static inline void check_tsc_sync_source(int cpu) { }
 static inline void check_tsc_sync_target(void) { }
 #endif
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 4fe5dc8..a67e0f0 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -277,7 +277,7 @@ void exit_idle(void)
 
 void arch_cpu_idle_enter(void)
 {
-	tsc_verify_tsc_adjust();
+	tsc_verify_tsc_adjust(false);
 	local_touch_nmi();
 	enter_idle();
 }
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 2bb8de4..bfb541a 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1080,6 +1080,11 @@ static void detect_art(void)
 
 static struct clocksource clocksource_tsc;
 
+static void tsc_resume(struct clocksource *cs)
+{
+	tsc_verify_tsc_adjust(true);
+}
+
 /*
  * We used to compare the TSC to the cycle_last value in the clocksource
  * structure to avoid a nasty time-warp. This can be observed in a
@@ -1112,6 +1117,7 @@ static struct clocksource clocksource_tsc = {
 	.flags                  = CLOCK_SOURCE_IS_CONTINUOUS |
 				  CLOCK_SOURCE_MUST_VERIFY,
 	.archdata               = { .vclock_mode = VCLOCK_TSC },
+	.resume			= tsc_resume,
 };
 
 void mark_tsc_unstable(char *reason)
diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index a75f696..94f2ce5 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -30,7 +30,7 @@ struct tsc_adjust {
 
 static DEFINE_PER_CPU(struct tsc_adjust, tsc_adjust);
 
-void tsc_verify_tsc_adjust(void)
+void tsc_verify_tsc_adjust(bool resume)
 {
 	struct tsc_adjust *adj = this_cpu_ptr(&tsc_adjust);
 	s64 curval;
@@ -39,7 +39,7 @@ void tsc_verify_tsc_adjust(void)
 		return;
 
 	/* Rate limit the MSR check */
-	if (time_before(jiffies, adj->nextcheck))
+	if (!resume && time_before(jiffies, adj->nextcheck))
 		return;
 
 	adj->nextcheck = jiffies + HZ;
@@ -51,7 +51,7 @@ void tsc_verify_tsc_adjust(void)
 	/* Restore the original value */
 	wrmsrl(MSR_IA32_TSC_ADJUST, adj->adjusted);
 
-	if (!adj->warned) {
+	if (!adj->warned || resume) {
 		pr_warn(FW_BUG "TSC ADJUST differs: CPU%u %lld --> %lld. Restoring\n",
 			smp_processor_id(), adj->adjusted, curval);
 		adj->warned = true;
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 53cace2..66ade16 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -252,6 +252,7 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
 	fix_processor_context();
 
 	do_fpu_end();
+	tsc_verify_tsc_adjust(true);
 	x86_platform.restore_sched_clock_state();
 	mtrr_bp_restore();
 	perf_restore_debug_store();

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero
  2016-12-13 13:14 [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Thomas Gleixner
  2016-12-13 13:14 ` [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume Thomas Gleixner
@ 2016-12-13 13:14 ` Thomas Gleixner
  2016-12-13 13:43   ` Peter Zijlstra
                     ` (2 more replies)
  2016-12-13 16:34 ` [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Roland Scheidegger
  2 siblings, 3 replies; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-13 13:14 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Roland Scheidegger, Kevin Stanton, Allen Hung

[-- Attachment #1: x86-tsc--Force-TSC_ADJUST-register-to-zero.patch --]
[-- Type: text/plain, Size: 7452 bytes --]

Roland reported that his DELL T5810 sports a value add BIOS which
completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
random negative TSC_ADJUST values, different on all CPUs. That renders the
TSC useless because the sycnchronization check fails.

Roland tested the new TSC_ADJUST mechanism. While it manages to readjust
the TSCs he needs to disable the TSC deadline timer, otherwise the machine
just stops booting.

Deeper investigation unearthed that the TSC deadline timer is sensitive to
the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an
interrupt storm caused by the TSC deadline timer.

This does not make any sense and it's hard to imagine what kind of hardware
wreckage is behind that misfeature, but it's reliably reproducible on other
systems which have TSC_ADJUST and TSC deadline timer.

While it would be understandable that a big enough negative value which
moves the resulting TSC readout into the negative space could have the
described effect, this happens even with a adjust value of -1, which keeps
the TSC readout definitely in the positive space. The compare register for
the TSC deadline timer is set to a positive value larger than the TSC, but
despite not having reached the deadline the interrupt is raised
immediately. If this happens on the boot CPU, then the machine dies
silently because this setup happens before the NMI watchdog is armed.

Further experiments showed that any other adjustment of TSC_ADJUST works as
expected as long as it stays in the positive range. The direction of the
adjustment has no influence either.

Yet another proof for the theory that timers are designed by janitors and
the underlying (obviously undocumented) mechanisms which allow BIOSes to
wreckage them are considered a feature. Well done Intel - NOT!

To address this wreckage add the following sanity measures:

- If the TSC_ADJUST value on the boot cpu is not 0, set it to 0

- If the TSC_ADJUST value on any cpu is negative, set it to 0

- Prevent the cross package synchronization mechanism from setting negative
  TSC_ADJUST values.

Reported-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/tsc.h |    4 +--
 arch/x86/kernel/tsc.c      |    2 -
 arch/x86/kernel/tsc_sync.c |   55 +++++++++++++++++++++++++++++++--------------
 3 files changed, 42 insertions(+), 19 deletions(-)

--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -46,12 +46,12 @@ extern int tsc_clocksource_reliable;
  * all CPUs/cores:
  */
 #ifdef CONFIG_X86_TSC
-extern bool tsc_store_and_check_tsc_adjust(void);
+extern bool tsc_store_and_check_tsc_adjust(bool bootcpu);
 extern void tsc_verify_tsc_adjust(bool resume);
 extern void check_tsc_sync_source(int cpu);
 extern void check_tsc_sync_target(void);
 #else
-static inline bool tsc_store_and_check_tsc_adjust(void) { return false; }
+static inline bool tsc_store_and_check_tsc_adjust(bool bootcpu) { return false; }
 static inline void tsc_verify_tsc_adjust(bool resume) { }
 static inline void check_tsc_sync_source(int cpu) { }
 static inline void check_tsc_sync_target(void) { }
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1386,7 +1386,7 @@ void __init tsc_init(void)
 	if (unsynchronized_tsc())
 		mark_tsc_unstable("TSCs unsynchronized");
 	else
-		tsc_store_and_check_tsc_adjust();
+		tsc_store_and_check_tsc_adjust(true);
 
 	check_system_tsc_reliable();
 
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -58,8 +58,33 @@ void tsc_verify_tsc_adjust(bool resume)
 	}
 }
 
+static void tsc_sanitize_first_cpu(struct tsc_adjust *cur, s64 bootval,
+				   unsigned int cpu, bool bootcpu)
+{
+	/*
+	 * First online CPU in a package stores the boot value in the
+	 * adjustment value. This value might change later via the sync
+	 * mechanism. If that fails we still can yell about boot values not
+	 * being consistent.
+	 *
+	 * On the boot cpu we just force set the ADJUST value to 0 if it's
+	 * non zero. We don't do that on non boot cpus because physical
+	 * hotplug should have set the ADJUST register to a value > 0 so
+	 * the TSC is in sync with the already running cpus.
+	 *
+	 * But we always force positive ADJUST values. Otherwise the TSC
+	 * deadline timer creates an interrupt storm. Sigh!
+	 */
+	if ((bootcpu && bootval != 0) || (!bootcpu && bootval < 0)) {
+		pr_warn("TSC ADJUST: CPU%u: %lld force to 0\n", cpu, bootval);
+		wrmsrl(MSR_IA32_TSC_ADJUST, 0);
+		bootval = 0;
+	}
+	cur->adjusted = bootval;
+}
+
 #ifndef CONFIG_SMP
-bool __init tsc_store_and_check_tsc_adjust(void)
+bool __init tsc_store_and_check_tsc_adjust(bool bootcpu)
 {
 	struct tsc_adjust *cur = this_cpu_ptr(&tsc_adjust);
 	s64 bootval;
@@ -69,9 +94,8 @@ bool __init tsc_store_and_check_tsc_adju
 
 	rdmsrl(MSR_IA32_TSC_ADJUST, bootval);
 	cur->bootval = bootval;
-	cur->adjusted = bootval;
 	cur->nextcheck = jiffies + HZ;
-	pr_info("TSC ADJUST: Boot CPU0: %lld\n", bootval);
+	tsc_sanitize_first_cpu(cur, bootval, smp_processor_id(), bootcpu);
 	return false;
 }
 
@@ -80,7 +104,7 @@ bool __init tsc_store_and_check_tsc_adju
 /*
  * Store and check the TSC ADJUST MSR if available
  */
-bool tsc_store_and_check_tsc_adjust(void)
+bool tsc_store_and_check_tsc_adjust(bool bootcpu)
 {
 	struct tsc_adjust *ref, *cur = this_cpu_ptr(&tsc_adjust);
 	unsigned int refcpu, cpu = smp_processor_id();
@@ -98,22 +122,16 @@ bool tsc_store_and_check_tsc_adjust(void
 	/*
 	 * Check whether this CPU is the first in a package to come up. In
 	 * this case do not check the boot value against another package
-	 * because the package might have been physically hotplugged, where
-	 * TSC_ADJUST is expected to be different. When called on the boot
-	 * CPU topology_core_cpumask() might not be available yet.
+	 * because the new package might have been physically hotplugged,
+	 * where TSC_ADJUST is expected to be different. When called on the
+	 * boot CPU topology_core_cpumask() might not be available yet.
 	 */
 	mask = topology_core_cpumask(cpu);
 	refcpu = mask ? cpumask_any_but(mask, cpu) : nr_cpu_ids;
 
 	if (refcpu >= nr_cpu_ids) {
-		/*
-		 * First online CPU in a package stores the boot value in
-		 * the adjustment value. This value might change later via
-		 * the sync mechanism. If that fails we still can yell
-		 * about boot values not being consistent.
-		 */
-		cur->adjusted = bootval;
-		pr_info_once("TSC ADJUST: Boot CPU%u: %lld\n", cpu,  bootval);
+		tsc_sanitize_first_cpu(cur, bootval, smp_processor_id(),
+				       bootcpu);
 		return false;
 	}
 
@@ -366,7 +384,7 @@ void check_tsc_sync_target(void)
 	 * Store, verify and sanitize the TSC adjust register. If
 	 * successful skip the test.
 	 */
-	if (tsc_store_and_check_tsc_adjust()) {
+	if (tsc_store_and_check_tsc_adjust(false)) {
 		atomic_inc(&skip_test);
 		return;
 	}
@@ -429,8 +447,13 @@ void check_tsc_sync_target(void)
 	 * that the warp is not longer detectable when the observed warp
 	 * value is used. In the worst case the adjustment needs to go
 	 * through a 3rd run for fine tuning.
+	 *
+	 * But we must make sure that the value doesn't become negative
+	 * otherwise TSC deadline timer will create an interrupt storm.
 	 */
 	cur->adjusted += cur_max_warp;
+	if (cur->adjusted < 0)
+		cur->adjusted = 0;
 
 	pr_warn("TSC ADJUST compensate: CPU%u observed %lld warp. Adjust: %lld\n",
 		cpu, cur_max_warp, cur->adjusted);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero
  2016-12-13 13:14 ` [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero Thomas Gleixner
@ 2016-12-13 13:43   ` Peter Zijlstra
  2016-12-13 15:49     ` Thomas Gleixner
  2016-12-15 10:53   ` [tip:x86/timers] " tip-bot for Thomas Gleixner
  2016-12-16 11:46   ` [patch 2/2] " Thomas Gleixner
  2 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2016-12-13 13:43 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Borislav Petkov, Bruce Schlobohm, Roland Scheidegger,
	Kevin Stanton, Allen Hung

On Tue, Dec 13, 2016 at 01:14:17PM -0000, Thomas Gleixner wrote:
> @@ -429,8 +447,13 @@ void check_tsc_sync_target(void)
>  	 * that the warp is not longer detectable when the observed warp
>  	 * value is used. In the worst case the adjustment needs to go
>  	 * through a 3rd run for fine tuning.
> +	 *
> +	 * But we must make sure that the value doesn't become negative
> +	 * otherwise TSC deadline timer will create an interrupt storm.
>  	 */
>  	cur->adjusted += cur_max_warp;
> +	if (cur->adjusted < 0)
> +		cur->adjusted = 0;
>  

So depending on how horrid we want to make this we could work around
this by increasing the ADJUST of the other CPUs by the amount we're
negative.

But yes, yuck.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero
  2016-12-13 13:43   ` Peter Zijlstra
@ 2016-12-13 15:49     ` Thomas Gleixner
  0 siblings, 0 replies; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-13 15:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, x86, Borislav Petkov, Bruce Schlobohm, Roland Scheidegger,
	Kevin Stanton, Allen Hung

On Tue, 13 Dec 2016, Peter Zijlstra wrote:
> On Tue, Dec 13, 2016 at 01:14:17PM -0000, Thomas Gleixner wrote:
> > @@ -429,8 +447,13 @@ void check_tsc_sync_target(void)
> >  	 * that the warp is not longer detectable when the observed warp
> >  	 * value is used. In the worst case the adjustment needs to go
> >  	 * through a 3rd run for fine tuning.
> > +	 *
> > +	 * But we must make sure that the value doesn't become negative
> > +	 * otherwise TSC deadline timer will create an interrupt storm.
> >  	 */
> >  	cur->adjusted += cur_max_warp;
> > +	if (cur->adjusted < 0)
> > +		cur->adjusted = 0;
> >  
> 
> So depending on how horrid we want to make this we could work around
> this by increasing the ADJUST of the other CPUs by the amount we're
> negative.

That means to increase all CPUs in the package of the other CPU by that
amount ....

> But yes, yuck.

Indeed.

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip:x86/timers] x86/tsc: Force TSC_ADJUST register to value >= zero
  2016-12-13 13:14 ` [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero Thomas Gleixner
  2016-12-13 13:43   ` Peter Zijlstra
@ 2016-12-15 10:53   ` tip-bot for Thomas Gleixner
  2016-12-16 11:46   ` [patch 2/2] " Thomas Gleixner
  2 siblings, 0 replies; 22+ messages in thread
From: tip-bot for Thomas Gleixner @ 2016-12-15 10:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, bp, bruce.schlobohm, mingo, tglx, allen_hung,
	rscheidegger_lists, peterz, hpa, kevin.b.stanton

Commit-ID:  5bae156241e05d25171b18ee43e49f103c3f8097
Gitweb:     http://git.kernel.org/tip/5bae156241e05d25171b18ee43e49f103c3f8097
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Dec 2016 13:14:17 +0000
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 15 Dec 2016 11:44:29 +0100

x86/tsc: Force TSC_ADJUST register to value >= zero

Roland reported that his DELL T5810 sports a value add BIOS which
completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
random negative TSC_ADJUST values, different on all CPUs. That renders the
TSC useless because the sycnchronization check fails.

Roland tested the new TSC_ADJUST mechanism. While it manages to readjust
the TSCs he needs to disable the TSC deadline timer, otherwise the machine
just stops booting.

Deeper investigation unearthed that the TSC deadline timer is sensitive to
the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an
interrupt storm caused by the TSC deadline timer.

This does not make any sense and it's hard to imagine what kind of hardware
wreckage is behind that misfeature, but it's reliably reproducible on other
systems which have TSC_ADJUST and TSC deadline timer.

While it would be understandable that a big enough negative value which
moves the resulting TSC readout into the negative space could have the
described effect, this happens even with a adjust value of -1, which keeps
the TSC readout definitely in the positive space. The compare register for
the TSC deadline timer is set to a positive value larger than the TSC, but
despite not having reached the deadline the interrupt is raised
immediately. If this happens on the boot CPU, then the machine dies
silently because this setup happens before the NMI watchdog is armed.

Further experiments showed that any other adjustment of TSC_ADJUST works as
expected as long as it stays in the positive range. The direction of the
adjustment has no influence either. See the lkml link for further analysis.

Yet another proof for the theory that timers are designed by janitors and
the underlying (obviously undocumented) mechanisms which allow BIOSes to
wreckage them are considered a feature. Well done Intel - NOT!

To address this wreckage add the following sanity measures:

- If the TSC_ADJUST value on the boot cpu is not 0, set it to 0

- If the TSC_ADJUST value on any cpu is negative, set it to 0

- Prevent the cross package synchronization mechanism from setting negative
  TSC_ADJUST values.

Reported-and-tested-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Allen Hung <allen_hung@dell.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161213131211.397588033@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/tsc.h |  4 ++--
 arch/x86/kernel/tsc.c      |  2 +-
 arch/x86/kernel/tsc_sync.c | 55 ++++++++++++++++++++++++++++++++--------------
 3 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 372ad0c..abb1fdc 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -46,12 +46,12 @@ extern int tsc_clocksource_reliable;
  * all CPUs/cores:
  */
 #ifdef CONFIG_X86_TSC
-extern bool tsc_store_and_check_tsc_adjust(void);
+extern bool tsc_store_and_check_tsc_adjust(bool bootcpu);
 extern void tsc_verify_tsc_adjust(bool resume);
 extern void check_tsc_sync_source(int cpu);
 extern void check_tsc_sync_target(void);
 #else
-static inline bool tsc_store_and_check_tsc_adjust(void) { return false; }
+static inline bool tsc_store_and_check_tsc_adjust(bool bootcpu) { return false; }
 static inline void tsc_verify_tsc_adjust(bool resume) { }
 static inline void check_tsc_sync_source(int cpu) { }
 static inline void check_tsc_sync_target(void) { }
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index bfb541a..0aed75a 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1386,7 +1386,7 @@ void __init tsc_init(void)
 	if (unsynchronized_tsc())
 		mark_tsc_unstable("TSCs unsynchronized");
 	else
-		tsc_store_and_check_tsc_adjust();
+		tsc_store_and_check_tsc_adjust(true);
 
 	check_system_tsc_reliable();
 
diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index 94f2ce5..9151f0c 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -58,8 +58,33 @@ void tsc_verify_tsc_adjust(bool resume)
 	}
 }
 
+static void tsc_sanitize_first_cpu(struct tsc_adjust *cur, s64 bootval,
+				   unsigned int cpu, bool bootcpu)
+{
+	/*
+	 * First online CPU in a package stores the boot value in the
+	 * adjustment value. This value might change later via the sync
+	 * mechanism. If that fails we still can yell about boot values not
+	 * being consistent.
+	 *
+	 * On the boot cpu we just force set the ADJUST value to 0 if it's
+	 * non zero. We don't do that on non boot cpus because physical
+	 * hotplug should have set the ADJUST register to a value > 0 so
+	 * the TSC is in sync with the already running cpus.
+	 *
+	 * But we always force positive ADJUST values. Otherwise the TSC
+	 * deadline timer creates an interrupt storm. Sigh!
+	 */
+	if ((bootcpu && bootval != 0) || (!bootcpu && bootval < 0)) {
+		pr_warn("TSC ADJUST: CPU%u: %lld force to 0\n", cpu, bootval);
+		wrmsrl(MSR_IA32_TSC_ADJUST, 0);
+		bootval = 0;
+	}
+	cur->adjusted = bootval;
+}
+
 #ifndef CONFIG_SMP
-bool __init tsc_store_and_check_tsc_adjust(void)
+bool __init tsc_store_and_check_tsc_adjust(bool bootcpu)
 {
 	struct tsc_adjust *cur = this_cpu_ptr(&tsc_adjust);
 	s64 bootval;
@@ -69,9 +94,8 @@ bool __init tsc_store_and_check_tsc_adjust(void)
 
 	rdmsrl(MSR_IA32_TSC_ADJUST, bootval);
 	cur->bootval = bootval;
-	cur->adjusted = bootval;
 	cur->nextcheck = jiffies + HZ;
-	pr_info("TSC ADJUST: Boot CPU0: %lld\n", bootval);
+	tsc_sanitize_first_cpu(cur, bootval, smp_processor_id(), bootcpu);
 	return false;
 }
 
@@ -80,7 +104,7 @@ bool __init tsc_store_and_check_tsc_adjust(void)
 /*
  * Store and check the TSC ADJUST MSR if available
  */
-bool tsc_store_and_check_tsc_adjust(void)
+bool tsc_store_and_check_tsc_adjust(bool bootcpu)
 {
 	struct tsc_adjust *ref, *cur = this_cpu_ptr(&tsc_adjust);
 	unsigned int refcpu, cpu = smp_processor_id();
@@ -98,22 +122,16 @@ bool tsc_store_and_check_tsc_adjust(void)
 	/*
 	 * Check whether this CPU is the first in a package to come up. In
 	 * this case do not check the boot value against another package
-	 * because the package might have been physically hotplugged, where
-	 * TSC_ADJUST is expected to be different. When called on the boot
-	 * CPU topology_core_cpumask() might not be available yet.
+	 * because the new package might have been physically hotplugged,
+	 * where TSC_ADJUST is expected to be different. When called on the
+	 * boot CPU topology_core_cpumask() might not be available yet.
 	 */
 	mask = topology_core_cpumask(cpu);
 	refcpu = mask ? cpumask_any_but(mask, cpu) : nr_cpu_ids;
 
 	if (refcpu >= nr_cpu_ids) {
-		/*
-		 * First online CPU in a package stores the boot value in
-		 * the adjustment value. This value might change later via
-		 * the sync mechanism. If that fails we still can yell
-		 * about boot values not being consistent.
-		 */
-		cur->adjusted = bootval;
-		pr_info_once("TSC ADJUST: Boot CPU%u: %lld\n", cpu,  bootval);
+		tsc_sanitize_first_cpu(cur, bootval, smp_processor_id(),
+				       bootcpu);
 		return false;
 	}
 
@@ -366,7 +384,7 @@ void check_tsc_sync_target(void)
 	 * Store, verify and sanitize the TSC adjust register. If
 	 * successful skip the test.
 	 */
-	if (tsc_store_and_check_tsc_adjust()) {
+	if (tsc_store_and_check_tsc_adjust(false)) {
 		atomic_inc(&skip_test);
 		return;
 	}
@@ -429,8 +447,13 @@ retry:
 	 * that the warp is not longer detectable when the observed warp
 	 * value is used. In the worst case the adjustment needs to go
 	 * through a 3rd run for fine tuning.
+	 *
+	 * But we must make sure that the value doesn't become negative
+	 * otherwise TSC deadline timer will create an interrupt storm.
 	 */
 	cur->adjusted += cur_max_warp;
+	if (cur->adjusted < 0)
+		cur->adjusted = 0;
 
 	pr_warn("TSC ADJUST compensate: CPU%u observed %lld warp. Adjust: %lld\n",
 		cpu, cur_max_warp, cur->adjusted);

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero
  2016-12-13 13:14 ` [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero Thomas Gleixner
  2016-12-13 13:43   ` Peter Zijlstra
  2016-12-15 10:53   ` [tip:x86/timers] " tip-bot for Thomas Gleixner
@ 2016-12-16 11:46   ` Thomas Gleixner
  2016-12-16 11:52     ` Ingo Molnar
  2016-12-16 13:33     ` Thomas Gleixner
  2 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-16 11:46 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Roland Scheidegger, Kevin Stanton, Allen Hung, stable

On Tue, 13 Dec 2016, Thomas Gleixner wrote:
> Roland reported that his DELL T5810 sports a value add BIOS which
> completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
> random negative TSC_ADJUST values, different on all CPUs. That renders the
> TSC useless because the sycnchronization check fails.

While everyone assumed that this is the usual DELL squirmware problem, I
have to say it's not.

Just got my hands on a Skylake based Lenovo S510 box and it shows the same
feature:

TSC ADJUST: CPU0: -10123656703215
    	    CPU1: -10123656796701
	    CPU2: -10123656797460
	    CPU3: -10123656798366

Which causes the TSC to be out of sync on a stock upstream kernel and the
TSC deadline timer wreckage is happening on that machine as well.

I'm pretty sure, that this well thought out feature to 'hide power on time'
from TSC has not been independently 'invented' by DELL and Lenovo BIOS
tinkerers.

I rather have the impression that this is an advisory or feature kit from
some other entity. Whoever came up with this misfeature at Intel and/or
Microsoft (sorry, I could not come up with any other suspects) should be
promoted to run the 'Linux on feature-plagued systems' hot line.

As this seems to be more wide spread than we thought initially, we have to
think about a solution for stable kernels, especially 4.9. And distros will
have to think about that as well....

We have two options:

1) Disable TSC deadline timer by default and force users with sane machines
   to enable it on the kernel command line.

   Upside:   Very small patch

   Downside: Degrades existing setups on sane machines, keeps TSC unusable
      	     on affected machines. We have no idea what other hidden side
      	     effects the TSC_ADJUST tinkering has. If there are any, they
      	     ain't be nice ones.

2) Push the whole TSC_ADJUST sanitizing machinery into stable

   Upside:   Does not affect sane machines and gives a benefit to users of
   	     affected machines

   Downside: Rather large patch, but not that risky either. Needs a few
	     eyes and good test coverage though

Thoughts?

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero
  2016-12-16 11:46   ` [patch 2/2] " Thomas Gleixner
@ 2016-12-16 11:52     ` Ingo Molnar
  2016-12-16 11:53       ` Thomas Gleixner
  2016-12-16 13:33     ` Thomas Gleixner
  1 sibling, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2016-12-16 11:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Roland Scheidegger, Kevin Stanton, Allen Hung, stable


* Thomas Gleixner <tglx@linutronix.de> wrote:

> We have two options:
> 
> 1) Disable TSC deadline timer by default and force users with sane machines
>    to enable it on the kernel command line.
> 
>    Upside:   Very small patch
>    
>    Downside: Degrades existing setups on sane machines, keeps TSC unusable
>       	     on affected machines. We have no idea what other hidden side
>       	     effects the TSC_ADJUST tinkering has. If there are any, they
>       	     ain't be nice ones.
> 
> 2) Push the whole TSC_ADJUST sanitizing machinery into stable
> 
>    Upside:   Does not affect sane machines and gives a benefit to users of
>    	     affected machines
> 
>    Downside: Rather large patch, but not that risky either. Needs a few
> 	     eyes and good test coverage though
> 
> Thoughts?

I'd go for #2, because #1 is essentially turning it off for almost everyone.

We can still do #1 and push it back to -stable as well if #2 fails.

But I'd suggest we delay the stable backporting until it's been upstream a bit.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero
  2016-12-16 11:52     ` Ingo Molnar
@ 2016-12-16 11:53       ` Thomas Gleixner
  0 siblings, 0 replies; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-16 11:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Roland Scheidegger, Kevin Stanton, Allen Hung, stable

On Fri, 16 Dec 2016, Ingo Molnar wrote:
> * Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > We have two options:
> > 
> > 1) Disable TSC deadline timer by default and force users with sane machines
> >    to enable it on the kernel command line.
> > 
> >    Upside:   Very small patch
> >    
> >    Downside: Degrades existing setups on sane machines, keeps TSC unusable
> >       	     on affected machines. We have no idea what other hidden side
> >       	     effects the TSC_ADJUST tinkering has. If there are any, they
> >       	     ain't be nice ones.
> > 
> > 2) Push the whole TSC_ADJUST sanitizing machinery into stable
> > 
> >    Upside:   Does not affect sane machines and gives a benefit to users of
> >    	     affected machines
> > 
> >    Downside: Rather large patch, but not that risky either. Needs a few
> > 	     eyes and good test coverage though
> > 
> > Thoughts?
> 
> I'd go for #2, because #1 is essentially turning it off for almost everyone.
> 
> We can still do #1 and push it back to -stable as well if #2 fails.
> 
> But I'd suggest we delay the stable backporting until it's been upstream a bit.

I agree. None of these patches is tagged stable. I just wanted to mention
it so it can be discussed before distros/stable users are swamped with
failure reports.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero
  2016-12-16 11:46   ` [patch 2/2] " Thomas Gleixner
  2016-12-16 11:52     ` Ingo Molnar
@ 2016-12-16 13:33     ` Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-16 13:33 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Roland Scheidegger, Kevin Stanton, Allen Hung, stable

On Fri, 16 Dec 2016, Thomas Gleixner wrote:
> On Tue, 13 Dec 2016, Thomas Gleixner wrote:
> > Roland reported that his DELL T5810 sports a value add BIOS which
> > completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
> > random negative TSC_ADJUST values, different on all CPUs. That renders the
> > TSC useless because the sycnchronization check fails.
> 
> While everyone assumed that this is the usual DELL squirmware problem, I
> have to say it's not.
> 
> Just got my hands on a Skylake based Lenovo S510 box and it shows the same
> feature:
> 
> TSC ADJUST: CPU0: -10123656703215
>     	    CPU1: -10123656796701
> 	    CPU2: -10123656797460
> 	    CPU3: -10123656798366
> 
> Which causes the TSC to be out of sync on a stock upstream kernel and the
> TSC deadline timer wreckage is happening on that machine as well.
> 
> I'm pretty sure, that this well thought out feature to 'hide power on time'
> from TSC has not been independently 'invented' by DELL and Lenovo BIOS
> tinkerers.
> 
> I rather have the impression that this is an advisory or feature kit from
> some other entity. Whoever came up with this misfeature at Intel and/or
> Microsoft (sorry, I could not come up with any other suspects) should be
> promoted to run the 'Linux on feature-plagued systems' hot line.

Just to add another data point here.

On cold boot the TSC_ADJUST value on that LENOVO machine is: -24534293,
which is about 9ms.

So assumed that the SDM is correct in this point and the counter starts at
0 after power on, then 9ms later might be right in that magic blob which
does the low level bringup of CPUs. That comes from the CPU vendor and runs
_BEFORE_ the system vendor BIOS can create havoc.

Dealing with timers on x86 feels like a Sisyphean task.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
  2016-12-13 13:14 [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Thomas Gleixner
  2016-12-13 13:14 ` [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume Thomas Gleixner
  2016-12-13 13:14 ` [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero Thomas Gleixner
@ 2016-12-13 16:34 ` Roland Scheidegger
  2016-12-13 16:46   ` Thomas Gleixner
  2 siblings, 1 reply; 22+ messages in thread
From: Roland Scheidegger @ 2016-12-13 16:34 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Kevin Stanton, Allen Hung

Am 13.12.2016 um 14:14 schrieb Thomas Gleixner:
> Roland reported interesting TSC ADJUST register wreckage on his DELL
> machine, which seems to populate that MSR with a random number generator.

FWIW, I thought about the actual values some more and I don't actually
think they are all that random any more: the behavior is consistent with
the bios trying to zero the TSC of all cpus. If I understand this right,
writing a zero to TSC would cause somewhat small negative values in the
TSC_ADJ register at boot time, and larger negative values at suspend
time (at least if the TSC just stops when suspended and isn't reset) -
exactly what I'm seeing.
(And of course the different TSC_ADJ values would be because the bios is
writing TSC without any thoughts of synchronization, just one cpu after
another).

> 
> Deeper investagation into fixing this wreckage unearthed another special
> feature which is designed by Intel: Negative TSC adjuste values cause
> interrupt storms on the TSC deadline timer. Further details in patch 2/2

This actually looks like quite a serious hw bug to me, shouldn't there
be an errata for such a bug?

And I still don't quite understand why the lockup doesn't happen after a
warm boot, there must be something different there...

(I didn't have the chance to test the patch yet.)

Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
  2016-12-13 16:34 ` [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Roland Scheidegger
@ 2016-12-13 16:46   ` Thomas Gleixner
  2016-12-14  1:36     ` Roland Scheidegger
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-13 16:46 UTC (permalink / raw)
  To: Roland Scheidegger
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Kevin Stanton, Allen Hung

On Tue, 13 Dec 2016, Roland Scheidegger wrote:

> Am 13.12.2016 um 14:14 schrieb Thomas Gleixner:
> > Roland reported interesting TSC ADJUST register wreckage on his DELL
> > machine, which seems to populate that MSR with a random number generator.
> 
> FWIW, I thought about the actual values some more and I don't actually
> think they are all that random any more: the behavior is consistent with
> the bios trying to zero the TSC of all cpus. If I understand this right,
> writing a zero to TSC would cause somewhat small negative values in the
> TSC_ADJ register at boot time, and larger negative values at suspend
> time (at least if the TSC just stops when suspended and isn't reset) -
> exactly what I'm seeing.
> (And of course the different TSC_ADJ values would be because the bios is
> writing TSC without any thoughts of synchronization, just one cpu after
> another).

Yeah, that might be. Still it looks like random nonsense and definitely the
BIOS developers did not follow the secrit boot protocol.

> > Deeper investagation into fixing this wreckage unearthed another special
> > feature which is designed by Intel: Negative TSC adjuste values cause
> > interrupt storms on the TSC deadline timer. Further details in patch 2/2
> 
> This actually looks like quite a serious hw bug to me, shouldn't there
> be an errata for such a bug?
> 
> And I still don't quite understand why the lockup doesn't happen after a
> warm boot, there must be something different there...

What are the adjust values after a warm boot?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
  2016-12-13 16:46   ` Thomas Gleixner
@ 2016-12-14  1:36     ` Roland Scheidegger
  2016-12-14  7:31       ` Thomas Gleixner
  0 siblings, 1 reply; 22+ messages in thread
From: Roland Scheidegger @ 2016-12-14  1:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Kevin Stanton, Allen Hung

Am 13.12.2016 um 17:46 schrieb Thomas Gleixner:
> On Tue, 13 Dec 2016, Roland Scheidegger wrote:
> 
>> Am 13.12.2016 um 14:14 schrieb Thomas Gleixner:
>>> Roland reported interesting TSC ADJUST register wreckage on his DELL
>>> machine, which seems to populate that MSR with a random number generator.
>>
>> FWIW, I thought about the actual values some more and I don't actually
>> think they are all that random any more: the behavior is consistent with
>> the bios trying to zero the TSC of all cpus. If I understand this right,
>> writing a zero to TSC would cause somewhat small negative values in the
>> TSC_ADJ register at boot time, and larger negative values at suspend
>> time (at least if the TSC just stops when suspended and isn't reset) -
>> exactly what I'm seeing.
>> (And of course the different TSC_ADJ values would be because the bios is
>> writing TSC without any thoughts of synchronization, just one cpu after
>> another).
> 
> Yeah, that might be. Still it looks like random nonsense and definitely the
> BIOS developers did not follow the secrit boot protocol.
> 
>>> Deeper investagation into fixing this wreckage unearthed another special
>>> feature which is designed by Intel: Negative TSC adjuste values cause
>>> interrupt storms on the TSC deadline timer. Further details in patch 2/2
>>
>> This actually looks like quite a serious hw bug to me, shouldn't there
>> be an errata for such a bug?
>>
>> And I still don't quite understand why the lockup doesn't happen after a
>> warm boot, there must be something different there...
> 
> What are the adjust values after a warm boot?
> 

So, after cold boot with a kernel which doesn't adjust TSCs, then warm
boot I got:
[    0.000000] TSC ADJUST: CPU0: -602358264300 176072418728
[    0.000000] TSC ADJUST: Boot CPU0: -602358264300
[    0.172245] TSC ADJUST: CPU1: -602360207584 176587932558
[    0.172245] TSC ADJUST differs: Reference CPU0: -602358264300 CPU1:
-602360207584
[    0.172246] TSC ADJUST synchronize: Reference CPU0: -602358264300
CPU1: -602360207584
[    0.252663] TSC ADJUST: CPU2: -602359000822 176828627154
[    0.252663] TSC ADJUST differs: Reference CPU0: -602358264300 CPU2:
-602359000822
[    0.252664] TSC ADJUST synchronize: Reference CPU0: -602358264300
CPU2: -602359000822
[    0.337014] TSC ADJUST: CPU3: -602360177680 177081093132
[    0.337014] TSC ADJUST differs: Reference CPU0: -602358264300 CPU3:
-602360177680
[    0.337015] TSC ADJUST synchronize: Reference CPU0: -602358264300
CPU3: -602360177680

and so on.

Albeit after another reboot (some minutes later), it actually straight
locked up again:

TSC ADJUST: CPU1: -8257481427958 165112676430
TSC ADJUST differs: Reference CPU0: -8257479484330 CPU1: -8257481427958
TSC ADJUST synchronize: Reference CPU0: -8257479484330 CPU1: -8254781427958
TSC target sync skip
...
smpboot: Target CPU is online

So, actually I thought the TSC would get reset too on warm boot, but
clearly looks like that isn't the case...
But I don't know what's the difference between first and second reboot -
the adjust values have just more magnitude, but otherwise even the
direction of the adjustments and everything looks all the same (just
like cold boot, which also looks all the same to me).

Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
  2016-12-14  1:36     ` Roland Scheidegger
@ 2016-12-14  7:31       ` Thomas Gleixner
  2016-12-14 20:59         ` Thomas Gleixner
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-14  7:31 UTC (permalink / raw)
  To: Roland Scheidegger
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Kevin Stanton, Allen Hung

On Wed, 14 Dec 2016, Roland Scheidegger wrote:
> Am 13.12.2016 um 17:46 schrieb Thomas Gleixner:
> > What are the adjust values after a warm boot?
>
> So, after cold boot with a kernel which doesn't adjust TSCs, then warm
> boot I got:
> [    0.000000] TSC ADJUST: CPU0: -602358264300 176072418728
> [    0.000000] TSC ADJUST: Boot CPU0: -602358264300
> [    0.172245] TSC ADJUST: CPU1: -602360207584 176587932558
> [    0.172245] TSC ADJUST differs: Reference CPU0: -602358264300 CPU1:
> -602360207584
> [    0.172246] TSC ADJUST synchronize: Reference CPU0: -602358264300
> CPU1: -602360207584
> [    0.252663] TSC ADJUST: CPU2: -602359000822 176828627154
> [    0.252663] TSC ADJUST differs: Reference CPU0: -602358264300 CPU2:
> -602359000822
> [    0.252664] TSC ADJUST synchronize: Reference CPU0: -602358264300
> CPU2: -602359000822
> [    0.337014] TSC ADJUST: CPU3: -602360177680 177081093132
> [    0.337014] TSC ADJUST differs: Reference CPU0: -602358264300 CPU3:
> -602360177680
> [    0.337015] TSC ADJUST synchronize: Reference CPU0: -602358264300
> CPU3: -602360177680
> 
> and so on.
> 
> Albeit after another reboot (some minutes later), it actually straight
> locked up again:
> 
> TSC ADJUST: CPU1: -8257481427958 165112676430
> TSC ADJUST differs: Reference CPU0: -8257479484330 CPU1: -8257481427958
> TSC ADJUST synchronize: Reference CPU0: -8257479484330 CPU1: -8254781427958
> TSC target sync skip
> ...
> smpboot: Target CPU is online
> 
> So, actually I thought the TSC would get reset too on warm boot, but
> clearly looks like that isn't the case...
> But I don't know what's the difference between first and second reboot -
> the adjust values have just more magnitude, but otherwise even the
> direction of the adjustments and everything looks all the same (just
> like cold boot, which also looks all the same to me).

I haven't found a pattern for the lockups yet and we have to wait for Intel
to provide useful information about that issue. All we know so far is that
negative adjust values are dangerous.

Could you test the two patches on top of tip x86/timers branch so we can
make progress with that whole disaster while waiting for Intel to come
forth with a proper explanation?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
  2016-12-14  7:31       ` Thomas Gleixner
@ 2016-12-14 20:59         ` Thomas Gleixner
  2016-12-14 21:40           ` Thomas Gleixner
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-14 20:59 UTC (permalink / raw)
  To: Roland Scheidegger
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Kevin Stanton, Allen Hung

On Wed, 14 Dec 2016, Thomas Gleixner wrote:

> On Wed, 14 Dec 2016, Roland Scheidegger wrote:
> > Am 13.12.2016 um 17:46 schrieb Thomas Gleixner:
> > > What are the adjust values after a warm boot?
> >
> > So, after cold boot with a kernel which doesn't adjust TSCs, then warm
> > boot I got:
> > [    0.000000] TSC ADJUST: CPU0: -602358264300 176072418728
> > [    0.000000] TSC ADJUST: Boot CPU0: -602358264300
> > [    0.172245] TSC ADJUST: CPU1: -602360207584 176587932558
> > [    0.172245] TSC ADJUST differs: Reference CPU0: -602358264300 CPU1:
> > -602360207584
> > [    0.172246] TSC ADJUST synchronize: Reference CPU0: -602358264300
> > CPU1: -602360207584
> > [    0.252663] TSC ADJUST: CPU2: -602359000822 176828627154
> > [    0.252663] TSC ADJUST differs: Reference CPU0: -602358264300 CPU2:
> > -602359000822
> > [    0.252664] TSC ADJUST synchronize: Reference CPU0: -602358264300
> > CPU2: -602359000822
> > [    0.337014] TSC ADJUST: CPU3: -602360177680 177081093132
> > [    0.337014] TSC ADJUST differs: Reference CPU0: -602358264300 CPU3:
> > -602360177680
> > [    0.337015] TSC ADJUST synchronize: Reference CPU0: -602358264300
> > CPU3: -602360177680
> > 
> > and so on.
> > 
> > Albeit after another reboot (some minutes later), it actually straight
> > locked up again:
> > 
> > TSC ADJUST: CPU1: -8257481427958 165112676430
> > TSC ADJUST differs: Reference CPU0: -8257479484330 CPU1: -8257481427958
> > TSC ADJUST synchronize: Reference CPU0: -8257479484330 CPU1: -8254781427958
> > TSC target sync skip
> > ...
> > smpboot: Target CPU is online
> > 
> > So, actually I thought the TSC would get reset too on warm boot, but
> > clearly looks like that isn't the case...
> > But I don't know what's the difference between first and second reboot -
> > the adjust values have just more magnitude, but otherwise even the
> > direction of the adjustments and everything looks all the same (just
> > like cold boot, which also looks all the same to me).
> 
> I haven't found a pattern for the lockups yet and we have to wait for Intel
> to provide useful information about that issue. All we know so far is that
> negative adjust values are dangerous.

Did some futher investigation. The values which cause the interrupt storms
have very clear identifiable points which reliably reproduce:

Positive space, results in timer not firing anymore - at least not in a
time frame you are willing to wait for.

     0x0000 0000 8000 0000

Negative space, results in an interrupt storm.

     0xffff ffff 0000 0000
     0xffff fffe 0000 0000
     0xffff fffd 0000 0000
     0xffff fffc 0000 0000
     0xffff fffb 0000 0000
     ....

These points are independent of the underlying counter value (cold boot,
warm boot) and even reproduce after hours of power on reliably.

And looking at the values makes me wonder about 32bit vs. 64bit wreckage
combined with sign expansion done wrong. Im really impressed!

In the negative space there is something else going on which is dependent
on the counter value. Right after cold boot the space is closer to zero
than after hours of power on.

So the approach of forbidding negative values is definitely not wrong.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
  2016-12-14 20:59         ` Thomas Gleixner
@ 2016-12-14 21:40           ` Thomas Gleixner
  2016-12-14 22:54             ` Roland Scheidegger
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-14 21:40 UTC (permalink / raw)
  To: Roland Scheidegger
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Kevin Stanton, Allen Hung

On Wed, 14 Dec 2016, Thomas Gleixner wrote:
> Positive space, results in timer not firing anymore - at least not in a
> time frame you are willing to wait for.
> 
>      0x0000 0000 8000 0000
> 
> Negative space, results in an interrupt storm.
> 
>      0xffff ffff 0000 0000
>      0xffff fffe 0000 0000
>      0xffff fffd 0000 0000
>      0xffff fffc 0000 0000
>      0xffff fffb 0000 0000
>      ....
> 
> These points are independent of the underlying counter value (cold boot,
> warm boot) and even reproduce after hours of power on reliably.
> 
> And looking at the values makes me wonder about 32bit vs. 64bit wreckage
> combined with sign expansion done wrong. Im really impressed!

And the whole mess stems from the fact that the deadline is not as one
would expect simply compared against the sum of the counter and the adjust
MSR.

No, they subtract the adjust value from the MSR when you write the deadline
and latch the result to compare it against the counter.

So when the following happens:

   ADJUST	= 0
   RDTSC	= 10000000 
   DEADLINE	= 11000000

   ADJUST	=  1000000

   INTERRUPT
   RDTSC	= 12000000

   DEADLINE	= 13000000

   ADJUST	=        0

   INTERRUPT
   RDTSC	= 12000000

So depending on the direction of the adjustment the timer fires late or
early.

Combined with that math wreckage this is a complete disaster. And of course
nothing is documented anywhere and the SDM is outright wrong:

10.5.4.1 TSC-Deadline Mode

  The processor generates a timer interrupt when the value of time-stamp
  counter is greater than or equal to that of IA32_TSC_DEADLINE. It then
  disarms the timer and clear the IA32_TSC_DEADLINE MSR. (Both the time-stamp
  counter and the IA32_TSC_DEADLINE MSR are 64-bit unsigned integers.)

See the example above. 1200000 is neither equal nor greater than 1300000, at
least not in my universe.

I serioulsy doubt that Intel manages it to design at least ONE functional
non broken timer before I retire.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
  2016-12-14 21:40           ` Thomas Gleixner
@ 2016-12-14 22:54             ` Roland Scheidegger
  2016-12-15  9:31               ` Thomas Gleixner
  0 siblings, 1 reply; 22+ messages in thread
From: Roland Scheidegger @ 2016-12-14 22:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Kevin Stanton, Allen Hung

Am 14.12.2016 um 22:40 schrieb Thomas Gleixner:
> On Wed, 14 Dec 2016, Thomas Gleixner wrote:
>> Positive space, results in timer not firing anymore - at least not in a
>> time frame you are willing to wait for.
>>
>>      0x0000 0000 8000 0000
>>
>> Negative space, results in an interrupt storm.
>>
>>      0xffff ffff 0000 0000
>>      0xffff fffe 0000 0000
>>      0xffff fffd 0000 0000
>>      0xffff fffc 0000 0000
>>      0xffff fffb 0000 0000
>>      ....
>>
>> These points are independent of the underlying counter value (cold boot,
>> warm boot) and even reproduce after hours of power on reliably.
>>
>> And looking at the values makes me wonder about 32bit vs. 64bit wreckage
>> combined with sign expansion done wrong. Im really impressed!
> 
> And the whole mess stems from the fact that the deadline is not as one
> would expect simply compared against the sum of the counter and the adjust
> MSR.
Why would it be compared against the sum? As far as I can tell the adjust
value should never come into play when using deadline timer (other than
indirectly because the TSC would change).
(And I'd guess better avoid an armed deadline timer while changing
TSC_ADJ...)

In any case, I've tested the two patches on top of x86-timers and they
work just fine - all TSC_ADJ values get set to zero both on boot and
resume, no lockups, and tsc clocksource active, with some whining in the
log of course :-).
So,
Tested-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch>

> No, they subtract the adjust value from the MSR when you write the deadline
> and latch the result to compare it against the counter.
> 
> So when the following happens:
> 
>    ADJUST	= 0
>    RDTSC	= 10000000 
>    DEADLINE	= 11000000
> 
>    ADJUST	=  1000000
> 
>    INTERRUPT
>    RDTSC	= 12000000
> 
>    DEADLINE	= 13000000
> 
>    ADJUST	=        0
> 
>    INTERRUPT
>    RDTSC	= 12000000
> 
> So depending on the direction of the adjustment the timer fires late or
> early.
> 
> Combined with that math wreckage this is a complete disaster. And of course
> nothing is documented anywhere and the SDM is outright wrong:
> 
> 10.5.4.1 TSC-Deadline Mode
> 
>   The processor generates a timer interrupt when the value of time-stamp
>   counter is greater than or equal to that of IA32_TSC_DEADLINE. It then
>   disarms the timer and clear the IA32_TSC_DEADLINE MSR. (Both the time-stamp
>   counter and the IA32_TSC_DEADLINE MSR are 64-bit unsigned integers.)
> 
> See the example above. 1200000 is neither equal nor greater than 1300000, at
> least not in my universe.
> 
> I serioulsy doubt that Intel manages it to design at least ONE functional
> non broken timer before I retire.
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
  2016-12-14 22:54             ` Roland Scheidegger
@ 2016-12-15  9:31               ` Thomas Gleixner
  2017-01-26 23:40                 ` Stanton, Kevin B
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2016-12-15  9:31 UTC (permalink / raw)
  To: Roland Scheidegger
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Bruce Schlobohm,
	Kevin Stanton, Allen Hung

On Wed, 14 Dec 2016, Roland Scheidegger wrote:
> Am 14.12.2016 um 22:40 schrieb Thomas Gleixner:
> > And the whole mess stems from the fact that the deadline is not as one
> > would expect simply compared against the sum of the counter and the adjust
> > MSR.
> Why would it be compared against the sum? As far as I can tell the adjust
> value should never come into play when using deadline timer (other than
> indirectly because the TSC would change).

See the SDM. It suggests that the deadline is compared to the TSC value.

I don't care how it is implemented, but I very much care about it being
documented in the way it is implemented, which is obviously not the case.

And even if the adjust value is subtracted once when the timer is armed the
whole thing should not blow up due to 32bit/sign extension bugs or whatever
they decided to cobble together.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm
  2016-12-15  9:31               ` Thomas Gleixner
@ 2017-01-26 23:40                 ` Stanton, Kevin B
  0 siblings, 0 replies; 22+ messages in thread
From: Stanton, Kevin B @ 2017-01-26 23:40 UTC (permalink / raw)
  To: Thomas Gleixner, Roland Scheidegger
  Cc: LKML, x86, Peter Zijlstra, Borislav Petkov, Schlobohm, Bruce,
	Allen Hung, Sharon, Barak, Segev, Gil

On Thurs, 15 Dec 2016, Thomas Gleixner wrote:
>See the SDM. It suggests that the deadline is compared to the TSC value.
>I don't care how it is implemented, but I very much care about it being
>documented in the way it is implemented, which is obviously not the case.

The TSC_DEADLINE behavior definitely diverges from the SDM and we've root caused the issue and are finalizing a fix that addresses the problem. Glad that a software workaround was also possible.  We also plan to expand our documentation on what good behavior of a BIOS is as it relates to adjusting TSC at boot (in almost every circumstance you shouldn't do it).  The official errata will be forthcoming. Thank you for helping to identify this.

Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-01-26 23:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-13 13:14 [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Thomas Gleixner
2016-12-13 13:14 ` [patch 1/2] x86/tsc: Validate TSC_ADJUST after resume Thomas Gleixner
2016-12-13 13:22   ` Peter Zijlstra
2016-12-13 13:23     ` Thomas Gleixner
2016-12-15 10:52   ` [tip:x86/timers] " tip-bot for Thomas Gleixner
2016-12-13 13:14 ` [patch 2/2] x86/tsc: Force TSC_ADJUST register to value >= zero Thomas Gleixner
2016-12-13 13:43   ` Peter Zijlstra
2016-12-13 15:49     ` Thomas Gleixner
2016-12-15 10:53   ` [tip:x86/timers] " tip-bot for Thomas Gleixner
2016-12-16 11:46   ` [patch 2/2] " Thomas Gleixner
2016-12-16 11:52     ` Ingo Molnar
2016-12-16 11:53       ` Thomas Gleixner
2016-12-16 13:33     ` Thomas Gleixner
2016-12-13 16:34 ` [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm Roland Scheidegger
2016-12-13 16:46   ` Thomas Gleixner
2016-12-14  1:36     ` Roland Scheidegger
2016-12-14  7:31       ` Thomas Gleixner
2016-12-14 20:59         ` Thomas Gleixner
2016-12-14 21:40           ` Thomas Gleixner
2016-12-14 22:54             ` Roland Scheidegger
2016-12-15  9:31               ` Thomas Gleixner
2017-01-26 23:40                 ` Stanton, Kevin B

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).