[patch] x86, tsc: fix SMI induced variation in quick_pit_calibrate()

* [patch] x86, tsc: fix SMI induced variation in quick_pit_calibrate()
@ 2012-01-16 20:15 Suresh Siddha
  2012-01-17  0:18 ` Linus Torvalds
  2012-01-17  2:47 ` [patch] x86, tsc: fix " Yinghai Lu
  0 siblings, 2 replies; 10+ messages in thread
From: Suresh Siddha @ 2012-01-16 20:15 UTC (permalink / raw)
  To: Linus Torvalds, Ingo Molnar, Thomas Gleixner, H. Peter Anvin
  Cc: linux-kernel, asit.k.mallick

Linus, We are seeing NTP failures on a big cluster as a result of big
variation in calibrated TSC values. Our debug showed that it is indeed
because of the SMI and its effect on quick pit calibration. Appended
patch helps fix it. It ran over the weekend boot tests with out any
failures.

thanks,
suresh
---
From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: x86, tsc: fix SMI induced variation in quick_pit_calibrate()

pit_expect_msb() returns success wrongly in the below SMI scenario:

a. pit_verify_msb() has not yet seen the MSB transition.

b. We are close to the MSB transition though and got a SMI immediately after
   returning from pit_verify_msb() which didn't see the MSB transition. PIT MSB
   transition has happened somewhere during SMI execution.

c. Returned from SMI and we noted down the 'tsc', saw the pit MSB change now and
   exited the loop to calculate d1/d2. Instead of noting the TSC at the MSB
   transition, we are way off because of the SMI.  And as the SMI happened
   between the pit_verify_msb() and before the 'tsc' is recorded in the
   for loop, d1/d2 will be small and quick_pit_calibrate() will not notice
   this error.

Depending on whether SMI disturbance happens while computing d1 or d2, we will
see the TSC calibrated value smaller or bigger than the expected value. As a
result, in a cluster we were seeing a variation of approximately +/- 20MHz in
the calibrated values, resulting in NTP failures.

  [ As far as the SMI source is concerned, this is a periodic SMI that gets
    disabled after ACPI is enabled by the OS. But the TSC calibration happens
    before the ACPI is enabled. ]

Fix this by comparing the returned delta (that is supposed to capture the MSB
transition and represents d1/d2 in the quick_pit_calibrate()) with
the one before in the for loop. If both of them are similar, then the returned
delta, tsc is captured closer to the MSB transition. Otherwise we will return
failure and fallback to slow PIT calibration.

Any SMI induced disturbance in returned delta (d1/d2) itself is already caught
in our requirements of error has to be less than 500ppm.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---
 arch/x86/kernel/tsc.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index c0dd5b6..27a1311 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -290,11 +290,12 @@ static inline int pit_verify_msb(unsigned char val)
 static inline int pit_expect_msb(unsigned char val, u64 *tscp, unsigned long *deltap)
 {
 	int count;
-	u64 tsc = 0;
+	u64 tsc = 0, tsc_old;
 
 	for (count = 0; count < 50000; count++) {
 		if (!pit_verify_msb(val))
 			break;
+		tsc_old = tsc;
 		tsc = get_cycles();
 	}
 	*deltap = get_cycles() - tsc;
@@ -304,7 +305,7 @@ static inline int pit_expect_msb(unsigned char val, u64 *tscp, unsigned long *de
 	 * We require _some_ success, but the quality control
 	 * will be based on the error terms on the TSC values.
 	 */
-	return count > 5;
+	return count > 5 && (tsc - tsc_old <= 2 * (*deltap));
 }
 
 /*



^ permalink raw reply related	[flat|nested] 10+ messages in thread