All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] tsc: make calibration refinement more robust
@ 2018-11-01 15:12 Daniel Vacek
  2018-11-01 15:34 ` Thomas Gleixner
  2018-11-05 17:10 ` [PATCH v2] x86/tsc: " Daniel Vacek
  0 siblings, 2 replies; 11+ messages in thread
From: Daniel Vacek @ 2018-11-01 15:12 UTC (permalink / raw)
  To: x86
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	linux-kernel, Daniel Vacek

The threshold in tsc_read_refs() is constant which may favor slower CPUs
but may not be optimal for simple reading of reference on faster ones.
Hence make it proportional to tsc_khz to compensate for this. The threshold
guards against any disturbance like IRQs, NMIs, SMIs or CPU stealing by
host on guest systems so rename it accordingly and fix comments as well.

Also on some systems there is noticeable DMI bus contention at some point
during boot keeping the readout failing (observed with about one in ~300
boots when testing). In that case retry also the second readout instead of
simply bailing out unrefined. Usually the next second the readout returns
fast just fine without any issues.

Signed-off-by: Daniel Vacek <neelx@redhat.com>
---
 arch/x86/kernel/tsc.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 6d5dc5dabfd7..7812cd7bd45b 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -297,11 +297,11 @@ static int __init tsc_setup(char *str)
 
 __setup("tsc=", tsc_setup);
 
-#define MAX_RETRIES     5
-#define SMI_TRESHOLD    50000
+#define MAX_RETRIES	5
+#define TSC_THRESHOLD	(tsc_khz >> 5)
 
 /*
- * Read TSC and the reference counters. Take care of SMI disturbance
+ * Read TSC and the reference counters. Take care of any disturbances
  */
 static u64 tsc_read_refs(u64 *p, int hpet)
 {
@@ -315,7 +315,7 @@ static u64 tsc_read_refs(u64 *p, int hpet)
 		else
 			*p = acpi_pm_read_early();
 		t2 = get_cycles();
-		if ((t2 - t1) < SMI_TRESHOLD)
+		if ((t2 - t1) < TSC_THRESHOLD)
 			return t2;
 	}
 	return ULLONG_MAX;
@@ -1268,7 +1268,7 @@ struct system_counterval_t convert_art_ns_to_tsc(u64 art_ns)
  */
 static void tsc_refine_calibration_work(struct work_struct *work)
 {
-	static u64 tsc_start = -1, ref_start;
+	static u64 tsc_start = ULLONG_MAX, ref_start;
 	static int hpet;
 	u64 tsc_stop, ref_stop, delta;
 	unsigned long freq;
@@ -1283,14 +1283,15 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	 * delayed the first time we expire. So set the workqueue
 	 * again once we know timers are working.
 	 */
-	if (tsc_start == -1) {
+	if (tsc_start == ULLONG_MAX) {
+restart:
 		/*
 		 * Only set hpet once, to avoid mixing hardware
 		 * if the hpet becomes enabled later.
 		 */
 		hpet = is_hpet_enabled();
-		schedule_delayed_work(&tsc_irqwork, HZ);
 		tsc_start = tsc_read_refs(&ref_start, hpet);
+		schedule_delayed_work(&tsc_irqwork, HZ);
 		return;
 	}
 
@@ -1300,9 +1301,9 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	if (ref_start == ref_stop)
 		goto out;
 
-	/* Check, whether the sampling was disturbed by an SMI */
-	if (tsc_start == ULLONG_MAX || tsc_stop == ULLONG_MAX)
-		goto out;
+	/* Check, whether the sampling was disturbed */
+	if (tsc_stop == ULLONG_MAX)
+		goto restart;
 
 	delta = tsc_stop - tsc_start;
 	delta *= 1000000LL;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] tsc: make calibration refinement more robust
  2018-11-01 15:12 [PATCH] tsc: make calibration refinement more robust Daniel Vacek
@ 2018-11-01 15:34 ` Thomas Gleixner
  2018-11-02  6:40   ` Daniel Vacek
  2018-11-05 17:10 ` [PATCH v2] x86/tsc: " Daniel Vacek
  1 sibling, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2018-11-01 15:34 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin, LKML, Peter Zijlstra

Daniel,

On Thu, 1 Nov 2018, Daniel Vacek wrote:

Please use 'x86/tsc:' as prefix. git log path/to/file usually gives you a
reasonable hint about prefixes.

> -#define MAX_RETRIES     5
> -#define SMI_TRESHOLD    50000
> +#define MAX_RETRIES	5
> +#define TSC_THRESHOLD	(tsc_khz >> 5)

This breaks pit_hpet_ptimer_calibrate_cpu() because at that point tsc_hkz is 0.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tsc: make calibration refinement more robust
  2018-11-01 15:34 ` Thomas Gleixner
@ 2018-11-02  6:40   ` Daniel Vacek
  2018-11-03 10:04     ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Vacek @ 2018-11-02  6:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin, LKML, Peter Zijlstra

Hi Thomas,

thanks for checking.

On Thu, Nov 1, 2018 at 4:34 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> Daniel,
>
> On Thu, 1 Nov 2018, Daniel Vacek wrote:
>
> Please use 'x86/tsc:' as prefix. git log path/to/file usually gives you a
> reasonable hint about prefixes.

Oh, sure thing. The dmesg always prints 'tsc:' - I somehow sticked to it...

>> -#define MAX_RETRIES     5
>> -#define SMI_TRESHOLD    50000
>> +#define MAX_RETRIES  5
>> +#define TSC_THRESHOLD        (tsc_khz >> 5)
>
> This breaks pit_hpet_ptimer_calibrate_cpu() because at that point tsc_hkz is 0.

That did not show up with my testing, sorry. I guess
pit_calibrate_tsc() never failed for me. Hmm, actually it looks like
quick_pit_calibrate() does the job for me so
pit_hpet_ptimer_calibrate_cpu() is likely not even called. Would this:

#define TSC_THRESHOLD   (tsc_khz? tsc_khz >> 5: 0x20000)

work for you instead? Or alternatively at some point when chasing this
down I used:

#define TSC_THRESHOLD (0x10000 + (tsc_khz >> 6))

The first one seems better though. I can send v2 next week if you like it.

--nX

> Thanks,
>
>         tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tsc: make calibration refinement more robust
  2018-11-02  6:40   ` Daniel Vacek
@ 2018-11-03 10:04     ` Thomas Gleixner
  2018-11-05 15:41       ` Daniel Vacek
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2018-11-03 10:04 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin, LKML, Peter Zijlstra

Daniel,

On Fri, 2 Nov 2018, Daniel Vacek wrote:
> On Thu, Nov 1, 2018 at 4:34 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> >> -#define MAX_RETRIES     5
> >> -#define SMI_TRESHOLD    50000
> >> +#define MAX_RETRIES  5
> >> +#define TSC_THRESHOLD        (tsc_khz >> 5)
> >
> > This breaks pit_hpet_ptimer_calibrate_cpu() because at that point tsc_hkz is 0.
> 
> That did not show up with my testing, sorry. I guess
> pit_calibrate_tsc() never failed for me. Hmm, actually it looks like
> quick_pit_calibrate() does the job for me so
> pit_hpet_ptimer_calibrate_cpu() is likely not even called.

Right. It's only called when quick calibration fails. Testing does not
replace code inspection :)

> Would this:
> 
> #define TSC_THRESHOLD   (tsc_khz? tsc_khz >> 5: 0x20000)
> 
> work for you instead? Or alternatively at some point when chasing this
> down I used:
> 
> #define TSC_THRESHOLD (0x10000 + (tsc_khz >> 6))
> 
> The first one seems better though. I can send v2 next week if you like it.

Can you please avoid hiding the logic in a macro? Just use a local
variable:

	u64 thresh = tsc_khz ? tsc_khz >> 5 : TSC_DEFAULT_THRESHOLD;

and use that in the comparison.

Thanks,

	tglx




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tsc: make calibration refinement more robust
  2018-11-03 10:04     ` Thomas Gleixner
@ 2018-11-05 15:41       ` Daniel Vacek
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vacek @ 2018-11-05 15:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin, LKML, Peter Zijlstra

On Sat, Nov 3, 2018 at 11:04 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> Daniel,
>
> On Fri, 2 Nov 2018, Daniel Vacek wrote:
>> On Thu, Nov 1, 2018 at 4:34 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> >> -#define MAX_RETRIES     5
>> >> -#define SMI_TRESHOLD    50000
>> >> +#define MAX_RETRIES  5
>> >> +#define TSC_THRESHOLD        (tsc_khz >> 5)
>> >
>> > This breaks pit_hpet_ptimer_calibrate_cpu() because at that point tsc_hkz is 0.
>>
>> That did not show up with my testing, sorry. I guess
>> pit_calibrate_tsc() never failed for me. Hmm, actually it looks like
>> quick_pit_calibrate() does the job for me so
>> pit_hpet_ptimer_calibrate_cpu() is likely not even called.
>
> Right. It's only called when quick calibration fails. Testing does not
> replace code inspection :)

Agreed. I was not 100% sure about this early init and order of
execution as it's dynamically changed with x86_platform.calibrate_cpu
and x86_platform.calibrate_tsc. Thanks again for the review, Thomas.

> Can you please avoid hiding the logic in a macro? Just use a local
> variable:
>
>         u64 thresh = tsc_khz ? tsc_khz >> 5 : TSC_DEFAULT_THRESHOLD;
>
> and use that in the comparison.

Sweet, I'll do that :)

--nX

> Thanks,
>
>         tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2] x86/tsc: make calibration refinement more robust
  2018-11-01 15:12 [PATCH] tsc: make calibration refinement more robust Daniel Vacek
  2018-11-01 15:34 ` Thomas Gleixner
@ 2018-11-05 17:10 ` Daniel Vacek
  2018-11-06 20:57   ` [tip:x86/timers] x86/tsc: Make " tip-bot for Daniel Vacek
  1 sibling, 1 reply; 11+ messages in thread
From: Daniel Vacek @ 2018-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	linux-kernel, Daniel Vacek

The threshold in tsc_read_refs() is constant which may favor slower CPUs
but may not be optimal for simple reading of reference on faster ones.
Hence make it proportional to tsc_khz when available to compensate for
this. The threshold guards against any disturbance like IRQs, NMIs, SMIs
or CPU stealing by host on guest systems so rename it accordingly and
fix comments as well.

Also on some systems there is noticeable DMI bus contention at some point
during boot keeping the readout failing (observed with about one in ~300
boots when testing). In that case retry also the second readout instead of
simply bailing out unrefined. Usually the next second the readout returns
fast just fine without any issues.

v2: keep using the constant early when the tsc_khz is not available yet
    as suggested by tglx

Signed-off-by: Daniel Vacek <neelx@redhat.com>
---
 arch/x86/kernel/tsc.c | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index e9f777bfed40..3fae23834069 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -297,15 +297,16 @@ static int __init tsc_setup(char *str)
 
 __setup("tsc=", tsc_setup);
 
-#define MAX_RETRIES     5
-#define SMI_TRESHOLD    50000
+#define MAX_RETRIES		5
+#define TSC_DEFAULT_THRESHOLD	0x20000
 
 /*
- * Read TSC and the reference counters. Take care of SMI disturbance
+ * Read TSC and the reference counters. Take care of any disturbances
  */
 static u64 tsc_read_refs(u64 *p, int hpet)
 {
 	u64 t1, t2;
+	u64 thresh = tsc_khz ? tsc_khz >> 5 : TSC_DEFAULT_THRESHOLD;
 	int i;
 
 	for (i = 0; i < MAX_RETRIES; i++) {
@@ -315,7 +316,7 @@ static u64 tsc_read_refs(u64 *p, int hpet)
 		else
 			*p = acpi_pm_read_early();
 		t2 = get_cycles();
-		if ((t2 - t1) < SMI_TRESHOLD)
+		if ((t2 - t1) < thresh)
 			return t2;
 	}
 	return ULLONG_MAX;
@@ -703,15 +704,15 @@ static unsigned long pit_hpet_ptimer_calibrate_cpu(void)
 	 * zero. In each wait loop iteration we read the TSC and check
 	 * the delta to the previous read. We keep track of the min
 	 * and max values of that delta. The delta is mostly defined
-	 * by the IO time of the PIT access, so we can detect when a
-	 * SMI/SMM disturbance happened between the two reads. If the
+	 * by the IO time of the PIT access, so we can detect when
+	 * any disturbance happened between the two reads. If the
 	 * maximum time is significantly larger than the minimum time,
 	 * then we discard the result and have another try.
 	 *
 	 * 2) Reference counter. If available we use the HPET or the
 	 * PMTIMER as a reference to check the sanity of that value.
 	 * We use separate TSC readouts and check inside of the
-	 * reference read for a SMI/SMM disturbance. We dicard
+	 * reference read for any possible disturbance. We dicard
 	 * disturbed values here as well. We do that around the PIT
 	 * calibration delay loop as we have to wait for a certain
 	 * amount of time anyway.
@@ -744,7 +745,7 @@ static unsigned long pit_hpet_ptimer_calibrate_cpu(void)
 		if (ref1 == ref2)
 			continue;
 
-		/* Check, whether the sampling was disturbed by an SMI */
+		/* Check, whether the sampling was disturbed */
 		if (tsc1 == ULLONG_MAX || tsc2 == ULLONG_MAX)
 			continue;
 
@@ -1268,7 +1269,7 @@ struct system_counterval_t convert_art_ns_to_tsc(u64 art_ns)
  */
 static void tsc_refine_calibration_work(struct work_struct *work)
 {
-	static u64 tsc_start = -1, ref_start;
+	static u64 tsc_start = ULLONG_MAX, ref_start;
 	static int hpet;
 	u64 tsc_stop, ref_stop, delta;
 	unsigned long freq;
@@ -1283,14 +1284,15 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	 * delayed the first time we expire. So set the workqueue
 	 * again once we know timers are working.
 	 */
-	if (tsc_start == -1) {
+	if (tsc_start == ULLONG_MAX) {
+restart:
 		/*
 		 * Only set hpet once, to avoid mixing hardware
 		 * if the hpet becomes enabled later.
 		 */
 		hpet = is_hpet_enabled();
-		schedule_delayed_work(&tsc_irqwork, HZ);
 		tsc_start = tsc_read_refs(&ref_start, hpet);
+		schedule_delayed_work(&tsc_irqwork, HZ);
 		return;
 	}
 
@@ -1300,9 +1302,9 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	if (ref_start == ref_stop)
 		goto out;
 
-	/* Check, whether the sampling was disturbed by an SMI */
-	if (tsc_start == ULLONG_MAX || tsc_stop == ULLONG_MAX)
-		goto out;
+	/* Check, whether the sampling was disturbed */
+	if (tsc_stop == ULLONG_MAX)
+		goto restart;
 
 	delta = tsc_stop - tsc_start;
 	delta *= 1000000LL;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:x86/timers] x86/tsc: Make calibration refinement more robust
  2018-11-05 17:10 ` [PATCH v2] x86/tsc: " Daniel Vacek
@ 2018-11-06 20:57   ` tip-bot for Daniel Vacek
  2019-01-08 13:14     ` Daniel Vacek
  0 siblings, 1 reply; 11+ messages in thread
From: tip-bot for Daniel Vacek @ 2018-11-06 20:57 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: mingo, tglx, linux-kernel, neelx, hpa, bp

Commit-ID:  a786ef152cdcfebc923a67f63c7815806eefcf81
Gitweb:     https://git.kernel.org/tip/a786ef152cdcfebc923a67f63c7815806eefcf81
Author:     Daniel Vacek <neelx@redhat.com>
AuthorDate: Mon, 5 Nov 2018 18:10:40 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 6 Nov 2018 21:53:15 +0100

x86/tsc: Make calibration refinement more robust

The threshold in tsc_read_refs() is constant which may favor slower CPUs
but may not be optimal for simple reading of reference on faster ones.

Hence make it proportional to tsc_khz when available to compensate for
this. The threshold guards against any disturbance like IRQs, NMIs, SMIs
or CPU stealing by host on guest systems so rename it accordingly and
fix comments as well.

Also on some systems there is noticeable DMI bus contention at some point
during boot keeping the readout failing (observed with about one in ~300
boots when testing). In that case retry also the second readout instead of
simply bailing out unrefined. Usually the next second the readout returns
fast just fine without any issues.

Signed-off-by: Daniel Vacek <neelx@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/1541437840-29293-1-git-send-email-neelx@redhat.com

---
 arch/x86/kernel/tsc.c | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index e9f777bfed40..3fae23834069 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -297,15 +297,16 @@ static int __init tsc_setup(char *str)
 
 __setup("tsc=", tsc_setup);
 
-#define MAX_RETRIES     5
-#define SMI_TRESHOLD    50000
+#define MAX_RETRIES		5
+#define TSC_DEFAULT_THRESHOLD	0x20000
 
 /*
- * Read TSC and the reference counters. Take care of SMI disturbance
+ * Read TSC and the reference counters. Take care of any disturbances
  */
 static u64 tsc_read_refs(u64 *p, int hpet)
 {
 	u64 t1, t2;
+	u64 thresh = tsc_khz ? tsc_khz >> 5 : TSC_DEFAULT_THRESHOLD;
 	int i;
 
 	for (i = 0; i < MAX_RETRIES; i++) {
@@ -315,7 +316,7 @@ static u64 tsc_read_refs(u64 *p, int hpet)
 		else
 			*p = acpi_pm_read_early();
 		t2 = get_cycles();
-		if ((t2 - t1) < SMI_TRESHOLD)
+		if ((t2 - t1) < thresh)
 			return t2;
 	}
 	return ULLONG_MAX;
@@ -703,15 +704,15 @@ static unsigned long pit_hpet_ptimer_calibrate_cpu(void)
 	 * zero. In each wait loop iteration we read the TSC and check
 	 * the delta to the previous read. We keep track of the min
 	 * and max values of that delta. The delta is mostly defined
-	 * by the IO time of the PIT access, so we can detect when a
-	 * SMI/SMM disturbance happened between the two reads. If the
+	 * by the IO time of the PIT access, so we can detect when
+	 * any disturbance happened between the two reads. If the
 	 * maximum time is significantly larger than the minimum time,
 	 * then we discard the result and have another try.
 	 *
 	 * 2) Reference counter. If available we use the HPET or the
 	 * PMTIMER as a reference to check the sanity of that value.
 	 * We use separate TSC readouts and check inside of the
-	 * reference read for a SMI/SMM disturbance. We dicard
+	 * reference read for any possible disturbance. We dicard
 	 * disturbed values here as well. We do that around the PIT
 	 * calibration delay loop as we have to wait for a certain
 	 * amount of time anyway.
@@ -744,7 +745,7 @@ static unsigned long pit_hpet_ptimer_calibrate_cpu(void)
 		if (ref1 == ref2)
 			continue;
 
-		/* Check, whether the sampling was disturbed by an SMI */
+		/* Check, whether the sampling was disturbed */
 		if (tsc1 == ULLONG_MAX || tsc2 == ULLONG_MAX)
 			continue;
 
@@ -1268,7 +1269,7 @@ static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
  */
 static void tsc_refine_calibration_work(struct work_struct *work)
 {
-	static u64 tsc_start = -1, ref_start;
+	static u64 tsc_start = ULLONG_MAX, ref_start;
 	static int hpet;
 	u64 tsc_stop, ref_stop, delta;
 	unsigned long freq;
@@ -1283,14 +1284,15 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	 * delayed the first time we expire. So set the workqueue
 	 * again once we know timers are working.
 	 */
-	if (tsc_start == -1) {
+	if (tsc_start == ULLONG_MAX) {
+restart:
 		/*
 		 * Only set hpet once, to avoid mixing hardware
 		 * if the hpet becomes enabled later.
 		 */
 		hpet = is_hpet_enabled();
-		schedule_delayed_work(&tsc_irqwork, HZ);
 		tsc_start = tsc_read_refs(&ref_start, hpet);
+		schedule_delayed_work(&tsc_irqwork, HZ);
 		return;
 	}
 
@@ -1300,9 +1302,9 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	if (ref_start == ref_stop)
 		goto out;
 
-	/* Check, whether the sampling was disturbed by an SMI */
-	if (tsc_start == ULLONG_MAX || tsc_stop == ULLONG_MAX)
-		goto out;
+	/* Check, whether the sampling was disturbed */
+	if (tsc_stop == ULLONG_MAX)
+		goto restart;
 
 	delta = tsc_stop - tsc_start;
 	delta *= 1000000LL;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [tip:x86/timers] x86/tsc: Make calibration refinement more robust
  2018-11-06 20:57   ` [tip:x86/timers] x86/tsc: Make " tip-bot for Daniel Vacek
@ 2019-01-08 13:14     ` Daniel Vacek
  2019-01-11 20:45       ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Vacek @ 2019-01-08 13:14 UTC (permalink / raw)
  To: Thomas Gleixner, mingo, H. Peter Anvin
  Cc: linux-tip-commits, open list, Borislav Petkov, Daniel Vacek

Hi Thomas, Ingo, Peter.

I'm wondering, was x86/timers branch of tip tree merged to linus' tree
for v5.0-rc1? Somehow I do not see this patch make it through...

Am I doing something wrong?

--nX

On Tue, Nov 6, 2018 at 9:58 PM tip-bot for Daniel Vacek
<tipbot@zytor.com> wrote:
>
> Commit-ID:  a786ef152cdcfebc923a67f63c7815806eefcf81
> Gitweb:     https://git.kernel.org/tip/a786ef152cdcfebc923a67f63c7815806eefcf81
> Author:     Daniel Vacek <neelx@redhat.com>
> AuthorDate: Mon, 5 Nov 2018 18:10:40 +0100
> Committer:  Thomas Gleixner <tglx@linutronix.de>
> CommitDate: Tue, 6 Nov 2018 21:53:15 +0100
>
> x86/tsc: Make calibration refinement more robust
>
> The threshold in tsc_read_refs() is constant which may favor slower CPUs
> but may not be optimal for simple reading of reference on faster ones.
>
> Hence make it proportional to tsc_khz when available to compensate for
> this. The threshold guards against any disturbance like IRQs, NMIs, SMIs
> or CPU stealing by host on guest systems so rename it accordingly and
> fix comments as well.
>
> Also on some systems there is noticeable DMI bus contention at some point
> during boot keeping the readout failing (observed with about one in ~300
> boots when testing). In that case retry also the second readout instead of
> simply bailing out unrefined. Usually the next second the readout returns
> fast just fine without any issues.
>
> Signed-off-by: Daniel Vacek <neelx@redhat.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Link: https://lkml.kernel.org/r/1541437840-29293-1-git-send-email-neelx@redhat.com
>
> ---
>  arch/x86/kernel/tsc.c | 30 ++++++++++++++++--------------
>  1 file changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index e9f777bfed40..3fae23834069 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -297,15 +297,16 @@ static int __init tsc_setup(char *str)
>
>  __setup("tsc=", tsc_setup);
>
> -#define MAX_RETRIES     5
> -#define SMI_TRESHOLD    50000
> +#define MAX_RETRIES            5
> +#define TSC_DEFAULT_THRESHOLD  0x20000
>
>  /*
> - * Read TSC and the reference counters. Take care of SMI disturbance
> + * Read TSC and the reference counters. Take care of any disturbances
>   */
>  static u64 tsc_read_refs(u64 *p, int hpet)
>  {
>         u64 t1, t2;
> +       u64 thresh = tsc_khz ? tsc_khz >> 5 : TSC_DEFAULT_THRESHOLD;
>         int i;
>
>         for (i = 0; i < MAX_RETRIES; i++) {
> @@ -315,7 +316,7 @@ static u64 tsc_read_refs(u64 *p, int hpet)
>                 else
>                         *p = acpi_pm_read_early();
>                 t2 = get_cycles();
> -               if ((t2 - t1) < SMI_TRESHOLD)
> +               if ((t2 - t1) < thresh)
>                         return t2;
>         }
>         return ULLONG_MAX;
> @@ -703,15 +704,15 @@ static unsigned long pit_hpet_ptimer_calibrate_cpu(void)
>          * zero. In each wait loop iteration we read the TSC and check
>          * the delta to the previous read. We keep track of the min
>          * and max values of that delta. The delta is mostly defined
> -        * by the IO time of the PIT access, so we can detect when a
> -        * SMI/SMM disturbance happened between the two reads. If the
> +        * by the IO time of the PIT access, so we can detect when
> +        * any disturbance happened between the two reads. If the
>          * maximum time is significantly larger than the minimum time,
>          * then we discard the result and have another try.
>          *
>          * 2) Reference counter. If available we use the HPET or the
>          * PMTIMER as a reference to check the sanity of that value.
>          * We use separate TSC readouts and check inside of the
> -        * reference read for a SMI/SMM disturbance. We dicard
> +        * reference read for any possible disturbance. We dicard
>          * disturbed values here as well. We do that around the PIT
>          * calibration delay loop as we have to wait for a certain
>          * amount of time anyway.
> @@ -744,7 +745,7 @@ static unsigned long pit_hpet_ptimer_calibrate_cpu(void)
>                 if (ref1 == ref2)
>                         continue;
>
> -               /* Check, whether the sampling was disturbed by an SMI */
> +               /* Check, whether the sampling was disturbed */
>                 if (tsc1 == ULLONG_MAX || tsc2 == ULLONG_MAX)
>                         continue;
>
> @@ -1268,7 +1269,7 @@ static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
>   */
>  static void tsc_refine_calibration_work(struct work_struct *work)
>  {
> -       static u64 tsc_start = -1, ref_start;
> +       static u64 tsc_start = ULLONG_MAX, ref_start;
>         static int hpet;
>         u64 tsc_stop, ref_stop, delta;
>         unsigned long freq;
> @@ -1283,14 +1284,15 @@ static void tsc_refine_calibration_work(struct work_struct *work)
>          * delayed the first time we expire. So set the workqueue
>          * again once we know timers are working.
>          */
> -       if (tsc_start == -1) {
> +       if (tsc_start == ULLONG_MAX) {
> +restart:
>                 /*
>                  * Only set hpet once, to avoid mixing hardware
>                  * if the hpet becomes enabled later.
>                  */
>                 hpet = is_hpet_enabled();
> -               schedule_delayed_work(&tsc_irqwork, HZ);
>                 tsc_start = tsc_read_refs(&ref_start, hpet);
> +               schedule_delayed_work(&tsc_irqwork, HZ);
>                 return;
>         }
>
> @@ -1300,9 +1302,9 @@ static void tsc_refine_calibration_work(struct work_struct *work)
>         if (ref_start == ref_stop)
>                 goto out;
>
> -       /* Check, whether the sampling was disturbed by an SMI */
> -       if (tsc_start == ULLONG_MAX || tsc_stop == ULLONG_MAX)
> -               goto out;
> +       /* Check, whether the sampling was disturbed */
> +       if (tsc_stop == ULLONG_MAX)
> +               goto restart;
>
>         delta = tsc_stop - tsc_start;
>         delta *= 1000000LL;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:x86/timers] x86/tsc: Make calibration refinement more robust
  2019-01-08 13:14     ` Daniel Vacek
@ 2019-01-11 20:45       ` Thomas Gleixner
  2019-01-16 15:53         ` Daniel Vacek
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2019-01-11 20:45 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: mingo, H. Peter Anvin, linux-tip-commits, open list, Borislav Petkov

Daniel,

On Tue, 8 Jan 2019, Daniel Vacek wrote:

> I'm wondering, was x86/timers branch of tip tree merged to linus' tree
> for v5.0-rc1? Somehow I do not see this patch make it through...
> 
> Am I doing something wrong?

No. The branch was not sent to Linus during the merge window due to holiday
season induced oversight.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:x86/timers] x86/tsc: Make calibration refinement more robust
  2019-01-11 20:45       ` Thomas Gleixner
@ 2019-01-16 15:53         ` Daniel Vacek
  2019-01-28 16:00           ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Vacek @ 2019-01-16 15:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mingo, H. Peter Anvin, linux-tip-commits, open list, Borislav Petkov

On Fri, Jan 11, 2019 at 10:03 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> No. The branch was not sent to Linus during the merge window due to holiday
> season induced oversight.

So does it wait for the next merge window now or is it still going to
be sent? This is actually worth stable branches. I've confirmations
from Hitachi and Nec that this really helps latest Skylake-X systems
boot reliably.

--nX

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:x86/timers] x86/tsc: Make calibration refinement more robust
  2019-01-16 15:53         ` Daniel Vacek
@ 2019-01-28 16:00           ` Thomas Gleixner
  0 siblings, 0 replies; 11+ messages in thread
From: Thomas Gleixner @ 2019-01-28 16:00 UTC (permalink / raw)
  To: Daniel Vacek
  Cc: mingo, H. Peter Anvin, linux-tip-commits, open list, Borislav Petkov

On Wed, 16 Jan 2019, Daniel Vacek wrote:

> On Fri, Jan 11, 2019 at 10:03 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > No. The branch was not sent to Linus during the merge window due to holiday
> > season induced oversight.
> 
> So does it wait for the next merge window now or is it still going to
> be sent? This is actually worth stable branches. I've confirmations
> from Hitachi and Nec that this really helps latest Skylake-X systems
> boot reliably.

It's in 5.0-rc4 now.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-01-28 16:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-01 15:12 [PATCH] tsc: make calibration refinement more robust Daniel Vacek
2018-11-01 15:34 ` Thomas Gleixner
2018-11-02  6:40   ` Daniel Vacek
2018-11-03 10:04     ` Thomas Gleixner
2018-11-05 15:41       ` Daniel Vacek
2018-11-05 17:10 ` [PATCH v2] x86/tsc: " Daniel Vacek
2018-11-06 20:57   ` [tip:x86/timers] x86/tsc: Make " tip-bot for Daniel Vacek
2019-01-08 13:14     ` Daniel Vacek
2019-01-11 20:45       ` Thomas Gleixner
2019-01-16 15:53         ` Daniel Vacek
2019-01-28 16:00           ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.