xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] x86/time: TSC calibration improvements
@ 2022-01-12  8:53 Jan Beulich
  2022-01-12  8:55 ` [PATCH 1/2] x86/time: use relative counts in calibration loops Jan Beulich
  2022-01-12  8:56 ` [PATCH 2/2] x86/time: improve TSC / CPU freq calibration accuracy Jan Beulich
  0 siblings, 2 replies; 9+ messages in thread
From: Jan Beulich @ 2022-01-12  8:53 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

While it'll take some time to know whether presumably the 2nd these
changes will help with the originally reported issue, there are
immediate positive effects even beyond that issue, which is why I
didn't want to wait with the submission. There's at least one more
issue to take care of (see the respective remark in patch 2), but
that's still under discussion / consideration.

1: use relative counts in calibration loops
2: improve TSC / CPU freq calibration accuracy

Jan



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] x86/time: use relative counts in calibration loops
  2022-01-12  8:53 [PATCH 0/2] x86/time: TSC calibration improvements Jan Beulich
@ 2022-01-12  8:55 ` Jan Beulich
  2022-01-12 10:18   ` Roger Pau Monné
  2022-01-13  9:37   ` Jan Beulich
  2022-01-12  8:56 ` [PATCH 2/2] x86/time: improve TSC / CPU freq calibration accuracy Jan Beulich
  1 sibling, 2 replies; 9+ messages in thread
From: Jan Beulich @ 2022-01-12  8:55 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Looping until reaching/exceeding a certain value is error prone: If the
target value is close enough to the wrapping point, the loop may not
terminate at all. Switch to using delta values, which then allows to
fold the two loops each into just one.

Fixes: 93340297802b ("x86/time: calibrate TSC against platform timer")
Reported-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -378,7 +378,7 @@ static u64 read_hpet_count(void)
 static int64_t __init init_hpet(struct platform_timesource *pts)
 {
     uint64_t hpet_rate, start;
-    uint32_t count, target;
+    uint32_t count, target, elapsed;
     /*
      * Allow HPET to be setup, but report a frequency of 0 so it's not selected
      * as a timer source. This is required so it can be used in legacy
@@ -451,11 +451,8 @@ static int64_t __init init_hpet(struct p
 
     count = hpet_read32(HPET_COUNTER);
     start = rdtsc_ordered();
-    target = count + CALIBRATE_VALUE(hpet_rate);
-    if ( target < count )
-        while ( hpet_read32(HPET_COUNTER) >= count )
-            continue;
-    while ( hpet_read32(HPET_COUNTER) < target )
+    target = CALIBRATE_VALUE(hpet_rate);
+    while ( (elapsed = hpet_read32(HPET_COUNTER) - count) < target )
         continue;
 
     return (rdtsc_ordered() - start) * CALIBRATE_FRAC;
@@ -493,8 +490,8 @@ static u64 read_pmtimer_count(void)
 
 static s64 __init init_pmtimer(struct platform_timesource *pts)
 {
-    u64 start;
-    u32 count, target, mask;
+    uint64_t start;
+    uint32_t count, target, mask, elapsed;
 
     if ( !pmtmr_ioport || (pmtmr_width != 24 && pmtmr_width != 32) )
         return 0;
@@ -504,11 +501,8 @@ static s64 __init init_pmtimer(struct pl
 
     count = inl(pmtmr_ioport) & mask;
     start = rdtsc_ordered();
-    target = count + CALIBRATE_VALUE(ACPI_PM_FREQUENCY);
-    if ( target < count )
-        while ( (inl(pmtmr_ioport) & mask) >= count )
-            continue;
-    while ( (inl(pmtmr_ioport) & mask) < target )
+    target = CALIBRATE_VALUE(ACPI_PM_FREQUENCY);
+    while ( (elapsed = (inl(pmtmr_ioport) & mask) - count) < target )
         continue;
 
     return (rdtsc_ordered() - start) * CALIBRATE_FRAC;



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/2] x86/time: improve TSC / CPU freq calibration accuracy
  2022-01-12  8:53 [PATCH 0/2] x86/time: TSC calibration improvements Jan Beulich
  2022-01-12  8:55 ` [PATCH 1/2] x86/time: use relative counts in calibration loops Jan Beulich
@ 2022-01-12  8:56 ` Jan Beulich
  2022-01-12 10:53   ` Roger Pau Monné
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2022-01-12  8:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

While the problem report was for extreme errors, even smaller ones would
better be avoided: The calculated period to run calibration loops over
can (and usually will) be shorter than the actual time elapsed between
first and last platform timer and TSC reads. Adjust values returned from
the init functions accordingly.

On a Skylake system I've tested this on accuracy (using HPET) went from
detecting in some cases more than 220kHz too high a value to about
±2kHz. On other systems (or on this system, but with PMTMR) the original
error range was much smaller, with less (in some cases only very little)
improvement.

Reported-by: James Dingwall <james-xen@dingwall.me.uk>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
There's still a time window for the issue to occur between the final
HPET/PMTMR read and the following TSC read. Improving this will be the
subject of yet another patch.

TBD: Accuracy could be slightly further improved by using a (to be
     introduced) rounding variant of muldiv64().
TBD: I'm not entirely sure how useful the conditional is - there
     shouldn't be any inaccuracy from the division when actual equals
     target (upon entry to the conditional), as then the divisor is
     what the original value was just multiplied by. And as per the
     logic in the callers actual can't be smaller than target.
TBD: I'm also no longer sure that the helper function is warranted
     anymore. It started out with more contents, but is now
     effectively only the [conditional] muldiv64() invocation.

I'm afraid I don't see a way to deal with the same issue in init_pit().
In particular the (multiple) specs I have to hand don't make clear
whether the counter would continue counting after having reached zero.
Obviously it wouldn't help to check this on a few systems, as their
behavior could still be implementation specific.

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -287,6 +287,23 @@ static char *freq_string(u64 freq)
     return s;
 }
 
+static uint64_t adjust_elapsed(uint64_t elapsed, uint32_t actual,
+                               uint32_t target)
+{
+    if ( likely(actual > target) )
+    {
+        /*
+         * A (perhaps significant) delay before the last timer read (e.g. due
+         * to a SMI or NMI) can lead to (perhaps severe) inaccuracy if not
+         * accounting for the time elapsed beyond the originally calculated
+         * duration of the calibration interval.
+         */
+        elapsed = muldiv64(elapsed, target, actual);
+    }
+
+    return elapsed * CALIBRATE_FRAC;
+}
+
 /************************************************************
  * PLATFORM TIMER 1: PROGRAMMABLE INTERVAL TIMER (LEGACY PIT)
  */
@@ -455,7 +472,7 @@ static int64_t __init init_hpet(struct p
     while ( (elapsed = hpet_read32(HPET_COUNTER) - count) < target )
         continue;
 
-    return (rdtsc_ordered() - start) * CALIBRATE_FRAC;
+    return adjust_elapsed(rdtsc_ordered() - start, elapsed, target);
 }
 
 static void resume_hpet(struct platform_timesource *pts)
@@ -505,7 +522,7 @@ static s64 __init init_pmtimer(struct pl
     while ( (elapsed = (inl(pmtmr_ioport) & mask) - count) < target )
         continue;
 
-    return (rdtsc_ordered() - start) * CALIBRATE_FRAC;
+    return adjust_elapsed(rdtsc_ordered() - start, elapsed, target);
 }
 
 static struct platform_timesource __initdata plt_pmtimer =



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] x86/time: use relative counts in calibration loops
  2022-01-12  8:55 ` [PATCH 1/2] x86/time: use relative counts in calibration loops Jan Beulich
@ 2022-01-12 10:18   ` Roger Pau Monné
  2022-01-13  9:37   ` Jan Beulich
  1 sibling, 0 replies; 9+ messages in thread
From: Roger Pau Monné @ 2022-01-12 10:18 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Wed, Jan 12, 2022 at 09:55:17AM +0100, Jan Beulich wrote:
> Looping until reaching/exceeding a certain value is error prone: If the
> target value is close enough to the wrapping point, the loop may not
> terminate at all. Switch to using delta values, which then allows to
> fold the two loops each into just one.
> 
> Fixes: 93340297802b ("x86/time: calibrate TSC against platform timer")
> Reported-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] x86/time: improve TSC / CPU freq calibration accuracy
  2022-01-12  8:56 ` [PATCH 2/2] x86/time: improve TSC / CPU freq calibration accuracy Jan Beulich
@ 2022-01-12 10:53   ` Roger Pau Monné
  2022-01-12 11:32     ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2022-01-12 10:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Wed, Jan 12, 2022 at 09:56:12AM +0100, Jan Beulich wrote:
> While the problem report was for extreme errors, even smaller ones would
> better be avoided: The calculated period to run calibration loops over
> can (and usually will) be shorter than the actual time elapsed between
> first and last platform timer and TSC reads. Adjust values returned from
> the init functions accordingly.
> 
> On a Skylake system I've tested this on accuracy (using HPET) went from
> detecting in some cases more than 220kHz too high a value to about
> ±2kHz. On other systems (or on this system, but with PMTMR) the original
> error range was much smaller, with less (in some cases only very little)
> improvement.
> 
> Reported-by: James Dingwall <james-xen@dingwall.me.uk>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> There's still a time window for the issue to occur between the final
> HPET/PMTMR read and the following TSC read. Improving this will be the
> subject of yet another patch.
> 
> TBD: Accuracy could be slightly further improved by using a (to be
>      introduced) rounding variant of muldiv64().

I'm unsure we care that much about such fine grained accuracy here.

> TBD: I'm not entirely sure how useful the conditional is - there
>      shouldn't be any inaccuracy from the division when actual equals
>      target (upon entry to the conditional), as then the divisor is
>      what the original value was just multiplied by. And as per the
>      logic in the callers actual can't be smaller than target.

Right, it's just overhead to do the muldiv64 if target == actual.

> TBD: I'm also no longer sure that the helper function is warranted
>      anymore. It started out with more contents, but is now
>      effectively only the [conditional] muldiv64() invocation.

Don't have a strong opinion, I'm fine with the helper, or else I would
likely request that the call to muldiv64 is not placed together with
the return in order to avoid overly long lines.

> 
> I'm afraid I don't see a way to deal with the same issue in init_pit().
> In particular the (multiple) specs I have to hand don't make clear
> whether the counter would continue counting after having reached zero.
> Obviously it wouldn't help to check this on a few systems, as their
> behavior could still be implementation specific.

We could likely set the counter to the maximum value it can hold
and then perform reads in a loop (like it's done for HPET or the PM
timers) and stop when start - target is reached. Not a great solution
either.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] x86/time: improve TSC / CPU freq calibration accuracy
  2022-01-12 10:53   ` Roger Pau Monné
@ 2022-01-12 11:32     ` Jan Beulich
  2022-01-17  8:40       ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2022-01-12 11:32 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 12.01.2022 11:53, Roger Pau Monné wrote:
> On Wed, Jan 12, 2022 at 09:56:12AM +0100, Jan Beulich wrote:
>> While the problem report was for extreme errors, even smaller ones would
>> better be avoided: The calculated period to run calibration loops over
>> can (and usually will) be shorter than the actual time elapsed between
>> first and last platform timer and TSC reads. Adjust values returned from
>> the init functions accordingly.
>>
>> On a Skylake system I've tested this on accuracy (using HPET) went from
>> detecting in some cases more than 220kHz too high a value to about
>> ±2kHz. On other systems (or on this system, but with PMTMR) the original
>> error range was much smaller, with less (in some cases only very little)
>> improvement.
>>
>> Reported-by: James Dingwall <james-xen@dingwall.me.uk>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> I'm afraid I don't see a way to deal with the same issue in init_pit().
>> In particular the (multiple) specs I have to hand don't make clear
>> whether the counter would continue counting after having reached zero.
>> Obviously it wouldn't help to check this on a few systems, as their
>> behavior could still be implementation specific.
> 
> We could likely set the counter to the maximum value it can hold
> and then perform reads in a loop (like it's done for HPET or the PM
> timers) and stop when start - target is reached. Not a great solution
> either.

Not the least because reading back the counter from the PIT requires
multiple port operations, i.e. is overall quite a bit slower.

Jan



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] x86/time: use relative counts in calibration loops
  2022-01-12  8:55 ` [PATCH 1/2] x86/time: use relative counts in calibration loops Jan Beulich
  2022-01-12 10:18   ` Roger Pau Monné
@ 2022-01-13  9:37   ` Jan Beulich
  2022-01-13 10:44     ` Roger Pau Monné
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2022-01-13  9:37 UTC (permalink / raw)
  To: xen-devel, Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu

On 12.01.2022 09:55, Jan Beulich wrote:
> @@ -504,11 +501,8 @@ static s64 __init init_pmtimer(struct pl
>  
>      count = inl(pmtmr_ioport) & mask;
>      start = rdtsc_ordered();
> -    target = count + CALIBRATE_VALUE(ACPI_PM_FREQUENCY);
> -    if ( target < count )
> -        while ( (inl(pmtmr_ioport) & mask) >= count )
> -            continue;
> -    while ( (inl(pmtmr_ioport) & mask) < target )
> +    target = CALIBRATE_VALUE(ACPI_PM_FREQUENCY);
> +    while ( (elapsed = (inl(pmtmr_ioport) & mask) - count) < target )

I think this is wrong, and instead needs to be

    while ( (elapsed = (inl(pmtmr_ioport) - count) & mask) < target )

There no similar issue with HPET as there we always have full 32 bits
available.

Roger - you gave your R-b. If you agree, I'd like to retain that with
the fix in place. But I'm not going to commit either variant ahead of
hearing back from you.

Jan

>          continue;
>  
>      return (rdtsc_ordered() - start) * CALIBRATE_FRAC;
> 
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] x86/time: use relative counts in calibration loops
  2022-01-13  9:37   ` Jan Beulich
@ 2022-01-13 10:44     ` Roger Pau Monné
  0 siblings, 0 replies; 9+ messages in thread
From: Roger Pau Monné @ 2022-01-13 10:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Thu, Jan 13, 2022 at 10:37:53AM +0100, Jan Beulich wrote:
> On 12.01.2022 09:55, Jan Beulich wrote:
> > @@ -504,11 +501,8 @@ static s64 __init init_pmtimer(struct pl
> >  
> >      count = inl(pmtmr_ioport) & mask;
> >      start = rdtsc_ordered();
> > -    target = count + CALIBRATE_VALUE(ACPI_PM_FREQUENCY);
> > -    if ( target < count )
> > -        while ( (inl(pmtmr_ioport) & mask) >= count )
> > -            continue;
> > -    while ( (inl(pmtmr_ioport) & mask) < target )
> > +    target = CALIBRATE_VALUE(ACPI_PM_FREQUENCY);
> > +    while ( (elapsed = (inl(pmtmr_ioport) & mask) - count) < target )
> 
> I think this is wrong, and instead needs to be
> 
>     while ( (elapsed = (inl(pmtmr_ioport) - count) & mask) < target )
> 
> There no similar issue with HPET as there we always have full 32 bits
> available.
> 
> Roger - you gave your R-b. If you agree, I'd like to retain that with
> the fix in place. But I'm not going to commit either variant ahead of
> hearing back from you.

Indeed, or else overflows past the mask boundary could make the loop
exit early.

Please keep the R-b.

Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] x86/time: improve TSC / CPU freq calibration accuracy
  2022-01-12 11:32     ` Jan Beulich
@ 2022-01-17  8:40       ` Jan Beulich
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2022-01-17  8:40 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 12.01.2022 12:32, Jan Beulich wrote:
> On 12.01.2022 11:53, Roger Pau Monné wrote:
>> On Wed, Jan 12, 2022 at 09:56:12AM +0100, Jan Beulich wrote:
>>> I'm afraid I don't see a way to deal with the same issue in init_pit().
>>> In particular the (multiple) specs I have to hand don't make clear
>>> whether the counter would continue counting after having reached zero.
>>> Obviously it wouldn't help to check this on a few systems, as their
>>> behavior could still be implementation specific.
>>
>> We could likely set the counter to the maximum value it can hold
>> and then perform reads in a loop (like it's done for HPET or the PM
>> timers) and stop when start - target is reached. Not a great solution
>> either.
> 
> Not the least because reading back the counter from the PIT requires
> multiple port operations, i.e. is overall quite a bit slower.

What's worse - even is programmed to the maximum value (65536) this
timer rolls over every 55ms; as said elsewhere SMIs have been observed
to take significantly longer. I conclude that PIT simply cannot safely
be used on platforms with such long lasting operations. As a further
consequence I wonder whether we wouldn't better calibrate the APIC
timer against the chosen platform timer rather than hardcoding this to
use the PIT.

Jan



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-01-17  8:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-12  8:53 [PATCH 0/2] x86/time: TSC calibration improvements Jan Beulich
2022-01-12  8:55 ` [PATCH 1/2] x86/time: use relative counts in calibration loops Jan Beulich
2022-01-12 10:18   ` Roger Pau Monné
2022-01-13  9:37   ` Jan Beulich
2022-01-13 10:44     ` Roger Pau Monné
2022-01-12  8:56 ` [PATCH 2/2] x86/time: improve TSC / CPU freq calibration accuracy Jan Beulich
2022-01-12 10:53   ` Roger Pau Monné
2022-01-12 11:32     ` Jan Beulich
2022-01-17  8:40       ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).