All of lore.kernel.org
 help / color / mirror / Atom feed
* What time is it kvm-clock?
@ 2016-02-24  2:31 Owen Hofmann
  2016-02-24  3:57 ` Marcelo Tosatti
                   ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Owen Hofmann @ 2016-02-24  2:31 UTC (permalink / raw)
  To: KVM General
  Cc: Paolo Bonzini, Marcelo Tosatti, Andy Lutomirski, Peter Hornyack

Specifically, what underlying source of time should be exposed through
kvm-clock and other paravirtual ABIs like the HyperV reference tsc
page?  Recently a couple of threads on kvm-list, along with attempts
to produce reliable behavior from kvm-clock on our systems have
highlighted a tension between the current implementation of kvm-clock
and potentially diverging goals for paravirt time. Here are a few:

1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html

This question is mostly in regards to kvm-clock in masterclock mode
(with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
expose a source of time that is more 'true' than the underlying TSC?
For example, by passing through NTP correction from the host. For the
current implementation, the answer seems to be... why not both? Once
programmed, kvm-clock or the HyperV TSC page will advance with the TSC
multiplied by the frequency specified by kvm. On the other hand,
KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
are measured against corrected time from the host. A guest reading its
pvclock gets a very different result from a host KVM_GET_CLOCK if the
guest has run long enough to for TSC to diverge from NTP time. A VMM
using these ioctls to save and restore clock state can produce wild
time jumps from the guest's perspective.

The patches in (2) address this mismatch by plumbing updates to clock
frequency through kvm-clock to the guest. This seems like an important
design choice for kvm-clock, and IMO deserves at least a clear
statement of the goals for this interface, if not some more
discussion. The (later) thread in (3) claims that synchronizing with
host time is *not* a goal of kvm-clock.

To me, kvm-clock and the HyperV TSC page are extremely effective as
simply a more enlightened path to the host TSC. Maintaining a
high-performance path to the TSC in the face of updates is tricky -
see the extended comment in pvclock_update_vm_gtod_copy, or the
discussion on the patchset in (2). Is the cost of auditing that the
path from host gettimeofday update -> kvm -> guest pvclock -> guest
gettimeofday both tracks host time correctly and does not produce any
backwards warps worth the added value, if it exists? As an
alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
function of the last update to kvm-clock or the reference TSC page,
respectively, sounds very straightforward.

(Outside of masterclock mode, the requirement that the client
synchronizes across cpus for montonicity smoothes over a lot of
complexity - periodically updating kvm-clock to the current time is
simple and works.)

Regardless of my opinion, I think that a clear statement of the design
goals for kvm-clock (and kvm's implementation of the reference TSC
page) would be valuable.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24  2:31 What time is it kvm-clock? Owen Hofmann
@ 2016-02-24  3:57 ` Marcelo Tosatti
  2016-02-24 17:35   ` Peter Hornyack
  2016-02-24  3:59 ` Marcelo Tosatti
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 29+ messages in thread
From: Marcelo Tosatti @ 2016-02-24  3:57 UTC (permalink / raw)
  To: Owen Hofmann; +Cc: KVM General, Paolo Bonzini, Andy Lutomirski, Peter Hornyack

On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
> Specifically, what underlying source of time should be exposed through
> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> page?  Recently a couple of threads on kvm-list, along with attempts
> to produce reliable behavior from kvm-clock on our systems have
> highlighted a tension between the current implementation of kvm-clock
> and potentially diverging goals for paravirt time. Here are a few:
> 
> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
> 
> This question is mostly in regards to kvm-clock in masterclock mode
> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
> expose a source of time that is more 'true' than the underlying TSC?
> For example, by passing through NTP correction from the host. For the
> current implementation, the answer seems to be... why not both? Once
> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
> multiplied by the frequency specified by kvm. On the other hand,
> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
> are measured against corrected time from the host. A guest reading its
> pvclock gets a very different result from a host KVM_GET_CLOCK if the
> guest has run long enough to for TSC to diverge from NTP time. A VMM
> using these ioctls to save and restore clock state can produce wild
> time jumps from the guest's perspective.
> 
> The patches in (2) address this mismatch by plumbing updates to clock
> frequency through kvm-clock to the guest. This seems like an important
> design choice for kvm-clock, and IMO deserves at least a clear
> statement of the goals for this interface, if not some more
> discussion. 

Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK? 

The interfaces have been introduced to fix a bug.

> The (later) thread in (3) claims that synchronizing with
> host time is *not* a goal of kvm-clock.

It is not.

> To me, kvm-clock and the HyperV TSC page are extremely effective as
> simply a more enlightened path to the host TSC. Maintaining a
> high-performance path to the TSC in the face of updates is tricky -
> see the extended comment in pvclock_update_vm_gtod_copy, or the
> discussion on the patchset in (2). Is the cost of auditing that the
> path from host gettimeofday update -> kvm -> guest pvclock -> guest
> gettimeofday both tracks host time correctly and does not produce any
> backwards warps worth the added value, if it exists? As an
> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
> function of the last update to kvm-clock or the reference TSC page,
> respectively, sounds very straightforward.
>
> (Outside of masterclock mode, the requirement that the client
> synchronizes across cpus for montonicity smoothes over a lot of
> complexity - periodically updating kvm-clock to the current time is
> simple and works.)
> 
> Regardless of my opinion, I think that a clear statement of the design
> goals for kvm-clock (and kvm's implementation of the reference TSC
> page) would be valuable.

Documentation/virtual/kvm/timekeeping.txt


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24  2:31 What time is it kvm-clock? Owen Hofmann
  2016-02-24  3:57 ` Marcelo Tosatti
@ 2016-02-24  3:59 ` Marcelo Tosatti
  2016-02-24 14:14 ` Paolo Bonzini
  2016-02-26 15:04 ` Marcelo Tosatti
  3 siblings, 0 replies; 29+ messages in thread
From: Marcelo Tosatti @ 2016-02-24  3:59 UTC (permalink / raw)
  To: Owen Hofmann; +Cc: KVM General, Paolo Bonzini, Andy Lutomirski, Peter Hornyack

On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
> Specifically, what underlying source of time should be exposed through
> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> page?  Recently a couple of threads on kvm-list, along with attempts
> to produce reliable behavior from kvm-clock 

Please report findings if you find issues in kvm-clock. "unreliable
behavior" ?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24  2:31 What time is it kvm-clock? Owen Hofmann
  2016-02-24  3:57 ` Marcelo Tosatti
  2016-02-24  3:59 ` Marcelo Tosatti
@ 2016-02-24 14:14 ` Paolo Bonzini
  2016-02-24 16:44   ` Andy Lutomirski
  2016-02-26 15:04 ` Marcelo Tosatti
  3 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2016-02-24 14:14 UTC (permalink / raw)
  To: Owen Hofmann, KVM General
  Cc: Marcelo Tosatti, Andy Lutomirski, Peter Hornyack



On 24/02/2016 03:31, Owen Hofmann wrote:
> Specifically, what underlying source of time should be exposed through
> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> page?  Recently a couple of threads on kvm-list, along with attempts
> to produce reliable behavior from kvm-clock on our systems have
> highlighted a tension between the current implementation of kvm-clock
> and potentially diverging goals for paravirt time. Here are a few:
> 
> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
> 
> This question is mostly in regards to kvm-clock in masterclock mode
> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
> expose a source of time that is more 'true' than the underlying TSC?
> For example, by passing through NTP correction from the host. For the
> current implementation, the answer seems to be... why not both? Once
> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
> multiplied by the frequency specified by kvm. On the other hand,
> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
> are measured against corrected time from the host. A guest reading its
> pvclock gets a very different result from a host KVM_GET_CLOCK if the
> guest has run long enough to for TSC to diverge from NTP time.

Right, in fact that's why QEMU is not really using KVM_GET_CLOCK
anymore.  In retrospect, the "fix" in QEMU was probably a bad idea.  It
would have been better to fix KVM_GET_CLOCK.

> To me, kvm-clock and the HyperV TSC page are extremely effective as
> simply a more enlightened path to the host TSC. Maintaining a
> high-performance path to the TSC in the face of updates is tricky -
> see the extended comment in pvclock_update_vm_gtod_copy, or the
> discussion on the patchset in (2). Is the cost of auditing that the
> path from host gettimeofday update -> kvm -> guest pvclock -> guest
> gettimeofday both tracks host time correctly and does not produce any
> backwards warps worth the added value, if it exists? As an
> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
> function of the last update to kvm-clock or the reference TSC page,
> respectively, sounds very straightforward.

Yes, we could do that too.

I think that vgettsc and do_monotonic_boot also would have to use the
TSC frequency instead the NTP-adjusted host clock.

> (Outside of masterclock mode, the requirement that the client
> synchronizes across cpus for montonicity smoothes over a lot of
> complexity - periodically updating kvm-clock to the current time is
> simple and works.)
> 
> Regardless of my opinion, I think that a clear statement of the design
> goals for kvm-clock (and kvm's implementation of the reference TSC
> page) would be valuable.

Since we cannot change the past, having kvmclock synchronize with the
host TSC frequency is the only choice we can make.

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 14:14 ` Paolo Bonzini
@ 2016-02-24 16:44   ` Andy Lutomirski
  2016-02-24 17:38     ` Marcelo Tosatti
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2016-02-24 16:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Owen Hofmann, KVM General, Marcelo Tosatti, Peter Hornyack

On Wed, Feb 24, 2016 at 6:14 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 24/02/2016 03:31, Owen Hofmann wrote:
>> Specifically, what underlying source of time should be exposed through
>> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
>> page?  Recently a couple of threads on kvm-list, along with attempts
>> to produce reliable behavior from kvm-clock on our systems have
>> highlighted a tension between the current implementation of kvm-clock
>> and potentially diverging goals for paravirt time. Here are a few:
>>
>> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
>> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
>> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
>>
>> This question is mostly in regards to kvm-clock in masterclock mode
>> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
>> expose a source of time that is more 'true' than the underlying TSC?
>> For example, by passing through NTP correction from the host. For the
>> current implementation, the answer seems to be... why not both? Once
>> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
>> multiplied by the frequency specified by kvm. On the other hand,
>> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
>> are measured against corrected time from the host. A guest reading its
>> pvclock gets a very different result from a host KVM_GET_CLOCK if the
>> guest has run long enough to for TSC to diverge from NTP time.
>
> Right, in fact that's why QEMU is not really using KVM_GET_CLOCK
> anymore.  In retrospect, the "fix" in QEMU was probably a bad idea.  It
> would have been better to fix KVM_GET_CLOCK.
>
>> To me, kvm-clock and the HyperV TSC page are extremely effective as
>> simply a more enlightened path to the host TSC. Maintaining a
>> high-performance path to the TSC in the face of updates is tricky -
>> see the extended comment in pvclock_update_vm_gtod_copy, or the
>> discussion on the patchset in (2). Is the cost of auditing that the
>> path from host gettimeofday update -> kvm -> guest pvclock -> guest
>> gettimeofday both tracks host time correctly and does not produce any
>> backwards warps worth the added value, if it exists? As an
>> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
>> function of the last update to kvm-clock or the reference TSC page,
>> respectively, sounds very straightforward.
>
> Yes, we could do that too.
>
> I think that vgettsc and do_monotonic_boot also would have to use the
> TSC frequency instead the NTP-adjusted host clock.
>
>> (Outside of masterclock mode, the requirement that the client
>> synchronizes across cpus for montonicity smoothes over a lot of
>> complexity - periodically updating kvm-clock to the current time is
>> simple and works.)
>>
>> Regardless of my opinion, I think that a clear statement of the design
>> goals for kvm-clock (and kvm's implementation of the reference TSC
>> page) would be valuable.
>
> Since we cannot change the past, having kvmclock synchronize with the
> host TSC frequency is the only choice we can make.
>

Could we introduce a new kvm-clock or perhaps opt-in mode that:

a) uses hypervisor-supplied IO pages and,

b) synchronizes to host CLOCK_MONOTONIC instead of some bizarre
non-suspend-resume-safe not-really-well-defined hybrid?

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24  3:57 ` Marcelo Tosatti
@ 2016-02-24 17:35   ` Peter Hornyack
  2016-02-24 20:17     ` Radim Krčmář
  2016-02-24 23:35     ` Marcelo Tosatti
  0 siblings, 2 replies; 29+ messages in thread
From: Peter Hornyack @ 2016-02-24 17:35 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Owen Hofmann, KVM General, Paolo Bonzini, Andy Lutomirski

On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
>> Specifically, what underlying source of time should be exposed through
>> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
>> page?  Recently a couple of threads on kvm-list, along with attempts
>> to produce reliable behavior from kvm-clock on our systems have
>> highlighted a tension between the current implementation of kvm-clock
>> and potentially diverging goals for paravirt time. Here are a few:
>>
>> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
>> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
>> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
>>
>> This question is mostly in regards to kvm-clock in masterclock mode
>> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
>> expose a source of time that is more 'true' than the underlying TSC?
>> For example, by passing through NTP correction from the host. For the
>> current implementation, the answer seems to be... why not both? Once
>> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
>> multiplied by the frequency specified by kvm. On the other hand,
>> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
>> are measured against corrected time from the host. A guest reading its
>> pvclock gets a very different result from a host KVM_GET_CLOCK if the
>> guest has run long enough to for TSC to diverge from NTP time. A VMM
>> using these ioctls to save and restore clock state can produce wild
>> time jumps from the guest's perspective.
>>
>> The patches in (2) address this mismatch by plumbing updates to clock
>> frequency through kvm-clock to the guest. This seems like an important
>> design choice for kvm-clock, and IMO deserves at least a clear
>> statement of the goals for this interface, if not some more
>> discussion.
>
> Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK?
>
> The interfaces have been introduced to fix a bug.
>
>> The (later) thread in (3) claims that synchronizing with
>> host time is *not* a goal of kvm-clock.
>
> It is not.
>
>> To me, kvm-clock and the HyperV TSC page are extremely effective as
>> simply a more enlightened path to the host TSC. Maintaining a
>> high-performance path to the TSC in the face of updates is tricky -
>> see the extended comment in pvclock_update_vm_gtod_copy, or the
>> discussion on the patchset in (2). Is the cost of auditing that the
>> path from host gettimeofday update -> kvm -> guest pvclock -> guest
>> gettimeofday both tracks host time correctly and does not produce any
>> backwards warps worth the added value, if it exists? As an
>> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
>> function of the last update to kvm-clock or the reference TSC page,
>> respectively, sounds very straightforward.
>>
>> (Outside of masterclock mode, the requirement that the client
>> synchronizes across cpus for montonicity smoothes over a lot of
>> complexity - periodically updating kvm-clock to the current time is
>> simple and works.)
>>
>> Regardless of my opinion, I think that a clear statement of the design
>> goals for kvm-clock (and kvm's implementation of the reference TSC
>> page) would be valuable.
>
> Documentation/virtual/kvm/timekeeping.txt
>

Hi Marcelo,

While I appreciate all of the detail in timekeeping.txt, it is not a
very good reference for what kvm-clock is or how it works. kvm-clock
is only mentioned three times in different places throughout that
document, and nowhere is there a very clear statement of what
kvm-clock is supposed to do or how it does it.

For somebody that does not already have a deep understanding of the
core masterclock code, trying to understand how kvm-clock works is a
real challenge.

Thanks,
Peter

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 16:44   ` Andy Lutomirski
@ 2016-02-24 17:38     ` Marcelo Tosatti
  2016-02-24 19:38       ` Andy Lutomirski
  0 siblings, 1 reply; 29+ messages in thread
From: Marcelo Tosatti @ 2016-02-24 17:38 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Paolo Bonzini, Owen Hofmann, KVM General, Peter Hornyack

On Wed, Feb 24, 2016 at 08:44:40AM -0800, Andy Lutomirski wrote:
> On Wed, Feb 24, 2016 at 6:14 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> >
> > On 24/02/2016 03:31, Owen Hofmann wrote:
> >> Specifically, what underlying source of time should be exposed through
> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> >> page?  Recently a couple of threads on kvm-list, along with attempts
> >> to produce reliable behavior from kvm-clock on our systems have
> >> highlighted a tension between the current implementation of kvm-clock
> >> and potentially diverging goals for paravirt time. Here are a few:
> >>
> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
> >>
> >> This question is mostly in regards to kvm-clock in masterclock mode
> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
> >> expose a source of time that is more 'true' than the underlying TSC?
> >> For example, by passing through NTP correction from the host. For the
> >> current implementation, the answer seems to be... why not both? Once
> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
> >> multiplied by the frequency specified by kvm. On the other hand,
> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
> >> are measured against corrected time from the host. A guest reading its
> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
> >> guest has run long enough to for TSC to diverge from NTP time.
> >
> > Right, in fact that's why QEMU is not really using KVM_GET_CLOCK
> > anymore.  In retrospect, the "fix" in QEMU was probably a bad idea.  It
> > would have been better to fix KVM_GET_CLOCK.
> >
> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
> >> simply a more enlightened path to the host TSC. Maintaining a
> >> high-performance path to the TSC in the face of updates is tricky -
> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
> >> discussion on the patchset in (2). Is the cost of auditing that the
> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
> >> gettimeofday both tracks host time correctly and does not produce any
> >> backwards warps worth the added value, if it exists? As an
> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
> >> function of the last update to kvm-clock or the reference TSC page,
> >> respectively, sounds very straightforward.
> >
> > Yes, we could do that too.
> >
> > I think that vgettsc and do_monotonic_boot also would have to use the
> > TSC frequency instead the NTP-adjusted host clock.
> >
> >> (Outside of masterclock mode, the requirement that the client
> >> synchronizes across cpus for montonicity smoothes over a lot of
> >> complexity - periodically updating kvm-clock to the current time is
> >> simple and works.)
> >>
> >> Regardless of my opinion, I think that a clear statement of the design
> >> goals for kvm-clock (and kvm's implementation of the reference TSC
> >> page) would be valuable.
> >
> > Since we cannot change the past, having kvmclock synchronize with the
> > host TSC frequency is the only choice we can make.
> >
> 
> Could we introduce a new kvm-clock or perhaps opt-in mode that:
> 
> a) uses hypervisor-supplied IO pages and,
> 
> b) synchronizes to host CLOCK_MONOTONIC instead of some bizarre
> non-suspend-resume-safe 

Please be accurate. It is suspend safe.

> not-really-well-defined hybrid?
> 
> --Andy

1. What is not well defined? I fail to spot anything 
specific in Owen's e-mail.

2. What is the problem you're trying to solve? 
Is there a visible problem? (Paolo has a pending fix for an issue
which could trigger time going backwards event).




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 17:38     ` Marcelo Tosatti
@ 2016-02-24 19:38       ` Andy Lutomirski
  2016-02-24 19:44         ` Paolo Bonzini
  2016-02-24 19:55         ` Owen Hofmann
  0 siblings, 2 replies; 29+ messages in thread
From: Andy Lutomirski @ 2016-02-24 19:38 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Paolo Bonzini, Owen Hofmann, KVM General, Peter Hornyack, Joao Martins

On Wed, Feb 24, 2016 at 9:38 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Wed, Feb 24, 2016 at 08:44:40AM -0800, Andy Lutomirski wrote:
>> On Wed, Feb 24, 2016 at 6:14 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> >
>> >
>> > On 24/02/2016 03:31, Owen Hofmann wrote:
>> >> Specifically, what underlying source of time should be exposed through
>> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
>> >> page?  Recently a couple of threads on kvm-list, along with attempts
>> >> to produce reliable behavior from kvm-clock on our systems have
>> >> highlighted a tension between the current implementation of kvm-clock
>> >> and potentially diverging goals for paravirt time. Here are a few:
>> >>
>> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
>> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
>> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
>> >>
>> >> This question is mostly in regards to kvm-clock in masterclock mode
>> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
>> >> expose a source of time that is more 'true' than the underlying TSC?
>> >> For example, by passing through NTP correction from the host. For the
>> >> current implementation, the answer seems to be... why not both? Once
>> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
>> >> multiplied by the frequency specified by kvm. On the other hand,
>> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
>> >> are measured against corrected time from the host. A guest reading its
>> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
>> >> guest has run long enough to for TSC to diverge from NTP time.
>> >
>> > Right, in fact that's why QEMU is not really using KVM_GET_CLOCK
>> > anymore.  In retrospect, the "fix" in QEMU was probably a bad idea.  It
>> > would have been better to fix KVM_GET_CLOCK.
>> >
>> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
>> >> simply a more enlightened path to the host TSC. Maintaining a
>> >> high-performance path to the TSC in the face of updates is tricky -
>> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
>> >> discussion on the patchset in (2). Is the cost of auditing that the
>> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
>> >> gettimeofday both tracks host time correctly and does not produce any
>> >> backwards warps worth the added value, if it exists? As an
>> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
>> >> function of the last update to kvm-clock or the reference TSC page,
>> >> respectively, sounds very straightforward.
>> >
>> > Yes, we could do that too.
>> >
>> > I think that vgettsc and do_monotonic_boot also would have to use the
>> > TSC frequency instead the NTP-adjusted host clock.
>> >
>> >> (Outside of masterclock mode, the requirement that the client
>> >> synchronizes across cpus for montonicity smoothes over a lot of
>> >> complexity - periodically updating kvm-clock to the current time is
>> >> simple and works.)
>> >>
>> >> Regardless of my opinion, I think that a clear statement of the design
>> >> goals for kvm-clock (and kvm's implementation of the reference TSC
>> >> page) would be valuable.
>> >
>> > Since we cannot change the past, having kvmclock synchronize with the
>> > host TSC frequency is the only choice we can make.
>> >
>>
>> Could we introduce a new kvm-clock or perhaps opt-in mode that:
>>
>> a) uses hypervisor-supplied IO pages and,
>>
>> b) synchronizes to host CLOCK_MONOTONIC instead of some bizarre
>> non-suspend-resume-safe
>
> Please be accurate. It is suspend safe.
>

I'm being accurate enough, I think.  Master clock mode is not suspend
safe.  When I suspend and resume my laptop, the master clock code
determines that it messed up and disables itself.  Unloading and
reloading the kvm modules turns it back on until the next suspect.

I *think* that the underlying issue is that kvm-clock's master clock
tracks something ill-defined instead of exposing a well-defined host
clock.  If the master clock accurately exposed CLOCK_MONOTONIC_RAW or
CLOCK_MONOTONIC (I much prefer the latter), then it would be fine
across suspend/resume.

I think that part of the reason that it doesn't accurately export a
host clock is that the worst-case performance of atomic updates to the
pvclock data structures is abysmal due to having the data structures
living in guest memory.  To be able to access and update all relevant
structures during host clock refreshes, the host would need to pin the
all pvclock pages for all running guests.  This could be partially
mitigated by only updating pvclock data for running vcpus and for vcpu
0 for all running guests synchronously and deferring the rest (8k
pinned per host cpu, max), but it would still be a mess.

If someone redefined the interface so that the *host* could allocate
it, then the pages could be shared across all guests and this would be
vastly simpler and faster.

Also, kvm-clock should really coordinate with the core timekeeping
code to handle this sort of time base export rather than hooking into
the host vdso support code.

>> not-really-well-defined hybrid?
>>
>> --Andy
>
> 1. What is not well defined? I fail to spot anything
> specific in Owen's e-mail.

If I start a guest and query kvm-clock, I get a nanosecond count.
AFAIK it is, in fact, ill-defined or at least ill-documented what that
nanosecond count means.

[cc: Joao.  Xen may want to take this stuff into consideration.]

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 19:38       ` Andy Lutomirski
@ 2016-02-24 19:44         ` Paolo Bonzini
  2016-02-24 19:52           ` Andy Lutomirski
  2016-02-24 19:55         ` Owen Hofmann
  1 sibling, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2016-02-24 19:44 UTC (permalink / raw)
  To: Andy Lutomirski, Marcelo Tosatti
  Cc: Owen Hofmann, KVM General, Peter Hornyack, Joao Martins



On 24/02/2016 20:38, Andy Lutomirski wrote:
> If the master clock accurately exposed CLOCK_MONOTONIC_RAW or
> CLOCK_MONOTONIC (I much prefer the latter), then it would be fine
> across suspend/resume.

Here we already have a conflict... Owen says he prefers the master clock
to expose the (stable) TSC, you say you prefer CLOCK_MONOTONIC.

I for one _thought_ CLOCK_MONOTONIC would have been my choice, but I'm
not so sure about it and I'm also not sure it's possible to do it
efficiently.  The mult/shift/offset tuple potentially can change every
tick, and it would be bad to do such an update #vms times per tick (or
worse, #vcpus times per tick).

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 19:44         ` Paolo Bonzini
@ 2016-02-24 19:52           ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2016-02-24 19:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Marcelo Tosatti, Owen Hofmann, KVM General, Peter Hornyack, Joao Martins

On Wed, Feb 24, 2016 at 11:44 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 24/02/2016 20:38, Andy Lutomirski wrote:
>> If the master clock accurately exposed CLOCK_MONOTONIC_RAW or
>> CLOCK_MONOTONIC (I much prefer the latter), then it would be fine
>> across suspend/resume.
>
> Here we already have a conflict... Owen says he prefers the master clock
> to expose the (stable) TSC, you say you prefer CLOCK_MONOTONIC.
>
> I for one _thought_ CLOCK_MONOTONIC would have been my choice, but I'm
> not so sure about it and I'm also not sure it's possible to do it
> efficiently.

I think it should be configurable.

>  The mult/shift/offset tuple potentially can change every
> tick, and it would be bad to do such an update #vms times per tick (or
> worse, #vcpus times per tick).

If the interface were sane, it would be a single update per tick that
would update a data structure that all guests would share.  Arguably
we ought to expose all three useful clocks (MONOTONIC, MONOTONIC_RAW,
and REALTIME) in the same page.  The vclock code already does this for
MONOTONIC and REALTIME.)

Each vm would have a *different* page that had information like the
guest-boot-to-monotonic offset.

>
> Paolo



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 19:38       ` Andy Lutomirski
  2016-02-24 19:44         ` Paolo Bonzini
@ 2016-02-24 19:55         ` Owen Hofmann
  2016-02-25 12:22           ` Joao Martins
  2016-02-25 12:22           ` Joao Martins
  1 sibling, 2 replies; 29+ messages in thread
From: Owen Hofmann @ 2016-02-24 19:55 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Marcelo Tosatti, Paolo Bonzini, KVM General, Peter Hornyack,
	Joao Martins

>>> not-really-well-defined hybrid?
>>>
>>> --Andy
>>
>> 1. What is not well defined? I fail to spot anything
>> specific in Owen's e-mail.
>
> If I start a guest and query kvm-clock, I get a nanosecond count.
> AFAIK it is, in fact, ill-defined or at least ill-documented what that
> nanosecond count means.

To try to put the thoughts into specific questions:
- What is the value returned by KVM_GET_CLOCK? How should it be used?
- What is the value returned by a guest read of the kvm-clock
structure? (This is also Andy's question)
To me there are two possibilities for how to answer the second question:
1) kvm-clock is better than the host TSC: it propagates updates to
frequency from the host (== CLOCK_MONOTONIC)
2) kvm-clock is a paravirtual source of truth on the guest TSC:
whether it is stable and its approximate frequency. If the guest needs
to synchronize to an external source of time, it runs NTP. (==
CLOCK_MONOTONIC_RAW)

To me, (1) sounds hard, (2) sounds easy, and its not clear how much
additional value (1) provides. The recent patches Paolo sent move
kvm-clock in the direction of (1), and it sounds like Andy and I might
have slightly different opinions as well. But mostly I would like some
clarity as to which is the stated goal for kvm-clock, and to have the
implementation pick only one of those options.

>>> > Since we cannot change the past, having kvmclock synchronize with the
>>> > host TSC frequency is the only choice we can make.'

I'm not sure I understand what previous decision locks kvm-clock into
the current path. Can you clarify?

On Wed, Feb 24, 2016 at 11:38 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Wed, Feb 24, 2016 at 9:38 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> On Wed, Feb 24, 2016 at 08:44:40AM -0800, Andy Lutomirski wrote:
>>> On Wed, Feb 24, 2016 at 6:14 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> >
>>> >
>>> > On 24/02/2016 03:31, Owen Hofmann wrote:
>>> >> Specifically, what underlying source of time should be exposed through
>>> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
>>> >> page?  Recently a couple of threads on kvm-list, along with attempts
>>> >> to produce reliable behavior from kvm-clock on our systems have
>>> >> highlighted a tension between the current implementation of kvm-clock
>>> >> and potentially diverging goals for paravirt time. Here are a few:
>>> >>
>>> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
>>> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
>>> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
>>> >>
>>> >> This question is mostly in regards to kvm-clock in masterclock mode
>>> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
>>> >> expose a source of time that is more 'true' than the underlying TSC?
>>> >> For example, by passing through NTP correction from the host. For the
>>> >> current implementation, the answer seems to be... why not both? Once
>>> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
>>> >> multiplied by the frequency specified by kvm. On the other hand,
>>> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
>>> >> are measured against corrected time from the host. A guest reading its
>>> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
>>> >> guest has run long enough to for TSC to diverge from NTP time.
>>> >
>>> > Right, in fact that's why QEMU is not really using KVM_GET_CLOCK
>>> > anymore.  In retrospect, the "fix" in QEMU was probably a bad idea.  It
>>> > would have been better to fix KVM_GET_CLOCK.
>>> >
>>> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
>>> >> simply a more enlightened path to the host TSC. Maintaining a
>>> >> high-performance path to the TSC in the face of updates is tricky -
>>> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
>>> >> discussion on the patchset in (2). Is the cost of auditing that the
>>> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
>>> >> gettimeofday both tracks host time correctly and does not produce any
>>> >> backwards warps worth the added value, if it exists? As an
>>> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
>>> >> function of the last update to kvm-clock or the reference TSC page,
>>> >> respectively, sounds very straightforward.
>>> >
>>> > Yes, we could do that too.
>>> >
>>> > I think that vgettsc and do_monotonic_boot also would have to use the
>>> > TSC frequency instead the NTP-adjusted host clock.
>>> >
>>> >> (Outside of masterclock mode, the requirement that the client
>>> >> synchronizes across cpus for montonicity smoothes over a lot of
>>> >> complexity - periodically updating kvm-clock to the current time is
>>> >> simple and works.)
>>> >>
>>> >> Regardless of my opinion, I think that a clear statement of the design
>>> >> goals for kvm-clock (and kvm's implementation of the reference TSC
>>> >> page) would be valuable.
>>> >
>>> > Since we cannot change the past, having kvmclock synchronize with the
>>> > host TSC frequency is the only choice we can make.
>>> >
>>>
>>> Could we introduce a new kvm-clock or perhaps opt-in mode that:
>>>
>>> a) uses hypervisor-supplied IO pages and,
>>>
>>> b) synchronizes to host CLOCK_MONOTONIC instead of some bizarre
>>> non-suspend-resume-safe
>>
>> Please be accurate. It is suspend safe.
>>
>
> I'm being accurate enough, I think.  Master clock mode is not suspend
> safe.  When I suspend and resume my laptop, the master clock code
> determines that it messed up and disables itself.  Unloading and
> reloading the kvm modules turns it back on until the next suspect.
>
> I *think* that the underlying issue is that kvm-clock's master clock
> tracks something ill-defined instead of exposing a well-defined host
> clock.  If the master clock accurately exposed CLOCK_MONOTONIC_RAW or
> CLOCK_MONOTONIC (I much prefer the latter), then it would be fine
> across suspend/resume.
>
> I think that part of the reason that it doesn't accurately export a
> host clock is that the worst-case performance of atomic updates to the
> pvclock data structures is abysmal due to having the data structures
> living in guest memory.  To be able to access and update all relevant
> structures during host clock refreshes, the host would need to pin the
> all pvclock pages for all running guests.  This could be partially
> mitigated by only updating pvclock data for running vcpus and for vcpu
> 0 for all running guests synchronously and deferring the rest (8k
> pinned per host cpu, max), but it would still be a mess.
>
> If someone redefined the interface so that the *host* could allocate
> it, then the pages could be shared across all guests and this would be
> vastly simpler and faster.
>
> Also, kvm-clock should really coordinate with the core timekeeping
> code to handle this sort of time base export rather than hooking into
> the host vdso support code.
>
>>> not-really-well-defined hybrid?
>>>
>>> --Andy
>>
>> 1. What is not well defined? I fail to spot anything
>> specific in Owen's e-mail.
>
> If I start a guest and query kvm-clock, I get a nanosecond count.
> AFAIK it is, in fact, ill-defined or at least ill-documented what that
> nanosecond count means.
>
> [cc: Joao.  Xen may want to take this stuff into consideration.]
>
> --Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 17:35   ` Peter Hornyack
@ 2016-02-24 20:17     ` Radim Krčmář
  2016-02-24 20:24       ` Andy Lutomirski
  2016-02-24 23:35     ` Marcelo Tosatti
  1 sibling, 1 reply; 29+ messages in thread
From: Radim Krčmář @ 2016-02-24 20:17 UTC (permalink / raw)
  To: Peter Hornyack
  Cc: Marcelo Tosatti, Owen Hofmann, KVM General, Paolo Bonzini,
	Andy Lutomirski

2016-02-24 09:35-0800, Peter Hornyack:
> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
>>> Regardless of my opinion, I think that a clear statement of the design
>>> goals for kvm-clock (and kvm's implementation of the reference TSC
>>> page) would be valuable.
>>
>> Documentation/virtual/kvm/timekeeping.txt
>>
> 
> Hi Marcelo,
> 
> While I appreciate all of the detail in timekeeping.txt, it is not a
> very good reference for what kvm-clock is or how it works. kvm-clock
> is only mentioned three times in different places throughout that
> document, and nowhere is there a very clear statement of what
> kvm-clock is supposed to do or how it does it.
> 
> For somebody that does not already have a deep understanding of the
> core masterclock code, trying to understand how kvm-clock works is a
> real challenge.

I agree.  Having an overview would be very helpful.

Do you find anything incorrect with
 * kvmclock measures the flow of time.
 * time in kvmclock flows at the same rate as host's CLOCK_BOOTTIME.
?

Maybe it would be better to say "best estimate of real time" instead of
"CLOCK_BOOTTIME", so people wouldn't jump to conclusion that
CLOCK_BOOTTIME has something to do with kvmclock ...

Then we could mention migration (why the time becomes imprecise) and
finish by explaining the TSC mechanism (that avoids a vmexit on every
read) and advantages of masterclock.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 20:17     ` Radim Krčmář
@ 2016-02-24 20:24       ` Andy Lutomirski
  2016-02-24 20:53         ` Radim Krčmář
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2016-02-24 20:24 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Peter Hornyack, Marcelo Tosatti, Owen Hofmann, KVM General,
	Paolo Bonzini

On Wed, Feb 24, 2016 at 12:17 PM, Radim Krčmář <rkrcmar@redhat.com> wrote:
> 2016-02-24 09:35-0800, Peter Hornyack:
>> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>> On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
>>>> Regardless of my opinion, I think that a clear statement of the design
>>>> goals for kvm-clock (and kvm's implementation of the reference TSC
>>>> page) would be valuable.
>>>
>>> Documentation/virtual/kvm/timekeeping.txt
>>>
>>
>> Hi Marcelo,
>>
>> While I appreciate all of the detail in timekeeping.txt, it is not a
>> very good reference for what kvm-clock is or how it works. kvm-clock
>> is only mentioned three times in different places throughout that
>> document, and nowhere is there a very clear statement of what
>> kvm-clock is supposed to do or how it does it.
>>
>> For somebody that does not already have a deep understanding of the
>> core masterclock code, trying to understand how kvm-clock works is a
>> real challenge.
>
> I agree.  Having an overview would be very helpful.
>
> Do you find anything incorrect with
>  * kvmclock measures the flow of time.
>  * time in kvmclock flows at the same rate as host's CLOCK_BOOTTIME.
> ?

If we could supply CLOCK_REALTIME as well and advertise that fact to
guest userspace (perhaps with a sysctl or similar in the guest to turn
it on), it would be *awesome*.  Guests with access to this feature
could simply not run ntpd/chronyd.

>
> Maybe it would be better to say "best estimate of real time" instead of
> "CLOCK_BOOTTIME", so people wouldn't jump to conclusion that
> CLOCK_BOOTTIME has something to do with kvmclock ...

We still need to define what zero means, if anything.

>
> Then we could mention migration (why the time becomes imprecise) and
> finish by explaining the TSC mechanism (that avoids a vmexit on every
> read) and advantages of masterclock.

We should also explain what masterclock is, aside from being an
implementation detail.  I've read the code and I still don't know.

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 20:24       ` Andy Lutomirski
@ 2016-02-24 20:53         ` Radim Krčmář
  2016-02-25 11:13           ` Radim Krčmář
  2016-02-25 11:22           ` Marcelo Tosatti
  0 siblings, 2 replies; 29+ messages in thread
From: Radim Krčmář @ 2016-02-24 20:53 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Peter Hornyack, Marcelo Tosatti, Owen Hofmann, KVM General,
	Paolo Bonzini

2016-02-24 12:24-0800, Andy Lutomirski:
> On Wed, Feb 24, 2016 at 12:17 PM, Radim Krčmář <rkrcmar@redhat.com> wrote:
>> 2016-02-24 09:35-0800, Peter Hornyack:
>>> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>>> On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
>>>>> Regardless of my opinion, I think that a clear statement of the design
>>>>> goals for kvm-clock (and kvm's implementation of the reference TSC
>>>>> page) would be valuable.
>>>>
>>>> Documentation/virtual/kvm/timekeeping.txt
>>>>
>>>
>>> Hi Marcelo,
>>>
>>> While I appreciate all of the detail in timekeeping.txt, it is not a
>>> very good reference for what kvm-clock is or how it works. kvm-clock
>>> is only mentioned three times in different places throughout that
>>> document, and nowhere is there a very clear statement of what
>>> kvm-clock is supposed to do or how it does it.
>>>
>>> For somebody that does not already have a deep understanding of the
>>> core masterclock code, trying to understand how kvm-clock works is a
>>> real challenge.
>>
>> I agree.  Having an overview would be very helpful.
>>
>> Do you find anything incorrect with
>>  * kvmclock measures the flow of time.
>>  * time in kvmclock flows at the same rate as host's CLOCK_BOOTTIME.
>> ?
> 
> If we could supply CLOCK_REALTIME as well and advertise that fact to
> guest userspace (perhaps with a sysctl or similar in the guest to turn
> it on), it would be *awesome*.  Guests with access to this feature
> could simply not run ntpd/chronyd.

I think that pvclock_wall_clock interface is there to do that.
(If pvclock_vcpu_time_info can provide what is claimed above.)

If pvclock_wall_clock version field matches with pvclock_vcpu_time_info,
then the guest can add those two and get CLOCK_REALTIME.
(Based on observations of angry users, the implementation lacking.)

>> Maybe it would be better to say "best estimate of real time" instead of
>> "CLOCK_BOOTTIME", so people wouldn't jump to conclusion that
>> CLOCK_BOOTTIME has something to do with kvmclock ...
> 
> We still need to define what zero means, if anything.

I think it's better if only the difference between two reads has a
meaning (the number of nanoseconds that passed).  Zero is then an
arbitrary value.

(If we're talking about system_time.)

>> Then we could mention migration (why the time becomes imprecise) and
>> finish by explaining the TSC mechanism (that avoids a vmexit on every
>> read) and advantages of masterclock.
> 
> We should also explain what masterclock is, aside from being an
> implementation detail.  I've read the code and I still don't know.

Yeah, rewriting the code would be a good deed.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 17:35   ` Peter Hornyack
  2016-02-24 20:17     ` Radim Krčmář
@ 2016-02-24 23:35     ` Marcelo Tosatti
  2016-02-24 23:36       ` Marcelo Tosatti
  2016-02-25  1:19       ` Andy Lutomirski
  1 sibling, 2 replies; 29+ messages in thread
From: Marcelo Tosatti @ 2016-02-24 23:35 UTC (permalink / raw)
  To: Peter Hornyack; +Cc: Owen Hofmann, KVM General, Paolo Bonzini, Andy Lutomirski

On Wed, Feb 24, 2016 at 09:35:44AM -0800, Peter Hornyack wrote:
> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
> >> Specifically, what underlying source of time should be exposed through
> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> >> page?  Recently a couple of threads on kvm-list, along with attempts
> >> to produce reliable behavior from kvm-clock on our systems have
> >> highlighted a tension between the current implementation of kvm-clock
> >> and potentially diverging goals for paravirt time. Here are a few:
> >>
> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
> >>
> >> This question is mostly in regards to kvm-clock in masterclock mode
> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
> >> expose a source of time that is more 'true' than the underlying TSC?
> >> For example, by passing through NTP correction from the host. For the
> >> current implementation, the answer seems to be... why not both? Once
> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
> >> multiplied by the frequency specified by kvm. On the other hand,
> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
> >> are measured against corrected time from the host. A guest reading its
> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
> >> guest has run long enough to for TSC to diverge from NTP time. A VMM
> >> using these ioctls to save and restore clock state can produce wild
> >> time jumps from the guest's perspective.
> >>
> >> The patches in (2) address this mismatch by plumbing updates to clock
> >> frequency through kvm-clock to the guest. This seems like an important
> >> design choice for kvm-clock, and IMO deserves at least a clear
> >> statement of the goals for this interface, if not some more
> >> discussion.
> >
> > Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK?
> >
> > The interfaces have been introduced to fix a bug.
> >
> >> The (later) thread in (3) claims that synchronizing with
> >> host time is *not* a goal of kvm-clock.
> >
> > It is not.
> >
> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
> >> simply a more enlightened path to the host TSC. Maintaining a
> >> high-performance path to the TSC in the face of updates is tricky -
> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
> >> discussion on the patchset in (2). Is the cost of auditing that the
> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
> >> gettimeofday both tracks host time correctly and does not produce any
> >> backwards warps worth the added value, if it exists? As an
> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
> >> function of the last update to kvm-clock or the reference TSC page,
> >> respectively, sounds very straightforward.
> >>
> >> (Outside of masterclock mode, the requirement that the client
> >> synchronizes across cpus for montonicity smoothes over a lot of
> >> complexity - periodically updating kvm-clock to the current time is
> >> simple and works.)
> >>
> >> Regardless of my opinion, I think that a clear statement of the design
> >> goals for kvm-clock (and kvm's implementation of the reference TSC
> >> page) would be valuable.
> >
> > Documentation/virtual/kvm/timekeeping.txt
> >
> 
> Hi Marcelo,
> 
> While I appreciate all of the detail in timekeeping.txt, it is not a
> very good reference for what kvm-clock is or how it works. kvm-clock
> is only mentioned three times in different places throughout that
> document, and nowhere is there a very clear statement of what
> kvm-clock is supposed to do or how it does it.
> 
> For somebody that does not already have a deep understanding of the
> core masterclock code, trying to understand how kvm-clock works is a

There is no "deep understanding". There is one comment there about 
why you can't update systemtimestamp + tsc_offset (you have to read
the kvmclock clock read function to understand this sentence) in
parallel in multiple VCPUs, and thats all masterclock is about.

Its called "master" because there must be only one system_timestamp 
and not multiple (therefore thats the "master" copy of system_time).

> real challenge.
> 
> Thanks,
> Peter

Design goals: provide a reliable clocksource device to Linux guests
so they are able to cope with virtualization problems, namely:

1. Migration to hosts with different TSC frequency.
2. Support for hosts with TSCs that are not stable (whose
counting frequency changes across processor frequency changes).

How: Expose a clockdevice which counts at 1GHz to guests.

pvclock was created for use with Xen, kvmclock is KVM's implementation
of pvclock interface.

-----

This are the "design properties", the rest have evolved over
time from requirements on the field which have not been
realized when initially "designed".

-----

Evolution of masterclock scheme (bugs uncovered):

Problem: time backwards as seen by guests.
Solution: Fix in guest with pvclock global variable (cmpxchg).

Problem: gettimeofday() performance
Solution: Use masterclock scheme (update pvclock areas in sync to avoid
time backwards event being visible to guests, its well documented in
x86.c, if something is unclear please try to understand the code / ask
and you/we improve the documentation there).

Problem: get_kernel_ns VS TSC clock get out of sync and
Hyper-V complains about the difference.

Solution: expose the NTP TSC frequency so that guests
apply NTP frequency correction to their kvmclock reads on TSC as well.

---

About future: agree with Andy that kvmclock should be removed.
So there is a pending work item there: "verify TSC clocksource
is fine for exposing to guests, think about the implications for
management software".
I can write down a list of items that have been fixed
for kvmclock and would have to be check for tsc clocksource.

Anyone willing to take that task ?

---

About complaint that "its not well designed whether NTP correction
should be applied or not". There are two different things:

1) Host clock and guest clocks synchronized.
KVM is not responsible for that, and it can't, because
Linux exposes a clock which is created in software
and fixed by NTP.

2) NTP frequency correction being applied to kvmclock.

This only means that the frequency of the pvclock reads
in the guest are NTP corrected.

Whether its necessary: No, its not strictly necessary because
the clock exposed to the guest is the Linux clock, maintained
by Linux on top of kvmclock (via interrupts and kvmclock reads).

So for KVM-RT for example, its fine to have one

    system_timestamp (read at guest initialization).
    uncorrected host TSC value.

Because the guests clock will be NTP corrected (via sys_adjtimex)
and the guest clock will be synchronized to UTC.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 23:35     ` Marcelo Tosatti
@ 2016-02-24 23:36       ` Marcelo Tosatti
  2016-02-25  1:19       ` Andy Lutomirski
  1 sibling, 0 replies; 29+ messages in thread
From: Marcelo Tosatti @ 2016-02-24 23:36 UTC (permalink / raw)
  To: Peter Hornyack; +Cc: Owen Hofmann, KVM General, Paolo Bonzini, Andy Lutomirski

On Wed, Feb 24, 2016 at 08:35:00PM -0300, Marcelo Tosatti wrote:
> On Wed, Feb 24, 2016 at 09:35:44AM -0800, Peter Hornyack wrote:
> > On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > > On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
> > >> Specifically, what underlying source of time should be exposed through
> > >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> > >> page?  Recently a couple of threads on kvm-list, along with attempts
> > >> to produce reliable behavior from kvm-clock on our systems have
> > >> highlighted a tension between the current implementation of kvm-clock
> > >> and potentially diverging goals for paravirt time. Here are a few:
> > >>
> > >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
> > >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
> > >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
> > >>
> > >> This question is mostly in regards to kvm-clock in masterclock mode
> > >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
> > >> expose a source of time that is more 'true' than the underlying TSC?
> > >> For example, by passing through NTP correction from the host. For the
> > >> current implementation, the answer seems to be... why not both? Once
> > >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
> > >> multiplied by the frequency specified by kvm. On the other hand,
> > >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
> > >> are measured against corrected time from the host. A guest reading its
> > >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
> > >> guest has run long enough to for TSC to diverge from NTP time. A VMM
> > >> using these ioctls to save and restore clock state can produce wild
> > >> time jumps from the guest's perspective.
> > >>
> > >> The patches in (2) address this mismatch by plumbing updates to clock
> > >> frequency through kvm-clock to the guest. This seems like an important
> > >> design choice for kvm-clock, and IMO deserves at least a clear
> > >> statement of the goals for this interface, if not some more
> > >> discussion.
> > >
> > > Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK?
> > >
> > > The interfaces have been introduced to fix a bug.
> > >
> > >> The (later) thread in (3) claims that synchronizing with
> > >> host time is *not* a goal of kvm-clock.
> > >
> > > It is not.
> > >
> > >> To me, kvm-clock and the HyperV TSC page are extremely effective as
> > >> simply a more enlightened path to the host TSC. Maintaining a
> > >> high-performance path to the TSC in the face of updates is tricky -
> > >> see the extended comment in pvclock_update_vm_gtod_copy, or the
> > >> discussion on the patchset in (2). Is the cost of auditing that the
> > >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
> > >> gettimeofday both tracks host time correctly and does not produce any
> > >> backwards warps worth the added value, if it exists? As an
> > >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
> > >> function of the last update to kvm-clock or the reference TSC page,
> > >> respectively, sounds very straightforward.
> > >>
> > >> (Outside of masterclock mode, the requirement that the client
> > >> synchronizes across cpus for montonicity smoothes over a lot of
> > >> complexity - periodically updating kvm-clock to the current time is
> > >> simple and works.)
> > >>
> > >> Regardless of my opinion, I think that a clear statement of the design
> > >> goals for kvm-clock (and kvm's implementation of the reference TSC
> > >> page) would be valuable.
> > >
> > > Documentation/virtual/kvm/timekeeping.txt
> > >
> > 
> > Hi Marcelo,
> > 
> > While I appreciate all of the detail in timekeeping.txt, it is not a
> > very good reference for what kvm-clock is or how it works. kvm-clock
> > is only mentioned three times in different places throughout that
> > document, and nowhere is there a very clear statement of what
> > kvm-clock is supposed to do or how it does it.
> > 
> > For somebody that does not already have a deep understanding of the
> > core masterclock code, trying to understand how kvm-clock works is a
> 
> There is no "deep understanding". There is one comment there about 
> why you can't update systemtimestamp + tsc_offset (you have to read
> the kvmclock clock read function to understand this sentence) in
> parallel in multiple VCPUs, and thats all masterclock is about.
> 
> Its called "master" because there must be only one system_timestamp 
> and not multiple (therefore thats the "master" copy of system_time).
> 
> > real challenge.
> > 
> > Thanks,
> > Peter
> 
> Design goals: provide a reliable clocksource device to Linux guests
> so they are able to cope with virtualization problems, namely:
> 
> 1. Migration to hosts with different TSC frequency.
> 2. Support for hosts with TSCs that are not stable (whose
> counting frequency changes across processor frequency changes).
> 
> How: Expose a clockdevice which counts at 1GHz to guests.
> 
> pvclock was created for use with Xen, kvmclock is KVM's implementation
> of pvclock interface.
> 
> -----
> 
> This are the "design properties", the rest have evolved over
> time from requirements on the field which have not been
> realized when initially "designed".
> 
> -----
> 
> Evolution of masterclock scheme (bugs uncovered):
> 
> Problem: time backwards as seen by guests.
> Solution: Fix in guest with pvclock global variable (cmpxchg).
> 
> Problem: gettimeofday() performance
> Solution: Use masterclock scheme (update pvclock areas in sync to avoid
> time backwards event being visible to guests, its well documented in
> x86.c, if something is unclear please try to understand the code / ask
> and you/we improve the documentation there).
> 
> Problem: get_kernel_ns VS TSC clock get out of sync and
> Hyper-V complains about the difference.
> 
> Solution: expose the NTP TSC frequency so that guests
> apply NTP frequency correction to their kvmclock reads on TSC as well.
> 
> ---
> 
> About future: agree with Andy that kvmclock should be removed.
> So there is a pending work item there: "verify TSC clocksource
> is fine for exposing to guests, think about the implications for
> management software".
> I can write down a list of items that have been fixed
> for kvmclock and would have to be check for tsc clocksource.
> 
> Anyone willing to take that task ?
> 
> ---
> 
> About complaint that "its not well designed whether NTP correction
> should be applied or not". There are two different things:
> 
> 1) Host clock and guest clocks synchronized.

Err meant "guest clock synced to UTC" (which is the same 
as guest clock is synced to host clock if host clock is synced
to UTC).


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 23:35     ` Marcelo Tosatti
  2016-02-24 23:36       ` Marcelo Tosatti
@ 2016-02-25  1:19       ` Andy Lutomirski
  2016-02-25  3:50         ` Owen Hofmann
                           ` (2 more replies)
  1 sibling, 3 replies; 29+ messages in thread
From: Andy Lutomirski @ 2016-02-25  1:19 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Peter Hornyack, Owen Hofmann, KVM General, Paolo Bonzini

On Wed, Feb 24, 2016 at 3:35 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Wed, Feb 24, 2016 at 09:35:44AM -0800, Peter Hornyack wrote:
>> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> > On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
>> >> Specifically, what underlying source of time should be exposed through
>> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
>> >> page?  Recently a couple of threads on kvm-list, along with attempts
>> >> to produce reliable behavior from kvm-clock on our systems have
>> >> highlighted a tension between the current implementation of kvm-clock
>> >> and potentially diverging goals for paravirt time. Here are a few:
>> >>
>> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
>> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
>> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
>> >>
>> >> This question is mostly in regards to kvm-clock in masterclock mode
>> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
>> >> expose a source of time that is more 'true' than the underlying TSC?
>> >> For example, by passing through NTP correction from the host. For the
>> >> current implementation, the answer seems to be... why not both? Once
>> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
>> >> multiplied by the frequency specified by kvm. On the other hand,
>> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
>> >> are measured against corrected time from the host. A guest reading its
>> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
>> >> guest has run long enough to for TSC to diverge from NTP time. A VMM
>> >> using these ioctls to save and restore clock state can produce wild
>> >> time jumps from the guest's perspective.
>> >>
>> >> The patches in (2) address this mismatch by plumbing updates to clock
>> >> frequency through kvm-clock to the guest. This seems like an important
>> >> design choice for kvm-clock, and IMO deserves at least a clear
>> >> statement of the goals for this interface, if not some more
>> >> discussion.
>> >
>> > Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK?
>> >
>> > The interfaces have been introduced to fix a bug.
>> >
>> >> The (later) thread in (3) claims that synchronizing with
>> >> host time is *not* a goal of kvm-clock.
>> >
>> > It is not.
>> >
>> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
>> >> simply a more enlightened path to the host TSC. Maintaining a
>> >> high-performance path to the TSC in the face of updates is tricky -
>> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
>> >> discussion on the patchset in (2). Is the cost of auditing that the
>> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
>> >> gettimeofday both tracks host time correctly and does not produce any
>> >> backwards warps worth the added value, if it exists? As an
>> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
>> >> function of the last update to kvm-clock or the reference TSC page,
>> >> respectively, sounds very straightforward.
>> >>
>> >> (Outside of masterclock mode, the requirement that the client
>> >> synchronizes across cpus for montonicity smoothes over a lot of
>> >> complexity - periodically updating kvm-clock to the current time is
>> >> simple and works.)
>> >>
>> >> Regardless of my opinion, I think that a clear statement of the design
>> >> goals for kvm-clock (and kvm's implementation of the reference TSC
>> >> page) would be valuable.
>> >
>> > Documentation/virtual/kvm/timekeeping.txt
>> >
>>
>> Hi Marcelo,
>>
>> While I appreciate all of the detail in timekeeping.txt, it is not a
>> very good reference for what kvm-clock is or how it works. kvm-clock
>> is only mentioned three times in different places throughout that
>> document, and nowhere is there a very clear statement of what
>> kvm-clock is supposed to do or how it does it.
>>
>> For somebody that does not already have a deep understanding of the
>> core masterclock code, trying to understand how kvm-clock works is a
>
> There is no "deep understanding". There is one comment there about
> why you can't update systemtimestamp + tsc_offset (you have to read
> the kvmclock clock read function to understand this sentence) in
> parallel in multiple VCPUs, and thats all masterclock is about.
>
> Its called "master" because there must be only one system_timestamp
> and not multiple (therefore thats the "master" copy of system_time).
>
>> real challenge.
>>
>> Thanks,
>> Peter
>
> Design goals: provide a reliable clocksource device to Linux guests
> so they are able to cope with virtualization problems, namely:
>
> 1. Migration to hosts with different TSC frequency.
> 2. Support for hosts with TSCs that are not stable (whose
> counting frequency changes across processor frequency changes).
>
> How: Expose a clockdevice which counts at 1GHz to guests.

This still doesn't define how closely it is intended to track 1 GHz or
whether NTP slew is applied.

> Evolution of masterclock scheme (bugs uncovered):
>
> Problem: time backwards as seen by guests.
> Solution: Fix in guest with pvclock global variable (cmpxchg).

I thought that was only for non-masterclock.

>
> Problem: gettimeofday() performance
> Solution: Use masterclock scheme (update pvclock areas in sync to avoid
> time backwards event being visible to guests, its well documented in
> x86.c, if something is unclear please try to understand the code / ask
> and you/we improve the documentation there).

The actual masterclock host code is long and very difficult to follow.

In 4.5-rc, the vDSO guest code is IMO short and reasonably clear.

>
> Problem: get_kernel_ns VS TSC clock get out of sync and
> Hyper-V complains about the difference.
>
> Solution: expose the NTP TSC frequency so that guests
> apply NTP frequency correction to their kvmclock reads on TSC as well.
>

I don't understand what you mean.

> ---
>
> About future: agree with Andy that kvmclock should be removed.
> So there is a pending work item there: "verify TSC clocksource
> is fine for exposing to guests, think about the implications for
> management software".
> I can write down a list of items that have been fixed
> for kvmclock and would have to be check for tsc clocksource.
>
> Anyone willing to take that task ?
>

How?

On very very new hosts (those that support TSC_ADJUST and tsc
scaling), this should be possible.  The host would ideally tell the
guest what frequency of clock it intends to provide (ideally 1 GHz
exactly) and the guest would use it.  I'm not sure this hardware
exists yet.

If you enable TSC scaling like this, you may need to supply an ART
(always running timer) adjustment to the guest in case you intend to
pass any ART consumers through to the guest.  Of course, no one
outside Intel has *that* hardware either (AFAIK -- maybe there are
some prototypes floating around).

> ---
>
> About complaint that "its not well designed whether NTP correction
> should be applied or not". There are two different things:
>
> 1) Host clock and guest clocks synchronized.
> KVM is not responsible for that, and it can't, because
> Linux exposes a clock which is created in software
> and fixed by NTP.

I don't understand what you mean.

Of course the guest can run its own NTP daemon or similar adjtimex
caller and cause the guest to stop tracking the host.  But if the host
passed CLOCK_MONOTONIC through, then the guest would, by default,
treat kvm-clock as an exactly 1GHz source and would then expose a
disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
without an NTP client on the guest.

If integration with the POSIX clock core were provided, the guest
would learn to consume the host's CLOCK_REALTIME as well, as long as
the host uses the tsc as its clocksource.

>
> 2) NTP frequency correction being applied to kvmclock.
>
> This only means that the frequency of the pvclock reads
> in the guest are NTP corrected.

If the host applied NTP frequency correction to the guest, then I
would be happy.  Some folks might want this to be optional.

The guest can do additional correction on top if it wants regardless.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-25  1:19       ` Andy Lutomirski
@ 2016-02-25  3:50         ` Owen Hofmann
  2016-02-25 12:20           ` Radim Krčmář
  2016-02-25 11:36         ` Radim Krčmář
  2016-02-25 12:12         ` Marcelo Tosatti
  2 siblings, 1 reply; 29+ messages in thread
From: Owen Hofmann @ 2016-02-25  3:50 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Marcelo Tosatti, Peter Hornyack, KVM General, Paolo Bonzini

On Wed, Feb 24, 2016 at 5:19 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Wed, Feb 24, 2016 at 3:35 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> On Wed, Feb 24, 2016 at 09:35:44AM -0800, Peter Hornyack wrote:
>>> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>> > On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
>>> >> Specifically, what underlying source of time should be exposed through
>>> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
>>> >> page?  Recently a couple of threads on kvm-list, along with attempts
>>> >> to produce reliable behavior from kvm-clock on our systems have
>>> >> highlighted a tension between the current implementation of kvm-clock
>>> >> and potentially diverging goals for paravirt time. Here are a few:
>>> >>
>>> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
>>> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
>>> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
>>> >>
>>> >> This question is mostly in regards to kvm-clock in masterclock mode
>>> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
>>> >> expose a source of time that is more 'true' than the underlying TSC?
>>> >> For example, by passing through NTP correction from the host. For the
>>> >> current implementation, the answer seems to be... why not both? Once
>>> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
>>> >> multiplied by the frequency specified by kvm. On the other hand,
>>> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
>>> >> are measured against corrected time from the host. A guest reading its
>>> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
>>> >> guest has run long enough to for TSC to diverge from NTP time. A VMM
>>> >> using these ioctls to save and restore clock state can produce wild
>>> >> time jumps from the guest's perspective.
>>> >>
>>> >> The patches in (2) address this mismatch by plumbing updates to clock
>>> >> frequency through kvm-clock to the guest. This seems like an important
>>> >> design choice for kvm-clock, and IMO deserves at least a clear
>>> >> statement of the goals for this interface, if not some more
>>> >> discussion.
>>> >
>>> > Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK?
>>> >
>>> > The interfaces have been introduced to fix a bug.
>>> >
>>> >> The (later) thread in (3) claims that synchronizing with
>>> >> host time is *not* a goal of kvm-clock.
>>> >
>>> > It is not.
>>> >
>>> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
>>> >> simply a more enlightened path to the host TSC. Maintaining a
>>> >> high-performance path to the TSC in the face of updates is tricky -
>>> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
>>> >> discussion on the patchset in (2). Is the cost of auditing that the
>>> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
>>> >> gettimeofday both tracks host time correctly and does not produce any
>>> >> backwards warps worth the added value, if it exists? As an
>>> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
>>> >> function of the last update to kvm-clock or the reference TSC page,
>>> >> respectively, sounds very straightforward.
>>> >>
>>> >> (Outside of masterclock mode, the requirement that the client
>>> >> synchronizes across cpus for montonicity smoothes over a lot of
>>> >> complexity - periodically updating kvm-clock to the current time is
>>> >> simple and works.)
>>> >>
>>> >> Regardless of my opinion, I think that a clear statement of the design
>>> >> goals for kvm-clock (and kvm's implementation of the reference TSC
>>> >> page) would be valuable.
>>> >
>>> > Documentation/virtual/kvm/timekeeping.txt
>>> >
>>>
>>> Hi Marcelo,
>>>
>>> While I appreciate all of the detail in timekeeping.txt, it is not a
>>> very good reference for what kvm-clock is or how it works. kvm-clock
>>> is only mentioned three times in different places throughout that
>>> document, and nowhere is there a very clear statement of what
>>> kvm-clock is supposed to do or how it does it.
>>>
>>> For somebody that does not already have a deep understanding of the
>>> core masterclock code, trying to understand how kvm-clock works is a
>>
>> There is no "deep understanding". There is one comment there about
>> why you can't update systemtimestamp + tsc_offset (you have to read
>> the kvmclock clock read function to understand this sentence) in
>> parallel in multiple VCPUs, and thats all masterclock is about.
>>
>> Its called "master" because there must be only one system_timestamp
>> and not multiple (therefore thats the "master" copy of system_time).
>>
>>> real challenge.
>>>
>>> Thanks,
>>> Peter
>>
>> Design goals: provide a reliable clocksource device to Linux guests
>> so they are able to cope with virtualization problems, namely:
>>
>> 1. Migration to hosts with different TSC frequency.
>> 2. Support for hosts with TSCs that are not stable (whose
>> counting frequency changes across processor frequency changes).
>>
>> How: Expose a clockdevice which counts at 1GHz to guests.
>
> This still doesn't define how closely it is intended to track 1 GHz or
> whether NTP slew is applied.
>
>> Evolution of masterclock scheme (bugs uncovered):
>>
>> Problem: time backwards as seen by guests.
>> Solution: Fix in guest with pvclock global variable (cmpxchg).
>
> I thought that was only for non-masterclock.
>
>>
>> Problem: gettimeofday() performance
>> Solution: Use masterclock scheme (update pvclock areas in sync to avoid
>> time backwards event being visible to guests, its well documented in
>> x86.c, if something is unclear please try to understand the code / ask
>> and you/we improve the documentation there).
>
> The actual masterclock host code is long and very difficult to follow.
>
> In 4.5-rc, the vDSO guest code is IMO short and reasonably clear.
>
>>
>> Problem: get_kernel_ns VS TSC clock get out of sync and
>> Hyper-V complains about the difference.
>>
>> Solution: expose the NTP TSC frequency so that guests
>> apply NTP frequency correction to their kvmclock reads on TSC as well.
>>
>
> I don't understand what you mean.
>
>> ---
>>
>> About future: agree with Andy that kvmclock should be removed.
>> So there is a pending work item there: "verify TSC clocksource
>> is fine for exposing to guests, think about the implications for
>> management software".
>> I can write down a list of items that have been fixed
>> for kvmclock and would have to be check for tsc clocksource.
>>
>> Anyone willing to take that task ?
>>


This would be a wonderful goal. But I think that you would want some
extra bits besides just "remove kvmclock":
- Force the guest to consider TSC a high quality clocksource.
- Provide the host's calibrated TSC frequency to the guest.
- Provide an alternative to hardware frequency scaling
These sound to me like the requirements for pvclock/kvmclock.

>
> How?
>
> On very very new hosts (those that support TSC_ADJUST and tsc
> scaling), this should be possible.  The host would ideally tell the
> guest what frequency of clock it intends to provide (ideally 1 GHz
> exactly) and the guest would use it.  I'm not sure this hardware
> exists yet.
>
> If you enable TSC scaling like this, you may need to supply an ART
> (always running timer) adjustment to the guest in case you intend to
> pass any ART consumers through to the guest.  Of course, no one
> outside Intel has *that* hardware either (AFAIK -- maybe there are
> some prototypes floating around).
>
>> ---
>>
>> About complaint that "its not well designed whether NTP correction
>> should be applied or not". There are two different things:
>>
>> 1) Host clock and guest clocks synchronized.
>> KVM is not responsible for that, and it can't, because
>> Linux exposes a clock which is created in software
>> and fixed by NTP.
>
> I don't understand what you mean.
>
> Of course the guest can run its own NTP daemon or similar adjtimex
> caller and cause the guest to stop tracking the host.  But if the host
> passed CLOCK_MONOTONIC through, then the guest would, by default,
> treat kvm-clock as an exactly 1GHz source and would then expose a
> disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
> without an NTP client on the guest.
>
> If integration with the POSIX clock core were provided, the guest
> would learn to consume the host's CLOCK_REALTIME as well, as long as
> the host uses the tsc as its clocksource.

Your proposal, which I'd describe as a direct passthrough (to the
extent possible) of the host gettimeofday vdso to a kvm guest, sounds
like a much better way to get clock frequency adjustments from the
host to the guest. But I don't know if I can think of a reason to do
this besides "hey you don't have to run ntp". Is there a situation you
have in mind that this helps out?

>
>>
>> 2) NTP frequency correction being applied to kvmclock.
>>
>> This only means that the frequency of the pvclock reads
>> in the guest are NTP corrected.
>
> If the host applied NTP frequency correction to the guest, then I
> would be happy.  Some folks might want this to be optional.
>
> The guest can do additional correction on top if it wants regardless.
>
> --Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 20:53         ` Radim Krčmář
@ 2016-02-25 11:13           ` Radim Krčmář
  2016-02-25 11:22           ` Marcelo Tosatti
  1 sibling, 0 replies; 29+ messages in thread
From: Radim Krčmář @ 2016-02-25 11:13 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Peter Hornyack, Marcelo Tosatti, Owen Hofmann, KVM General,
	Paolo Bonzini

2016-02-24 21:53+0100, Radim Krčmář:
> 2016-02-24 12:24-0800, Andy Lutomirski:
>> On Wed, Feb 24, 2016 at 12:17 PM, Radim Krčmář <rkrcmar@redhat.com> wrote:
>>> Do you find anything incorrect with
>>>  * kvmclock measures the flow of time.
>>>  * time in kvmclock flows at the same rate as host's CLOCK_BOOTTIME.
>>> ?
>> 
>> If we could supply CLOCK_REALTIME as well and advertise that fact to
>> guest userspace (perhaps with a sysctl or similar in the guest to turn
>> it on), it would be *awesome*.  Guests with access to this feature
>> could simply not run ntpd/chronyd.
> 
> I think that pvclock_wall_clock interface is there to do that.
> (If pvclock_vcpu_time_info can provide what is claimed above.)
> 
> If pvclock_wall_clock version field matches with pvclock_vcpu_time_info,

Correction: versioning schemes are independent.  (pvclock_wall_clock
needs a version because of CPUs that can't atomically access 64 bits.)

> then the guest can add those two and get CLOCK_REALTIME.

pvclock_wall_clock provides CLOCK_REALTIME when pvclock_wall_clock was
0, so the guest can just add them to get current CLOCK_REALTIME.

(The agreement is that CLOCK_REALTIME started in 1970.)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 20:53         ` Radim Krčmář
  2016-02-25 11:13           ` Radim Krčmář
@ 2016-02-25 11:22           ` Marcelo Tosatti
  1 sibling, 0 replies; 29+ messages in thread
From: Marcelo Tosatti @ 2016-02-25 11:22 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Andy Lutomirski, Peter Hornyack, Owen Hofmann, KVM General,
	Paolo Bonzini

On Wed, Feb 24, 2016 at 09:53:23PM +0100, Radim Krčmář wrote:
> 2016-02-24 12:24-0800, Andy Lutomirski:
> > On Wed, Feb 24, 2016 at 12:17 PM, Radim Krčmář <rkrcmar@redhat.com> wrote:
> >> 2016-02-24 09:35-0800, Peter Hornyack:
> >>> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> >>>> On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
> >>>>> Regardless of my opinion, I think that a clear statement of the design
> >>>>> goals for kvm-clock (and kvm's implementation of the reference TSC
> >>>>> page) would be valuable.
> >>>>
> >>>> Documentation/virtual/kvm/timekeeping.txt
> >>>>
> >>>
> >>> Hi Marcelo,
> >>>
> >>> While I appreciate all of the detail in timekeeping.txt, it is not a
> >>> very good reference for what kvm-clock is or how it works. kvm-clock
> >>> is only mentioned three times in different places throughout that
> >>> document, and nowhere is there a very clear statement of what
> >>> kvm-clock is supposed to do or how it does it.
> >>>
> >>> For somebody that does not already have a deep understanding of the
> >>> core masterclock code, trying to understand how kvm-clock works is a
> >>> real challenge.
> >>
> >> I agree.  Having an overview would be very helpful.
> >>
> >> Do you find anything incorrect with
> >>  * kvmclock measures the flow of time.
> >>  * time in kvmclock flows at the same rate as host's CLOCK_BOOTTIME.
> >> ?
> > 
> > If we could supply CLOCK_REALTIME as well and advertise that fact to
> > guest userspace (perhaps with a sysctl or similar in the guest to turn
> > it on), it would be *awesome*.  Guests with access to this feature
> > could simply not run ntpd/chronyd.
> 
> I think that pvclock_wall_clock interface is there to do that.
> (If pvclock_vcpu_time_info can provide what is claimed above.)
> 
> If pvclock_wall_clock version field matches with pvclock_vcpu_time_info,
> then the guest can add those two and get CLOCK_REALTIME.
> (Based on observations of angry users, the implementation lacking.)
> 
> >> Maybe it would be better to say "best estimate of real time" instead of
> >> "CLOCK_BOOTTIME", so people wouldn't jump to conclusion that
> >> CLOCK_BOOTTIME has something to do with kvmclock ...
> > 
> > We still need to define what zero means, if anything.
> 
> I think it's better if only the difference between two reads has a
> meaning (the number of nanoseconds that passed).  Zero is then an
> arbitrary value.
> 
> (If we're talking about system_time.)
> 
> >> Then we could mention migration (why the time becomes imprecise) and
> >> finish by explaining the TSC mechanism (that avoids a vmexit on every
> >> read) and advantages of masterclock.
> > 
> > We should also explain what masterclock is, aside from being an
> > implementation detail.  I've read the code and I still don't know.
> 
> Yeah, rewriting the code would be a good deed.

Please do so.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-25  1:19       ` Andy Lutomirski
  2016-02-25  3:50         ` Owen Hofmann
@ 2016-02-25 11:36         ` Radim Krčmář
  2016-02-25 12:12         ` Marcelo Tosatti
  2 siblings, 0 replies; 29+ messages in thread
From: Radim Krčmář @ 2016-02-25 11:36 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Marcelo Tosatti, Peter Hornyack, Owen Hofmann, KVM General,
	Paolo Bonzini

2016-02-24 17:19-0800, Andy Lutomirski:
> On Wed, Feb 24, 2016 at 3:35 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> About complaint that "its not well designed whether NTP correction
>> should be applied or not". There are two different things:
>>
>> 1) Host clock and guest clocks synchronized.
>> KVM is not responsible for that, and it can't, because
>> Linux exposes a clock which is created in software
>> and fixed by NTP.
> 
> I don't understand what you mean.
> 
> Of course the guest can run its own NTP daemon or similar adjtimex
> caller and cause the guest to stop tracking the host. But if the host
> passed CLOCK_MONOTONIC through, then the guest would, by default,
> treat kvm-clock as an exactly 1GHz source and would then expose a
> disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
> without an NTP client on the guest.

kvmclock always is a 1 GHz clock, it just wasn't (maybe still isn't) a
good source of source of 1 GHz.  The guest can't know that it's poor
unless it compares with better 1 GHz sources, for example via NTP.

KVM can't know what 1 GHz is better than the host and vice-versa, so
kvmclock should "accidentally" track CLOCK_BOOTTIME.

But that is related to (2), I think that (1) was mainly about the offset
from CLOCK_REALTIME.

> If integration with the POSIX clock core were provided, the guest
> would learn to consume the host's CLOCK_REALTIME as well, as long as
> the host uses the tsc as its clocksource.

Using TSC in the host allows KVM to provide precise CLOCK_REALTIME, but
nothing prevents us from giving host's CLOCK_REALTIME even if the host
is using hpet/... as the source.

>> 2) NTP frequency correction being applied to kvmclock.
>>
>> This only means that the frequency of the pvclock reads
>> in the guest are NTP corrected.
> 
> If the host applied NTP frequency correction to the guest, then I
> would be happy.  Some folks might want this to be optional.

kvmclock provides time in nanoseconds, so I'd argue that NTP corrections
are mandatory.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-25  1:19       ` Andy Lutomirski
  2016-02-25  3:50         ` Owen Hofmann
  2016-02-25 11:36         ` Radim Krčmář
@ 2016-02-25 12:12         ` Marcelo Tosatti
  2 siblings, 0 replies; 29+ messages in thread
From: Marcelo Tosatti @ 2016-02-25 12:12 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Peter Hornyack, Owen Hofmann, KVM General, Paolo Bonzini

On Wed, Feb 24, 2016 at 05:19:38PM -0800, Andy Lutomirski wrote:
> On Wed, Feb 24, 2016 at 3:35 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Wed, Feb 24, 2016 at 09:35:44AM -0800, Peter Hornyack wrote:
> >> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> >> > On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
> >> >> Specifically, what underlying source of time should be exposed through
> >> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> >> >> page?  Recently a couple of threads on kvm-list, along with attempts
> >> >> to produce reliable behavior from kvm-clock on our systems have
> >> >> highlighted a tension between the current implementation of kvm-clock
> >> >> and potentially diverging goals for paravirt time. Here are a few:
> >> >>
> >> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
> >> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
> >> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
> >> >>
> >> >> This question is mostly in regards to kvm-clock in masterclock mode
> >> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
> >> >> expose a source of time that is more 'true' than the underlying TSC?
> >> >> For example, by passing through NTP correction from the host. For the
> >> >> current implementation, the answer seems to be... why not both? Once
> >> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
> >> >> multiplied by the frequency specified by kvm. On the other hand,
> >> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
> >> >> are measured against corrected time from the host. A guest reading its
> >> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
> >> >> guest has run long enough to for TSC to diverge from NTP time. A VMM
> >> >> using these ioctls to save and restore clock state can produce wild
> >> >> time jumps from the guest's perspective.
> >> >>
> >> >> The patches in (2) address this mismatch by plumbing updates to clock
> >> >> frequency through kvm-clock to the guest. This seems like an important
> >> >> design choice for kvm-clock, and IMO deserves at least a clear
> >> >> statement of the goals for this interface, if not some more
> >> >> discussion.
> >> >
> >> > Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK?
> >> >
> >> > The interfaces have been introduced to fix a bug.
> >> >
> >> >> The (later) thread in (3) claims that synchronizing with
> >> >> host time is *not* a goal of kvm-clock.
> >> >
> >> > It is not.
> >> >
> >> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
> >> >> simply a more enlightened path to the host TSC. Maintaining a
> >> >> high-performance path to the TSC in the face of updates is tricky -
> >> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
> >> >> discussion on the patchset in (2). Is the cost of auditing that the
> >> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
> >> >> gettimeofday both tracks host time correctly and does not produce any
> >> >> backwards warps worth the added value, if it exists? As an
> >> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
> >> >> function of the last update to kvm-clock or the reference TSC page,
> >> >> respectively, sounds very straightforward.
> >> >>
> >> >> (Outside of masterclock mode, the requirement that the client
> >> >> synchronizes across cpus for montonicity smoothes over a lot of
> >> >> complexity - periodically updating kvm-clock to the current time is
> >> >> simple and works.)
> >> >>
> >> >> Regardless of my opinion, I think that a clear statement of the design
> >> >> goals for kvm-clock (and kvm's implementation of the reference TSC
> >> >> page) would be valuable.
> >> >
> >> > Documentation/virtual/kvm/timekeeping.txt
> >> >
> >>
> >> Hi Marcelo,
> >>
> >> While I appreciate all of the detail in timekeeping.txt, it is not a
> >> very good reference for what kvm-clock is or how it works. kvm-clock
> >> is only mentioned three times in different places throughout that
> >> document, and nowhere is there a very clear statement of what
> >> kvm-clock is supposed to do or how it does it.
> >>
> >> For somebody that does not already have a deep understanding of the
> >> core masterclock code, trying to understand how kvm-clock works is a
> >
> > There is no "deep understanding". There is one comment there about
> > why you can't update systemtimestamp + tsc_offset (you have to read
> > the kvmclock clock read function to understand this sentence) in
> > parallel in multiple VCPUs, and thats all masterclock is about.
> >
> > Its called "master" because there must be only one system_timestamp
> > and not multiple (therefore thats the "master" copy of system_time).
> >
> >> real challenge.
> >>
> >> Thanks,
> >> Peter
> >
> > Design goals: provide a reliable clocksource device to Linux guests
> > so they are able to cope with virtualization problems, namely:
> >
> > 1. Migration to hosts with different TSC frequency.
> > 2. Support for hosts with TSCs that are not stable (whose
> > counting frequency changes across processor frequency changes).
> >
> > How: Expose a clockdevice which counts at 1GHz to guests.
> 
> This still doesn't define how closely it is intended to track 1 GHz or
> whether NTP slew is applied.
> 
> > Evolution of masterclock scheme (bugs uncovered):
> >
> > Problem: time backwards as seen by guests.
> > Solution: Fix in guest with pvclock global variable (cmpxchg).
> 
> I thought that was only for non-masterclock.
> 
> >
> > Problem: gettimeofday() performance
> > Solution: Use masterclock scheme (update pvclock areas in sync to avoid
> > time backwards event being visible to guests, its well documented in
> > x86.c, if something is unclear please try to understand the code / ask
> > and you/we improve the documentation there).
> 
> The actual masterclock host code is long and very difficult to follow.
> 
> In 4.5-rc, the vDSO guest code is IMO short and reasonably clear.
> 
> >
> > Problem: get_kernel_ns VS TSC clock get out of sync and
> > Hyper-V complains about the difference.
> >
> > Solution: expose the NTP TSC frequency so that guests
> > apply NTP frequency correction to their kvmclock reads on TSC as well.
> >
> 
> I don't understand what you mean.
> 
> > ---
> >
> > About future: agree with Andy that kvmclock should be removed.
> > So there is a pending work item there: "verify TSC clocksource
> > is fine for exposing to guests, think about the implications for
> > management software".
> > I can write down a list of items that have been fixed
> > for kvmclock and would have to be check for tsc clocksource.
> >
> > Anyone willing to take that task ?
> >
> 
> How?
> 
> On very very new hosts (those that support TSC_ADJUST and tsc
> scaling), this should be possible.  

Exactly, TSC scaling.

> The host would ideally tell the
> guest what frequency of clock it intends to provide (ideally 1 GHz
> exactly) and the guest would use it.  I'm not sure this hardware
> exists yet.
> 
> If you enable TSC scaling like this, you may need to supply an ART
> (always running timer) adjustment to the guest in case you intend to
> pass any ART consumers through to the guest.  Of course, no one
> outside Intel has *that* hardware either (AFAIK -- maybe there are
> some prototypes floating around).
> 
> > ---
> >
> > About complaint that "its not well designed whether NTP correction
> > should be applied or not". There are two different things:
> >
> > 1) Host clock and guest clocks synchronized.
> > KVM is not responsible for that, and it can't, because
> > Linux exposes a clock which is created in software
> > and fixed by NTP.
> 
> I don't understand what you mean.
> 
> Of course the guest can run its own NTP daemon or similar adjtimex
> caller and cause the guest to stop tracking the host.  But if the host
> passed CLOCK_MONOTONIC through, then the guest would, by default,
> treat kvm-clock as an exactly 1GHz source and would then expose a
> disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
> without an NTP client on the guest.
> 
> If integration with the POSIX clock core were provided, the guest
> would learn to consume the host's CLOCK_REALTIME as well, as long as
> the host uses the tsc as its clocksource.
> 
> >
> > 2) NTP frequency correction being applied to kvmclock.
> >
> > This only means that the frequency of the pvclock reads
> > in the guest are NTP corrected.
> 
> If the host applied NTP frequency correction to the guest, then I
> would be happy.  Some folks might want this to be optional.
> 
> The guest can do additional correction on top if it wants regardless.
> 
> --Andy

Paolo's track-TSC-offset-multiplier-from-kvmclock-updates should make
enabling masterclock for suspend/resume much simpler.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-25  3:50         ` Owen Hofmann
@ 2016-02-25 12:20           ` Radim Krčmář
  2016-02-26 17:02             ` Andy Lutomirski
  0 siblings, 1 reply; 29+ messages in thread
From: Radim Krčmář @ 2016-02-25 12:20 UTC (permalink / raw)
  To: Owen Hofmann
  Cc: Andy Lutomirski, Marcelo Tosatti, Peter Hornyack, KVM General,
	Paolo Bonzini

2016-02-24 19:50-0800, Owen Hofmann:
> On Wed, Feb 24, 2016 at 5:19 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> Of course the guest can run its own NTP daemon or similar adjtimex
>> caller and cause the guest to stop tracking the host.  But if the host
>> passed CLOCK_MONOTONIC through, then the guest would, by default,
>> treat kvm-clock as an exactly 1GHz source and would then expose a
>> disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
>> without an NTP client on the guest.
>>
>> If integration with the POSIX clock core were provided, the guest
>> would learn to consume the host's CLOCK_REALTIME as well, as long as
>> the host uses the tsc as its clocksource.
> 
> Your proposal, which I'd describe as a direct passthrough (to the
> extent possible) of the host gettimeofday vdso to a kvm guest, sounds
> like a much better way to get clock frequency adjustments from the
> host to the guest. But I don't know if I can think of a reason to do
> this besides "hey you don't have to run ntp". Is there a situation you
> have in mind that this helps out?

Running NTP only on the host is a good reason.
(And probably the only reason I'd call good, because any software that
 passes TSC or CLOCK_MONOTONIC timestamps between hosts needs to handle
 their differences.)

I'd prefer to avoid direct access to host's TSC or CLOCK_MONOTONIC from
the guest, so migrations aren't more complicated than they are now.
Passing host's CLOCK_REALTIME to the guest (which can be done with
pvclock) is enough for any isolated application.  (considering that
guest's pvclock ticks at expected rate, which is host's job.)

But need paravirtual interface(s) to precisely sync host and guest
CLOCK_REALTIME even on machines with VMX TSC offset and scaling, so

Is the the pvclock interface is unable to fulfill our goals?

Do you want to simplify or extend the pvclock interface?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 19:55         ` Owen Hofmann
@ 2016-02-25 12:22           ` Joao Martins
  2016-02-25 12:22           ` Joao Martins
  1 sibling, 0 replies; 29+ messages in thread
From: Joao Martins @ 2016-02-25 12:22 UTC (permalink / raw)
  To: Owen Hofmann, Andy Lutomirski
  Cc: Marcelo Tosatti, Paolo Bonzini, KVM General, Peter Hornyack, xen-devel

On 02/24/2016 07:55 PM, Owen Hofmann wrote:
>>>> not-really-well-defined hybrid?
>>>>
>>>> --Andy
>>>
>>> 1. What is not well defined? I fail to spot anything
>>> specific in Owen's e-mail.
>>
>> If I start a guest and query kvm-clock, I get a nanosecond count.
>> AFAIK it is, in fact, ill-defined or at least ill-documented what that
>> nanosecond count means.
> 
> To try to put the thoughts into specific questions:
> - What is the value returned by KVM_GET_CLOCK? How should it be used?
> - What is the value returned by a guest read of the kvm-clock
> structure? (This is also Andy's question)
> To me there are two possibilities for how to answer the second question:
> 1) kvm-clock is better than the host TSC: it propagates updates to
> frequency from the host (== CLOCK_MONOTONIC)
> 2) kvm-clock is a paravirtual source of truth on the guest TSC:
> whether it is stable and its approximate frequency. If the guest needs
> to synchronize to an external source of time, it runs NTP. (==
> CLOCK_MONOTONIC_RAW)
> 
> To me, (1) sounds hard, (2) sounds easy, and its not clear how much
> additional value (1) provides. The recent patches Paolo sent move
> kvm-clock in the direction of (1), and it sounds like Andy and I might
> have slightly different opinions as well. But mostly I would like some
> clarity as to which is the stated goal for kvm-clock, and to have the
> implementation pick only one of those options.
> 
>>>>> Since we cannot change the past, having kvmclock synchronize with the
>>>>> host TSC frequency is the only choice we can make.'
> 
> I'm not sure I understand what previous decision locks kvm-clock into
> the current path. Can you clarify?
> 
> On Wed, Feb 24, 2016 at 11:38 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Wed, Feb 24, 2016 at 9:38 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>> On Wed, Feb 24, 2016 at 08:44:40AM -0800, Andy Lutomirski wrote:
>>>> On Wed, Feb 24, 2016 at 6:14 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>>
>>>>>
>>>>> On 24/02/2016 03:31, Owen Hofmann wrote:
>>>>>> Specifically, what underlying source of time should be exposed through
>>>>>> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
>>>>>> page?  Recently a couple of threads on kvm-list, along with attempts
>>>>>> to produce reliable behavior from kvm-clock on our systems have
>>>>>> highlighted a tension between the current implementation of kvm-clock
>>>>>> and potentially diverging goals for paravirt time. Here are a few:
>>>>>>
>>>>>> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
>>>>>> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
>>>>>> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
>>>>>>
>>>>>> This question is mostly in regards to kvm-clock in masterclock mode
>>>>>> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
>>>>>> expose a source of time that is more 'true' than the underlying TSC?
>>>>>> For example, by passing through NTP correction from the host. For the
>>>>>> current implementation, the answer seems to be... why not both? Once
>>>>>> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
>>>>>> multiplied by the frequency specified by kvm. On the other hand,
>>>>>> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
>>>>>> are measured against corrected time from the host. A guest reading its
>>>>>> pvclock gets a very different result from a host KVM_GET_CLOCK if the
>>>>>> guest has run long enough to for TSC to diverge from NTP time.
>>>>>
>>>>> Right, in fact that's why QEMU is not really using KVM_GET_CLOCK
>>>>> anymore.  In retrospect, the "fix" in QEMU was probably a bad idea.  It
>>>>> would have been better to fix KVM_GET_CLOCK.
>>>>>
>>>>>> To me, kvm-clock and the HyperV TSC page are extremely effective as
>>>>>> simply a more enlightened path to the host TSC. Maintaining a
>>>>>> high-performance path to the TSC in the face of updates is tricky -
>>>>>> see the extended comment in pvclock_update_vm_gtod_copy, or the
>>>>>> discussion on the patchset in (2). Is the cost of auditing that the
>>>>>> path from host gettimeofday update -> kvm -> guest pvclock -> guest
>>>>>> gettimeofday both tracks host time correctly and does not produce any
>>>>>> backwards warps worth the added value, if it exists? As an
>>>>>> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
>>>>>> function of the last update to kvm-clock or the reference TSC page,
>>>>>> respectively, sounds very straightforward.
>>>>>
>>>>> Yes, we could do that too.
>>>>>
>>>>> I think that vgettsc and do_monotonic_boot also would have to use the
>>>>> TSC frequency instead the NTP-adjusted host clock.
>>>>>
>>>>>> (Outside of masterclock mode, the requirement that the client
>>>>>> synchronizes across cpus for montonicity smoothes over a lot of
>>>>>> complexity - periodically updating kvm-clock to the current time is
>>>>>> simple and works.)
>>>>>>
>>>>>> Regardless of my opinion, I think that a clear statement of the design
>>>>>> goals for kvm-clock (and kvm's implementation of the reference TSC
>>>>>> page) would be valuable.
>>>>>
>>>>> Since we cannot change the past, having kvmclock synchronize with the
>>>>> host TSC frequency is the only choice we can make.
>>>>>
>>>>
>>>> Could we introduce a new kvm-clock or perhaps opt-in mode that:
>>>>
>>>> a) uses hypervisor-supplied IO pages and,
>>>>
>>>> b) synchronizes to host CLOCK_MONOTONIC instead of some bizarre
>>>> non-suspend-resume-safe
>>>
>>> Please be accurate. It is suspend safe.
>>>
>>
>> I'm being accurate enough, I think.  Master clock mode is not suspend
>> safe.  When I suspend and resume my laptop, the master clock code
>> determines that it messed up and disables itself.  Unloading and
>> reloading the kvm modules turns it back on until the next suspect.
>>
>> I *think* that the underlying issue is that kvm-clock's master clock
>> tracks something ill-defined instead of exposing a well-defined host
>> clock.  If the master clock accurately exposed CLOCK_MONOTONIC_RAW or
>> CLOCK_MONOTONIC (I much prefer the latter), then it would be fine
>> across suspend/resume.
>>
>> I think that part of the reason that it doesn't accurately export a
>> host clock is that the worst-case performance of atomic updates to the
>> pvclock data structures is abysmal due to having the data structures
>> living in guest memory.  To be able to access and update all relevant
>> structures during host clock refreshes, the host would need to pin the
>> all pvclock pages for all running guests.  This could be partially
>> mitigated by only updating pvclock data for running vcpus and for vcpu
>> 0 for all running guests synchronously and deferring the rest (8k
>> pinned per host cpu, max), but it would still be a mess.
>>
>> If someone redefined the interface so that the *host* could allocate
>> it, then the pages could be shared across all guests and this would be
>> vastly simpler and faster.
>>
>> Also, kvm-clock should really coordinate with the core timekeeping
>> code to handle this sort of time base export rather than hooking into
>> the host vdso support code.
>>
>>>> not-really-well-defined hybrid?
>>>>
>>>> --Andy
>>>
>>> 1. What is not well defined? I fail to spot anything
>>> specific in Owen's e-mail.
>>
>> If I start a guest and query kvm-clock, I get a nanosecond count.
>> AFAIK it is, in fact, ill-defined or at least ill-documented what that
>> nanosecond count means.
>>
>> [cc: Joao.  Xen may want to take this stuff into consideration.]
[CC-ing xen-devel folks too]

Joao
>> --Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24 19:55         ` Owen Hofmann
  2016-02-25 12:22           ` Joao Martins
@ 2016-02-25 12:22           ` Joao Martins
  1 sibling, 0 replies; 29+ messages in thread
From: Joao Martins @ 2016-02-25 12:22 UTC (permalink / raw)
  To: Owen Hofmann, Andy Lutomirski
  Cc: Paolo Bonzini, Marcelo Tosatti, xen-devel, KVM General, Peter Hornyack

On 02/24/2016 07:55 PM, Owen Hofmann wrote:
>>>> not-really-well-defined hybrid?
>>>>
>>>> --Andy
>>>
>>> 1. What is not well defined? I fail to spot anything
>>> specific in Owen's e-mail.
>>
>> If I start a guest and query kvm-clock, I get a nanosecond count.
>> AFAIK it is, in fact, ill-defined or at least ill-documented what that
>> nanosecond count means.
> 
> To try to put the thoughts into specific questions:
> - What is the value returned by KVM_GET_CLOCK? How should it be used?
> - What is the value returned by a guest read of the kvm-clock
> structure? (This is also Andy's question)
> To me there are two possibilities for how to answer the second question:
> 1) kvm-clock is better than the host TSC: it propagates updates to
> frequency from the host (== CLOCK_MONOTONIC)
> 2) kvm-clock is a paravirtual source of truth on the guest TSC:
> whether it is stable and its approximate frequency. If the guest needs
> to synchronize to an external source of time, it runs NTP. (==
> CLOCK_MONOTONIC_RAW)
> 
> To me, (1) sounds hard, (2) sounds easy, and its not clear how much
> additional value (1) provides. The recent patches Paolo sent move
> kvm-clock in the direction of (1), and it sounds like Andy and I might
> have slightly different opinions as well. But mostly I would like some
> clarity as to which is the stated goal for kvm-clock, and to have the
> implementation pick only one of those options.
> 
>>>>> Since we cannot change the past, having kvmclock synchronize with the
>>>>> host TSC frequency is the only choice we can make.'
> 
> I'm not sure I understand what previous decision locks kvm-clock into
> the current path. Can you clarify?
> 
> On Wed, Feb 24, 2016 at 11:38 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Wed, Feb 24, 2016 at 9:38 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>> On Wed, Feb 24, 2016 at 08:44:40AM -0800, Andy Lutomirski wrote:
>>>> On Wed, Feb 24, 2016 at 6:14 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>>
>>>>>
>>>>> On 24/02/2016 03:31, Owen Hofmann wrote:
>>>>>> Specifically, what underlying source of time should be exposed through
>>>>>> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
>>>>>> page?  Recently a couple of threads on kvm-list, along with attempts
>>>>>> to produce reliable behavior from kvm-clock on our systems have
>>>>>> highlighted a tension between the current implementation of kvm-clock
>>>>>> and potentially diverging goals for paravirt time. Here are a few:
>>>>>>
>>>>>> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
>>>>>> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
>>>>>> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
>>>>>>
>>>>>> This question is mostly in regards to kvm-clock in masterclock mode
>>>>>> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
>>>>>> expose a source of time that is more 'true' than the underlying TSC?
>>>>>> For example, by passing through NTP correction from the host. For the
>>>>>> current implementation, the answer seems to be... why not both? Once
>>>>>> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
>>>>>> multiplied by the frequency specified by kvm. On the other hand,
>>>>>> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
>>>>>> are measured against corrected time from the host. A guest reading its
>>>>>> pvclock gets a very different result from a host KVM_GET_CLOCK if the
>>>>>> guest has run long enough to for TSC to diverge from NTP time.
>>>>>
>>>>> Right, in fact that's why QEMU is not really using KVM_GET_CLOCK
>>>>> anymore.  In retrospect, the "fix" in QEMU was probably a bad idea.  It
>>>>> would have been better to fix KVM_GET_CLOCK.
>>>>>
>>>>>> To me, kvm-clock and the HyperV TSC page are extremely effective as
>>>>>> simply a more enlightened path to the host TSC. Maintaining a
>>>>>> high-performance path to the TSC in the face of updates is tricky -
>>>>>> see the extended comment in pvclock_update_vm_gtod_copy, or the
>>>>>> discussion on the patchset in (2). Is the cost of auditing that the
>>>>>> path from host gettimeofday update -> kvm -> guest pvclock -> guest
>>>>>> gettimeofday both tracks host time correctly and does not produce any
>>>>>> backwards warps worth the added value, if it exists? As an
>>>>>> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
>>>>>> function of the last update to kvm-clock or the reference TSC page,
>>>>>> respectively, sounds very straightforward.
>>>>>
>>>>> Yes, we could do that too.
>>>>>
>>>>> I think that vgettsc and do_monotonic_boot also would have to use the
>>>>> TSC frequency instead the NTP-adjusted host clock.
>>>>>
>>>>>> (Outside of masterclock mode, the requirement that the client
>>>>>> synchronizes across cpus for montonicity smoothes over a lot of
>>>>>> complexity - periodically updating kvm-clock to the current time is
>>>>>> simple and works.)
>>>>>>
>>>>>> Regardless of my opinion, I think that a clear statement of the design
>>>>>> goals for kvm-clock (and kvm's implementation of the reference TSC
>>>>>> page) would be valuable.
>>>>>
>>>>> Since we cannot change the past, having kvmclock synchronize with the
>>>>> host TSC frequency is the only choice we can make.
>>>>>
>>>>
>>>> Could we introduce a new kvm-clock or perhaps opt-in mode that:
>>>>
>>>> a) uses hypervisor-supplied IO pages and,
>>>>
>>>> b) synchronizes to host CLOCK_MONOTONIC instead of some bizarre
>>>> non-suspend-resume-safe
>>>
>>> Please be accurate. It is suspend safe.
>>>
>>
>> I'm being accurate enough, I think.  Master clock mode is not suspend
>> safe.  When I suspend and resume my laptop, the master clock code
>> determines that it messed up and disables itself.  Unloading and
>> reloading the kvm modules turns it back on until the next suspect.
>>
>> I *think* that the underlying issue is that kvm-clock's master clock
>> tracks something ill-defined instead of exposing a well-defined host
>> clock.  If the master clock accurately exposed CLOCK_MONOTONIC_RAW or
>> CLOCK_MONOTONIC (I much prefer the latter), then it would be fine
>> across suspend/resume.
>>
>> I think that part of the reason that it doesn't accurately export a
>> host clock is that the worst-case performance of atomic updates to the
>> pvclock data structures is abysmal due to having the data structures
>> living in guest memory.  To be able to access and update all relevant
>> structures during host clock refreshes, the host would need to pin the
>> all pvclock pages for all running guests.  This could be partially
>> mitigated by only updating pvclock data for running vcpus and for vcpu
>> 0 for all running guests synchronously and deferring the rest (8k
>> pinned per host cpu, max), but it would still be a mess.
>>
>> If someone redefined the interface so that the *host* could allocate
>> it, then the pages could be shared across all guests and this would be
>> vastly simpler and faster.
>>
>> Also, kvm-clock should really coordinate with the core timekeeping
>> code to handle this sort of time base export rather than hooking into
>> the host vdso support code.
>>
>>>> not-really-well-defined hybrid?
>>>>
>>>> --Andy
>>>
>>> 1. What is not well defined? I fail to spot anything
>>> specific in Owen's e-mail.
>>
>> If I start a guest and query kvm-clock, I get a nanosecond count.
>> AFAIK it is, in fact, ill-defined or at least ill-documented what that
>> nanosecond count means.
>>
>> [cc: Joao.  Xen may want to take this stuff into consideration.]
[CC-ing xen-devel folks too]

Joao
>> --Andy

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-24  2:31 What time is it kvm-clock? Owen Hofmann
                   ` (2 preceding siblings ...)
  2016-02-24 14:14 ` Paolo Bonzini
@ 2016-02-26 15:04 ` Marcelo Tosatti
  3 siblings, 0 replies; 29+ messages in thread
From: Marcelo Tosatti @ 2016-02-26 15:04 UTC (permalink / raw)
  To: Owen Hofmann; +Cc: KVM General, Paolo Bonzini, Andy Lutomirski, Peter Hornyack

On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
> Specifically, what underlying source of time should be exposed through
> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> page?  Recently a couple of threads on kvm-list, along with attempts
> to produce reliable behavior from kvm-clock on our systems have

What is there in place are testcases to measure particular kvmclock
issues, such as time backwards events and maximum offset/frequency 
against NTP. 

There is no "true" clock, you can only measure one clock against 
another (i think Radim raised that point as well). UTC is the
global standard, an average of atomic clocks.
http://tf.nist.gov/general/pdf/1498.pdf

What you'd like to do is measure kvmclock stability regarding
some parameter. So to improve that situation one could find what
parameters are important (such as whether clock-A should not stop 
counting for more than some time units of clock-B, thats the "opposite"
side effect of the bug uncovered by Hyper-V fixes, the other being
time backwards events).

(clock-A being kvmclock, clock-B being a GPS clock for example).

One useful activity would be to compare (only the minimum measurements
out of many measurements)
kvmclock in a guest with a GPS clock. I bought a Garmin GPS clock but never got 
around to enabling the RS-232 connection required to bypass the USB
latency.
It costs less than 100US$.

http://www.lammertbies.nl/comm/info/GPS-time.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-25 12:20           ` Radim Krčmář
@ 2016-02-26 17:02             ` Andy Lutomirski
  2016-02-26 19:30               ` Marcelo Tosatti
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2016-02-26 17:02 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Owen Hofmann, Marcelo Tosatti, Peter Hornyack, KVM General,
	Paolo Bonzini

On Thu, Feb 25, 2016 at 4:20 AM, Radim Krčmář <rkrcmar@redhat.com> wrote:
> 2016-02-24 19:50-0800, Owen Hofmann:
>> On Wed, Feb 24, 2016 at 5:19 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> Of course the guest can run its own NTP daemon or similar adjtimex
>>> caller and cause the guest to stop tracking the host.  But if the host
>>> passed CLOCK_MONOTONIC through, then the guest would, by default,
>>> treat kvm-clock as an exactly 1GHz source and would then expose a
>>> disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
>>> without an NTP client on the guest.
>>>
>>> If integration with the POSIX clock core were provided, the guest
>>> would learn to consume the host's CLOCK_REALTIME as well, as long as
>>> the host uses the tsc as its clocksource.
>>
>> Your proposal, which I'd describe as a direct passthrough (to the
>> extent possible) of the host gettimeofday vdso to a kvm guest, sounds
>> like a much better way to get clock frequency adjustments from the
>> host to the guest. But I don't know if I can think of a reason to do
>> this besides "hey you don't have to run ntp". Is there a situation you
>> have in mind that this helps out?
>
> Running NTP only on the host is a good reason.
> (And probably the only reason I'd call good, because any software that
>  passes TSC or CLOCK_MONOTONIC timestamps between hosts needs to handle
>  their differences.)

There are handful of distributed algorithms that benefit from clocks
with a bounded worst-case synchronization error.  I think that Google
uses some.  If some cloud provider were to provide, say, 10ms max
CLOCK_REALTIME error and pass CLOCK_REALTIME through using kvm-clock,
it could be quite useful.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-26 17:02             ` Andy Lutomirski
@ 2016-02-26 19:30               ` Marcelo Tosatti
  2016-02-27  0:00                 ` Andy Lutomirski
  0 siblings, 1 reply; 29+ messages in thread
From: Marcelo Tosatti @ 2016-02-26 19:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Radim Krčmář,
	Owen Hofmann, Peter Hornyack, KVM General, Paolo Bonzini

On Fri, Feb 26, 2016 at 09:02:16AM -0800, Andy Lutomirski wrote:
> On Thu, Feb 25, 2016 at 4:20 AM, Radim Krčmář <rkrcmar@redhat.com> wrote:
> > 2016-02-24 19:50-0800, Owen Hofmann:
> >> On Wed, Feb 24, 2016 at 5:19 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> >>> Of course the guest can run its own NTP daemon or similar adjtimex
> >>> caller and cause the guest to stop tracking the host.  But if the host
> >>> passed CLOCK_MONOTONIC through, then the guest would, by default,
> >>> treat kvm-clock as an exactly 1GHz source and would then expose a
> >>> disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
> >>> without an NTP client on the guest.
> >>>
> >>> If integration with the POSIX clock core were provided, the guest
> >>> would learn to consume the host's CLOCK_REALTIME as well, as long as
> >>> the host uses the tsc as its clocksource.
> >>
> >> Your proposal, which I'd describe as a direct passthrough (to the
> >> extent possible) of the host gettimeofday vdso to a kvm guest, sounds
> >> like a much better way to get clock frequency adjustments from the
> >> host to the guest. But I don't know if I can think of a reason to do
> >> this besides "hey you don't have to run ntp". Is there a situation you
> >> have in mind that this helps out?
> >
> > Running NTP only on the host is a good reason.
> > (And probably the only reason I'd call good, because any software that
> >  passes TSC or CLOCK_MONOTONIC timestamps between hosts needs to handle
> >  their differences.)
> 
> There are handful of distributed algorithms that benefit from clocks
> with a bounded worst-case synchronization error.  I think that Google
> uses some.  If some cloud provider were to provide, say, 10ms max
> CLOCK_REALTIME error and pass CLOCK_REALTIME through using kvm-clock,
> it could be quite useful.
> 
> --Andy

Why would you want to do that again?
To fix the suspend/resume problem?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What time is it kvm-clock?
  2016-02-26 19:30               ` Marcelo Tosatti
@ 2016-02-27  0:00                 ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2016-02-27  0:00 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Radim Krčmář,
	Owen Hofmann, Peter Hornyack, KVM General, Paolo Bonzini

On Fri, Feb 26, 2016 at 11:30 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Fri, Feb 26, 2016 at 09:02:16AM -0800, Andy Lutomirski wrote:
>> On Thu, Feb 25, 2016 at 4:20 AM, Radim Krčmář <rkrcmar@redhat.com> wrote:
>> > 2016-02-24 19:50-0800, Owen Hofmann:
>> >> On Wed, Feb 24, 2016 at 5:19 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> >>> Of course the guest can run its own NTP daemon or similar adjtimex
>> >>> caller and cause the guest to stop tracking the host.  But if the host
>> >>> passed CLOCK_MONOTONIC through, then the guest would, by default,
>> >>> treat kvm-clock as an exactly 1GHz source and would then expose a
>> >>> disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
>> >>> without an NTP client on the guest.
>> >>>
>> >>> If integration with the POSIX clock core were provided, the guest
>> >>> would learn to consume the host's CLOCK_REALTIME as well, as long as
>> >>> the host uses the tsc as its clocksource.
>> >>
>> >> Your proposal, which I'd describe as a direct passthrough (to the
>> >> extent possible) of the host gettimeofday vdso to a kvm guest, sounds
>> >> like a much better way to get clock frequency adjustments from the
>> >> host to the guest. But I don't know if I can think of a reason to do
>> >> this besides "hey you don't have to run ntp". Is there a situation you
>> >> have in mind that this helps out?
>> >
>> > Running NTP only on the host is a good reason.
>> > (And probably the only reason I'd call good, because any software that
>> >  passes TSC or CLOCK_MONOTONIC timestamps between hosts needs to handle
>> >  their differences.)
>>
>> There are handful of distributed algorithms that benefit from clocks
>> with a bounded worst-case synchronization error.  I think that Google
>> uses some.  If some cloud provider were to provide, say, 10ms max
>> CLOCK_REALTIME error and pass CLOCK_REALTIME through using kvm-clock,
>> it could be quite useful.
>>
>> --Andy
>
> Why would you want to do that again?
> To fix the suspend/resume problem?
>

No.  Any clock that matches one of the host POSIX clocks would solve
the suspend/resume problem.

This is for distributed algorithms.  If I know that no participant is
more than 10ms different from me, then I can take a timestamp, wait
10ms, and then I know that no participant will subsequently get a
later timestamp.

--Andy


-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2016-02-27  0:01 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-24  2:31 What time is it kvm-clock? Owen Hofmann
2016-02-24  3:57 ` Marcelo Tosatti
2016-02-24 17:35   ` Peter Hornyack
2016-02-24 20:17     ` Radim Krčmář
2016-02-24 20:24       ` Andy Lutomirski
2016-02-24 20:53         ` Radim Krčmář
2016-02-25 11:13           ` Radim Krčmář
2016-02-25 11:22           ` Marcelo Tosatti
2016-02-24 23:35     ` Marcelo Tosatti
2016-02-24 23:36       ` Marcelo Tosatti
2016-02-25  1:19       ` Andy Lutomirski
2016-02-25  3:50         ` Owen Hofmann
2016-02-25 12:20           ` Radim Krčmář
2016-02-26 17:02             ` Andy Lutomirski
2016-02-26 19:30               ` Marcelo Tosatti
2016-02-27  0:00                 ` Andy Lutomirski
2016-02-25 11:36         ` Radim Krčmář
2016-02-25 12:12         ` Marcelo Tosatti
2016-02-24  3:59 ` Marcelo Tosatti
2016-02-24 14:14 ` Paolo Bonzini
2016-02-24 16:44   ` Andy Lutomirski
2016-02-24 17:38     ` Marcelo Tosatti
2016-02-24 19:38       ` Andy Lutomirski
2016-02-24 19:44         ` Paolo Bonzini
2016-02-24 19:52           ` Andy Lutomirski
2016-02-24 19:55         ` Owen Hofmann
2016-02-25 12:22           ` Joao Martins
2016-02-25 12:22           ` Joao Martins
2016-02-26 15:04 ` Marcelo Tosatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.