All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Peter Hornyack <peterhornyack@google.com>
Cc: Owen Hofmann <osh@google.com>, KVM General <kvm@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Andy Lutomirski <luto@amacapital.net>
Subject: Re: What time is it kvm-clock?
Date: Wed, 24 Feb 2016 20:35:00 -0300	[thread overview]
Message-ID: <20160224233500.GA17304@amt.cnet> (raw)
In-Reply-To: <CA+0KQ4PXW90DWw3nkonbvGY5aiYLi4Vg-aYWt5s_q1MXpRKMzA@mail.gmail.com>

On Wed, Feb 24, 2016 at 09:35:44AM -0800, Peter Hornyack wrote:
> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
> >> Specifically, what underlying source of time should be exposed through
> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> >> page?  Recently a couple of threads on kvm-list, along with attempts
> >> to produce reliable behavior from kvm-clock on our systems have
> >> highlighted a tension between the current implementation of kvm-clock
> >> and potentially diverging goals for paravirt time. Here are a few:
> >>
> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
> >>
> >> This question is mostly in regards to kvm-clock in masterclock mode
> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
> >> expose a source of time that is more 'true' than the underlying TSC?
> >> For example, by passing through NTP correction from the host. For the
> >> current implementation, the answer seems to be... why not both? Once
> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
> >> multiplied by the frequency specified by kvm. On the other hand,
> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
> >> are measured against corrected time from the host. A guest reading its
> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
> >> guest has run long enough to for TSC to diverge from NTP time. A VMM
> >> using these ioctls to save and restore clock state can produce wild
> >> time jumps from the guest's perspective.
> >>
> >> The patches in (2) address this mismatch by plumbing updates to clock
> >> frequency through kvm-clock to the guest. This seems like an important
> >> design choice for kvm-clock, and IMO deserves at least a clear
> >> statement of the goals for this interface, if not some more
> >> discussion.
> >
> > Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK?
> >
> > The interfaces have been introduced to fix a bug.
> >
> >> The (later) thread in (3) claims that synchronizing with
> >> host time is *not* a goal of kvm-clock.
> >
> > It is not.
> >
> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
> >> simply a more enlightened path to the host TSC. Maintaining a
> >> high-performance path to the TSC in the face of updates is tricky -
> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
> >> discussion on the patchset in (2). Is the cost of auditing that the
> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
> >> gettimeofday both tracks host time correctly and does not produce any
> >> backwards warps worth the added value, if it exists? As an
> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
> >> function of the last update to kvm-clock or the reference TSC page,
> >> respectively, sounds very straightforward.
> >>
> >> (Outside of masterclock mode, the requirement that the client
> >> synchronizes across cpus for montonicity smoothes over a lot of
> >> complexity - periodically updating kvm-clock to the current time is
> >> simple and works.)
> >>
> >> Regardless of my opinion, I think that a clear statement of the design
> >> goals for kvm-clock (and kvm's implementation of the reference TSC
> >> page) would be valuable.
> >
> > Documentation/virtual/kvm/timekeeping.txt
> >
> 
> Hi Marcelo,
> 
> While I appreciate all of the detail in timekeeping.txt, it is not a
> very good reference for what kvm-clock is or how it works. kvm-clock
> is only mentioned three times in different places throughout that
> document, and nowhere is there a very clear statement of what
> kvm-clock is supposed to do or how it does it.
> 
> For somebody that does not already have a deep understanding of the
> core masterclock code, trying to understand how kvm-clock works is a

There is no "deep understanding". There is one comment there about 
why you can't update systemtimestamp + tsc_offset (you have to read
the kvmclock clock read function to understand this sentence) in
parallel in multiple VCPUs, and thats all masterclock is about.

Its called "master" because there must be only one system_timestamp 
and not multiple (therefore thats the "master" copy of system_time).

> real challenge.
> 
> Thanks,
> Peter

Design goals: provide a reliable clocksource device to Linux guests
so they are able to cope with virtualization problems, namely:

1. Migration to hosts with different TSC frequency.
2. Support for hosts with TSCs that are not stable (whose
counting frequency changes across processor frequency changes).

How: Expose a clockdevice which counts at 1GHz to guests.

pvclock was created for use with Xen, kvmclock is KVM's implementation
of pvclock interface.

-----

This are the "design properties", the rest have evolved over
time from requirements on the field which have not been
realized when initially "designed".

-----

Evolution of masterclock scheme (bugs uncovered):

Problem: time backwards as seen by guests.
Solution: Fix in guest with pvclock global variable (cmpxchg).

Problem: gettimeofday() performance
Solution: Use masterclock scheme (update pvclock areas in sync to avoid
time backwards event being visible to guests, its well documented in
x86.c, if something is unclear please try to understand the code / ask
and you/we improve the documentation there).

Problem: get_kernel_ns VS TSC clock get out of sync and
Hyper-V complains about the difference.

Solution: expose the NTP TSC frequency so that guests
apply NTP frequency correction to their kvmclock reads on TSC as well.

---

About future: agree with Andy that kvmclock should be removed.
So there is a pending work item there: "verify TSC clocksource
is fine for exposing to guests, think about the implications for
management software".
I can write down a list of items that have been fixed
for kvmclock and would have to be check for tsc clocksource.

Anyone willing to take that task ?

---

About complaint that "its not well designed whether NTP correction
should be applied or not". There are two different things:

1) Host clock and guest clocks synchronized.
KVM is not responsible for that, and it can't, because
Linux exposes a clock which is created in software
and fixed by NTP.

2) NTP frequency correction being applied to kvmclock.

This only means that the frequency of the pvclock reads
in the guest are NTP corrected.

Whether its necessary: No, its not strictly necessary because
the clock exposed to the guest is the Linux clock, maintained
by Linux on top of kvmclock (via interrupts and kvmclock reads).

So for KVM-RT for example, its fine to have one

    system_timestamp (read at guest initialization).
    uncorrected host TSC value.

Because the guests clock will be NTP corrected (via sys_adjtimex)
and the guest clock will be synchronized to UTC.


  parent reply	other threads:[~2016-02-24 23:35 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-24  2:31 What time is it kvm-clock? Owen Hofmann
2016-02-24  3:57 ` Marcelo Tosatti
2016-02-24 17:35   ` Peter Hornyack
2016-02-24 20:17     ` Radim Krčmář
2016-02-24 20:24       ` Andy Lutomirski
2016-02-24 20:53         ` Radim Krčmář
2016-02-25 11:13           ` Radim Krčmář
2016-02-25 11:22           ` Marcelo Tosatti
2016-02-24 23:35     ` Marcelo Tosatti [this message]
2016-02-24 23:36       ` Marcelo Tosatti
2016-02-25  1:19       ` Andy Lutomirski
2016-02-25  3:50         ` Owen Hofmann
2016-02-25 12:20           ` Radim Krčmář
2016-02-26 17:02             ` Andy Lutomirski
2016-02-26 19:30               ` Marcelo Tosatti
2016-02-27  0:00                 ` Andy Lutomirski
2016-02-25 11:36         ` Radim Krčmář
2016-02-25 12:12         ` Marcelo Tosatti
2016-02-24  3:59 ` Marcelo Tosatti
2016-02-24 14:14 ` Paolo Bonzini
2016-02-24 16:44   ` Andy Lutomirski
2016-02-24 17:38     ` Marcelo Tosatti
2016-02-24 19:38       ` Andy Lutomirski
2016-02-24 19:44         ` Paolo Bonzini
2016-02-24 19:52           ` Andy Lutomirski
2016-02-24 19:55         ` Owen Hofmann
2016-02-25 12:22           ` Joao Martins
2016-02-25 12:22           ` Joao Martins
2016-02-26 15:04 ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160224233500.GA17304@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=osh@google.com \
    --cc=pbonzini@redhat.com \
    --cc=peterhornyack@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.