From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: What time is it kvm-clock? Date: Wed, 24 Feb 2016 20:35:00 -0300 Message-ID: <20160224233500.GA17304@amt.cnet> References: <20160224035753.GA6681@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Owen Hofmann , KVM General , Paolo Bonzini , Andy Lutomirski To: Peter Hornyack Return-path: Received: from mx1.redhat.com ([209.132.183.28]:57557 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751949AbcBXXfW (ORCPT ); Wed, 24 Feb 2016 18:35:22 -0500 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Wed, Feb 24, 2016 at 09:35:44AM -0800, Peter Hornyack wrote: > On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti wrote: > > On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote: > >> Specifically, what underlying source of time should be exposed through > >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc > >> page? Recently a couple of threads on kvm-list, along with attempts > >> to produce reliable behavior from kvm-clock on our systems have > >> highlighted a tension between the current implementation of kvm-clock > >> and potentially diverging goals for paravirt time. Here are a few: > >> > >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html > >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html > >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html > >> > >> This question is mostly in regards to kvm-clock in masterclock mode > >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to > >> expose a source of time that is more 'true' than the underlying TSC? > >> For example, by passing through NTP correction from the host. For the > >> current implementation, the answer seems to be... why not both? Once > >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC > >> multiplied by the frequency specified by kvm. On the other hand, > >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR > >> are measured against corrected time from the host. A guest reading its > >> pvclock gets a very different result from a host KVM_GET_CLOCK if the > >> guest has run long enough to for TSC to diverge from NTP time. A VMM > >> using these ioctls to save and restore clock state can produce wild > >> time jumps from the guest's perspective. > >> > >> The patches in (2) address this mismatch by plumbing updates to clock > >> frequency through kvm-clock to the guest. This seems like an important > >> design choice for kvm-clock, and IMO deserves at least a clear > >> statement of the goals for this interface, if not some more > >> discussion. > > > > Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK? > > > > The interfaces have been introduced to fix a bug. > > > >> The (later) thread in (3) claims that synchronizing with > >> host time is *not* a goal of kvm-clock. > > > > It is not. > > > >> To me, kvm-clock and the HyperV TSC page are extremely effective as > >> simply a more enlightened path to the host TSC. Maintaining a > >> high-performance path to the TSC in the face of updates is tricky - > >> see the extended comment in pvclock_update_vm_gtod_copy, or the > >> discussion on the patchset in (2). Is the cost of auditing that the > >> path from host gettimeofday update -> kvm -> guest pvclock -> guest > >> gettimeofday both tracks host time correctly and does not produce any > >> backwards warps worth the added value, if it exists? As an > >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a > >> function of the last update to kvm-clock or the reference TSC page, > >> respectively, sounds very straightforward. > >> > >> (Outside of masterclock mode, the requirement that the client > >> synchronizes across cpus for montonicity smoothes over a lot of > >> complexity - periodically updating kvm-clock to the current time is > >> simple and works.) > >> > >> Regardless of my opinion, I think that a clear statement of the design > >> goals for kvm-clock (and kvm's implementation of the reference TSC > >> page) would be valuable. > > > > Documentation/virtual/kvm/timekeeping.txt > > > > Hi Marcelo, > > While I appreciate all of the detail in timekeeping.txt, it is not a > very good reference for what kvm-clock is or how it works. kvm-clock > is only mentioned three times in different places throughout that > document, and nowhere is there a very clear statement of what > kvm-clock is supposed to do or how it does it. > > For somebody that does not already have a deep understanding of the > core masterclock code, trying to understand how kvm-clock works is a There is no "deep understanding". There is one comment there about why you can't update systemtimestamp + tsc_offset (you have to read the kvmclock clock read function to understand this sentence) in parallel in multiple VCPUs, and thats all masterclock is about. Its called "master" because there must be only one system_timestamp and not multiple (therefore thats the "master" copy of system_time). > real challenge. > > Thanks, > Peter Design goals: provide a reliable clocksource device to Linux guests so they are able to cope with virtualization problems, namely: 1. Migration to hosts with different TSC frequency. 2. Support for hosts with TSCs that are not stable (whose counting frequency changes across processor frequency changes). How: Expose a clockdevice which counts at 1GHz to guests. pvclock was created for use with Xen, kvmclock is KVM's implementation of pvclock interface. ----- This are the "design properties", the rest have evolved over time from requirements on the field which have not been realized when initially "designed". ----- Evolution of masterclock scheme (bugs uncovered): Problem: time backwards as seen by guests. Solution: Fix in guest with pvclock global variable (cmpxchg). Problem: gettimeofday() performance Solution: Use masterclock scheme (update pvclock areas in sync to avoid time backwards event being visible to guests, its well documented in x86.c, if something is unclear please try to understand the code / ask and you/we improve the documentation there). Problem: get_kernel_ns VS TSC clock get out of sync and Hyper-V complains about the difference. Solution: expose the NTP TSC frequency so that guests apply NTP frequency correction to their kvmclock reads on TSC as well. --- About future: agree with Andy that kvmclock should be removed. So there is a pending work item there: "verify TSC clocksource is fine for exposing to guests, think about the implications for management software". I can write down a list of items that have been fixed for kvmclock and would have to be check for tsc clocksource. Anyone willing to take that task ? --- About complaint that "its not well designed whether NTP correction should be applied or not". There are two different things: 1) Host clock and guest clocks synchronized. KVM is not responsible for that, and it can't, because Linux exposes a clock which is created in software and fixed by NTP. 2) NTP frequency correction being applied to kvmclock. This only means that the frequency of the pvclock reads in the guest are NTP corrected. Whether its necessary: No, its not strictly necessary because the clock exposed to the guest is the Linux clock, maintained by Linux on top of kvmclock (via interrupts and kvmclock reads). So for KVM-RT for example, its fine to have one system_timestamp (read at guest initialization). uncorrected host TSC value. Because the guests clock will be NTP corrected (via sys_adjtimex) and the guest clock will be synchronized to UTC.