From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: kvmclock doesn't work, help? Date: Thu, 10 Dec 2015 19:33:47 -0200 Message-ID: <20151210213347.GB4836@amt.cnet> References: <56689A2B.6090500@redhat.com> <5668A76A.7050707@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Paolo Bonzini , kvm list , Radim Krcmar , X86 ML , Alexander Graf To: Andy Lutomirski Return-path: Received: from mx1.redhat.com ([209.132.183.28]:57222 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753073AbbLKVTm (ORCPT ); Fri, 11 Dec 2015 16:19:42 -0500 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Wed, Dec 09, 2015 at 02:27:36PM -0800, Andy Lutomirski wrote: > On Wed, Dec 9, 2015 at 2:12 PM, Paolo Bonzini wrote: > > > > > > On 09/12/2015 22:49, Andy Lutomirski wrote: > >> On Wed, Dec 9, 2015 at 1:16 PM, Paolo Bonzini wrote: > >>> > >>> > >>> On 09/12/2015 22:10, Andy Lutomirski wrote: > >>>> Can we please stop making kvmclock more complex? It's a beast right > >>>> now, and not in a good way. It's far too tangled with the vclock > >>>> machinery on both the host and guest sides, the pvclock stuff is not > >>>> well thought out (even in principle in an ABI sense), and it's never > >>>> been clear to my what problem exactly the kvmclock stuff is supposed > >>>> to solve. > >>> > >>> It's supposed to solve the problem that: > >>> > >>> - not all hosts have a working TSC > >> > >> Fine, but we don't need any vdso integration for that. > > > > Well, you still want a fast time source. That was a given. :) > > If the host can't figure out how to give *itself* a fast time source, > I'd be surprised if KVM can manage to give the guest a fast, reliable > time source. > > > > >>> - even if they all do, virtual machines can be migrated (or > >>> saved/restored) to a host with a different TSC frequency > >>> > >>> - any MMIO- or PIO-based mechanism to access the current time is orders > >>> of magnitude slower than the TSC and less precise too. > >> > >> Yup. But TSC by itself gets that benefit, too. > > > > Yes, the problem is if you want to solve all three of them. The first > > two are solved by the ACPI PM timer with a decent resolution (70 > > ns---much faster anyway than an I/O port access). The third is solved > > by TSC. To solve all three, you need kvmclock. > > Still confused. Is kvmclock really used in cases where even the host > can't pull of working TSC? > > > > >>>> I'm somewhat tempted to suggest that we delete kvmclock entirely and > >>>> start over. A correctly functioning KVM guest using TSC (i.e. > >>>> ignoring kvmclock entirely) seems to work rather more reliably and > >>>> considerably faster than a kvmclock guest. > >>> > >>> If all your hosts have a working TSC and you don't do migration or > >>> save/restore, that's a valid configuration. It's not a good default, > >>> however. > >> > >> Er? > >> > >> kvmclock is still really quite slow and buggy. > > > > Unless it takes 3-4000 clock cycles for a gettimeofday, which it > > shouldn't even with vdso disabled, it's definitely not slower than PIO. > > > >> And the patch I identified is definitely a problem here: > >> > >> [ 136.131241] KVM: disabling fast timing permanently due to inability > >> to recover from suspend > >> > >> I got that on the host with this whitespace-damaged patch: > >> > >> if (backwards_tsc) { > >> u64 delta_cyc = max_tsc - local_tsc; > >> + if (!backwards_tsc_observed) > >> + pr_warn("KVM: disabling fast timing > >> permanently due to inability to recover from suspend\n"); > >> > >> when I suspended and resumed. > >> > >> Can anyone explain what problem > >> 16a9602158861687c78b6de6dc6a79e6e8a9136f is supposed to solve? On > >> brief inspection, it just seems to be incorrect. Shouldn't KVM's > >> normal TSC logic handle that case right? After all, all vcpus should > >> be paused when we resume from suspend. At worst, we should just need > >> kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu) on all vcpus. (Actually, > >> shouldn't we do that regardless of which way the TSC jumped on > >> suspend/resume? After all, the jTSC-to-wall-clock offset is quite > >> likely to change except on the very small handful of CPUs (if any) > >> that keep the TSC running in S3 and hibernate. > > > > I don't recall the details of that patch, so Marcelo will have to answer > > this, or Alex too since he chimed in the original thread. At least it > > should be made conditional on the existence of a VM at suspend time (and > > the master clock stuff should be made per VM, as I suggested at > > https://www.mail-archive.com/kvm@vger.kernel.org/msg102316.html). > > > > It would indeed be great if the master clock could be dropped. But I'm > > definitely missing some of the subtle details. :( > > Me, too. > > Anyway, see the attached untested patch. Marcelo? > > --Andy Read the last email, about the problem.