All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Woodhouse <dwmw2@infradead.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Dongli Zhang <dongli.zhang@oracle.com>,
	Joe Jin <joe.jin@oracle.com>,
	x86@kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, pbonzini@redhat.com,
	tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com
Subject: Re: [PATCH RFC 1/1] KVM: x86: add param to update master clock periodically
Date: Mon, 02 Oct 2023 19:16:22 +0100	[thread overview]
Message-ID: <52a3cea2084482fc67e35a0bf37453f84dcd6297.camel@infradead.org> (raw)
In-Reply-To: <ZRrxtagy7vJO5tgU@google.com>

[-- Attachment #1: Type: text/plain, Size: 2446 bytes --]

On Mon, 2023-10-02 at 09:37 -0700, Sean Christopherson wrote:
> On Mon, Oct 02, 2023, David Woodhouse wrote:
> > On Fri, 2023-09-29 at 13:15 -0700, Dongli Zhang wrote:
> > > 
> > > 1. The vcpu->hv_clock (kvmclock) is based on its own mult/shift/equation.
> > > 
> > > 2. The raw monotonic (tsc_clocksource) uses different mult/shift/equation.
> > > 
> > 
> > That just seems wrong. I don't mean that you're incorrect; it seems
> > *morally* wrong.
> > 
> > In a system with X86_FEATURE_CONSTANT_TSC, why would KVM choose to use
> > a *different* mult/shift/equation (your #1) to convert TSC ticks to
> > nanoseconds than the host CLOCK_MONOTONIC_RAW does (your #2).
> > 
> > I understand that KVM can't track the host's CLOCK_MONOTONIC, as it's
> > adjusted by NTP. But CLOCK_MONOTONIC_RAW is supposed to be consistent.
> > 
> > Fix that, and the whole problem goes away, doesn't it?
> > 
> > What am I missing here, that means we can't do that?
> 
> I believe the answer is that "struct pvclock_vcpu_time_info" and its math are
> ABI between KVM and KVM guests.
> 
> Like many of the older bits of KVM, my guess is that KVM's behavior is the product
> of making things kinda sorta work with old hardware, i.e. was probably the least
> awful solution in the days before constant TSCs, but is completely nonsensical on
> modern hardware.

I still don't understand. The ABI and its math are fine. The ABI is just
 "at time X the TSC was Y, and the TSC frequency is Z"

I understand why on older hardware, those values needed to *change*
occasionally when TSC stupidity happened. 

But on newer hardware, surely we can set them precisely *once* when the
VM starts, and never ever have to change them again? Theoretically not
even when we pause the VM, kexec into a new kernel, and resume the VM!

But we *are* having to change it, because apparently
CLOCK_MONOTONIC_RAW is doing something *other* than incrementing at
precisely the frequency of the known and constant TSC.

But *why* is CLOCK_MONOTONIC_RAW doing that? I thought that the whole
point of CLOCK_MONOTONIC_RAW was to be consistent and not adjusted by
NTP etc.? Shouldn't it run at precisely the same rate as the kvmclock,
with no skew at all?

And if CLOCK_MONOTONIC_RAW is not what I thought it was... do we really
have to keep resetting the kvmclock to it at all? On modern hardware
can't the kvmclock be defined by the TSC alone?

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

  parent reply	other threads:[~2023-10-02 18:16 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-26 23:06 [PATCH RFC 1/1] KVM: x86: add param to update master clock periodically Dongli Zhang
2023-09-27  0:29 ` Joe Jin
2023-09-27  0:36   ` Dongli Zhang
2023-09-28 16:18     ` Sean Christopherson
2023-09-29 20:15       ` Dongli Zhang
2023-10-02  8:33         ` David Woodhouse
2023-10-02 16:37           ` Sean Christopherson
2023-10-02 17:17             ` Dongli Zhang
2023-10-02 18:18               ` Sean Christopherson
2023-10-02 21:06                 ` Peter Zijlstra
2023-10-02 21:16                   ` Peter Zijlstra
2023-10-02 18:16             ` David Woodhouse [this message]
2023-10-03  0:53               ` Sean Christopherson
2023-10-03  1:32                 ` Dongli Zhang
2023-10-03  1:49                   ` Sean Christopherson
2023-10-03  2:07                     ` Dongli Zhang
2023-10-03 21:00                       ` Sean Christopherson
2023-10-03  5:54                 ` David Woodhouse
2023-10-04  0:04                   ` Sean Christopherson
2023-10-04 10:01                     ` David Woodhouse
2023-10-04 18:06                       ` Sean Christopherson
2023-10-04 19:13                         ` Dongli Zhang
2023-10-11  0:20                           ` Sean Christopherson
2023-10-11  7:18                             ` David Woodhouse
2023-10-13 18:07                               ` Sean Christopherson
2023-10-13 18:21                                 ` David Woodhouse
2023-10-13 19:02                                   ` Sean Christopherson
2023-10-13 19:12                                     ` David Woodhouse
2023-10-13 20:03                                       ` Sean Christopherson
2023-10-13 20:12                                 ` Dongli Zhang
2023-10-13 23:26                                   ` Sean Christopherson
2023-10-14  9:49                                     ` David Woodhouse
2023-10-16 15:47                                       ` Dongli Zhang
2023-10-16 16:25                                         ` David Woodhouse
2023-10-16 17:04                                           ` Dongli Zhang
2023-10-16 18:49                                           ` Sean Christopherson
2023-10-16 22:04                                             ` Dongli Zhang
2023-10-16 22:48                                               ` Sean Christopherson
2023-10-17 16:18                                                 ` Dongli Zhang
2023-10-03  9:12                 ` David Woodhouse
2023-10-04  0:07                   ` Sean Christopherson
2023-10-04  8:06                     ` David Woodhouse
2023-10-03 14:29                 ` David Woodhouse
2023-10-04  0:10                   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52a3cea2084482fc67e35a0bf37453f84dcd6297.camel@infradead.org \
    --to=dwmw2@infradead.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dongli.zhang@oracle.com \
    --cc=joe.jin@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.