linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maxim Levitsky <mlevitsk@redhat.com>
To: kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Oliver Upton <oupton@google.com>, Ingo Molnar <mingo@redhat.com>,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org (open list),
	Marcelo Tosatti <mtosatti@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Wanpeng Li <wanpengli@tencent.com>,
	Borislav Petkov <bp@alien8.de>, Jim Mattson <jmattson@google.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-doc@vger.kernel.org (open list:DOCUMENTATION),
	Joerg Roedel <joro@8bytes.org>,
	x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)),
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Maxim Levitsky <mlevitsk@redhat.com>
Subject: [PATCH 0/2] RFC: Precise TSC migration
Date: Mon, 30 Nov 2020 15:35:57 +0200	[thread overview]
Message-ID: <20201130133559.233242-1-mlevitsk@redhat.com> (raw)

Hi!

This is the first version of the work to make TSC migration more accurate,
as was defined by Paulo at:
https://www.spinics.net/lists/kvm/msg225525.html

I have a few thoughts about the kvm masterclock synchronization,
which relate to the Paulo's proposal that I implemented.

The idea of masterclock is that when the host TSC is synchronized
(or as kernel call it, stable), and the guest TSC is synchronized as well,
then we can base the kvmclock, on the same pair of
(host time in nsec, host tsc value), for all vCPUs.

This makes the random error in calculation of this value invariant
across vCPUS, and allows the guest to do kvmclock calculation in userspace
(vDSO) since kvmclock parameters are vCPU invariant.

To ensure that the guest tsc is synchronized we currently track host/guest tsc
writes, and enable the master clock only when roughly the same guest's TSC value
was written across all vCPUs.

Recently this was disabled by Paulo and I agree with this, because I think
that we indeed should only make the guest TSC synchronized by default
(including new hotplugged vCPUs) and not do any tsc synchronization beyond that.
(Trying to guess when the guest syncs the TSC can cause more harm that good).

Besides, Linux guests don't sync the TSC via IA32_TSC write,
but rather use IA32_TSC_ADJUST which currently doesn't participate
in the tsc sync heruistics.
And as far as I know, Linux guest is the primary (only?) user of the kvmclock.

I *do think* however that we should redefine KVM_CLOCK_TSC_STABLE
in the documentation to state that it only guarantees invariance if the guest
doesn't mess with its own TSC.

Also I think we should consider enabling the X86_FEATURE_TSC_RELIABLE
in the guest kernel, when kvm is detected to avoid the guest even from trying
to sync TSC on newly hotplugged vCPUs.

(The guest doesn't end up touching TSC_ADJUST usually, but it still might
in some cases due to scheduling of guest vCPUs)

(X86_FEATURE_TSC_RELIABLE short circuits tsc synchronization on CPU hotplug,
and TSC clocksource watchdog, and the later we might want to keep).

For host TSC writes, just as Paulo proposed we can still do the tsc sync,
unless the new code that I implemented is in use.

Few more random notes:

I have a weird feeling about using 'nsec since 1 January 1970'.
Common sense is telling me that a 64 bit value can hold about 580 years,
but still I see that it is more common to use timespec which is a (sec,nsec) pair.

I feel that 'kvm_get_walltime' that I added is a bit of a hack.
Some refactoring might improve things here.

For example making kvm_get_walltime_and_clockread work in non tsc case as well
might make the code cleaner.

Patches to enable this feature in qemu are in process of being sent to
qemu-devel mailing list.

Best regards,
       Maxim Levitsky

Maxim Levitsky (2):
  KVM: x86: implement KVM_SET_TSC_PRECISE/KVM_GET_TSC_PRECISE
  KVM: x86: introduce KVM_X86_QUIRK_TSC_HOST_ACCESS

 Documentation/virt/kvm/api.rst  | 56 +++++++++++++++++++++
 arch/x86/include/uapi/asm/kvm.h |  1 +
 arch/x86/kvm/x86.c              | 88 +++++++++++++++++++++++++++++++--
 include/uapi/linux/kvm.h        | 14 ++++++
 4 files changed, 154 insertions(+), 5 deletions(-)

-- 
2.26.2



             reply	other threads:[~2020-11-30 13:37 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30 13:35 Maxim Levitsky [this message]
2020-11-30 13:35 ` [PATCH 1/2] KVM: x86: implement KVM_SET_TSC_PRECISE/KVM_GET_TSC_PRECISE Maxim Levitsky
2020-11-30 14:33   ` Paolo Bonzini
2020-11-30 15:58     ` Maxim Levitsky
2020-11-30 17:01       ` Paolo Bonzini
2020-12-01 19:43   ` Thomas Gleixner
2020-12-03 11:11     ` Maxim Levitsky
2020-11-30 13:35 ` [PATCH 2/2] KVM: x86: introduce KVM_X86_QUIRK_TSC_HOST_ACCESS Maxim Levitsky
2020-11-30 13:54   ` Paolo Bonzini
2020-11-30 14:11     ` Maxim Levitsky
2020-11-30 14:15       ` Paolo Bonzini
2020-11-30 15:33         ` Maxim Levitsky
2020-11-30 13:38 ` [PATCH 0/2] RFC: Precise TSC migration (summary) Maxim Levitsky
2020-11-30 16:54 ` [PATCH 0/2] RFC: Precise TSC migration Andy Lutomirski
2020-11-30 16:59   ` Paolo Bonzini
2020-11-30 19:16 ` Marcelo Tosatti
2020-12-01 12:30   ` Maxim Levitsky
2020-12-01 19:48     ` Marcelo Tosatti
2020-12-03 11:39       ` Maxim Levitsky
2020-12-03 20:18         ` Marcelo Tosatti
2020-12-07 13:00           ` Maxim Levitsky
2020-12-01 13:48   ` Thomas Gleixner
2020-12-01 15:02     ` Marcelo Tosatti
2020-12-03 11:51       ` Maxim Levitsky
2020-12-01 14:01   ` Thomas Gleixner
2020-12-01 16:19     ` Andy Lutomirski
2020-12-03 11:57       ` Maxim Levitsky
2020-12-01 19:35 ` Thomas Gleixner
2020-12-03 11:41   ` Paolo Bonzini
2020-12-03 12:47   ` Maxim Levitsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201130133559.233242-1-mlevitsk@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=oupton@google.com \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).