[RFC PATCH v2] Cleaning up the KVM clock mess

* [RFC PATCH v2] Cleaning up the KVM clock mess
@ 2024-04-27 11:04 David Woodhouse
  2024-04-27 11:04 ` [PATCH v2 01/15] KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init() David Woodhouse
                   ` (14 more replies)
  0 siblings, 15 replies; 20+ messages in thread
From: David Woodhouse @ 2024-04-27 11:04 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Jonathan Corbet, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Paul Durrant, Shuah Khan, linux-doc,
	linux-kernel, linux-kselftest, Oliver Upton, Marcelo Tosatti,
	jalliste, sveith, zide.chen, Dongli Zhang

Clean up the KVM clock mess somewhat so that it is either based on the guest
TSC ("master clock" mode), or on the host CLOCK_MONOTONIC_RAW in cases where
the TSC isn't usable.

Eliminate the third variant where it was based directly on the *host* TSC,
due to bugs in e.g. __get_kvmclock().

Kill off the last vestiges of the KVM clock being based on CLOCK_MONOTONIC
instead of CLOCK_MONOTONIC_RAW and thus being subject to NTP skew.

Fix up migration support to allow the KVM clock to be saved/restored as an
arithmetic function of the guest TSC, since that's what it actually is in
the *common* case so it can be migrated precisely. Or at least to within
±1 ns which is good enough, as discussed in
https://lore.kernel.org/kvm/c8dca08bf848e663f192de6705bf04aa3966e856.camel@infradead.org

In v2 of this series, TSC synchronization is improved and simplified a bit
too, and we allow masterclock mode to be used even when the guest TSCs are
out of sync, as long as they're running at the same *rate*. The different
*offset* shouldn't matter.

And the kvm_get_time_scale() function annoyed me by being entirely opaque,
so I studied it until my brain hurt and then added some comments.

In v2 I also dropped the commits which were removing the periodic clock
syncs. Those are going to be needed still but *only* for non-masterclock
mode, which I'll do next. Along with ensuring that a masterclock update
while already in masterclock mode doesn't jump the clock, and just does
the same as KVM_SET_CLOCK_GUEST does to preserve it.

Needs a *lot* more testing. I think I'm almost done refactoring the code,
so should focus on building up the tests next.

(I do still hate that we're abusing KVM_GET_CLOCK just to get the tuple
of {host_tsc, CLOCK_REALTIME} without even *caring* about the eponymous
KVM clock. Especially as this information is (a) fundamentally what the
vDSO gettimeofday() exposes to us anyway, (b) using CLOCK_REALTIME not
TAI, (c) not available on other platforms, for example for migrating
the Arm arch counter.)

David Woodhouse (13):
      KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init()
      KVM: x86: Improve accuracy of KVM clock when TSC scaling is in force
      KVM: x86: Explicitly disable TSC scaling without CONSTANT_TSC
      KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration
      KVM: x86: Avoid NTP frequency skew for KVM clock on 32-bit host
      KVM: x86: Fix KVM clock precision in __get_kvmclock()
      KVM: x86: Fix software TSC upscaling in kvm_update_guest_time()
      KVM: x86: Simplify and comment kvm_get_time_scale()
      KVM: x86: Remove implicit rdtsc() from kvm_compute_l1_tsc_offset()
      KVM: x86: Improve synchronization in kvm_synchronize_tsc()
      KVM: x86: Kill cur_tsc_{nsec,offset,write} fields
      KVM: x86: Allow KVM master clock mode when TSCs are offset from each other
      KVM: x86: Factor out kvm_use_master_clock()

Jack Allister (2):
      KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration
      KVM: selftests: Add KVM/PV clock selftest to prove timer correction

 Documentation/virt/kvm/api.rst                    |  37 ++
 Documentation/virt/kvm/devices/vcpu.rst           | 115 +++-
 arch/x86/include/asm/kvm_host.h                   |  15 +-
 arch/x86/include/uapi/asm/kvm.h                   |   6 +
 arch/x86/kvm/svm/svm.c                            |   3 +-
 arch/x86/kvm/vmx/vmx.c                            |   2 +-
 arch/x86/kvm/x86.c                                | 687 +++++++++++++++-------
 arch/x86/kvm/xen.c                                |   4 +-
 include/uapi/linux/kvm.h                          |   3 +
 tools/testing/selftests/kvm/Makefile              |   1 +
 tools/testing/selftests/kvm/x86_64/pvclock_test.c | 192 ++++++
 11 files changed, 822 insertions(+), 243 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread