kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oliver Upton <oupton@google.com>
To: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Marc Zyngier <maz@kernel.org>, Peter Shier <pshier@google.com>,
	Jim Mattson <jmattson@google.com>,
	David Matlack <dmatlack@google.com>,
	Ricardo Koller <ricarkol@google.com>,
	Jing Zhang <jingzhangos@google.com>,
	Raghavendra Rao Anata <rananta@google.com>,
	James Morse <james.morse@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Andrew Jones <drjones@redhat.com>, Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Oliver Upton <oupton@google.com>
Subject: [PATCH v7 0/6] KVM: x86: Add idempotent controls for migrating system counter state
Date: Mon, 16 Aug 2021 00:11:24 +0000	[thread overview]
Message-ID: <20210816001130.3059564-1-oupton@google.com> (raw)

KVM's current means of saving/restoring system counters is plagued with
temporal issues. On x86, we migrate the guest's system counter by-value
through the respective guest's IA32_TSC value. Restoring system counters
by-value is brittle as the state is not idempotent: the host system
counter is still oscillating between the attempted save and restore.
Furthermore, VMMs may wish to transparently live migrate guest VMs,
meaning that they include the elapsed time due to live migration blackout
in the guest system counter view. The VMM thread could be preempted for
any number of reasons (scheduler, L0 hypervisor under nested) between the
time that it calculates the desired guest counter value and when
KVM actually sets this counter state.

Despite the value-based interface that we present to userspace, KVM
actually has idempotent guest controls by way of the TSC offset.
We can avoid all of the issues associated with a value-based interface
by abstracting these offset controls in a new device attribute. This
series introduces new vCPU device attributes to provide userspace access
to the vCPU's system counter offset.

Patch 1 addresses a possible race in KVM_GET_CLOCK where
use_master_clock is read outside of the pvclock_gtod_sync_lock.

Patch 2 is a cleanup, moving the implementation of KVM_{GET,SET}_CLOCK
into helper methods.

Patch 3 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
ioctls to provide userspace with a (host_tsc, realtime) instant. This is
essential for a VMM to perform precise migration of the guest's system
counters.

Patches 4-5 are some preparatory changes for exposing the TSC offset to
userspace. Patch 6 provides a vCPU attribute to provide userspace access
to the TSC offset.

This series was tested with the new KVM selftests for the KVM clock and
system counter offset controls on Haswell hardware. Note that these
tests are mailed as a separate series due to the dependencies in both
x86 and arm64.

Applies cleanly to kvm/queue.

Parent commit: a3e0b8bd99ab ("KVM: MMU: change tracepoints arguments to kvm_page_fault")

v6: https://lore.kernel.org/r/20210804085819.846610-1-oupton@google.com

v6 -> v7:
 - Separated x86, arm64, and selftests into different series
 - Rebased on top of kvm/queue

Oliver Upton (6):
  KVM: x86: Fix potential race in KVM_GET_CLOCK
  KVM: x86: Create helper methods for KVM_{GET,SET}_CLOCK ioctls
  KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
  KVM: x86: Take the pvclock sync lock behind the tsc_write_lock
  KVM: x86: Refactor tsc synchronization code
  KVM: x86: Expose TSC offset controls to userspace

 Documentation/virt/kvm/api.rst          |  42 ++-
 Documentation/virt/kvm/devices/vcpu.rst |  57 ++++
 Documentation/virt/kvm/locking.rst      |  11 +
 arch/x86/include/asm/kvm_host.h         |   4 +
 arch/x86/include/uapi/asm/kvm.h         |   4 +
 arch/x86/kvm/x86.c                      | 362 +++++++++++++++++-------
 include/uapi/linux/kvm.h                |   7 +-
 7 files changed, 378 insertions(+), 109 deletions(-)

-- 
2.33.0.rc1.237.g0d66db33f3-goog


             reply	other threads:[~2021-08-16  0:11 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-16  0:11 Oliver Upton [this message]
2021-08-16  0:11 ` [PATCH v7 1/6] KVM: x86: Fix potential race in KVM_GET_CLOCK Oliver Upton
2021-08-19 18:24   ` Marcelo Tosatti
2021-08-20 18:22     ` Oliver Upton
2021-08-16  0:11 ` [PATCH v7 2/6] KVM: x86: Create helper methods for KVM_{GET,SET}_CLOCK ioctls Oliver Upton
2021-08-16  0:11 ` [PATCH v7 3/6] KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK Oliver Upton
2021-08-20 12:46   ` Marcelo Tosatti
2021-09-24  8:30     ` Paolo Bonzini
2021-08-16  0:11 ` [PATCH v7 4/6] KVM: x86: Take the pvclock sync lock behind the tsc_write_lock Oliver Upton
2021-09-02 19:22   ` Sean Christopherson
2021-08-16  0:11 ` [PATCH v7 5/6] KVM: x86: Refactor tsc synchronization code Oliver Upton
2021-09-02 19:21   ` Sean Christopherson
2021-09-02 19:41     ` Oliver Upton
2021-09-24  9:28     ` Paolo Bonzini
2021-08-16  0:11 ` [PATCH v7 6/6] KVM: x86: Expose TSC offset controls to userspace Oliver Upton
2021-08-23 20:56   ` Oliver Upton
2021-08-26 12:48     ` Marcelo Tosatti
2021-08-26 20:27       ` Oliver Upton
2021-09-02 19:23 ` [PATCH v7 0/6] KVM: x86: Add idempotent controls for migrating system counter state Sean Christopherson
2021-09-02 19:45   ` Oliver Upton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210816001130.3059564-1-oupton@google.com \
    --to=oupton@google.com \
    --cc=alexandru.elisei@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dmatlack@google.com \
    --cc=drjones@redhat.com \
    --cc=james.morse@arm.com \
    --cc=jingzhangos@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=pshier@google.com \
    --cc=rananta@google.com \
    --cc=ricarkol@google.com \
    --cc=seanjc@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).