From: Maxim Levitsky <mlevitsk@redhat.com>
To: kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Oliver Upton <oupton@google.com>, Ingo Molnar <mingo@redhat.com>,
Sean Christopherson <sean.j.christopherson@intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org (open list),
Marcelo Tosatti <mtosatti@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Wanpeng Li <wanpengli@tencent.com>,
Borislav Petkov <bp@alien8.de>, Jim Mattson <jmattson@google.com>,
"H. Peter Anvin" <hpa@zytor.com>,
linux-doc@vger.kernel.org (open list:DOCUMENTATION),
Joerg Roedel <joro@8bytes.org>,
x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)),
Vitaly Kuznetsov <vkuznets@redhat.com>,
Maxim Levitsky <mlevitsk@redhat.com>
Subject: [PATCH 1/2] KVM: x86: implement KVM_SET_TSC_PRECISE/KVM_GET_TSC_PRECISE
Date: Mon, 30 Nov 2020 15:35:58 +0200 [thread overview]
Message-ID: <20201130133559.233242-2-mlevitsk@redhat.com> (raw)
In-Reply-To: <20201130133559.233242-1-mlevitsk@redhat.com>
These two new ioctls allow to more precisly capture and
restore guest's TSC state.
Both ioctls are meant to be used to accurately migrate guest TSC
even when there is a significant downtime during the migration.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
Documentation/virt/kvm/api.rst | 56 +++++++++++++++++++++++++++
arch/x86/kvm/x86.c | 69 ++++++++++++++++++++++++++++++++++
include/uapi/linux/kvm.h | 14 +++++++
3 files changed, 139 insertions(+)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 70254eaa5229f..2f04aa8ecf119 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4826,6 +4826,62 @@ If a vCPU is in running state while this ioctl is invoked, the vCPU may
experience inconsistent filtering behavior on MSR accesses.
+4.127 KVM_GET_TSC_STATE
+----------------------------
+
+:Capability: KVM_CAP_PRECISE_TSC
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_tsc_info
+:Returns: 0 on success, < 0 on error
+
+::
+
+ #define KVM_TSC_INFO_TSC_ADJUST_VALID 1
+ struct kvm_tsc_info {
+ __u32 flags;
+ __u64 nsec;
+ __u64 tsc;
+ __u64 tsc_adjust;
+ };
+
+flags values for ``struct kvm_tsc_info``:
+
+``KVM_TSC_INFO_TSC_ADJUST_VALID``
+
+ ``tsc_adjust`` contains valid IA32_TSC_ADJUST value
+
+This ioctl allows user space to read guest's IA32_TSC, IA32_TSC_ADJUST,
+and the current value of host CLOCK_REALTIME clock in nanoseconds since unix
+epoch.
+
+
+4.128 KVM_SET_TSC_STATE
+----------------------------
+
+:Capability: KVM_CAP_PRECISE_TSC
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_tsc_info
+:Returns: 0 on success, < 0 on error
+
+::
+
+This ioctl allows to reconstruct the guest's IA32_TSC and TSC_ADJUST value
+from the state obtained in the past by KVM_GET_TSC_STATE on the same vCPU.
+
+KVM will adjust the guest TSC value by the time that passed between
+CLOCK_REALTIME timestamp saved in the struct and current value of
+CLOCK_REALTIME, and set guest's TSC to the new value.
+
+TSC_ADJUST is restored as is if KVM_TSC_INFO_TSC_ADJUST_VALID is set.
+
+It is assumed that either both ioctls will be run on the same machine,
+or that source and destination machines have synchronized clocks.
+
+As a special case, it is allowed to leave the timestamp in the struct to zero,
+in which case it will be ignored and the TSC will be restored exactly.
+
5. The kvm_run structure
========================
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a3fdc16cfd6f3..4f0ae9cb14b8a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2438,6 +2438,21 @@ static bool kvm_get_walltime_and_clockread(struct timespec64 *ts,
return gtod_is_based_on_tsc(do_realtime(ts, tsc_timestamp));
}
+
+
+static void kvm_get_walltime(u64 *walltime_ns, u64 *host_tsc)
+{
+ struct timespec64 ts;
+
+ if (kvm_get_walltime_and_clockread(&ts, host_tsc)) {
+ *walltime_ns = timespec64_to_ns(&ts);
+ return;
+ }
+
+ *host_tsc = rdtsc();
+ *walltime_ns = ktime_get_real_ns();
+}
+
#endif
/*
@@ -3757,6 +3772,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_X86_USER_SPACE_MSR:
case KVM_CAP_X86_MSR_FILTER:
case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
+#ifdef CONFIG_X86_64
+ case KVM_CAP_PRECISE_TSC:
+#endif
r = 1;
break;
case KVM_CAP_SYNC_REGS:
@@ -4999,6 +5017,57 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
case KVM_GET_SUPPORTED_HV_CPUID:
r = kvm_ioctl_get_supported_hv_cpuid(vcpu, argp);
break;
+#ifdef CONFIG_X86_64
+ case KVM_GET_TSC_STATE: {
+ struct kvm_tsc_state __user *user_tsc_state = argp;
+ struct kvm_tsc_state tsc_state;
+ u64 host_tsc;
+
+ memset(&tsc_state, 0, sizeof(tsc_state));
+
+ kvm_get_walltime(&tsc_state.nsec, &host_tsc);
+ tsc_state.tsc = kvm_read_l1_tsc(vcpu, host_tsc);
+
+ if (guest_cpuid_has(vcpu, X86_FEATURE_TSC_ADJUST)) {
+ tsc_state.tsc_adjust = vcpu->arch.ia32_tsc_adjust_msr;
+ tsc_state.flags |= KVM_TSC_STATE_TSC_ADJUST_VALID;
+ }
+
+ r = -EFAULT;
+ if (copy_to_user(user_tsc_state, &tsc_state, sizeof(tsc_state)))
+ goto out;
+ r = 0;
+ break;
+ }
+ case KVM_SET_TSC_STATE: {
+ struct kvm_tsc_state __user *user_tsc_state = argp;
+ struct kvm_tsc_state tsc_state;
+
+ u64 host_tsc, wall_nsec;
+ s64 diff;
+ u64 new_guest_tsc, new_guest_tsc_offset;
+
+ r = -EFAULT;
+ if (copy_from_user(&tsc_state, user_tsc_state, sizeof(tsc_state)))
+ goto out;
+
+ kvm_get_walltime(&wall_nsec, &host_tsc);
+ diff = wall_nsec - tsc_state.nsec;
+
+ if (diff < 0 || tsc_state.nsec == 0)
+ diff = 0;
+
+ new_guest_tsc = tsc_state.tsc + nsec_to_cycles(vcpu, diff);
+ new_guest_tsc_offset = new_guest_tsc - kvm_scale_tsc(vcpu, host_tsc);
+ kvm_vcpu_write_tsc_offset(vcpu, new_guest_tsc_offset);
+
+ if (tsc_state.flags & KVM_TSC_STATE_TSC_ADJUST_VALID)
+ if (guest_cpuid_has(vcpu, X86_FEATURE_TSC_ADJUST))
+ vcpu->arch.ia32_tsc_adjust_msr = tsc_state.tsc_adjust;
+ r = 0;
+ break;
+ }
+#endif
default:
r = -EINVAL;
}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 886802b8ffba3..ee1bd5e7da964 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1056,6 +1056,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 190
#define KVM_CAP_SYS_HYPERV_CPUID 191
#define KVM_CAP_DIRTY_LOG_RING 192
+#define KVM_CAP_PRECISE_TSC 193
#ifdef KVM_CAP_IRQ_ROUTING
@@ -1169,6 +1170,15 @@ struct kvm_clock_data {
__u32 pad[9];
};
+
+#define KVM_TSC_STATE_TSC_ADJUST_VALID 1
+struct kvm_tsc_state {
+ __u32 flags;
+ __u64 nsec;
+ __u64 tsc;
+ __u64 tsc_adjust;
+};
+
/* For KVM_CAP_SW_TLB */
#define KVM_MMU_FSL_BOOKE_NOHV 0
@@ -1563,6 +1573,10 @@ struct kvm_pv_cmd {
/* Available with KVM_CAP_DIRTY_LOG_RING */
#define KVM_RESET_DIRTY_RINGS _IO(KVMIO, 0xc7)
+/* Available with KVM_CAP_PRECISE_TSC*/
+#define KVM_SET_TSC_STATE _IOW(KVMIO, 0xc8, struct kvm_tsc_state)
+#define KVM_GET_TSC_STATE _IOR(KVMIO, 0xc9, struct kvm_tsc_state)
+
/* Secure Encrypted Virtualization command */
enum sev_cmd_id {
/* Guest initialization commands */
--
2.26.2
next prev parent reply other threads:[~2020-11-30 13:38 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-30 13:35 [PATCH 0/2] RFC: Precise TSC migration Maxim Levitsky
2020-11-30 13:35 ` Maxim Levitsky [this message]
2020-11-30 14:33 ` [PATCH 1/2] KVM: x86: implement KVM_SET_TSC_PRECISE/KVM_GET_TSC_PRECISE Paolo Bonzini
2020-11-30 15:58 ` Maxim Levitsky
2020-11-30 17:01 ` Paolo Bonzini
2020-12-01 19:43 ` Thomas Gleixner
2020-12-03 11:11 ` Maxim Levitsky
2020-11-30 13:35 ` [PATCH 2/2] KVM: x86: introduce KVM_X86_QUIRK_TSC_HOST_ACCESS Maxim Levitsky
2020-11-30 13:54 ` Paolo Bonzini
2020-11-30 14:11 ` Maxim Levitsky
2020-11-30 14:15 ` Paolo Bonzini
2020-11-30 15:33 ` Maxim Levitsky
2020-11-30 13:38 ` [PATCH 0/2] RFC: Precise TSC migration (summary) Maxim Levitsky
2020-11-30 16:54 ` [PATCH 0/2] RFC: Precise TSC migration Andy Lutomirski
2020-11-30 16:59 ` Paolo Bonzini
2020-11-30 19:16 ` Marcelo Tosatti
2020-12-01 12:30 ` Maxim Levitsky
2020-12-01 19:48 ` Marcelo Tosatti
2020-12-03 11:39 ` Maxim Levitsky
2020-12-03 20:18 ` Marcelo Tosatti
2020-12-07 13:00 ` Maxim Levitsky
2020-12-01 13:48 ` Thomas Gleixner
2020-12-01 15:02 ` Marcelo Tosatti
2020-12-03 11:51 ` Maxim Levitsky
2020-12-01 14:01 ` Thomas Gleixner
2020-12-01 16:19 ` Andy Lutomirski
2020-12-03 11:57 ` Maxim Levitsky
2020-12-01 19:35 ` Thomas Gleixner
2020-12-03 11:41 ` Paolo Bonzini
2020-12-03 12:47 ` Maxim Levitsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201130133559.233242-2-mlevitsk@redhat.com \
--to=mlevitsk@redhat.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mtosatti@redhat.com \
--cc=oupton@google.com \
--cc=pbonzini@redhat.com \
--cc=sean.j.christopherson@intel.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).