kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>,
	Marcelo Tosatti <mtosatti@redhat.com>
Subject: [PATCH 4/4] KVM: x86: hyper-v: Don't touch TSC page values when guest opted for re-enlightenment
Date: Mon, 15 Mar 2021 15:37:06 +0100	[thread overview]
Message-ID: <20210315143706.859293-5-vkuznets@redhat.com> (raw)
In-Reply-To: <20210315143706.859293-1-vkuznets@redhat.com>

When guest opts for re-enlightenment notifications upon migration, it is
in its right to assume that TSC page values never change (as they're only
supposed to change upon migration and the host has to keep things as they
are before it receives confirmation from the guest). This is mostly true
until the guest is migrated somewhere. KVM userspace (e.g. QEMU) will
trigger masterclock update by writing to HV_X64_MSR_REFERENCE_TSC, by
calling KVM_SET_CLOCK,... and as TSC value and kvmclock reading drift
apart (even slightly), the update causes TSC page values to change.

The issue at hand is that when Hyper-V is migrated, it uses stale (cached)
TSC page values to compute the difference between its own clocksource
(provided by KVM) and its guests' TSC pages to program synthetic timers
and in some cases, when TSC page is updated, this puts all stimer
expirations in the past. This, in its turn, causes an interrupt storm
and L2 guests not making much forward progress.

Note, KVM doesn't fully implement re-enlightenment notification. Basically,
the support for reenlightenment MSRs is just a stub and userspace is only
expected to expose the feature when TSC scaling on the expected destination
hosts is available. With TSC scaling, no real re-enlightenment is needed
as TSC frequency doesn't change. With TSC scaling becoming ubiquitous, it
likely makes little sense to fully implement re-enlightenment in KVM.

Prevent TSC page from being updated after migration. In case it's not the
guest who's initiating the change and when TSC page is already enabled,
just keep it as it is: TSC value is supposed to be preserved across
migration and TSC frequency can't change with re-enlightenment enabled.
The guest is doomed anyway if any of this is not true.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 2a8d078b16cb..5889de580e37 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1103,6 +1103,24 @@ void kvm_hv_setup_tsc_page(struct kvm *kvm,
 				    &tsc_seq, sizeof(tsc_seq))))
 		goto out_err;
 
+	/*
+	 * Don't touch TSC page values if the guest has opted for TSC emulation
+	 * after migration. KVM doesn't fully support reenlightenment
+	 * notifications and TSC access emulation and Hyper-V is known to expect
+	 * the values in TSC page to stay constant before TSC access emulation
+	 * is disabled from guest side (HV_X64_MSR_TSC_EMULATION_STATUS).
+	 * Userspace is expected to preserve TSC frequency and guest visible TSC
+	 * value across migration (and prevent it when TSC scaling is
+	 * unsupported).
+	 */
+	if ((hv->hv_tsc_page_status != HV_TSC_PAGE_GUEST_CHANGED) &&
+	    hv->hv_tsc_emulation_control && tsc_seq) {
+		if (kvm_read_guest(kvm, gfn_to_gpa(gfn), &hv->tsc_ref, sizeof(hv->tsc_ref)))
+			goto out_err;
+
+		goto out_unlock;
+	}
+
 	/*
 	 * While we're computing and writing the parameters, force the
 	 * guest to use the time reference count MSR.
-- 
2.30.2


      parent reply	other threads:[~2021-03-15 14:39 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-15 14:37 [PATCH 0/4] KVM: x86: hyper-v: TSC page fixes Vitaly Kuznetsov
2021-03-15 14:37 ` [PATCH 1/4] KVM: x86: hyper-v: Limit guest to writing zero to HV_X64_MSR_TSC_EMULATION_STATUS Vitaly Kuznetsov
2021-03-15 14:37 ` [PATCH 2/4] KVM: x86: hyper-v: Prevent using not-yet-updated TSC page by secondary CPUs Vitaly Kuznetsov
2021-03-15 15:45   ` Paolo Bonzini
2021-03-15 15:55     ` Vitaly Kuznetsov
2021-03-15 16:23       ` Paolo Bonzini
2021-03-16 12:29         ` Vitaly Kuznetsov
2021-03-15 14:37 ` [PATCH 3/4] KVM: x86: hyper-v: Track Hyper-V TSC page status Vitaly Kuznetsov
2021-03-15 15:15   ` Sean Christopherson
2021-03-15 15:34     ` Vitaly Kuznetsov
2021-03-16 12:24       ` Vitaly Kuznetsov
2021-03-16 15:20         ` Sean Christopherson
2021-03-15 14:37 ` Vitaly Kuznetsov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210315143706.859293-5-vkuznets@redhat.com \
    --to=vkuznets@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).