All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maxim Levitsky <mlevitsk@redhat.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>,
	kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Kechen Lu <kechenl@nvidia.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 0/5] KVM: x86: hyper-v: Conditionally allow SynIC with APICv/AVIC
Date: Wed, 26 May 2021 12:54:13 +0300	[thread overview]
Message-ID: <2409eb8593804eb879ae6fb961a709ca8c20f329.camel@redhat.com> (raw)
In-Reply-To: <20210518144339.1987982-1-vkuznets@redhat.com>

On Tue, 2021-05-18 at 16:43 +0200, Vitaly Kuznetsov wrote:
> Changes since v1 (Sean):
> - Use common 'enable_apicv' variable for both APICv and AVIC instead of 
>  adding a new hook to 'struct kvm_x86_ops'.
> - Drop unneded CONFIG_X86_LOCAL_APIC checks from VMX/SVM code along the
>  way.
> 
> Original description:
> 
> APICV_INHIBIT_REASON_HYPERV is currently unconditionally forced upon
> SynIC activation as SynIC's AutoEOI is incompatible with APICv/AVIC. It is,
> however, possible to track whether the feature was actually used by the
> guest and only inhibit APICv/AVIC when needed.
> 
> The feature can be tested with QEMU's 'hv-passthrough' debug mode.
> 
> Note, 'avic' kvm-amd module parameter is '0' by default and thus needs to
> be explicitly enabled.
> 
> Vitaly Kuznetsov (5):
>   KVM: SVM: Drop unneeded CONFIG_X86_LOCAL_APIC check for AVIC
>   KVM: VMX: Drop unneeded CONFIG_X86_LOCAL_APIC check from
>     cpu_has_vmx_posted_intr()
>   KVM: x86: Use common 'enable_apicv' variable for both APICv and AVIC
>   KVM: x86: Invert APICv/AVIC enablement check
>   KVM: x86: hyper-v: Deactivate APICv only when AutoEOI feature is in
>     use
> 
>  arch/x86/include/asm/kvm_host.h |  5 ++++-
>  arch/x86/kvm/hyperv.c           | 27 +++++++++++++++++++++------
>  arch/x86/kvm/svm/avic.c         | 16 +++++-----------
>  arch/x86/kvm/svm/svm.c          | 24 +++++++++++++-----------
>  arch/x86/kvm/svm/svm.h          |  2 --
>  arch/x86/kvm/vmx/capabilities.h |  4 +---
>  arch/x86/kvm/vmx/vmx.c          |  2 --
>  arch/x86/kvm/x86.c              |  9 ++++++---
>  8 files changed, 50 insertions(+), 39 deletions(-)
> 

I tested this patch set and this is what I found.

For reference,
First of all, indeed to make AVIC work I need to:
 
1. Disable SVM - I wonder if I can make this on demand
too when the guest actually uses a nested guest or at least
enables nesting in IA32_FEATURE_CONTROL.
I naturally run most of my VMs with nesting enabled,
thus I tend to not have avic enabled due to this.
I'll prepare a patch soon for this.
 
2. Disable x2apic, naturally x2apic can't be used with avic.
In theory we can also disable avic when the guest switches on
the x2apic mode, but in practice the guest will likely to pick the x2apic
when it can.
 
3. (for hyperv) Disable 'hv_vapic', because otherwise hyper-v
uses its own PV APIC msrs which AVIC doesn't support.

This HV enlightment turns on in the CPUID both the 
HV_APIC_ACCESS_AVAILABLE which isn't that bad 
(it only tells that we have the VP assist page),
and HV_APIC_ACCESS_RECOMMENDED which hints the guest
to use HyperV PV APIC MSRS and use PV EOI field in 
the APIC access page, which means that the guest 
won't use the real apic at all.

4. and of course enable SynIC autoeoi deprecation.

Otherwise indeed windows enables autoeoi.

hv-passthrough indeed can't be used to test this
as it both enables autoeoi depreciation and *hv-vapic*. 
I had to use the patch that you posted
in 'About the performance of hyper-v' thread.
 
In addition to that when I don't use the autoeoi depreciation patch,
then the guest indeed enables autoeoi, and this triggers a deadlock.
 
The reason is that kvm_request_apicv_update must not be called with
srcu lock held vcpu->kvm->srcu (there is a warning about that
in kvm_request_apicv_update), but guest msr writes which come
from vcpu thread do hold it.
 
The other place where we disable AVIC on demand is svm_toggle_avic_for_irq_window.
And that code has a hack to drop this lock and take 
it back around the call to kvm_request_apicv_update.
This hack is safe as this code is called only from the vcpu thread.
 
Also for reference the reason for the fact that we need to
disable AVIC on the interrupt window request, or more correctly
why we still need to request interrupt windows with AVIC,
is that the local apic can act sadly as a pass-through device 
for legacy PIC, when one of its LINTn pins is configured in ExtINT mode.
In this mode when such pin is raised, the local apic asks the PIC for
the interrupt vector and then delivers it to the APIC
without touching the IRR/ISR.

The later means that if guest's interrupts are disabled,
such interrupt can't be queued via IRR to VAPIC
but instead the regular interrupt window has to be requested, 
but on AMD, the only way to request interrupt window
is to queue a VIRQ, and intercept its delivery,
a feature that is disabled when AVIC is active.
 
Finally for SynIC this srcu lock drop hack can be extended to this gross hack:
It seems to work though:


diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index bedd9b6cc26a..925b76e7b45e 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -85,7 +85,7 @@ static bool synic_has_vector_auto_eoi(struct kvm_vcpu_hv_synic *synic,
 }
 
 static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
-				int vector)
+				int vector, bool host)
 {
 	struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic);
 	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
@@ -109,6 +109,9 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
 
 	auto_eoi_new = bitmap_weight(synic->auto_eoi_bitmap, 256);
 
+	if (!host)
+		srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
+
 	/* Hyper-V SynIC auto EOI SINTs are not compatible with APICV */
 	if (!auto_eoi_old && auto_eoi_new) {
 		printk("Synic: inhibiting avic %d %d\n", auto_eoi_old, auto_eoi_new);
@@ -121,6 +124,10 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
 			kvm_request_apicv_update(vcpu->kvm, true,
 						 APICV_INHIBIT_REASON_HYPERV);
 	}
+
+	if (!host)
+		vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
+
 }
 
 static int synic_set_sint(struct kvm_vcpu_hv_synic *synic, int sint,
@@ -149,9 +156,9 @@ static int synic_set_sint(struct kvm_vcpu_hv_synic *synic, int sint,
 
 	atomic64_set(&synic->sint[sint], data);
 
-	synic_update_vector(synic, old_vector);
+	synic_update_vector(synic, old_vector, host);
 
-	synic_update_vector(synic, vector);
+	synic_update_vector(synic, vector, host);
 
 	/* Load SynIC vectors into EOI exit bitmap */
 	kvm_make_request(KVM_REQ_SCAN_IOAPIC, hv_synic_to_vcpu(synic));


Assuming that we don't want this gross hack,  
I wonder if we can avoid full blown memslot 
update when we disable avic, but rather have some 
smaller hack like only manually patching its
NPT mapping to have RW permissions instead 
of reserved bits which we use for MMIO. 

The AVIC spec says that NPT is only used to check that
guest has RW permission to the page, 
while the HVA in the NPT entry itself is ignored.

Best regards,
	Maxim Levitsky






  parent reply	other threads:[~2021-05-26  9:54 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-18 14:43 [PATCH v2 0/5] KVM: x86: hyper-v: Conditionally allow SynIC with APICv/AVIC Vitaly Kuznetsov
2021-05-18 14:43 ` [PATCH v2 1/5] KVM: SVM: Drop unneeded CONFIG_X86_LOCAL_APIC check for AVIC Vitaly Kuznetsov
2021-05-18 19:57   ` Sean Christopherson
2021-05-26  9:54   ` Maxim Levitsky
2021-05-18 14:43 ` [PATCH v2 2/5] KVM: VMX: Drop unneeded CONFIG_X86_LOCAL_APIC check from cpu_has_vmx_posted_intr() Vitaly Kuznetsov
2021-05-18 19:57   ` Sean Christopherson
2021-05-26  9:54   ` Maxim Levitsky
2021-05-18 14:43 ` [PATCH v2 3/5] KVM: x86: Use common 'enable_apicv' variable for both APICv and AVIC Vitaly Kuznetsov
2021-05-18 20:39   ` Sean Christopherson
2021-05-19  7:58     ` Vitaly Kuznetsov
2021-05-24 16:18     ` Paolo Bonzini
2021-05-24 16:52       ` Sean Christopherson
2021-05-24 17:02         ` Paolo Bonzini
2021-05-26  9:56   ` Maxim Levitsky
2021-05-26 15:07     ` Sean Christopherson
2021-05-26 15:52       ` Maxim Levitsky
2021-05-27 11:39         ` Paolo Bonzini
2021-05-18 14:43 ` [PATCH v2 4/5] KVM: x86: Invert APICv/AVIC enablement check Vitaly Kuznetsov
2021-05-18 21:05   ` Sean Christopherson
2021-05-26  9:57   ` Maxim Levitsky
2021-05-26 10:40     ` Vitaly Kuznetsov
2021-05-26 11:11       ` Maxim Levitsky
2021-05-18 14:43 ` [PATCH v2 5/5] KVM: x86: hyper-v: Deactivate APICv only when AutoEOI feature is in use Vitaly Kuznetsov
2021-05-24 16:21   ` Paolo Bonzini
2021-05-25  6:23     ` Vitaly Kuznetsov
2021-05-25  7:11       ` Paolo Bonzini
2021-05-26 10:02     ` Maxim Levitsky
2021-05-26 10:01   ` Maxim Levitsky
2021-05-26  9:54 ` Maxim Levitsky [this message]
2021-05-27  8:35   ` [PATCH v2 0/5] KVM: x86: hyper-v: Conditionally allow SynIC with APICv/AVIC Vitaly Kuznetsov
2021-05-27 15:49     ` Maxim Levitsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2409eb8593804eb879ae6fb961a709ca8c20f329.camel@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=jmattson@google.com \
    --cc=kechenl@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.