All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Alejandro Jimenez <alejandro.j.jimenez@oracle.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Li RongQing <lirongqing@baidu.com>,
	Greg Edwards <gedwards@ddn.com>
Subject: [PATCH v5 10/33] KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled
Date: Fri,  6 Jan 2023 01:12:43 +0000	[thread overview]
Message-ID: <20230106011306.85230-11-seanjc@google.com> (raw)
In-Reply-To: <20230106011306.85230-1-seanjc@google.com>

Free the APIC access page memslot if any vCPU enables x2APIC and SVM's
AVIC is enabled to prevent accesses to the virtual APIC on vCPUs with
x2APIC enabled.  On AMD, if its "hybrid" mode is enabled (AVIC is enabled
when x2APIC is enabled even without x2AVIC support), keeping the APIC
access page memslot results in the guest being able to access the virtual
APIC page as x2APIC is fully emulated by KVM.  I.e. hardware isn't aware
that the guest is operating in x2APIC mode.

Exempt nested SVM's update of APICv state from the new logic as x2APIC
can't be toggled on VM-Exit.  In practice, invoking the x2APIC logic
should be harmless precisely because it should be a glorified nop, but
play it safe to avoid latent bugs, e.g. with dropping the vCPU's SRCU
lock.

Intel doesn't suffer from the same issue as APICv has fully independent
VMCS controls for xAPIC vs. x2APIC virtualization.  Technically, KVM
should provide bus error semantics and not memory semantics for the APIC
page when x2APIC is enabled, but KVM already provides memory semantics in
other scenarios, e.g. if APICv/AVIC is enabled and the APIC is hardware
disabled (via APIC_BASE MSR).

Note, checking apic_access_memslot_enabled without taking locks relies
it being set during vCPU creation (before kvm_vcpu_reset()).  vCPUs can
race to set the inhibit and delete the memslot, i.e. can get false
positives, but can't get false negatives as apic_access_memslot_enabled
can't be toggled "on" once any vCPU reaches KVM_RUN.

Opportunistically drop the "can" while updating avic_activate_vmcb()'s
comment, i.e. to state that KVM _does_ support the hybrid mode.  Move
the "Note:" down a line to conform to preferred kernel/KVM multi-line
comment style.

Opportunistically update the apicv_update_lock comment, as it isn't
actually used to protect apic_access_memslot_enabled (which is protected
by slots_lock).

Fixes: 0e311d33bfbe ("KVM: SVM: Introduce hybrid-AVIC mode")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 10 +++++----
 arch/x86/kvm/lapic.c            | 38 ++++++++++++++++++++++++++++++++-
 arch/x86/kvm/lapic.h            |  1 +
 arch/x86/kvm/svm/avic.c         | 12 +++++------
 arch/x86/kvm/svm/nested.c       |  2 +-
 arch/x86/kvm/svm/svm.c          |  2 ++
 arch/x86/kvm/x86.c              | 27 +++++++++++++++++++++--
 7 files changed, 78 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c70690b2c82d..1d92c148e799 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1249,10 +1249,11 @@ struct kvm_arch {
 	struct kvm_apic_map __rcu *apic_map;
 	atomic_t apic_map_dirty;
 
-	/* Protects apic_access_memslot_enabled and apicv_inhibit_reasons */
-	struct rw_semaphore apicv_update_lock;
-
 	bool apic_access_memslot_enabled;
+	bool apic_access_memslot_inhibited;
+
+	/* Protects apicv_inhibit_reasons */
+	struct rw_semaphore apicv_update_lock;
 	unsigned long apicv_inhibit_reasons;
 
 	gpa_t wall_clock;
@@ -1599,6 +1600,7 @@ struct kvm_x86_ops {
 	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
 	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
 	bool (*check_apicv_inhibit_reasons)(enum kvm_apicv_inhibit reason);
+	bool allow_apicv_in_x2apic_without_x2apic_virtualization;
 	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
 	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
 	void (*hwapic_isr_update)(int isr);
@@ -1973,7 +1975,7 @@ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva,
 
 bool kvm_apicv_activated(struct kvm *kvm);
 bool kvm_vcpu_apicv_activated(struct kvm_vcpu *vcpu);
-void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu);
+void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu);
 void __kvm_set_or_clear_apicv_inhibit(struct kvm *kvm,
 				      enum kvm_apicv_inhibit reason, bool set);
 void kvm_set_or_clear_apicv_inhibit(struct kvm *kvm,
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index e73386c26d2c..355ea688df4a 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2442,7 +2442,8 @@ int kvm_alloc_apic_access_page(struct kvm *kvm)
 	int ret = 0;
 
 	mutex_lock(&kvm->slots_lock);
-	if (kvm->arch.apic_access_memslot_enabled)
+	if (kvm->arch.apic_access_memslot_enabled ||
+	    kvm->arch.apic_access_memslot_inhibited)
 		goto out;
 
 	hva = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
@@ -2470,6 +2471,41 @@ int kvm_alloc_apic_access_page(struct kvm *kvm)
 }
 EXPORT_SYMBOL_GPL(kvm_alloc_apic_access_page);
 
+void kvm_inhibit_apic_access_page(struct kvm_vcpu *vcpu)
+{
+	struct kvm *kvm = vcpu->kvm;
+
+	if (!kvm->arch.apic_access_memslot_enabled)
+		return;
+
+	kvm_vcpu_srcu_read_unlock(vcpu);
+
+	mutex_lock(&kvm->slots_lock);
+
+	if (kvm->arch.apic_access_memslot_enabled) {
+		__x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, 0, 0);
+		/*
+		 * Clear "enabled" after the memslot is deleted so that a
+		 * different vCPU doesn't get a false negative when checking
+		 * the flag out of slots_lock.  No additional memory barrier is
+		 * needed as modifying memslots requires waiting other vCPUs to
+		 * drop SRCU (see above), and false positives are ok as the
+		 * flag is rechecked after acquiring slots_lock.
+		 */
+		kvm->arch.apic_access_memslot_enabled = false;
+
+		/*
+		 * Mark the memslot as inhibited to prevent reallocating the
+		 * memslot during vCPU creation, e.g. if a vCPU is hotplugged.
+		 */
+		kvm->arch.apic_access_memslot_inhibited = true;
+	}
+
+	mutex_unlock(&kvm->slots_lock);
+
+	kvm_vcpu_srcu_read_lock(vcpu);
+}
+
 void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 8c6442751dab..df316ede7546 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -113,6 +113,7 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
 int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
 void kvm_apic_update_apicv(struct kvm_vcpu *vcpu);
 int kvm_alloc_apic_access_page(struct kvm *kvm);
+void kvm_inhibit_apic_access_page(struct kvm_vcpu *vcpu);
 
 bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
 		struct kvm_lapic_irq *irq, int *r, struct dest_map *dest_map);
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index ec28ba4c5f1b..0a75993afed6 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -72,12 +72,12 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
 
 	vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
 
-	/* Note:
-	 * KVM can support hybrid-AVIC mode, where KVM emulates x2APIC
-	 * MSR accesses, while interrupt injection to a running vCPU
-	 * can be achieved using AVIC doorbell. The AVIC hardware still
-	 * accelerate MMIO accesses, but this does not cause any harm
-	 * as the guest is not supposed to access xAPIC mmio when uses x2APIC.
+	/*
+	 * Note: KVM supports hybrid-AVIC mode, where KVM emulates x2APIC MSR
+	 * accesses, while interrupt injection to a running vCPU can be
+	 * achieved using AVIC doorbell.  KVM disables the APIC access page
+	 * (deletes the memslot) if any vCPU has x2APIC enabled, thus enabling
+	 * AVIC in hybrid mode activates only the doorbell mechanism.
 	 */
 	if (apic_x2apic_mode(svm->vcpu.arch.apic) &&
 	    avic_mode == AVIC_MODE_X2) {
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index bc9cd7086fa9..34ac03969f28 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1106,7 +1106,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	 * to benefit from it right away.
 	 */
 	if (kvm_apicv_activated(vcpu->kvm))
-		kvm_vcpu_update_apicv(vcpu);
+		__kvm_vcpu_update_apicv(vcpu);
 
 	return 0;
 }
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 26044e1d2422..7651d665723e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5028,6 +5028,8 @@ static __init int svm_hardware_setup(void)
 		svm_x86_ops.vcpu_blocking = NULL;
 		svm_x86_ops.vcpu_unblocking = NULL;
 		svm_x86_ops.vcpu_get_apicv_inhibit_reasons = NULL;
+	} else if (avic_mode == AVIC_MODE_X1) {
+		svm_x86_ops.allow_apicv_in_x2apic_without_x2apic_virtualization = true;
 	}
 
 	if (vls) {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 39b8dd37bc40..1abe3f1e821c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10045,7 +10045,7 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
 	kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
 }
 
-void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
+void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
 	bool activate;
@@ -10080,7 +10080,30 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
 	preempt_enable();
 	up_read(&vcpu->kvm->arch.apicv_update_lock);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_update_apicv);
+EXPORT_SYMBOL_GPL(__kvm_vcpu_update_apicv);
+
+static void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
+{
+	if (!lapic_in_kernel(vcpu))
+		return;
+
+	/*
+	 * Due to sharing page tables across vCPUs, the xAPIC memslot must be
+	 * deleted if any vCPU has xAPIC virtualization and x2APIC enabled, but
+	 * and hardware doesn't support x2APIC virtualization.  E.g. some AMD
+	 * CPUs support AVIC but not x2APIC.  KVM still allows enabling AVIC in
+	 * this case so that KVM can the AVIC doorbell to inject interrupts to
+	 * running vCPUs, but KVM must not create SPTEs for the APIC base as
+	 * the vCPU would incorrectly be able to access the vAPIC page via MMIO
+	 * despite being in x2APIC mode.  For simplicity, inhibiting the APIC
+	 * access page is sticky.
+	 */
+	if (apic_x2apic_mode(vcpu->arch.apic) &&
+	    kvm_x86_ops.allow_apicv_in_x2apic_without_x2apic_virtualization)
+		kvm_inhibit_apic_access_page(vcpu);
+
+	__kvm_vcpu_update_apicv(vcpu);
+}
 
 void __kvm_set_or_clear_apicv_inhibit(struct kvm *kvm,
 				      enum kvm_apicv_inhibit reason, bool set)
-- 
2.39.0.314.g84b9a713c41-goog


  parent reply	other threads:[~2023-01-06  1:14 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-06  1:12 [PATCH v5 00/33] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 01/33] KVM: x86: Blindly get current x2APIC reg value on "nodecode write" traps Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 02/33] KVM: x86: Purge "highest ISR" cache when updating APICv state Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 03/33] KVM: SVM: Flush the "current" TLB when activating AVIC Sean Christopherson
2023-01-08 14:25   ` Maxim Levitsky
2023-01-06  1:12 ` [PATCH v5 04/33] KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 05/33] KVM: x86: Don't inhibit APICv/AVIC on xAPIC ID "change" if APIC is disabled Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 06/33] KVM: x86: Don't inhibit APICv/AVIC if xAPIC ID mismatch is due to 32-bit ID Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 07/33] KVM: SVM: Don't put/load AVIC when setting virtual APIC mode Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 08/33] KVM: x86: Handle APICv updates for APIC "mode" changes via request Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 09/33] KVM: x86: Move APIC access page helper to common x86 code Sean Christopherson
2023-01-06  1:12 ` Sean Christopherson [this message]
2023-01-08 14:30   ` [PATCH v5 10/33] KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled Maxim Levitsky
2023-01-06  1:12 ` [PATCH v5 11/33] KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 12/33] KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 13/33] KVM: SVM: Fix x2APIC Logical ID calculation for avic_kick_target_vcpus_fast Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 14/33] Revert "KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible" Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 15/33] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 16/33] KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 17/33] KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0 Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 18/33] KVM: x86: Explicitly track all possibilities for APIC map's logical modes Sean Christopherson
2023-01-08 15:14   ` Maxim Levitsky
2023-01-06  1:12 ` [PATCH v5 19/33] KVM: x86: Skip redundant x2APIC logical mode optimized cluster setup Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 20/33] KVM: x86: Disable APIC logical map if logical ID covers multiple MDAs Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 21/33] KVM: x86: Disable APIC logical map if vCPUs are aliased in logical mode Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 22/33] KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 23/33] KVM: x86: Inhibit APICv/AVIC if the optimized physical map is disabled Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 24/33] KVM: SVM: Inhibit AVIC if vCPUs are aliased in logical mode Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 25/33] KVM: SVM: Always update local APIC on writes to logical dest register Sean Christopherson
2023-01-06  1:12 ` [PATCH v5 26/33] KVM: SVM: Update svm->ldr_reg cache even if LDR is "bad" Sean Christopherson
2023-01-06  1:13 ` [PATCH v5 27/33] KVM: SVM: Require logical ID to be power-of-2 for AVIC entry Sean Christopherson
2023-01-08 15:20   ` Maxim Levitsky
2023-01-06  1:13 ` [PATCH v5 28/33] KVM: SVM: Handle multiple logical targets in AVIC kick fastpath Sean Christopherson
2023-01-06  1:13 ` [PATCH v5 29/33] KVM: SVM: Ignore writes to Remote Read Data on AVIC write traps Sean Christopherson
2023-01-06  1:13 ` [PATCH v5 30/33] Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu" Sean Christopherson
2023-01-06  1:13 ` [PATCH v5 31/33] KVM: x86: Track required APICv inhibits with variable, not callback Sean Christopherson
2023-01-06  1:13 ` [PATCH v5 32/33] KVM: x86: Allow APICv APIC ID inhibit to be cleared Sean Christopherson
2023-01-08 15:25   ` Maxim Levitsky
2023-01-06  1:13 ` [PATCH v5 33/33] KVM: x86: Add helpers to recalc physical vs. logical optimized APIC maps Sean Christopherson
2023-01-08 15:32   ` Maxim Levitsky
2023-02-15 20:25 ` [PATCH v5 00/33] KVM: x86: AVIC and local APIC fixes+cleanups Suthikulpanit, Suravee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230106011306.85230-11-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=alejandro.j.jimenez@oracle.com \
    --cc=gedwards@ddn.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lirongqing@baidu.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=suravee.suthikulpanit@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.