All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups
@ 2022-08-31  0:34 Sean Christopherson
  2022-08-31  0:34 ` [PATCH 01/19] KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target Sean Christopherson
                   ` (18 more replies)
  0 siblings, 19 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

This started as a simple cleanup, and then I made the mistake of writing
a test to verify my changes to AVIC's handling of logical mode interrupts.

TL;DR: KVM's AVIC and optimized APIC map code doesn't correctly handle
various edge cases that are 100% architecturally legal, but are unlikely
to occur in most real world scenarios.

There are a variety of other fixes, but most of them are non-fatal.

I have tested this heavily with KUT, but I haven't booted Windows and
don't have access to x2AVIC, so additional testing would be much
appreciated.

I'll post my KVM-Unit-Tests later this week (need to write quite a few
changelogs).  The gist of the tests is to target multiple and non-existent
vCPUs in logical mode, and to target multiple vCPUs in physical mode by
aliasing vCPU0 and vCPU1 to the same physical ID.

Sean Christopherson (19):
  KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid
    target
  KVM: SVM: Don't put/load AVIC when setting virtual APIC mode
  Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean
  KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick
  KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not
    ICHR[31:16]
  KVM: SVM: Drop buggy and redundant AVIC "single logical dest" check
  KVM: SVM: Remove redundant cluster calculation that also creates a
    shadow
  KVM: SVM: Drop duplicate calcuation of AVIC/x2AVIC "logical index"
  KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch
  KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU
  KVM: x86: Disable APIC logical map if logical ID covers multiple MDAs
  KVM: x86: Disable APIC logical map if vCPUs are aliased in logical
    mode
  KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs
  KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0
  KVM: x86: Explicitly track all possibilities for APIC map's logical
    modes
  KVM: SVM: Handle multiple logical targets in AVIC kick fastpath
  KVM: SVM: Ignore writes to Remote Read Data on AVIC write traps
  Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on
    a running vcpu"

 Documentation/virt/kvm/x86/errata.rst |  11 ++
 arch/x86/include/asm/kvm_host.h       |  27 ++-
 arch/x86/kvm/lapic.c                  | 100 ++++++++--
 arch/x86/kvm/svm/avic.c               | 273 ++++++++++++++------------
 arch/x86/kvm/svm/svm.c                |   2 +-
 arch/x86/kvm/svm/svm.h                |   9 +-
 6 files changed, 260 insertions(+), 162 deletions(-)


base-commit: 372d07084593dc7a399bf9bee815711b1fb1bcf2
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 01/19] KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31  9:18   ` Maxim Levitsky
  2022-08-31  0:34 ` [PATCH 02/19] KVM: SVM: Don't put/load AVIC when setting virtual APIC mode Sean Christopherson
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Emulate ICR writes on AVIC IPI failures due to invalid targets using the
same logic as failures due to invalid types.  AVIC acceleration fails if
_any_ of the targets are invalid, and crucially VM-Exits before sending
IPIs to targets that _are_ valid.  In logical mode, the destination is a
bitmap, i.e. a single IPI can target multiple logical IDs.  Doing nothing
causes KVM to drop IPIs if at least one target is valid and at least one
target is invalid.

Fixes: 18f40c53e10f ("svm: Add VMEXIT handlers for AVIC")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 6919dee69f18..b1ade555e8d0 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -496,14 +496,18 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
 	trace_kvm_avic_incomplete_ipi(vcpu->vcpu_id, icrh, icrl, id, index);
 
 	switch (id) {
+	case AVIC_IPI_FAILURE_INVALID_TARGET:
 	case AVIC_IPI_FAILURE_INVALID_INT_TYPE:
 		/*
 		 * Emulate IPIs that are not handled by AVIC hardware, which
-		 * only virtualizes Fixed, Edge-Triggered INTRs.  The exit is
-		 * a trap, e.g. ICR holds the correct value and RIP has been
-		 * advanced, KVM is responsible only for emulating the IPI.
-		 * Sadly, hardware may sometimes leave the BUSY flag set, in
-		 * which case KVM needs to emulate the ICR write as well in
+		 * only virtualizes Fixed, Edge-Triggered INTRs, and falls over
+		 * if _any_ targets are invalid, e.g. if the logical mode mask
+		 * is a superset of running vCPUs.
+		 *
+		 * The exit is a trap, e.g. ICR holds the correct value and RIP
+		 * has been advanced, KVM is responsible only for emulating the
+		 * IPI.  Sadly, hardware may sometimes leave the BUSY flag set,
+		 * in which case KVM needs to emulate the ICR write as well in
 		 * order to clear the BUSY flag.
 		 */
 		if (icrl & APIC_ICR_BUSY)
@@ -519,8 +523,6 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
 		 */
 		avic_kick_target_vcpus(vcpu->kvm, apic, icrl, icrh, index);
 		break;
-	case AVIC_IPI_FAILURE_INVALID_TARGET:
-		break;
 	case AVIC_IPI_FAILURE_INVALID_BACKING_PAGE:
 		WARN_ONCE(1, "Invalid backing page\n");
 		break;
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 02/19] KVM: SVM: Don't put/load AVIC when setting virtual APIC mode
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
  2022-08-31  0:34 ` [PATCH 01/19] KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31  9:24   ` Maxim Levitsky
  2022-08-31  0:34 ` [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode" Sean Christopherson
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Move the VMCB updates from avic_refresh_apicv_exec_ctrl() into
avic_set_virtual_apic_mode() and invert the dependency being said
functions to avoid calling avic_vcpu_{load,put}() and
avic_set_pi_irte_mode() when "only" setting the virtual APIC mode.

avic_set_virtual_apic_mode() is invoked from common x86 with preemption
enabled, which makes avic_vcpu_{load,put}() unhappy.  Luckily, calling
those and updating IRTE stuff is unnecessary as the only reason
avic_set_virtual_apic_mode() is called is to handle transitions between
xAPIC and x2APIC that don't also toggle APICv activation.  And if
activation doesn't change, there's no need to fiddle with the physical
APIC ID table or update IRTE.

The "full" refresh is guaranteed to be called if activation changes in
this case as the only call to the "set" path is:

	kvm_vcpu_update_apicv(vcpu);
	static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);

and kvm_vcpu_update_apicv() invokes the refresh if activation changes:

	if (apic->apicv_active == activate)
		goto out;

	apic->apicv_active = activate;
	kvm_apic_update_apicv(vcpu);
	static_call(kvm_x86_refresh_apicv_exec_ctrl)(vcpu);

  WARNING: CPU: 183 PID: 49186 at arch/x86/kvm/svm/avic.c:1081 avic_vcpu_put+0xde/0xf0 [kvm_amd]
  CPU: 183 PID: 49186 Comm: stable Tainted: G           O       6.0.0-smp--fcddbca45f0a-sink #34
  Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022
  RIP: 0010:avic_vcpu_put+0xde/0xf0 [kvm_amd]
   avic_refresh_apicv_exec_ctrl+0x142/0x1c0 [kvm_amd]
   avic_set_virtual_apic_mode+0x5a/0x70 [kvm_amd]
   kvm_lapic_set_base+0x149/0x1a0 [kvm]
   kvm_set_apic_base+0x8f/0xd0 [kvm]
   kvm_set_msr_common+0xa3a/0xdc0 [kvm]
   svm_set_msr+0x364/0x6b0 [kvm_amd]
   __kvm_set_msr+0xb8/0x1c0 [kvm]
   kvm_emulate_wrmsr+0x58/0x1d0 [kvm]
   msr_interception+0x1c/0x30 [kvm_amd]
   svm_invoke_exit_handler+0x31/0x100 [kvm_amd]
   svm_handle_exit+0xfc/0x160 [kvm_amd]
   vcpu_enter_guest+0x21bb/0x23e0 [kvm]
   vcpu_run+0x92/0x450 [kvm]
   kvm_arch_vcpu_ioctl_run+0x43e/0x6e0 [kvm]
   kvm_vcpu_ioctl+0x559/0x620 [kvm]

Fixes: 05c4fe8c1bd9 ("KVM: SVM: Refresh AVIC configuration when changing APIC mode")
Cc: stable@vger.kernel.org
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index b1ade555e8d0..f3a74c8284cb 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -741,18 +741,6 @@ void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu)
 	avic_handle_ldr_update(vcpu);
 }
 
-void avic_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
-{
-	if (!lapic_in_kernel(vcpu) || avic_mode == AVIC_MODE_NONE)
-		return;
-
-	if (kvm_get_apic_mode(vcpu) == LAPIC_MODE_INVALID) {
-		WARN_ONCE(true, "Invalid local APIC state (vcpu_id=%d)", vcpu->vcpu_id);
-		return;
-	}
-	avic_refresh_apicv_exec_ctrl(vcpu);
-}
-
 static int avic_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
 {
 	int ret = 0;
@@ -1094,17 +1082,18 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
 	WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
 }
 
-
-void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
+void avic_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct vmcb *vmcb = svm->vmcb01.ptr;
-	bool activated = kvm_vcpu_apicv_active(vcpu);
+
+	if (!lapic_in_kernel(vcpu) || avic_mode == AVIC_MODE_NONE)
+		return;
 
 	if (!enable_apicv)
 		return;
 
-	if (activated) {
+	if (kvm_vcpu_apicv_active(vcpu)) {
 		/**
 		 * During AVIC temporary deactivation, guest could update
 		 * APIC ID, DFR and LDR registers, which would not be trapped
@@ -1118,6 +1107,16 @@ void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
 		avic_deactivate_vmcb(svm);
 	}
 	vmcb_mark_dirty(vmcb, VMCB_AVIC);
+}
+
+void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
+{
+	bool activated = kvm_vcpu_apicv_active(vcpu);
+
+	if (!enable_apicv)
+		return;
+
+	avic_set_virtual_apic_mode(vcpu);
 
 	if (activated)
 		avic_vcpu_load(vcpu, vcpu->cpu);
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
  2022-08-31  0:34 ` [PATCH 01/19] KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target Sean Christopherson
  2022-08-31  0:34 ` [PATCH 02/19] KVM: SVM: Don't put/load AVIC when setting virtual APIC mode Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31  5:59   ` Maxim Levitsky
  2022-08-31  0:34 ` [PATCH 04/19] KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean Sean Christopherson
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Remove SVM's so called "hybrid-AVIC mode" and reinstate the restriction
where AVIC is disabled if x2APIC is enabled.  The argument that the
"guest is not supposed to access xAPIC mmio when uses x2APIC" is flat out
wrong.  Activating x2APIC completely disables the xAPIC MMIO region,
there is nothing that says the guest must not access that address.

Concretely, KVM-Unit-Test's existing "apic" test fails the subtests that
expect accesses to the APIC base region to not be emulated when x2APIC is
enabled.

Furthermore, allowing the guest to trigger MMIO emulation in a mode where
KVM doesn't expect such emulation to occur is all kinds of dangerous.

Tweak the restriction so that it only inhibits AVIC if x2APIC is actually
enabled instead of inhibiting AVIC is x2APIC is exposed to the guest.

This reverts commit 0e311d33bfbef86da130674e8528cc23e6acfe16.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  6 ++++++
 arch/x86/kvm/svm/avic.c         | 21 ++++++++++-----------
 2 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2c96c43c313a..1f51411f3112 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1128,6 +1128,12 @@ enum kvm_apicv_inhibit {
 	 */
 	APICV_INHIBIT_REASON_PIT_REINJ,
 
+	/*
+	 * AVIC is inhibited because the vCPU has x2apic enabled and x2AVIC is
+	 * not supported.
+	 */
+	APICV_INHIBIT_REASON_X2APIC,
+
 	/*
 	 * AVIC is disabled because SEV doesn't support it.
 	 */
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index f3a74c8284cb..1d516d658f9a 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -71,22 +71,12 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
 	vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
 
 	vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
-
-	/* Note:
-	 * KVM can support hybrid-AVIC mode, where KVM emulates x2APIC
-	 * MSR accesses, while interrupt injection to a running vCPU
-	 * can be achieved using AVIC doorbell. The AVIC hardware still
-	 * accelerate MMIO accesses, but this does not cause any harm
-	 * as the guest is not supposed to access xAPIC mmio when uses x2APIC.
-	 */
-	if (apic_x2apic_mode(svm->vcpu.arch.apic) &&
-	    avic_mode == AVIC_MODE_X2) {
+	if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
 		vmcb->control.int_ctl |= X2APIC_MODE_MASK;
 		vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
 		/* Disabling MSR intercept for x2APIC registers */
 		svm_set_x2apic_msr_interception(svm, false);
 	} else {
-		/* For xAVIC and hybrid-xAVIC modes */
 		vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
 		/* Enabling MSR intercept for x2APIC registers */
 		svm_set_x2apic_msr_interception(svm, true);
@@ -537,6 +527,14 @@ unsigned long avic_vcpu_get_apicv_inhibit_reasons(struct kvm_vcpu *vcpu)
 {
 	if (is_guest_mode(vcpu))
 		return APICV_INHIBIT_REASON_NESTED;
+
+	/*
+	 * AVIC must be disabled if x2AVIC isn't supported and the guest has
+	 * x2APIC enabled.
+	 */
+	if (avic_mode != AVIC_MODE_X2 && apic_x2apic_mode(vcpu->arch.apic))
+		return APICV_INHIBIT_REASON_X2APIC;
+
 	return 0;
 }
 
@@ -993,6 +991,7 @@ bool avic_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason)
 			  BIT(APICV_INHIBIT_REASON_NESTED) |
 			  BIT(APICV_INHIBIT_REASON_IRQWIN) |
 			  BIT(APICV_INHIBIT_REASON_PIT_REINJ) |
+			  BIT(APICV_INHIBIT_REASON_X2APIC) |
 			  BIT(APICV_INHIBIT_REASON_BLOCKIRQ) |
 			  BIT(APICV_INHIBIT_REASON_SEV)      |
 			  BIT(APICV_INHIBIT_REASON_APIC_ID_MODIFIED) |
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 04/19] KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (2 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode" Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31  9:36   ` Maxim Levitsky
  2022-08-31  0:34 ` [PATCH 05/19] KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick Sean Christopherson
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Replace the "avic_mode" enum with a single bool to track whether or not
x2AVIC is enabled.  KVM already has "apicv_enabled" that tracks if any
flavor of AVIC is enabled, i.e. AVIC_MODE_NONE and AVIC_MODE_X1 are
redundant and unnecessary noise.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 45 +++++++++++++++++++----------------------
 arch/x86/kvm/svm/svm.c  |  2 +-
 arch/x86/kvm/svm/svm.h  |  9 +--------
 3 files changed, 23 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 1d516d658f9a..b59f8ee2671f 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -53,7 +53,7 @@ static DEFINE_HASHTABLE(svm_vm_data_hash, SVM_VM_DATA_HASH_BITS);
 static u32 next_vm_id = 0;
 static bool next_vm_id_wrapped = 0;
 static DEFINE_SPINLOCK(svm_vm_data_hash_lock);
-enum avic_modes avic_mode;
+bool x2avic_enabled;
 
 /*
  * This is a wrapper of struct amd_iommu_ir_data.
@@ -231,8 +231,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
 	u64 *avic_physical_id_table;
 	struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);
 
-	if ((avic_mode == AVIC_MODE_X1 && index > AVIC_MAX_PHYSICAL_ID) ||
-	    (avic_mode == AVIC_MODE_X2 && index > X2AVIC_MAX_PHYSICAL_ID))
+	if ((!x2avic_enabled && index > AVIC_MAX_PHYSICAL_ID) ||
+	    (index > X2AVIC_MAX_PHYSICAL_ID))
 		return NULL;
 
 	avic_physical_id_table = page_address(kvm_svm->avic_physical_id_table_page);
@@ -279,8 +279,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
 	int id = vcpu->vcpu_id;
 	struct vcpu_svm *svm = to_svm(vcpu);
 
-	if ((avic_mode == AVIC_MODE_X1 && id > AVIC_MAX_PHYSICAL_ID) ||
-	    (avic_mode == AVIC_MODE_X2 && id > X2AVIC_MAX_PHYSICAL_ID))
+	if ((!x2avic_enabled && id > AVIC_MAX_PHYSICAL_ID) ||
+	    (id > X2AVIC_MAX_PHYSICAL_ID))
 		return -EINVAL;
 
 	if (!vcpu->arch.apic->regs)
@@ -532,7 +532,7 @@ unsigned long avic_vcpu_get_apicv_inhibit_reasons(struct kvm_vcpu *vcpu)
 	 * AVIC must be disabled if x2AVIC isn't supported and the guest has
 	 * x2APIC enabled.
 	 */
-	if (avic_mode != AVIC_MODE_X2 && apic_x2apic_mode(vcpu->arch.apic))
+	if (!x2avic_enabled && apic_x2apic_mode(vcpu->arch.apic))
 		return APICV_INHIBIT_REASON_X2APIC;
 
 	return 0;
@@ -1086,10 +1086,7 @@ void avic_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct vmcb *vmcb = svm->vmcb01.ptr;
 
-	if (!lapic_in_kernel(vcpu) || avic_mode == AVIC_MODE_NONE)
-		return;
-
-	if (!enable_apicv)
+	if (!lapic_in_kernel(vcpu) || !enable_apicv)
 		return;
 
 	if (kvm_vcpu_apicv_active(vcpu)) {
@@ -1165,32 +1162,32 @@ bool avic_hardware_setup(struct kvm_x86_ops *x86_ops)
 	if (!npt_enabled)
 		return false;
 
+	/* AVIC is a prerequisite for x2AVIC. */
+	if (!boot_cpu_has(X86_FEATURE_AVIC) && !force_avic) {
+		if (boot_cpu_has(X86_FEATURE_X2AVIC)) {
+			pr_warn(FW_BUG "Cannot support x2AVIC due to AVIC is disabled");
+			pr_warn(FW_BUG "Try enable AVIC using force_avic option");
+		}
+		return false;
+	}
+
 	if (boot_cpu_has(X86_FEATURE_AVIC)) {
-		avic_mode = AVIC_MODE_X1;
 		pr_info("AVIC enabled\n");
 	} else if (force_avic) {
 		/*
 		 * Some older systems does not advertise AVIC support.
 		 * See Revision Guide for specific AMD processor for more detail.
 		 */
-		avic_mode = AVIC_MODE_X1;
 		pr_warn("AVIC is not supported in CPUID but force enabled");
 		pr_warn("Your system might crash and burn");
 	}
 
 	/* AVIC is a prerequisite for x2AVIC. */
-	if (boot_cpu_has(X86_FEATURE_X2AVIC)) {
-		if (avic_mode == AVIC_MODE_X1) {
-			avic_mode = AVIC_MODE_X2;
-			pr_info("x2AVIC enabled\n");
-		} else {
-			pr_warn(FW_BUG "Cannot support x2AVIC due to AVIC is disabled");
-			pr_warn(FW_BUG "Try enable AVIC using force_avic option");
-		}
-	}
+	x2avic_enabled = boot_cpu_has(X86_FEATURE_X2AVIC);
+	if (x2avic_enabled)
+		pr_info("x2AVIC enabled\n");
 
-	if (avic_mode != AVIC_MODE_NONE)
-		amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
+	amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
 
-	return !!avic_mode;
+	return true;
 }
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f3813dbacb9f..b25c89069128 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -821,7 +821,7 @@ void svm_set_x2apic_msr_interception(struct vcpu_svm *svm, bool intercept)
 	if (intercept == svm->x2avic_msrs_intercepted)
 		return;
 
-	if (avic_mode != AVIC_MODE_X2 ||
+	if (!x2avic_enabled ||
 	    !apic_x2apic_mode(svm->vcpu.arch.apic))
 		return;
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..cee79ade400a 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -35,14 +35,7 @@ extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
 extern bool npt_enabled;
 extern int vgif;
 extern bool intercept_smi;
-
-enum avic_modes {
-	AVIC_MODE_NONE = 0,
-	AVIC_MODE_X1,
-	AVIC_MODE_X2,
-};
-
-extern enum avic_modes avic_mode;
+extern bool x2avic_enabled;
 
 /*
  * Clean bits in VMCB.
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 05/19] KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (3 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 04/19] KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31  9:38   ` Maxim Levitsky
  2022-09-01  2:56   ` Li,Rongqing
  2022-08-31  0:34 ` [PATCH 06/19] KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not ICHR[31:16] Sean Christopherson
                   ` (13 subsequent siblings)
  18 siblings, 2 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Compute the destination from ICRH using the sender's x2APIC status, not
each (potential) target's x2APIC status.

Fixes: c514d3a348ac ("KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID")
Cc: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index b59f8ee2671f..3ace0f2f52f0 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -441,6 +441,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
 				   u32 icrl, u32 icrh, u32 index)
 {
+	u32 dest = apic_x2apic_mode(source) ? icrh : GET_XAPIC_DEST_FIELD(icrh);
 	unsigned long i;
 	struct kvm_vcpu *vcpu;
 
@@ -456,13 +457,6 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
 	 * since entered the guest will have processed pending IRQs at VMRUN.
 	 */
 	kvm_for_each_vcpu(i, vcpu, kvm) {
-		u32 dest;
-
-		if (apic_x2apic_mode(vcpu->arch.apic))
-			dest = icrh;
-		else
-			dest = GET_XAPIC_DEST_FIELD(icrh);
-
 		if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
 					dest, icrl & APIC_DEST_MASK)) {
 			vcpu->arch.apic->irr_pending = true;
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 06/19] KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not ICHR[31:16]
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (4 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 05/19] KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31  6:09   ` Maxim Levitsky
  2022-08-31  0:34 ` [PATCH 07/19] KVM: SVM: Drop buggy and redundant AVIC "single logical dest" check Sean Christopherson
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

When attempting a fast kick for x2AVIC, get the destination bitmap from
ICR[15:0], not ICHR[31:16].  The upper 16 bits contain the cluster, the
lower 16 bits hold the bitmap.

Fixes: 603ccef42ce9 ("KVM: x86: SVM: fix avic_kick_target_vcpus_fast")
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 3ace0f2f52f0..3c333cd2e752 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -368,7 +368,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 
 		if (apic_x2apic_mode(source)) {
 			/* 16 bit dest mask, 16 bit cluster id */
-			bitmap = dest & 0xFFFF0000;
+			bitmap = dest & 0xFFFF;
 			cluster = (dest >> 16) << 4;
 		} else if (kvm_lapic_get_reg(source, APIC_DFR) == APIC_DFR_FLAT) {
 			/* 8 bit dest mask*/
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 07/19] KVM: SVM: Drop buggy and redundant AVIC "single logical dest" check
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (5 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 06/19] KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not ICHR[31:16] Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31  6:19   ` Maxim Levitsky
  2022-08-31  0:34 ` [PATCH 08/19] KVM: SVM: Remove redundant cluster calculation that also creates a shadow Sean Christopherson
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Use the already-calculated-and-sanity-checked destination bitmap when
processing a fast AVIC kick in logical mode, and drop the logical path's
flawed logic.  The intent of the check is to ensure the bitmap is a power
of two, whereas "icrh != (1 << avic)" effectively checks that the bitmap
is a power of two _and_ the target cluster is '0'.

Note, the flawed check isn't a functional issue, it simply means that KVM
will go down the slow path if the target cluster is non-zero.

Fixes: 8c9e639da435 ("KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 3c333cd2e752..14f567550a1e 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -411,15 +411,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 			 * Instead, calculate physical ID from logical ID in ICRH.
 			 */
 			int cluster = (icrh & 0xffff0000) >> 16;
-			int apic = ffs(icrh & 0xffff) - 1;
-
-			/*
-			 * If the x2APIC logical ID sub-field (i.e. icrh[15:0])
-			 * contains anything but a single bit, we cannot use the
-			 * fast path, because it is limited to a single vCPU.
-			 */
-			if (apic < 0 || icrh != (1 << apic))
-				return -EINVAL;
+			int apic = ffs(bitmap) - 1;
 
 			l1_physical_id = (cluster << 4) + apic;
 		}
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 08/19] KVM: SVM: Remove redundant cluster calculation that also creates a shadow
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (6 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 07/19] KVM: SVM: Drop buggy and redundant AVIC "single logical dest" check Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31 10:19   ` Maxim Levitsky
  2022-08-31  0:34 ` [PATCH 09/19] KVM: SVM: Drop duplicate calcuation of AVIC/x2AVIC "logical index" Sean Christopherson
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Drop the redundant "cluster" calculation and its horrific shadowing in
the x2avic logical mode path.  The "cluster" that is declared at an outer
scope is derived using the exact same calculation and has performed the
left-shift.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 14f567550a1e..8c9cad96008e 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -410,10 +410,9 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 			 * For x2APIC logical mode, cannot leverage the index.
 			 * Instead, calculate physical ID from logical ID in ICRH.
 			 */
-			int cluster = (icrh & 0xffff0000) >> 16;
 			int apic = ffs(bitmap) - 1;
 
-			l1_physical_id = (cluster << 4) + apic;
+			l1_physical_id = cluster + apic;
 		}
 	}
 
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 09/19] KVM: SVM: Drop duplicate calcuation of AVIC/x2AVIC "logical index"
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (7 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 08/19] KVM: SVM: Remove redundant cluster calculation that also creates a shadow Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31  0:34 ` [PATCH 10/19] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch Sean Christopherson
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Drop the duplicate calculation of the logical "index", which ends up
being the same for x2APIC vs. xAPIC: cluster + bit number.

Note, the existing code is a mess and uses ffs(), which is 1-based, and
__ffs(), which is 0-based, for the exact same calculation, i.e. replacing
"ffs(bitmap) - 1" with "__ffs(bitmap)" is intentional.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 8c9cad96008e..05a1cde8175c 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -407,12 +407,14 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 					 AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK;
 		} else {
 			/*
-			 * For x2APIC logical mode, cannot leverage the index.
-			 * Instead, calculate physical ID from logical ID in ICRH.
+			 * For x2APIC, the logical APIC ID is a read-only value
+			 * that is derived from the x2APIC ID, thus the x2APIC
+			 * ID can be found by reversing the calculation (done
+			 * above).  Note, bits 31:20 of the x2APIC ID are not
+			 * propagated to the logical ID, but KVM limits the
+			 * x2APIC ID limited to KVM_MAX_VCPU_IDS.
 			 */
-			int apic = ffs(bitmap) - 1;
-
-			l1_physical_id = cluster + apic;
+			l1_physical_id = logid_index;
 		}
 	}
 
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 10/19] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (8 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 09/19] KVM: SVM: Drop duplicate calcuation of AVIC/x2AVIC "logical index" Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31 10:22   ` Maxim Levitsky
  2022-08-31  0:34 ` [PATCH 11/19] KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU Sean Christopherson
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Document that AVIC is inhibited if any vCPU's APIC ID diverges from its
vCPU ID, i.e. that there's no need to check for a destination match in
the AVIC kick fast path.

Opportunistically tweak comments to remove "guest bug", as that suggests
KVM is punting on error handling, which is not the case.  Targeting a
non-existent vCPU or no vCPUs _may_ be a guest software bug, but whether
or not it's a guest bug is irrelevant.  Such behavior is architecturally
legal and thus needs to faithfully emulated by KVM (and it is).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 05a1cde8175c..3959d4766911 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -380,8 +380,8 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 			cluster = (dest >> 4) << 2;
 		}
 
+		/* Nothing to do if there are no destinations in the cluster. */
 		if (unlikely(!bitmap))
-			/* guest bug: nobody to send the logical interrupt to */
 			return 0;
 
 		if (!is_power_of_2(bitmap))
@@ -399,7 +399,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 			if (WARN_ON_ONCE(index != logid_index))
 				return -EINVAL;
 
-			/* guest bug: non existing/reserved logical destination */
+			/* Nothing to do if the logical destination is invalid. */
 			if (unlikely(!(logid_entry & AVIC_LOGICAL_ID_ENTRY_VALID_MASK)))
 				return 0;
 
@@ -418,9 +418,13 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 		}
 	}
 
+	/*
+	 * KVM inhibits AVIC if any vCPU ID diverges from the vCPUs APIC ID,
+	 * i.e. APIC ID == vCPU ID.  Once again, nothing to do if the target
+	 * vCPU doesn't exist.
+	 */
 	target_vcpu = kvm_get_vcpu_by_id(kvm, l1_physical_id);
 	if (unlikely(!target_vcpu))
-		/* guest bug: non existing vCPU is a target of this IPI*/
 		return 0;
 
 	target_vcpu->arch.apic->irr_pending = true;
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 11/19] KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (9 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 10/19] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31 10:25   ` Maxim Levitsky
  2022-08-31  0:34 ` [PATCH 12/19] KVM: x86: Disable APIC logical map if logical ID covers multiple MDAs Sean Christopherson
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Add a helper to perform the final kick, two instances of the ICR decoding
is one too many.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 3959d4766911..2095ece70712 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -329,6 +329,16 @@ void avic_ring_doorbell(struct kvm_vcpu *vcpu)
 	put_cpu();
 }
 
+
+static void avic_kick_vcpu(struct kvm_vcpu *vcpu, u32 icrl)
+{
+	vcpu->arch.apic->irr_pending = true;
+	svm_complete_interrupt_delivery(vcpu,
+					icrl & APIC_MODE_MASK,
+					icrl & APIC_INT_LEVELTRIG,
+					icrl & APIC_VECTOR_MASK);
+}
+
 /*
  * A fast-path version of avic_kick_target_vcpus(), which attempts to match
  * destination APIC ID to vCPU without looping through all vCPUs.
@@ -427,11 +437,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 	if (unlikely(!target_vcpu))
 		return 0;
 
-	target_vcpu->arch.apic->irr_pending = true;
-	svm_complete_interrupt_delivery(target_vcpu,
-					icrl & APIC_MODE_MASK,
-					icrl & APIC_INT_LEVELTRIG,
-					icrl & APIC_VECTOR_MASK);
+	avic_kick_vcpu(target_vcpu, icrl);
 	return 0;
 }
 
@@ -455,13 +461,8 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
 	 */
 	kvm_for_each_vcpu(i, vcpu, kvm) {
 		if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
-					dest, icrl & APIC_DEST_MASK)) {
-			vcpu->arch.apic->irr_pending = true;
-			svm_complete_interrupt_delivery(vcpu,
-							icrl & APIC_MODE_MASK,
-							icrl & APIC_INT_LEVELTRIG,
-							icrl & APIC_VECTOR_MASK);
-		}
+					dest, icrl & APIC_DEST_MASK))
+			avic_kick_vcpu(vcpu, icrl);
 	}
 }
 
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 12/19] KVM: x86: Disable APIC logical map if logical ID covers multiple MDAs
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (10 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 11/19] KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU Sean Christopherson
@ 2022-08-31  0:34 ` Sean Christopherson
  2022-08-31 13:23   ` Maxim Levitsky
  2022-08-31  0:35 ` [PATCH 13/19] KVM: x86: Disable APIC logical map if vCPUs are aliased in logical mode Sean Christopherson
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Disable the optimized APIC logical map if a logical ID covers multiple
MDAs, i.e. if a vCPU has multiple bits set in its ID.  In logical mode,
events match if "ID & MDA != 0", i.e. creating an entry for only the
first bit can cause interrupts to be missed.

Note, creating an entry for every bit is also wrong as KVM would generate
IPIs for every matching bit.  It would be possible to teach KVM to play
nice with this edge case, but it is very much an edge case and probably
not used in any real world OS, i.e. it's not worth optimizing.

Use an impossible value for the "mode" to effectively designate that it's
disabled.  Don't bother adding a dedicated "invalid" value, the mode
handling will be cleaned up in the future and it would take just as much
effort to explain what value is "safe" at this time.

Fixes: 1e08ec4a130e ("KVM: optimize apic interrupt delivery")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/lapic.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9dda989a1cf0..82278acae95b 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -300,8 +300,15 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 		if (!kvm_apic_map_get_logical_dest(new, ldr, &cluster, &mask))
 			continue;
 
-		if (mask)
-			cluster[ffs(mask) - 1] = apic;
+		if (!mask)
+			continue;
+
+		if (!is_power_of_2(mask)) {
+			new->mode = KVM_APIC_MODE_XAPIC_FLAT |
+				    KVM_APIC_MODE_XAPIC_CLUSTER;
+			continue;
+		}
+		cluster[ffs(mask) - 1] = apic;
 	}
 out:
 	old = rcu_dereference_protected(kvm->arch.apic_map,
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 13/19] KVM: x86: Disable APIC logical map if vCPUs are aliased in logical mode
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (11 preceding siblings ...)
  2022-08-31  0:34 ` [PATCH 12/19] KVM: x86: Disable APIC logical map if logical ID covers multiple MDAs Sean Christopherson
@ 2022-08-31  0:35 ` Sean Christopherson
  2022-08-31 13:24   ` Maxim Levitsky
  2022-08-31  0:35 ` [PATCH 14/19] KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs Sean Christopherson
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Disable the optimized APIC logical map if multiple vCPUs are aliased to
the same logical ID.  Architecturally, all CPUs whose logical ID matches
the MDA are supposed to receive the interrupt; overwriting existing map
entries can result in missed IPIs.

Fixes: 1e08ec4a130e ("KVM: optimize apic interrupt delivery")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/lapic.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 82278acae95b..d537b51295d6 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -303,12 +303,13 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 		if (!mask)
 			continue;
 
-		if (!is_power_of_2(mask)) {
+		ldr = ffs(mask) - 1;
+		if (!is_power_of_2(mask) || cluster[ldr]) {
 			new->mode = KVM_APIC_MODE_XAPIC_FLAT |
 				    KVM_APIC_MODE_XAPIC_CLUSTER;
 			continue;
 		}
-		cluster[ffs(mask) - 1] = apic;
+		cluster[ldr] = apic;
 	}
 out:
 	old = rcu_dereference_protected(kvm->arch.apic_map,
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 14/19] KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (12 preceding siblings ...)
  2022-08-31  0:35 ` [PATCH 13/19] KVM: x86: Disable APIC logical map if vCPUs are aliased in logical mode Sean Christopherson
@ 2022-08-31  0:35 ` Sean Christopherson
  2022-08-31 13:37   ` Maxim Levitsky
  2022-08-31  0:35 ` [PATCH 15/19] KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0 Sean Christopherson
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Apply KVM's hotplug hack if and only if userspace has enabled 32-bit IDs
for x2APIC.  If 32-bit IDs are not enabled, disable the optimized map to
honor x86 architectural behavior if multiple vCPUs shared a physical APIC
ID.  As called out in the changelog that added the hack, all CPUs whose
(possibly truncated) APIC ID matches the target are supposed to receive
the IPI.

  KVM intentionally differs from real hardware, because real hardware
  (Knights Landing) does just "x2apic_id & 0xff" to decide whether to
  accept the interrupt in xAPIC mode and it can deliver one interrupt to
  more than one physical destination, e.g. 0x123 to 0x123 and 0x23.

Applying the hack even when x2APIC is not fully enabled means KVM doesn't
correctly handle scenarios where the guest has aliased xAPIC IDs across
multiple vCPUs, as only the vCPU with the lowest vCPU ID will receive any
interrupts.  It's extremely unlikely any real world guest aliase APIC IDs,
or even modifies APIC IDs, but KVM's behavior is arbitrary, e.g. the
lowest vCPU ID "wins" regardless of which vCPU is "aliasing" and which
vCPU is "normal".

Furthermore, the hack is _not_ guaranteed to work!  The hack works if and
only if the optimized APIC map is successfully allocated.  If the map
allocation fails (unlikely), KVM will fall back to its unoptimized
behavior, which _does_ honor the architectural behavior.

Pivot on 32-bit x2APIC IDs being enabled as that is required to take
advantage of the hotplug hack (see kvm_apic_state_fixup()), i.e. won't
break existing setups unless they are way, way off in the weeds.

And an entry in KVM's errata to document the hack.  Alternatively, KVM
could provide an actual x2APIC quirk and document the hack that way, but
there's unlikely to ever be a use case for disabling the quirk.  Go the
errata route to avoid having to validate a quirk no one cares about.

Fixes: 5bd5db385b3e ("KVM: x86: allow hotplug of VCPU with APIC ID over 0xff")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/x86/errata.rst | 11 ++++++
 arch/x86/kvm/lapic.c                  | 50 ++++++++++++++++++++++-----
 2 files changed, 52 insertions(+), 9 deletions(-)

diff --git a/Documentation/virt/kvm/x86/errata.rst b/Documentation/virt/kvm/x86/errata.rst
index 410e0aa63493..49a05f24747b 100644
--- a/Documentation/virt/kvm/x86/errata.rst
+++ b/Documentation/virt/kvm/x86/errata.rst
@@ -37,3 +37,14 @@ Nested virtualization features
 ------------------------------
 
 TBD
+
+x2APIC
+------
+When KVM_X2APIC_API_USE_32BIT_IDS is enabled, KVM activates a hack/quirk that
+allows sending events to a single vCPU using its x2APIC ID even if the target
+vCPU has legacy xAPIC enabled, e.g. to bring up hotplugged vCPUs via INIT-SIPI
+on VMs with > 255 vCPUs.  A side effect of the quirk is that, if multiple vCPUs
+have the same physical APIC ID, KVM will deliver events targeting that APIC ID
+only to the vCPU with the lowest vCPU ID.  If KVM_X2APIC_API_USE_32BIT_IDS is
+not enabled, KVM follows x86 architecture when processing interrupts (all vCPUs
+matching the target APIC ID receive the interrupt).
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index d537b51295d6..c224b5c7cd92 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -260,10 +260,10 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 	kvm_for_each_vcpu(i, vcpu, kvm) {
 		struct kvm_lapic *apic = vcpu->arch.apic;
 		struct kvm_lapic **cluster;
+		u32 x2apic_id, physical_id;
 		u16 mask;
 		u32 ldr;
 		u8 xapic_id;
-		u32 x2apic_id;
 
 		if (!kvm_apic_present(vcpu))
 			continue;
@@ -271,16 +271,48 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 		xapic_id = kvm_xapic_id(apic);
 		x2apic_id = kvm_x2apic_id(apic);
 
-		/* Hotplug hack: see kvm_apic_match_physical_addr(), ... */
-		if ((apic_x2apic_mode(apic) || x2apic_id > 0xff) &&
-				x2apic_id <= new->max_apic_id)
-			new->phys_map[x2apic_id] = apic;
 		/*
-		 * ... xAPIC ID of VCPUs with APIC ID > 0xff will wrap-around,
-		 * prevent them from masking VCPUs with APIC ID <= 0xff.
+		 * Apply KVM's hotplug hack if userspace has enable 32-bit APIC
+		 * IDs.  Allow sending events to vCPUs by their x2APIC ID even
+		 * if the target vCPU is in legacy xAPIC mode, and silently
+		 * ignore aliased xAPIC IDs (the x2APIC ID is truncated to 8
+		 * bits, causing IDs > 0xff to wrap and collide).
+		 *
+		 * Honor the architectural (and KVM's non-optimized) behavior
+		 * if userspace has not enabled 32-bit x2APIC IDs.  Each APIC
+		 * is supposed to process messages independently.  If multiple
+		 * vCPUs have the same effective APIC ID, e.g. due to the
+		 * x2APIC wrap or because the guest manually modified its xAPIC
+		 * IDs, events targeting that ID are supposed to be recognized
+		 * by all vCPUs with said ID.
 		 */
-		if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
-			new->phys_map[xapic_id] = apic;
+		if (kvm->arch.x2apic_format) {
+			/* See also kvm_apic_match_physical_addr(). */
+			if ((apic_x2apic_mode(apic) || x2apic_id > 0xff) &&
+			    x2apic_id <= new->max_apic_id)
+				new->phys_map[x2apic_id] = apic;
+
+			if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
+				new->phys_map[xapic_id] = apic;
+		} else {
+			/*
+			 * Disable the optimized map if the physical APIC ID is
+			 * already mapped, i.e. is aliased to multiple vCPUs.
+			 * The optimized map requires a strict 1:1 mapping
+			 * between IDs and vCPUs.
+			 */
+			if (apic_x2apic_mode(apic))
+				physical_id = x2apic_id;
+			else
+				physical_id = xapic_id;
+
+			if (new->phys_map[physical_id]) {
+				kvfree(new);
+				new = NULL;
+				goto out;
+			}
+			new->phys_map[physical_id] = apic;
+		}
 
 		if (!kvm_apic_sw_enabled(apic))
 			continue;
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 15/19] KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (13 preceding siblings ...)
  2022-08-31  0:35 ` [PATCH 14/19] KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs Sean Christopherson
@ 2022-08-31  0:35 ` Sean Christopherson
  2022-08-31 13:41   ` Maxim Levitsky
  2022-08-31  0:35 ` [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes Sean Christopherson
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Explicitly skip the optimized map setup if the vCPU's LDR is '0', i.e. if
the vCPU will never response to logical mode interrupts.  KVM already
skips setup in this case, but relies on kvm_apic_map_get_logical_dest()
to generate mask==0.  KVM still needs the mask=0 check as a non-zero LDR
can yield mask==0 depending on the mode, but explicitly handling the LDR
will make it simpler to clean up the logical mode tracking in the future.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/lapic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index c224b5c7cd92..8209caffe3ab 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -318,10 +318,12 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 			continue;
 
 		ldr = kvm_lapic_get_reg(apic, APIC_LDR);
+		if (!ldr)
+			continue;
 
 		if (apic_x2apic_mode(apic)) {
 			new->mode |= KVM_APIC_MODE_X2APIC;
-		} else if (ldr) {
+		} else {
 			ldr = GET_APIC_LOGICAL_ID(ldr);
 			if (kvm_lapic_get_reg(apic, APIC_DFR) == APIC_DFR_FLAT)
 				new->mode |= KVM_APIC_MODE_XAPIC_FLAT;
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (14 preceding siblings ...)
  2022-08-31  0:35 ` [PATCH 15/19] KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0 Sean Christopherson
@ 2022-08-31  0:35 ` Sean Christopherson
  2022-08-31 13:43   ` Maxim Levitsky
  2022-08-31 18:42   ` Maxim Levitsky
  2022-08-31  0:35 ` [PATCH 17/19] KVM: SVM: Handle multiple logical targets in AVIC kick fastpath Sean Christopherson
                   ` (2 subsequent siblings)
  18 siblings, 2 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Track all possibilities for the optimized APIC map's logical modes
instead of overloading the pseudo-bitmap and treating any "unknown" value
as "invalid".

As documented by the now-stale comment above the mode values, the values
did have meaning when the optimized map was originally added.  That
dependent logical was removed by commit e45115b62f9a ("KVM: x86: use
physical LAPIC array for logical x2APIC"), but the obfuscated behavior
and its comment were left behind.

Opportunistically rename "mode" to "logical_mode", partly to make it
clear that the "disabled" case applies only to the logical map, but also
to prove that there is no lurking code that expects "mode" to be a bitmap.

Functionally, this is a glorified nop.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 21 ++++++++++--------
 arch/x86/kvm/lapic.c            | 38 ++++++++++++++++++++++++---------
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1f51411f3112..0184e64ab555 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -955,19 +955,22 @@ struct kvm_arch_memory_slot {
 };
 
 /*
- * We use as the mode the number of bits allocated in the LDR for the
- * logical processor ID.  It happens that these are all powers of two.
- * This makes it is very easy to detect cases where the APICs are
- * configured for multiple modes; in that case, we cannot use the map and
- * hence cannot use kvm_irq_delivery_to_apic_fast either.
+ * Track the mode of the optimized logical map, as the rules for decoding the
+ * destination vary per mode.  Enabling the optimized logical map requires all
+ * software-enabled local APIs to be in the same mode, each addressable APIC to
+ * be mapped to only one MDA, and each MDA to map to at most one APIC.
  */
-#define KVM_APIC_MODE_XAPIC_CLUSTER          4
-#define KVM_APIC_MODE_XAPIC_FLAT             8
-#define KVM_APIC_MODE_X2APIC                16
+enum kvm_apic_logical_mode {
+	KVM_APIC_MODE_SW_DISABLED,
+	KVM_APIC_MODE_XAPIC_CLUSTER,
+	KVM_APIC_MODE_XAPIC_FLAT,
+	KVM_APIC_MODE_X2APIC,
+	KVM_APIC_MODE_MAP_DISABLED,
+};
 
 struct kvm_apic_map {
 	struct rcu_head rcu;
-	u8 mode;
+	enum kvm_apic_logical_mode logical_mode;
 	u32 max_apic_id;
 	union {
 		struct kvm_lapic *xapic_flat_map[8];
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8209caffe3ab..3b6ef36b3963 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -168,7 +168,12 @@ static bool kvm_use_posted_timer_interrupt(struct kvm_vcpu *vcpu)
 
 static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
 		u32 dest_id, struct kvm_lapic ***cluster, u16 *mask) {
-	switch (map->mode) {
+	switch (map->logical_mode) {
+	case KVM_APIC_MODE_SW_DISABLED:
+		/* Arbitrarily use the flat map so that @cluster isn't NULL. */
+		*cluster = map->xapic_flat_map;
+		*mask = 0;
+		return true;
 	case KVM_APIC_MODE_X2APIC: {
 		u32 offset = (dest_id >> 16) * 16;
 		u32 max_apic_id = map->max_apic_id;
@@ -193,8 +198,10 @@ static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
 		*cluster = map->xapic_cluster_map[(dest_id >> 4) & 0xf];
 		*mask = dest_id & 0xf;
 		return true;
+	case KVM_APIC_MODE_MAP_DISABLED:
+		return false;
 	default:
-		/* Not optimized. */
+		WARN_ON_ONCE(1);
 		return false;
 	}
 }
@@ -256,10 +263,12 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 		goto out;
 
 	new->max_apic_id = max_id;
+	new->logical_mode = KVM_APIC_MODE_SW_DISABLED;
 
 	kvm_for_each_vcpu(i, vcpu, kvm) {
 		struct kvm_lapic *apic = vcpu->arch.apic;
 		struct kvm_lapic **cluster;
+		enum kvm_apic_logical_mode logical_mode;
 		u32 x2apic_id, physical_id;
 		u16 mask;
 		u32 ldr;
@@ -314,7 +323,8 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 			new->phys_map[physical_id] = apic;
 		}
 
-		if (!kvm_apic_sw_enabled(apic))
+		if (new->logical_mode == KVM_APIC_MODE_MAP_DISABLED ||
+		    !kvm_apic_sw_enabled(apic))
 			continue;
 
 		ldr = kvm_lapic_get_reg(apic, APIC_LDR);
@@ -322,25 +332,33 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
 			continue;
 
 		if (apic_x2apic_mode(apic)) {
-			new->mode |= KVM_APIC_MODE_X2APIC;
+			logical_mode = KVM_APIC_MODE_X2APIC;
 		} else {
 			ldr = GET_APIC_LOGICAL_ID(ldr);
 			if (kvm_lapic_get_reg(apic, APIC_DFR) == APIC_DFR_FLAT)
-				new->mode |= KVM_APIC_MODE_XAPIC_FLAT;
+				logical_mode = KVM_APIC_MODE_XAPIC_FLAT;
 			else
-				new->mode |= KVM_APIC_MODE_XAPIC_CLUSTER;
+				logical_mode = KVM_APIC_MODE_XAPIC_CLUSTER;
 		}
+		if (new->logical_mode != KVM_APIC_MODE_SW_DISABLED &&
+		    new->logical_mode != logical_mode) {
+			new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;
+			continue;
+		}
+		new->logical_mode = logical_mode;
 
-		if (!kvm_apic_map_get_logical_dest(new, ldr, &cluster, &mask))
+		if (WARN_ON_ONCE(!kvm_apic_map_get_logical_dest(new, ldr,
+								&cluster, &mask))) {
+			new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;
 			continue;
+		}
 
 		if (!mask)
 			continue;
 
 		ldr = ffs(mask) - 1;
 		if (!is_power_of_2(mask) || cluster[ldr]) {
-			new->mode = KVM_APIC_MODE_XAPIC_FLAT |
-				    KVM_APIC_MODE_XAPIC_CLUSTER;
+			new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;
 			continue;
 		}
 		cluster[ldr] = apic;
@@ -993,7 +1011,7 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
 {
 	if (kvm->arch.x2apic_broadcast_quirk_disabled) {
 		if ((irq->dest_id == APIC_BROADCAST &&
-				map->mode != KVM_APIC_MODE_X2APIC))
+		     map->logical_mode != KVM_APIC_MODE_X2APIC))
 			return true;
 		if (irq->dest_id == X2APIC_BROADCAST)
 			return true;
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 17/19] KVM: SVM: Handle multiple logical targets in AVIC kick fastpath
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (15 preceding siblings ...)
  2022-08-31  0:35 ` [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes Sean Christopherson
@ 2022-08-31  0:35 ` Sean Christopherson
  2022-08-31 13:57   ` Maxim Levitsky
  2022-08-31  0:35 ` [PATCH 18/19] KVM: SVM: Ignore writes to Remote Read Data on AVIC write traps Sean Christopherson
  2022-08-31  0:35 ` [PATCH 19/19] Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu" Sean Christopherson
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Iterate over all target logical IDs in the AVIC kick fastpath instead of
bailing if there is more than one target and KVM's optimized APIC map is
enabled for logical mode.  If the optimized map is enabled, all vCPUs are
guaranteed to be mapped 1:1 to a logical ID or effectively have logical
mode disabled, i.e. iterating over the bitmap is guaranteed to kick each
target exactly once.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 126 +++++++++++++++++++++++++---------------
 1 file changed, 79 insertions(+), 47 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 2095ece70712..dad5affe44c1 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -339,6 +339,62 @@ static void avic_kick_vcpu(struct kvm_vcpu *vcpu, u32 icrl)
 					icrl & APIC_VECTOR_MASK);
 }
 
+static void avic_kick_vcpu_by_physical_id(struct kvm *kvm, u32 physical_id,
+					  u32 icrl)
+{
+	/*
+	 * KVM inhibits AVIC if any vCPU ID diverges from the vCPUs APIC ID,
+	 * i.e. APIC ID == vCPU ID.
+	 */
+	struct kvm_vcpu *target_vcpu = kvm_get_vcpu_by_id(kvm, physical_id);
+
+	/* Once again, nothing to do if the target vCPU doesn't exist. */
+	if (unlikely(!target_vcpu))
+		return;
+
+	avic_kick_vcpu(target_vcpu, icrl);
+}
+
+static void avic_kick_vcpu_by_logical_id(struct kvm *kvm, u32 *avic_logical_id_table,
+					 u32 logid_index, u32 icrl)
+{
+	u32 physical_id;
+
+	if (!avic_logical_id_table) {
+		u32 logid_entry = avic_logical_id_table[logid_index];
+
+		/* Nothing to do if the logical destination is invalid. */
+		if (unlikely(!(logid_entry & AVIC_LOGICAL_ID_ENTRY_VALID_MASK)))
+			return;
+
+		physical_id = logid_entry &
+			      AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK;
+	} else {
+		/*
+		 * For x2APIC, the logical APIC ID is a read-only value that is
+		 * derived from the x2APIC ID, thus the x2APIC ID can be found
+		 * by reversing the calculation (stored in logid_index).  Note,
+		 * bits 31:20 of the x2APIC ID aren't propagated to the logical
+		 * ID, but KVM limits the x2APIC ID limited to KVM_MAX_VCPU_IDS.
+		 */
+		physical_id = logid_index;
+	}
+
+	avic_kick_vcpu_by_physical_id(kvm, physical_id, icrl);
+}
+
+static bool is_optimized_logical_map_enabled(struct kvm *kvm)
+{
+	struct kvm_apic_map *map;
+	bool enabled;
+
+	rcu_read_lock();
+	map = rcu_dereference(kvm->arch.apic_map);
+	enabled = map && map->logical_mode != KVM_APIC_MODE_MAP_DISABLED;
+	rcu_read_unlock();
+	return enabled;
+}
+
 /*
  * A fast-path version of avic_kick_target_vcpus(), which attempts to match
  * destination APIC ID to vCPU without looping through all vCPUs.
@@ -346,11 +402,10 @@ static void avic_kick_vcpu(struct kvm_vcpu *vcpu, u32 icrl)
 static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source,
 				       u32 icrl, u32 icrh, u32 index)
 {
-	u32 l1_physical_id, dest;
-	struct kvm_vcpu *target_vcpu;
 	int dest_mode = icrl & APIC_DEST_MASK;
 	int shorthand = icrl & APIC_SHORT_MASK;
 	struct kvm_svm *kvm_svm = to_kvm_svm(kvm);
+	u32 dest;
 
 	if (shorthand != APIC_DEST_NOSHORT)
 		return -EINVAL;
@@ -367,14 +422,14 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 		if (!apic_x2apic_mode(source) && dest == APIC_BROADCAST)
 			return -EINVAL;
 
-		l1_physical_id = dest;
-
-		if (WARN_ON_ONCE(l1_physical_id != index))
+		if (WARN_ON_ONCE(dest != index))
 			return -EINVAL;
 
+		avic_kick_vcpu_by_physical_id(kvm, dest, icrl);
 	} else {
-		u32 bitmap, cluster;
-		int logid_index;
+		u32 *avic_logical_id_table;
+		unsigned long bitmap, i;
+		u32 cluster;
 
 		if (apic_x2apic_mode(source)) {
 			/* 16 bit dest mask, 16 bit cluster id */
@@ -394,50 +449,27 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
 		if (unlikely(!bitmap))
 			return 0;
 
-		if (!is_power_of_2(bitmap))
-			/* multiple logical destinations, use slow path */
+		/*
+		 * Use the slow path if more than one bit is set in the bitmap
+		 * and KVM's optimized logical map is disabled to avoid kicking
+		 * a vCPU multiple times.  If the optimized map is disabled, a
+		 * vCPU _may_ have multiple bits set in its logical ID, i.e.
+		 * may have multiple entries in the logical table.
+		 */
+		if (!is_power_of_2(bitmap) &&
+		    !is_optimized_logical_map_enabled(kvm))
 			return -EINVAL;
 
-		logid_index = cluster + __ffs(bitmap);
-
-		if (!apic_x2apic_mode(source)) {
-			u32 *avic_logical_id_table =
-				page_address(kvm_svm->avic_logical_id_table_page);
-
-			u32 logid_entry = avic_logical_id_table[logid_index];
-
-			if (WARN_ON_ONCE(index != logid_index))
-				return -EINVAL;
-
-			/* Nothing to do if the logical destination is invalid. */
-			if (unlikely(!(logid_entry & AVIC_LOGICAL_ID_ENTRY_VALID_MASK)))
-				return 0;
-
-			l1_physical_id = logid_entry &
-					 AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK;
-		} else {
-			/*
-			 * For x2APIC, the logical APIC ID is a read-only value
-			 * that is derived from the x2APIC ID, thus the x2APIC
-			 * ID can be found by reversing the calculation (done
-			 * above).  Note, bits 31:20 of the x2APIC ID are not
-			 * propagated to the logical ID, but KVM limits the
-			 * x2APIC ID limited to KVM_MAX_VCPU_IDS.
-			 */
-			l1_physical_id = logid_index;
-		}
+		if (apic_x2apic_mode(source))
+			avic_logical_id_table = NULL;
+		else
+			avic_logical_id_table = page_address(kvm_svm->avic_logical_id_table_page);
+
+		for_each_set_bit(i, &bitmap, 16)
+			avic_kick_vcpu_by_logical_id(kvm, avic_logical_id_table,
+						     cluster + i, icrl);
 	}
 
-	/*
-	 * KVM inhibits AVIC if any vCPU ID diverges from the vCPUs APIC ID,
-	 * i.e. APIC ID == vCPU ID.  Once again, nothing to do if the target
-	 * vCPU doesn't exist.
-	 */
-	target_vcpu = kvm_get_vcpu_by_id(kvm, l1_physical_id);
-	if (unlikely(!target_vcpu))
-		return 0;
-
-	avic_kick_vcpu(target_vcpu, icrl);
 	return 0;
 }
 
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 18/19] KVM: SVM: Ignore writes to Remote Read Data on AVIC write traps
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (16 preceding siblings ...)
  2022-08-31  0:35 ` [PATCH 17/19] KVM: SVM: Handle multiple logical targets in AVIC kick fastpath Sean Christopherson
@ 2022-08-31  0:35 ` Sean Christopherson
  2022-08-31 10:40   ` Maxim Levitsky
  2022-08-31  0:35 ` [PATCH 19/19] Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu" Sean Christopherson
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Drop writes to APIC_RRR, a.k.a. Remote Read Data Register, on AVIC
unaccelerated write traps.  The register is read-only and isn't emulated
by KVM.  Sending the register through kvm_apic_write_nodecode() will
result in screaming when x2APIC is enabled due to the unexpected failure
to retrieve the MSR (KVM expects that only "legal" accesses will trap).

Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index dad5affe44c1..b2033a56010c 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -675,6 +675,9 @@ static int avic_unaccel_trap_write(struct kvm_vcpu *vcpu)
 	case APIC_DFR:
 		avic_handle_dfr_update(vcpu);
 		break;
+	case APIC_RRR:
+		/* Ignore writes to Read Remote Data, it's read-only. */
+		return 1;
 	default:
 		break;
 	}
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 19/19] Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu"
  2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
                   ` (17 preceding siblings ...)
  2022-08-31  0:35 ` [PATCH 18/19] KVM: SVM: Ignore writes to Remote Read Data on AVIC write traps Sean Christopherson
@ 2022-08-31  0:35 ` Sean Christopherson
  2022-08-31  6:07   ` Maxim Levitsky
  18 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31  0:35 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky, Li RongQing

Turns out that some warnings exist for good reasons.  Restore the warning
in avic_vcpu_load() that guards against calling avic_vcpu_load() on a
running vCPU now that KVM avoids doing so when switching between x2APIC
and xAPIC.  The entire point of the WARN is to highlight that KVM should
not be reloading an AVIC.

Opportunistically convert the WARN_ON() to WARN_ON_ONCE() to avoid
spamming the kernel if it does fire.

This reverts commit c0caeee65af3944b7b8abbf566e7cc1fae15c775.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index b2033a56010c..3c300113d40b 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -1080,6 +1080,7 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 		return;
 
 	entry = READ_ONCE(*(svm->avic_physical_id_cache));
+	WARN_ON_ONCE(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
 
 	entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
 	entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31  0:34 ` [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode" Sean Christopherson
@ 2022-08-31  5:59   ` Maxim Levitsky
  2022-08-31  6:45     ` Maxim Levitsky
  2022-08-31 16:19     ` Sean Christopherson
  0 siblings, 2 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  5:59 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Remove SVM's so called "hybrid-AVIC mode" and reinstate the restriction
> where AVIC is disabled if x2APIC is enabled.  The argument that the
> "guest is not supposed to access xAPIC mmio when uses x2APIC" is flat out
> wrong.  Activating x2APIC completely disables the xAPIC MMIO region,
> there is nothing that says the guest must not access that address.
> 
> Concretely, KVM-Unit-Test's existing "apic" test fails the subtests that
> expect accesses to the APIC base region to not be emulated when x2APIC is
> enabled.
> 
> Furthermore, allowing the guest to trigger MMIO emulation in a mode where
> KVM doesn't expect such emulation to occur is all kinds of dangerous.
> 
> Tweak the restriction so that it only inhibits AVIC if x2APIC is actually
> enabled instead of inhibiting AVIC is x2APIC is exposed to the guest.
> 
> This reverts commit 0e311d33bfbef86da130674e8528cc23e6acfe16.

I don't agree with this patch.

When reviewing this code I did note that MMIO is left enabled which is kind of errata on KVM
side, and nobody objected to this.

Keeping AVIC enabled allows to have performance benefits with guests that have to use x2apic
(e.g after migration, or OS requirements) - aside from emulated msr access they get all the
AVIC benefits like doorbell, IOMMU posted interrupts, etc.

What we should do, and I even suggested doing it 
(and what I somewhat assumed that you will ask when the patch was in review),
is to disable the AVIC mmio hole when one of the guest vCPUs is in x2apic mode
which can be done by disabling our APIC private memslot.

Best regards,
	Maxim Levitsky

> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  6 ++++++
>  arch/x86/kvm/svm/avic.c         | 21 ++++++++++-----------
>  2 files changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 2c96c43c313a..1f51411f3112 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1128,6 +1128,12 @@ enum kvm_apicv_inhibit {
>  	 */
>  	APICV_INHIBIT_REASON_PIT_REINJ,
>  
> +	/*
> +	 * AVIC is inhibited because the vCPU has x2apic enabled and x2AVIC is
> +	 * not supported.
> +	 */
> +	APICV_INHIBIT_REASON_X2APIC,
> +
>  	/*
>  	 * AVIC is disabled because SEV doesn't support it.
>  	 */
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index f3a74c8284cb..1d516d658f9a 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -71,22 +71,12 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
>  	vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
>  
>  	vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> -
> -	/* Note:
> -	 * KVM can support hybrid-AVIC mode, where KVM emulates x2APIC
> -	 * MSR accesses, while interrupt injection to a running vCPU
> -	 * can be achieved using AVIC doorbell. The AVIC hardware still
> -	 * accelerate MMIO accesses, but this does not cause any harm
> -	 * as the guest is not supposed to access xAPIC mmio when uses x2APIC.
> -	 */
> -	if (apic_x2apic_mode(svm->vcpu.arch.apic) &&
> -	    avic_mode == AVIC_MODE_X2) {
> +	if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
>  		vmcb->control.int_ctl |= X2APIC_MODE_MASK;
>  		vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
>  		/* Disabling MSR intercept for x2APIC registers */
>  		svm_set_x2apic_msr_interception(svm, false);
>  	} else {
> -		/* For xAVIC and hybrid-xAVIC modes */
>  		vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
>  		/* Enabling MSR intercept for x2APIC registers */
>  		svm_set_x2apic_msr_interception(svm, true);
> @@ -537,6 +527,14 @@ unsigned long avic_vcpu_get_apicv_inhibit_reasons(struct kvm_vcpu *vcpu)
>  {
>  	if (is_guest_mode(vcpu))
>  		return APICV_INHIBIT_REASON_NESTED;
> +
> +	/*
> +	 * AVIC must be disabled if x2AVIC isn't supported and the guest has
> +	 * x2APIC enabled.
> +	 */
> +	if (avic_mode != AVIC_MODE_X2 && apic_x2apic_mode(vcpu->arch.apic))
> +		return APICV_INHIBIT_REASON_X2APIC;
> +
>  	return 0;
>  }
>  
> @@ -993,6 +991,7 @@ bool avic_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason)
>  			  BIT(APICV_INHIBIT_REASON_NESTED) |
>  			  BIT(APICV_INHIBIT_REASON_IRQWIN) |
>  			  BIT(APICV_INHIBIT_REASON_PIT_REINJ) |
> +			  BIT(APICV_INHIBIT_REASON_X2APIC) |
>  			  BIT(APICV_INHIBIT_REASON_BLOCKIRQ) |
>  			  BIT(APICV_INHIBIT_REASON_SEV)      |
>  			  BIT(APICV_INHIBIT_REASON_APIC_ID_MODIFIED) |



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 19/19] Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu"
  2022-08-31  0:35 ` [PATCH 19/19] Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu" Sean Christopherson
@ 2022-08-31  6:07   ` Maxim Levitsky
  2022-08-31  7:03     ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  6:07 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> Turns out that some warnings exist for good reasons.  Restore the warning
> in avic_vcpu_load() that guards against calling avic_vcpu_load() on a
> running vCPU now that KVM avoids doing so when switching between x2APIC
> and xAPIC.  The entire point of the WARN is to highlight that KVM should
> not be reloading an AVIC.
> 
> Opportunistically convert the WARN_ON() to WARN_ON_ONCE() to avoid
> spamming the kernel if it does fire.
> 
> This reverts commit c0caeee65af3944b7b8abbf566e7cc1fae15c775.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index b2033a56010c..3c300113d40b 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -1080,6 +1080,7 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  		return;
>  
>  	entry = READ_ONCE(*(svm->avic_physical_id_cache));
> +	WARN_ON_ONCE(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
>  
>  	entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
>  	entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);

Note that this warning was removed because it would trigger wheh x2avic code would switch
between xapic and x2apic.

I do agree 100% that this warning is useful.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 06/19] KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not ICHR[31:16]
  2022-08-31  0:34 ` [PATCH 06/19] KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not ICHR[31:16] Sean Christopherson
@ 2022-08-31  6:09   ` Maxim Levitsky
  2022-08-31  9:43     ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  6:09 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> When attempting a fast kick for x2AVIC, get the destination bitmap from
> ICR[15:0], not ICHR[31:16].  The upper 16 bits contain the cluster, the
> lower 16 bits hold the bitmap.
> 
> Fixes: 603ccef42ce9 ("KVM: x86: SVM: fix avic_kick_target_vcpus_fast")
> Cc: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 3ace0f2f52f0..3c333cd2e752 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -368,7 +368,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  
>  		if (apic_x2apic_mode(source)) {
>  			/* 16 bit dest mask, 16 bit cluster id */
> -			bitmap = dest & 0xFFFF0000;
> +			bitmap = dest & 0xFFFF;
>  			cluster = (dest >> 16) << 4;
>  		} else if (kvm_lapic_get_reg(source, APIC_DFR) == APIC_DFR_FLAT) {
>  			/* 8 bit dest mask*/

I swear I have seen a patch from Suravee Suthikulpanit fixing this my mistake, I don't know why it was not
accepted upstream.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/19] KVM: SVM: Drop buggy and redundant AVIC "single logical dest" check
  2022-08-31  0:34 ` [PATCH 07/19] KVM: SVM: Drop buggy and redundant AVIC "single logical dest" check Sean Christopherson
@ 2022-08-31  6:19   ` Maxim Levitsky
  2022-08-31 16:37     ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  6:19 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Use the already-calculated-and-sanity-checked destination bitmap when
> processing a fast AVIC kick in logical mode, and drop the logical path's
> flawed logic.  The intent of the check is to ensure the bitmap is a power
> of two, whereas "icrh != (1 << avic)" effectively checks that the bitmap
> is a power of two _and_ the target cluster is '0'.
> 
> Note, the flawed check isn't a functional issue, it simply means that KVM
> will go down the slow path if the target cluster is non-zero.
> 
> Fixes: 8c9e639da435 ("KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 3c333cd2e752..14f567550a1e 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -411,15 +411,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  			 * Instead, calculate physical ID from logical ID in ICRH.
>  			 */
>  			int cluster = (icrh & 0xffff0000) >> 16;
> -			int apic = ffs(icrh & 0xffff) - 1;
> -
> -			/*
> -			 * If the x2APIC logical ID sub-field (i.e. icrh[15:0])
> -			 * contains anything but a single bit, we cannot use the
> -			 * fast path, because it is limited to a single vCPU.
> -			 */
> -			if (apic < 0 || icrh != (1 << apic))
> -				return -EINVAL;
> +			int apic = ffs(bitmap) - 1;
>  
>  			l1_physical_id = (cluster << 4) + apic;
>  		}

Oh, I didn't notice this bug. However isn't removing the check is wrong as well?

What if we do have multiple bits set in the bitmap? After you remove this code,
we will set IPI only to APIC which matches the 1st bit, no?
(The fast code only sends IPI to one vCPU)

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31  5:59   ` Maxim Levitsky
@ 2022-08-31  6:45     ` Maxim Levitsky
  2022-08-31 16:29       ` Sean Christopherson
  2022-08-31 16:19     ` Sean Christopherson
  1 sibling, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  6:45 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 08:59 +0300, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > Remove SVM's so called "hybrid-AVIC mode" and reinstate the restriction
> > where AVIC is disabled if x2APIC is enabled.  The argument that the
> > "guest is not supposed to access xAPIC mmio when uses x2APIC" is flat out
> > wrong.  Activating x2APIC completely disables the xAPIC MMIO region,
> > there is nothing that says the guest must not access that address.
> > 
> > Concretely, KVM-Unit-Test's existing "apic" test fails the subtests that
> > expect accesses to the APIC base region to not be emulated when x2APIC is
> > enabled.
> > 
> > Furthermore, allowing the guest to trigger MMIO emulation in a mode where
> > KVM doesn't expect such emulation to occur is all kinds of dangerous.

Also, unless I misunderstood you, the above statement is wrong.

Leaving AVIC on, when vCPU is in x2apic mode cannot trigger extra MMIO emulation,
in fact the opposite - because AVIC is on, writes to 0xFEE00xxx might *not* trigger
MMIO emulation and instead be emulated by AVIC.

Yes, some of these writes can trigger AVIC specific emulation vm exits, but they
are literaly the same as those used by x2avic, and it is really hard to see
why this would be dangerous (assuming that x2avic code works, and avic code
is aware of this 'hybrid' mode).

From the guest point of view, unless the guest pokes at random MMIO area,
the only case when this matters is if the guest maps RAM over the 0xFEE00xxx
(which it of course can, the spec explictly state as you say that when x2apic
is enabled, the mmio is disabled), and then instead of functioning as RAM,
the range will still function as APIC.

However after *our hack* that we did to avoid removing the APIC private memslot,
regardless of APIC inhibit, userspace likely can't even place RAM memslot over the area, regarless
if x2apic is enabled or not, because it will overlap over our private memslot.

Best regards,
	Maxim Levitsky

> > 
> > Tweak the restriction so that it only inhibits AVIC if x2APIC is actually
> > enabled instead of inhibiting AVIC is x2APIC is exposed to the guest.
> > 
> > This reverts commit 0e311d33bfbef86da130674e8528cc23e6acfe16.
> 
> I don't agree with this patch.
> 
> When reviewing this code I did note that MMIO is left enabled which is kind of errata on KVM
> side, and nobody objected to this.
> 
> Keeping AVIC enabled allows to have performance benefits with guests that have to use x2apic
> (e.g after migration, or OS requirements) - aside from emulated msr access they get all the
> AVIC benefits like doorbell, IOMMU posted interrupts, etc.
> 
> What we should do, and I even suggested doing it 
> (and what I somewhat assumed that you will ask when the patch was in review),
> is to disable the AVIC mmio hole when one of the guest vCPUs is in x2apic mode
> which can be done by disabling our APIC private memslot.
> 
> Best regards,
> 	Maxim Levitsky
> 
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h |  6 ++++++
> >  arch/x86/kvm/svm/avic.c         | 21 ++++++++++-----------
> >  2 files changed, 16 insertions(+), 11 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 2c96c43c313a..1f51411f3112 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1128,6 +1128,12 @@ enum kvm_apicv_inhibit {
> >  	 */
> >  	APICV_INHIBIT_REASON_PIT_REINJ,
> >  
> > +	/*
> > +	 * AVIC is inhibited because the vCPU has x2apic enabled and x2AVIC is
> > +	 * not supported.
> > +	 */
> > +	APICV_INHIBIT_REASON_X2APIC,
> > +
> >  	/*
> >  	 * AVIC is disabled because SEV doesn't support it.
> >  	 */
> > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > index f3a74c8284cb..1d516d658f9a 100644
> > --- a/arch/x86/kvm/svm/avic.c
> > +++ b/arch/x86/kvm/svm/avic.c
> > @@ -71,22 +71,12 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
> >  	vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
> >  
> >  	vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> > -
> > -	/* Note:
> > -	 * KVM can support hybrid-AVIC mode, where KVM emulates x2APIC
> > -	 * MSR accesses, while interrupt injection to a running vCPU
> > -	 * can be achieved using AVIC doorbell. The AVIC hardware still
> > -	 * accelerate MMIO accesses, but this does not cause any harm
> > -	 * as the guest is not supposed to access xAPIC mmio when uses x2APIC.
> > -	 */
> > -	if (apic_x2apic_mode(svm->vcpu.arch.apic) &&
> > -	    avic_mode == AVIC_MODE_X2) {
> > +	if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
> >  		vmcb->control.int_ctl |= X2APIC_MODE_MASK;
> >  		vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
> >  		/* Disabling MSR intercept for x2APIC registers */
> >  		svm_set_x2apic_msr_interception(svm, false);
> >  	} else {
> > -		/* For xAVIC and hybrid-xAVIC modes */
> >  		vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> >  		/* Enabling MSR intercept for x2APIC registers */
> >  		svm_set_x2apic_msr_interception(svm, true);
> > @@ -537,6 +527,14 @@ unsigned long avic_vcpu_get_apicv_inhibit_reasons(struct kvm_vcpu *vcpu)
> >  {
> >  	if (is_guest_mode(vcpu))
> >  		return APICV_INHIBIT_REASON_NESTED;
> > +
> > +	/*
> > +	 * AVIC must be disabled if x2AVIC isn't supported and the guest has
> > +	 * x2APIC enabled.
> > +	 */
> > +	if (avic_mode != AVIC_MODE_X2 && apic_x2apic_mode(vcpu->arch.apic))
> > +		return APICV_INHIBIT_REASON_X2APIC;
> > +
> >  	return 0;
> >  }
> >  
> > @@ -993,6 +991,7 @@ bool avic_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason)
> >  			  BIT(APICV_INHIBIT_REASON_NESTED) |
> >  			  BIT(APICV_INHIBIT_REASON_IRQWIN) |
> >  			  BIT(APICV_INHIBIT_REASON_PIT_REINJ) |
> > +			  BIT(APICV_INHIBIT_REASON_X2APIC) |
> >  			  BIT(APICV_INHIBIT_REASON_BLOCKIRQ) |
> >  			  BIT(APICV_INHIBIT_REASON_SEV)      |
> >  			  BIT(APICV_INHIBIT_REASON_APIC_ID_MODIFIED) |



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 19/19] Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu"
  2022-08-31  6:07   ` Maxim Levitsky
@ 2022-08-31  7:03     ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  7:03 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 09:07 +0300, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> > Turns out that some warnings exist for good reasons.  Restore the warning
> > in avic_vcpu_load() that guards against calling avic_vcpu_load() on a
> > running vCPU now that KVM avoids doing so when switching between x2APIC
> > and xAPIC.  The entire point of the WARN is to highlight that KVM should
> > not be reloading an AVIC.
> > 
> > Opportunistically convert the WARN_ON() to WARN_ON_ONCE() to avoid
> > spamming the kernel if it does fire.
> > 
> > This reverts commit c0caeee65af3944b7b8abbf566e7cc1fae15c775.
> > 
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/svm/avic.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > index b2033a56010c..3c300113d40b 100644
> > --- a/arch/x86/kvm/svm/avic.c
> > +++ b/arch/x86/kvm/svm/avic.c
> > @@ -1080,6 +1080,7 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> >  		return;
> >  
> >  	entry = READ_ONCE(*(svm->avic_physical_id_cache));
> > +	WARN_ON_ONCE(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
> >  
> >  	entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
> >  	entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);
> 
> Note that this warning was removed because it would trigger wheh x2avic code would switch
> between xapic and x2apic.
> 
> I do agree 100% that this warning is useful.

Ah I see that you fixed the apic reloading to not trigger this code,
need more coffee - in this case this patch makes lot of sense.

I'll review it again when I review rest of the patches in the patch series.

Best regards,
	Maxim Levitsky
> 
> Best regards,
> 	Maxim Levitsky



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/19] KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target
  2022-08-31  0:34 ` [PATCH 01/19] KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target Sean Christopherson
@ 2022-08-31  9:18   ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  9:18 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Emulate ICR writes on AVIC IPI failures due to invalid targets using the
> same logic as failures due to invalid types.  AVIC acceleration fails if
> _any_ of the targets are invalid
> , and crucially VM-Exits before sending
> IPIs to targets that _are_ valid.  In logical mode, the destination is a
> bitmap, i.e. a single IPI can target multiple logical IDs.  Doing nothing
> causes KVM to drop IPIs if at least one target is valid and at least one
> target is invalid.


This is a corner case, but it does makes sense to fix it.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky

> 
> Fixes: 18f40c53e10f ("svm: Add VMEXIT handlers for AVIC")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 6919dee69f18..b1ade555e8d0 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -496,14 +496,18 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
>  	trace_kvm_avic_incomplete_ipi(vcpu->vcpu_id, icrh, icrl, id, index);
>  
>  	switch (id) {
> +	case AVIC_IPI_FAILURE_INVALID_TARGET:
>  	case AVIC_IPI_FAILURE_INVALID_INT_TYPE:
>  		/*
>  		 * Emulate IPIs that are not handled by AVIC hardware, which
> -		 * only virtualizes Fixed, Edge-Triggered INTRs.  The exit is
> -		 * a trap, e.g. ICR holds the correct value and RIP has been
> -		 * advanced, KVM is responsible only for emulating the IPI.
> -		 * Sadly, hardware may sometimes leave the BUSY flag set, in
> -		 * which case KVM needs to emulate the ICR write as well in
> +		 * only virtualizes Fixed, Edge-Triggered INTRs, and falls over
> +		 * if _any_ targets are invalid, e.g. if the logical mode mask
> +		 * is a superset of running vCPUs.
> +		 *
> +		 * The exit is a trap, e.g. ICR holds the correct value and RIP
> +		 * has been advanced, KVM is responsible only for emulating the
> +		 * IPI.  Sadly, hardware may sometimes leave the BUSY flag set,
> +		 * in which case KVM needs to emulate the ICR write as well in
>  		 * order to clear the BUSY flag.
>  		 */
>  		if (icrl & APIC_ICR_BUSY)
> @@ -519,8 +523,6 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
>  		 */
>  		avic_kick_target_vcpus(vcpu->kvm, apic, icrl, icrh, index);
>  		break;
> -	case AVIC_IPI_FAILURE_INVALID_TARGET:
> -		break;
>  	case AVIC_IPI_FAILURE_INVALID_BACKING_PAGE:
>  		WARN_ONCE(1, "Invalid backing page\n");
>  		break;



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 02/19] KVM: SVM: Don't put/load AVIC when setting virtual APIC mode
  2022-08-31  0:34 ` [PATCH 02/19] KVM: SVM: Don't put/load AVIC when setting virtual APIC mode Sean Christopherson
@ 2022-08-31  9:24   ` Maxim Levitsky
  2022-08-31 16:33     ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  9:24 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Move the VMCB updates from avic_refresh_apicv_exec_ctrl() into
> avic_set_virtual_apic_mode() and invert the dependency being said
> functions to avoid calling avic_vcpu_{load,put}() and
> avic_set_pi_irte_mode() when "only" setting the virtual APIC mode.
> 
> avic_set_virtual_apic_mode() is invoked from common x86 with preemption
> enabled, which makes avic_vcpu_{load,put}() unhappy.  Luckily, calling
> those and updating IRTE stuff is unnecessary as the only reason
> avic_set_virtual_apic_mode() is called is to handle transitions between
> xAPIC and x2APIC that don't also toggle APICv activation.  And if
> activation doesn't change, there's no need to fiddle with the physical
> APIC ID table or update IRTE.
> 
> The "full" refresh is guaranteed to be called if activation changes in
> this case as the only call to the "set" path is:
> 
> 	kvm_vcpu_update_apicv(vcpu);
> 	static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);
> 
> and kvm_vcpu_update_apicv() invokes the refresh if activation changes:
> 
> 	if (apic->apicv_active == activate)
> 		goto out;
> 
> 	apic->apicv_active = activate;
> 	kvm_apic_update_apicv(vcpu);
> 	static_call(kvm_x86_refresh_apicv_exec_ctrl)(vcpu);
> 
>   WARNING: CPU: 183 PID: 49186 at arch/x86/kvm/svm/avic.c:1081 avic_vcpu_put+0xde/0xf0 [kvm_amd]
>   CPU: 183 PID: 49186 Comm: stable Tainted: G           O       6.0.0-smp--fcddbca45f0a-sink #34
>   Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022
>   RIP: 0010:avic_vcpu_put+0xde/0xf0 [kvm_amd]
>    avic_refresh_apicv_exec_ctrl+0x142/0x1c0 [kvm_amd]
>    avic_set_virtual_apic_mode+0x5a/0x70 [kvm_amd]
>    kvm_lapic_set_base+0x149/0x1a0 [kvm]
>    kvm_set_apic_base+0x8f/0xd0 [kvm]
>    kvm_set_msr_common+0xa3a/0xdc0 [kvm]
>    svm_set_msr+0x364/0x6b0 [kvm_amd]
>    __kvm_set_msr+0xb8/0x1c0 [kvm]
>    kvm_emulate_wrmsr+0x58/0x1d0 [kvm]
>    msr_interception+0x1c/0x30 [kvm_amd]
>    svm_invoke_exit_handler+0x31/0x100 [kvm_amd]
>    svm_handle_exit+0xfc/0x160 [kvm_amd]
>    vcpu_enter_guest+0x21bb/0x23e0 [kvm]
>    vcpu_run+0x92/0x450 [kvm]
>    kvm_arch_vcpu_ioctl_run+0x43e/0x6e0 [kvm]
>    kvm_vcpu_ioctl+0x559/0x620 [kvm]
> 
> Fixes: 05c4fe8c1bd9 ("KVM: SVM: Refresh AVIC configuration when changing APIC mode")
> Cc: stable@vger.kernel.org
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 31 +++++++++++++++----------------
>  1 file changed, 15 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index b1ade555e8d0..f3a74c8284cb 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -741,18 +741,6 @@ void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu)
>  	avic_handle_ldr_update(vcpu);
>  }
>  
> -void avic_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
> -{
> -	if (!lapic_in_kernel(vcpu) || avic_mode == AVIC_MODE_NONE)
> -		return;
> -
> -	if (kvm_get_apic_mode(vcpu) == LAPIC_MODE_INVALID) {
> -		WARN_ONCE(true, "Invalid local APIC state (vcpu_id=%d)", vcpu->vcpu_id);
> -		return;
> -	}
> -	avic_refresh_apicv_exec_ctrl(vcpu);
> -}
> -
>  static int avic_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
>  {
>  	int ret = 0;
> @@ -1094,17 +1082,18 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
>  	WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
>  }
>  
> -
> -void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> +void avic_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  	struct vmcb *vmcb = svm->vmcb01.ptr;
> -	bool activated = kvm_vcpu_apicv_active(vcpu);
> +
> +	if (!lapic_in_kernel(vcpu) || avic_mode == AVIC_MODE_NONE)
> +		return;
>  
>  	if (!enable_apicv)
>  		return;
>  
> -	if (activated) {
> +	if (kvm_vcpu_apicv_active(vcpu)) {
>  		/**
>  		 * During AVIC temporary deactivation, guest could update
>  		 * APIC ID, DFR and LDR registers, which would not be trapped
> @@ -1118,6 +1107,16 @@ void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
>  		avic_deactivate_vmcb(svm);
>  	}
>  	vmcb_mark_dirty(vmcb, VMCB_AVIC);
> +}
> +
> +void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> +{
> +	bool activated = kvm_vcpu_apicv_active(vcpu);
> +
> +	if (!enable_apicv)
> +		return;
> +
> +	avic_set_virtual_apic_mode(vcpu);

This call is misleading - this will usually be called
when avic mode didn't change - I think we need a better name for
avic_set_virtual_apic_mode.

Other than that this makes sense.

Best regards,
	Maxim Levitsky

>  
>  	if (activated)
>  		avic_vcpu_load(vcpu, vcpu->cpu);



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/19] KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean
  2022-08-31  0:34 ` [PATCH 04/19] KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean Sean Christopherson
@ 2022-08-31  9:36   ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  9:36 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Replace the "avic_mode" enum with a single bool to track whether or not
> x2AVIC is enabled.  KVM already has "apicv_enabled" that tracks if any
> flavor of AVIC is enabled, i.e. AVIC_MODE_NONE and AVIC_MODE_X1 are
> redundant and unnecessary noise.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 45 +++++++++++++++++++----------------------
>  arch/x86/kvm/svm/svm.c  |  2 +-
>  arch/x86/kvm/svm/svm.h  |  9 +--------
>  3 files changed, 23 insertions(+), 33 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 1d516d658f9a..b59f8ee2671f 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -53,7 +53,7 @@ static DEFINE_HASHTABLE(svm_vm_data_hash, SVM_VM_DATA_HASH_BITS);
>  static u32 next_vm_id = 0;
>  static bool next_vm_id_wrapped = 0;
>  static DEFINE_SPINLOCK(svm_vm_data_hash_lock);
> -enum avic_modes avic_mode;
> +bool x2avic_enabled;
>  
>  /*
>   * This is a wrapper of struct amd_iommu_ir_data.
> @@ -231,8 +231,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
>  	u64 *avic_physical_id_table;
>  	struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);
>  
> -	if ((avic_mode == AVIC_MODE_X1 && index > AVIC_MAX_PHYSICAL_ID) ||
> -	    (avic_mode == AVIC_MODE_X2 && index > X2AVIC_MAX_PHYSICAL_ID))
> +	if ((!x2avic_enabled && index > AVIC_MAX_PHYSICAL_ID) ||
> +	    (index > X2AVIC_MAX_PHYSICAL_ID))
>  		return NULL;
>  
>  	avic_physical_id_table = page_address(kvm_svm->avic_physical_id_table_page);
> @@ -279,8 +279,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
>  	int id = vcpu->vcpu_id;
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  
> -	if ((avic_mode == AVIC_MODE_X1 && id > AVIC_MAX_PHYSICAL_ID) ||
> -	    (avic_mode == AVIC_MODE_X2 && id > X2AVIC_MAX_PHYSICAL_ID))
> +	if ((!x2avic_enabled && id > AVIC_MAX_PHYSICAL_ID) ||
> +	    (id > X2AVIC_MAX_PHYSICAL_ID))
>  		return -EINVAL;
>  
>  	if (!vcpu->arch.apic->regs)
> @@ -532,7 +532,7 @@ unsigned long avic_vcpu_get_apicv_inhibit_reasons(struct kvm_vcpu *vcpu)
>  	 * AVIC must be disabled if x2AVIC isn't supported and the guest has
>  	 * x2APIC enabled.
>  	 */
> -	if (avic_mode != AVIC_MODE_X2 && apic_x2apic_mode(vcpu->arch.apic))
> +	if (!x2avic_enabled && apic_x2apic_mode(vcpu->arch.apic))
>  		return APICV_INHIBIT_REASON_X2APIC;
>  
>  	return 0;
> @@ -1086,10 +1086,7 @@ void avic_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  	struct vmcb *vmcb = svm->vmcb01.ptr;
>  
> -	if (!lapic_in_kernel(vcpu) || avic_mode == AVIC_MODE_NONE)
> -		return;
> -
> -	if (!enable_apicv)
> +	if (!lapic_in_kernel(vcpu) || !enable_apicv)
>  		return;
>  
>  	if (kvm_vcpu_apicv_active(vcpu)) {
> @@ -1165,32 +1162,32 @@ bool avic_hardware_setup(struct kvm_x86_ops *x86_ops)
>  	if (!npt_enabled)
>  		return false;
>  
> +	/* AVIC is a prerequisite for x2AVIC. */
> +	if (!boot_cpu_has(X86_FEATURE_AVIC) && !force_avic) {
> +		if (boot_cpu_has(X86_FEATURE_X2AVIC)) {
> +			pr_warn(FW_BUG "Cannot support x2AVIC due to AVIC is disabled");
> +			pr_warn(FW_BUG "Try enable AVIC using force_avic option");
> +		}
> +		return false;
> +	}
> +
>  	if (boot_cpu_has(X86_FEATURE_AVIC)) {
> -		avic_mode = AVIC_MODE_X1;
>  		pr_info("AVIC enabled\n");
>  	} else if (force_avic) {
>  		/*
>  		 * Some older systems does not advertise AVIC support.
>  		 * See Revision Guide for specific AMD processor for more detail.
>  		 */
> -		avic_mode = AVIC_MODE_X1;
>  		pr_warn("AVIC is not supported in CPUID but force enabled");
>  		pr_warn("Your system might crash and burn");
>  	}
>  
>  	/* AVIC is a prerequisite for x2AVIC. */
> -	if (boot_cpu_has(X86_FEATURE_X2AVIC)) {
> -		if (avic_mode == AVIC_MODE_X1) {
> -			avic_mode = AVIC_MODE_X2;
> -			pr_info("x2AVIC enabled\n");
> -		} else {
> -			pr_warn(FW_BUG "Cannot support x2AVIC due to AVIC is disabled");
> -			pr_warn(FW_BUG "Try enable AVIC using force_avic option");
> -		}
> -	}
> +	x2avic_enabled = boot_cpu_has(X86_FEATURE_X2AVIC);
> +	if (x2avic_enabled)
> +		pr_info("x2AVIC enabled\n");
>  
> -	if (avic_mode != AVIC_MODE_NONE)
> -		amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
> +	amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
>  
> -	return !!avic_mode;
> +	return true;
>  }
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index f3813dbacb9f..b25c89069128 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -821,7 +821,7 @@ void svm_set_x2apic_msr_interception(struct vcpu_svm *svm, bool intercept)
>  	if (intercept == svm->x2avic_msrs_intercepted)
>  		return;
>  
> -	if (avic_mode != AVIC_MODE_X2 ||
> +	if (!x2avic_enabled ||
>  	    !apic_x2apic_mode(svm->vcpu.arch.apic))
>  		return;
>  
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 6a7686bf6900..cee79ade400a 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -35,14 +35,7 @@ extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
>  extern bool npt_enabled;
>  extern int vgif;
>  extern bool intercept_smi;
> -
> -enum avic_modes {
> -	AVIC_MODE_NONE = 0,
> -	AVIC_MODE_X1,
> -	AVIC_MODE_X2,
> -};
> -
> -extern enum avic_modes avic_mode;
> +extern bool x2avic_enabled;
>  
>  /*
>   * Clean bits in VMCB.

Overall looks like an improvement, I would probably do it this way if I were to
implement this code, but the original code is alright as well.


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>


Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 05/19] KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick
  2022-08-31  0:34 ` [PATCH 05/19] KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick Sean Christopherson
@ 2022-08-31  9:38   ` Maxim Levitsky
  2022-09-01  2:56   ` Li,Rongqing
  1 sibling, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  9:38 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Compute the destination from ICRH using the sender's x2APIC status, not
> each (potential) target's x2APIC status.
> 
> Fixes: c514d3a348ac ("KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID")
> Cc: Li RongQing <lirongqing@baidu.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index b59f8ee2671f..3ace0f2f52f0 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -441,6 +441,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
>  				   u32 icrl, u32 icrh, u32 index)
>  {
> +	u32 dest = apic_x2apic_mode(source) ? icrh : GET_XAPIC_DEST_FIELD(icrh);
>  	unsigned long i;
>  	struct kvm_vcpu *vcpu;
>  
> @@ -456,13 +457,6 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
>  	 * since entered the guest will have processed pending IRQs at VMRUN.
>  	 */
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
> -		u32 dest;
> -
> -		if (apic_x2apic_mode(vcpu->arch.apic))
> -			dest = icrh;
> -		else
> -			dest = GET_XAPIC_DEST_FIELD(icrh);
> -
>  		if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
>  					dest, icrl & APIC_DEST_MASK)) {
>  			vcpu->arch.apic->irr_pending = true;

I didn't notice this in a review, makes sense.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 06/19] KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not ICHR[31:16]
  2022-08-31  6:09   ` Maxim Levitsky
@ 2022-08-31  9:43     ` Maxim Levitsky
  2022-08-31 16:35       ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31  9:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 09:09 +0300, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > When attempting a fast kick for x2AVIC, get the destination bitmap from
> > ICR[15:0], not ICHR[31:16].  The upper 16 bits contain the cluster, the
> > lower 16 bits hold the bitmap.
> > 
> > Fixes: 603ccef42ce9 ("KVM: x86: SVM: fix avic_kick_target_vcpus_fast")
> > Cc: Maxim Levitsky <mlevitsk@redhat.com>
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/svm/avic.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > index 3ace0f2f52f0..3c333cd2e752 100644
> > --- a/arch/x86/kvm/svm/avic.c
> > +++ b/arch/x86/kvm/svm/avic.c
> > @@ -368,7 +368,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
> >  
> >  		if (apic_x2apic_mode(source)) {
> >  			/* 16 bit dest mask, 16 bit cluster id */
> > -			bitmap = dest & 0xFFFF0000;
> > +			bitmap = dest & 0xFFFF;
> >  			cluster = (dest >> 16) << 4;
> >  		} else if (kvm_lapic_get_reg(source, APIC_DFR) == APIC_DFR_FLAT) {
> >  			/* 8 bit dest mask*/
> 
> I swear I have seen a patch from Suravee Suthikulpanit fixing this my mistake, I don't know why it was not
> accepted upstream.

This is the patch, which I guess got forgotten.

https://www.spinics.net/lists/kernel/msg4417427.html

Since it is literaly the same patch, you can just add credit to Suravee Suthikulpanit.

So with the credit added:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/19] KVM: SVM: Remove redundant cluster calculation that also creates a shadow
  2022-08-31  0:34 ` [PATCH 08/19] KVM: SVM: Remove redundant cluster calculation that also creates a shadow Sean Christopherson
@ 2022-08-31 10:19   ` Maxim Levitsky
  2022-09-01 20:02     ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 10:19 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Drop the redundant "cluster" calculation and its horrific shadowing in
> the x2avic logical mode path.  The "cluster" that is declared at an outer
> scope is derived using the exact same calculation and has performed the
> left-shift.

Actually I think we should just revert the commit 
'KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible'


I know that the patch that intially introduced the the avic_kick_target_vcpus_fast had
x2apic/x2avic code, and then it was split to avoid adding x2avic code before it was merged,
resulting in this patch to add the x2apic specific code.

But when I fixed most issues of avic_kick_target_vcpus_fast in my 
'KVM: x86: SVM: fix avic_kick_target_vcpus_fast', I added back x2apic code because
it was just natural to do since I had to calculate cluster/bitmap masks anyway.

I expected this patch to be dropped because of this but I guess it was not noticed,
or patches were merged in the wrong order.

This is the reason of shadowing, duplicate calculations/etc.

Patch 9 and 7 of your series can be dropped as well then.

Best regards,
	Maxim Levitsky


> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 14f567550a1e..8c9cad96008e 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -410,10 +410,9 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  			 * For x2APIC logical mode, cannot leverage the index.
>  			 * Instead, calculate physical ID from logical ID in ICRH.
>  			 */
> -			int cluster = (icrh & 0xffff0000) >> 16;
>  			int apic = ffs(bitmap) - 1;
>  
> -			l1_physical_id = (cluster << 4) + apic;
> +			l1_physical_id = cluster + apic;
>  		}
>  	}
>  



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/19] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch
  2022-08-31  0:34 ` [PATCH 10/19] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch Sean Christopherson
@ 2022-08-31 10:22   ` Maxim Levitsky
  2022-08-31 16:16     ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 10:22 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Document that AVIC is inhibited if any vCPU's APIC ID diverges from its
> vCPU ID, i.e. that there's no need to check for a destination match in
> the AVIC kick fast path.
> 
> Opportunistically tweak comments to remove "guest bug", as that suggests
> KVM is punting on error handling, which is not the case.  Targeting a
> non-existent vCPU or no vCPUs _may_ be a guest software bug, but whether
> or not it's a guest bug is irrelevant.  Such behavior is architecturally
> legal and thus needs to faithfully emulated by KVM (and it is).

I don't want to pick a fight, but personally these things *are* guest bugs / improper usage of APIC,
and I don't think it is wrong to document them as such.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>


> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 05a1cde8175c..3959d4766911 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -380,8 +380,8 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  			cluster = (dest >> 4) << 2;
>  		}
>  
> +		/* Nothing to do if there are no destinations in the cluster. */
>  		if (unlikely(!bitmap))
> -			/* guest bug: nobody to send the logical interrupt to */
>  			return 0;
>  
>  		if (!is_power_of_2(bitmap))
> @@ -399,7 +399,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  			if (WARN_ON_ONCE(index != logid_index))
>  				return -EINVAL;
>  
> -			/* guest bug: non existing/reserved logical destination */
> +			/* Nothing to do if the logical destination is invalid. */
>  			if (unlikely(!(logid_entry & AVIC_LOGICAL_ID_ENTRY_VALID_MASK)))
>  				return 0;
>  
> @@ -418,9 +418,13 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  		}
>  	}
>  
> +	/*
> +	 * KVM inhibits AVIC if any vCPU ID diverges from the vCPUs APIC ID,
> +	 * i.e. APIC ID == vCPU ID.  Once again, nothing to do if the target
> +	 * vCPU doesn't exist.
> +	 */
>  	target_vcpu = kvm_get_vcpu_by_id(kvm, l1_physical_id);
>  	if (unlikely(!target_vcpu))
> -		/* guest bug: non existing vCPU is a target of this IPI*/
>  		return 0;
>  
>  	target_vcpu->arch.apic->irr_pending = true;



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/19] KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU
  2022-08-31  0:34 ` [PATCH 11/19] KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU Sean Christopherson
@ 2022-08-31 10:25   ` Maxim Levitsky
  2022-08-31 15:08     ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 10:25 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Add a helper to perform the final kick, two instances of the ICR decoding
> is one too many.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 3959d4766911..2095ece70712 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -329,6 +329,16 @@ void avic_ring_doorbell(struct kvm_vcpu *vcpu)
>  	put_cpu();
>  }
>  
> +
> +static void avic_kick_vcpu(struct kvm_vcpu *vcpu, u32 icrl)
> +{
> +	vcpu->arch.apic->irr_pending = true;
> +	svm_complete_interrupt_delivery(vcpu,
> +					icrl & APIC_MODE_MASK,
> +					icrl & APIC_INT_LEVELTRIG,
> +					icrl & APIC_VECTOR_MASK);
> +}
> +
>  /*
>   * A fast-path version of avic_kick_target_vcpus(), which attempts to match
>   * destination APIC ID to vCPU without looping through all vCPUs.
> @@ -427,11 +437,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  	if (unlikely(!target_vcpu))
>  		return 0;
>  
> -	target_vcpu->arch.apic->irr_pending = true;
> -	svm_complete_interrupt_delivery(target_vcpu,
> -					icrl & APIC_MODE_MASK,
> -					icrl & APIC_INT_LEVELTRIG,
> -					icrl & APIC_VECTOR_MASK);
> +	avic_kick_vcpu(target_vcpu, icrl);
>  	return 0;
>  }
>  
> @@ -455,13 +461,8 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
>  	 */
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>  		if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
> -					dest, icrl & APIC_DEST_MASK)) {
> -			vcpu->arch.apic->irr_pending = true;
> -			svm_complete_interrupt_delivery(vcpu,
> -							icrl & APIC_MODE_MASK,
> -							icrl & APIC_INT_LEVELTRIG,
> -							icrl & APIC_VECTOR_MASK);
> -		}
> +					dest, icrl & APIC_DEST_MASK))
> +			avic_kick_vcpu(vcpu, icrl);
>  	}
>  }
>  

I don't know what I think about this, sometimes *minor* code duplication might actually
be a good thing, as it is easier to read the code, but I don't have much against this
as well.

I am not sure if before or after this code is more readable.

But anyway,

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 18/19] KVM: SVM: Ignore writes to Remote Read Data on AVIC write traps
  2022-08-31  0:35 ` [PATCH 18/19] KVM: SVM: Ignore writes to Remote Read Data on AVIC write traps Sean Christopherson
@ 2022-08-31 10:40   ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 10:40 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> Drop writes to APIC_RRR, a.k.a. Remote Read Data Register, on AVIC
> unaccelerated write traps.  The register is read-only and isn't emulated
> by KVM.  Sending the register through kvm_apic_write_nodecode() will
> result in screaming when x2APIC is enabled due to the unexpected failure
> to retrieve the MSR (KVM expects that only "legal" accesses will trap).

I wonder about ESR register as well (280H), KVM doesn't seem to support it either,
but allows 0 writes. Anyway:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky

> 
> Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index dad5affe44c1..b2033a56010c 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -675,6 +675,9 @@ static int avic_unaccel_trap_write(struct kvm_vcpu *vcpu)
>  	case APIC_DFR:
>  		avic_handle_dfr_update(vcpu);
>  		break;
> +	case APIC_RRR:
> +		/* Ignore writes to Read Remote Data, it's read-only. */
> +		return 1;
>  	default:
>  		break;
>  	}



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 12/19] KVM: x86: Disable APIC logical map if logical ID covers multiple MDAs
  2022-08-31  0:34 ` [PATCH 12/19] KVM: x86: Disable APIC logical map if logical ID covers multiple MDAs Sean Christopherson
@ 2022-08-31 13:23   ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 13:23 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> Disable the optimized APIC logical map if a logical ID covers multiple
> MDAs, i.e. if a vCPU has multiple bits set in its ID.  In logical mode,
> events match if "ID & MDA != 0", i.e. creating an entry for only the
> first bit can cause interrupts to be missed.
> 
> Note, creating an entry for every bit is also wrong as KVM would generate
> IPIs for every matching bit.  It would be possible to teach KVM to play
> nice with this edge case, but it is very much an edge case and probably
> not used in any real world OS, i.e. it's not worth optimizing.
> 
> Use an impossible value for the "mode" to effectively designate that it's
> disabled.  Don't bother adding a dedicated "invalid" value, the mode
> handling will be cleaned up in the future and it would take just as much
> effort to explain what value is "safe" at this time.
> 
> Fixes: 1e08ec4a130e ("KVM: optimize apic interrupt delivery")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/lapic.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 9dda989a1cf0..82278acae95b 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -300,8 +300,15 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  		if (!kvm_apic_map_get_logical_dest(new, ldr, &cluster, &mask))
>  			continue;
>  
> -		if (mask)
> -			cluster[ffs(mask) - 1] = apic;
> +		if (!mask)
> +			continue;
> +
> +		if (!is_power_of_2(mask)) {
> +			new->mode = KVM_APIC_MODE_XAPIC_FLAT |
> +				    KVM_APIC_MODE_XAPIC_CLUSTER;
> +			continue;
> +		}
> +		cluster[ffs(mask) - 1] = apic;
>  	}
>  out:
>  	old = rcu_dereference_protected(kvm->arch.apic_map,


I was about to complain about the abuse of the invalid mode,
but I see that this is fixed in later patch as it is said in the commit
description, so no complains.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/19] KVM: x86: Disable APIC logical map if vCPUs are aliased in logical mode
  2022-08-31  0:35 ` [PATCH 13/19] KVM: x86: Disable APIC logical map if vCPUs are aliased in logical mode Sean Christopherson
@ 2022-08-31 13:24   ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 13:24 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> Disable the optimized APIC logical map if multiple vCPUs are aliased to
> the same logical ID.  Architecturally, all CPUs whose logical ID matches
> the MDA are supposed to receive the interrupt; overwriting existing map
> entries can result in missed IPIs.
> 
> Fixes: 1e08ec4a130e ("KVM: optimize apic interrupt delivery")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/lapic.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 82278acae95b..d537b51295d6 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -303,12 +303,13 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  		if (!mask)
>  			continue;
>  
> -		if (!is_power_of_2(mask)) {
> +		ldr = ffs(mask) - 1;
> +		if (!is_power_of_2(mask) || cluster[ldr]) {
>  			new->mode = KVM_APIC_MODE_XAPIC_FLAT |
>  				    KVM_APIC_MODE_XAPIC_CLUSTER;
>  			continue;
>  		}
> -		cluster[ffs(mask) - 1] = apic;
> +		cluster[ldr] = apic;
>  	}
>  out:
>  	old = rcu_dereference_protected(kvm->arch.apic_map,

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 14/19] KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs
  2022-08-31  0:35 ` [PATCH 14/19] KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs Sean Christopherson
@ 2022-08-31 13:37   ` Maxim Levitsky
  2022-08-31 16:41     ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 13:37 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> Apply KVM's hotplug hack if and only if userspace has enabled 32-bit IDs
> for x2APIC.  If 32-bit IDs are not enabled, disable the optimized map to
> honor x86 architectural behavior if multiple vCPUs shared a physical APIC
> ID.  As called out in the changelog that added the hack, all CPUs whose
> (possibly truncated) APIC ID matches the target are supposed to receive
> the IPI.
> 
>   KVM intentionally differs from real hardware, because real hardware
>   (Knights Landing) does just "x2apic_id & 0xff" to decide whether to
>   accept the interrupt in xAPIC mode and it can deliver one interrupt to
>   more than one physical destination, e.g. 0x123 to 0x123 and 0x23.
> 
> Applying the hack even when x2APIC is not fully enabled means KVM doesn't
> correctly handle scenarios where the guest has aliased xAPIC IDs across
> multiple vCPUs, as only the vCPU with the lowest vCPU ID will receive any
> interrupts.  It's extremely unlikely any real world guest aliase APIC IDs,
> or even modifies APIC IDs, but KVM's behavior is arbitrary, e.g. the
> lowest vCPU ID "wins" regardless of which vCPU is "aliasing" and which
> vCPU is "normal".
> 
> Furthermore, the hack is _not_ guaranteed to work!  The hack works if and
> only if the optimized APIC map is successfully allocated.  If the map
> allocation fails (unlikely), KVM will fall back to its unoptimized
> behavior, which _does_ honor the architectural behavior.
> 
> Pivot on 32-bit x2APIC IDs being enabled as that is required to take
> advantage of the hotplug hack (see kvm_apic_state_fixup()), i.e. won't
> break existing setups unless they are way, way off in the weeds.
> 
> And an entry in KVM's errata to document the hack.  Alternatively, KVM
> could provide an actual x2APIC quirk and document the hack that way, but
> there's unlikely to ever be a use case for disabling the quirk.  Go the
> errata route to avoid having to validate a quirk no one cares about.
> 
> Fixes: 5bd5db385b3e ("KVM: x86: allow hotplug of VCPU with APIC ID over 0xff")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  Documentation/virt/kvm/x86/errata.rst | 11 ++++++
>  arch/x86/kvm/lapic.c                  | 50 ++++++++++++++++++++++-----
>  2 files changed, 52 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/x86/errata.rst b/Documentation/virt/kvm/x86/errata.rst
> index 410e0aa63493..49a05f24747b 100644
> --- a/Documentation/virt/kvm/x86/errata.rst
> +++ b/Documentation/virt/kvm/x86/errata.rst
> @@ -37,3 +37,14 @@ Nested virtualization features
>  ------------------------------
>  
>  TBD
> +
> +x2APIC
> +------
> +When KVM_X2APIC_API_USE_32BIT_IDS is enabled, KVM activates a hack/quirk that
> +allows sending events to a single vCPU using its x2APIC ID even if the target
> +vCPU has legacy xAPIC enabled, e.g. to bring up hotplugged vCPUs via INIT-SIPI
> +on VMs with > 255 vCPUs.  A side effect of the quirk is that, if multiple vCPUs
> +have the same physical APIC ID, KVM will deliver events targeting that APIC ID
> +only to the vCPU with the lowest vCPU ID.  If KVM_X2APIC_API_USE_32BIT_IDS is
> +not enabled, KVM follows x86 architecture when processing interrupts (all vCPUs
> +matching the target APIC ID receive the interrupt).
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index d537b51295d6..c224b5c7cd92 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -260,10 +260,10 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>  		struct kvm_lapic *apic = vcpu->arch.apic;
>  		struct kvm_lapic **cluster;
> +		u32 x2apic_id, physical_id;
>  		u16 mask;
>  		u32 ldr;
>  		u8 xapic_id;
> -		u32 x2apic_id;
>  
>  		if (!kvm_apic_present(vcpu))
>  			continue;
> @@ -271,16 +271,48 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  		xapic_id = kvm_xapic_id(apic);
>  		x2apic_id = kvm_x2apic_id(apic);
>  
> -		/* Hotplug hack: see kvm_apic_match_physical_addr(), ... */
> -		if ((apic_x2apic_mode(apic) || x2apic_id > 0xff) &&
> -				x2apic_id <= new->max_apic_id)
> -			new->phys_map[x2apic_id] = apic;
>  		/*
> -		 * ... xAPIC ID of VCPUs with APIC ID > 0xff will wrap-around,
> -		 * prevent them from masking VCPUs with APIC ID <= 0xff.
> +		 * Apply KVM's hotplug hack if userspace has enable 32-bit APIC
> +		 * IDs.  Allow sending events to vCPUs by their x2APIC ID even
> +		 * if the target vCPU is in legacy xAPIC mode, and silently
> +		 * ignore aliased xAPIC IDs (the x2APIC ID is truncated to 8
> +		 * bits, causing IDs > 0xff to wrap and collide).
> +		 *
> +		 * Honor the architectural (and KVM's non-optimized) behavior
> +		 * if userspace has not enabled 32-bit x2APIC IDs.  Each APIC
> +		 * is supposed to process messages independently.  If multiple
> +		 * vCPUs have the same effective APIC ID, e.g. due to the
> +		 * x2APIC wrap or because the guest manually modified its xAPIC
> +		 * IDs, events targeting that ID are supposed to be recognized
> +		 * by all vCPUs with said ID.
>  		 */
> -		if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
> -			new->phys_map[xapic_id] = apic;
> +		if (kvm->arch.x2apic_format) {
> +			/* See also kvm_apic_match_physical_addr(). */
> +			if ((apic_x2apic_mode(apic) || x2apic_id > 0xff) &&
> +			    x2apic_id <= new->max_apic_id)
> +				new->phys_map[x2apic_id] = apic;
> +
> +			if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
> +				new->phys_map[xapic_id] = apic;
> +		} else {
> +			/*
> +			 * Disable the optimized map if the physical APIC ID is
> +			 * already mapped, i.e. is aliased to multiple vCPUs.
> +			 * The optimized map requires a strict 1:1 mapping
> +			 * between IDs and vCPUs.
> +			 */
> +			if (apic_x2apic_mode(apic))
> +				physical_id = x2apic_id;
> +			else
> +				physical_id = xapic_id;
> +
> +			if (new->phys_map[physical_id]) {
> +				kvfree(new);
> +				new = NULL;
> +				goto out;
Why not to use the same  KVM_APIC_MODE_XAPIC_FLAT |  KVM_APIC_MODE_XAPIC_CLUSTER
hack here?

Other than that looks correct but I don't know this area well enough to be 100% sure.

Best regards,
	Maxim Levitsky


> +			}
> +			new->phys_map[physical_id] = apic;
> +		}
>  
>  		if (!kvm_apic_sw_enabled(apic))
>  			continue;



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 15/19] KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0
  2022-08-31  0:35 ` [PATCH 15/19] KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0 Sean Christopherson
@ 2022-08-31 13:41   ` Maxim Levitsky
  2022-08-31 16:47     ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 13:41 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> Explicitly skip the optimized map setup if the vCPU's LDR is '0', i.e. if
> the vCPU will never response to logical mode interrupts.  KVM already
> skips setup in this case, but relies on kvm_apic_map_get_logical_dest()
> to generate mask==0.  KVM still needs the mask=0 check as a non-zero LDR
> can yield mask==0 depending on the mode, but explicitly handling the LDR
> will make it simpler to clean up the logical mode tracking in the future.

If I am not mistaken, the commit description is a bit misleading - in this case we just don't add
the vCPU to the map since it is unreachable, but we do continue creating the map.

Best regards,
	MaxiM Levitsky

> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/lapic.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index c224b5c7cd92..8209caffe3ab 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -318,10 +318,12 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  			continue;
>  
>  		ldr = kvm_lapic_get_reg(apic, APIC_LDR);
> +		if (!ldr)
> +			continue;
>  
>  		if (apic_x2apic_mode(apic)) {
>  			new->mode |= KVM_APIC_MODE_X2APIC;
> -		} else if (ldr) {
> +		} else {
>  			ldr = GET_APIC_LOGICAL_ID(ldr);
>  			if (kvm_lapic_get_reg(apic, APIC_DFR) == APIC_DFR_FLAT)
>  				new->mode |= KVM_APIC_MODE_XAPIC_FLAT;



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes
  2022-08-31  0:35 ` [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes Sean Christopherson
@ 2022-08-31 13:43   ` Maxim Levitsky
  2022-08-31 16:56     ` Sean Christopherson
  2022-08-31 18:42   ` Maxim Levitsky
  1 sibling, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 13:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> Track all possibilities for the optimized APIC map's logical modes
> instead of overloading the pseudo-bitmap and treating any "unknown" value
> as "invalid".
> 
> As documented by the now-stale comment above the mode values, the values
> did have meaning when the optimized map was originally added.  That
> dependent logical was removed by commit e45115b62f9a ("KVM: x86: use
> physical LAPIC array for logical x2APIC"), but the obfuscated behavior
> and its comment were left behind.
> 
> Opportunistically rename "mode" to "logical_mode", partly to make it
> clear that the "disabled" case applies only to the logical map, but also
> to prove that there is no lurking code that expects "mode" to be a bitmap.
> 
> Functionally, this is a glorified nop.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 21 ++++++++++--------
>  arch/x86/kvm/lapic.c            | 38 ++++++++++++++++++++++++---------
>  2 files changed, 40 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 1f51411f3112..0184e64ab555 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -955,19 +955,22 @@ struct kvm_arch_memory_slot {
>  };
>  
>  /*
> - * We use as the mode the number of bits allocated in the LDR for the
> - * logical processor ID.  It happens that these are all powers of two.
> - * This makes it is very easy to detect cases where the APICs are
> - * configured for multiple modes; in that case, we cannot use the map and
> - * hence cannot use kvm_irq_delivery_to_apic_fast either.
> + * Track the mode of the optimized logical map, as the rules for decoding the
> + * destination vary per mode.  Enabling the optimized logical map requires all
> + * software-enabled local APIs to be in the same mode, each addressable APIC to
> + * be mapped to only one MDA, and each MDA to map to at most one APIC.
>   */
> -#define KVM_APIC_MODE_XAPIC_CLUSTER          4
> -#define KVM_APIC_MODE_XAPIC_FLAT             8
> -#define KVM_APIC_MODE_X2APIC                16
> +enum kvm_apic_logical_mode {
> +	KVM_APIC_MODE_SW_DISABLED,
> +	KVM_APIC_MODE_XAPIC_CLUSTER,
> +	KVM_APIC_MODE_XAPIC_FLAT,
> +	KVM_APIC_MODE_X2APIC,
> +	KVM_APIC_MODE_MAP_DISABLED,
> +};
>  
>  struct kvm_apic_map {
>  	struct rcu_head rcu;
> -	u8 mode;
> +	enum kvm_apic_logical_mode logical_mode;
>  	u32 max_apic_id;
>  	union {
>  		struct kvm_lapic *xapic_flat_map[8];
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 8209caffe3ab..3b6ef36b3963 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -168,7 +168,12 @@ static bool kvm_use_posted_timer_interrupt(struct kvm_vcpu *vcpu)
>  
>  static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
>  		u32 dest_id, struct kvm_lapic ***cluster, u16 *mask) {
> -	switch (map->mode) {
> +	switch (map->logical_mode) {
> +	case KVM_APIC_MODE_SW_DISABLED:
> +		/* Arbitrarily use the flat map so that @cluster isn't NULL. */
> +		*cluster = map->xapic_flat_map;
> +		*mask = 0;
> +		return true;
Could you explain why this is needed? I probably missed something.

>  	case KVM_APIC_MODE_X2APIC: {
>  		u32 offset = (dest_id >> 16) * 16;
>  		u32 max_apic_id = map->max_apic_id;
> @@ -193,8 +198,10 @@ static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
>  		*cluster = map->xapic_cluster_map[(dest_id >> 4) & 0xf];
>  		*mask = dest_id & 0xf;
>  		return true;
> +	case KVM_APIC_MODE_MAP_DISABLED:
> +		return false;
>  	default:
> -		/* Not optimized. */
> +		WARN_ON_ONCE(1);
>  		return false;
>  	}
>  }
> @@ -256,10 +263,12 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  		goto out;
>  
>  	new->max_apic_id = max_id;
> +	new->logical_mode = KVM_APIC_MODE_SW_DISABLED;
>  
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>  		struct kvm_lapic *apic = vcpu->arch.apic;
>  		struct kvm_lapic **cluster;
> +		enum kvm_apic_logical_mode logical_mode;
>  		u32 x2apic_id, physical_id;
>  		u16 mask;
>  		u32 ldr;
> @@ -314,7 +323,8 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  			new->phys_map[physical_id] = apic;
>  		}
>  
> -		if (!kvm_apic_sw_enabled(apic))
> +		if (new->logical_mode == KVM_APIC_MODE_MAP_DISABLED ||
> +		    !kvm_apic_sw_enabled(apic))
>  			continue;
>  
>  		ldr = kvm_lapic_get_reg(apic, APIC_LDR);
> @@ -322,25 +332,33 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  			continue;
>  
>  		if (apic_x2apic_mode(apic)) {
> -			new->mode |= KVM_APIC_MODE_X2APIC;
> +			logical_mode = KVM_APIC_MODE_X2APIC;
>  		} else {
>  			ldr = GET_APIC_LOGICAL_ID(ldr);
>  			if (kvm_lapic_get_reg(apic, APIC_DFR) == APIC_DFR_FLAT)
> -				new->mode |= KVM_APIC_MODE_XAPIC_FLAT;
> +				logical_mode = KVM_APIC_MODE_XAPIC_FLAT;
>  			else
> -				new->mode |= KVM_APIC_MODE_XAPIC_CLUSTER;
> +				logical_mode = KVM_APIC_MODE_XAPIC_CLUSTER;
>  		}
> +		if (new->logical_mode != KVM_APIC_MODE_SW_DISABLED &&
> +		    new->logical_mode != logical_mode) {
> +			new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;
> +			continue;
> +		}
> +		new->logical_mode = logical_mode;
>  
> -		if (!kvm_apic_map_get_logical_dest(new, ldr, &cluster, &mask))
> +		if (WARN_ON_ONCE(!kvm_apic_map_get_logical_dest(new, ldr,
> +								&cluster, &mask))) {
> +			new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;
>  			continue;
> +		}
>  
>  		if (!mask)
>  			continue;
>  
>  		ldr = ffs(mask) - 1;
>  		if (!is_power_of_2(mask) || cluster[ldr]) {
> -			new->mode = KVM_APIC_MODE_XAPIC_FLAT |
> -				    KVM_APIC_MODE_XAPIC_CLUSTER;
> +			new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;
>  			continue;
>  		}
>  		cluster[ldr] = apic;
> @@ -993,7 +1011,7 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
>  {
>  	if (kvm->arch.x2apic_broadcast_quirk_disabled) {
>  		if ((irq->dest_id == APIC_BROADCAST &&
> -				map->mode != KVM_APIC_MODE_X2APIC))
> +		     map->logical_mode != KVM_APIC_MODE_X2APIC))
>  			return true;
>  		if (irq->dest_id == X2APIC_BROADCAST)
>  			return true;

To be honest I would put that patch first, and then do all the other patches, this
way you would avoid all of the hacks they do and removed here.


Other than that this looks like a very good cleanup.

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 17/19] KVM: SVM: Handle multiple logical targets in AVIC kick fastpath
  2022-08-31  0:35 ` [PATCH 17/19] KVM: SVM: Handle multiple logical targets in AVIC kick fastpath Sean Christopherson
@ 2022-08-31 13:57   ` Maxim Levitsky
  2022-08-31 18:19     ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 13:57 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> Iterate over all target logical IDs in the AVIC kick fastpath instead of
> bailing if there is more than one target and KVM's optimized APIC map is
> enabled for logical mode.  If the optimized map is enabled, all vCPUs are
> guaranteed to be mapped 1:1 to a logical ID or effectively have logical
> mode disabled, i.e. iterating over the bitmap is guaranteed to kick each
> target exactly once.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 126 +++++++++++++++++++++++++---------------
>  1 file changed, 79 insertions(+), 47 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 2095ece70712..dad5affe44c1 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -339,6 +339,62 @@ static void avic_kick_vcpu(struct kvm_vcpu *vcpu, u32 icrl)
>  					icrl & APIC_VECTOR_MASK);
>  }
>  
> +static void avic_kick_vcpu_by_physical_id(struct kvm *kvm, u32 physical_id,
> +					  u32 icrl)
> +{
> +	/*
> +	 * KVM inhibits AVIC if any vCPU ID diverges from the vCPUs APIC ID,
> +	 * i.e. APIC ID == vCPU ID.
> +	 */
> +	struct kvm_vcpu *target_vcpu = kvm_get_vcpu_by_id(kvm, physical_id);
> +
> +	/* Once again, nothing to do if the target vCPU doesn't exist. */
> +	if (unlikely(!target_vcpu))
> +		return;
> +
> +	avic_kick_vcpu(target_vcpu, icrl);
> +}
> +
> +static void avic_kick_vcpu_by_logical_id(struct kvm *kvm, u32 *avic_logical_id_table,
> +					 u32 logid_index, u32 icrl)
> +{
> +	u32 physical_id;
> +
> +	if (!avic_logical_id_table) {
^ Typo, the '!' shoudn't be there.

> +		u32 logid_entry = avic_logical_id_table[logid_index];
> +
> +		/* Nothing to do if the logical destination is invalid. */
> +		if (unlikely(!(logid_entry & AVIC_LOGICAL_ID_ENTRY_VALID_MASK)))
> +			return;
> +
> +		physical_id = logid_entry &
> +			      AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK;
> +	} else {
> +		/*
> +		 * For x2APIC, the logical APIC ID is a read-only value that is
> +		 * derived from the x2APIC ID, thus the x2APIC ID can be found
> +		 * by reversing the calculation (stored in logid_index).  Note,
> +		 * bits 31:20 of the x2APIC ID aren't propagated to the logical
> +		 * ID, but KVM limits the x2APIC ID limited to KVM_MAX_VCPU_IDS.
> +		 */
> +		physical_id = logid_index;
> +	}
> +
> +	avic_kick_vcpu_by_physical_id(kvm, physical_id, icrl);
> +}

These two functions are a very good cleanup IMHO.

> +
> +static bool is_optimized_logical_map_enabled(struct kvm *kvm)
> +{
> +	struct kvm_apic_map *map;
> +	bool enabled;
> +
> +	rcu_read_lock();
> +	map = rcu_dereference(kvm->arch.apic_map);
> +	enabled = map && map->logical_mode != KVM_APIC_MODE_MAP_DISABLED;
> +	rcu_read_unlock();
> +	return enabled;
> +}

This function doesn't belong to avic, it should be in common KVM code.


> +
>  /*
>   * A fast-path version of avic_kick_target_vcpus(), which attempts to match
>   * destination APIC ID to vCPU without looping through all vCPUs.
> @@ -346,11 +402,10 @@ static void avic_kick_vcpu(struct kvm_vcpu *vcpu, u32 icrl)
>  static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source,
>  				       u32 icrl, u32 icrh, u32 index)
>  {
> -	u32 l1_physical_id, dest;
> -	struct kvm_vcpu *target_vcpu;
>  	int dest_mode = icrl & APIC_DEST_MASK;
>  	int shorthand = icrl & APIC_SHORT_MASK;
>  	struct kvm_svm *kvm_svm = to_kvm_svm(kvm);
> +	u32 dest;
>  
>  	if (shorthand != APIC_DEST_NOSHORT)
>  		return -EINVAL;
> @@ -367,14 +422,14 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  		if (!apic_x2apic_mode(source) && dest == APIC_BROADCAST)
>  			return -EINVAL;
>  
> -		l1_physical_id = dest;
> -
> -		if (WARN_ON_ONCE(l1_physical_id != index))
> +		if (WARN_ON_ONCE(dest != index))
>  			return -EINVAL;
>  
> +		avic_kick_vcpu_by_physical_id(kvm, dest, icrl);
>  	} else {
> -		u32 bitmap, cluster;
> -		int logid_index;
> +		u32 *avic_logical_id_table;
> +		unsigned long bitmap, i;
> +		u32 cluster;
>  
>  		if (apic_x2apic_mode(source)) {
>  			/* 16 bit dest mask, 16 bit cluster id */
> @@ -394,50 +449,27 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
>  		if (unlikely(!bitmap))
>  			return 0;
>  
> -		if (!is_power_of_2(bitmap))
> -			/* multiple logical destinations, use slow path */
> +		/*
> +		 * Use the slow path if more than one bit is set in the bitmap
> +		 * and KVM's optimized logical map is disabled to avoid kicking
> +		 * a vCPU multiple times.  If the optimized map is disabled, a
> +		 * vCPU _may_ have multiple bits set in its logical ID, i.e.
> +		 * may have multiple entries in the logical table.
> +		 */
> +		if (!is_power_of_2(bitmap) &&
> +		    !is_optimized_logical_map_enabled(kvm))
>  			return -EINVAL;


I hate to say it but there is another issue here, which I know about for a while
but haven't gotten yet to fix.

The issue is that AVIC's logical to physical map can't cover all the corner cases
that you discovered - it only supports the sane subset: for each cluster, and for each bit
in the mask, it has a physical apic id - so things like logical ids with multiple bits,
having same logical id for multiple vcpus and so on can't work.

In this case we need to either inhibit AVIC (I support this 100%), or clear
its logical ID map, so all logicical IPIs VM exit, and then they can be emulated.

I haven't studied it formally but the code which rebuilds the AVIC's logical ID map
starts at 'avic_handle_ldr_update'.


Besides that this patch makes sense, and it explains why you removed the logic which
was incorrectly checking for having a single bit in the bitmap, but I still
prefer to revert the patch as I explained there.

Best regards,
	Maxim Levitsky

>  
> -		logid_index = cluster + __ffs(bitmap);
> -
> -		if (!apic_x2apic_mode(source)) {
> -			u32 *avic_logical_id_table =
> -				page_address(kvm_svm->avic_logical_id_table_page);
> -
> -			u32 logid_entry = avic_logical_id_table[logid_index];
> -
> -			if (WARN_ON_ONCE(index != logid_index))
> -				return -EINVAL;
> -
> -			/* Nothing to do if the logical destination is invalid. */
> -			if (unlikely(!(logid_entry & AVIC_LOGICAL_ID_ENTRY_VALID_MASK)))
> -				return 0;
> -
> -			l1_physical_id = logid_entry &
> -					 AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK;
> -		} else {
> -			/*
> -			 * For x2APIC, the logical APIC ID is a read-only value
> -			 * that is derived from the x2APIC ID, thus the x2APIC
> -			 * ID can be found by reversing the calculation (done
> -			 * above).  Note, bits 31:20 of the x2APIC ID are not
> -			 * propagated to the logical ID, but KVM limits the
> -			 * x2APIC ID limited to KVM_MAX_VCPU_IDS.
> -			 */
> -			l1_physical_id = logid_index;
> -		}
> +		if (apic_x2apic_mode(source))
> +			avic_logical_id_table = NULL;
> +		else
> +			avic_logical_id_table = page_address(kvm_svm->avic_logical_id_table_page);
> +
> +		for_each_set_bit(i, &bitmap, 16)
> +			avic_kick_vcpu_by_logical_id(kvm, avic_logical_id_table,
> +						     cluster + i, icrl);
>  	}
>  
> -	/*
> -	 * KVM inhibits AVIC if any vCPU ID diverges from the vCPUs APIC ID,
> -	 * i.e. APIC ID == vCPU ID.  Once again, nothing to do if the target
> -	 * vCPU doesn't exist.
> -	 */
> -	target_vcpu = kvm_get_vcpu_by_id(kvm, l1_physical_id);
> -	if (unlikely(!target_vcpu))
> -		return 0;
> -
> -	avic_kick_vcpu(target_vcpu, icrl);
>  	return 0;
>  }
>  



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/19] KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU
  2022-08-31 10:25   ` Maxim Levitsky
@ 2022-08-31 15:08     ` Sean Christopherson
  2022-08-31 18:12       ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 15:08 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > @@ -455,13 +461,8 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
> >  	 */
> >  	kvm_for_each_vcpu(i, vcpu, kvm) {
> >  		if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
> > -					dest, icrl & APIC_DEST_MASK)) {
> > -			vcpu->arch.apic->irr_pending = true;
> > -			svm_complete_interrupt_delivery(vcpu,
> > -							icrl & APIC_MODE_MASK,
> > -							icrl & APIC_INT_LEVELTRIG,
> > -							icrl & APIC_VECTOR_MASK);
> > -		}
> > +					dest, icrl & APIC_DEST_MASK))
> > +			avic_kick_vcpu(vcpu, icrl);
> >  	}
> >  }
> >  
> 
> I don't know what I think about this, sometimes *minor* code duplication
> might actually be a good thing, as it is easier to read the code, but I don't
> have much against this as well.
> 
> I am not sure if before or after this code is more readable.

I don't have a strong opinion either.  I think I prefer having the helper, but
have no objection to leaving things as is.  Originally I was thinking there was
going to be a third call site, but that didn't happen.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/19] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch
  2022-08-31 10:22   ` Maxim Levitsky
@ 2022-08-31 16:16     ` Sean Christopherson
  2022-08-31 17:49       ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 16:16 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > Document that AVIC is inhibited if any vCPU's APIC ID diverges from its
> > vCPU ID, i.e. that there's no need to check for a destination match in
> > the AVIC kick fast path.
> > 
> > Opportunistically tweak comments to remove "guest bug", as that suggests
> > KVM is punting on error handling, which is not the case.  Targeting a
> > non-existent vCPU or no vCPUs _may_ be a guest software bug, but whether
> > or not it's a guest bug is irrelevant.  Such behavior is architecturally
> > legal and thus needs to faithfully emulated by KVM (and it is).
> 
> I don't want to pick a fight,

Please don't hesitate to push back, I would much rather discuss points of contention
than have an ongoing, silent battle through patches.

> but personally these things *are* guest bugs / improper usage of APIC, and I
> don't think it is wrong to document them as such.

If the guest is intentionally exercising an edge case, e.g. KUT or selftests, then
from the guest's perspective its code/behavior isn't buggy.

I completely agree that abusing/aliasing logical IDs is improper usage and arguably
out of spec, but the scenarios here are very much in spec, e.g. a bitmap of '0'
isn't expressly forbidden and both Intel and AMD specs very clearly state that
APICs discard interrupt messages if they are not the destination.

But that's somewhat beside the point, as I have no objection to documenting scenarios
that are out-of-spec or undefined.  What I object to is documenting such scenarios as
"guest bugs".  If the CPU/APIC/whatever behavior is undefined, then document it
as undefined.  Saying "guest bug" doesn't help future readers understand what is
architecturally supposed to happen, whereas a comment like

	/*
	 * Logical IDs are architecturally "required" to be unique, i.e. this is
	 * technically undefined behavior.  Disable the optimized logical map so
	 * that KVM is consistent with itself, as the non-optimized matching
	 * logic with accept interrupts on all CPUs with the logical ID.
	 */

anchors KVM's behavior to the specs and explains why KVM does XYZ in response to
undefined behavior.

I feel very strongly about "guest bug" because KVM has a history of disregarding
architectural correctness and using a "good enough" approach.  Simply stating
"guest bug" makes it all the more difficult to differentiate between KVM handling
architecturally undefined behavior, versus KVM deviating from the spec because
someone decided that KVM's partial emulation/virtualziation was "good enough".

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31  5:59   ` Maxim Levitsky
  2022-08-31  6:45     ` Maxim Levitsky
@ 2022-08-31 16:19     ` Sean Christopherson
  2022-08-31 17:47       ` Maxim Levitsky
  1 sibling, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 16:19 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > Remove SVM's so called "hybrid-AVIC mode" and reinstate the restriction
> > where AVIC is disabled if x2APIC is enabled.  The argument that the
> > "guest is not supposed to access xAPIC mmio when uses x2APIC" is flat out
> > wrong.  Activating x2APIC completely disables the xAPIC MMIO region,
> > there is nothing that says the guest must not access that address.
> > 
> > Concretely, KVM-Unit-Test's existing "apic" test fails the subtests that
> > expect accesses to the APIC base region to not be emulated when x2APIC is
> > enabled.
> > 
> > Furthermore, allowing the guest to trigger MMIO emulation in a mode where
> > KVM doesn't expect such emulation to occur is all kinds of dangerous.
> > 
> > Tweak the restriction so that it only inhibits AVIC if x2APIC is actually
> > enabled instead of inhibiting AVIC is x2APIC is exposed to the guest.
> > 
> > This reverts commit 0e311d33bfbef86da130674e8528cc23e6acfe16.
> 
> I don't agree with this patch.
> 
> When reviewing this code I did note that MMIO is left enabled which is kind
> of errata on KVM side, and nobody objected to this.

I didn't object because I didn't read the patch.  I'm very much objecting now.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31  6:45     ` Maxim Levitsky
@ 2022-08-31 16:29       ` Sean Christopherson
  2022-08-31 17:46         ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 16:29 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 08:59 +0300, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > > Remove SVM's so called "hybrid-AVIC mode" and reinstate the restriction
> > > where AVIC is disabled if x2APIC is enabled.  The argument that the
> > > "guest is not supposed to access xAPIC mmio when uses x2APIC" is flat out
> > > wrong.  Activating x2APIC completely disables the xAPIC MMIO region,
> > > there is nothing that says the guest must not access that address.
> > > 
> > > Concretely, KVM-Unit-Test's existing "apic" test fails the subtests that
> > > expect accesses to the APIC base region to not be emulated when x2APIC is
> > > enabled.
> > > 
> > > Furthermore, allowing the guest to trigger MMIO emulation in a mode where
> > > KVM doesn't expect such emulation to occur is all kinds of dangerous.
> 
> Also, unless I misunderstood you, the above statement is wrong.
> 
> Leaving AVIC on, when vCPU is in x2apic mode cannot trigger extra MMIO emulation,
> in fact the opposite - because AVIC is on, writes to 0xFEE00xxx might *not* trigger
> MMIO emulation and instead be emulated by AVIC.

That's even worse, because KVM is allowing the guest to exercise hardware logic
that I highly doubt AMD has thoroughly tested.

> Yes, some of these writes can trigger AVIC specific emulation vm exits, but they
> are literaly the same as those used by x2avic, and it is really hard to see
> why this would be dangerous (assuming that x2avic code works, and avic code
> is aware of this 'hybrid' mode).

The APIC_RRR thing triggered the KVM_BUG_ON() in kvm_apic_write_nodecode()
precisely because of the AVIC trap.  At best, this gives a way for the guest to
trigger a WARN_ON_ONCE() and thus panic the host if panic_on_warn=1.  I fixed
the APIC_RRR case because that will be problematic for x2AVIC, but there are
other APIC registers that are unsupported in x2APIC that can trigger the KVM_BUG_ON().

> From the guest point of view, unless the guest pokes at random MMIO area,
> the only case when this matters is if the guest maps RAM over the 0xFEE00xxx
> (which it of course can, the spec explictly state as you say that when x2apic
> is enabled, the mmio is disabled), and then instead of functioning as RAM,
> the range will still function as APIC.

There is no wiggle room here though, KVM is blatantly breaking the architectural
specification.  When x2APIC is enabled, the xAPIC MMIO does not exist.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 02/19] KVM: SVM: Don't put/load AVIC when setting virtual APIC mode
  2022-08-31  9:24   ` Maxim Levitsky
@ 2022-08-31 16:33     ` Sean Christopherson
  0 siblings, 0 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 16:33 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > @@ -1118,6 +1107,16 @@ void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> >  		avic_deactivate_vmcb(svm);
> >  	}
> >  	vmcb_mark_dirty(vmcb, VMCB_AVIC);
> > +}
> > +
> > +void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> > +{
> > +	bool activated = kvm_vcpu_apicv_active(vcpu);
> > +
> > +	if (!enable_apicv)
> > +		return;
> > +
> > +	avic_set_virtual_apic_mode(vcpu);
> 
> This call is misleading - this will usually be called
> when avic mode didn't change - I think we need a better name for
> avic_set_virtual_apic_mode.

I don't disagree, but I'm having trouble coming up with a succinct alternative.
The helper primarily configures the VMCB, but the call to
avic_apicv_post_state_restore() makes avic_refresh_vmcb_controls() undesirable.

Maybe avic_refresh_virtual_apic_mode()?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 06/19] KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not ICHR[31:16]
  2022-08-31  9:43     ` Maxim Levitsky
@ 2022-08-31 16:35       ` Sean Christopherson
  2022-08-31 18:18         ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 16:35 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 09:09 +0300, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > > When attempting a fast kick for x2AVIC, get the destination bitmap from
> > > ICR[15:0], not ICHR[31:16].  The upper 16 bits contain the cluster, the
> > > lower 16 bits hold the bitmap.
> > > 
> > > Fixes: 603ccef42ce9 ("KVM: x86: SVM: fix avic_kick_target_vcpus_fast")
> > > Cc: Maxim Levitsky <mlevitsk@redhat.com>
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >  arch/x86/kvm/svm/avic.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > > index 3ace0f2f52f0..3c333cd2e752 100644
> > > --- a/arch/x86/kvm/svm/avic.c
> > > +++ b/arch/x86/kvm/svm/avic.c
> > > @@ -368,7 +368,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
> > >  
> > >  		if (apic_x2apic_mode(source)) {
> > >  			/* 16 bit dest mask, 16 bit cluster id */
> > > -			bitmap = dest & 0xFFFF0000;
> > > +			bitmap = dest & 0xFFFF;
> > >  			cluster = (dest >> 16) << 4;
> > >  		} else if (kvm_lapic_get_reg(source, APIC_DFR) == APIC_DFR_FLAT) {
> > >  			/* 8 bit dest mask*/
> > 
> > I swear I have seen a patch from Suravee Suthikulpanit fixing this my mistake, I don't know why it was not
> > accepted upstream.
> 
> This is the patch, which I guess got forgotten.
> 
> https://www.spinics.net/lists/kernel/msg4417427.html

Ah, we just missed it, doubt there's anything more than that to the story.

> Since it is literaly the same patch, you can just add credit to Suravee Suthikulpanit.
> 
> So with the credit added:
> 
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

I'll grab Suravee's patch and added your review.  Thanks!

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/19] KVM: SVM: Drop buggy and redundant AVIC "single logical dest" check
  2022-08-31  6:19   ` Maxim Levitsky
@ 2022-08-31 16:37     ` Sean Christopherson
  2022-08-31 18:10       ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 16:37 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > Use the already-calculated-and-sanity-checked destination bitmap when
> > processing a fast AVIC kick in logical mode, and drop the logical path's
> > flawed logic.  The intent of the check is to ensure the bitmap is a power
> > of two, whereas "icrh != (1 << avic)" effectively checks that the bitmap
> > is a power of two _and_ the target cluster is '0'.
> > 
> > Note, the flawed check isn't a functional issue, it simply means that KVM
> > will go down the slow path if the target cluster is non-zero.
> > 
> > Fixes: 8c9e639da435 ("KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible")
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/svm/avic.c | 10 +---------
> >  1 file changed, 1 insertion(+), 9 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > index 3c333cd2e752..14f567550a1e 100644
> > --- a/arch/x86/kvm/svm/avic.c
> > +++ b/arch/x86/kvm/svm/avic.c
> > @@ -411,15 +411,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
> >  			 * Instead, calculate physical ID from logical ID in ICRH.
> >  			 */
> >  			int cluster = (icrh & 0xffff0000) >> 16;
> > -			int apic = ffs(icrh & 0xffff) - 1;
> > -
> > -			/*
> > -			 * If the x2APIC logical ID sub-field (i.e. icrh[15:0])
> > -			 * contains anything but a single bit, we cannot use the
> > -			 * fast path, because it is limited to a single vCPU.
> > -			 */
> > -			if (apic < 0 || icrh != (1 << apic))
> > -				return -EINVAL;
> > +			int apic = ffs(bitmap) - 1;
> >  
> >  			l1_physical_id = (cluster << 4) + apic;
> >  		}
> 
> Oh, I didn't notice this bug. However isn't removing the check is wrong as well?
> 
> What if we do have multiple bits set in the bitmap? After you remove this code,
> we will set IPI only to APIC which matches the 1st bit, no?

The common code (out of sight here) already ensures the bitmap has exactly one bit set:

                if (unlikely(!bitmap))
                        /* guest bug: nobody to send the logical interrupt to */
                        return 0;

                if (!is_power_of_2(bitmap))
                        /* multiple logical destinations, use slow path */
                        return -EINVAL;

                logid_index = cluster + __ffs(bitmap);


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 14/19] KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs
  2022-08-31 13:37   ` Maxim Levitsky
@ 2022-08-31 16:41     ` Sean Christopherson
  2022-08-31 17:51       ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 16:41 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> > -		if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
> > -			new->phys_map[xapic_id] = apic;
> > +		if (kvm->arch.x2apic_format) {
> > +			/* See also kvm_apic_match_physical_addr(). */
> > +			if ((apic_x2apic_mode(apic) || x2apic_id > 0xff) &&
> > +			    x2apic_id <= new->max_apic_id)
> > +				new->phys_map[x2apic_id] = apic;
> > +
> > +			if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
> > +				new->phys_map[xapic_id] = apic;
> > +		} else {
> > +			/*
> > +			 * Disable the optimized map if the physical APIC ID is
> > +			 * already mapped, i.e. is aliased to multiple vCPUs.
> > +			 * The optimized map requires a strict 1:1 mapping
> > +			 * between IDs and vCPUs.
> > +			 */
> > +			if (apic_x2apic_mode(apic))
> > +				physical_id = x2apic_id;
> > +			else
> > +				physical_id = xapic_id;
> > +
> > +			if (new->phys_map[physical_id]) {
> > +				kvfree(new);
> > +				new = NULL;
> > +				goto out;
> Why not to use the same  KVM_APIC_MODE_XAPIC_FLAT |  KVM_APIC_MODE_XAPIC_CLUSTER
> hack here?

The map's "mode" only covers logical mode (the cleanup patch renames "mode" to
"logical_mode" to make this more clear).  There is no equivalent for dealing with
the physical IDs.  Alternatively, a flag to say "physical map is disabled" could
be added, but KVM already has to cleanly handle a NULL map and in all likelihood
the logical map is also going to be disabled anyways.

Not to mention that APIC performance is unlikely to be a priority for any guest
that triggers this code :-)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 15/19] KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0
  2022-08-31 13:41   ` Maxim Levitsky
@ 2022-08-31 16:47     ` Sean Christopherson
  0 siblings, 0 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 16:47 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> > Explicitly skip the optimized map setup if the vCPU's LDR is '0', i.e. if
> > the vCPU will never response to logical mode interrupts.  KVM already
> > skips setup in this case, but relies on kvm_apic_map_get_logical_dest()
> > to generate mask==0.  KVM still needs the mask=0 check as a non-zero LDR
> > can yield mask==0 depending on the mode, but explicitly handling the LDR
> > will make it simpler to clean up the logical mode tracking in the future.
> 
> If I am not mistaken, the commit description is a bit misleading - in this
> case we just don't add the vCPU to the map since it is unreachable, but we do
> continue creating the map.

I'll reword.  I intended that to mean "skip the optimized setup for the vCPU", but
even that is unnecessarily confusing.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes
  2022-08-31 13:43   ` Maxim Levitsky
@ 2022-08-31 16:56     ` Sean Christopherson
  2022-08-31 17:53       ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 16:56 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 8209caffe3ab..3b6ef36b3963 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -168,7 +168,12 @@ static bool kvm_use_posted_timer_interrupt(struct kvm_vcpu *vcpu)
> >  
> >  static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
> >  		u32 dest_id, struct kvm_lapic ***cluster, u16 *mask) {
> > -	switch (map->mode) {
> > +	switch (map->logical_mode) {
> > +	case KVM_APIC_MODE_SW_DISABLED:
> > +		/* Arbitrarily use the flat map so that @cluster isn't NULL. */
> > +		*cluster = map->xapic_flat_map;
> > +		*mask = 0;
> > +		return true;
> Could you explain why this is needed? I probably missed something.

If all vCPUs leave their APIC software disabled, or leave LDR=0, then the overall
mode will be KVM_APIC_MODE_SW_DISABLED.  In this case, the effective "mask" is '0'
because there are no targets.  And this returns %true because there are no targets,
i.e. there's no need to go down the slow path after kvm_apic_map_get_dest_lapic().

> > @@ -993,7 +1011,7 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
> >  {
> >  	if (kvm->arch.x2apic_broadcast_quirk_disabled) {
> >  		if ((irq->dest_id == APIC_BROADCAST &&
> > -				map->mode != KVM_APIC_MODE_X2APIC))
> > +		     map->logical_mode != KVM_APIC_MODE_X2APIC))
> >  			return true;
> >  		if (irq->dest_id == X2APIC_BROADCAST)
> >  			return true;
> 
> To be honest I would put that patch first, and then do all the other patches,
> this way you would avoid all of the hacks they do and removed here.

I did it this way so that I could test this patch for correctness.  Without the
bug fixes in place it's not really possible to verify this patch is 100% correct.

I completely agree that it would be a lot easier to read/understand/review if
this came first, but I'd rather not sacrifice the ability to easily test this patch.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31 16:29       ` Sean Christopherson
@ 2022-08-31 17:46         ` Maxim Levitsky
  2022-08-31 17:58           ` Jim Mattson
  2022-08-31 19:12           ` Sean Christopherson
  0 siblings, 2 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 17:46 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 16:29 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 08:59 +0300, Maxim Levitsky wrote:
> > > On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > > > Remove SVM's so called "hybrid-AVIC mode" and reinstate the restriction
> > > > where AVIC is disabled if x2APIC is enabled.  The argument that the
> > > > "guest is not supposed to access xAPIC mmio when uses x2APIC" is flat out
> > > > wrong.  Activating x2APIC completely disables the xAPIC MMIO region,
> > > > there is nothing that says the guest must not access that address.
> > > > 
> > > > Concretely, KVM-Unit-Test's existing "apic" test fails the subtests that
> > > > expect accesses to the APIC base region to not be emulated when x2APIC is
> > > > enabled.
> > > > 
> > > > Furthermore, allowing the guest to trigger MMIO emulation in a mode where
> > > > KVM doesn't expect such emulation to occur is all kinds of dangerous.
> > 
> > Also, unless I misunderstood you, the above statement is wrong.
> > 
> > Leaving AVIC on, when vCPU is in x2apic mode cannot trigger extra MMIO emulation,
> > in fact the opposite - because AVIC is on, writes to 0xFEE00xxx might *not* trigger
> > MMIO emulation and instead be emulated by AVIC.
> 
> That's even worse, because KVM is allowing the guest to exercise hardware logic
> that I highly doubt AMD has thoroughly tested.

Harware logic is exactly the same regarless of if KVM uses x2apic mode or not,
and it is better to be prepared for all kind of garbage coming from the guest.

Software logic, I can understand you, there could be registers that trap differently
in avic and x2avic mode, but it should be *very* easy to deal with it, the list
of registers that trap is very short.

> 
> > Yes, some of these writes can trigger AVIC specific emulation vm exits, but they
> > are literaly the same as those used by x2avic, and it is really hard to see
> > why this would be dangerous (assuming that x2avic code works, and avic code
> > is aware of this 'hybrid' mode).
> 
> The APIC_RRR thing triggered the KVM_BUG_ON() in kvm_apic_write_nodecode()
> precisely because of the AVIC trap.  At best, this gives a way for the guest to
> trigger a WARN_ON_ONCE() and thus panic the host if panic_on_warn=1.  I fixed
> the APIC_RRR case because that will be problematic for x2AVIC, but there are
> other APIC registers that are unsupported in x2APIC that can trigger the KVM_BUG_ON().
> 
> > From the guest point of view, unless the guest pokes at random MMIO area,
> > the only case when this matters is if the guest maps RAM over the 0xFEE00xxx
> > (which it of course can, the spec explictly state as you say that when x2apic
> > is enabled, the mmio is disabled), and then instead of functioning as RAM,
> > the range will still function as APIC.
> 
> There is no wiggle room here though, KVM is blatantly breaking the architectural
> specification.  When x2APIC is enabled, the xAPIC MMIO does not exist.

In this case I say that there is no wiggle room for KVM to not allow different APIC bases
on each CPU - the spec 100% allows it, but in KVM it is broken.

If you are really hell bent on not having that MMIO exposed, 
then I say we can just disable the AVIC memslot, and keep AVIC enabled in this case - 
this should make us both happy.


This discussion really makes me feel that you want just to force your opinion on others,
and its not the first time this happens. It is really frustrating to work like that.
It might sound harsh but this is how I feel. Hopefully I am wrong.


Best regards,
	Maxim Levisky


> 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31 16:19     ` Sean Christopherson
@ 2022-08-31 17:47       ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 17:47 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 16:19 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > > Remove SVM's so called "hybrid-AVIC mode" and reinstate the restriction
> > > where AVIC is disabled if x2APIC is enabled.  The argument that the
> > > "guest is not supposed to access xAPIC mmio when uses x2APIC" is flat out
> > > wrong.  Activating x2APIC completely disables the xAPIC MMIO region,
> > > there is nothing that says the guest must not access that address.
> > > 
> > > Concretely, KVM-Unit-Test's existing "apic" test fails the subtests that
> > > expect accesses to the APIC base region to not be emulated when x2APIC is
> > > enabled.
> > > 
> > > Furthermore, allowing the guest to trigger MMIO emulation in a mode where
> > > KVM doesn't expect such emulation to occur is all kinds of dangerous.
> > > 
> > > Tweak the restriction so that it only inhibits AVIC if x2APIC is actually
> > > enabled instead of inhibiting AVIC is x2APIC is exposed to the guest.
> > > 
> > > This reverts commit 0e311d33bfbef86da130674e8528cc23e6acfe16.
> > 
> > I don't agree with this patch.
> > 
> > When reviewing this code I did note that MMIO is left enabled which is kind
> > of errata on KVM side, and nobody objected to this.
> 
> I didn't object because I didn't read the patch.  I'm very much objecting now.
> 

And I am *very* much objecting to reverting this patch.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/19] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch
  2022-08-31 16:16     ` Sean Christopherson
@ 2022-08-31 17:49       ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 17:49 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 16:16 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > > Document that AVIC is inhibited if any vCPU's APIC ID diverges from its
> > > vCPU ID, i.e. that there's no need to check for a destination match in
> > > the AVIC kick fast path.
> > > 
> > > Opportunistically tweak comments to remove "guest bug", as that suggests
> > > KVM is punting on error handling, which is not the case.  Targeting a
> > > non-existent vCPU or no vCPUs _may_ be a guest software bug, but whether
> > > or not it's a guest bug is irrelevant.  Such behavior is architecturally
> > > legal and thus needs to faithfully emulated by KVM (and it is).
> > 
> > I don't want to pick a fight,
> 
> Please don't hesitate to push back, I would much rather discuss points of contention
> than have an ongoing, silent battle through patches.
> 
> > but personally these things *are* guest bugs / improper usage of APIC, and I
> > don't think it is wrong to document them as such.
> 
> If the guest is intentionally exercising an edge case, e.g. KUT or selftests, then
> from the guest's perspective its code/behavior isn't buggy.
> 
> I completely agree that abusing/aliasing logical IDs is improper usage and arguably
> out of spec, but the scenarios here are very much in spec, e.g. a bitmap of '0'
> isn't expressly forbidden and both Intel and AMD specs very clearly state that
> APICs discard interrupt messages if they are not the destination.
> 
> But that's somewhat beside the point, as I have no objection to documenting scenarios
> that are out-of-spec or undefined.  What I object to is documenting such scenarios as
> "guest bugs".  If the CPU/APIC/whatever behavior is undefined, then document it
> as undefined.  Saying "guest bug" doesn't help future readers understand what is
> architecturally supposed to happen, whereas a comment like
> 
> 	/*
> 	 * Logical IDs are architecturally "required" to be unique, i.e. this is
> 	 * technically undefined behavior.  Disable the optimized logical map so
> 	 * that KVM is consistent with itself, as the non-optimized matching
> 	 * logic with accept interrupts on all CPUs with the logical ID.
> 	 */
> 
> anchors KVM's behavior to the specs and explains why KVM does XYZ in response to
> undefined behavior.
> 
> I feel very strongly about "guest bug" because KVM has a history of disregarding
> architectural correctness and using a "good enough" approach.  Simply stating
> "guest bug" makes it all the more difficult to differentiate between KVM handling
> architecturally undefined behavior, versus KVM deviating from the spec because
> someone decided that KVM's partial emulation/virtualziation was "good enough".
> 

All right, I agree with you.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 14/19] KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs
  2022-08-31 16:41     ` Sean Christopherson
@ 2022-08-31 17:51       ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 17:51 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 16:41 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> > > -		if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
> > > -			new->phys_map[xapic_id] = apic;
> > > +		if (kvm->arch.x2apic_format) {
> > > +			/* See also kvm_apic_match_physical_addr(). */
> > > +			if ((apic_x2apic_mode(apic) || x2apic_id > 0xff) &&
> > > +			    x2apic_id <= new->max_apic_id)
> > > +				new->phys_map[x2apic_id] = apic;
> > > +
> > > +			if (!apic_x2apic_mode(apic) && !new->phys_map[xapic_id])
> > > +				new->phys_map[xapic_id] = apic;
> > > +		} else {
> > > +			/*
> > > +			 * Disable the optimized map if the physical APIC ID is
> > > +			 * already mapped, i.e. is aliased to multiple vCPUs.
> > > +			 * The optimized map requires a strict 1:1 mapping
> > > +			 * between IDs and vCPUs.
> > > +			 */
> > > +			if (apic_x2apic_mode(apic))
> > > +				physical_id = x2apic_id;
> > > +			else
> > > +				physical_id = xapic_id;
> > > +
> > > +			if (new->phys_map[physical_id]) {
> > > +				kvfree(new);
> > > +				new = NULL;
> > > +				goto out;
> > Why not to use the same  KVM_APIC_MODE_XAPIC_FLAT |  KVM_APIC_MODE_XAPIC_CLUSTER
> > hack here?
> 
> The map's "mode" only covers logical mode (the cleanup patch renames "mode" to
> "logical_mode" to make this more clear).  There is no equivalent for dealing with
> the physical IDs.  Alternatively, a flag to say "physical map is disabled" could
> be added, but KVM already has to cleanly handle a NULL map and in all likelihood
> the logical map is also going to be disabled anyways.
> 
> Not to mention that APIC performance is unlikely to be a priority for any guest
> that triggers this code :-)
> 

Thanks for the explanation, that makes sense!

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes
  2022-08-31 16:56     ` Sean Christopherson
@ 2022-08-31 17:53       ` Maxim Levitsky
  2022-09-16 18:58         ` Sean Christopherson
  0 siblings, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 17:53 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 16:56 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > index 8209caffe3ab..3b6ef36b3963 100644
> > > --- a/arch/x86/kvm/lapic.c
> > > +++ b/arch/x86/kvm/lapic.c
> > > @@ -168,7 +168,12 @@ static bool kvm_use_posted_timer_interrupt(struct kvm_vcpu *vcpu)
> > >  
> > >  static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
> > >  		u32 dest_id, struct kvm_lapic ***cluster, u16 *mask) {
> > > -	switch (map->mode) {
> > > +	switch (map->logical_mode) {
> > > +	case KVM_APIC_MODE_SW_DISABLED:
> > > +		/* Arbitrarily use the flat map so that @cluster isn't NULL. */
> > > +		*cluster = map->xapic_flat_map;
> > > +		*mask = 0;
> > > +		return true;
> > Could you explain why this is needed? I probably missed something.
> 
> If all vCPUs leave their APIC software disabled, or leave LDR=0, then the overall
> mode will be KVM_APIC_MODE_SW_DISABLED.  In this case, the effective "mask" is '0'
> because there are no targets.  And this returns %true because there are no targets,
> i.e. there's no need to go down the slow path after kvm_apic_map_get_dest_lapic().

I guess this case doesn't need optimization (although maybe some OSes do leave all LDRs to 0,
if they don't use logical addressing, don't know)

Anyway thanks, that makes sense.

> 
> > > @@ -993,7 +1011,7 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
> > >  {
> > >  	if (kvm->arch.x2apic_broadcast_quirk_disabled) {
> > >  		if ((irq->dest_id == APIC_BROADCAST &&
> > > -				map->mode != KVM_APIC_MODE_X2APIC))
> > > +		     map->logical_mode != KVM_APIC_MODE_X2APIC))
> > >  			return true;
> > >  		if (irq->dest_id == X2APIC_BROADCAST)
> > >  			return true;
> > 
> > To be honest I would put that patch first, and then do all the other patches,
> > this way you would avoid all of the hacks they do and removed here.
> 
> I did it this way so that I could test this patch for correctness.  Without the
> bug fixes in place it's not really possible to verify this patch is 100% correct.
> 
> I completely agree that it would be a lot easier to read/understand/review if
> this came first, but I'd rather not sacrifice the ability to easily test this patch.
> 

I am not 100% sure about this, but I won't argue about it, let it be.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31 17:46         ` Maxim Levitsky
@ 2022-08-31 17:58           ` Jim Mattson
  2022-08-31 18:01             ` Maxim Levitsky
  2022-08-31 19:12           ` Sean Christopherson
  1 sibling, 1 reply; 74+ messages in thread
From: Jim Mattson @ 2022-08-31 17:58 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022 at 10:49 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:

> In this case I say that there is no wiggle room for KVM to not allow different APIC bases
> on each CPU - the spec 100% allows it, but in KVM it is broken.

This would actually be my first candidate for
Documentation/virt/kvm/x86/errata.rst!

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31 17:58           ` Jim Mattson
@ 2022-08-31 18:01             ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 18:01 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 10:58 -0700, Jim Mattson wrote:
> On Wed, Aug 31, 2022 at 10:49 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
> 
> > In this case I say that there is no wiggle room for KVM to not allow different APIC bases
> > on each CPU - the spec 100% allows it, but in KVM it is broken.
> 
> This would actually be my first candidate for
> Documentation/virt/kvm/x86/errata.rst!
> 
100% agree.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/19] KVM: SVM: Drop buggy and redundant AVIC "single logical dest" check
  2022-08-31 16:37     ` Sean Christopherson
@ 2022-08-31 18:10       ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 18:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 16:37 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > > Use the already-calculated-and-sanity-checked destination bitmap when
> > > processing a fast AVIC kick in logical mode, and drop the logical path's
> > > flawed logic.  The intent of the check is to ensure the bitmap is a power
> > > of two, whereas "icrh != (1 << avic)" effectively checks that the bitmap
> > > is a power of two _and_ the target cluster is '0'.
> > > 
> > > Note, the flawed check isn't a functional issue, it simply means that KVM
> > > will go down the slow path if the target cluster is non-zero.
> > > 
> > > Fixes: 8c9e639da435 ("KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible")
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >  arch/x86/kvm/svm/avic.c | 10 +---------
> > >  1 file changed, 1 insertion(+), 9 deletions(-)
> > > 
> > > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > > index 3c333cd2e752..14f567550a1e 100644
> > > --- a/arch/x86/kvm/svm/avic.c
> > > +++ b/arch/x86/kvm/svm/avic.c
> > > @@ -411,15 +411,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
> > >  			 * Instead, calculate physical ID from logical ID in ICRH.
> > >  			 */
> > >  			int cluster = (icrh & 0xffff0000) >> 16;
> > > -			int apic = ffs(icrh & 0xffff) - 1;
> > > -
> > > -			/*
> > > -			 * If the x2APIC logical ID sub-field (i.e. icrh[15:0])
> > > -			 * contains anything but a single bit, we cannot use the
> > > -			 * fast path, because it is limited to a single vCPU.
> > > -			 */
> > > -			if (apic < 0 || icrh != (1 << apic))
> > > -				return -EINVAL;
> > > +			int apic = ffs(bitmap) - 1;
> > >  
> > >  			l1_physical_id = (cluster << 4) + apic;
> > >  		}
> > 
> > Oh, I didn't notice this bug. However isn't removing the check is wrong as well?
> > 
> > What if we do have multiple bits set in the bitmap? After you remove this code,
> > we will set IPI only to APIC which matches the 1st bit, no?
> 
> The common code (out of sight here) already ensures the bitmap has exactly one bit set:
> 
>                 if (unlikely(!bitmap))
>                         /* guest bug: nobody to send the logical interrupt to */
>                         return 0;
> 
>                 if (!is_power_of_2(bitmap))
>                         /* multiple logical destinations, use slow path */
>                         return -EINVAL;
> 
>                 logid_index = cluster + __ffs(bitmap);
> 
Ah right, but again because the patch that added x2apic logic after I already added should not
be added. I vote again to just revert it.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/19] KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU
  2022-08-31 15:08     ` Sean Christopherson
@ 2022-08-31 18:12       ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 18:12 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 15:08 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > > @@ -455,13 +461,8 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
> > >  	 */
> > >  	kvm_for_each_vcpu(i, vcpu, kvm) {
> > >  		if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
> > > -					dest, icrl & APIC_DEST_MASK)) {
> > > -			vcpu->arch.apic->irr_pending = true;
> > > -			svm_complete_interrupt_delivery(vcpu,
> > > -							icrl & APIC_MODE_MASK,
> > > -							icrl & APIC_INT_LEVELTRIG,
> > > -							icrl & APIC_VECTOR_MASK);
> > > -		}
> > > +					dest, icrl & APIC_DEST_MASK))
> > > +			avic_kick_vcpu(vcpu, icrl);
> > >  	}
> > >  }
> > >  
> > 
> > I don't know what I think about this, sometimes *minor* code duplication
> > might actually be a good thing, as it is easier to read the code, but I don't
> > have much against this as well.
> > 
> > I am not sure if before or after this code is more readable.
> 
> I don't have a strong opinion either.  I think I prefer having the helper, but
> have no objection to leaving things as is.  Originally I was thinking there was
> going to be a third call site, but that didn't happen.
> 

Yep - when something is duplicated 3 times, it is really rare to not want to have a helper,
Anyway I don't have a strong opinion about this either.

I mostly was unsure about the fact that helper receives icrl and not icrh, kind of wierd,
but anyway let it be.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 06/19] KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not ICHR[31:16]
  2022-08-31 16:35       ` Sean Christopherson
@ 2022-08-31 18:18         ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 18:18 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 16:35 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 09:09 +0300, Maxim Levitsky wrote:
> > > On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > > > When attempting a fast kick for x2AVIC, get the destination bitmap from
> > > > ICR[15:0], not ICHR[31:16].  The upper 16 bits contain the cluster, the
> > > > lower 16 bits hold the bitmap.
> > > > 
> > > > Fixes: 603ccef42ce9 ("KVM: x86: SVM: fix avic_kick_target_vcpus_fast")
> > > > Cc: Maxim Levitsky <mlevitsk@redhat.com>
> > > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > > ---
> > > >  arch/x86/kvm/svm/avic.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > > > index 3ace0f2f52f0..3c333cd2e752 100644
> > > > --- a/arch/x86/kvm/svm/avic.c
> > > > +++ b/arch/x86/kvm/svm/avic.c
> > > > @@ -368,7 +368,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
> > > >  
> > > >  		if (apic_x2apic_mode(source)) {
> > > >  			/* 16 bit dest mask, 16 bit cluster id */
> > > > -			bitmap = dest & 0xFFFF0000;
> > > > +			bitmap = dest & 0xFFFF;
> > > >  			cluster = (dest >> 16) << 4;
> > > >  		} else if (kvm_lapic_get_reg(source, APIC_DFR) == APIC_DFR_FLAT) {
> > > >  			/* 8 bit dest mask*/
> > > 
> > > I swear I have seen a patch from Suravee Suthikulpanit fixing this my mistake, I don't know why it was not
> > > accepted upstream.
> > 
> > This is the patch, which I guess got forgotten.
> > 
> > https://www.spinics.net/lists/kernel/msg4417427.html
> 
> Ah, we just missed it, doubt there's anything more than that to the story.

100% sure about it.

BTW there another minor bug I need to fix which was pointed to me by Suravee Suthikulpanit,
just so that I don't forget about it:

My code which inhibits APIC when APIC_ID != vcpu_id has a bug in regard that
when we have more that 255 vCPUs, this condition becames always true, but it is
to some extent wrong to inhibit the AVIC always in this case - there is a use case
when the guest uses only 255 vCPUs, then AVIC will work.

The relaxed condition for inhibit should be that APIC_ID == (vcpu_id & 0xFF), and it AVIC
is actually used on vCPU > 255, then it will be inhibited ( I need to check if the code
for this exists).

Best regards,
	Maxim Levitsky



> 
> > Since it is literaly the same patch, you can just add credit to Suravee Suthikulpanit.
> > 
> > So with the credit added:
> > 
> > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> 
> I'll grab Suravee's patch and added your review.  Thanks!
> 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 17/19] KVM: SVM: Handle multiple logical targets in AVIC kick fastpath
  2022-08-31 13:57   ` Maxim Levitsky
@ 2022-08-31 18:19     ` Sean Christopherson
  2022-08-31 18:25       ` Maxim Levitsky
  0 siblings, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 18:19 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> > +static void avic_kick_vcpu_by_logical_id(struct kvm *kvm, u32 *avic_logical_id_table,
> > +					 u32 logid_index, u32 icrl)
> > +{
> > +	u32 physical_id;
> > +
> > +	if (!avic_logical_id_table) {
> ^ Typo, the '!' shoudn't be there.

Ouch.  I suspect the tests pass because this just ends up routing events through
the slow path.  I try to concoct a testcase to expose this bug.

> > +static bool is_optimized_logical_map_enabled(struct kvm *kvm)
> > +{
> > +	struct kvm_apic_map *map;
> > +	bool enabled;
> > +
> > +	rcu_read_lock();
> > +	map = rcu_dereference(kvm->arch.apic_map);
> > +	enabled = map && map->logical_mode != KVM_APIC_MODE_MAP_DISABLED;
> > +	rcu_read_unlock();
> > +	return enabled;
> > +}
> 
> This function doesn't belong to avic, it should be in common KVM code.

I'll move it.  I'm not expecting any additional users, but I agree it belongs
elsewhere.  Actually, might be a moot point (see below).

> > @@ -394,50 +449,27 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
> >  		if (unlikely(!bitmap))
> >  			return 0;
> >  
> > -		if (!is_power_of_2(bitmap))
> > -			/* multiple logical destinations, use slow path */
> > +		/*
> > +		 * Use the slow path if more than one bit is set in the bitmap
> > +		 * and KVM's optimized logical map is disabled to avoid kicking
> > +		 * a vCPU multiple times.  If the optimized map is disabled, a
> > +		 * vCPU _may_ have multiple bits set in its logical ID, i.e.
> > +		 * may have multiple entries in the logical table.
> > +		 */
> > +		if (!is_power_of_2(bitmap) &&
> > +		    !is_optimized_logical_map_enabled(kvm))
> >  			return -EINVAL;
> 
> 
> I hate to say it but there is another issue here, which I know about for a while
> but haven't gotten yet to fix.
> 
> The issue is that AVIC's logical to physical map can't cover all the corner cases
> that you discovered - it only supports the sane subset: for each cluster, and for each bit
> in the mask, it has a physical apic id - so things like logical ids with multiple bits,
> having same logical id for multiple vcpus and so on can't work.
> 
> In this case we need to either inhibit AVIC (I support this 100%),

I like the idea of inhibiting.

>  or clear its logical ID map, so all logicical IPIs VM exit, and then they
>  can be emulated.
> 
> I haven't studied it formally but the code which rebuilds the AVIC's logical ID map
> starts at 'avic_handle_ldr_update'.

I suspected there are issues here, but the new tests passed (somewhat surprisingly)
so I stopped trying to decipher the AVIC LDR handling.

Eww.  And the VM-Exit trap logic is broken too.  If the guest updates and disables
its LDR, SVM returns immediately and doesn't call into common APIC code, i.e. doesn't
recalc the optimized map.  E.g. if the guest clears its LDR, the optimized map will
be left as is and the vCPU will receive interrupts using its old LDR.

	case APIC_LDR:
		if (avic_handle_ldr_update(vcpu))
			return 0;
		break;

Rather than handling this purely in AVIC code, what if we a key off of
the optimized map being enabled?  E.g. drop the return from avic_handle_ldr_update()
and in the kvm_recalculate_apic_map() do:

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3b6ef36b3963..6e188010b614 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -364,6 +364,11 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
                cluster[ldr] = apic;
        }
 out:
+       if (!new || new->logical_mode == KVM_APIC_MODE_MAP_DISABLED)
+               kvm_set_apicv_inhibit(kvm, APICV_INHIBIT_REASON_LOGICAL_MAP_DISABLED);
+       else
+               kvm_clear_apicv_inhibit(kvm, APICV_INHIBIT_REASON_LOGICAL_MAP_DISABLED);
+
        old = rcu_dereference_protected(kvm->arch.apic_map,
                        lockdep_is_held(&kvm->arch.apic_map_lock));
        rcu_assign_pointer(kvm->arch.apic_map, new);

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 17/19] KVM: SVM: Handle multiple logical targets in AVIC kick fastpath
  2022-08-31 18:19     ` Sean Christopherson
@ 2022-08-31 18:25       ` Maxim Levitsky
  0 siblings, 0 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 18:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 18:19 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> > > +static void avic_kick_vcpu_by_logical_id(struct kvm *kvm, u32 *avic_logical_id_table,
> > > +					 u32 logid_index, u32 icrl)
> > > +{
> > > +	u32 physical_id;
> > > +
> > > +	if (!avic_logical_id_table) {
> > ^ Typo, the '!' shoudn't be there.
> 
> Ouch.  I suspect the tests pass because this just ends up routing events through
> the slow path.  I try to concoct a testcase to expose this bug.
> 
> > > +static bool is_optimized_logical_map_enabled(struct kvm *kvm)
> > > +{
> > > +	struct kvm_apic_map *map;
> > > +	bool enabled;
> > > +
> > > +	rcu_read_lock();
> > > +	map = rcu_dereference(kvm->arch.apic_map);
> > > +	enabled = map && map->logical_mode != KVM_APIC_MODE_MAP_DISABLED;
> > > +	rcu_read_unlock();
> > > +	return enabled;
> > > +}
> > 
> > This function doesn't belong to avic, it should be in common KVM code.
> 
> I'll move it.  I'm not expecting any additional users, but I agree it belongs
> elsewhere.  Actually, might be a moot point (see below).
> 
> > > @@ -394,50 +449,27 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
> > >  		if (unlikely(!bitmap))
> > >  			return 0;
> > >  
> > > -		if (!is_power_of_2(bitmap))
> > > -			/* multiple logical destinations, use slow path */
> > > +		/*
> > > +		 * Use the slow path if more than one bit is set in the bitmap
> > > +		 * and KVM's optimized logical map is disabled to avoid kicking
> > > +		 * a vCPU multiple times.  If the optimized map is disabled, a
> > > +		 * vCPU _may_ have multiple bits set in its logical ID, i.e.
> > > +		 * may have multiple entries in the logical table.
> > > +		 */
> > > +		if (!is_power_of_2(bitmap) &&
> > > +		    !is_optimized_logical_map_enabled(kvm))
> > >  			return -EINVAL;
> > 
> > I hate to say it but there is another issue here, which I know about for a while
> > but haven't gotten yet to fix.
> > 
> > The issue is that AVIC's logical to physical map can't cover all the corner cases
> > that you discovered - it only supports the sane subset: for each cluster, and for each bit
> > in the mask, it has a physical apic id - so things like logical ids with multiple bits,
> > having same logical id for multiple vcpus and so on can't work.
> > 
> > In this case we need to either inhibit AVIC (I support this 100%),
> 
> I like the idea of inhibiting.
> 
> >  or clear its logical ID map, so all logicical IPIs VM exit, and then they
> >  can be emulated.
> > 
> > I haven't studied it formally but the code which rebuilds the AVIC's logical ID map
> > starts at 'avic_handle_ldr_update'.
> 
> I suspected there are issues here, but the new tests passed (somewhat surprisingly)
> so I stopped trying to decipher the AVIC LDR handling.
> 
> Eww.  And the VM-Exit trap logic is broken too.  If the guest updates and disables
> its LDR, SVM returns immediately and doesn't call into common APIC code, i.e. doesn't
> recalc the optimized map.  E.g. if the guest clears its LDR, the optimized map will
> be left as is and the vCPU will receive interrupts using its old LDR.
> 
> 	case APIC_LDR:
> 		if (avic_handle_ldr_update(vcpu))
> 			return 0;
> 		break;
> 
> Rather than handling this purely in AVIC code, what if we a key off of
> the optimized map being enabled?  E.g. drop the return from avic_handle_ldr_update()
> and in the kvm_recalculate_apic_map() do:
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 3b6ef36b3963..6e188010b614 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -364,6 +364,11 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>                 cluster[ldr] = apic;
>         }
>  out:
> +       if (!new || new->logical_mode == KVM_APIC_MODE_MAP_DISABLED)
> +               kvm_set_apicv_inhibit(kvm, APICV_INHIBIT_REASON_LOGICAL_MAP_DISABLED);
> +       else
> +               kvm_clear_apicv_inhibit(kvm, APICV_INHIBIT_REASON_LOGICAL_MAP_DISABLED);
> +

This looks very good, it will even work on APICv, because the 'check_apicv_inhibit_reasons'
will not return true for this new reason (APICv IPIv I think doesn't deal with logical destination at all);

Best regards,
	Maxim Levitsky

>         old = rcu_dereference_protected(kvm->arch.apic_map,
>                         lockdep_is_held(&kvm->arch.apic_map_lock));
>         rcu_assign_pointer(kvm->arch.apic_map, new);
> 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes
  2022-08-31  0:35 ` [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes Sean Christopherson
  2022-08-31 13:43   ` Maxim Levitsky
@ 2022-08-31 18:42   ` Maxim Levitsky
  2022-08-31 19:17     ` Sean Christopherson
  1 sibling, 1 reply; 74+ messages in thread
From: Maxim Levitsky @ 2022-08-31 18:42 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> Track all possibilities for the optimized APIC map's logical modes
> instead of overloading the pseudo-bitmap and treating any "unknown" value
> as "invalid".
> 
> As documented by the now-stale comment above the mode values, the values
> did have meaning when the optimized map was originally added.  That
> dependent logical was removed by commit e45115b62f9a ("KVM: x86: use
> physical LAPIC array for logical x2APIC"), but the obfuscated behavior
> and its comment were left behind.
> 
> Opportunistically rename "mode" to "logical_mode", partly to make it
> clear that the "disabled" case applies only to the logical map, but also
> to prove that there is no lurking code that expects "mode" to be a bitmap.
> 
> Functionally, this is a glorified nop.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 21 ++++++++++--------
>  arch/x86/kvm/lapic.c            | 38 ++++++++++++++++++++++++---------
>  2 files changed, 40 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 1f51411f3112..0184e64ab555 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -955,19 +955,22 @@ struct kvm_arch_memory_slot {
>  };
>  
>  /*
> - * We use as the mode the number of bits allocated in the LDR for the
> - * logical processor ID.  It happens that these are all powers of two.
> - * This makes it is very easy to detect cases where the APICs are
> - * configured for multiple modes; in that case, we cannot use the map and
> - * hence cannot use kvm_irq_delivery_to_apic_fast either.
> + * Track the mode of the optimized logical map, as the rules for decoding the
> + * destination vary per mode.  Enabling the optimized logical map requires all
> + * software-enabled local APIs to be in the same mode, each addressable APIC to
> + * be mapped to only one MDA, and each MDA to map to at most one APIC.
>   */
> -#define KVM_APIC_MODE_XAPIC_CLUSTER          4
> -#define KVM_APIC_MODE_XAPIC_FLAT             8
> -#define KVM_APIC_MODE_X2APIC                16
> +enum kvm_apic_logical_mode {
> +	KVM_APIC_MODE_SW_DISABLED,
> +	KVM_APIC_MODE_XAPIC_CLUSTER,
> +	KVM_APIC_MODE_XAPIC_FLAT,
> +	KVM_APIC_MODE_X2APIC,
> +	KVM_APIC_MODE_MAP_DISABLED,
> +};
>  
>  struct kvm_apic_map {
>  	struct rcu_head rcu;
> -	u8 mode;
> +	enum kvm_apic_logical_mode logical_mode;
>  	u32 max_apic_id;
>  	union {
>  		struct kvm_lapic *xapic_flat_map[8];
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 8209caffe3ab..3b6ef36b3963 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -168,7 +168,12 @@ static bool kvm_use_posted_timer_interrupt(struct kvm_vcpu *vcpu)
>  
>  static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
>  		u32 dest_id, struct kvm_lapic ***cluster, u16 *mask) {
> -	switch (map->mode) {
> +	switch (map->logical_mode) {
> +	case KVM_APIC_MODE_SW_DISABLED:
> +		/* Arbitrarily use the flat map so that @cluster isn't NULL. */
> +		*cluster = map->xapic_flat_map;
> +		*mask = 0;
> +		return true;
>  	case KVM_APIC_MODE_X2APIC: {
>  		u32 offset = (dest_id >> 16) * 16;
>  		u32 max_apic_id = map->max_apic_id;
> @@ -193,8 +198,10 @@ static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
>  		*cluster = map->xapic_cluster_map[(dest_id >> 4) & 0xf];
>  		*mask = dest_id & 0xf;
>  		return true;
> +	case KVM_APIC_MODE_MAP_DISABLED:
> +		return false;
>  	default:
> -		/* Not optimized. */
> +		WARN_ON_ONCE(1);

BTW unless I am mistaken, this warning is guest triggerable, and thus as you say when
'panic_on_warn=1', this will panic the host kernel.


Best regards,
	Maxim Levitsky


>  		return false;
>  	}
>  }
> @@ -256,10 +263,12 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  		goto out;
>  
>  	new->max_apic_id = max_id;
> +	new->logical_mode = KVM_APIC_MODE_SW_DISABLED;
>  
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>  		struct kvm_lapic *apic = vcpu->arch.apic;
>  		struct kvm_lapic **cluster;
> +		enum kvm_apic_logical_mode logical_mode;
>  		u32 x2apic_id, physical_id;
>  		u16 mask;
>  		u32 ldr;
> @@ -314,7 +323,8 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  			new->phys_map[physical_id] = apic;
>  		}
>  
> -		if (!kvm_apic_sw_enabled(apic))
> +		if (new->logical_mode == KVM_APIC_MODE_MAP_DISABLED ||
> +		    !kvm_apic_sw_enabled(apic))
>  			continue;
>  
>  		ldr = kvm_lapic_get_reg(apic, APIC_LDR);
> @@ -322,25 +332,33 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
>  			continue;
>  
>  		if (apic_x2apic_mode(apic)) {
> -			new->mode |= KVM_APIC_MODE_X2APIC;
> +			logical_mode = KVM_APIC_MODE_X2APIC;
>  		} else {
>  			ldr = GET_APIC_LOGICAL_ID(ldr);
>  			if (kvm_lapic_get_reg(apic, APIC_DFR) == APIC_DFR_FLAT)
> -				new->mode |= KVM_APIC_MODE_XAPIC_FLAT;
> +				logical_mode = KVM_APIC_MODE_XAPIC_FLAT;
>  			else
> -				new->mode |= KVM_APIC_MODE_XAPIC_CLUSTER;
> +				logical_mode = KVM_APIC_MODE_XAPIC_CLUSTER;
>  		}
> +		if (new->logical_mode != KVM_APIC_MODE_SW_DISABLED &&
> +		    new->logical_mode != logical_mode) {
> +			new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;
> +			continue;
> +		}
> +		new->logical_mode = logical_mode;
>  
> -		if (!kvm_apic_map_get_logical_dest(new, ldr, &cluster, &mask))
> +		if (WARN_ON_ONCE(!kvm_apic_map_get_logical_dest(new, ldr,
> +								&cluster, &mask))) {
> +			new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;
>  			continue;
> +		}
>  
>  		if (!mask)
>  			continue;
>  
>  		ldr = ffs(mask) - 1;
>  		if (!is_power_of_2(mask) || cluster[ldr]) {
> -			new->mode = KVM_APIC_MODE_XAPIC_FLAT |
> -				    KVM_APIC_MODE_XAPIC_CLUSTER;
> +			new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;
>  			continue;
>  		}
>  		cluster[ldr] = apic;
> @@ -993,7 +1011,7 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
>  {
>  	if (kvm->arch.x2apic_broadcast_quirk_disabled) {
>  		if ((irq->dest_id == APIC_BROADCAST &&
> -				map->mode != KVM_APIC_MODE_X2APIC))
> +		     map->logical_mode != KVM_APIC_MODE_X2APIC))
>  			return true;
>  		if (irq->dest_id == X2APIC_BROADCAST)
>  			return true;



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31 17:46         ` Maxim Levitsky
  2022-08-31 17:58           ` Jim Mattson
@ 2022-08-31 19:12           ` Sean Christopherson
  2022-09-01 10:25             ` Maxim Levitsky
  1 sibling, 1 reply; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 19:12 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 16:29 +0000, Sean Christopherson wrote:
> > On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > > Leaving AVIC on, when vCPU is in x2apic mode cannot trigger extra MMIO emulation,
> > > in fact the opposite - because AVIC is on, writes to 0xFEE00xxx might *not* trigger
> > > MMIO emulation and instead be emulated by AVIC.
> > 
> > That's even worse, because KVM is allowing the guest to exercise hardware logic
> > that I highly doubt AMD has thoroughly tested.
> 
> Harware logic is exactly the same regarless of if KVM uses x2apic mode or not,
> and it is better to be prepared for all kind of garbage coming from the guest.

Right, I got twisted around and was thinking x2APIC enabling of the guest was
reflected in hardware, but it's purely emulated in this mode.

> Software logic, I can understand you, there could be registers that trap differently
> in avic and x2avic mode, but it should be *very* easy to deal with it, the list
> of registers that trap is very short.
> 
> > 
> > > Yes, some of these writes can trigger AVIC specific emulation vm exits, but they
> > > are literaly the same as those used by x2avic, and it is really hard to see
> > > why this would be dangerous (assuming that x2avic code works, and avic code
> > > is aware of this 'hybrid' mode).
> > 
> > The APIC_RRR thing triggered the KVM_BUG_ON() in kvm_apic_write_nodecode()
> > precisely because of the AVIC trap.  At best, this gives a way for the guest to
> > trigger a WARN_ON_ONCE() and thus panic the host if panic_on_warn=1.  I fixed
> > the APIC_RRR case because that will be problematic for x2AVIC, but there are
> > other APIC registers that are unsupported in x2APIC that can trigger the KVM_BUG_ON().

> > > From the guest point of view, unless the guest pokes at random MMIO area,
> > > the only case when this matters is if the guest maps RAM over the 0xFEE00xxx
> > > (which it of course can, the spec explictly state as you say that when x2apic
> > > is enabled, the mmio is disabled), and then instead of functioning as RAM,
> > > the range will still function as APIC.
> > 
> > There is no wiggle room here though, KVM is blatantly breaking the architectural
> > specification.  When x2APIC is enabled, the xAPIC MMIO does not exist.
> 
> In this case I say that there is no wiggle room for KVM to not allow
> different APIC bases on each CPU - the spec 100% allows it, but in KVM it is
> broken.

The difference is that KVM is consistent with respect to itself in that case.
KVM doesn't support APIC base relocation, and never has supported APIC base
relocation.

This hybrid AVIC mode means that the resulting KVM behavior will vary depending
on whether or not AVIC is supported and enabled, whether or not x2AVIC is supported,
and also on internal KVM state.  And we even have tests for this!  E.g. run the APIC
unit test with AVIC disabled and it passes, run it with AVIC enabled and it fails.

> If you are really hell bent on not having that MMIO exposed, then I say we
> can just disable the AVIC memslot, and keep AVIC enabled in this case - this
> should make us both happy.

I don't think that will work though, as I don't think it's possible to tell hardware
not to accelerate AVIC accesses.  I.e. KVM can squash the unaccelerated traps, but
anything that is handled by hardware will still go through.

> This discussion really makes me feel that you want just to force your opinion
> on others, and its not the first time this happens. It is really frustrating
> to work like that.  It might sound harsh but this is how I feel. Hopefully I
> am wrong.

Yes and no.  I am most definitely trying to steer KVM in a direction that I feel
will allow KVM to scale in terms of developers, users, and workloads.  Part of
that direction is to change people's mindset from "good enough" to "implement to
the spec".

KVM's historic "good enough" approach largely worked when KVM had a relatively
small and stable development community, and when KVM was being used to run a
relatively limited set of workloads.  But KVM is being used to run an ever increasing
variety of workloads, and the development community is larger (and I hope that we
can grow it even further).  And there is inevitably attrition, which means that
unless we are extremely diligent in documenting KVM's quirks/errata, knowledge of
KVM's "good enough" shortcuts will eventually be lost.

E.g. it took me literally days to understand KVM's hack for forcing #PF=>#PF=>VM-Exit
instead of #PF=>#PF=>#DF.  That mess would have been avoided if KVM had implemented
to the spec and actually done the right thing back when the bug was first encountered.
Ditto for the x2APIC ID hotplug hack; I stared at that code on at least three
different occassions before I finally understood the full impact of the hack.

And getting KVM to implement to the spec means not deviating from that path when
it's inconvenient to follow, thus the hard line I am drawing.  I am sure we'll
encounter scenarios/features where it's simply impossible for KVM to do the right
thing, e.g. I believe virtualizing VMX's posted interrupts falls into this category.
But IMO those should be very, very rare exceptions.

So, "yes" in the sense that I am openly trying to drag people into alignment with
my "implement to the spec" vision.  But "no" in the sense that I don't think it's
fair to say I am forcing my opinion on others.  I am not saying "NAK" and ending
the discussion.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes
  2022-08-31 18:42   ` Maxim Levitsky
@ 2022-08-31 19:17     ` Sean Christopherson
  0 siblings, 0 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-08-31 19:17 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:35 +0000, Sean Christopherson wrote:
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 8209caffe3ab..3b6ef36b3963 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -168,7 +168,12 @@ static bool kvm_use_posted_timer_interrupt(struct kvm_vcpu *vcpu)
> >  
> >  static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
> >  		u32 dest_id, struct kvm_lapic ***cluster, u16 *mask) {
> > -	switch (map->mode) {
> > +	switch (map->logical_mode) {
> > +	case KVM_APIC_MODE_SW_DISABLED:
> > +		/* Arbitrarily use the flat map so that @cluster isn't NULL. */
> > +		*cluster = map->xapic_flat_map;
> > +		*mask = 0;
> > +		return true;
> >  	case KVM_APIC_MODE_X2APIC: {
> >  		u32 offset = (dest_id >> 16) * 16;
> >  		u32 max_apic_id = map->max_apic_id;
> > @@ -193,8 +198,10 @@ static inline bool kvm_apic_map_get_logical_dest(struct kvm_apic_map *map,
> >  		*cluster = map->xapic_cluster_map[(dest_id >> 4) & 0xf];
> >  		*mask = dest_id & 0xf;
> >  		return true;
> > +	case KVM_APIC_MODE_MAP_DISABLED:
> > +		return false;
> >  	default:
> > -		/* Not optimized. */
> > +		WARN_ON_ONCE(1);
> 
> BTW unless I am mistaken, this warning is guest triggerable, and thus as you
> say when 'panic_on_warn=1', this will panic the host kernel.

If it's guest triggerable then it's a bug in this patch.  The "default" case was
reachable with the old approach of OR-ing in bits, but the intent of this patch
is to fully enumerate all values of map->logical_mode and make the "default" case
impossible.

And I don't think it's reachable.  The case statements are:

	case KVM_APIC_MODE_SW_DISABLED:
	case KVM_APIC_MODE_X2APIC:
	case KVM_APIC_MODE_XAPIC_FLAT:
	case KVM_APIC_MODE_XAPIC_CLUSTER:
	case KVM_APIC_MODE_MAP_DISABLED:
	default:

which covers all of the possible enum values.

	enum kvm_apic_logical_mode {
		KVM_APIC_MODE_SW_DISABLED,
		KVM_APIC_MODE_XAPIC_CLUSTER,
		KVM_APIC_MODE_XAPIC_FLAT,
		KVM_APIC_MODE_X2APIC,
		KVM_APIC_MODE_MAP_DISABLED,
	};

The map is explicitly initialized to KVM_APIC_MODE_SW_DISABLED (to avoid relying
on KVM_APIC_MODE_SW_DISABLED==0)

	new->logical_mode = KVM_APIC_MODE_SW_DISABLED;

so unless I've missed a "logical_mode |= ..." somewhere, reaching "default" should
be impossible.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 05/19] KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick
  2022-08-31  0:34 ` [PATCH 05/19] KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick Sean Christopherson
  2022-08-31  9:38   ` Maxim Levitsky
@ 2022-09-01  2:56   ` Li,Rongqing
  1 sibling, 0 replies; 74+ messages in thread
From: Li,Rongqing @ 2022-09-01  2:56 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Suravee Suthikulpanit, Maxim Levitsky

> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Wednesday, August 31, 2022 8:35 AM
> To: Sean Christopherson <seanjc@google.com>; Paolo Bonzini
> <pbonzini@redhat.com>
> Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>; Maxim Levitsky <mlevitsk@redhat.com>;
> Li,Rongqing <lirongqing@baidu.com>
> Subject: [PATCH 05/19] KVM: SVM: Compute dest based on sender's x2APIC
> status for AVIC kick
> 
> Compute the destination from ICRH using the sender's x2APIC status, not each
> (potential) target's x2APIC status.
> 
> Fixes: c514d3a348ac ("KVM: SVM: Update avic_kick_target_vcpus to support
> 32-bit APIC ID")
> Cc: Li RongQing <lirongqing@baidu.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/avic.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index
> b59f8ee2671f..3ace0f2f52f0 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -441,6 +441,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm,
> struct kvm_lapic *source  static void avic_kick_target_vcpus(struct kvm *kvm,
> struct kvm_lapic *source,
>  				   u32 icrl, u32 icrh, u32 index)
>  {
> +	u32 dest = apic_x2apic_mode(source) ? icrh :
> +GET_XAPIC_DEST_FIELD(icrh);
>  	unsigned long i;
>  	struct kvm_vcpu *vcpu;
> 
> @@ -456,13 +457,6 @@ static void avic_kick_target_vcpus(struct kvm *kvm,
> struct kvm_lapic *source,
>  	 * since entered the guest will have processed pending IRQs at VMRUN.
>  	 */
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
> -		u32 dest;
> -
> -		if (apic_x2apic_mode(vcpu->arch.apic))
> -			dest = icrh;
> -		else
> -			dest = GET_XAPIC_DEST_FIELD(icrh);
> -
>  		if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
>  					dest, icrl & APIC_DEST_MASK)) {
>  			vcpu->arch.apic->irr_pending = true;
> --
> 2.37.2.672.g94769d06f0-goog

Reviewed-by: Li RongQing <lirongqing@baidu.com>

Thanks.

-Li

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-08-31 19:12           ` Sean Christopherson
@ 2022-09-01 10:25             ` Maxim Levitsky
  2022-09-01 11:47               ` Jim Mattson
                                 ` (3 more replies)
  0 siblings, 4 replies; 74+ messages in thread
From: Maxim Levitsky @ 2022-09-01 10:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, 2022-08-31 at 19:12 +0000, Sean Christopherson wrote:
> On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-08-31 at 16:29 +0000, Sean Christopherson wrote:
> > > On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > > > Leaving AVIC on, when vCPU is in x2apic mode cannot trigger extra MMIO emulation,
> > > > in fact the opposite - because AVIC is on, writes to 0xFEE00xxx might *not* trigger
> > > > MMIO emulation and instead be emulated by AVIC.
> > > 
> > > That's even worse, because KVM is allowing the guest to exercise hardware logic
> > > that I highly doubt AMD has thoroughly tested.
> > 
> > Harware logic is exactly the same regarless of if KVM uses x2apic mode or not,
> > and it is better to be prepared for all kind of garbage coming from the guest.
> 
> Right, I got twisted around and was thinking x2APIC enabling of the guest was
> reflected in hardware, but it's purely emulated in this mode.

BTW, I don't know why I didn't thought about it before - it is trivial to plug the possible
emulation bugs in hybrid AVIC mode, assuming that AVIC mmio is still exposed to the guest - 
all we need to do is to refuse to do anything on each of the two AVIC vmexits if the 
AVIC is in this mode, because in this mode, the guest should never touch the MMIO.

So basically just stick teh below code in start of  

'avic_unaccelerated_access_interception()' and in 'avic_incomplete_ipi_interception()'


if (apic_x2apic_mode(vcpu->arch.apic) && avic_mode != AVIC_MODE_X2)
	return 1;

> 
> > Software logic, I can understand you, there could be registers that trap differently
> > in avic and x2avic mode, but it should be *very* easy to deal with it, the list
> > of registers that trap is very short.
> > 
> > > > Yes, some of these writes can trigger AVIC specific emulation vm exits, but they
> > > > are literaly the same as those used by x2avic, and it is really hard to see
> > > > why this would be dangerous (assuming that x2avic code works, and avic code
> > > > is aware of this 'hybrid' mode).
> > > 
> > > The APIC_RRR thing triggered the KVM_BUG_ON() in kvm_apic_write_nodecode()
> > > precisely because of the AVIC trap.  At best, this gives a way for the guest to
> > > trigger a WARN_ON_ONCE() and thus panic the host if panic_on_warn=1.  I fixed
> > > the APIC_RRR case because that will be problematic for x2AVIC, but there are
> > > other APIC registers that are unsupported in x2APIC that can trigger the KVM_BUG_ON().
> > > > From the guest point of view, unless the guest pokes at random MMIO area,
> > > > the only case when this matters is if the guest maps RAM over the 0xFEE00xxx
> > > > (which it of course can, the spec explictly state as you say that when x2apic
> > > > is enabled, the mmio is disabled), and then instead of functioning as RAM,
> > > > the range will still function as APIC.
> > > 
> > > There is no wiggle room here though, KVM is blatantly breaking the architectural
> > > specification.  When x2APIC is enabled, the xAPIC MMIO does not exist.
> > 
> > In this case I say that there is no wiggle room for KVM to not allow
> > different APIC bases on each CPU - the spec 100% allows it, but in KVM it is
> > broken.
> 
> The difference is that KVM is consistent with respect to itself in that case.
> KVM doesn't support APIC base relocation, and never has supported APIC base
> relocation.
> 
> This hybrid AVIC mode means that the resulting KVM behavior will vary depending
> on whether or not AVIC is supported and enabled, whether or not x2AVIC is supported,
> and also on internal KVM state.  And we even have tests for this!  E.g. run the APIC
> unit test with AVIC disabled and it passes, run it with AVIC enabled and it fails.
> 
> > If you are really hell bent on not having that MMIO exposed, then I say we
> > can just disable the AVIC memslot, and keep AVIC enabled in this case - this
> > should make us both happy.
> 
> I don't think that will work though, as I don't think it's possible to tell hardware
> not to accelerate AVIC accesses.  I.e. KVM can squash the unaccelerated traps, but
> anything that is handled by hardware will still go through.

Nope! Remember why do we have that APIC private memslot? It is because APIC's mmio
is accelerated by AVIC/APICv only when both conditions are true 
	1. AVIC/APICv is enabled and is in xapic mode
	2. the MMIO is mapped as read/write in the NPT/EPT (and yes, AVIC/APICv requires NPT/EPT).

For the (2) we have a private memslot that we map there and we enable it when
AVIC/APICv is enabled.



To refresh our memory, we did a hack (I have nothing against it), that we
no longer remove that memslot when AVIC/APICv is inhibited, but instead we zap
its SPTE, and next time the guest tries to access it, despite having a memslot there
we don't re-install the SPTE and instead jump to emulation.

So technically the memslot is always enabled, but the guest can't use it when AVIC/APICv is
inhibited.

However when APIC mode changes to x2apic, I don't think that we zap that SPTE.

This means that when the hardware APIC acceleration is in x2apic mode (e.g x2avic is active),
the hardware doesn't intercept the MMIO anymore, and thus if guest accesses it,
it will access read/write our private APIC memslot, since its SPTE is there, which is wrong
because the userspace didn't map memory there, not to mention that this 'extra' memory won't survive migration.


Thus regardless of avic hybrid mode we need to ensure that if one of vCPUs is in x2apic mode,
that our private memslot SPTE is zapped, and that also the page fault handler won't re-install it.


A side efect of this will be that AVIC in hybrid mode will just work and be 100% to the spec.

> 
> > This discussion really makes me feel that you want just to force your opinion
> > on others, and its not the first time this happens. It is really frustrating
> > to work like that.  It might sound harsh but this is how I feel. Hopefully I
> > am wrong.
> 
> Yes and no.  I am most definitely trying to steer KVM in a direction that I feel
> will allow KVM to scale in terms of developers, users, and workloads.  Part of
> that direction is to change people's mindset from "good enough" to "implement to
> the spec".

Sean, don't understand me wrong - I agree with you mostly.

In fact I implemented support for PDPTR migration to be 100% to the spec,
although I kind of regret it, since PDPTRs looks like not preserved over SMM entries
in hardware thus, it is really not possible to rely on them to be unsync.
Also NPT doesn't support them either.

I also was actually first to notice the exception merging issue, and I actually
worked on fixing it and posted patches upstream. 

It is also a very corner case, and yet I wanted to fix it, just to be 100% up the spec.

I also reviewed the EVENTINJ injection of software interrupts (INTn) when there
is not INTn instruction in the guest, also something that nobody in reality uses,
but I warmely welcomed it and helped with review.

I also fixed many bugs in !NPT, and !EPT cases, also just to make KVM work to the spec.

Etc, etc, etc.

I just know that nothing is absolute, in some rare cases (like different APIC base per cpu),
it is just not feasable to support the spec.

In fact remember that patch series about maximum VMCS field number?
https://lore.kernel.org/kvm/YPmI8x2Qu3ZSS5Bc@google.com/

There was actually a patch series that was fixing it, but you said, just like me,
that it is not worth it, better to have an errata in KVM, since guest should not use
this info anyway. I didn't object to it, and neither I do now, but as you see,
you also sometimes agree that going 100% to the spec is not worth it.


I hope you understand me.

TL;DR - in this case actually the most correct solution is to disable the private apic
memslot that we install over the APIC mmio, when one of the APICs is in x2apic mode.


That will not only make AVIC hybrid mode just work, but also make sure that if the
guest accesses the APIC MMIO anyway it will see no memory there.


Best regards,
	Maxim Levitsky


> 
> KVM's historic "good enough" approach largely worked when KVM had a relatively
> small and stable development community, and when KVM was being used to run a
> relatively limited set of workloads.  But KVM is being used to run an ever increasing
> variety of workloads, and the development community is larger (and I hope that we
> can grow it even further).  And there is inevitably attrition, which means that
> unless we are extremely diligent in documenting KVM's quirks/errata, knowledge of
> KVM's "good enough" shortcuts will eventually be lost.
> 
> E.g. it took me literally days to understand KVM's hack for forcing #PF=>#PF=>VM-Exit
> instead of #PF=>#PF=>#DF.  That mess would have been avoided if KVM had implemented
> to the spec and actually done the right thing back when the bug was first encountered.
> Ditto for the x2APIC ID hotplug hack; I stared at that code on at least three
> different occassions before I finally understood the full impact of the hack.
> 
> And getting KVM to implement to the spec means not deviating from that path when
> it's inconvenient to follow, thus the hard line I am drawing.  I am sure we'll
> encounter scenarios/features where it's simply impossible for KVM to do the right
> thing, e.g. I believe virtualizing VMX's posted interrupts falls into this category.
> But IMO those should be very, very rare exceptions.


> 
> So, "yes" in the sense that I am openly trying to drag people into alignment with
> my "implement to the spec" vision.  But "no" in the sense that I don't think it's
> fair to say I am forcing my opinion on others.  I am not saying "NAK" and ending
> the discussion.
> 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-09-01 10:25             ` Maxim Levitsky
@ 2022-09-01 11:47               ` Jim Mattson
  2022-09-01 13:29               ` Sean Christopherson
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 74+ messages in thread
From: Jim Mattson @ 2022-09-01 11:47 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Suravee Suthikulpanit, Li RongQing

On Thu, Sep 1, 2022 at 3:26 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:

> I just know that nothing is absolute, in some rare cases (like different APIC base per cpu),
> it is just not feasable to support the spec.

Why isn't this feasible? We supported it when I was at VMware.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-09-01 10:25             ` Maxim Levitsky
  2022-09-01 11:47               ` Jim Mattson
@ 2022-09-01 13:29               ` Sean Christopherson
  2022-09-01 13:53               ` Sean Christopherson
  2022-09-01 15:08               ` Sean Christopherson
  3 siblings, 0 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-09-01 13:29 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Thu, Sep 01, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 19:12 +0000, Sean Christopherson wrote:
> > On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > > If you are really hell bent on not having that MMIO exposed, then I say we
> > > can just disable the AVIC memslot, and keep AVIC enabled in this case - this
> > > should make us both happy.
> > 
> > I don't think that will work though, as I don't think it's possible to tell hardware
> > not to accelerate AVIC accesses.  I.e. KVM can squash the unaccelerated traps, but
> > anything that is handled by hardware will still go through.
> 
> Nope! Remember why do we have that APIC private memslot?

Heh, I finally understood what you were suggesting two minutes after I woke up
this morning.   Apparently I needed to sleep on it?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-09-01 10:25             ` Maxim Levitsky
  2022-09-01 11:47               ` Jim Mattson
  2022-09-01 13:29               ` Sean Christopherson
@ 2022-09-01 13:53               ` Sean Christopherson
  2022-09-01 15:08               ` Sean Christopherson
  3 siblings, 0 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-09-01 13:53 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Thu, Sep 01, 2022, Maxim Levitsky wrote:
> There was actually a patch series that was fixing it, but you said, just like me,
> that it is not worth it, better to have an errata in KVM, since guest should not use
> this info anyway. I didn't object to it, and neither I do now, but as you see,
> you also sometimes agree that going 100% to the spec is not worth it.
> 
> 
> I hope you understand me.

Yep.

And rereading what I wrote...  I didn't intend to imply that you personally aren't
operating in "good faith" or whatever is the right terminology.  What I was trying
to explain is why I sometimes speak in absolutes.  When there is a bug/regression
and KVM is clearly violating spec, it's not a matter of opinion; KVM is broken and
needs to be fixed.

Of course, one could argue that that's an opinion in and of itself...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode"
  2022-09-01 10:25             ` Maxim Levitsky
                                 ` (2 preceding siblings ...)
  2022-09-01 13:53               ` Sean Christopherson
@ 2022-09-01 15:08               ` Sean Christopherson
  3 siblings, 0 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-09-01 15:08 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

Sorry for forking a bunch of different threads, wanted to respond to the other
stuff before you signed off for the day.

On Thu, Sep 01, 2022, Maxim Levitsky wrote:
> However when APIC mode changes to x2apic, I don't think that we zap that
> SPTE.

Hmm, yes. 

> A side efect of this will be that AVIC in hybrid mode will just work and be
> 100% to the spec. 

And I believe we can solve the avic_set_virtual_apic_mode() issue as well.  If
x2APIC is treated as an inhibit for xAPIC but not APICv at large, then setting the
inhibit will zap the SPTEs, and toggling the inhibit will force AVIC to re-assess
its VMCB controls without needing to hook ->set_virtual_apic_mode().

Roughly what I'm thinking.  I'll try to give this a shot today.

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 164b002777cf..0b69acef2bee 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2381,7 +2381,11 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
        if (((old_value ^ value) & X2APIC_ENABLE) && (value & X2APIC_ENABLE))
                kvm_apic_set_x2apic_id(apic, vcpu->vcpu_id);
 
-       if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE)) {
+       if ((old_value ^ value) & X2APIC_ENABLE) {
+               kvm_set_or_clear_apicv_inhibit(vcpu->kvm,
+                                              APICV_INHIBIT_REASON_X2APIC,
+                                              value & X2APIC_ENABLE;
+       } else if ((old_value ^ value) & MSR_IA32_APICBASE_ENABLE) {
                kvm_vcpu_update_apicv(vcpu);
                static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);
        }
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 270403610185..76d93f87071f 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4156,7 +4156,7 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
                 * when the AVIC is re-enabled.
                 */
                if (slot && slot->id == APIC_ACCESS_PAGE_PRIVATE_MEMSLOT &&
-                   !kvm_apicv_activated(vcpu->kvm))
+                   !kvm_xapic_apicv_activated(vcpu->kvm))
                        return RET_PF_EMULATE;
        }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1328326acfae..ee381d155adc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9297,11 +9297,24 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, int apicid)
        kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
 }
 
-bool kvm_apicv_activated(struct kvm *kvm)
+bool kvm_xapic_apicv_activated(struct kvm *kvm)
 {
        return (READ_ONCE(kvm->arch.apicv_inhibit_reasons) == 0);
 }
 
+bool kvm_apicv_activated(struct kvm *kvm)
+{
+       unsigned long inhibits = READ_ONCE(kvm->arch.apicv_inhibit_reasons);
+
+       /*
+        * x2APIC only needs to "inhibit" the MMIO region, all other aspects of
+        * APICv can continue to be utilized.
+        */
+       inhibits &= ~APICV_INHIBIT_REASON_X2APIC;
+
+       return !inhibits;
+}
+
 bool kvm_vcpu_apicv_activated(struct kvm_vcpu *vcpu)
 {
        ulong vm_reasons = READ_ONCE(vcpu->kvm->arch.apicv_inhibit_reasons);


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/19] KVM: SVM: Remove redundant cluster calculation that also creates a shadow
  2022-08-31 10:19   ` Maxim Levitsky
@ 2022-09-01 20:02     ` Sean Christopherson
  0 siblings, 0 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-09-01 20:02 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > Drop the redundant "cluster" calculation and its horrific shadowing in
> > the x2avic logical mode path.  The "cluster" that is declared at an outer
> > scope is derived using the exact same calculation and has performed the
> > left-shift.
> 
> Actually I think we should just revert the commit 
> 'KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible'
> 
> 
> I know that the patch that intially introduced the the avic_kick_target_vcpus_fast had
> x2apic/x2avic code, and then it was split to avoid adding x2avic code before it was merged,
> resulting in this patch to add the x2apic specific code.
> 
> But when I fixed most issues of avic_kick_target_vcpus_fast in my 
> 'KVM: x86: SVM: fix avic_kick_target_vcpus_fast', I added back x2apic code because
> it was just natural to do since I had to calculate cluster/bitmap masks anyway.
> 
> I expected this patch to be dropped because of this but I guess it was not noticed,
> or patches were merged in the wrong order.
> 
> This is the reason of shadowing, duplicate calculations/etc.

Ooooh, I completely missed that x2AVIC was already supported.  I saw the code, but
between the fixing the first bug and unwinding everything everything else I didn't
see that the end result ended up being a full revert.

So yeah, a full revert is definitely in order.

Thanks!

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes
  2022-08-31 17:53       ` Maxim Levitsky
@ 2022-09-16 18:58         ` Sean Christopherson
  0 siblings, 0 replies; 74+ messages in thread
From: Sean Christopherson @ 2022-09-16 18:58 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, kvm, linux-kernel, Suravee Suthikulpanit, Li RongQing

On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 16:56 +0000, Sean Christopherson wrote:
> > On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> > > > @@ -993,7 +1011,7 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
> > > >  {
> > > >  	if (kvm->arch.x2apic_broadcast_quirk_disabled) {
> > > >  		if ((irq->dest_id == APIC_BROADCAST &&
> > > > -				map->mode != KVM_APIC_MODE_X2APIC))
> > > > +		     map->logical_mode != KVM_APIC_MODE_X2APIC))
> > > >  			return true;
> > > >  		if (irq->dest_id == X2APIC_BROADCAST)
> > > >  			return true;
> > > 
> > > To be honest I would put that patch first, and then do all the other patches,
> > > this way you would avoid all of the hacks they do and removed here.
> > 
> > I did it this way so that I could test this patch for correctness.  Without the
> > bug fixes in place it's not really possible to verify this patch is 100% correct.
> > 
> > I completely agree that it would be a lot easier to read/understand/review if
> > this came first, but I'd rather not sacrifice the ability to easily test this patch.
> > 
> 
> I am not 100% sure about this, but I won't argue about it, let it be.

Whelp, so much for my argument.  I'm going to bite the bullet and move this patch
first so that the fix for logical x2APIC mode[*] doesn't need to pile on the hacks.

[*] https://lore.kernel.org/all/YyTF7SsMjm+pClqh@google.com

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2022-09-16 18:58 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-31  0:34 [PATCH 00/19] KVM: x86: AVIC and local APIC fixes+cleanups Sean Christopherson
2022-08-31  0:34 ` [PATCH 01/19] KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target Sean Christopherson
2022-08-31  9:18   ` Maxim Levitsky
2022-08-31  0:34 ` [PATCH 02/19] KVM: SVM: Don't put/load AVIC when setting virtual APIC mode Sean Christopherson
2022-08-31  9:24   ` Maxim Levitsky
2022-08-31 16:33     ` Sean Christopherson
2022-08-31  0:34 ` [PATCH 03/19] Revert "KVM: SVM: Introduce hybrid-AVIC mode" Sean Christopherson
2022-08-31  5:59   ` Maxim Levitsky
2022-08-31  6:45     ` Maxim Levitsky
2022-08-31 16:29       ` Sean Christopherson
2022-08-31 17:46         ` Maxim Levitsky
2022-08-31 17:58           ` Jim Mattson
2022-08-31 18:01             ` Maxim Levitsky
2022-08-31 19:12           ` Sean Christopherson
2022-09-01 10:25             ` Maxim Levitsky
2022-09-01 11:47               ` Jim Mattson
2022-09-01 13:29               ` Sean Christopherson
2022-09-01 13:53               ` Sean Christopherson
2022-09-01 15:08               ` Sean Christopherson
2022-08-31 16:19     ` Sean Christopherson
2022-08-31 17:47       ` Maxim Levitsky
2022-08-31  0:34 ` [PATCH 04/19] KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean Sean Christopherson
2022-08-31  9:36   ` Maxim Levitsky
2022-08-31  0:34 ` [PATCH 05/19] KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick Sean Christopherson
2022-08-31  9:38   ` Maxim Levitsky
2022-09-01  2:56   ` Li,Rongqing
2022-08-31  0:34 ` [PATCH 06/19] KVM: SVM: Get x2APIC logical dest bitmap from ICRH[15:0], not ICHR[31:16] Sean Christopherson
2022-08-31  6:09   ` Maxim Levitsky
2022-08-31  9:43     ` Maxim Levitsky
2022-08-31 16:35       ` Sean Christopherson
2022-08-31 18:18         ` Maxim Levitsky
2022-08-31  0:34 ` [PATCH 07/19] KVM: SVM: Drop buggy and redundant AVIC "single logical dest" check Sean Christopherson
2022-08-31  6:19   ` Maxim Levitsky
2022-08-31 16:37     ` Sean Christopherson
2022-08-31 18:10       ` Maxim Levitsky
2022-08-31  0:34 ` [PATCH 08/19] KVM: SVM: Remove redundant cluster calculation that also creates a shadow Sean Christopherson
2022-08-31 10:19   ` Maxim Levitsky
2022-09-01 20:02     ` Sean Christopherson
2022-08-31  0:34 ` [PATCH 09/19] KVM: SVM: Drop duplicate calcuation of AVIC/x2AVIC "logical index" Sean Christopherson
2022-08-31  0:34 ` [PATCH 10/19] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch Sean Christopherson
2022-08-31 10:22   ` Maxim Levitsky
2022-08-31 16:16     ` Sean Christopherson
2022-08-31 17:49       ` Maxim Levitsky
2022-08-31  0:34 ` [PATCH 11/19] KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU Sean Christopherson
2022-08-31 10:25   ` Maxim Levitsky
2022-08-31 15:08     ` Sean Christopherson
2022-08-31 18:12       ` Maxim Levitsky
2022-08-31  0:34 ` [PATCH 12/19] KVM: x86: Disable APIC logical map if logical ID covers multiple MDAs Sean Christopherson
2022-08-31 13:23   ` Maxim Levitsky
2022-08-31  0:35 ` [PATCH 13/19] KVM: x86: Disable APIC logical map if vCPUs are aliased in logical mode Sean Christopherson
2022-08-31 13:24   ` Maxim Levitsky
2022-08-31  0:35 ` [PATCH 14/19] KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs Sean Christopherson
2022-08-31 13:37   ` Maxim Levitsky
2022-08-31 16:41     ` Sean Christopherson
2022-08-31 17:51       ` Maxim Levitsky
2022-08-31  0:35 ` [PATCH 15/19] KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0 Sean Christopherson
2022-08-31 13:41   ` Maxim Levitsky
2022-08-31 16:47     ` Sean Christopherson
2022-08-31  0:35 ` [PATCH 16/19] KVM: x86: Explicitly track all possibilities for APIC map's logical modes Sean Christopherson
2022-08-31 13:43   ` Maxim Levitsky
2022-08-31 16:56     ` Sean Christopherson
2022-08-31 17:53       ` Maxim Levitsky
2022-09-16 18:58         ` Sean Christopherson
2022-08-31 18:42   ` Maxim Levitsky
2022-08-31 19:17     ` Sean Christopherson
2022-08-31  0:35 ` [PATCH 17/19] KVM: SVM: Handle multiple logical targets in AVIC kick fastpath Sean Christopherson
2022-08-31 13:57   ` Maxim Levitsky
2022-08-31 18:19     ` Sean Christopherson
2022-08-31 18:25       ` Maxim Levitsky
2022-08-31  0:35 ` [PATCH 18/19] KVM: SVM: Ignore writes to Remote Read Data on AVIC write traps Sean Christopherson
2022-08-31 10:40   ` Maxim Levitsky
2022-08-31  0:35 ` [PATCH 19/19] Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu" Sean Christopherson
2022-08-31  6:07   ` Maxim Levitsky
2022-08-31  7:03     ` Maxim Levitsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.