linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode
@ 2019-11-01 22:41 Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 01/17] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter Suthikulpanit, Suravee
                   ` (16 more replies)
  0 siblings, 17 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

The 'commit 67034bb9dd5e ("KVM: SVM: Add irqchip_split() checks before
enabling AVIC")' was introduced to fix miscellaneous boot-hang issues
when enable AVIC. This is mainly due to AVIC hardware doest not #vmexit
on write to LAPIC EOI register resulting in-kernel PIC and IOAPIC to
wait and do not inject new interrupts (e.g. PIT, RTC).

This limits AVIC to only work with kernel_irqchip=split mode, which is
not currently enabled by default, and also required user-space to
support split irqchip model, which might not be the case.

The goal of this series is to enable AVIC to work in both irqchip modes,
by allowing AVIC to be deactivated temporarily during runtime, and fallback
to legacy interrupt injection mode (w/ vINTR and interrupt windows)
when needed, and then re-enabled subsequently.

Similar approach is also used to handle Hyper-V SynIC in the
'commit 5c919412fe61 ("kvm/x86: Hyper-V synthetic interrupt controller")',
where APICv is permanently disabled at runtime (currently broken for
AVIC, and fixed by this series). 

This series contains serveral parts:
  * Part 1: patch 1,2
    Code clean up, refactor, and introduce helper funtions

  * Part 2: patch 3 
    Introduce APICv deactivate bits to keep track of APICv state 
    for each vm.
 
  * Part 3: patch 4-9
    Add support for activate/deactivate APICv at runtime

  * Part 4: patch 10-13:
    Add support various cases where APICv needs to be deactivated

  * Part 5: patch 14-16:
    Introduce in-kernel IOAPIC workaround for AVIC EOI

  * Part 6: path 17
    Allow enable AVIC w/ kernel_irqchip=on

Pre-requisite Patch:
  * commit b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
    (https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/
     ?h=next&id=b9c6ff94e43a0ee053e0c1d983fba1ac4953b762)

This series has been tested against v5.3 as following:
  * Booting Linux and Windows Server 2019 VMs upto 240 vcpus
    and FreeBSD upto 128 vcpus w/ qemu option "kernel-irqchip=on"
    and "-no-hpet".
  * Pass-through Intel 10GbE NIC and run netperf in the VM.

Changes from V3: (https://lkml.org/lkml/2019/9/13/871)
(Per Paolo comments) 
  * Replace struct kvm_vcpu with struct kvm in various interfaces
  * Replace KVM_REQ_APICV_ACTIVATE/DEACTIVATE with KVM_REQ_APICV_UPDATE request
  * Replace APICv state enum (introduced in V3) w/ deactivate bits to track APICv state
  * Remove kvm_apicv_eoi_accelerate() (introduced in V3)
  * Deactivate APICv when using PIT re-inject mode
  * Consolidate srcu_read_unlock/lock into svm_request_update_avic()

Suravee Suthikulpanit (17):
  kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter
  kvm: lapic: Introduce APICv update helper function
  kvm: x86: Introduce APICv deactivate bits
  kvm: x86: Add support for activate/de-activate APICv at runtime
  kvm: x86: Add APICv activate/deactivate request trace points
  kvm: x86: svm: Add support to activate/deactivate posted interrupts
  svm: Add support for setup/destroy virutal APIC backing page for AVIC
  kvm: x86: Introduce APICv pre-update hook
  svm: Add support for activate/deactivate AVIC at runtime
  kvm: x86: hyperv: Use APICv update request interface
  svm: Deactivate AVIC when launching guest with nested SVM support
  svm: Temporary deactivate AVIC during ExtINT handling
  kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  kvm: lapic: Clean up APIC predefined macros
  kvm: ioapic: Refactor kvm_ioapic_update_eoi()
  kvm: ioapic: Lazy update IOAPIC EOI
  svm: Allow AVIC with in-kernel irqchip mode

 arch/x86/include/asm/kvm_host.h |  18 ++++-
 arch/x86/kvm/hyperv.c           |   5 +-
 arch/x86/kvm/i8254.c            |  10 +++
 arch/x86/kvm/ioapic.c           | 149 +++++++++++++++++++++++++---------------
 arch/x86/kvm/lapic.c            |  35 ++++++----
 arch/x86/kvm/lapic.h            |   2 +
 arch/x86/kvm/svm.c              | 136 +++++++++++++++++++++++++++++++-----
 arch/x86/kvm/trace.h            |  19 +++++
 arch/x86/kvm/vmx/vmx.c          |   6 +-
 arch/x86/kvm/x86.c              |  61 +++++++++++++---
 10 files changed, 343 insertions(+), 98 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 01/17] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 02/17] kvm: lapic: Introduce APICv update helper function Suthikulpanit, Suravee
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Generally, APICv for all vcpus in the VM are enable/disable in the same
manner. So, get_enable_apicv() should represent APICv status of the VM
instead of each VCPU.

Modify kvm_x86_ops.get_enable_apicv() to take struct kvm as parameter
instead of struct kvm_vcpu.

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/svm.c              | 4 ++--
 arch/x86/kvm/vmx/vmx.c          | 2 +-
 arch/x86/kvm/x86.c              | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index bdc16b0..843799b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1088,7 +1088,7 @@ struct kvm_x86_ops {
 	void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
 	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
 	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
-	bool (*get_enable_apicv)(struct kvm_vcpu *vcpu);
+	bool (*get_enable_apicv)(struct kvm *kvm);
 	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
 	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
 	void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index e036807..7090306 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5143,9 +5143,9 @@ static void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
 	return;
 }
 
-static bool svm_get_enable_apicv(struct kvm_vcpu *vcpu)
+static bool svm_get_enable_apicv(struct kvm *kvm)
 {
-	return avic && irqchip_split(vcpu->kvm);
+	return avic && irqchip_split(kvm);
 }
 
 static void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c030c96..e4faa00 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3644,7 +3644,7 @@ void pt_update_intercept_for_msr(struct vcpu_vmx *vmx)
 	}
 }
 
-static bool vmx_get_enable_apicv(struct kvm_vcpu *vcpu)
+static bool vmx_get_enable_apicv(struct kvm *kvm)
 {
 	return enable_apicv;
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 91602d3..2341f48 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9217,7 +9217,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 		goto fail_free_pio_data;
 
 	if (irqchip_in_kernel(vcpu->kvm)) {
-		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu);
+		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
 		r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
 		if (r < 0)
 			goto fail_mmu_destroy;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 02/17] kvm: lapic: Introduce APICv update helper function
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 01/17] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 03/17] kvm: x86: Introduce APICv deactivate bits Suthikulpanit, Suravee
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Re-factor code into a helper function for setting lapic parameters when
activate/deactivate APICv, and export the function for subsequent usage.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/lapic.c | 22 +++++++++++++++++-----
 arch/x86/kvm/lapic.h |  1 +
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index e904ff0..4654230 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2152,6 +2152,21 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
 		pr_warn_once("APIC base relocation is unsupported by KVM");
 }
 
+void kvm_apic_update_apicv(struct kvm_vcpu *vcpu)
+{
+	struct kvm_lapic *apic = vcpu->arch.apic;
+
+	if (vcpu->arch.apicv_active) {
+		/* irr_pending is always true when apicv is activated. */
+		apic->irr_pending = true;
+		apic->isr_count = 1;
+	} else {
+		apic->irr_pending = (apic_search_irr(apic) != -1);
+		apic->isr_count = count_vectors(apic->regs + APIC_ISR);
+	}
+}
+EXPORT_SYMBOL_GPL(kvm_apic_update_apicv);
+
 void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
@@ -2194,8 +2209,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
 		kvm_lapic_set_reg(apic, APIC_ISR + 0x10 * i, 0);
 		kvm_lapic_set_reg(apic, APIC_TMR + 0x10 * i, 0);
 	}
-	apic->irr_pending = vcpu->arch.apicv_active;
-	apic->isr_count = vcpu->arch.apicv_active ? 1 : 0;
+	kvm_apic_update_apicv(vcpu);
 	apic->highest_isr_cache = -1;
 	update_divide_count(apic);
 	atomic_set(&apic->lapic_timer.pending, 0);
@@ -2454,9 +2468,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
 	apic_manage_nmi_watchdog(apic, kvm_lapic_get_reg(apic, APIC_LVT0));
 	update_divide_count(apic);
 	start_apic_timer(apic);
-	apic->irr_pending = true;
-	apic->isr_count = vcpu->arch.apicv_active ?
-				1 : count_vectors(apic->regs + APIC_ISR);
+	kvm_apic_update_apicv(vcpu);
 	apic->highest_isr_cache = -1;
 	if (vcpu->arch.apicv_active) {
 		kvm_x86_ops->apicv_post_state_restore(vcpu);
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 50053d2..36a5271 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -91,6 +91,7 @@ bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
 int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
 		     struct dest_map *dest_map);
 int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
+void kvm_apic_update_apicv(struct kvm_vcpu *vcpu);
 
 bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
 		struct kvm_lapic_irq *irq, int *r, struct dest_map *dest_map);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 03/17] kvm: x86: Introduce APICv deactivate bits
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 01/17] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 02/17] kvm: lapic: Introduce APICv update helper function Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-02  9:51   ` Paolo Bonzini
  2019-11-01 22:41 ` [PATCH v4 04/17] kvm: x86: Add support for activate/de-activate APICv at runtime Suthikulpanit, Suravee
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Currently, after a VM boots with APICv enabled, it could be deactivated
due to various reasons (e.g. Hyper-v synic).

Introduce KVM APICv deactivate bits along with a new variable
struct kvm_arch.apicv_deact_msk to help keep track of why APICv is
deactivated for each VM.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  5 +++++
 arch/x86/kvm/svm.c              |  3 +++
 arch/x86/kvm/vmx/vmx.c          |  4 ++++
 arch/x86/kvm/x86.c              | 22 +++++++++++++++++++++-
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 843799b..1c05363 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -852,6 +852,8 @@ enum kvm_irqchip_mode {
 	KVM_IRQCHIP_SPLIT,        /* created with KVM_CAP_SPLIT_IRQCHIP */
 };
 
+#define APICV_DEACT_BIT_DISABLE    0
+
 struct kvm_arch {
 	unsigned long n_used_mmu_pages;
 	unsigned long n_requested_mmu_pages;
@@ -881,6 +883,7 @@ struct kvm_arch {
 	struct kvm_apic_map *apic_map;
 
 	bool apic_access_page_done;
+	unsigned long apicv_deact_msk;
 
 	gpa_t wall_clock;
 
@@ -1416,6 +1419,8 @@ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva,
 				struct x86_exception *exception);
 
 void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
+bool kvm_apicv_activated(struct kvm *kvm);
+void kvm_apicv_init(struct kvm *kvm, bool enable);
 
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7090306..a0caf66 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1985,6 +1985,9 @@ static int avic_vm_init(struct kvm *kvm)
 	hash_add(svm_vm_data_hash, &kvm_svm->hnode, kvm_svm->avic_vm_id);
 	spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
 
+	/* Enable KVM APICv support */
+	kvm_apicv_init(kvm, true);
+
 	return 0;
 
 free_avic:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e4faa00..28b97fb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6775,6 +6775,10 @@ static int vmx_vm_init(struct kvm *kvm)
 			break;
 		}
 	}
+
+	/* Enable KVM APICv support */
+	kvm_apicv_init(kvm, vmx_get_enable_apicv(kvm));
+
 	return 0;
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2341f48..70a70a1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7201,6 +7201,21 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
 	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
 }
 
+bool kvm_apicv_activated(struct kvm *kvm)
+{
+	return (READ_ONCE(kvm->arch.apicv_deact_msk) == 0);
+}
+EXPORT_SYMBOL_GPL(kvm_apicv_activated);
+
+void kvm_apicv_init(struct kvm *kvm, bool enable)
+{
+	if (enable)
+		clear_bit(APICV_DEACT_BIT_DISABLE, &kvm->arch.apicv_deact_msk);
+	else
+		set_bit(APICV_DEACT_BIT_DISABLE, &kvm->arch.apicv_deact_msk);
+}
+EXPORT_SYMBOL_GPL(kvm_apicv_init);
+
 static void kvm_sched_yield(struct kvm *kvm, unsigned long dest_id)
 {
 	struct kvm_vcpu *target = NULL;
@@ -9217,13 +9232,15 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 		goto fail_free_pio_data;
 
 	if (irqchip_in_kernel(vcpu->kvm)) {
-		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
 		r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
 		if (r < 0)
 			goto fail_mmu_destroy;
 	} else
 		static_key_slow_inc(&kvm_no_apic_vcpu);
 
+	if (irqchip_in_kernel(vcpu->kvm) && kvm_apicv_activated(vcpu->kvm))
+		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
+
 	vcpu->arch.mce_banks = kzalloc(KVM_MAX_MCE_BANKS * sizeof(u64) * 4,
 				       GFP_KERNEL_ACCOUNT);
 	if (!vcpu->arch.mce_banks) {
@@ -9322,6 +9339,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	kvm_page_track_init(kvm);
 	kvm_mmu_init_vm(kvm);
 
+	/* Default to APICv disable */
+	kvm_apicv_init(kvm, false);
+
 	if (kvm_x86_ops->vm_init)
 		return kvm_x86_ops->vm_init(kvm);
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 04/17] kvm: x86: Add support for activate/de-activate APICv at runtime
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (2 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 03/17] kvm: x86: Introduce APICv deactivate bits Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-02  9:52   ` Paolo Bonzini
  2019-11-01 22:41 ` [PATCH v4 05/17] kvm: x86: Add APICv activate/deactivate request trace points Suthikulpanit, Suravee
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Certain runtime conditions require APICv to be temporary deactivated.
However, current implementation only support permanently deactivate
APICv at runtime (mainly used when running Hyper-V synic).

In addition, for AMD, when activate / deactivate APICv during runtime,
all vcpus in the VM has to be operating in the same APICv mode, which
requires the requesting (main) vcpu to notify others.

So, introduce the following:
 * A new KVM_REQ_APICV_UPDATE request bit
 * Interfaces to request all vcpus to update (activate/deactivate) APICv
 * Interface to update APICV-related parameters for each vcpu

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  5 +++++
 arch/x86/kvm/x86.c              | 30 ++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1c05363..3b94f42 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -78,6 +78,8 @@
 #define KVM_REQ_HV_STIMER		KVM_ARCH_REQ(22)
 #define KVM_REQ_LOAD_EOI_EXITMAP	KVM_ARCH_REQ(23)
 #define KVM_REQ_GET_VMCS12_PAGES	KVM_ARCH_REQ(24)
+#define KVM_REQ_APICV_UPDATE \
+	KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
@@ -1421,6 +1423,9 @@ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva,
 void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
 bool kvm_apicv_activated(struct kvm *kvm);
 void kvm_apicv_init(struct kvm *kvm, bool enable);
+void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu);
+void kvm_request_apicv_update(struct kvm *kvm, bool activate,
+			      unsigned long bit);
 
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 70a70a1..394695a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -26,6 +26,7 @@
 #include "cpuid.h"
 #include "pmu.h"
 #include "hyperv.h"
+#include "lapic.h"
 
 #include <linux/clocksource.h>
 #include <linux/interrupt.h>
@@ -7730,6 +7731,33 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
 	kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
 }
 
+void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
+{
+	if (!lapic_in_kernel(vcpu))
+		return;
+
+	vcpu->arch.apicv_active = kvm_apicv_activated(vcpu->kvm);
+	kvm_apic_update_apicv(vcpu);
+	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
+}
+EXPORT_SYMBOL_GPL(kvm_vcpu_update_apicv);
+
+void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
+{
+	if (activate) {
+		if (!test_and_clear_bit(bit, &kvm->arch.apicv_deact_msk) ||
+		    !kvm_apicv_activated(kvm))
+			return;
+	} else {
+		if (test_and_set_bit(bit, &kvm->arch.apicv_deact_msk) ||
+		    kvm_apicv_activated(kvm))
+			return;
+	}
+
+	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_UPDATE);
+}
+EXPORT_SYMBOL_GPL(kvm_request_apicv_update);
+
 static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 {
 	if (!kvm_apic_present(vcpu))
@@ -7916,6 +7944,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		 */
 		if (kvm_check_request(KVM_REQ_HV_STIMER, vcpu))
 			kvm_hv_process_stimers(vcpu);
+		if (kvm_check_request(KVM_REQ_APICV_UPDATE, vcpu))
+			kvm_vcpu_update_apicv(vcpu);
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 05/17] kvm: x86: Add APICv activate/deactivate request trace points
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (3 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 04/17] kvm: x86: Add support for activate/de-activate APICv at runtime Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 06/17] kvm: x86: svm: Add support to activate/deactivate posted interrupts Suthikulpanit, Suravee
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Add trace points when sending request to activate/deactivate APICv.

Suggested-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/trace.h | 19 +++++++++++++++++++
 arch/x86/kvm/x86.c   |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index b5c831e..3bfc6b5 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1297,6 +1297,25 @@
 		  __entry->vcpu_id, __entry->timer_index)
 );
 
+TRACE_EVENT(kvm_apicv_update_request,
+	    TP_PROTO(bool activate, unsigned long bit),
+	    TP_ARGS(activate, bit),
+
+	TP_STRUCT__entry(
+		__field(bool, activate)
+		__field(unsigned long, bit)
+	),
+
+	TP_fast_assign(
+		__entry->activate = activate;
+		__entry->bit = bit;
+	),
+
+	TP_printk("%s bit=%lu",
+		  __entry->activate ? "activate" : "deactivate",
+		  __entry->bit)
+);
+
 /*
  * Tracepoint for AMD AVIC
  */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 394695a..4fab93e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7754,6 +7754,7 @@ void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
 			return;
 	}
 
+	trace_kvm_apicv_update_request(activate, bit);
 	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_UPDATE);
 }
 EXPORT_SYMBOL_GPL(kvm_request_apicv_update);
@@ -10145,3 +10146,4 @@ bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_pi_irte_update);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_unaccelerated_access);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_incomplete_ipi);
+EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_apicv_update_request);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 06/17] kvm: x86: svm: Add support to activate/deactivate posted interrupts
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (4 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 05/17] kvm: x86: Add APICv activate/deactivate request trace points Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 07/17] svm: Add support for setup/destroy virutal APIC backing page for AVIC Suthikulpanit, Suravee
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Introduce interface for activate/deactivate posted interrupts, and
implement SVM hooks to toggle AMD IOMMU guest virtual APIC mode.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a0caf66..b7d0adc 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5159,17 +5159,52 @@ static void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
 {
 }
 
+static int svm_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
+{
+	int ret = 0;
+	unsigned long flags;
+	struct amd_svm_iommu_ir *ir;
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (!kvm_arch_has_assigned_device(vcpu->kvm))
+		return 0;
+
+	/*
+	 * Here, we go through the per-vcpu ir_list to update all existing
+	 * interrupt remapping table entry targeting this vcpu.
+	 */
+	spin_lock_irqsave(&svm->ir_list_lock, flags);
+
+	if (list_empty(&svm->ir_list))
+		goto out;
+
+	list_for_each_entry(ir, &svm->ir_list, node) {
+		if (activate)
+			ret = amd_iommu_activate_guest_mode(ir->data);
+		else
+			ret = amd_iommu_deactivate_guest_mode(ir->data);
+		if (ret)
+			break;
+	}
+out:
+	spin_unlock_irqrestore(&svm->ir_list_lock, flags);
+	return ret;
+}
+
 /* Note: Currently only used by Hyper-V. */
 static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct vmcb *vmcb = svm->vmcb;
+	bool activated = kvm_vcpu_apicv_active(vcpu);
 
-	if (kvm_vcpu_apicv_active(vcpu))
+	if (activated)
 		vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
 	else
 		vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
 	mark_dirty(vmcb, VMCB_AVIC);
+
+	svm_set_pi_irte_mode(vcpu, activated);
 }
 
 static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 07/17] svm: Add support for setup/destroy virutal APIC backing page for AVIC
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (5 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 06/17] kvm: x86: svm: Add support to activate/deactivate posted interrupts Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-04 21:53   ` Roman Kagan
  2019-11-01 22:41 ` [PATCH v4 08/17] kvm: x86: Introduce APICv pre-update hook Suthikulpanit, Suravee
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Re-factor avic_init_access_page() to avic_update_access_page() since
activate/deactivate AVIC requires setting/unsetting the memory region used
for virtual APIC backing page (APIC_ACCESS_PAGE_PRIVATE_MEMSLOT).

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b7d0adc..46842a2 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1668,23 +1668,22 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
  * field of the VMCB. Therefore, we set up the
  * APIC_ACCESS_PAGE_PRIVATE_MEMSLOT (4KB) here.
  */
-static int avic_init_access_page(struct kvm_vcpu *vcpu)
+static int avic_update_access_page(struct kvm *kvm, bool activate)
 {
-	struct kvm *kvm = vcpu->kvm;
 	int ret = 0;
 
 	mutex_lock(&kvm->slots_lock);
-	if (kvm->arch.apic_access_page_done)
+	if (kvm->arch.apic_access_page_done == activate)
 		goto out;
 
 	ret = __x86_set_memory_region(kvm,
 				      APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
 				      APIC_DEFAULT_PHYS_BASE,
-				      PAGE_SIZE);
+				      activate ? PAGE_SIZE : 0);
 	if (ret)
 		goto out;
 
-	kvm->arch.apic_access_page_done = true;
+	kvm->arch.apic_access_page_done = activate;
 out:
 	mutex_unlock(&kvm->slots_lock);
 	return ret;
@@ -1697,7 +1696,7 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
 	int id = vcpu->vcpu_id;
 	struct vcpu_svm *svm = to_svm(vcpu);
 
-	ret = avic_init_access_page(vcpu);
+	ret = avic_update_access_page(vcpu->kvm, true);
 	if (ret)
 		return ret;
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 08/17] kvm: x86: Introduce APICv pre-update hook
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (6 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 07/17] svm: Add support for setup/destroy virutal APIC backing page for AVIC Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-04 22:05   ` Roman Kagan
  2019-11-01 22:41 ` [PATCH v4 09/17] svm: Add support for activate/deactivate AVIC at runtime Suthikulpanit, Suravee
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

AMD SVM AVIC needs to update APIC backing page mapping before changing
APICv mode. Introduce struct kvm_x86_ops.pre_update_apicv_exec_ctrl
function hook to be called prior KVM APICv update request to each vcpu.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/svm.c              | 6 ++++++
 arch/x86/kvm/x86.c              | 2 ++
 3 files changed, 9 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3b94f42..f93d347 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1094,6 +1094,7 @@ struct kvm_x86_ops {
 	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
 	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
 	bool (*get_enable_apicv)(struct kvm *kvm);
+	void (*pre_update_apicv_exec_ctrl)(struct kvm *kvm, bool activate);
 	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
 	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
 	void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 46842a2..21203a6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7230,6 +7230,11 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
 	return false;
 }
 
+static void svm_pre_update_apicv_exec_ctrl(struct kvm *kvm, bool activate)
+{
+	avic_update_access_page(kvm, activate);
+}
+
 static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -7307,6 +7312,7 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
 	.set_virtual_apic_mode = svm_set_virtual_apic_mode,
 	.get_enable_apicv = svm_get_enable_apicv,
 	.refresh_apicv_exec_ctrl = svm_refresh_apicv_exec_ctrl,
+	.pre_update_apicv_exec_ctrl = svm_pre_update_apicv_exec_ctrl,
 	.load_eoi_exitmap = svm_load_eoi_exitmap,
 	.hwapic_irr_update = svm_hwapic_irr_update,
 	.hwapic_isr_update = svm_hwapic_isr_update,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4fab93e..c09ff78 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7755,6 +7755,8 @@ void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
 	}
 
 	trace_kvm_apicv_update_request(activate, bit);
+	if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
+		kvm_x86_ops->pre_update_apicv_exec_ctrl(kvm, activate);
 	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_UPDATE);
 }
 EXPORT_SYMBOL_GPL(kvm_request_apicv_update);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 09/17] svm: Add support for activate/deactivate AVIC at runtime
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (7 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 08/17] kvm: x86: Introduce APICv pre-update hook Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 10/17] kvm: x86: hyperv: Use APICv update request interface Suthikulpanit, Suravee
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Add necessary logics for supporting activate/deactivate AVIC at runtime.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 21203a6..5b90458 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -388,6 +388,7 @@ struct amd_svm_iommu_ir {
 static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
 static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
 static void svm_complete_interrupts(struct vcpu_svm *svm);
+static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
 
 static int nested_svm_exit_handled(struct vcpu_svm *svm);
 static int nested_svm_intercept(struct vcpu_svm *svm);
@@ -1485,7 +1486,10 @@ static void avic_init_vmcb(struct vcpu_svm *svm)
 	vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
 	vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
 	vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID_COUNT;
-	vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+	if (kvm_apicv_activated(svm->vcpu.kvm))
+		vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+	else
+		vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
 }
 
 static void init_vmcb(struct vcpu_svm *svm)
@@ -1696,7 +1700,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
 	int id = vcpu->vcpu_id;
 	struct vcpu_svm *svm = to_svm(vcpu);
 
-	ret = avic_update_access_page(vcpu->kvm, true);
+	if (kvm_apicv_activated(vcpu->kvm))
+		ret = avic_update_access_page(vcpu->kvm, true);
 	if (ret)
 		return ret;
 
@@ -2188,7 +2193,8 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
 	/* We initialize this flag to true to make sure that the is_running
 	 * bit would be set the first time the vcpu is loaded.
 	 */
-	svm->avic_is_running = true;
+	if (irqchip_in_kernel(kvm) && kvm_apicv_activated(kvm))
+		svm->avic_is_running = true;
 
 	svm->nested.hsave = page_address(hsave_page);
 
@@ -2325,6 +2331,8 @@ static void svm_vcpu_blocking(struct kvm_vcpu *vcpu)
 
 static void svm_vcpu_unblocking(struct kvm_vcpu *vcpu)
 {
+	if (kvm_check_request(KVM_REQ_APICV_UPDATE, vcpu))
+		kvm_vcpu_update_apicv(vcpu);
 	avic_set_running(vcpu, true);
 }
 
@@ -5190,17 +5198,25 @@ static int svm_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
 	return ret;
 }
 
-/* Note: Currently only used by Hyper-V. */
 static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct vmcb *vmcb = svm->vmcb;
 	bool activated = kvm_vcpu_apicv_active(vcpu);
 
-	if (activated)
+	if (activated) {
+		/**
+		 * During AVIC temporary deactivation, guest could update
+		 * APIC ID, DFR and LDR registers, which would not be trapped
+		 * by avic_unaccelerated_access_interception(). In this case,
+		 * we need to check and update the AVIC logical APIC ID table
+		 * accordingly before re-activating.
+		 */
+		avic_post_state_restore(vcpu);
 		vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
-	else
+	} else {
 		vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+	}
 	mark_dirty(vmcb, VMCB_AVIC);
 
 	svm_set_pi_irte_mode(vcpu, activated);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 10/17] kvm: x86: hyperv: Use APICv update request interface
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (8 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 09/17] svm: Add support for activate/deactivate AVIC at runtime Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 11/17] svm: Deactivate AVIC when launching guest with nested SVM support Suthikulpanit, Suravee
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Since disabling APICv has to be done for all vcpus on AMD-based
system, adopt the newly introduced kvm_request_apicv_update()
interface, and introduce a new APICv deactivate bit for Hyper-V.

Also, remove the kvm_vcpu_deactivate_apicv() since no longer used.

Cc: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/hyperv.c           |  5 +++--
 arch/x86/kvm/x86.c              | 13 -------------
 3 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f93d347..a6475fd 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -855,6 +855,7 @@ enum kvm_irqchip_mode {
 };
 
 #define APICV_DEACT_BIT_DISABLE    0
+#define APICV_DEACT_BIT_HYPERV     1
 
 struct kvm_arch {
 	unsigned long n_used_mmu_pages;
@@ -1421,7 +1422,6 @@ gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva,
 gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva,
 				struct x86_exception *exception);
 
-void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
 bool kvm_apicv_activated(struct kvm *kvm);
 void kvm_apicv_init(struct kvm *kvm, bool enable);
 void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index fff790a..aa93b46 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -772,9 +772,10 @@ int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages)
 
 	/*
 	 * Hyper-V SynIC auto EOI SINT's are
-	 * not compatible with APICV, so deactivate APICV
+	 * not compatible with APICV, so request
+	 * to deactivate APICV permanently.
 	 */
-	kvm_vcpu_deactivate_apicv(vcpu);
+	kvm_request_apicv_update(vcpu->kvm, false, APICV_DEACT_BIT_HYPERV);
 	synic->active = true;
 	synic->dont_zero_synic_pages = dont_zero_synic_pages;
 	return 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c09ff78..0aa2833 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7189,19 +7189,6 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
 	kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
 }
 
-void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
-{
-	if (!lapic_in_kernel(vcpu)) {
-		WARN_ON_ONCE(vcpu->arch.apicv_active);
-		return;
-	}
-	if (!vcpu->arch.apicv_active)
-		return;
-
-	vcpu->arch.apicv_active = false;
-	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
-}
-
 bool kvm_apicv_activated(struct kvm *kvm)
 {
 	return (READ_ONCE(kvm->arch.apicv_deact_msk) == 0);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 11/17] svm: Deactivate AVIC when launching guest with nested SVM support
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (9 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 10/17] kvm: x86: hyperv: Use APICv update request interface Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 12/17] svm: Temporary deactivate AVIC during ExtINT handling Suthikulpanit, Suravee
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Since AVIC does not currently work w/ nested virtualization,
deactivate AVIC for the guest if setting CPUID Fn80000001_ECX[SVM]
(i.e. indicate support for SVM, which is needed for nested virtualization).

Suggested-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/svm.c              | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a6475fd..55d6476 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -856,6 +856,7 @@ enum kvm_irqchip_mode {
 
 #define APICV_DEACT_BIT_DISABLE    0
 #define APICV_DEACT_BIT_HYPERV     1
+#define APICV_DEACT_BIT_NESTED     2
 
 struct kvm_arch {
 	unsigned long n_used_mmu_pages;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 5b90458..7f59b1a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5984,6 +5984,14 @@ static void svm_cpuid_update(struct kvm_vcpu *vcpu)
 		return;
 
 	guest_cpuid_clear(vcpu, X86_FEATURE_X2APIC);
+
+	/*
+	 * Currently, AVIC does not work with nested virtualization.
+	 * So, we disable AVIC when cpuid for SVM is set in the L1 guest.
+	 */
+	if (nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))
+		kvm_request_apicv_update(vcpu->kvm, false,
+					 APICV_DEACT_BIT_NESTED);
 }
 
 static void svm_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 12/17] svm: Temporary deactivate AVIC during ExtINT handling
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (10 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 11/17] svm: Deactivate AVIC when launching guest with nested SVM support Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-02 10:01   ` Paolo Bonzini
  2019-11-01 22:41 ` [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode Suthikulpanit, Suravee
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
deactivated and fall back to using legacy interrupt injection via vINTR
and interrupt window.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/svm.c              | 37 ++++++++++++++++++++++++++++++++++---
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 55d6476..fe61269 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -857,6 +857,7 @@ enum kvm_irqchip_mode {
 #define APICV_DEACT_BIT_DISABLE    0
 #define APICV_DEACT_BIT_HYPERV     1
 #define APICV_DEACT_BIT_NESTED     2
+#define APICV_DEACT_BIT_IRQWIN     3
 
 struct kvm_arch {
 	unsigned long n_used_mmu_pages;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7f59b1a..0e7ff04 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -388,6 +388,8 @@ struct amd_svm_iommu_ir {
 static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
 static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
 static void svm_complete_interrupts(struct vcpu_svm *svm);
+static void svm_request_update_avic(struct kvm_vcpu *vcpu, bool activate);
+static bool svm_get_enable_apicv(struct kvm *kvm);
 static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
 
 static int nested_svm_exit_handled(struct vcpu_svm *svm);
@@ -4479,6 +4481,15 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
 {
 	kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
 	svm_clear_vintr(svm);
+
+	/*
+	 * For AVIC, the only reason to end up here is ExtINTs.
+	 * In this case AVIC was temporarily disabled for
+	 * requesting the IRQ window and we have to re-enable it.
+	 */
+	if (svm_get_enable_apicv(svm->vcpu.kvm))
+		svm_request_update_avic(&svm->vcpu, true);
+
 	svm->vmcb->control.int_ctl &= ~V_IRQ_MASK;
 	mark_dirty(svm->vmcb, VMCB_INTR);
 	++svm->vcpu.stat.irq_window_exits;
@@ -5166,6 +5177,21 @@ static void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
 {
 }
 
+static void svm_request_update_avic(struct kvm_vcpu *vcpu, bool activate)
+{
+	if (!lapic_in_kernel(vcpu))
+		return;
+	/*
+	 * kvm_request_apicv_update() expects a prior read unlock
+	 * on the the kvm->srcu since it subsequently calls read lock
+	 * and re-unlock in __x86_set_memory_region()
+	 * when updating APIC_ACCESS_PAGE_PRIVATE_MEMSLOT.
+	 */
+	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
+	kvm_request_apicv_update(vcpu->kvm, activate, APICV_DEACT_BIT_IRQWIN);
+	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
+}
+
 static int svm_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
 {
 	int ret = 0;
@@ -5504,9 +5530,6 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
-	if (kvm_vcpu_apicv_active(vcpu))
-		return;
-
 	/*
 	 * In case GIF=0 we can't rely on the CPU to tell us when GIF becomes
 	 * 1, because that's a separate STGI/VMRUN intercept.  The next time we
@@ -5516,6 +5539,14 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
 	 * window under the assumption that the hardware will set the GIF.
 	 */
 	if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
+		/*
+		 * IRQ window is not needed when AVIC is enabled,
+		 * unless we have pending ExtINT since it cannot be injected
+		 * via AVIC. In such case, we need to temporarily disable AVIC,
+		 * and fallback to injecting IRQ via V_IRQ.
+		 */
+		if (kvm_vcpu_apicv_active(vcpu))
+			svm_request_update_avic(vcpu, false);
 		svm_set_vintr(svm);
 		svm_inject_irq(svm, 0x0);
 	}
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (11 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 12/17] svm: Temporary deactivate AVIC during ExtINT handling Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-02  9:57   ` Paolo Bonzini
  2019-11-01 22:41 ` [PATCH v4 14/17] kvm: lapic: Clean up APIC predefined macros Suthikulpanit, Suravee
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

AMD SVM AVIC accelerates EOI write and does not trap. This causes
in-kernel PIT re-injection mode to fail since it relies on irq-ack
notifier mechanism. So, APICv is activated only when in-kernel PIT
is in discard mode e.g. w/ qemu option:

  -global kvm-pit.lost_tick_policy=discard

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/i8254.c            | 10 ++++++++++
 arch/x86/kvm/svm.c              |  8 +++++++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fe61269..460f7a4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -858,6 +858,7 @@ enum kvm_irqchip_mode {
 #define APICV_DEACT_BIT_HYPERV     1
 #define APICV_DEACT_BIT_NESTED     2
 #define APICV_DEACT_BIT_IRQWIN     3
+#define APICV_DEACT_BIT_PIT_REINJ  4
 
 struct kvm_arch {
 	unsigned long n_used_mmu_pages;
diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 4a6dc54..3f77fda 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -295,12 +295,22 @@ void kvm_pit_set_reinject(struct kvm_pit *pit, bool reinject)
 	if (atomic_read(&ps->reinject) == reinject)
 		return;
 
+	/*
+	 * AMD SVM AVIC accelerates EOI write and does not trap.
+	 * This cause in-kernel PIT re-inject mode to fail
+	 * since it checks ps->irq_ack before kvm_set_irq()
+	 * and relies on the ack notifier to timely queue
+	 * the pt->worker work iterm and reinject the missed tick.
+	 * So, deactivate APICv when PIT is in reinject mode.
+	 */
 	if (reinject) {
+		kvm_request_apicv_update(kvm, false, APICV_DEACT_BIT_PIT_REINJ);
 		/* The initial state is preserved while ps->reinject == 0. */
 		kvm_pit_reset_reinject(pit);
 		kvm_register_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
 		kvm_register_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
 	} else {
+		kvm_request_apicv_update(kvm, true, APICV_DEACT_BIT_PIT_REINJ);
 		kvm_unregister_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
 		kvm_unregister_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
 	}
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 0e7ff04..9812feb 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1679,7 +1679,13 @@ static int avic_update_access_page(struct kvm *kvm, bool activate)
 	int ret = 0;
 
 	mutex_lock(&kvm->slots_lock);
-	if (kvm->arch.apic_access_page_done == activate)
+	/*
+	 * During kvm_destroy_vm(), kvm_pit_set_reinject() could trigger
+	 * APICv mode change, which update APIC_ACCESS_PAGE_PRIVATE_MEMSLOT
+	 * memory region. So, we need to ensure that kvm->mm == current->mm.
+	 */
+	if ((kvm->arch.apic_access_page_done == activate) ||
+	    (kvm->mm != current->mm))
 		goto out;
 
 	ret = __x86_set_memory_region(kvm,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 14/17] kvm: lapic: Clean up APIC predefined macros
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (12 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 15/17] kvm: ioapic: Refactor kvm_ioapic_update_eoi() Suthikulpanit, Suravee
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Move these duplicated predefined macros to the header file so that
it can be re-used in other places.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/lapic.c | 13 +++++--------
 arch/x86/kvm/lapic.h |  1 +
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 4654230..07e154e 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -56,9 +56,6 @@
 #define APIC_VERSION			(0x14UL | ((KVM_APIC_LVT_NUM - 1) << 16))
 #define LAPIC_MMIO_LENGTH		(1 << 12)
 /* followed define is not in apicdef.h */
-#define APIC_SHORT_MASK			0xc0000
-#define APIC_DEST_NOSHORT		0x0
-#define APIC_DEST_MASK			0x800
 #define MAX_APIC_VECTOR			256
 #define APIC_VECTORS_PER_REG		32
 
@@ -575,9 +572,9 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
 	irq.level = (icr & APIC_INT_ASSERT) != 0;
 	irq.trig_mode = icr & APIC_INT_LEVELTRIG;
 
-	if (icr & APIC_DEST_MASK)
+	if (icr & KVM_APIC_DEST_MASK)
 		return -KVM_EINVAL;
-	if (icr & APIC_SHORT_MASK)
+	if (icr & KVM_APIC_SHORT_MASK)
 		return -KVM_EINVAL;
 
 	rcu_read_lock();
@@ -808,7 +805,7 @@ bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
 
 	ASSERT(target);
 	switch (short_hand) {
-	case APIC_DEST_NOSHORT:
+	case KVM_APIC_DEST_NOSHORT:
 		if (dest_mode == APIC_DEST_PHYSICAL)
 			return kvm_apic_match_physical_addr(target, mda);
 		else
@@ -1206,10 +1203,10 @@ static void apic_send_ipi(struct kvm_lapic *apic)
 
 	irq.vector = icr_low & APIC_VECTOR_MASK;
 	irq.delivery_mode = icr_low & APIC_MODE_MASK;
-	irq.dest_mode = icr_low & APIC_DEST_MASK;
+	irq.dest_mode = icr_low & KVM_APIC_DEST_MASK;
 	irq.level = (icr_low & APIC_INT_ASSERT) != 0;
 	irq.trig_mode = icr_low & APIC_INT_LEVELTRIG;
-	irq.shorthand = icr_low & APIC_SHORT_MASK;
+	irq.shorthand = icr_low & KVM_APIC_SHORT_MASK;
 	irq.msi_redir_hint = false;
 	if (apic_x2apic_mode(apic))
 		irq.dest_id = icr_high;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 36a5271..ba13c98e 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -10,6 +10,7 @@
 #define KVM_APIC_SIPI		1
 #define KVM_APIC_LVT_NUM	6
 
+#define KVM_APIC_DEST_NOSHORT	0x0
 #define KVM_APIC_SHORT_MASK	0xc0000
 #define KVM_APIC_DEST_MASK	0x800
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 15/17] kvm: ioapic: Refactor kvm_ioapic_update_eoi()
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (13 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 14/17] kvm: lapic: Clean up APIC predefined macros Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 16/17] kvm: ioapic: Lazy update IOAPIC EOI Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 17/17] svm: Allow AVIC with in-kernel irqchip mode Suthikulpanit, Suravee
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Refactor code for handling IOAPIC EOI for subsequent patch.
There is no functional change.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/ioapic.c | 110 +++++++++++++++++++++++++-------------------------
 1 file changed, 56 insertions(+), 54 deletions(-)

diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index d859ae8..c57b7bb 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -151,10 +151,16 @@ static void kvm_rtc_eoi_tracking_restore_all(struct kvm_ioapic *ioapic)
 	    __rtc_irq_eoi_tracking_restore_one(vcpu);
 }
 
-static void rtc_irq_eoi(struct kvm_ioapic *ioapic, struct kvm_vcpu *vcpu)
+static void rtc_irq_eoi(struct kvm_ioapic *ioapic, struct kvm_vcpu *vcpu,
+			int vector)
 {
-	if (test_and_clear_bit(vcpu->vcpu_id,
-			       ioapic->rtc_status.dest_map.map)) {
+	struct dest_map *dest_map = &ioapic->rtc_status.dest_map;
+
+	/* RTC special handling */
+	if (test_bit(vcpu->vcpu_id, dest_map->map) &&
+	    (vector == dest_map->vectors[vcpu->vcpu_id]) &&
+	    (test_and_clear_bit(vcpu->vcpu_id,
+				ioapic->rtc_status.dest_map.map))) {
 		--ioapic->rtc_status.pending_eoi;
 		rtc_status_pending_eoi_check_valid(ioapic);
 	}
@@ -415,72 +421,68 @@ static void kvm_ioapic_eoi_inject_work(struct work_struct *work)
 }
 
 #define IOAPIC_SUCCESSIVE_IRQ_MAX_COUNT 10000
-
-static void __kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu,
-			struct kvm_ioapic *ioapic, int vector, int trigger_mode)
+static void kvm_ioapic_update_eoi_one(struct kvm_vcpu *vcpu,
+				      struct kvm_ioapic *ioapic,
+				      int trigger_mode,
+				      int pin)
 {
-	struct dest_map *dest_map = &ioapic->rtc_status.dest_map;
 	struct kvm_lapic *apic = vcpu->arch.apic;
-	int i;
-
-	/* RTC special handling */
-	if (test_bit(vcpu->vcpu_id, dest_map->map) &&
-	    vector == dest_map->vectors[vcpu->vcpu_id])
-		rtc_irq_eoi(ioapic, vcpu);
-
-	for (i = 0; i < IOAPIC_NUM_PINS; i++) {
-		union kvm_ioapic_redirect_entry *ent = &ioapic->redirtbl[i];
-
-		if (ent->fields.vector != vector)
-			continue;
+	union kvm_ioapic_redirect_entry *ent = &ioapic->redirtbl[pin];
 
-		/*
-		 * We are dropping lock while calling ack notifiers because ack
-		 * notifier callbacks for assigned devices call into IOAPIC
-		 * recursively. Since remote_irr is cleared only after call
-		 * to notifiers if the same vector will be delivered while lock
-		 * is dropped it will be put into irr and will be delivered
-		 * after ack notifier returns.
-		 */
-		spin_unlock(&ioapic->lock);
-		kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i);
-		spin_lock(&ioapic->lock);
+	/*
+	 * We are dropping lock while calling ack notifiers because ack
+	 * notifier callbacks for assigned devices call into IOAPIC
+	 * recursively. Since remote_irr is cleared only after call
+	 * to notifiers if the same vector will be delivered while lock
+	 * is dropped it will be put into irr and will be delivered
+	 * after ack notifier returns.
+	 */
+	spin_unlock(&ioapic->lock);
+	kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, pin);
+	spin_lock(&ioapic->lock);
 
-		if (trigger_mode != IOAPIC_LEVEL_TRIG ||
-		    kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI)
-			continue;
+	if (trigger_mode != IOAPIC_LEVEL_TRIG ||
+	    kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI)
+		return;
 
-		ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG);
-		ent->fields.remote_irr = 0;
-		if (!ent->fields.mask && (ioapic->irr & (1 << i))) {
-			++ioapic->irq_eoi[i];
-			if (ioapic->irq_eoi[i] == IOAPIC_SUCCESSIVE_IRQ_MAX_COUNT) {
-				/*
-				 * Real hardware does not deliver the interrupt
-				 * immediately during eoi broadcast, and this
-				 * lets a buggy guest make slow progress
-				 * even if it does not correctly handle a
-				 * level-triggered interrupt.  Emulate this
-				 * behavior if we detect an interrupt storm.
-				 */
-				schedule_delayed_work(&ioapic->eoi_inject, HZ / 100);
-				ioapic->irq_eoi[i] = 0;
-				trace_kvm_ioapic_delayed_eoi_inj(ent->bits);
-			} else {
-				ioapic_service(ioapic, i, false);
-			}
+	ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG);
+	ent->fields.remote_irr = 0;
+	if (!ent->fields.mask && (ioapic->irr & (1 << pin))) {
+		++ioapic->irq_eoi[pin];
+		if (ioapic->irq_eoi[pin] == IOAPIC_SUCCESSIVE_IRQ_MAX_COUNT) {
+			/*
+			 * Real hardware does not deliver the interrupt
+			 * immediately during eoi broadcast, and this
+			 * lets a buggy guest make slow progress
+			 * even if it does not correctly handle a
+			 * level-triggered interrupt.  Emulate this
+			 * behavior if we detect an interrupt storm.
+			 */
+			schedule_delayed_work(&ioapic->eoi_inject, HZ / 100);
+			ioapic->irq_eoi[pin] = 0;
+			trace_kvm_ioapic_delayed_eoi_inj(ent->bits);
 		} else {
-			ioapic->irq_eoi[i] = 0;
+			ioapic_service(ioapic, pin, false);
 		}
+	} else {
+		ioapic->irq_eoi[pin] = 0;
 	}
 }
 
 void kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu, int vector, int trigger_mode)
 {
+	int i;
 	struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
 
 	spin_lock(&ioapic->lock);
-	__kvm_ioapic_update_eoi(vcpu, ioapic, vector, trigger_mode);
+	rtc_irq_eoi(ioapic, vcpu, vector);
+	for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+		union kvm_ioapic_redirect_entry *ent = &ioapic->redirtbl[i];
+
+		if (ent->fields.vector != vector)
+			continue;
+		kvm_ioapic_update_eoi_one(vcpu, ioapic, trigger_mode, i);
+	}
 	spin_unlock(&ioapic->lock);
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 16/17] kvm: ioapic: Lazy update IOAPIC EOI
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (14 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 15/17] kvm: ioapic: Refactor kvm_ioapic_update_eoi() Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  2019-11-01 22:41 ` [PATCH v4 17/17] svm: Allow AVIC with in-kernel irqchip mode Suthikulpanit, Suravee
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

In-kernel IOAPIC does not receive EOI with AMD SVM AVIC
since the processor accelerate write to APIC EOI register and
does not trap if the interrupt is edge-triggered.

Workaround this by lazy check for pending APIC EOI at the time when
setting new IOPIC irq, and update IOAPIC EOI if no pending APIC EOI.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/ioapic.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index c57b7bb..6fdd88f 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -48,6 +48,11 @@
 static int ioapic_service(struct kvm_ioapic *vioapic, int irq,
 		bool line_status);
 
+static void kvm_ioapic_update_eoi_one(struct kvm_vcpu *vcpu,
+				      struct kvm_ioapic *ioapic,
+				      int trigger_mode,
+				      int pin);
+
 static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic,
 					  unsigned long addr,
 					  unsigned long length)
@@ -174,6 +179,31 @@ static bool rtc_irq_check_coalesced(struct kvm_ioapic *ioapic)
 	return false;
 }
 
+static void ioapic_lazy_update_eoi(struct kvm_ioapic *ioapic, int irq)
+{
+	int i;
+	struct kvm_vcpu *vcpu;
+	union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
+
+	kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
+		if (!kvm_apic_match_dest(vcpu, NULL, KVM_APIC_DEST_NOSHORT,
+					 entry->fields.dest_id,
+					 entry->fields.dest_mode) ||
+		    kvm_apic_pending_eoi(vcpu, entry->fields.vector))
+			continue;
+
+		/*
+		 * If no longer has pending EOI in LAPICs, update
+		 * EOI for this vetor.
+		 */
+		rtc_irq_eoi(ioapic, vcpu, entry->fields.vector);
+		kvm_ioapic_update_eoi_one(vcpu, ioapic,
+					  entry->fields.trig_mode,
+					  irq);
+		break;
+	}
+}
+
 static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
 		int irq_level, bool line_status)
 {
@@ -192,6 +222,15 @@ static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
 	}
 
 	/*
+	 * AMD SVM AVIC accelerate EOI write and do not trap,
+	 * in-kernel IOAPIC will not be able to receive the EOI.
+	 * In this case, we do lazy update of the pending EOI when
+	 * trying to set IOAPIC irq.
+	 */
+	if (kvm_apicv_activated(ioapic->kvm))
+		ioapic_lazy_update_eoi(ioapic, irq);
+
+	/*
 	 * Return 0 for coalesced interrupts; for edge-triggered interrupts,
 	 * this only happens if a previous edge has not been delivered due
 	 * do masking.  For level interrupts, the remote_irr field tells
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 17/17] svm: Allow AVIC with in-kernel irqchip mode
  2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (15 preceding siblings ...)
  2019-11-01 22:41 ` [PATCH v4 16/17] kvm: ioapic: Lazy update IOAPIC EOI Suthikulpanit, Suravee
@ 2019-11-01 22:41 ` Suthikulpanit, Suravee
  16 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-01 22:41 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, vkuznets, rkagan, graf, jschoenh,
	karahmed, rimasluk, Grimm, Jon, Suthikulpanit, Suravee

Once run-time AVIC activate/deactivate is supported, and EOI workaround
for AVIC is implemented, we can remove the kernel irqchip split mode
requirement for AVIC.

Hence, remove the check for irqchip split mode when enabling AVIC.

Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 9812feb..be9c1d3 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5172,7 +5172,7 @@ static void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
 
 static bool svm_get_enable_apicv(struct kvm *kvm)
 {
-	return avic && irqchip_split(kvm);
+	return avic;
 }
 
 static void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 03/17] kvm: x86: Introduce APICv deactivate bits
  2019-11-01 22:41 ` [PATCH v4 03/17] kvm: x86: Introduce APICv deactivate bits Suthikulpanit, Suravee
@ 2019-11-02  9:51   ` Paolo Bonzini
  0 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2019-11-02  9:51 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: rkrcmar, joro, vkuznets, rkagan, graf, jschoenh, karahmed,
	rimasluk, Grimm, Jon

On 01/11/19 23:41, Suthikulpanit, Suravee wrote:
> +	unsigned long apicv_deact_msk;

No abbrev fld names. :)  I can change this to something like
apicv_inhibit_reasons if there are no other issues (and likewise
s/APICV_DEACT_BIT_/APICV_INHIBIT_REASON_/).

> 
> +bool kvm_apicv_activated(struct kvm *kvm)
> +{
> +	return (READ_ONCE(kvm->arch.apicv_deact_msk) == 0);
> +}

Using READ_ONCE introduces a risk of races.  I'll check during a more
thorough review if it's worth introducing separate kvm_apicv_active and
kvm_apicv_active_nolock functions.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 04/17] kvm: x86: Add support for activate/de-activate APICv at runtime
  2019-11-01 22:41 ` [PATCH v4 04/17] kvm: x86: Add support for activate/de-activate APICv at runtime Suthikulpanit, Suravee
@ 2019-11-02  9:52   ` Paolo Bonzini
  2019-11-04 19:22     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2019-11-02  9:52 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: rkrcmar, joro, vkuznets, rkagan, graf, jschoenh, karahmed,
	rimasluk, Grimm, Jon

On 01/11/19 23:41, Suthikulpanit, Suravee wrote:
> +void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
> +{
> +	if (activate) {
> +		if (!test_and_clear_bit(bit, &kvm->arch.apicv_deact_msk) ||
> +		    !kvm_apicv_activated(kvm))
> +			return;
> +	} else {
> +		if (test_and_set_bit(bit, &kvm->arch.apicv_deact_msk) ||
> +		    kvm_apicv_activated(kvm))
> +			return;
> +	}
> +
> +	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_UPDATE);
> +}
> +EXPORT_SYMBOL_GPL(kvm_request_apicv_update);
> +

It's worth documenting the locking requirements of
kvm_request_apicv_update (it can also be negative requirements, such as
"don't hold any lock"), because kvm_make_all_cpus_request is a somewhat
deadlock-prone API.

Again, something I'll check after a more thorough review.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-01 22:41 ` [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode Suthikulpanit, Suravee
@ 2019-11-02  9:57   ` Paolo Bonzini
  2019-11-04 18:54     ` Suthikulpanit, Suravee
  2019-11-04 23:17     ` Roman Kagan
  0 siblings, 2 replies; 36+ messages in thread
From: Paolo Bonzini @ 2019-11-02  9:57 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: rkrcmar, joro, vkuznets, rkagan, graf, jschoenh, karahmed,
	rimasluk, Grimm, Jon

On 01/11/19 23:41, Suthikulpanit, Suravee wrote:
> +	/*
> +	 * AMD SVM AVIC accelerates EOI write and does not trap.
> +	 * This cause in-kernel PIT re-inject mode to fail
> +	 * since it checks ps->irq_ack before kvm_set_irq()
> +	 * and relies on the ack notifier to timely queue
> +	 * the pt->worker work iterm and reinject the missed tick.
> +	 * So, deactivate APICv when PIT is in reinject mode.
> +	 */
>  	if (reinject) {
> +		kvm_request_apicv_update(kvm, false, APICV_DEACT_BIT_PIT_REINJ);
>  		/* The initial state is preserved while ps->reinject == 0. */
>  		kvm_pit_reset_reinject(pit);
>  		kvm_register_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
>  		kvm_register_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
>  	} else {
> +		kvm_request_apicv_update(kvm, true, APICV_DEACT_BIT_PIT_REINJ);
>  		kvm_unregister_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
>  		kvm_unregister_irq_mask_notifier(kvm, 0, &pit->mask_notifier);

This is not too nice for Intel which does support (through the EOI exit
mask) APICv even if PIT reinjection active.

We can work around it by adding a global mask of inhibit reasons that
apply to the vendor, and initializing it as soon as possible in vmx.c/svm.c.

Then kvm_request_apicv_update can ignore reasons that the vendor doesn't
care about.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 12/17] svm: Temporary deactivate AVIC during ExtINT handling
  2019-11-01 22:41 ` [PATCH v4 12/17] svm: Temporary deactivate AVIC during ExtINT handling Suthikulpanit, Suravee
@ 2019-11-02 10:01   ` Paolo Bonzini
  0 siblings, 0 replies; 36+ messages in thread
From: Paolo Bonzini @ 2019-11-02 10:01 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: rkrcmar, joro, vkuznets, rkagan, graf, jschoenh, karahmed,
	rimasluk, Grimm, Jon

On 01/11/19 23:41, Suthikulpanit, Suravee wrote:
> +		/*
> +		 * IRQ window is not needed when AVIC is enabled,
> +		 * unless we have pending ExtINT since it cannot be injected
> +		 * via AVIC. In such case, we need to temporarily disable AVIC,
> +		 * and fallback to injecting IRQ via V_IRQ.
> +		 */
> +		if (kvm_vcpu_apicv_active(vcpu))
> +			svm_request_update_avic(vcpu, false);

This must be pretty heavy-weight on SMP VMs, even if most ExtINT guests
do not need SMP in the guest.  One alternative is to enable/disable
APICv when LVT or IOAPIC registers are written with ExtINT mode.  Not a
blocker, just an idea to consider.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-02  9:57   ` Paolo Bonzini
@ 2019-11-04 18:54     ` Suthikulpanit, Suravee
  2019-11-04 21:49       ` Paolo Bonzini
  2019-11-04 23:17     ` Roman Kagan
  1 sibling, 1 reply; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-04 18:54 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: rkrcmar, joro, vkuznets, rkagan, graf, jschoenh, karahmed,
	rimasluk, Grimm, Jon

Paolo,

Thanks for quick response.

On 11/2/19 4:57 AM, Paolo Bonzini wrote:
> On 01/11/19 23:41, Suthikulpanit, Suravee wrote:
>> +	/*
>> +	 * AMD SVM AVIC accelerates EOI write and does not trap.
>> +	 * This cause in-kernel PIT re-inject mode to fail
>> +	 * since it checks ps->irq_ack before kvm_set_irq()
>> +	 * and relies on the ack notifier to timely queue
>> +	 * the pt->worker work iterm and reinject the missed tick.
>> +	 * So, deactivate APICv when PIT is in reinject mode.
>> +	 */
>>   	if (reinject) {
>> +		kvm_request_apicv_update(kvm, false, APICV_DEACT_BIT_PIT_REINJ);
>>   		/* The initial state is preserved while ps->reinject == 0. */
>>   		kvm_pit_reset_reinject(pit);
>>   		kvm_register_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
>>   		kvm_register_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
>>   	} else {
>> +		kvm_request_apicv_update(kvm, true, APICV_DEACT_BIT_PIT_REINJ);
>>   		kvm_unregister_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
>>   		kvm_unregister_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
> 
> This is not too nice for Intel which does support (through the EOI exit
> mask) APICv even if PIT reinjection active.

I see you point.

> We can work around it by adding a global mask of inhibit reasons that
> apply to the vendor, and initializing it as soon as possible in vmx.c/svm.c.
> 
> Then kvm_request_apicv_update can ignore reasons that the vendor doesn't
> care about.
> 
> Paolo
> 

What about we enhance the pre_update_apivc_exec_ctrl() to also return 
success/fail. In here, the vendor specific code can decide to update 
APICv state or not.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 04/17] kvm: x86: Add support for activate/de-activate APICv at runtime
  2019-11-02  9:52   ` Paolo Bonzini
@ 2019-11-04 19:22     ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 36+ messages in thread
From: Suthikulpanit, Suravee @ 2019-11-04 19:22 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: rkrcmar, joro, vkuznets, rkagan, graf, jschoenh, karahmed,
	rimasluk, Grimm, Jon

Paolo,

On 11/2/19 4:52 AM, Paolo Bonzini wrote:
> On 01/11/19 23:41, Suthikulpanit, Suravee wrote:
>> +void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
>> +{
>> +	if (activate) {
>> +		if (!test_and_clear_bit(bit, &kvm->arch.apicv_deact_msk) ||
>> +		    !kvm_apicv_activated(kvm))
>> +			return;
>> +	} else {
>> +		if (test_and_set_bit(bit, &kvm->arch.apicv_deact_msk) ||
>> +		    kvm_apicv_activated(kvm))
>> +			return;
>> +	}
>> +
>> +	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_UPDATE);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_request_apicv_update);
>> +
> 
> It's worth documenting the locking requirements of
> kvm_request_apicv_update (it can also be negative requirements, such as
> "don't hold any lock"), because kvm_make_all_cpus_request is a somewhat
> deadlock-prone API.

Currently, I have a comment in the svm_request_update_avic(), where it 
calls kvm_request_apicv_update. I'll move it here instead and enhance 
the comment.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-04 18:54     ` Suthikulpanit, Suravee
@ 2019-11-04 21:49       ` Paolo Bonzini
  2019-11-05  7:05         ` Graf (AWS), Alexander
  0 siblings, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2019-11-04 21:49 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: rkrcmar, joro, vkuznets, rkagan, graf, jschoenh, karahmed,
	rimasluk, Grimm, Jon

On 04/11/19 19:54, Suthikulpanit, Suravee wrote:
> I see you point.
> 
>> We can work around it by adding a global mask of inhibit reasons that
>> apply to the vendor, and initializing it as soon as possible in vmx.c/svm.c.
>>
>> Then kvm_request_apicv_update can ignore reasons that the vendor doesn't
>> care about.
>
> What about we enhance the pre_update_apivc_exec_ctrl() to also return 
> success/fail. In here, the vendor specific code can decide to update 
> APICv state or not.

That works for me, too.  Something like return false for deactivate and
true for activate.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 07/17] svm: Add support for setup/destroy virutal APIC backing page for AVIC
  2019-11-01 22:41 ` [PATCH v4 07/17] svm: Add support for setup/destroy virutal APIC backing page for AVIC Suthikulpanit, Suravee
@ 2019-11-04 21:53   ` Roman Kagan
  2019-11-12  0:05     ` Suravee Suthikulpanit
  0 siblings, 1 reply; 36+ messages in thread
From: Roman Kagan @ 2019-11-04 21:53 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: linux-kernel, kvm, pbonzini, rkrcmar, joro, vkuznets, graf,
	jschoenh, karahmed, rimasluk, Grimm, Jon

On Fri, Nov 01, 2019 at 10:41:30PM +0000, Suthikulpanit, Suravee wrote:
> Re-factor avic_init_access_page() to avic_update_access_page() since
> activate/deactivate AVIC requires setting/unsetting the memory region used
> for virtual APIC backing page (APIC_ACCESS_PAGE_PRIVATE_MEMSLOT).

AFAICT the patch actually touches the (de)allocation of the APIC access
page rather than the APIC backing page (or I'm confused in the
nomenclature).

Thanks,
Roman.

> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  arch/x86/kvm/svm.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index b7d0adc..46842a2 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1668,23 +1668,22 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
>   * field of the VMCB. Therefore, we set up the
>   * APIC_ACCESS_PAGE_PRIVATE_MEMSLOT (4KB) here.
>   */
> -static int avic_init_access_page(struct kvm_vcpu *vcpu)
> +static int avic_update_access_page(struct kvm *kvm, bool activate)
>  {
> -	struct kvm *kvm = vcpu->kvm;
>  	int ret = 0;
>  
>  	mutex_lock(&kvm->slots_lock);
> -	if (kvm->arch.apic_access_page_done)
> +	if (kvm->arch.apic_access_page_done == activate)
>  		goto out;
>  
>  	ret = __x86_set_memory_region(kvm,
>  				      APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
>  				      APIC_DEFAULT_PHYS_BASE,
> -				      PAGE_SIZE);
> +				      activate ? PAGE_SIZE : 0);
>  	if (ret)
>  		goto out;
>  
> -	kvm->arch.apic_access_page_done = true;
> +	kvm->arch.apic_access_page_done = activate;
>  out:
>  	mutex_unlock(&kvm->slots_lock);
>  	return ret;
> @@ -1697,7 +1696,7 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
>  	int id = vcpu->vcpu_id;
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  
> -	ret = avic_init_access_page(vcpu);
> +	ret = avic_update_access_page(vcpu->kvm, true);
>  	if (ret)
>  		return ret;
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 08/17] kvm: x86: Introduce APICv pre-update hook
  2019-11-01 22:41 ` [PATCH v4 08/17] kvm: x86: Introduce APICv pre-update hook Suthikulpanit, Suravee
@ 2019-11-04 22:05   ` Roman Kagan
  2019-11-12  0:08     ` Suravee Suthikulpanit
  0 siblings, 1 reply; 36+ messages in thread
From: Roman Kagan @ 2019-11-04 22:05 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: linux-kernel, kvm, pbonzini, rkrcmar, joro, vkuznets, graf,
	jschoenh, karahmed, rimasluk, Grimm, Jon

On Fri, Nov 01, 2019 at 10:41:31PM +0000, Suthikulpanit, Suravee wrote:
> AMD SVM AVIC needs to update APIC backing page mapping before changing
> APICv mode. Introduce struct kvm_x86_ops.pre_update_apicv_exec_ctrl
> function hook to be called prior KVM APICv update request to each vcpu.

This again seems to mix up APIC backing page and APIC access page.

And I must be missing something obvious, but why is it necessary to
unmap the APIC access page while AVIC is disabled?  Does keeping it
around stand in the way when working with AVIC disabled?

Thanks,
Roman.

> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 1 +
>  arch/x86/kvm/svm.c              | 6 ++++++
>  arch/x86/kvm/x86.c              | 2 ++
>  3 files changed, 9 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 3b94f42..f93d347 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1094,6 +1094,7 @@ struct kvm_x86_ops {
>  	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>  	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
>  	bool (*get_enable_apicv)(struct kvm *kvm);
> +	void (*pre_update_apicv_exec_ctrl)(struct kvm *kvm, bool activate);
>  	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
>  	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
>  	void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 46842a2..21203a6 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -7230,6 +7230,11 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
>  	return false;
>  }
>  
> +static void svm_pre_update_apicv_exec_ctrl(struct kvm *kvm, bool activate)
> +{
> +	avic_update_access_page(kvm, activate);
> +}
> +
>  static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
>  	.cpu_has_kvm_support = has_svm,
>  	.disabled_by_bios = is_disabled,
> @@ -7307,6 +7312,7 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
>  	.set_virtual_apic_mode = svm_set_virtual_apic_mode,
>  	.get_enable_apicv = svm_get_enable_apicv,
>  	.refresh_apicv_exec_ctrl = svm_refresh_apicv_exec_ctrl,
> +	.pre_update_apicv_exec_ctrl = svm_pre_update_apicv_exec_ctrl,
>  	.load_eoi_exitmap = svm_load_eoi_exitmap,
>  	.hwapic_irr_update = svm_hwapic_irr_update,
>  	.hwapic_isr_update = svm_hwapic_isr_update,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4fab93e..c09ff78 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7755,6 +7755,8 @@ void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
>  	}
>  
>  	trace_kvm_apicv_update_request(activate, bit);
> +	if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
> +		kvm_x86_ops->pre_update_apicv_exec_ctrl(kvm, activate);
>  	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_UPDATE);
>  }
>  EXPORT_SYMBOL_GPL(kvm_request_apicv_update);
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-02  9:57   ` Paolo Bonzini
  2019-11-04 18:54     ` Suthikulpanit, Suravee
@ 2019-11-04 23:17     ` Roman Kagan
  2019-11-05 22:47       ` Paolo Bonzini
  1 sibling, 1 reply; 36+ messages in thread
From: Roman Kagan @ 2019-11-04 23:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Suthikulpanit, Suravee, linux-kernel, kvm, rkrcmar, joro,
	vkuznets, graf, jschoenh, karahmed, rimasluk, Grimm, Jon

On Sat, Nov 02, 2019 at 10:57:47AM +0100, Paolo Bonzini wrote:
> On 01/11/19 23:41, Suthikulpanit, Suravee wrote:
> > +	/*
> > +	 * AMD SVM AVIC accelerates EOI write and does not trap.
> > +	 * This cause in-kernel PIT re-inject mode to fail
> > +	 * since it checks ps->irq_ack before kvm_set_irq()
> > +	 * and relies on the ack notifier to timely queue
> > +	 * the pt->worker work iterm and reinject the missed tick.
> > +	 * So, deactivate APICv when PIT is in reinject mode.
> > +	 */
> >  	if (reinject) {
> > +		kvm_request_apicv_update(kvm, false, APICV_DEACT_BIT_PIT_REINJ);
> >  		/* The initial state is preserved while ps->reinject == 0. */
> >  		kvm_pit_reset_reinject(pit);
> >  		kvm_register_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
> >  		kvm_register_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
> >  	} else {
> > +		kvm_request_apicv_update(kvm, true, APICV_DEACT_BIT_PIT_REINJ);
> >  		kvm_unregister_irq_ack_notifier(kvm, &ps->irq_ack_notifier);
> >  		kvm_unregister_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
> 
> This is not too nice for Intel which does support (through the EOI exit
> mask) APICv even if PIT reinjection active.

Hmm, it's tempting to just make svm_load_eoi_exitmap() disable AVIC when
given a non-empty eoi_exit_bitmap, and enable it back on a clear
eoi_exit_bitmap.  This may remove the need to add special treatment to
PIT etc.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-04 21:49       ` Paolo Bonzini
@ 2019-11-05  7:05         ` Graf (AWS), Alexander
  2019-11-05  8:40           ` Roman Kagan
  0 siblings, 1 reply; 36+ messages in thread
From: Graf (AWS), Alexander @ 2019-11-05  7:05 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Suthikulpanit, Suravee, linux-kernel, kvm, rkrcmar, joro,
	vkuznets, rkagan, Schoenherr, Jan H.,
	Raslan, KarimAllah, Lukaszewicz, Rimas, Grimm, Jon



> Am 04.11.2019 um 22:50 schrieb Paolo Bonzini <pbonzini@redhat.com>:
> 
> On 04/11/19 19:54, Suthikulpanit, Suravee wrote:
>> I see you point.
>> 
>>> We can work around it by adding a global mask of inhibit reasons that
>>> apply to the vendor, and initializing it as soon as possible in vmx.c/svm.c.
>>> 
>>> Then kvm_request_apicv_update can ignore reasons that the vendor doesn't
>>> care about.
>> 
>> What about we enhance the pre_update_apivc_exec_ctrl() to also return 
>> success/fail. In here, the vendor specific code can decide to update 
>> APICv state or not.
> 
> That works for me, too.  Something like return false for deactivate and
> true for activate.

I'm trying to wrap my head around how that will work with live migration. Do we also need to save/restore the deactivate reasons?

Alex

> 
> Paolo



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-05  7:05         ` Graf (AWS), Alexander
@ 2019-11-05  8:40           ` Roman Kagan
  0 siblings, 0 replies; 36+ messages in thread
From: Roman Kagan @ 2019-11-05  8:40 UTC (permalink / raw)
  To: Graf (AWS), Alexander
  Cc: Paolo Bonzini, Suthikulpanit, Suravee, linux-kernel, kvm,
	rkrcmar, joro, vkuznets, Schoenherr, Jan H.,
	Raslan, KarimAllah, Lukaszewicz, Rimas, Grimm, Jon

On Tue, Nov 05, 2019 at 07:05:57AM +0000, Graf (AWS), Alexander wrote:
> 
> 
> > Am 04.11.2019 um 22:50 schrieb Paolo Bonzini <pbonzini@redhat.com>:
> > 
> > On 04/11/19 19:54, Suthikulpanit, Suravee wrote:
> >> I see you point.
> >> 
> >>> We can work around it by adding a global mask of inhibit reasons that
> >>> apply to the vendor, and initializing it as soon as possible in vmx.c/svm.c.
> >>> 
> >>> Then kvm_request_apicv_update can ignore reasons that the vendor doesn't
> >>> care about.
> >> 
> >> What about we enhance the pre_update_apivc_exec_ctrl() to also return 
> >> success/fail. In here, the vendor specific code can decide to update 
> >> APICv state or not.
> > 
> > That works for me, too.  Something like return false for deactivate and
> > true for activate.
> 
> I'm trying to wrap my head around how that will work with live
> migration. Do we also need to save/restore the deactivate reasons?

Nope, this is all invisible to userland.  The target will deduce the
deactivation reasons on its own from the user-visible setup like PIT
configuration, Hyper-V SynIC, etc.

Roman.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-04 23:17     ` Roman Kagan
@ 2019-11-05 22:47       ` Paolo Bonzini
  2019-11-11 17:37         ` Suravee Suthikulpanit
  0 siblings, 1 reply; 36+ messages in thread
From: Paolo Bonzini @ 2019-11-05 22:47 UTC (permalink / raw)
  To: rkagan, Suthikulpanit, Suravee, linux-kernel, kvm, rkrcmar, joro,
	vkuznets, graf, jschoenh, karahmed, rimasluk, Grimm, Jon

On 05/11/19 00:17, Roman Kagan wrote:
>> This is not too nice for Intel which does support (through the EOI exit
>> mask) APICv even if PIT reinjection active.
> Hmm, it's tempting to just make svm_load_eoi_exitmap() disable AVIC when
> given a non-empty eoi_exit_bitmap, and enable it back on a clear
> eoi_exit_bitmap.  This may remove the need to add special treatment to
> PIT etc.

That is a very nice idea---we can make that a single disable reason,
like APICV_DEACTIVATE_REASON_EOI, and Intel can simply never use it.

Paolo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-05 22:47       ` Paolo Bonzini
@ 2019-11-11 17:37         ` Suravee Suthikulpanit
  2019-11-12 12:22           ` Roman Kagan
  0 siblings, 1 reply; 36+ messages in thread
From: Suravee Suthikulpanit @ 2019-11-11 17:37 UTC (permalink / raw)
  To: Paolo Bonzini, rkagan, linux-kernel, kvm, rkrcmar, joro,
	vkuznets, graf, jschoenh, karahmed, rimasluk, Grimm, Jon

Roman/Paolo

On 11/5/2019 4:47 PM, Paolo Bonzini wrote:
> On 05/11/19 00:17, Roman Kagan wrote:
>>> This is not too nice for Intel which does support (through the EOI exit
>>> mask) APICv even if PIT reinjection active.
>> Hmm, it's tempting to just make svm_load_eoi_exitmap() disable AVIC when
>> given a non-empty eoi_exit_bitmap, and enable it back on a clear
>> eoi_exit_bitmap.  This may remove the need to add special treatment to
>> PIT etc.
> 
> That is a very nice idea---we can make that a single disable reason,
> like APICV_DEACTIVATE_REASON_EOI, and Intel can simply never use it.

I took at look at the svm_load_eoi_exitmap() and it is called via:
     kvm_make_scan_ioapic_request() ->
         KVM_REQ_SCAN_IOAPIC -> vcpu_scan_ioapic() ->
             KVM_REQ_LOAD_EOI_EXITMAP -> vcpu_load_eoi_exitmap()

The kvm_make_scan_ioapic_request() is called from multiple places:

arch/x86/kvm/irq_comm.c:
     * kvm_arch_post_irq_routing_update() : Called from kvm_set_irq_routing()

arch/x86/kvm/ioapic.c:
     * kvm_arch_post_irq_ack_notifier_list_update() : (Un)registering irq ack notifier
     * kvm_set_ioapic() : Setting ioapic irqchip
     * ioapic_mmio_write() -> ioapic_write_indirect()

arch/x86/kvm/lapic.c:
     * recalculate_apic_map()

Most calls would be from ioapic_mmio_write()->ioapic_write_indirect().

In case of AMD AVIC, the svm_load_e::vsoi_exitmap() is called several times, and requesting
APICV (de)activate from here when the eoi_exit_bitmap is set/clear would introduce
large overhead especially with SMP machine. So, for now, let's just disable APICv
when in-kernel PIT is in reinject (delay) mode.

I'll also add the logic to avoid unnecessary overhead for Intel.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 07/17] svm: Add support for setup/destroy virutal APIC backing page for AVIC
  2019-11-04 21:53   ` Roman Kagan
@ 2019-11-12  0:05     ` Suravee Suthikulpanit
  0 siblings, 0 replies; 36+ messages in thread
From: Suravee Suthikulpanit @ 2019-11-12  0:05 UTC (permalink / raw)
  To: rkagan, linux-kernel, kvm, pbonzini, rkrcmar, joro, vkuznets,
	graf, jschoenh, karahmed, rimasluk, Grimm, Jon

Roman,

On 11/4/19 3:53 PM, Roman Kagan wrote:
> On Fri, Nov 01, 2019 at 10:41:30PM +0000, Suthikulpanit, Suravee wrote:
>> Re-factor avic_init_access_page() to avic_update_access_page() since
>> activate/deactivate AVIC requires setting/unsetting the memory region used
>> for virtual APIC backing page (APIC_ACCESS_PAGE_PRIVATE_MEMSLOT).
>
> AFAICT the patch actually touches the (de)allocation of the APIC access
> page rather than the APIC backing page (or I'm confused in the
> nomenclature).

The APIC backing page is allocated during vcpu initialization, while
the APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, is initialized per-vm, and is
used mainly for access permission control of the APIC backing page.

There is a comment in the arch/x86/kvm/svm.c:

  /**
   * Note:
   * AVIC hardware walks the nested page table to check permissions,
   * but does not use the SPA address specified in the leaf page
   * table entry since it uses address in the AVIC_BACKING_PAGE pointer
   * field of the VMCB. Therefore, we set up the
   * APIC_ACCESS_PAGE_PRIVATE_MEMSLOT (4KB) here.
   */

When deactivate APICv, we do not destroy the APIC backing page, but
we need to de-allocate the APIC_ACCESS_PAGE_PRIVATE_MEMSLOT.

Thanks,
Suravee

> Thanks,
> Roman.
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 08/17] kvm: x86: Introduce APICv pre-update hook
  2019-11-04 22:05   ` Roman Kagan
@ 2019-11-12  0:08     ` Suravee Suthikulpanit
  2019-11-12 11:12       ` Roman Kagan
  0 siblings, 1 reply; 36+ messages in thread
From: Suravee Suthikulpanit @ 2019-11-12  0:08 UTC (permalink / raw)
  To: rkagan, linux-kernel, kvm, pbonzini, rkrcmar, joro, vkuznets,
	graf, jschoenh, karahmed, rimasluk, Grimm, Jon

Roman,

On 11/4/19 4:05 PM, Roman Kagan wrote:
> On Fri, Nov 01, 2019 at 10:41:31PM +0000, Suthikulpanit, Suravee wrote:
>> AMD SVM AVIC needs to update APIC backing page mapping before changing
>> APICv mode. Introduce struct kvm_x86_ops.pre_update_apicv_exec_ctrl
>> function hook to be called prior KVM APICv update request to each vcpu.
> This again seems to mix up APIC backing page and APIC access page.
> 
> And I must be missing something obvious, but why is it necessary to
> unmap the APIC access page while AVIC is disabled?  Does keeping it
> around stand in the way when working with AVIC disabled?

I have replied to patch 07/17 with explanation.

Yes, keeping the APIC access page while disabling AVIC would cause
the SVM to not function properly.

Thanks,
Suravee

> Thanks,
> Roman.
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 08/17] kvm: x86: Introduce APICv pre-update hook
  2019-11-12  0:08     ` Suravee Suthikulpanit
@ 2019-11-12 11:12       ` Roman Kagan
  0 siblings, 0 replies; 36+ messages in thread
From: Roman Kagan @ 2019-11-12 11:12 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, kvm, pbonzini, rkrcmar, joro, vkuznets, graf,
	jschoenh, karahmed, rimasluk, Grimm, Jon

On Mon, Nov 11, 2019 at 06:08:05PM -0600, Suravee Suthikulpanit wrote:
> On 11/4/19 4:05 PM, Roman Kagan wrote:
> > On Fri, Nov 01, 2019 at 10:41:31PM +0000, Suthikulpanit, Suravee wrote:
> > > AMD SVM AVIC needs to update APIC backing page mapping before changing
> > > APICv mode. Introduce struct kvm_x86_ops.pre_update_apicv_exec_ctrl
> > > function hook to be called prior KVM APICv update request to each vcpu.
> > This again seems to mix up APIC backing page and APIC access page.
> > 
> > And I must be missing something obvious, but why is it necessary to
> > unmap the APIC access page while AVIC is disabled?  Does keeping it
> > around stand in the way when working with AVIC disabled?
> 
> I have replied to patch 07/17 with explanation.
> 
> Yes, keeping the APIC access page while disabling AVIC would cause
> the SVM to not function properly.

I wonder why?  Once AVIC is disabled guest access to this page would
trigger a regular NPT fault vmexit, just as it would with the NPT entry
for this page destroyed, wouldn't it?  So there would be no difference
from the host's POV.  Am I missing something?

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode.
  2019-11-11 17:37         ` Suravee Suthikulpanit
@ 2019-11-12 12:22           ` Roman Kagan
  0 siblings, 0 replies; 36+ messages in thread
From: Roman Kagan @ 2019-11-12 12:22 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: Paolo Bonzini, linux-kernel, kvm, rkrcmar, joro, vkuznets, graf,
	jschoenh, karahmed, rimasluk, Grimm, Jon

On Mon, Nov 11, 2019 at 11:37:53AM -0600, Suravee Suthikulpanit wrote:
> On 11/5/2019 4:47 PM, Paolo Bonzini wrote:
> > On 05/11/19 00:17, Roman Kagan wrote:
> > > > This is not too nice for Intel which does support (through the EOI exit
> > > > mask) APICv even if PIT reinjection active.
> > > Hmm, it's tempting to just make svm_load_eoi_exitmap() disable AVIC when
> > > given a non-empty eoi_exit_bitmap, and enable it back on a clear
> > > eoi_exit_bitmap.  This may remove the need to add special treatment to
> > > PIT etc.
> > 
> > That is a very nice idea---we can make that a single disable reason,
> > like APICV_DEACTIVATE_REASON_EOI, and Intel can simply never use it.
> 
> I took at look at the svm_load_eoi_exitmap() and it is called via:
>     kvm_make_scan_ioapic_request() ->
>         KVM_REQ_SCAN_IOAPIC -> vcpu_scan_ioapic() ->
>             KVM_REQ_LOAD_EOI_EXITMAP -> vcpu_load_eoi_exitmap()
> 
> The kvm_make_scan_ioapic_request() is called from multiple places:
> 
> arch/x86/kvm/irq_comm.c:
>     * kvm_arch_post_irq_routing_update() : Called from kvm_set_irq_routing()
> 
> arch/x86/kvm/ioapic.c:
>     * kvm_arch_post_irq_ack_notifier_list_update() : (Un)registering irq ack notifier
>     * kvm_set_ioapic() : Setting ioapic irqchip
>     * ioapic_mmio_write() -> ioapic_write_indirect()
> 
> arch/x86/kvm/lapic.c:
>     * recalculate_apic_map()
> 
> Most calls would be from ioapic_mmio_write()->ioapic_write_indirect().
> 
> In case of AMD AVIC, the svm_load_e::vsoi_exitmap() is called several times, and requesting
> APICV (de)activate from here when the eoi_exit_bitmap is set/clear would introduce
> large overhead especially with SMP machine.

This doesn't look like a hot path, so I'm not sure it needs to be
optimized for performance.  Especially so since
kvm_make_scan_ioapic_request does kvm_make_all_cpus_request which isn't
particularly fast by definition, and I guess the extra overhead there
won't be noticable.

OTOH introducing extra code paths has its maintenance costs, so sticking
the simple logic in svm_load_eoi_exitmap looks attractive.

Just my 2c,
Roman.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2019-11-12 12:23 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-01 22:41 [PATCH v4 00/17] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 01/17] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 02/17] kvm: lapic: Introduce APICv update helper function Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 03/17] kvm: x86: Introduce APICv deactivate bits Suthikulpanit, Suravee
2019-11-02  9:51   ` Paolo Bonzini
2019-11-01 22:41 ` [PATCH v4 04/17] kvm: x86: Add support for activate/de-activate APICv at runtime Suthikulpanit, Suravee
2019-11-02  9:52   ` Paolo Bonzini
2019-11-04 19:22     ` Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 05/17] kvm: x86: Add APICv activate/deactivate request trace points Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 06/17] kvm: x86: svm: Add support to activate/deactivate posted interrupts Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 07/17] svm: Add support for setup/destroy virutal APIC backing page for AVIC Suthikulpanit, Suravee
2019-11-04 21:53   ` Roman Kagan
2019-11-12  0:05     ` Suravee Suthikulpanit
2019-11-01 22:41 ` [PATCH v4 08/17] kvm: x86: Introduce APICv pre-update hook Suthikulpanit, Suravee
2019-11-04 22:05   ` Roman Kagan
2019-11-12  0:08     ` Suravee Suthikulpanit
2019-11-12 11:12       ` Roman Kagan
2019-11-01 22:41 ` [PATCH v4 09/17] svm: Add support for activate/deactivate AVIC at runtime Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 10/17] kvm: x86: hyperv: Use APICv update request interface Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 11/17] svm: Deactivate AVIC when launching guest with nested SVM support Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 12/17] svm: Temporary deactivate AVIC during ExtINT handling Suthikulpanit, Suravee
2019-11-02 10:01   ` Paolo Bonzini
2019-11-01 22:41 ` [PATCH v4 13/17] kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection mode Suthikulpanit, Suravee
2019-11-02  9:57   ` Paolo Bonzini
2019-11-04 18:54     ` Suthikulpanit, Suravee
2019-11-04 21:49       ` Paolo Bonzini
2019-11-05  7:05         ` Graf (AWS), Alexander
2019-11-05  8:40           ` Roman Kagan
2019-11-04 23:17     ` Roman Kagan
2019-11-05 22:47       ` Paolo Bonzini
2019-11-11 17:37         ` Suravee Suthikulpanit
2019-11-12 12:22           ` Roman Kagan
2019-11-01 22:41 ` [PATCH v4 14/17] kvm: lapic: Clean up APIC predefined macros Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 15/17] kvm: ioapic: Refactor kvm_ioapic_update_eoi() Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 16/17] kvm: ioapic: Lazy update IOAPIC EOI Suthikulpanit, Suravee
2019-11-01 22:41 ` [PATCH v4 17/17] svm: Allow AVIC with in-kernel irqchip mode Suthikulpanit, Suravee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).