KVM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode
@ 2019-08-15 16:25 Suthikulpanit, Suravee
  2019-08-15 16:25 ` [PATCH v2 01/15] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter Suthikulpanit, Suravee
                   ` (14 more replies)
  0 siblings, 15 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

The 'commit 67034bb9dd5e ("KVM: SVM: Add irqchip_split() checks before
enabling AVIC")' was introduced to fix miscellaneous boot-hang issues
when enable AVIC. This is mainly due to AVIC hardware doest not #vmexit
on write to LAPIC EOI register resulting in-kernel PIC and IOAPIC to
wait and do not inject new interrupts (e.g. PIT).

This limits AVIC to only work with kernel_irqchip=split mode, which is
not currently enabled by default, and also required user-space to
support split irqchip model, which might not be the case.

The goal of this series is to enable AVIC to work in both irqchip modes.
by allowing AVIC to be deactivated temporary during runtime and fallback
to legacy interrupt injection mode (w/ vINTR and interrupt windows)
when needed, and then re-enabled subseqently.

Similar approach is also used to handle Hyper-V SynIC in the
'commit 5c919412fe61 ("kvm/x86: Hyper-V synthetic interrupt controller")',
where APICv is permanently disabled at runtime (currently broken for
AVIC, and fixed by this series). 

In addition, currently when KVM disables APICv (e.g. running with
kernel_irqchip=on mode on AMD, or due to Hyper-V SyncIC mode), this happens
under the hood and often cause confusion to users. This will be addressed
in part 1 of this series.

This series contains three parts:
  * Part 1: patch 1-4
     Introduce APICv state enum and logic for keeping track of the state
     for each vm, along with debugfs for checking to see if APICv is
     enabled for a particular vm.

  * Part 2: patch 5-11
     Add support for activate/deactivate APICv at runtime

  * Part 3: patch 12-15:
     Provide workaround for AVIC EOI and allow enable AVIC w/
     kernel_irqchip=on

Pre-requisite Patch:
  * commit b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC
    (de-)activation code")
    (https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/
     ?h=next&id=b9c6ff94e43a0ee053e0c1d983fba1ac4953b762)

This series has been tested against v5.2 as following:
  * Booting Linux, FreeBSD, and Windows Server 2019 VMs upto 255 vCPUs
    w/ qemu option "kernel-irqchip=on" and "-no-hpet".
  * Pass-through Intel 10GbE NIC and run netperf in the VM.

Changes from V1: (https://lkml.org/lkml/2019/3/22/1042)
  * Introduce APICv state enumeration
  * Introduce KVM debugfs for APICv state
  * Add synchronization logic for APICv state to prevent potential
    race condition (per Jan's suggestion)
  * Add support for activate/deactivate posted interrupt
    (per Jan's suggestion)
  * Remove callback functions for handling APIC ID, DFR and LDR update
    (per Paolo's suggestion)
  * Add workaround for handling EOI for in-kernel PIT and IOAPIC.

Suravee Suthikulpanit (15):
  kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm
    parameter
  kvm: x86: Introduce KVM APICv state
  kvm: Add arch-specific per-VM debugfs support
  kvm: x86: Add per-VM APICv state debugfs
  kvm: lapic: Introduce APICv update helper function
  kvm: x86: Add support for activate/de-activate APICv at runtime
  kvm: x86: svm: Add support to activate/deactivate posted interrupts
  svm: Add support for setup/destroy virutal APIC backing page for AVIC
  svm: Add support for activate/deactivate AVIC at runtime
  kvm: x86: hyperv: Use APICv deactivate request interface
  svm: Temporary deactivate AVIC during ExtINT handling
  kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC
  kvm: lapic: Clean up APIC predefined macros
  kvm: ioapic: Delay update IOAPIC EOI for RTC
  svm: Allow AVIC with in-kernel irqchip mode

 arch/mips/kvm/mips.c            |   5 ++
 arch/powerpc/kvm/powerpc.c      |   5 ++
 arch/s390/kvm/kvm-s390.c        |   5 ++
 arch/x86/include/asm/kvm_host.h |  26 +++++-
 arch/x86/kvm/debugfs.c          |  27 +++++++
 arch/x86/kvm/hyperv.c           |  12 ++-
 arch/x86/kvm/i8254.c            |  31 +++++--
 arch/x86/kvm/ioapic.c           |  36 ++++++++-
 arch/x86/kvm/lapic.c            |  35 +++++---
 arch/x86/kvm/lapic.h            |   2 +
 arch/x86/kvm/svm.c              | 173 +++++++++++++++++++++++++++++++++++++---
 arch/x86/kvm/vmx/vmx.c          |   2 +-
 arch/x86/kvm/x86.c              |  96 +++++++++++++++++++++-
 include/linux/kvm_host.h        |   1 +
 virt/kvm/arm/arm.c              |   5 ++
 virt/kvm/kvm_main.c             |   2 +-
 16 files changed, 421 insertions(+), 42 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 01/15] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-27  7:15   ` Vitaly Kuznetsov
  2019-08-15 16:25 ` [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state Suthikulpanit, Suravee
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Generally, APICv for all vcpus in the VM are enable/disable in the same
manner. So, get_enable_apicv() should represent APICv status of the VM
instead of each VCPU.

Modify kvm_x86_ops.get_enable_apicv() to take struct kvm as parameter
instead of struct kvm_vcpu.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/svm.c              | 5 +++--
 arch/x86/kvm/vmx/vmx.c          | 2 +-
 arch/x86/kvm/x86.c              | 2 +-
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 26d1eb8..56bc702 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1077,7 +1077,7 @@ struct kvm_x86_ops {
 	void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
 	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
 	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
-	bool (*get_enable_apicv)(struct kvm_vcpu *vcpu);
+	bool (*get_enable_apicv)(struct kvm *kvm);
 	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
 	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
 	void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ccd5aa6..6851bce 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
 static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
 static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
 static void svm_complete_interrupts(struct vcpu_svm *svm);
+static bool svm_get_enable_apicv(struct kvm *kvm);
 
 static int nested_svm_exit_handled(struct vcpu_svm *svm);
 static int nested_svm_intercept(struct vcpu_svm *svm);
@@ -5124,9 +5125,9 @@ static void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
 	return;
 }
 
-static bool svm_get_enable_apicv(struct kvm_vcpu *vcpu)
+static bool svm_get_enable_apicv(struct kvm *kvm)
 {
-	return avic && irqchip_split(vcpu->kvm);
+	return avic && irqchip_split(kvm);
 }
 
 static void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d98eac3..18a4b94 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3610,7 +3610,7 @@ void pt_update_intercept_for_msr(struct vcpu_vmx *vmx)
 	}
 }
 
-static bool vmx_get_enable_apicv(struct kvm_vcpu *vcpu)
+static bool vmx_get_enable_apicv(struct kvm *kvm)
 {
 	return enable_apicv;
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fafd81d..7daf0dd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9150,7 +9150,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 		goto fail_free_pio_data;
 
 	if (irqchip_in_kernel(vcpu->kvm)) {
-		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu);
+		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
 		r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
 		if (r < 0)
 			goto fail_mmu_destroy;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
  2019-08-15 16:25 ` [PATCH v2 01/15] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-19  9:49   ` Alexander Graf
  2019-08-15 16:25 ` [PATCH v2 03/15] kvm: Add arch-specific per-VM debugfs support Suthikulpanit, Suravee
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Currently, after a VM boots with APICv enabled, it could go into
the following states:
  * activated   = VM is running w/ APICv
  * deactivated = VM deactivate APICv (temporary)
  * disabled    = VM deactivate APICv (permanent)

Introduce KVM APICv state enum to help keep track of the APICv states
along with a new variable struct kvm_arch.apicv_state to store
the current state.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h | 11 +++++++++++
 arch/x86/kvm/x86.c              | 14 +++++++++++++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 56bc702..04d7066 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -845,6 +845,15 @@ enum kvm_irqchip_mode {
 	KVM_IRQCHIP_SPLIT,        /* created with KVM_CAP_SPLIT_IRQCHIP */
 };
 
+/*
+ * KVM assumes all vcpus in a VM operate in the same mode.
+ */
+enum kvm_apicv_state {
+	APICV_DISABLED,		/* Disabled (such as for Hyper-V case) */
+	APICV_DEACTIVATED,	/* Deactivated tempoerary */
+	APICV_ACTIVATED,	/* Default status when APICV is enabled */
+};
+
 struct kvm_arch {
 	unsigned long n_used_mmu_pages;
 	unsigned long n_requested_mmu_pages;
@@ -873,6 +882,8 @@ struct kvm_arch {
 	struct kvm_apic_map *apic_map;
 
 	bool apic_access_page_done;
+	struct mutex apicv_lock;
+	enum kvm_apicv_state apicv_state;
 
 	gpa_t wall_clock;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7daf0dd..f9c3f63 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4584,6 +4584,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		kvm->arch.irqchip_mode = KVM_IRQCHIP_SPLIT;
 		kvm->arch.nr_reserved_ioapic_pins = cap->args[0];
 		r = 0;
+		if (kvm_x86_ops->get_enable_apicv(kvm))
+			kvm->arch.apicv_state = APICV_ACTIVATED;
 split_irqchip_unlock:
 		mutex_unlock(&kvm->lock);
 		break;
@@ -4701,6 +4703,8 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		/* Write kvm->irq_routing before enabling irqchip_in_kernel. */
 		smp_wmb();
 		kvm->arch.irqchip_mode = KVM_IRQCHIP_KERNEL;
+		if (kvm_x86_ops->get_enable_apicv(kvm))
+			kvm->arch.apicv_state = APICV_ACTIVATED;
 	create_irqchip_unlock:
 		mutex_unlock(&kvm->lock);
 		break;
@@ -9150,13 +9154,18 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 		goto fail_free_pio_data;
 
 	if (irqchip_in_kernel(vcpu->kvm)) {
-		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
 		r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
 		if (r < 0)
 			goto fail_mmu_destroy;
 	} else
 		static_key_slow_inc(&kvm_no_apic_vcpu);
 
+	mutex_lock(&vcpu->kvm->arch.apicv_lock);
+	if (irqchip_in_kernel(vcpu->kvm) &&
+	    vcpu->kvm->arch.apicv_state == APICV_ACTIVATED)
+		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
+	mutex_unlock(&vcpu->kvm->arch.apicv_lock);
+
 	vcpu->arch.mce_banks = kzalloc(KVM_MAX_MCE_BANKS * sizeof(u64) * 4,
 				       GFP_KERNEL_ACCOUNT);
 	if (!vcpu->arch.mce_banks) {
@@ -9255,6 +9264,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	kvm_page_track_init(kvm);
 	kvm_mmu_init_vm(kvm);
 
+	/* APICV initialization */
+	mutex_init(&kvm->arch.apicv_lock);
+
 	if (kvm_x86_ops->vm_init)
 		return kvm_x86_ops->vm_init(kvm);
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 03/15] kvm: Add arch-specific per-VM debugfs support
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
  2019-08-15 16:25 ` [PATCH v2 01/15] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter Suthikulpanit, Suravee
  2019-08-15 16:25 ` [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-15 16:25 ` [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs Suthikulpanit, Suravee
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Introduce per-VM debugfs for providing per-VM debug information.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/mips/kvm/mips.c       | 5 +++++
 arch/powerpc/kvm/powerpc.c | 5 +++++
 arch/s390/kvm/kvm-s390.c   | 5 +++++
 arch/x86/kvm/debugfs.c     | 5 +++++
 include/linux/kvm_host.h   | 1 +
 virt/kvm/arm/arm.c         | 5 +++++
 virt/kvm/kvm_main.c        | 2 +-
 7 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 0369f26..d8325b7 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -160,6 +160,11 @@ int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+	return 0;
+}
+
 void kvm_mips_free_vcpus(struct kvm *kvm)
 {
 	unsigned int i;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 6d704ad..44766af 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -462,6 +462,11 @@ int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+	return 0;
+}
+
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
 	unsigned int i;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 28ebd64..243b5c4 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2518,6 +2518,11 @@ int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+	return 0;
+}
+
 void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
 	VCPU_EVENT(vcpu, 3, "%s", "free cpu");
diff --git a/arch/x86/kvm/debugfs.c b/arch/x86/kvm/debugfs.c
index 329361b..d62852c 100644
--- a/arch/x86/kvm/debugfs.c
+++ b/arch/x86/kvm/debugfs.c
@@ -48,6 +48,11 @@ static int vcpu_get_tsc_scaling_frac_bits(void *data, u64 *val)
 
 DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_scaling_frac_fops, vcpu_get_tsc_scaling_frac_bits, NULL, "%llu\n");
 
+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+	return 0;
+}
+
 int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
 {
 	struct dentry *ret;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d1ad38a..ef9f176 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -862,6 +862,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
 
 bool kvm_arch_has_vcpu_debugfs(void);
 int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu);
+int kvm_arch_create_vm_debugfs(struct kvm *kvm);
 
 int kvm_arch_hardware_enable(void);
 void kvm_arch_hardware_disable(void);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index bd5c559..8812c55 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -154,6 +154,11 @@ int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+int kvm_arch_create_vm_debugfs(struct kvm *kvm)
+{
+	return 0;
+}
+
 vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
 {
 	return VM_FAULT_SIGBUS;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2f2d24a..e39db0c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -620,7 +620,7 @@ static int kvm_create_vm_debugfs(struct kvm *kvm, int fd)
 		debugfs_create_file(p->name, 0644, kvm->debugfs_dentry,
 				    stat_data, stat_fops_per_vm[p->kind]);
 	}
-	return 0;
+	return kvm_arch_create_vm_debugfs(kvm);
 }
 
 static struct kvm *kvm_create_vm(unsigned long type)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (2 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 03/15] kvm: Add arch-specific per-VM debugfs support Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-19  9:57   ` Alexander Graf
  2019-08-15 16:25 ` [PATCH v2 05/15] kvm: lapic: Introduce APICv update helper function Suthikulpanit, Suravee
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Currently, there is no way to tell whether APICv is active
on a particular VM. This often cause confusion since APICv
can be deactivated at runtime.

Introduce a debugfs entry to report APICv state of a VM.
This creates a read-only file:

   /sys/kernel/debug/kvm/70860-14/apicv-state

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/debugfs.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/x86/kvm/debugfs.c b/arch/x86/kvm/debugfs.c
index d62852c..bd9fd25 100644
--- a/arch/x86/kvm/debugfs.c
+++ b/arch/x86/kvm/debugfs.c
@@ -48,8 +48,30 @@ static int vcpu_get_tsc_scaling_frac_bits(void *data, u64 *val)
 
 DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_scaling_frac_fops, vcpu_get_tsc_scaling_frac_bits, NULL, "%llu\n");
 
+static int kvm_get_apicv_state(void *data, u64 *val)
+{
+	struct kvm *kvm = (struct kvm *)data;
+
+	mutex_lock(&kvm->arch.apicv_lock);
+	*val = kvm->arch.apicv_state;
+	mutex_unlock(&kvm->arch.apicv_lock);
+	return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(apicv_state_fops, kvm_get_apicv_state, NULL, "%llu\n");
+
 int kvm_arch_create_vm_debugfs(struct kvm *kvm)
 {
+	struct dentry *ret;
+
+	if (kvm_x86_ops->get_enable_apicv(kvm)) {
+		ret = debugfs_create_file("apicv-state", 0444,
+					  kvm->debugfs_dentry,
+					  kvm, &apicv_state_fops);
+		if (!ret)
+			return -ENOMEM;
+	}
+
 	return 0;
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 05/15] kvm: lapic: Introduce APICv update helper function
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (3 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-27  7:22   ` Vitaly Kuznetsov
  2019-08-15 16:25 ` [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime Suthikulpanit, Suravee
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Re-factor code into a helper function for setting lapic parameters when
activate/deactivate APICv, and export the function for subsequent usage.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/lapic.c | 22 +++++++++++++++++-----
 arch/x86/kvm/lapic.h |  1 +
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 4dabc31..90f79ca 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2153,6 +2153,21 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
 
 }
 
+void kvm_apic_update_apicv(struct kvm_vcpu *vcpu)
+{
+	struct kvm_lapic *apic = vcpu->arch.apic;
+
+	if (vcpu->arch.apicv_active) {
+		/* irr_pending is always true when apicv is activated. */
+		apic->irr_pending = true;
+		apic->isr_count = 1;
+	} else {
+		apic->irr_pending = (apic_search_irr(apic) != -1);
+		apic->isr_count = count_vectors(apic->regs + APIC_ISR);
+	}
+}
+EXPORT_SYMBOL(kvm_apic_update_apicv);
+
 void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
@@ -2197,8 +2212,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
 		kvm_lapic_set_reg(apic, APIC_ISR + 0x10 * i, 0);
 		kvm_lapic_set_reg(apic, APIC_TMR + 0x10 * i, 0);
 	}
-	apic->irr_pending = vcpu->arch.apicv_active;
-	apic->isr_count = vcpu->arch.apicv_active ? 1 : 0;
+	kvm_apic_update_apicv(vcpu);
 	apic->highest_isr_cache = -1;
 	update_divide_count(apic);
 	atomic_set(&apic->lapic_timer.pending, 0);
@@ -2468,9 +2482,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
 	apic_manage_nmi_watchdog(apic, kvm_lapic_get_reg(apic, APIC_LVT0));
 	update_divide_count(apic);
 	start_apic_timer(apic);
-	apic->irr_pending = true;
-	apic->isr_count = vcpu->arch.apicv_active ?
-				1 : count_vectors(apic->regs + APIC_ISR);
+	kvm_apic_update_apicv(vcpu);
 	apic->highest_isr_cache = -1;
 	if (vcpu->arch.apicv_active) {
 		kvm_x86_ops->apicv_post_state_restore(vcpu);
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index d6d049b..3892d50 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -90,6 +90,7 @@ bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
 int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
 		     struct dest_map *dest_map);
 int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
+void kvm_apic_update_apicv(struct kvm_vcpu *vcpu);
 
 bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
 		struct kvm_lapic_irq *irq, int *r, struct dest_map *dest_map);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (4 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 05/15] kvm: lapic: Introduce APICv update helper function Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-27  7:29   ` Vitaly Kuznetsov
  2019-08-15 16:25 ` [PATCH v2 07/15] kvm: x86: svm: Add support to activate/deactivate posted interrupts Suthikulpanit, Suravee
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Certain runtime conditions require APICv to be temporary deactivated.
However, current implementation only support permanently deactivate
APICv at runtime (mainly used when running Hyper-V guest).

In addtion, for AMD, when activate / deactivate APICv during runtime,
all vcpus in the VM has to be operating in the same APICv mode, which
requires the requesting (main) vcpu to notify others.

So, introduce interfaces to request all vcpus to activate/deactivate
APICv.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  9 +++++
 arch/x86/kvm/x86.c              | 76 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 85 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 04d7066..dfb7c3d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -76,6 +76,10 @@
 #define KVM_REQ_HV_STIMER		KVM_ARCH_REQ(22)
 #define KVM_REQ_LOAD_EOI_EXITMAP	KVM_ARCH_REQ(23)
 #define KVM_REQ_GET_VMCS12_PAGES	KVM_ARCH_REQ(24)
+#define KVM_REQ_APICV_ACTIVATE		\
+	KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+#define KVM_REQ_APICV_DEACTIVATE	\
+	KVM_ARCH_REQ_FLAGS(26, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
@@ -1089,6 +1093,7 @@ struct kvm_x86_ops {
 	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
 	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
 	bool (*get_enable_apicv)(struct kvm *kvm);
+	void (*pre_update_apicv_exec_ctrl)(struct kvm_vcpu *vcpu, bool activate);
 	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
 	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
 	void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
@@ -1552,6 +1557,10 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
 
 void kvm_make_mclock_inprogress_request(struct kvm *kvm);
 void kvm_make_scan_ioapic_request(struct kvm *kvm);
+void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
+void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu);
+void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu);
+void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable);
 
 void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
 				     struct kvm_async_pf *work);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f9c3f63..40a20bf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -26,6 +26,7 @@
 #include "cpuid.h"
 #include "pmu.h"
 #include "hyperv.h"
+#include "lapic.h"
 
 #include <linux/clocksource.h>
 #include <linux/interrupt.h>
@@ -7163,6 +7164,22 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
 	kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
 }
 
+void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu)
+{
+	if (!lapic_in_kernel(vcpu)) {
+		WARN_ON_ONCE(!vcpu->arch.apicv_active);
+		return;
+	}
+	if (vcpu->arch.apicv_active)
+		return;
+
+	vcpu->arch.apicv_active = true;
+	kvm_apic_update_apicv(vcpu);
+
+	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
+}
+EXPORT_SYMBOL_GPL(kvm_vcpu_activate_apicv);
+
 void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
 {
 	if (!lapic_in_kernel(vcpu)) {
@@ -7173,8 +7190,11 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
 		return;
 
 	vcpu->arch.apicv_active = false;
+	kvm_apic_update_apicv(vcpu);
+
 	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
 }
+EXPORT_SYMBOL_GPL(kvm_vcpu_deactivate_apicv);
 
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
@@ -7668,6 +7688,58 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
 	kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
 }
 
+void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu)
+{
+	int i;
+	struct kvm_vcpu *v;
+	struct kvm *kvm = vcpu->kvm;
+
+	mutex_lock(&kvm->arch.apicv_lock);
+	if (kvm->arch.apicv_state != APICV_DEACTIVATED) {
+		mutex_unlock(&kvm->arch.apicv_lock);
+		return;
+	}
+
+	kvm_for_each_vcpu(i, v, kvm)
+		kvm_clear_request(KVM_REQ_APICV_DEACTIVATE, v);
+
+	if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
+		kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, true);
+
+	kvm->arch.apicv_state = APICV_ACTIVATED;
+
+	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_ACTIVATE);
+
+	mutex_unlock(&kvm->arch.apicv_lock);
+}
+EXPORT_SYMBOL_GPL(kvm_make_apicv_activate_request);
+
+void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable)
+{
+	int i;
+	struct kvm_vcpu *v;
+	struct kvm *kvm = vcpu->kvm;
+
+	mutex_lock(&kvm->arch.apicv_lock);
+	if (kvm->arch.apicv_state == APICV_DEACTIVATED) {
+		mutex_unlock(&kvm->arch.apicv_lock);
+		return;
+	}
+
+	kvm_for_each_vcpu(i, v, kvm)
+		kvm_clear_request(KVM_REQ_APICV_ACTIVATE, v);
+
+	if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
+		kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, false);
+
+	kvm->arch.apicv_state = disable ? APICV_DISABLED : APICV_DEACTIVATED;
+
+	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_DEACTIVATE);
+
+	mutex_unlock(&kvm->arch.apicv_lock);
+}
+EXPORT_SYMBOL_GPL(kvm_make_apicv_deactivate_request);
+
 static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 {
 	if (!kvm_apic_present(vcpu))
@@ -7854,6 +7926,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		 */
 		if (kvm_check_request(KVM_REQ_HV_STIMER, vcpu))
 			kvm_hv_process_stimers(vcpu);
+		if (kvm_check_request(KVM_REQ_APICV_ACTIVATE, vcpu))
+			kvm_vcpu_activate_apicv(vcpu);
+		if (kvm_check_request(KVM_REQ_APICV_DEACTIVATE, vcpu))
+			kvm_vcpu_deactivate_apicv(vcpu);
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 07/15] kvm: x86: svm: Add support to activate/deactivate posted interrupts
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (5 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-15 16:25 ` [PATCH v2 08/15] svm: Add support for setup/destroy virutal APIC backing page for AVIC Suthikulpanit, Suravee
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Introduce interface for activate/deactivate posted interrupts, and
implement SVM hooks to toggle AMD IOMMU guest virtual APIC mode.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  4 ++++
 arch/x86/kvm/svm.c              | 44 +++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |  6 ++++++
 3 files changed, 54 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index dfb7c3d..0d8544b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1183,6 +1183,10 @@ struct kvm_x86_ops {
 
 	int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq,
 			      uint32_t guest_irq, bool set);
+
+	int (*activate_pi_irte)(struct kvm_vcpu *vcpu);
+	int (*deactivate_pi_irte)(struct kvm_vcpu *vcpu);
+
 	void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu);
 
 	int (*set_hv_timer)(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc,
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 6851bce..b674cd0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5374,6 +5374,48 @@ static int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
 	return ret;
 }
 
+static int svm_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
+{
+	int ret = 0;
+	unsigned long flags;
+	struct amd_svm_iommu_ir *ir;
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (!kvm_arch_has_assigned_device(vcpu->kvm))
+		return 0;
+
+	/*
+	 * Here, we go through the per-vcpu ir_list to update all existing
+	 * interrupt remapping table entry targeting this vcpu.
+	 */
+	spin_lock_irqsave(&svm->ir_list_lock, flags);
+
+	if (list_empty(&svm->ir_list))
+		goto out;
+
+	list_for_each_entry(ir, &svm->ir_list, node) {
+		if (activate)
+			ret = amd_iommu_activate_guest_mode(ir->data);
+		else
+			ret = amd_iommu_deactivate_guest_mode(ir->data);
+		if (ret)
+			break;
+	}
+out:
+	spin_unlock_irqrestore(&svm->ir_list_lock, flags);
+	return ret;
+}
+
+static int svm_activate_pi_irte(struct kvm_vcpu *vcpu)
+{
+	return svm_set_pi_irte_mode(vcpu, true);
+}
+
+static int svm_deactivate_pi_irte(struct kvm_vcpu *vcpu)
+{
+	return svm_set_pi_irte_mode(vcpu, false);
+}
+
 static int svm_nmi_allowed(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -7269,6 +7311,8 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
 	.pmu_ops = &amd_pmu_ops,
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
 	.update_pi_irte = svm_update_pi_irte,
+	.activate_pi_irte = svm_activate_pi_irte,
+	.deactivate_pi_irte = svm_deactivate_pi_irte,
 	.setup_mce = svm_setup_mce,
 
 	.smi_allowed = svm_smi_allowed,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 40a20bf..5ab1643 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7177,6 +7177,9 @@ void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu)
 	kvm_apic_update_apicv(vcpu);
 
 	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
+
+	if (kvm_x86_ops->activate_pi_irte)
+		kvm_x86_ops->activate_pi_irte(vcpu);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_activate_apicv);
 
@@ -7192,6 +7195,9 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
 	vcpu->arch.apicv_active = false;
 	kvm_apic_update_apicv(vcpu);
 
+	if (kvm_x86_ops->deactivate_pi_irte)
+		kvm_x86_ops->deactivate_pi_irte(vcpu);
+
 	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_deactivate_apicv);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 08/15] svm: Add support for setup/destroy virutal APIC backing page for AVIC
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (6 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 07/15] kvm: x86: svm: Add support to activate/deactivate posted interrupts Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-15 16:25 ` [PATCH v2 09/15] svm: Add support for activate/deactivate AVIC at runtime Suthikulpanit, Suravee
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Activate/deactivate AVIC requires setting/unsetting the memory region used
for virtual APIC backing page (APIC_ACCESS_PAGE_PRIVATE_MEMSLOT).
So, re-factor avic_init_access_page() to avic_setup_access_page()
and add srcu_read_lock/unlock, which are needed to allow this function
to be called during run-time.

Also, introduce avic_destroy_access_page() to unset the page when
deactivate AVIC.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 48 +++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b674cd0..47f2439 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1468,7 +1468,9 @@ static u64 svm_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
 static void avic_init_vmcb(struct vcpu_svm *svm)
 {
 	struct vmcb *vmcb = svm->vmcb;
-	struct kvm_svm *kvm_svm = to_kvm_svm(svm->vcpu.kvm);
+	struct kvm *kvm = svm->vcpu.kvm;
+	struct kvm_svm *kvm_svm = to_kvm_svm(kvm);
+
 	phys_addr_t bpa = __sme_set(page_to_phys(svm->avic_backing_page));
 	phys_addr_t lpa = __sme_set(page_to_phys(kvm_svm->avic_logical_id_table_page));
 	phys_addr_t ppa = __sme_set(page_to_phys(kvm_svm->avic_physical_id_table_page));
@@ -1477,7 +1479,13 @@ static void avic_init_vmcb(struct vcpu_svm *svm)
 	vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
 	vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
 	vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID_COUNT;
-	vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+
+	mutex_lock(&kvm->arch.apicv_lock);
+	if (kvm->arch.apicv_state == APICV_ACTIVATED)
+		vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+	else
+		vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+	mutex_unlock(&kvm->arch.apicv_lock);
 }
 
 static void init_vmcb(struct vcpu_svm *svm)
@@ -1660,19 +1668,24 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
  * field of the VMCB. Therefore, we set up the
  * APIC_ACCESS_PAGE_PRIVATE_MEMSLOT (4KB) here.
  */
-static int avic_init_access_page(struct kvm_vcpu *vcpu)
+static int avic_setup_access_page(struct kvm_vcpu *vcpu, bool init)
 {
 	struct kvm *kvm = vcpu->kvm;
 	int ret = 0;
 
 	mutex_lock(&kvm->slots_lock);
+
 	if (kvm->arch.apic_access_page_done)
 		goto out;
 
+	if (!init)
+		srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
 	ret = __x86_set_memory_region(kvm,
 				      APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
 				      APIC_DEFAULT_PHYS_BASE,
 				      PAGE_SIZE);
+	if (!init)
+		vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
 	if (ret)
 		goto out;
 
@@ -1682,14 +1695,39 @@ static int avic_init_access_page(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static void avic_destroy_access_page(struct kvm_vcpu *vcpu)
+{
+	struct kvm *kvm = vcpu->kvm;
+
+	mutex_lock(&kvm->slots_lock);
+
+	if (!kvm->arch.apic_access_page_done)
+		goto out;
+
+	srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
+	__x86_set_memory_region(kvm,
+				APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
+				APIC_DEFAULT_PHYS_BASE,
+				0);
+	vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
+	kvm->arch.apic_access_page_done = false;
+out:
+	mutex_unlock(&kvm->slots_lock);
+}
+
 static int avic_init_backing_page(struct kvm_vcpu *vcpu)
 {
-	int ret;
+	int ret = 0;
 	u64 *entry, new_entry;
 	int id = vcpu->vcpu_id;
+	struct kvm *kvm = vcpu->kvm;
 	struct vcpu_svm *svm = to_svm(vcpu);
 
-	ret = avic_init_access_page(vcpu);
+	mutex_lock(&kvm->arch.apicv_lock);
+	if (kvm->arch.apicv_state == APICV_ACTIVATED)
+		ret = avic_setup_access_page(vcpu, true);
+	mutex_unlock(&kvm->arch.apicv_lock);
+
 	if (ret)
 		return ret;
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 09/15] svm: Add support for activate/deactivate AVIC at runtime
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (7 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 08/15] svm: Add support for setup/destroy virutal APIC backing page for AVIC Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-19 10:28   ` Alexander Graf
  2019-08-15 16:25 ` [PATCH v2 10/15] kvm: x86: hyperv: Use APICv deactivate request interface Suthikulpanit, Suravee
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Add necessary logics for supporting activate/deactivate AVIC at runtime.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 47f2439..cfa4b13 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -385,6 +385,7 @@ struct amd_svm_iommu_ir {
 static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
 static void svm_complete_interrupts(struct vcpu_svm *svm);
 static bool svm_get_enable_apicv(struct kvm *kvm);
+static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
 
 static int nested_svm_exit_handled(struct vcpu_svm *svm);
 static int nested_svm_intercept(struct vcpu_svm *svm);
@@ -2343,6 +2344,10 @@ static void svm_vcpu_blocking(struct kvm_vcpu *vcpu)
 
 static void svm_vcpu_unblocking(struct kvm_vcpu *vcpu)
 {
+	if (kvm_check_request(KVM_REQ_APICV_ACTIVATE, vcpu))
+		kvm_vcpu_activate_apicv(vcpu);
+	if (kvm_check_request(KVM_REQ_APICV_DEACTIVATE, vcpu))
+		kvm_vcpu_deactivate_apicv(vcpu);
 	avic_set_running(vcpu, true);
 }
 
@@ -5182,10 +5187,19 @@ static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct vmcb *vmcb = svm->vmcb;
 
-	if (kvm_vcpu_apicv_active(vcpu))
+	if (kvm_vcpu_apicv_active(vcpu)) {
+		/**
+		 * During AVIC temporary deactivation, guest could update
+		 * APIC ID, DFR and LDR registers, which would not be trapped
+		 * by avic_unaccelerated_ccess_interception(). In this case,
+		 * we need to check and update the AVIC logical APIC ID table
+		 * accordingly before re-activating.
+		 */
+		avic_post_state_restore(vcpu);
 		vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
-	else
+	} else {
 		vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+	}
 	mark_dirty(vmcb, VMCB_AVIC);
 }
 
@@ -7229,6 +7243,14 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
 	return false;
 }
 
+static void svm_pre_update_apicv_exec_ctrl(struct kvm_vcpu *vcpu, bool activate)
+{
+	if (activate)
+		avic_setup_access_page(vcpu, false);
+	else
+		avic_destroy_access_page(vcpu);
+}
+
 static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -7306,6 +7328,7 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
 	.set_virtual_apic_mode = svm_set_virtual_apic_mode,
 	.get_enable_apicv = svm_get_enable_apicv,
 	.refresh_apicv_exec_ctrl = svm_refresh_apicv_exec_ctrl,
+	.pre_update_apicv_exec_ctrl = svm_pre_update_apicv_exec_ctrl,
 	.load_eoi_exitmap = svm_load_eoi_exitmap,
 	.hwapic_irr_update = svm_hwapic_irr_update,
 	.hwapic_isr_update = svm_hwapic_isr_update,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 10/15] kvm: x86: hyperv: Use APICv deactivate request interface
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (8 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 09/15] svm: Add support for activate/deactivate AVIC at runtime Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-19 10:31   ` Alexander Graf
  2019-08-15 16:25 ` [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling Suthikulpanit, Suravee
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Since disabling APICv has to be done for all vcpus on AMD-based system,
adopt the newly introduced kvm_make_apicv_deactivate_request() intereface.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/hyperv.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index a39e38f..4f71a39 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -772,9 +772,17 @@ int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages)
 
 	/*
 	 * Hyper-V SynIC auto EOI SINT's are
-	 * not compatible with APICV, so deactivate APICV
+	 * not compatible with APICV, so  request
+	 * to deactivate APICV permanently.
+	 *
+	 * Since this requires updating
+	 * APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
+	 * also take srcu lock.
 	 */
-	kvm_vcpu_deactivate_apicv(vcpu);
+	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
+	kvm_make_apicv_deactivate_request(vcpu, true);
+	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
+
 	synic->active = true;
 	synic->dont_zero_synic_pages = dont_zero_synic_pages;
 	return 0;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (9 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 10/15] kvm: x86: hyperv: Use APICv deactivate request interface Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-19 10:35   ` Alexander Graf
  2019-08-15 16:25 ` [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC Suthikulpanit, Suravee
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
deactivated and fall back to using legacy interrupt injection via vINTR
and interrupt window.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 49 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index cfa4b13..4690351 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
 static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
 static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
 static void svm_complete_interrupts(struct vcpu_svm *svm);
+static void svm_request_activate_avic(struct kvm_vcpu *vcpu);
 static bool svm_get_enable_apicv(struct kvm *kvm);
 static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
 
@@ -4494,6 +4495,15 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
 {
 	kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
 	svm_clear_vintr(svm);
+
+	/*
+	 * For AVIC, the only reason to end up here is ExtINTs.
+	 * In this case AVIC was temporarily disabled for
+	 * requesting the IRQ window and we have to re-enable it.
+	 */
+	if (svm_get_enable_apicv(svm->vcpu.kvm))
+		svm_request_activate_avic(&svm->vcpu);
+
 	svm->vmcb->control.int_ctl &= ~V_IRQ_MASK;
 	mark_dirty(svm->vmcb, VMCB_INTR);
 	++svm->vcpu.stat.irq_window_exits;
@@ -5181,7 +5191,33 @@ static void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
 {
 }
 
-/* Note: Currently only used by Hyper-V. */
+static bool is_avic_active(struct vcpu_svm *svm)
+{
+	return (svm_get_enable_apicv(svm->vcpu.kvm) &&
+		svm->vmcb->control.int_ctl & AVIC_ENABLE_MASK);
+}
+
+static void svm_request_activate_avic(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (!lapic_in_kernel(vcpu) || is_avic_active(svm))
+		return;
+
+	kvm_make_apicv_activate_request(vcpu);
+}
+
+static void svm_request_deactivate_avic(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (!lapic_in_kernel(vcpu) || !is_avic_active(svm))
+		return;
+
+	/* Request temporary deactivate apicv */
+	kvm_make_apicv_deactivate_request(vcpu, false);
+}
+
 static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
-	if (kvm_vcpu_apicv_active(vcpu))
-		return;
-
 	/*
 	 * In case GIF=0 we can't rely on the CPU to tell us when GIF becomes
 	 * 1, because that's a separate STGI/VMRUN intercept.  The next time we
@@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
 	 * window under the assumption that the hardware will set the GIF.
 	 */
 	if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
+		/*
+		 * IRQ window is not needed when AVIC is enabled,
+		 * unless we have pending ExtINT since it cannot be injected
+		 * via AVIC. In such case, we need to temporarily disable AVIC,
+		 * and fallback to injecting IRQ via V_IRQ.
+		 */
+		if (kvm_vcpu_apicv_active(vcpu))
+			svm_request_deactivate_avic(&svm->vcpu);
 		svm_set_vintr(svm);
 		svm_inject_irq(svm, 0x0);
 	}
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (10 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-19 10:42   ` Alexander Graf
  2019-08-15 16:25 ` [PATCH v2 13/15] kvm: lapic: Clean up APIC predefined macros Suthikulpanit, Suravee
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

ACK notifiers don't work with AMD SVM w/ AVIC when the PIT interrupt
is delivered as edge-triggered fixed interrupt since AMD processors
cannot exit on EOI for these interrupts.

Add code to check LAPIC pending EOI before injecting any pending PIT
interrupt on AMD SVM when AVIC is activated.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/i8254.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 4a6dc54..31c4a9b 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -34,10 +34,12 @@
 
 #include <linux/kvm_host.h>
 #include <linux/slab.h>
+#include <asm/virtext.h>
 
 #include "ioapic.h"
 #include "irq.h"
 #include "i8254.h"
+#include "lapic.h"
 #include "x86.h"
 
 #ifndef CONFIG_X86_64
@@ -236,6 +238,12 @@ static void destroy_pit_timer(struct kvm_pit *pit)
 	kthread_flush_work(&pit->expired);
 }
 
+static inline void kvm_pit_reset_reinject(struct kvm_pit *pit)
+{
+	atomic_set(&pit->pit_state.pending, 0);
+	atomic_set(&pit->pit_state.irq_ack, 1);
+}
+
 static void pit_do_work(struct kthread_work *work)
 {
 	struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
@@ -244,6 +252,23 @@ static void pit_do_work(struct kthread_work *work)
 	int i;
 	struct kvm_kpit_state *ps = &pit->pit_state;
 
+	/*
+	 * Since, AMD SVM AVIC accelerates write access to APIC EOI
+	 * register for edge-trigger interrupts. PIT will not be able
+	 * to receive the IRQ ACK notifier and will always be zero.
+	 * Therefore, we check if any LAPIC EOI pending for vector 0
+	 * and reset irq_ack if no pending.
+	 */
+	if (cpu_has_svm(NULL) && kvm->arch.apicv_state == APICV_ACTIVATED) {
+		int eoi = 0;
+
+		kvm_for_each_vcpu(i, vcpu, kvm)
+			if (kvm_apic_pending_eoi(vcpu, 0))
+				eoi++;
+		if (!eoi)
+			kvm_pit_reset_reinject(pit);
+	}
+
 	if (atomic_read(&ps->reinject) && !atomic_xchg(&ps->irq_ack, 0))
 		return;
 
@@ -281,12 +306,6 @@ static enum hrtimer_restart pit_timer_fn(struct hrtimer *data)
 		return HRTIMER_NORESTART;
 }
 
-static inline void kvm_pit_reset_reinject(struct kvm_pit *pit)
-{
-	atomic_set(&pit->pit_state.pending, 0);
-	atomic_set(&pit->pit_state.irq_ack, 1);
-}
-
 void kvm_pit_set_reinject(struct kvm_pit *pit, bool reinject)
 {
 	struct kvm_kpit_state *ps = &pit->pit_state;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 13/15] kvm: lapic: Clean up APIC predefined macros
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (11 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-15 16:25 ` [PATCH v2 14/15] kvm: ioapic: Delay update IOAPIC EOI for RTC Suthikulpanit, Suravee
  2019-08-15 16:25 ` [PATCH v2 15/15] svm: Allow AVIC with in-kernel irqchip mode Suthikulpanit, Suravee
  14 siblings, 0 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Move these duplicated predefined macros to the header file so that
it can be re-used in other places.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/lapic.c | 13 +++++--------
 arch/x86/kvm/lapic.h |  1 +
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 90f79ca..3f0674b 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -59,9 +59,6 @@
 #define APIC_VERSION			(0x14UL | ((KVM_APIC_LVT_NUM - 1) << 16))
 #define LAPIC_MMIO_LENGTH		(1 << 12)
 /* followed define is not in apicdef.h */
-#define APIC_SHORT_MASK			0xc0000
-#define APIC_DEST_NOSHORT		0x0
-#define APIC_DEST_MASK			0x800
 #define MAX_APIC_VECTOR			256
 #define APIC_VECTORS_PER_REG		32
 
@@ -566,9 +563,9 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
 	irq.level = (icr & APIC_INT_ASSERT) != 0;
 	irq.trig_mode = icr & APIC_INT_LEVELTRIG;
 
-	if (icr & APIC_DEST_MASK)
+	if (icr & KVM_APIC_DEST_MASK)
 		return -KVM_EINVAL;
-	if (icr & APIC_SHORT_MASK)
+	if (icr & KVM_APIC_SHORT_MASK)
 		return -KVM_EINVAL;
 
 	rcu_read_lock();
@@ -808,7 +805,7 @@ bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
 
 	ASSERT(target);
 	switch (short_hand) {
-	case APIC_DEST_NOSHORT:
+	case KVM_APIC_DEST_NOSHORT:
 		if (dest_mode == APIC_DEST_PHYSICAL)
 			return kvm_apic_match_physical_addr(target, mda);
 		else
@@ -1211,10 +1208,10 @@ static void apic_send_ipi(struct kvm_lapic *apic)
 
 	irq.vector = icr_low & APIC_VECTOR_MASK;
 	irq.delivery_mode = icr_low & APIC_MODE_MASK;
-	irq.dest_mode = icr_low & APIC_DEST_MASK;
+	irq.dest_mode = icr_low & KVM_APIC_DEST_MASK;
 	irq.level = (icr_low & APIC_INT_ASSERT) != 0;
 	irq.trig_mode = icr_low & APIC_INT_LEVELTRIG;
-	irq.shorthand = icr_low & APIC_SHORT_MASK;
+	irq.shorthand = icr_low & KVM_APIC_SHORT_MASK;
 	irq.msi_redir_hint = false;
 	if (apic_x2apic_mode(apic))
 		irq.dest_id = icr_high;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 3892d50..fb02e3f 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -10,6 +10,7 @@
 #define KVM_APIC_SIPI		1
 #define KVM_APIC_LVT_NUM	6
 
+#define KVM_APIC_DEST_NOSHORT	0x0
 #define KVM_APIC_SHORT_MASK	0xc0000
 #define KVM_APIC_DEST_MASK	0x800
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 14/15] kvm: ioapic: Delay update IOAPIC EOI for RTC
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (12 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 13/15] kvm: lapic: Clean up APIC predefined macros Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  2019-08-19 11:00   ` Alexander Graf
  2019-08-15 16:25 ` [PATCH v2 15/15] svm: Allow AVIC with in-kernel irqchip mode Suthikulpanit, Suravee
  14 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

In-kernel IOAPIC does not update RTC pending EOI info with AMD SVM /w AVIC
when interrupt is delivered as edge-triggered since AMD processors
cannot exit on EOI for these interrupts.

Add code to also check LAPIC pending EOI before injecting any new RTC
interrupts on AMD SVM when AVIC is activated.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/ioapic.c | 36 ++++++++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index 1add1bc..45e7bb0 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -39,6 +39,7 @@
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/current.h>
+#include <asm/virtext.h>
 #include <trace/events/kvm.h>
 
 #include "ioapic.h"
@@ -173,6 +174,7 @@ static bool rtc_irq_check_coalesced(struct kvm_ioapic *ioapic)
 	return false;
 }
 
+#define APIC_DEST_NOSHORT		0x0
 static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
 		int irq_level, bool line_status)
 {
@@ -201,10 +203,36 @@ static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
 	 * interrupts lead to time drift in Windows guests.  So we track
 	 * EOI manually for the RTC interrupt.
 	 */
-	if (irq == RTC_GSI && line_status &&
-		rtc_irq_check_coalesced(ioapic)) {
-		ret = 0;
-		goto out;
+	if (irq == RTC_GSI && line_status) {
+		struct kvm *kvm = ioapic->kvm;
+		union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
+
+		/*
+		 * Since, AMD SVM AVIC accelerates write access to APIC EOI
+		 * register for edge-trigger interrupts, IOAPIC will not be
+		 * able to receive the EOI. In this case, we do lazy update
+		 * of the pending EOI when trying to set IOAPIC irq for RTC.
+		 */
+		if (cpu_has_svm(NULL) &&
+		    (kvm->arch.apicv_state == APICV_ACTIVATED) &&
+		    (entry->fields.trig_mode == IOAPIC_EDGE_TRIG)) {
+			int i;
+			struct kvm_vcpu *vcpu;
+
+			kvm_for_each_vcpu(i, vcpu, kvm)
+				if (kvm_apic_match_dest(vcpu, NULL,
+							KVM_APIC_DEST_NOSHORT,
+							entry->fields.dest_id,
+							entry->fields.dest_mode)) {
+					__rtc_irq_eoi_tracking_restore_one(vcpu);
+					break;
+				}
+		}
+
+		if (rtc_irq_check_coalesced(ioapic)) {
+			ret = 0;
+			goto out;
+		}
 	}
 
 	old_irr = ioapic->irr;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 15/15] svm: Allow AVIC with in-kernel irqchip mode
  2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
                   ` (13 preceding siblings ...)
  2019-08-15 16:25 ` [PATCH v2 14/15] kvm: ioapic: Delay update IOAPIC EOI for RTC Suthikulpanit, Suravee
@ 2019-08-15 16:25 ` Suthikulpanit, Suravee
  14 siblings, 0 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-15 16:25 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, graf, jschoenh, karahmed, rimasluk,
	Grimm, Jon, Suthikulpanit, Suravee

Once run-time AVIC activate/deactivate, and PIC and IOAPIC EOI workaround
for AVIC is supported, we can remove the kernel irqchip split mode
requirement for AVIC.

Hence, remove the check for irqchip split mode when enabling AVIC.

Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 4690351..04c8392 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5180,7 +5180,7 @@ static void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
 
 static bool svm_get_enable_apicv(struct kvm *kvm)
 {
-	return avic && irqchip_split(kvm);
+	return avic;
 }
 
 static void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state
  2019-08-15 16:25 ` [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state Suthikulpanit, Suravee
@ 2019-08-19  9:49   ` Alexander Graf
  2019-08-26 19:06     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 39+ messages in thread
From: Alexander Graf @ 2019-08-19  9:49 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> Currently, after a VM boots with APICv enabled, it could go into
> the following states:
>    * activated   = VM is running w/ APICv
>    * deactivated = VM deactivate APICv (temporary)
>    * disabled    = VM deactivate APICv (permanent)
> 
> Introduce KVM APICv state enum to help keep track of the APICv states
> along with a new variable struct kvm_arch.apicv_state to store
> the current state.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   arch/x86/include/asm/kvm_host.h | 11 +++++++++++
>   arch/x86/kvm/x86.c              | 14 +++++++++++++-
>   2 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 56bc702..04d7066 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -845,6 +845,15 @@ enum kvm_irqchip_mode {
>   	KVM_IRQCHIP_SPLIT,        /* created with KVM_CAP_SPLIT_IRQCHIP */
>   };
>   
> +/*
> + * KVM assumes all vcpus in a VM operate in the same mode.
> + */
> +enum kvm_apicv_state {
> +	APICV_DISABLED,		/* Disabled (such as for Hyper-V case) */
> +	APICV_DEACTIVATED,	/* Deactivated tempoerary */

typo

I'm also not sure the name is 100% obvious. How about something like 
"suspended" or "paused"?

> +	APICV_ACTIVATED,	/* Default status when APICV is enabled */
> +};
> +
>   struct kvm_arch {
>   	unsigned long n_used_mmu_pages;
>   	unsigned long n_requested_mmu_pages;
> @@ -873,6 +882,8 @@ struct kvm_arch {
>   	struct kvm_apic_map *apic_map;
>   
>   	bool apic_access_page_done;
> +	struct mutex apicv_lock;
> +	enum kvm_apicv_state apicv_state;
>   
>   	gpa_t wall_clock;
>   
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7daf0dd..f9c3f63 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4584,6 +4584,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>   		kvm->arch.irqchip_mode = KVM_IRQCHIP_SPLIT;
>   		kvm->arch.nr_reserved_ioapic_pins = cap->args[0];
>   		r = 0;
> +		if (kvm_x86_ops->get_enable_apicv(kvm))
> +			kvm->arch.apicv_state = APICV_ACTIVATED;
>   split_irqchip_unlock:
>   		mutex_unlock(&kvm->lock);
>   		break;
> @@ -4701,6 +4703,8 @@ long kvm_arch_vm_ioctl(struct file *filp,
>   		/* Write kvm->irq_routing before enabling irqchip_in_kernel. */
>   		smp_wmb();
>   		kvm->arch.irqchip_mode = KVM_IRQCHIP_KERNEL;
> +		if (kvm_x86_ops->get_enable_apicv(kvm))
> +			kvm->arch.apicv_state = APICV_ACTIVATED;
>   	create_irqchip_unlock:
>   		mutex_unlock(&kvm->lock);
>   		break;
> @@ -9150,13 +9154,18 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>   		goto fail_free_pio_data;
>   
>   	if (irqchip_in_kernel(vcpu->kvm)) {
> -		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);

Why are you moving this into a locked section?

>   		r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
>   		if (r < 0)
>   			goto fail_mmu_destroy;
>   	} else
>   		static_key_slow_inc(&kvm_no_apic_vcpu);
>   
> +	mutex_lock(&vcpu->kvm->arch.apicv_lock);
> +	if (irqchip_in_kernel(vcpu->kvm) &&
> +	    vcpu->kvm->arch.apicv_state == APICV_ACTIVATED)
> +		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
> +	mutex_unlock(&vcpu->kvm->arch.apicv_lock);
> +
>   	vcpu->arch.mce_banks = kzalloc(KVM_MAX_MCE_BANKS * sizeof(u64) * 4,
>   				       GFP_KERNEL_ACCOUNT);
>   	if (!vcpu->arch.mce_banks) {
> @@ -9255,6 +9264,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>   	kvm_page_track_init(kvm);
>   	kvm_mmu_init_vm(kvm);
>   
> +	/* APICV initialization */
> +	mutex_init(&kvm->arch.apicv_lock);

In fact, the whole lock story is not part of the patch description :).


Alex

> +
>   	if (kvm_x86_ops->vm_init)
>   		return kvm_x86_ops->vm_init(kvm);
>   
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs
  2019-08-15 16:25 ` [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs Suthikulpanit, Suravee
@ 2019-08-19  9:57   ` Alexander Graf
  2019-08-26 19:41     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 39+ messages in thread
From: Alexander Graf @ 2019-08-19  9:57 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> Currently, there is no way to tell whether APICv is active
> on a particular VM. This often cause confusion since APICv
> can be deactivated at runtime.
> 
> Introduce a debugfs entry to report APICv state of a VM.
> This creates a read-only file:
> 
>     /sys/kernel/debug/kvm/70860-14/apicv-state
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Shouldn't this first and foremost be a VM ioctl so that user space can 
inquire its own state?


Alex

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 09/15] svm: Add support for activate/deactivate AVIC at runtime
  2019-08-15 16:25 ` [PATCH v2 09/15] svm: Add support for activate/deactivate AVIC at runtime Suthikulpanit, Suravee
@ 2019-08-19 10:28   ` Alexander Graf
  0 siblings, 0 replies; 39+ messages in thread
From: Alexander Graf @ 2019-08-19 10:28 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> Add necessary logics for supporting activate/deactivate AVIC at runtime.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   arch/x86/kvm/svm.c | 27 +++++++++++++++++++++++++--
>   1 file changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 47f2439..cfa4b13 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -385,6 +385,7 @@ struct amd_svm_iommu_ir {
>   static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
>   static void svm_complete_interrupts(struct vcpu_svm *svm);
>   static bool svm_get_enable_apicv(struct kvm *kvm);
> +static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
>   
>   static int nested_svm_exit_handled(struct vcpu_svm *svm);
>   static int nested_svm_intercept(struct vcpu_svm *svm);
> @@ -2343,6 +2344,10 @@ static void svm_vcpu_blocking(struct kvm_vcpu *vcpu)
>   
>   static void svm_vcpu_unblocking(struct kvm_vcpu *vcpu)
>   {
> +	if (kvm_check_request(KVM_REQ_APICV_ACTIVATE, vcpu))
> +		kvm_vcpu_activate_apicv(vcpu);
> +	if (kvm_check_request(KVM_REQ_APICV_DEACTIVATE, vcpu))
> +		kvm_vcpu_deactivate_apicv(vcpu);
>   	avic_set_running(vcpu, true);
>   }
>   
> @@ -5182,10 +5187,19 @@ static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
>   	struct vcpu_svm *svm = to_svm(vcpu);
>   	struct vmcb *vmcb = svm->vmcb;
>   
> -	if (kvm_vcpu_apicv_active(vcpu))
> +	if (kvm_vcpu_apicv_active(vcpu)) {
> +		/**
> +		 * During AVIC temporary deactivation, guest could update
> +		 * APIC ID, DFR and LDR registers, which would not be trapped
> +		 * by avic_unaccelerated_ccess_interception(). In this case,

typo

Alex

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 10/15] kvm: x86: hyperv: Use APICv deactivate request interface
  2019-08-15 16:25 ` [PATCH v2 10/15] kvm: x86: hyperv: Use APICv deactivate request interface Suthikulpanit, Suravee
@ 2019-08-19 10:31   ` Alexander Graf
  0 siblings, 0 replies; 39+ messages in thread
From: Alexander Graf @ 2019-08-19 10:31 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> Since disabling APICv has to be done for all vcpus on AMD-based system,
> adopt the newly introduced kvm_make_apicv_deactivate_request() intereface.

typo

> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   arch/x86/kvm/hyperv.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index a39e38f..4f71a39 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -772,9 +772,17 @@ int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages)
>   
>   	/*
>   	 * Hyper-V SynIC auto EOI SINT's are
> -	 * not compatible with APICV, so deactivate APICV
> +	 * not compatible with APICV, so  request

double space

> +	 * to deactivate APICV permanently.
> +	 *
> +	 * Since this requires updating
> +	 * APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
> +	 * also take srcu lock.
>   	 */
> -	kvm_vcpu_deactivate_apicv(vcpu);
> +	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
> +	kvm_make_apicv_deactivate_request(vcpu, true);
> +	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);

Overall, I'm not terribly happy with the srcu locks. Can't we handle the 
memslot changes outside of the lock region, inside the respective 
request handlers somehow?


Alex

> +
>   	synic->active = true;
>   	synic->dont_zero_synic_pages = dont_zero_synic_pages;
>   	return 0;
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling
  2019-08-15 16:25 ` [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling Suthikulpanit, Suravee
@ 2019-08-19 10:35   ` Alexander Graf
  2019-08-28 15:17     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 39+ messages in thread
From: Alexander Graf @ 2019-08-19 10:35 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
> deactivated and fall back to using legacy interrupt injection via vINTR
> and interrupt window.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   arch/x86/kvm/svm.c | 49 +++++++++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 45 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index cfa4b13..4690351 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
>   static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>   static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
>   static void svm_complete_interrupts(struct vcpu_svm *svm);
> +static void svm_request_activate_avic(struct kvm_vcpu *vcpu);
>   static bool svm_get_enable_apicv(struct kvm *kvm);
>   static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
>   
> @@ -4494,6 +4495,15 @@ static int interrupt_window_interception(struct vcpu_svm *svm)
>   {
>   	kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
>   	svm_clear_vintr(svm);
> +
> +	/*
> +	 * For AVIC, the only reason to end up here is ExtINTs.
> +	 * In this case AVIC was temporarily disabled for
> +	 * requesting the IRQ window and we have to re-enable it.
> +	 */
> +	if (svm_get_enable_apicv(svm->vcpu.kvm))
> +		svm_request_activate_avic(&svm->vcpu);

Would it make sense to add a trace point here and to the other call 
sites, so that it becomes obvious in a trace when and why exactly avic 
was active/inactive?

The trace point could add additional information on the why.

> +
>   	svm->vmcb->control.int_ctl &= ~V_IRQ_MASK;
>   	mark_dirty(svm->vmcb, VMCB_INTR);
>   	++svm->vcpu.stat.irq_window_exits;
> @@ -5181,7 +5191,33 @@ static void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
>   {
>   }
>   
> -/* Note: Currently only used by Hyper-V. */
> +static bool is_avic_active(struct vcpu_svm *svm)
> +{
> +	return (svm_get_enable_apicv(svm->vcpu.kvm) &&
> +		svm->vmcb->control.int_ctl & AVIC_ENABLE_MASK);
> +}
> +
> +static void svm_request_activate_avic(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_svm *svm = to_svm(vcpu);
> +
> +	if (!lapic_in_kernel(vcpu) || is_avic_active(svm))
> +		return;
> +
> +	kvm_make_apicv_activate_request(vcpu);
> +}
> +
> +static void svm_request_deactivate_avic(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_svm *svm = to_svm(vcpu);
> +
> +	if (!lapic_in_kernel(vcpu) || !is_avic_active(svm))
> +		return;
> +
> +	/* Request temporary deactivate apicv */
> +	kvm_make_apicv_deactivate_request(vcpu, false);
> +}
> +
>   static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
>   {
>   	struct vcpu_svm *svm = to_svm(vcpu);
> @@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
>   {
>   	struct vcpu_svm *svm = to_svm(vcpu);
>   
> -	if (kvm_vcpu_apicv_active(vcpu))
> -		return;
> -
>   	/*
>   	 * In case GIF=0 we can't rely on the CPU to tell us when GIF becomes
>   	 * 1, because that's a separate STGI/VMRUN intercept.  The next time we
> @@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
>   	 * window under the assumption that the hardware will set the GIF.
>   	 */
>   	if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
> +		/*
> +		 * IRQ window is not needed when AVIC is enabled,
> +		 * unless we have pending ExtINT since it cannot be injected
> +		 * via AVIC. In such case, we need to temporarily disable AVIC,
> +		 * and fallback to injecting IRQ via V_IRQ.
> +		 */
> +		if (kvm_vcpu_apicv_active(vcpu))
> +			svm_request_deactivate_avic(&svm->vcpu);

Did you test AVIC with nesting? Did you actually run across this issue 
there?


Alex

>   		svm_set_vintr(svm);
>   		svm_inject_irq(svm, 0x0);
>   	}
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC
  2019-08-15 16:25 ` [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC Suthikulpanit, Suravee
@ 2019-08-19 10:42   ` Alexander Graf
  2019-08-26 20:46     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 39+ messages in thread
From: Alexander Graf @ 2019-08-19 10:42 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> ACK notifiers don't work with AMD SVM w/ AVIC when the PIT interrupt
> is delivered as edge-triggered fixed interrupt since AMD processors
> cannot exit on EOI for these interrupts.
> 
> Add code to check LAPIC pending EOI before injecting any pending PIT
> interrupt on AMD SVM when AVIC is activated.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   arch/x86/kvm/i8254.c | 31 +++++++++++++++++++++++++------
>   1 file changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
> index 4a6dc54..31c4a9b 100644
> --- a/arch/x86/kvm/i8254.c
> +++ b/arch/x86/kvm/i8254.c
> @@ -34,10 +34,12 @@
>   
>   #include <linux/kvm_host.h>
>   #include <linux/slab.h>
> +#include <asm/virtext.h>
>   
>   #include "ioapic.h"
>   #include "irq.h"
>   #include "i8254.h"
> +#include "lapic.h"
>   #include "x86.h"
>   
>   #ifndef CONFIG_X86_64
> @@ -236,6 +238,12 @@ static void destroy_pit_timer(struct kvm_pit *pit)
>   	kthread_flush_work(&pit->expired);
>   }
>   
> +static inline void kvm_pit_reset_reinject(struct kvm_pit *pit)
> +{
> +	atomic_set(&pit->pit_state.pending, 0);
> +	atomic_set(&pit->pit_state.irq_ack, 1);
> +}
> +
>   static void pit_do_work(struct kthread_work *work)
>   {
>   	struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
> @@ -244,6 +252,23 @@ static void pit_do_work(struct kthread_work *work)
>   	int i;
>   	struct kvm_kpit_state *ps = &pit->pit_state;
>   
> +	/*
> +	 * Since, AMD SVM AVIC accelerates write access to APIC EOI
> +	 * register for edge-trigger interrupts. PIT will not be able
> +	 * to receive the IRQ ACK notifier and will always be zero.
> +	 * Therefore, we check if any LAPIC EOI pending for vector 0
> +	 * and reset irq_ack if no pending.
> +	 */
> +	if (cpu_has_svm(NULL) && kvm->arch.apicv_state == APICV_ACTIVATED) {
> +		int eoi = 0;
> +
> +		kvm_for_each_vcpu(i, vcpu, kvm)
> +			if (kvm_apic_pending_eoi(vcpu, 0))
> +				eoi++;
> +		if (!eoi)
> +			kvm_pit_reset_reinject(pit);

In which case would eoi be != 0 when APIC-V is active?


Alex

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 14/15] kvm: ioapic: Delay update IOAPIC EOI for RTC
  2019-08-15 16:25 ` [PATCH v2 14/15] kvm: ioapic: Delay update IOAPIC EOI for RTC Suthikulpanit, Suravee
@ 2019-08-19 11:00   ` Alexander Graf
  2019-09-04  2:06     ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 39+ messages in thread
From: Alexander Graf @ 2019-08-19 11:00 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
> In-kernel IOAPIC does not update RTC pending EOI info with AMD SVM /w AVIC
> when interrupt is delivered as edge-triggered since AMD processors
> cannot exit on EOI for these interrupts.
> 
> Add code to also check LAPIC pending EOI before injecting any new RTC
> interrupts on AMD SVM when AVIC is activated.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   arch/x86/kvm/ioapic.c | 36 ++++++++++++++++++++++++++++++++----
>   1 file changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
> index 1add1bc..45e7bb0 100644
> --- a/arch/x86/kvm/ioapic.c
> +++ b/arch/x86/kvm/ioapic.c
> @@ -39,6 +39,7 @@
>   #include <asm/processor.h>
>   #include <asm/page.h>
>   #include <asm/current.h>
> +#include <asm/virtext.h>
>   #include <trace/events/kvm.h>
>   
>   #include "ioapic.h"
> @@ -173,6 +174,7 @@ static bool rtc_irq_check_coalesced(struct kvm_ioapic *ioapic)
>   	return false;
>   }
>   
> +#define APIC_DEST_NOSHORT		0x0
>   static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
>   		int irq_level, bool line_status)
>   {
> @@ -201,10 +203,36 @@ static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
>   	 * interrupts lead to time drift in Windows guests.  So we track
>   	 * EOI manually for the RTC interrupt.
>   	 */
> -	if (irq == RTC_GSI && line_status &&
> -		rtc_irq_check_coalesced(ioapic)) {
> -		ret = 0;
> -		goto out;
> +	if (irq == RTC_GSI && line_status) {
> +		struct kvm *kvm = ioapic->kvm;
> +		union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
> +
> +		/*
> +		 * Since, AMD SVM AVIC accelerates write access to APIC EOI
> +		 * register for edge-trigger interrupts, IOAPIC will not be
> +		 * able to receive the EOI. In this case, we do lazy update
> +		 * of the pending EOI when trying to set IOAPIC irq for RTC.
> +		 */
> +		if (cpu_has_svm(NULL) &&
> +		    (kvm->arch.apicv_state == APICV_ACTIVATED) &&
> +		    (entry->fields.trig_mode == IOAPIC_EDGE_TRIG)) {
> +			int i;
> +			struct kvm_vcpu *vcpu;
> +
> +			kvm_for_each_vcpu(i, vcpu, kvm)
> +				if (kvm_apic_match_dest(vcpu, NULL,
> +							KVM_APIC_DEST_NOSHORT,
> +							entry->fields.dest_id,
> +							entry->fields.dest_mode)) {
> +					__rtc_irq_eoi_tracking_restore_one(vcpu);

I don't understand why this works. This code just means we're injecting 
an EOI on the first CPU that has the vector mapped, right when we're 
setting it to trigger, no?


Alex


> +					break;
> +				}
> +		}
> +
> +		if (rtc_irq_check_coalesced(ioapic)) {
> +			ret = 0;
> +			goto out;
> +		}
>   	}
>   
>   	old_irr = ioapic->irr;
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state
  2019-08-19  9:49   ` Alexander Graf
@ 2019-08-26 19:06     ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-26 19:06 UTC (permalink / raw)
  To: Alexander Graf, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon

Alex,

On 8/19/2019 4:49 AM, Alexander Graf wrote:
> 
> 
> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>> Currently, after a VM boots with APICv enabled, it could go into
>> the following states:
>>    * activated   = VM is running w/ APICv
>>    * deactivated = VM deactivate APICv (temporary)
>>    * disabled    = VM deactivate APICv (permanent)
>>
>> Introduce KVM APICv state enum to help keep track of the APICv states
>> along with a new variable struct kvm_arch.apicv_state to store
>> the current state.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>   arch/x86/include/asm/kvm_host.h | 11 +++++++++++
>>   arch/x86/kvm/x86.c              | 14 +++++++++++++-
>>   2 files changed, 24 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 56bc702..04d7066 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -845,6 +845,15 @@ enum kvm_irqchip_mode {
>>       KVM_IRQCHIP_SPLIT,        /* created with KVM_CAP_SPLIT_IRQCHIP */
>>   };
>> +/*
>> + * KVM assumes all vcpus in a VM operate in the same mode.
>> + */
>> +enum kvm_apicv_state {
>> +    APICV_DISABLED,        /* Disabled (such as for Hyper-V case) */
>> +    APICV_DEACTIVATED,    /* Deactivated tempoerary */
> 
> typo
> 
> I'm also not sure the name is 100% obvious. How about something like "suspended" or "paused"?

Ok, I'll change it to APICV_SUSPENDED.

>> ...
>> @@ -9150,13 +9154,18 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>>           goto fail_free_pio_data;
>>       if (irqchip_in_kernel(vcpu->kvm)) {
>> -        vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
> 
> Why are you moving this into a locked section?

Since we introduced the apicv_state to track the VM APICv state, which is accessible
by each vcpu initialization code, we need to lock and check the state before setting
the per-vcpu apicv_active.

> 
>>           r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
>>           if (r < 0)
>>               goto fail_mmu_destroy;
>>       } else
>>           static_key_slow_inc(&kvm_no_apic_vcpu);
>> +    mutex_lock(&vcpu->kvm->arch.apicv_lock);
>> +    if (irqchip_in_kernel(vcpu->kvm) &&
>> +        vcpu->kvm->arch.apicv_state == APICV_ACTIVATED)
>> +        vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
>> +    mutex_unlock(&vcpu->kvm->arch.apicv_lock);
>> +
>>       vcpu->arch.mce_banks = kzalloc(KVM_MAX_MCE_BANKS * sizeof(u64) * 4,
>>                          GFP_KERNEL_ACCOUNT);
>>       if (!vcpu->arch.mce_banks) {
>> @@ -9255,6 +9264,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>       kvm_page_track_init(kvm);
>>       kvm_mmu_init_vm(kvm);
>> +    /* APICV initialization */
>> +    mutex_init(&kvm->arch.apicv_lock);
> 
> In fact, the whole lock story is not part of the patch description :).\\

Ok, I'll update the commit log to describe the lock .

Suravee

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs
  2019-08-19  9:57   ` Alexander Graf
@ 2019-08-26 19:41     ` Suthikulpanit, Suravee
  2019-08-27  8:20       ` Alexander Graf
  0 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-26 19:41 UTC (permalink / raw)
  To: Alexander Graf, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon

Alex,

On 8/19/2019 4:57 AM, Alexander Graf wrote:
> 
> 
> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>> Currently, there is no way to tell whether APICv is active
>> on a particular VM. This often cause confusion since APICv
>> can be deactivated at runtime.
>>
>> Introduce a debugfs entry to report APICv state of a VM.
>> This creates a read-only file:
>>
>>     /sys/kernel/debug/kvm/70860-14/apicv-state
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> Shouldn't this first and foremost be a VM ioctl so that user space can inquire its own state?
> 
> 
> Alex

I introduce this mainly for debugging similar to how KVM is currently provides
some per-VCPU information:

     /sys/kernel/debug/kvm/15957-14/vcpu0/
         lapic_timer_advance_ns
         tsc-offset
         tsc-scaling-ratio
         tsc-scaling-ratio-frac-bits

I'm not sure if this needs to be VM ioctl at this point. If this information is
useful for user-space tool to inquire via ioctl, we can also provide it.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC
  2019-08-19 10:42   ` Alexander Graf
@ 2019-08-26 20:46     ` Suthikulpanit, Suravee
  2019-08-27  9:10       ` Alexander Graf
  0 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-26 20:46 UTC (permalink / raw)
  To: Alexander Graf, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon

Alex,

On 8/19/2019 5:42 AM, Alexander Graf wrote:
> 
> 
> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>> ACK notifiers don't work with AMD SVM w/ AVIC when the PIT interrupt
>> is delivered as edge-triggered fixed interrupt since AMD processors
>> cannot exit on EOI for these interrupts.
>>
>> Add code to check LAPIC pending EOI before injecting any pending PIT
>> interrupt on AMD SVM when AVIC is activated.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>   arch/x86/kvm/i8254.c | 31 +++++++++++++++++++++++++------
>>   1 file changed, 25 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
>> index 4a6dc54..31c4a9b 100644
>> --- a/arch/x86/kvm/i8254.c
>> +++ b/arch/x86/kvm/i8254.c
>> @@ -34,10 +34,12 @@
>>   #include <linux/kvm_host.h>
>>   #include <linux/slab.h>
>> +#include <asm/virtext.h>
>>   #include "ioapic.h"
>>   #include "irq.h"
>>   #include "i8254.h"
>> +#include "lapic.h"
>>   #include "x86.h"
>>   #ifndef CONFIG_X86_64
>> @@ -236,6 +238,12 @@ static void destroy_pit_timer(struct kvm_pit *pit)
>>       kthread_flush_work(&pit->expired);
>>   }
>> +static inline void kvm_pit_reset_reinject(struct kvm_pit *pit)
>> +{
>> +    atomic_set(&pit->pit_state.pending, 0);
>> +    atomic_set(&pit->pit_state.irq_ack, 1);
>> +}
>> +
>>   static void pit_do_work(struct kthread_work *work)
>>   {
>>       struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
>> @@ -244,6 +252,23 @@ static void pit_do_work(struct kthread_work *work)
>>       int i;
>>       struct kvm_kpit_state *ps = &pit->pit_state;
>> +    /*
>> +     * Since, AMD SVM AVIC accelerates write access to APIC EOI
>> +     * register for edge-trigger interrupts. PIT will not be able
>> +     * to receive the IRQ ACK notifier and will always be zero.
>> +     * Therefore, we check if any LAPIC EOI pending for vector 0
>> +     * and reset irq_ack if no pending.
>> +     */
>> +    if (cpu_has_svm(NULL) && kvm->arch.apicv_state == APICV_ACTIVATED) {
>> +        int eoi = 0;
>> +
>> +        kvm_for_each_vcpu(i, vcpu, kvm)
>> +            if (kvm_apic_pending_eoi(vcpu, 0))
>> +                eoi++;
>> +        if (!eoi)
>> +            kvm_pit_reset_reinject(pit);
> 
> In which case would eoi be != 0 when APIC-V is active?

That would be the case when guest has not processed and/or still processing the interrupt.
Once the guest writes to APIC EOI register for edge-triggered interrupt for vector 0,
and the AVIC hardware accelerated the access by clearing the highest priority ISR bit,
then the eoi should be zero.

Suravee

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 01/15] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter
  2019-08-15 16:25 ` [PATCH v2 01/15] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter Suthikulpanit, Suravee
@ 2019-08-27  7:15   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 39+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-27  7:15 UTC (permalink / raw)
  To: Suthikulpanit\, Suravee
  Cc: pbonzini\, rkrcmar\, joro\, graf\, jschoenh\, karahmed\,
	rimasluk\, Grimm\, Jon, Suthikulpanit\, Suravee, linux-kernel\,
	kvm\

"Suthikulpanit, Suravee" <Suravee.Suthikulpanit@amd.com> writes:

> Generally, APICv for all vcpus in the VM are enable/disable in the same
> manner. So, get_enable_apicv() should represent APICv status of the VM
> instead of each VCPU.
>
> Modify kvm_x86_ops.get_enable_apicv() to take struct kvm as parameter
> instead of struct kvm_vcpu.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 2 +-
>  arch/x86/kvm/svm.c              | 5 +++--
>  arch/x86/kvm/vmx/vmx.c          | 2 +-
>  arch/x86/kvm/x86.c              | 2 +-
>  4 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 26d1eb8..56bc702 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1077,7 +1077,7 @@ struct kvm_x86_ops {
>  	void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
>  	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>  	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
> -	bool (*get_enable_apicv)(struct kvm_vcpu *vcpu);
> +	bool (*get_enable_apicv)(struct kvm *kvm);
>  	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
>  	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
>  	void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index ccd5aa6..6851bce 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
>  static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>  static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
>  static void svm_complete_interrupts(struct vcpu_svm *svm);
> +static bool svm_get_enable_apicv(struct kvm *kvm);

Why is this forward declaration needed [in this patch]?

>  
>  static int nested_svm_exit_handled(struct vcpu_svm *svm);
>  static int nested_svm_intercept(struct vcpu_svm *svm);
> @@ -5124,9 +5125,9 @@ static void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
>  	return;
>  }
>  
> -static bool svm_get_enable_apicv(struct kvm_vcpu *vcpu)
> +static bool svm_get_enable_apicv(struct kvm *kvm)
>  {
> -	return avic && irqchip_split(vcpu->kvm);
> +	return avic && irqchip_split(kvm);
>  }
>  
>  static void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d98eac3..18a4b94 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -3610,7 +3610,7 @@ void pt_update_intercept_for_msr(struct vcpu_vmx *vmx)
>  	}
>  }
>  
> -static bool vmx_get_enable_apicv(struct kvm_vcpu *vcpu)
> +static bool vmx_get_enable_apicv(struct kvm *kvm)
>  {
>  	return enable_apicv;
>  }
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fafd81d..7daf0dd 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9150,7 +9150,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  		goto fail_free_pio_data;
>  
>  	if (irqchip_in_kernel(vcpu->kvm)) {
> -		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu);
> +		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu->kvm);
>  		r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
>  		if (r < 0)
>  			goto fail_mmu_destroy;

With the above question answered (or declaration moved to the patch
where it's actually needed)

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 05/15] kvm: lapic: Introduce APICv update helper function
  2019-08-15 16:25 ` [PATCH v2 05/15] kvm: lapic: Introduce APICv update helper function Suthikulpanit, Suravee
@ 2019-08-27  7:22   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 39+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-27  7:22 UTC (permalink / raw)
  To: Suthikulpanit\, Suravee
  Cc: pbonzini\, rkrcmar\, joro\, graf\, jschoenh\, karahmed\,
	rimasluk\, Grimm\, Jon, Suthikulpanit\, Suravee, linux-kernel\,
	kvm\

"Suthikulpanit, Suravee" <Suravee.Suthikulpanit@amd.com> writes:

> Re-factor code into a helper function for setting lapic parameters when
> activate/deactivate APICv, and export the function for subsequent usage.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  arch/x86/kvm/lapic.c | 22 +++++++++++++++++-----
>  arch/x86/kvm/lapic.h |  1 +
>  2 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 4dabc31..90f79ca 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2153,6 +2153,21 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
>  
>  }
>  
> +void kvm_apic_update_apicv(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_lapic *apic = vcpu->arch.apic;
> +
> +	if (vcpu->arch.apicv_active) {
> +		/* irr_pending is always true when apicv is activated. */
> +		apic->irr_pending = true;
> +		apic->isr_count = 1;
> +	} else {
> +		apic->irr_pending = (apic_search_irr(apic) != -1);
> +		apic->isr_count = count_vectors(apic->regs + APIC_ISR);
> +	}
> +}
> +EXPORT_SYMBOL(kvm_apic_update_apicv);

EXPORT_SYMBOL_GPL please.

> +
>  void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
>  {
>  	struct kvm_lapic *apic = vcpu->arch.apic;
> @@ -2197,8 +2212,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
>  		kvm_lapic_set_reg(apic, APIC_ISR + 0x10 * i, 0);
>  		kvm_lapic_set_reg(apic, APIC_TMR + 0x10 * i, 0);
>  	}
> -	apic->irr_pending = vcpu->arch.apicv_active;
> -	apic->isr_count = vcpu->arch.apicv_active ? 1 : 0;
> +	kvm_apic_update_apicv(vcpu);
>  	apic->highest_isr_cache = -1;
>  	update_divide_count(apic);
>  	atomic_set(&apic->lapic_timer.pending, 0);
> @@ -2468,9 +2482,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
>  	apic_manage_nmi_watchdog(apic, kvm_lapic_get_reg(apic, APIC_LVT0));
>  	update_divide_count(apic);
>  	start_apic_timer(apic);
> -	apic->irr_pending = true;
> -	apic->isr_count = vcpu->arch.apicv_active ?
> -				1 : count_vectors(apic->regs + APIC_ISR);
> +	kvm_apic_update_apicv(vcpu);
>  	apic->highest_isr_cache = -1;
>  	if (vcpu->arch.apicv_active) {
>  		kvm_x86_ops->apicv_post_state_restore(vcpu);
> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> index d6d049b..3892d50 100644
> --- a/arch/x86/kvm/lapic.h
> +++ b/arch/x86/kvm/lapic.h
> @@ -90,6 +90,7 @@ bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
>  int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
>  		     struct dest_map *dest_map);
>  int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
> +void kvm_apic_update_apicv(struct kvm_vcpu *vcpu);
>  
>  bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
>  		struct kvm_lapic_irq *irq, int *r, struct dest_map *dest_map);

-- 
Vitaly

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime
  2019-08-15 16:25 ` [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime Suthikulpanit, Suravee
@ 2019-08-27  7:29   ` Vitaly Kuznetsov
  2019-08-27  8:05     ` Alexander Graf
  0 siblings, 1 reply; 39+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-27  7:29 UTC (permalink / raw)
  To: Suthikulpanit\, Suravee, linux-kernel\, kvm\
  Cc: pbonzini\, rkrcmar\, joro\, graf\, jschoenh\, karahmed\,
	rimasluk\, Grimm\, Jon, Suthikulpanit\,
	Suravee

"Suthikulpanit, Suravee" <Suravee.Suthikulpanit@amd.com> writes:

> Certain runtime conditions require APICv to be temporary deactivated.
> However, current implementation only support permanently deactivate
> APICv at runtime (mainly used when running Hyper-V guest).
>
> In addtion, for AMD, when activate / deactivate APICv during runtime,
> all vcpus in the VM has to be operating in the same APICv mode, which
> requires the requesting (main) vcpu to notify others.
>
> So, introduce interfaces to request all vcpus to activate/deactivate
> APICv.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  9 +++++
>  arch/x86/kvm/x86.c              | 76 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 85 insertions(+)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 04d7066..dfb7c3d 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -76,6 +76,10 @@
>  #define KVM_REQ_HV_STIMER		KVM_ARCH_REQ(22)
>  #define KVM_REQ_LOAD_EOI_EXITMAP	KVM_ARCH_REQ(23)
>  #define KVM_REQ_GET_VMCS12_PAGES	KVM_ARCH_REQ(24)
> +#define KVM_REQ_APICV_ACTIVATE		\
> +	KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> +#define KVM_REQ_APICV_DEACTIVATE	\
> +	KVM_ARCH_REQ_FLAGS(26, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>  
>  #define CR0_RESERVED_BITS                                               \
>  	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
> @@ -1089,6 +1093,7 @@ struct kvm_x86_ops {
>  	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>  	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
>  	bool (*get_enable_apicv)(struct kvm *kvm);
> +	void (*pre_update_apicv_exec_ctrl)(struct kvm_vcpu *vcpu, bool activate);
>  	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
>  	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
>  	void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
> @@ -1552,6 +1557,10 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
>  
>  void kvm_make_mclock_inprogress_request(struct kvm *kvm);
>  void kvm_make_scan_ioapic_request(struct kvm *kvm);
> +void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu);
> +void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu);
> +void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable);
>  
>  void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
>  				     struct kvm_async_pf *work);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index f9c3f63..40a20bf 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -26,6 +26,7 @@
>  #include "cpuid.h"
>  #include "pmu.h"
>  #include "hyperv.h"
> +#include "lapic.h"
>  
>  #include <linux/clocksource.h>
>  #include <linux/interrupt.h>
> @@ -7163,6 +7164,22 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
>  	kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
>  }
>  
> +void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu)
> +{
> +	if (!lapic_in_kernel(vcpu)) {
> +		WARN_ON_ONCE(!vcpu->arch.apicv_active);
> +		return;
> +	}
> +	if (vcpu->arch.apicv_active)
> +		return;
> +
> +	vcpu->arch.apicv_active = true;
> +	kvm_apic_update_apicv(vcpu);
> +
> +	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
> +}
> +EXPORT_SYMBOL_GPL(kvm_vcpu_activate_apicv);
> +
>  void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
>  {
>  	if (!lapic_in_kernel(vcpu)) {
> @@ -7173,8 +7190,11 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
>  		return;
>  
>  	vcpu->arch.apicv_active = false;
> +	kvm_apic_update_apicv(vcpu);
> +
>  	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
>  }
> +EXPORT_SYMBOL_GPL(kvm_vcpu_deactivate_apicv);
>  
>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  {
> @@ -7668,6 +7688,58 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>  	kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>  }
>  
> +void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu)
> +{
> +	int i;
> +	struct kvm_vcpu *v;
> +	struct kvm *kvm = vcpu->kvm;
> +
> +	mutex_lock(&kvm->arch.apicv_lock);
> +	if (kvm->arch.apicv_state != APICV_DEACTIVATED) {
> +		mutex_unlock(&kvm->arch.apicv_lock);
> +		return;
> +	}
> +
> +	kvm_for_each_vcpu(i, v, kvm)
> +		kvm_clear_request(KVM_REQ_APICV_DEACTIVATE, v);
> +
> +	if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
> +		kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, true);
> +
> +	kvm->arch.apicv_state = APICV_ACTIVATED;
> +
> +	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_ACTIVATE);
> +
> +	mutex_unlock(&kvm->arch.apicv_lock);
> +}
> +EXPORT_SYMBOL_GPL(kvm_make_apicv_activate_request);
> +
> +void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable)
> +{
> +	int i;
> +	struct kvm_vcpu *v;
> +	struct kvm *kvm = vcpu->kvm;
> +
> +	mutex_lock(&kvm->arch.apicv_lock);
> +	if (kvm->arch.apicv_state == APICV_DEACTIVATED) {
> +		mutex_unlock(&kvm->arch.apicv_lock);
> +		return;
> +	}
> +
> +	kvm_for_each_vcpu(i, v, kvm)
> +		kvm_clear_request(KVM_REQ_APICV_ACTIVATE, v);

Could you please elaborate on when we need to eat the
KVM_REQ_APICV_ACTIVATE request here (and KVM_REQ_APICV_DEACTIVATE in
kvm_make_apicv_activate_request() respectively)? To me, this looks like
a possible source of hard-to-debug problems in the future.

> +
> +	if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
> +		kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, false);
> +
> +	kvm->arch.apicv_state = disable ? APICV_DISABLED : APICV_DEACTIVATED;
> +
> +	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_DEACTIVATE);
> +
> +	mutex_unlock(&kvm->arch.apicv_lock);
> +}
> +EXPORT_SYMBOL_GPL(kvm_make_apicv_deactivate_request);
> +
>  static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>  {
>  	if (!kvm_apic_present(vcpu))
> @@ -7854,6 +7926,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		 */
>  		if (kvm_check_request(KVM_REQ_HV_STIMER, vcpu))
>  			kvm_hv_process_stimers(vcpu);
> +		if (kvm_check_request(KVM_REQ_APICV_ACTIVATE, vcpu))
> +			kvm_vcpu_activate_apicv(vcpu);
> +		if (kvm_check_request(KVM_REQ_APICV_DEACTIVATE, vcpu))
> +			kvm_vcpu_deactivate_apicv(vcpu);
>  	}
>  
>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {

-- 
Vitaly

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime
  2019-08-27  7:29   ` Vitaly Kuznetsov
@ 2019-08-27  8:05     ` Alexander Graf
  0 siblings, 0 replies; 39+ messages in thread
From: Alexander Graf @ 2019-08-27  8:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 27.08.19 09:29, Vitaly Kuznetsov wrote:
> "Suthikulpanit, Suravee" <Suravee.Suthikulpanit@amd.com> writes:
> 
>> Certain runtime conditions require APICv to be temporary deactivated.
>> However, current implementation only support permanently deactivate
>> APICv at runtime (mainly used when running Hyper-V guest).
>>
>> In addtion, for AMD, when activate / deactivate APICv during runtime,
>> all vcpus in the VM has to be operating in the same APICv mode, which
>> requires the requesting (main) vcpu to notify others.
>>
>> So, introduce interfaces to request all vcpus to activate/deactivate
>> APICv.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>   arch/x86/include/asm/kvm_host.h |  9 +++++
>>   arch/x86/kvm/x86.c              | 76 +++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 85 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 04d7066..dfb7c3d 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -76,6 +76,10 @@
>>   #define KVM_REQ_HV_STIMER		KVM_ARCH_REQ(22)
>>   #define KVM_REQ_LOAD_EOI_EXITMAP	KVM_ARCH_REQ(23)
>>   #define KVM_REQ_GET_VMCS12_PAGES	KVM_ARCH_REQ(24)
>> +#define KVM_REQ_APICV_ACTIVATE		\
>> +	KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>> +#define KVM_REQ_APICV_DEACTIVATE	\
>> +	KVM_ARCH_REQ_FLAGS(26, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>>   
>>   #define CR0_RESERVED_BITS                                               \
>>   	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
>> @@ -1089,6 +1093,7 @@ struct kvm_x86_ops {
>>   	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>>   	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
>>   	bool (*get_enable_apicv)(struct kvm *kvm);
>> +	void (*pre_update_apicv_exec_ctrl)(struct kvm_vcpu *vcpu, bool activate);
>>   	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
>>   	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
>>   	void (*hwapic_isr_update)(struct kvm_vcpu *vcpu, int isr);
>> @@ -1552,6 +1557,10 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
>>   
>>   void kvm_make_mclock_inprogress_request(struct kvm *kvm);
>>   void kvm_make_scan_ioapic_request(struct kvm *kvm);
>> +void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
>> +void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu);
>> +void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu);
>> +void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable);
>>   
>>   void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
>>   				     struct kvm_async_pf *work);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index f9c3f63..40a20bf 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -26,6 +26,7 @@
>>   #include "cpuid.h"
>>   #include "pmu.h"
>>   #include "hyperv.h"
>> +#include "lapic.h"
>>   
>>   #include <linux/clocksource.h>
>>   #include <linux/interrupt.h>
>> @@ -7163,6 +7164,22 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
>>   	kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
>>   }
>>   
>> +void kvm_vcpu_activate_apicv(struct kvm_vcpu *vcpu)
>> +{
>> +	if (!lapic_in_kernel(vcpu)) {
>> +		WARN_ON_ONCE(!vcpu->arch.apicv_active);
>> +		return;
>> +	}
>> +	if (vcpu->arch.apicv_active)
>> +		return;
>> +
>> +	vcpu->arch.apicv_active = true;
>> +	kvm_apic_update_apicv(vcpu);
>> +
>> +	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_vcpu_activate_apicv);
>> +
>>   void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
>>   {
>>   	if (!lapic_in_kernel(vcpu)) {
>> @@ -7173,8 +7190,11 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
>>   		return;
>>   
>>   	vcpu->arch.apicv_active = false;
>> +	kvm_apic_update_apicv(vcpu);
>> +
>>   	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
>>   }
>> +EXPORT_SYMBOL_GPL(kvm_vcpu_deactivate_apicv);
>>   
>>   int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>>   {
>> @@ -7668,6 +7688,58 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>>   	kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>>   }
>>   
>> +void kvm_make_apicv_activate_request(struct kvm_vcpu *vcpu)
>> +{
>> +	int i;
>> +	struct kvm_vcpu *v;
>> +	struct kvm *kvm = vcpu->kvm;
>> +
>> +	mutex_lock(&kvm->arch.apicv_lock);
>> +	if (kvm->arch.apicv_state != APICV_DEACTIVATED) {
>> +		mutex_unlock(&kvm->arch.apicv_lock);
>> +		return;
>> +	}
>> +
>> +	kvm_for_each_vcpu(i, v, kvm)
>> +		kvm_clear_request(KVM_REQ_APICV_DEACTIVATE, v);
>> +
>> +	if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
>> +		kvm_x86_ops->pre_update_apicv_exec_ctrl(vcpu, true);
>> +
>> +	kvm->arch.apicv_state = APICV_ACTIVATED;
>> +
>> +	kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_ACTIVATE);
>> +
>> +	mutex_unlock(&kvm->arch.apicv_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_make_apicv_activate_request);
>> +
>> +void kvm_make_apicv_deactivate_request(struct kvm_vcpu *vcpu, bool disable)
>> +{
>> +	int i;
>> +	struct kvm_vcpu *v;
>> +	struct kvm *kvm = vcpu->kvm;
>> +
>> +	mutex_lock(&kvm->arch.apicv_lock);
>> +	if (kvm->arch.apicv_state == APICV_DEACTIVATED) {
>> +		mutex_unlock(&kvm->arch.apicv_lock);
>> +		return;
>> +	}
>> +
>> +	kvm_for_each_vcpu(i, v, kvm)
>> +		kvm_clear_request(KVM_REQ_APICV_ACTIVATE, v);
> 
> Could you please elaborate on when we need to eat the
> KVM_REQ_APICV_ACTIVATE request here (and KVM_REQ_APICV_DEACTIVATE in
> kvm_make_apicv_activate_request() respectively)? To me, this looks like
> a possible source of hard-to-debug problems in the future.

	CPU0					CPU1

kvm_make_apicv_activate_request()	...
Handle KVM_REQ_APICV_ACTIVATE		KVM_REQ_APICV_ACTIVATE==1
vcpu_run()				...
kvm_make_apicv_deactivate_request()	...

In that case CPU1 would have both activate and deactivate requests 
pending. You can see the same logic above in the activate() function, 
just reverse.

I agree that it probably needs at least a comment to explain why we have 
it. But when I looked at the code, I could not think of a nicer way to 
implement it either.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs
  2019-08-26 19:41     ` Suthikulpanit, Suravee
@ 2019-08-27  8:20       ` Alexander Graf
  2019-08-28 18:39         ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 39+ messages in thread
From: Alexander Graf @ 2019-08-27  8:20 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 26.08.19 21:41, Suthikulpanit, Suravee wrote:
> Alex,
> 
> On 8/19/2019 4:57 AM, Alexander Graf wrote:
>>
>>
>> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>>> Currently, there is no way to tell whether APICv is active
>>> on a particular VM. This often cause confusion since APICv
>>> can be deactivated at runtime.
>>>
>>> Introduce a debugfs entry to report APICv state of a VM.
>>> This creates a read-only file:
>>>
>>>      /sys/kernel/debug/kvm/70860-14/apicv-state
>>>
>>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> Shouldn't this first and foremost be a VM ioctl so that user space can inquire its own state?
>>
>>
>> Alex
> 
> I introduce this mainly for debugging similar to how KVM is currently provides
> some per-VCPU information:
> 
>       /sys/kernel/debug/kvm/15957-14/vcpu0/
>           lapic_timer_advance_ns
>           tsc-offset
>           tsc-scaling-ratio
>           tsc-scaling-ratio-frac-bits
> 
> I'm not sure if this needs to be VM ioctl at this point. If this information is
> useful for user-space tool to inquire via ioctl, we can also provide it.

I'm mostly thinking of something like "info apic" in QEMU which to me 
seems like the natural place for APIC information exposure to a user. 
The problem with debugfs is that it's not accessible to the user that 
created the VM, but only root, right?

That said, I don't feel very strongly here.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC
  2019-08-26 20:46     ` Suthikulpanit, Suravee
@ 2019-08-27  9:10       ` Alexander Graf
  2019-09-13 14:50         ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 39+ messages in thread
From: Alexander Graf @ 2019-08-27  9:10 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon



On 26.08.19 22:46, Suthikulpanit, Suravee wrote:
> Alex,
> 
> On 8/19/2019 5:42 AM, Alexander Graf wrote:
>>
>>
>> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>>> ACK notifiers don't work with AMD SVM w/ AVIC when the PIT interrupt
>>> is delivered as edge-triggered fixed interrupt since AMD processors
>>> cannot exit on EOI for these interrupts.
>>>
>>> Add code to check LAPIC pending EOI before injecting any pending PIT
>>> interrupt on AMD SVM when AVIC is activated.
>>>
>>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>> ---
>>>    arch/x86/kvm/i8254.c | 31 +++++++++++++++++++++++++------
>>>    1 file changed, 25 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
>>> index 4a6dc54..31c4a9b 100644
>>> --- a/arch/x86/kvm/i8254.c
>>> +++ b/arch/x86/kvm/i8254.c
>>> @@ -34,10 +34,12 @@
>>>    #include <linux/kvm_host.h>
>>>    #include <linux/slab.h>
>>> +#include <asm/virtext.h>
>>>    #include "ioapic.h"
>>>    #include "irq.h"
>>>    #include "i8254.h"
>>> +#include "lapic.h"
>>>    #include "x86.h"
>>>    #ifndef CONFIG_X86_64
>>> @@ -236,6 +238,12 @@ static void destroy_pit_timer(struct kvm_pit *pit)
>>>        kthread_flush_work(&pit->expired);
>>>    }
>>> +static inline void kvm_pit_reset_reinject(struct kvm_pit *pit)
>>> +{
>>> +    atomic_set(&pit->pit_state.pending, 0);
>>> +    atomic_set(&pit->pit_state.irq_ack, 1);
>>> +}
>>> +
>>>    static void pit_do_work(struct kthread_work *work)
>>>    {
>>>        struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
>>> @@ -244,6 +252,23 @@ static void pit_do_work(struct kthread_work *work)
>>>        int i;
>>>        struct kvm_kpit_state *ps = &pit->pit_state;
>>> +    /*
>>> +     * Since, AMD SVM AVIC accelerates write access to APIC EOI
>>> +     * register for edge-trigger interrupts. PIT will not be able
>>> +     * to receive the IRQ ACK notifier and will always be zero.
>>> +     * Therefore, we check if any LAPIC EOI pending for vector 0
>>> +     * and reset irq_ack if no pending.
>>> +     */
>>> +    if (cpu_has_svm(NULL) && kvm->arch.apicv_state == APICV_ACTIVATED) {
>>> +        int eoi = 0;
>>> +
>>> +        kvm_for_each_vcpu(i, vcpu, kvm)
>>> +            if (kvm_apic_pending_eoi(vcpu, 0))
>>> +                eoi++;
>>> +        if (!eoi)
>>> +            kvm_pit_reset_reinject(pit);
>>
>> In which case would eoi be != 0 when APIC-V is active?
> 
> That would be the case when guest has not processed and/or still processing the interrupt.
> Once the guest writes to APIC EOI register for edge-triggered interrupt for vector 0,
> and the AVIC hardware accelerated the access by clearing the highest priority ISR bit,
> then the eoi should be zero.

Thinking about this a bit more, you're basically saying the irq ack 
notifier never triggers because we don't see the EOI register write, but 
we can determine the state asynchronously.

The irqfd code also uses the ack notifier for level irq reinjection. 
Will that break as well?

Wouldn't it make more sense to try to either maintain the ack notifier 
API or remove it completely if we can't find a way to make it work with 
APIC-V?

So what if we detect that an IRQ vector we're injecting for has an irq 
notifier? If it does, we set up / start:

   * an hrtimer that polls for EOI on that vector
   * a flag so that every vcpu on exit checks for EOI on that vector
   * a direct call from pit_do_work to check on it as well

Each of them would go through a single code path that then calls the 
ack_notifier.

That way we should be able to just maintain the old API and not get into 
unpleasant surprises that only manifest on a tiny faction of systems, right?


Alternatively, feel free to remove the ack logic altogether and move all 
users of it to different mechanisms (check in do_work here, additional 
timer in irqfd probably).


Let's try to be as consistent as possible across different host 
platforms. Otherwise the test matrix just explodes.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling
  2019-08-19 10:35   ` Alexander Graf
@ 2019-08-28 15:17     ` Suthikulpanit, Suravee
  2019-08-28 19:37       ` Graf (AWS), Alexander
  0 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-28 15:17 UTC (permalink / raw)
  To: Alexander Graf, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon

Alex,

On 8/19/19 5:35 AM, Alexander Graf wrote:
> 
> 
> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>> AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
>> deactivated and fall back to using legacy interrupt injection via vINTR
>> and interrupt window.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>   arch/x86/kvm/svm.c | 49 
>> +++++++++++++++++++++++++++++++++++++++++++++----
>>   1 file changed, 45 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index cfa4b13..4690351 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
>>   static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>>   static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
>>   static void svm_complete_interrupts(struct vcpu_svm *svm);
>> +static void svm_request_activate_avic(struct kvm_vcpu *vcpu);
>>   static bool svm_get_enable_apicv(struct kvm *kvm);
>>   static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
>> @@ -4494,6 +4495,15 @@ static int interrupt_window_interception(struct 
>> vcpu_svm *svm)
>>   {
>>       kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
>>       svm_clear_vintr(svm);
>> +
>> +    /*
>> +     * For AVIC, the only reason to end up here is ExtINTs.
>> +     * In this case AVIC was temporarily disabled for
>> +     * requesting the IRQ window and we have to re-enable it.
>> +     */
>> +    if (svm_get_enable_apicv(svm->vcpu.kvm))
>> +        svm_request_activate_avic(&svm->vcpu);
> 
> Would it make sense to add a trace point here and to the other call 
> sites, so that it becomes obvious in a trace when and why exactly avic 
> was active/inactive?
> 
> The trace point could add additional information on the why.

Sure, sounds good.

>> ....
>> @@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu 
>> *vcpu)
>>   {
>>       struct vcpu_svm *svm = to_svm(vcpu);
>> -    if (kvm_vcpu_apicv_active(vcpu))
>> -        return;
>> -
>>       /*
>>        * In case GIF=0 we can't rely on the CPU to tell us when GIF 
>> becomes
>>        * 1, because that's a separate STGI/VMRUN intercept.  The next 
>> time we
>> @@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu 
>> *vcpu)
>>        * window under the assumption that the hardware will set the GIF.
>>        */
>>       if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
>> +        /*
>> +         * IRQ window is not needed when AVIC is enabled,
>> +         * unless we have pending ExtINT since it cannot be injected
>> +         * via AVIC. In such case, we need to temporarily disable AVIC,
>> +         * and fallback to injecting IRQ via V_IRQ.
>> +         */
>> +        if (kvm_vcpu_apicv_active(vcpu))
>> +            svm_request_deactivate_avic(&svm->vcpu);
> 
> Did you test AVIC with nesting? Did you actually run across this issue 
> there?

Currently, we have not claimed that AVIC is supported w/ nested VM. 
That's why we have not enabled AVIC by default yet. We will be looking 
more into that next.

Suravee

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs
  2019-08-27  8:20       ` Alexander Graf
@ 2019-08-28 18:39         ` Suthikulpanit, Suravee
  2019-08-28 19:36           ` Graf (AWS), Alexander
  0 siblings, 1 reply; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-08-28 18:39 UTC (permalink / raw)
  To: Alexander Graf, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon

Alex,

On 8/27/19 3:20 AM, Alexander Graf wrote:
> 
> 
> On 26.08.19 21:41, Suthikulpanit, Suravee wrote:
>> Alex,
>>
>> On 8/19/2019 4:57 AM, Alexander Graf wrote:
>>>
>>>
>>> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>>>> Currently, there is no way to tell whether APICv is active
>>>> on a particular VM. This often cause confusion since APICv
>>>> can be deactivated at runtime.
>>>>
>>>> Introduce a debugfs entry to report APICv state of a VM.
>>>> This creates a read-only file:
>>>>
>>>>      /sys/kernel/debug/kvm/70860-14/apicv-state
>>>>
>>>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>>
>>> Shouldn't this first and foremost be a VM ioctl so that user space 
>>> can inquire its own state?
>>>
>>>
>>> Alex
>>
>> I introduce this mainly for debugging similar to how KVM is currently 
>> provides
>> some per-VCPU information:
>>
>>       /sys/kernel/debug/kvm/15957-14/vcpu0/
>>           lapic_timer_advance_ns
>>           tsc-offset
>>           tsc-scaling-ratio
>>           tsc-scaling-ratio-frac-bits
>>
>> I'm not sure if this needs to be VM ioctl at this point. If this 
>> information is
>> useful for user-space tool to inquire via ioctl, we can also provide it.
> 
> I'm mostly thinking of something like "info apic" in QEMU which to me 
> seems like the natural place for APIC information exposure to a user.

I could not find QEMU "info apic". I assume you meant "info lapic", 
which provides information specific to each local APIC (i.e. per-vcpu).

> The problem with debugfs is that it's not accessible to the user that 
> created the VM, but only root, right?

Hm, you are right. I'll also look into also adding ioctl interface then.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs
  2019-08-28 18:39         ` Suthikulpanit, Suravee
@ 2019-08-28 19:36           ` Graf (AWS), Alexander
  0 siblings, 0 replies; 39+ messages in thread
From: Graf (AWS), Alexander @ 2019-08-28 19:36 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: linux-kernel, kvm, pbonzini, rkrcmar, joro, Schoenherr, Jan H.,
	Raslan, KarimAllah, Lukaszewicz, Rimas, Grimm, Jon



> Am 28.08.2019 um 20:41 schrieb Suthikulpanit, Suravee <Suravee.Suthikulpanit@amd.com>:
> 
> Alex,
> 
>> On 8/27/19 3:20 AM, Alexander Graf wrote:
>> 
>> 
>>> On 26.08.19 21:41, Suthikulpanit, Suravee wrote:
>>> Alex,
>>> 
>>>> On 8/19/2019 4:57 AM, Alexander Graf wrote:
>>>> 
>>>> 
>>>>> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>>>>> Currently, there is no way to tell whether APICv is active
>>>>> on a particular VM. This often cause confusion since APICv
>>>>> can be deactivated at runtime.
>>>>> 
>>>>> Introduce a debugfs entry to report APICv state of a VM.
>>>>> This creates a read-only file:
>>>>> 
>>>>>      /sys/kernel/debug/kvm/70860-14/apicv-state
>>>>> 
>>>>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>>> 
>>>> Shouldn't this first and foremost be a VM ioctl so that user space 
>>>> can inquire its own state?
>>>> 
>>>> 
>>>> Alex
>>> 
>>> I introduce this mainly for debugging similar to how KVM is currently 
>>> provides
>>> some per-VCPU information:
>>> 
>>>       /sys/kernel/debug/kvm/15957-14/vcpu0/
>>>           lapic_timer_advance_ns
>>>           tsc-offset
>>>           tsc-scaling-ratio
>>>           tsc-scaling-ratio-frac-bits
>>> 
>>> I'm not sure if this needs to be VM ioctl at this point. If this 
>>> information is
>>> useful for user-space tool to inquire via ioctl, we can also provide it.
>> 
>> I'm mostly thinking of something like "info apic" in QEMU which to me 
>> seems like the natural place for APIC information exposure to a user.
> 
> I could not find QEMU "info apic". I assume you meant "info lapic", 
> which provides information specific to each local APIC (i.e. per-vcpu).

Or maybe even "info kvm" :). I think you get the point.

> 
>> The problem with debugfs is that it's not accessible to the user that 
>> created the VM, but only root, right?
> 
> Hm, you are right. I'll also look into also adding ioctl interface then.

Thanks!

Alex




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling
  2019-08-28 15:17     ` Suthikulpanit, Suravee
@ 2019-08-28 19:37       ` Graf (AWS), Alexander
  2019-09-10 15:46         ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 39+ messages in thread
From: Graf (AWS), Alexander @ 2019-08-28 19:37 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: linux-kernel, kvm, pbonzini, rkrcmar, joro, Schoenherr, Jan H.,
	Raslan, KarimAllah, Lukaszewicz, Rimas, Grimm, Jon



> Am 28.08.2019 um 17:19 schrieb Suthikulpanit, Suravee <Suravee.Suthikulpanit@amd.com>:
> 
> Alex,
> 
>> On 8/19/19 5:35 AM, Alexander Graf wrote:
>> 
>> 
>>> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>>> AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary
>>> deactivated and fall back to using legacy interrupt injection via vINTR
>>> and interrupt window.
>>> 
>>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>> ---
>>>   arch/x86/kvm/svm.c | 49 
>>> +++++++++++++++++++++++++++++++++++++++++++++----
>>>   1 file changed, 45 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>>> index cfa4b13..4690351 100644
>>> --- a/arch/x86/kvm/svm.c
>>> +++ b/arch/x86/kvm/svm.c
>>> @@ -384,6 +384,7 @@ struct amd_svm_iommu_ir {
>>>   static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>>>   static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
>>>   static void svm_complete_interrupts(struct vcpu_svm *svm);
>>> +static void svm_request_activate_avic(struct kvm_vcpu *vcpu);
>>>   static bool svm_get_enable_apicv(struct kvm *kvm);
>>>   static inline void avic_post_state_restore(struct kvm_vcpu *vcpu);
>>> @@ -4494,6 +4495,15 @@ static int interrupt_window_interception(struct 
>>> vcpu_svm *svm)
>>>   {
>>>       kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
>>>       svm_clear_vintr(svm);
>>> +
>>> +    /*
>>> +     * For AVIC, the only reason to end up here is ExtINTs.
>>> +     * In this case AVIC was temporarily disabled for
>>> +     * requesting the IRQ window and we have to re-enable it.
>>> +     */
>>> +    if (svm_get_enable_apicv(svm->vcpu.kvm))
>>> +        svm_request_activate_avic(&svm->vcpu);
>> 
>> Would it make sense to add a trace point here and to the other call 
>> sites, so that it becomes obvious in a trace when and why exactly avic 
>> was active/inactive?
>> 
>> The trace point could add additional information on the why.
> 
> Sure, sounds good.
> 
>>> ....
>>> @@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu 
>>> *vcpu)
>>>   {
>>>       struct vcpu_svm *svm = to_svm(vcpu);
>>> -    if (kvm_vcpu_apicv_active(vcpu))
>>> -        return;
>>> -
>>>       /*
>>>        * In case GIF=0 we can't rely on the CPU to tell us when GIF 
>>> becomes
>>>        * 1, because that's a separate STGI/VMRUN intercept.  The next 
>>> time we
>>> @@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu 
>>> *vcpu)
>>>        * window under the assumption that the hardware will set the GIF.
>>>        */
>>>       if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
>>> +        /*
>>> +         * IRQ window is not needed when AVIC is enabled,
>>> +         * unless we have pending ExtINT since it cannot be injected
>>> +         * via AVIC. In such case, we need to temporarily disable AVIC,
>>> +         * and fallback to injecting IRQ via V_IRQ.
>>> +         */
>>> +        if (kvm_vcpu_apicv_active(vcpu))
>>> +            svm_request_deactivate_avic(&svm->vcpu);
>> 
>> Did you test AVIC with nesting? Did you actually run across this issue 
>> there?
> 
> Currently, we have not claimed that AVIC is supported w/ nested VM. 
> That's why we have not enabled AVIC by default yet. We will be looking 
> more into that next.

If it's not supported, please suspend it when we enter a nested guest then? In that case, the above change is also unnecessary, as it only affects nested guests, no?

Alex

> 
> Suravee



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 14/15] kvm: ioapic: Delay update IOAPIC EOI for RTC
  2019-08-19 11:00   ` Alexander Graf
@ 2019-09-04  2:06     ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-09-04  2:06 UTC (permalink / raw)
  To: Alexander Graf, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon

Alex,

On 8/19/19 6:00 AM, Alexander Graf wrote:
> 
> 
> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>> In-kernel IOAPIC does not update RTC pending EOI info with AMD SVM /w 
>> AVIC
>> when interrupt is delivered as edge-triggered since AMD processors
>> cannot exit on EOI for these interrupts.
>>
>> Add code to also check LAPIC pending EOI before injecting any new RTC
>> interrupts on AMD SVM when AVIC is activated.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>   arch/x86/kvm/ioapic.c | 36 ++++++++++++++++++++++++++++++++----
>>   1 file changed, 32 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
>> index 1add1bc..45e7bb0 100644
>> --- a/arch/x86/kvm/ioapic.c
>> +++ b/arch/x86/kvm/ioapic.c
>> @@ -39,6 +39,7 @@
>>   #include <asm/processor.h>
>>   #include <asm/page.h>
>>   #include <asm/current.h>
>> +#include <asm/virtext.h>
>>   #include <trace/events/kvm.h>
>>   #include "ioapic.h"
>> @@ -173,6 +174,7 @@ static bool rtc_irq_check_coalesced(struct 
>> kvm_ioapic *ioapic)
>>       return false;
>>   }
>> +#define APIC_DEST_NOSHORT        0x0
>>   static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
>>           int irq_level, bool line_status)
>>   {
>> @@ -201,10 +203,36 @@ static int ioapic_set_irq(struct kvm_ioapic 
>> *ioapic, unsigned int irq,
>>        * interrupts lead to time drift in Windows guests.  So we track
>>        * EOI manually for the RTC interrupt.
>>        */
>> -    if (irq == RTC_GSI && line_status &&
>> -        rtc_irq_check_coalesced(ioapic)) {
>> -        ret = 0;
>> -        goto out;
>> +    if (irq == RTC_GSI && line_status) {
>> +        struct kvm *kvm = ioapic->kvm;
>> +        union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>> +
>> +        /*
>> +         * Since, AMD SVM AVIC accelerates write access to APIC EOI
>> +         * register for edge-trigger interrupts, IOAPIC will not be
>> +         * able to receive the EOI. In this case, we do lazy update
>> +         * of the pending EOI when trying to set IOAPIC irq for RTC.
>> +         */
>> +        if (cpu_has_svm(NULL) &&
>> +            (kvm->arch.apicv_state == APICV_ACTIVATED) &&
>> +            (entry->fields.trig_mode == IOAPIC_EDGE_TRIG)) {
>> +            int i;
>> +            struct kvm_vcpu *vcpu;
>> +
>> +            kvm_for_each_vcpu(i, vcpu, kvm)
>> +                if (kvm_apic_match_dest(vcpu, NULL,
>> +                            KVM_APIC_DEST_NOSHORT,
>> +                            entry->fields.dest_id,
>> +                            entry->fields.dest_mode)) {
>> +                    __rtc_irq_eoi_tracking_restore_one(vcpu);
> 
> I don't understand why this works. This code just means we're injecting 
> an EOI on the first CPU that has the vector mapped, right when we're 
> setting it to trigger, no?

Actually, this is similar to the approach in patch 12/15, where we check 
if there is any pending EOI for the RTC_GSI on the destination vcpu. 
Otherwise, the __rtc_irq_eoi_tracking_restore_one() should clear IOAPIC 
pending EOI for RTC.

Suravee

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling
  2019-08-28 19:37       ` Graf (AWS), Alexander
@ 2019-09-10 15:46         ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-09-10 15:46 UTC (permalink / raw)
  To: Graf (AWS), Alexander
  Cc: linux-kernel, kvm, pbonzini, rkrcmar, joro, Schoenherr, Jan H.,
	Raslan, KarimAllah, Lukaszewicz, Rimas, Grimm, Jon

Alex,

On 8/28/19 2:37 PM, Graf (AWS), Alexander wrote:
>>>> @@ -5522,9 +5558,6 @@ static void enable_irq_window(struct kvm_vcpu
>>>> *vcpu)
>>>>    {
>>>>        struct vcpu_svm *svm = to_svm(vcpu);
>>>> -    if (kvm_vcpu_apicv_active(vcpu))
>>>> -        return;
>>>> -
>>>>        /*
>>>>         * In case GIF=0 we can't rely on the CPU to tell us when GIF
>>>> becomes
>>>>         * 1, because that's a separate STGI/VMRUN intercept.  The next
>>>> time we
>>>> @@ -5534,6 +5567,14 @@ static void enable_irq_window(struct kvm_vcpu
>>>> *vcpu)
>>>>         * window under the assumption that the hardware will set the GIF.
>>>>         */
>>>>        if ((vgif_enabled(svm) || gif_set(svm)) && nested_svm_intr(svm)) {
>>>> +        /*
>>>> +         * IRQ window is not needed when AVIC is enabled,
>>>> +         * unless we have pending ExtINT since it cannot be injected
>>>> +         * via AVIC. In such case, we need to temporarily disable AVIC,
>>>> +         * and fallback to injecting IRQ via V_IRQ.
>>>> +         */
>>>> +        if (kvm_vcpu_apicv_active(vcpu))
>>>> +            svm_request_deactivate_avic(&svm->vcpu);
>>> Did you test AVIC with nesting? Did you actually run across this issue
>>> there?
>> Currently, we have not claimed that AVIC is supported w/ nested VM.
>> That's why we have not enabled AVIC by default yet. We will be looking
>> more into that next.
> If it's not supported, please suspend it when we enter a nested guest then?

Ok, this makes sense. I'll update this in V3.

> In that case, the above change is also unnecessary, as it only affects nested guests, no?

Actually, the function name nested_svm_intr() is misleading. Here it 
returns true when _NOT_ in guest mode:

/* This function returns true if it is save to enable the irq window */
   static inline bool nested_svm_intr(struct vcpu_svm *svm)
   {
           if (!is_guest_mode(&svm->vcpu))
                   return true;
   ....

So, the logic above does what we want when AVIC is enabled.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC
  2019-08-27  9:10       ` Alexander Graf
@ 2019-09-13 14:50         ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 39+ messages in thread
From: Suthikulpanit, Suravee @ 2019-09-13 14:50 UTC (permalink / raw)
  To: Alexander Graf, linux-kernel, kvm
  Cc: pbonzini, rkrcmar, joro, jschoenh, karahmed, rimasluk, Grimm, Jon

Alex,

On 8/27/19 4:10 AM, Alexander Graf wrote:
> 
> On 26.08.19 22:46, Suthikulpanit, Suravee wrote:
>> Alex,
>>
>> On 8/19/2019 5:42 AM, Alexander Graf wrote:
>>>
>>>
>>> On 15.08.19 18:25, Suthikulpanit, Suravee wrote:
>>>> ACK notifiers don't work with AMD SVM w/ AVIC when the PIT interrupt
>>>> is delivered as edge-triggered fixed interrupt since AMD processors
>>>> cannot exit on EOI for these interrupts.
>>>>
>>>> Add code to check LAPIC pending EOI before injecting any pending PIT
>>>> interrupt on AMD SVM when AVIC is activated.
>>>>
>>>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>>> ---
>>>>    arch/x86/kvm/i8254.c | 31 +++++++++++++++++++++++++------
>>>>    1 file changed, 25 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
>>>> index 4a6dc54..31c4a9b 100644
>>>> --- a/arch/x86/kvm/i8254.c
>>>> +++ b/arch/x86/kvm/i8254.c
>>>> @@ -34,10 +34,12 @@
>>>>    #include <linux/kvm_host.h>
>>>>    #include <linux/slab.h>
>>>> +#include <asm/virtext.h>
>>>>    #include "ioapic.h"
>>>>    #include "irq.h"
>>>>    #include "i8254.h"
>>>> +#include "lapic.h"
>>>>    #include "x86.h"
>>>>    #ifndef CONFIG_X86_64
>>>> @@ -236,6 +238,12 @@ static void destroy_pit_timer(struct kvm_pit *pit)
>>>>        kthread_flush_work(&pit->expired);
>>>>    }
>>>> +static inline void kvm_pit_reset_reinject(struct kvm_pit *pit)
>>>> +{
>>>> +    atomic_set(&pit->pit_state.pending, 0);
>>>> +    atomic_set(&pit->pit_state.irq_ack, 1);
>>>> +}
>>>> +
>>>>    static void pit_do_work(struct kthread_work *work)
>>>>    {
>>>>        struct kvm_pit *pit = container_of(work, struct kvm_pit, expired); 
>>>>
>>>> @@ -244,6 +252,23 @@ static void pit_do_work(struct kthread_work *work)
>>>>        int i;
>>>>        struct kvm_kpit_state *ps = &pit->pit_state;
>>>> +    /*
>>>> +     * Since, AMD SVM AVIC accelerates write access to APIC EOI
>>>> +     * register for edge-trigger interrupts. PIT will not be able
>>>> +     * to receive the IRQ ACK notifier and will always be zero.
>>>> +     * Therefore, we check if any LAPIC EOI pending for vector 0
>>>> +     * and reset irq_ack if no pending.
>>>> +     */
>>>> +    if (cpu_has_svm(NULL) && kvm->arch.apicv_state == APICV_ACTIVATED) { 
>>>>
>>>> +        int eoi = 0;
>>>> +
>>>> +        kvm_for_each_vcpu(i, vcpu, kvm)
>>>> +            if (kvm_apic_pending_eoi(vcpu, 0))
>>>> +                eoi++;
>>>> +        if (!eoi)
>>>> +            kvm_pit_reset_reinject(pit);
>>>
>>> In which case would eoi be != 0 when APIC-V is active?
>>
>> That would be the case when guest has not processed and/or still 
>> processing the interrupt.
>> Once the guest writes to APIC EOI register for edge-triggered 
>> interrupt for vector 0,
>> and the AVIC hardware accelerated the access by clearing the highest 
>> priority ISR bit,
>> then the eoi should be zero.
> 
> Thinking about this a bit more, you're basically saying the irq ack 
> notifier never triggers because we don't see the EOI register write, but 
> we can determine the state asynchronously.

Yes, we should be able to determine this in lazy fashion only when we 
need to inject new interrupts.

> The irqfd code also uses the ack notifier for level irq reinjection. 
> Will that break as well?

IIUC, in case of irqfd, the notifier is only used for the case of 
resampling irqfds, which are special variety of irqfds used to emulate 
level triggered interrupts (see include/linux/kvm_irqfd.h). The AVIC 
workaround is only needed for handling EOI for edge-trigger interrupts.

> Wouldn't it make more sense to try to either maintain the ack notifier 
> API or remove it completely if we can't find a way to make it work with 
> APIC-V?

My understanding is that the ack notifier is needed for KVM to support 
KVM_REINJECT_CONTROL ioctl (mentioned here 
(https://lkml.org/lkml/2019/2/6/133).

> So what if we detect that an IRQ vector we're injecting for has an irq 
> notifier? If it does, we set up / start:
> 
>    * an hrtimer that polls for EOI on that vector
>    * a flag so that every vcpu on exit checks for EOI on that vector
>    * a direct call from pit_do_work to check on it as well
> 
> Each of them would go through a single code path that then calls the 
> ack_notifier.
> 
> That way we should be able to just maintain the old API and not get into 
> unpleasant surprises that only manifest on a tiny faction of systems, 
> right?

Let me send out v3 that will consolidate this into a single code path.

Suravee

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, back to index

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-15 16:25 [PATCH v2 00/15] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode Suthikulpanit, Suravee
2019-08-15 16:25 ` [PATCH v2 01/15] kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm parameter Suthikulpanit, Suravee
2019-08-27  7:15   ` Vitaly Kuznetsov
2019-08-15 16:25 ` [PATCH v2 02/15] kvm: x86: Introduce KVM APICv state Suthikulpanit, Suravee
2019-08-19  9:49   ` Alexander Graf
2019-08-26 19:06     ` Suthikulpanit, Suravee
2019-08-15 16:25 ` [PATCH v2 03/15] kvm: Add arch-specific per-VM debugfs support Suthikulpanit, Suravee
2019-08-15 16:25 ` [PATCH v2 04/15] kvm: x86: Add per-VM APICv state debugfs Suthikulpanit, Suravee
2019-08-19  9:57   ` Alexander Graf
2019-08-26 19:41     ` Suthikulpanit, Suravee
2019-08-27  8:20       ` Alexander Graf
2019-08-28 18:39         ` Suthikulpanit, Suravee
2019-08-28 19:36           ` Graf (AWS), Alexander
2019-08-15 16:25 ` [PATCH v2 05/15] kvm: lapic: Introduce APICv update helper function Suthikulpanit, Suravee
2019-08-27  7:22   ` Vitaly Kuznetsov
2019-08-15 16:25 ` [PATCH v2 06/15] kvm: x86: Add support for activate/de-activate APICv at runtime Suthikulpanit, Suravee
2019-08-27  7:29   ` Vitaly Kuznetsov
2019-08-27  8:05     ` Alexander Graf
2019-08-15 16:25 ` [PATCH v2 07/15] kvm: x86: svm: Add support to activate/deactivate posted interrupts Suthikulpanit, Suravee
2019-08-15 16:25 ` [PATCH v2 08/15] svm: Add support for setup/destroy virutal APIC backing page for AVIC Suthikulpanit, Suravee
2019-08-15 16:25 ` [PATCH v2 09/15] svm: Add support for activate/deactivate AVIC at runtime Suthikulpanit, Suravee
2019-08-19 10:28   ` Alexander Graf
2019-08-15 16:25 ` [PATCH v2 10/15] kvm: x86: hyperv: Use APICv deactivate request interface Suthikulpanit, Suravee
2019-08-19 10:31   ` Alexander Graf
2019-08-15 16:25 ` [PATCH v2 11/15] svm: Temporary deactivate AVIC during ExtINT handling Suthikulpanit, Suravee
2019-08-19 10:35   ` Alexander Graf
2019-08-28 15:17     ` Suthikulpanit, Suravee
2019-08-28 19:37       ` Graf (AWS), Alexander
2019-09-10 15:46         ` Suthikulpanit, Suravee
2019-08-15 16:25 ` [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC Suthikulpanit, Suravee
2019-08-19 10:42   ` Alexander Graf
2019-08-26 20:46     ` Suthikulpanit, Suravee
2019-08-27  9:10       ` Alexander Graf
2019-09-13 14:50         ` Suthikulpanit, Suravee
2019-08-15 16:25 ` [PATCH v2 13/15] kvm: lapic: Clean up APIC predefined macros Suthikulpanit, Suravee
2019-08-15 16:25 ` [PATCH v2 14/15] kvm: ioapic: Delay update IOAPIC EOI for RTC Suthikulpanit, Suravee
2019-08-19 11:00   ` Alexander Graf
2019-09-04  2:06     ` Suthikulpanit, Suravee
2019-08-15 16:25 ` [PATCH v2 15/15] svm: Allow AVIC with in-kernel irqchip mode Suthikulpanit, Suravee

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git