linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/6] KVM: X86: Implement Exit-less IPIs support
@ 2018-07-23  6:39 Wanpeng Li
  2018-07-23  6:39 ` [PATCH v5 1/6] KVM: X86: Add kvm hypervisor init time platform setup callback Wanpeng Li
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Wanpeng Li @ 2018-07-23  6:39 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Vitaly Kuznetsov

Using hypercall to send IPIs by one vmexit instead of one by one for
xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster 
mode. Intel guest can enter x2apic cluster mode when interrupt remmaping 
is enabled in qemu, however, latest AMD EPYC still just supports xapic 
mode which can get great improvement by Exit-less IPIs. This patchset 
lets a guest send multicast IPIs, with at most 128 destinations per 
hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.

Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM 
is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141):

x2apic cluster mode, vanilla

 Dry-run:                         0,            2392199 ns
 Self-IPI:                  6907514,           15027589 ns
 Normal IPI:              223910476,          251301666 ns
 Broadcast IPI:                   0,         9282161150 ns
 Broadcast lock:                  0,         8812934104 ns

x2apic cluster mode, pv-ipi 

 Dry-run:                         0,            2449341 ns
 Self-IPI:                  6720360,           15028732 ns
 Normal IPI:              228643307,          255708477 ns
 Broadcast IPI:                   0,         7572293590 ns  => 22% performance boost 
 Broadcast lock:                  0,         8316124651 ns

x2apic physical mode, vanilla

 Dry-run:                         0,            3135933 ns
 Self-IPI:                  8572670,           17901757 ns
 Normal IPI:              226444334,          255421709 ns
 Broadcast IPI:                   0,        19845070887 ns
 Broadcast lock:                  0,        19827383656 ns

x2apic physical mode, pv-ipi

 Dry-run:                         0,            2446381 ns
 Self-IPI:                  6788217,           15021056 ns
 Normal IPI:              219454441,          249583458 ns
 Broadcast IPI:                   0,         7806540019 ns  => 154% performance boost 
 Broadcast lock:                  0,         9143618799 ns

v4 -> v5:
 * update hypercall layout description
 * fix PV IPIs send hypercall loops

v3 -> v4:
 * offset algorithm w/ __uint128_t to scale to higher APIC IDs
 * remove num_possible_cpus limit
 * pass op_64_bit to check bitmap size
 * better describe hypercall layout

v2 -> v3:
 * rename ipi_mask_done to irq_restore_exit, __send_ipi_mask return int 
   instead of bool 
 * fix build errors reported by 0day
 * split patches, nothing change 

v1 -> v2:
 * sparse apic id > 128, or any other errors, fallback to original apic hooks
 * have two bitmask arguments so that one hypercall handles 128 vCPUs 
 * fix KVM_FEATURE_PV_SEND_IPI doc
 * document hypercall
 * fix NMI selftest fails
 * fix build errors reported by 0day

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>

Wanpeng Li (6):
  KVM: X86: Add kvm hypervisor init time platform setup callback
  KVM: X86: Implement PV IPIs in linux guest
  KVM: X86: Fallback to original apic hooks when bad happens
  KVM: X86: Implement PV IPIs send hypercall
  KVM: X86: Add NMI support to PV IPIs
  KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest

 Documentation/virtual/kvm/cpuid.txt      |   4 ++
 Documentation/virtual/kvm/hypercalls.txt |  20 ++++++
 arch/x86/include/uapi/asm/kvm_para.h     |   1 +
 arch/x86/kernel/kvm.c                    | 111 +++++++++++++++++++++++++++++++
 arch/x86/kvm/cpuid.c                     |   3 +-
 arch/x86/kvm/x86.c                       |  43 ++++++++++++
 include/uapi/linux/kvm_para.h            |   1 +
 7 files changed, 182 insertions(+), 1 deletion(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 1/6] KVM: X86: Add kvm hypervisor init time platform setup callback
  2018-07-23  6:39 [PATCH v5 0/6] KVM: X86: Implement Exit-less IPIs support Wanpeng Li
@ 2018-07-23  6:39 ` Wanpeng Li
  2018-07-23  6:39 ` [PATCH v5 2/6] KVM: X86: Implement PV IPIs in linux guest Wanpeng Li
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2018-07-23  6:39 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Vitaly Kuznetsov

From: Wanpeng Li <wanpengli@tencent.com>

Add kvm hypervisor init time platform setup callback which
will be used to replace native apic hooks by pararvirtual
hooks.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kernel/kvm.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 5b2300b..591bcf2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -624,12 +624,22 @@ static uint32_t __init kvm_detect(void)
 	return kvm_cpuid_base();
 }
 
+static void __init kvm_apic_init(void)
+{
+}
+
+static void __init kvm_init_platform(void)
+{
+	x86_platform.apic_post_init = kvm_apic_init;
+}
+
 const __initconst struct hypervisor_x86 x86_hyper_kvm = {
 	.name			= "KVM",
 	.detect			= kvm_detect,
 	.type			= X86_HYPER_KVM,
 	.init.guest_late_init	= kvm_guest_init,
 	.init.x2apic_available	= kvm_para_available,
+	.init.init_platform	= kvm_init_platform,
 };
 
 static __init int activate_jump_labels(void)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 2/6] KVM: X86: Implement PV IPIs in linux guest
  2018-07-23  6:39 [PATCH v5 0/6] KVM: X86: Implement Exit-less IPIs support Wanpeng Li
  2018-07-23  6:39 ` [PATCH v5 1/6] KVM: X86: Add kvm hypervisor init time platform setup callback Wanpeng Li
@ 2018-07-23  6:39 ` Wanpeng Li
  2018-07-23  6:39 ` [PATCH v5 3/6] KVM: X86: Fallback to original apic hooks when bad happens Wanpeng Li
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2018-07-23  6:39 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Vitaly Kuznetsov

From: Wanpeng Li <wanpengli@tencent.com>

Implement paravirtual apic hooks to enable PV IPIs.

apic->send_IPI_mask
apic->send_IPI_mask_allbutself
apic->send_IPI_allbutself
apic->send_IPI_all

This patch lets a guest send multicast IPIs, with at most 128 destinations 
per hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/include/uapi/asm/kvm_para.h |  1 +
 arch/x86/kernel/kvm.c                | 86 ++++++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm_para.h        |  1 +
 3 files changed, 88 insertions(+)

diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 0ede697..19980ec 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -28,6 +28,7 @@
 #define KVM_FEATURE_PV_UNHALT		7
 #define KVM_FEATURE_PV_TLB_FLUSH	9
 #define KVM_FEATURE_ASYNC_PF_VMEXIT	10
+#define KVM_FEATURE_PV_SEND_IPI	11
 
 #define KVM_HINTS_REALTIME      0
 
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 591bcf2..eed6046 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -454,6 +454,88 @@ static void __init sev_map_percpu_data(void)
 }
 
 #ifdef CONFIG_SMP
+static void __send_ipi_mask(const struct cpumask *mask, int vector)
+{
+	unsigned long flags;
+	int cpu, apic_id, min = 0, max = 0;
+#ifdef CONFIG_X86_64
+	__uint128_t ipi_bitmap = 0;
+	int cluster_size = 128;
+#else
+	u64 ipi_bitmap = 0;
+	int cluster_size = 64;
+#endif
+
+	if (cpumask_empty(mask))
+		return;
+
+	local_irq_save(flags);
+
+	for_each_cpu(cpu, mask) {
+		apic_id = per_cpu(x86_cpu_to_apicid, cpu);
+		if (!ipi_bitmap) {
+			min = max = apic_id;
+		} else if (apic_id < min && max - apic_id < cluster_size) {
+			ipi_bitmap <<= min - apic_id;
+			min = apic_id;
+		} else if (apic_id < min + cluster_size) {
+			max = apic_id < max ? max : apic_id;
+		} else {
+			kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
+				(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
+			min = max = apic_id;
+			ipi_bitmap = 0;
+		}
+		__set_bit(apic_id - min, (unsigned long *)&ipi_bitmap);
+	}
+
+	if (ipi_bitmap) {
+		kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
+			(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
+	}
+
+	local_irq_restore(flags);
+}
+
+static void kvm_send_ipi_mask(const struct cpumask *mask, int vector)
+{
+	__send_ipi_mask(mask, vector);
+}
+
+static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector)
+{
+	unsigned int this_cpu = smp_processor_id();
+	struct cpumask new_mask;
+	const struct cpumask *local_mask;
+
+	cpumask_copy(&new_mask, mask);
+	cpumask_clear_cpu(this_cpu, &new_mask);
+	local_mask = &new_mask;
+	__send_ipi_mask(local_mask, vector);
+}
+
+static void kvm_send_ipi_allbutself(int vector)
+{
+	kvm_send_ipi_mask_allbutself(cpu_online_mask, vector);
+}
+
+static void kvm_send_ipi_all(int vector)
+{
+	__send_ipi_mask(cpu_online_mask, vector);
+}
+
+/*
+ * Set the IPI entry points
+ */
+static void kvm_setup_pv_ipi(void)
+{
+	apic->send_IPI_mask = kvm_send_ipi_mask;
+	apic->send_IPI_mask_allbutself = kvm_send_ipi_mask_allbutself;
+	apic->send_IPI_allbutself = kvm_send_ipi_allbutself;
+	apic->send_IPI_all = kvm_send_ipi_all;
+	pr_info("KVM setup pv IPIs\n");
+}
+
 static void __init kvm_smp_prepare_cpus(unsigned int max_cpus)
 {
 	native_smp_prepare_cpus(max_cpus);
@@ -626,6 +708,10 @@ static uint32_t __init kvm_detect(void)
 
 static void __init kvm_apic_init(void)
 {
+#if defined(CONFIG_SMP)
+	if (kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI))
+		kvm_setup_pv_ipi();
+#endif
 }
 
 static void __init kvm_init_platform(void)
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index dcf629d..a98217d 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -26,6 +26,7 @@
 #define KVM_HC_MIPS_EXIT_VM		7
 #define KVM_HC_MIPS_CONSOLE_OUTPUT	8
 #define KVM_HC_CLOCK_PAIRING		9
+#define KVM_HC_SEND_IPI		10
 
 /*
  * hypercalls use architecture specific
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 3/6] KVM: X86: Fallback to original apic hooks when bad happens
  2018-07-23  6:39 [PATCH v5 0/6] KVM: X86: Implement Exit-less IPIs support Wanpeng Li
  2018-07-23  6:39 ` [PATCH v5 1/6] KVM: X86: Add kvm hypervisor init time platform setup callback Wanpeng Li
  2018-07-23  6:39 ` [PATCH v5 2/6] KVM: X86: Implement PV IPIs in linux guest Wanpeng Li
@ 2018-07-23  6:39 ` Wanpeng Li
  2018-08-02 13:01   ` Paolo Bonzini
  2018-07-23  6:39 ` [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall Wanpeng Li
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2018-07-23  6:39 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Vitaly Kuznetsov

From: Wanpeng Li <wanpengli@tencent.com>

Fallback to original apic hooks when unlikely kvm fails to add the
pending IRQ to lapic.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kernel/kvm.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index eed6046..57eb4a2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -47,6 +47,7 @@
 #include <asm/hypervisor.h>
 #include <asm/kvm_guest.h>
 
+static struct apic orig_apic;
 static int kvmapf = 1;
 
 static int __init parse_no_kvmapf(char *arg)
@@ -454,10 +455,10 @@ static void __init sev_map_percpu_data(void)
 }
 
 #ifdef CONFIG_SMP
-static void __send_ipi_mask(const struct cpumask *mask, int vector)
+static int __send_ipi_mask(const struct cpumask *mask, int vector)
 {
 	unsigned long flags;
-	int cpu, apic_id, min = 0, max = 0;
+	int cpu, apic_id, min = 0, max = 0, ret = 0;
 #ifdef CONFIG_X86_64
 	__uint128_t ipi_bitmap = 0;
 	int cluster_size = 128;
@@ -467,7 +468,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
 #endif
 
 	if (cpumask_empty(mask))
-		return;
+		return 0;
 
 	local_irq_save(flags);
 
@@ -481,7 +482,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
 		} else if (apic_id < min + cluster_size) {
 			max = apic_id < max ? max : apic_id;
 		} else {
-			kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
+			ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
 				(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
 			min = max = apic_id;
 			ipi_bitmap = 0;
@@ -490,11 +491,12 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
 	}
 
 	if (ipi_bitmap) {
-		kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
+		ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
 			(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
 	}
 
 	local_irq_restore(flags);
+	return ret;
 }
 
 static void kvm_send_ipi_mask(const struct cpumask *mask, int vector)
@@ -511,7 +513,8 @@ static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector)
 	cpumask_copy(&new_mask, mask);
 	cpumask_clear_cpu(this_cpu, &new_mask);
 	local_mask = &new_mask;
-	__send_ipi_mask(local_mask, vector);
+	if (__send_ipi_mask(local_mask, vector))
+		orig_apic.send_IPI_mask_allbutself(mask, vector);
 }
 
 static void kvm_send_ipi_allbutself(int vector)
@@ -521,7 +524,8 @@ static void kvm_send_ipi_allbutself(int vector)
 
 static void kvm_send_ipi_all(int vector)
 {
-	__send_ipi_mask(cpu_online_mask, vector);
+	if (__send_ipi_mask(cpu_online_mask, vector))
+		orig_apic.send_IPI_all(vector);
 }
 
 /*
@@ -529,6 +533,8 @@ static void kvm_send_ipi_all(int vector)
  */
 static void kvm_setup_pv_ipi(void)
 {
+	orig_apic = *apic;
+
 	apic->send_IPI_mask = kvm_send_ipi_mask;
 	apic->send_IPI_mask_allbutself = kvm_send_ipi_mask_allbutself;
 	apic->send_IPI_allbutself = kvm_send_ipi_allbutself;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall
  2018-07-23  6:39 [PATCH v5 0/6] KVM: X86: Implement Exit-less IPIs support Wanpeng Li
                   ` (2 preceding siblings ...)
  2018-07-23  6:39 ` [PATCH v5 3/6] KVM: X86: Fallback to original apic hooks when bad happens Wanpeng Li
@ 2018-07-23  6:39 ` Wanpeng Li
  2018-08-02 13:04   ` Paolo Bonzini
  2018-07-23  6:39 ` [PATCH v5 5/6] KVM: X86: Add NMI support to PV IPIs Wanpeng Li
  2018-07-23  6:39 ` [PATCH v5 6/6] KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest Wanpeng Li
  5 siblings, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2018-07-23  6:39 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Vitaly Kuznetsov

From: Wanpeng Li <wanpengli@tencent.com>

Using hypercall to send IPIs by one vmexit instead of one by one for
xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster 
mode. Intel guest can enter x2apic cluster mode when interrupt remmaping 
is enabled in qemu, however, latest AMD EPYC still just supports xapic 
mode which can get great improvement by Exit-less IPIs. This patchset 
lets a guest send multicast IPIs, with at most 128 destinations per 
hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.

Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM 
is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141):

x2apic cluster mode, vanilla

 Dry-run:                         0,            2392199 ns
 Self-IPI:                  6907514,           15027589 ns
 Normal IPI:              223910476,          251301666 ns
 Broadcast IPI:                   0,         9282161150 ns
 Broadcast lock:                  0,         8812934104 ns

x2apic cluster mode, pv-ipi 

 Dry-run:                         0,            2449341 ns
 Self-IPI:                  6720360,           15028732 ns
 Normal IPI:              228643307,          255708477 ns
 Broadcast IPI:                   0,         7572293590 ns  => 22% performance boost 
 Broadcast lock:                  0,         8316124651 ns

x2apic physical mode, vanilla

 Dry-run:                         0,            3135933 ns
 Self-IPI:                  8572670,           17901757 ns
 Normal IPI:              226444334,          255421709 ns
 Broadcast IPI:                   0,        19845070887 ns
 Broadcast lock:                  0,        19827383656 ns

x2apic physical mode, pv-ipi

 Dry-run:                         0,            2446381 ns
 Self-IPI:                  6788217,           15021056 ns
 Normal IPI:              219454441,          249583458 ns
 Broadcast IPI:                   0,         7806540019 ns  => 154% performance boost 
 Broadcast lock:                  0,         9143618799 ns

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 Documentation/virtual/kvm/hypercalls.txt | 20 +++++++++++++++++
 arch/x86/kvm/x86.c                       | 37 ++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
index a890529..9895123 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -121,3 +121,23 @@ compute the CLOCK_REALTIME for its clock, at the same instant.
 
 Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
 or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
+
+6. KVM_HC_SEND_IPI
+------------------------
+Architecture: x86
+Status: active
+Purpose: Hypercall used to send IPIs.
+
+a0: lower part of the bitmap of destination APIC IDs
+a1: higher part of the bitmap of destination APIC IDs
+a2: the lowest APIC ID in bitmap
+a3: APIC ICR
+
+The hypercall lets a guest send multicast IPIs, with at most 128
+128 destinations per hypercall in 64-bit mode and 64 vCPUs per
+hypercall in 32-bit mode.  The destinations are represented by a
+bitmap contained in the first two arguments (a0 and a1). Bit 0 of
+a0 corresponds to the APIC ID in the third argument (a2), bit 1
+corresponds to the APIC ID a2+1, and so on.
+
+Returns 0 if successfully delivery the IPIs and 1 if discarded.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2b812b3..a43a29f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6691,6 +6691,40 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
 	kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
 }
 
+/*
+ * Return 0 if successfully added and 1 if discarded.
+ */
+static int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
+		unsigned long ipi_bitmap_high, int min, int vector, int op_64_bit)
+{
+	int i;
+	struct kvm_apic_map *map;
+	struct kvm_vcpu *vcpu;
+	struct kvm_lapic_irq irq = {
+		.delivery_mode = APIC_DM_FIXED,
+		.vector = vector,
+	};
+	int cluster_size = op_64_bit ? 64 : 32;
+
+	rcu_read_lock();
+	map = rcu_dereference(kvm->arch.apic_map);
+
+	for_each_set_bit(i, &ipi_bitmap_low, cluster_size) {
+		vcpu = map->phys_map[min + i]->vcpu;
+		if (!kvm_apic_set_irq(vcpu, &irq, NULL))
+			return 1;
+	}
+
+	for_each_set_bit(i, &ipi_bitmap_high, cluster_size) {
+		vcpu = map->phys_map[min + i + cluster_size]->vcpu;
+		if (!kvm_apic_set_irq(vcpu, &irq, NULL))
+			return 1;
+	}
+
+	rcu_read_unlock();
+	return 0;
+}
+
 void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.apicv_active = false;
@@ -6739,6 +6773,9 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 	case KVM_HC_CLOCK_PAIRING:
 		ret = kvm_pv_clock_pairing(vcpu, a0, a1);
 		break;
+	case KVM_HC_SEND_IPI:
+		ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit);
+		break;
 #endif
 	default:
 		ret = -KVM_ENOSYS;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 5/6] KVM: X86: Add NMI support to PV IPIs
  2018-07-23  6:39 [PATCH v5 0/6] KVM: X86: Implement Exit-less IPIs support Wanpeng Li
                   ` (3 preceding siblings ...)
  2018-07-23  6:39 ` [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall Wanpeng Li
@ 2018-07-23  6:39 ` Wanpeng Li
  2018-07-23  6:39 ` [PATCH v5 6/6] KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest Wanpeng Li
  5 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2018-07-23  6:39 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Vitaly Kuznetsov

From: Wanpeng Li <wanpengli@tencent.com>

The NMI delivery mode of ICR is used to deliver an NMI to the processor, 
and the vector information is ignored.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kernel/kvm.c | 15 ++++++++++++---
 arch/x86/kvm/x86.c    | 16 +++++++++++-----
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 57eb4a2..3456531 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -458,7 +458,7 @@ static void __init sev_map_percpu_data(void)
 static int __send_ipi_mask(const struct cpumask *mask, int vector)
 {
 	unsigned long flags;
-	int cpu, apic_id, min = 0, max = 0, ret = 0;
+	int cpu, apic_id, min = 0, max = 0, ret = 0, icr = 0;
 #ifdef CONFIG_X86_64
 	__uint128_t ipi_bitmap = 0;
 	int cluster_size = 128;
@@ -472,6 +472,15 @@ static int __send_ipi_mask(const struct cpumask *mask, int vector)
 
 	local_irq_save(flags);
 
+	switch (vector) {
+	default:
+		icr = APIC_DM_FIXED | vector;
+		break;
+	case NMI_VECTOR:
+		icr = APIC_DM_NMI;
+		break;
+	}
+
 	for_each_cpu(cpu, mask) {
 		apic_id = per_cpu(x86_cpu_to_apicid, cpu);
 		if (!ipi_bitmap) {
@@ -483,7 +492,7 @@ static int __send_ipi_mask(const struct cpumask *mask, int vector)
 			max = apic_id < max ? max : apic_id;
 		} else {
 			ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
-				(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
+				(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, icr);
 			min = max = apic_id;
 			ipi_bitmap = 0;
 		}
@@ -492,7 +501,7 @@ static int __send_ipi_mask(const struct cpumask *mask, int vector)
 
 	if (ipi_bitmap) {
 		ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
-			(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
+			(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, icr);
 	}
 
 	local_irq_restore(flags);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a43a29f..c118040 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6695,17 +6695,23 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
  * Return 0 if successfully added and 1 if discarded.
  */
 static int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
-		unsigned long ipi_bitmap_high, int min, int vector, int op_64_bit)
+		unsigned long ipi_bitmap_high, int min, unsigned long icr, int op_64_bit)
 {
 	int i;
 	struct kvm_apic_map *map;
 	struct kvm_vcpu *vcpu;
-	struct kvm_lapic_irq irq = {
-		.delivery_mode = APIC_DM_FIXED,
-		.vector = vector,
-	};
+	struct kvm_lapic_irq irq = {0};
 	int cluster_size = op_64_bit ? 64 : 32;
 
+	switch (icr & APIC_VECTOR_MASK) {
+	default:
+		irq.vector = icr & APIC_VECTOR_MASK;
+		break;
+	case NMI_VECTOR:
+		break;
+	}
+	irq.delivery_mode = icr & APIC_MODE_MASK;
+
 	rcu_read_lock();
 	map = rcu_dereference(kvm->arch.apic_map);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 6/6] KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest
  2018-07-23  6:39 [PATCH v5 0/6] KVM: X86: Implement Exit-less IPIs support Wanpeng Li
                   ` (4 preceding siblings ...)
  2018-07-23  6:39 ` [PATCH v5 5/6] KVM: X86: Add NMI support to PV IPIs Wanpeng Li
@ 2018-07-23  6:39 ` Wanpeng Li
  5 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2018-07-23  6:39 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Radim Krčmář, Vitaly Kuznetsov

From: Wanpeng Li <wanpengli@tencent.com>

Expose PV_SEND_IPI feature bit to guest, the guest can check this feature
bit before using paravirtualized send IPIs.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 Documentation/virtual/kvm/cpuid.txt | 4 ++++
 arch/x86/kvm/cpuid.c                | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index ab022dc..97ca194 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -62,6 +62,10 @@ KVM_FEATURE_ASYNC_PF_VMEXIT        ||    10 || paravirtualized async PF VM exit
                                    ||       || can be enabled by setting bit 2
                                    ||       || when writing to msr 0x4b564d02
 ------------------------------------------------------------------------------
+KVM_FEATURE_PV_SEND_IPI            ||    11 || guest checks this feature bit
+                                   ||       || before using paravirtualized
+                                   ||       || send IPIs.
+------------------------------------------------------------------------------
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
                                    ||       || per-cpu warps are expected in
                                    ||       || kvmclock.
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7e042e3..7bcfa61 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -621,7 +621,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 			     (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
 			     (1 << KVM_FEATURE_PV_UNHALT) |
 			     (1 << KVM_FEATURE_PV_TLB_FLUSH) |
-			     (1 << KVM_FEATURE_ASYNC_PF_VMEXIT);
+			     (1 << KVM_FEATURE_ASYNC_PF_VMEXIT) |
+			     (1 << KVM_FEATURE_PV_SEND_IPI);
 
 		if (sched_info_on())
 			entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 3/6] KVM: X86: Fallback to original apic hooks when bad happens
  2018-07-23  6:39 ` [PATCH v5 3/6] KVM: X86: Fallback to original apic hooks when bad happens Wanpeng Li
@ 2018-08-02 13:01   ` Paolo Bonzini
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2018-08-02 13:01 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm
  Cc: Radim Krčmář, Vitaly Kuznetsov

On 23/07/2018 08:39, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> Fallback to original apic hooks when unlikely kvm fails to add the
> pending IRQ to lapic.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
>  arch/x86/kernel/kvm.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index eed6046..57eb4a2 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -47,6 +47,7 @@
>  #include <asm/hypervisor.h>
>  #include <asm/kvm_guest.h>
>  
> +static struct apic orig_apic;
>  static int kvmapf = 1;
>  
>  static int __init parse_no_kvmapf(char *arg)
> @@ -454,10 +455,10 @@ static void __init sev_map_percpu_data(void)
>  }
>  
>  #ifdef CONFIG_SMP
> -static void __send_ipi_mask(const struct cpumask *mask, int vector)
> +static int __send_ipi_mask(const struct cpumask *mask, int vector)
>  {
>  	unsigned long flags;
> -	int cpu, apic_id, min = 0, max = 0;
> +	int cpu, apic_id, min = 0, max = 0, ret = 0;
>  #ifdef CONFIG_X86_64
>  	__uint128_t ipi_bitmap = 0;
>  	int cluster_size = 128;
> @@ -467,7 +468,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
>  #endif
>  
>  	if (cpumask_empty(mask))
> -		return;
> +		return 0;
>  
>  	local_irq_save(flags);
>  
> @@ -481,7 +482,7 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
>  		} else if (apic_id < min + cluster_size) {
>  			max = apic_id < max ? max : apic_id;
>  		} else {
> -			kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
> +			ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
>  				(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
>  			min = max = apic_id;
>  			ipi_bitmap = 0;
> @@ -490,11 +491,12 @@ static void __send_ipi_mask(const struct cpumask *mask, int vector)
>  	}
>  
>  	if (ipi_bitmap) {
> -		kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
> +		ret = kvm_hypercall4(KVM_HC_SEND_IPI, (unsigned long)ipi_bitmap,
>  			(unsigned long)(ipi_bitmap >> BITS_PER_LONG), min, vector);
>  	}
>  
>  	local_irq_restore(flags);
> +	return ret;
>  }
>  
>  static void kvm_send_ipi_mask(const struct cpumask *mask, int vector)
> @@ -511,7 +513,8 @@ static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector)
>  	cpumask_copy(&new_mask, mask);
>  	cpumask_clear_cpu(this_cpu, &new_mask);
>  	local_mask = &new_mask;
> -	__send_ipi_mask(local_mask, vector);
> +	if (__send_ipi_mask(local_mask, vector))
> +		orig_apic.send_IPI_mask_allbutself(mask, vector);
>  }
>  
>  static void kvm_send_ipi_allbutself(int vector)
> @@ -521,7 +524,8 @@ static void kvm_send_ipi_allbutself(int vector)
>  
>  static void kvm_send_ipi_all(int vector)
>  {
> -	__send_ipi_mask(cpu_online_mask, vector);
> +	if (__send_ipi_mask(cpu_online_mask, vector))
> +		orig_apic.send_IPI_all(vector);
>  }
>  
>  /*
> @@ -529,6 +533,8 @@ static void kvm_send_ipi_all(int vector)
>   */
>  static void kvm_setup_pv_ipi(void)
>  {
> +	orig_apic = *apic;
> +
>  	apic->send_IPI_mask = kvm_send_ipi_mask;
>  	apic->send_IPI_mask_allbutself = kvm_send_ipi_mask_allbutself;
>  	apic->send_IPI_allbutself = kvm_send_ipi_allbutself;
> 

Is this actually needed?

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall
  2018-07-23  6:39 ` [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall Wanpeng Li
@ 2018-08-02 13:04   ` Paolo Bonzini
  2018-08-03  4:09     ` Wanpeng Li
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2018-08-02 13:04 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm
  Cc: Radim Krčmář, Vitaly Kuznetsov

On 23/07/2018 08:39, Wanpeng Li wrote:
> +Returns 0 if successfully delivery the IPIs and 1 if discarded.

I'm changing this to

"Returns the number of CPUs to which the IPIs were delivered successfully"

with an obvious change to x86.c.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall
  2018-08-02 13:04   ` Paolo Bonzini
@ 2018-08-03  4:09     ` Wanpeng Li
  0 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2018-08-03  4:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: LKML, kvm, Radim Krcmar, Vitaly Kuznetsov

On Thu, 2 Aug 2018 at 21:04, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 23/07/2018 08:39, Wanpeng Li wrote:
> > +Returns 0 if successfully delivery the IPIs and 1 if discarded.
>
> I'm changing this to
>
> "Returns the number of CPUs to which the IPIs were delivered successfully"
>
> with an obvious change to x86.c.

Thanks Paolo!

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-08-03  4:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-23  6:39 [PATCH v5 0/6] KVM: X86: Implement Exit-less IPIs support Wanpeng Li
2018-07-23  6:39 ` [PATCH v5 1/6] KVM: X86: Add kvm hypervisor init time platform setup callback Wanpeng Li
2018-07-23  6:39 ` [PATCH v5 2/6] KVM: X86: Implement PV IPIs in linux guest Wanpeng Li
2018-07-23  6:39 ` [PATCH v5 3/6] KVM: X86: Fallback to original apic hooks when bad happens Wanpeng Li
2018-08-02 13:01   ` Paolo Bonzini
2018-07-23  6:39 ` [PATCH v5 4/6] KVM: X86: Implement PV IPIs send hypercall Wanpeng Li
2018-08-02 13:04   ` Paolo Bonzini
2018-08-03  4:09     ` Wanpeng Li
2018-07-23  6:39 ` [PATCH v5 5/6] KVM: X86: Add NMI support to PV IPIs Wanpeng Li
2018-07-23  6:39 ` [PATCH v5 6/6] KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).