[PATCH 0/3] KVM: x86: hyperv: PV IPI support for Windows guests

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] KVM: x86: hyperv: PV IPI support for Windows guests
@ 2018-06-22 14:56 Vitaly Kuznetsov
  2018-06-22 14:56 ` [PATCH 1/3] KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number Vitaly Kuznetsov
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Vitaly Kuznetsov @ 2018-06-22 14:56 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Radim Krčmář,
	Roman Kagan, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, linux-kernel

Using hypercall for sending IPIs is faster because this allows to specify
any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
will take only one VMEXIT.

Same as PV TLB flush, this allows Windows guests having > 64 vCPUs to boot
on KVM when Hyper-V extensions are enabled.

Vitaly Kuznetsov (3):
  KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number
  x86/hyper-v: rename ipi_arg_{ex,non_ex} structures
  KVM: x86: hyperv: implement PV IPI send hypercalls

 Documentation/virtual/kvm/api.txt  |  10 +++-
 arch/x86/hyperv/hv_apic.c          |  12 ++--
 arch/x86/include/asm/hyperv-tlfs.h |  16 ++---
 arch/x86/kvm/hyperv.c              | 116 +++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/trace.h               |  42 ++++++++++++++
 arch/x86/kvm/x86.c                 |   1 +
 include/uapi/linux/kvm.h           |   1 +
 7 files changed, 184 insertions(+), 14 deletions(-)

-- 
2.14.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number
  2018-06-22 14:56 [PATCH 0/3] KVM: x86: hyperv: PV IPI support for Windows guests Vitaly Kuznetsov
@ 2018-06-22 14:56 ` Vitaly Kuznetsov
  2018-06-22 16:58   ` Radim Krčmář
  2018-06-22 14:56 ` [PATCH 2/3] x86/hyper-v: rename ipi_arg_{ex,non_ex} structures Vitaly Kuznetsov
  2018-06-22 14:56 ` [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls Vitaly Kuznetsov
  2 siblings, 1 reply; 10+ messages in thread
From: Vitaly Kuznetsov @ 2018-06-22 14:56 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Radim Krčmář,
	Roman Kagan, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, linux-kernel

KVM_CAP_HYPERV_TLBFLUSH collided with KVM_CAP_S390_PSW-BPB, its paragraph
number should now be 8.18.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 Documentation/virtual/kvm/api.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 495b7742ab58..d10944e619d3 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4610,7 +4610,7 @@ This capability indicates that kvm will implement the interfaces to handle
 reset, migration and nested KVM for branch prediction blocking. The stfle
 facility 82 should not be provided to the guest without this capability.
 
-8.14 KVM_CAP_HYPERV_TLBFLUSH
+8.18 KVM_CAP_HYPERV_TLBFLUSH
 
 Architectures: x86
 
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] x86/hyper-v: rename ipi_arg_{ex,non_ex} structures
  2018-06-22 14:56 [PATCH 0/3] KVM: x86: hyperv: PV IPI support for Windows guests Vitaly Kuznetsov
  2018-06-22 14:56 ` [PATCH 1/3] KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number Vitaly Kuznetsov
@ 2018-06-22 14:56 ` Vitaly Kuznetsov
  2018-06-22 14:56 ` [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls Vitaly Kuznetsov
  2 siblings, 0 replies; 10+ messages in thread
From: Vitaly Kuznetsov @ 2018-06-22 14:56 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Radim Krčmář,
	Roman Kagan, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, linux-kernel

These structures are going to be used from KVM code so let's make
their names reflect their Hyper-V origin.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/hyperv/hv_apic.c          | 12 ++++++------
 arch/x86/include/asm/hyperv-tlfs.h | 16 +++++++++-------
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index f68855499391..cb17168e6263 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -93,14 +93,14 @@ static void hv_apic_eoi_write(u32 reg, u32 val)
  */
 static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector)
 {
-	struct ipi_arg_ex **arg;
-	struct ipi_arg_ex *ipi_arg;
+	struct hv_send_ipi_ex **arg;
+	struct hv_send_ipi_ex *ipi_arg;
 	unsigned long flags;
 	int nr_bank = 0;
 	int ret = 1;
 
 	local_irq_save(flags);
-	arg = (struct ipi_arg_ex **)this_cpu_ptr(hyperv_pcpu_input_arg);
+	arg = (struct hv_send_ipi_ex **)this_cpu_ptr(hyperv_pcpu_input_arg);
 
 	ipi_arg = *arg;
 	if (unlikely(!ipi_arg))
@@ -128,8 +128,8 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector)
 static bool __send_ipi_mask(const struct cpumask *mask, int vector)
 {
 	int cur_cpu, vcpu;
-	struct ipi_arg_non_ex **arg;
-	struct ipi_arg_non_ex *ipi_arg;
+	struct hv_send_ipi **arg;
+	struct hv_send_ipi *ipi_arg;
 	int ret = 1;
 	unsigned long flags;
 
@@ -146,7 +146,7 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector)
 		return __send_ipi_mask_ex(mask, vector);
 
 	local_irq_save(flags);
-	arg = (struct ipi_arg_non_ex **)this_cpu_ptr(hyperv_pcpu_input_arg);
+	arg = (struct hv_send_ipi **)this_cpu_ptr(hyperv_pcpu_input_arg);
 
 	ipi_arg = *arg;
 	if (unlikely(!ipi_arg))
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index b8c89265baf0..b52c9604b20d 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -723,19 +723,21 @@ struct hv_enlightened_vmcs {
 #define HV_STIMER_AUTOENABLE		(1ULL << 3)
 #define HV_STIMER_SINT(config)		(__u8)(((config) >> 16) & 0x0F)
 
-struct ipi_arg_non_ex {
-	u32 vector;
-	u32 reserved;
-	u64 cpu_mask;
-};
-
 struct hv_vpset {
 	u64 format;
 	u64 valid_bank_mask;
 	u64 bank_contents[];
 };
 
-struct ipi_arg_ex {
+/* HvCallSendSyntheticClusterIpi hypercall */
+struct hv_send_ipi {
+	u32 vector;
+	u32 reserved;
+	u64 cpu_mask;
+};
+
+/* HvCallSendSyntheticClusterIpiEx hypercall */
+struct hv_send_ipi_ex {
 	u32 vector;
 	u32 reserved;
 	struct hv_vpset vp_set;
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls
  2018-06-22 14:56 [PATCH 0/3] KVM: x86: hyperv: PV IPI support for Windows guests Vitaly Kuznetsov
  2018-06-22 14:56 ` [PATCH 1/3] KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number Vitaly Kuznetsov
  2018-06-22 14:56 ` [PATCH 2/3] x86/hyper-v: rename ipi_arg_{ex,non_ex} structures Vitaly Kuznetsov
@ 2018-06-22 14:56 ` Vitaly Kuznetsov
  2018-06-22 19:13   ` Radim Krčmář
  2 siblings, 1 reply; 10+ messages in thread
From: Vitaly Kuznetsov @ 2018-06-22 14:56 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Radim Krčmář,
	Roman Kagan, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, linux-kernel

Using hypercall for sending IPIs is faster because this allows to specify
any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
will take only one VMEXIT.

Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
hypercall can't be 'fast' (passing parameters through registers) but
apparently this is not true, Windows always uses it as 'fast' so we need
to support that.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 Documentation/virtual/kvm/api.txt |   8 +++
 arch/x86/kvm/hyperv.c             | 116 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/trace.h              |  42 ++++++++++++++
 arch/x86/kvm/x86.c                |   1 +
 include/uapi/linux/kvm.h          |   1 +
 5 files changed, 168 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index d10944e619d3..8a8e13c83aab 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4618,3 +4618,11 @@ This capability indicates that KVM supports paravirtualized Hyper-V TLB Flush
 hypercalls:
 HvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx,
 HvFlushVirtualAddressList, HvFlushVirtualAddressListEx.
+
+8.19 KVM_CAP_HYPERV_SEND_IPI
+
+Architectures: x86
+
+This capability indicates that KVM supports paravirtualized Hyper-V IPI send
+hypercalls:
+HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index af8caf965baa..aa110b1da103 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1357,6 +1357,108 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
 		((u64)rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
 }
 
+static u64 kvm_hv_send_ipi(struct kvm_vcpu *current_vcpu, u64 ingpa, u64 outgpa,
+			   bool ex, bool fast)
+{
+	struct kvm *kvm = current_vcpu->kvm;
+	struct hv_send_ipi_ex send_ipi_ex;
+	struct hv_send_ipi send_ipi;
+	struct kvm_vcpu *vcpu;
+	unsigned long valid_bank_mask = 0;
+	u64 sparse_banks[64];
+	int sparse_banks_len, i;
+	struct kvm_lapic_irq irq = {0};
+	bool all_cpus;
+
+	if (!ex) {
+		if (!fast) {
+			if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi,
+						    sizeof(send_ipi))))
+				return HV_STATUS_INVALID_HYPERCALL_INPUT;
+			sparse_banks[0] = send_ipi.cpu_mask;
+			irq.vector = send_ipi.vector;
+		} else {
+			/* 'reserved' part of hv_send_ipi should be 0 */
+			if (unlikely(ingpa >> 32 != 0))
+				return HV_STATUS_INVALID_HYPERCALL_INPUT;
+			sparse_banks[0] = outgpa;
+			irq.vector = (u32)ingpa;
+		}
+		all_cpus = false;
+
+		trace_kvm_hv_send_ipi(irq.vector, sparse_banks[0]);
+	} else {
+		if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi_ex,
+					    sizeof(send_ipi_ex))))
+			return HV_STATUS_INVALID_HYPERCALL_INPUT;
+
+		trace_kvm_hv_send_ipi_ex(send_ipi_ex.vector,
+					 send_ipi_ex.vp_set.format,
+					 send_ipi_ex.vp_set.valid_bank_mask);
+
+		irq.vector = send_ipi_ex.vector;
+		valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
+		sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
+			sizeof(sparse_banks[0]);
+		all_cpus = send_ipi_ex.vp_set.format !=
+			HV_GENERIC_SET_SPARSE_4K;
+
+		if (!sparse_banks_len)
+			goto ret_success;
+
+		if (!all_cpus &&
+		    kvm_read_guest(kvm,
+				   ingpa + offsetof(struct hv_send_ipi_ex,
+						    vp_set.bank_contents),
+				   sparse_banks,
+				   sparse_banks_len))
+			return HV_STATUS_INVALID_HYPERCALL_INPUT;
+	}
+
+	if ((irq.vector < HV_IPI_LOW_VECTOR) ||
+	    (irq.vector > HV_IPI_HIGH_VECTOR))
+		return HV_STATUS_INVALID_HYPERCALL_INPUT;
+
+	irq.delivery_mode = APIC_DM_FIXED;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
+		int bank = hv->vp_index / 64, sbank = 0;
+
+		if (!all_cpus) {
+			/* Banks >64 can't be represented */
+			if (bank >= 64)
+				continue;
+
+			/* Non-ex hypercalls can only address first 64 vCPUs */
+			if (!ex && bank)
+				continue;
+
+			if (ex) {
+				/*
+				 * Check is the bank of this vCPU is in sparse
+				 * set and get the sparse bank number.
+				 */
+				sbank = get_sparse_bank_no(valid_bank_mask,
+							   bank);
+
+				if (sbank < 0)
+					continue;
+			}
+
+			if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
+				continue;
+		}
+
+		/* We fail only when APIC is disabled */
+		if (!kvm_apic_set_irq(vcpu, &irq, NULL))
+			return HV_STATUS_INVALID_HYPERCALL_INPUT;
+	}
+
+ret_success:
+	return HV_STATUS_SUCCESS;
+}
+
 bool kvm_hv_hypercall_enabled(struct kvm *kvm)
 {
 	return READ_ONCE(kvm->arch.hyperv.hv_hypercall) & HV_X64_MSR_HYPERCALL_ENABLE;
@@ -1526,6 +1628,20 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
 		}
 		ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
 		break;
+	case HVCALL_SEND_IPI:
+		if (unlikely(rep)) {
+			ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+			break;
+		}
+		ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, false, fast);
+		break;
+	case HVCALL_SEND_IPI_EX:
+		if (unlikely(fast || rep)) {
+			ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+			break;
+		}
+		ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, true, false);
+		break;
 	default:
 		ret = HV_STATUS_INVALID_HYPERCALL_CODE;
 		break;
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 0f997683404f..0659465a745c 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1418,6 +1418,48 @@ TRACE_EVENT(kvm_hv_flush_tlb_ex,
 		  __entry->valid_bank_mask, __entry->format,
 		  __entry->address_space, __entry->flags)
 );
+
+/*
+ * Tracepoints for kvm_hv_send_ipi.
+ */
+TRACE_EVENT(kvm_hv_send_ipi,
+	TP_PROTO(u32 vector, u64 processor_mask),
+	TP_ARGS(vector, processor_mask),
+
+	TP_STRUCT__entry(
+		__field(u32, vector)
+		__field(u64, processor_mask)
+	),
+
+	TP_fast_assign(
+		__entry->vector = vector;
+		__entry->processor_mask = processor_mask;
+	),
+
+	TP_printk("vector %x processor_mask 0x%llx",
+		  __entry->vector, __entry->processor_mask)
+);
+
+TRACE_EVENT(kvm_hv_send_ipi_ex,
+	TP_PROTO(u32 vector, u64 format, u64 valid_bank_mask),
+	TP_ARGS(vector, format, valid_bank_mask),
+
+	TP_STRUCT__entry(
+		__field(u32, vector)
+		__field(u64, format)
+		__field(u64, valid_bank_mask)
+	),
+
+	TP_fast_assign(
+		__entry->vector = vector;
+		__entry->format = format;
+		__entry->valid_bank_mask = valid_bank_mask;
+	),
+
+	TP_printk("vector %x format %llx valid_bank_mask 0x%llx",
+		  __entry->vector, __entry->format,
+		  __entry->valid_bank_mask)
+);
 #endif /* _TRACE_KVM_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0046aa70205a..1884b66de9c2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2874,6 +2874,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_HYPERV_VP_INDEX:
 	case KVM_CAP_HYPERV_EVENTFD:
 	case KVM_CAP_HYPERV_TLBFLUSH:
+	case KVM_CAP_HYPERV_SEND_IPI:
 	case KVM_CAP_PCI_SEGMENT:
 	case KVM_CAP_DEBUGREGS:
 	case KVM_CAP_X86_ROBUST_SINGLESTEP:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index b6270a3b38e9..adce915f80a5 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -949,6 +949,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_GET_MSR_FEATURES 153
 #define KVM_CAP_HYPERV_EVENTFD 154
 #define KVM_CAP_HYPERV_TLBFLUSH 155
+#define KVM_CAP_HYPERV_SEND_IPI 156
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number
  2018-06-22 14:56 ` [PATCH 1/3] KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number Vitaly Kuznetsov
@ 2018-06-22 16:58   ` Radim Krčmář
  0 siblings, 0 replies; 10+ messages in thread
From: Radim Krčmář @ 2018-06-22 16:58 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Roman Kagan, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, linux-kernel

2018-06-22 16:56+0200, Vitaly Kuznetsov:
> KVM_CAP_HYPERV_TLBFLUSH collided with KVM_CAP_S390_PSW-BPB, its paragraph
> number should now be 8.18.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  Documentation/virtual/kvm/api.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 495b7742ab58..d10944e619d3 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -4610,7 +4610,7 @@ This capability indicates that kvm will implement the interfaces to handle
>  reset, migration and nested KVM for branch prediction blocking. The stfle
>  facility 82 should not be provided to the guest without this capability.
>  
> -8.14 KVM_CAP_HYPERV_TLBFLUSH
> +8.18 KVM_CAP_HYPERV_TLBFLUSH

Taking this one early, thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls
  2018-06-22 14:56 ` [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls Vitaly Kuznetsov
@ 2018-06-22 19:13   ` Radim Krčmář
  2018-06-25  9:10     ` Vitaly Kuznetsov
  2018-06-28 14:05     ` Wanpeng Li
  0 siblings, 2 replies; 10+ messages in thread
From: Radim Krčmář @ 2018-06-22 19:13 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Roman Kagan, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, linux-kernel

2018-06-22 16:56+0200, Vitaly Kuznetsov:
> Using hypercall for sending IPIs is faster because this allows to specify
> any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
> will take only one VMEXIT.
> 
> Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
> hypercall can't be 'fast' (passing parameters through registers) but
> apparently this is not true, Windows always uses it as 'fast' so we need
> to support that.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> @@ -1357,6 +1357,108 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
>  		((u64)rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
>  }
>  
> +static u64 kvm_hv_send_ipi(struct kvm_vcpu *current_vcpu, u64 ingpa, u64 outgpa,
> +			   bool ex, bool fast)
> +{
> +	struct kvm *kvm = current_vcpu->kvm;
> +	struct hv_send_ipi_ex send_ipi_ex;
> +	struct hv_send_ipi send_ipi;
> +	struct kvm_vcpu *vcpu;
> +	unsigned long valid_bank_mask = 0;
> +	u64 sparse_banks[64];
> +	int sparse_banks_len, i;
> +	struct kvm_lapic_irq irq = {0};
> +	bool all_cpus;
> +
> +	if (!ex) {
> +		if (!fast) {
> +			if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi,
> +						    sizeof(send_ipi))))
> +				return HV_STATUS_INVALID_HYPERCALL_INPUT;
> +			sparse_banks[0] = send_ipi.cpu_mask;
> +			irq.vector = send_ipi.vector;
> +		} else {
> +			/* 'reserved' part of hv_send_ipi should be 0 */
> +			if (unlikely(ingpa >> 32 != 0))
> +				return HV_STATUS_INVALID_HYPERCALL_INPUT;
> +			sparse_banks[0] = outgpa;
> +			irq.vector = (u32)ingpa;
> +		}
> +		all_cpus = false;
> +
> +		trace_kvm_hv_send_ipi(irq.vector, sparse_banks[0]);
> +	} else {
> +		if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi_ex,
> +					    sizeof(send_ipi_ex))))
> +			return HV_STATUS_INVALID_HYPERCALL_INPUT;
> +
> +		trace_kvm_hv_send_ipi_ex(send_ipi_ex.vector,
> +					 send_ipi_ex.vp_set.format,
> +					 send_ipi_ex.vp_set.valid_bank_mask);
> +
> +		irq.vector = send_ipi_ex.vector;
> +		valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
> +		sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
> +			sizeof(sparse_banks[0]);
> +		all_cpus = send_ipi_ex.vp_set.format !=
> +			HV_GENERIC_SET_SPARSE_4K;

This would be much better readable as

  send_ipi_ex.vp_set.format == HV_GENERIC_SET_ALL

And if Microsoft ever adds more formats, they won't be all VCPUs, so
we're future-proofing as well.

> +
> +		if (!sparse_banks_len)
> +			goto ret_success;
> +
> +		if (!all_cpus &&
> +		    kvm_read_guest(kvm,
> +				   ingpa + offsetof(struct hv_send_ipi_ex,
> +						    vp_set.bank_contents),
> +				   sparse_banks,
> +				   sparse_banks_len))
> +			return HV_STATUS_INVALID_HYPERCALL_INPUT;
> +	}
> +
> +	if ((irq.vector < HV_IPI_LOW_VECTOR) ||
> +	    (irq.vector > HV_IPI_HIGH_VECTOR))
> +		return HV_STATUS_INVALID_HYPERCALL_INPUT;
> +
> +	irq.delivery_mode = APIC_DM_FIXED;

I'd set this during variable definition.

APIC_DM_FIXED is 0 anyway and the compiler probably won't optimize it
here due to function with side effects since definition.

> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
> +		int bank = hv->vp_index / 64, sbank = 0;
> +
> +		if (!all_cpus) {
> +			/* Banks >64 can't be represented */
> +			if (bank >= 64)
> +				continue;
> +
> +			/* Non-ex hypercalls can only address first 64 vCPUs */
> +			if (!ex && bank)
> +				continue;
> +
> +			if (ex) {
> +				/*
> +				 * Check is the bank of this vCPU is in sparse
> +				 * set and get the sparse bank number.
> +				 */
> +				sbank = get_sparse_bank_no(valid_bank_mask,
> +							   bank);
> +
> +				if (sbank < 0)
> +					continue;
> +			}
> +
> +			if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
> +				continue;
> +		}
> +
> +		/* We fail only when APIC is disabled */
> +		if (!kvm_apic_set_irq(vcpu, &irq, NULL))
> +			return HV_STATUS_INVALID_HYPERCALL_INPUT;

Does Windows use this even for 1 VCPU IPI?

I'm thinking we could apply the same optimization we do for LAPIC -- RCU
protected array that maps vp_index to vcpu.

Thanks.

> +	}
> +
> +ret_success:
> +	return HV_STATUS_SUCCESS;
> +}
> +
>  bool kvm_hv_hypercall_enabled(struct kvm *kvm)
>  {
>  	return READ_ONCE(kvm->arch.hyperv.hv_hypercall) & HV_X64_MSR_HYPERCALL_ENABLE;
> @@ -1526,6 +1628,20 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>  		}
>  		ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
>  		break;
> +	case HVCALL_SEND_IPI:
> +		if (unlikely(rep)) {
> +			ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
> +			break;
> +		}
> +		ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, false, fast);
> +		break;
> +	case HVCALL_SEND_IPI_EX:
> +		if (unlikely(fast || rep)) {

Now I'm getting worried that the ex can be fast as well and we'll be
reading the banks from XMM registers. :)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls
  2018-06-22 19:13   ` Radim Krčmář
@ 2018-06-25  9:10     ` Vitaly Kuznetsov
  2018-06-28 14:05     ` Wanpeng Li
  1 sibling, 0 replies; 10+ messages in thread
From: Vitaly Kuznetsov @ 2018-06-25  9:10 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: kvm, Paolo Bonzini, Roman Kagan, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, linux-kernel

Radim Krčmář <rkrcmar@redhat.com> writes:

> 2018-06-22 16:56+0200, Vitaly Kuznetsov:
>> +
>> +		/* We fail only when APIC is disabled */
>> +		if (!kvm_apic_set_irq(vcpu, &irq, NULL))
>> +			return HV_STATUS_INVALID_HYPERCALL_INPUT;
>
> Does Windows use this even for 1 VCPU IPI?
>

It seems that it does.

> I'm thinking we could apply the same optimization we do for LAPIC -- RCU
> protected array that maps vp_index to vcpu.

Sure, both this and PV TLB flush will benefit.

>
> Thanks.
>
>> +	}
>> +
>> +ret_success:
>> +	return HV_STATUS_SUCCESS;
>> +}
>> +
>>  bool kvm_hv_hypercall_enabled(struct kvm *kvm)
>>  {
>>  	return READ_ONCE(kvm->arch.hyperv.hv_hypercall) & HV_X64_MSR_HYPERCALL_ENABLE;
>> @@ -1526,6 +1628,20 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>>  		}
>>  		ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
>>  		break;
>> +	case HVCALL_SEND_IPI:
>> +		if (unlikely(rep)) {
>> +			ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
>> +			break;
>> +		}
>> +		ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, false, fast);
>> +		break;
>> +	case HVCALL_SEND_IPI_EX:
>> +		if (unlikely(fast || rep)) {
>
> Now I'm getting worried that the ex can be fast as well and we'll be
> reading the banks from XMM registers. :)

Maybe but currently we don't announce 'parameters through XMM registers'
support in KVM (and neither do we support these for Linux-on-Hyper-V as
we don't usually use FPU in kernel).

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls
  2018-06-22 19:13   ` Radim Krčmář
  2018-06-25  9:10     ` Vitaly Kuznetsov
@ 2018-06-28 14:05     ` Wanpeng Li
  2018-06-28 15:04       ` Vitaly Kuznetsov
  1 sibling, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2018-06-28 14:05 UTC (permalink / raw)
  To: Radim Krcmar
  Cc: Vitaly Kuznetsov, kvm, Paolo Bonzini, Roman Kagan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, LKML

On Sat, 23 Jun 2018 at 03:14, Radim Krčmář <rkrcmar@redhat.com> wrote:
>
> 2018-06-22 16:56+0200, Vitaly Kuznetsov:
> > Using hypercall for sending IPIs is faster because this allows to specify
> > any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
> > will take only one VMEXIT.
> >
> > Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
> > hypercall can't be 'fast' (passing parameters through registers) but
> > apparently this is not true, Windows always uses it as 'fast' so we need
> > to support that.
> >
> > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > ---
> > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> > @@ -1357,6 +1357,108 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
> >               ((u64)rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
> >  }
> >
> > +static u64 kvm_hv_send_ipi(struct kvm_vcpu *current_vcpu, u64 ingpa, u64 outgpa,
> > +                        bool ex, bool fast)
> > +{
> > +     struct kvm *kvm = current_vcpu->kvm;
> > +     struct hv_send_ipi_ex send_ipi_ex;
> > +     struct hv_send_ipi send_ipi;
> > +     struct kvm_vcpu *vcpu;
> > +     unsigned long valid_bank_mask = 0;
> > +     u64 sparse_banks[64];
> > +     int sparse_banks_len, i;
> > +     struct kvm_lapic_irq irq = {0};
> > +     bool all_cpus;
> > +
> > +     if (!ex) {
> > +             if (!fast) {
> > +                     if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi,
> > +                                                 sizeof(send_ipi))))
> > +                             return HV_STATUS_INVALID_HYPERCALL_INPUT;
> > +                     sparse_banks[0] = send_ipi.cpu_mask;
> > +                     irq.vector = send_ipi.vector;
> > +             } else {
> > +                     /* 'reserved' part of hv_send_ipi should be 0 */
> > +                     if (unlikely(ingpa >> 32 != 0))
> > +                             return HV_STATUS_INVALID_HYPERCALL_INPUT;
> > +                     sparse_banks[0] = outgpa;
> > +                     irq.vector = (u32)ingpa;
> > +             }
> > +             all_cpus = false;
> > +
> > +             trace_kvm_hv_send_ipi(irq.vector, sparse_banks[0]);
> > +     } else {
> > +             if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi_ex,
> > +                                         sizeof(send_ipi_ex))))
> > +                     return HV_STATUS_INVALID_HYPERCALL_INPUT;
> > +
> > +             trace_kvm_hv_send_ipi_ex(send_ipi_ex.vector,
> > +                                      send_ipi_ex.vp_set.format,
> > +                                      send_ipi_ex.vp_set.valid_bank_mask);
> > +
> > +             irq.vector = send_ipi_ex.vector;
> > +             valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
> > +             sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
> > +                     sizeof(sparse_banks[0]);
> > +             all_cpus = send_ipi_ex.vp_set.format !=
> > +                     HV_GENERIC_SET_SPARSE_4K;
>
> This would be much better readable as
>
>   send_ipi_ex.vp_set.format == HV_GENERIC_SET_ALL
>
> And if Microsoft ever adds more formats, they won't be all VCPUs, so
> we're future-proofing as well.
>
> > +
> > +             if (!sparse_banks_len)
> > +                     goto ret_success;
> > +
> > +             if (!all_cpus &&
> > +                 kvm_read_guest(kvm,
> > +                                ingpa + offsetof(struct hv_send_ipi_ex,
> > +                                                 vp_set.bank_contents),
> > +                                sparse_banks,
> > +                                sparse_banks_len))
> > +                     return HV_STATUS_INVALID_HYPERCALL_INPUT;
> > +     }
> > +
> > +     if ((irq.vector < HV_IPI_LOW_VECTOR) ||
> > +         (irq.vector > HV_IPI_HIGH_VECTOR))
> > +             return HV_STATUS_INVALID_HYPERCALL_INPUT;
> > +
> > +     irq.delivery_mode = APIC_DM_FIXED;
>
> I'd set this during variable definition.
>
> APIC_DM_FIXED is 0 anyway and the compiler probably won't optimize it
> here due to function with side effects since definition.
>
> > +
> > +     kvm_for_each_vcpu(i, vcpu, kvm) {
> > +             struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
> > +             int bank = hv->vp_index / 64, sbank = 0;
> > +
> > +             if (!all_cpus) {
> > +                     /* Banks >64 can't be represented */
> > +                     if (bank >= 64)
> > +                             continue;
> > +
> > +                     /* Non-ex hypercalls can only address first 64 vCPUs */
> > +                     if (!ex && bank)
> > +                             continue;
> > +
> > +                     if (ex) {
> > +                             /*
> > +                              * Check is the bank of this vCPU is in sparse
> > +                              * set and get the sparse bank number.
> > +                              */
> > +                             sbank = get_sparse_bank_no(valid_bank_mask,
> > +                                                        bank);
> > +
> > +                             if (sbank < 0)
> > +                                     continue;
> > +                     }
> > +
> > +                     if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
> > +                             continue;
> > +             }
> > +
> > +             /* We fail only when APIC is disabled */
> > +             if (!kvm_apic_set_irq(vcpu, &irq, NULL))
> > +                     return HV_STATUS_INVALID_HYPERCALL_INPUT;
>
> Does Windows use this even for 1 VCPU IPI?
>
> I'm thinking we could apply the same optimization we do for LAPIC -- RCU
> protected array that maps vp_index to vcpu.
>
> Thanks.
>
> > +     }
> > +
> > +ret_success:
> > +     return HV_STATUS_SUCCESS;
> > +}
> > +
> >  bool kvm_hv_hypercall_enabled(struct kvm *kvm)
> >  {
> >       return READ_ONCE(kvm->arch.hyperv.hv_hypercall) & HV_X64_MSR_HYPERCALL_ENABLE;
> > @@ -1526,6 +1628,20 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
> >               }
> >               ret = kvm_hv_flush_tlb(vcpu, ingpa, rep_cnt, true);
> >               break;
> > +     case HVCALL_SEND_IPI:
> > +             if (unlikely(rep)) {
> > +                     ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
> > +                     break;
> > +             }
> > +             ret = kvm_hv_send_ipi(vcpu, ingpa, outgpa, false, fast);
> > +             break;
> > +     case HVCALL_SEND_IPI_EX:

Hi Paolo and Radim,

I have already completed the patches for linux guest/kvm/qemu w/ vCPUs
<= 64, however, extra complication as the ex in hyperv should be
introduced for vCPUs > 64, so do you think vCPU <=64 is enough for
linux guest or should me introduce two hypercall as what hyperv does
w/ ex logic?

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls
  2018-06-28 14:05     ` Wanpeng Li
@ 2018-06-28 15:04       ` Vitaly Kuznetsov
  2018-06-28 15:06         ` KY Srinivasan
  0 siblings, 1 reply; 10+ messages in thread
From: Vitaly Kuznetsov @ 2018-06-28 15:04 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Radim Krcmar, kvm, Paolo Bonzini, Roman Kagan, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, LKML

Wanpeng Li <kernellwp@gmail.com> writes:

> Hi Paolo and Radim,
>
> I have already completed the patches for linux guest/kvm/qemu w/ vCPUs
> <= 64, however, extra complication as the ex in hyperv should be
> introduced for vCPUs > 64, so do you think vCPU <=64 is enough for
> linux guest or should me introduce two hypercall as what hyperv does
> w/ ex logic?
>

Neither Paolo nor Radim but as we already have
#define KVM_MAX_VCPUS 288
on at least x86, supporting <= 64 vCPUs seems to be too limiting for any
new functionality.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls
  2018-06-28 15:04       ` Vitaly Kuznetsov
@ 2018-06-28 15:06         ` KY Srinivasan
  0 siblings, 0 replies; 10+ messages in thread
From: KY Srinivasan @ 2018-06-28 15:06 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Wanpeng Li
  Cc: Radim Krcmar, kvm, Paolo Bonzini, Roman Kagan, Haiyang Zhang,
	Stephen Hemminger, Michael Kelley (EOSG),
	Mohammed Gamal, Cathy Avery, LKML



> -----Original Message-----
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> Sent: Thursday, June 28, 2018 8:05 AM
> To: Wanpeng Li <kernellwp@gmail.com>
> Cc: Radim Krcmar <rkrcmar@redhat.com>; kvm <kvm@vger.kernel.org>;
> Paolo Bonzini <pbonzini@redhat.com>; Roman Kagan
> <rkagan@virtuozzo.com>; KY Srinivasan <kys@microsoft.com>; Haiyang
> Zhang <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; Michael Kelley (EOSG)
> <Michael.H.Kelley@microsoft.com>; Mohammed Gamal
> <mmorsy@redhat.com>; Cathy Avery <cavery@redhat.com>; LKML <linux-
> kernel@vger.kernel.org>
> Subject: Re: [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send
> hypercalls
> 
> Wanpeng Li <kernellwp@gmail.com> writes:
> 
> > Hi Paolo and Radim,
> >
> > I have already completed the patches for linux guest/kvm/qemu w/ vCPUs
> > <= 64, however, extra complication as the ex in hyperv should be
> > introduced for vCPUs > 64, so do you think vCPU <=64 is enough for
> > linux guest or should me introduce two hypercall as what hyperv does
> > w/ ex logic?
> >
> 
> Neither Paolo nor Radim but as we already have
> #define KVM_MAX_VCPUS 288
> on at least x86, supporting <= 64 vCPUs seems to be too limiting for any
> new functionality.

Agreed.

K. Y
> 
> --
>   Vitaly

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-06-28 15:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-22 14:56 [PATCH 0/3] KVM: x86: hyperv: PV IPI support for Windows guests Vitaly Kuznetsov
2018-06-22 14:56 ` [PATCH 1/3] KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number Vitaly Kuznetsov
2018-06-22 16:58   ` Radim Krčmář
2018-06-22 14:56 ` [PATCH 2/3] x86/hyper-v: rename ipi_arg_{ex,non_ex} structures Vitaly Kuznetsov
2018-06-22 14:56 ` [PATCH 3/3] KVM: x86: hyperv: implement PV IPI send hypercalls Vitaly Kuznetsov
2018-06-22 19:13   ` Radim Krčmář
2018-06-25  9:10     ` Vitaly Kuznetsov
2018-06-28 14:05     ` Wanpeng Li
2018-06-28 15:04       ` Vitaly Kuznetsov
2018-06-28 15:06         ` KY Srinivasan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).