All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature
@ 2022-04-07 15:56 Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 01/31] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag Vitaly Kuznetsov
                   ` (30 more replies)
  0 siblings, 31 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Changes since v1:
- Move '#include "svm.h"' to PATCH10 to avoid interim build breakage through
 the series.
- Fix crash from nested_vmx_free_vcpu() when nested_release_evmcs() is
 called while 'to_hv_vcpu() == NULL'.

Original description:

Currently, KVM handles HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} requests
by flushing the whole VPID and this is sub-optimal. This series introduces
the required mechanism to make handling of these requests more 
fine-grained by flushing individual GVAs only (when requested). On this
foundation, "Direct Virtual Flush" Hyper-V feature is implemented. The 
feature allows L0 to handle Hyper-V TLB flush hypercalls directly at
L0 without the need to reflect the exit to L1. This has at least two
benefits: reflecting vmexit and the consequent vmenter are avoided + L0
has precise information whether the target vCPU is actually running (and
thus requires a kick).

Vitaly Kuznetsov (31):
  KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag
  KVM: x86: hyper-v: Introduce TLB flush ring
  KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls
    gently
  KVM: x86: hyper-v: Expose support for extended gva ranges for flush
    hypercalls
  KVM: x86: Prepare kvm_hv_flush_tlb() to handle L2's GPAs
  KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in
    kvm_hv_send_ipi()
  KVM: x86: hyper-v: Create a separate ring for Direct TLB flush
  KVM: x86: hyper-v: Use preallocated buffer in 'struct kvm_vcpu_hv'
    instead of on-stack 'sparse_banks'
  KVM: nVMX: Keep track of hv_vm_id/hv_vp_id when eVMCS is in use
  KVM: nSVM: Keep track of Hyper-V hv_vm_id/hv_vp_id
  KVM: x86: Introduce .post_hv_direct_flush() nested hook
  KVM: x86: hyper-v: Introduce kvm_hv_is_tlb_flush_hcall()
  KVM: x86: hyper-v: Direct TLB flush
  KVM: x86: hyper-v: Introduce fast kvm_hv_direct_tlb_flush_exposed()
    check
  x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
  KVM: nVMX: hyper-v: Direct TLB flush
  KVM: x86: KVM_REQ_TLB_FLUSH_CURRENT is a superset of
    KVM_REQ_HV_TLB_FLUSH too
  KVM: nSVM: hyper-v: Direct TLB flush
  KVM: x86: Expose Hyper-V Direct TLB flush feature
  KVM: selftests: add hyperv_svm_test to .gitignore
  KVM: selftests: Better XMM read/write helpers
  KVM: selftests: Hyper-V PV IPI selftest
  KVM: selftests: Make it possible to replace PTEs with __virt_pg_map()
  KVM: selftests: Hyper-V PV TLB flush selftest
  KVM: selftests: Sync 'struct hv_enlightened_vmcs' definition with
    hyperv-tlfs.h
  KVM: selftests: nVMX: Allocate Hyper-V partition assist page
  KVM: selftests: nSVM: Allocate Hyper-V partition assist and VP assist
    pages
  KVM: selftests: Sync 'struct hv_vp_assist_page' definition with
    hyperv-tlfs.h
  KVM: selftests: evmcs_test: Direct TLB flush test
  KVM: selftests: Move Hyper-V VP assist page enablement out of evmcs.h
  KVM: selftests: hyperv_svm_test: Add Direct TLB flush test

 arch/x86/include/asm/hyperv-tlfs.h            |   6 +-
 arch/x86/include/asm/kvm_host.h               |  30 +
 arch/x86/kvm/Makefile                         |   3 +-
 arch/x86/kvm/hyperv.c                         | 305 ++++++++-
 arch/x86/kvm/hyperv.h                         |  55 ++
 arch/x86/kvm/svm/hyperv.c                     |  18 +
 arch/x86/kvm/svm/hyperv.h                     |  37 +
 arch/x86/kvm/svm/nested.c                     |  25 +-
 arch/x86/kvm/trace.h                          |  21 +-
 arch/x86/kvm/vmx/evmcs.c                      |  24 +
 arch/x86/kvm/vmx/evmcs.h                      |   4 +
 arch/x86/kvm/vmx/nested.c                     |  32 +
 arch/x86/kvm/x86.c                            |  15 +-
 arch/x86/kvm/x86.h                            |   1 +
 tools/testing/selftests/kvm/.gitignore        |   3 +
 tools/testing/selftests/kvm/Makefile          |   4 +-
 .../selftests/kvm/include/x86_64/evmcs.h      |  40 +-
 .../selftests/kvm/include/x86_64/hyperv.h     |  35 +
 .../selftests/kvm/include/x86_64/processor.h  |  72 +-
 .../selftests/kvm/include/x86_64/svm_util.h   |  10 +
 .../selftests/kvm/include/x86_64/vmx.h        |   4 +
 .../testing/selftests/kvm/lib/x86_64/hyperv.c |  21 +
 .../selftests/kvm/lib/x86_64/processor.c      |   6 +-
 tools/testing/selftests/kvm/lib/x86_64/svm.c  |  10 +
 tools/testing/selftests/kvm/lib/x86_64/vmx.c  |   7 +
 .../selftests/kvm/max_guest_memory_test.c     |   2 +-
 .../testing/selftests/kvm/x86_64/evmcs_test.c |  53 +-
 .../selftests/kvm/x86_64/hyperv_features.c    |   5 +-
 .../testing/selftests/kvm/x86_64/hyperv_ipi.c | 362 ++++++++++
 .../selftests/kvm/x86_64/hyperv_svm_test.c    |  60 +-
 .../selftests/kvm/x86_64/hyperv_tlb_flush.c   | 647 ++++++++++++++++++
 .../selftests/kvm/x86_64/mmu_role_test.c      |   2 +-
 32 files changed, 1797 insertions(+), 122 deletions(-)
 create mode 100644 arch/x86/kvm/svm/hyperv.c
 create mode 100644 tools/testing/selftests/kvm/lib/x86_64/hyperv.c
 create mode 100644 tools/testing/selftests/kvm/x86_64/hyperv_ipi.c
 create mode 100644 tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c

-- 
2.35.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v2 01/31] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 18:02   ` Sean Christopherson
  2022-04-07 15:56 ` [PATCH v2 02/31] KVM: x86: hyper-v: Introduce TLB flush ring Vitaly Kuznetsov
                   ` (29 subsequent siblings)
  30 siblings, 1 reply; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

In preparation to implementing fine-grained Hyper-V TLB flush and
Direct TLB flush, resurrect dedicated KVM_REQ_HV_TLB_FLUSH request
bit. As KVM_REQ_TLB_FLUSH_GUEST is a stronger operation, clear
KVM_REQ_HV_TLB_FLUSH request in kvm_service_local_tlb_flush_requests()
when KVM_REQ_TLB_FLUSH_GUEST was also requested.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 2 ++
 arch/x86/kvm/hyperv.c           | 4 ++--
 arch/x86/kvm/x86.c              | 7 ++++++-
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3460bcd75bf2..488934fadc3a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -105,6 +105,8 @@
 	KVM_ARCH_REQ_FLAGS(30, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_MMU_FREE_OBSOLETE_ROOTS \
 	KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+#define KVM_REQ_HV_TLB_FLUSH \
+	KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e235c1d43f83..b60bad29caf8 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1876,11 +1876,11 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 	 * analyze it here, flush TLB regardless of the specified address space.
 	 */
 	if (all_cpus) {
-		kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH_GUEST);
+		kvm_make_all_cpus_request(kvm, KVM_REQ_HV_TLB_FLUSH);
 	} else {
 		sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, vcpu_mask);
 
-		kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST, vcpu_mask);
+		kvm_make_vcpus_request_mask(kvm, KVM_REQ_HV_TLB_FLUSH, vcpu_mask);
 	}
 
 ret_success:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e9647614dc8c..3c54f6804b7b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3341,7 +3341,12 @@ void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu)
 	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu))
 		kvm_vcpu_flush_tlb_current(vcpu);
 
-	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu))
+	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
+		kvm_vcpu_flush_tlb_guest(vcpu);
+		kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
+	}
+
+	if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu))
 		kvm_vcpu_flush_tlb_guest(vcpu);
 }
 EXPORT_SYMBOL_GPL(kvm_service_local_tlb_flush_requests);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 02/31] KVM: x86: hyper-v: Introduce TLB flush ring
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 01/31] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently Vitaly Kuznetsov
                   ` (28 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

To allow flushing individual GVAs instead of always flushing the whole
VPID a per-vCPU structure to pass the requests is needed. Introduce a
simple ring write-locked structure to hold two types of entries:
individual GVA (GFN + up to 4095 following GFNs in the lower 12 bits)
and 'flush all'.

The queuing rule is: if there's not enough space on the ring to put
the request and leave at least 1 entry for 'flush all' - put 'flush
all' entry.

The size of the ring is arbitrary set to '16'.

Note, kvm_hv_flush_tlb() only queues 'flush all' entries for now so
there's very small functional change but the infrastructure is
prepared to handle individual GVA flush requests.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 16 +++++++
 arch/x86/kvm/hyperv.c           | 74 +++++++++++++++++++++++++++++++++
 arch/x86/kvm/hyperv.h           | 13 ++++++
 arch/x86/kvm/x86.c              |  7 ++--
 arch/x86/kvm/x86.h              |  1 +
 5 files changed, 108 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 488934fadc3a..15d798fe280d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -583,6 +583,20 @@ struct kvm_vcpu_hv_synic {
 	bool dont_zero_synic_pages;
 };
 
+#define KVM_HV_TLB_FLUSH_RING_SIZE (16)
+
+struct kvm_vcpu_hv_tlbflush_entry {
+	u64 addr;
+	u64 flush_all:1;
+	u64 pad:63;
+};
+
+struct kvm_vcpu_hv_tlbflush_ring {
+	int read_idx, write_idx;
+	spinlock_t write_lock;
+	struct kvm_vcpu_hv_tlbflush_entry entries[KVM_HV_TLB_FLUSH_RING_SIZE];
+};
+
 /* Hyper-V per vcpu emulation context */
 struct kvm_vcpu_hv {
 	struct kvm_vcpu *vcpu;
@@ -602,6 +616,8 @@ struct kvm_vcpu_hv {
 		u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
 		u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
 	} cpuid_cache;
+
+	struct kvm_vcpu_hv_tlbflush_ring tlb_flush_ring;
 };
 
 /* Xen HVM per vcpu emulation context */
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index b60bad29caf8..81c44e0eadf9 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -29,6 +29,7 @@
 #include <linux/kvm_host.h>
 #include <linux/highmem.h>
 #include <linux/sched/cputime.h>
+#include <linux/spinlock.h>
 #include <linux/eventfd.h>
 
 #include <asm/apicdef.h>
@@ -954,6 +955,8 @@ static int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
 
 	hv_vcpu->vp_index = vcpu->vcpu_idx;
 
+	spin_lock_init(&hv_vcpu->tlb_flush_ring.write_lock);
+
 	return 0;
 }
 
@@ -1789,6 +1792,65 @@ static u64 kvm_get_sparse_vp_set(struct kvm *kvm, struct kvm_hv_hcall *hc,
 			      var_cnt * sizeof(*sparse_banks));
 }
 
+static inline int hv_tlb_flush_ring_free(struct kvm_vcpu_hv *hv_vcpu,
+					 int read_idx, int write_idx)
+{
+	if (write_idx >= read_idx)
+		return KVM_HV_TLB_FLUSH_RING_SIZE - (write_idx - read_idx) - 1;
+
+	return read_idx - write_idx - 1;
+}
+
+static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu)
+{
+	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+	int ring_free, write_idx, read_idx;
+	unsigned long flags;
+
+	if (!hv_vcpu)
+		return;
+
+	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
+
+	spin_lock_irqsave(&tlb_flush_ring->write_lock, flags);
+
+	read_idx = READ_ONCE(tlb_flush_ring->read_idx);
+	write_idx = READ_ONCE(tlb_flush_ring->write_idx);
+
+	ring_free = hv_tlb_flush_ring_free(hv_vcpu, read_idx, write_idx);
+	/* Full ring always contains 'flush all' entry */
+	if (!ring_free)
+		goto out_unlock;
+
+	tlb_flush_ring->entries[write_idx].addr = 0;
+	tlb_flush_ring->entries[write_idx].flush_all = 1;
+	/*
+	 * Advance write index only after filling in the entry to
+	 * synchronize with lockless reader.
+	 */
+	smp_wmb();
+	tlb_flush_ring->write_idx = (write_idx + 1) % KVM_HV_TLB_FLUSH_RING_SIZE;
+
+out_unlock:
+	spin_unlock_irqrestore(&tlb_flush_ring->write_lock, flags);
+}
+
+void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
+{
+	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+
+	kvm_vcpu_flush_tlb_guest(vcpu);
+
+	if (!hv_vcpu)
+		return;
+
+	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
+
+	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
+}
+
 static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 {
 	struct kvm *kvm = vcpu->kvm;
@@ -1797,6 +1859,8 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 	DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
 	u64 valid_bank_mask;
 	u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
+	struct kvm_vcpu *v;
+	unsigned long i;
 	bool all_cpus;
 
 	/*
@@ -1876,10 +1940,20 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 	 * analyze it here, flush TLB regardless of the specified address space.
 	 */
 	if (all_cpus) {
+		kvm_for_each_vcpu(i, v, kvm)
+			hv_tlb_flush_ring_enqueue(v);
+
 		kvm_make_all_cpus_request(kvm, KVM_REQ_HV_TLB_FLUSH);
 	} else {
 		sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, vcpu_mask);
 
+		for_each_set_bit(i, vcpu_mask, KVM_MAX_VCPUS) {
+			v = kvm_get_vcpu(kvm, i);
+			if (!v)
+				continue;
+			hv_tlb_flush_ring_enqueue(v);
+		}
+
 		kvm_make_vcpus_request_mask(kvm, KVM_REQ_HV_TLB_FLUSH, vcpu_mask);
 	}
 
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index da2737f2a956..6847caeaaf84 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -147,4 +147,17 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
 int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 		     struct kvm_cpuid_entry2 __user *entries);
 
+
+static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu *vcpu)
+{
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+
+	if (!hv_vcpu)
+		return;
+
+	hv_vcpu->tlb_flush_ring.read_idx = hv_vcpu->tlb_flush_ring.write_idx;
+}
+void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
+
+
 #endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3c54f6804b7b..2074d52b0666 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3305,7 +3305,7 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
 	static_call(kvm_x86_flush_tlb_all)(vcpu);
 }
 
-static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
+void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
 {
 	++vcpu->stat.tlb_flush;
 
@@ -3343,11 +3343,12 @@ void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu)
 
 	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
 		kvm_vcpu_flush_tlb_guest(vcpu);
-		kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
+		if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu))
+			kvm_hv_vcpu_empty_flush_tlb(vcpu);
 	}
 
 	if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu))
-		kvm_vcpu_flush_tlb_guest(vcpu);
+		kvm_hv_vcpu_flush_tlb(vcpu);
 }
 EXPORT_SYMBOL_GPL(kvm_service_local_tlb_flush_requests);
 
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index aa86abad914d..ed5c67b5d086 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -58,6 +58,7 @@ static inline unsigned int __shrink_ple_window(unsigned int val,
 
 #define MSR_IA32_CR_PAT_DEFAULT  0x0007040600070406ULL
 
+void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu);
 void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu);
 int kvm_check_nested_events(struct kvm_vcpu *vcpu);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 01/31] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 02/31] KVM: x86: hyper-v: Introduce TLB flush ring Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 17:33   ` Sean Christopherson
  2022-04-07 17:44   ` Sean Christopherson
  2022-04-07 15:56 ` [PATCH v2 04/31] KVM: x86: hyper-v: Expose support for extended gva ranges for flush hypercalls Vitaly Kuznetsov
                   ` (27 subsequent siblings)
  30 siblings, 2 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Currently, HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls are handled
the exact same way as HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE{,EX}: by
flushing the whole VPID and this is sub-optimal. Switch to handling
these requests with 'flush_tlb_gva()' hooks instead. Use the newly
introduced TLB flush ring to queue the requests.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c | 141 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 121 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 81c44e0eadf9..a54d41656f30 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1792,6 +1792,35 @@ static u64 kvm_get_sparse_vp_set(struct kvm *kvm, struct kvm_hv_hcall *hc,
 			      var_cnt * sizeof(*sparse_banks));
 }
 
+static int kvm_hv_get_tlbflush_entries(struct kvm *kvm, struct kvm_hv_hcall *hc, u64 entries[],
+				       u32 data_offset, int consumed_xmm_halves)
+{
+	int i;
+
+	if (hc->fast) {
+		/*
+		 * Each XMM holds two entries, but do not count halves that
+		 * have already been consumed.
+		 */
+		if (hc->rep_cnt > (2 * HV_HYPERCALL_MAX_XMM_REGISTERS - consumed_xmm_halves))
+			return -EINVAL;
+
+		for (i = 0; i < hc->rep_cnt; i++) {
+			int j = i + consumed_xmm_halves;
+
+			if (j % 2)
+				entries[i] = sse128_hi(hc->xmm[j / 2]);
+			else
+				entries[i] = sse128_lo(hc->xmm[j / 2]);
+		}
+
+		return 0;
+	}
+
+	return kvm_read_guest(kvm, hc->ingpa + data_offset,
+			      entries, hc->rep_cnt * sizeof(entries[0]));
+}
+
 static inline int hv_tlb_flush_ring_free(struct kvm_vcpu_hv *hv_vcpu,
 					 int read_idx, int write_idx)
 {
@@ -1801,12 +1830,14 @@ static inline int hv_tlb_flush_ring_free(struct kvm_vcpu_hv *hv_vcpu,
 	return read_idx - write_idx - 1;
 }
 
-static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu)
+static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu, bool flush_all,
+				      u64 *entries, int count)
 {
 	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
 	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 	int ring_free, write_idx, read_idx;
 	unsigned long flags;
+	int i;
 
 	if (!hv_vcpu)
 		return;
@@ -1823,14 +1854,34 @@ static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu)
 	if (!ring_free)
 		goto out_unlock;
 
-	tlb_flush_ring->entries[write_idx].addr = 0;
-	tlb_flush_ring->entries[write_idx].flush_all = 1;
 	/*
-	 * Advance write index only after filling in the entry to
-	 * synchronize with lockless reader.
+	 * All entries should fit on the ring leaving one free for 'flush all'
+	 * entry in case another request comes in. In case there's not enough
+	 * space, just put 'flush all' entry there.
+	 */
+	if (!count || count >= ring_free - 1 || flush_all) {
+		tlb_flush_ring->entries[write_idx].addr = 0;
+		tlb_flush_ring->entries[write_idx].flush_all = 1;
+		/*
+		 * Advance write index only after filling in the entry to
+		 * synchronize with lockless reader.
+		 */
+		smp_wmb();
+		tlb_flush_ring->write_idx = (write_idx + 1) % KVM_HV_TLB_FLUSH_RING_SIZE;
+		goto out_unlock;
+	}
+
+	for (i = 0; i < count; i++) {
+		tlb_flush_ring->entries[write_idx].addr = entries[i];
+		tlb_flush_ring->entries[write_idx].flush_all = 0;
+		write_idx = (write_idx + 1) % KVM_HV_TLB_FLUSH_RING_SIZE;
+	}
+	/*
+	 * Advance write index only after filling in the entry to synchronize
+	 * with lockless reader.
 	 */
 	smp_wmb();
-	tlb_flush_ring->write_idx = (write_idx + 1) % KVM_HV_TLB_FLUSH_RING_SIZE;
+	tlb_flush_ring->write_idx = write_idx;
 
 out_unlock:
 	spin_unlock_irqrestore(&tlb_flush_ring->write_lock, flags);
@@ -1840,15 +1891,47 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
 {
 	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
 	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
-
-	kvm_vcpu_flush_tlb_guest(vcpu);
-
-	if (!hv_vcpu)
+	struct kvm_vcpu_hv_tlbflush_entry *entry;
+	int read_idx, write_idx;
+	u64 address;
+	u32 count;
+	int i, j;
+
+	if (!tdp_enabled || !hv_vcpu) {
+		kvm_vcpu_flush_tlb_guest(vcpu);
 		return;
+	}
 
 	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
+	read_idx = READ_ONCE(tlb_flush_ring->read_idx);
+	write_idx = READ_ONCE(tlb_flush_ring->write_idx);
+
+	/* Pairs with smp_wmb() in hv_tlb_flush_ring_enqueue() */
+	smp_rmb();
 
-	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
+	for (i = read_idx; i != write_idx; i = (i + 1) % KVM_HV_TLB_FLUSH_RING_SIZE) {
+		entry = &tlb_flush_ring->entries[i];
+
+		if (entry->flush_all)
+			goto out_flush_all;
+
+		/*
+		 * Lower 12 bits of 'address' encode the number of additional
+		 * pages to flush.
+		 */
+		address = entry->addr & PAGE_MASK;
+		count = (entry->addr & ~PAGE_MASK) + 1;
+		for (j = 0; j < count; j++)
+			static_call(kvm_x86_flush_tlb_gva)(vcpu, address + j * PAGE_SIZE);
+	}
+	++vcpu->stat.tlb_flush;
+	goto out_empty_ring;
+
+out_flush_all:
+	kvm_vcpu_flush_tlb_guest(vcpu);
+
+out_empty_ring:
+	tlb_flush_ring->read_idx = write_idx;
 }
 
 static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
@@ -1857,12 +1940,13 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 	struct hv_tlb_flush_ex flush_ex;
 	struct hv_tlb_flush flush;
 	DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
+	u64 entries[KVM_HV_TLB_FLUSH_RING_SIZE - 2];
 	u64 valid_bank_mask;
 	u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
 	struct kvm_vcpu *v;
 	unsigned long i;
-	bool all_cpus;
-
+	bool all_cpus, all_addr;
+	int data_offset = 0, consumed_xmm_halves = 0;
 	/*
 	 * The Hyper-V TLFS doesn't allow more than 64 sparse banks, e.g. the
 	 * valid mask is a u64.  Fail the build if KVM's max allowed number of
@@ -1877,10 +1961,12 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 			flush.address_space = hc->ingpa;
 			flush.flags = hc->outgpa;
 			flush.processor_mask = sse128_lo(hc->xmm[0]);
+			consumed_xmm_halves = 1;
 		} else {
 			if (unlikely(kvm_read_guest(kvm, hc->ingpa,
 						    &flush, sizeof(flush))))
 				return HV_STATUS_INVALID_HYPERCALL_INPUT;
+			data_offset = sizeof(flush);
 		}
 
 		trace_kvm_hv_flush_tlb(flush.processor_mask,
@@ -1904,10 +1990,12 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 			flush_ex.flags = hc->outgpa;
 			memcpy(&flush_ex.hv_vp_set,
 			       &hc->xmm[0], sizeof(hc->xmm[0]));
+			consumed_xmm_halves = 2;
 		} else {
 			if (unlikely(kvm_read_guest(kvm, hc->ingpa, &flush_ex,
 						    sizeof(flush_ex))))
 				return HV_STATUS_INVALID_HYPERCALL_INPUT;
+			data_offset = sizeof(flush_ex);
 		}
 
 		trace_kvm_hv_flush_tlb_ex(flush_ex.hv_vp_set.valid_bank_mask,
@@ -1923,25 +2011,38 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 			return HV_STATUS_INVALID_HYPERCALL_INPUT;
 
 		if (all_cpus)
-			goto do_flush;
+			goto read_flush_entries;
 
 		if (!hc->var_cnt)
 			goto ret_success;
 
-		if (kvm_get_sparse_vp_set(kvm, hc, 2, sparse_banks,
-					  offsetof(struct hv_tlb_flush_ex,
-						   hv_vp_set.bank_contents)))
+		if (kvm_get_sparse_vp_set(kvm, hc, consumed_xmm_halves,
+					  sparse_banks, data_offset))
+			return HV_STATUS_INVALID_HYPERCALL_INPUT;
+		data_offset += hc->var_cnt * sizeof(sparse_banks[0]);
+		consumed_xmm_halves += hc->var_cnt;
+	}
+
+read_flush_entries:
+	if (hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE ||
+	    hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX ||
+	    hc->rep_cnt > (KVM_HV_TLB_FLUSH_RING_SIZE - 2)) {
+		all_addr = true;
+	} else {
+		if (kvm_hv_get_tlbflush_entries(kvm, hc, entries,
+						data_offset, consumed_xmm_halves))
 			return HV_STATUS_INVALID_HYPERCALL_INPUT;
+		all_addr = false;
 	}
 
-do_flush:
+
 	/*
 	 * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
 	 * analyze it here, flush TLB regardless of the specified address space.
 	 */
 	if (all_cpus) {
 		kvm_for_each_vcpu(i, v, kvm)
-			hv_tlb_flush_ring_enqueue(v);
+			hv_tlb_flush_ring_enqueue(v, all_addr, entries, hc->rep_cnt);
 
 		kvm_make_all_cpus_request(kvm, KVM_REQ_HV_TLB_FLUSH);
 	} else {
@@ -1951,7 +2052,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 			v = kvm_get_vcpu(kvm, i);
 			if (!v)
 				continue;
-			hv_tlb_flush_ring_enqueue(v);
+			hv_tlb_flush_ring_enqueue(v, all_addr, entries, hc->rep_cnt);
 		}
 
 		kvm_make_vcpus_request_mask(kvm, KVM_REQ_HV_TLB_FLUSH, vcpu_mask);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 04/31] KVM: x86: hyper-v: Expose support for extended gva ranges for flush hypercalls
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 05/31] KVM: x86: Prepare kvm_hv_flush_tlb() to handle L2's GPAs Vitaly Kuznetsov
                   ` (26 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Extended GVA ranges support bit seems to indicate whether lower 12
bits of GVA can be used to specify up to 4095 additional consequent
GVAs to flush. This is somewhat described in TLFS.

Previously, KVM was handling HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX}
requests by flushing the whole VPID so technically, extended GVA
ranges were already supported. As such requests are handled more
gently now, advertizing support for extended ranges starts making
sense to reduce the size of TLB flush requests.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 2 ++
 arch/x86/kvm/hyperv.c              | 1 +
 2 files changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 0a9407dc0859..5225a85c08c3 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -61,6 +61,8 @@
 #define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE		BIT(10)
 /* Support for debug MSRs available */
 #define HV_FEATURE_DEBUG_MSRS_AVAILABLE			BIT(11)
+/* Support for extended gva ranges for flush hypercalls available */
+#define HV_FEATURE_EXT_GVA_RANGES_FLUSH			BIT(14)
 /*
  * Support for returning hypercall output block via XMM
  * registers is available
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index a54d41656f30..75904820aced 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2683,6 +2683,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 			ent->ebx |= HV_DEBUGGING;
 			ent->edx |= HV_X64_GUEST_DEBUGGING_AVAILABLE;
 			ent->edx |= HV_FEATURE_DEBUG_MSRS_AVAILABLE;
+			ent->edx |= HV_FEATURE_EXT_GVA_RANGES_FLUSH;
 
 			/*
 			 * Direct Synthetic timers only make sense with in-kernel
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 05/31] KVM: x86: Prepare kvm_hv_flush_tlb() to handle L2's GPAs
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 04/31] KVM: x86: hyper-v: Expose support for extended gva ranges for flush hypercalls Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 06/31] KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in kvm_hv_send_ipi() Vitaly Kuznetsov
                   ` (25 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

To handle Direct TLB flush requests from L2, KVM needs to translate the
specified L2 GPA to L1 GPA to read hypercall arguments from there.

No fucntional change as KVM doesn't handle VMCALL/VMMCALL from L2 yet.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 75904820aced..d7bcdf87b90c 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -23,6 +23,7 @@
 #include "ioapic.h"
 #include "cpuid.h"
 #include "hyperv.h"
+#include "mmu.h"
 #include "xen.h"
 
 #include <linux/cpu.h>
@@ -1955,6 +1956,12 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 	 */
 	BUILD_BUG_ON(KVM_HV_MAX_SPARSE_VCPU_SET_BITS > 64);
 
+	if (!hc->fast && is_guest_mode(vcpu)) {
+		hc->ingpa = translate_nested_gpa(vcpu, hc->ingpa, 0, NULL);
+		if (unlikely(hc->ingpa == UNMAPPED_GVA))
+			return HV_STATUS_INVALID_HYPERCALL_INPUT;
+	}
+
 	if (hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST ||
 	    hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE) {
 		if (hc->fast) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 06/31] KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in kvm_hv_send_ipi()
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 05/31] KVM: x86: Prepare kvm_hv_flush_tlb() to handle L2's GPAs Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 17:48   ` Sean Christopherson
  2022-04-07 15:56 ` [PATCH v2 07/31] KVM: x86: hyper-v: Create a separate ring for Direct TLB flush Vitaly Kuznetsov
                   ` (24 subsequent siblings)
  30 siblings, 1 reply; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Get rid of on-stack allocation of vcpu_mask and optimize kvm_hv_send_ipi()
for a smaller number of vCPUs in the request. When Hyper-V TLB flush
is in  use, HvSendSyntheticClusterIpi{,Ex} calls are not commonly used to
send IPIs to a large number of vCPUs (and are rarely used in general).

Introduce hv_is_vp_in_sparse_set() to directly check if the specified
VP_ID is present in sparse vCPU set.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c | 35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index d7bcdf87b90c..918642bcdbd0 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1746,6 +1746,23 @@ static void sparse_set_to_vcpu_mask(struct kvm *kvm, u64 *sparse_banks,
 	}
 }
 
+static bool hv_is_vp_in_sparse_set(u32 vp_id, u64 valid_bank_mask, u64 sparse_banks[])
+{
+	int bank, sbank = 0;
+
+	if (!test_bit(vp_id / 64, (unsigned long *)&valid_bank_mask))
+		return false;
+
+	for_each_set_bit(bank, (unsigned long *)&valid_bank_mask,
+			 KVM_HV_MAX_SPARSE_VCPU_SET_BITS) {
+		if (bank == vp_id / 64)
+			break;
+		sbank++;
+	}
+
+	return test_bit(vp_id % 64, (unsigned long *)&sparse_banks[sbank]);
+}
+
 struct kvm_hv_hcall {
 	u64 param;
 	u64 ingpa;
@@ -2071,8 +2088,8 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 		((u64)hc->rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
 }
 
-static void kvm_send_ipi_to_many(struct kvm *kvm, u32 vector,
-				 unsigned long *vcpu_bitmap)
+static void kvm_hv_send_ipi_to_many(struct kvm *kvm, u32 vector,
+				    u64 *sparse_banks, u64 valid_bank_mask)
 {
 	struct kvm_lapic_irq irq = {
 		.delivery_mode = APIC_DM_FIXED,
@@ -2082,7 +2099,10 @@ static void kvm_send_ipi_to_many(struct kvm *kvm, u32 vector,
 	unsigned long i;
 
 	kvm_for_each_vcpu(i, vcpu, kvm) {
-		if (vcpu_bitmap && !test_bit(i, vcpu_bitmap))
+		if (sparse_banks &&
+		    !hv_is_vp_in_sparse_set(kvm_hv_get_vpindex(vcpu),
+					    valid_bank_mask,
+					    sparse_banks))
 			continue;
 
 		/* We fail only when APIC is disabled */
@@ -2095,7 +2115,6 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 	struct kvm *kvm = vcpu->kvm;
 	struct hv_send_ipi_ex send_ipi_ex;
 	struct hv_send_ipi send_ipi;
-	DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
 	unsigned long valid_bank_mask;
 	u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
 	u32 vector;
@@ -2157,13 +2176,7 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 	if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR))
 		return HV_STATUS_INVALID_HYPERCALL_INPUT;
 
-	if (all_cpus) {
-		kvm_send_ipi_to_many(kvm, vector, NULL);
-	} else {
-		sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, vcpu_mask);
-
-		kvm_send_ipi_to_many(kvm, vector, vcpu_mask);
-	}
+	kvm_hv_send_ipi_to_many(kvm, vector, all_cpus ? NULL : sparse_banks, valid_bank_mask);
 
 ret_success:
 	return HV_STATUS_SUCCESS;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 07/31] KVM: x86: hyper-v: Create a separate ring for Direct TLB flush
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 06/31] KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in kvm_hv_send_ipi() Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 17:57   ` Sean Christopherson
  2022-04-07 15:56 ` [PATCH v2 08/31] KVM: x86: hyper-v: Use preallocated buffer in 'struct kvm_vcpu_hv' instead of on-stack 'sparse_banks' Vitaly Kuznetsov
                   ` (23 subsequent siblings)
  30 siblings, 1 reply; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

To handle Direct TLB flush requests from L2 KVM needs to use a
separate ring from regular Hyper-V TLB flush requests: e.g. when a
request to flush something in L2 is made, the target vCPU can
transition from L2 to L1, receive a request to flush a GVA for L1 and
then try to enter L2 back. The first request needs to be processed
then. Similarly, requests to flush GVAs in L1 must wait until L2
exits to L1.

No functional change yet as KVM doesn't handle Direct TLB flush
requests from L2 yet.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/hyperv.c           |  7 ++++---
 arch/x86/kvm/hyperv.h           | 17 ++++++++++++++---
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 15d798fe280d..b8d7c1422da6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -617,7 +617,8 @@ struct kvm_vcpu_hv {
 		u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
 	} cpuid_cache;
 
-	struct kvm_vcpu_hv_tlbflush_ring tlb_flush_ring;
+	/* Two rings for regular Hyper-V TLB flush and Direct TLB flush */
+	struct kvm_vcpu_hv_tlbflush_ring tlb_flush_ring[2];
 };
 
 /* Xen HVM per vcpu emulation context */
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 918642bcdbd0..16cbf41b5b7b 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -956,7 +956,8 @@ static int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
 
 	hv_vcpu->vp_index = vcpu->vcpu_idx;
 
-	spin_lock_init(&hv_vcpu->tlb_flush_ring.write_lock);
+	spin_lock_init(&hv_vcpu->tlb_flush_ring[0].write_lock);
+	spin_lock_init(&hv_vcpu->tlb_flush_ring[1].write_lock);
 
 	return 0;
 }
@@ -1860,7 +1861,7 @@ static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu, bool flush_all,
 	if (!hv_vcpu)
 		return;
 
-	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
+	tlb_flush_ring = &hv_vcpu->tlb_flush_ring[0];
 
 	spin_lock_irqsave(&tlb_flush_ring->write_lock, flags);
 
@@ -1920,7 +1921,7 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
 		return;
 	}
 
-	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
+	tlb_flush_ring = kvm_hv_get_tlb_flush_ring(vcpu);
 	read_idx = READ_ONCE(tlb_flush_ring->read_idx);
 	write_idx = READ_ONCE(tlb_flush_ring->write_idx);
 
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index 6847caeaaf84..448877b478ef 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -22,6 +22,7 @@
 #define __ARCH_X86_KVM_HYPERV_H__
 
 #include <linux/kvm_host.h>
+#include "x86.h"
 
 /*
  * The #defines related to the synthetic debugger are required by KDNet, but
@@ -147,15 +148,25 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
 int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 		     struct kvm_cpuid_entry2 __user *entries);
 
+static inline struct kvm_vcpu_hv_tlbflush_ring *kvm_hv_get_tlb_flush_ring(struct kvm_vcpu *vcpu)
+{
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+
+	if (!is_guest_mode(vcpu))
+		return &hv_vcpu->tlb_flush_ring[0];
+
+	return &hv_vcpu->tlb_flush_ring[1];
+}
 
 static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu *vcpu)
 {
-	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
 
-	if (!hv_vcpu)
+	if (!to_hv_vcpu(vcpu))
 		return;
 
-	hv_vcpu->tlb_flush_ring.read_idx = hv_vcpu->tlb_flush_ring.write_idx;
+	tlb_flush_ring = kvm_hv_get_tlb_flush_ring(vcpu);
+	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
 }
 void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 08/31] KVM: x86: hyper-v: Use preallocated buffer in 'struct kvm_vcpu_hv' instead of on-stack 'sparse_banks'
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (6 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 07/31] KVM: x86: hyper-v: Create a separate ring for Direct TLB flush Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 09/31] KVM: nVMX: Keep track of hv_vm_id/hv_vp_id when eVMCS is in use Vitaly Kuznetsov
                   ` (22 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

To make kvm_hv_flush_tlb() ready to handle Direct TLB flush request
KVM needs to allow for all 64 sparse vCPU banks regardless of KVM_MAX_VCPUs
as L1 may use vCPU overcommit for L2. To avoid growing on-stack allocation,
make 'sparse_banks' part of per-vCPU 'struct kvm_vcpu_hv' which is
allocated dynamically.

Note: sparse_set_to_vcpu_mask() keeps using on-stack allocation as it
won't be used to handle Direct TLB flush requests.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 3 +++
 arch/x86/kvm/hyperv.c           | 6 ++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b8d7c1422da6..4458abc1d41d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -619,6 +619,9 @@ struct kvm_vcpu_hv {
 
 	/* Two rings for regular Hyper-V TLB flush and Direct TLB flush */
 	struct kvm_vcpu_hv_tlbflush_ring tlb_flush_ring[2];
+
+	/* Preallocated buffer for handling hypercalls passing sparse vCPU set */
+	u64 sparse_banks[64];
 };
 
 /* Xen HVM per vcpu emulation context */
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 16cbf41b5b7b..705c0b739c1b 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1955,13 +1955,14 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
 
 static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 {
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+	u64 *sparse_banks = hv_vcpu->sparse_banks;
 	struct kvm *kvm = vcpu->kvm;
 	struct hv_tlb_flush_ex flush_ex;
 	struct hv_tlb_flush flush;
 	DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
 	u64 entries[KVM_HV_TLB_FLUSH_RING_SIZE - 2];
 	u64 valid_bank_mask;
-	u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
 	struct kvm_vcpu *v;
 	unsigned long i;
 	bool all_cpus, all_addr;
@@ -2113,11 +2114,12 @@ static void kvm_hv_send_ipi_to_many(struct kvm *kvm, u32 vector,
 
 static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 {
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+	u64 *sparse_banks = hv_vcpu->sparse_banks;
 	struct kvm *kvm = vcpu->kvm;
 	struct hv_send_ipi_ex send_ipi_ex;
 	struct hv_send_ipi send_ipi;
 	unsigned long valid_bank_mask;
-	u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
 	u32 vector;
 	bool all_cpus;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 09/31] KVM: nVMX: Keep track of hv_vm_id/hv_vp_id when eVMCS is in use
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (7 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 08/31] KVM: x86: hyper-v: Use preallocated buffer in 'struct kvm_vcpu_hv' instead of on-stack 'sparse_banks' Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 10/31] KVM: nSVM: Keep track of Hyper-V hv_vm_id/hv_vp_id Vitaly Kuznetsov
                   ` (21 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

To handle Direct TLB flush requests from L2, KVM needs to keep track
of L2's VM_ID/VP_IDs which are set by L1 hypervisor. 'Partition assist
page' address is also needed to handle post-flush exit to L1 upon
request.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  6 ++++++
 arch/x86/kvm/vmx/nested.c       | 15 +++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4458abc1d41d..01f094a2208f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -622,6 +622,12 @@ struct kvm_vcpu_hv {
 
 	/* Preallocated buffer for handling hypercalls passing sparse vCPU set */
 	u64 sparse_banks[64];
+
+	struct {
+		u64 pa_page_gpa;
+		u64 vm_id;
+		u32 vp_id;
+	} nested;
 };
 
 /* Xen HVM per vcpu emulation context */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f18744f7ff82..4075cf8d61f4 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -225,6 +225,7 @@ static void vmx_disable_shadow_vmcs(struct vcpu_vmx *vmx)
 
 static inline void nested_release_evmcs(struct kvm_vcpu *vcpu)
 {
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
 	if (evmptr_is_valid(vmx->nested.hv_evmcs_vmptr)) {
@@ -233,6 +234,12 @@ static inline void nested_release_evmcs(struct kvm_vcpu *vcpu)
 	}
 
 	vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;
+
+	if (hv_vcpu) {
+		hv_vcpu->nested.pa_page_gpa = INVALID_GPA;
+		hv_vcpu->nested.vm_id = 0;
+		hv_vcpu->nested.vp_id = 0;
+	}
 }
 
 static void vmx_sync_vmcs_host_state(struct vcpu_vmx *vmx,
@@ -1592,11 +1599,19 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 {
 	struct vmcs12 *vmcs12 = vmx->nested.cached_vmcs12;
 	struct hv_enlightened_vmcs *evmcs = vmx->nested.hv_evmcs;
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(&vmx->vcpu);
 
 	/* HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE */
 	vmcs12->tpr_threshold = evmcs->tpr_threshold;
 	vmcs12->guest_rip = evmcs->guest_rip;
 
+	if (unlikely(!(hv_clean_fields &
+		       HV_VMX_ENLIGHTENED_CLEAN_FIELD_ENLIGHTENMENTSCONTROL))) {
+		hv_vcpu->nested.pa_page_gpa = evmcs->partition_assist_page;
+		hv_vcpu->nested.vm_id = evmcs->hv_vm_id;
+		hv_vcpu->nested.vp_id = evmcs->hv_vp_id;
+	}
+
 	if (unlikely(!(hv_clean_fields &
 		       HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_BASIC))) {
 		vmcs12->guest_rsp = evmcs->guest_rsp;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 10/31] KVM: nSVM: Keep track of Hyper-V hv_vm_id/hv_vp_id
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (8 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 09/31] KVM: nVMX: Keep track of hv_vm_id/hv_vp_id when eVMCS is in use Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 11/31] KVM: x86: Introduce .post_hv_direct_flush() nested hook Vitaly Kuznetsov
                   ` (20 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Similar to nSVM, KVM needs to know L2's VM_ID/VP_ID and Partition
assist page address to handle Direct TLB flush requests.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/svm/hyperv.h | 16 ++++++++++++++++
 arch/x86/kvm/svm/nested.c |  2 ++
 2 files changed, 18 insertions(+)

diff --git a/arch/x86/kvm/svm/hyperv.h b/arch/x86/kvm/svm/hyperv.h
index 7d6d97968fb9..8cf702fed7e5 100644
--- a/arch/x86/kvm/svm/hyperv.h
+++ b/arch/x86/kvm/svm/hyperv.h
@@ -9,6 +9,7 @@
 #include <asm/mshyperv.h>
 
 #include "../hyperv.h"
+#include "svm.h"
 
 /*
  * Hyper-V uses the software reserved 32 bytes in VMCB
@@ -32,4 +33,19 @@ struct hv_enlightenments {
  */
 #define VMCB_HV_NESTED_ENLIGHTENMENTS VMCB_SW
 
+static inline void nested_svm_hv_update_vm_vp_ids(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	struct hv_enlightenments *hve =
+		(struct hv_enlightenments *)svm->nested.ctl.reserved_sw;
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+
+	if (!hv_vcpu)
+		return;
+
+	hv_vcpu->nested.pa_page_gpa = hve->partition_assist_page;
+	hv_vcpu->nested.vm_id = hve->hv_vm_id;
+	hv_vcpu->nested.vp_id = hve->hv_vp_id;
+}
+
 #endif /* __ARCH_X86_KVM_SVM_HYPERV_H__ */
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 73b545278f5f..ee75061a7ea3 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -827,6 +827,8 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 
 	svm->nested.nested_run_pending = 1;
 
+	nested_svm_hv_update_vm_vp_ids(vcpu);
+
 	if (enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, true))
 		goto out_exit_err;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 11/31] KVM: x86: Introduce .post_hv_direct_flush() nested hook
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (9 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 10/31] KVM: nSVM: Keep track of Hyper-V hv_vm_id/hv_vp_id Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 12/31] KVM: x86: hyper-v: Introduce kvm_hv_is_tlb_flush_hcall() Vitaly Kuznetsov
                   ` (19 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Hyper-V supports injecting synthetic L2->L1 exit after performing
Direct TLB flush operation but the procedure is vendor
specific. Introduce .post_hv_direct_flush() nested hook for it.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/Makefile           |  3 ++-
 arch/x86/kvm/svm/hyperv.c       | 11 +++++++++++
 arch/x86/kvm/svm/hyperv.h       |  2 ++
 arch/x86/kvm/svm/nested.c       |  1 +
 arch/x86/kvm/vmx/evmcs.c        |  4 ++++
 arch/x86/kvm/vmx/evmcs.h        |  1 +
 arch/x86/kvm/vmx/nested.c       |  1 +
 8 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/kvm/svm/hyperv.c

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 01f094a2208f..7c607307a5d4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1559,6 +1559,7 @@ struct kvm_x86_nested_ops {
 	int (*enable_evmcs)(struct kvm_vcpu *vcpu,
 			    uint16_t *vmcs_version);
 	uint16_t (*get_evmcs_version)(struct kvm_vcpu *vcpu);
+	void (*post_hv_direct_flush)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_x86_init_ops {
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 30f244b64523..b6d53b045692 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -25,7 +25,8 @@ kvm-intel-y		+= vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
 			   vmx/evmcs.o vmx/nested.o vmx/posted_intr.o
 kvm-intel-$(CONFIG_X86_SGX_KVM)	+= vmx/sgx.o
 
-kvm-amd-y		+= svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o svm/sev.o
+kvm-amd-y		+= svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o \
+			   svm/sev.o svm/hyperv.o
 
 ifdef CONFIG_HYPERV
 kvm-amd-y		+= svm/svm_onhyperv.o
diff --git a/arch/x86/kvm/svm/hyperv.c b/arch/x86/kvm/svm/hyperv.c
new file mode 100644
index 000000000000..0142fde34738
--- /dev/null
+++ b/arch/x86/kvm/svm/hyperv.c
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD SVM specific code for Hyper-V on KVM.
+ *
+ * Copyright 2022 Red Hat, Inc. and/or its affiliates.
+ */
+#include "hyperv.h"
+
+void svm_post_hv_direct_flush(struct kvm_vcpu *vcpu)
+{
+}
diff --git a/arch/x86/kvm/svm/hyperv.h b/arch/x86/kvm/svm/hyperv.h
index 8cf702fed7e5..b3f5df6c6c97 100644
--- a/arch/x86/kvm/svm/hyperv.h
+++ b/arch/x86/kvm/svm/hyperv.h
@@ -48,4 +48,6 @@ static inline void nested_svm_hv_update_vm_vp_ids(struct kvm_vcpu *vcpu)
 	hv_vcpu->nested.vp_id = hve->hv_vp_id;
 }
 
+void svm_post_hv_direct_flush(struct kvm_vcpu *vcpu);
+
 #endif /* __ARCH_X86_KVM_SVM_HYPERV_H__ */
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index ee75061a7ea3..8cd008e12350 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1668,4 +1668,5 @@ struct kvm_x86_nested_ops svm_nested_ops = {
 	.get_nested_state_pages = svm_get_nested_state_pages,
 	.get_state = svm_get_nested_state,
 	.set_state = svm_set_nested_state,
+	.post_hv_direct_flush = svm_post_hv_direct_flush,
 };
diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 87e3dc10edf4..1705c4973636 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -437,3 +437,7 @@ int nested_enable_evmcs(struct kvm_vcpu *vcpu,
 
 	return 0;
 }
+
+void vmx_post_hv_direct_flush(struct kvm_vcpu *vcpu)
+{
+}
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index 8d70f9aea94b..8862692a4c5d 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -244,5 +244,6 @@ int nested_enable_evmcs(struct kvm_vcpu *vcpu,
 			uint16_t *vmcs_version);
 void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata);
 int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
+void vmx_post_hv_direct_flush(struct kvm_vcpu *vcpu);
 
 #endif /* __KVM_X86_VMX_EVMCS_H */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4075cf8d61f4..7dd4104cfdf4 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6827,4 +6827,5 @@ struct kvm_x86_nested_ops vmx_nested_ops = {
 	.write_log_dirty = nested_vmx_write_pml_buffer,
 	.enable_evmcs = nested_enable_evmcs,
 	.get_evmcs_version = nested_get_evmcs_version,
+	.post_hv_direct_flush = vmx_post_hv_direct_flush,
 };
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 12/31] KVM: x86: hyper-v: Introduce kvm_hv_is_tlb_flush_hcall()
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (10 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 11/31] KVM: x86: Introduce .post_hv_direct_flush() nested hook Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 18:07   ` Sean Christopherson
  2022-04-07 15:56 ` [PATCH v2 13/31] KVM: x86: hyper-v: Direct TLB flush Vitaly Kuznetsov
                   ` (18 subsequent siblings)
  30 siblings, 1 reply; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

The newly introduced helper checks whether vCPU is performing a
Hyper-V TLB flush hypercall. This is required to filter out Direct TLB
flush hypercalls from L2 for processing.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index 448877b478ef..3687e1e61e0d 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -168,6 +168,30 @@ static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu *vcpu)
 	tlb_flush_ring = kvm_hv_get_tlb_flush_ring(vcpu);
 	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
 }
+
+static inline bool kvm_hv_is_tlb_flush_hcall(struct kvm_vcpu *vcpu)
+{
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+	u16 code;
+
+	if (!hv_vcpu)
+		return false;
+
+#ifdef CONFIG_X86_64
+	if (is_64_bit_hypercall(vcpu)) {
+		code = kvm_rcx_read(vcpu) & 0xffff;
+	} else
+#endif
+	{
+		code = kvm_rax_read(vcpu) & 0xffff;
+	}
+
+	return (code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE ||
+		code == HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST ||
+		code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX ||
+		code == HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX);
+}
+
 void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
 
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 13/31] KVM: x86: hyper-v: Direct TLB flush
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (11 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 12/31] KVM: x86: hyper-v: Introduce kvm_hv_is_tlb_flush_hcall() Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 18:27   ` Sean Christopherson
  2022-04-07 15:56 ` [PATCH v2 14/31] KVM: x86: hyper-v: Introduce fast kvm_hv_direct_tlb_flush_exposed() check Vitaly Kuznetsov
                   ` (17 subsequent siblings)
  30 siblings, 1 reply; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Handle Direct TLB flush requests from L2 by going through all vCPUs
and checking whether there are vCPUs running the same VM_ID with a
VP_ID specified in the requests. Perform synthetic exit to L2 upon
finish.

Note, while checking VM_ID/VP_ID of running vCPUs seem to be a bit
racy, we count on the fact that KVM flushes the whole L2 VPID upon
transition. Also, KVM_REQ_HV_TLB_FLUSH request needs to be done upon
transition between L1 and L2 to make sure all pending requests are
always processed.

Note, while nVMX/nSVM code does not handle VMCALL/VMMCALL from L2 yet.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c | 65 ++++++++++++++++++++++++++++++++++++-------
 arch/x86/kvm/trace.h  | 21 ++++++++------
 2 files changed, 68 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 705c0b739c1b..2b12f1b5c992 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -34,6 +34,7 @@
 #include <linux/eventfd.h>
 
 #include <asm/apicdef.h>
+#include <asm/mshyperv.h>
 #include <trace/events/kvm.h>
 
 #include "trace.h"
@@ -1849,8 +1850,8 @@ static inline int hv_tlb_flush_ring_free(struct kvm_vcpu_hv *hv_vcpu,
 	return read_idx - write_idx - 1;
 }
 
-static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu, bool flush_all,
-				      u64 *entries, int count)
+static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu, bool direct,
+				      bool flush_all, u64 *entries, int count)
 {
 	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
 	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
@@ -1861,7 +1862,7 @@ static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu, bool flush_all,
 	if (!hv_vcpu)
 		return;
 
-	tlb_flush_ring = &hv_vcpu->tlb_flush_ring[0];
+	tlb_flush_ring = direct ? &hv_vcpu->tlb_flush_ring[1] : &hv_vcpu->tlb_flush_ring[0];
 
 	spin_lock_irqsave(&tlb_flush_ring->write_lock, flags);
 
@@ -1996,7 +1997,8 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 		}
 
 		trace_kvm_hv_flush_tlb(flush.processor_mask,
-				       flush.address_space, flush.flags);
+				       flush.address_space, flush.flags,
+				       is_guest_mode(vcpu));
 
 		valid_bank_mask = BIT_ULL(0);
 		sparse_banks[0] = flush.processor_mask;
@@ -2027,7 +2029,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 		trace_kvm_hv_flush_tlb_ex(flush_ex.hv_vp_set.valid_bank_mask,
 					  flush_ex.hv_vp_set.format,
 					  flush_ex.address_space,
-					  flush_ex.flags);
+					  flush_ex.flags, is_guest_mode(vcpu));
 
 		valid_bank_mask = flush_ex.hv_vp_set.valid_bank_mask;
 		all_cpus = flush_ex.hv_vp_set.format !=
@@ -2066,19 +2068,45 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 	 * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
 	 * analyze it here, flush TLB regardless of the specified address space.
 	 */
-	if (all_cpus) {
+	if (all_cpus && !is_guest_mode(vcpu)) {
 		kvm_for_each_vcpu(i, v, kvm)
-			hv_tlb_flush_ring_enqueue(v, all_addr, entries, hc->rep_cnt);
+			hv_tlb_flush_ring_enqueue(v, false, all_addr, entries, hc->rep_cnt);
 
 		kvm_make_all_cpus_request(kvm, KVM_REQ_HV_TLB_FLUSH);
-	} else {
+	} else if (!is_guest_mode(vcpu)) {
 		sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, vcpu_mask);
 
 		for_each_set_bit(i, vcpu_mask, KVM_MAX_VCPUS) {
 			v = kvm_get_vcpu(kvm, i);
 			if (!v)
 				continue;
-			hv_tlb_flush_ring_enqueue(v, all_addr, entries, hc->rep_cnt);
+			hv_tlb_flush_ring_enqueue(v, false, all_addr, entries, hc->rep_cnt);
+		}
+
+		kvm_make_vcpus_request_mask(kvm, KVM_REQ_HV_TLB_FLUSH, vcpu_mask);
+	} else {
+		struct kvm_vcpu_hv *hv_v;
+
+		bitmap_zero(vcpu_mask, KVM_MAX_VCPUS);
+
+		kvm_for_each_vcpu(i, v, kvm) {
+			hv_v = to_hv_vcpu(v);
+
+			/*
+			 * TLB is fully flushed on L2 VM change: either by KVM
+			 * (on a eVMPTR switch) or by L1 hypervisor (in case it
+			 * re-purposes the active eVMCS for a different VM/VP).
+			 */
+			if (!hv_v || hv_v->nested.vm_id != hv_vcpu->nested.vm_id)
+				continue;
+
+			if (!all_cpus &&
+			    !hv_is_vp_in_sparse_set(hv_v->nested.vp_id, valid_bank_mask,
+						    sparse_banks))
+				continue;
+
+			__set_bit(i, vcpu_mask);
+			hv_tlb_flush_ring_enqueue(v, true, all_addr, entries, hc->rep_cnt);
 		}
 
 		kvm_make_vcpus_request_mask(kvm, KVM_REQ_HV_TLB_FLUSH, vcpu_mask);
@@ -2266,10 +2294,27 @@ static void kvm_hv_hypercall_set_result(struct kvm_vcpu *vcpu, u64 result)
 
 static int kvm_hv_hypercall_complete(struct kvm_vcpu *vcpu, u64 result)
 {
+	int ret;
+
 	trace_kvm_hv_hypercall_done(result);
 	kvm_hv_hypercall_set_result(vcpu, result);
 	++vcpu->stat.hypercalls;
-	return kvm_skip_emulated_instruction(vcpu);
+	ret = kvm_skip_emulated_instruction(vcpu);
+
+	if (unlikely(hv_result_success(result) && is_guest_mode(vcpu)
+		     && kvm_hv_is_tlb_flush_hcall(vcpu))) {
+		struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+		u32 tlb_lock_count;
+
+		if (unlikely(kvm_read_guest(vcpu->kvm, hv_vcpu->nested.pa_page_gpa,
+					    &tlb_lock_count, sizeof(tlb_lock_count))))
+			kvm_inject_gp(vcpu, 0);
+
+		if (tlb_lock_count)
+			kvm_x86_ops.nested_ops->post_hv_direct_flush(vcpu);
+	}
+
+	return ret;
 }
 
 static int kvm_hv_hypercall_complete_userspace(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index e3a24b8f04be..4241b7c0245e 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1479,38 +1479,41 @@ TRACE_EVENT(kvm_hv_timer_state,
  * Tracepoint for kvm_hv_flush_tlb.
  */
 TRACE_EVENT(kvm_hv_flush_tlb,
-	TP_PROTO(u64 processor_mask, u64 address_space, u64 flags),
-	TP_ARGS(processor_mask, address_space, flags),
+	TP_PROTO(u64 processor_mask, u64 address_space, u64 flags, bool direct),
+	TP_ARGS(processor_mask, address_space, flags, direct),
 
 	TP_STRUCT__entry(
 		__field(u64, processor_mask)
 		__field(u64, address_space)
 		__field(u64, flags)
+		__field(bool, direct)
 	),
 
 	TP_fast_assign(
 		__entry->processor_mask = processor_mask;
 		__entry->address_space = address_space;
 		__entry->flags = flags;
+		__entry->direct = direct;
 	),
 
-	TP_printk("processor_mask 0x%llx address_space 0x%llx flags 0x%llx",
+	TP_printk("processor_mask 0x%llx address_space 0x%llx flags 0x%llx %s",
 		  __entry->processor_mask, __entry->address_space,
-		  __entry->flags)
+		  __entry->flags, __entry->direct ? "(direct)" : "")
 );
 
 /*
  * Tracepoint for kvm_hv_flush_tlb_ex.
  */
 TRACE_EVENT(kvm_hv_flush_tlb_ex,
-	TP_PROTO(u64 valid_bank_mask, u64 format, u64 address_space, u64 flags),
-	TP_ARGS(valid_bank_mask, format, address_space, flags),
+	TP_PROTO(u64 valid_bank_mask, u64 format, u64 address_space, u64 flags, bool direct),
+	TP_ARGS(valid_bank_mask, format, address_space, flags, direct),
 
 	TP_STRUCT__entry(
 		__field(u64, valid_bank_mask)
 		__field(u64, format)
 		__field(u64, address_space)
 		__field(u64, flags)
+		__field(bool, direct)
 	),
 
 	TP_fast_assign(
@@ -1518,12 +1521,14 @@ TRACE_EVENT(kvm_hv_flush_tlb_ex,
 		__entry->format = format;
 		__entry->address_space = address_space;
 		__entry->flags = flags;
+		__entry->direct = direct;
 	),
 
 	TP_printk("valid_bank_mask 0x%llx format 0x%llx "
-		  "address_space 0x%llx flags 0x%llx",
+		  "address_space 0x%llx flags 0x%llx %s",
 		  __entry->valid_bank_mask, __entry->format,
-		  __entry->address_space, __entry->flags)
+		  __entry->address_space, __entry->flags,
+		  __entry->direct ? "(direct)" : "")
 );
 
 /*
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 14/31] KVM: x86: hyper-v: Introduce fast kvm_hv_direct_tlb_flush_exposed() check
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (12 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 13/31] KVM: x86: hyper-v: Direct TLB flush Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 15/31] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
                   ` (16 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Introduce a helper to quickly check if KVM needs to handle VMCALL/VMMCALL
from L2 in L0 to process Direct TLB flush requests.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/hyperv.c           | 6 ++++++
 arch/x86/kvm/hyperv.h           | 7 +++++++
 3 files changed, 14 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7c607307a5d4..3be54b51d6bc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -615,6 +615,7 @@ struct kvm_vcpu_hv {
 		u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
 		u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
 		u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
+		u32 nested_features_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
 	} cpuid_cache;
 
 	/* Two rings for regular Hyper-V TLB flush and Direct TLB flush */
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 2b12f1b5c992..f08cf295d750 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2256,6 +2256,12 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
 		hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
 	else
 		hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
+
+	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES, 0);
+	if (entry)
+		hv_vcpu->cpuid_cache.nested_features_eax = entry->eax;
+	else
+		hv_vcpu->cpuid_cache.nested_features_eax = 0;
 }
 
 int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index 3687e1e61e0d..c4f8064da606 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -169,6 +169,13 @@ static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu *vcpu)
 	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
 }
 
+static inline bool kvm_hv_direct_tlb_flush_exposed(struct kvm_vcpu *vcpu)
+{
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+
+	return hv_vcpu && (hv_vcpu->cpuid_cache.nested_features_eax & HV_X64_NESTED_DIRECT_FLUSH);
+}
+
 static inline bool kvm_hv_is_tlb_flush_hcall(struct kvm_vcpu *vcpu)
 {
 	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 15/31] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (13 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 14/31] KVM: x86: hyper-v: Introduce fast kvm_hv_direct_tlb_flush_exposed() check Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 16/31] KVM: nVMX: hyper-v: Direct TLB flush Vitaly Kuznetsov
                   ` (15 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Section 1.9 of TLFS v6.0b says:

"All structures are padded in such a way that fields are aligned
naturally (that is, an 8-byte field is aligned to an offset of 8 bytes
and so on)".

'struct enlightened_vmcs' has a glitch:

...
        struct {
                u32                nested_flush_hypercall:1; /*   836: 0  4 */
                u32                msr_bitmap:1;         /*   836: 1  4 */
                u32                reserved:30;          /*   836: 2  4 */
        } hv_enlightenments_control;                     /*   836     4 */
        u32                        hv_vp_id;             /*   840     4 */
        u64                        hv_vm_id;             /*   844     8 */
        u64                        partition_assist_page; /*   852     8 */
...

And the observed values in 'partition_assist_page' make no sense at
all. Fix the layout by padding the structure properly.

Fixes: 68d1eb72ee99 ("x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits")
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 5225a85c08c3..e7ddae8e02c6 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -548,7 +548,7 @@ struct hv_enlightened_vmcs {
 	u64 guest_rip;
 
 	u32 hv_clean_fields;
-	u32 hv_padding_32;
+	u32 padding32_1;
 	u32 hv_synthetic_controls;
 	struct {
 		u32 nested_flush_hypercall:1;
@@ -556,7 +556,7 @@ struct hv_enlightened_vmcs {
 		u32 reserved:30;
 	}  __packed hv_enlightenments_control;
 	u32 hv_vp_id;
-
+	u32 padding32_2;
 	u64 hv_vm_id;
 	u64 partition_assist_page;
 	u64 padding64_4[4];
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 16/31] KVM: nVMX: hyper-v: Direct TLB flush
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (14 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 15/31] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 18:47   ` Sean Christopherson
  2022-04-07 15:56 ` [PATCH v2 17/31] KVM: x86: KVM_REQ_TLB_FLUSH_CURRENT is a superset of KVM_REQ_HV_TLB_FLUSH too Vitaly Kuznetsov
                   ` (14 subsequent siblings)
  30 siblings, 1 reply; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Enable Direct TLB flush feature on nVMX when:
- Enlightened VMCS is in use.
- Direct TLB flush flag is enabled in eVMCS.
- Direct TLB flush is enabled in partition assist page.

Perform synthetic vmexit to L1 after processing TLB flush call upon
request (HV_VMX_SYNTHETIC_EXIT_REASON_TRAP_AFTER_FLUSH).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/evmcs.c  | 20 ++++++++++++++++++++
 arch/x86/kvm/vmx/evmcs.h  |  3 +++
 arch/x86/kvm/vmx/nested.c | 16 ++++++++++++++++
 3 files changed, 39 insertions(+)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 1705c4973636..cdf7ec5cb64c 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -6,6 +6,7 @@
 #include "../hyperv.h"
 #include "../cpuid.h"
 #include "evmcs.h"
+#include "nested.h"
 #include "vmcs.h"
 #include "vmx.h"
 #include "trace.h"
@@ -438,6 +439,25 @@ int nested_enable_evmcs(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
+bool nested_evmcs_direct_flush_enabled(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	struct hv_enlightened_vmcs *evmcs = vmx->nested.hv_evmcs;
+	struct hv_vp_assist_page assist_page;
+
+	if (!evmcs)
+		return false;
+
+	if (!evmcs->hv_enlightenments_control.nested_flush_hypercall)
+		return false;
+
+	if (unlikely(!kvm_hv_get_assist_page(vcpu, &assist_page)))
+		return false;
+
+	return assist_page.nested_control.features.directhypercall;
+}
+
 void vmx_post_hv_direct_flush(struct kvm_vcpu *vcpu)
 {
+	nested_vmx_vmexit(vcpu, HV_VMX_SYNTHETIC_EXIT_REASON_TRAP_AFTER_FLUSH, 0, 0);
 }
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index 8862692a4c5d..ab0949c22d2d 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -65,6 +65,8 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
 #define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
 #define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)
 
+#define HV_VMX_SYNTHETIC_EXIT_REASON_TRAP_AFTER_FLUSH 0x10000031
+
 struct evmcs_field {
 	u16 offset;
 	u16 clean_field;
@@ -244,6 +246,7 @@ int nested_enable_evmcs(struct kvm_vcpu *vcpu,
 			uint16_t *vmcs_version);
 void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata);
 int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
+bool nested_evmcs_direct_flush_enabled(struct kvm_vcpu *vcpu);
 void vmx_post_hv_direct_flush(struct kvm_vcpu *vcpu);
 
 #endif /* __KVM_X86_VMX_EVMCS_H */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 7dd4104cfdf4..d53d0cfe1df1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1171,6 +1171,17 @@ static void nested_vmx_transition_tlb_flush(struct kvm_vcpu *vcpu,
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
+	/*
+	 * KVM_REQ_HV_TLB_FLUSH flushes entries from either L1's VPID or
+	 * L2's VPID upon request from the guest. Make sure we check for
+	 * pending entries for the case when the request got misplaced (e.g.
+	 * a transition from L2->L1 happened while processing Direct TLB flush
+	 * request or vice versa). kvm_hv_vcpu_flush_tlb() will not flush
+	 * anything if there are no requests in the corresponding buffer.
+	 */
+	if (to_hv_vcpu(vcpu))
+		kvm_make_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
+
 	/*
 	 * If vmcs12 doesn't use VPID, L1 expects linear and combined mappings
 	 * for *all* contexts to be flushed on VM-Enter/VM-Exit, i.e. it's a
@@ -5975,6 +5986,11 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu,
 		 * Handle L2's bus locks in L0 directly.
 		 */
 		return true;
+	case EXIT_REASON_VMCALL:
+		/* Hyper-V Direct TLB flush hypercall is handled by L0 */
+		return kvm_hv_direct_tlb_flush_exposed(vcpu) &&
+			nested_evmcs_direct_flush_enabled(vcpu) &&
+			kvm_hv_is_tlb_flush_hcall(vcpu);
 	default:
 		break;
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 17/31] KVM: x86: KVM_REQ_TLB_FLUSH_CURRENT is a superset of KVM_REQ_HV_TLB_FLUSH too
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (15 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 16/31] KVM: nVMX: hyper-v: Direct TLB flush Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 18/31] KVM: nSVM: hyper-v: Direct TLB flush Vitaly Kuznetsov
                   ` (13 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

KVM_REQ_TLB_FLUSH_CURRENT is an even stronger operation than
KVM_REQ_TLB_FLUSH_GUEST so KVM_REQ_HV_TLB_FLUSH needs not to be
processed after it.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/x86.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2074d52b0666..59d19a3f0275 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3338,8 +3338,11 @@ static inline void kvm_vcpu_flush_tlb_current(struct kvm_vcpu *vcpu)
  */
 void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu)
 {
-	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu))
+	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu)) {
 		kvm_vcpu_flush_tlb_current(vcpu);
+		if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu))
+			kvm_hv_vcpu_empty_flush_tlb(vcpu);
+	}
 
 	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
 		kvm_vcpu_flush_tlb_guest(vcpu);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 18/31] KVM: nSVM: hyper-v: Direct TLB flush
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (16 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 17/31] KVM: x86: KVM_REQ_TLB_FLUSH_CURRENT is a superset of KVM_REQ_HV_TLB_FLUSH too Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 18:50   ` Sean Christopherson
  2022-04-07 15:56 ` [PATCH v2 19/31] KVM: x86: Expose Hyper-V Direct TLB flush feature Vitaly Kuznetsov
                   ` (12 subsequent siblings)
  30 siblings, 1 reply; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Implement Hyper-V Direct TLB flush for nSVM feature. The feature needs
to be enabled both in extended 'nested controls' in VMCB and partition
assist page. According to TLFS, synthetic vmexit to L1 is performed
with
- HV_SVM_EXITCODE_ENL exit_code.
- HV_SVM_ENL_EXITCODE_TRAP_AFTER_FLUSH exit_info_1.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/svm/hyperv.c |  7 +++++++
 arch/x86/kvm/svm/hyperv.h | 19 +++++++++++++++++++
 arch/x86/kvm/svm/nested.c | 22 +++++++++++++++++++++-
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/hyperv.c b/arch/x86/kvm/svm/hyperv.c
index 0142fde34738..f3298c70053e 100644
--- a/arch/x86/kvm/svm/hyperv.c
+++ b/arch/x86/kvm/svm/hyperv.c
@@ -8,4 +8,11 @@
 
 void svm_post_hv_direct_flush(struct kvm_vcpu *vcpu)
 {
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	svm->vmcb->control.exit_code = HV_SVM_EXITCODE_ENL;
+	svm->vmcb->control.exit_code_hi = 0;
+	svm->vmcb->control.exit_info_1 = HV_SVM_ENL_EXITCODE_TRAP_AFTER_FLUSH;
+	svm->vmcb->control.exit_info_2 = 0;
+	nested_svm_vmexit(svm);
 }
diff --git a/arch/x86/kvm/svm/hyperv.h b/arch/x86/kvm/svm/hyperv.h
index b3f5df6c6c97..80d12e075b4f 100644
--- a/arch/x86/kvm/svm/hyperv.h
+++ b/arch/x86/kvm/svm/hyperv.h
@@ -33,6 +33,9 @@ struct hv_enlightenments {
  */
 #define VMCB_HV_NESTED_ENLIGHTENMENTS VMCB_SW
 
+#define HV_SVM_EXITCODE_ENL 0xF0000000
+#define HV_SVM_ENL_EXITCODE_TRAP_AFTER_FLUSH   (1)
+
 static inline void nested_svm_hv_update_vm_vp_ids(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -48,6 +51,22 @@ static inline void nested_svm_hv_update_vm_vp_ids(struct kvm_vcpu *vcpu)
 	hv_vcpu->nested.vp_id = hve->hv_vp_id;
 }
 
+static inline bool nested_svm_direct_flush_enabled(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	struct hv_enlightenments *hve =
+		(struct hv_enlightenments *)svm->nested.ctl.reserved_sw;
+	struct hv_vp_assist_page assist_page;
+
+	if (unlikely(!kvm_hv_get_assist_page(vcpu, &assist_page)))
+		return false;
+
+	if (!hve->hv_enlightenments_control.nested_flush_hypercall)
+		return false;
+
+	return assist_page.nested_control.features.directhypercall;
+}
+
 void svm_post_hv_direct_flush(struct kvm_vcpu *vcpu);
 
 #endif /* __ARCH_X86_KVM_SVM_HYPERV_H__ */
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 8cd008e12350..45bc7921d260 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -170,7 +170,8 @@ void recalc_intercepts(struct vcpu_svm *svm)
 	}
 
 	/* We don't want to see VMMCALLs from a nested guest */
-	vmcb_clr_intercept(c, INTERCEPT_VMMCALL);
+	if (!nested_svm_direct_flush_enabled(&svm->vcpu))
+		vmcb_clr_intercept(c, INTERCEPT_VMMCALL);
 
 	for (i = 0; i < MAX_INTERCEPT; i++)
 		c->intercepts[i] |= g->intercepts[i];
@@ -486,6 +487,17 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm,
 
 static void nested_svm_transition_tlb_flush(struct kvm_vcpu *vcpu)
 {
+	/*
+	 * KVM_REQ_HV_TLB_FLUSH flushes entries from either L1's VPID or
+	 * L2's VPID upon request from the guest. Make sure we check for
+	 * pending entries for the case when the request got misplaced (e.g.
+	 * a transition from L2->L1 happened while processing Direct TLB flush
+	 * request or vice versa). kvm_hv_vcpu_flush_tlb() will not flush
+	 * anything if there are no requests in the corresponding buffer.
+	 */
+	if (to_hv_vcpu(vcpu))
+		kvm_make_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
+
 	/*
 	 * TODO: optimize unconditional TLB flush/MMU sync.  A partial list of
 	 * things to fix before this can be conditional:
@@ -1361,6 +1373,7 @@ static int svm_check_nested_events(struct kvm_vcpu *vcpu)
 int nested_svm_exit_special(struct vcpu_svm *svm)
 {
 	u32 exit_code = svm->vmcb->control.exit_code;
+	struct kvm_vcpu *vcpu = &svm->vcpu;
 
 	switch (exit_code) {
 	case SVM_EXIT_INTR:
@@ -1379,6 +1392,13 @@ int nested_svm_exit_special(struct vcpu_svm *svm)
 			return NESTED_EXIT_HOST;
 		break;
 	}
+	case SVM_EXIT_VMMCALL:
+		/* Hyper-V Direct TLB flush hypercall is handled by L0 */
+		if (kvm_hv_direct_tlb_flush_exposed(vcpu) &&
+		    nested_svm_direct_flush_enabled(vcpu) &&
+		    kvm_hv_is_tlb_flush_hcall(vcpu))
+			return NESTED_EXIT_HOST;
+		break;
 	default:
 		break;
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 19/31] KVM: x86: Expose Hyper-V Direct TLB flush feature
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (17 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 18/31] KVM: nSVM: hyper-v: Direct TLB flush Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 20/31] KVM: selftests: add hyperv_svm_test to .gitignore Vitaly Kuznetsov
                   ` (11 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

With both nSVM and nVMX implementations in place, KVM can export
Hyper-V Direct TLB flush feature to userspace.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index f08cf295d750..7758ed3dc811 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2801,6 +2801,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 
 		case HYPERV_CPUID_NESTED_FEATURES:
 			ent->eax = evmcs_ver;
+			ent->eax |= HV_X64_NESTED_DIRECT_FLUSH;
 			ent->eax |= HV_X64_NESTED_MSR_BITMAP;
 
 			break;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 20/31] KVM: selftests: add hyperv_svm_test to .gitignore
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (18 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 19/31] KVM: x86: Expose Hyper-V Direct TLB flush feature Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 21/31] KVM: selftests: Better XMM read/write helpers Vitaly Kuznetsov
                   ` (10 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Add hyperv_svm_test to .gitignore.

Fixes: e67bd7df28a0 ("KVM: selftests: nSVM: Add enlightened MSR-Bitmap selftest")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 1f1b6c978bf7..4256fa526cda 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -22,6 +22,7 @@
 /x86_64/hyperv_clock
 /x86_64/hyperv_cpuid
 /x86_64/hyperv_features
+/x86_64/hyperv_svm_test
 /x86_64/mmio_warning_test
 /x86_64/mmu_role_test
 /x86_64/platform_info_test
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 21/31] KVM: selftests: Better XMM read/write helpers
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (19 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 20/31] KVM: selftests: add hyperv_svm_test to .gitignore Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 22/31] KVM: selftests: Hyper-V PV IPI selftest Vitaly Kuznetsov
                   ` (9 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

set_xmm()/get_xmm() helpers are fairly useless as they only read 64 bits
from 128-bit registers. Moreover, these helpers are not used. Borrow
_kvm_read_sse_reg()/_kvm_write_sse_reg() from KVM limiting them to
XMM0-XMM8 for now.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../selftests/kvm/include/x86_64/processor.h  | 70 ++++++++++---------
 1 file changed, 36 insertions(+), 34 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 37db341d4cc5..9ad7602a257b 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -296,71 +296,73 @@ static inline void cpuid(uint32_t *eax, uint32_t *ebx,
 	    : "memory");
 }
 
-#define SET_XMM(__var, __xmm) \
-	asm volatile("movq %0, %%"#__xmm : : "r"(__var) : #__xmm)
+typedef u32		__attribute__((vector_size(16))) sse128_t;
+#define __sse128_u	union { sse128_t vec; u64 as_u64[2]; u32 as_u32[4]; }
+#define sse128_lo(x)	({ __sse128_u t; t.vec = x; t.as_u64[0]; })
+#define sse128_hi(x)	({ __sse128_u t; t.vec = x; t.as_u64[1]; })
 
-static inline void set_xmm(int n, unsigned long val)
+static inline void read_sse_reg(int reg, sse128_t *data)
 {
-	switch (n) {
+	switch (reg) {
 	case 0:
-		SET_XMM(val, xmm0);
+		asm("movdqa %%xmm0, %0" : "=m"(*data));
 		break;
 	case 1:
-		SET_XMM(val, xmm1);
+		asm("movdqa %%xmm1, %0" : "=m"(*data));
 		break;
 	case 2:
-		SET_XMM(val, xmm2);
+		asm("movdqa %%xmm2, %0" : "=m"(*data));
 		break;
 	case 3:
-		SET_XMM(val, xmm3);
+		asm("movdqa %%xmm3, %0" : "=m"(*data));
 		break;
 	case 4:
-		SET_XMM(val, xmm4);
+		asm("movdqa %%xmm4, %0" : "=m"(*data));
 		break;
 	case 5:
-		SET_XMM(val, xmm5);
+		asm("movdqa %%xmm5, %0" : "=m"(*data));
 		break;
 	case 6:
-		SET_XMM(val, xmm6);
+		asm("movdqa %%xmm6, %0" : "=m"(*data));
 		break;
 	case 7:
-		SET_XMM(val, xmm7);
+		asm("movdqa %%xmm7, %0" : "=m"(*data));
 		break;
+	default:
+		BUG();
 	}
 }
 
-#define GET_XMM(__xmm)							\
-({									\
-	unsigned long __val;						\
-	asm volatile("movq %%"#__xmm", %0" : "=r"(__val));		\
-	__val;								\
-})
-
-static inline unsigned long get_xmm(int n)
+static inline void write_sse_reg(int reg, const sse128_t *data)
 {
-	assert(n >= 0 && n <= 7);
-
-	switch (n) {
+	switch (reg) {
 	case 0:
-		return GET_XMM(xmm0);
+		asm("movdqa %0, %%xmm0" : : "m"(*data));
+		break;
 	case 1:
-		return GET_XMM(xmm1);
+		asm("movdqa %0, %%xmm1" : : "m"(*data));
+		break;
 	case 2:
-		return GET_XMM(xmm2);
+		asm("movdqa %0, %%xmm2" : : "m"(*data));
+		break;
 	case 3:
-		return GET_XMM(xmm3);
+		asm("movdqa %0, %%xmm3" : : "m"(*data));
+		break;
 	case 4:
-		return GET_XMM(xmm4);
+		asm("movdqa %0, %%xmm4" : : "m"(*data));
+		break;
 	case 5:
-		return GET_XMM(xmm5);
+		asm("movdqa %0, %%xmm5" : : "m"(*data));
+		break;
 	case 6:
-		return GET_XMM(xmm6);
+		asm("movdqa %0, %%xmm6" : : "m"(*data));
+		break;
 	case 7:
-		return GET_XMM(xmm7);
+		asm("movdqa %0, %%xmm7" : : "m"(*data));
+		break;
+	default:
+		BUG();
 	}
-
-	/* never reached */
-	return 0;
 }
 
 static inline void cpu_relax(void)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 22/31] KVM: selftests: Hyper-V PV IPI selftest
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (20 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 21/31] KVM: selftests: Better XMM read/write helpers Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 23/31] KVM: selftests: Make it possible to replace PTEs with __virt_pg_map() Vitaly Kuznetsov
                   ` (8 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Introduce a selftest for Hyper-V PV IPI hypercalls
(HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx).

The test creates one 'sender' vCPU and two 'receiver' vCPU and then
issues various combinations of send IPI hypercalls in both 'normal'
and 'fast' (with XMM input where necessary) mode. Later, the test
checks whether IPIs were delivered to the expected destination vCPU[s].

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/hyperv.h     |   3 +
 .../selftests/kvm/x86_64/hyperv_features.c    |   5 +-
 .../testing/selftests/kvm/x86_64/hyperv_ipi.c | 362 ++++++++++++++++++
 5 files changed, 369 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/hyperv_ipi.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 4256fa526cda..143fd0f00c9d 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -22,6 +22,7 @@
 /x86_64/hyperv_clock
 /x86_64/hyperv_cpuid
 /x86_64/hyperv_features
+/x86_64/hyperv_ipi
 /x86_64/hyperv_svm_test
 /x86_64/mmio_warning_test
 /x86_64/mmu_role_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index c9cdbd248727..2e84a8a8c0c9 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -52,6 +52,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/fix_hypercall_test
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_clock
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_features
+TEST_GEN_PROGS_x86_64 += x86_64/hyperv_ipi
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_svm_test
 TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_test
 TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
diff --git a/tools/testing/selftests/kvm/include/x86_64/hyperv.h b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
index b66910702c0a..f51d6fab8e93 100644
--- a/tools/testing/selftests/kvm/include/x86_64/hyperv.h
+++ b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
@@ -184,5 +184,8 @@
 
 /* hypercall options */
 #define HV_HYPERCALL_FAST_BIT		BIT(16)
+#define HV_HYPERCALL_VARHEAD_OFFSET	17
+
+#define HYPERV_LINUX_OS_ID ((u64)0x8100 << 48)
 
 #endif /* !SELFTEST_KVM_HYPERV_H */
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
index 672915ce73d8..98c020356925 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
@@ -14,7 +14,6 @@
 #include "hyperv.h"
 
 #define VCPU_ID 0
-#define LINUX_OS_ID ((u64)0x8100 << 48)
 
 extern unsigned char rdmsr_start;
 extern unsigned char rdmsr_end;
@@ -127,7 +126,7 @@ static void guest_hcall(vm_vaddr_t pgs_gpa, struct hcall_data *hcall)
 	int i = 0;
 	u64 res, input, output;
 
-	wrmsr(HV_X64_MSR_GUEST_OS_ID, LINUX_OS_ID);
+	wrmsr(HV_X64_MSR_GUEST_OS_ID, HYPERV_LINUX_OS_ID);
 	wrmsr(HV_X64_MSR_HYPERCALL, pgs_gpa);
 
 	while (hcall->control) {
@@ -230,7 +229,7 @@ static void guest_test_msrs_access(void)
 			 */
 			msr->idx = HV_X64_MSR_GUEST_OS_ID;
 			msr->write = 1;
-			msr->write_val = LINUX_OS_ID;
+			msr->write_val = HYPERV_LINUX_OS_ID;
 			msr->available = 1;
 			break;
 		case 3:
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_ipi.c b/tools/testing/selftests/kvm/x86_64/hyperv_ipi.c
new file mode 100644
index 000000000000..6c697fe7eca4
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_ipi.c
@@ -0,0 +1,362 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hyper-V HvCallSendSyntheticClusterIpi{,Ex} tests
+ *
+ * Copyright (C) 2022, Red Hat, Inc.
+ *
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <pthread.h>
+#include <inttypes.h>
+
+#include "kvm_util.h"
+#include "hyperv.h"
+#include "processor.h"
+#include "test_util.h"
+#include "vmx.h"
+
+#define SENDER_VCPU_ID   1
+#define RECEIVER_VCPU_ID_1 2
+#define RECEIVER_VCPU_ID_2 65
+
+#define IPI_VECTOR	 0xfe
+
+static volatile uint64_t ipis_rcvd[RECEIVER_VCPU_ID_2 + 1];
+
+struct thread_params {
+	struct kvm_vm *vm;
+	uint32_t vcpu_id;
+};
+
+struct hv_vpset {
+	u64 format;
+	u64 valid_bank_mask;
+	u64 bank_contents[2];
+};
+
+enum HV_GENERIC_SET_FORMAT {
+	HV_GENERIC_SET_SPARSE_4K,
+	HV_GENERIC_SET_ALL,
+};
+
+/* HvCallSendSyntheticClusterIpi hypercall */
+struct hv_send_ipi {
+	u32 vector;
+	u32 reserved;
+	u64 cpu_mask;
+};
+
+/* HvCallSendSyntheticClusterIpiEx hypercall */
+struct hv_send_ipi_ex {
+	u32 vector;
+	u32 reserved;
+	struct hv_vpset vp_set;
+};
+
+static inline void hv_init(vm_vaddr_t pgs_gpa)
+{
+	wrmsr(HV_X64_MSR_GUEST_OS_ID, HYPERV_LINUX_OS_ID);
+	wrmsr(HV_X64_MSR_HYPERCALL, pgs_gpa);
+}
+
+static void receiver_code(void *hcall_page, vm_vaddr_t pgs_gpa)
+{
+	x2apic_enable();
+	hv_init(pgs_gpa);
+
+	for (;;)
+		asm volatile("sti; hlt; cli");
+}
+
+static void guest_ipi_handler(struct ex_regs *regs)
+{
+	u32 vcpu_id = rdmsr(HV_X64_MSR_VP_INDEX);
+
+	ipis_rcvd[vcpu_id]++;
+	wrmsr(HV_X64_MSR_EOI, 1);
+}
+
+static inline u64 hypercall(u64 control, vm_vaddr_t arg1, vm_vaddr_t arg2)
+{
+	u64 hv_status;
+
+	asm volatile("mov %3, %%r8\n"
+		     "vmcall"
+		     : "=a" (hv_status),
+		       "+c" (control), "+d" (arg1)
+		     :  "r" (arg2)
+		     : "cc", "memory", "r8", "r9", "r10", "r11");
+
+	return hv_status;
+}
+
+static inline void nop_loop(void)
+{
+	int i;
+
+	for (i = 0; i < 100000000; i++)
+		asm volatile("nop");
+}
+
+static inline void sync_to_xmm(void *data)
+{
+	int i;
+
+	for (i = 0; i < 8; i++)
+		write_sse_reg(i, (sse128_t *)(data + sizeof(sse128_t) * i));
+}
+
+static void sender_guest_code(void *hcall_page, vm_vaddr_t pgs_gpa)
+{
+	struct hv_send_ipi *ipi = (struct hv_send_ipi *)hcall_page;
+	struct hv_send_ipi_ex *ipi_ex = (struct hv_send_ipi_ex *)hcall_page;
+	int stage = 1, ipis_expected[2] = {0};
+	u64 res;
+
+	hv_init(pgs_gpa);
+	GUEST_SYNC(stage++);
+
+	/* 'Slow' HvCallSendSyntheticClusterIpi to RECEIVER_VCPU_ID_1 */
+	ipi->vector = IPI_VECTOR;
+	ipi->cpu_mask = 1 << RECEIVER_VCPU_ID_1;
+	res = hypercall(HVCALL_SEND_IPI, pgs_gpa, pgs_gpa + 4096);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ++ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ipis_expected[1]);
+	GUEST_SYNC(stage++);
+	/* 'Fast' HvCallSendSyntheticClusterIpi to RECEIVER_VCPU_ID_1 */
+	res = hypercall(HVCALL_SEND_IPI | HV_HYPERCALL_FAST_BIT,
+			IPI_VECTOR, 1 << RECEIVER_VCPU_ID_1);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ++ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ipis_expected[1]);
+	GUEST_SYNC(stage++);
+
+	/* 'Slow' HvCallSendSyntheticClusterIpiEx to RECEIVER_VCPU_ID_1 */
+	memset(hcall_page, 0, 4096);
+	ipi_ex->vector = IPI_VECTOR;
+	ipi_ex->vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+	ipi_ex->vp_set.valid_bank_mask = 1 << 0;
+	ipi_ex->vp_set.bank_contents[0] = BIT(RECEIVER_VCPU_ID_1);
+	res = hypercall(HVCALL_SEND_IPI_EX | (1 << HV_HYPERCALL_VARHEAD_OFFSET),
+			pgs_gpa, pgs_gpa + 4096);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ++ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ipis_expected[1]);
+	GUEST_SYNC(stage++);
+	/* 'XMM Fast' HvCallSendSyntheticClusterIpiEx to RECEIVER_VCPU_ID_1 */
+	sync_to_xmm(&ipi_ex->vp_set.valid_bank_mask);
+	res = hypercall(HVCALL_SEND_IPI_EX | HV_HYPERCALL_FAST_BIT |
+			(1 << HV_HYPERCALL_VARHEAD_OFFSET),
+			IPI_VECTOR, HV_GENERIC_SET_SPARSE_4K);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ++ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ipis_expected[1]);
+	GUEST_SYNC(stage++);
+
+	/* 'Slow' HvCallSendSyntheticClusterIpiEx to RECEIVER_VCPU_ID_2 */
+	memset(hcall_page, 0, 4096);
+	ipi_ex->vector = IPI_VECTOR;
+	ipi_ex->vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+	ipi_ex->vp_set.valid_bank_mask = 1 << 1;
+	ipi_ex->vp_set.bank_contents[0] = BIT(RECEIVER_VCPU_ID_2 - 64);
+	res = hypercall(HVCALL_SEND_IPI_EX | (1 << HV_HYPERCALL_VARHEAD_OFFSET),
+			pgs_gpa, pgs_gpa + 4096);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ++ipis_expected[1]);
+	GUEST_SYNC(stage++);
+	/* 'XMM Fast' HvCallSendSyntheticClusterIpiEx to RECEIVER_VCPU_ID_2 */
+	sync_to_xmm(&ipi_ex->vp_set.valid_bank_mask);
+	res = hypercall(HVCALL_SEND_IPI_EX | HV_HYPERCALL_FAST_BIT |
+			(1 << HV_HYPERCALL_VARHEAD_OFFSET),
+			IPI_VECTOR, HV_GENERIC_SET_SPARSE_4K);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ++ipis_expected[1]);
+	GUEST_SYNC(stage++);
+
+	/* 'Slow' HvCallSendSyntheticClusterIpiEx to both RECEIVER_VCPU_ID_{1,2} */
+	memset(hcall_page, 0, 4096);
+	ipi_ex->vector = IPI_VECTOR;
+	ipi_ex->vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+	ipi_ex->vp_set.valid_bank_mask = 1 << 1 | 1;
+	ipi_ex->vp_set.bank_contents[0] = BIT(RECEIVER_VCPU_ID_1);
+	ipi_ex->vp_set.bank_contents[1] = BIT(RECEIVER_VCPU_ID_2 - 64);
+	res = hypercall(HVCALL_SEND_IPI_EX | (2 << HV_HYPERCALL_VARHEAD_OFFSET),
+			pgs_gpa, pgs_gpa + 4096);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ++ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ++ipis_expected[1]);
+	GUEST_SYNC(stage++);
+	/* 'XMM Fast' HvCallSendSyntheticClusterIpiEx to both RECEIVER_VCPU_ID_{1, 2} */
+	sync_to_xmm(&ipi_ex->vp_set.valid_bank_mask);
+	res = hypercall(HVCALL_SEND_IPI_EX | HV_HYPERCALL_FAST_BIT |
+			(2 << HV_HYPERCALL_VARHEAD_OFFSET),
+			IPI_VECTOR, HV_GENERIC_SET_SPARSE_4K);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ++ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ++ipis_expected[1]);
+	GUEST_SYNC(stage++);
+
+	/* 'Slow' HvCallSendSyntheticClusterIpiEx to HV_GENERIC_SET_ALL */
+	memset(hcall_page, 0, 4096);
+	ipi_ex->vector = IPI_VECTOR;
+	ipi_ex->vp_set.format = HV_GENERIC_SET_ALL;
+	res = hypercall(HVCALL_SEND_IPI_EX,
+			pgs_gpa, pgs_gpa + 4096);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ++ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ++ipis_expected[1]);
+	GUEST_SYNC(stage++);
+	/* 'XMM Fast' HvCallSendSyntheticClusterIpiEx to HV_GENERIC_SET_ALL */
+	sync_to_xmm(&ipi_ex->vp_set.valid_bank_mask);
+	res = hypercall(HVCALL_SEND_IPI_EX | HV_HYPERCALL_FAST_BIT,
+			IPI_VECTOR, HV_GENERIC_SET_ALL);
+	GUEST_ASSERT((res & 0xffff) == 0);
+	nop_loop();
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_1] == ++ipis_expected[0]);
+	GUEST_ASSERT(ipis_rcvd[RECEIVER_VCPU_ID_2] == ++ipis_expected[1]);
+	GUEST_SYNC(stage++);
+
+	GUEST_DONE();
+}
+
+static void *vcpu_thread(void *arg)
+{
+	struct thread_params *params = (struct thread_params *)arg;
+	struct ucall uc;
+	int old;
+	int r;
+	unsigned int exit_reason;
+
+	r = pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, &old);
+	TEST_ASSERT(r == 0,
+		    "pthread_setcanceltype failed on vcpu_id=%u with errno=%d",
+		    params->vcpu_id, r);
+
+	vcpu_run(params->vm, params->vcpu_id);
+	exit_reason = vcpu_state(params->vm, params->vcpu_id)->exit_reason;
+
+	TEST_ASSERT(exit_reason == KVM_EXIT_IO,
+		    "vCPU %u exited with unexpected exit reason %u-%s, expected KVM_EXIT_IO",
+		    params->vcpu_id, exit_reason, exit_reason_str(exit_reason));
+
+	if (get_ucall(params->vm, params->vcpu_id, &uc) == UCALL_ABORT) {
+		TEST_ASSERT(false,
+			    "vCPU %u exited with error: %s.\n",
+			    params->vcpu_id, (const char *)uc.args[0]);
+	}
+
+	return NULL;
+}
+
+static void cancel_join_vcpu_thread(pthread_t thread, uint32_t vcpu_id)
+{
+	void *retval;
+	int r;
+
+	r = pthread_cancel(thread);
+	TEST_ASSERT(r == 0,
+		    "pthread_cancel on vcpu_id=%d failed with errno=%d",
+		    vcpu_id, r);
+
+	r = pthread_join(thread, &retval);
+	TEST_ASSERT(r == 0,
+		    "pthread_join on vcpu_id=%d failed with errno=%d",
+		    vcpu_id, r);
+	TEST_ASSERT(retval == PTHREAD_CANCELED,
+		    "expected retval=%p, got %p", PTHREAD_CANCELED,
+		    retval);
+}
+
+int main(int argc, char *argv[])
+{
+	int r;
+	pthread_t threads[2];
+	struct thread_params params[2];
+	struct kvm_vm *vm;
+	struct kvm_run *run;
+	vm_vaddr_t hcall_page;
+	struct ucall uc;
+	int stage = 1;
+
+	vm = vm_create_default(SENDER_VCPU_ID, 0, sender_guest_code);
+	params[0].vm = vm;
+	params[1].vm = vm;
+
+	/* Hypercall input/output */
+	hcall_page = vm_vaddr_alloc_pages(vm, 2);
+	memset(addr_gva2hva(vm, hcall_page), 0x0, 2 * getpagesize());
+
+	vm_init_descriptor_tables(vm);
+
+	vm_vcpu_add_default(vm, RECEIVER_VCPU_ID_1, receiver_code);
+	vcpu_init_descriptor_tables(vm, RECEIVER_VCPU_ID_1);
+	vcpu_args_set(vm, RECEIVER_VCPU_ID_1, 2, hcall_page, addr_gva2gpa(vm, hcall_page));
+	vcpu_set_msr(vm, RECEIVER_VCPU_ID_1, HV_X64_MSR_VP_INDEX, RECEIVER_VCPU_ID_1);
+	vcpu_set_hv_cpuid(vm, RECEIVER_VCPU_ID_1);
+
+	vm_vcpu_add_default(vm, RECEIVER_VCPU_ID_2, receiver_code);
+	vcpu_init_descriptor_tables(vm, RECEIVER_VCPU_ID_2);
+	vcpu_args_set(vm, RECEIVER_VCPU_ID_2, 2, hcall_page, addr_gva2gpa(vm, hcall_page));
+	vcpu_set_msr(vm, RECEIVER_VCPU_ID_2, HV_X64_MSR_VP_INDEX, RECEIVER_VCPU_ID_2);
+	vcpu_set_hv_cpuid(vm, RECEIVER_VCPU_ID_2);
+
+	vm_install_exception_handler(vm, IPI_VECTOR, guest_ipi_handler);
+
+	vcpu_args_set(vm, SENDER_VCPU_ID, 2, hcall_page, addr_gva2gpa(vm, hcall_page));
+	vcpu_set_hv_cpuid(vm, SENDER_VCPU_ID);
+
+	params[0].vcpu_id = RECEIVER_VCPU_ID_1;
+	r = pthread_create(&threads[0], NULL, vcpu_thread, &params[0]);
+	TEST_ASSERT(r == 0,
+		    "pthread_create halter failed errno=%d", errno);
+
+	params[1].vcpu_id = RECEIVER_VCPU_ID_2;
+	r = pthread_create(&threads[1], NULL, vcpu_thread, &params[1]);
+	TEST_ASSERT(r == 0,
+		    "pthread_create halter failed errno=%d", errno);
+
+	run = vcpu_state(vm, SENDER_VCPU_ID);
+
+	while (true) {
+		r = _vcpu_run(vm, SENDER_VCPU_ID);
+		TEST_ASSERT(!r, "vcpu_run failed: %d\n", r);
+		TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+			    "unexpected exit reason: %u (%s)",
+			    run->exit_reason, exit_reason_str(run->exit_reason));
+
+		switch (get_ucall(vm, SENDER_VCPU_ID, &uc)) {
+		case UCALL_SYNC:
+			TEST_ASSERT(uc.args[1] == stage,
+				    "Unexpected stage: %ld (%d expected)\n",
+				    uc.args[1], stage);
+			break;
+		case UCALL_ABORT:
+			TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0],
+				  __FILE__, uc.args[1]);
+			return 1;
+		case UCALL_DONE:
+			return 0;
+		}
+
+		stage++;
+	}
+
+	cancel_join_vcpu_thread(threads[0], RECEIVER_VCPU_ID_1);
+	cancel_join_vcpu_thread(threads[1], RECEIVER_VCPU_ID_2);
+	kvm_vm_free(vm);
+
+	return 0;
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 23/31] KVM: selftests: Make it possible to replace PTEs with __virt_pg_map()
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (21 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 22/31] KVM: selftests: Hyper-V PV IPI selftest Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 24/31] KVM: selftests: Hyper-V PV TLB flush selftest Vitaly Kuznetsov
                   ` (7 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

__virt_pg_map() makes an assumption that leaf PTE is not present. This
is not suitable if the test wants to replace an already present
PTE. Hyper-V PV TLB flush test is going to need that.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/include/x86_64/processor.h | 2 +-
 tools/testing/selftests/kvm/lib/x86_64/processor.c     | 6 +++---
 tools/testing/selftests/kvm/max_guest_memory_test.c    | 2 +-
 tools/testing/selftests/kvm/x86_64/mmu_role_test.c     | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 9ad7602a257b..c20b18d05119 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -473,7 +473,7 @@ enum x86_page_size {
 	X86_PAGE_SIZE_1G,
 };
 void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
-		   enum x86_page_size page_size);
+		   enum x86_page_size page_size, bool replace);
 
 /*
  * Basic CPU control in CR0
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index 9f000dfb5594..20df3e84d777 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -229,7 +229,7 @@ static struct pageUpperEntry *virt_create_upper_pte(struct kvm_vm *vm,
 }
 
 void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
-		   enum x86_page_size page_size)
+		   enum x86_page_size page_size, bool replace)
 {
 	const uint64_t pg_size = 1ull << ((page_size * 9) + 12);
 	struct pageUpperEntry *pml4e, *pdpe, *pde;
@@ -270,7 +270,7 @@ void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 
 	/* Fill in page table entry. */
 	pte = virt_get_pte(vm, pde->pfn, vaddr, 0);
-	TEST_ASSERT(!pte->present,
+	TEST_ASSERT(replace || !pte->present,
 		    "PTE already present for 4k page at vaddr: 0x%lx\n", vaddr);
 	pte->pfn = paddr >> vm->page_shift;
 	pte->writable = true;
@@ -279,7 +279,7 @@ void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 
 void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 {
-	__virt_pg_map(vm, vaddr, paddr, X86_PAGE_SIZE_4K);
+	__virt_pg_map(vm, vaddr, paddr, X86_PAGE_SIZE_4K, false);
 }
 
 static struct pageTableEntry *_vm_get_page_table_entry(struct kvm_vm *vm, int vcpuid,
diff --git a/tools/testing/selftests/kvm/max_guest_memory_test.c b/tools/testing/selftests/kvm/max_guest_memory_test.c
index 3875c4b23a04..437f77633b0e 100644
--- a/tools/testing/selftests/kvm/max_guest_memory_test.c
+++ b/tools/testing/selftests/kvm/max_guest_memory_test.c
@@ -244,7 +244,7 @@ int main(int argc, char *argv[])
 #ifdef __x86_64__
 		/* Identity map memory in the guest using 1gb pages. */
 		for (i = 0; i < slot_size; i += size_1gb)
-			__virt_pg_map(vm, gpa + i, gpa + i, X86_PAGE_SIZE_1G);
+			__virt_pg_map(vm, gpa + i, gpa + i, X86_PAGE_SIZE_1G, false);
 #else
 		for (i = 0; i < slot_size; i += vm_get_page_size(vm))
 			virt_pg_map(vm, gpa + i, gpa + i);
diff --git a/tools/testing/selftests/kvm/x86_64/mmu_role_test.c b/tools/testing/selftests/kvm/x86_64/mmu_role_test.c
index da2325fcad87..e3fdf320b9f4 100644
--- a/tools/testing/selftests/kvm/x86_64/mmu_role_test.c
+++ b/tools/testing/selftests/kvm/x86_64/mmu_role_test.c
@@ -35,7 +35,7 @@ static void mmu_role_test(u32 *cpuid_reg, u32 evil_cpuid_val)
 	run = vcpu_state(vm, VCPU_ID);
 
 	/* Map 1gb page without a backing memlot. */
-	__virt_pg_map(vm, MMIO_GPA, MMIO_GPA, X86_PAGE_SIZE_1G);
+	__virt_pg_map(vm, MMIO_GPA, MMIO_GPA, X86_PAGE_SIZE_1G, false);
 
 	r = _vcpu_run(vm, VCPU_ID);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 24/31] KVM: selftests: Hyper-V PV TLB flush selftest
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (22 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 23/31] KVM: selftests: Make it possible to replace PTEs with __virt_pg_map() Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 25/31] KVM: selftests: Sync 'struct hv_enlightened_vmcs' definition with hyperv-tlfs.h Vitaly Kuznetsov
                   ` (6 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Introduce a selftest for Hyper-V PV TLB flush hypercalls
(HvFlushVirtualAddressSpace/HvFlushVirtualAddressSpaceEx,
HvFlushVirtualAddressList/HvFlushVirtualAddressListEx).

The test creates one 'sender' vCPU and two 'worker' vCPU which do busy
loop reading from a certain GVA checking the observed value. Sender
vCPU drops to the host to swap the data page with another page filled
with a different value. The expectation for workers is also
altered. Without TLB flush on worker vCPUs, they may continue to
observe old value. To guard against accidental TLB flushes for worker
vCPUs the test is repeated 100 times.

Hyper-V TLB flush hypercalls are tested in both 'normal' and 'XMM
fast' modes.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/hyperv.h     |   1 +
 .../selftests/kvm/x86_64/hyperv_tlb_flush.c   | 647 ++++++++++++++++++
 4 files changed, 650 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 143fd0f00c9d..468c07a11e76 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -24,6 +24,7 @@
 /x86_64/hyperv_features
 /x86_64/hyperv_ipi
 /x86_64/hyperv_svm_test
+/x86_64/hyperv_tlb_flush
 /x86_64/mmio_warning_test
 /x86_64/mmu_role_test
 /x86_64/platform_info_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 2e84a8a8c0c9..c3ba9505b368 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -54,6 +54,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_features
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_ipi
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_svm_test
+TEST_GEN_PROGS_x86_64 += x86_64/hyperv_tlb_flush
 TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_test
 TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
 TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
diff --git a/tools/testing/selftests/kvm/include/x86_64/hyperv.h b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
index f51d6fab8e93..1e34dd7c5075 100644
--- a/tools/testing/selftests/kvm/include/x86_64/hyperv.h
+++ b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
@@ -185,6 +185,7 @@
 /* hypercall options */
 #define HV_HYPERCALL_FAST_BIT		BIT(16)
 #define HV_HYPERCALL_VARHEAD_OFFSET	17
+#define HV_HYPERCALL_REP_COMP_OFFSET	32
 
 #define HYPERV_LINUX_OS_ID ((u64)0x8100 << 48)
 
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c b/tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c
new file mode 100644
index 000000000000..00bcae45ddd2
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c
@@ -0,0 +1,647 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hyper-V HvFlushVirtualAddress{List,Space}{,Ex} tests
+ *
+ * Copyright (C) 2022, Red Hat, Inc.
+ *
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <pthread.h>
+#include <inttypes.h>
+
+#include "kvm_util.h"
+#include "hyperv.h"
+#include "processor.h"
+#include "test_util.h"
+#include "vmx.h"
+
+#define SENDER_VCPU_ID   1
+#define WORKER_VCPU_ID_1 2
+#define WORKER_VCPU_ID_2 65
+
+#define NTRY 100
+
+struct thread_params {
+	struct kvm_vm *vm;
+	uint32_t vcpu_id;
+};
+
+struct hv_vpset {
+	u64 format;
+	u64 valid_bank_mask;
+	u64 bank_contents[];
+};
+
+enum HV_GENERIC_SET_FORMAT {
+	HV_GENERIC_SET_SPARSE_4K,
+	HV_GENERIC_SET_ALL,
+};
+
+#define HV_FLUSH_ALL_PROCESSORS			BIT(0)
+#define HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES	BIT(1)
+#define HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY	BIT(2)
+#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT	BIT(3)
+
+/* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
+struct hv_tlb_flush {
+	u64 address_space;
+	u64 flags;
+	u64 processor_mask;
+	u64 gva_list[];
+} __packed;
+
+/* HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressListEx hypercalls */
+struct hv_tlb_flush_ex {
+	u64 address_space;
+	u64 flags;
+	struct hv_vpset hv_vp_set;
+	u64 gva_list[];
+} __packed;
+
+static inline void hv_init(vm_vaddr_t pgs_gpa)
+{
+	wrmsr(HV_X64_MSR_GUEST_OS_ID, HYPERV_LINUX_OS_ID);
+	wrmsr(HV_X64_MSR_HYPERCALL, pgs_gpa);
+}
+
+static void worker_code(void *test_pages, vm_vaddr_t pgs_gpa)
+{
+	u32 vcpu_id = rdmsr(HV_X64_MSR_VP_INDEX);
+	unsigned char chr;
+
+	x2apic_enable();
+	hv_init(pgs_gpa);
+
+	for (;;) {
+		chr = READ_ONCE(*(unsigned char *)(test_pages + 4096 * 2 + vcpu_id));
+		if (chr)
+			GUEST_ASSERT(*(unsigned char *)test_pages == chr);
+		asm volatile("nop");
+	}
+}
+
+static inline u64 hypercall(u64 control, vm_vaddr_t arg1, vm_vaddr_t arg2)
+{
+	u64 hv_status;
+
+	asm volatile("mov %3, %%r8\n"
+		     "vmcall"
+		     : "=a" (hv_status),
+		       "+c" (control), "+d" (arg1)
+		     :  "r" (arg2)
+		     : "cc", "memory", "r8", "r9", "r10", "r11");
+
+	return hv_status;
+}
+
+static inline void nop_loop(void)
+{
+	int i;
+
+	for (i = 0; i < 10000000; i++)
+		asm volatile("nop");
+}
+
+static inline void sync_to_xmm(void *data)
+{
+	int i;
+
+	for (i = 0; i < 8; i++)
+		write_sse_reg(i, (sse128_t *)(data + sizeof(sse128_t) * i));
+}
+
+static void set_expected_char(void *addr, unsigned char chr, int vcpu_id)
+{
+	asm volatile("mfence");
+	*(unsigned char *)(addr + 2 * 4096 + vcpu_id) = chr;
+}
+
+static void sender_guest_code(void *hcall_page, void *test_pages, vm_vaddr_t pgs_gpa)
+{
+	struct hv_tlb_flush *flush = (struct hv_tlb_flush *)hcall_page;
+	struct hv_tlb_flush_ex *flush_ex = (struct hv_tlb_flush_ex *)hcall_page;
+	int stage = 1, i;
+	u64 res;
+
+	hv_init(pgs_gpa);
+
+	/* "Slow" hypercalls */
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE for WORKER_VCPU_ID_1 */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush->processor_mask = BIT(WORKER_VCPU_ID_1);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE, pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST for WORKER_VCPU_ID_1 */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush->processor_mask = BIT(WORKER_VCPU_ID_1);
+		flush->gva_list[0] = (u64)test_pages;
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET),
+				pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE for HV_FLUSH_ALL_PROCESSORS */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES | HV_FLUSH_ALL_PROCESSORS;
+		flush->processor_mask = 0;
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE, pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST for HV_FLUSH_ALL_PROCESSORS */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES | HV_FLUSH_ALL_PROCESSORS;
+		flush->gva_list[0] = (u64)test_pages;
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET),
+				pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX for WORKER_VCPU_ID_2 */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+		flush_ex->hv_vp_set.valid_bank_mask = BIT_ULL(WORKER_VCPU_ID_2 / 64);
+		flush_ex->hv_vp_set.bank_contents[0] = BIT_ULL(WORKER_VCPU_ID_2 % 64);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX |
+				(1 << HV_HYPERCALL_VARHEAD_OFFSET),
+				pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX for WORKER_VCPU_ID_2 */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+		flush_ex->hv_vp_set.valid_bank_mask = BIT_ULL(WORKER_VCPU_ID_2 / 64);
+		flush_ex->hv_vp_set.bank_contents[0] = BIT_ULL(WORKER_VCPU_ID_2 % 64);
+		/* bank_contents and gva_list occupy the same space, thus [1] */
+		flush_ex->gva_list[1] = (u64)test_pages;
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX |
+				(1 << HV_HYPERCALL_VARHEAD_OFFSET) |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET),
+				pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX for both vCPUs */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+		flush_ex->hv_vp_set.valid_bank_mask = BIT_ULL(WORKER_VCPU_ID_2 / 64) |
+			BIT_ULL(WORKER_VCPU_ID_1 / 64);
+		flush_ex->hv_vp_set.bank_contents[0] = BIT_ULL(WORKER_VCPU_ID_1 % 64);
+		flush_ex->hv_vp_set.bank_contents[1] = BIT_ULL(WORKER_VCPU_ID_2 % 64);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX |
+				(2 << HV_HYPERCALL_VARHEAD_OFFSET),
+				pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX for both vCPUs */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+		flush_ex->hv_vp_set.valid_bank_mask = BIT_ULL(WORKER_VCPU_ID_1 / 64) |
+			BIT_ULL(WORKER_VCPU_ID_2 / 64);
+		flush_ex->hv_vp_set.bank_contents[0] = BIT_ULL(WORKER_VCPU_ID_1 % 64);
+		flush_ex->hv_vp_set.bank_contents[1] = BIT_ULL(WORKER_VCPU_ID_2 % 64);
+		/* bank_contents and gva_list occupy the same space, thus [2] */
+		flush_ex->gva_list[2] = (u64)test_pages;
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX |
+				(2 << HV_HYPERCALL_VARHEAD_OFFSET) |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET),
+				pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX for HV_GENERIC_SET_ALL */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_ALL;
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX,
+				pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX for HV_GENERIC_SET_ALL */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_ALL;
+		flush_ex->gva_list[0] = (u64)test_pages;
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET),
+				pgs_gpa, pgs_gpa + 4096);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* "Fast" hypercalls */
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE for WORKER_VCPU_ID_1 */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush->processor_mask = BIT(WORKER_VCPU_ID_1);
+		sync_to_xmm(&flush->processor_mask);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE |
+				HV_HYPERCALL_FAST_BIT, 0x0, HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST for WORKER_VCPU_ID_1 */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush->processor_mask = BIT(WORKER_VCPU_ID_1);
+		flush->gva_list[0] = (u64)test_pages;
+		sync_to_xmm(&flush->processor_mask);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST | HV_HYPERCALL_FAST_BIT |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET),
+				0x0, HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE for HV_FLUSH_ALL_PROCESSORS */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		sync_to_xmm(&flush->processor_mask);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE | HV_HYPERCALL_FAST_BIT, 0x0,
+				HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES | HV_FLUSH_ALL_PROCESSORS);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST for HV_FLUSH_ALL_PROCESSORS */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush->gva_list[0] = (u64)test_pages;
+		sync_to_xmm(&flush->processor_mask);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST | HV_HYPERCALL_FAST_BIT |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET), 0x0,
+				HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES | HV_FLUSH_ALL_PROCESSORS);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX for WORKER_VCPU_ID_2 */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+		flush_ex->hv_vp_set.valid_bank_mask = BIT_ULL(WORKER_VCPU_ID_2 / 64);
+		flush_ex->hv_vp_set.bank_contents[0] = BIT_ULL(WORKER_VCPU_ID_2 % 64);
+		sync_to_xmm(&flush_ex->hv_vp_set);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX | HV_HYPERCALL_FAST_BIT |
+				(1 << HV_HYPERCALL_VARHEAD_OFFSET),
+				0x0, HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX for WORKER_VCPU_ID_2 */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+		flush_ex->hv_vp_set.valid_bank_mask = BIT_ULL(WORKER_VCPU_ID_2 / 64);
+		flush_ex->hv_vp_set.bank_contents[0] = BIT_ULL(WORKER_VCPU_ID_2 % 64);
+		/* bank_contents and gva_list occupy the same space, thus [1] */
+		flush_ex->gva_list[1] = (u64)test_pages;
+		sync_to_xmm(&flush_ex->hv_vp_set);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX | HV_HYPERCALL_FAST_BIT |
+				(1 << HV_HYPERCALL_VARHEAD_OFFSET) |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET),
+				0x0, HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX for both vCPUs */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+		flush_ex->hv_vp_set.valid_bank_mask = BIT_ULL(WORKER_VCPU_ID_2 / 64) |
+			BIT_ULL(WORKER_VCPU_ID_1 / 64);
+		flush_ex->hv_vp_set.bank_contents[0] = BIT_ULL(WORKER_VCPU_ID_1 % 64);
+		flush_ex->hv_vp_set.bank_contents[1] = BIT_ULL(WORKER_VCPU_ID_2 % 64);
+		sync_to_xmm(&flush_ex->hv_vp_set);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX | HV_HYPERCALL_FAST_BIT |
+				(2 << HV_HYPERCALL_VARHEAD_OFFSET),
+				0x0, HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX for both vCPUs */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
+		flush_ex->hv_vp_set.valid_bank_mask = BIT_ULL(WORKER_VCPU_ID_1 / 64) |
+			BIT_ULL(WORKER_VCPU_ID_2 / 64);
+		flush_ex->hv_vp_set.bank_contents[0] = BIT_ULL(WORKER_VCPU_ID_1 % 64);
+		flush_ex->hv_vp_set.bank_contents[1] = BIT_ULL(WORKER_VCPU_ID_2 % 64);
+		/* bank_contents and gva_list occupy the same space, thus [2] */
+		flush_ex->gva_list[2] = (u64)test_pages;
+		sync_to_xmm(&flush_ex->hv_vp_set);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX | HV_HYPERCALL_FAST_BIT |
+				(2 << HV_HYPERCALL_VARHEAD_OFFSET) |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET),
+				0x0, HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX for HV_GENERIC_SET_ALL */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_ALL;
+		sync_to_xmm(&flush_ex->hv_vp_set);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX | HV_HYPERCALL_FAST_BIT,
+				0x0, HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	/* HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX for HV_GENERIC_SET_ALL */
+	for (i = 0; i < NTRY; i++) {
+		memset(hcall_page, 0, 4096);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, 0x0, WORKER_VCPU_ID_2);
+		GUEST_SYNC(stage++);
+		flush_ex->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+		flush_ex->hv_vp_set.format = HV_GENERIC_SET_ALL;
+		flush_ex->gva_list[0] = (u64)test_pages;
+		sync_to_xmm(&flush_ex->hv_vp_set);
+		res = hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX | HV_HYPERCALL_FAST_BIT |
+				(1UL << HV_HYPERCALL_REP_COMP_OFFSET),
+				0x0, HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES);
+		GUEST_ASSERT((res & 0xffff) == 0);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_1);
+		set_expected_char(test_pages, i % 2 ? 0x1 : 0x2, WORKER_VCPU_ID_2);
+		nop_loop();
+	}
+
+	GUEST_DONE();
+}
+
+static void *vcpu_thread(void *arg)
+{
+	struct thread_params *params = (struct thread_params *)arg;
+	struct ucall uc;
+	int old;
+	int r;
+	unsigned int exit_reason;
+
+	r = pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, &old);
+	TEST_ASSERT(r == 0,
+		    "pthread_setcanceltype failed on vcpu_id=%u with errno=%d",
+		    params->vcpu_id, r);
+
+	vcpu_run(params->vm, params->vcpu_id);
+	exit_reason = vcpu_state(params->vm, params->vcpu_id)->exit_reason;
+
+	TEST_ASSERT(exit_reason == KVM_EXIT_IO,
+		    "vCPU %u exited with unexpected exit reason %u-%s, expected KVM_EXIT_IO",
+		    params->vcpu_id, exit_reason, exit_reason_str(exit_reason));
+
+	if (get_ucall(params->vm, params->vcpu_id, &uc) == UCALL_ABORT) {
+		TEST_ASSERT(false,
+			    "vCPU %u exited with error: %s.\n",
+			    params->vcpu_id, (const char *)uc.args[0]);
+	}
+
+	return NULL;
+}
+
+static void cancel_join_vcpu_thread(pthread_t thread, uint32_t vcpu_id)
+{
+	void *retval;
+	int r;
+
+	r = pthread_cancel(thread);
+	TEST_ASSERT(r == 0,
+		    "pthread_cancel on vcpu_id=%d failed with errno=%d",
+		    vcpu_id, r);
+
+	r = pthread_join(thread, &retval);
+	TEST_ASSERT(r == 0,
+		    "pthread_join on vcpu_id=%d failed with errno=%d",
+		    vcpu_id, r);
+	TEST_ASSERT(retval == PTHREAD_CANCELED,
+		    "expected retval=%p, got %p", PTHREAD_CANCELED,
+		    retval);
+}
+
+int main(int argc, char *argv[])
+{
+	int r;
+	pthread_t threads[2];
+	struct thread_params params[2];
+	struct kvm_vm *vm;
+	struct kvm_run *run;
+	vm_vaddr_t hcall_page, test_pages;
+	struct ucall uc;
+	int stage = 1;
+
+	vm = vm_create_default(SENDER_VCPU_ID, 0, sender_guest_code);
+	params[0].vm = vm;
+	params[1].vm = vm;
+
+	/* Hypercall input/output */
+	hcall_page = vm_vaddr_alloc_pages(vm, 2);
+	memset(addr_gva2hva(vm, hcall_page), 0x0, 2 * getpagesize());
+
+	/*
+	 * Test pages: the first one is filled with '0x1's, the second with '0x2's
+	 * and the test will swap their mappings. The third page keeps the indication
+	 * about the current state of mappings.
+	 */
+	test_pages = vm_vaddr_alloc_pages(vm, 3);
+	memset(addr_gva2hva(vm, test_pages), 0x1, 4096);
+	memset(addr_gva2hva(vm, test_pages) + 4096, 0x2, 4096);
+	set_expected_char(addr_gva2hva(vm, test_pages), 0x0, WORKER_VCPU_ID_1);
+	set_expected_char(addr_gva2hva(vm, test_pages), 0x0, WORKER_VCPU_ID_2);
+
+	vm_vcpu_add_default(vm, WORKER_VCPU_ID_1, worker_code);
+	vcpu_args_set(vm, WORKER_VCPU_ID_1, 2, test_pages, addr_gva2gpa(vm, hcall_page));
+	vcpu_set_msr(vm, WORKER_VCPU_ID_1, HV_X64_MSR_VP_INDEX, WORKER_VCPU_ID_1);
+	vcpu_set_hv_cpuid(vm, WORKER_VCPU_ID_1);
+
+	vm_vcpu_add_default(vm, WORKER_VCPU_ID_2, worker_code);
+	vcpu_args_set(vm, WORKER_VCPU_ID_2, 2, test_pages, addr_gva2gpa(vm, hcall_page));
+	vcpu_set_msr(vm, WORKER_VCPU_ID_2, HV_X64_MSR_VP_INDEX, WORKER_VCPU_ID_2);
+	vcpu_set_hv_cpuid(vm, WORKER_VCPU_ID_2);
+
+	vcpu_args_set(vm, SENDER_VCPU_ID, 3, hcall_page, test_pages,
+		      addr_gva2gpa(vm, hcall_page));
+	vcpu_set_hv_cpuid(vm, SENDER_VCPU_ID);
+
+	params[0].vcpu_id = WORKER_VCPU_ID_1;
+	r = pthread_create(&threads[0], NULL, vcpu_thread, &params[0]);
+	TEST_ASSERT(r == 0,
+		    "pthread_create halter failed errno=%d", errno);
+
+	params[1].vcpu_id = WORKER_VCPU_ID_2;
+	r = pthread_create(&threads[1], NULL, vcpu_thread, &params[1]);
+	TEST_ASSERT(r == 0,
+		    "pthread_create halter failed errno=%d", errno);
+
+	run = vcpu_state(vm, SENDER_VCPU_ID);
+
+	while (true) {
+		r = _vcpu_run(vm, SENDER_VCPU_ID);
+		TEST_ASSERT(!r, "vcpu_run failed: %d\n", r);
+		TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+			    "unexpected exit reason: %u (%s)",
+			    run->exit_reason, exit_reason_str(run->exit_reason));
+
+		switch (get_ucall(vm, SENDER_VCPU_ID, &uc)) {
+		case UCALL_SYNC:
+			TEST_ASSERT(uc.args[1] == stage,
+				    "Unexpected stage: %ld (%d expected)\n",
+				    uc.args[1], stage);
+			break;
+		case UCALL_ABORT:
+			TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0],
+				  __FILE__, uc.args[1]);
+			return 1;
+		case UCALL_DONE:
+			return 0;
+		}
+
+		/* Swap test pages */
+		if (stage % 2) {
+			__virt_pg_map(vm, test_pages, addr_gva2gpa(vm, test_pages) + 4096,
+				      X86_PAGE_SIZE_4K, true);
+			__virt_pg_map(vm, test_pages + 4096, addr_gva2gpa(vm, test_pages) - 4096,
+				      X86_PAGE_SIZE_4K, true);
+		} else {
+			__virt_pg_map(vm, test_pages, addr_gva2gpa(vm, test_pages) - 4096,
+				      X86_PAGE_SIZE_4K, true);
+			__virt_pg_map(vm, test_pages + 4096, addr_gva2gpa(vm, test_pages) + 4096,
+				      X86_PAGE_SIZE_4K, true);
+		}
+
+		stage++;
+	}
+
+	cancel_join_vcpu_thread(threads[0], WORKER_VCPU_ID_1);
+	cancel_join_vcpu_thread(threads[1], WORKER_VCPU_ID_2);
+	kvm_vm_free(vm);
+
+	return 0;
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 25/31] KVM: selftests: Sync 'struct hv_enlightened_vmcs' definition with hyperv-tlfs.h
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (23 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 24/31] KVM: selftests: Hyper-V PV TLB flush selftest Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 26/31] KVM: selftests: nVMX: Allocate Hyper-V partition assist page Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

'struct hv_enlightened_vmcs' definition in selftests is not '__packed'
and so we rely on the compiler doing the right padding. This is not
obvious so it seems beneficial to use the same definition as in kernel.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/include/x86_64/evmcs.h | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
index cc5d14a45702..b6067b555110 100644
--- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
+++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
@@ -41,6 +41,8 @@ struct hv_enlightened_vmcs {
 	u16 host_gs_selector;
 	u16 host_tr_selector;
 
+	u16 padding16_1;
+
 	u64 host_ia32_pat;
 	u64 host_ia32_efer;
 
@@ -159,7 +161,7 @@ struct hv_enlightened_vmcs {
 	u64 ept_pointer;
 
 	u16 virtual_processor_id;
-	u16 padding16[3];
+	u16 padding16_2[3];
 
 	u64 padding64_2[5];
 	u64 guest_physical_address;
@@ -195,15 +197,15 @@ struct hv_enlightened_vmcs {
 	u64 guest_rip;
 
 	u32 hv_clean_fields;
-	u32 hv_padding_32;
+	u32 padding32_1;
 	u32 hv_synthetic_controls;
 	struct {
 		u32 nested_flush_hypercall:1;
 		u32 msr_bitmap:1;
 		u32 reserved:30;
-	} hv_enlightenments_control;
+	}  __packed hv_enlightenments_control;
 	u32 hv_vp_id;
-
+	u32 padding32_2;
 	u64 hv_vm_id;
 	u64 partition_assist_page;
 	u64 padding64_4[4];
@@ -211,7 +213,7 @@ struct hv_enlightened_vmcs {
 	u64 padding64_5[7];
 	u64 xss_exit_bitmap;
 	u64 padding64_6[7];
-};
+} __packed;
 
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE                     0
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_IO_BITMAP                BIT(0)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 26/31] KVM: selftests: nVMX: Allocate Hyper-V partition assist page
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (24 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 25/31] KVM: selftests: Sync 'struct hv_enlightened_vmcs' definition with hyperv-tlfs.h Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 27/31] KVM: selftests: nSVM: Allocate Hyper-V partition assist and VP assist pages Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

In preparation to testing Hyper-V Direct TLB flush hypercalls,
allocate so-called Partition assist page and link it to 'struct
vmx_pages'.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/include/x86_64/vmx.h | 4 ++++
 tools/testing/selftests/kvm/lib/x86_64/vmx.c     | 7 +++++++
 2 files changed, 11 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index 583ceb0d1457..f99922ca8259 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -567,6 +567,10 @@ struct vmx_pages {
 	uint64_t enlightened_vmcs_gpa;
 	void *enlightened_vmcs;
 
+	void *partition_assist_hva;
+	uint64_t partition_assist_gpa;
+	void *partition_assist;
+
 	void *eptp_hva;
 	uint64_t eptp_gpa;
 	void *eptp;
diff --git a/tools/testing/selftests/kvm/lib/x86_64/vmx.c b/tools/testing/selftests/kvm/lib/x86_64/vmx.c
index d089d8b850b5..3db21e0e1a8f 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/vmx.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/vmx.c
@@ -124,6 +124,13 @@ vcpu_alloc_vmx(struct kvm_vm *vm, vm_vaddr_t *p_vmx_gva)
 	vmx->enlightened_vmcs_gpa =
 		addr_gva2gpa(vm, (uintptr_t)vmx->enlightened_vmcs);
 
+	/* Setup of a region of guest memory for the partition assist page. */
+	vmx->partition_assist = (void *)vm_vaddr_alloc_page(vm);
+	vmx->partition_assist_hva =
+		addr_gva2hva(vm, (uintptr_t)vmx->partition_assist);
+	vmx->partition_assist_gpa =
+		addr_gva2gpa(vm, (uintptr_t)vmx->partition_assist);
+
 	*p_vmx_gva = vmx_gva;
 	return vmx;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 27/31] KVM: selftests: nSVM: Allocate Hyper-V partition assist and VP assist pages
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (25 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 26/31] KVM: selftests: nVMX: Allocate Hyper-V partition assist page Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 28/31] KVM: selftests: Sync 'struct hv_vp_assist_page' definition with hyperv-tlfs.h Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

In preparation to testing Hyper-V Direct TLB flush hypercalls,
allocate VP assist and Partition assist pages and link them to 'struct
svm_test_data'.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/include/x86_64/svm_util.h | 10 ++++++++++
 tools/testing/selftests/kvm/lib/x86_64/svm.c          | 10 ++++++++++
 2 files changed, 20 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/svm_util.h b/tools/testing/selftests/kvm/include/x86_64/svm_util.h
index a25aabd8f5e7..640859b58fd6 100644
--- a/tools/testing/selftests/kvm/include/x86_64/svm_util.h
+++ b/tools/testing/selftests/kvm/include/x86_64/svm_util.h
@@ -34,6 +34,16 @@ struct svm_test_data {
 	void *msr; /* gva */
 	void *msr_hva;
 	uint64_t msr_gpa;
+
+	/* Hyper-V VP assist page */
+	void *vp_assist; /* gva */
+	void *vp_assist_hva;
+	uint64_t vp_assist_gpa;
+
+	/* Hyper-V Partition assist page */
+	void *partition_assist; /* gva */
+	void *partition_assist_hva;
+	uint64_t partition_assist_gpa;
 };
 
 struct svm_test_data *vcpu_alloc_svm(struct kvm_vm *vm, vm_vaddr_t *p_svm_gva);
diff --git a/tools/testing/selftests/kvm/lib/x86_64/svm.c b/tools/testing/selftests/kvm/lib/x86_64/svm.c
index 736ee4a23df6..c284e8f87f5c 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/svm.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/svm.c
@@ -48,6 +48,16 @@ vcpu_alloc_svm(struct kvm_vm *vm, vm_vaddr_t *p_svm_gva)
 	svm->msr_gpa = addr_gva2gpa(vm, (uintptr_t)svm->msr);
 	memset(svm->msr_hva, 0, getpagesize());
 
+	svm->vp_assist = (void *)vm_vaddr_alloc_page(vm);
+	svm->vp_assist_hva = addr_gva2hva(vm, (uintptr_t)svm->vp_assist);
+	svm->vp_assist_gpa = addr_gva2gpa(vm, (uintptr_t)svm->vp_assist);
+	memset(svm->vp_assist_hva, 0, getpagesize());
+
+	svm->partition_assist = (void *)vm_vaddr_alloc_page(vm);
+	svm->partition_assist_hva = addr_gva2hva(vm, (uintptr_t)svm->partition_assist);
+	svm->partition_assist_gpa = addr_gva2gpa(vm, (uintptr_t)svm->partition_assist);
+	memset(svm->partition_assist_hva, 0, getpagesize());
+
 	*p_svm_gva = svm_gva;
 	return svm;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 28/31] KVM: selftests: Sync 'struct hv_vp_assist_page' definition with hyperv-tlfs.h
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (26 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 27/31] KVM: selftests: nSVM: Allocate Hyper-V partition assist and VP assist pages Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 29/31] KVM: selftests: evmcs_test: Direct TLB flush test Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

'struct hv_vp_assist_page' definition doesn't match TLFS. Also, define
'struct hv_nested_enlightenments_control' and use it instead of opaque
'__u64'.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../selftests/kvm/include/x86_64/evmcs.h      | 22 ++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
index b6067b555110..9c965ba73dec 100644
--- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
+++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
@@ -20,14 +20,26 @@
 
 extern bool enable_evmcs;
 
+struct hv_nested_enlightenments_control {
+	struct {
+		__u32 directhypercall:1;
+		__u32 reserved:31;
+	} features;
+	struct {
+		__u32 reserved;
+	} hypercallControls;
+} __packed;
+
+/* Define virtual processor assist page structure. */
 struct hv_vp_assist_page {
 	__u32 apic_assist;
-	__u32 reserved;
-	__u64 vtl_control[2];
-	__u64 nested_enlightenments_control[2];
-	__u32 enlighten_vmentry;
+	__u32 reserved1;
+	__u64 vtl_control[3];
+	struct hv_nested_enlightenments_control nested_control;
+	__u8 enlighten_vmentry;
+	__u8 reserved2[7];
 	__u64 current_nested_vmcs;
-};
+} __packed;
 
 struct hv_enlightened_vmcs {
 	u32 revision_id;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 29/31] KVM: selftests: evmcs_test: Direct TLB flush test
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (27 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 28/31] KVM: selftests: Sync 'struct hv_vp_assist_page' definition with hyperv-tlfs.h Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 30/31] KVM: selftests: Move Hyper-V VP assist page enablement out of evmcs.h Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 31/31] KVM: selftests: hyperv_svm_test: Add Direct TLB flush test Vitaly Kuznetsov
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Enable Hyper-V Direct TLB flush and check that Hyper-V TLB flush
hypercalls from L2 don't exit to L1 unless 'TlbLockCount' is set in the
Partition assist page.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../selftests/kvm/include/x86_64/evmcs.h      |  2 +
 .../testing/selftests/kvm/x86_64/evmcs_test.c | 52 ++++++++++++++++++-
 2 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
index 9c965ba73dec..36c0a67d8602 100644
--- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
+++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
@@ -252,6 +252,8 @@ struct hv_enlightened_vmcs {
 #define HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_MASK	\
 		(~((1ull << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) - 1))
 
+#define HV_VMX_SYNTHETIC_EXIT_REASON_TRAP_AFTER_FLUSH 0x10000031
+
 extern struct hv_enlightened_vmcs *current_evmcs;
 extern struct hv_vp_assist_page *current_vp_assist;
 
diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
index d12e043aa2ee..411c4dbeac09 100644
--- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
@@ -16,6 +16,7 @@
 
 #include "kvm_util.h"
 
+#include "hyperv.h"
 #include "vmx.h"
 
 #define VCPU_ID		5
@@ -49,6 +50,16 @@ static inline void rdmsr_gs_base(void)
 			      "r13", "r14", "r15");
 }
 
+static inline void hypercall(u64 control, vm_vaddr_t arg1, vm_vaddr_t arg2)
+{
+	asm volatile("mov %3, %%r8\n"
+		     "vmcall"
+		     : "+c" (control), "+d" (arg1)
+		     :  "r" (arg2)
+		     : "cc", "memory", "rax", "rbx", "r8", "r9", "r10",
+		       "r11", "r12", "r13", "r14", "r15");
+}
+
 void l2_guest_code(void)
 {
 	GUEST_SYNC(7);
@@ -67,15 +78,27 @@ void l2_guest_code(void)
 	vmcall();
 	rdmsr_gs_base(); /* intercepted */
 
+	/* Direct TLB flush tests */
+	hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE | HV_HYPERCALL_FAST_BIT, 0x0,
+		  HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES | HV_FLUSH_ALL_PROCESSORS);
+	rdmsr_fs_base();
+	hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE | HV_HYPERCALL_FAST_BIT, 0x0,
+		  HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES | HV_FLUSH_ALL_PROCESSORS);
+	/* Make sure we're no issuing Hyper-V TLB flush call again */
+	__asm__ __volatile__ ("mov $0xdeadbeef, %rcx");
+
 	/* Done, exit to L1 and never come back.  */
 	vmcall();
 }
 
-void guest_code(struct vmx_pages *vmx_pages)
+void guest_code(struct vmx_pages *vmx_pages, vm_vaddr_t pgs_gpa)
 {
 #define L2_GUEST_STACK_SIZE 64
 	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
 
+	wrmsr(HV_X64_MSR_GUEST_OS_ID, HYPERV_LINUX_OS_ID);
+	wrmsr(HV_X64_MSR_HYPERCALL, pgs_gpa);
+
 	x2apic_enable();
 
 	GUEST_SYNC(1);
@@ -105,6 +128,14 @@ void guest_code(struct vmx_pages *vmx_pages)
 	vmwrite(PIN_BASED_VM_EXEC_CONTROL, vmreadz(PIN_BASED_VM_EXEC_CONTROL) |
 		PIN_BASED_NMI_EXITING);
 
+	/* Direct TLB flush setup */
+	current_evmcs->partition_assist_page = vmx_pages->partition_assist_gpa;
+	current_evmcs->hv_enlightenments_control.nested_flush_hypercall = 1;
+	current_evmcs->hv_vm_id = 1;
+	current_evmcs->hv_vp_id = 1;
+	current_vp_assist->nested_control.features.directhypercall = 1;
+	*(u32 *)(vmx_pages->partition_assist) = 0;
+
 	GUEST_ASSERT(!vmlaunch());
 	GUEST_ASSERT(vmptrstz() == vmx_pages->enlightened_vmcs_gpa);
 
@@ -149,6 +180,18 @@ void guest_code(struct vmx_pages *vmx_pages)
 	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_MSR_READ);
 	current_evmcs->guest_rip += 2; /* rdmsr */
 
+	/*
+	 * Direct TLB flush test. First VMCALL should be handled directly by L0,
+	 * no VMCALL exit expested.
+	 */
+	GUEST_ASSERT(!vmresume());
+	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_MSR_READ);
+	current_evmcs->guest_rip += 2; /* rdmsr */
+	/* Enable synthetic vmexit */
+	*(u32 *)(vmx_pages->partition_assist) = 1;
+	GUEST_ASSERT(!vmresume());
+	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == HV_VMX_SYNTHETIC_EXIT_REASON_TRAP_AFTER_FLUSH);
+
 	GUEST_ASSERT(!vmresume());
 	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
 	GUEST_SYNC(11);
@@ -201,6 +244,7 @@ static void save_restore_vm(struct kvm_vm *vm)
 int main(int argc, char *argv[])
 {
 	vm_vaddr_t vmx_pages_gva = 0;
+	vm_vaddr_t hcall_page;
 
 	struct kvm_vm *vm;
 	struct kvm_run *run;
@@ -217,11 +261,15 @@ int main(int argc, char *argv[])
 		exit(KSFT_SKIP);
 	}
 
+	hcall_page = vm_vaddr_alloc_pages(vm, 1);
+	memset(addr_gva2hva(vm, hcall_page), 0x0,  getpagesize());
+
 	vcpu_set_hv_cpuid(vm, VCPU_ID);
 	vcpu_enable_evmcs(vm, VCPU_ID);
 
 	vcpu_alloc_vmx(vm, &vmx_pages_gva);
-	vcpu_args_set(vm, VCPU_ID, 1, vmx_pages_gva);
+	vcpu_args_set(vm, VCPU_ID, 2, vmx_pages_gva, addr_gva2gpa(vm, hcall_page));
+	vcpu_set_msr(vm, VCPU_ID, HV_X64_MSR_VP_INDEX, VCPU_ID);
 
 	vm_init_descriptor_tables(vm);
 	vcpu_init_descriptor_tables(vm, VCPU_ID);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 30/31] KVM: selftests: Move Hyper-V VP assist page enablement out of evmcs.h
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (28 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 29/31] KVM: selftests: evmcs_test: Direct TLB flush test Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  2022-04-07 15:56 ` [PATCH v2 31/31] KVM: selftests: hyperv_svm_test: Add Direct TLB flush test Vitaly Kuznetsov
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Hyper-V VP assist page is not eVMCS specific, it is also used for
enlightened nSVM. Move the code to vendor neutral place.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/Makefile          |  2 +-
 .../selftests/kvm/include/x86_64/evmcs.h      | 40 +------------------
 .../selftests/kvm/include/x86_64/hyperv.h     | 31 ++++++++++++++
 .../testing/selftests/kvm/lib/x86_64/hyperv.c | 21 ++++++++++
 .../testing/selftests/kvm/x86_64/evmcs_test.c |  1 +
 5 files changed, 56 insertions(+), 39 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/lib/x86_64/hyperv.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index c3ba9505b368..f2e6c4594399 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -38,7 +38,7 @@ ifeq ($(ARCH),riscv)
 endif
 
 LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
-LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
+LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/hyperv.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
 LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
 LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
 LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
index 36c0a67d8602..026586b53013 100644
--- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
+++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
@@ -10,6 +10,7 @@
 #define SELFTEST_KVM_EVMCS_H
 
 #include <stdint.h>
+#include "hyperv.h"
 #include "vmx.h"
 
 #define u16 uint16_t
@@ -20,27 +21,6 @@
 
 extern bool enable_evmcs;
 
-struct hv_nested_enlightenments_control {
-	struct {
-		__u32 directhypercall:1;
-		__u32 reserved:31;
-	} features;
-	struct {
-		__u32 reserved;
-	} hypercallControls;
-} __packed;
-
-/* Define virtual processor assist page structure. */
-struct hv_vp_assist_page {
-	__u32 apic_assist;
-	__u32 reserved1;
-	__u64 vtl_control[3];
-	struct hv_nested_enlightenments_control nested_control;
-	__u8 enlighten_vmentry;
-	__u8 reserved2[7];
-	__u64 current_nested_vmcs;
-} __packed;
-
 struct hv_enlightened_vmcs {
 	u32 revision_id;
 	u32 abort;
@@ -246,31 +226,15 @@ struct hv_enlightened_vmcs {
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_ENLIGHTENMENTSCONTROL    BIT(15)
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL                      0xFFFF
 
-#define HV_X64_MSR_VP_ASSIST_PAGE		0x40000073
-#define HV_X64_MSR_VP_ASSIST_PAGE_ENABLE	0x00000001
-#define HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT	12
-#define HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_MASK	\
-		(~((1ull << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) - 1))
-
 #define HV_VMX_SYNTHETIC_EXIT_REASON_TRAP_AFTER_FLUSH 0x10000031
 
 extern struct hv_enlightened_vmcs *current_evmcs;
-extern struct hv_vp_assist_page *current_vp_assist;
 
 int vcpu_enable_evmcs(struct kvm_vm *vm, int vcpu_id);
 
-static inline int enable_vp_assist(uint64_t vp_assist_pa, void *vp_assist)
+static inline void evmcs_enable(void)
 {
-	u64 val = (vp_assist_pa & HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_MASK) |
-		HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
-
-	wrmsr(HV_X64_MSR_VP_ASSIST_PAGE, val);
-
-	current_vp_assist = vp_assist;
-
 	enable_evmcs = true;
-
-	return 0;
 }
 
 static inline int evmcs_vmptrld(uint64_t vmcs_pa, void *vmcs)
diff --git a/tools/testing/selftests/kvm/include/x86_64/hyperv.h b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
index 1e34dd7c5075..095c15fc5381 100644
--- a/tools/testing/selftests/kvm/include/x86_64/hyperv.h
+++ b/tools/testing/selftests/kvm/include/x86_64/hyperv.h
@@ -189,4 +189,35 @@
 
 #define HYPERV_LINUX_OS_ID ((u64)0x8100 << 48)
 
+#define HV_X64_MSR_VP_ASSIST_PAGE		0x40000073
+#define HV_X64_MSR_VP_ASSIST_PAGE_ENABLE	0x00000001
+#define HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT	12
+#define HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_MASK	\
+		(~((1ull << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) - 1))
+
+struct hv_nested_enlightenments_control {
+	struct {
+		__u32 directhypercall:1;
+		__u32 reserved:31;
+	} features;
+	struct {
+		__u32 reserved;
+	} hypercallControls;
+} __packed;
+
+/* Define virtual processor assist page structure. */
+struct hv_vp_assist_page {
+	__u32 apic_assist;
+	__u32 reserved1;
+	__u64 vtl_control[3];
+	struct hv_nested_enlightenments_control nested_control;
+	__u8 enlighten_vmentry;
+	__u8 reserved2[7];
+	__u64 current_nested_vmcs;
+} __packed;
+
+extern struct hv_vp_assist_page *current_vp_assist;
+
+int enable_vp_assist(uint64_t vp_assist_pa, void *vp_assist);
+
 #endif /* !SELFTEST_KVM_HYPERV_H */
diff --git a/tools/testing/selftests/kvm/lib/x86_64/hyperv.c b/tools/testing/selftests/kvm/lib/x86_64/hyperv.c
new file mode 100644
index 000000000000..a8c6b156f92d
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/x86_64/hyperv.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * tools/testing/selftests/kvm/lib/x86_64/hyperv.c
+ *
+ * Copyright (C) 2021, Red Hat Inc.
+ */
+#include <stdint.h>
+#include "processor.h"
+#include "hyperv.h"
+
+int enable_vp_assist(uint64_t vp_assist_pa, void *vp_assist)
+{
+	uint64_t val = (vp_assist_pa & HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_MASK) |
+		HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
+
+	wrmsr(HV_X64_MSR_VP_ASSIST_PAGE, val);
+
+	current_vp_assist = vp_assist;
+
+	return 0;
+}
diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
index 411c4dbeac09..32ec43a8b101 100644
--- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
@@ -105,6 +105,7 @@ void guest_code(struct vmx_pages *vmx_pages, vm_vaddr_t pgs_gpa)
 	GUEST_SYNC(2);
 
 	enable_vp_assist(vmx_pages->vp_assist_gpa, vmx_pages->vp_assist);
+	evmcs_enable();
 
 	GUEST_ASSERT(vmx_pages->vmcs_gpa);
 	GUEST_ASSERT(prepare_for_vmx_operation(vmx_pages));
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 31/31] KVM: selftests: hyperv_svm_test: Add Direct TLB flush test
  2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
                   ` (29 preceding siblings ...)
  2022-04-07 15:56 ` [PATCH v2 30/31] KVM: selftests: Move Hyper-V VP assist page enablement out of evmcs.h Vitaly Kuznetsov
@ 2022-04-07 15:56 ` Vitaly Kuznetsov
  30 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-07 15:56 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Enable Hyper-V Direct TLB flush and check that Hyper-V TLB flush
hypercalls from L2 don't exit to L1 unless 'TlbLockCount' is set in the
Partition assist page.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../selftests/kvm/x86_64/hyperv_svm_test.c    | 60 +++++++++++++++++--
 1 file changed, 56 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c b/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c
index 21f5ca9197da..e2849e58582f 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c
@@ -42,11 +42,24 @@ struct hv_enlightenments {
  */
 #define VMCB_HV_NESTED_ENLIGHTENMENTS (1U << 31)
 
+#define HV_SVM_EXITCODE_ENL 0xF0000000
+#define HV_SVM_ENL_EXITCODE_TRAP_AFTER_FLUSH   (1)
+
 static inline void vmmcall(void)
 {
 	__asm__ __volatile__("vmmcall");
 }
 
+static inline void hypercall(u64 control, vm_vaddr_t arg1, vm_vaddr_t arg2)
+{
+	asm volatile("mov %3, %%r8\n"
+		     "vmmcall"
+		     : "+c" (control), "+d" (arg1)
+		     :  "r" (arg2)
+		     : "cc", "memory", "rax", "rbx", "r8", "r9", "r10",
+		       "r11", "r12", "r13", "r14", "r15");
+}
+
 void l2_guest_code(void)
 {
 	GUEST_SYNC(3);
@@ -62,11 +75,21 @@ void l2_guest_code(void)
 
 	GUEST_SYNC(5);
 
+	/* Direct TLB flush tests */
+	hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE | HV_HYPERCALL_FAST_BIT, 0x0,
+		  HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES | HV_FLUSH_ALL_PROCESSORS);
+	rdmsr(MSR_FS_BASE);
+	hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE | HV_HYPERCALL_FAST_BIT, 0x0,
+		  HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES | HV_FLUSH_ALL_PROCESSORS);
+	/* Make sure we're not issuing Hyper-V TLB flush call again */
+	__asm__ __volatile__ ("mov $0xdeadbeef, %rcx");
+
 	/* Done, exit to L1 and never come back.  */
 	vmmcall();
 }
 
-static void __attribute__((__flatten__)) guest_code(struct svm_test_data *svm)
+static void __attribute__((__flatten__)) guest_code(struct svm_test_data *svm,
+						    vm_vaddr_t pgs_gpa)
 {
 	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
 	struct vmcb *vmcb = svm->vmcb;
@@ -75,13 +98,23 @@ static void __attribute__((__flatten__)) guest_code(struct svm_test_data *svm)
 
 	GUEST_SYNC(1);
 
-	wrmsr(HV_X64_MSR_GUEST_OS_ID, (u64)0x8100 << 48);
+	wrmsr(HV_X64_MSR_GUEST_OS_ID, HYPERV_LINUX_OS_ID);
+	wrmsr(HV_X64_MSR_HYPERCALL, pgs_gpa);
+	enable_vp_assist(svm->vp_assist_gpa, svm->vp_assist);
 
 	GUEST_ASSERT(svm->vmcb_gpa);
 	/* Prepare for L2 execution. */
 	generic_svm_setup(svm, l2_guest_code,
 			  &l2_guest_stack[L2_GUEST_STACK_SIZE]);
 
+	/* Direct TLB flush setup */
+	hve->partition_assist_page = svm->partition_assist_gpa;
+	hve->hv_enlightenments_control.nested_flush_hypercall = 1;
+	hve->hv_vm_id = 1;
+	hve->hv_vp_id = 1;
+	current_vp_assist->nested_control.features.directhypercall = 1;
+	*(u32 *)(svm->partition_assist) = 0;
+
 	GUEST_SYNC(2);
 	run_guest(vmcb, svm->vmcb_gpa);
 	GUEST_ASSERT(vmcb->control.exit_code == SVM_EXIT_VMMCALL);
@@ -116,6 +149,20 @@ static void __attribute__((__flatten__)) guest_code(struct svm_test_data *svm)
 	GUEST_ASSERT(vmcb->control.exit_code == SVM_EXIT_MSR);
 	vmcb->save.rip += 2; /* rdmsr */
 
+
+	/*
+	 * Direct TLB flush test. First VMCALL should be handled directly by L0,
+	 * no VMCALL exit expested.
+	 */
+	run_guest(vmcb, svm->vmcb_gpa);
+	GUEST_ASSERT(vmcb->control.exit_code == SVM_EXIT_MSR);
+	vmcb->save.rip += 2; /* rdmsr */
+	/* Enable synthetic vmexit */
+	*(u32 *)(svm->partition_assist) = 1;
+	run_guest(vmcb, svm->vmcb_gpa);
+	GUEST_ASSERT(vmcb->control.exit_code == HV_SVM_EXITCODE_ENL);
+	GUEST_ASSERT(vmcb->control.exit_info_1 == HV_SVM_ENL_EXITCODE_TRAP_AFTER_FLUSH);
+
 	run_guest(vmcb, svm->vmcb_gpa);
 	GUEST_ASSERT(vmcb->control.exit_code == SVM_EXIT_VMMCALL);
 	GUEST_SYNC(6);
@@ -126,7 +173,7 @@ static void __attribute__((__flatten__)) guest_code(struct svm_test_data *svm)
 int main(int argc, char *argv[])
 {
 	vm_vaddr_t nested_gva = 0;
-
+	vm_vaddr_t hcall_page;
 	struct kvm_vm *vm;
 	struct kvm_run *run;
 	struct ucall uc;
@@ -141,7 +188,12 @@ int main(int argc, char *argv[])
 	vcpu_set_hv_cpuid(vm, VCPU_ID);
 	run = vcpu_state(vm, VCPU_ID);
 	vcpu_alloc_svm(vm, &nested_gva);
-	vcpu_args_set(vm, VCPU_ID, 1, nested_gva);
+
+	hcall_page = vm_vaddr_alloc_pages(vm, 1);
+	memset(addr_gva2hva(vm, hcall_page), 0x0,  getpagesize());
+
+	vcpu_args_set(vm, VCPU_ID, 2, nested_gva, addr_gva2gpa(vm, hcall_page));
+	vcpu_set_msr(vm, VCPU_ID, HV_X64_MSR_VP_INDEX, VCPU_ID);
 
 	for (stage = 1;; stage++) {
 		_vcpu_run(vm, VCPU_ID);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently
  2022-04-07 15:56 ` [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently Vitaly Kuznetsov
@ 2022-04-07 17:33   ` Sean Christopherson
  2022-04-07 17:47     ` Sean Christopherson
  2022-04-11 11:15     ` Vitaly Kuznetsov
  2022-04-07 17:44   ` Sean Christopherson
  1 sibling, 2 replies; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 17:33 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 6695 bytes --]

On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> Currently, HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls are handled
> the exact same way as HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE{,EX}: by
> flushing the whole VPID and this is sub-optimal. Switch to handling
> these requests with 'flush_tlb_gva()' hooks instead. Use the newly
> introduced TLB flush ring to queue the requests.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/hyperv.c | 141 ++++++++++++++++++++++++++++++++++++------
>  1 file changed, 121 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 81c44e0eadf9..a54d41656f30 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1792,6 +1792,35 @@ static u64 kvm_get_sparse_vp_set(struct kvm *kvm, struct kvm_hv_hcall *hc,
>  			      var_cnt * sizeof(*sparse_banks));
>  }
>  
> +static int kvm_hv_get_tlbflush_entries(struct kvm *kvm, struct kvm_hv_hcall *hc, u64 entries[],
> +				       u32 data_offset, int consumed_xmm_halves)

data_offset should be gpa_t, and the order of params should be consistent between
this and kvm_get_sparse_vp_set().

> +{
> +	int i;
> +
> +	if (hc->fast) {
> +		/*
> +		 * Each XMM holds two entries, but do not count halves that
> +		 * have already been consumed.
> +		 */
> +		if (hc->rep_cnt > (2 * HV_HYPERCALL_MAX_XMM_REGISTERS - consumed_xmm_halves))
> +			return -EINVAL;
> +
> +		for (i = 0; i < hc->rep_cnt; i++) {
> +			int j = i + consumed_xmm_halves;
> +
> +			if (j % 2)
> +				entries[i] = sse128_hi(hc->xmm[j / 2]);
> +			else
> +				entries[i] = sse128_lo(hc->xmm[j / 2]);
> +		}
> +
> +		return 0;
> +	}
> +
> +	return kvm_read_guest(kvm, hc->ingpa + data_offset,
> +			      entries, hc->rep_cnt * sizeof(entries[0]));

This is almost verbatim copy+pasted from kvm_get_sparse_vp_set().  If you slot in
the attached patched before this, then this function becomes:

static int kvm_hv_get_tlbflush_entries(struct kvm *kvm, struct kvm_hv_hcall *hc, u64 entries[],
				       int consumed_xmm_halves, gpa_t offset)
{
	return kvm_hv_get_hc_data(kvm, hc, hc->rep_cnt, hc->rep_cnt,
				  entries, consumed_xmm_halves, offset);
}


> +}

...

> @@ -1840,15 +1891,47 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
>  	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> -
> -	kvm_vcpu_flush_tlb_guest(vcpu);
> -
> -	if (!hv_vcpu)
> +	struct kvm_vcpu_hv_tlbflush_entry *entry;
> +	int read_idx, write_idx;
> +	u64 address;
> +	u32 count;
> +	int i, j;
> +
> +	if (!tdp_enabled || !hv_vcpu) {
> +		kvm_vcpu_flush_tlb_guest(vcpu);
>  		return;
> +	}
>  
>  	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
> +	read_idx = READ_ONCE(tlb_flush_ring->read_idx);
> +	write_idx = READ_ONCE(tlb_flush_ring->write_idx);
> +
> +	/* Pairs with smp_wmb() in hv_tlb_flush_ring_enqueue() */
> +	smp_rmb();
>  
> -	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
> +	for (i = read_idx; i != write_idx; i = (i + 1) % KVM_HV_TLB_FLUSH_RING_SIZE) {
> +		entry = &tlb_flush_ring->entries[i];
> +
> +		if (entry->flush_all)
> +			goto out_flush_all;
> +
> +		/*
> +		 * Lower 12 bits of 'address' encode the number of additional
> +		 * pages to flush.
> +		 */
> +		address = entry->addr & PAGE_MASK;
> +		count = (entry->addr & ~PAGE_MASK) + 1;
> +		for (j = 0; j < count; j++)
> +			static_call(kvm_x86_flush_tlb_gva)(vcpu, address + j * PAGE_SIZE);
> +	}
> +	++vcpu->stat.tlb_flush;
> +	goto out_empty_ring;
> +
> +out_flush_all:
> +	kvm_vcpu_flush_tlb_guest(vcpu);
> +
> +out_empty_ring:
> +	tlb_flush_ring->read_idx = write_idx;
>  }
>  
>  static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
> @@ -1857,12 +1940,13 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
>  	struct hv_tlb_flush_ex flush_ex;
>  	struct hv_tlb_flush flush;
>  	DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
> +	u64 entries[KVM_HV_TLB_FLUSH_RING_SIZE - 2];

What's up with the -2?  And given the multitude of things going on in this code,
I'd strongly prefer this be tlbflush_entries.

Actually, if you do:

	u64 __tlbflush_entries[KVM_HV_TLB_FLUSH_RING_SIZE - 2];
	u64 *tlbflush_entries;

and drop all_addr, the code to get entries can be

	if (hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE ||
	    hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX ||
	    hc->rep_cnt > ARRAY_SIZE(tlbflush_entries)) {
		tlbfluish_entries = NULL;
	} else {
		if (kvm_hv_get_tlbflush_entries(kvm, hc, __tlbflush_entries,
						consumed_xmm_halves, data_offset))
			return HV_STATUS_INVALID_HYPERCALL_INPUT;
		tlbfluish_entries = __tlbflush_entries;
	}

and the calls to queue flushes becomes

			hv_tlb_flush_ring_enqueue(v, tlbflush_entries, hc->rep_cnt);

That way a bug will "just" be a NULL pointer dereference and not consumption of
uninitialized data (though such a bug might be caught be caught by the compiler).

>  	u64 valid_bank_mask;
>  	u64 sparse_banks[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
>  	struct kvm_vcpu *v;
>  	unsigned long i;
> -	bool all_cpus;
> -
> +	bool all_cpus, all_addr;
> +	int data_offset = 0, consumed_xmm_halves = 0;

data_offset should be a gpa_t.

>  	/*
>  	 * The Hyper-V TLFS doesn't allow more than 64 sparse banks, e.g. the
>  	 * valid mask is a u64.  Fail the build if KVM's max allowed number of

...

> +read_flush_entries:
> +	if (hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE ||
> +	    hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX ||
> +	    hc->rep_cnt > (KVM_HV_TLB_FLUSH_RING_SIZE - 2)) {

Rather than duplicate the -2 magic, it's far better to do:


> +		all_addr = true;
> +	} else {
> +		if (kvm_hv_get_tlbflush_entries(kvm, hc, entries,
> +						data_offset, consumed_xmm_halves))

As mentioned, the order for this call should match kvm_get_sparse_vp_set().

>  			return HV_STATUS_INVALID_HYPERCALL_INPUT;
> +		all_addr = false;
>  	}
>  
> -do_flush:
> +
>  	/*
>  	 * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
>  	 * analyze it here, flush TLB regardless of the specified address space.
>  	 */
>  	if (all_cpus) {
>  		kvm_for_each_vcpu(i, v, kvm)
> -			hv_tlb_flush_ring_enqueue(v);
> +			hv_tlb_flush_ring_enqueue(v, all_addr, entries, hc->rep_cnt);
>  
>  		kvm_make_all_cpus_request(kvm, KVM_REQ_HV_TLB_FLUSH);
>  	} else {
> @@ -1951,7 +2052,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
>  			v = kvm_get_vcpu(kvm, i);
>  			if (!v)
>  				continue;
> -			hv_tlb_flush_ring_enqueue(v);
> +			hv_tlb_flush_ring_enqueue(v, all_addr, entries, hc->rep_cnt);
>  		}
>  
>  		kvm_make_vcpus_request_mask(kvm, KVM_REQ_HV_TLB_FLUSH, vcpu_mask);
> -- 
> 2.35.1
> 

[-- Attachment #2: 0001-KVM-x86-hyper-v-Add-helper-to-read-hypercall-data-fo.patch --]
[-- Type: text/x-diff, Size: 4043 bytes --]

From ad6033048d498baba7889ae0e14788c92d4baacb Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Thu, 7 Apr 2022 09:52:46 -0700
Subject: [PATCH] KVM: x86: hyper-v: Add helper to read hypercall data for
 arrary

Move the guts of kvm_get_sparse_vp_set() to a helper so that the code for
reading a guest-provided array can be reused in the future, e.g. for
getting a list of virtual addresses whose TLB entries need to be flushed.

Opportunisticaly swap the order of the data and XMM adjustment so that
the XMM/gpa offsets are bundled together.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/hyperv.c | 53 +++++++++++++++++++++++++++----------------
 1 file changed, 33 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e4f381b46a28..58e7aff6057a 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1782,38 +1782,51 @@ struct kvm_hv_hcall {
 	sse128_t xmm[HV_HYPERCALL_MAX_XMM_REGISTERS];
 };
 
-static u64 kvm_get_sparse_vp_set(struct kvm *kvm, struct kvm_hv_hcall *hc,
-				 int consumed_xmm_halves,
-				 u64 *sparse_banks, gpa_t offset)
+
+static int kvm_hv_get_hc_data(struct kvm *kvm, struct kvm_hv_hcall *hc,
+			      u16 orig_cnt, u16 cnt_cap, u64 *data,
+			      int consumed_xmm_halves, gpa_t offset)
 {
-	u16 var_cnt;
-	int i;
-
-	if (hc->var_cnt > 64)
-		return -EINVAL;
-
-	/* Ignore banks that cannot possibly contain a legal VP index. */
-	var_cnt = min_t(u16, hc->var_cnt, KVM_HV_MAX_SPARSE_VCPU_SET_BITS);
+	/*
+	 * Preserve the original count when ignoring entries via a "cap", KVM
+	 * still needs to validate the guest input (though the non-XMM path
+	 * punts on the checks).
+	 */
+	u16 cnt = min(orig_cnt, cnt_cap);
+	int i, j;
 
 	if (hc->fast) {
 		/*
 		 * Each XMM holds two sparse banks, but do not count halves that
 		 * have already been consumed for hypercall parameters.
 		 */
-		if (hc->var_cnt > 2 * HV_HYPERCALL_MAX_XMM_REGISTERS - consumed_xmm_halves)
+		if (orig_cnt > 2 * HV_HYPERCALL_MAX_XMM_REGISTERS - consumed_xmm_halves)
 			return HV_STATUS_INVALID_HYPERCALL_INPUT;
-		for (i = 0; i < var_cnt; i++) {
-			int j = i + consumed_xmm_halves;
+
+		for (i = 0; i < cnt; i++) {
+			j = i + consumed_xmm_halves;
 			if (j % 2)
-				sparse_banks[i] = sse128_hi(hc->xmm[j / 2]);
+				data[i] = sse128_hi(hc->xmm[j / 2]);
 			else
-				sparse_banks[i] = sse128_lo(hc->xmm[j / 2]);
+				data[i] = sse128_lo(hc->xmm[j / 2]);
 		}
 		return 0;
 	}
 
-	return kvm_read_guest(kvm, hc->ingpa + offset, sparse_banks,
-			      var_cnt * sizeof(*sparse_banks));
+	return kvm_read_guest(kvm, hc->ingpa + offset, data,
+			      cnt * sizeof(*data));
+}
+
+static u64 kvm_get_sparse_vp_set(struct kvm *kvm, struct kvm_hv_hcall *hc,
+				 u64 *sparse_banks, int consumed_xmm_halves,
+				 gpa_t offset)
+{
+	if (hc->var_cnt > 64)
+		return -EINVAL;
+
+	/* Cap var_cnt to ignore banks that cannot contain a legal VP index. */
+	return kvm_hv_get_hc_data(kvm, hc, hc->var_cnt, KVM_HV_MAX_SPARSE_VCPU_SET_BITS,
+				  sparse_banks, consumed_xmm_halves, offset);
 }
 
 static inline int hv_tlb_flush_ring_free(struct kvm_vcpu_hv *hv_vcpu,
@@ -1952,7 +1965,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 		if (!hc->var_cnt)
 			goto ret_success;
 
-		if (kvm_get_sparse_vp_set(kvm, hc, 2, sparse_banks,
+		if (kvm_get_sparse_vp_set(kvm, hc, sparse_banks, 2,
 					  offsetof(struct hv_tlb_flush_ex,
 						   hv_vp_set.bank_contents)))
 			return HV_STATUS_INVALID_HYPERCALL_INPUT;
@@ -2063,7 +2076,7 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 		if (!hc->var_cnt)
 			goto ret_success;
 
-		if (kvm_get_sparse_vp_set(kvm, hc, 1, sparse_banks,
+		if (kvm_get_sparse_vp_set(kvm, hc, sparse_banks, 1,
 					  offsetof(struct hv_send_ipi_ex,
 						   vp_set.bank_contents)))
 			return HV_STATUS_INVALID_HYPERCALL_INPUT;

base-commit: 9e28f2680fd1606225ab456bb28d30598110a520
-- 
2.35.1.1178.g4f1659d476-goog


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently
  2022-04-07 15:56 ` [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently Vitaly Kuznetsov
  2022-04-07 17:33   ` Sean Christopherson
@ 2022-04-07 17:44   ` Sean Christopherson
  2022-04-11 11:31     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 17:44 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> @@ -1840,15 +1891,47 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
>  	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> -
> -	kvm_vcpu_flush_tlb_guest(vcpu);
> -
> -	if (!hv_vcpu)
> +	struct kvm_vcpu_hv_tlbflush_entry *entry;
> +	int read_idx, write_idx;
> +	u64 address;
> +	u32 count;
> +	int i, j;
> +
> +	if (!tdp_enabled || !hv_vcpu) {
> +		kvm_vcpu_flush_tlb_guest(vcpu);
>  		return;
> +	}
>  
>  	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
> +	read_idx = READ_ONCE(tlb_flush_ring->read_idx);
> +	write_idx = READ_ONCE(tlb_flush_ring->write_idx);
> +
> +	/* Pairs with smp_wmb() in hv_tlb_flush_ring_enqueue() */
> +	smp_rmb();
>  
> -	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
> +	for (i = read_idx; i != write_idx; i = (i + 1) % KVM_HV_TLB_FLUSH_RING_SIZE) {
> +		entry = &tlb_flush_ring->entries[i];
> +
> +		if (entry->flush_all)
> +			goto out_flush_all;
> +
> +		/*
> +		 * Lower 12 bits of 'address' encode the number of additional
> +		 * pages to flush.
> +		 */
> +		address = entry->addr & PAGE_MASK;
> +		count = (entry->addr & ~PAGE_MASK) + 1;
> +		for (j = 0; j < count; j++)
> +			static_call(kvm_x86_flush_tlb_gva)(vcpu, address + j * PAGE_SIZE);
> +	}
> +	++vcpu->stat.tlb_flush;
> +	goto out_empty_ring;
> +
> +out_flush_all:
> +	kvm_vcpu_flush_tlb_guest(vcpu);
> +
> +out_empty_ring:
> +	tlb_flush_ring->read_idx = write_idx;

Does this need WRITE_ONCE?  My usual "I suck at memory ordering" disclaimer applies.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently
  2022-04-07 17:33   ` Sean Christopherson
@ 2022-04-07 17:47     ` Sean Christopherson
  2022-04-11 11:15     ` Vitaly Kuznetsov
  1 sibling, 0 replies; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 17:47 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Thu, Apr 07, 2022, Sean Christopherson wrote:
> > @@ -1857,12 +1940,13 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
> >  	struct hv_tlb_flush_ex flush_ex;
> >  	struct hv_tlb_flush flush;
> >  	DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
> > +	u64 entries[KVM_HV_TLB_FLUSH_RING_SIZE - 2];
> 
> What's up with the -2?  And given the multitude of things going on in this code,
> I'd strongly prefer this be tlbflush_entries.
> 
> Actually, if you do:
> 
> 	u64 __tlbflush_entries[KVM_HV_TLB_FLUSH_RING_SIZE - 2];
> 	u64 *tlbflush_entries;

Looking at future patches, tlb_flush_entries is better for consistency (apply everywhere).

> and drop all_addr, the code to get entries can be
> 
> 	if (hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE ||
> 	    hc->code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX ||
> 	    hc->rep_cnt > ARRAY_SIZE(tlbflush_entries)) {
> 		tlbfluish_entries = NULL;
> 	} else {
> 		if (kvm_hv_get_tlbflush_entries(kvm, hc, __tlbflush_entries,
> 						consumed_xmm_halves, data_offset))
> 			return HV_STATUS_INVALID_HYPERCALL_INPUT;
> 		tlbfluish_entries = __tlbflush_entries;

Heh, fluish, because TLB entries are somewhat fluid?

> 	}
> 
> and the calls to queue flushes becomes
> 
> 			hv_tlb_flush_ring_enqueue(v, tlbflush_entries, hc->rep_cnt);
> 
> That way a bug will "just" be a NULL pointer dereference and not consumption of
> uninitialized data (though such a bug might be caught be caught by the compiler).

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 06/31] KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in kvm_hv_send_ipi()
  2022-04-07 15:56 ` [PATCH v2 06/31] KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in kvm_hv_send_ipi() Vitaly Kuznetsov
@ 2022-04-07 17:48   ` Sean Christopherson
  0 siblings, 0 replies; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 17:48 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> Get rid of on-stack allocation of vcpu_mask and optimize kvm_hv_send_ipi()
> for a smaller number of vCPUs in the request. When Hyper-V TLB flush
> is in  use, HvSendSyntheticClusterIpi{,Ex} calls are not commonly used to
> send IPIs to a large number of vCPUs (and are rarely used in general).
> 
> Introduce hv_is_vp_in_sparse_set() to directly check if the specified
> VP_ID is present in sparse vCPU set.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/hyperv.c | 35 ++++++++++++++++++++++++-----------
>  1 file changed, 24 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index d7bcdf87b90c..918642bcdbd0 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1746,6 +1746,23 @@ static void sparse_set_to_vcpu_mask(struct kvm *kvm, u64 *sparse_banks,
>  	}
>  }
>  
> +static bool hv_is_vp_in_sparse_set(u32 vp_id, u64 valid_bank_mask, u64 sparse_banks[])
> +{
> +	int bank, sbank = 0;
> +
> +	if (!test_bit(vp_id / 64, (unsigned long *)&valid_bank_mask))

'64' really, really, really needs a #define.  I assume this is the same '64' that's
used to check the var_cnt when getting the sparse_banks.

> +		return false;
> +
> +	for_each_set_bit(bank, (unsigned long *)&valid_bank_mask,
> +			 KVM_HV_MAX_SPARSE_VCPU_SET_BITS) {
> +		if (bank == vp_id / 64)
> +			break;
> +		sbank++;
> +	}
> +
> +	return test_bit(vp_id % 64, (unsigned long *)&sparse_banks[sbank]);
> +}

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/31] KVM: x86: hyper-v: Create a separate ring for Direct TLB flush
  2022-04-07 15:56 ` [PATCH v2 07/31] KVM: x86: hyper-v: Create a separate ring for Direct TLB flush Vitaly Kuznetsov
@ 2022-04-07 17:57   ` Sean Christopherson
  0 siblings, 0 replies; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 17:57 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> To handle Direct TLB flush requests from L2 KVM needs to use a
> separate ring from regular Hyper-V TLB flush requests: e.g. when a
> request to flush something in L2 is made, the target vCPU can
> transition from L2 to L1, receive a request to flush a GVA for L1 and
> then try to enter L2 back. The first request needs to be processed
> then. Similarly, requests to flush GVAs in L1 must wait until L2
> exits to L1.
> 
> No functional change yet as KVM doesn't handle Direct TLB flush
> requests from L2 yet.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  3 ++-
>  arch/x86/kvm/hyperv.c           |  7 ++++---
>  arch/x86/kvm/hyperv.h           | 17 ++++++++++++++---
>  3 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 15d798fe280d..b8d7c1422da6 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -617,7 +617,8 @@ struct kvm_vcpu_hv {
>  		u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
>  	} cpuid_cache;
>  
> -	struct kvm_vcpu_hv_tlbflush_ring tlb_flush_ring;

Probably feedback for a prior patch, but please be consistent in tlbflush vs
tlb_flush.  I prefer the tlb_flush variant.

> +	/* Two rings for regular Hyper-V TLB flush and Direct TLB flush */
> +	struct kvm_vcpu_hv_tlbflush_ring tlb_flush_ring[2];

Use an enum, then the magic numbers go away, e.g.

enum hv_tlb_flush_rings {
	HV_L1_TLB_FLUSH_RING,
	HV_L2_TLB_FLUSH_RING,
	HV_NR_TLB_FLUSH_RINGS,
}

>  };
>  
>  /* Xen HVM per vcpu emulation context */
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 918642bcdbd0..16cbf41b5b7b 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -956,7 +956,8 @@ static int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
>  
>  	hv_vcpu->vp_index = vcpu->vcpu_idx;
>  
> -	spin_lock_init(&hv_vcpu->tlb_flush_ring.write_lock);
> +	spin_lock_init(&hv_vcpu->tlb_flush_ring[0].write_lock);
> +	spin_lock_init(&hv_vcpu->tlb_flush_ring[1].write_lock);

Or

	for (i = 0; i < ARRAY_SIZE(&hv_vcpu->tlb_flush_ring); i++)
		spin_lock_init(&hv_vcpu->tlb_flush_ring[i].write_lock)

or replace ARRAY_SIZE() with HV_NR_TLB_FLUSH_RINGS.

>  
>  	return 0;
>  }
> @@ -1860,7 +1861,7 @@ static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu, bool flush_all,
>  	if (!hv_vcpu)
>  		return;
>  
> -	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
> +	tlb_flush_ring = &hv_vcpu->tlb_flush_ring[0];

The [0] is gross, and it only gets worse in future patches that take @direct,
though this is slightly less gross:

	/* Here's a comment explaining why this is hardcoded to L1's ring. */
	tlb_flush_ring = &hv_vcpu->tlb_flush_ring[HV_L1_TLB_FLUSH_RING];

More thoughts in the patch that adds @direct.

>  	spin_lock_irqsave(&tlb_flush_ring->write_lock, flags);
>  
> @@ -1920,7 +1921,7 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
>  		return;
>  	}
>  
> -	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
> +	tlb_flush_ring = kvm_hv_get_tlb_flush_ring(vcpu);
>  	read_idx = READ_ONCE(tlb_flush_ring->read_idx);
>  	write_idx = READ_ONCE(tlb_flush_ring->write_idx);
>  
> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
> index 6847caeaaf84..448877b478ef 100644
> --- a/arch/x86/kvm/hyperv.h
> +++ b/arch/x86/kvm/hyperv.h
> @@ -22,6 +22,7 @@
>  #define __ARCH_X86_KVM_HYPERV_H__
>  
>  #include <linux/kvm_host.h>
> +#include "x86.h"
>  
>  /*
>   * The #defines related to the synthetic debugger are required by KDNet, but
> @@ -147,15 +148,25 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
>  int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>  		     struct kvm_cpuid_entry2 __user *entries);
>  
> +static inline struct kvm_vcpu_hv_tlbflush_ring *kvm_hv_get_tlb_flush_ring(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> +
> +	if (!is_guest_mode(vcpu))
> +		return &hv_vcpu->tlb_flush_ring[0];

Maybe this?

	int i = !is_guest_mode(vcpu) ? HV_L1_TLB_FLUSH_RING :
				       HV_L2_TLB_FLUSH_RING;

	return &hv_vcpu->tlb_flush_ring[i];

Though shouldn't this be a WARN condition as of this patch?  I.e. shouldn't it be
impossible to request a flush for L2 at this point?

> +
> +	return &hv_vcpu->tlb_flush_ring[1];
> +}
>  
>  static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu *vcpu)
>  {
> -	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> +	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
>  
> -	if (!hv_vcpu)
> +	if (!to_hv_vcpu(vcpu))
>  		return;
>  
> -	hv_vcpu->tlb_flush_ring.read_idx = hv_vcpu->tlb_flush_ring.write_idx;
> +	tlb_flush_ring = kvm_hv_get_tlb_flush_ring(vcpu);
> +	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
>  }
>  void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
>  
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/31] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag
  2022-04-07 15:56 ` [PATCH v2 01/31] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag Vitaly Kuznetsov
@ 2022-04-07 18:02   ` Sean Christopherson
  0 siblings, 0 replies; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 18:02 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e9647614dc8c..3c54f6804b7b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3341,7 +3341,12 @@ void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu)
>  	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu))
>  		kvm_vcpu_flush_tlb_current(vcpu);
>  
> -	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu))
> +	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
> +		kvm_vcpu_flush_tlb_guest(vcpu);
> +		kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
> +	}
> +
> +	if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu))
>  		kvm_vcpu_flush_tlb_guest(vcpu);

It'd be slightly more performant to do:

	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
		kvm_vcpu_flush_tlb_guest(vcpu);
		kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
	} else if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu)) {
		kvm_hv_vcpu_flush_tlb(vcpu);
	}

And then when the code becomes


	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
		kvm_vcpu_flush_tlb_guest(vcpu);
		if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu))
			kvm_hv_vcpu_empty_flush_tlb(vcpu);
	} else if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu)) {
		kvm_hv_vcpu_flush_tlb(vcpu);
	}

the elif will help unsuspecting readers see that the HV_TLB_FLUSH request is
cleared by kvm_check_request() in the TLB_FLUSH_GUEST path.

The elif could result in having to bail from VM-Entry if the request becomes
pending after the check/clear inside TLB_FLUSH_GUEST, but that should be a very
rare case.

>  }
>  EXPORT_SYMBOL_GPL(kvm_service_local_tlb_flush_requests);
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 12/31] KVM: x86: hyper-v: Introduce kvm_hv_is_tlb_flush_hcall()
  2022-04-07 15:56 ` [PATCH v2 12/31] KVM: x86: hyper-v: Introduce kvm_hv_is_tlb_flush_hcall() Vitaly Kuznetsov
@ 2022-04-07 18:07   ` Sean Christopherson
  0 siblings, 0 replies; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 18:07 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> The newly introduced helper checks whether vCPU is performing a
> Hyper-V TLB flush hypercall. This is required to filter out Direct TLB
> flush hypercalls from L2 for processing.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/hyperv.h | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
> index 448877b478ef..3687e1e61e0d 100644
> --- a/arch/x86/kvm/hyperv.h
> +++ b/arch/x86/kvm/hyperv.h
> @@ -168,6 +168,30 @@ static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu *vcpu)
>  	tlb_flush_ring = kvm_hv_get_tlb_flush_ring(vcpu);
>  	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
>  }
> +
> +static inline bool kvm_hv_is_tlb_flush_hcall(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> +	u16 code;
> +
> +	if (!hv_vcpu)
> +		return false;
> +
> +#ifdef CONFIG_X86_64
> +	if (is_64_bit_hypercall(vcpu)) {
> +		code = kvm_rcx_read(vcpu) & 0xffff;
> +	} else
> +#endif
> +	{
> +		code = kvm_rax_read(vcpu) & 0xffff;
> +	}

	if (IS_ENABLED(CONFIG_X86_64) && is_64_bit_hypercall(vcpu))
		code = kvm_rcx_read(vcpu) & 0xffff;
	else
		code = kvm_rax_read(vcpu) & 0xffff;

Though I honestly don't see the point, is_64_bit_hypercall() will do the right
thing.

And is the 0xffff really needed?  An implicit cast should work just fine.  If I'm
overlooking something, an explicit would be better, e.g. why not

	code = is_64_bit_hypercall(vcpu) ? kvm_rcx_read(vcpu) :
					   kvm_rax_read(vcpu);

> +
> +	return (code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE ||
> +		code == HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST ||
> +		code == HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX ||
> +		code == HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX);
> +}
> +
>  void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
>  
>  
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 13/31] KVM: x86: hyper-v: Direct TLB flush
  2022-04-07 15:56 ` [PATCH v2 13/31] KVM: x86: hyper-v: Direct TLB flush Vitaly Kuznetsov
@ 2022-04-07 18:27   ` Sean Christopherson
  2022-04-14 12:24     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 18:27 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> Handle Direct TLB flush requests from L2 by going through all vCPUs

What is a "Direct TLB flush request" in this context?  I can't tell if "direct"
refers to the MMU being direct, or if it has some other Hyper-V specific meaning.

Ewww, it looks to be Hyper-V terminology.  Now I see that @direct=true is getting
L2's ring, not L1's ring.  That's all kinds of evil.  That confusion goes away with
my suggestion below, but this shortlog and changelog (and the ones for nVMX and
nSVM enabling) absolutely need to clarify "direct" since it conflicts mightily
with KVM's "direct" terminology.

In fact, unless I'm missing a patch where "Direct" doesn't mean "From L2", I vote
to not use the "Direct TLB flush" terminology in any of the shortlogs or changelogs
and only add a footnote to this first changelog to call out that the TLFS (or
wherever this terminology came from) calls these types of flushes "Direct".

> and checking whether there are vCPUs running the same VM_ID with a
> VP_ID specified in the requests. Perform synthetic exit to L2 upon
> finish.
> 
> Note, while checking VM_ID/VP_ID of running vCPUs seem to be a bit
> racy, we count on the fact that KVM flushes the whole L2 VPID upon
> transition. Also, KVM_REQ_HV_TLB_FLUSH request needs to be done upon
> transition between L1 and L2 to make sure all pending requests are
> always processed.
> 
> Note, while nVMX/nSVM code does not handle VMCALL/VMMCALL from L2 yet.

Spurious "while"?  Or is there a missing second half of the note?

> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/hyperv.c | 65 ++++++++++++++++++++++++++++++++++++-------
>  arch/x86/kvm/trace.h  | 21 ++++++++------
>  2 files changed, 68 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 705c0b739c1b..2b12f1b5c992 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -34,6 +34,7 @@
>  #include <linux/eventfd.h>
>  
>  #include <asm/apicdef.h>
> +#include <asm/mshyperv.h>
>  #include <trace/events/kvm.h>
>  
>  #include "trace.h"
> @@ -1849,8 +1850,8 @@ static inline int hv_tlb_flush_ring_free(struct kvm_vcpu_hv *hv_vcpu,
>  	return read_idx - write_idx - 1;
>  }
>  
> -static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu, bool flush_all,
> -				      u64 *entries, int count)
> +static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu, bool direct,
> +				      bool flush_all, u64 *entries, int count)
>  {
>  	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
>  	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> @@ -1861,7 +1862,7 @@ static void hv_tlb_flush_ring_enqueue(struct kvm_vcpu *vcpu, bool flush_all,
>  	if (!hv_vcpu)
>  		return;
>  
> -	tlb_flush_ring = &hv_vcpu->tlb_flush_ring[0];
> +	tlb_flush_ring = direct ? &hv_vcpu->tlb_flush_ring[1] : &hv_vcpu->tlb_flush_ring[0];

Rather than pass in @direct and open code indexing into the ring array, pass in
the ring, then the magic boolean goes away along with its confusing terminology.

	tlb_flush_ring = kvm_hv_get_tlb_flush_ring(vcpu);

	/*
	 * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
	 * analyze it here, flush TLB regardless of the specified address space.
	 */
	if (all_cpus && !is_guest_mode(vcpu)) {
		kvm_for_each_vcpu(i, v, kvm)
			hv_tlb_flush_ring_enqueue(v, tlb_flush_ring,
						  tlb_flush_entries, hc->rep_cnt);

		kvm_make_all_cpus_request(kvm, KVM_REQ_HV_TLB_FLUSH);
	} else if (!is_guest_mode(vcpu)) {
		sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, vcpu_mask);

		for_each_set_bit(i, vcpu_mask, KVM_MAX_VCPUS) {
			v = kvm_get_vcpu(kvm, i);
			if (!v)
				continue;
			hv_tlb_flush_ring_enqueue(v, tlb_flush_ring,
						  tlb_flush_entries, hc->rep_cnt);
		}

		kvm_make_vcpus_request_mask(kvm, KVM_REQ_HV_TLB_FLUSH, vcpu_mask);
	} else {
		struct kvm_vcpu_hv *hv_v;

		bitmap_zero(vcpu_mask, KVM_MAX_VCPUS);

		kvm_for_each_vcpu(i, v, kvm) {
			hv_v = to_hv_vcpu(v);

			/*
			 * TLB is fully flushed on L2 VM change: either by KVM
			 * (on a eVMPTR switch) or by L1 hypervisor (in case it
			 * re-purposes the active eVMCS for a different VM/VP).
			 */
			if (!hv_v || hv_v->nested.vm_id != hv_vcpu->nested.vm_id)
				continue;

			if (!all_cpus &&
			    !hv_is_vp_in_sparse_set(hv_v->nested.vp_id, valid_bank_mask,
						    sparse_banks))
				continue;

			__set_bit(i, vcpu_mask);
			hv_tlb_flush_ring_enqueue(v, tlb_flush_ring, tlb_flush_entries, hc->rep_cnt);
		}

		kvm_make_vcpus_request_mask(kvm, KVM_REQ_HV_TLB_FLUSH, vcpu_mask);
	}


>  	spin_lock_irqsave(&tlb_flush_ring->write_lock, flags);
>  

...

>  static int kvm_hv_hypercall_complete_userspace(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index e3a24b8f04be..4241b7c0245e 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -1479,38 +1479,41 @@ TRACE_EVENT(kvm_hv_timer_state,
>   * Tracepoint for kvm_hv_flush_tlb.
>   */
>  TRACE_EVENT(kvm_hv_flush_tlb,
> -	TP_PROTO(u64 processor_mask, u64 address_space, u64 flags),
> -	TP_ARGS(processor_mask, address_space, flags),
> +	TP_PROTO(u64 processor_mask, u64 address_space, u64 flags, bool direct),
> +	TP_ARGS(processor_mask, address_space, flags, direct),

I very strongly prefer direct be guest_mode here, and then print out L1 vs L2 in
the tracepoint itself.  There's no reason to overload "direct". 

>  	TP_STRUCT__entry(
>  		__field(u64, processor_mask)
>  		__field(u64, address_space)
>  		__field(u64, flags)
> +		__field(bool, direct)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->processor_mask = processor_mask;
>  		__entry->address_space = address_space;
>  		__entry->flags = flags;
> +		__entry->direct = direct;
>  	),
>  
> -	TP_printk("processor_mask 0x%llx address_space 0x%llx flags 0x%llx",
> +	TP_printk("processor_mask 0x%llx address_space 0x%llx flags 0x%llx %s",
>  		  __entry->processor_mask, __entry->address_space,
> -		  __entry->flags)
> +		  __entry->flags, __entry->direct ? "(direct)" : "")
>  );
>  
>  /*
>   * Tracepoint for kvm_hv_flush_tlb_ex.
>   */
>  TRACE_EVENT(kvm_hv_flush_tlb_ex,
> -	TP_PROTO(u64 valid_bank_mask, u64 format, u64 address_space, u64 flags),
> -	TP_ARGS(valid_bank_mask, format, address_space, flags),
> +	TP_PROTO(u64 valid_bank_mask, u64 format, u64 address_space, u64 flags, bool direct),
> +	TP_ARGS(valid_bank_mask, format, address_space, flags, direct),
>  
>  	TP_STRUCT__entry(
>  		__field(u64, valid_bank_mask)
>  		__field(u64, format)
>  		__field(u64, address_space)
>  		__field(u64, flags)
> +		__field(bool, direct)
>  	),
>  
>  	TP_fast_assign(
> @@ -1518,12 +1521,14 @@ TRACE_EVENT(kvm_hv_flush_tlb_ex,
>  		__entry->format = format;
>  		__entry->address_space = address_space;
>  		__entry->flags = flags;
> +		__entry->direct = direct;
>  	),
>  
>  	TP_printk("valid_bank_mask 0x%llx format 0x%llx "
> -		  "address_space 0x%llx flags 0x%llx",
> +		  "address_space 0x%llx flags 0x%llx %s",
>  		  __entry->valid_bank_mask, __entry->format,
> -		  __entry->address_space, __entry->flags)
> +		  __entry->address_space, __entry->flags,
> +		  __entry->direct ? "(direct)" : "")
>  );
>  
>  /*
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 16/31] KVM: nVMX: hyper-v: Direct TLB flush
  2022-04-07 15:56 ` [PATCH v2 16/31] KVM: nVMX: hyper-v: Direct TLB flush Vitaly Kuznetsov
@ 2022-04-07 18:47   ` Sean Christopherson
  2022-04-11 11:19     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 18:47 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> Enable Direct TLB flush feature on nVMX when:
> - Enlightened VMCS is in use.
> - Direct TLB flush flag is enabled in eVMCS.
> - Direct TLB flush is enabled in partition assist page.

Yeah, KVM definitely needs a different name for "Direct TLB flush".  I don't have
any good ideas offhand, but honestly anything is better than "Direct".

> Perform synthetic vmexit to L1 after processing TLB flush call upon
> request (HV_VMX_SYNTHETIC_EXIT_REASON_TRAP_AFTER_FLUSH).
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

...

> diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
> index 8862692a4c5d..ab0949c22d2d 100644
> --- a/arch/x86/kvm/vmx/evmcs.h
> +++ b/arch/x86/kvm/vmx/evmcs.h
> @@ -65,6 +65,8 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
>  #define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
>  #define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)
>  
> +#define HV_VMX_SYNTHETIC_EXIT_REASON_TRAP_AFTER_FLUSH 0x10000031

LOL, I guess I have to appreciate the cleverness.  Bit 28 is cleared for all
exits except when using an SMI transfer monitor, and then it's set only if MTF
is pending.

  The remainder of the field (bits 31:28 and bits 26:16) is cleared to 0 (certain
  SMM VM exits may set some of these bits; see Section 31.15.2.3).

  If the SMM VM exit occurred in VMX non-root operation and an MTF VM exit was
  pending, bit 28 of the exit-reason field is set; otherwise, it is cleared.

So despite all appearances, Microsoft didn't actually steal a bit from Intel,
they're just abusing a bit that (a) will never be set so long as the VMM doesn't
use parallel SMM and (b) architecturally can't be set in conjuction with many
exit reasons (everything that's _not_ some form of SMI).

Can you add a comment note to document this?

/*
 * Note, Hyper-V isn't actually stealing bit 28 from Intel, just abusing it by
 * pairing it with architecturally impossible exit reasons.  Bit 28 is set only
 * on SMI exits to a SMI tranfer monitor (STM) and if and only if a MTF VM-Exit
 * is pending.  I.e. it will never be set by hardware for non-SMI exits (there
 * are only three), nor will it ever be set unless the VMM is an STM.
 */

>  struct evmcs_field {
>  	u16 offset;
>  	u16 clean_field;
> @@ -244,6 +246,7 @@ int nested_enable_evmcs(struct kvm_vcpu *vcpu,
>  			uint16_t *vmcs_version);
>  void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata);
>  int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
> +bool nested_evmcs_direct_flush_enabled(struct kvm_vcpu *vcpu);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 18/31] KVM: nSVM: hyper-v: Direct TLB flush
  2022-04-07 15:56 ` [PATCH v2 18/31] KVM: nSVM: hyper-v: Direct TLB flush Vitaly Kuznetsov
@ 2022-04-07 18:50   ` Sean Christopherson
  0 siblings, 0 replies; 47+ messages in thread
From: Sean Christopherson @ 2022-04-07 18:50 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> @@ -486,6 +487,17 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm,
>  
>  static void nested_svm_transition_tlb_flush(struct kvm_vcpu *vcpu)
>  {
> +	/*
> +	 * KVM_REQ_HV_TLB_FLUSH flushes entries from either L1's VPID or

Can you use VP_ID or some variation to avoid "VPID"?  This looks like a copy+paste
from nVMX gone bad and will confuse the heck out of people that are more familiar
with VMX's VPID.

> +	 * L2's VPID upon request from the guest. Make sure we check for
> +	 * pending entries for the case when the request got misplaced (e.g.
> +	 * a transition from L2->L1 happened while processing Direct TLB flush
> +	 * request or vice versa). kvm_hv_vcpu_flush_tlb() will not flush
> +	 * anything if there are no requests in the corresponding buffer.
> +	 */
> +	if (to_hv_vcpu(vcpu))
> +		kvm_make_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
> +
>  	/*
>  	 * TODO: optimize unconditional TLB flush/MMU sync.  A partial list of
>  	 * things to fix before this can be conditional:

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently
  2022-04-07 17:33   ` Sean Christopherson
  2022-04-07 17:47     ` Sean Christopherson
@ 2022-04-11 11:15     ` Vitaly Kuznetsov
  1 sibling, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-11 11:15 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:

...

Thanks a lot for the review! I'll incorporate your feedback into v3.

>>  
>>  static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
>> @@ -1857,12 +1940,13 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
>>  	struct hv_tlb_flush_ex flush_ex;
>>  	struct hv_tlb_flush flush;
>>  	DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
>> +	u64 entries[KVM_HV_TLB_FLUSH_RING_SIZE - 2];
>
> What's up with the -2?

(This should probably be a define or at least a comment somewhere)

Normally, we can only put 'KVM_HV_TLB_FLUSH_RING_SIZE - 1' entries on
the ring as when read_idx == write_idx we percieve this as 'ring is
empty' and not as 'ring is full'. For the TLB flush ring we must always
leave one free entry to put "flush all" request when we run out of
free space to avoid blocking the writer. I.e. when a request flies in,
we check if we have enough space on the ring to put all the entries and
if not, we just put 'flush all' there. In case 'flush all' is already on
the ring, ignoring the request is safe.

So, long story short, there's no point in fetching more than
'KVM_HV_TLB_FLUSH_RING_SIZE - 2' entries from the guest as we can't
possibly put them all on the ring.

[snip]

-- 
Vitaly


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 16/31] KVM: nVMX: hyper-v: Direct TLB flush
  2022-04-07 18:47   ` Sean Christopherson
@ 2022-04-11 11:19     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-11 11:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
>> Enable Direct TLB flush feature on nVMX when:
>> - Enlightened VMCS is in use.
>> - Direct TLB flush flag is enabled in eVMCS.
>> - Direct TLB flush is enabled in partition assist page.
>
> Yeah, KVM definitely needs a different name for "Direct TLB flush".  I don't have
> any good ideas offhand, but honestly anything is better than "Direct".
>

I think we can get away without a name inside KVM, we'll be doing either
'L1 TLB flush' or 'L2 TLB flush'. In QEMU we can still use 'Direct' I
believe as it matches TLFS and doesn't intersect with KVM's MMU.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently
  2022-04-07 17:44   ` Sean Christopherson
@ 2022-04-11 11:31     ` Vitaly Kuznetsov
  2022-04-11 20:37       ` Sean Christopherson
  0 siblings, 1 reply; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-11 11:31 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
>> @@ -1840,15 +1891,47 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
>>  {
>>  	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
>>  	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> -
>> -	kvm_vcpu_flush_tlb_guest(vcpu);
>> -
>> -	if (!hv_vcpu)
>> +	struct kvm_vcpu_hv_tlbflush_entry *entry;
>> +	int read_idx, write_idx;
>> +	u64 address;
>> +	u32 count;
>> +	int i, j;
>> +
>> +	if (!tdp_enabled || !hv_vcpu) {
>> +		kvm_vcpu_flush_tlb_guest(vcpu);
>>  		return;
>> +	}
>>  
>>  	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
>> +	read_idx = READ_ONCE(tlb_flush_ring->read_idx);
>> +	write_idx = READ_ONCE(tlb_flush_ring->write_idx);
>> +
>> +	/* Pairs with smp_wmb() in hv_tlb_flush_ring_enqueue() */
>> +	smp_rmb();
>>  
>> -	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
>> +	for (i = read_idx; i != write_idx; i = (i + 1) % KVM_HV_TLB_FLUSH_RING_SIZE) {
>> +		entry = &tlb_flush_ring->entries[i];
>> +
>> +		if (entry->flush_all)
>> +			goto out_flush_all;
>> +
>> +		/*
>> +		 * Lower 12 bits of 'address' encode the number of additional
>> +		 * pages to flush.
>> +		 */
>> +		address = entry->addr & PAGE_MASK;
>> +		count = (entry->addr & ~PAGE_MASK) + 1;
>> +		for (j = 0; j < count; j++)
>> +			static_call(kvm_x86_flush_tlb_gva)(vcpu, address + j * PAGE_SIZE);
>> +	}
>> +	++vcpu->stat.tlb_flush;
>> +	goto out_empty_ring;
>> +
>> +out_flush_all:
>> +	kvm_vcpu_flush_tlb_guest(vcpu);
>> +
>> +out_empty_ring:
>> +	tlb_flush_ring->read_idx = write_idx;
>
> Does this need WRITE_ONCE?  My usual "I suck at memory ordering" disclaimer applies.
>

Same here) I *think* we're fine for 'read_idx' as it shouldn't matter at
which point in this function 'tlb_flush_ring->read_idx' gets modified
(relative to other things, e.g. actual TLB flushes) and there's no
concurency as we only have one reader (the vCPU which needs its TLB
flushed). On the other hand, I'm not against adding WRITE_ONCE() here
even if just to aid an unprepared reader (thinking myself couple years
in the future).

-- 
Vitaly


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently
  2022-04-11 11:31     ` Vitaly Kuznetsov
@ 2022-04-11 20:37       ` Sean Christopherson
  0 siblings, 0 replies; 47+ messages in thread
From: Sean Christopherson @ 2022-04-11 20:37 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

On Mon, Apr 11, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> > On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
> >> @@ -1840,15 +1891,47 @@ void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
> >>  {
> >>  	struct kvm_vcpu_hv_tlbflush_ring *tlb_flush_ring;
> >>  	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> >> -
> >> -	kvm_vcpu_flush_tlb_guest(vcpu);
> >> -
> >> -	if (!hv_vcpu)
> >> +	struct kvm_vcpu_hv_tlbflush_entry *entry;
> >> +	int read_idx, write_idx;
> >> +	u64 address;
> >> +	u32 count;
> >> +	int i, j;
> >> +
> >> +	if (!tdp_enabled || !hv_vcpu) {
> >> +		kvm_vcpu_flush_tlb_guest(vcpu);
> >>  		return;
> >> +	}
> >>  
> >>  	tlb_flush_ring = &hv_vcpu->tlb_flush_ring;
> >> +	read_idx = READ_ONCE(tlb_flush_ring->read_idx);
> >> +	write_idx = READ_ONCE(tlb_flush_ring->write_idx);
> >> +
> >> +	/* Pairs with smp_wmb() in hv_tlb_flush_ring_enqueue() */
> >> +	smp_rmb();
> >>  
> >> -	tlb_flush_ring->read_idx = tlb_flush_ring->write_idx;
> >> +	for (i = read_idx; i != write_idx; i = (i + 1) % KVM_HV_TLB_FLUSH_RING_SIZE) {
> >> +		entry = &tlb_flush_ring->entries[i];
> >> +
> >> +		if (entry->flush_all)
> >> +			goto out_flush_all;
> >> +
> >> +		/*
> >> +		 * Lower 12 bits of 'address' encode the number of additional
> >> +		 * pages to flush.
> >> +		 */
> >> +		address = entry->addr & PAGE_MASK;
> >> +		count = (entry->addr & ~PAGE_MASK) + 1;
> >> +		for (j = 0; j < count; j++)
> >> +			static_call(kvm_x86_flush_tlb_gva)(vcpu, address + j * PAGE_SIZE);
> >> +	}
> >> +	++vcpu->stat.tlb_flush;
> >> +	goto out_empty_ring;
> >> +
> >> +out_flush_all:
> >> +	kvm_vcpu_flush_tlb_guest(vcpu);
> >> +
> >> +out_empty_ring:
> >> +	tlb_flush_ring->read_idx = write_idx;
> >
> > Does this need WRITE_ONCE?  My usual "I suck at memory ordering" disclaimer applies.
> >
> 
> Same here) I *think* we're fine for 'read_idx' as it shouldn't matter at
> which point in this function 'tlb_flush_ring->read_idx' gets modified
> (relative to other things, e.g. actual TLB flushes) and there's no
> concurency as we only have one reader (the vCPU which needs its TLB
> flushed). On the other hand, I'm not against adding WRITE_ONCE() here
> even if just to aid an unprepared reader (thinking myself couple years
> in the future).

Ah, read_idx == tail and write_idx == head.  I didn't look at the structure very
closely, or maybe not at all :-)  And IIUC, only the vCPU itself ever writes to
tail?  In that case, I would omit the READ_ONCE() from both the write to tail here
and the read above, and probably add a brief comment stating that the flush must
be performed on the target vCPU, i.e. must hold vcpu->mutex, and so it's safe for
the compiler to re-read tlb_flush_ring->read_idx in the loop because it cannot
change.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 13/31] KVM: x86: hyper-v: Direct TLB flush
  2022-04-07 18:27   ` Sean Christopherson
@ 2022-04-14 12:24     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 47+ messages in thread
From: Vitaly Kuznetsov @ 2022-04-14 12:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, Michael Kelley,
	Siddharth Chandrasekaran, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Thu, Apr 07, 2022, Vitaly Kuznetsov wrote:
>> Handle Direct TLB flush requests from L2 by going through all vCPUs
>
> What is a "Direct TLB flush request" in this context?  I can't tell if "direct"
> refers to the MMU being direct, or if it has some other Hyper-V specific meaning.
>
> Ewww, it looks to be Hyper-V terminology.  Now I see that @direct=true is getting
> L2's ring, not L1's ring.  That's all kinds of evil.  That confusion goes away with
> my suggestion below, but this shortlog and changelog (and the ones for nVMX and
> nSVM enabling) absolutely need to clarify "direct" since it conflicts mightily
> with KVM's "direct" terminology.
>
> In fact, unless I'm missing a patch where "Direct" doesn't mean "From L2", I vote
> to not use the "Direct TLB flush" terminology in any of the shortlogs or changelogs
> and only add a footnote to this first changelog to call out that the TLFS (or
> wherever this terminology came from) calls these types of flushes
> "Direct".

In soon-to-be-sent-out v3 I got rid of 'Direct TLB flush' completely.
Note, in addition to what gets introduced in this series, there are
two other Hyper-V specific places which overload 'direct' already:

- Direct TLB flush for KVM-on-Hyper-V (enable_direct_tlbflush()). I'm
getting rid of it too.

- Direct synthetic timers. 'Direct' in this case means that the timer
signal is delivered via dedicated IRQ 'directly' and not through Vmbus
message. This stays as I can't think of how we can rename it (and if we
should, in the first place).

-- 
Vitaly


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2022-04-14 12:25 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-07 15:56 [PATCH v2 00/31] KVM: x86: hyper-v: Fine-grained TLB flush + Direct TLB flush feature Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 01/31] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag Vitaly Kuznetsov
2022-04-07 18:02   ` Sean Christopherson
2022-04-07 15:56 ` [PATCH v2 02/31] KVM: x86: hyper-v: Introduce TLB flush ring Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 03/31] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently Vitaly Kuznetsov
2022-04-07 17:33   ` Sean Christopherson
2022-04-07 17:47     ` Sean Christopherson
2022-04-11 11:15     ` Vitaly Kuznetsov
2022-04-07 17:44   ` Sean Christopherson
2022-04-11 11:31     ` Vitaly Kuznetsov
2022-04-11 20:37       ` Sean Christopherson
2022-04-07 15:56 ` [PATCH v2 04/31] KVM: x86: hyper-v: Expose support for extended gva ranges for flush hypercalls Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 05/31] KVM: x86: Prepare kvm_hv_flush_tlb() to handle L2's GPAs Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 06/31] KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in kvm_hv_send_ipi() Vitaly Kuznetsov
2022-04-07 17:48   ` Sean Christopherson
2022-04-07 15:56 ` [PATCH v2 07/31] KVM: x86: hyper-v: Create a separate ring for Direct TLB flush Vitaly Kuznetsov
2022-04-07 17:57   ` Sean Christopherson
2022-04-07 15:56 ` [PATCH v2 08/31] KVM: x86: hyper-v: Use preallocated buffer in 'struct kvm_vcpu_hv' instead of on-stack 'sparse_banks' Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 09/31] KVM: nVMX: Keep track of hv_vm_id/hv_vp_id when eVMCS is in use Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 10/31] KVM: nSVM: Keep track of Hyper-V hv_vm_id/hv_vp_id Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 11/31] KVM: x86: Introduce .post_hv_direct_flush() nested hook Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 12/31] KVM: x86: hyper-v: Introduce kvm_hv_is_tlb_flush_hcall() Vitaly Kuznetsov
2022-04-07 18:07   ` Sean Christopherson
2022-04-07 15:56 ` [PATCH v2 13/31] KVM: x86: hyper-v: Direct TLB flush Vitaly Kuznetsov
2022-04-07 18:27   ` Sean Christopherson
2022-04-14 12:24     ` Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 14/31] KVM: x86: hyper-v: Introduce fast kvm_hv_direct_tlb_flush_exposed() check Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 15/31] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 16/31] KVM: nVMX: hyper-v: Direct TLB flush Vitaly Kuznetsov
2022-04-07 18:47   ` Sean Christopherson
2022-04-11 11:19     ` Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 17/31] KVM: x86: KVM_REQ_TLB_FLUSH_CURRENT is a superset of KVM_REQ_HV_TLB_FLUSH too Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 18/31] KVM: nSVM: hyper-v: Direct TLB flush Vitaly Kuznetsov
2022-04-07 18:50   ` Sean Christopherson
2022-04-07 15:56 ` [PATCH v2 19/31] KVM: x86: Expose Hyper-V Direct TLB flush feature Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 20/31] KVM: selftests: add hyperv_svm_test to .gitignore Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 21/31] KVM: selftests: Better XMM read/write helpers Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 22/31] KVM: selftests: Hyper-V PV IPI selftest Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 23/31] KVM: selftests: Make it possible to replace PTEs with __virt_pg_map() Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 24/31] KVM: selftests: Hyper-V PV TLB flush selftest Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 25/31] KVM: selftests: Sync 'struct hv_enlightened_vmcs' definition with hyperv-tlfs.h Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 26/31] KVM: selftests: nVMX: Allocate Hyper-V partition assist page Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 27/31] KVM: selftests: nSVM: Allocate Hyper-V partition assist and VP assist pages Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 28/31] KVM: selftests: Sync 'struct hv_vp_assist_page' definition with hyperv-tlfs.h Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 29/31] KVM: selftests: evmcs_test: Direct TLB flush test Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 30/31] KVM: selftests: Move Hyper-V VP assist page enablement out of evmcs.h Vitaly Kuznetsov
2022-04-07 15:56 ` [PATCH v2 31/31] KVM: selftests: hyperv_svm_test: Add Direct TLB flush test Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.