kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue
@ 2021-06-04 17:25 Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 01/12] KVM: x86: Remove guest mode check from kvm_check_nested_events Jim Mattson
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:25 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson

When the VMCS12 posted interrupt descriptor isn't backed by an L1
memslot, kvm will launch vmcs02 with a stale posted interrupt
descriptor. Before commit 6beb7bd52e48 ("kvm: nVMX: Refactor
nested_get_vmcs12_pages()"), kvm would have silently disabled the
VMCS02 "process posted interrupts" VM-execution control. Both
behaviors are wrong, though the use-after-free is more egregious.

Empirical tests on actual hardware reveal that a posted interrupt
descriptor without any backing memory/device has PCI bus error
semantics (reads return all 1's and writes are discarded). However,
kvm can't tell an unbacked address from an MMIO address. Normally, kvm
would ask userspace for an MMIO completion, but that's complicated for
a posted interrupt descriptor access. There are already a number of
cases where kvm bails out to userspace with KVM_INTERNAL_ERROR via
kvm_handle_memory_failure, so that seems like the best route to take.

It would be relatively easy to invoke kvm_handle_memory_failure at
emulated VM-entry, but that approach would break existing
kvm-unit-tests. Moreover, the issue doesn't really come up until the
vCPU--in virtualized VMX non-root operation--received the posted
interrupt notification vector indicated in its VMCS12.

Sadly, it's really hard to arrange for an exit to userspace from
vmx_complete_nested_posted_interrupt, which is where kvm actually
needs to access the unbacked PID. Initially, I added a new kvm request
for a userspace exit on the next guest entry, but Sean hated that
approach. Based on his suggestion, I added the plumbing to get back
out to userspace in the event of an error in
vmx_complete_nested_posted_interrupt. This works in the case of an
unbacked PID, but it doesn't work quite as well in the case of an
unbacked virtual APIC page (another case where kvm was happy to just
silently ignore the problem and attempt to muddle its way through.) In
that case, this series is an incremental improvement, but it's not a
complete fix.

v1 -> v2:
  05/12: Modified kvm_arch_vcpu_ioctl_get_mpstate() so that it
         returns the error code from kvm_apic_accept_events() if
	 said error code is less than 0.
  07/12: Changed a comment based on Sean's feedback.

Jim Mattson (12):
  KVM: x86: Remove guest mode check from kvm_check_nested_events
  KVM: x86: Wake up a vCPU when kvm_check_nested_events fails
  KVM: nVMX: Add a return code to vmx_complete_nested_posted_interrupt
  KVM: x86: Add a return code to inject_pending_event
  KVM: x86: Add a return code to kvm_apic_accept_events
  KVM: nVMX: Fail on MMIO completion for nested posted interrupts
  KVM: nVMX: Disable vmcs02 posted interrupts if vmcs12 PID isn't
    mappable
  KVM: selftests: Move APIC definitions into a separate file
  KVM: selftests: Hoist APIC functions out of individual tests
  KVM: selftests: Introduce x2APIC register manipulation functions
  KVM: selftests: Introduce prepare_tpr_shadow
  KVM: selftests: Add a test of an unbacked nested PI descriptor

 arch/x86/kvm/lapic.c                          |  11 +-
 arch/x86/kvm/lapic.h                          |   2 +-
 arch/x86/kvm/vmx/nested.c                     |  31 ++-
 arch/x86/kvm/x86.c                            |  59 ++--
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   3 +-
 .../selftests/kvm/include/x86_64/apic.h       |  91 +++++++
 .../selftests/kvm/include/x86_64/processor.h  |  49 +---
 .../selftests/kvm/include/x86_64/vmx.h        |   6 +
 tools/testing/selftests/kvm/lib/x86_64/apic.c |  45 ++++
 tools/testing/selftests/kvm/lib/x86_64/vmx.c  |   8 +
 .../testing/selftests/kvm/x86_64/evmcs_test.c |  11 +-
 .../selftests/kvm/x86_64/set_boot_cpu_id.c    |   6 +-
 tools/testing/selftests/kvm/x86_64/smm_test.c |   4 +-
 .../selftests/kvm/x86_64/vmx_pi_mmio_test.c   | 252 ++++++++++++++++++
 .../selftests/kvm/x86_64/xapic_ipi_test.c     |  59 +---
 16 files changed, 490 insertions(+), 148 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/x86_64/apic.h
 create mode 100644 tools/testing/selftests/kvm/lib/x86_64/apic.c
 create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_pi_mmio_test.c

-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 01/12] KVM: x86: Remove guest mode check from kvm_check_nested_events
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 02/12] KVM: x86: Wake up a vCPU when kvm_check_nested_events fails Jim Mattson
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson

A survey of the callsites reveals that they all ensure the vCPU is in
guest mode before calling kvm_check_nested_events. Remove this dead
code so that the only negative value this function returns (at the
moment) is -EBUSY.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/x86.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b594275d49b5..882457e92679 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8535,9 +8535,6 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu)
 
 int kvm_check_nested_events(struct kvm_vcpu *vcpu)
 {
-	if (WARN_ON_ONCE(!is_guest_mode(vcpu)))
-		return -EIO;
-
 	if (kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu)) {
 		kvm_x86_ops.nested_ops->triple_fault(vcpu);
 		return 1;
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 02/12] KVM: x86: Wake up a vCPU when kvm_check_nested_events fails
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 01/12] KVM: x86: Remove guest mode check from kvm_check_nested_events Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 03/12] KVM: nVMX: Add a return code to vmx_complete_nested_posted_interrupt Jim Mattson
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson, Oliver Upton

At present, there are two reasons why kvm_check_nested_events may
return a non-zero value:

1) we just emulated a shutdown VM-exit from L2 to L1.
2) we need to perform an immediate VM-exit from vmcs02.

In either case, transition the vCPU to "running."

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/x86.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 882457e92679..83bc0a5b1aab 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9471,8 +9471,8 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
 
 static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu)
 {
-	if (is_guest_mode(vcpu))
-		kvm_check_nested_events(vcpu);
+	if (is_guest_mode(vcpu) && kvm_check_nested_events(vcpu))
+		return true;
 
 	return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
 		!vcpu->arch.apf.halted);
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 03/12] KVM: nVMX: Add a return code to vmx_complete_nested_posted_interrupt
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 01/12] KVM: x86: Remove guest mode check from kvm_check_nested_events Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 02/12] KVM: x86: Wake up a vCPU when kvm_check_nested_events fails Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 04/12] KVM: x86: Add a return code to inject_pending_event Jim Mattson
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson, Oliver Upton

No functional change intended.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/nested.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 6058a65a6ede..7646e6e561ad 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3682,7 +3682,7 @@ void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu)
 	}
 }
 
-static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
+static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	int max_irr;
@@ -3690,17 +3690,17 @@ static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
 	u16 status;
 
 	if (!vmx->nested.pi_desc || !vmx->nested.pi_pending)
-		return;
+		return 0;
 
 	vmx->nested.pi_pending = false;
 	if (!pi_test_and_clear_on(vmx->nested.pi_desc))
-		return;
+		return 0;
 
 	max_irr = find_last_bit((unsigned long *)vmx->nested.pi_desc->pir, 256);
 	if (max_irr != 256) {
 		vapic_page = vmx->nested.virtual_apic_map.hva;
 		if (!vapic_page)
-			return;
+			return 0;
 
 		__kvm_apic_update_irr(vmx->nested.pi_desc->pir,
 			vapic_page, &max_irr);
@@ -3713,6 +3713,7 @@ static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
 	}
 
 	nested_mark_vmcs12_pages_dirty(vcpu);
+	return 0;
 }
 
 static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu,
@@ -3887,8 +3888,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 	}
 
 no_vmexit:
-	vmx_complete_nested_posted_interrupt(vcpu);
-	return 0;
+	return vmx_complete_nested_posted_interrupt(vcpu);
 }
 
 static u32 vmx_get_preemption_timer_value(struct kvm_vcpu *vcpu)
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 04/12] KVM: x86: Add a return code to inject_pending_event
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (2 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 03/12] KVM: nVMX: Add a return code to vmx_complete_nested_posted_interrupt Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 05/12] KVM: x86: Add a return code to kvm_apic_accept_events Jim Mattson
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson

No functional change intended. At present, 'r' will always be -EBUSY
on a control transfer to the 'out' label.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/x86.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 83bc0a5b1aab..f9b3ea916344 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8550,7 +8550,7 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
 	static_call(kvm_x86_queue_exception)(vcpu);
 }
 
-static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit)
+static int inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit)
 {
 	int r;
 	bool can_inject = true;
@@ -8597,7 +8597,7 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
 	if (is_guest_mode(vcpu)) {
 		r = kvm_check_nested_events(vcpu);
 		if (r < 0)
-			goto busy;
+			goto out;
 	}
 
 	/* try to inject new event if pending */
@@ -8639,7 +8639,7 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
 	if (vcpu->arch.smi_pending) {
 		r = can_inject ? static_call(kvm_x86_smi_allowed)(vcpu, true) : -EBUSY;
 		if (r < 0)
-			goto busy;
+			goto out;
 		if (r) {
 			vcpu->arch.smi_pending = false;
 			++vcpu->arch.smi_count;
@@ -8652,7 +8652,7 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
 	if (vcpu->arch.nmi_pending) {
 		r = can_inject ? static_call(kvm_x86_nmi_allowed)(vcpu, true) : -EBUSY;
 		if (r < 0)
-			goto busy;
+			goto out;
 		if (r) {
 			--vcpu->arch.nmi_pending;
 			vcpu->arch.nmi_injected = true;
@@ -8667,7 +8667,7 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
 	if (kvm_cpu_has_injectable_intr(vcpu)) {
 		r = can_inject ? static_call(kvm_x86_interrupt_allowed)(vcpu, true) : -EBUSY;
 		if (r < 0)
-			goto busy;
+			goto out;
 		if (r) {
 			kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);
 			static_call(kvm_x86_set_irq)(vcpu);
@@ -8683,11 +8683,14 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit
 		*req_immediate_exit = true;
 
 	WARN_ON(vcpu->arch.exception.pending);
-	return;
+	return 0;
 
-busy:
-	*req_immediate_exit = true;
-	return;
+out:
+	if (r == -EBUSY) {
+		*req_immediate_exit = true;
+		r = 0;
+	}
+	return r;
 }
 
 static void process_nmi(struct kvm_vcpu *vcpu)
@@ -9248,7 +9251,11 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			goto out;
 		}
 
-		inject_pending_event(vcpu, &req_immediate_exit);
+		r = inject_pending_event(vcpu, &req_immediate_exit);
+		if (r < 0) {
+			r = 0;
+			goto out;
+		}
 		if (req_int_win)
 			static_call(kvm_x86_enable_irq_window)(vcpu);
 
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 05/12] KVM: x86: Add a return code to kvm_apic_accept_events
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (3 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 04/12] KVM: x86: Add a return code to inject_pending_event Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 06/12] KVM: nVMX: Fail on MMIO completion for nested posted interrupts Jim Mattson
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson

No functional change intended. At present, the only negative value
returned by kvm_check_nested_events is -EBUSY.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/lapic.c | 11 ++++++-----
 arch/x86/kvm/lapic.h |  2 +-
 arch/x86/kvm/x86.c   | 25 ++++++++++++++++++++-----
 3 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8120e8614b92..6a315ade6c41 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2864,7 +2864,7 @@ int kvm_lapic_enable_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len)
 	return kvm_gfn_to_hva_cache_init(vcpu->kvm, ghc, addr, new_len);
 }
 
-void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
+int kvm_apic_accept_events(struct kvm_vcpu *vcpu)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
 	u8 sipi_vector;
@@ -2872,7 +2872,7 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
 	unsigned long pe;
 
 	if (!lapic_in_kernel(vcpu))
-		return;
+		return 0;
 
 	/*
 	 * Read pending events before calling the check_events
@@ -2880,12 +2880,12 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
 	 */
 	pe = smp_load_acquire(&apic->pending_events);
 	if (!pe)
-		return;
+		return 0;
 
 	if (is_guest_mode(vcpu)) {
 		r = kvm_check_nested_events(vcpu);
 		if (r < 0)
-			return;
+			return r == -EBUSY ? 0 : r;
 		/*
 		 * If an event has happened and caused a vmexit,
 		 * we know INITs are latched and therefore
@@ -2906,7 +2906,7 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
 		WARN_ON_ONCE(vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED);
 		if (test_bit(KVM_APIC_SIPI, &pe))
 			clear_bit(KVM_APIC_SIPI, &apic->pending_events);
-		return;
+		return 0;
 	}
 
 	if (test_bit(KVM_APIC_INIT, &pe)) {
@@ -2927,6 +2927,7 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
 			vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
 		}
 	}
+	return 0;
 }
 
 void kvm_lapic_exit(void)
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 997c45a5963a..d7c25d0c1354 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -76,7 +76,7 @@ void kvm_free_lapic(struct kvm_vcpu *vcpu);
 int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu);
 int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu);
 int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu);
-void kvm_apic_accept_events(struct kvm_vcpu *vcpu);
+int kvm_apic_accept_events(struct kvm_vcpu *vcpu);
 void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event);
 u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu);
 void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f9b3ea916344..51d3b9ff4d96 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9245,7 +9245,11 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win ||
 	    kvm_xen_has_interrupt(vcpu)) {
 		++vcpu->stat.req_event;
-		kvm_apic_accept_events(vcpu);
+		r = kvm_apic_accept_events(vcpu);
+		if (r < 0) {
+			r = 0;
+			goto out;
+		}
 		if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
 			r = 1;
 			goto out;
@@ -9457,7 +9461,8 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
 			return 1;
 	}
 
-	kvm_apic_accept_events(vcpu);
+	if (kvm_apic_accept_events(vcpu) < 0)
+		return 0;
 	switch(vcpu->arch.mp_state) {
 	case KVM_MP_STATE_HALTED:
 	case KVM_MP_STATE_AP_RESET_HOLD:
@@ -9681,7 +9686,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 			goto out;
 		}
 		kvm_vcpu_block(vcpu);
-		kvm_apic_accept_events(vcpu);
+		if (kvm_apic_accept_events(vcpu) < 0) {
+			r = 0;
+			goto out;
+		}
 		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
 		r = -EAGAIN;
 		if (signal_pending(current)) {
@@ -9883,11 +9891,17 @@ int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
 int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
 				    struct kvm_mp_state *mp_state)
 {
+	int r;
+
 	vcpu_load(vcpu);
 	if (kvm_mpx_supported())
 		kvm_load_guest_fpu(vcpu);
 
-	kvm_apic_accept_events(vcpu);
+	r = kvm_apic_accept_events(vcpu);
+	if (r < 0)
+		goto out;
+	r = 0;
+
 	if ((vcpu->arch.mp_state == KVM_MP_STATE_HALTED ||
 	     vcpu->arch.mp_state == KVM_MP_STATE_AP_RESET_HOLD) &&
 	    vcpu->arch.pv.pv_unhalted)
@@ -9895,10 +9909,11 @@ int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
 	else
 		mp_state->mp_state = vcpu->arch.mp_state;
 
+out:
 	if (kvm_mpx_supported())
 		kvm_put_guest_fpu(vcpu);
 	vcpu_put(vcpu);
-	return 0;
+	return r;
 }
 
 int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 06/12] KVM: nVMX: Fail on MMIO completion for nested posted interrupts
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (4 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 05/12] KVM: x86: Add a return code to kvm_apic_accept_events Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 07/12] KVM: nVMX: Disable vmcs02 posted interrupts if vmcs12 PID isn't mappable Jim Mattson
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson

When the kernel has no mapping for the vmcs02 virtual APIC page,
userspace MMIO completion is necessary to process nested posted
interrupts. This is not a configuration that KVM supports. Rather than
silently ignoring the problem, try to exit to userspace with
KVM_INTERNAL_ERROR.

Note that the event that triggers this error is consumed as a
side-effect of a call to kvm_check_nested_events. On some paths
(notably through kvm_vcpu_check_block), the error is dropped. In any
case, this is an incremental improvement over always ignoring the
error.

Signed-off-by: Jim Mattson <jmattson@google.com>

---
 arch/x86/kvm/vmx/nested.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 7646e6e561ad..706c31821362 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3700,7 +3700,7 @@ static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
 	if (max_irr != 256) {
 		vapic_page = vmx->nested.virtual_apic_map.hva;
 		if (!vapic_page)
-			return 0;
+			goto mmio_needed;
 
 		__kvm_apic_update_irr(vmx->nested.pi_desc->pir,
 			vapic_page, &max_irr);
@@ -3714,6 +3714,10 @@ static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
 
 	nested_mark_vmcs12_pages_dirty(vcpu);
 	return 0;
+
+mmio_needed:
+	kvm_handle_memory_failure(vcpu, X86EMUL_IO_NEEDED, NULL);
+	return -ENXIO;
 }
 
 static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu,
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 07/12] KVM: nVMX: Disable vmcs02 posted interrupts if vmcs12 PID isn't mappable
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (5 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 06/12] KVM: nVMX: Fail on MMIO completion for nested posted interrupts Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 08/12] KVM: selftests: Move APIC definitions into a separate file Jim Mattson
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson

Don't allow posted interrupts to modify a stale posted interrupt
descriptor (including the initial value of 0).

Empirical tests on real hardware reveal that a posted interrupt
descriptor referencing an unbacked address has PCI bus error semantics
(reads as all 1's; writes are ignored). However, kvm can't distinguish
unbacked addresses from device-backed (MMIO) addresses, so it should
really ask userspace for an MMIO completion. That's overly
complicated, so just punt with KVM_INTERNAL_ERROR.

Don't return the error until the posted interrupt descriptor is
actually accessed. We don't want to break the existing kvm-unit-tests
that assume they can launch an L2 VM with a posted interrupt
descriptor that references MMIO space in L1.

Fixes: 6beb7bd52e48 ("kvm: nVMX: Refactor nested_get_vmcs12_pages()")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/vmx/nested.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 706c31821362..632f0abfe154 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3175,6 +3175,15 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
 				offset_in_page(vmcs12->posted_intr_desc_addr));
 			vmcs_write64(POSTED_INTR_DESC_ADDR,
 				     pfn_to_hpa(map->pfn) + offset_in_page(vmcs12->posted_intr_desc_addr));
+		} else {
+			/*
+			 * Defer the KVM_INTERNAL_EXIT until KVM tries to
+			 * access the contents of the VMCS12 posted interrupt
+			 * descriptor. (Note that KVM may do this when it
+			 * should not, per the architectural specification.)
+			 */
+			vmx->nested.pi_desc = NULL;
+			pin_controls_clearbit(vmx, PIN_BASED_POSTED_INTR);
 		}
 	}
 	if (nested_vmx_prepare_msr_bitmap(vcpu, vmcs12))
@@ -3689,10 +3698,14 @@ static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
 	void *vapic_page;
 	u16 status;
 
-	if (!vmx->nested.pi_desc || !vmx->nested.pi_pending)
+	if (!vmx->nested.pi_pending)
 		return 0;
 
+	if (!vmx->nested.pi_desc)
+		goto mmio_needed;
+
 	vmx->nested.pi_pending = false;
+
 	if (!pi_test_and_clear_on(vmx->nested.pi_desc))
 		return 0;
 
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 08/12] KVM: selftests: Move APIC definitions into a separate file
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (6 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 07/12] KVM: nVMX: Disable vmcs02 posted interrupts if vmcs12 PID isn't mappable Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 09/12] KVM: selftests: Hoist APIC functions out of individual tests Jim Mattson
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson, Oliver Upton

Processor.h is a hodgepodge of definitions. Though the local APIC is
technically built into the CPU these days, move the APIC definitions
into a new header file: apic.h.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 .../selftests/kvm/include/x86_64/apic.h       | 58 +++++++++++++++++++
 .../selftests/kvm/include/x86_64/processor.h  | 47 ---------------
 .../selftests/kvm/include/x86_64/vmx.h        |  1 +
 3 files changed, 59 insertions(+), 47 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/x86_64/apic.h

diff --git a/tools/testing/selftests/kvm/include/x86_64/apic.h b/tools/testing/selftests/kvm/include/x86_64/apic.h
new file mode 100644
index 000000000000..0d0e35c8866b
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/x86_64/apic.h
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * tools/testing/selftests/kvm/include/x86_64/apic.h
+ *
+ * Copyright (C) 2021, Google LLC.
+ */
+
+#ifndef SELFTEST_KVM_APIC_H
+#define SELFTEST_KVM_APIC_H
+
+#define APIC_DEFAULT_GPA		0xfee00000ULL
+
+/* APIC base address MSR and fields */
+#define MSR_IA32_APICBASE		0x0000001b
+#define MSR_IA32_APICBASE_BSP		(1<<8)
+#define MSR_IA32_APICBASE_EXTD		(1<<10)
+#define MSR_IA32_APICBASE_ENABLE	(1<<11)
+#define MSR_IA32_APICBASE_BASE		(0xfffff<<12)
+#define		GET_APIC_BASE(x)	(((x) >> 12) << 12)
+
+#define APIC_BASE_MSR	0x800
+#define X2APIC_ENABLE	(1UL << 10)
+#define	APIC_ID		0x20
+#define	APIC_LVR	0x30
+#define		GET_APIC_ID_FIELD(x)	(((x) >> 24) & 0xFF)
+#define	APIC_TASKPRI	0x80
+#define	APIC_PROCPRI	0xA0
+#define	APIC_EOI	0xB0
+#define	APIC_SPIV	0xF0
+#define		APIC_SPIV_FOCUS_DISABLED	(1 << 9)
+#define		APIC_SPIV_APIC_ENABLED		(1 << 8)
+#define	APIC_ICR	0x300
+#define		APIC_DEST_SELF		0x40000
+#define		APIC_DEST_ALLINC	0x80000
+#define		APIC_DEST_ALLBUT	0xC0000
+#define		APIC_ICR_RR_MASK	0x30000
+#define		APIC_ICR_RR_INVALID	0x00000
+#define		APIC_ICR_RR_INPROG	0x10000
+#define		APIC_ICR_RR_VALID	0x20000
+#define		APIC_INT_LEVELTRIG	0x08000
+#define		APIC_INT_ASSERT		0x04000
+#define		APIC_ICR_BUSY		0x01000
+#define		APIC_DEST_LOGICAL	0x00800
+#define		APIC_DEST_PHYSICAL	0x00000
+#define		APIC_DM_FIXED		0x00000
+#define		APIC_DM_FIXED_MASK	0x00700
+#define		APIC_DM_LOWEST		0x00100
+#define		APIC_DM_SMI		0x00200
+#define		APIC_DM_REMRD		0x00300
+#define		APIC_DM_NMI		0x00400
+#define		APIC_DM_INIT		0x00500
+#define		APIC_DM_STARTUP		0x00600
+#define		APIC_DM_EXTINT		0x00700
+#define		APIC_VECTOR_MASK	0x000FF
+#define	APIC_ICR2	0x310
+#define		SET_APIC_DEST_FIELD(x)	((x) << 24)
+
+#endif /* SELFTEST_KVM_APIC_H */
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 0b30b4e15c38..a4729d9032ce 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -425,53 +425,6 @@ struct kvm_cpuid2 *vcpu_get_supported_hv_cpuid(struct kvm_vm *vm, uint32_t vcpui
 #define X86_CR0_CD          (1UL<<30) /* Cache Disable */
 #define X86_CR0_PG          (1UL<<31) /* Paging */
 
-#define APIC_DEFAULT_GPA		0xfee00000ULL
-
-/* APIC base address MSR and fields */
-#define MSR_IA32_APICBASE		0x0000001b
-#define MSR_IA32_APICBASE_BSP		(1<<8)
-#define MSR_IA32_APICBASE_EXTD		(1<<10)
-#define MSR_IA32_APICBASE_ENABLE	(1<<11)
-#define MSR_IA32_APICBASE_BASE		(0xfffff<<12)
-#define		GET_APIC_BASE(x)	(((x) >> 12) << 12)
-
-#define APIC_BASE_MSR	0x800
-#define X2APIC_ENABLE	(1UL << 10)
-#define	APIC_ID		0x20
-#define	APIC_LVR	0x30
-#define		GET_APIC_ID_FIELD(x)	(((x) >> 24) & 0xFF)
-#define	APIC_TASKPRI	0x80
-#define	APIC_PROCPRI	0xA0
-#define	APIC_EOI	0xB0
-#define	APIC_SPIV	0xF0
-#define		APIC_SPIV_FOCUS_DISABLED	(1 << 9)
-#define		APIC_SPIV_APIC_ENABLED		(1 << 8)
-#define	APIC_ICR	0x300
-#define		APIC_DEST_SELF		0x40000
-#define		APIC_DEST_ALLINC	0x80000
-#define		APIC_DEST_ALLBUT	0xC0000
-#define		APIC_ICR_RR_MASK	0x30000
-#define		APIC_ICR_RR_INVALID	0x00000
-#define		APIC_ICR_RR_INPROG	0x10000
-#define		APIC_ICR_RR_VALID	0x20000
-#define		APIC_INT_LEVELTRIG	0x08000
-#define		APIC_INT_ASSERT		0x04000
-#define		APIC_ICR_BUSY		0x01000
-#define		APIC_DEST_LOGICAL	0x00800
-#define		APIC_DEST_PHYSICAL	0x00000
-#define		APIC_DM_FIXED		0x00000
-#define		APIC_DM_FIXED_MASK	0x00700
-#define		APIC_DM_LOWEST		0x00100
-#define		APIC_DM_SMI		0x00200
-#define		APIC_DM_REMRD		0x00300
-#define		APIC_DM_NMI		0x00400
-#define		APIC_DM_INIT		0x00500
-#define		APIC_DM_STARTUP		0x00600
-#define		APIC_DM_EXTINT		0x00700
-#define		APIC_VECTOR_MASK	0x000FF
-#define	APIC_ICR2	0x310
-#define		SET_APIC_DEST_FIELD(x)	((x) << 24)
-
 /* VMX_EPT_VPID_CAP bits */
 #define VMX_EPT_VPID_CAP_AD_BITS       (1ULL << 21)
 
diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index 65eb1079a161..516c81d86353 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -10,6 +10,7 @@
 
 #include <stdint.h>
 #include "processor.h"
+#include "apic.h"
 
 /*
  * Definitions of Primary Processor-Based VM-Execution Controls.
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 09/12] KVM: selftests: Hoist APIC functions out of individual tests
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (7 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 08/12] KVM: selftests: Move APIC definitions into a separate file Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 10/12] KVM: selftests: Introduce x2APIC register manipulation functions Jim Mattson
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson, Oliver Upton

Move the APIC functions into the library to encourage code reuse and
to avoid unintended deviations.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/Makefile          |  2 +-
 .../selftests/kvm/include/x86_64/apic.h       | 23 ++++++++
 .../selftests/kvm/include/x86_64/processor.h  |  2 +
 tools/testing/selftests/kvm/lib/x86_64/apic.c | 46 +++++++++++++++
 .../testing/selftests/kvm/x86_64/evmcs_test.c | 11 +---
 .../selftests/kvm/x86_64/set_boot_cpu_id.c    |  6 +-
 .../selftests/kvm/x86_64/xapic_ipi_test.c     | 59 +++----------------
 7 files changed, 83 insertions(+), 66 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/lib/x86_64/apic.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index daaee1888b12..5e9051dd4336 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -34,7 +34,7 @@ ifeq ($(ARCH),s390)
 endif
 
 LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
-LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
+LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
 LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c
 LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
 
diff --git a/tools/testing/selftests/kvm/include/x86_64/apic.h b/tools/testing/selftests/kvm/include/x86_64/apic.h
index 0d0e35c8866b..e5a9fe040a6c 100644
--- a/tools/testing/selftests/kvm/include/x86_64/apic.h
+++ b/tools/testing/selftests/kvm/include/x86_64/apic.h
@@ -8,6 +8,10 @@
 #ifndef SELFTEST_KVM_APIC_H
 #define SELFTEST_KVM_APIC_H
 
+#include <stdint.h>
+
+#include "processor.h"
+
 #define APIC_DEFAULT_GPA		0xfee00000ULL
 
 /* APIC base address MSR and fields */
@@ -55,4 +59,23 @@
 #define	APIC_ICR2	0x310
 #define		SET_APIC_DEST_FIELD(x)	((x) << 24)
 
+void apic_disable(void);
+void xapic_enable(void);
+void x2apic_enable(void);
+
+static inline uint32_t get_bsp_flag(void)
+{
+	return rdmsr(MSR_IA32_APICBASE) & MSR_IA32_APICBASE_BSP;
+}
+
+static inline uint32_t xapic_read_reg(unsigned int reg)
+{
+	return ((volatile uint32_t *)APIC_DEFAULT_GPA)[reg >> 2];
+}
+
+static inline void xapic_write_reg(unsigned int reg, uint32_t val)
+{
+	((volatile uint32_t *)APIC_DEFAULT_GPA)[reg >> 2] = val;
+}
+
 #endif /* SELFTEST_KVM_APIC_H */
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index a4729d9032ce..9a5b47d2d5d6 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -13,6 +13,8 @@
 
 #include <asm/msr-index.h>
 
+#include "../kvm_util.h"
+
 #define X86_EFLAGS_FIXED	 (1u << 1)
 
 #define X86_CR4_VME		(1ul << 0)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/apic.c b/tools/testing/selftests/kvm/lib/x86_64/apic.c
new file mode 100644
index 000000000000..31f318ac67ba
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/x86_64/apic.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * tools/testing/selftests/kvm/lib/x86_64/processor.c
+ *
+ * Copyright (C) 2021, Google LLC.
+ */
+
+#include "apic.h"
+
+void apic_disable(void)
+{
+	wrmsr(MSR_IA32_APICBASE,
+	      rdmsr(MSR_IA32_APICBASE) &
+		~(MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD));
+}
+
+void xapic_enable(void)
+{
+	uint64_t val = rdmsr(MSR_IA32_APICBASE);
+
+	/* Per SDM: to enable xAPIC when in x2APIC must first disable APIC */
+	if (val & MSR_IA32_APICBASE_EXTD) {
+		apic_disable();
+		wrmsr(MSR_IA32_APICBASE,
+		      rdmsr(MSR_IA32_APICBASE) | MSR_IA32_APICBASE_ENABLE);
+	} else if (!(val & MSR_IA32_APICBASE_ENABLE)) {
+		wrmsr(MSR_IA32_APICBASE, val | MSR_IA32_APICBASE_ENABLE);
+	}
+
+	/*
+	 * Per SDM: reset value of spurious interrupt vector register has the
+	 * APIC software enabled bit=0. It must be enabled in addition to the
+	 * enable bit in the MSR.
+	 */
+	val = xapic_read_reg(APIC_SPIV) | APIC_SPIV_APIC_ENABLED;
+	xapic_write_reg(APIC_SPIV, val);
+}
+
+void x2apic_enable(void)
+{
+	uint32_t spiv_reg = APIC_BASE_MSR + (APIC_SPIV >> 4);
+
+	wrmsr(MSR_IA32_APICBASE, rdmsr(MSR_IA32_APICBASE) |
+	      MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD);
+	wrmsr(spiv_reg, rdmsr(spiv_reg) | APIC_SPIV_APIC_ENABLED);
+}
diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
index 63096cea26c6..d058d9e428c6 100644
--- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
@@ -22,15 +22,6 @@
 
 static int ud_count;
 
-void enable_x2apic(void)
-{
-	uint32_t spiv_reg = APIC_BASE_MSR + (APIC_SPIV >> 4);
-
-	wrmsr(MSR_IA32_APICBASE, rdmsr(MSR_IA32_APICBASE) |
-	      MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD);
-	wrmsr(spiv_reg, rdmsr(spiv_reg) | APIC_SPIV_APIC_ENABLED);
-}
-
 static void guest_ud_handler(struct ex_regs *regs)
 {
 	ud_count++;
@@ -59,7 +50,7 @@ void guest_code(struct vmx_pages *vmx_pages)
 #define L2_GUEST_STACK_SIZE 64
 	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
 
-	enable_x2apic();
+	x2apic_enable();
 
 	GUEST_SYNC(1);
 	GUEST_SYNC(2);
diff --git a/tools/testing/selftests/kvm/x86_64/set_boot_cpu_id.c b/tools/testing/selftests/kvm/x86_64/set_boot_cpu_id.c
index 12c558fc8074..5f8dd74d415f 100644
--- a/tools/testing/selftests/kvm/x86_64/set_boot_cpu_id.c
+++ b/tools/testing/selftests/kvm/x86_64/set_boot_cpu_id.c
@@ -14,16 +14,12 @@
 #include "test_util.h"
 #include "kvm_util.h"
 #include "processor.h"
+#include "apic.h"
 
 #define N_VCPU 2
 #define VCPU_ID0 0
 #define VCPU_ID1 1
 
-static uint32_t get_bsp_flag(void)
-{
-	return rdmsr(MSR_IA32_APICBASE) & MSR_IA32_APICBASE_BSP;
-}
-
 static void guest_bsp_vcpu(void *arg)
 {
 	GUEST_SYNC(1);
diff --git a/tools/testing/selftests/kvm/x86_64/xapic_ipi_test.c b/tools/testing/selftests/kvm/x86_64/xapic_ipi_test.c
index 2f964cdc273c..21b22718a9db 100644
--- a/tools/testing/selftests/kvm/x86_64/xapic_ipi_test.c
+++ b/tools/testing/selftests/kvm/x86_64/xapic_ipi_test.c
@@ -42,8 +42,6 @@
 #define HALTER_VCPU_ID 0
 #define SENDER_VCPU_ID 1
 
-volatile uint32_t *apic_base = (volatile uint32_t *)APIC_DEFAULT_GPA;
-
 /*
  * Vector for IPI from sender vCPU to halting vCPU.
  * Value is arbitrary and was chosen for the alternating bit pattern. Any
@@ -86,45 +84,6 @@ struct thread_params {
 	uint64_t *pipis_rcvd; /* host address of ipis_rcvd global */
 };
 
-uint32_t read_apic_reg(uint reg)
-{
-	return apic_base[reg >> 2];
-}
-
-void write_apic_reg(uint reg, uint32_t val)
-{
-	apic_base[reg >> 2] = val;
-}
-
-void disable_apic(void)
-{
-	wrmsr(MSR_IA32_APICBASE,
-	      rdmsr(MSR_IA32_APICBASE) &
-		~(MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD));
-}
-
-void enable_xapic(void)
-{
-	uint64_t val = rdmsr(MSR_IA32_APICBASE);
-
-	/* Per SDM: to enable xAPIC when in x2APIC must first disable APIC */
-	if (val & MSR_IA32_APICBASE_EXTD) {
-		disable_apic();
-		wrmsr(MSR_IA32_APICBASE,
-		      rdmsr(MSR_IA32_APICBASE) | MSR_IA32_APICBASE_ENABLE);
-	} else if (!(val & MSR_IA32_APICBASE_ENABLE)) {
-		wrmsr(MSR_IA32_APICBASE, val | MSR_IA32_APICBASE_ENABLE);
-	}
-
-	/*
-	 * Per SDM: reset value of spurious interrupt vector register has the
-	 * APIC software enabled bit=0. It must be enabled in addition to the
-	 * enable bit in the MSR.
-	 */
-	val = read_apic_reg(APIC_SPIV) | APIC_SPIV_APIC_ENABLED;
-	write_apic_reg(APIC_SPIV, val);
-}
-
 void verify_apic_base_addr(void)
 {
 	uint64_t msr = rdmsr(MSR_IA32_APICBASE);
@@ -136,10 +95,10 @@ void verify_apic_base_addr(void)
 static void halter_guest_code(struct test_data_page *data)
 {
 	verify_apic_base_addr();
-	enable_xapic();
+	xapic_enable();
 
-	data->halter_apic_id = GET_APIC_ID_FIELD(read_apic_reg(APIC_ID));
-	data->halter_lvr = read_apic_reg(APIC_LVR);
+	data->halter_apic_id = GET_APIC_ID_FIELD(xapic_read_reg(APIC_ID));
+	data->halter_lvr = xapic_read_reg(APIC_LVR);
 
 	/*
 	 * Loop forever HLTing and recording halts & wakes. Disable interrupts
@@ -150,8 +109,8 @@ static void halter_guest_code(struct test_data_page *data)
 	 * TPR and PPR for diagnostic purposes in case the test fails.
 	 */
 	for (;;) {
-		data->halter_tpr = read_apic_reg(APIC_TASKPRI);
-		data->halter_ppr = read_apic_reg(APIC_PROCPRI);
+		data->halter_tpr = xapic_read_reg(APIC_TASKPRI);
+		data->halter_ppr = xapic_read_reg(APIC_PROCPRI);
 		data->hlt_count++;
 		asm volatile("sti; hlt; cli");
 		data->wake_count++;
@@ -166,7 +125,7 @@ static void halter_guest_code(struct test_data_page *data)
 static void guest_ipi_handler(struct ex_regs *regs)
 {
 	ipis_rcvd++;
-	write_apic_reg(APIC_EOI, 77);
+	xapic_write_reg(APIC_EOI, 77);
 }
 
 static void sender_guest_code(struct test_data_page *data)
@@ -179,7 +138,7 @@ static void sender_guest_code(struct test_data_page *data)
 	uint64_t tsc_start;
 
 	verify_apic_base_addr();
-	enable_xapic();
+	xapic_enable();
 
 	/*
 	 * Init interrupt command register for sending IPIs
@@ -206,8 +165,8 @@ static void sender_guest_code(struct test_data_page *data)
 		 * First IPI can be sent unconditionally because halter vCPU
 		 * starts earlier.
 		 */
-		write_apic_reg(APIC_ICR2, icr2_val);
-		write_apic_reg(APIC_ICR, icr_val);
+		xapic_write_reg(APIC_ICR2, icr2_val);
+		xapic_write_reg(APIC_ICR, icr_val);
 		data->ipis_sent++;
 
 		/*
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 10/12] KVM: selftests: Introduce x2APIC register manipulation functions
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (8 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 09/12] KVM: selftests: Hoist APIC functions out of individual tests Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 11/12] KVM: selftests: Introduce prepare_tpr_shadow Jim Mattson
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson, Oliver Upton

Standardize reads and writes of the x2APIC MSRs.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/include/x86_64/apic.h | 10 ++++++++++
 tools/testing/selftests/kvm/lib/x86_64/apic.c     |  5 ++---
 tools/testing/selftests/kvm/x86_64/smm_test.c     |  4 ++--
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/apic.h b/tools/testing/selftests/kvm/include/x86_64/apic.h
index e5a9fe040a6c..0be4757f1f20 100644
--- a/tools/testing/selftests/kvm/include/x86_64/apic.h
+++ b/tools/testing/selftests/kvm/include/x86_64/apic.h
@@ -78,4 +78,14 @@ static inline void xapic_write_reg(unsigned int reg, uint32_t val)
 	((volatile uint32_t *)APIC_DEFAULT_GPA)[reg >> 2] = val;
 }
 
+static inline uint64_t x2apic_read_reg(unsigned int reg)
+{
+	return rdmsr(APIC_BASE_MSR + (reg >> 4));
+}
+
+static inline void x2apic_write_reg(unsigned int reg, uint64_t value)
+{
+	wrmsr(APIC_BASE_MSR + (reg >> 4), value);
+}
+
 #endif /* SELFTEST_KVM_APIC_H */
diff --git a/tools/testing/selftests/kvm/lib/x86_64/apic.c b/tools/testing/selftests/kvm/lib/x86_64/apic.c
index 31f318ac67ba..7168e25c194e 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/apic.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/apic.c
@@ -38,9 +38,8 @@ void xapic_enable(void)
 
 void x2apic_enable(void)
 {
-	uint32_t spiv_reg = APIC_BASE_MSR + (APIC_SPIV >> 4);
-
 	wrmsr(MSR_IA32_APICBASE, rdmsr(MSR_IA32_APICBASE) |
 	      MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD);
-	wrmsr(spiv_reg, rdmsr(spiv_reg) | APIC_SPIV_APIC_ENABLED);
+	x2apic_write_reg(APIC_SPIV,
+			 x2apic_read_reg(APIC_SPIV) | APIC_SPIV_APIC_ENABLED);
 }
diff --git a/tools/testing/selftests/kvm/x86_64/smm_test.c b/tools/testing/selftests/kvm/x86_64/smm_test.c
index 613c42c5a9b8..c1f831803ad2 100644
--- a/tools/testing/selftests/kvm/x86_64/smm_test.c
+++ b/tools/testing/selftests/kvm/x86_64/smm_test.c
@@ -55,8 +55,8 @@ static inline void sync_with_host(uint64_t phase)
 
 void self_smi(void)
 {
-	wrmsr(APIC_BASE_MSR + (APIC_ICR >> 4),
-	      APIC_DEST_SELF | APIC_INT_ASSERT | APIC_DM_SMI);
+	x2apic_write_reg(APIC_ICR,
+			 APIC_DEST_SELF | APIC_INT_ASSERT | APIC_DM_SMI);
 }
 
 void guest_code(void *arg)
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 11/12] KVM: selftests: Introduce prepare_tpr_shadow
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (9 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 10/12] KVM: selftests: Introduce x2APIC register manipulation functions Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-04 17:26 ` [PATCH v2 12/12] KVM: selftests: Add a test of an unbacked nested PI descriptor Jim Mattson
  2021-06-10 11:55 ` [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Paolo Bonzini
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson, Oliver Upton

Add support for yet another page to hang from the VMCS12 for nested
VMX testing: the virtual APIC page. This page is necessary for a
VMCS12 to be launched with the "use TPR shadow" VM-execution control
set (except in some oddball circumstances permitted by KVM).

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/include/x86_64/vmx.h | 5 +++++
 tools/testing/selftests/kvm/lib/x86_64/vmx.c     | 8 ++++++++
 2 files changed, 13 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index 516c81d86353..83ccb096b966 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -574,6 +574,10 @@ struct vmx_pages {
 	void *apic_access_hva;
 	uint64_t apic_access_gpa;
 	void *apic_access;
+
+	void *virtual_apic_hva;
+	uint64_t virtual_apic_gpa;
+	void *virtual_apic;
 };
 
 union vmx_basic {
@@ -618,5 +622,6 @@ void prepare_eptp(struct vmx_pages *vmx, struct kvm_vm *vm,
 		  uint32_t eptp_memslot);
 void prepare_virtualize_apic_accesses(struct vmx_pages *vmx, struct kvm_vm *vm,
 				      uint32_t eptp_memslot);
+void prepare_tpr_shadow(struct vmx_pages *vmx, struct kvm_vm *vm);
 
 #endif /* SELFTEST_KVM_VMX_H */
diff --git a/tools/testing/selftests/kvm/lib/x86_64/vmx.c b/tools/testing/selftests/kvm/lib/x86_64/vmx.c
index 2448b30e8efa..1023760d1bf7 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/vmx.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/vmx.c
@@ -551,3 +551,11 @@ void prepare_virtualize_apic_accesses(struct vmx_pages *vmx, struct kvm_vm *vm,
 	vmx->apic_access_hva = addr_gva2hva(vm, (uintptr_t)vmx->apic_access);
 	vmx->apic_access_gpa = addr_gva2gpa(vm, (uintptr_t)vmx->apic_access);
 }
+
+void prepare_tpr_shadow(struct vmx_pages *vmx, struct kvm_vm *vm)
+{
+	vmx->virtual_apic = (void *)vm_vaddr_alloc(vm, getpagesize(),
+						  0x10000, 0, 0);
+	vmx->virtual_apic_hva = addr_gva2hva(vm, (uintptr_t)vmx->virtual_apic);
+	vmx->virtual_apic_gpa = addr_gva2gpa(vm, (uintptr_t)vmx->virtual_apic);
+}
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 12/12] KVM: selftests: Add a test of an unbacked nested PI descriptor
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (10 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 11/12] KVM: selftests: Introduce prepare_tpr_shadow Jim Mattson
@ 2021-06-04 17:26 ` Jim Mattson
  2021-06-10 11:55 ` [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Paolo Bonzini
  12 siblings, 0 replies; 14+ messages in thread
From: Jim Mattson @ 2021-06-04 17:26 UTC (permalink / raw)
  To: kvm, pbonzini; +Cc: Jim Mattson, Oliver Upton

Add a regression test for the unsupported configuration of a VMCS12
posted interrupt descriptor that has no backing memory in L1. KVM
should exit to userspace with KVM_INTERNAL_ERROR rather than just
silently doing something wrong.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/x86_64/vmx_pi_mmio_test.c   | 252 ++++++++++++++++++
 3 files changed, 254 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_pi_mmio_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 524c857a049c..43fe1bb037ac 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -34,6 +34,7 @@
 /x86_64/xen_vmcall_test
 /x86_64/xss_msr_test
 /x86_64/vmx_pmu_msrs_test
+/x86_64/vmx_pi_mmio_test
 /demand_paging_test
 /dirty_log_test
 /dirty_log_perf_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 5e9051dd4336..d7b7d6e641f0 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -67,6 +67,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/tsc_msrs_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_pmu_msrs_test
 TEST_GEN_PROGS_x86_64 += x86_64/xen_shinfo_test
 TEST_GEN_PROGS_x86_64 += x86_64/xen_vmcall_test
+TEST_GEN_PROGS_x86_64 += x86_64/vmx_pi_mmio_test
 TEST_GEN_PROGS_x86_64 += demand_paging_test
 TEST_GEN_PROGS_x86_64 += dirty_log_test
 TEST_GEN_PROGS_x86_64 += dirty_log_perf_test
diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pi_mmio_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pi_mmio_test.c
new file mode 100644
index 000000000000..2246899d8988
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pi_mmio_test.c
@@ -0,0 +1,252 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * vmx_pi_mmio_test
+ *
+ * Copyright (C) 2021, Google LLC.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ *
+ * Test that an L2 vCPU can be launched with an unbacked posted
+ * interrupt descriptor, but that any attempt to send that vCPU its
+ * posted interrupt notification vector will result in an exit to
+ * userspace with KVM_INTERNAL_ERROR.
+ *
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <pthread.h>
+#include <inttypes.h>
+#include <unistd.h>
+
+#include "kvm_util.h"
+#include "processor.h"
+#include "test_util.h"
+#include "vmx.h"
+
+#include "kselftest.h"
+
+#define RECEIVER_VCPU_ID	0
+#define SENDER_VCPU_ID		1
+
+#define L2_GUEST_STACK_SIZE	64
+
+#define TIMEOUT_SECS		10
+
+#define L1_PI_VECTOR		33
+
+static struct kvm_vm *vm;
+
+static bool l2_active;
+
+static void l2_guest_code(void)
+{
+	l2_active = true;
+	__asm__ __volatile__("hlt");
+	/* NOT REACHED */
+}
+
+static void l1_receiver_code(struct vmx_pages *vmx_pages,
+			     unsigned long high_gpa)
+{
+	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	uint32_t control;
+
+	x2apic_enable();
+
+	GUEST_ASSERT(prepare_for_vmx_operation(vmx_pages));
+	GUEST_ASSERT(load_vmcs(vmx_pages));
+
+	prepare_vmcs(vmx_pages, l2_guest_code,
+		     &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+	control = vmreadz(PIN_BASED_VM_EXEC_CONTROL);
+	control |= PIN_BASED_EXT_INTR_MASK |
+		PIN_BASED_POSTED_INTR;
+	vmwrite(PIN_BASED_VM_EXEC_CONTROL, control);
+
+	control = vmreadz(CPU_BASED_VM_EXEC_CONTROL);
+	control |= CPU_BASED_TPR_SHADOW |
+		CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
+	vmwrite(CPU_BASED_VM_EXEC_CONTROL, control);
+
+	control = vmreadz(SECONDARY_VM_EXEC_CONTROL);
+	control |= SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY;
+	vmwrite(SECONDARY_VM_EXEC_CONTROL, control);
+
+	control = vmreadz(VM_EXIT_CONTROLS);
+	control |= VM_EXIT_ACK_INTR_ON_EXIT;
+	vmwrite(VM_EXIT_CONTROLS, control);
+
+	vmwrite(VIRTUAL_APIC_PAGE_ADDR, vmx_pages->virtual_apic_gpa);
+	vmwrite(POSTED_INTR_NV, L1_PI_VECTOR);
+	vmwrite(POSTED_INTR_DESC_ADDR, high_gpa);
+
+	GUEST_ASSERT(!vmlaunch());
+	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
+
+	GUEST_DONE();
+}
+
+static void l1_sender_code(void *arg)
+{
+	x2apic_enable();
+
+	x2apic_write_reg(APIC_ICR,
+			 APIC_INT_ASSERT | APIC_DEST_PHYSICAL |
+			 APIC_DM_FIXED | L1_PI_VECTOR |
+			 ((uint64_t)RECEIVER_VCPU_ID << 32));
+
+	GUEST_DONE();
+}
+
+static bool vcpu_run_loop(int vcpu_id)
+{
+	volatile struct kvm_run *run = vcpu_state(vm, vcpu_id);
+	bool done = false;
+	struct ucall uc;
+
+	while (!done) {
+		vcpu_run(vm, vcpu_id);
+
+		if (run->exit_reason != KVM_EXIT_IO)
+			break;
+
+		switch (get_ucall(vm, vcpu_id, &uc)) {
+		case UCALL_ABORT:
+			TEST_FAIL("vCPU  %d: %s at %s:%ld", vcpu_id,
+				  (const char *)uc.args[0], __FILE__,
+				  uc.args[1]);
+			/* NOT REACHED */
+		case UCALL_SYNC:
+			break;
+		case UCALL_DONE:
+			done = true;
+			break;
+		default:
+			TEST_FAIL("vCPU %d: Unknown ucall %lu",
+				  vcpu_id, uc.cmd);
+			/* NOT REACHED */
+		}
+	}
+
+	return done;
+}
+
+static void *receiver(void *arg)
+{
+	volatile struct kvm_run *run = vcpu_state(vm, RECEIVER_VCPU_ID);
+	unsigned long high_gpa = *(unsigned long *)arg;
+	vm_vaddr_t vmx_pages_gva;
+	struct vmx_pages *vmx;
+	bool success;
+
+	vmx = vcpu_alloc_vmx(vm, &vmx_pages_gva);
+	prepare_tpr_shadow(vmx, vm);
+	vcpu_args_set(vm, RECEIVER_VCPU_ID, 2, vmx_pages_gva, high_gpa);
+
+	success = vcpu_run_loop(RECEIVER_VCPU_ID);
+	TEST_ASSERT(!success, "Receiver didn't fail as expected.\n");
+	TEST_ASSERT(run->exit_reason ==
+		    KVM_EXIT_INTERNAL_ERROR,
+		    "Exit reason isn't KVM_EXIT_INTERNAL_ERROR: %u (%s).\n",
+		    run->exit_reason,
+		    exit_reason_str(run->exit_reason));
+	TEST_ASSERT(run->internal.suberror ==
+		    KVM_INTERNAL_ERROR_EMULATION,
+		    "Internal suberror isn't KVM_INTERNAL_ERROR_EMULATION: %u.\n",
+		    run->internal.suberror);
+
+	return NULL;
+}
+
+static void sender(void)
+{
+	volatile struct kvm_run *run = vcpu_state(vm, SENDER_VCPU_ID);
+	bool success;
+
+	success = vcpu_run_loop(SENDER_VCPU_ID);
+	TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+		    "Sender didn't exit with KVM_EXIT_IO: %u (%s).\n",
+		    run->exit_reason,
+		    exit_reason_str(run->exit_reason));
+	TEST_ASSERT(success, "Sender didn't complete successfully.\n");
+}
+
+void check_constraints(void)
+{
+	uint64_t msr;
+
+	nested_vmx_check_supported();
+
+	msr = kvm_get_feature_msr(MSR_IA32_VMX_PINBASED_CTLS) >> 32;
+	if (!(msr & PIN_BASED_EXT_INTR_MASK)) {
+		print_skip("Cannot enable \"external-interrupt exiting\"");
+		exit(KSFT_SKIP);
+	}
+	if (!(msr & PIN_BASED_POSTED_INTR)) {
+		print_skip("Cannot enable \"process posted interrupts\"");
+		exit(KSFT_SKIP);
+	}
+
+	msr = kvm_get_feature_msr(MSR_IA32_VMX_PROCBASED_CTLS) >> 32;
+	if (!(msr & CPU_BASED_TPR_SHADOW)) {
+		print_skip("Cannot enable \"use TPR shadow\"");
+		exit(KSFT_SKIP);
+	}
+	if (!(msr & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)) {
+		print_skip("Cannot enable \"activate secondary controls\"");
+		exit(KSFT_SKIP);
+	}
+
+	msr = kvm_get_feature_msr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32;
+	if (!(msr & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY)) {
+		print_skip("Cannot enable \"virtual-interrupt delivery\"");
+		exit(KSFT_SKIP);
+	}
+
+	msr = kvm_get_feature_msr(MSR_IA32_VMX_EXIT_CTLS) >> 32;
+	if (!(msr & VM_EXIT_ACK_INTR_ON_EXIT)) {
+		print_skip("Cannot enable \"acknowledge interrupt on exit\"");
+		exit(KSFT_SKIP);
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	unsigned int paddr_width;
+	unsigned int vaddr_width;
+	unsigned long high_gpa;
+	pthread_t thread;
+	bool *l2_active_hva;
+	int r;
+
+	kvm_get_cpu_address_width(&paddr_width, &vaddr_width);
+	high_gpa = (1ul << paddr_width) - getpagesize();
+	if ((unsigned long)DEFAULT_GUEST_PHY_PAGES * getpagesize() > high_gpa) {
+		print_skip("No unbacked physical page available");
+		exit(KSFT_SKIP);
+	}
+
+	check_constraints();
+
+	vm = vm_create_default(RECEIVER_VCPU_ID, 0, (void *)l1_receiver_code);
+	vm_vcpu_add_default(vm, SENDER_VCPU_ID, (void *)l1_sender_code);
+	vcpu_set_cpuid(vm, SENDER_VCPU_ID, kvm_get_supported_cpuid());
+
+	r = pthread_create(&thread, NULL, receiver, &high_gpa);
+	TEST_ASSERT(r == 0,
+		    "pthread_create failed errno=%d", errno);
+
+	alarm(TIMEOUT_SECS);
+	l2_active_hva = (bool *)addr_gva2hva(vm, (vm_vaddr_t)&l2_active);
+	while (!*l2_active_hva)
+		pthread_yield();
+
+	sender();
+
+	r = pthread_join(thread, NULL);
+	TEST_ASSERT(r == 0, "pthread_join failed with errno=%d", r);
+
+	kvm_vm_free(vm);
+
+	return 0;
+}
-- 
2.32.0.rc1.229.g3e70b5a671-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue
  2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
                   ` (11 preceding siblings ...)
  2021-06-04 17:26 ` [PATCH v2 12/12] KVM: selftests: Add a test of an unbacked nested PI descriptor Jim Mattson
@ 2021-06-10 11:55 ` Paolo Bonzini
  12 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2021-06-10 11:55 UTC (permalink / raw)
  To: Jim Mattson, kvm

On 04/06/21 19:25, Jim Mattson wrote:
> When the VMCS12 posted interrupt descriptor isn't backed by an L1
> memslot, kvm will launch vmcs02 with a stale posted interrupt
> descriptor. Before commit 6beb7bd52e48 ("kvm: nVMX: Refactor
> nested_get_vmcs12_pages()"), kvm would have silently disabled the
> VMCS02 "process posted interrupts" VM-execution control. Both
> behaviors are wrong, though the use-after-free is more egregious.
> 
> Empirical tests on actual hardware reveal that a posted interrupt
> descriptor without any backing memory/device has PCI bus error
> semantics (reads return all 1's and writes are discarded). However,
> kvm can't tell an unbacked address from an MMIO address. Normally, kvm
> would ask userspace for an MMIO completion, but that's complicated for
> a posted interrupt descriptor access. There are already a number of
> cases where kvm bails out to userspace with KVM_INTERNAL_ERROR via
> kvm_handle_memory_failure, so that seems like the best route to take.
> 
> It would be relatively easy to invoke kvm_handle_memory_failure at
> emulated VM-entry, but that approach would break existing
> kvm-unit-tests. Moreover, the issue doesn't really come up until the
> vCPU--in virtualized VMX non-root operation--received the posted
> interrupt notification vector indicated in its VMCS12.
> 
> Sadly, it's really hard to arrange for an exit to userspace from
> vmx_complete_nested_posted_interrupt, which is where kvm actually
> needs to access the unbacked PID. Initially, I added a new kvm request
> for a userspace exit on the next guest entry, but Sean hated that
> approach. Based on his suggestion, I added the plumbing to get back
> out to userspace in the event of an error in
> vmx_complete_nested_posted_interrupt. This works in the case of an
> unbacked PID, but it doesn't work quite as well in the case of an
> unbacked virtual APIC page (another case where kvm was happy to just
> silently ignore the problem and attempt to muddle its way through.) In
> that case, this series is an incremental improvement, but it's not a
> complete fix.
> 
> v1 -> v2:
>    05/12: Modified kvm_arch_vcpu_ioctl_get_mpstate() so that it
>           returns the error code from kvm_apic_accept_events() if
> 	 said error code is less than 0.
>    07/12: Changed a comment based on Sean's feedback.
> 
> Jim Mattson (12):
>    KVM: x86: Remove guest mode check from kvm_check_nested_events
>    KVM: x86: Wake up a vCPU when kvm_check_nested_events fails
>    KVM: nVMX: Add a return code to vmx_complete_nested_posted_interrupt
>    KVM: x86: Add a return code to inject_pending_event
>    KVM: x86: Add a return code to kvm_apic_accept_events
>    KVM: nVMX: Fail on MMIO completion for nested posted interrupts
>    KVM: nVMX: Disable vmcs02 posted interrupts if vmcs12 PID isn't
>      mappable
>    KVM: selftests: Move APIC definitions into a separate file
>    KVM: selftests: Hoist APIC functions out of individual tests
>    KVM: selftests: Introduce x2APIC register manipulation functions
>    KVM: selftests: Introduce prepare_tpr_shadow
>    KVM: selftests: Add a test of an unbacked nested PI descriptor
> 
>   arch/x86/kvm/lapic.c                          |  11 +-
>   arch/x86/kvm/lapic.h                          |   2 +-
>   arch/x86/kvm/vmx/nested.c                     |  31 ++-
>   arch/x86/kvm/x86.c                            |  59 ++--
>   tools/testing/selftests/kvm/.gitignore        |   1 +
>   tools/testing/selftests/kvm/Makefile          |   3 +-
>   .../selftests/kvm/include/x86_64/apic.h       |  91 +++++++
>   .../selftests/kvm/include/x86_64/processor.h  |  49 +---
>   .../selftests/kvm/include/x86_64/vmx.h        |   6 +
>   tools/testing/selftests/kvm/lib/x86_64/apic.c |  45 ++++
>   tools/testing/selftests/kvm/lib/x86_64/vmx.c  |   8 +
>   .../testing/selftests/kvm/x86_64/evmcs_test.c |  11 +-
>   .../selftests/kvm/x86_64/set_boot_cpu_id.c    |   6 +-
>   tools/testing/selftests/kvm/x86_64/smm_test.c |   4 +-
>   .../selftests/kvm/x86_64/vmx_pi_mmio_test.c   | 252 ++++++++++++++++++
>   .../selftests/kvm/x86_64/xapic_ipi_test.c     |  59 +---
>   16 files changed, 490 insertions(+), 148 deletions(-)
>   create mode 100644 tools/testing/selftests/kvm/include/x86_64/apic.h
>   create mode 100644 tools/testing/selftests/kvm/lib/x86_64/apic.c
>   create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_pi_mmio_test.c
> 

Queued, thanks.  I'll leave out temporarily 2/10/12 until I figure out 
what's going on.

Paolo


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-06-10 11:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-04 17:25 [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Jim Mattson
2021-06-04 17:26 ` [PATCH v2 01/12] KVM: x86: Remove guest mode check from kvm_check_nested_events Jim Mattson
2021-06-04 17:26 ` [PATCH v2 02/12] KVM: x86: Wake up a vCPU when kvm_check_nested_events fails Jim Mattson
2021-06-04 17:26 ` [PATCH v2 03/12] KVM: nVMX: Add a return code to vmx_complete_nested_posted_interrupt Jim Mattson
2021-06-04 17:26 ` [PATCH v2 04/12] KVM: x86: Add a return code to inject_pending_event Jim Mattson
2021-06-04 17:26 ` [PATCH v2 05/12] KVM: x86: Add a return code to kvm_apic_accept_events Jim Mattson
2021-06-04 17:26 ` [PATCH v2 06/12] KVM: nVMX: Fail on MMIO completion for nested posted interrupts Jim Mattson
2021-06-04 17:26 ` [PATCH v2 07/12] KVM: nVMX: Disable vmcs02 posted interrupts if vmcs12 PID isn't mappable Jim Mattson
2021-06-04 17:26 ` [PATCH v2 08/12] KVM: selftests: Move APIC definitions into a separate file Jim Mattson
2021-06-04 17:26 ` [PATCH v2 09/12] KVM: selftests: Hoist APIC functions out of individual tests Jim Mattson
2021-06-04 17:26 ` [PATCH v2 10/12] KVM: selftests: Introduce x2APIC register manipulation functions Jim Mattson
2021-06-04 17:26 ` [PATCH v2 11/12] KVM: selftests: Introduce prepare_tpr_shadow Jim Mattson
2021-06-04 17:26 ` [PATCH v2 12/12] KVM: selftests: Add a test of an unbacked nested PI descriptor Jim Mattson
2021-06-10 11:55 ` [PATCH v2 00/12] KVM: nVMX: Fix vmcs02 PID use-after-free issue Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).