kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call
@ 2021-08-18  8:50 Oliver Upton
  2021-08-18  8:50 ` [PATCH 1/4] KVM: arm64: Fix read-side race on updates to vcpu reset state Oliver Upton
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Oliver Upton @ 2021-08-18  8:50 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Marc Zyngier, Peter Shier, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Oliver Upton

The CPU_ON PSCI call requires careful coordination between vCPUs in KVM,
as it allows callers to send a payload (pc, context id) to another vCPU
to start execution. There are a couple of races in the handling of
CPU_ON:

 - KVM uses the kvm->lock to serialize the write-side of a vCPU's reset
   state. However, kvm_vcpu_reset() doesn't take the lock on the
   read-size, meaning the vCPU could be reset with interleaved state
   from two separate CPU_ON calls.

 - If a targeted vCPU never enters the guest again (say, the VMM was
   getting ready to migrate), then the reset payload is never actually
   folded in to the vCPU's registers. Despite this, the calling vCPU has
   already made the target runnable. Migrating the target vCPU at this
   time will result in execution from its old PC, not execution coming
   out of the reset state at the requested address.

Patch 1 addresses the read-side race in KVM's CPU_ON implementation.

Patch 2 fixes the KVM/VMM race by resetting a vCPU (if requested)
whenever the VMM tries to read out its registers. Gross, but it avoids
exposing the vcpu_reset_state structure through some other UAPI. That is
undesirable, as we really are only trying to paper over the
implementation details of PSCI in KVM.

Patch 3 is unrelated, and is based on my own reading of the PSCI
specification. In short, if you invoke PSCI_ON from AArch64, then you
must set the Aff3 bits. This is impossible if you use the 32 bit
function, since the arguments are only 32 bits. Just return
INVALID_PARAMS to the guest in this case.

This series cleanly applies to kvm-arm/next at the following commit:

ae280335cdb5 ("Merge branch kvm-arm64/mmu/el2-tracking into kvmarm-master/next")

The series was tested with the included KVM selftest on an Ampere Mt.
Jade system. Broken behavior was verified using the same test on
kvm-arm/next, sans this series.

Oliver Upton (4):
  KVM: arm64: Fix read-side race on updates to vcpu reset state
  KVM: arm64: Handle PSCI resets before userspace touches vCPU state
  KVM: arm64: Enforce reserved bits for PSCI target affinities
  selftests: KVM: Introduce psci_cpu_on_test

 arch/arm64/kvm/arm.c                          |   9 ++
 arch/arm64/kvm/psci.c                         |  20 ++-
 arch/arm64/kvm/reset.c                        |  16 ++-
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/aarch64/psci_cpu_on_test.c  | 121 ++++++++++++++++++
 .../selftests/kvm/include/aarch64/processor.h |   3 +
 7 files changed, 162 insertions(+), 9 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/aarch64/psci_cpu_on_test.c

-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] KVM: arm64: Fix read-side race on updates to vcpu reset state
  2021-08-18  8:50 [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call Oliver Upton
@ 2021-08-18  8:50 ` Oliver Upton
  2021-08-18 10:06   ` Marc Zyngier
  2021-08-18  8:50 ` [PATCH 2/4] KVM: arm64: Handle PSCI resets before userspace touches vCPU state Oliver Upton
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Oliver Upton @ 2021-08-18  8:50 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Marc Zyngier, Peter Shier, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Oliver Upton

KVM correctly serializes writes to a vCPU's reset state, however since
we do not take the KVM lock on the read side it is entirely possible to
read state from two different reset requests.

Cure the race for now by taking the KVM lock when reading the
reset_state structure.

Fixes: 358b28f09f0a ("arm/arm64: KVM: Allow a VCPU to fully reset itself")
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/arm64/kvm/reset.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 18ffc6ad67b8..3507e64ff8ad 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -210,10 +210,16 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
  */
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 {
+	struct vcpu_reset_state reset_state;
 	int ret;
 	bool loaded;
 	u32 pstate;
 
+	mutex_lock(&vcpu->kvm->lock);
+	memcpy(&reset_state, &vcpu->arch.reset_state, sizeof(reset_state));
+	vcpu->arch.reset_state.reset = false;
+	mutex_unlock(&vcpu->kvm->lock);
+
 	/* Reset PMU outside of the non-preemptible section */
 	kvm_pmu_vcpu_reset(vcpu);
 
@@ -276,8 +282,8 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	 * Additional reset state handling that PSCI may have imposed on us.
 	 * Must be done after all the sys_reg reset.
 	 */
-	if (vcpu->arch.reset_state.reset) {
-		unsigned long target_pc = vcpu->arch.reset_state.pc;
+	if (reset_state.reset) {
+		unsigned long target_pc = reset_state.pc;
 
 		/* Gracefully handle Thumb2 entry point */
 		if (vcpu_mode_is_32bit(vcpu) && (target_pc & 1)) {
@@ -286,13 +292,11 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 		}
 
 		/* Propagate caller endianness */
-		if (vcpu->arch.reset_state.be)
+		if (reset_state.be)
 			kvm_vcpu_set_be(vcpu);
 
 		*vcpu_pc(vcpu) = target_pc;
-		vcpu_set_reg(vcpu, 0, vcpu->arch.reset_state.r0);
-
-		vcpu->arch.reset_state.reset = false;
+		vcpu_set_reg(vcpu, 0, reset_state.r0);
 	}
 
 	/* Reset timer */
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] KVM: arm64: Handle PSCI resets before userspace touches vCPU state
  2021-08-18  8:50 [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call Oliver Upton
  2021-08-18  8:50 ` [PATCH 1/4] KVM: arm64: Fix read-side race on updates to vcpu reset state Oliver Upton
@ 2021-08-18  8:50 ` Oliver Upton
  2021-08-18 10:38   ` Marc Zyngier
  2021-08-18  8:50 ` [PATCH 3/4] KVM: arm64: Enforce reserved bits for PSCI target affinities Oliver Upton
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Oliver Upton @ 2021-08-18  8:50 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Marc Zyngier, Peter Shier, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Oliver Upton

The CPU_ON PSCI call takes a payload that KVM uses to configure a
destination vCPU to run. This payload is non-architectural state and not
exposed through any existing UAPI. Effectively, we have a race between
CPU_ON and userspace saving/restoring a guest: if the target vCPU isn't
ran again before the VMM saves its state, the requested PC and context
ID are lost. When restored, the target vCPU will be runnable and start
executing at its old PC.

We can avoid this race by making sure the reset payload is serviced
before userspace can access a vCPU's state. This is, of course, a hairy
ugly hack. A benefit of such a hack, though, is that we've managed to
massage the reset state into the architected state, thereby making it
migratable without forcing userspace to play our game with a UAPI
addition.

Fixes: 358b28f09f0a ("arm/arm64: KVM: Allow a VCPU to fully reset itself")
Signed-off-by: Oliver Upton <oupton@google.com>
---
I really hate this, but my imagination is failing me on any other way to
cure the race without cluing in userspace. Any ideas?

 arch/arm64/kvm/arm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 0de4b41c3706..6b124c29c663 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1216,6 +1216,15 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		if (copy_from_user(&reg, argp, sizeof(reg)))
 			break;
 
+		/*
+		 * ugly hack. We could owe a reset due to PSCI and not yet
+		 * serviced it. Prevent userspace from reading/writing state
+		 * that will be clobbered by the eventual handling of the reset
+		 * bit.
+		 */
+		if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
+			kvm_reset_vcpu(vcpu);
+
 		if (ioctl == KVM_SET_ONE_REG)
 			r = kvm_arm_set_reg(vcpu, &reg);
 		else
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] KVM: arm64: Enforce reserved bits for PSCI target affinities
  2021-08-18  8:50 [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call Oliver Upton
  2021-08-18  8:50 ` [PATCH 1/4] KVM: arm64: Fix read-side race on updates to vcpu reset state Oliver Upton
  2021-08-18  8:50 ` [PATCH 2/4] KVM: arm64: Handle PSCI resets before userspace touches vCPU state Oliver Upton
@ 2021-08-18  8:50 ` Oliver Upton
  2021-08-18 11:12   ` Marc Zyngier
  2021-08-18  8:50 ` [PATCH 4/4] selftests: KVM: Introduce psci_cpu_on_test Oliver Upton
  2021-08-18 11:32 ` [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call Marc Zyngier
  4 siblings, 1 reply; 10+ messages in thread
From: Oliver Upton @ 2021-08-18  8:50 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Marc Zyngier, Peter Shier, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Oliver Upton

Some calls in PSCI take a target affinity argument, defined to be
bit-compatible with the affinity fields in MPIDR_EL1. All other bits in
the parameter are reserved and must be 0. Return INVALID_PARAMETERS if
the guest incorrectly sets a reserved bit.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/arm64/kvm/psci.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index db4056ecccfd..bb76be01abd2 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -59,6 +59,17 @@ static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu)
 	kvm_vcpu_kick(vcpu);
 }
 
+static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
+					   unsigned long affinity)
+{
+	unsigned long mask = MPIDR_HWID_BITMASK;
+
+	if (vcpu_mode_is_32bit(vcpu))
+		mask &= ~((u32) 0);
+
+	return !(affinity & ~mask);
+}
+
 static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 {
 	struct vcpu_reset_state *reset_state;
@@ -66,9 +77,9 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 	struct kvm_vcpu *vcpu = NULL;
 	unsigned long cpu_id;
 
-	cpu_id = smccc_get_arg1(source_vcpu) & MPIDR_HWID_BITMASK;
-	if (vcpu_mode_is_32bit(source_vcpu))
-		cpu_id &= ~((u32) 0);
+	cpu_id = smccc_get_arg1(source_vcpu);
+	if (!kvm_psci_valid_affinity(source_vcpu, cpu_id))
+		return PSCI_RET_INVALID_PARAMS;
 
 	vcpu = kvm_mpidr_to_vcpu(kvm, cpu_id);
 
@@ -126,6 +137,9 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
 	target_affinity = smccc_get_arg1(vcpu);
 	lowest_affinity_level = smccc_get_arg2(vcpu);
 
+	if (!kvm_psci_valid_affinity(vcpu, target_affinity))
+		return PSCI_RET_INVALID_PARAMS;
+
 	/* Determine target affinity mask */
 	target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
 	if (!target_affinity_mask)
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] selftests: KVM: Introduce psci_cpu_on_test
  2021-08-18  8:50 [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call Oliver Upton
                   ` (2 preceding siblings ...)
  2021-08-18  8:50 ` [PATCH 3/4] KVM: arm64: Enforce reserved bits for PSCI target affinities Oliver Upton
@ 2021-08-18  8:50 ` Oliver Upton
  2021-08-18 14:42   ` Andrew Jones
  2021-08-18 11:32 ` [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call Marc Zyngier
  4 siblings, 1 reply; 10+ messages in thread
From: Oliver Upton @ 2021-08-18  8:50 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Marc Zyngier, Peter Shier, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Oliver Upton

Introduce a test for aarch64 that ensures CPU resets induced by PSCI are
reflected in the target vCPU's state, even if the target is never run
again. This is a regression test for a race between vCPU migration and
PSCI.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/aarch64/psci_cpu_on_test.c  | 121 ++++++++++++++++++
 .../selftests/kvm/include/aarch64/processor.h |   3 +
 4 files changed, 126 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/psci_cpu_on_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 0709af0144c8..98053d3afbda 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 /aarch64/debug-exceptions
 /aarch64/get-reg-list
+/aarch64/psci_cpu_on_test
 /aarch64/vgic_init
 /s390x/memop
 /s390x/resets
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 5832f510a16c..5d05801ab816 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -86,6 +86,7 @@ TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
 
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
+TEST_GEN_PROGS_aarch64 += aarch64/psci_cpu_on_test
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
 TEST_GEN_PROGS_aarch64 += demand_paging_test
 TEST_GEN_PROGS_aarch64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/aarch64/psci_cpu_on_test.c b/tools/testing/selftests/kvm/aarch64/psci_cpu_on_test.c
new file mode 100644
index 000000000000..17a6234b4b42
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/psci_cpu_on_test.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * psci_cpu_on_test - Test that the observable state of a vCPU targeted by the
+ * CPU_ON PSCI call matches what the caller requested.
+ *
+ * Copyright (c) 2021 Google LLC.
+ *
+ * This is a regression test for a race between KVM servicing the PSCI call and
+ * userspace reading the vCPUs registers.
+ */
+
+#define _GNU_SOURCE
+
+#include <linux/psci.h>
+
+#include "kvm_util.h"
+#include "processor.h"
+#include "test_util.h"
+
+#define VCPU_ID_SOURCE 0
+#define VCPU_ID_TARGET 1
+
+#define CPU_ON_ENTRY_ADDR 0xfeedf00dul
+#define CPU_ON_CONTEXT_ID 0xdeadc0deul
+
+static uint64_t psci_cpu_on(uint64_t target_cpu, uint64_t entry_addr,
+			    uint64_t context_id)
+{
+	register uint64_t x0 asm("x0") = PSCI_0_2_FN64_CPU_ON;
+	register uint64_t x1 asm("x1") = target_cpu;
+	register uint64_t x2 asm("x2") = entry_addr;
+	register uint64_t x3 asm("x3") = context_id;
+
+	asm("hvc #0"
+	    : "=r"(x0)
+	    : "r"(x0), "r"(x1), "r"(x2), "r"(x3)
+	    : "memory");
+
+	return x0;
+}
+
+static uint64_t psci_affinity_info(uint64_t target_affinity,
+				   uint64_t lowest_affinity_level)
+{
+	register uint64_t x0 asm("x0") = PSCI_0_2_FN64_AFFINITY_INFO;
+	register uint64_t x1 asm("x1") = target_affinity;
+	register uint64_t x2 asm("x2") = lowest_affinity_level;
+
+	asm("hvc #0"
+	    : "=r"(x0)
+	    : "r"(x0), "r"(x1), "r"(x2)
+	    : "memory");
+
+	return x0;
+}
+
+static void guest_main(uint64_t target_cpu)
+{
+	GUEST_ASSERT(!psci_cpu_on(target_cpu, CPU_ON_ENTRY_ADDR, CPU_ON_CONTEXT_ID));
+	uint64_t target_state;
+
+	do {
+		target_state = psci_affinity_info(target_cpu, 0);
+
+		GUEST_ASSERT((target_state == PSCI_0_2_AFFINITY_LEVEL_ON) ||
+			     (target_state == PSCI_0_2_AFFINITY_LEVEL_OFF));
+	} while (target_state != PSCI_0_2_AFFINITY_LEVEL_ON);
+
+	GUEST_DONE();
+}
+
+int main(void)
+{
+	uint64_t target_mpidr, obs_pc, obs_x0;
+	struct kvm_vcpu_init init;
+	struct kvm_vm *vm;
+	struct ucall uc;
+
+	vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR);
+	kvm_vm_elf_load(vm, program_invocation_name);
+	ucall_init(vm, NULL);
+
+	vm_ioctl(vm, KVM_ARM_PREFERRED_TARGET, &init);
+	init.features[0] |= (1 << KVM_ARM_VCPU_PSCI_0_2);
+
+	aarch64_vcpu_add_default(vm, VCPU_ID_SOURCE, &init, guest_main);
+
+	/*
+	 * make sure the target is already off when executing the test.
+	 */
+	init.features[0] |= (1 << KVM_ARM_VCPU_POWER_OFF);
+	aarch64_vcpu_add_default(vm, VCPU_ID_TARGET, &init, guest_main);
+
+	get_reg(vm, VCPU_ID_TARGET, ARM64_SYS_REG(MPIDR_EL1), &target_mpidr);
+	vcpu_args_set(vm, VCPU_ID_SOURCE, 1, target_mpidr & MPIDR_HWID_BITMASK);
+	vcpu_run(vm, VCPU_ID_SOURCE);
+
+	switch (get_ucall(vm, VCPU_ID_SOURCE, &uc)) {
+	case UCALL_DONE:
+		break;
+	case UCALL_ABORT:
+		TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0], __FILE__,
+			  uc.args[1]);
+		break;
+	default:
+		TEST_FAIL("Unhandled ucall: %lu", uc.cmd);
+	}
+
+	get_reg(vm, VCPU_ID_TARGET, ARM64_CORE_REG(regs.pc), &obs_pc);
+	get_reg(vm, VCPU_ID_TARGET, ARM64_CORE_REG(regs.regs[0]), &obs_x0);
+
+	TEST_ASSERT(obs_pc == CPU_ON_ENTRY_ADDR,
+		    "unexpected target cpu pc: %lx (expected: %lx)",
+		    obs_pc, CPU_ON_ENTRY_ADDR);
+	TEST_ASSERT(obs_x0 == CPU_ON_CONTEXT_ID,
+		    "unexpected target context id: %lx (expected: %lx)",
+		    obs_x0, CPU_ON_CONTEXT_ID);
+
+	kvm_vm_free(vm);
+	return 0;
+}
diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index 27dc5c2e56b9..c0273aefa63d 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -17,6 +17,7 @@
 #define CPACR_EL1               3, 0,  1, 0, 2
 #define TCR_EL1                 3, 0,  2, 0, 2
 #define MAIR_EL1                3, 0, 10, 2, 0
+#define MPIDR_EL1               3, 0,  0, 0, 5
 #define TTBR0_EL1               3, 0,  2, 0, 0
 #define SCTLR_EL1               3, 0,  1, 0, 0
 #define VBAR_EL1                3, 0, 12, 0, 0
@@ -40,6 +41,8 @@
 			  (0xfful << (4 * 8)) | \
 			  (0xbbul << (5 * 8)))
 
+#define MPIDR_HWID_BITMASK (0xff00fffffful)
+
 static inline void get_reg(struct kvm_vm *vm, uint32_t vcpuid, uint64_t id, uint64_t *addr)
 {
 	struct kvm_one_reg reg;
-- 
2.33.0.rc1.237.g0d66db33f3-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] KVM: arm64: Fix read-side race on updates to vcpu reset state
  2021-08-18  8:50 ` [PATCH 1/4] KVM: arm64: Fix read-side race on updates to vcpu reset state Oliver Upton
@ 2021-08-18 10:06   ` Marc Zyngier
  0 siblings, 0 replies; 10+ messages in thread
From: Marc Zyngier @ 2021-08-18 10:06 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Peter Shier, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose

On Wed, 18 Aug 2021 09:50:44 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> KVM correctly serializes writes to a vCPU's reset state, however since
> we do not take the KVM lock on the read side it is entirely possible to
> read state from two different reset requests.
> 
> Cure the race for now by taking the KVM lock when reading the
> reset_state structure.
> 
> Fixes: 358b28f09f0a ("arm/arm64: KVM: Allow a VCPU to fully reset itself")
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/arm64/kvm/reset.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index 18ffc6ad67b8..3507e64ff8ad 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -210,10 +210,16 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
>   */
>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  {
> +	struct vcpu_reset_state reset_state;
>  	int ret;
>  	bool loaded;
>  	u32 pstate;
>  
> +	mutex_lock(&vcpu->kvm->lock);
> +	memcpy(&reset_state, &vcpu->arch.reset_state, sizeof(reset_state));

nit: "reset_state = vcpu->arch.reset_state;" should do the trick.

> +	vcpu->arch.reset_state.reset = false;

We should probably need to upgrade this to a WRITE_ONCE() to match the
PSCI side.

> +	mutex_unlock(&vcpu->kvm->lock);
> +
>  	/* Reset PMU outside of the non-preemptible section */
>  	kvm_pmu_vcpu_reset(vcpu);
>  
> @@ -276,8 +282,8 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  	 * Additional reset state handling that PSCI may have imposed on us.
>  	 * Must be done after all the sys_reg reset.
>  	 */
> -	if (vcpu->arch.reset_state.reset) {
> -		unsigned long target_pc = vcpu->arch.reset_state.pc;
> +	if (reset_state.reset) {
> +		unsigned long target_pc = reset_state.pc;
>  
>  		/* Gracefully handle Thumb2 entry point */
>  		if (vcpu_mode_is_32bit(vcpu) && (target_pc & 1)) {
> @@ -286,13 +292,11 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  		}
>  
>  		/* Propagate caller endianness */
> -		if (vcpu->arch.reset_state.be)
> +		if (reset_state.be)
>  			kvm_vcpu_set_be(vcpu);
>  
>  		*vcpu_pc(vcpu) = target_pc;
> -		vcpu_set_reg(vcpu, 0, vcpu->arch.reset_state.r0);
> -
> -		vcpu->arch.reset_state.reset = false;
> +		vcpu_set_reg(vcpu, 0, reset_state.r0);
>  	}
>  
>  	/* Reset timer */

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] KVM: arm64: Handle PSCI resets before userspace touches vCPU state
  2021-08-18  8:50 ` [PATCH 2/4] KVM: arm64: Handle PSCI resets before userspace touches vCPU state Oliver Upton
@ 2021-08-18 10:38   ` Marc Zyngier
  0 siblings, 0 replies; 10+ messages in thread
From: Marc Zyngier @ 2021-08-18 10:38 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Peter Shier, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose

On Wed, 18 Aug 2021 09:50:45 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> The CPU_ON PSCI call takes a payload that KVM uses to configure a
> destination vCPU to run. This payload is non-architectural state and not
> exposed through any existing UAPI. Effectively, we have a race between
> CPU_ON and userspace saving/restoring a guest: if the target vCPU isn't
> ran again before the VMM saves its state, the requested PC and context
> ID are lost. When restored, the target vCPU will be runnable and start
> executing at its old PC.
> 
> We can avoid this race by making sure the reset payload is serviced
> before userspace can access a vCPU's state. This is, of course, a hairy
> ugly hack. A benefit of such a hack, though, is that we've managed to
> massage the reset state into the architected state, thereby making it
> migratable without forcing userspace to play our game with a UAPI
> addition.

I don't think it is that bad. In a way, it is similar to the "resync
pending exception state" dance that we do on vcpu exit to userspace.
One thing to note is that it only works because this is done from the
vcpu thread itself.

>
> Fixes: 358b28f09f0a ("arm/arm64: KVM: Allow a VCPU to fully reset itself")
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
> I really hate this, but my imagination is failing me on any other way to
> cure the race without cluing in userspace. Any ideas?
> 
>  arch/arm64/kvm/arm.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 0de4b41c3706..6b124c29c663 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1216,6 +1216,15 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  		if (copy_from_user(&reg, argp, sizeof(reg)))
>  			break;
>  
> +		/*
> +		 * ugly hack. We could owe a reset due to PSCI and not yet
> +		 * serviced it. Prevent userspace from reading/writing state
> +		 * that will be clobbered by the eventual handling of the reset
> +		 * bit.

This reads a bit odd. You are taking care of two potential issues in
one go here:
- userspace writes won't be overwritten by a pending reset as they
will take place after said reset
- userspace reads will reflect the state of the freshly reset CPU
instead of some stale state

> +		 */
> +		if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
> +			kvm_reset_vcpu(vcpu);
> +
>  		if (ioctl == KVM_SET_ONE_REG)
>  			r = kvm_arm_set_reg(vcpu, &reg);
>  		else

Otherwise, well spotted.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] KVM: arm64: Enforce reserved bits for PSCI target affinities
  2021-08-18  8:50 ` [PATCH 3/4] KVM: arm64: Enforce reserved bits for PSCI target affinities Oliver Upton
@ 2021-08-18 11:12   ` Marc Zyngier
  0 siblings, 0 replies; 10+ messages in thread
From: Marc Zyngier @ 2021-08-18 11:12 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Peter Shier, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose

On Wed, 18 Aug 2021 09:50:46 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> Some calls in PSCI take a target affinity argument, defined to be
> bit-compatible with the affinity fields in MPIDR_EL1. All other bits in
> the parameter are reserved and must be 0.

For future reference, it may be worth quoting the spec (ARM DEN 0022D,
5.1.4 "CPU_ON").

> Return INVALID_PARAMETERS if the guest incorrectly sets a reserved
> bit.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/arm64/kvm/psci.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index db4056ecccfd..bb76be01abd2 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -59,6 +59,17 @@ static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu)
>  	kvm_vcpu_kick(vcpu);
>  }
>  
> +static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
> +					   unsigned long affinity)
> +{
> +	unsigned long mask = MPIDR_HWID_BITMASK;
> +
> +	if (vcpu_mode_is_32bit(vcpu))
> +		mask &= ~((u32) 0);

I don't think we need this anymore since 5.7:

- fdc9999e20cd ("KVM: arm64: PSCI: Forbid 64bit functions for 32bit
  guests") guarantees that the guest can't trick KVM into using the
  SMC64 functions.

- with 2890ac993daa ("KVM: arm64: PSCI: Narrow input registers when
  using 32bit functions"), the registers are always narrowed down to
  32bit

Put the two together, and you can't have a 32bit guest issuing a PSCI
operation with crap in the upper 32bits.

> +
> +	return !(affinity & ~mask);

So the whole helper can now be rewritten as

	return !(affinity & ~MPIDR_HWID_BITMASK);

> +}
> +
>  static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>  {
>  	struct vcpu_reset_state *reset_state;
> @@ -66,9 +77,9 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>  	struct kvm_vcpu *vcpu = NULL;
>  	unsigned long cpu_id;
>  
> -	cpu_id = smccc_get_arg1(source_vcpu) & MPIDR_HWID_BITMASK;
> -	if (vcpu_mode_is_32bit(source_vcpu))
> -		cpu_id &= ~((u32) 0);
> +	cpu_id = smccc_get_arg1(source_vcpu);
> +	if (!kvm_psci_valid_affinity(source_vcpu, cpu_id))
> +		return PSCI_RET_INVALID_PARAMS;
>  
>  	vcpu = kvm_mpidr_to_vcpu(kvm, cpu_id);
>  
> @@ -126,6 +137,9 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
>  	target_affinity = smccc_get_arg1(vcpu);
>  	lowest_affinity_level = smccc_get_arg2(vcpu);
>  
> +	if (!kvm_psci_valid_affinity(vcpu, target_affinity))
> +		return PSCI_RET_INVALID_PARAMS;
> +
>  	/* Determine target affinity mask */
>  	target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
>  	if (!target_affinity_mask)

Otherwise, looks good to me.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call
  2021-08-18  8:50 [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call Oliver Upton
                   ` (3 preceding siblings ...)
  2021-08-18  8:50 ` [PATCH 4/4] selftests: KVM: Introduce psci_cpu_on_test Oliver Upton
@ 2021-08-18 11:32 ` Marc Zyngier
  4 siblings, 0 replies; 10+ messages in thread
From: Marc Zyngier @ 2021-08-18 11:32 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Peter Shier, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Drew Jones

[+ Andrew for the selftest part.]

Hi Oliver,

On Wed, 18 Aug 2021 09:50:43 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> The CPU_ON PSCI call requires careful coordination between vCPUs in KVM,
> as it allows callers to send a payload (pc, context id) to another vCPU
> to start execution. There are a couple of races in the handling of
> CPU_ON:
> 
>  - KVM uses the kvm->lock to serialize the write-side of a vCPU's reset
>    state. However, kvm_vcpu_reset() doesn't take the lock on the
>    read-size, meaning the vCPU could be reset with interleaved state
>    from two separate CPU_ON calls.
> 
>  - If a targeted vCPU never enters the guest again (say, the VMM was
>    getting ready to migrate), then the reset payload is never actually
>    folded in to the vCPU's registers. Despite this, the calling vCPU has
>    already made the target runnable. Migrating the target vCPU at this
>    time will result in execution from its old PC, not execution coming
>    out of the reset state at the requested address.
> 
> Patch 1 addresses the read-side race in KVM's CPU_ON implementation.
> 
> Patch 2 fixes the KVM/VMM race by resetting a vCPU (if requested)
> whenever the VMM tries to read out its registers. Gross, but it avoids
> exposing the vcpu_reset_state structure through some other UAPI. That is
> undesirable, as we really are only trying to paper over the
> implementation details of PSCI in KVM.
> 
> Patch 3 is unrelated, and is based on my own reading of the PSCI
> specification. In short, if you invoke PSCI_ON from AArch64, then you
> must set the Aff3 bits. This is impossible if you use the 32 bit
> function, since the arguments are only 32 bits. Just return
> INVALID_PARAMS to the guest in this case.

Overall, this looks pretty good, and I only have a few nits on the
first three patches. I'll let Andrew comment on the selftest, which
looks OK to me at least on the surface.

> 
> This series cleanly applies to kvm-arm/next at the following commit:
> 
> ae280335cdb5 ("Merge branch kvm-arm64/mmu/el2-tracking into kvmarm-master/next")
>

Another nit: In the future, please consider basing your series on a
stable tag (such as v5.14-rc4), as kvmarm/next is a bit of a moving
target (the individual commits are stable, but the merge commits
aren't). Basing something off -next should be reserved for the cases
where you are fixing something that is only broken there.

Thanks,

	M.

> The series was tested with the included KVM selftest on an Ampere Mt.
> Jade system. Broken behavior was verified using the same test on
> kvm-arm/next, sans this series.
> 
> Oliver Upton (4):
>   KVM: arm64: Fix read-side race on updates to vcpu reset state
>   KVM: arm64: Handle PSCI resets before userspace touches vCPU state
>   KVM: arm64: Enforce reserved bits for PSCI target affinities
>   selftests: KVM: Introduce psci_cpu_on_test
> 
>  arch/arm64/kvm/arm.c                          |   9 ++
>  arch/arm64/kvm/psci.c                         |  20 ++-
>  arch/arm64/kvm/reset.c                        |  16 ++-
>  tools/testing/selftests/kvm/.gitignore        |   1 +
>  tools/testing/selftests/kvm/Makefile          |   1 +
>  .../selftests/kvm/aarch64/psci_cpu_on_test.c  | 121 ++++++++++++++++++
>  .../selftests/kvm/include/aarch64/processor.h |   3 +
>  7 files changed, 162 insertions(+), 9 deletions(-)
>  create mode 100644 tools/testing/selftests/kvm/aarch64/psci_cpu_on_test.c
> 

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] selftests: KVM: Introduce psci_cpu_on_test
  2021-08-18  8:50 ` [PATCH 4/4] selftests: KVM: Introduce psci_cpu_on_test Oliver Upton
@ 2021-08-18 14:42   ` Andrew Jones
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Jones @ 2021-08-18 14:42 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Marc Zyngier, Peter Shier, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose

On Wed, Aug 18, 2021 at 08:50:47AM +0000, Oliver Upton wrote:
> Introduce a test for aarch64 that ensures CPU resets induced by PSCI are
> reflected in the target vCPU's state, even if the target is never run
> again. This is a regression test for a race between vCPU migration and
> PSCI.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  tools/testing/selftests/kvm/.gitignore        |   1 +
>  tools/testing/selftests/kvm/Makefile          |   1 +
>  .../selftests/kvm/aarch64/psci_cpu_on_test.c  | 121 ++++++++++++++++++
>  .../selftests/kvm/include/aarch64/processor.h |   3 +
>  4 files changed, 126 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/aarch64/psci_cpu_on_test.c
>

Reviewed-by: Andrew Jones <drjones@redhat.com>

Thanks,
drew


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-08-18 14:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-18  8:50 [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call Oliver Upton
2021-08-18  8:50 ` [PATCH 1/4] KVM: arm64: Fix read-side race on updates to vcpu reset state Oliver Upton
2021-08-18 10:06   ` Marc Zyngier
2021-08-18  8:50 ` [PATCH 2/4] KVM: arm64: Handle PSCI resets before userspace touches vCPU state Oliver Upton
2021-08-18 10:38   ` Marc Zyngier
2021-08-18  8:50 ` [PATCH 3/4] KVM: arm64: Enforce reserved bits for PSCI target affinities Oliver Upton
2021-08-18 11:12   ` Marc Zyngier
2021-08-18  8:50 ` [PATCH 4/4] selftests: KVM: Introduce psci_cpu_on_test Oliver Upton
2021-08-18 14:42   ` Andrew Jones
2021-08-18 11:32 ` [PATCH 0/4] KVM: arm64: Fix some races in CPU_ON PSCI call Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).