linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace
@ 2021-06-08 15:48 Jean-Philippe Brucker
  2021-06-08 15:48 ` [RFC PATCH 1/5] KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch Jean-Philippe Brucker
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Jean-Philippe Brucker @ 2021-06-08 15:48 UTC (permalink / raw)
  To: maz
  Cc: kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini, corbet,
	james.morse, alexandru.elisei, suzuki.poulose, catalin.marinas,
	will, lorenzo.pieralisi, salil.mehta, shameerali.kolothum.thodi,
	jonathan.cameron, Jean-Philippe Brucker

Allow userspace to request handling PSCI calls from guests. Our goal is
to enable a vCPU hot-add solution for Arm where the VMM presents
possible resources to the guest at boot, and controls which vCPUs can be
brought up by allowing or denying PSCI CPU_ON calls. Passing HVC and
PSCI to userspace has been discussed on the list in the context of vCPU
hot-add [1,2] but it can also be useful for implementing other SMCCC and
vendor hypercalls [3,4,5].

Patches 1-3 allow userspace to request WFI to be executed in KVM. That
way the VMM can easily implement the PSCI CPU_SUSPEND function, which is
mandatory from PSCI v0.2 onwards (even if it doesn't have a more useful
implementation than WFI, natively available to the guest).

Patch 4 lets userspace request any HVC that isn't handled by KVM, and
patch 5 lets userspace request PSCI calls, disabling in-kernel PSCI
handling.

I'm focusing on the PSCI bits, but a complete prototype of vCPU hot-add
for arm64 on Linux and QEMU, most of it from Salil and James, is
available at [6].

[1] https://lore.kernel.org/kvmarm/82879258-46a7-a6e9-ee54-fc3692c1cdc3@arm.com/
[2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
    (Followed by KVM forum and Linaro Open discussions)
[3] https://lore.kernel.org/linux-arm-kernel/f56cf420-affc-35f0-2355-801a924b8a35@arm.com/
[4] https://lore.kernel.org/kvm/bf7e83f1-c58e-8d65-edd0-d08f27b8b766@arm.com/
[5] https://lore.kernel.org/kvm/1569338454-26202-2-git-send-email-guoheyi@huawei.com/
[6] https://jpbrucker.net/git/linux/log/?h=cpuhp/devel
    https://jpbrucker.net/git/qemu/log/?h=cpuhp/devel    

Jean-Philippe Brucker (5):
  KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch
  KVM: arm64: Move WFI execution to check_vcpu_requests()
  KVM: arm64: Allow userspace to request WFI
  KVM: arm64: Pass hypercalls to userspace
  KVM: arm64: Pass PSCI calls to userspace

 Documentation/virt/kvm/api.rst      | 46 +++++++++++++++----
 Documentation/virt/kvm/arm/psci.rst |  1 +
 arch/arm64/include/asm/kvm_host.h   | 10 +++-
 include/kvm/arm_hypercalls.h        |  1 +
 include/kvm/arm_psci.h              |  4 ++
 include/uapi/linux/kvm.h            |  3 ++
 arch/arm64/kvm/arm.c                | 71 +++++++++++++++++++++--------
 arch/arm64/kvm/handle_exit.c        |  3 +-
 arch/arm64/kvm/hypercalls.c         | 28 +++++++++++-
 arch/arm64/kvm/psci.c               | 69 ++++++++++++++--------------
 10 files changed, 170 insertions(+), 66 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH 1/5] KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch
  2021-06-08 15:48 [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace Jean-Philippe Brucker
@ 2021-06-08 15:48 ` Jean-Philippe Brucker
  2021-06-10 15:08   ` Jonathan Cameron
  2021-07-01  9:44   ` Fuad Tabba
  2021-06-08 15:48 ` [RFC PATCH 2/5] KVM: arm64: Move WFI execution to check_vcpu_requests() Jean-Philippe Brucker
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 19+ messages in thread
From: Jean-Philippe Brucker @ 2021-06-08 15:48 UTC (permalink / raw)
  To: maz
  Cc: kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini, corbet,
	james.morse, alexandru.elisei, suzuki.poulose, catalin.marinas,
	will, lorenzo.pieralisi, salil.mehta, shameerali.kolothum.thodi,
	jonathan.cameron, Jean-Philippe Brucker

In order to add a new "suspend" power state, replace power_off with
mp_state in struct kvm_vcpu_arch. Factor the vcpu_off() function while
we're here.

No functional change intended.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 arch/arm64/include/asm/kvm_host.h |  6 ++++--
 arch/arm64/kvm/arm.c              | 29 +++++++++++++++--------------
 arch/arm64/kvm/psci.c             | 19 ++++++-------------
 3 files changed, 25 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7cd7d5c8c4bc..55a04f4d5919 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -340,8 +340,8 @@ struct kvm_vcpu_arch {
 		u32	mdscr_el1;
 	} guest_debug_preserved;
 
-	/* vcpu power-off state */
-	bool power_off;
+	/* vcpu power state (runnable, stopped, halted) */
+	u32 mp_state;
 
 	/* Don't run the guest (internal implementation need) */
 	bool pause;
@@ -720,6 +720,8 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
 			       struct kvm_device_attr *attr);
 int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 			       struct kvm_device_attr *attr);
+void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu);
+bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu);
 
 /* Guest/host FPSIMD coordination helpers */
 int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e720148232a0..bcc24adb9c0a 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -435,21 +435,22 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 	vcpu->cpu = -1;
 }
 
-static void vcpu_power_off(struct kvm_vcpu *vcpu)
+void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu)
 {
-	vcpu->arch.power_off = true;
+	vcpu->arch.mp_state = KVM_MP_STATE_STOPPED;
 	kvm_make_request(KVM_REQ_SLEEP, vcpu);
 	kvm_vcpu_kick(vcpu);
 }
 
+bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.mp_state == KVM_MP_STATE_STOPPED;
+}
+
 int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
 				    struct kvm_mp_state *mp_state)
 {
-	if (vcpu->arch.power_off)
-		mp_state->mp_state = KVM_MP_STATE_STOPPED;
-	else
-		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
-
+	mp_state->mp_state = vcpu->arch.mp_state;
 	return 0;
 }
 
@@ -460,10 +461,10 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 
 	switch (mp_state->mp_state) {
 	case KVM_MP_STATE_RUNNABLE:
-		vcpu->arch.power_off = false;
+		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
 		break;
 	case KVM_MP_STATE_STOPPED:
-		vcpu_power_off(vcpu);
+		kvm_arm_vcpu_power_off(vcpu);
 		break;
 	default:
 		ret = -EINVAL;
@@ -483,7 +484,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
 {
 	bool irq_lines = *vcpu_hcr(v) & (HCR_VI | HCR_VF);
 	return ((irq_lines || kvm_vgic_vcpu_pending_irq(v))
-		&& !v->arch.power_off && !v->arch.pause);
+		&& !kvm_arm_vcpu_is_off(v) && !v->arch.pause);
 }
 
 bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
@@ -643,10 +644,10 @@ static void vcpu_req_sleep(struct kvm_vcpu *vcpu)
 	struct rcuwait *wait = kvm_arch_vcpu_get_wait(vcpu);
 
 	rcuwait_wait_event(wait,
-			   (!vcpu->arch.power_off) &&(!vcpu->arch.pause),
+			   !kvm_arm_vcpu_is_off(vcpu) && !vcpu->arch.pause,
 			   TASK_INTERRUPTIBLE);
 
-	if (vcpu->arch.power_off || vcpu->arch.pause) {
+	if (kvm_arm_vcpu_is_off(vcpu) || vcpu->arch.pause) {
 		/* Awaken to handle a signal, request we sleep again later. */
 		kvm_make_request(KVM_REQ_SLEEP, vcpu);
 	}
@@ -1087,9 +1088,9 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
 	 * Handle the "start in power-off" case.
 	 */
 	if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features))
-		vcpu_power_off(vcpu);
+		kvm_arm_vcpu_power_off(vcpu);
 	else
-		vcpu->arch.power_off = false;
+		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
 
 	return 0;
 }
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index db4056ecccfd..24b4a2265dbd 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -52,13 +52,6 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
 	return PSCI_RET_SUCCESS;
 }
 
-static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu)
-{
-	vcpu->arch.power_off = true;
-	kvm_make_request(KVM_REQ_SLEEP, vcpu);
-	kvm_vcpu_kick(vcpu);
-}
-
 static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 {
 	struct vcpu_reset_state *reset_state;
@@ -78,7 +71,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 	 */
 	if (!vcpu)
 		return PSCI_RET_INVALID_PARAMS;
-	if (!vcpu->arch.power_off) {
+	if (!kvm_arm_vcpu_is_off(vcpu)) {
 		if (kvm_psci_version(source_vcpu, kvm) != KVM_ARM_PSCI_0_1)
 			return PSCI_RET_ALREADY_ON;
 		else
@@ -107,7 +100,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 	 */
 	smp_wmb();
 
-	vcpu->arch.power_off = false;
+	vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
 	kvm_vcpu_wake_up(vcpu);
 
 	return PSCI_RET_SUCCESS;
@@ -142,7 +135,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
 		mpidr = kvm_vcpu_get_mpidr_aff(tmp);
 		if ((mpidr & target_affinity_mask) == target_affinity) {
 			matching_cpus++;
-			if (!tmp->arch.power_off)
+			if (!kvm_arm_vcpu_is_off(tmp))
 				return PSCI_0_2_AFFINITY_LEVEL_ON;
 		}
 	}
@@ -168,7 +161,7 @@ static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type)
 	 * re-initialized.
 	 */
 	kvm_for_each_vcpu(i, tmp, vcpu->kvm)
-		tmp->arch.power_off = true;
+		tmp->arch.mp_state = KVM_MP_STATE_STOPPED;
 	kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP);
 
 	memset(&vcpu->run->system_event, 0, sizeof(vcpu->run->system_event));
@@ -237,7 +230,7 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
 		val = kvm_psci_vcpu_suspend(vcpu);
 		break;
 	case PSCI_0_2_FN_CPU_OFF:
-		kvm_psci_vcpu_off(vcpu);
+		kvm_arm_vcpu_power_off(vcpu);
 		val = PSCI_RET_SUCCESS;
 		break;
 	case PSCI_0_2_FN_CPU_ON:
@@ -350,7 +343,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 
 	switch (psci_fn) {
 	case KVM_PSCI_FN_CPU_OFF:
-		kvm_psci_vcpu_off(vcpu);
+		kvm_arm_vcpu_power_off(vcpu);
 		val = PSCI_RET_SUCCESS;
 		break;
 	case KVM_PSCI_FN_CPU_ON:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 2/5] KVM: arm64: Move WFI execution to check_vcpu_requests()
  2021-06-08 15:48 [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace Jean-Philippe Brucker
  2021-06-08 15:48 ` [RFC PATCH 1/5] KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch Jean-Philippe Brucker
@ 2021-06-08 15:48 ` Jean-Philippe Brucker
  2021-07-01  9:45   ` Fuad Tabba
  2021-06-08 15:48 ` [RFC PATCH 3/5] KVM: arm64: Allow userspace to request WFI Jean-Philippe Brucker
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Jean-Philippe Brucker @ 2021-06-08 15:48 UTC (permalink / raw)
  To: maz
  Cc: kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini, corbet,
	james.morse, alexandru.elisei, suzuki.poulose, catalin.marinas,
	will, lorenzo.pieralisi, salil.mehta, shameerali.kolothum.thodi,
	jonathan.cameron, Jean-Philippe Brucker

Prepare for WFI requests from userspace, by adding a suspend request and
moving the WFI execution into check_vcpu_requests(), next to the
power-off logic.

vcpu->arch.mp_state, previously only RUNNABLE or STOPPED, supports an
additional state HALTED and two new state transitions:

  RUNNABLE -> HALTED	from WFI or PSCI CPU_SUSPEND (same vCPU)
  HALTED -> RUNNABLE	vGIC IRQ, pending timer, signal

There shouldn't be any functional change with this patch, even though
the KVM_GET_MP_STATE ioctl could now in theory return
KVM_MP_STATE_HALTED, which would break some users' mp_state support. In
practice it should not happen because we do not return to userspace with
HALTED state. Both WFI and PSCI CPU_SUSPEND stay in the vCPU run loop
until the suspend request is consumed. It does feel fragile though,
maybe we should explicitly return RUNNABLE in KVM_GET_MP_STATE in place
of HALTED, to prevent future breakage.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 arch/arm64/include/asm/kvm_host.h |  2 ++
 arch/arm64/kvm/arm.c              | 18 ++++++++++++++-
 arch/arm64/kvm/handle_exit.c      |  3 +--
 arch/arm64/kvm/psci.c             | 37 +++++++++++++------------------
 4 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 55a04f4d5919..3ca732feb9a5 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -46,6 +46,7 @@
 #define KVM_REQ_VCPU_RESET	KVM_ARCH_REQ(2)
 #define KVM_REQ_RECORD_STEAL	KVM_ARCH_REQ(3)
 #define KVM_REQ_RELOAD_GICv4	KVM_ARCH_REQ(4)
+#define KVM_REQ_SUSPEND		KVM_ARCH_REQ(5)
 
 #define KVM_DIRTY_LOG_MANUAL_CAPS   (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
 				     KVM_DIRTY_LOG_INITIALLY_SET)
@@ -722,6 +723,7 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 			       struct kvm_device_attr *attr);
 void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu);
 bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu);
+void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu);
 
 /* Guest/host FPSIMD coordination helpers */
 int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index bcc24adb9c0a..d8cbaa0373c7 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -447,6 +447,12 @@ bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu)
 	return vcpu->arch.mp_state == KVM_MP_STATE_STOPPED;
 }
 
+void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
+	kvm_make_request(KVM_REQ_SUSPEND, vcpu);
+}
+
 int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
 				    struct kvm_mp_state *mp_state)
 {
@@ -667,6 +673,8 @@ static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
 
 static void check_vcpu_requests(struct kvm_vcpu *vcpu)
 {
+	bool irq_pending;
+
 	if (kvm_request_pending(vcpu)) {
 		if (kvm_check_request(KVM_REQ_SLEEP, vcpu))
 			vcpu_req_sleep(vcpu);
@@ -678,7 +686,7 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
 		 * Clear IRQ_PENDING requests that were made to guarantee
 		 * that a VCPU sees new virtual interrupts.
 		 */
-		kvm_check_request(KVM_REQ_IRQ_PENDING, vcpu);
+		irq_pending = kvm_check_request(KVM_REQ_IRQ_PENDING, vcpu);
 
 		if (kvm_check_request(KVM_REQ_RECORD_STEAL, vcpu))
 			kvm_update_stolen_time(vcpu);
@@ -690,6 +698,14 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
 			vgic_v4_load(vcpu);
 			preempt_enable();
 		}
+
+		if (kvm_check_request(KVM_REQ_SUSPEND, vcpu)) {
+			if (!irq_pending) {
+				kvm_vcpu_block(vcpu);
+				kvm_clear_request(KVM_REQ_UNHALT, vcpu);
+			}
+			vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
+		}
 	}
 }
 
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 6f48336b1d86..9717df3104cf 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -95,8 +95,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
 	} else {
 		trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
 		vcpu->stat.wfi_exit_stat++;
-		kvm_vcpu_block(vcpu);
-		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
+		kvm_arm_vcpu_suspend(vcpu);
 	}
 
 	kvm_incr_pc(vcpu);
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 24b4a2265dbd..42a307ceb95f 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -31,27 +31,6 @@ static unsigned long psci_affinity_mask(unsigned long affinity_level)
 	return 0;
 }
 
-static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
-{
-	/*
-	 * NOTE: For simplicity, we make VCPU suspend emulation to be
-	 * same-as WFI (Wait-for-interrupt) emulation.
-	 *
-	 * This means for KVM the wakeup events are interrupts and
-	 * this is consistent with intended use of StateID as described
-	 * in section 5.4.1 of PSCI v0.2 specification (ARM DEN 0022A).
-	 *
-	 * Further, we also treat power-down request to be same as
-	 * stand-by request as-per section 5.4.2 clause 3 of PSCI v0.2
-	 * specification (ARM DEN 0022A). This means all suspend states
-	 * for KVM will preserve the register state.
-	 */
-	kvm_vcpu_block(vcpu);
-	kvm_clear_request(KVM_REQ_UNHALT, vcpu);
-
-	return PSCI_RET_SUCCESS;
-}
-
 static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 {
 	struct vcpu_reset_state *reset_state;
@@ -227,7 +206,21 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
 		break;
 	case PSCI_0_2_FN_CPU_SUSPEND:
 	case PSCI_0_2_FN64_CPU_SUSPEND:
-		val = kvm_psci_vcpu_suspend(vcpu);
+		/*
+		 * NOTE: For simplicity, we make VCPU suspend emulation to be
+		 * same-as WFI (Wait-for-interrupt) emulation.
+		 *
+		 * This means for KVM the wakeup events are interrupts and this
+		 * is consistent with intended use of StateID as described in
+		 * section 5.4.1 of PSCI v0.2 specification (ARM DEN 0022A).
+		 *
+		 * Further, we also treat power-down request to be same as
+		 * stand-by request as-per section 5.4.2 clause 3 of PSCI v0.2
+		 * specification (ARM DEN 0022A). This means all suspend states
+		 * for KVM will preserve the register state.
+		 */
+		kvm_arm_vcpu_suspend(vcpu);
+		val = PSCI_RET_SUCCESS;
 		break;
 	case PSCI_0_2_FN_CPU_OFF:
 		kvm_arm_vcpu_power_off(vcpu);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 3/5] KVM: arm64: Allow userspace to request WFI
  2021-06-08 15:48 [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace Jean-Philippe Brucker
  2021-06-08 15:48 ` [RFC PATCH 1/5] KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch Jean-Philippe Brucker
  2021-06-08 15:48 ` [RFC PATCH 2/5] KVM: arm64: Move WFI execution to check_vcpu_requests() Jean-Philippe Brucker
@ 2021-06-08 15:48 ` Jean-Philippe Brucker
  2021-07-01  9:45   ` Fuad Tabba
  2021-06-08 15:48 ` [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace Jean-Philippe Brucker
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Jean-Philippe Brucker @ 2021-06-08 15:48 UTC (permalink / raw)
  To: maz
  Cc: kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini, corbet,
	james.morse, alexandru.elisei, suzuki.poulose, catalin.marinas,
	will, lorenzo.pieralisi, salil.mehta, shameerali.kolothum.thodi,
	jonathan.cameron, Jean-Philippe Brucker

To help userspace implement PSCI CPU_SUSPEND, allow setting the "HALTED"
MP state to request a WFI before returning to the guest.

Userspace won't obtain a HALTED mp_state from a KVM_GET_MP_STATE call
unless they set it themselves. When set by KVM, to handle wfi or
CPU_SUSPEND, it is consumed before returning to userspace.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 Documentation/virt/kvm/api.rst | 15 +++++++++------
 include/uapi/linux/kvm.h       |  1 +
 arch/arm64/kvm/arm.c           | 11 ++++++++++-
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 7fcb2fd38f42..e4fe7fb60d5d 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1416,8 +1416,8 @@ Possible values are:
                                  which has not yet received an INIT signal [x86]
    KVM_MP_STATE_INIT_RECEIVED    the vcpu has received an INIT signal, and is
                                  now ready for a SIPI [x86]
-   KVM_MP_STATE_HALTED           the vcpu has executed a HLT instruction and
-                                 is waiting for an interrupt [x86]
+   KVM_MP_STATE_HALTED           the vcpu has executed a HLT/WFI instruction
+                                 and is waiting for an interrupt [x86,arm64]
    KVM_MP_STATE_SIPI_RECEIVED    the vcpu has just received a SIPI (vector
                                  accessible via KVM_GET_VCPU_EVENTS) [x86]
    KVM_MP_STATE_STOPPED          the vcpu is stopped [s390,arm/arm64]
@@ -1435,8 +1435,9 @@ these architectures.
 For arm/arm64:
 ^^^^^^^^^^^^^^
 
-The only states that are valid are KVM_MP_STATE_STOPPED and
-KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
+Valid states are KVM_MP_STATE_STOPPED and KVM_MP_STATE_RUNNABLE which reflect
+if the vcpu is paused or not. If KVM_CAP_ARM_MP_HALTED is present, state
+KVM_MP_STATE_HALTED is also valid.
 
 4.39 KVM_SET_MP_STATE
 ---------------------
@@ -1457,8 +1458,10 @@ these architectures.
 For arm/arm64:
 ^^^^^^^^^^^^^^
 
-The only states that are valid are KVM_MP_STATE_STOPPED and
-KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
+Valid states are KVM_MP_STATE_STOPPED and KVM_MP_STATE_RUNNABLE which reflect
+if the vcpu should be paused or not. If KVM_CAP_ARM_MP_HALTED is present,
+KVM_MP_STATE_HALTED can be set, to wait for interrupts targeted at the vcpu
+before running it.
 
 4.40 KVM_SET_IDENTITY_MAP_ADDR
 ------------------------------
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 79d9c44d1ad7..06ba64c49737 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1083,6 +1083,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_SGX_ATTRIBUTE 196
 #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197
 #define KVM_CAP_PTP_KVM 198
+#define KVM_CAP_ARM_MP_HALTED 199
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index d8cbaa0373c7..d6ad977fea5f 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -207,6 +207,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_VCPU_ATTRIBUTES:
 	case KVM_CAP_PTP_KVM:
+	case KVM_CAP_ARM_MP_HALTED:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
@@ -469,6 +470,9 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 	case KVM_MP_STATE_RUNNABLE:
 		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
 		break;
+	case KVM_MP_STATE_HALTED:
+		kvm_arm_vcpu_suspend(vcpu);
+		break;
 	case KVM_MP_STATE_STOPPED:
 		kvm_arm_vcpu_power_off(vcpu);
 		break;
@@ -699,7 +703,12 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
 			preempt_enable();
 		}
 
-		if (kvm_check_request(KVM_REQ_SUSPEND, vcpu)) {
+		/*
+		 * Check mp_state again in case userspace changed their mind
+		 * after requesting suspend.
+		 */
+		if (kvm_check_request(KVM_REQ_SUSPEND, vcpu) &&
+		    vcpu->arch.mp_state == KVM_MP_STATE_HALTED) {
 			if (!irq_pending) {
 				kvm_vcpu_block(vcpu);
 				kvm_clear_request(KVM_REQ_UNHALT, vcpu);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace
  2021-06-08 15:48 [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace Jean-Philippe Brucker
                   ` (2 preceding siblings ...)
  2021-06-08 15:48 ` [RFC PATCH 3/5] KVM: arm64: Allow userspace to request WFI Jean-Philippe Brucker
@ 2021-06-08 15:48 ` Jean-Philippe Brucker
  2021-06-10 16:02   ` Jonathan Cameron
                     ` (2 more replies)
  2021-06-08 15:48 ` [RFC PATCH 5/5] KVM: arm64: Pass PSCI calls " Jean-Philippe Brucker
                   ` (2 subsequent siblings)
  6 siblings, 3 replies; 19+ messages in thread
From: Jean-Philippe Brucker @ 2021-06-08 15:48 UTC (permalink / raw)
  To: maz
  Cc: kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini, corbet,
	james.morse, alexandru.elisei, suzuki.poulose, catalin.marinas,
	will, lorenzo.pieralisi, salil.mehta, shameerali.kolothum.thodi,
	jonathan.cameron, Jean-Philippe Brucker

Let userspace request to handle all hypercalls that aren't handled by
KVM, by setting the KVM_CAP_ARM_HVC_TO_USER capability.

With the help of another capability, this will allow userspace to handle
PSCI calls.

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---

Notes on this implementation:

* A similar mechanism was proposed for SDEI some time ago [1]. This RFC
  generalizes the idea to all hypercalls, since that was suggested on
  the list [2, 3].

* We're reusing kvm_run.hypercall. I copied x0-x5 into
  kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
  this, because:
  - Most user handlers will need to write results back into the
    registers (x0-x3 for SMCCC), so if we keep this shortcut we should
    go all the way and synchronize them on return to kernel.
  - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
    handling the call.
  - SMCCC uses x0-x16 for parameters.
  x0 does contain the SMCCC function ID and may be useful for fast
  dispatch, we could keep that plus the immediate number.

* Should we add a flag in the kvm_run.hypercall telling whether this is
  HVC or SMC?  Can be added later in those bottom longmode and pad
  fields.

* On top of this we could share with userspace which HVC ranges are
  available and which ones are handled by KVM. That can actually be
  added independently, through a vCPU/VM device attribute (which doesn't
  consume a new ioctl):
  - userspace issues HAS_ATTR ioctl on the VM fd to query whether this
    feature is available.
  - userspace queries the number N of HVC ranges using one GET_ATTR.
  - userspace passes an array of N ranges using another GET_ATTR.
    The array is filled and returned by KVM.

* Untested for AArch32 guests.

[1] https://lore.kernel.org/linux-arm-kernel/20170808164616.25949-12-james.morse@arm.com/
[2] https://lore.kernel.org/linux-arm-kernel/bf7e83f1-c58e-8d65-edd0-d08f27b8b766@arm.com/
[3] https://lore.kernel.org/linux-arm-kernel/f56cf420-affc-35f0-2355-801a924b8a35@arm.com/
---
 Documentation/virt/kvm/api.rst    | 17 +++++++++++++++--
 arch/arm64/include/asm/kvm_host.h |  1 +
 include/kvm/arm_psci.h            |  4 ++++
 include/uapi/linux/kvm.h          |  1 +
 arch/arm64/kvm/arm.c              |  5 +++++
 arch/arm64/kvm/hypercalls.c       | 28 +++++++++++++++++++++++++++-
 6 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index e4fe7fb60d5d..3d8c1661e7b2 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5228,8 +5228,12 @@ to the byte array.
 			__u32 pad;
 		} hypercall;
 
-Unused.  This was once used for 'hypercall to userspace'.  To implement
-such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
+On x86 this was once used for 'hypercall to userspace'.  To implement such
+functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
+
+On arm64 it is used for hypercalls, when the KVM_CAP_ARM_HVC_TO_USER capability
+is enabled. 'nr' contains the HVC or SMC immediate. 'args' contains registers
+x0 - x5. The other parameters are unused.
 
 .. note:: KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
 
@@ -6894,3 +6898,12 @@ This capability is always enabled.
 This capability indicates that the KVM virtual PTP service is
 supported in the host. A VMM can check whether the service is
 available to the guest on migration.
+
+8.33 KVM_CAP_ARM_HVC_TO_USER
+----------------------------
+
+:Architecture: arm64
+
+This capability indicates that KVM can pass unhandled hypercalls to userspace,
+if the VMM enables it. Hypercalls are passed with KVM_EXIT_HYPERCALL in
+kvm_run::hypercall.
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3ca732feb9a5..25554ce97045 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -123,6 +123,7 @@ struct kvm_arch {
 	 * supported.
 	 */
 	bool return_nisv_io_abort_to_user;
+	bool hvc_to_user;
 
 	/*
 	 * VM-wide PMU filter, implemented as a bitmap and big enough for
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 5b58bd2fe088..d6b71a48fbb1 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -16,6 +16,10 @@
 
 #define KVM_ARM_PSCI_LATEST	KVM_ARM_PSCI_1_0
 
+#define KVM_PSCI_FN_LAST	KVM_PSCI_FN(3)
+#define PSCI_0_2_FN_LAST	PSCI_0_2_FN(0x3f)
+#define PSCI_0_2_FN64_LAST	PSCI_0_2_FN64(0x3f)
+
 /*
  * We need the KVM pointer independently from the vcpu as we can call
  * this from HYP, and need to apply kern_hyp_va on it...
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 06ba64c49737..aa831986a399 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1084,6 +1084,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197
 #define KVM_CAP_PTP_KVM 198
 #define KVM_CAP_ARM_MP_HALTED 199
+#define KVM_CAP_ARM_HVC_TO_USER 200
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index d6ad977fea5f..074197721e97 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -93,6 +93,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		r = 0;
 		kvm->arch.return_nisv_io_abort_to_user = true;
 		break;
+	case KVM_CAP_ARM_HVC_TO_USER:
+		r = 0;
+		kvm->arch.hvc_to_user = true;
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -208,6 +212,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_VCPU_ATTRIBUTES:
 	case KVM_CAP_PTP_KVM:
 	case KVM_CAP_ARM_MP_HALTED:
+	case KVM_CAP_ARM_HVC_TO_USER:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 30da78f72b3b..ccc2015eddf9 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -58,6 +58,28 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
 	val[3] = lower_32_bits(cycles);
 }
 
+static int kvm_hvc_user(struct kvm_vcpu *vcpu)
+{
+	int i;
+	struct kvm_run *run = vcpu->run;
+
+	if (!vcpu->kvm->arch.hvc_to_user) {
+		smccc_set_retval(vcpu, SMCCC_RET_NOT_SUPPORTED, 0, 0, 0);
+		return 1;
+	}
+
+	run->exit_reason = KVM_EXIT_HYPERCALL;
+	run->hypercall.nr = kvm_vcpu_hvc_get_imm(vcpu);
+	/* Copy the first parameters for fast access */
+	for (i = 0; i < 6; i++)
+		run->hypercall.args[i] = vcpu_get_reg(vcpu, i);
+	run->hypercall.ret = 0;
+	run->hypercall.longmode = 0;
+	run->hypercall.pad = 0;
+
+	return 0;
+}
+
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 {
 	u32 func_id = smccc_get_function(vcpu);
@@ -139,8 +161,12 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 	case ARM_SMCCC_TRNG_RND32:
 	case ARM_SMCCC_TRNG_RND64:
 		return kvm_trng_call(vcpu);
-	default:
+	case KVM_PSCI_FN_BASE...KVM_PSCI_FN_LAST:
+	case PSCI_0_2_FN_BASE...PSCI_0_2_FN_LAST:
+	case PSCI_0_2_FN64_BASE...PSCI_0_2_FN64_LAST:
 		return kvm_psci_call(vcpu);
+	default:
+		return kvm_hvc_user(vcpu);
 	}
 
 	smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 5/5] KVM: arm64: Pass PSCI calls to userspace
  2021-06-08 15:48 [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace Jean-Philippe Brucker
                   ` (3 preceding siblings ...)
  2021-06-08 15:48 ` [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace Jean-Philippe Brucker
@ 2021-06-08 15:48 ` Jean-Philippe Brucker
  2021-07-01  9:46   ` Fuad Tabba
  2021-07-19 15:29 ` [RFC PATCH 0/5] KVM: arm64: Pass PSCI " Alexandru Elisei
  2021-07-21 17:56 ` Jean-Philippe Brucker
  6 siblings, 1 reply; 19+ messages in thread
From: Jean-Philippe Brucker @ 2021-06-08 15:48 UTC (permalink / raw)
  To: maz
  Cc: kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini, corbet,
	james.morse, alexandru.elisei, suzuki.poulose, catalin.marinas,
	will, lorenzo.pieralisi, salil.mehta, shameerali.kolothum.thodi,
	jonathan.cameron, Jean-Philippe Brucker

Let userspace request to handle PSCI calls, by setting the new
KVM_CAP_ARM_PSCI_TO_USER capability.

SMCCC probe requires PSCI v1.x. If userspace only implements PSCI v0.2,
the guest won't query SMCCC support through PSCI and won't use the
spectre workarounds. We could hijack PSCI_VERSION and pretend to support
v1.0 if userspace does not, then handle all v1.0 calls ourselves
(including guessing the PSCI feature set implemented by the guest), but
that seems unnecessary. After all the API already allows userspace to
force a version lower than v1.0 using the firmware pseudo-registers.

The KVM_REG_ARM_PSCI_VERSION pseudo-register currently resets to either
v0.1 if userspace doesn't set KVM_ARM_VCPU_PSCI_0_2, or
KVM_ARM_PSCI_LATEST (1.0).

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 Documentation/virt/kvm/api.rst      | 14 ++++++++++++++
 Documentation/virt/kvm/arm/psci.rst |  1 +
 arch/arm64/include/asm/kvm_host.h   |  1 +
 include/kvm/arm_hypercalls.h        |  1 +
 include/uapi/linux/kvm.h            |  1 +
 arch/arm64/kvm/arm.c                | 10 +++++++---
 arch/arm64/kvm/hypercalls.c         |  2 +-
 arch/arm64/kvm/psci.c               | 13 +++++++++++++
 8 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 3d8c1661e7b2..f24eb70e575d 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6907,3 +6907,17 @@ available to the guest on migration.
 This capability indicates that KVM can pass unhandled hypercalls to userspace,
 if the VMM enables it. Hypercalls are passed with KVM_EXIT_HYPERCALL in
 kvm_run::hypercall.
+
+8.34 KVM_CAP_ARM_PSCI_TO_USER
+-----------------------------
+
+:Architectures: arm64
+
+When the VMM enables this capability, all PSCI calls are passed to userspace
+instead of being handled by KVM. Capability KVM_CAP_ARM_HVC_TO_USER must be
+enabled first.
+
+Userspace should support at least PSCI v1.0. Otherwise SMCCC features won't be
+available to the guest. Userspace does not need to handle the SMCCC_VERSION
+parameter for the PSCI_FEATURES function. The KVM_ARM_VCPU_PSCI_0_2 vCPU
+feature should be set even if this capability is enabled.
diff --git a/Documentation/virt/kvm/arm/psci.rst b/Documentation/virt/kvm/arm/psci.rst
index d52c2e83b5b8..110011d1fa3f 100644
--- a/Documentation/virt/kvm/arm/psci.rst
+++ b/Documentation/virt/kvm/arm/psci.rst
@@ -34,6 +34,7 @@ The following register is defined:
   - Allows any PSCI version implemented by KVM and compatible with
     v0.2 to be set with SET_ONE_REG
   - Affects the whole VM (even if the register view is per-vcpu)
+  - Defaults to PSCI 1.0 if userspace enables KVM_CAP_ARM_PSCI_TO_USER.
 
 * KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
     Holds the state of the firmware support to mitigate CVE-2017-5715, as
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 25554ce97045..5d74b769c16d 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -124,6 +124,7 @@ struct kvm_arch {
 	 */
 	bool return_nisv_io_abort_to_user;
 	bool hvc_to_user;
+	bool psci_to_user;
 
 	/*
 	 * VM-wide PMU filter, implemented as a bitmap and big enough for
diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
index 0e2509d27910..b66c6a000ef3 100644
--- a/include/kvm/arm_hypercalls.h
+++ b/include/kvm/arm_hypercalls.h
@@ -6,6 +6,7 @@
 
 #include <asm/kvm_emulate.h>
 
+int kvm_hvc_user(struct kvm_vcpu *vcpu);
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
 
 static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index aa831986a399..2b8e55aa7e1e 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1085,6 +1085,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PTP_KVM 198
 #define KVM_CAP_ARM_MP_HALTED 199
 #define KVM_CAP_ARM_HVC_TO_USER 200
+#define KVM_CAP_ARM_PSCI_TO_USER 201
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 074197721e97..bc3e63b0b3ad 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -83,7 +83,7 @@ int kvm_arch_check_processor_compat(void *opaque)
 int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			    struct kvm_enable_cap *cap)
 {
-	int r;
+	int r = -EINVAL;
 
 	if (cap->flags)
 		return -EINVAL;
@@ -97,8 +97,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		r = 0;
 		kvm->arch.hvc_to_user = true;
 		break;
-	default:
-		r = -EINVAL;
+	case KVM_CAP_ARM_PSCI_TO_USER:
+		if (kvm->arch.hvc_to_user) {
+			r = 0;
+			kvm->arch.psci_to_user = true;
+		}
 		break;
 	}
 
@@ -213,6 +216,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_PTP_KVM:
 	case KVM_CAP_ARM_MP_HALTED:
 	case KVM_CAP_ARM_HVC_TO_USER:
+	case KVM_CAP_ARM_PSCI_TO_USER:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index ccc2015eddf9..621d5a5b7e48 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -58,7 +58,7 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
 	val[3] = lower_32_bits(cycles);
 }
 
-static int kvm_hvc_user(struct kvm_vcpu *vcpu)
+int kvm_hvc_user(struct kvm_vcpu *vcpu)
 {
 	int i;
 	struct kvm_run *run = vcpu->run;
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 42a307ceb95f..7f44ee527966 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -353,6 +353,16 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+static bool kvm_psci_call_is_user(struct kvm_vcpu *vcpu)
+{
+	/* Handle the special case of SMCCC probe through PSCI */
+	if (smccc_get_function(vcpu) == PSCI_1_0_FN_PSCI_FEATURES &&
+	    smccc_get_arg1(vcpu) == ARM_SMCCC_VERSION_FUNC_ID)
+		return false;
+
+	return vcpu->kvm->arch.psci_to_user;
+}
+
 /**
  * kvm_psci_call - handle PSCI call if r0 value is in range
  * @vcpu: Pointer to the VCPU struct
@@ -369,6 +379,9 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
  */
 int kvm_psci_call(struct kvm_vcpu *vcpu)
 {
+	if (kvm_psci_call_is_user(vcpu))
+		return kvm_hvc_user(vcpu);
+
 	switch (kvm_psci_version(vcpu, vcpu->kvm)) {
 	case KVM_ARM_PSCI_1_0:
 		return kvm_psci_1_0_call(vcpu);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/5] KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch
  2021-06-08 15:48 ` [RFC PATCH 1/5] KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch Jean-Philippe Brucker
@ 2021-06-10 15:08   ` Jonathan Cameron
  2021-07-01  9:44   ` Fuad Tabba
  1 sibling, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2021-06-10 15:08 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: maz, kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini,
	corbet, james.morse, alexandru.elisei, suzuki.poulose,
	catalin.marinas, will, lorenzo.pieralisi, salil.mehta,
	shameerali.kolothum.thodi

On Tue,  8 Jun 2021 17:48:02 +0200
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> In order to add a new "suspend" power state, replace power_off with
> mp_state in struct kvm_vcpu_arch. Factor the vcpu_off() function while
> we're here.

Hi Jean-Phillipe,

2 changes, so if you do end up doing a v2 I'd prefer the
factor out of kvm_arm_vcpu_power_off() + possibly introduced
kvm_arm_vcpu_is_off() using the old boolean.
Then the change in how you track the state will be a bit easier to
pick out.

> 
> No functional change intended.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>  arch/arm64/include/asm/kvm_host.h |  6 ++++--
>  arch/arm64/kvm/arm.c              | 29 +++++++++++++++--------------
>  arch/arm64/kvm/psci.c             | 19 ++++++-------------
>  3 files changed, 25 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 7cd7d5c8c4bc..55a04f4d5919 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -340,8 +340,8 @@ struct kvm_vcpu_arch {
>  		u32	mdscr_el1;
>  	} guest_debug_preserved;
>  
> -	/* vcpu power-off state */
> -	bool power_off;
> +	/* vcpu power state (runnable, stopped, halted) */
> +	u32 mp_state;
>  
>  	/* Don't run the guest (internal implementation need) */
>  	bool pause;
> @@ -720,6 +720,8 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>  			       struct kvm_device_attr *attr);
>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  			       struct kvm_device_attr *attr);
> +void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu);
> +bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu);
>  
>  /* Guest/host FPSIMD coordination helpers */
>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e720148232a0..bcc24adb9c0a 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -435,21 +435,22 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  	vcpu->cpu = -1;
>  }
>  
> -static void vcpu_power_off(struct kvm_vcpu *vcpu)
> +void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu)
>  {
> -	vcpu->arch.power_off = true;
> +	vcpu->arch.mp_state = KVM_MP_STATE_STOPPED;
>  	kvm_make_request(KVM_REQ_SLEEP, vcpu);
>  	kvm_vcpu_kick(vcpu);
>  }
>  
> +bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.mp_state == KVM_MP_STATE_STOPPED;
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
>  				    struct kvm_mp_state *mp_state)
>  {
> -	if (vcpu->arch.power_off)
> -		mp_state->mp_state = KVM_MP_STATE_STOPPED;
> -	else
> -		mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
> -
> +	mp_state->mp_state = vcpu->arch.mp_state;

Nice to have a blank line here.

>  	return 0;
>  }
>  
> @@ -460,10 +461,10 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>  
>  	switch (mp_state->mp_state) {
>  	case KVM_MP_STATE_RUNNABLE:
> -		vcpu->arch.power_off = false;
> +		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>  		break;
>  	case KVM_MP_STATE_STOPPED:
> -		vcpu_power_off(vcpu);
> +		kvm_arm_vcpu_power_off(vcpu);
>  		break;
>  	default:
>  		ret = -EINVAL;
> @@ -483,7 +484,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
>  {
>  	bool irq_lines = *vcpu_hcr(v) & (HCR_VI | HCR_VF);
>  	return ((irq_lines || kvm_vgic_vcpu_pending_irq(v))
> -		&& !v->arch.power_off && !v->arch.pause);
> +		&& !kvm_arm_vcpu_is_off(v) && !v->arch.pause);
>  }
>  
>  bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> @@ -643,10 +644,10 @@ static void vcpu_req_sleep(struct kvm_vcpu *vcpu)
>  	struct rcuwait *wait = kvm_arch_vcpu_get_wait(vcpu);
>  
>  	rcuwait_wait_event(wait,
> -			   (!vcpu->arch.power_off) &&(!vcpu->arch.pause),
> +			   !kvm_arm_vcpu_is_off(vcpu) && !vcpu->arch.pause,
>  			   TASK_INTERRUPTIBLE);
>  
> -	if (vcpu->arch.power_off || vcpu->arch.pause) {
> +	if (kvm_arm_vcpu_is_off(vcpu) || vcpu->arch.pause) {
>  		/* Awaken to handle a signal, request we sleep again later. */
>  		kvm_make_request(KVM_REQ_SLEEP, vcpu);
>  	}
> @@ -1087,9 +1088,9 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
>  	 * Handle the "start in power-off" case.
>  	 */
>  	if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features))
> -		vcpu_power_off(vcpu);
> +		kvm_arm_vcpu_power_off(vcpu);
>  	else
> -		vcpu->arch.power_off = false;
> +		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>  
>  	return 0;
>  }
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index db4056ecccfd..24b4a2265dbd 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -52,13 +52,6 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
>  	return PSCI_RET_SUCCESS;
>  }
>  
> -static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu)
> -{
> -	vcpu->arch.power_off = true;
> -	kvm_make_request(KVM_REQ_SLEEP, vcpu);
> -	kvm_vcpu_kick(vcpu);
> -}
> -
>  static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>  {
>  	struct vcpu_reset_state *reset_state;
> @@ -78,7 +71,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>  	 */
>  	if (!vcpu)
>  		return PSCI_RET_INVALID_PARAMS;
> -	if (!vcpu->arch.power_off) {
> +	if (!kvm_arm_vcpu_is_off(vcpu)) {
>  		if (kvm_psci_version(source_vcpu, kvm) != KVM_ARM_PSCI_0_1)
>  			return PSCI_RET_ALREADY_ON;
>  		else
> @@ -107,7 +100,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>  	 */
>  	smp_wmb();
>  
> -	vcpu->arch.power_off = false;
> +	vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>  	kvm_vcpu_wake_up(vcpu);
>  
>  	return PSCI_RET_SUCCESS;
> @@ -142,7 +135,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
>  		mpidr = kvm_vcpu_get_mpidr_aff(tmp);
>  		if ((mpidr & target_affinity_mask) == target_affinity) {
>  			matching_cpus++;
> -			if (!tmp->arch.power_off)
> +			if (!kvm_arm_vcpu_is_off(tmp))
>  				return PSCI_0_2_AFFINITY_LEVEL_ON;
>  		}
>  	}
> @@ -168,7 +161,7 @@ static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type)
>  	 * re-initialized.
>  	 */
>  	kvm_for_each_vcpu(i, tmp, vcpu->kvm)
> -		tmp->arch.power_off = true;
> +		tmp->arch.mp_state = KVM_MP_STATE_STOPPED;
>  	kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP);
>  
>  	memset(&vcpu->run->system_event, 0, sizeof(vcpu->run->system_event));
> @@ -237,7 +230,7 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
>  		val = kvm_psci_vcpu_suspend(vcpu);
>  		break;
>  	case PSCI_0_2_FN_CPU_OFF:
> -		kvm_psci_vcpu_off(vcpu);
> +		kvm_arm_vcpu_power_off(vcpu);
>  		val = PSCI_RET_SUCCESS;
>  		break;
>  	case PSCI_0_2_FN_CPU_ON:
> @@ -350,7 +343,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
>  
>  	switch (psci_fn) {
>  	case KVM_PSCI_FN_CPU_OFF:
> -		kvm_psci_vcpu_off(vcpu);
> +		kvm_arm_vcpu_power_off(vcpu);
>  		val = PSCI_RET_SUCCESS;
>  		break;
>  	case KVM_PSCI_FN_CPU_ON:


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace
  2021-06-08 15:48 ` [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace Jean-Philippe Brucker
@ 2021-06-10 16:02   ` Jonathan Cameron
  2021-07-01  9:46   ` Fuad Tabba
  2021-07-01  9:54   ` Russell King (Oracle)
  2 siblings, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2021-06-10 16:02 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: maz, kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini,
	corbet, james.morse, alexandru.elisei, suzuki.poulose,
	catalin.marinas, will, lorenzo.pieralisi, salil.mehta,
	shameerali.kolothum.thodi

On Tue,  8 Jun 2021 17:48:05 +0200
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> Let userspace request to handle all hypercalls that aren't handled by
> KVM, by setting the KVM_CAP_ARM_HVC_TO_USER capability.
> 
> With the help of another capability, this will allow userspace to handle
> PSCI calls.
> 
> Suggested-by: James Morse <james.morse@arm.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

Trivial question inline.


> ---
> 
> Notes on this implementation:
> 
> * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
>   generalizes the idea to all hypercalls, since that was suggested on
>   the list [2, 3].
> 
> * We're reusing kvm_run.hypercall. I copied x0-x5 into
>   kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
>   this, because:
>   - Most user handlers will need to write results back into the
>     registers (x0-x3 for SMCCC), so if we keep this shortcut we should
>     go all the way and synchronize them on return to kernel.
>   - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
>     handling the call.
>   - SMCCC uses x0-x16 for parameters.
>   x0 does contain the SMCCC function ID and may be useful for fast
>   dispatch, we could keep that plus the immediate number.
> 
> * Should we add a flag in the kvm_run.hypercall telling whether this is
>   HVC or SMC?  Can be added later in those bottom longmode and pad
>   fields.
> 
> * On top of this we could share with userspace which HVC ranges are
>   available and which ones are handled by KVM. That can actually be
>   added independently, through a vCPU/VM device attribute (which doesn't
>   consume a new ioctl):
>   - userspace issues HAS_ATTR ioctl on the VM fd to query whether this
>     feature is available.
>   - userspace queries the number N of HVC ranges using one GET_ATTR.
>   - userspace passes an array of N ranges using another GET_ATTR.
>     The array is filled and returned by KVM.
> 
> * Untested for AArch32 guests.
> 
> [1] https://lore.kernel.org/linux-arm-kernel/20170808164616.25949-12-james.morse@arm.com/
> [2] https://lore.kernel.org/linux-arm-kernel/bf7e83f1-c58e-8d65-edd0-d08f27b8b766@arm.com/
> [3] https://lore.kernel.org/linux-arm-kernel/f56cf420-affc-35f0-2355-801a924b8a35@arm.com/
> ---
>  Documentation/virt/kvm/api.rst    | 17 +++++++++++++++--
>  arch/arm64/include/asm/kvm_host.h |  1 +
>  include/kvm/arm_psci.h            |  4 ++++
>  include/uapi/linux/kvm.h          |  1 +
>  arch/arm64/kvm/arm.c              |  5 +++++
>  arch/arm64/kvm/hypercalls.c       | 28 +++++++++++++++++++++++++++-
>  6 files changed, 53 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index e4fe7fb60d5d..3d8c1661e7b2 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -5228,8 +5228,12 @@ to the byte array.
>  			__u32 pad;
>  		} hypercall;
>  
> -Unused.  This was once used for 'hypercall to userspace'.  To implement
> -such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
> +On x86 this was once used for 'hypercall to userspace'.  To implement such
> +functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
> +
> +On arm64 it is used for hypercalls, when the KVM_CAP_ARM_HVC_TO_USER capability
> +is enabled. 'nr' contains the HVC or SMC immediate. 'args' contains registers
> +x0 - x5. The other parameters are unused.
>  
>  .. note:: KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
>  
> @@ -6894,3 +6898,12 @@ This capability is always enabled.
>  This capability indicates that the KVM virtual PTP service is
>  supported in the host. A VMM can check whether the service is
>  available to the guest on migration.
> +
> +8.33 KVM_CAP_ARM_HVC_TO_USER
> +----------------------------
> +
> +:Architecture: arm64
> +
> +This capability indicates that KVM can pass unhandled hypercalls to userspace,
> +if the VMM enables it. Hypercalls are passed with KVM_EXIT_HYPERCALL in
> +kvm_run::hypercall.
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 3ca732feb9a5..25554ce97045 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -123,6 +123,7 @@ struct kvm_arch {
>  	 * supported.
>  	 */
>  	bool return_nisv_io_abort_to_user;
> +	bool hvc_to_user;
>  
>  	/*
>  	 * VM-wide PMU filter, implemented as a bitmap and big enough for
> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> index 5b58bd2fe088..d6b71a48fbb1 100644
> --- a/include/kvm/arm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -16,6 +16,10 @@
>  
>  #define KVM_ARM_PSCI_LATEST	KVM_ARM_PSCI_1_0
>  
> +#define KVM_PSCI_FN_LAST	KVM_PSCI_FN(3)
> +#define PSCI_0_2_FN_LAST	PSCI_0_2_FN(0x3f)
> +#define PSCI_0_2_FN64_LAST	PSCI_0_2_FN64(0x3f)

Why 3f?  Perhaps a spec reference if appropriate.

> +
>  /*
>   * We need the KVM pointer independently from the vcpu as we can call
>   * this from HYP, and need to apply kern_hyp_va on it...
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 06ba64c49737..aa831986a399 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1084,6 +1084,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197
>  #define KVM_CAP_PTP_KVM 198
>  #define KVM_CAP_ARM_MP_HALTED 199
> +#define KVM_CAP_ARM_HVC_TO_USER 200
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index d6ad977fea5f..074197721e97 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -93,6 +93,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		r = 0;
>  		kvm->arch.return_nisv_io_abort_to_user = true;
>  		break;
> +	case KVM_CAP_ARM_HVC_TO_USER:
> +		r = 0;
> +		kvm->arch.hvc_to_user = true;
> +		break;
>  	default:
>  		r = -EINVAL;
>  		break;
> @@ -208,6 +212,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_VCPU_ATTRIBUTES:
>  	case KVM_CAP_PTP_KVM:
>  	case KVM_CAP_ARM_MP_HALTED:
> +	case KVM_CAP_ARM_HVC_TO_USER:
>  		r = 1;
>  		break;
>  	case KVM_CAP_SET_GUEST_DEBUG2:
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index 30da78f72b3b..ccc2015eddf9 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -58,6 +58,28 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
>  	val[3] = lower_32_bits(cycles);
>  }
>  
> +static int kvm_hvc_user(struct kvm_vcpu *vcpu)
> +{
> +	int i;
> +	struct kvm_run *run = vcpu->run;
> +
> +	if (!vcpu->kvm->arch.hvc_to_user) {
> +		smccc_set_retval(vcpu, SMCCC_RET_NOT_SUPPORTED, 0, 0, 0);
> +		return 1;
> +	}
> +
> +	run->exit_reason = KVM_EXIT_HYPERCALL;
> +	run->hypercall.nr = kvm_vcpu_hvc_get_imm(vcpu);
> +	/* Copy the first parameters for fast access */
> +	for (i = 0; i < 6; i++)
> +		run->hypercall.args[i] = vcpu_get_reg(vcpu, i);
> +	run->hypercall.ret = 0;
> +	run->hypercall.longmode = 0;
> +	run->hypercall.pad = 0;
> +
> +	return 0;
> +}
> +
>  int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>  {
>  	u32 func_id = smccc_get_function(vcpu);
> @@ -139,8 +161,12 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>  	case ARM_SMCCC_TRNG_RND32:
>  	case ARM_SMCCC_TRNG_RND64:
>  		return kvm_trng_call(vcpu);
> -	default:
> +	case KVM_PSCI_FN_BASE...KVM_PSCI_FN_LAST:
> +	case PSCI_0_2_FN_BASE...PSCI_0_2_FN_LAST:
> +	case PSCI_0_2_FN64_BASE...PSCI_0_2_FN64_LAST:
>  		return kvm_psci_call(vcpu);
> +	default:
> +		return kvm_hvc_user(vcpu);
>  	}
>  
>  	smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/5] KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch
  2021-06-08 15:48 ` [RFC PATCH 1/5] KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch Jean-Philippe Brucker
  2021-06-10 15:08   ` Jonathan Cameron
@ 2021-07-01  9:44   ` Fuad Tabba
  1 sibling, 0 replies; 19+ messages in thread
From: Fuad Tabba @ 2021-07-01  9:44 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: maz, salil.mehta, lorenzo.pieralisi, kvm, corbet,
	catalin.marinas, linux-kernel, will, jonathan.cameron, pbonzini,
	kvmarm, linux-arm-kernel

Hi Jean-Philippe,

On Tue, Jun 8, 2021 at 4:54 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> In order to add a new "suspend" power state, replace power_off with
> mp_state in struct kvm_vcpu_arch. Factor the vcpu_off() function while
> we're here.
>
> No functional change intended.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>  arch/arm64/include/asm/kvm_host.h |  6 ++++--
>  arch/arm64/kvm/arm.c              | 29 +++++++++++++++--------------
>  arch/arm64/kvm/psci.c             | 19 ++++++-------------
>  3 files changed, 25 insertions(+), 29 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 7cd7d5c8c4bc..55a04f4d5919 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -340,8 +340,8 @@ struct kvm_vcpu_arch {
>                 u32     mdscr_el1;
>         } guest_debug_preserved;
>
> -       /* vcpu power-off state */
> -       bool power_off;
> +       /* vcpu power state (runnable, stopped, halted) */

Should the comment be, for clarity, something along the lines of
KVM_MP_STATE_(STOPPED, RUNNABLE, HALTED), or maybe "a valid struct
kvm_mp_state", if you think other states might be added in the future?

Thanks,
/fuad


> +       u32 mp_state;
>
>         /* Don't run the guest (internal implementation need) */
>         bool pause;
> @@ -720,6 +720,8 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>                                struct kvm_device_attr *attr);
>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>                                struct kvm_device_attr *attr);
> +void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu);
> +bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu);
>
>  /* Guest/host FPSIMD coordination helpers */
>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e720148232a0..bcc24adb9c0a 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -435,21 +435,22 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>         vcpu->cpu = -1;
>  }
>
> -static void vcpu_power_off(struct kvm_vcpu *vcpu)
> +void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu)
>  {
> -       vcpu->arch.power_off = true;
> +       vcpu->arch.mp_state = KVM_MP_STATE_STOPPED;
>         kvm_make_request(KVM_REQ_SLEEP, vcpu);
>         kvm_vcpu_kick(vcpu);
>  }
>
> +bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu)
> +{
> +       return vcpu->arch.mp_state == KVM_MP_STATE_STOPPED;
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
>                                     struct kvm_mp_state *mp_state)
>  {
> -       if (vcpu->arch.power_off)
> -               mp_state->mp_state = KVM_MP_STATE_STOPPED;
> -       else
> -               mp_state->mp_state = KVM_MP_STATE_RUNNABLE;
> -
> +       mp_state->mp_state = vcpu->arch.mp_state;
>         return 0;
>  }
>
> @@ -460,10 +461,10 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>
>         switch (mp_state->mp_state) {
>         case KVM_MP_STATE_RUNNABLE:
> -               vcpu->arch.power_off = false;
> +               vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>                 break;
>         case KVM_MP_STATE_STOPPED:
> -               vcpu_power_off(vcpu);
> +               kvm_arm_vcpu_power_off(vcpu);
>                 break;
>         default:
>                 ret = -EINVAL;
> @@ -483,7 +484,7 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
>  {
>         bool irq_lines = *vcpu_hcr(v) & (HCR_VI | HCR_VF);
>         return ((irq_lines || kvm_vgic_vcpu_pending_irq(v))
> -               && !v->arch.power_off && !v->arch.pause);
> +               && !kvm_arm_vcpu_is_off(v) && !v->arch.pause);
>  }
>
>  bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> @@ -643,10 +644,10 @@ static void vcpu_req_sleep(struct kvm_vcpu *vcpu)
>         struct rcuwait *wait = kvm_arch_vcpu_get_wait(vcpu);
>
>         rcuwait_wait_event(wait,
> -                          (!vcpu->arch.power_off) &&(!vcpu->arch.pause),
> +                          !kvm_arm_vcpu_is_off(vcpu) && !vcpu->arch.pause,
>                            TASK_INTERRUPTIBLE);
>
> -       if (vcpu->arch.power_off || vcpu->arch.pause) {
> +       if (kvm_arm_vcpu_is_off(vcpu) || vcpu->arch.pause) {
>                 /* Awaken to handle a signal, request we sleep again later. */
>                 kvm_make_request(KVM_REQ_SLEEP, vcpu);
>         }
> @@ -1087,9 +1088,9 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
>          * Handle the "start in power-off" case.
>          */
>         if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features))
> -               vcpu_power_off(vcpu);
> +               kvm_arm_vcpu_power_off(vcpu);
>         else
> -               vcpu->arch.power_off = false;
> +               vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>
>         return 0;
>  }
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index db4056ecccfd..24b4a2265dbd 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -52,13 +52,6 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
>         return PSCI_RET_SUCCESS;
>  }
>
> -static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu)
> -{
> -       vcpu->arch.power_off = true;
> -       kvm_make_request(KVM_REQ_SLEEP, vcpu);
> -       kvm_vcpu_kick(vcpu);
> -}
> -
>  static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>  {
>         struct vcpu_reset_state *reset_state;
> @@ -78,7 +71,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>          */
>         if (!vcpu)
>                 return PSCI_RET_INVALID_PARAMS;
> -       if (!vcpu->arch.power_off) {
> +       if (!kvm_arm_vcpu_is_off(vcpu)) {
>                 if (kvm_psci_version(source_vcpu, kvm) != KVM_ARM_PSCI_0_1)
>                         return PSCI_RET_ALREADY_ON;
>                 else
> @@ -107,7 +100,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>          */
>         smp_wmb();
>
> -       vcpu->arch.power_off = false;
> +       vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>         kvm_vcpu_wake_up(vcpu);
>
>         return PSCI_RET_SUCCESS;
> @@ -142,7 +135,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
>                 mpidr = kvm_vcpu_get_mpidr_aff(tmp);
>                 if ((mpidr & target_affinity_mask) == target_affinity) {
>                         matching_cpus++;
> -                       if (!tmp->arch.power_off)
> +                       if (!kvm_arm_vcpu_is_off(tmp))
>                                 return PSCI_0_2_AFFINITY_LEVEL_ON;
>                 }
>         }
> @@ -168,7 +161,7 @@ static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type)
>          * re-initialized.
>          */
>         kvm_for_each_vcpu(i, tmp, vcpu->kvm)
> -               tmp->arch.power_off = true;
> +               tmp->arch.mp_state = KVM_MP_STATE_STOPPED;
>         kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP);
>
>         memset(&vcpu->run->system_event, 0, sizeof(vcpu->run->system_event));
> @@ -237,7 +230,7 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
>                 val = kvm_psci_vcpu_suspend(vcpu);
>                 break;
>         case PSCI_0_2_FN_CPU_OFF:
> -               kvm_psci_vcpu_off(vcpu);
> +               kvm_arm_vcpu_power_off(vcpu);
>                 val = PSCI_RET_SUCCESS;
>                 break;
>         case PSCI_0_2_FN_CPU_ON:
> @@ -350,7 +343,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
>
>         switch (psci_fn) {
>         case KVM_PSCI_FN_CPU_OFF:
> -               kvm_psci_vcpu_off(vcpu);
> +               kvm_arm_vcpu_power_off(vcpu);
>                 val = PSCI_RET_SUCCESS;
>                 break;
>         case KVM_PSCI_FN_CPU_ON:
> --
> 2.31.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 2/5] KVM: arm64: Move WFI execution to check_vcpu_requests()
  2021-06-08 15:48 ` [RFC PATCH 2/5] KVM: arm64: Move WFI execution to check_vcpu_requests() Jean-Philippe Brucker
@ 2021-07-01  9:45   ` Fuad Tabba
  0 siblings, 0 replies; 19+ messages in thread
From: Fuad Tabba @ 2021-07-01  9:45 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: maz, salil.mehta, lorenzo.pieralisi, kvm, corbet,
	catalin.marinas, linux-kernel, will, jonathan.cameron, pbonzini,
	kvmarm, linux-arm-kernel

Hi Jean-Philippe,

On Tue, Jun 8, 2021 at 4:54 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> Prepare for WFI requests from userspace, by adding a suspend request and
> moving the WFI execution into check_vcpu_requests(), next to the
> power-off logic.
>
> vcpu->arch.mp_state, previously only RUNNABLE or STOPPED, supports an
> additional state HALTED and two new state transitions:
>
>   RUNNABLE -> HALTED    from WFI or PSCI CPU_SUSPEND (same vCPU)
>   HALTED -> RUNNABLE    vGIC IRQ, pending timer, signal
>
> There shouldn't be any functional change with this patch, even though
> the KVM_GET_MP_STATE ioctl could now in theory return
> KVM_MP_STATE_HALTED, which would break some users' mp_state support. In
> practice it should not happen because we do not return to userspace with
> HALTED state. Both WFI and PSCI CPU_SUSPEND stay in the vCPU run loop
> until the suspend request is consumed. It does feel fragile though,
> maybe we should explicitly return RUNNABLE in KVM_GET_MP_STATE in place
> of HALTED, to prevent future breakage.

It's not really a functional change, but it might introduce some
timing/scheduling changes I think.

Before your changes, the kvm_vcpu_block() would take place at the end
of the vCPU run loop, via handle_exit(). Now it takes place closer to
the beginning, after cond_resched() is called, and not if there is a
KVM_REQ_IRQ_PENDING.

If my observation is correct, would it be good to mention that?

Thanks,
/fuad



>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>  arch/arm64/include/asm/kvm_host.h |  2 ++
>  arch/arm64/kvm/arm.c              | 18 ++++++++++++++-
>  arch/arm64/kvm/handle_exit.c      |  3 +--
>  arch/arm64/kvm/psci.c             | 37 +++++++++++++------------------
>  4 files changed, 35 insertions(+), 25 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 55a04f4d5919..3ca732feb9a5 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -46,6 +46,7 @@
>  #define KVM_REQ_VCPU_RESET     KVM_ARCH_REQ(2)
>  #define KVM_REQ_RECORD_STEAL   KVM_ARCH_REQ(3)
>  #define KVM_REQ_RELOAD_GICv4   KVM_ARCH_REQ(4)
> +#define KVM_REQ_SUSPEND                KVM_ARCH_REQ(5)
>
>  #define KVM_DIRTY_LOG_MANUAL_CAPS   (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
>                                      KVM_DIRTY_LOG_INITIALLY_SET)
> @@ -722,6 +723,7 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>                                struct kvm_device_attr *attr);
>  void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu);
>  bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu);
> +void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu);
>
>  /* Guest/host FPSIMD coordination helpers */
>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index bcc24adb9c0a..d8cbaa0373c7 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -447,6 +447,12 @@ bool kvm_arm_vcpu_is_off(struct kvm_vcpu *vcpu)
>         return vcpu->arch.mp_state == KVM_MP_STATE_STOPPED;
>  }
>
> +void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu)
> +{
> +       vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
> +       kvm_make_request(KVM_REQ_SUSPEND, vcpu);
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
>                                     struct kvm_mp_state *mp_state)
>  {
> @@ -667,6 +673,8 @@ static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
>
>  static void check_vcpu_requests(struct kvm_vcpu *vcpu)
>  {
> +       bool irq_pending;
> +
>         if (kvm_request_pending(vcpu)) {
>                 if (kvm_check_request(KVM_REQ_SLEEP, vcpu))
>                         vcpu_req_sleep(vcpu);
> @@ -678,7 +686,7 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
>                  * Clear IRQ_PENDING requests that were made to guarantee
>                  * that a VCPU sees new virtual interrupts.
>                  */
> -               kvm_check_request(KVM_REQ_IRQ_PENDING, vcpu);
> +               irq_pending = kvm_check_request(KVM_REQ_IRQ_PENDING, vcpu);
>
>                 if (kvm_check_request(KVM_REQ_RECORD_STEAL, vcpu))
>                         kvm_update_stolen_time(vcpu);
> @@ -690,6 +698,14 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
>                         vgic_v4_load(vcpu);
>                         preempt_enable();
>                 }
> +
> +               if (kvm_check_request(KVM_REQ_SUSPEND, vcpu)) {
> +                       if (!irq_pending) {
> +                               kvm_vcpu_block(vcpu);
> +                               kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> +                       }
> +                       vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
> +               }
>         }
>  }
>
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 6f48336b1d86..9717df3104cf 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -95,8 +95,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
>         } else {
>                 trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
>                 vcpu->stat.wfi_exit_stat++;
> -               kvm_vcpu_block(vcpu);
> -               kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> +               kvm_arm_vcpu_suspend(vcpu);
>         }
>
>         kvm_incr_pc(vcpu);
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index 24b4a2265dbd..42a307ceb95f 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -31,27 +31,6 @@ static unsigned long psci_affinity_mask(unsigned long affinity_level)
>         return 0;
>  }
>
> -static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
> -{
> -       /*
> -        * NOTE: For simplicity, we make VCPU suspend emulation to be
> -        * same-as WFI (Wait-for-interrupt) emulation.
> -        *
> -        * This means for KVM the wakeup events are interrupts and
> -        * this is consistent with intended use of StateID as described
> -        * in section 5.4.1 of PSCI v0.2 specification (ARM DEN 0022A).
> -        *
> -        * Further, we also treat power-down request to be same as
> -        * stand-by request as-per section 5.4.2 clause 3 of PSCI v0.2
> -        * specification (ARM DEN 0022A). This means all suspend states
> -        * for KVM will preserve the register state.
> -        */
> -       kvm_vcpu_block(vcpu);
> -       kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> -
> -       return PSCI_RET_SUCCESS;
> -}
> -
>  static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>  {
>         struct vcpu_reset_state *reset_state;
> @@ -227,7 +206,21 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
>                 break;
>         case PSCI_0_2_FN_CPU_SUSPEND:
>         case PSCI_0_2_FN64_CPU_SUSPEND:
> -               val = kvm_psci_vcpu_suspend(vcpu);
> +               /*
> +                * NOTE: For simplicity, we make VCPU suspend emulation to be
> +                * same-as WFI (Wait-for-interrupt) emulation.
> +                *
> +                * This means for KVM the wakeup events are interrupts and this
> +                * is consistent with intended use of StateID as described in
> +                * section 5.4.1 of PSCI v0.2 specification (ARM DEN 0022A).
> +                *
> +                * Further, we also treat power-down request to be same as
> +                * stand-by request as-per section 5.4.2 clause 3 of PSCI v0.2
> +                * specification (ARM DEN 0022A). This means all suspend states
> +                * for KVM will preserve the register state.
> +                */
> +               kvm_arm_vcpu_suspend(vcpu);
> +               val = PSCI_RET_SUCCESS;
>                 break;
>         case PSCI_0_2_FN_CPU_OFF:
>                 kvm_arm_vcpu_power_off(vcpu);
> --
> 2.31.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 3/5] KVM: arm64: Allow userspace to request WFI
  2021-06-08 15:48 ` [RFC PATCH 3/5] KVM: arm64: Allow userspace to request WFI Jean-Philippe Brucker
@ 2021-07-01  9:45   ` Fuad Tabba
  0 siblings, 0 replies; 19+ messages in thread
From: Fuad Tabba @ 2021-07-01  9:45 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: maz, salil.mehta, lorenzo.pieralisi, kvm, corbet,
	catalin.marinas, linux-kernel, will, jonathan.cameron, pbonzini,
	kvmarm, linux-arm-kernel

Hi Jean-Philippe,

On Tue, Jun 8, 2021 at 4:54 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> To help userspace implement PSCI CPU_SUSPEND, allow setting the "HALTED"
> MP state to request a WFI before returning to the guest.
>
> Userspace won't obtain a HALTED mp_state from a KVM_GET_MP_STATE call
> unless they set it themselves. When set by KVM, to handle wfi or
> CPU_SUSPEND, it is consumed before returning to userspace.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>  Documentation/virt/kvm/api.rst | 15 +++++++++------
>  include/uapi/linux/kvm.h       |  1 +
>  arch/arm64/kvm/arm.c           | 11 ++++++++++-
>  3 files changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 7fcb2fd38f42..e4fe7fb60d5d 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -1416,8 +1416,8 @@ Possible values are:
>                                   which has not yet received an INIT signal [x86]
>     KVM_MP_STATE_INIT_RECEIVED    the vcpu has received an INIT signal, and is
>                                   now ready for a SIPI [x86]
> -   KVM_MP_STATE_HALTED           the vcpu has executed a HLT instruction and
> -                                 is waiting for an interrupt [x86]
> +   KVM_MP_STATE_HALTED           the vcpu has executed a HLT/WFI instruction
> +                                 and is waiting for an interrupt [x86,arm64]

Considering that arm64 has a HLT instruction (for debugging), which is
very different from the x86 one, would it be good to clarify that in
the comment. e.g.,

"the vcpu has executed a HLT(x86)/WFI(arm64) instruction"?

Thanks,
/fuad


>     KVM_MP_STATE_SIPI_RECEIVED    the vcpu has just received a SIPI (vector
>                                   accessible via KVM_GET_VCPU_EVENTS) [x86]
>     KVM_MP_STATE_STOPPED          the vcpu is stopped [s390,arm/arm64]
> @@ -1435,8 +1435,9 @@ these architectures.
>  For arm/arm64:
>  ^^^^^^^^^^^^^^
>
> -The only states that are valid are KVM_MP_STATE_STOPPED and
> -KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
> +Valid states are KVM_MP_STATE_STOPPED and KVM_MP_STATE_RUNNABLE which reflect
> +if the vcpu is paused or not. If KVM_CAP_ARM_MP_HALTED is present, state
> +KVM_MP_STATE_HALTED is also valid.
>
>  4.39 KVM_SET_MP_STATE
>  ---------------------
> @@ -1457,8 +1458,10 @@ these architectures.
>  For arm/arm64:
>  ^^^^^^^^^^^^^^
>
> -The only states that are valid are KVM_MP_STATE_STOPPED and
> -KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
> +Valid states are KVM_MP_STATE_STOPPED and KVM_MP_STATE_RUNNABLE which reflect
> +if the vcpu should be paused or not. If KVM_CAP_ARM_MP_HALTED is present,
> +KVM_MP_STATE_HALTED can be set, to wait for interrupts targeted at the vcpu
> +before running it.
>
>  4.40 KVM_SET_IDENTITY_MAP_ADDR
>  ------------------------------
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 79d9c44d1ad7..06ba64c49737 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1083,6 +1083,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_SGX_ATTRIBUTE 196
>  #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197
>  #define KVM_CAP_PTP_KVM 198
> +#define KVM_CAP_ARM_MP_HALTED 199
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index d8cbaa0373c7..d6ad977fea5f 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -207,6 +207,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>         case KVM_CAP_SET_GUEST_DEBUG:
>         case KVM_CAP_VCPU_ATTRIBUTES:
>         case KVM_CAP_PTP_KVM:
> +       case KVM_CAP_ARM_MP_HALTED:
>                 r = 1;
>                 break;
>         case KVM_CAP_SET_GUEST_DEBUG2:
> @@ -469,6 +470,9 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>         case KVM_MP_STATE_RUNNABLE:
>                 vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
>                 break;
> +       case KVM_MP_STATE_HALTED:
> +               kvm_arm_vcpu_suspend(vcpu);
> +               break;
>         case KVM_MP_STATE_STOPPED:
>                 kvm_arm_vcpu_power_off(vcpu);
>                 break;
> @@ -699,7 +703,12 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
>                         preempt_enable();
>                 }
>
> -               if (kvm_check_request(KVM_REQ_SUSPEND, vcpu)) {
> +               /*
> +                * Check mp_state again in case userspace changed their mind
> +                * after requesting suspend.
> +                */
> +               if (kvm_check_request(KVM_REQ_SUSPEND, vcpu) &&
> +                   vcpu->arch.mp_state == KVM_MP_STATE_HALTED) {
>                         if (!irq_pending) {
>                                 kvm_vcpu_block(vcpu);
>                                 kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> --
> 2.31.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace
  2021-06-08 15:48 ` [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace Jean-Philippe Brucker
  2021-06-10 16:02   ` Jonathan Cameron
@ 2021-07-01  9:46   ` Fuad Tabba
  2021-07-01  9:54   ` Russell King (Oracle)
  2 siblings, 0 replies; 19+ messages in thread
From: Fuad Tabba @ 2021-07-01  9:46 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: maz, salil.mehta, lorenzo.pieralisi, kvm, corbet,
	catalin.marinas, linux-kernel, will, jonathan.cameron, pbonzini,
	kvmarm, linux-arm-kernel

Hi Jean-Philippe,

On Tue, Jun 8, 2021 at 4:54 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> Let userspace request to handle all hypercalls that aren't handled by
> KVM, by setting the KVM_CAP_ARM_HVC_TO_USER capability.
>
> With the help of another capability, this will allow userspace to handle
> PSCI calls.
>
> Suggested-by: James Morse <james.morse@arm.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>
> Notes on this implementation:
>
> * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
>   generalizes the idea to all hypercalls, since that was suggested on
>   the list [2, 3].
>
> * We're reusing kvm_run.hypercall. I copied x0-x5 into
>   kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
>   this, because:
>   - Most user handlers will need to write results back into the
>     registers (x0-x3 for SMCCC), so if we keep this shortcut we should
>     go all the way and synchronize them on return to kernel.
>   - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
>     handling the call.
>   - SMCCC uses x0-x16 for parameters.

Another reason, which supports the last item on this list, to consider
removing the copying to kvm_run.hypercall.args is PSCA*, which I think
does use registers beyond x5 for some calls.

* https://developer.arm.com/-/media/Files/pdf/PlatformSecurityArchitecture/Architect/DEN0063-PSA_Firmware_Framework-1.0.0-2.pdf?revision=2d1429fa-4b5b-461a-a60e-4ef3d8f7f4b4

>   x0 does contain the SMCCC function ID and may be useful for fast
>   dispatch, we could keep that plus the immediate number.
>
> * Should we add a flag in the kvm_run.hypercall telling whether this is
>   HVC or SMC?  Can be added later in those bottom longmode and pad
>   fields.

Would it make sense to make this even more fine-grained, where there
would be two capabilities, e.g., one KVM_CAP_ARM_HVC_TO_USER and
another KVM_CAP_ARM_SMC_TO_USER, to select whether to handle the one
or the other (or maybe a parameter of KVM_CAP_ARM_HVC_TO_USER that
selects what subset to handle, e.g., psci). I cannot think of a
use-case to be honest, but considering whether such fine-grained
control is helpful or just unnecessary overkill.

But as you suggest, I think that having a flag to indicate whether
this is an hvc or an smc would be good.

> * On top of this we could share with userspace which HVC ranges are
>   available and which ones are handled by KVM. That can actually be
>   added independently, through a vCPU/VM device attribute (which doesn't
>   consume a new ioctl):
>   - userspace issues HAS_ATTR ioctl on the VM fd to query whether this
>     feature is available.
>   - userspace queries the number N of HVC ranges using one GET_ATTR.
>   - userspace passes an array of N ranges using another GET_ATTR.
>     The array is filled and returned by KVM.
>
> * Untested for AArch32 guests.
>
> [1] https://lore.kernel.org/linux-arm-kernel/20170808164616.25949-12-james.morse@arm.com/
> [2] https://lore.kernel.org/linux-arm-kernel/bf7e83f1-c58e-8d65-edd0-d08f27b8b766@arm.com/
> [3] https://lore.kernel.org/linux-arm-kernel/f56cf420-affc-35f0-2355-801a924b8a35@arm.com/
> ---
>  Documentation/virt/kvm/api.rst    | 17 +++++++++++++++--
>  arch/arm64/include/asm/kvm_host.h |  1 +
>  include/kvm/arm_psci.h            |  4 ++++
>  include/uapi/linux/kvm.h          |  1 +
>  arch/arm64/kvm/arm.c              |  5 +++++
>  arch/arm64/kvm/hypercalls.c       | 28 +++++++++++++++++++++++++++-
>  6 files changed, 53 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index e4fe7fb60d5d..3d8c1661e7b2 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -5228,8 +5228,12 @@ to the byte array.
>                         __u32 pad;
>                 } hypercall;
>
> -Unused.  This was once used for 'hypercall to userspace'.  To implement
> -such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
> +On x86 this was once used for 'hypercall to userspace'.  To implement such
> +functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).

Should the "all except s390" become "all except s390 and arm64"? The
following paragraph clarifies this, but thinking about the person who
might be just skimming through the documentation.

> +On arm64 it is used for hypercalls, when the KVM_CAP_ARM_HVC_TO_USER capability
> +is enabled. 'nr' contains the HVC or SMC immediate. 'args' contains registers
> +x0 - x5. The other parameters are unused.
>
>  .. note:: KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
>
> @@ -6894,3 +6898,12 @@ This capability is always enabled.
>  This capability indicates that the KVM virtual PTP service is
>  supported in the host. A VMM can check whether the service is
>  available to the guest on migration.
> +
> +8.33 KVM_CAP_ARM_HVC_TO_USER
> +----------------------------
> +
> +:Architecture: arm64
> +
> +This capability indicates that KVM can pass unhandled hypercalls to userspace,
> +if the VMM enables it. Hypercalls are passed with KVM_EXIT_HYPERCALL in
> +kvm_run::hypercall.
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 3ca732feb9a5..25554ce97045 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -123,6 +123,7 @@ struct kvm_arch {
>          * supported.
>          */
>         bool return_nisv_io_abort_to_user;
> +       bool hvc_to_user;

Maybe some documentation for this field, similar to
return_nisv_io_abort_to_user above?

>
>         /*
>          * VM-wide PMU filter, implemented as a bitmap and big enough for
> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> index 5b58bd2fe088..d6b71a48fbb1 100644
> --- a/include/kvm/arm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -16,6 +16,10 @@
>
>  #define KVM_ARM_PSCI_LATEST    KVM_ARM_PSCI_1_0
>
> +#define KVM_PSCI_FN_LAST       KVM_PSCI_FN(3)
> +#define PSCI_0_2_FN_LAST       PSCI_0_2_FN(0x3f)
> +#define PSCI_0_2_FN64_LAST     PSCI_0_2_FN64(0x3f)

I think it might be clearer to define these where their _BASE
counterparts are, i.e., uapi/asm/kvm.h and uapi/linux/psci.h.

Also, should _LAST be 0x1f rather than 0x3f, or do you want to include SDEI?
https://developer.arm.com/documentation/den0028/latest

> +
>  /*
>   * We need the KVM pointer independently from the vcpu as we can call
>   * this from HYP, and need to apply kern_hyp_va on it...
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 06ba64c49737..aa831986a399 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1084,6 +1084,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197
>  #define KVM_CAP_PTP_KVM 198
>  #define KVM_CAP_ARM_MP_HALTED 199
> +#define KVM_CAP_ARM_HVC_TO_USER 200
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index d6ad977fea5f..074197721e97 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -93,6 +93,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>                 r = 0;
>                 kvm->arch.return_nisv_io_abort_to_user = true;
>                 break;
> +       case KVM_CAP_ARM_HVC_TO_USER:
> +               r = 0;
> +               kvm->arch.hvc_to_user = true;
> +               break;
>         default:
>                 r = -EINVAL;
>                 break;
> @@ -208,6 +212,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>         case KVM_CAP_VCPU_ATTRIBUTES:
>         case KVM_CAP_PTP_KVM:
>         case KVM_CAP_ARM_MP_HALTED:
> +       case KVM_CAP_ARM_HVC_TO_USER:
>                 r = 1;
>                 break;
>         case KVM_CAP_SET_GUEST_DEBUG2:
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index 30da78f72b3b..ccc2015eddf9 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -58,6 +58,28 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
>         val[3] = lower_32_bits(cycles);
>  }
>
> +static int kvm_hvc_user(struct kvm_vcpu *vcpu)
> +{
> +       int i;
> +       struct kvm_run *run = vcpu->run;
> +
> +       if (!vcpu->kvm->arch.hvc_to_user) {
> +               smccc_set_retval(vcpu, SMCCC_RET_NOT_SUPPORTED, 0, 0, 0);
> +               return 1;
> +       }
> +
> +       run->exit_reason = KVM_EXIT_HYPERCALL;
> +       run->hypercall.nr = kvm_vcpu_hvc_get_imm(vcpu);
> +       /* Copy the first parameters for fast access */
> +       for (i = 0; i < 6; i++)

nit: ARRAY_SIZE(run->hypercall.args) instead of 6?

> +               run->hypercall.args[i] = vcpu_get_reg(vcpu, i);
> +       run->hypercall.ret = 0;
> +       run->hypercall.longmode = 0;
> +       run->hypercall.pad = 0;
> +
> +       return 0;
> +}
> +
>  int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>  {
>         u32 func_id = smccc_get_function(vcpu);
> @@ -139,8 +161,12 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>         case ARM_SMCCC_TRNG_RND32:
>         case ARM_SMCCC_TRNG_RND64:
>                 return kvm_trng_call(vcpu);
> -       default:
> +       case KVM_PSCI_FN_BASE...KVM_PSCI_FN_LAST:
> +       case PSCI_0_2_FN_BASE...PSCI_0_2_FN_LAST:
> +       case PSCI_0_2_FN64_BASE...PSCI_0_2_FN64_LAST:
>                 return kvm_psci_call(vcpu);
> +       default:
> +               return kvm_hvc_user(vcpu);
>         }

I know that SMCCC is a convention, but would it make sense to exclude
other ranges, e.g., not allocated, trusted os-related, or reserved for
future use?

Thanks,
/fuad



>
>         smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);
> --
> 2.31.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 5/5] KVM: arm64: Pass PSCI calls to userspace
  2021-06-08 15:48 ` [RFC PATCH 5/5] KVM: arm64: Pass PSCI calls " Jean-Philippe Brucker
@ 2021-07-01  9:46   ` Fuad Tabba
  0 siblings, 0 replies; 19+ messages in thread
From: Fuad Tabba @ 2021-07-01  9:46 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: maz, salil.mehta, lorenzo.pieralisi, kvm, corbet,
	catalin.marinas, linux-kernel, will, jonathan.cameron, pbonzini,
	kvmarm, linux-arm-kernel

Hi Jean-Philippe,


On Tue, Jun 8, 2021 at 4:54 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> Let userspace request to handle PSCI calls, by setting the new
> KVM_CAP_ARM_PSCI_TO_USER capability.
>
> SMCCC probe requires PSCI v1.x. If userspace only implements PSCI v0.2,
> the guest won't query SMCCC support through PSCI and won't use the
> spectre workarounds. We could hijack PSCI_VERSION and pretend to support
> v1.0 if userspace does not, then handle all v1.0 calls ourselves
> (including guessing the PSCI feature set implemented by the guest), but
> that seems unnecessary. After all the API already allows userspace to
> force a version lower than v1.0 using the firmware pseudo-registers.
>
> The KVM_REG_ARM_PSCI_VERSION pseudo-register currently resets to either
> v0.1 if userspace doesn't set KVM_ARM_VCPU_PSCI_0_2, or
> KVM_ARM_PSCI_LATEST (1.0).
>
> Suggested-by: James Morse <james.morse@arm.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>  Documentation/virt/kvm/api.rst      | 14 ++++++++++++++
>  Documentation/virt/kvm/arm/psci.rst |  1 +
>  arch/arm64/include/asm/kvm_host.h   |  1 +
>  include/kvm/arm_hypercalls.h        |  1 +
>  include/uapi/linux/kvm.h            |  1 +
>  arch/arm64/kvm/arm.c                | 10 +++++++---
>  arch/arm64/kvm/hypercalls.c         |  2 +-
>  arch/arm64/kvm/psci.c               | 13 +++++++++++++
>  8 files changed, 39 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 3d8c1661e7b2..f24eb70e575d 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6907,3 +6907,17 @@ available to the guest on migration.
>  This capability indicates that KVM can pass unhandled hypercalls to userspace,
>  if the VMM enables it. Hypercalls are passed with KVM_EXIT_HYPERCALL in
>  kvm_run::hypercall.
> +
> +8.34 KVM_CAP_ARM_PSCI_TO_USER
> +-----------------------------
> +
> +:Architectures: arm64
> +
> +When the VMM enables this capability, all PSCI calls are passed to userspace
> +instead of being handled by KVM. Capability KVM_CAP_ARM_HVC_TO_USER must be
> +enabled first.
> +
> +Userspace should support at least PSCI v1.0. Otherwise SMCCC features won't be
> +available to the guest. Userspace does not need to handle the SMCCC_VERSION
> +parameter for the PSCI_FEATURES function. The KVM_ARM_VCPU_PSCI_0_2 vCPU
> +feature should be set even if this capability is enabled.
> diff --git a/Documentation/virt/kvm/arm/psci.rst b/Documentation/virt/kvm/arm/psci.rst
> index d52c2e83b5b8..110011d1fa3f 100644
> --- a/Documentation/virt/kvm/arm/psci.rst
> +++ b/Documentation/virt/kvm/arm/psci.rst
> @@ -34,6 +34,7 @@ The following register is defined:
>    - Allows any PSCI version implemented by KVM and compatible with
>      v0.2 to be set with SET_ONE_REG
>    - Affects the whole VM (even if the register view is per-vcpu)
> +  - Defaults to PSCI 1.0 if userspace enables KVM_CAP_ARM_PSCI_TO_USER.

Should this be "to PSCI 1.x", to match the comment and for future expansion?

>
>  * KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
>      Holds the state of the firmware support to mitigate CVE-2017-5715, as
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 25554ce97045..5d74b769c16d 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -124,6 +124,7 @@ struct kvm_arch {
>          */
>         bool return_nisv_io_abort_to_user;
>         bool hvc_to_user;
> +       bool psci_to_user;
>
>         /*
>          * VM-wide PMU filter, implemented as a bitmap and big enough for
> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
> index 0e2509d27910..b66c6a000ef3 100644
> --- a/include/kvm/arm_hypercalls.h
> +++ b/include/kvm/arm_hypercalls.h
> @@ -6,6 +6,7 @@
>
>  #include <asm/kvm_emulate.h>
>
> +int kvm_hvc_user(struct kvm_vcpu *vcpu);
>  int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
>
>  static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index aa831986a399..2b8e55aa7e1e 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1085,6 +1085,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_PTP_KVM 198
>  #define KVM_CAP_ARM_MP_HALTED 199
>  #define KVM_CAP_ARM_HVC_TO_USER 200
> +#define KVM_CAP_ARM_PSCI_TO_USER 201
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 074197721e97..bc3e63b0b3ad 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -83,7 +83,7 @@ int kvm_arch_check_processor_compat(void *opaque)
>  int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>                             struct kvm_enable_cap *cap)
>  {
> -       int r;
> +       int r = -EINVAL;
>
>         if (cap->flags)
>                 return -EINVAL;
> @@ -97,8 +97,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>                 r = 0;
>                 kvm->arch.hvc_to_user = true;
>                 break;
> -       default:
> -               r = -EINVAL;
> +       case KVM_CAP_ARM_PSCI_TO_USER:
> +               if (kvm->arch.hvc_to_user) {
> +                       r = 0;
> +                       kvm->arch.psci_to_user = true;
> +               }

Should this return -EINVAL if hvc_to_user isn't set, rather than
silently not setting psci_to_user, or should it be a parameter of
KVM_CAP_ARM_HVC_TO_USER rather than its own thing?

Thanks,
/fuad


>                 break;
>         }
>
> @@ -213,6 +216,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>         case KVM_CAP_PTP_KVM:
>         case KVM_CAP_ARM_MP_HALTED:
>         case KVM_CAP_ARM_HVC_TO_USER:
> +       case KVM_CAP_ARM_PSCI_TO_USER:
>                 r = 1;
>                 break;
>         case KVM_CAP_SET_GUEST_DEBUG2:
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index ccc2015eddf9..621d5a5b7e48 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -58,7 +58,7 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
>         val[3] = lower_32_bits(cycles);
>  }
>
> -static int kvm_hvc_user(struct kvm_vcpu *vcpu)
> +int kvm_hvc_user(struct kvm_vcpu *vcpu)
>  {
>         int i;
>         struct kvm_run *run = vcpu->run;
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index 42a307ceb95f..7f44ee527966 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -353,6 +353,16 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
>         return 1;
>  }
>
> +static bool kvm_psci_call_is_user(struct kvm_vcpu *vcpu)
> +{
> +       /* Handle the special case of SMCCC probe through PSCI */
> +       if (smccc_get_function(vcpu) == PSCI_1_0_FN_PSCI_FEATURES &&
> +           smccc_get_arg1(vcpu) == ARM_SMCCC_VERSION_FUNC_ID)
> +               return false;
> +
> +       return vcpu->kvm->arch.psci_to_user;
> +}
> +
>  /**
>   * kvm_psci_call - handle PSCI call if r0 value is in range
>   * @vcpu: Pointer to the VCPU struct
> @@ -369,6 +379,9 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
>   */
>  int kvm_psci_call(struct kvm_vcpu *vcpu)
>  {
> +       if (kvm_psci_call_is_user(vcpu))
> +               return kvm_hvc_user(vcpu);
> +
>         switch (kvm_psci_version(vcpu, vcpu->kvm)) {
>         case KVM_ARM_PSCI_1_0:
>                 return kvm_psci_1_0_call(vcpu);
> --
> 2.31.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace
  2021-06-08 15:48 ` [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace Jean-Philippe Brucker
  2021-06-10 16:02   ` Jonathan Cameron
  2021-07-01  9:46   ` Fuad Tabba
@ 2021-07-01  9:54   ` Russell King (Oracle)
  2 siblings, 0 replies; 19+ messages in thread
From: Russell King (Oracle) @ 2021-07-01  9:54 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: maz, kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini,
	corbet, james.morse, alexandru.elisei, suzuki.poulose,
	catalin.marinas, will, lorenzo.pieralisi, salil.mehta,
	shameerali.kolothum.thodi, jonathan.cameron

On Tue, Jun 08, 2021 at 05:48:05PM +0200, Jean-Philippe Brucker wrote:
> * Untested for AArch32 guests.

That really needs testing; you may notice Will Deacon has been
posting patches for aarch32 guests over the last few months, and
there are other users out there who run aarch32 guests on their
aarch64 hardware.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace
  2021-06-08 15:48 [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace Jean-Philippe Brucker
                   ` (4 preceding siblings ...)
  2021-06-08 15:48 ` [RFC PATCH 5/5] KVM: arm64: Pass PSCI calls " Jean-Philippe Brucker
@ 2021-07-19 15:29 ` Alexandru Elisei
  2021-07-19 18:02   ` Jean-Philippe Brucker
  2021-07-21 17:56 ` Jean-Philippe Brucker
  6 siblings, 1 reply; 19+ messages in thread
From: Alexandru Elisei @ 2021-07-19 15:29 UTC (permalink / raw)
  To: Jean-Philippe Brucker, maz
  Cc: kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini, corbet,
	james.morse, suzuki.poulose, catalin.marinas, will,
	lorenzo.pieralisi, salil.mehta, shameerali.kolothum.thodi,
	jonathan.cameron

Hi Jean-Philippe,

I'm not really familiar with this part of KVM, and I'm still trying to get my head
around how this works, so please bare with me if I ask silly questions.

This is how I understand this will work:

1. VMM opts in to forward HVC calls not handled by KVM.

2. VMM opts in to forward PSCI calls, other than
PSCI_1_0_FN_PSCI_FEATURES(ARM_SMCCC_VERSION_FUNC_ID).

3. Guest emulates PSCI calls (and all the other HVC calls).

    3.a For CPU_SUSPEND coming from VCPU A, userspace does a
KVM_SET_MP_STATE(KVM_MP_STATE_HALTED) ioctl on the VCPU fd which sets the request
KVM_REQ_SUSPEND.

    3.b The next time the VCPU is run, KVM blocks the VCPU as a result of the
request. kvm_vcpu_block() does a schedule() in a loop until it decides that the
CPU must unblock.

    3.c The VCPU will run as normal after kvm_vcpu_block() returns.

Please correct me if I got something wrong.

I have a few general questions. It doesn't mean there's something wrong with your
approach, I'm just trying to understand it better.

1. Why forwarding PSCI calls to userspace depend on enabling forwarding for other
HVC calls? As I understand from the patches, those handle distinct function IDs.

2. HVC call forwarding to userspace also forwards PSCI functions which are defined
in ARM DEN 0022D, but not (yet) implemented by KVM. What happens if KVM's PSCI
implementation gets support for one of those functions? How does userspace know
that now it also needs to enable PSCI call forwarding to be able to handle that
function?

It looks to me like the boundary between the functions that are forwarded when HVC
call forwarding is enabled and the functions that are forwarded when PSCI call
forwarding is enabled is based on what Linux v5.13 handles. Have you considered
choosing this boundary based on something less arbitrary, like the function types
specified in ARM DEN 0028C, table 2-1?

In my opinion, setting the MP state to HALTED looks like a sensible approach to
implementing PSCI_SUSPEND. I'll take a closer look at the patches after I get a
better understanding about what is going on.

On 6/8/21 4:48 PM, Jean-Philippe Brucker wrote:
> Allow userspace to request handling PSCI calls from guests. Our goal is
> to enable a vCPU hot-add solution for Arm where the VMM presents
> possible resources to the guest at boot, and controls which vCPUs can be
> brought up by allowing or denying PSCI CPU_ON calls. Passing HVC and
> PSCI to userspace has been discussed on the list in the context of vCPU
> hot-add [1,2] but it can also be useful for implementing other SMCCC and
> vendor hypercalls [3,4,5].
>
> Patches 1-3 allow userspace to request WFI to be executed in KVM. That

I don't understand this. KVM, in kvm_vcpu_block(), does not execute an WFI.
PSCI_SUSPEND is documented as being indistinguishable from an WFI from the guest's
point of view, but it's implementation is not architecturally defined.

Thanks,

Alex

> way the VMM can easily implement the PSCI CPU_SUSPEND function, which is
> mandatory from PSCI v0.2 onwards (even if it doesn't have a more useful
> implementation than WFI, natively available to the guest).
>
> Patch 4 lets userspace request any HVC that isn't handled by KVM, and
> patch 5 lets userspace request PSCI calls, disabling in-kernel PSCI
> handling.
>
> I'm focusing on the PSCI bits, but a complete prototype of vCPU hot-add
> for arm64 on Linux and QEMU, most of it from Salil and James, is
> available at [6].
>
> [1] https://lore.kernel.org/kvmarm/82879258-46a7-a6e9-ee54-fc3692c1cdc3@arm.com/
> [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
>     (Followed by KVM forum and Linaro Open discussions)
> [3] https://lore.kernel.org/linux-arm-kernel/f56cf420-affc-35f0-2355-801a924b8a35@arm.com/
> [4] https://lore.kernel.org/kvm/bf7e83f1-c58e-8d65-edd0-d08f27b8b766@arm.com/
> [5] https://lore.kernel.org/kvm/1569338454-26202-2-git-send-email-guoheyi@huawei.com/
> [6] https://jpbrucker.net/git/linux/log/?h=cpuhp/devel
>     https://jpbrucker.net/git/qemu/log/?h=cpuhp/devel    
>
> Jean-Philippe Brucker (5):
>   KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch
>   KVM: arm64: Move WFI execution to check_vcpu_requests()
>   KVM: arm64: Allow userspace to request WFI
>   KVM: arm64: Pass hypercalls to userspace
>   KVM: arm64: Pass PSCI calls to userspace
>
>  Documentation/virt/kvm/api.rst      | 46 +++++++++++++++----
>  Documentation/virt/kvm/arm/psci.rst |  1 +
>  arch/arm64/include/asm/kvm_host.h   | 10 +++-
>  include/kvm/arm_hypercalls.h        |  1 +
>  include/kvm/arm_psci.h              |  4 ++
>  include/uapi/linux/kvm.h            |  3 ++
>  arch/arm64/kvm/arm.c                | 71 +++++++++++++++++++++--------
>  arch/arm64/kvm/handle_exit.c        |  3 +-
>  arch/arm64/kvm/hypercalls.c         | 28 +++++++++++-
>  arch/arm64/kvm/psci.c               | 69 ++++++++++++++--------------
>  10 files changed, 170 insertions(+), 66 deletions(-)
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace
  2021-07-19 15:29 ` [RFC PATCH 0/5] KVM: arm64: Pass PSCI " Alexandru Elisei
@ 2021-07-19 18:02   ` Jean-Philippe Brucker
  2021-07-19 19:37     ` Oliver Upton
  0 siblings, 1 reply; 19+ messages in thread
From: Jean-Philippe Brucker @ 2021-07-19 18:02 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: maz, kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini,
	corbet, james.morse, suzuki.poulose, catalin.marinas, will,
	lorenzo.pieralisi, salil.mehta, shameerali.kolothum.thodi,
	jonathan.cameron

Hi Alex,

I'm not planning to resend this work at the moment, because it looks like
vcpu hot-add will go a different way so I don't have a user. But I'll
probably address the feedback so far and park it on some branch, in case
anyone else needs it.

On Mon, Jul 19, 2021 at 04:29:18PM +0100, Alexandru Elisei wrote:
> 1. Why forwarding PSCI calls to userspace depend on enabling forwarding for other
> HVC calls? As I understand from the patches, those handle distinct function IDs.

The HVC cap from patch 4 enables returning from the VCPU_RUN ioctl with
KVM_EXIT_HYPERCALL, for any HVC not handled by KVM. This one should
definitely be improved, either by letting userspace choose the ranges of
HVC it wants, or at least by reporting ranges reserved by KVM to
userspace.

The PSCI cap from patch 5 disables the in-kernel PSCI implementation. As a
result those HVCs are forwarded to userspace.

It was suggested that other users will want to handle HVC calls (SDEI for
example [1]), hence splitting into two capabilities rather than just the
PSCI cap. In v5.14 x86 added KVM_CAP_EXIT_HYPERCALL [2], which lets
userspace receive specific hypercalls. We could reuse that and have PSCI
be one bit of that capability's parameter.

[1] https://lore.kernel.org/linux-arm-kernel/20170808164616.25949-12-james.morse@arm.com/
[2] https://lore.kernel.org/kvm/90778988e1ee01926ff9cac447aacb745f954c8c.1623174621.git.ashish.kalra@amd.com/

> 2. HVC call forwarding to userspace also forwards PSCI functions which are defined
> in ARM DEN 0022D, but not (yet) implemented by KVM. What happens if KVM's PSCI
> implementation gets support for one of those functions? How does userspace know
> that now it also needs to enable PSCI call forwarding to be able to handle that
> function?

We forward the whole PSCI function range, so it's either KVM or userspace.
If KVM manages PSCI and the guest calls an unimplemented function, that
returns directly to the guest without going to userspace.

The concern is valid for any other range, though. If userspace enables the
HVC cap it receives function calls that at some point KVM might need to
handle itself. So we need some negotiation between user and KVM about the
specific HVC ranges that userspace can and will handle.

> It looks to me like the boundary between the functions that are forwarded when HVC
> call forwarding is enabled and the functions that are forwarded when PSCI call
> forwarding is enabled is based on what Linux v5.13 handles. Have you considered
> choosing this boundary based on something less arbitrary, like the function types
> specified in ARM DEN 0028C, table 2-1?

For PSCI I've used the range 0-0x1f as the boundary, which is reserved for
PSCI by SMCCC (table 6-4 in that document).

> 
> In my opinion, setting the MP state to HALTED looks like a sensible approach to
> implementing PSCI_SUSPEND. I'll take a closer look at the patches after I get a
> better understanding about what is going on.
> 
> On 6/8/21 4:48 PM, Jean-Philippe Brucker wrote:
> > Allow userspace to request handling PSCI calls from guests. Our goal is
> > to enable a vCPU hot-add solution for Arm where the VMM presents
> > possible resources to the guest at boot, and controls which vCPUs can be
> > brought up by allowing or denying PSCI CPU_ON calls. Passing HVC and
> > PSCI to userspace has been discussed on the list in the context of vCPU
> > hot-add [1,2] but it can also be useful for implementing other SMCCC and
> > vendor hypercalls [3,4,5].
> >
> > Patches 1-3 allow userspace to request WFI to be executed in KVM. That
> 
> I don't understand this. KVM, in kvm_vcpu_block(), does not execute an WFI.
> PSCI_SUSPEND is documented as being indistinguishable from an WFI from the guest's
> point of view, but it's implementation is not architecturally defined.

Yes that was an oversimplification on my part

Thanks,
Jean

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace
  2021-07-19 18:02   ` Jean-Philippe Brucker
@ 2021-07-19 19:37     ` Oliver Upton
  2021-07-21 17:46       ` Jean-Philippe Brucker
  0 siblings, 1 reply; 19+ messages in thread
From: Oliver Upton @ 2021-07-19 19:37 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Alexandru Elisei, salil.mehta, lorenzo.pieralisi, kvm, corbet,
	maz, linux-kernel, jonathan.cameron, catalin.marinas, pbonzini,
	will, kvmarm, linux-arm-kernel

On Mon, Jul 19, 2021 at 11:02 AM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
> We forward the whole PSCI function range, so it's either KVM or userspace.
> If KVM manages PSCI and the guest calls an unimplemented function, that
> returns directly to the guest without going to userspace.
>
> The concern is valid for any other range, though. If userspace enables the
> HVC cap it receives function calls that at some point KVM might need to
> handle itself. So we need some negotiation between user and KVM about the
> specific HVC ranges that userspace can and will handle.

Are we going to use KVM_CAPs for every interesting HVC range that
userspace may want to trap? I wonder if a more generic interface for
hypercall filtering would have merit to handle the aforementioned
cases, and whatever else a VMM will want to intercept down the line.

For example, x86 has the concept of 'MSR filtering', wherein userspace
can specify a set of registers that it wants to intercept. Doing
something similar for HVCs would avoid the need for a kernel change
each time a VMM wishes to intercept a new hypercall.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace
  2021-07-19 19:37     ` Oliver Upton
@ 2021-07-21 17:46       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 19+ messages in thread
From: Jean-Philippe Brucker @ 2021-07-21 17:46 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Alexandru Elisei, salil.mehta, lorenzo.pieralisi, kvm, corbet,
	maz, linux-kernel, jonathan.cameron, catalin.marinas, pbonzini,
	will, kvmarm, linux-arm-kernel

On Mon, Jul 19, 2021 at 12:37:52PM -0700, Oliver Upton wrote:
> On Mon, Jul 19, 2021 at 11:02 AM Jean-Philippe Brucker
> <jean-philippe@linaro.org> wrote:
> > We forward the whole PSCI function range, so it's either KVM or userspace.
> > If KVM manages PSCI and the guest calls an unimplemented function, that
> > returns directly to the guest without going to userspace.
> >
> > The concern is valid for any other range, though. If userspace enables the
> > HVC cap it receives function calls that at some point KVM might need to
> > handle itself. So we need some negotiation between user and KVM about the
> > specific HVC ranges that userspace can and will handle.
> 
> Are we going to use KVM_CAPs for every interesting HVC range that
> userspace may want to trap? I wonder if a more generic interface for
> hypercall filtering would have merit to handle the aforementioned
> cases, and whatever else a VMM will want to intercept down the line.
> 
> For example, x86 has the concept of 'MSR filtering', wherein userspace
> can specify a set of registers that it wants to intercept. Doing
> something similar for HVCs would avoid the need for a kernel change
> each time a VMM wishes to intercept a new hypercall.

Yes we could introduce a VM device group for this:
* User reads attribute KVM_ARM_VM_HVC_NR_SLOTS, which defines the number
  of available HVC ranges.
* User writes attribute KVM_ARM_VM_HVC_SET_RANGE with one range
  struct kvm_arm_hvc_range {
          __u32 slot;
  #define KVM_ARM_HVC_USER (1 << 0) /* Enable range. 0 disables it */
          __u16 flags;
	  __u16 imm;
          __u32 fn_start;
          __u32 fn_end;
  };
* KVM forwards any HVC within this range to userspace.
* If one of the ranges is PSCI functions, disable KVM PSCI.

Since it's more work for KVM to keep track of ranges, I didn't include it
in the RFC, and I'm going to leave it to the next person dealing with this
stuff :)

Thanks,
Jean

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace
  2021-06-08 15:48 [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace Jean-Philippe Brucker
                   ` (5 preceding siblings ...)
  2021-07-19 15:29 ` [RFC PATCH 0/5] KVM: arm64: Pass PSCI " Alexandru Elisei
@ 2021-07-21 17:56 ` Jean-Philippe Brucker
  6 siblings, 0 replies; 19+ messages in thread
From: Jean-Philippe Brucker @ 2021-07-21 17:56 UTC (permalink / raw)
  To: maz
  Cc: kvm, kvmarm, linux-arm-kernel, linux-kernel, pbonzini, corbet,
	james.morse, alexandru.elisei, suzuki.poulose, catalin.marinas,
	will, lorenzo.pieralisi, salil.mehta, shameerali.kolothum.thodi,
	jonathan.cameron, tabba, oupton

On Tue, Jun 08, 2021 at 05:48:01PM +0200, Jean-Philippe Brucker wrote:
> Allow userspace to request handling PSCI calls from guests. Our goal is
> to enable a vCPU hot-add solution for Arm where the VMM presents
> possible resources to the guest at boot, and controls which vCPUs can be
> brought up by allowing or denying PSCI CPU_ON calls.

Since it looks like vCPU hot-add will be implemented differently, I don't
intend to resend this series at the moment. But some of it could be
useful for other projects and to avoid the helpful review effort going to
waste, I fixed it up and will leave it on branch
https://jpbrucker.net/git/linux/log/?h=kvm/psci-to-userspace
It now only uses KVM_CAP_EXIT_HYPERCALL introduced in v5.14.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-07-21 17:56 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08 15:48 [RFC PATCH 0/5] KVM: arm64: Pass PSCI to userspace Jean-Philippe Brucker
2021-06-08 15:48 ` [RFC PATCH 1/5] KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch Jean-Philippe Brucker
2021-06-10 15:08   ` Jonathan Cameron
2021-07-01  9:44   ` Fuad Tabba
2021-06-08 15:48 ` [RFC PATCH 2/5] KVM: arm64: Move WFI execution to check_vcpu_requests() Jean-Philippe Brucker
2021-07-01  9:45   ` Fuad Tabba
2021-06-08 15:48 ` [RFC PATCH 3/5] KVM: arm64: Allow userspace to request WFI Jean-Philippe Brucker
2021-07-01  9:45   ` Fuad Tabba
2021-06-08 15:48 ` [RFC PATCH 4/5] KVM: arm64: Pass hypercalls to userspace Jean-Philippe Brucker
2021-06-10 16:02   ` Jonathan Cameron
2021-07-01  9:46   ` Fuad Tabba
2021-07-01  9:54   ` Russell King (Oracle)
2021-06-08 15:48 ` [RFC PATCH 5/5] KVM: arm64: Pass PSCI calls " Jean-Philippe Brucker
2021-07-01  9:46   ` Fuad Tabba
2021-07-19 15:29 ` [RFC PATCH 0/5] KVM: arm64: Pass PSCI " Alexandru Elisei
2021-07-19 18:02   ` Jean-Philippe Brucker
2021-07-19 19:37     ` Oliver Upton
2021-07-21 17:46       ` Jean-Philippe Brucker
2021-07-21 17:56 ` Jean-Philippe Brucker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).