From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oupton@google.com>
Cc: kvm@vger.kernel.org, Peter Shier <pshier@google.com>,
kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH v2 06/11] KVM: arm64: Add support for SYSTEM_SUSPEND PSCI call
Date: Thu, 30 Sep 2021 13:29:47 +0100 [thread overview]
Message-ID: <877deytfes.wl-maz@kernel.org> (raw)
In-Reply-To: <20210923191610.3814698-7-oupton@google.com>
Hi Oliver,
On Thu, 23 Sep 2021 20:16:05 +0100,
Oliver Upton <oupton@google.com> wrote:
>
> ARM DEN0022D 5.19 "SYSTEM_SUSPEND" describes a PSCI call that may be
> used to request a system be suspended. This is optional for PSCI v1.0
> and to date KVM has elected to not implement the call. However, a
> VMM/operator may wish to provide their guests with the ability to
> suspend/resume, necessitating this PSCI call.
>
> Implement support for SYSTEM_SUSPEND according to the prescribed
> behavior in the specification. Add a new system event exit type,
> KVM_SYSTEM_EVENT_SUSPEND, to notify userspace when a VM has requested a
> system suspend. Make KVM_MP_STATE_HALTED a valid state on arm64.
KVM_MP_STATE_HALTED is a per-CPU state on x86 (it denotes HLT). Does
it make really sense to hijack this for something that is more of a
VM-wide state? I can see that it is tempting to do so as we're using
the WFI semantics (which are close to HLT's, in a twisted kind of
way), but I'm also painfully aware that gluing x86 expectations on
arm64 rarely leads to a palatable result.
> Userspace can set this to request an in-kernel emulation of the suspend.
>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
> Documentation/virt/kvm/api.rst | 6 ++++
> arch/arm64/include/asm/kvm_host.h | 3 ++
> arch/arm64/kvm/arm.c | 8 +++++
> arch/arm64/kvm/psci.c | 60 +++++++++++++++++++++++++++++++
> include/uapi/linux/kvm.h | 2 ++
> 5 files changed, 79 insertions(+)
This patch needs splitting. PSCI interface in one, mpstate stuff in
another, userspace management last.
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index a6729c8cf063..361a57061b8f 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -5656,6 +5656,7 @@ should put the acknowledged interrupt vector into the 'epr' field.
> #define KVM_SYSTEM_EVENT_SHUTDOWN 1
> #define KVM_SYSTEM_EVENT_RESET 2
> #define KVM_SYSTEM_EVENT_CRASH 3
> + #define KVM_SYSTEM_EVENT_SUSPEND 4
> __u32 type;
> __u64 flags;
> } system_event;
> @@ -5680,6 +5681,11 @@ Valid values for 'type' are:
> has requested a crash condition maintenance. Userspace can choose
> to ignore the request, or to gather VM memory core dump and/or
> reset/shutdown of the VM.
> + - KVM_SYSTEM_EVENT_SUSPEND -- the guest has requested that the VM
> + suspends. Userspace is not obliged to honor this, and may call KVM_RUN
> + again. Doing so will cause the guest to resume at its requested entry
> + point. For ARM64, userspace can request in-kernel suspend emulation
> + by setting the vCPU's MP state to KVM_MP_STATE_HALTED.
>
> ::
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 1beda1189a15..441eb6fa7adc 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -137,6 +137,9 @@ struct kvm_arch {
>
> /* Memory Tagging Extension enabled for the guest */
> bool mte_enabled;
> +
> + /* PSCI SYSTEM_SUSPEND call enabled for the guest */
> + bool suspend_enabled;
> };
>
> struct kvm_vcpu_fault_info {
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index f1a375648e25..d875d3bcf3c5 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -101,6 +101,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> }
> mutex_unlock(&kvm->lock);
> break;
> + case KVM_CAP_ARM_SYSTEM_SUSPEND:
> + r = 0;
> + kvm->arch.suspend_enabled = true;
I don't really fancy adding another control here. Given that we now
have the PSCI version being controlled by a firmware pseudo-register,
I'd rather have that there.
And if we do that, I wonder whether there would be any benefit in
bumping the PSCI version to 1.1.
> + break;
> default:
> r = -EINVAL;
> break;
> @@ -215,6 +219,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_SET_GUEST_DEBUG:
> case KVM_CAP_VCPU_ATTRIBUTES:
> case KVM_CAP_PTP_KVM:
> + case KVM_CAP_ARM_SYSTEM_SUSPEND:
> r = 1;
> break;
> case KVM_CAP_SET_GUEST_DEBUG2:
> @@ -470,6 +475,9 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
> int ret = 0;
>
> switch (mp_state->mp_state) {
> + case KVM_MP_STATE_HALTED:
> + kvm_make_request(KVM_REQ_SUSPEND, vcpu);
> + fallthrough;
> case KVM_MP_STATE_RUNNABLE:
> vcpu->arch.power_off = false;
> break;
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index d453666ddb83..cf869f1f8615 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -203,6 +203,46 @@ static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)
> kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_RESET);
> }
>
> +static int kvm_psci_system_suspend(struct kvm_vcpu *vcpu)
> +{
> + unsigned long entry_addr, context_id;
> + struct kvm *kvm = vcpu->kvm;
> + unsigned long psci_ret = 0;
> + struct kvm_vcpu *tmp;
> + int ret = 0;
> + int i;
> +
> + /*
> + * The SYSTEM_SUSPEND PSCI call requires that all vCPUs (except the
> + * calling vCPU) be in an OFF state, as determined by the
> + * implementation.
> + *
> + * See ARM DEN0022D, 5.19 "SYSTEM_SUSPEND" for more details.
> + */
> + mutex_lock(&kvm->lock);
> + kvm_for_each_vcpu(i, tmp, kvm) {
> + if (tmp != vcpu && !tmp->arch.power_off) {
> + psci_ret = PSCI_RET_DENIED;
> + ret = 1;
> + goto out;
> + }
> + }
> +
> + entry_addr = smccc_get_arg1(vcpu);
> + context_id = smccc_get_arg2(vcpu);
> +
> + kvm_psci_vcpu_request_reset(vcpu, entry_addr, context_id,
> + kvm_vcpu_is_be(vcpu));
So even if userspace doesn't want to honor the suspend request, the
CPU ends up resetting? That's not wrong, but maybe a bit surprising.
> +
> + memset(&vcpu->run->system_event, 0, sizeof(vcpu->run->system_event));
> + vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SUSPEND;
> + vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
Consider spinning out a helper common to this and kvm_prepare_system_event().
> +out:
> + mutex_unlock(&kvm->lock);
> + smccc_set_retval(vcpu, psci_ret, 0, 0, 0);
> + return ret;
> +}
> +
> static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
> {
> int i;
> @@ -223,6 +263,14 @@ static unsigned long kvm_psci_check_allowed_function(struct kvm_vcpu *vcpu, u32
> if ((fn & PSCI_0_2_64BIT) && vcpu_mode_is_32bit(vcpu))
> return PSCI_RET_NOT_SUPPORTED;
>
> + switch (fn) {
> + case PSCI_1_0_FN_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> + if (!vcpu->kvm->arch.suspend_enabled)
> + return PSCI_RET_NOT_SUPPORTED;
> + break;
> + }
> +
> return 0;
> }
>
> @@ -316,6 +364,10 @@ static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
> unsigned long val;
> int ret = 1;
>
> + val = kvm_psci_check_allowed_function(vcpu, psci_fn);
> + if (val)
> + goto out;
> +
> switch(psci_fn) {
> case PSCI_0_2_FN_PSCI_VERSION:
> val = KVM_ARM_PSCI_1_0;
> @@ -339,6 +391,8 @@ static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
> case PSCI_0_2_FN_SYSTEM_OFF:
> case PSCI_0_2_FN_SYSTEM_RESET:
> case PSCI_1_0_FN_PSCI_FEATURES:
> + case PSCI_1_0_FN_SYSTEM_SUSPEND:
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> case ARM_SMCCC_VERSION_FUNC_ID:
> val = 0;
> break;
> @@ -347,10 +401,16 @@ static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
> break;
> }
> break;
> + case PSCI_1_0_FN_SYSTEM_SUSPEND:
> + kvm_psci_narrow_to_32bit(vcpu);
> + fallthrough;
> + case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> + return kvm_psci_system_suspend(vcpu);
> default:
> return kvm_psci_0_2_call(vcpu);
> }
>
> +out:
> smccc_set_retval(vcpu, val, 0, 0, 0);
> return ret;
> }
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index a067410ebea5..052b0e717b08 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -433,6 +433,7 @@ struct kvm_run {
> #define KVM_SYSTEM_EVENT_SHUTDOWN 1
> #define KVM_SYSTEM_EVENT_RESET 2
> #define KVM_SYSTEM_EVENT_CRASH 3
> +#define KVM_SYSTEM_EVENT_SUSPEND 4
> __u32 type;
> __u64 flags;
> } system_event;
> @@ -1112,6 +1113,7 @@ struct kvm_ppc_resize_hpt {
> #define KVM_CAP_BINARY_STATS_FD 203
> #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
> #define KVM_CAP_ARM_MTE 205
> +#define KVM_CAP_ARM_SYSTEM_SUSPEND 206
>
> #ifdef KVM_CAP_IRQ_ROUTING
>
I think there is an additional feature we need, which is to give
control back to userspace every time a wake-up condition occurs (I did
touch on that in [1]). This would give the opportunity to the VMM to
do whatever it needs to perform.
A typical use case would be to implement wake-up from certain
interrupts only (mask non-wake-up IRQs on suspend request, return to
the guest for WFI, wake-up returns to the VMM to restore the previous
interrupt configuration, resume).
Userspace would be free to clear the suspend state and resume the
guest, or to reenter WFI.
Thanks,
M.
[1] https://lkml.kernel.org/kvm/87k0jauurx.wl-maz@kernel.org/
--
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
next prev parent reply other threads:[~2021-09-30 12:29 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-23 19:15 [PATCH v2 00/11] KVM: arm64: Implement PSCI SYSTEM_SUSPEND support Oliver Upton
2021-09-23 19:16 ` [PATCH v2 01/11] KVM: arm64: Drop unused vcpu param to kvm_psci_valid_affinity() Oliver Upton
2021-10-01 3:50 ` Reiji Watanabe
2021-10-05 13:22 ` Andrew Jones
2021-09-23 19:16 ` [PATCH v2 02/11] KVM: arm64: Clean up SMC64 PSCI filtering for AArch32 guests Oliver Upton
2021-10-01 3:56 ` Reiji Watanabe
2021-10-05 13:23 ` Andrew Jones
2021-09-23 19:16 ` [PATCH v2 03/11] KVM: arm64: Encapsulate reset request logic in a helper function Oliver Upton
2021-10-01 6:04 ` Reiji Watanabe
2021-10-01 16:10 ` Oliver Upton
2021-10-05 13:33 ` Andrew Jones
2021-10-05 15:05 ` Oliver Upton
2021-10-05 19:01 ` Andrew Jones
2021-10-13 4:48 ` Reiji Watanabe
2021-10-05 13:35 ` Andrew Jones
2021-09-23 19:16 ` [PATCH v2 04/11] KVM: arm64: Rename the KVM_REQ_SLEEP handler Oliver Upton
2021-10-05 13:34 ` Andrew Jones
2021-09-23 19:16 ` [PATCH v2 05/11] KVM: arm64: Defer WFI emulation as a requested event Oliver Upton
2021-09-30 10:50 ` Marc Zyngier
2021-09-30 17:09 ` Sean Christopherson
2021-09-30 17:32 ` Oliver Upton
2021-09-30 18:08 ` Sean Christopherson
2021-09-30 21:57 ` Oliver Upton
2021-10-01 13:57 ` Marc Zyngier
2021-09-23 19:16 ` [PATCH v2 06/11] KVM: arm64: Add support for SYSTEM_SUSPEND PSCI call Oliver Upton
2021-09-30 12:29 ` Marc Zyngier [this message]
2021-09-30 17:19 ` Sean Christopherson
2021-09-30 17:35 ` Oliver Upton
2021-09-30 17:40 ` Oliver Upton
2021-10-01 14:02 ` Marc Zyngier
2021-10-05 16:02 ` Oliver Upton
2021-09-23 19:16 ` [PATCH v2 07/11] selftests: KVM: Rename psci_cpu_on_test to psci_test Oliver Upton
2021-10-05 13:36 ` Andrew Jones
2021-09-23 19:16 ` [PATCH v2 08/11] selftests: KVM: Create helper for making SMCCC calls Oliver Upton
2021-10-05 13:39 ` Andrew Jones
2021-09-23 19:16 ` [PATCH v2 09/11] selftests: KVM: Use KVM_SET_MP_STATE to power off vCPU in psci_test Oliver Upton
2021-09-23 19:16 ` [PATCH v2 10/11] selftests: KVM: Refactor psci_test to make it amenable to new tests Oliver Upton
2021-10-05 13:45 ` Andrew Jones
2021-10-05 14:54 ` Oliver Upton
2021-10-05 19:05 ` Andrew Jones
2021-09-23 19:16 ` [PATCH v2 11/11] selftests: KVM: Test SYSTEM_SUSPEND PSCI call Oliver Upton
2021-10-05 13:49 ` Andrew Jones
2021-10-05 15:07 ` Oliver Upton
2021-09-23 20:15 ` [PATCH v2 00/11] KVM: arm64: Implement PSCI SYSTEM_SUSPEND support Oliver Upton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877deytfes.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=oupton@google.com \
--cc=pshier@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).