All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oupton@google.com>
Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
	Peter Shier <pshier@google.com>,
	Ricardo Koller <ricarkol@google.com>,
	Jing Zhang <jingzhangos@google.com>,
	Raghavendra Rao Anata <rananta@google.com>,
	James Morse <james.morse@arm.com>,
	Alexandru Elisei <Alexandru.Elisei@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Andrew Jones <drjones@redhat.com>
Subject: Re: [PATCH 0/6] KVM: arm64: Implement PSCI SYSTEM_SUSPEND support
Date: Tue, 07 Sep 2021 18:43:42 +0100	[thread overview]
Message-ID: <87a6kocmcx.wl-maz@kernel.org> (raw)
In-Reply-To: <CAOQ_QsgOtufyB6_qGAs4fQf6kd81FSMSj44uiVRgoFQWOf3nRA@mail.gmail.com>

On Tue, 07 Sep 2021 17:30:33 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> On Mon, Sep 6, 2021 at 4:12 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > Hi Oliver,
> >
> > On Thu, 19 Aug 2021 23:36:34 +0100,
> > Oliver Upton <oupton@google.com> wrote:
> > >
> > > Certain VMMs/operators may wish to give their guests the ability to
> > > initiate a system suspend that could result in the VM being saved to
> > > persistent storage to be resumed at a later time. The PSCI v1.0
> > > specification describes an SMC, SYSTEM_SUSPEND, that allows a kernel to
> > > request a system suspend. This call is optional for v1.0, and KVM
> > > elected to not support the call in its v1.0 implementation.
> > >
> > > This series adds support for the SYSTEM_SUSPEND PSCI call to KVM/arm64.
> > > Since this is a system-scoped event, KVM cannot quiesce the VM on its
> > > own. We add a new system exit type in this series to clue in userspace
> > > that a suspend was requested. Per the KVM_EXIT_SYSTEM_EVENT ABI, a VMM
> > > that doesn't care about this event can simply resume the guest without
> > > issue (we set up the calling vCPU to come out of reset correctly on next
> > > KVM_RUN).
> >
> > More idle thoughts on this:
> >
> > Although the definition of SYSTEM_SUSPEND is very simple from a PSCI
> > perspective, I don't think it is that simple at the system level,
> > because PSCI is only concerned with the CPU.
> >
> > For example, what is a wake-up event? My first approach would be to
> > consider interrupts to be such events. However, this approach suffers
> > from at least two issues:
> >
> > - How do you define which interrupts are actual wake-up events?
> >   Nothing in the GIC architecture defines what a wake-up is (let alone
> >   a wake-up event).
> 
> Good point.
> 
> One possible implementation of suspend could just be a `WFI` in a
> higher EL. In this case, KVM could emulate WFI wake up events
> according to D1.16.2 in DDI 0487G.a. But I agree, it isn't entirely
> clear what constitutes a wakeup from powered down state.

It isn't, and it is actually IMPDEF (there isn't much in the ARM ARM
in terms of what constitutes a low power state). And even if you
wanted to emulate a WFI in userspace, the problem of interrupts that
have their source in the kernel remains. How to you tell userspace
that such an event has occurred if the vcpu thread isn't in the
kernel?

> 
> > - Assuming you have a way to express the above, how do you handle
> >   wake-ups from interrupts that have their source in the kernel (such
> >   as timers, irqfd sources)?
> 
> I think this could be handled, so long as we allow userspace to
> indicate it has woken a vCPU. Depending on this, in the next KVM_RUN
> we'd say:
> 
> - Some IMP DEF event occurred; I'm waking this CPU now

I'm seeing the problem from the other side. The vcpu has exited to
userspace on a SUSPEND event. How is the kernel supposed to tell
userspace that there is a pending interrupt? To do that, you'd have to
keep the vcpu in the kernel on SUSPEND. Which is *exactly* WFI.

> - I've either chosen to ignore the guest or will defer to KVM's
> suspend implementation
> 
> > How do you cope with directly injected interrupts?
> 
> No expert on this, I'll need to do a bit more reading to give a good
> answer here.
> 
> > It looks to me that your implementation can only work with userspace
> > provided events, which is pretty limited.
> 
> Right. I implemented this from the mindset that userspace may do
> something heavyweight when a guest suspends, like save it to a
> persistent store to resume later on. No matter what we do in KVM, I
> think it's probably best to give userspace the right of first refusal
> to handle the suspension.

Maybe. But if you want to handle wake-up from interrupts to actually
work, you must return to the kernel for the wake-up to occurs.

The problem is that you piggyback on an existing feature (suspend) to
implement something else (opportunistic save/restore?). Oddly enough
the stars don't exactly align! ;-)

I have the feeling that a solution to this problem would be to exit to
userspace with something indicating an *intent* to suspend. At this
stage, userspace can do two things:

- resume the guest: the guest may have been moved to some other
  machine, cold storage, whatever... The important thing is that the
  guest is directly runnable without any extra event

- confirm the suspension by returning to the kernel, which will
  execute a blocking WFI on behalf of the guest

With this, you end-up with something that is works from an interrupt
perspective (even for directly injected interrupts), and you can save
your guest on suspend.

>
> > Other items worth considering: ongoing DMA, state of the caches at
> > suspend time, device state in general All of this really needs to be
> > defined before we can move forward with this feature.
> 
> I believe it is largely up to the caller to get devices in a quiesced
> state appropriate for a system suspend, but PSCI is delightfully vague
> on this topic.

Indeed, it only deals with the CPU. Oh look, another opportunity to
write a new spec! :)

> On the contrary, it is up to KVM's implementation to
> guarantee caches are clean when servicing the guest request.

This last point is pretty unclear to me. If the guest doesn't clean to
the PoC (or even to one of the PoPs) when it calls into suspend,
that's a clear indication that it doesn't care about its data. Why
should KVM be more conservative here? It shouldn't be in the business
of working around guest bugs.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oupton@google.com>
Cc: kvm@vger.kernel.org, Peter Shier <pshier@google.com>,
	kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH 0/6] KVM: arm64: Implement PSCI SYSTEM_SUSPEND support
Date: Tue, 07 Sep 2021 18:43:42 +0100	[thread overview]
Message-ID: <87a6kocmcx.wl-maz@kernel.org> (raw)
In-Reply-To: <CAOQ_QsgOtufyB6_qGAs4fQf6kd81FSMSj44uiVRgoFQWOf3nRA@mail.gmail.com>

On Tue, 07 Sep 2021 17:30:33 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> On Mon, Sep 6, 2021 at 4:12 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > Hi Oliver,
> >
> > On Thu, 19 Aug 2021 23:36:34 +0100,
> > Oliver Upton <oupton@google.com> wrote:
> > >
> > > Certain VMMs/operators may wish to give their guests the ability to
> > > initiate a system suspend that could result in the VM being saved to
> > > persistent storage to be resumed at a later time. The PSCI v1.0
> > > specification describes an SMC, SYSTEM_SUSPEND, that allows a kernel to
> > > request a system suspend. This call is optional for v1.0, and KVM
> > > elected to not support the call in its v1.0 implementation.
> > >
> > > This series adds support for the SYSTEM_SUSPEND PSCI call to KVM/arm64.
> > > Since this is a system-scoped event, KVM cannot quiesce the VM on its
> > > own. We add a new system exit type in this series to clue in userspace
> > > that a suspend was requested. Per the KVM_EXIT_SYSTEM_EVENT ABI, a VMM
> > > that doesn't care about this event can simply resume the guest without
> > > issue (we set up the calling vCPU to come out of reset correctly on next
> > > KVM_RUN).
> >
> > More idle thoughts on this:
> >
> > Although the definition of SYSTEM_SUSPEND is very simple from a PSCI
> > perspective, I don't think it is that simple at the system level,
> > because PSCI is only concerned with the CPU.
> >
> > For example, what is a wake-up event? My first approach would be to
> > consider interrupts to be such events. However, this approach suffers
> > from at least two issues:
> >
> > - How do you define which interrupts are actual wake-up events?
> >   Nothing in the GIC architecture defines what a wake-up is (let alone
> >   a wake-up event).
> 
> Good point.
> 
> One possible implementation of suspend could just be a `WFI` in a
> higher EL. In this case, KVM could emulate WFI wake up events
> according to D1.16.2 in DDI 0487G.a. But I agree, it isn't entirely
> clear what constitutes a wakeup from powered down state.

It isn't, and it is actually IMPDEF (there isn't much in the ARM ARM
in terms of what constitutes a low power state). And even if you
wanted to emulate a WFI in userspace, the problem of interrupts that
have their source in the kernel remains. How to you tell userspace
that such an event has occurred if the vcpu thread isn't in the
kernel?

> 
> > - Assuming you have a way to express the above, how do you handle
> >   wake-ups from interrupts that have their source in the kernel (such
> >   as timers, irqfd sources)?
> 
> I think this could be handled, so long as we allow userspace to
> indicate it has woken a vCPU. Depending on this, in the next KVM_RUN
> we'd say:
> 
> - Some IMP DEF event occurred; I'm waking this CPU now

I'm seeing the problem from the other side. The vcpu has exited to
userspace on a SUSPEND event. How is the kernel supposed to tell
userspace that there is a pending interrupt? To do that, you'd have to
keep the vcpu in the kernel on SUSPEND. Which is *exactly* WFI.

> - I've either chosen to ignore the guest or will defer to KVM's
> suspend implementation
> 
> > How do you cope with directly injected interrupts?
> 
> No expert on this, I'll need to do a bit more reading to give a good
> answer here.
> 
> > It looks to me that your implementation can only work with userspace
> > provided events, which is pretty limited.
> 
> Right. I implemented this from the mindset that userspace may do
> something heavyweight when a guest suspends, like save it to a
> persistent store to resume later on. No matter what we do in KVM, I
> think it's probably best to give userspace the right of first refusal
> to handle the suspension.

Maybe. But if you want to handle wake-up from interrupts to actually
work, you must return to the kernel for the wake-up to occurs.

The problem is that you piggyback on an existing feature (suspend) to
implement something else (opportunistic save/restore?). Oddly enough
the stars don't exactly align! ;-)

I have the feeling that a solution to this problem would be to exit to
userspace with something indicating an *intent* to suspend. At this
stage, userspace can do two things:

- resume the guest: the guest may have been moved to some other
  machine, cold storage, whatever... The important thing is that the
  guest is directly runnable without any extra event

- confirm the suspension by returning to the kernel, which will
  execute a blocking WFI on behalf of the guest

With this, you end-up with something that is works from an interrupt
perspective (even for directly injected interrupts), and you can save
your guest on suspend.

>
> > Other items worth considering: ongoing DMA, state of the caches at
> > suspend time, device state in general All of this really needs to be
> > defined before we can move forward with this feature.
> 
> I believe it is largely up to the caller to get devices in a quiesced
> state appropriate for a system suspend, but PSCI is delightfully vague
> on this topic.

Indeed, it only deals with the CPU. Oh look, another opportunity to
write a new spec! :)

> On the contrary, it is up to KVM's implementation to
> guarantee caches are clean when servicing the guest request.

This last point is pretty unclear to me. If the guest doesn't clean to
the PoC (or even to one of the PoPs) when it calls into suspend,
that's a clear indication that it doesn't care about its data. Why
should KVM be more conservative here? It shouldn't be in the business
of working around guest bugs.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

  reply	other threads:[~2021-09-07 17:43 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-19 22:36 [PATCH 0/6] KVM: arm64: Implement PSCI SYSTEM_SUSPEND support Oliver Upton
2021-08-19 22:36 ` Oliver Upton
2021-08-19 22:36 ` [PATCH 1/6] KVM: arm64: Drop unused vcpu param to kvm_psci_valid_affinity() Oliver Upton
2021-08-19 22:36   ` Oliver Upton
2021-08-19 22:36 ` [PATCH 2/6] KVM: arm64: Clean up SMC64 PSCI filtering for AArch32 guests Oliver Upton
2021-08-19 22:36   ` Oliver Upton
2021-08-19 22:36 ` [PATCH 3/6] KVM: arm64: Encapsulate reset request logic in a helper function Oliver Upton
2021-08-19 22:36   ` Oliver Upton
2021-08-19 22:36 ` [PATCH 4/6] KVM: arm64: Add support for SYSTEM_SUSPEND PSCI call Oliver Upton
2021-08-19 22:36   ` Oliver Upton
2021-08-19 22:36 ` [PATCH 5/6] selftests: KVM: Promote PSCI hypercalls to asm volatile Oliver Upton
2021-08-19 22:36   ` Oliver Upton
2021-08-19 22:36 ` [PATCH 6/6] selftests: KVM: Test SYSTEM_SUSPEND PSCI call Oliver Upton
2021-08-19 22:36   ` Oliver Upton
2021-08-19 23:41 ` [PATCH] Documentation: kvm: Document KVM_SYSTEM_EVENT_SUSPEND exit type Oliver Upton
2021-08-19 23:41   ` Oliver Upton
2021-08-22 19:56 ` [PATCH 0/6] KVM: arm64: Implement PSCI SYSTEM_SUSPEND support Oliver Upton
2021-08-22 19:56   ` Oliver Upton
2021-08-26 10:51   ` Marc Zyngier
2021-08-26 10:51     ` Marc Zyngier
2021-08-26 18:37     ` Oliver Upton
2021-08-26 18:37       ` Oliver Upton
2021-08-27 21:58 ` [RFC kvmtool PATCH 0/2] " Oliver Upton
2021-08-27 21:58   ` Oliver Upton
2021-08-27 21:58   ` [RFC kvmtool PATCH 1/2] TESTONLY: KVM: Update KVM headers Oliver Upton
2021-08-27 21:58     ` Oliver Upton
2021-08-27 21:58   ` [RFC kvmtool PATCH 2/2] arm64: Add support for KVM_CAP_ARM_SYSTEM_SUSPEND Oliver Upton
2021-08-27 21:58     ` Oliver Upton
2021-09-06  9:12 ` [PATCH 0/6] KVM: arm64: Implement PSCI SYSTEM_SUSPEND support Marc Zyngier
2021-09-06  9:12   ` Marc Zyngier
2021-09-07 16:30   ` Oliver Upton
2021-09-07 16:30     ` Oliver Upton
2021-09-07 17:43     ` Marc Zyngier [this message]
2021-09-07 17:43       ` Marc Zyngier
2021-09-07 18:14       ` Oliver Upton
2021-09-07 18:14         ` Oliver Upton
2021-09-21  9:45         ` Marc Zyngier
2021-09-21  9:45           ` Marc Zyngier
2021-09-21 18:22           ` Oliver Upton
2021-09-21 18:22             ` Oliver Upton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6kocmcx.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=Alexandru.Elisei@arm.com \
    --cc=drjones@redhat.com \
    --cc=james.morse@arm.com \
    --cc=jingzhangos@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=oupton@google.com \
    --cc=pshier@google.com \
    --cc=rananta@google.com \
    --cc=ricarkol@google.com \
    --cc=suzuki.poulose@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.