All of lore.kernel.org
 help / color / mirror / Atom feed
From: Grzegorz Jaszczyk <jaz@semihalf.com>
To: "Limonciello, Mario" <mario.limonciello@amd.com>,
	Sean Christopherson <seanjc@google.com>
Cc: linux-kernel@vger.kernel.org, Dmytro Maluka <dmy@semihalf.com>,
	Zide Chen <zide.chen@intel.corp-partner.google.com>,
	Peter Fang <peter.fang@intel.corp-partner.google.com>,
	Tomasz Nowicki <tn@semihalf.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<x86@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Len Brown <lenb@kernel.org>, Pavel Machek <pavel@ucw.cz>,
	Ashish Kalra <ashish.kalra@amd.com>,
	Hans de Goede <hdegoede@redhat.com>,
	Sachi King <nakato@nakato.io>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	David Dunn <daviddunn@google.com>,
	Wei Wang <wei.w.wang@intel.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	"open list:KERNEL VIRTUAL MACHINE (KVM)" <kvm@vger.kernel.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	"open list:ACPI" <linux-acpi@vger.kernel.org>,
	"open list:HIBERNATION (aka Software Suspend,
	aka swsusp)"  <linux-pm@vger.kernel.org>,
	Dominik Behr <dbehr@google.com>,
	Dmitry Torokhov <dtor@google.com>
Subject: Re: [PATCH 1/2] x86: notify hypervisor about guest entering s2idle state
Date: Thu, 23 Jun 2022 18:50:13 +0200	[thread overview]
Message-ID: <CAH76GKNB0V+-Ky6bfhX6Kzudyn6zJW42iSWfRkfbo9C-eKdo-w@mail.gmail.com> (raw)
In-Reply-To: <7c428b03-261f-78cb-4ce3-5949ac93f028@amd.com>

śr., 22 cze 2022 o 23:50 Limonciello, Mario
<mario.limonciello@amd.com> napisał(a):
>
> On 6/22/2022 04:53, Grzegorz Jaszczyk wrote:
> > pon., 20 cze 2022 o 18:32 Limonciello, Mario
> > <mario.limonciello@amd.com> napisał(a):
> >>
> >> On 6/20/2022 10:43, Grzegorz Jaszczyk wrote:
> >>> czw., 16 cze 2022 o 18:58 Limonciello, Mario
> >>> <mario.limonciello@amd.com> napisał(a):
> >>>>
> >>>> On 6/16/2022 11:48, Sean Christopherson wrote:
> >>>>> On Wed, Jun 15, 2022, Grzegorz Jaszczyk wrote:
> >>>>>> pt., 10 cze 2022 o 16:30 Sean Christopherson <seanjc@google.com> napisał(a):
> >>>>>>> MMIO or PIO for the actual exit, there's nothing special about hypercalls.  As for
> >>>>>>> enumerating to the guest that it should do something, why not add a new ACPI_LPS0_*
> >>>>>>> function?  E.g. something like
> >>>>>>>
> >>>>>>> static void s2idle_hypervisor_notify(void)
> >>>>>>> {
> >>>>>>>            if (lps0_dsm_func_mask > 0)
> >>>>>>>                    acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT_HYPERVISOR_NOTIFY
> >>>>>>>                                            lps0_dsm_func_mask, lps0_dsm_guid);
> >>>>>>> }
> >>>>>>
> >>>>>> Great, thank you for your suggestion! I will try this approach and
> >>>>>> come back. Since this will be the main change in the next version,
> >>>>>> will it be ok for you to add Suggested-by: Sean Christopherson
> >>>>>> <seanjc@google.com> tag?
> >>>>>
> >>>>> If you want, but there's certainly no need to do so.  But I assume you or someone
> >>>>> at Intel will need to get formal approval for adding another ACPI LPS0 function?
> >>>>> I.e. isn't there work to be done outside of the kernel before any patches can be
> >>>>> merged?
> >>>>
> >>>> There are 3 different LPS0 GUIDs in use.  An Intel one, an AMD (legacy)
> >>>> one, and a Microsoft one.  They all have their own specs, and so if this
> >>>> was to be added I think all 3 need to be updated.
> >>>
> >>> Yes this will not be easy to achieve I think.
> >>>
> >>>>
> >>>> As this is Linux specific hypervisor behavior, I don't know you would be
> >>>> able to convince Microsoft to update theirs' either.
> >>>>
> >>>> How about using s2idle_devops?  There is a prepare() call and a
> >>>> restore() call that is set for each handler.  The only consumer of this
> >>>> ATM I'm aware of is the amd-pmc driver, but it's done like a
> >>>> notification chain so that a bunch of drivers can hook in if they need to.
> >>>>
> >>>> Then you can have this notification path and the associated ACPI device
> >>>> it calls out to be it's own driver.
> >>>
> >>> Thank you for your suggestion, just to be sure that I've understand
> >>> your idea correctly:
> >>> 1) it will require to extend acpi_s2idle_dev_ops about something like
> >>> hypervisor_notify() call, since existing prepare() is called from end
> >>> of acpi_s2idle_prepare_late so it is too early as it was described in
> >>> one of previous message (between acpi_s2idle_prepare_late and place
> >>> where we use hypercall there are several places where the suspend
> >>> could be canceled, otherwise we could probably try to trap on other
> >>> acpi_sleep_run_lps0_dsm occurrence from acpi_s2idle_prepare_late).
> >>>
> >>
> >> The idea for prepare() was it would be the absolute last thing before
> >> the s2idle loop was run.  You're sure that's too early?  It's basically
> >> the same thing as having a last stage new _DSM call.
> >>
> >> What about adding a new abort() extension to acpi_s2idle_dev_ops?  Then
> >> you could catch the cancelled suspend case still and take corrective
> >> action (if that action is different than what restore() would do).
> >
> > It will be problematic since the abort/restore notification could
> > arrive too late and therefore the whole system will go to suspend
> > thinking that the guest is in desired s2ilde state. Also in this case
> > it would be impossible to prevent races and actually making sure that
> > the guest is suspended or not. We already had similar discussion with
> > Sean earlier in this thread why the notification have to be send just
> > before swait_event_exclusive(s2idle_wait_head, s2idle_state ==
> > S2IDLE_STATE_WAKE) and that the VMM have to have control over guest
> > resumption.
> >
> > Nevertheless if extending acpi_s2idle_dev_ops is possible, why not
> > extend it about the hypervisor_notify() and use it in the same place
> > where the hypercall is used in this patch? Do you see any issue with
> > that?
>
> If this needs to be a hypercall and the hypercall needs to go at that
> specific time, I wouldn't bother with extending acpi_s2idle_dev_ops.
> The whole idea there was that this would be less custom and could follow
> a spec.

Just to clarify - it probably doesn't need to be a hypercall. I've
probably misled you with copy-pasting a handler name from the current
patch but aiming your and Sean ACPI like approach. What I meant is
something like:
- extend acpi_s2idle_dev_ops with notify()
- implement notify() handler for acpi_s2idle_dev_ops in HYPE0001
driver (without hypercall):
static void s2idle_notify(void)
{
        acpi_evaluate_dsm(acpi_handle, guid_of_HYPE0001, 0,
ACPI_HYPE_NOTIFY, NULL);
}

- register it via acpi_register_lps0_dev() from HYPE0001 driver
- use it just before swait_event_exclusive(s2idle_wait_head..) as it
is with original patch (the name of the function will be different):
static void s2idle_hypervisor_notify(void)
{
         struct acpi_s2idle_dev_ops *handler;
...
         list_for_each_entry(handler, &lps0_s2idle_devops_head, list_node) {
                  if (handler->notify)
                          handler->notify();
          }
}

so it will be like:
-> s2idle_enter (just before swait_event_exclusive(s2idle_wait_head,.. )
--> s2idle_hypervisor_notify (as platform_s2idle_ops)
---> notify (as acpi_s2idle_dev_ops)
----> HYPE0001 device driver's notify () routine

It will probably be easier to understand it if I actually implement
it. Nevertheless this way we ensure that:
- notification will be triggered at very last command before actually
entering s2idle
- we can trap on MMIO/PIO by implementing HYPE0001 specific  _DSM
method and therefore this implementation will not become hypervisor
specific and also not use KVM as "dumb pipe out to userspace" as Sean
suggested
- we will not have to change existing Intel/AMD/Window spec (3
different LPS0 GUIDs) but thanks to HYPE0001's acpi_s2idle_dev_ops
involvment, only care about new HYPE0001 spec

>
> TBH - given the strong dependency on being the very last command and
> this being all Linux specific (you won't need to do something similar
> with Windows) - I think the way you already did it makes the most sense.
> It seems to me the ACPI device model doesn't really work well for this
> scenario.
>
> >
> >>
> >>> 2) using newly introduced acpi_s2idle_dev_ops hypervisor_notify() call
> >>> will allow to register handler from Intel x86/intel/pmc/core.c driver
> >>> and/or AMD x86/amd-pmc.c driver. Therefore we will need to get only
> >>> Intel and/or AMD approval about extending the ACPI LPS0 _DSM method,
> >>> correct?
> >>>
> >>
> >> Right now the only thing that hooks prepare()/restore() is the amd-pmc
> >> driver (unless Intel's PMC had a change I didn't catch yet).
> >>
> >> I don't think you should be changing any existing drivers but rather
> >> introduce another platform driver for this specific case.
> >>
> >> So it would be something like this:
> >>
> >> acpi_s2idle_prepare_late
> >> -> prepare()
> >> --> AMD: amd_pmc handler for prepare()
> >> --> Intel: intel_pmc handler for prepare() (conceptual)
> >> --> HYPE0001 device: new driver's prepare() routine
> >>
> >> So the platform driver would match the HYPE0001 device to load, and it
> >> wouldn't do anything other than provide a prepare()/restore() handler
> >> for your case.
> >>
> >> You don't need to change any existing specs.  If anything a new spec to
> >> go with this new ACPI device would be made.  Someone would need to
> >> reserve the ID and such for it, but I think you can mock it up in advance.
> >
> > Thank you for your explanation. This means that I should register
> > "HYPE" through https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuefi.org%2FPNP_ACPI_Registry&amp;data=05%7C01%7Cmario.limonciello%40amd.com%7C49512293908e4ee17e8c08da54351ed5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637914884458918039%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=v5VsnxAINiJhOMLpwORLHd13WcYBHf%2FGSNv8Bjhyino%3D&amp;reserved=0 before introducing
> > this new driver to Linux.
> > I have no experience with the above, so I wonder who should be
> > responsible for maintaining such ACPI ID since it will not belong to
> > any specific vendor? There is an example of e.g. COREBOOT PROJECT
> > using "BOOT" ACPI ID [1], which seems similar in terms of not
> > specifying any vendor but rather the project as a responsible entity.
> > Maybe you have some recommendations?
>
> Maybe LF could own a namespace and ID?  But I would suggest you make a
> mockup that everything works this way before you go explore too much.

Yeah, sure.

>
> Also make sure Rafael is aligned with your mockup.

Agree.

>
> >
> > I am also not sure if and where a specification describing such a
> > device has to be maintained. Since "HYPE0001" will have its own _DSM
> > so will it be required to document it somewhere rather than just using
> > it in the driver and preparing proper ACPI tables for guest?
> >
> >>
> >>> I wonder if this will be affordable so just re-thinking loudly if
> >>> there is no other mechanism that could be suggested and used upstream
> >>> so we could notify hypervisor/vmm about guest entering s2idle state?
> >>> Especially that such _DSM function will be introduced only to trap on
> >>> some fake MMIO/PIO access and will be useful only for guest ACPI
> >>> tables?
> >>>
> >>
> >> Do you need to worry about Microsoft guests using Modern Standby too or
> >> is that out of the scope of your problem set?  I think you'll be a lot
> >> more limited in how this can behave and where you can modify things if so.
> >>
> >
> > I do not need to worry about Microsoft guests.
>
> Makes life a lot easier :)

Agree :) and thank you for all your feedback,
Grzegorz

  parent reply	other threads:[~2022-06-23 16:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09 11:03 [PATCH 0/2] x86: notify hypervisor/VMM about guest entering s2idle Grzegorz Jaszczyk
2022-06-09 11:03 ` [PATCH 1/2] x86: notify hypervisor about guest entering s2idle state Grzegorz Jaszczyk
2022-06-09 14:27   ` Dave Hansen
2022-06-10 11:36     ` Grzegorz Jaszczyk
2022-06-10 12:49       ` Dave Hansen
2022-06-13  5:03         ` Mario Limonciello
2022-06-15 18:00           ` Grzegorz Jaszczyk
2022-06-09 14:55   ` Sean Christopherson
2022-06-10 12:26     ` Grzegorz Jaszczyk
2022-06-10 14:29       ` Sean Christopherson
2022-06-15 18:53         ` Grzegorz Jaszczyk
2022-06-16 16:48           ` Sean Christopherson
2022-06-16 16:58             ` Limonciello, Mario
2022-06-20 15:43               ` Grzegorz Jaszczyk
2022-06-20 16:32                 ` Limonciello, Mario
2022-06-22  9:53                   ` Grzegorz Jaszczyk
2022-06-22 21:50                     ` Limonciello, Mario
2022-06-23 16:47                       ` Sean Christopherson
2022-06-23 16:50                       ` Grzegorz Jaszczyk [this message]
2022-06-23 17:19                         ` Limonciello, Mario
2022-06-09 11:03 ` [PATCH 2/2] KVM: x86: notify user space about guest entering s2idle Grzegorz Jaszczyk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAH76GKNB0V+-Ky6bfhX6Kzudyn6zJW42iSWfRkfbo9C-eKdo-w@mail.gmail.com \
    --to=jaz@semihalf.com \
    --cc=acme@redhat.com \
    --cc=ashish.kalra@amd.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=daviddunn@google.com \
    --cc=dbehr@google.com \
    --cc=dmy@semihalf.com \
    --cc=dtor@google.com \
    --cc=hdegoede@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=mingo@redhat.com \
    --cc=nakato@nakato.io \
    --cc=npiggin@gmail.com \
    --cc=pavel@ucw.cz \
    --cc=pbonzini@redhat.com \
    --cc=peter.fang@intel.corp-partner.google.com \
    --cc=rafael@kernel.org \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tn@semihalf.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    --cc=zide.chen@intel.corp-partner.google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.