All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: "Limonciello, Mario" <mario.limonciello@amd.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Hans de Goede <hdegoede@redhat.com>,
	Mark Gross <mgross@linux.intel.com>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	"open list:X86 PLATFORM DRIVERS" 
	<platform-driver-x86@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"S-k, Shyam-sundar" <Shyam-sundar.S-k@amd.com>,
	"Goswami, Sanket" <Sanket.Goswami@amd.com>
Subject: Re: [PATCH v3 1/5] ACPI / x86: Add support for LPS0 callback handler
Date: Wed, 16 Mar 2022 18:27:11 +0100	[thread overview]
Message-ID: <CAJZ5v0hdbz4ktUQv=2WA99Tqnq-13we4Qk5z7i_OQmA-ZaQkWQ@mail.gmail.com> (raw)
In-Reply-To: <29058b59-80f8-b29a-c54c-b58779c329e0@amd.com>

On Wed, Mar 16, 2022 at 5:43 PM Limonciello, Mario
<mario.limonciello@amd.com> wrote:
>
> >>>> My preference at this point would be to use a notifier chain, unless
> >>>> that's not sufficient for some reason, because it appears to match the
> >>>> notifier usage model.
> >>>
> >>> Well, I'm actually not sure about that.
> >>>
> >>>>> I pointed this out, because making this change would also make 4/5 a bit
> >>>>> cleaner. You are recreating the same struct lps0_callback_handler on
> >>>>> stack twice there, which looked weird to me.
> >>>>>
> >>>>> Note if Rafael wants to stick with the approach from this v3, then
> >>>>> I guess that the approach in 4/5 is fine.
> >>>>>> Rafael - can you please confirm which direction you want to see here
> >>> for this?
> >>>
> >>> So the idea is that the PMC driver's "suspend" method needs to be
> >>> invoked from acpi_s2idle_prepare_late(), so it doesn't interfere with
> >>> the suspend of the other devices in the system and so it can take the
> >>> constraints into account.
> >>
> >> The reason to do nothing (besides a debug level message right now) with the constraints
> >> information is that at least on today's OEM platforms there are some instances constraints
> >> aren't met on Linux that need to be investigated and root caused.  These particular constraints
> >> don't actually cause problems reaching s0ix residency though.
> >
> > Why are they useful at all, then?
> >
> >>>
> >>> What is it going to do, in the future, depending on whether or not the
> >>> constraints are met?
> >>
> >> The idea was that if constraints were met that it would send the OS_HINT as part of
> >> amd_pmc_suspend/amd_pmc_resume, and if they aren't met then skip this step.
> >>
> >> It would effectively block the system from getting s0ix residency unless the constraints
> >> are all met.
> >
> > Why do that?
>
> I guess to both of your above questions this begs a comparison of how
> things work in Windows versus Linux.
>
> Windows Modern Standby has this concept of "SW DRIPS" vs "HW DRIPS".
>  From an end user perspective you close your lid or you click Start
> button and hit sleep and the machine is in sleep.  Whether it's in the
> deepest state is "invisible" to you unless you're running a sleep study.
>
> Windows will at this time requests devices to go into their deepest
> states and will continue to monitor them against the constraints table.
> When the constraints table is matched a uPEP driver is notified (this is
> the _DSM stuff we have in Linux too for "deepest state") and then it can
> do as it pleases.   ON AMD's platform this sends OS_HINT.  OS_HINT is
> meant to be indicate that the OS is done with all it's suspend actions
> (caches flushed, devices in D3 etc) and the system can "try" to enter s0ix.
>
> Windows will then also distinguish between different types of wakeups
> and have different behaviors for them.   There are wakeups that can be
> treated as keep screen off, and then possibly go back into deepest state.
>
> "As soon as the SoC wakes and the platform exits the DRIPS state, the
> CPUs start running code again. However, the screen stays powered off
> unless the interrupt was a result of user input or connecting to a power
> source"
> https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/transitioning-between-idle-and-active-states
>
> "During the Sleep state, specific value-adding SW activity may run"
> https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/modern-standby-states
>
> So from the time that I clicked sleep in the OS, I might NOT get into
> the deepest state (HW DRIPS), or I might have gotten in and out several
> times.  If a device driver failed to put a PCI device in D3 for example
> I would not be able to enter HW DRIPS, but the suspend wouldn't "fail".

Right.  I'm aware of how this is expected to work in the Windows
Modern Standby context.

> Now to contrast this to Linux when I enter suspend all drivers will run
> their various PM callbacks and devices will go into their deepest states.
> * If a driver fails, the suspend actually fails and you get an error
> code to go investigate what happened.
> * If the devices don't get into their deepest state by the time you get
> to the s2idle loop you don't get s0ix residency.
>
> As you can see at least for AMD's platforms OS_HINT is sent too "early".
>   That's why this series exists in the first place.

Right.

> So with all that said; why look at constraints at all if stuff is working?
>  From this design at least on Windows constraints are supposed to be a
> safety guard that you don't start the HW process for s0i3 process "too
> early".
>
> The last commit that is getting reverted in this series is an example of
> what could happen if the process is started prematurely.

Well, it is not "the process started prematurely", but an ordering
issue in a process that's already under way.

The real difference is that Modern Standby is designed as an extension
of the system working state and things can go from "working" to
"sleeping" in a kind of transparent fashion, whereas in Linux the
system is either suspended or working, or going from one of these
states to the other.

From the Modern Standby POV it only makes sense to attempt to enter HW
DRIPS if all of the potentially involved entities are ready, and hence
the constraints.  In Linux, they all are expected to get ready during
transitions from "working" to "suspended".

> >
> >> Given we know some OEM platforms have problems in current generations
> >> with constraints it would probably need to be restricted to this behavior only on a future
> >> SOC that we are confident of all drivers and firmware are doing the right thing.
> >>
> >> By passing the information to amd_pmc we can keep that logic restricting the behavior to
> >> only those platforms that we're confident on that behavior.
> >
> > Honestly, I'm not quite sure why it is a good idea to prevent the
> > platform from attempting to get into S0ix via suspend-to-idle in any
> > case.
> >
> > You know you have to suspend.  You don't know how much time you will
> > be suspended.  The constraints can only tell you what's the
> > lowest-power state you can achieve at this point, but why is it
> > relevant?
>
> Having thought through and said all I did above, I do concede you're
> right with the Linux approach to sleep the constraints really don't add
> a lot of value.  If a device fails to enter it's intended sleep states
> the suspend will "fail".  If the suspend succeeds but the constraints
> table doesn't match, it's just a hint where to focus on problems.

Exactly.

> I appreciate your thoughts and I will drop the constraints passing patch
> in this series.

Thanks!

> With that intent of dropping that would you still like this reworked as
> a notifier chain or keep it as this design?

I would introduce something like

struct acpi_s2idle_dev_ops {
        struct list_head list_node;
        void (*prepare)(struct device *dev);
        void (*restore)(struct device *dev);
};

and let whoever wants to use one of these pass the pointer to it to a
"register" function (that will only do a "list add").  The reason why
I wouldn't allow ->prepare() to fail is that failing suspend at this
point isn't particularly useful (and arguably it isn't really useful
at all, but that's a whole different topic).

This can be a const struct in a driver, so it cannot be modified even
by mistake which reduces the attack surface a bit.

Then, make changes to acpi_s2idle_prepare_late() and
acpi_s2idle_restore_early() like in the $subject patch, except that
the extra locking may not be needed if the "register" function uses
lock_system_sleep() for mutual exclusion.

  reply	other threads:[~2022-03-16 17:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-14  5:03 [PATCH v3 1/5] ACPI / x86: Add support for LPS0 callback handler Mario Limonciello
2022-03-14  5:03 ` [PATCH v3 2/5] ACPI / x86: Pass the constraints checking result to LPS0 callback Mario Limonciello
2022-03-14  5:03 ` [PATCH v3 3/5] ACPI / x86: Check LPI constraints by default Mario Limonciello
2022-03-14  5:03 ` [PATCH v3 4/5] platform/x86: amd-pmc: Move to later in the suspend process Mario Limonciello
2022-03-14  5:03 ` [PATCH v3 5/5] platform/x86: amd-pmc: Drop CPU QoS workaround Mario Limonciello
2022-03-14  9:05 ` [PATCH v3 1/5] ACPI / x86: Add support for LPS0 callback handler Hans de Goede
2022-03-14 13:32   ` Limonciello, Mario
2022-03-14 13:37     ` Hans de Goede
2022-03-16 15:02       ` Rafael J. Wysocki
2022-03-16 15:34         ` Rafael J. Wysocki
2022-03-16 15:43           ` Limonciello, Mario
2022-03-16 15:52             ` Rafael J. Wysocki
2022-03-16 16:43               ` Limonciello, Mario
2022-03-16 17:27                 ` Rafael J. Wysocki [this message]
2022-03-15  1:01     ` David E. Box
2022-03-14  9:12 ` Lukas Wunner
2022-03-14 13:28   ` Limonciello, Mario

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJZ5v0hdbz4ktUQv=2WA99Tqnq-13we4Qk5z7i_OQmA-ZaQkWQ@mail.gmail.com' \
    --to=rafael@kernel.org \
    --cc=Sanket.Goswami@amd.com \
    --cc=Shyam-sundar.S-k@amd.com \
    --cc=hdegoede@redhat.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=mgross@linux.intel.com \
    --cc=platform-driver-x86@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.