From: "Rafael J. Wysocki" <firstname.lastname@example.org>
To: Bjorn Helgaas <email@example.com>
Cc: "Rafael J. Wysocki" <firstname.lastname@example.org>,
Konstantin Kharlamov <email@example.com>,
Lukas Wunner <firstname.lastname@example.org>,
Linux PCI <email@example.com>,
"Rafael J. Wysocki" <firstname.lastname@example.org>,
Andreas Noever <email@example.com>,
Linux PM <firstname.lastname@example.org>
Subject: Re: [PATCH] PCI: don't power-off apple thunderbolt controller on s2idle
Date: Thu, 20 May 2021 13:27:23 +0200 [thread overview]
Message-ID: <CAJZ5v0i+s7Ru5J5Um0tExMGCnK0JGTnc76BYyd58EiYkt6dghQ@mail.gmail.com> (raw)
On Wed, May 19, 2021 at 9:48 PM Bjorn Helgaas <email@example.com> wrote:
> On Wed, May 19, 2021 at 09:12:26PM +0200, Rafael J. Wysocki wrote:
> > On Wed, May 19, 2021 at 7:28 PM Bjorn Helgaas <firstname.lastname@example.org> wrote:
> > > On Fri, May 07, 2021 at 05:08:42PM +0300, Konstantin Kharlamov wrote:
> > > > On Fri, 2021-05-07 at 08:30 -0500, Bjorn Helgaas wrote:
> > > > > On Fri, May 07, 2021 at 12:07:38AM +0200, Lukas Wunner wrote:
> > > > > > On Thu, May 06, 2021 at 04:48:42PM -0500, Bjorn Helgaas wrote:
> > > > > > > On Thu, May 06, 2021 at 08:38:20PM +0300, Konstantin Kharlamov wrote:
> > > > > > > > On Macbook 2013 resuming from s2idle results in external monitor no
> > > > > > > > longer being detected, and dmesg having errors like:
> > > > > > > >
> > > > > > > > pcieport 0000:06:00.0: can't change power state from D3hot to D0
> > > > > > > > (config space inaccessible)
> > > > > > > >
> > > > > > > > and a stacktrace. The reason turned out that the hw that the quirk
> > > > > > > > powers off does not get powered on back on resume.
> > >
> > > > > For example, "With s2idle, the machine isn't suspended via ACPI, so
> > > > > the AML code which powers the controller off isn't executed." AFAICT
> > > > > that isn't actually a required, documented property of s2idle, but
> > > > > rather it reaches into the internal implementation.
> > >
> > > > > The code comment "If suspend mode is s2idle, power won't get restored
> > > > > on resume" is similar. !pm_suspend_via_firmware() tells us that
> > > > > platform firmware won't be invoked. But the connection between *that*
> > > > > and "power won't get restored" is unexplained.
> > > >
> > > > Sorry, I can't comment anything regarding AML and power management
> > > > in general since I am really new to all of this. However, regarding
> > > > the usage of the `pm_suspend_via_firmware()`: yeah, I also think it
> > > > is unclear what this does, and I was thinking about adding a wrapper
> > > > function something like `is_s2idle()` to the suspend.h, which would
> > > > simply call `pm_suspend_via_firmware` internally.
> > >
> > > No, that's not my point at all. I don't really care whether the
> > > interface is called pm_suspend_via_firmware() or is_s2idle().
> > >
> > > What I don't like about this is that it's all just unexplained magic,
> > > as was the original quirk.
> > >
> > > IIUC, the quirk is applied by pci_pm_suspend_noirq() *after* it puts
> > > the device in a low-power state. Here's my uninformed speculation
> > > about what happens:
> > >
> > > - On suspend, pci_pm_suspend_noirq() puts device in low-power state.
> > Right.
> > > My *guess* is this means D0 or D3hot for s2idle, and D3cold for
> > > everything else.
> > This really depends on the ACPI methods involved, but as a rule (and
> > I'm not aware of any exceptions) the target power state of the device
> > for s2idle and "everything else" is the same. If D3cold is possible,
> > it will be D3cold. If not, that will be D3hot etc.
> > The only difference (from the individual device perspective) is that
> > in the ACPI S3 (or S4 if this matters here) case the platform firmware
> > (and that's not AML but something like SMM) may cut power off from it
> > later.
> > > [Do we have sufficient debug to find out what these states are?]
> > I'm not sure what you mean. What states are supported or something else?
> I was curious about what states we actually put this device in for
> s2idle/standby/s2ram/std because I was hoping we could decide based
> on what state s2idle used. But if they all use the same target state,
> that idea won't work.
> > > - pci_pm_suspend_noirq() does pci_fixup_suspend_late fixups,
> > > including quirk_apple_poweroff_thunderbolt().
> > >
> > > - quirk_apple_poweroff_thunderbolt() runs the magic SXIO/SXFP/SXLF
> > > methods, which apparently turn off more power.
> > >
> > > - On resume, pci_pm_resume_noirq() brings the device back to D0.
> > >
> > > If we're resuming from standby, S2RAM, or STD, I speculate the
> > > device is in D3cold, so this involves running AML that seems to
> > > undo whatever SXIO/SXFP/SXLF did.
> > The standby, S2RAM and STD cases are actually different in that respect.
> > In the STD case we get the device from the restore kernel and it most
> > likely has been enumerated by it, so it is in whatever power state the
> > restore kernel puts it into.
> > For standby that most likely is the state in which the device was left
> > during suspend.
> > For S2RAM, it most likely will be something like D0-uninitialized,
> > because the platform firmware initiating the resume transition (which
> > is not AML, but again something like SMM) restores power to the
> > platform (including the majority of devices), but doesn't initialize
> > them as a rule.
> > The STD and S2RAM resume will undo the "magic", standby resume may not.
> > > If we're resuming from s2idle, I speculate the device is in D0 or
> > > D3hot, and we run different AML (or maybe no AML at all), and we
> > It is in whatever power state it was left in during suspend and we run
> > the same AML as in the other cases, modulo the possible power state
> > difference (with respect to the other cases).
> > > *don't* undo the effects of SXIO/SXFP/SXLF, so the device doesn't
> > > work.
> > This is correct.
> > > If the above is anything like what's happening, we should be able to
> > > skip SXIO/SXFP/SXLF based on the current power state of the device.
> > Not really.
> > > E.g., if the device is in D0 or D3hot, we should not use
> > > SXIO/SXFP/SXLF to yank power.
> > >
> > > That would seem more connected to the observable state of the device
> > > than using pm_suspend_via_firmware(), which relies on the connection
> > > between s2idle and PM_SUSPEND_FLAG_FW_SUSPEND (which is not at all
> > > obvious) and the power state of the device while in s2idle (also not
> > > obvious).
> > The problem is related to the fact that in s2idle the platform
> > firmware does not finalize the suspend transition and, consequently,
> > it doesn't initiate the resume transition. Therefore whatever power
> > state the device was left in during suspend must be dealt with during
> > the subsequent resume. Hence, if whatever is done by SXIO/SXFP/SXLF
> > in the suspend path cannot be reversed in the resume path by the
> > kernel (because there is no known method to do that), they should not
> > be invoked. And that's exactly because the platform firmware will not
> > finalize the suspend transition which is indicated by
> > PM_SUSPEND_FLAG_FW_SUSPEND being unset.
> How can we connect "if (!pm_suspend_via_firmware())" in this patch
> with the fact that firmware doesn't finalize suspend (and consequently
> does not reverse things in resume)?
> I don't see any use of pm_suspend_via_firmware() or
> PM_SUSPEND_FLAG_FW_SUSPEND that looks relevant.
First of all, there is a kerneldoc comment next to
pm_suspend_via_firmware() which is relevant. Especially the last
paragraph of that comment applies directly to the case at hand IMV.
Next, if you see how pm_suspend_global_flags get populated, you'll
notice that pm_suspend_clear_flags() is called early in both the
system-wide suspend and hibernation paths and then the ACPI platform
callbacks invoke pm_set_suspend_via_firmware() when the platform is
going to finalize the suspend transition. [In particular, notice the
acpi_state > ACPI_STATE_S1 condition in acpi_suspend_begin(), which is
related to the difference between ACPI S3/S4 and standby that I was
talking about before.] This means that, on a system with ACPI,
PM_SUSPEND_FLAG_FW_SUSPEND is set only if the target system state is
above S1 and in those cases the firmware finalizes the suspend
Of course, s2idle callbacks don't invoke pm_set_suspend_via_firmware()
at all, because the target ACPI system state for s2idle is S0 (the
working state from the ACPI perspective).
> That makes it really hard to figure out how this patch works and how to avoid breaking it
> in the future.
Well, if you don't trust kerneldoc comments, you get to figure out
things yourself which may not be straightforward ...
> Thanks for your patience in explaining all this!
next prev parent reply other threads:[~2021-05-20 11:29 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-06 17:38 [PATCH] PCI: don't power-off apple thunderbolt controller on s2idle Konstantin Kharlamov
2021-05-06 21:48 ` Bjorn Helgaas
2021-05-06 22:07 ` Lukas Wunner
2021-05-07 9:51 ` Rafael J. Wysocki
2021-05-08 8:48 ` Lukas Wunner
2021-05-07 13:30 ` Bjorn Helgaas
2021-05-07 14:08 ` Konstantin Kharlamov
2021-05-12 20:36 ` Konstantin Kharlamov
2021-05-17 19:51 ` PING " Konstantin Kharlamov
2021-05-19 17:28 ` Bjorn Helgaas
2021-05-19 19:12 ` Rafael J. Wysocki
2021-05-19 19:48 ` Bjorn Helgaas
2021-05-20 11:27 ` Rafael J. Wysocki [this message]
2021-05-20 11:54 ` Rafael J. Wysocki
2021-05-20 19:49 ` Bjorn Helgaas
2021-05-20 23:28 ` Konstantin Kharlamov
2021-05-24 6:59 ` Konstantin Kharlamov
2021-05-20 23:55 ` [PATCH v2] PCI: don't call firmware hooks on suspend unless it's fw-controlled Konstantin Kharlamov
2021-05-28 7:39 ` PING: " Konstantin Kharlamov
2021-06-03 8:36 ` PING: " Konstantin Kharlamov
2021-06-03 17:46 ` Bjorn Helgaas
2021-06-04 8:30 ` Konstantin Kharlamov
2021-05-21 9:47 ` [PATCH] PCI: don't power-off apple thunderbolt controller on s2idle Rafael J. Wysocki
2021-05-07 15:02 ` Rafael J. Wysocki
2021-05-08 8:20 ` Lukas Wunner
2021-05-07 9:32 ` Konstantin Kharlamov
2021-05-07 13:07 ` Bjorn Helgaas
2021-05-07 13:48 ` Konstantin Kharlamov
2021-05-20 11:58 ` Rafael J. Wysocki
2021-06-07 23:17 ` Bjorn Helgaas
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).