linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maximilian Luz <luzmaximilian@gmail.com>
To: Bjorn Helgaas <bhelgaas@google.com>
Cc: Maximilian Luz <luzmaximilian@gmail.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	linux-pci@vger.kernel.org, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH v2] PCI: Run platform power transition on initial D0 entry
Date: Sun, 14 Mar 2021 01:04:39 +0100	[thread overview]
Message-ID: <20210314000439.3138941-1-luzmaximilian@gmail.com> (raw)

On some devices and platforms, the initial platform (e.g. ACPI) power
state is not in sync with the power state of the PCI device.

This seems like it is, for all intents and purposes, an issue with the
device firmware (e.g. ACPI). On some devices, specifically Microsoft
Surface Books 2 and 3, we encounter ACPI code akin to the following
power resource, corresponding to a PCI device:

    PowerResource (PRP5, 0x00, 0x0000)
    {
        // Initialized to zero, i.e. off. There is no logic for checking
        // the actual state dynamically.
        Name (_STA, Zero)

        Method (_ON, 0, Serialized)
        {
            // ... code omitted ...
            _STA = One
        }

        Method (_OFF, 0, Serialized)
        {
            // ... code omitted ...
            _STA = Zero
        }
    }

This resource is initialized to 'off' and does not have any logic for
checking its actual state, i.e. the state of the corresponding PCI
device. The stored state of this resource can only be changed by running
the (platform/ACPI) power transition functions (i.e. _ON and _OFF).

This means that, at boot, the PCI device power state is out of sync with
the power state of the corresponding ACPI resource.

During initial bring-up of a PCI device, pci_enable_device_flags()
updates its PCI core state (from initially 'unknown') by reading from
its PCI_PM_CTRL register. It does, however, not check if the platform
(here ACPI) state is in sync with/valid for the actual device state and
needs updating.

The later call to pci_set_power_state(..., PCI_D0), which would normally
perform such platform transitions, will evaluate to a no-op if the
stored state has been updated to D0 via this. Thus any such power
resource, required for D0, will never get "officially" turned on.

One could also make the case that this could cause PCI devices requiring
some secondary power resources to not work properly, as no attempt is
ever made at checking that they are in fact synchronized (i.e. turned
on/off as required e.g. by ACPI) for the updated state.

Ultimately, this breaks the first D3cold entry for the discrete GPU on
the aforementioned Surface Books. On transition of the PCI device to
D3cold, the power resource is not properly turned off as it is already
considered off:

    $ echo auto > /sys/bus/pci/devices/0000:02:00.0/power/control

    [  104.430304] ACPI: \_SB_.PCI0.RP05: ACPI: PM: GPE69 enabled for wakeup
    [  104.430310] pcieport 0000:00:1c.4: Wakeup enabled by ACPI
    [  104.444144] ACPI: \_SB_.PCI0.RP05: ACPI: PM: Power state change: D3cold -> D3cold
    [  104.444151] acpi device:3b: Device already in D3cold
    [  104.444154] pcieport 0000:00:1c.4: power state changed by ACPI to D3cold

However, the device still consumes a non-negligible amount of power and
gets warm. A full power-cycle fixes this and results in the device being
properly transitioned to D3cold:

    $ echo on > /sys/bus/pci/devices/0000:02:00.0/power/control

    [  134.258039] ACPI: \_SB_.PCI0.RP05: ACPI: PM: Power state change: D3cold -> D0
    [  134.369140] ACPI: PM: Power resource [PRP5] turned on
    [  134.369157] acpi device:3b: Power state changed to D0
    [  134.369165] pcieport 0000:00:1c.4: power state changed by ACPI to D0
    [  134.369338] pcieport 0000:00:1c.4: Wakeup disabled by ACPI

    $ echo auto > /sys/bus/pci/devices/0000:02:00.0/power/control

    [  310.338597] ACPI: \_SB_.PCI0.RP05: ACPI: PM: GPE69 enabled for wakeup
    [  310.338604] pcieport 0000:00:1c.4: Wakeup enabled by ACPI
    [  310.353841] ACPI: \_SB_.PCI0.RP05: ACPI: PM: Power state change: D0 -> D3cold
    [  310.403879] ACPI: PM: Power resource [PRP5] turned off
    [  310.403894] acpi device:3b: Power state changed to D3cold
    [  310.403901] pcieport 0000:00:1c.4: power state changed by ACPI to D3cold

By replacing pci_set_power_state(..., PCI_DO) with pci_power_up() in
do_pci_enable_device(), we can ensure that the state of power resources
is always checked. This essentially drops the initial (first) check on
the current state of the PCI device and calls platform specific code,
i.e. pci_platform_power_transition() to perform the appropriate platform
transition.

This can be verified by

    $ echo auto > /sys/bus/pci/devices/0000:02:00.0/power/control

    [  152.790448] ACPI: \_SB_.PCI0.RP05: ACPI: PM: GPE69 enabled for wakeup
    [  152.790454] pcieport 0000:00:1c.4: Wakeup enabled by ACPI
    [  152.804252] ACPI: \_SB_.PCI0.RP05: ACPI: PM: Power state change: D0 -> D3cold
    [  152.857548] ACPI: PM: Power resource [PRP5] turned off
    [  152.857563] acpi device:3b: Power state changed to D3cold
    [  152.857570] pcieport 0000:00:1c.4: power state changed by ACPI to D3cold

which yields the expected behavior.

Note that

 a) pci_set_power_state() would call pci_power_up() if the check on the
 current state failed. This means that in case of the updated state not
 being D0, there is no functional change.

 b) The core and platform transitions, i.e. pci_raw_set_power_state()
 and pci_platform_power_transition() (should) both have checks on the
 current state. So in case either state is up to date, the corresponding
 transition should still evaluate to a no-op. The only notable
 difference made by the suggested replacement is pci_resume_bus() being
 called for devices where runtime D3cold is enabled.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
---

Changes in v2:
 - No code changes
 - Improve commit message, add details
 - Add Rafael and linux-pm to CCs

---
 drivers/pci/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 16a17215f633..cc42315210e4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1800,7 +1800,7 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
 	u16 cmd;
 	u8 pin;
 
-	err = pci_set_power_state(dev, PCI_D0);
+	err = pci_power_up(dev);
 	if (err < 0 && err != -EIO)
 		return err;
 
-- 
2.30.2


             reply	other threads:[~2021-03-14  0:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-14  0:04 Maximilian Luz [this message]
2021-03-15 15:34 ` [PATCH v2] PCI: Run platform power transition on initial D0 entry Rafael J. Wysocki
2021-03-15 18:28   ` Maximilian Luz
2021-03-16 13:36     ` Rafael J. Wysocki
2021-03-16 14:10       ` Maximilian Luz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210314000439.3138941-1-luzmaximilian@gmail.com \
    --to=luzmaximilian@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).