All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] PCI PM refinements
@ 2016-09-18  3:39 Lukas Wunner
  2016-09-18  3:39 ` [PATCH v2 1/5] PCI: Afford direct-complete to devices with nonstandard PM Lukas Wunner
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Lukas Wunner @ 2016-09-18  3:39 UTC (permalink / raw)
  To: linux-pci
  Cc: linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu, Andreas Noever

Here's the second installment of this series, based on feedback from
Rafael. (Thanks!)

To reiterate, these refinements are meant to speed up and reduce power
consumption when entering and leaving system sleep and when shutting
down the system.  For details, please refer to the cover letter of v1:
http://www.spinics.net/lists/linux-pci/msg53702.html

As usual I've pushed the patches to GitHub in case anyone prefers
reviewing in a GUI:
https://github.com/l1k/linux/commits/pci_pm_refinements_v2


Changes since v1:

* Patch [1/5] now sports an ack by Rafael.

* Patch [2/5]:
  * In acpi_pci_get_power_state(), only check for ACPI_STATE_UNKNOWN
    instead of ACPI_STATE_D0 and ACPI_STATE_D3_COLD.
  * Move the change to pci_update_current_state() to new patch [3/5].

* Patch [3/5]:
  * Instead of solely relying on the platform firmware to report D3cold,
    also probe the vendor ID and assume D3cold if it can't be read.
    This should ensure proper detection of D3cold on pre-ACPI 4.0
    platforms (which will never report anything deeper than D3hot)
    as well as for devices with nonstandard PM mechanisms.
  * The two existing workarounds for D3cold are removed from
    pci_update_current_state(), as explained in the commit message.

* Patch [5/5]:
  * Disable runtime PM on the device to prevent it from being runtime
    resumed during the remainder of the shutdown process.

Thanks,

Lukas


Lukas Wunner (5):
  PCI: Afford direct-complete to devices with nonstandard PM
  PCI: Query platform firmware for device power state
  PCI: Recognize D3cold in pci_update_current_state()
  PCI: Avoid unnecessary resume after direct-complete
  PCI: Avoid unnecessary resume on shutdown

 drivers/pci/pci-acpi.c   | 22 ++++++++++++++++++++
 drivers/pci/pci-driver.c | 24 ++++++++++++++++++++--
 drivers/pci/pci.c        | 52 ++++++++++++++++++++++++++++++++----------------
 drivers/pci/pci.h        |  3 +++
 4 files changed, 82 insertions(+), 19 deletions(-)

-- 
2.9.3


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/5] PCI: Afford direct-complete to devices with nonstandard PM
  2016-09-18  3:39 [PATCH v2 0/5] PCI PM refinements Lukas Wunner
@ 2016-09-18  3:39 ` Lukas Wunner
  2016-09-18  3:39 ` [PATCH v2 5/5] PCI: Avoid unnecessary resume on shutdown Lukas Wunner
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Lukas Wunner @ 2016-09-18  3:39 UTC (permalink / raw)
  To: linux-pci
  Cc: linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu, Andreas Noever

There are devices not power-manageable by the platform, but still able
to runtime suspend to D3cold with a nonstandard mechanism.  One example
is laptop hybrid graphics where the discrete GPU and its built-in HDA
controller are power-managed either with a _DSM (AMD PowerXpress, Nvidia
Optimus) or a separate gmux controller (MacBook Pro).  Another example
is Thunderbolt on Macs which is power-managed with custom ACPI methods.

When putting the system to sleep, we currently handle such devices
improperly by transitioning them from D3cold to D3hot (the default power
state defined at the top of pci_target_state()).  This wastes energy and
prolongs the suspend sequence (powering up the Thunderbolt controller
takes 2 seconds).

Avoid that by assuming that a nonstandard PM mechanism is at work if the
device is not platform-power-manageable but currently in D3cold.

If the device is wakeup enabled, we might still have to wake it up from
D3cold if PME cannot be signaled from that power state.

The check for devices without PM capability comes before the check for
D3cold since such devices could in theory also be powered down by
nonstandard means and should then be afforded direct-complete as well.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/pci/pci.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index aab9d51..72a9d3a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1959,9 +1959,22 @@ static pci_power_t pci_target_state(struct pci_dev *dev)
 		default:
 			target_state = state;
 		}
-	} else if (!dev->pm_cap) {
+
+		return target_state;
+	}
+
+	if (!dev->pm_cap)
 		target_state = PCI_D0;
-	} else if (device_may_wakeup(&dev->dev)) {
+
+	/*
+	 * If the device is in D3cold even though it's not power-manageable by
+	 * the platform, it may have been powered down by nonstandard means.
+	 * Best to let it slumber.
+	 */
+	if (dev->current_state == PCI_D3cold)
+		target_state = PCI_D3cold;
+
+	if (device_may_wakeup(&dev->dev)) {
 		/*
 		 * Find the deepest state from which the device can generate
 		 * wake-up events, make it the target state and enable device
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/5] PCI: Query platform firmware for device power state
  2016-09-18  3:39 [PATCH v2 0/5] PCI PM refinements Lukas Wunner
                   ` (3 preceding siblings ...)
  2016-09-18  3:39 ` [PATCH v2 3/5] PCI: Recognize D3cold in pci_update_current_state() Lukas Wunner
@ 2016-09-18  3:39 ` Lukas Wunner
  2016-09-24  0:46   ` Rafael J. Wysocki
  2016-09-28 16:54 ` [PATCH v2 0/5] PCI PM refinements Bjorn Helgaas
  5 siblings, 1 reply; 15+ messages in thread
From: Lukas Wunner @ 2016-09-18  3:39 UTC (permalink / raw)
  To: linux-pci
  Cc: linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu, Andreas Noever

Usually the most accurate way to determine a PCI device's power state is
to read its PM Control & Status Register.  There are two cases however
when this is not an option:  If the device doesn't have the PM
capability at all, or if it is in D3cold (in which case its config space
is inaccessible).

In both cases, we can alternatively query the platform firmware for its
opinion on the device's power state.  To facilitate this, augment struct
pci_platform_pm_ops with a ->get_power callback and implement it for
acpi_pci_platform_pm (the only pci_platform_pm_ops existing so far).

It is used by a forthcoming commit to let pci_update_current_state()
recognize D3cold.

Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---

Changes since v1:
* In acpi_pci_get_power_state(), only check for ACPI_STATE_UNKNOWN
  instead of ACPI_STATE_D0 and ACPI_STATE_D3_COLD.
* Move the change to pci_update_current_state() to new patch [3/5].

 drivers/pci/pci-acpi.c | 22 ++++++++++++++++++++++
 drivers/pci/pci.c      | 10 ++++++++--
 drivers/pci/pci.h      |  3 +++
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 9a033e8..d966d47 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -452,6 +452,27 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
 	return error;
 }
 
+static pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
+{
+	struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
+	static const pci_power_t state_conv[] = {
+		[ACPI_STATE_D0]      = PCI_D0,
+		[ACPI_STATE_D1]      = PCI_D1,
+		[ACPI_STATE_D2]      = PCI_D2,
+		[ACPI_STATE_D3_HOT]  = PCI_D3hot,
+		[ACPI_STATE_D3_COLD] = PCI_D3cold,
+	};
+	int state;
+
+	if (!adev || !acpi_device_power_manageable(adev))
+		return PCI_UNKNOWN;
+
+	if (acpi_device_get_power(adev, &state) || state == ACPI_STATE_UNKNOWN)
+		return PCI_UNKNOWN;
+
+	return state_conv[state];
+}
+
 static bool acpi_pci_can_wakeup(struct pci_dev *dev)
 {
 	struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
@@ -534,6 +555,7 @@ static bool acpi_pci_need_resume(struct pci_dev *dev)
 static const struct pci_platform_pm_ops acpi_pci_platform_pm = {
 	.is_manageable = acpi_pci_power_manageable,
 	.set_state = acpi_pci_set_power_state,
+	.get_state = acpi_pci_get_power_state,
 	.choose_state = acpi_pci_choose_state,
 	.sleep_wake = acpi_pci_sleep_wake,
 	.run_wake = acpi_pci_run_wake,
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 72a9d3a..6ea0d2d 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -552,8 +552,9 @@ static const struct pci_platform_pm_ops *pci_platform_pm;
 
 int pci_set_platform_pm(const struct pci_platform_pm_ops *ops)
 {
-	if (!ops->is_manageable || !ops->set_state || !ops->choose_state ||
-	    !ops->sleep_wake || !ops->run_wake || !ops->need_resume)
+	if (!ops->is_manageable || !ops->set_state  || !ops->get_state ||
+	    !ops->choose_state  || !ops->sleep_wake || !ops->run_wake  ||
+	    !ops->need_resume)
 		return -EINVAL;
 	pci_platform_pm = ops;
 	return 0;
@@ -570,6 +571,11 @@ static inline int platform_pci_set_power_state(struct pci_dev *dev,
 	return pci_platform_pm ? pci_platform_pm->set_state(dev, t) : -ENOSYS;
 }
 
+static inline pci_power_t platform_pci_get_power_state(struct pci_dev *dev)
+{
+	return pci_platform_pm ? pci_platform_pm->get_state(dev) : PCI_UNKNOWN;
+}
+
 static inline pci_power_t platform_pci_choose_state(struct pci_dev *dev)
 {
 	return pci_platform_pm ?
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 194521b..4518562 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -42,6 +42,8 @@ int pci_probe_reset_function(struct pci_dev *dev);
  *
  * @set_state: invokes the platform firmware to set the device's power state
  *
+ * @get_state: queries the platform firmware for a device's current power state
+ *
  * @choose_state: returns PCI power state of given device preferred by the
  *                platform; to be used during system-wide transitions from a
  *                sleeping state to the working state and vice versa
@@ -62,6 +64,7 @@ int pci_probe_reset_function(struct pci_dev *dev);
 struct pci_platform_pm_ops {
 	bool (*is_manageable)(struct pci_dev *dev);
 	int (*set_state)(struct pci_dev *dev, pci_power_t state);
+	pci_power_t (*get_state)(struct pci_dev *dev);
 	pci_power_t (*choose_state)(struct pci_dev *dev);
 	int (*sleep_wake)(struct pci_dev *dev, bool enable);
 	int (*run_wake)(struct pci_dev *dev, bool enable);
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/5] PCI: Recognize D3cold in pci_update_current_state()
  2016-09-18  3:39 [PATCH v2 0/5] PCI PM refinements Lukas Wunner
                   ` (2 preceding siblings ...)
  2016-09-18  3:39 ` [PATCH v2 4/5] PCI: Avoid unnecessary resume after direct-complete Lukas Wunner
@ 2016-09-18  3:39 ` Lukas Wunner
  2016-09-24  0:46   ` Rafael J. Wysocki
  2016-09-18  3:39 ` [PATCH v2 2/5] PCI: Query platform firmware for device power state Lukas Wunner
  2016-09-28 16:54 ` [PATCH v2 0/5] PCI PM refinements Bjorn Helgaas
  5 siblings, 1 reply; 15+ messages in thread
From: Lukas Wunner @ 2016-09-18  3:39 UTC (permalink / raw)
  To: linux-pci
  Cc: linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu, Andreas Noever

Whenever a device is resumed or its power state is changed using the
platform, its new power state is read from the PM Control & Status
Register and cached in pci_dev->current_state by calling
pci_update_current_state().

If the device is in D3cold, reading from config space typically results
in a fabricated "all ones" response.  But if it's in D3hot, the two bits
representing the power state in the PMCSR are *also* set to 1.  Thus
D3hot and D3cold are not discernible by just reading the PMCSR.

To account for this, pci_update_current_state() uses two workarounds:

- When transitioning to D3cold using pci_platform_power_transition(),
  the new power state is set blindly by pci_update_current_state(),
  i.e. without verifying that the device actually *is* in D3cold.
  This is achieved by setting the "state" argument to PCI_D3cold.
  The "state" argument was originally intended to convey the new state
  in case the device doesn't have the PM capability.  It is *also* used
  to convey the device state if the PM capability is present and the
  new state is D3cold, but this was never explained in the kerneldoc.

- Once the current_state is set to D3cold, further invocations of
  pci_update_current_state() will blindly assume that the device is
  still in D3cold and leave the current_state unmodified.  To get out of
  this impasse, the current_state has to be set directly, typically by
  calling pci_raw_set_power_state() or pci_enable_device().

It would be desirable if pci_update_current_state() could reliably
detect D3cold by itself.  That would allow us to do away with these
workarounds, and it would allow for a smarter, more energy conserving
runtime resume strategy after system sleep:  Currently devices which
utilize direct_complete are mandatorily runtime resumed in their
->complete stage.  This can be avoided if their power state after system
sleep is the same as before, but it requires a mechanism to detect the
power state reliably.

We've just gained the ability to query the platform firmware for its
opinion on the device's power state.  On platforms conforming to
ACPI 4.0 or newer, this allows recognition of D3cold.  Pre-4.0 platforms
lack _PR3 and therefore the deepest power state that will ever be
reported is D3hot, even though the device may actually be in D3cold.
To detect D3cold in those cases, accessibility of the vendor ID in
config space is probed using pci_device_is_present().  This also works
for devices which are not platform-power-manageable at all, but can be
suspended to D3cold using a nonstandard mechanism (e.g. some hybrid
graphics laptops or Thunderbolt on the Mac).

Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---

Changes since v1:
* Instead of solely relying on the platform firmware to report D3cold,
  also probe the vendor ID and assume D3cold if it can't be read.
  This should ensure proper detection of D3cold on pre-ACPI 4.0
  platforms (which will never report anything deeper than D3hot)
  as well as for devices with nonstandard PM mechanisms.
* The two existing workarounds for D3cold are removed from
  pci_update_current_state(), as explained in the commit message.

 drivers/pci/pci.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 6ea0d2d..7d3188b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -707,26 +707,25 @@ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state)
 }
 
 /**
- * pci_update_current_state - Read PCI power state of given device from its
- *                            PCI PM registers and cache it
+ * pci_update_current_state - Read power state of given device and cache it
  * @dev: PCI device to handle.
  * @state: State to cache in case the device doesn't have the PM capability
+ *
+ * The power state is read from the PMCSR register, which however is
+ * inaccessible in D3cold.  The platform firmware is therefore queried first
+ * to detect accessibility of the register.  In case the platform firmware
+ * reports an incorrect state or the device isn't power manageable by the
+ * platform at all, we try to detect D3cold by testing accessibility of the
+ * vendor ID in config space.
  */
 void pci_update_current_state(struct pci_dev *dev, pci_power_t state)
 {
-	if (dev->pm_cap) {
+	if (platform_pci_get_power_state(dev) == PCI_D3cold ||
+	    !pci_device_is_present(dev)) {
+		dev->current_state = PCI_D3cold;
+	} else if (dev->pm_cap) {
 		u16 pmcsr;
 
-		/*
-		 * Configuration space is not accessible for device in
-		 * D3cold, so just keep or set D3cold for safety
-		 */
-		if (dev->current_state == PCI_D3cold)
-			return;
-		if (state == PCI_D3cold) {
-			dev->current_state = PCI_D3cold;
-			return;
-		}
 		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
 		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
 	} else {
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/5] PCI: Avoid unnecessary resume after direct-complete
  2016-09-18  3:39 [PATCH v2 0/5] PCI PM refinements Lukas Wunner
  2016-09-18  3:39 ` [PATCH v2 1/5] PCI: Afford direct-complete to devices with nonstandard PM Lukas Wunner
  2016-09-18  3:39 ` [PATCH v2 5/5] PCI: Avoid unnecessary resume on shutdown Lukas Wunner
@ 2016-09-18  3:39 ` Lukas Wunner
  2016-09-24  0:47   ` Rafael J. Wysocki
  2016-09-18  3:39 ` [PATCH v2 3/5] PCI: Recognize D3cold in pci_update_current_state() Lukas Wunner
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Lukas Wunner @ 2016-09-18  3:39 UTC (permalink / raw)
  To: linux-pci
  Cc: linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu, Andreas Noever

Commit 58a1fbbb2ee8 ("PM / PCI / ACPI: Kick devices that might have been
reset by firmware") added a runtime resume for devices that were runtime
suspended when the system entered sleep.

The motivation was that devices might be in a reset-power-on state after
waking from system sleep, so their power state as perceived by Linux
(stored in pci_dev->current_state) would no longer reflect reality.
By resuming such devices, we allow them to return to a low-power state
via autosuspend and also bring their current_state in sync with reality.

However for devices that are *not* in a reset-power-on state, doing an
unconditional resume wastes energy.  A more refined approach is called
for which issues a runtime resume only if the power state after
direct-complete is shallower than it was before. To achieve this, update
the device's current_state and compare it to its pre-sleep value.

Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/pci/pci-driver.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index e39a67c..fd4b9c4 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -684,8 +684,19 @@ static int pci_pm_prepare(struct device *dev)
 
 static void pci_pm_complete(struct device *dev)
 {
-	pci_dev_complete_resume(to_pci_dev(dev));
-	pm_complete_with_resume_check(dev);
+	struct pci_dev *pci_dev = to_pci_dev(dev);
+
+	pci_dev_complete_resume(pci_dev);
+	pm_generic_complete(dev);
+
+	/* Resume device if platform firmware has put it in reset-power-on */
+	if (dev->power.direct_complete && pm_resume_via_firmware()) {
+		pci_power_t pre_sleep_state = pci_dev->current_state;
+
+		pci_update_current_state(pci_dev, pci_dev->current_state);
+		if (pci_dev->current_state < pre_sleep_state)
+			pm_request_resume(dev);
+	}
 }
 
 #else /* !CONFIG_PM_SLEEP */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 5/5] PCI: Avoid unnecessary resume on shutdown
  2016-09-18  3:39 [PATCH v2 0/5] PCI PM refinements Lukas Wunner
  2016-09-18  3:39 ` [PATCH v2 1/5] PCI: Afford direct-complete to devices with nonstandard PM Lukas Wunner
@ 2016-09-18  3:39 ` Lukas Wunner
  2016-09-19  9:12   ` Oliver Neukum
  2016-09-18  3:39 ` [PATCH v2 4/5] PCI: Avoid unnecessary resume after direct-complete Lukas Wunner
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Lukas Wunner @ 2016-09-18  3:39 UTC (permalink / raw)
  To: linux-pci
  Cc: linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu, Andreas Noever

We currently perform a mandatory runtime resume of all PCI devices on
->shutdown.  However it is pointless to wake devices only to immediately
power them down afterwards.  (Or have the firmware reset them, in case
of a reboot.)

It seems there are only two cases when a runtime resume is actually
necessary:  If the driver has declared a ->shutdown callback or if kexec
is in progress.

Constrain resume of a device to these cases and let it slumber
otherwise, thereby conserving energy and speeding up shutdown.

To prevent the device from being runtime resumed during the remainder of
the shutdown process, disable runtime PM for it.

Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---

Changes since v1:
* Disable runtime PM on the device to prevent it from being runtime
  resumed during the remainder of the shutdown process.

 drivers/pci/pci-driver.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index fd4b9c4..f21c620 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -459,6 +459,15 @@ static void pci_device_shutdown(struct device *dev)
 	struct pci_dev *pci_dev = to_pci_dev(dev);
 	struct pci_driver *drv = pci_dev->driver;
 
+	/* Fast path for suspended devices */
+	if (pm_runtime_status_suspended(dev) && (!drv || !drv->shutdown) &&
+	    !kexec_in_progress) {
+		pm_runtime_disable(dev);
+		if (pm_runtime_status_suspended(dev))
+			return;
+		pm_runtime_enable(dev);
+	}
+
 	pm_runtime_resume(dev);
 
 	if (drv && drv->shutdown)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 5/5] PCI: Avoid unnecessary resume on shutdown
  2016-09-18  3:39 ` [PATCH v2 5/5] PCI: Avoid unnecessary resume on shutdown Lukas Wunner
@ 2016-09-19  9:12   ` Oliver Neukum
  2016-09-19 10:14     ` Lukas Wunner
  0 siblings, 1 reply; 15+ messages in thread
From: Oliver Neukum @ 2016-09-19  9:12 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-pci, linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu,
	Andreas Noever

On Sun, 2016-09-18 at 05:39 +0200, Lukas Wunner wrote:
> We currently perform a mandatory runtime resume of all PCI devices on
> ->shutdown.  However it is pointless to wake devices only to immediately
> power them down afterwards.  (Or have the firmware reset them, in case
> of a reboot.)
> 
> It seems there are only two cases when a runtime resume is actually
> necessary:  If the driver has declared a ->shutdown callback or if kexec
> is in progress.
> 
> Constrain resume of a device to these cases and let it slumber
> otherwise, thereby conserving energy and speeding up shutdown.

What happens if you get a wakeup event while going down?
Will it lead to a reboot when a shutdown has been requested?

	Regards
		Oliver



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 5/5] PCI: Avoid unnecessary resume on shutdown
  2016-09-19  9:12   ` Oliver Neukum
@ 2016-09-19 10:14     ` Lukas Wunner
  2016-09-24  0:50       ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Lukas Wunner @ 2016-09-19 10:14 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: linux-pci, linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu,
	Andreas Noever

On Mon, Sep 19, 2016 at 11:12:23AM +0200, Oliver Neukum wrote:
> On Sun, 2016-09-18 at 05:39 +0200, Lukas Wunner wrote:
> > We currently perform a mandatory runtime resume of all PCI devices on
> > ->shutdown.  However it is pointless to wake devices only to immediately
> > power them down afterwards.  (Or have the firmware reset them, in case
> > of a reboot.)
> > 
> > It seems there are only two cases when a runtime resume is actually
> > necessary:  If the driver has declared a ->shutdown callback or if kexec
> > is in progress.
> > 
> > Constrain resume of a device to these cases and let it slumber
> > otherwise, thereby conserving energy and speeding up shutdown.
> 
> What happens if you get a wakeup event while going down?

Hm, I should probably disable that with

	__pci_enable_wake(pci_dev, pci_dev->current_state, true, false);


> Will it lead to a reboot when a shutdown has been requested?

That would depend on the platform, my guess is no, but only someone
with extensive real world experience with such corner cases can answer
this. (Rafael?)

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/5] PCI: Query platform firmware for device power state
  2016-09-18  3:39 ` [PATCH v2 2/5] PCI: Query platform firmware for device power state Lukas Wunner
@ 2016-09-24  0:46   ` Rafael J. Wysocki
  0 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2016-09-24  0:46 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: linux-pci, linux-pm, linux-acpi, Peter Wu, Andreas Noever

On Sunday, September 18, 2016 05:39:20 AM Lukas Wunner wrote:
> Usually the most accurate way to determine a PCI device's power state is
> to read its PM Control & Status Register.  There are two cases however
> when this is not an option:  If the device doesn't have the PM
> capability at all, or if it is in D3cold (in which case its config space
> is inaccessible).
> 
> In both cases, we can alternatively query the platform firmware for its
> opinion on the device's power state.  To facilitate this, augment struct
> pci_platform_pm_ops with a ->get_power callback and implement it for
> acpi_pci_platform_pm (the only pci_platform_pm_ops existing so far).
> 
> It is used by a forthcoming commit to let pci_update_current_state()
> recognize D3cold.
> 
> Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
> 
> Changes since v1:
> * In acpi_pci_get_power_state(), only check for ACPI_STATE_UNKNOWN
>   instead of ACPI_STATE_D0 and ACPI_STATE_D3_COLD.
> * Move the change to pci_update_current_state() to new patch [3/5].
> 
>  drivers/pci/pci-acpi.c | 22 ++++++++++++++++++++++
>  drivers/pci/pci.c      | 10 ++++++++--
>  drivers/pci/pci.h      |  3 +++
>  3 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 9a033e8..d966d47 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -452,6 +452,27 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>  	return error;
>  }
>  
> +static pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
> +{
> +	struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
> +	static const pci_power_t state_conv[] = {
> +		[ACPI_STATE_D0]      = PCI_D0,
> +		[ACPI_STATE_D1]      = PCI_D1,
> +		[ACPI_STATE_D2]      = PCI_D2,
> +		[ACPI_STATE_D3_HOT]  = PCI_D3hot,
> +		[ACPI_STATE_D3_COLD] = PCI_D3cold,
> +	};
> +	int state;
> +
> +	if (!adev || !acpi_device_power_manageable(adev))
> +		return PCI_UNKNOWN;
> +
> +	if (acpi_device_get_power(adev, &state) || state == ACPI_STATE_UNKNOWN)
> +		return PCI_UNKNOWN;
> +
> +	return state_conv[state];
> +}
> +
>  static bool acpi_pci_can_wakeup(struct pci_dev *dev)
>  {
>  	struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
> @@ -534,6 +555,7 @@ static bool acpi_pci_need_resume(struct pci_dev *dev)
>  static const struct pci_platform_pm_ops acpi_pci_platform_pm = {
>  	.is_manageable = acpi_pci_power_manageable,
>  	.set_state = acpi_pci_set_power_state,
> +	.get_state = acpi_pci_get_power_state,
>  	.choose_state = acpi_pci_choose_state,
>  	.sleep_wake = acpi_pci_sleep_wake,
>  	.run_wake = acpi_pci_run_wake,
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 72a9d3a..6ea0d2d 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -552,8 +552,9 @@ static const struct pci_platform_pm_ops *pci_platform_pm;
>  
>  int pci_set_platform_pm(const struct pci_platform_pm_ops *ops)
>  {
> -	if (!ops->is_manageable || !ops->set_state || !ops->choose_state ||
> -	    !ops->sleep_wake || !ops->run_wake || !ops->need_resume)
> +	if (!ops->is_manageable || !ops->set_state  || !ops->get_state ||
> +	    !ops->choose_state  || !ops->sleep_wake || !ops->run_wake  ||
> +	    !ops->need_resume)
>  		return -EINVAL;
>  	pci_platform_pm = ops;
>  	return 0;
> @@ -570,6 +571,11 @@ static inline int platform_pci_set_power_state(struct pci_dev *dev,
>  	return pci_platform_pm ? pci_platform_pm->set_state(dev, t) : -ENOSYS;
>  }
>  
> +static inline pci_power_t platform_pci_get_power_state(struct pci_dev *dev)
> +{
> +	return pci_platform_pm ? pci_platform_pm->get_state(dev) : PCI_UNKNOWN;
> +}
> +
>  static inline pci_power_t platform_pci_choose_state(struct pci_dev *dev)
>  {
>  	return pci_platform_pm ?
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 194521b..4518562 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -42,6 +42,8 @@ int pci_probe_reset_function(struct pci_dev *dev);
>   *
>   * @set_state: invokes the platform firmware to set the device's power state
>   *
> + * @get_state: queries the platform firmware for a device's current power state
> + *
>   * @choose_state: returns PCI power state of given device preferred by the
>   *                platform; to be used during system-wide transitions from a
>   *                sleeping state to the working state and vice versa
> @@ -62,6 +64,7 @@ int pci_probe_reset_function(struct pci_dev *dev);
>  struct pci_platform_pm_ops {
>  	bool (*is_manageable)(struct pci_dev *dev);
>  	int (*set_state)(struct pci_dev *dev, pci_power_t state);
> +	pci_power_t (*get_state)(struct pci_dev *dev);
>  	pci_power_t (*choose_state)(struct pci_dev *dev);
>  	int (*sleep_wake)(struct pci_dev *dev, bool enable);
>  	int (*run_wake)(struct pci_dev *dev, bool enable);
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/5] PCI: Recognize D3cold in pci_update_current_state()
  2016-09-18  3:39 ` [PATCH v2 3/5] PCI: Recognize D3cold in pci_update_current_state() Lukas Wunner
@ 2016-09-24  0:46   ` Rafael J. Wysocki
  0 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2016-09-24  0:46 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: linux-pci, linux-pm, linux-acpi, Peter Wu, Andreas Noever

On Sunday, September 18, 2016 05:39:20 AM Lukas Wunner wrote:
> Whenever a device is resumed or its power state is changed using the
> platform, its new power state is read from the PM Control & Status
> Register and cached in pci_dev->current_state by calling
> pci_update_current_state().
> 
> If the device is in D3cold, reading from config space typically results
> in a fabricated "all ones" response.  But if it's in D3hot, the two bits
> representing the power state in the PMCSR are *also* set to 1.  Thus
> D3hot and D3cold are not discernible by just reading the PMCSR.
> 
> To account for this, pci_update_current_state() uses two workarounds:
> 
> - When transitioning to D3cold using pci_platform_power_transition(),
>   the new power state is set blindly by pci_update_current_state(),
>   i.e. without verifying that the device actually *is* in D3cold.
>   This is achieved by setting the "state" argument to PCI_D3cold.
>   The "state" argument was originally intended to convey the new state
>   in case the device doesn't have the PM capability.  It is *also* used
>   to convey the device state if the PM capability is present and the
>   new state is D3cold, but this was never explained in the kerneldoc.
> 
> - Once the current_state is set to D3cold, further invocations of
>   pci_update_current_state() will blindly assume that the device is
>   still in D3cold and leave the current_state unmodified.  To get out of
>   this impasse, the current_state has to be set directly, typically by
>   calling pci_raw_set_power_state() or pci_enable_device().
> 
> It would be desirable if pci_update_current_state() could reliably
> detect D3cold by itself.  That would allow us to do away with these
> workarounds, and it would allow for a smarter, more energy conserving
> runtime resume strategy after system sleep:  Currently devices which
> utilize direct_complete are mandatorily runtime resumed in their
> ->complete stage.  This can be avoided if their power state after system
> sleep is the same as before, but it requires a mechanism to detect the
> power state reliably.
> 
> We've just gained the ability to query the platform firmware for its
> opinion on the device's power state.  On platforms conforming to
> ACPI 4.0 or newer, this allows recognition of D3cold.  Pre-4.0 platforms
> lack _PR3 and therefore the deepest power state that will ever be
> reported is D3hot, even though the device may actually be in D3cold.
> To detect D3cold in those cases, accessibility of the vendor ID in
> config space is probed using pci_device_is_present().  This also works
> for devices which are not platform-power-manageable at all, but can be
> suspended to D3cold using a nonstandard mechanism (e.g. some hybrid
> graphics laptops or Thunderbolt on the Mac).
> 
> Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
> 
> Changes since v1:
> * Instead of solely relying on the platform firmware to report D3cold,
>   also probe the vendor ID and assume D3cold if it can't be read.
>   This should ensure proper detection of D3cold on pre-ACPI 4.0
>   platforms (which will never report anything deeper than D3hot)
>   as well as for devices with nonstandard PM mechanisms.
> * The two existing workarounds for D3cold are removed from
>   pci_update_current_state(), as explained in the commit message.
> 
>  drivers/pci/pci.c | 25 ++++++++++++-------------
>  1 file changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 6ea0d2d..7d3188b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -707,26 +707,25 @@ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state)
>  }
>  
>  /**
> - * pci_update_current_state - Read PCI power state of given device from its
> - *                            PCI PM registers and cache it
> + * pci_update_current_state - Read power state of given device and cache it
>   * @dev: PCI device to handle.
>   * @state: State to cache in case the device doesn't have the PM capability
> + *
> + * The power state is read from the PMCSR register, which however is
> + * inaccessible in D3cold.  The platform firmware is therefore queried first
> + * to detect accessibility of the register.  In case the platform firmware
> + * reports an incorrect state or the device isn't power manageable by the
> + * platform at all, we try to detect D3cold by testing accessibility of the
> + * vendor ID in config space.
>   */
>  void pci_update_current_state(struct pci_dev *dev, pci_power_t state)
>  {
> -	if (dev->pm_cap) {
> +	if (platform_pci_get_power_state(dev) == PCI_D3cold ||
> +	    !pci_device_is_present(dev)) {
> +		dev->current_state = PCI_D3cold;
> +	} else if (dev->pm_cap) {
>  		u16 pmcsr;
>  
> -		/*
> -		 * Configuration space is not accessible for device in
> -		 * D3cold, so just keep or set D3cold for safety
> -		 */
> -		if (dev->current_state == PCI_D3cold)
> -			return;
> -		if (state == PCI_D3cold) {
> -			dev->current_state = PCI_D3cold;
> -			return;
> -		}
>  		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>  		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
>  	} else {
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/5] PCI: Avoid unnecessary resume after direct-complete
  2016-09-18  3:39 ` [PATCH v2 4/5] PCI: Avoid unnecessary resume after direct-complete Lukas Wunner
@ 2016-09-24  0:47   ` Rafael J. Wysocki
  0 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2016-09-24  0:47 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: linux-pci, linux-pm, linux-acpi, Peter Wu, Andreas Noever

On Sunday, September 18, 2016 05:39:20 AM Lukas Wunner wrote:
> Commit 58a1fbbb2ee8 ("PM / PCI / ACPI: Kick devices that might have been
> reset by firmware") added a runtime resume for devices that were runtime
> suspended when the system entered sleep.
> 
> The motivation was that devices might be in a reset-power-on state after
> waking from system sleep, so their power state as perceived by Linux
> (stored in pci_dev->current_state) would no longer reflect reality.
> By resuming such devices, we allow them to return to a low-power state
> via autosuspend and also bring their current_state in sync with reality.
> 
> However for devices that are *not* in a reset-power-on state, doing an
> unconditional resume wastes energy.  A more refined approach is called
> for which issues a runtime resume only if the power state after
> direct-complete is shallower than it was before. To achieve this, update
> the device's current_state and compare it to its pre-sleep value.
> 
> Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
>  drivers/pci/pci-driver.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index e39a67c..fd4b9c4 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -684,8 +684,19 @@ static int pci_pm_prepare(struct device *dev)
>  
>  static void pci_pm_complete(struct device *dev)
>  {
> -	pci_dev_complete_resume(to_pci_dev(dev));
> -	pm_complete_with_resume_check(dev);
> +	struct pci_dev *pci_dev = to_pci_dev(dev);
> +
> +	pci_dev_complete_resume(pci_dev);
> +	pm_generic_complete(dev);
> +
> +	/* Resume device if platform firmware has put it in reset-power-on */
> +	if (dev->power.direct_complete && pm_resume_via_firmware()) {
> +		pci_power_t pre_sleep_state = pci_dev->current_state;
> +
> +		pci_update_current_state(pci_dev, pci_dev->current_state);
> +		if (pci_dev->current_state < pre_sleep_state)
> +			pm_request_resume(dev);
> +	}
>  }
>  
>  #else /* !CONFIG_PM_SLEEP */
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 5/5] PCI: Avoid unnecessary resume on shutdown
  2016-09-19 10:14     ` Lukas Wunner
@ 2016-09-24  0:50       ` Rafael J. Wysocki
  2016-10-05 12:32         ` Lukas Wunner
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2016-09-24  0:50 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Oliver Neukum, linux-pci, linux-pm, linux-acpi, Peter Wu, Andreas Noever

On Monday, September 19, 2016 12:14:02 PM Lukas Wunner wrote:
> On Mon, Sep 19, 2016 at 11:12:23AM +0200, Oliver Neukum wrote:
> > On Sun, 2016-09-18 at 05:39 +0200, Lukas Wunner wrote:
> > > We currently perform a mandatory runtime resume of all PCI devices on
> > > ->shutdown.  However it is pointless to wake devices only to immediately
> > > power them down afterwards.  (Or have the firmware reset them, in case
> > > of a reboot.)
> > > 
> > > It seems there are only two cases when a runtime resume is actually
> > > necessary:  If the driver has declared a ->shutdown callback or if kexec
> > > is in progress.
> > > 
> > > Constrain resume of a device to these cases and let it slumber
> > > otherwise, thereby conserving energy and speeding up shutdown.
> > 
> > What happens if you get a wakeup event while going down?
> 
> Hm, I should probably disable that with
> 
> 	__pci_enable_wake(pci_dev, pci_dev->current_state, true, false);
> 

What if the user wants, say, WoL to power on the system?

> 
> > Will it lead to a reboot when a shutdown has been requested?
> 
> That would depend on the platform, my guess is no, but only someone
> with extensive real world experience with such corner cases can answer
> this. (Rafael?)

That wakeup will be silently dropped on the floor as a rule AFAICS.

But there's another theoretical issue here.

On systems with ACPI some devices may need to be reconfigured for wakeup
from S5 even though they have been configured for signaling wakeup while
in S0.  Those would still need to be resumed.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/5] PCI PM refinements
  2016-09-18  3:39 [PATCH v2 0/5] PCI PM refinements Lukas Wunner
                   ` (4 preceding siblings ...)
  2016-09-18  3:39 ` [PATCH v2 2/5] PCI: Query platform firmware for device power state Lukas Wunner
@ 2016-09-28 16:54 ` Bjorn Helgaas
  2016-09-29 12:11   ` Lukas Wunner
  5 siblings, 1 reply; 15+ messages in thread
From: Bjorn Helgaas @ 2016-09-28 16:54 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-pci, linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu,
	Andreas Noever

On Sun, Sep 18, 2016 at 05:39:20AM +0200, Lukas Wunner wrote:
> Here's the second installment of this series, based on feedback from
> Rafael. (Thanks!)
> 
> To reiterate, these refinements are meant to speed up and reduce power
> consumption when entering and leaving system sleep and when shutting
> down the system.  For details, please refer to the cover letter of v1:
> http://www.spinics.net/lists/linux-pci/msg53702.html
> 
> As usual I've pushed the patches to GitHub in case anyone prefers
> reviewing in a GUI:
> https://github.com/l1k/linux/commits/pci_pm_refinements_v2
> 
> 
> Changes since v1:
> 
> * Patch [1/5] now sports an ack by Rafael.
> 
> * Patch [2/5]:
>   * In acpi_pci_get_power_state(), only check for ACPI_STATE_UNKNOWN
>     instead of ACPI_STATE_D0 and ACPI_STATE_D3_COLD.
>   * Move the change to pci_update_current_state() to new patch [3/5].
> 
> * Patch [3/5]:
>   * Instead of solely relying on the platform firmware to report D3cold,
>     also probe the vendor ID and assume D3cold if it can't be read.
>     This should ensure proper detection of D3cold on pre-ACPI 4.0
>     platforms (which will never report anything deeper than D3hot)
>     as well as for devices with nonstandard PM mechanisms.
>   * The two existing workarounds for D3cold are removed from
>     pci_update_current_state(), as explained in the commit message.
> 
> * Patch [5/5]:
>   * Disable runtime PM on the device to prevent it from being runtime
>     resumed during the remainder of the shutdown process.
> 
> Thanks,
> 
> Lukas
> 
> 
> Lukas Wunner (5):
>   PCI: Afford direct-complete to devices with nonstandard PM
>   PCI: Query platform firmware for device power state
>   PCI: Recognize D3cold in pci_update_current_state()
>   PCI: Avoid unnecessary resume after direct-complete

I applied the patches above with Rafael's acks to pci/pm for v4.9, thanks,
Lukas!  I particularly appreciate your changelogs -- they're very readable,
complete, and consistent in style.

>   PCI: Avoid unnecessary resume on shutdown

Sounds like there's still a little discussion on this, so I haven't applied
this one yet.

>  drivers/pci/pci-acpi.c   | 22 ++++++++++++++++++++
>  drivers/pci/pci-driver.c | 24 ++++++++++++++++++++--
>  drivers/pci/pci.c        | 52 ++++++++++++++++++++++++++++++++----------------
>  drivers/pci/pci.h        |  3 +++
>  4 files changed, 82 insertions(+), 19 deletions(-)
> 
> -- 
> 2.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/5] PCI PM refinements
  2016-09-28 16:54 ` [PATCH v2 0/5] PCI PM refinements Bjorn Helgaas
@ 2016-09-29 12:11   ` Lukas Wunner
  0 siblings, 0 replies; 15+ messages in thread
From: Lukas Wunner @ 2016-09-29 12:11 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, linux-pm, linux-acpi, Rafael J. Wysocki, Peter Wu,
	Andreas Noever

On Wed, Sep 28, 2016 at 11:54:45AM -0500, Bjorn Helgaas wrote:
> On Sun, Sep 18, 2016 at 05:39:20AM +0200, Lukas Wunner wrote:
> >   PCI: Afford direct-complete to devices with nonstandard PM
> >   PCI: Query platform firmware for device power state
> >   PCI: Recognize D3cold in pci_update_current_state()
> >   PCI: Avoid unnecessary resume after direct-complete
> 
> I applied the patches above with Rafael's acks to pci/pm for v4.9, thanks,
> Lukas!  I particularly appreciate your changelogs -- they're very readable,
> complete, and consistent in style.

As a non-native English speaker I cannot imagine a bigger compliment,
thanks so much!


> >   PCI: Avoid unnecessary resume on shutdown
> 
> Sounds like there's still a little discussion on this, so I haven't applied
> this one yet.

Yes, indeed I need more time with this one.

Thanks!

Lukas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 5/5] PCI: Avoid unnecessary resume on shutdown
  2016-09-24  0:50       ` Rafael J. Wysocki
@ 2016-10-05 12:32         ` Lukas Wunner
  0 siblings, 0 replies; 15+ messages in thread
From: Lukas Wunner @ 2016-10-05 12:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oliver Neukum, linux-pci, linux-pm, linux-acpi, Peter Wu, Andreas Noever

On Sat, Sep 24, 2016 at 02:50:05AM +0200, Rafael J. Wysocki wrote:
> On Monday, September 19, 2016 12:14:02 PM Lukas Wunner wrote:
> > On Mon, Sep 19, 2016 at 11:12:23AM +0200, Oliver Neukum wrote:
> > > On Sun, 2016-09-18 at 05:39 +0200, Lukas Wunner wrote:
> > > > We currently perform a mandatory runtime resume of all PCI devices
> > > > on ->shutdown.  However it is pointless to wake devices only to
> > > > immediately power them down afterwards.  (Or have the firmware
> > > > reset them, in case of a reboot.)
> > > > 
> > > > It seems there are only two cases when a runtime resume is actually
> > > > necessary:  If the driver has declared a ->shutdown callback or if
> > > > kexec is in progress.
> > > > 
> > > > Constrain resume of a device to these cases and let it slumber
> > > > otherwise, thereby conserving energy and speeding up shutdown.
> > > 
> > > What happens if you get a wakeup event while going down?
> > 
> > Hm, I should probably disable that with
> > 
> > 	__pci_enable_wake(pci_dev, pci_dev->current_state, true, false);
> > 
> 
> What if the user wants, say, WoL to power on the system?

When the PCI device is runtime resumed, the function call above
will be executed by pci_pm_runtime_resume(), so WoL will be disabled
as well.

There are a few drivers which re-enable WoL from their ->shutdown
hook, but that isn't impacted by this patch because the fast path
isn't taken if the driver has a ->shutdown hook.


> But there's another theoretical issue here.
> 
> On systems with ACPI some devices may need to be reconfigured for wakeup
> from S5 even though they have been configured for signaling wakeup while
> in S0.  Those would still need to be resumed.

Hm, why does the device have to be resumed for that?  The ACPI 6.0 spec
says for _DSW:

"This control method can only access Operation Regions that are either
always available while in a system working state or that are available
when the Power Resources referenced by the _PRW object are all ON."

Before _DSW is executed, all power resources are turned on, so whatever
_DSW does, the device should be powered up and accessible.  Does OSPM
really have to runtime resume the device for _DSW?


Another potential issue that crossed my mind is I/O virtualization:
If a PCI device is passed through to a guest VM, the VM may enable
PME, put the device in D3, etc.  If the VM is shut down, can anything
go wrong if the fast path is taken for runtime suspended devices?
(To be fair, if the VM crashes and doesn't shut down passed through
devices orderly, we'd be in the same situation.)  PME would be disabled
as described above, and leaving the device in D3hot should be fine as well.
If the host or another guest VM wants to use the PCI device afterwards,
it will resume it to D0 with pci_enable_device() in the driver's ->probe
hook.  Let me know if you find flaws in this line of thought.

Thanks!

Lukas

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-10-05 12:32 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-18  3:39 [PATCH v2 0/5] PCI PM refinements Lukas Wunner
2016-09-18  3:39 ` [PATCH v2 1/5] PCI: Afford direct-complete to devices with nonstandard PM Lukas Wunner
2016-09-18  3:39 ` [PATCH v2 5/5] PCI: Avoid unnecessary resume on shutdown Lukas Wunner
2016-09-19  9:12   ` Oliver Neukum
2016-09-19 10:14     ` Lukas Wunner
2016-09-24  0:50       ` Rafael J. Wysocki
2016-10-05 12:32         ` Lukas Wunner
2016-09-18  3:39 ` [PATCH v2 4/5] PCI: Avoid unnecessary resume after direct-complete Lukas Wunner
2016-09-24  0:47   ` Rafael J. Wysocki
2016-09-18  3:39 ` [PATCH v2 3/5] PCI: Recognize D3cold in pci_update_current_state() Lukas Wunner
2016-09-24  0:46   ` Rafael J. Wysocki
2016-09-18  3:39 ` [PATCH v2 2/5] PCI: Query platform firmware for device power state Lukas Wunner
2016-09-24  0:46   ` Rafael J. Wysocki
2016-09-28 16:54 ` [PATCH v2 0/5] PCI PM refinements Bjorn Helgaas
2016-09-29 12:11   ` Lukas Wunner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.