linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -v4 0/2] PCIe: Add PCIe runtime D3cold support
@ 2012-05-18  1:48 Huang Ying
  2012-05-18  1:48 ` [PATCH -v4 1/2] PCIe: Add runtime PM support to PCIe port Huang Ying
  2012-05-18  1:48 ` [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support Huang Ying
  0 siblings, 2 replies; 14+ messages in thread
From: Huang Ying @ 2012-05-18  1:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: ming.m.lin, linux-kernel, linux-pm, Rafael J. Wysocki,
	Huang Ying, Zheng Yan

This patchset is based on the following patch:

[PATCH -v2] ACPI, PM, Specify lowest allowed state for device sleep state

Changelog:

v4:

- Minors fixes based on comments from Rafael

v3:

- Drop device.power.power_must_be_on, use pci_dev.no_d3cold instead
- Drop device.power.power_off_user, use pci_dev.d3cold_allowed instead
- Use black list instead of white list in pcie port runtime support per request from Bjorn
- Various fixes based on comments from Rafael

v2:

- Refreshed based on comments from Rafael

[PATCH -v4 1/2] PCIe: Add runtime PM support to PCIe port
[PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support

Best Regards,
Huang Ying

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH -v4 1/2] PCIe: Add runtime PM support to PCIe port
  2012-05-18  1:48 [PATCH -v4 0/2] PCIe: Add PCIe runtime D3cold support Huang Ying
@ 2012-05-18  1:48 ` Huang Ying
  2012-05-21 22:09   ` Rafael J. Wysocki
  2012-05-18  1:48 ` [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support Huang Ying
  1 sibling, 1 reply; 14+ messages in thread
From: Huang Ying @ 2012-05-18  1:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: ming.m.lin, linux-kernel, linux-pm, Rafael J. Wysocki,
	Huang Ying, Zheng Yan

From: "Yan, Zheng" <zheng.z.yan@intel.com>

This patch adds runtime PM support to PCIe port.  This is needed by
PCIe D3cold support, where PCIe device without ACPI node may be
powered on/off by PCIe port.

Because runtime suspend is broken for some chipsets, a black list is
used to disable runtime PM support for these chipsets.

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
---
 drivers/pci/pci.c              |   10 ++++++++++
 drivers/pci/pcie/portdrv_pci.c |   24 ++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1517,6 +1517,16 @@ static void pci_pme_list_scan(struct wor
 	if (!list_empty(&pci_pme_list)) {
 		list_for_each_entry_safe(pme_dev, n, &pci_pme_list, list) {
 			if (pme_dev->dev->pme_poll) {
+				struct pci_dev *bridge;
+
+				bridge = pme_dev->dev->bus->self;
+				/*
+				 * If bridge is in low power state, the
+				 * configuration space of subordinate devices
+				 * may be not accessible
+				 */
+				if (bridge && bridge->current_state != PCI_D0)
+					continue;
 				pci_pme_wakeup(pme_dev->dev, NULL);
 			} else {
 				list_del(&pme_dev->list);
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -11,6 +11,7 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/pm.h>
+#include <linux/pm_runtime.h>
 #include <linux/init.h>
 #include <linux/pcieport_if.h>
 #include <linux/aer.h>
@@ -99,6 +100,15 @@ static int pcie_port_resume_noirq(struct
 	return 0;
 }
 
+#ifdef CONFIG_PM_RUNTIME
+static int pcie_port_runtime_pm(struct device *dev)
+{
+	return 0;
+}
+#else
+#define pcie_port_runtime_pm	NULL
+#endif
+
 static const struct dev_pm_ops pcie_portdrv_pm_ops = {
 	.suspend	= pcie_port_device_suspend,
 	.resume		= pcie_port_device_resume,
@@ -107,6 +117,8 @@ static const struct dev_pm_ops pcie_port
 	.poweroff	= pcie_port_device_suspend,
 	.restore	= pcie_port_device_resume,
 	.resume_noirq	= pcie_port_resume_noirq,
+	.runtime_suspend = pcie_port_runtime_pm,
+	.runtime_resume = pcie_port_runtime_pm,
 };
 
 #define PCIE_PORTDRV_PM_OPS	(&pcie_portdrv_pm_ops)
@@ -117,6 +129,14 @@ static const struct dev_pm_ops pcie_port
 #endif /* !PM */
 
 /*
+ * PCIe port runtime suspend is broken for some chipsets, so use a
+ * black list to disable runtime PM for these chipsets.
+ */
+static const struct pci_device_id port_runtime_pm_black_list[] = {
+	{ /* end: all zeroes */ }
+};
+
+/*
  * pcie_portdrv_probe - Probe PCI-Express port devices
  * @dev: PCI-Express port device being probed
  *
@@ -144,12 +164,16 @@ static int __devinit pcie_portdrv_probe(
 		return status;
 
 	pci_save_state(dev);
+	if (!pci_match_id(port_runtime_pm_black_list, dev))
+		pm_runtime_put_noidle(&dev->dev);
 
 	return 0;
 }
 
 static void pcie_portdrv_remove(struct pci_dev *dev)
 {
+	if (!pci_match_id(port_runtime_pm_black_list, dev))
+		pm_runtime_get_noresume(&dev->dev);
 	pcie_port_device_remove(dev);
 	pci_disable_device(dev);
 }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-05-18  1:48 [PATCH -v4 0/2] PCIe: Add PCIe runtime D3cold support Huang Ying
  2012-05-18  1:48 ` [PATCH -v4 1/2] PCIe: Add runtime PM support to PCIe port Huang Ying
@ 2012-05-18  1:48 ` Huang Ying
  2012-05-21 22:11   ` Rafael J. Wysocki
  1 sibling, 1 reply; 14+ messages in thread
From: Huang Ying @ 2012-05-18  1:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: ming.m.lin, linux-kernel, linux-pm, Rafael J. Wysocki,
	Huang Ying, Zheng Yan

This patch adds runtime D3cold support to PCIe bus.  D3cold is the
deepest power saving state for PCIe devices.  Where the main PCIe link
will be powered off totally, so before the PCIe main link is powered
on again, you can not access the device, even the configuration space,
which is still accessible in D3hot.  Because the main PCIe link is not
available, the PCI PM registers can not be used to put device into/out
of D3cold state, that will be done by platform logic such as ACPI
_PR3.

To support remote wake up in D3cold, aux power is supplied, and a
side-band pin: WAKE# is used to notify remote wake up event, the pin
is usually connected to platform logic such as ACPI GPE.  This is
quite different with other power saving states, where remote wake up
is notified via PME message through the PCIe main link.

PCIe devices in plug-in slot usually has no direct platform logic, for
example, there is usually no ACPI _PR3 for them.  The D3cold support
for these devices can be done via the PCIe port connected to it.  When
power on/off the PCIe port, the corresponding PCIe devices are powered
on/off too.  And the remote wake up event from PCIe device will be
notified to the corresponding PCIe port.

The PCI subsystem D3cold support and corresponding ACPI platform
support is implemented in the patch.  Including the support for PCIe
devices in plug-in slot.

For more information about PCIe D3cold and corresponding ACPI support,
please refer to:

- PCI Express Base Specification Revision 2.0
- Advanced Configuration and Power Interface Specification Revision 5.0

Originally-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 drivers/pci/pci-acpi.c         |   23 +++++++-
 drivers/pci/pci-driver.c       |    3 +
 drivers/pci/pci-sysfs.c        |   29 ++++++++++
 drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
 drivers/pci/pci.h              |    1 
 drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
 include/linux/pci.h            |   16 ++++-
 7 files changed, 205 insertions(+), 21 deletions(-)

--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
 	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
 		return;
 
+	if (pci_dev->current_state == PCI_D3cold) {
+		pci_wakeup_event(pci_dev);
+		pm_runtime_resume(&pci_dev->dev);
+		return;
+	}
+
 	if (!pci_dev->pm_cap || !pci_dev->pme_support
 	     || pci_check_pme_status(pci_dev)) {
 		if (pci_dev->pme_poll)
@@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
 
 static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
 {
-	int acpi_state;
+	int acpi_state, d_max;
 
-	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
-						ACPI_STATE_D3);
+	if (pdev->no_d3cold)
+		d_max = ACPI_STATE_D3_HOT;
+	else
+		d_max = ACPI_STATE_D3_COLD;
+	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL, d_max);
 	if (acpi_state < 0)
 		return PCI_POWER_ERROR;
 
@@ -297,7 +306,13 @@ static void acpi_pci_propagate_run_wake(
 
 static int acpi_pci_run_wake(struct pci_dev *dev, bool enable)
 {
-	if (dev->pme_interrupt)
+	/*
+	 * Per PCI Express Base Specification Revision 2.0 section
+	 * 5.3.3.2 Link Wakeup, platform support is needed for D3cold
+	 * waking up to power on the main link even if there is PME
+	 * support for D3cold
+	 */
+	if (dev->pme_interrupt && !dev->runtime_d3cold)
 		return 0;
 
 	if (!acpi_pm_device_run_wake(&dev->dev, enable))
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1013,10 +1013,13 @@ static int pci_pm_runtime_suspend(struct
 	if (!pm || !pm->runtime_suspend)
 		return -ENOSYS;
 
+	pci_dev->no_d3cold = false;
 	error = pm->runtime_suspend(dev);
 	suspend_report_result(pm->runtime_suspend, error);
 	if (error)
 		return error;
+	if (!pci_dev->d3cold_allowed)
+		pci_dev->no_d3cold = true;
 
 	pci_fixup_device(pci_fixup_suspend, pci_dev);
 
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -27,6 +27,7 @@
 #include <linux/security.h>
 #include <linux/pci-aspm.h>
 #include <linux/slab.h>
+#include <linux/pm_runtime.h>
 #include "pci.h"
 
 static int sysfs_initialized;	/* = 0 */
@@ -377,6 +378,31 @@ dev_bus_rescan_store(struct device *dev,
 
 #endif
 
+#if defined(CONFIG_PM_RUNTIME) && defined(CONFIG_ACPI)
+static ssize_t d3cold_allowed_store(struct device *dev,
+				    struct device_attribute *attr,
+				    const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	unsigned long val;
+
+	if (strict_strtoul(buf, 0, &val) < 0)
+		return -EINVAL;
+
+	pdev->d3cold_allowed = !!val;
+	pm_runtime_resume(dev);
+
+	return count;
+}
+
+static ssize_t d3cold_allowed_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	return sprintf (buf, "%u\n", pdev->d3cold_allowed);
+}
+#endif
+
 struct device_attribute pci_dev_attrs[] = {
 	__ATTR_RO(resource),
 	__ATTR_RO(vendor),
@@ -401,6 +427,9 @@ struct device_attribute pci_dev_attrs[]
 	__ATTR(remove, (S_IWUSR|S_IWGRP), NULL, remove_store),
 	__ATTR(rescan, (S_IWUSR|S_IWGRP), NULL, dev_rescan_store),
 #endif
+#if defined(CONFIG_PM_RUNTIME) && defined(CONFIG_ACPI)
+	__ATTR(d3cold_allowed, 0644, d3cold_allowed_show, d3cold_allowed_store),
+#endif
 	__ATTR_NULL,
 };
 
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -621,7 +621,8 @@ static int pci_raw_set_power_state(struc
 		dev_info(&dev->dev, "Refused to change power state, "
 			"currently in D%d\n", dev->current_state);
 
-	/* According to section 5.4.1 of the "PCI BUS POWER MANAGEMENT
+	/*
+	 * According to section 5.4.1 of the "PCI BUS POWER MANAGEMENT
 	 * INTERFACE SPECIFICATION, REV. 1.2", a device transitioning
 	 * from D3hot to D0 _may_ perform an internal reset, thereby
 	 * going to "D0 Uninitialized" rather than "D0 Initialized".
@@ -653,6 +654,16 @@ void pci_update_current_state(struct pci
 	if (dev->pm_cap) {
 		u16 pmcsr;
 
+		/*
+		 * Configuration space is not accessible for device in
+		 * D3cold, so just keep or set D3cold for safety
+		 */
+		if (dev->current_state == PCI_D3cold)
+			return;
+		if (state == PCI_D3cold) {
+			dev->current_state = PCI_D3cold;
+			return;
+		}
 		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
 		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
 	} else {
@@ -693,8 +704,50 @@ static int pci_platform_power_transition
  */
 static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state)
 {
-	if (state == PCI_D0)
+	if (state == PCI_D0) {
 		pci_platform_power_transition(dev, PCI_D0);
+		/*
+		 * Mandatory power management transition delays, see
+		 * PCI Express Base Specification Revision 2.0 Section
+		 * 6.6.1: Conventional Reset.  Do not delay for
+		 * devices powered on/off by corresponding bridge,
+		 * because have already delayed for the bridge.
+		 */
+		if (dev->runtime_d3cold) {
+			msleep(dev->d3cold_delay);
+			/*
+			 * When powering on a bridge from D3cold, the
+			 * whole hierarchy may be powered on into
+			 * D0uninitialized state, resume them to give
+			 * them a chance to suspend again
+			 */
+			pci_wakeup_bus(dev->subordinate);
+		}
+	}
+}
+
+/**
+ * __pci_dev_set_current_state - Set current state of a PCI device
+ * @dev: Device to handle
+ * @data: pointer to state to be set
+ */
+static int __pci_dev_set_current_state(struct pci_dev *dev, void *data)
+{
+	pci_power_t state = *(pci_power_t *)data;
+
+	dev->current_state = state;
+	return 0;
+}
+
+/**
+ * __pci_bus_set_current_state - Walk given bus and set current state of devices
+ * @bus: Top bus of the subtree to walk.
+ * @state: state to be set
+ */
+static void __pci_bus_set_current_state(struct pci_bus *bus, pci_power_t state)
+{
+	if (bus)
+		pci_walk_bus(bus, __pci_dev_set_current_state, &state);
 }
 
 /**
@@ -706,8 +759,15 @@ static void __pci_start_power_transition
  */
 int __pci_complete_power_transition(struct pci_dev *dev, pci_power_t state)
 {
-	return state >= PCI_D0 ?
-			pci_platform_power_transition(dev, state) : -EINVAL;
+	int ret;
+
+	if (state < PCI_D0)
+		return -EINVAL;
+	ret = pci_platform_power_transition(dev, state);
+	/* Power off the bridge may power off the whole hierarchy */
+	if (!ret && state == PCI_D3cold)
+		__pci_bus_set_current_state(dev->subordinate, PCI_D3cold);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(__pci_complete_power_transition);
 
@@ -731,8 +791,8 @@ int pci_set_power_state(struct pci_dev *
 	int error;
 
 	/* bound the state we're entering */
-	if (state > PCI_D3hot)
-		state = PCI_D3hot;
+	if (state > PCI_D3cold)
+		state = PCI_D3cold;
 	else if (state < PCI_D0)
 		state = PCI_D0;
 	else if ((state == PCI_D1 || state == PCI_D2) && pci_no_d1d2(dev))
@@ -747,10 +807,15 @@ int pci_set_power_state(struct pci_dev *
 
 	/* This device is quirked not to be put into D3, so
 	   don't put it in D3 */
-	if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
+	if (state >= PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
 		return 0;
 
-	error = pci_raw_set_power_state(dev, state);
+	/*
+	 * To put device in D3cold, we put device into D3hot in native
+	 * way, then put device into D3cold with platform ops
+	 */
+	error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
+					PCI_D3hot : state);
 
 	if (!__pci_complete_power_transition(dev, state))
 		error = 0;
@@ -1497,6 +1562,28 @@ void pci_pme_wakeup_bus(struct pci_bus *
 }
 
 /**
+ * pci_wakeup - Wake up a PCI device
+ * @dev: Device to handle.
+ * @ign: ignored parameter
+ */
+static int pci_wakeup(struct pci_dev *pci_dev, void *ign)
+{
+	pci_wakeup_event(pci_dev);
+	pm_request_resume(&pci_dev->dev);
+	return 0;
+}
+
+/**
+ * pci_wakeup_bus - Walk given bus and wake up devices on it
+ * @bus: Top bus of the subtree to walk.
+ */
+void pci_wakeup_bus(struct pci_bus *bus)
+{
+	if (bus)
+		pci_walk_bus(bus, pci_wakeup, NULL);
+}
+
+/**
  * pci_pme_capable - check the capability of PCI device to generate PME#
  * @dev: PCI device to handle.
  * @state: PCI state from which device will issue PME#.
@@ -1790,12 +1877,16 @@ int pci_finish_runtime_suspend(struct pc
 	if (target_state == PCI_POWER_ERROR)
 		return -EIO;
 
+	dev->runtime_d3cold = target_state == PCI_D3cold;
+
 	__pci_enable_wake(dev, target_state, true, pci_dev_run_wake(dev));
 
 	error = pci_set_power_state(dev, target_state);
 
-	if (error)
+	if (error) {
 		__pci_enable_wake(dev, target_state, true, false);
+		dev->runtime_d3cold = false;
+	}
 
 	return error;
 }
@@ -1865,6 +1956,7 @@ void pci_pm_init(struct pci_dev *dev)
 
 	dev->pm_cap = pm;
 	dev->d3_delay = PCI_PM_D3_WAIT;
+	dev->d3cold_delay = PCI_PM_D3COLD_WAIT;
 
 	dev->d1_support = false;
 	dev->d2_support = false;
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -70,6 +70,7 @@ extern void pci_update_current_state(str
 extern void pci_disable_enabled_device(struct pci_dev *dev);
 extern int pci_finish_runtime_suspend(struct pci_dev *dev);
 extern int __pci_pme_wakeup(struct pci_dev *dev, void *ign);
+extern void pci_wakeup_bus(struct pci_bus *bus);
 extern void pci_pm_init(struct pci_dev *dev);
 extern void platform_pci_wakeup_init(struct pci_dev *dev);
 extern void pci_allocate_cap_save_buffers(struct pci_dev *dev);
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -101,12 +101,48 @@ static int pcie_port_resume_noirq(struct
 }
 
 #ifdef CONFIG_PM_RUNTIME
-static int pcie_port_runtime_pm(struct device *dev)
+struct d3cold_info {
+	bool no_d3cold;
+	unsigned int d3cold_delay;
+};
+
+static int pci_dev_d3cold_info(struct pci_dev *pdev, void *data)
+{
+	struct d3cold_info *info = data;
+
+	info->d3cold_delay = max_t(unsigned int, pdev->d3cold_delay,
+				   info->d3cold_delay);
+	if (pdev->no_d3cold)
+		info->no_d3cold = true;
+	return 0;
+}
+
+static int pcie_port_runtime_suspend(struct device *dev)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct d3cold_info d3cold_info = {
+		.no_d3cold	= false,
+		.d3cold_delay	= PCI_PM_D3_WAIT,
+	};
+
+	/*
+	 * If any subordinate device disable D3cold, we should not put
+	 * the port into D3cold.  The D3cold delay of port should be
+	 * the max of that of all subordinate devices.
+	 */
+	pci_walk_bus(pdev->subordinate, pci_dev_d3cold_info, &d3cold_info);
+	pdev->no_d3cold = d3cold_info.no_d3cold;
+	pdev->d3cold_delay = d3cold_info.d3cold_delay;
+	return 0;
+}
+
+static int pcie_port_runtime_resume(struct device *dev)
 {
 	return 0;
 }
 #else
-#define pcie_port_runtime_pm	NULL
+#define pcie_port_runtime_suspend	NULL
+#define pcie_port_runtime_resume	NULL
 #endif
 
 static const struct dev_pm_ops pcie_portdrv_pm_ops = {
@@ -117,8 +153,8 @@ static const struct dev_pm_ops pcie_port
 	.poweroff	= pcie_port_device_suspend,
 	.restore	= pcie_port_device_resume,
 	.resume_noirq	= pcie_port_resume_noirq,
-	.runtime_suspend = pcie_port_runtime_pm,
-	.runtime_resume = pcie_port_runtime_pm,
+	.runtime_suspend = pcie_port_runtime_suspend,
+	.runtime_resume = pcie_port_runtime_resume,
 };
 
 #define PCIE_PORTDRV_PM_OPS	(&pcie_portdrv_pm_ops)
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -132,9 +132,10 @@ static inline const char *pci_power_name
 	return pci_power_names[1 + (int) state];
 }
 
-#define PCI_PM_D2_DELAY	200
-#define PCI_PM_D3_WAIT	10
-#define PCI_PM_BUS_WAIT	50
+#define PCI_PM_D2_DELAY		200
+#define PCI_PM_D3_WAIT		10
+#define PCI_PM_D3COLD_WAIT	100
+#define PCI_PM_BUS_WAIT		50
 
 /** The pci_channel state describes connectivity between the CPU and
  *  the pci device.  If some PCI bus between here and the pci device
@@ -278,11 +279,18 @@ struct pci_dev {
 	unsigned int	pme_poll:1;	/* Poll device's PME status bit */
 	unsigned int	d1_support:1;	/* Low power state D1 is supported */
 	unsigned int	d2_support:1;	/* Low power state D2 is supported */
-	unsigned int	no_d1d2:1;	/* Only allow D0 and D3 */
+	unsigned int	no_d1d2:1;	/* D1 and D2 are forbidden */
+	unsigned int	no_d3cold:1;	/* D3cold is forbidden */
+	unsigned int	d3cold_allowed:1;	/* D3cold is allowed by user */
 	unsigned int	mmio_always_on:1;	/* disallow turning off io/mem
 						   decoding during bar sizing */
 	unsigned int	wakeup_prepared:1;
+	unsigned int	runtime_d3cold:1;	/* whether go through runtime
+						   D3cold, not set for devices
+						   powered on/off by the
+						   corresponding bridge */
 	unsigned int	d3_delay;	/* D3->D0 transition time in ms */
+	unsigned int	d3cold_delay;	/* D3cold->D0 transition time in ms */
 
 #ifdef CONFIG_PCIEASPM
 	struct pcie_link_state	*link_state;	/* ASPM link state. */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 1/2] PCIe: Add runtime PM support to PCIe port
  2012-05-18  1:48 ` [PATCH -v4 1/2] PCIe: Add runtime PM support to PCIe port Huang Ying
@ 2012-05-21 22:09   ` Rafael J. Wysocki
  0 siblings, 0 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-05-21 22:09 UTC (permalink / raw)
  To: Huang Ying, Bjorn Helgaas; +Cc: ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Friday, May 18, 2012, Huang Ying wrote:
> From: "Yan, Zheng" <zheng.z.yan@intel.com>
> 
> This patch adds runtime PM support to PCIe port.  This is needed by
> PCIe D3cold support, where PCIe device without ACPI node may be
> powered on/off by PCIe port.
> 
> Because runtime suspend is broken for some chipsets, a black list is
> used to disable runtime PM support for these chipsets.
> 
> Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>

Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>

Thanks,
Rafael


> ---
>  drivers/pci/pci.c              |   10 ++++++++++
>  drivers/pci/pcie/portdrv_pci.c |   24 ++++++++++++++++++++++++
>  2 files changed, 34 insertions(+)
> 
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1517,6 +1517,16 @@ static void pci_pme_list_scan(struct wor
>  	if (!list_empty(&pci_pme_list)) {
>  		list_for_each_entry_safe(pme_dev, n, &pci_pme_list, list) {
>  			if (pme_dev->dev->pme_poll) {
> +				struct pci_dev *bridge;
> +
> +				bridge = pme_dev->dev->bus->self;
> +				/*
> +				 * If bridge is in low power state, the
> +				 * configuration space of subordinate devices
> +				 * may be not accessible
> +				 */
> +				if (bridge && bridge->current_state != PCI_D0)
> +					continue;
>  				pci_pme_wakeup(pme_dev->dev, NULL);
>  			} else {
>  				list_del(&pme_dev->list);
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -11,6 +11,7 @@
>  #include <linux/kernel.h>
>  #include <linux/errno.h>
>  #include <linux/pm.h>
> +#include <linux/pm_runtime.h>
>  #include <linux/init.h>
>  #include <linux/pcieport_if.h>
>  #include <linux/aer.h>
> @@ -99,6 +100,15 @@ static int pcie_port_resume_noirq(struct
>  	return 0;
>  }
>  
> +#ifdef CONFIG_PM_RUNTIME
> +static int pcie_port_runtime_pm(struct device *dev)
> +{
> +	return 0;
> +}
> +#else
> +#define pcie_port_runtime_pm	NULL
> +#endif
> +
>  static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>  	.suspend	= pcie_port_device_suspend,
>  	.resume		= pcie_port_device_resume,
> @@ -107,6 +117,8 @@ static const struct dev_pm_ops pcie_port
>  	.poweroff	= pcie_port_device_suspend,
>  	.restore	= pcie_port_device_resume,
>  	.resume_noirq	= pcie_port_resume_noirq,
> +	.runtime_suspend = pcie_port_runtime_pm,
> +	.runtime_resume = pcie_port_runtime_pm,
>  };
>  
>  #define PCIE_PORTDRV_PM_OPS	(&pcie_portdrv_pm_ops)
> @@ -117,6 +129,14 @@ static const struct dev_pm_ops pcie_port
>  #endif /* !PM */
>  
>  /*
> + * PCIe port runtime suspend is broken for some chipsets, so use a
> + * black list to disable runtime PM for these chipsets.
> + */
> +static const struct pci_device_id port_runtime_pm_black_list[] = {
> +	{ /* end: all zeroes */ }
> +};
> +
> +/*
>   * pcie_portdrv_probe - Probe PCI-Express port devices
>   * @dev: PCI-Express port device being probed
>   *
> @@ -144,12 +164,16 @@ static int __devinit pcie_portdrv_probe(
>  		return status;
>  
>  	pci_save_state(dev);
> +	if (!pci_match_id(port_runtime_pm_black_list, dev))
> +		pm_runtime_put_noidle(&dev->dev);
>  
>  	return 0;
>  }
>  
>  static void pcie_portdrv_remove(struct pci_dev *dev)
>  {
> +	if (!pci_match_id(port_runtime_pm_black_list, dev))
> +		pm_runtime_get_noresume(&dev->dev);
>  	pcie_port_device_remove(dev);
>  	pci_disable_device(dev);
>  }
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-05-18  1:48 ` [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support Huang Ying
@ 2012-05-21 22:11   ` Rafael J. Wysocki
  2012-05-30 21:49     ` Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-05-21 22:11 UTC (permalink / raw)
  To: Huang Ying, Bjorn Helgaas; +Cc: ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Friday, May 18, 2012, Huang Ying wrote:
> This patch adds runtime D3cold support to PCIe bus.  D3cold is the
> deepest power saving state for PCIe devices.  Where the main PCIe link
> will be powered off totally, so before the PCIe main link is powered
> on again, you can not access the device, even the configuration space,
> which is still accessible in D3hot.  Because the main PCIe link is not
> available, the PCI PM registers can not be used to put device into/out
> of D3cold state, that will be done by platform logic such as ACPI
> _PR3.
> 
> To support remote wake up in D3cold, aux power is supplied, and a
> side-band pin: WAKE# is used to notify remote wake up event, the pin
> is usually connected to platform logic such as ACPI GPE.  This is
> quite different with other power saving states, where remote wake up
> is notified via PME message through the PCIe main link.
> 
> PCIe devices in plug-in slot usually has no direct platform logic, for
> example, there is usually no ACPI _PR3 for them.  The D3cold support
> for these devices can be done via the PCIe port connected to it.  When
> power on/off the PCIe port, the corresponding PCIe devices are powered
> on/off too.  And the remote wake up event from PCIe device will be
> notified to the corresponding PCIe port.
> 
> The PCI subsystem D3cold support and corresponding ACPI platform
> support is implemented in the patch.  Including the support for PCIe
> devices in plug-in slot.
> 
> For more information about PCIe D3cold and corresponding ACPI support,
> please refer to:
> 
> - PCI Express Base Specification Revision 2.0
> - Advanced Configuration and Power Interface Specification Revision 5.0
> 
> Originally-by: Zheng Yan <zheng.z.yan@intel.com>
> Signed-off-by: Huang Ying <ying.huang@intel.com>

Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>

Thanks,
Rafael


> ---
>  drivers/pci/pci-acpi.c         |   23 +++++++-
>  drivers/pci/pci-driver.c       |    3 +
>  drivers/pci/pci-sysfs.c        |   29 ++++++++++
>  drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
>  drivers/pci/pci.h              |    1 
>  drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
>  include/linux/pci.h            |   16 ++++-
>  7 files changed, 205 insertions(+), 21 deletions(-)
> 
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
>  	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
>  		return;
>  
> +	if (pci_dev->current_state == PCI_D3cold) {
> +		pci_wakeup_event(pci_dev);
> +		pm_runtime_resume(&pci_dev->dev);
> +		return;
> +	}
> +
>  	if (!pci_dev->pm_cap || !pci_dev->pme_support
>  	     || pci_check_pme_status(pci_dev)) {
>  		if (pci_dev->pme_poll)
> @@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
>  
>  static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
>  {
> -	int acpi_state;
> +	int acpi_state, d_max;
>  
> -	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
> -						ACPI_STATE_D3);
> +	if (pdev->no_d3cold)
> +		d_max = ACPI_STATE_D3_HOT;
> +	else
> +		d_max = ACPI_STATE_D3_COLD;
> +	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL, d_max);
>  	if (acpi_state < 0)
>  		return PCI_POWER_ERROR;
>  
> @@ -297,7 +306,13 @@ static void acpi_pci_propagate_run_wake(
>  
>  static int acpi_pci_run_wake(struct pci_dev *dev, bool enable)
>  {
> -	if (dev->pme_interrupt)
> +	/*
> +	 * Per PCI Express Base Specification Revision 2.0 section
> +	 * 5.3.3.2 Link Wakeup, platform support is needed for D3cold
> +	 * waking up to power on the main link even if there is PME
> +	 * support for D3cold
> +	 */
> +	if (dev->pme_interrupt && !dev->runtime_d3cold)
>  		return 0;
>  
>  	if (!acpi_pm_device_run_wake(&dev->dev, enable))
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1013,10 +1013,13 @@ static int pci_pm_runtime_suspend(struct
>  	if (!pm || !pm->runtime_suspend)
>  		return -ENOSYS;
>  
> +	pci_dev->no_d3cold = false;
>  	error = pm->runtime_suspend(dev);
>  	suspend_report_result(pm->runtime_suspend, error);
>  	if (error)
>  		return error;
> +	if (!pci_dev->d3cold_allowed)
> +		pci_dev->no_d3cold = true;
>  
>  	pci_fixup_device(pci_fixup_suspend, pci_dev);
>  
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -27,6 +27,7 @@
>  #include <linux/security.h>
>  #include <linux/pci-aspm.h>
>  #include <linux/slab.h>
> +#include <linux/pm_runtime.h>
>  #include "pci.h"
>  
>  static int sysfs_initialized;	/* = 0 */
> @@ -377,6 +378,31 @@ dev_bus_rescan_store(struct device *dev,
>  
>  #endif
>  
> +#if defined(CONFIG_PM_RUNTIME) && defined(CONFIG_ACPI)
> +static ssize_t d3cold_allowed_store(struct device *dev,
> +				    struct device_attribute *attr,
> +				    const char *buf, size_t count)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	unsigned long val;
> +
> +	if (strict_strtoul(buf, 0, &val) < 0)
> +		return -EINVAL;
> +
> +	pdev->d3cold_allowed = !!val;
> +	pm_runtime_resume(dev);
> +
> +	return count;
> +}
> +
> +static ssize_t d3cold_allowed_show(struct device *dev,
> +				   struct device_attribute *attr, char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	return sprintf (buf, "%u\n", pdev->d3cold_allowed);
> +}
> +#endif
> +
>  struct device_attribute pci_dev_attrs[] = {
>  	__ATTR_RO(resource),
>  	__ATTR_RO(vendor),
> @@ -401,6 +427,9 @@ struct device_attribute pci_dev_attrs[]
>  	__ATTR(remove, (S_IWUSR|S_IWGRP), NULL, remove_store),
>  	__ATTR(rescan, (S_IWUSR|S_IWGRP), NULL, dev_rescan_store),
>  #endif
> +#if defined(CONFIG_PM_RUNTIME) && defined(CONFIG_ACPI)
> +	__ATTR(d3cold_allowed, 0644, d3cold_allowed_show, d3cold_allowed_store),
> +#endif
>  	__ATTR_NULL,
>  };
>  
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -621,7 +621,8 @@ static int pci_raw_set_power_state(struc
>  		dev_info(&dev->dev, "Refused to change power state, "
>  			"currently in D%d\n", dev->current_state);
>  
> -	/* According to section 5.4.1 of the "PCI BUS POWER MANAGEMENT
> +	/*
> +	 * According to section 5.4.1 of the "PCI BUS POWER MANAGEMENT
>  	 * INTERFACE SPECIFICATION, REV. 1.2", a device transitioning
>  	 * from D3hot to D0 _may_ perform an internal reset, thereby
>  	 * going to "D0 Uninitialized" rather than "D0 Initialized".
> @@ -653,6 +654,16 @@ void pci_update_current_state(struct pci
>  	if (dev->pm_cap) {
>  		u16 pmcsr;
>  
> +		/*
> +		 * Configuration space is not accessible for device in
> +		 * D3cold, so just keep or set D3cold for safety
> +		 */
> +		if (dev->current_state == PCI_D3cold)
> +			return;
> +		if (state == PCI_D3cold) {
> +			dev->current_state = PCI_D3cold;
> +			return;
> +		}
>  		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>  		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
>  	} else {
> @@ -693,8 +704,50 @@ static int pci_platform_power_transition
>   */
>  static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state)
>  {
> -	if (state == PCI_D0)
> +	if (state == PCI_D0) {
>  		pci_platform_power_transition(dev, PCI_D0);
> +		/*
> +		 * Mandatory power management transition delays, see
> +		 * PCI Express Base Specification Revision 2.0 Section
> +		 * 6.6.1: Conventional Reset.  Do not delay for
> +		 * devices powered on/off by corresponding bridge,
> +		 * because have already delayed for the bridge.
> +		 */
> +		if (dev->runtime_d3cold) {
> +			msleep(dev->d3cold_delay);
> +			/*
> +			 * When powering on a bridge from D3cold, the
> +			 * whole hierarchy may be powered on into
> +			 * D0uninitialized state, resume them to give
> +			 * them a chance to suspend again
> +			 */
> +			pci_wakeup_bus(dev->subordinate);
> +		}
> +	}
> +}
> +
> +/**
> + * __pci_dev_set_current_state - Set current state of a PCI device
> + * @dev: Device to handle
> + * @data: pointer to state to be set
> + */
> +static int __pci_dev_set_current_state(struct pci_dev *dev, void *data)
> +{
> +	pci_power_t state = *(pci_power_t *)data;
> +
> +	dev->current_state = state;
> +	return 0;
> +}
> +
> +/**
> + * __pci_bus_set_current_state - Walk given bus and set current state of devices
> + * @bus: Top bus of the subtree to walk.
> + * @state: state to be set
> + */
> +static void __pci_bus_set_current_state(struct pci_bus *bus, pci_power_t state)
> +{
> +	if (bus)
> +		pci_walk_bus(bus, __pci_dev_set_current_state, &state);
>  }
>  
>  /**
> @@ -706,8 +759,15 @@ static void __pci_start_power_transition
>   */
>  int __pci_complete_power_transition(struct pci_dev *dev, pci_power_t state)
>  {
> -	return state >= PCI_D0 ?
> -			pci_platform_power_transition(dev, state) : -EINVAL;
> +	int ret;
> +
> +	if (state < PCI_D0)
> +		return -EINVAL;
> +	ret = pci_platform_power_transition(dev, state);
> +	/* Power off the bridge may power off the whole hierarchy */
> +	if (!ret && state == PCI_D3cold)
> +		__pci_bus_set_current_state(dev->subordinate, PCI_D3cold);
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(__pci_complete_power_transition);
>  
> @@ -731,8 +791,8 @@ int pci_set_power_state(struct pci_dev *
>  	int error;
>  
>  	/* bound the state we're entering */
> -	if (state > PCI_D3hot)
> -		state = PCI_D3hot;
> +	if (state > PCI_D3cold)
> +		state = PCI_D3cold;
>  	else if (state < PCI_D0)
>  		state = PCI_D0;
>  	else if ((state == PCI_D1 || state == PCI_D2) && pci_no_d1d2(dev))
> @@ -747,10 +807,15 @@ int pci_set_power_state(struct pci_dev *
>  
>  	/* This device is quirked not to be put into D3, so
>  	   don't put it in D3 */
> -	if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> +	if (state >= PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
>  		return 0;
>  
> -	error = pci_raw_set_power_state(dev, state);
> +	/*
> +	 * To put device in D3cold, we put device into D3hot in native
> +	 * way, then put device into D3cold with platform ops
> +	 */
> +	error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
> +					PCI_D3hot : state);
>  
>  	if (!__pci_complete_power_transition(dev, state))
>  		error = 0;
> @@ -1497,6 +1562,28 @@ void pci_pme_wakeup_bus(struct pci_bus *
>  }
>  
>  /**
> + * pci_wakeup - Wake up a PCI device
> + * @dev: Device to handle.
> + * @ign: ignored parameter
> + */
> +static int pci_wakeup(struct pci_dev *pci_dev, void *ign)
> +{
> +	pci_wakeup_event(pci_dev);
> +	pm_request_resume(&pci_dev->dev);
> +	return 0;
> +}
> +
> +/**
> + * pci_wakeup_bus - Walk given bus and wake up devices on it
> + * @bus: Top bus of the subtree to walk.
> + */
> +void pci_wakeup_bus(struct pci_bus *bus)
> +{
> +	if (bus)
> +		pci_walk_bus(bus, pci_wakeup, NULL);
> +}
> +
> +/**
>   * pci_pme_capable - check the capability of PCI device to generate PME#
>   * @dev: PCI device to handle.
>   * @state: PCI state from which device will issue PME#.
> @@ -1790,12 +1877,16 @@ int pci_finish_runtime_suspend(struct pc
>  	if (target_state == PCI_POWER_ERROR)
>  		return -EIO;
>  
> +	dev->runtime_d3cold = target_state == PCI_D3cold;
> +
>  	__pci_enable_wake(dev, target_state, true, pci_dev_run_wake(dev));
>  
>  	error = pci_set_power_state(dev, target_state);
>  
> -	if (error)
> +	if (error) {
>  		__pci_enable_wake(dev, target_state, true, false);
> +		dev->runtime_d3cold = false;
> +	}
>  
>  	return error;
>  }
> @@ -1865,6 +1956,7 @@ void pci_pm_init(struct pci_dev *dev)
>  
>  	dev->pm_cap = pm;
>  	dev->d3_delay = PCI_PM_D3_WAIT;
> +	dev->d3cold_delay = PCI_PM_D3COLD_WAIT;
>  
>  	dev->d1_support = false;
>  	dev->d2_support = false;
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -70,6 +70,7 @@ extern void pci_update_current_state(str
>  extern void pci_disable_enabled_device(struct pci_dev *dev);
>  extern int pci_finish_runtime_suspend(struct pci_dev *dev);
>  extern int __pci_pme_wakeup(struct pci_dev *dev, void *ign);
> +extern void pci_wakeup_bus(struct pci_bus *bus);
>  extern void pci_pm_init(struct pci_dev *dev);
>  extern void platform_pci_wakeup_init(struct pci_dev *dev);
>  extern void pci_allocate_cap_save_buffers(struct pci_dev *dev);
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -101,12 +101,48 @@ static int pcie_port_resume_noirq(struct
>  }
>  
>  #ifdef CONFIG_PM_RUNTIME
> -static int pcie_port_runtime_pm(struct device *dev)
> +struct d3cold_info {
> +	bool no_d3cold;
> +	unsigned int d3cold_delay;
> +};
> +
> +static int pci_dev_d3cold_info(struct pci_dev *pdev, void *data)
> +{
> +	struct d3cold_info *info = data;
> +
> +	info->d3cold_delay = max_t(unsigned int, pdev->d3cold_delay,
> +				   info->d3cold_delay);
> +	if (pdev->no_d3cold)
> +		info->no_d3cold = true;
> +	return 0;
> +}
> +
> +static int pcie_port_runtime_suspend(struct device *dev)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct d3cold_info d3cold_info = {
> +		.no_d3cold	= false,
> +		.d3cold_delay	= PCI_PM_D3_WAIT,
> +	};
> +
> +	/*
> +	 * If any subordinate device disable D3cold, we should not put
> +	 * the port into D3cold.  The D3cold delay of port should be
> +	 * the max of that of all subordinate devices.
> +	 */
> +	pci_walk_bus(pdev->subordinate, pci_dev_d3cold_info, &d3cold_info);
> +	pdev->no_d3cold = d3cold_info.no_d3cold;
> +	pdev->d3cold_delay = d3cold_info.d3cold_delay;
> +	return 0;
> +}
> +
> +static int pcie_port_runtime_resume(struct device *dev)
>  {
>  	return 0;
>  }
>  #else
> -#define pcie_port_runtime_pm	NULL
> +#define pcie_port_runtime_suspend	NULL
> +#define pcie_port_runtime_resume	NULL
>  #endif
>  
>  static const struct dev_pm_ops pcie_portdrv_pm_ops = {
> @@ -117,8 +153,8 @@ static const struct dev_pm_ops pcie_port
>  	.poweroff	= pcie_port_device_suspend,
>  	.restore	= pcie_port_device_resume,
>  	.resume_noirq	= pcie_port_resume_noirq,
> -	.runtime_suspend = pcie_port_runtime_pm,
> -	.runtime_resume = pcie_port_runtime_pm,
> +	.runtime_suspend = pcie_port_runtime_suspend,
> +	.runtime_resume = pcie_port_runtime_resume,
>  };
>  
>  #define PCIE_PORTDRV_PM_OPS	(&pcie_portdrv_pm_ops)
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -132,9 +132,10 @@ static inline const char *pci_power_name
>  	return pci_power_names[1 + (int) state];
>  }
>  
> -#define PCI_PM_D2_DELAY	200
> -#define PCI_PM_D3_WAIT	10
> -#define PCI_PM_BUS_WAIT	50
> +#define PCI_PM_D2_DELAY		200
> +#define PCI_PM_D3_WAIT		10
> +#define PCI_PM_D3COLD_WAIT	100
> +#define PCI_PM_BUS_WAIT		50
>  
>  /** The pci_channel state describes connectivity between the CPU and
>   *  the pci device.  If some PCI bus between here and the pci device
> @@ -278,11 +279,18 @@ struct pci_dev {
>  	unsigned int	pme_poll:1;	/* Poll device's PME status bit */
>  	unsigned int	d1_support:1;	/* Low power state D1 is supported */
>  	unsigned int	d2_support:1;	/* Low power state D2 is supported */
> -	unsigned int	no_d1d2:1;	/* Only allow D0 and D3 */
> +	unsigned int	no_d1d2:1;	/* D1 and D2 are forbidden */
> +	unsigned int	no_d3cold:1;	/* D3cold is forbidden */
> +	unsigned int	d3cold_allowed:1;	/* D3cold is allowed by user */
>  	unsigned int	mmio_always_on:1;	/* disallow turning off io/mem
>  						   decoding during bar sizing */
>  	unsigned int	wakeup_prepared:1;
> +	unsigned int	runtime_d3cold:1;	/* whether go through runtime
> +						   D3cold, not set for devices
> +						   powered on/off by the
> +						   corresponding bridge */
>  	unsigned int	d3_delay;	/* D3->D0 transition time in ms */
> +	unsigned int	d3cold_delay;	/* D3cold->D0 transition time in ms */
>  
>  #ifdef CONFIG_PCIEASPM
>  	struct pcie_link_state	*link_state;	/* ASPM link state. */
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-05-21 22:11   ` Rafael J. Wysocki
@ 2012-05-30 21:49     ` Rafael J. Wysocki
  2012-05-31  0:40       ` Huang Ying
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-05-30 21:49 UTC (permalink / raw)
  To: Huang Ying; +Cc: Bjorn Helgaas, ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Tuesday, May 22, 2012, Rafael J. Wysocki wrote:
> On Friday, May 18, 2012, Huang Ying wrote:
> > This patch adds runtime D3cold support to PCIe bus.  D3cold is the
> > deepest power saving state for PCIe devices.  Where the main PCIe link
> > will be powered off totally, so before the PCIe main link is powered
> > on again, you can not access the device, even the configuration space,
> > which is still accessible in D3hot.  Because the main PCIe link is not
> > available, the PCI PM registers can not be used to put device into/out
> > of D3cold state, that will be done by platform logic such as ACPI
> > _PR3.
> > 
> > To support remote wake up in D3cold, aux power is supplied, and a
> > side-band pin: WAKE# is used to notify remote wake up event, the pin
> > is usually connected to platform logic such as ACPI GPE.  This is
> > quite different with other power saving states, where remote wake up
> > is notified via PME message through the PCIe main link.
> > 
> > PCIe devices in plug-in slot usually has no direct platform logic, for
> > example, there is usually no ACPI _PR3 for them.  The D3cold support
> > for these devices can be done via the PCIe port connected to it.  When
> > power on/off the PCIe port, the corresponding PCIe devices are powered
> > on/off too.  And the remote wake up event from PCIe device will be
> > notified to the corresponding PCIe port.
> > 
> > The PCI subsystem D3cold support and corresponding ACPI platform
> > support is implemented in the patch.  Including the support for PCIe
> > devices in plug-in slot.
> > 
> > For more information about PCIe D3cold and corresponding ACPI support,
> > please refer to:
> > 
> > - PCI Express Base Specification Revision 2.0
> > - Advanced Configuration and Power Interface Specification Revision 5.0
> > 
> > Originally-by: Zheng Yan <zheng.z.yan@intel.com>
> > Signed-off-by: Huang Ying <ying.huang@intel.com>
> 
> Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>

On a second thought ...

> > ---
> >  drivers/pci/pci-acpi.c         |   23 +++++++-
> >  drivers/pci/pci-driver.c       |    3 +
> >  drivers/pci/pci-sysfs.c        |   29 ++++++++++
> >  drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
> >  drivers/pci/pci.h              |    1 
> >  drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
> >  include/linux/pci.h            |   16 ++++-
> >  7 files changed, 205 insertions(+), 21 deletions(-)
> > 
> > --- a/drivers/pci/pci-acpi.c
> > +++ b/drivers/pci/pci-acpi.c
> > @@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
> >  	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
> >  		return;
> >  
> > +	if (pci_dev->current_state == PCI_D3cold) {
> > +		pci_wakeup_event(pci_dev);
> > +		pm_runtime_resume(&pci_dev->dev);
> > +		return;
> > +	}
> > +
> >  	if (!pci_dev->pm_cap || !pci_dev->pme_support
> >  	     || pci_check_pme_status(pci_dev)) {
> >  		if (pci_dev->pme_poll)
> > @@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
> >  
> >  static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
> >  {
> > -	int acpi_state;
> > +	int acpi_state, d_max;
> >  
> > -	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
> > -						ACPI_STATE_D3);

I wonder what tree this patch is against?  The code above doesn't seem to
be present in the current mainline.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-05-30 21:49     ` Rafael J. Wysocki
@ 2012-05-31  0:40       ` Huang Ying
  2012-05-31 19:01         ` Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Huang Ying @ 2012-05-31  0:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Wed, 2012-05-30 at 23:49 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 22, 2012, Rafael J. Wysocki wrote:
> > On Friday, May 18, 2012, Huang Ying wrote:
> > > This patch adds runtime D3cold support to PCIe bus.  D3cold is the
> > > deepest power saving state for PCIe devices.  Where the main PCIe link
> > > will be powered off totally, so before the PCIe main link is powered
> > > on again, you can not access the device, even the configuration space,
> > > which is still accessible in D3hot.  Because the main PCIe link is not
> > > available, the PCI PM registers can not be used to put device into/out
> > > of D3cold state, that will be done by platform logic such as ACPI
> > > _PR3.
> > > 
> > > To support remote wake up in D3cold, aux power is supplied, and a
> > > side-band pin: WAKE# is used to notify remote wake up event, the pin
> > > is usually connected to platform logic such as ACPI GPE.  This is
> > > quite different with other power saving states, where remote wake up
> > > is notified via PME message through the PCIe main link.
> > > 
> > > PCIe devices in plug-in slot usually has no direct platform logic, for
> > > example, there is usually no ACPI _PR3 for them.  The D3cold support
> > > for these devices can be done via the PCIe port connected to it.  When
> > > power on/off the PCIe port, the corresponding PCIe devices are powered
> > > on/off too.  And the remote wake up event from PCIe device will be
> > > notified to the corresponding PCIe port.
> > > 
> > > The PCI subsystem D3cold support and corresponding ACPI platform
> > > support is implemented in the patch.  Including the support for PCIe
> > > devices in plug-in slot.
> > > 
> > > For more information about PCIe D3cold and corresponding ACPI support,
> > > please refer to:
> > > 
> > > - PCI Express Base Specification Revision 2.0
> > > - Advanced Configuration and Power Interface Specification Revision 5.0
> > > 
> > > Originally-by: Zheng Yan <zheng.z.yan@intel.com>
> > > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > 
> > Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
> 
> On a second thought ...
> 
> > > ---
> > >  drivers/pci/pci-acpi.c         |   23 +++++++-
> > >  drivers/pci/pci-driver.c       |    3 +
> > >  drivers/pci/pci-sysfs.c        |   29 ++++++++++
> > >  drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
> > >  drivers/pci/pci.h              |    1 
> > >  drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
> > >  include/linux/pci.h            |   16 ++++-
> > >  7 files changed, 205 insertions(+), 21 deletions(-)
> > > 
> > > --- a/drivers/pci/pci-acpi.c
> > > +++ b/drivers/pci/pci-acpi.c
> > > @@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
> > >  	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
> > >  		return;
> > >  
> > > +	if (pci_dev->current_state == PCI_D3cold) {
> > > +		pci_wakeup_event(pci_dev);
> > > +		pm_runtime_resume(&pci_dev->dev);
> > > +		return;
> > > +	}
> > > +
> > >  	if (!pci_dev->pm_cap || !pci_dev->pme_support
> > >  	     || pci_check_pme_status(pci_dev)) {
> > >  		if (pci_dev->pme_poll)
> > > @@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
> > >  
> > >  static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
> > >  {
> > > -	int acpi_state;
> > > +	int acpi_state, d_max;
> > >  
> > > -	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
> > > -						ACPI_STATE_D3);
> 
> I wonder what tree this patch is against?  The code above doesn't seem to
> be present in the current mainline.

As stated in [PATCH -v4 0/2], This patchset is based on:

[PATCH -v2] ACPI, PM, Specify lowest allowed state for device sleep
state

Which is sent to LKML separately.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-05-31  0:40       ` Huang Ying
@ 2012-05-31 19:01         ` Rafael J. Wysocki
  2012-06-01  2:03           ` Huang Ying
  2012-06-01  2:25           ` Huang Ying
  0 siblings, 2 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-05-31 19:01 UTC (permalink / raw)
  To: Huang Ying; +Cc: Bjorn Helgaas, ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Thursday, May 31, 2012, Huang Ying wrote:
> On Wed, 2012-05-30 at 23:49 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 22, 2012, Rafael J. Wysocki wrote:
> > > On Friday, May 18, 2012, Huang Ying wrote:
> > > > This patch adds runtime D3cold support to PCIe bus.  D3cold is the
> > > > deepest power saving state for PCIe devices.  Where the main PCIe link
> > > > will be powered off totally, so before the PCIe main link is powered
> > > > on again, you can not access the device, even the configuration space,
> > > > which is still accessible in D3hot.  Because the main PCIe link is not
> > > > available, the PCI PM registers can not be used to put device into/out
> > > > of D3cold state, that will be done by platform logic such as ACPI
> > > > _PR3.
> > > > 
> > > > To support remote wake up in D3cold, aux power is supplied, and a
> > > > side-band pin: WAKE# is used to notify remote wake up event, the pin
> > > > is usually connected to platform logic such as ACPI GPE.  This is
> > > > quite different with other power saving states, where remote wake up
> > > > is notified via PME message through the PCIe main link.
> > > > 
> > > > PCIe devices in plug-in slot usually has no direct platform logic, for
> > > > example, there is usually no ACPI _PR3 for them.  The D3cold support
> > > > for these devices can be done via the PCIe port connected to it.  When
> > > > power on/off the PCIe port, the corresponding PCIe devices are powered
> > > > on/off too.  And the remote wake up event from PCIe device will be
> > > > notified to the corresponding PCIe port.
> > > > 
> > > > The PCI subsystem D3cold support and corresponding ACPI platform
> > > > support is implemented in the patch.  Including the support for PCIe
> > > > devices in plug-in slot.
> > > > 
> > > > For more information about PCIe D3cold and corresponding ACPI support,
> > > > please refer to:
> > > > 
> > > > - PCI Express Base Specification Revision 2.0
> > > > - Advanced Configuration and Power Interface Specification Revision 5.0
> > > > 
> > > > Originally-by: Zheng Yan <zheng.z.yan@intel.com>
> > > > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > > 
> > > Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > On a second thought ...
> > 
> > > > ---
> > > >  drivers/pci/pci-acpi.c         |   23 +++++++-
> > > >  drivers/pci/pci-driver.c       |    3 +
> > > >  drivers/pci/pci-sysfs.c        |   29 ++++++++++
> > > >  drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
> > > >  drivers/pci/pci.h              |    1 
> > > >  drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
> > > >  include/linux/pci.h            |   16 ++++-
> > > >  7 files changed, 205 insertions(+), 21 deletions(-)
> > > > 
> > > > --- a/drivers/pci/pci-acpi.c
> > > > +++ b/drivers/pci/pci-acpi.c
> > > > @@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
> > > >  	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
> > > >  		return;
> > > >  
> > > > +	if (pci_dev->current_state == PCI_D3cold) {
> > > > +		pci_wakeup_event(pci_dev);
> > > > +		pm_runtime_resume(&pci_dev->dev);
> > > > +		return;
> > > > +	}
> > > > +
> > > >  	if (!pci_dev->pm_cap || !pci_dev->pme_support
> > > >  	     || pci_check_pme_status(pci_dev)) {
> > > >  		if (pci_dev->pme_poll)
> > > > @@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
> > > >  
> > > >  static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
> > > >  {
> > > > -	int acpi_state;
> > > > +	int acpi_state, d_max;
> > > >  
> > > > -	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
> > > > -						ACPI_STATE_D3);
> > 
> > I wonder what tree this patch is against?  The code above doesn't seem to
> > be present in the current mainline.
> 
> As stated in [PATCH -v4 0/2], This patchset is based on:
> 
> [PATCH -v2] ACPI, PM, Specify lowest allowed state for device sleep
> state
> 
> Which is sent to LKML separately.

I see.

Can you post the latest version of that patch, please?

Besides, this hunk of the $subject patch:

> @@ -731,8 +791,8 @@ int pci_set_power_state(struct pci_dev *
>       int error;
>  
>       /* bound the state we're entering */
> -     if (state > PCI_D3hot)
> -             state = PCI_D3hot;
> +     if (state > PCI_D3cold)
> +             state = PCI_D3cold;
>       else if (state < PCI_D0)
>               state = PCI_D0;
>       else if ((state == PCI_D1 || state == PCI_D2) && pci_no_d1d2(dev))
> @@ -747,10 +807,15 @@ int pci_set_power_state(struct pci_dev *
>  
>       /* This device is quirked not to be put into D3, so
>          don't put it in D3 */
> -     if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> +     if (state >= PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
>               return 0;
>  
> -     error = pci_raw_set_power_state(dev, state);
> +     /*
> +      * To put device in D3cold, we put device into D3hot in native
> +      * way, then put device into D3cold with platform ops
> +      */
> +     error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
> +                                     PCI_D3hot : state);
>  
>       if (!__pci_complete_power_transition(dev, state))
>               error = 0;

should be merged separately, because it will affect suspend/hibernation code
paths.  Namely, it will change the behavior in such a way that some devices
put into D3hot previously will be put into D3cold now during system suspend.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-05-31 19:01         ` Rafael J. Wysocki
@ 2012-06-01  2:03           ` Huang Ying
  2012-06-01  2:25           ` Huang Ying
  1 sibling, 0 replies; 14+ messages in thread
From: Huang Ying @ 2012-06-01  2:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Thu, 2012-05-31 at 21:01 +0200, Rafael J. Wysocki wrote:
> On Thursday, May 31, 2012, Huang Ying wrote:
> > On Wed, 2012-05-30 at 23:49 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 22, 2012, Rafael J. Wysocki wrote:
> > > > On Friday, May 18, 2012, Huang Ying wrote:
> > > > > This patch adds runtime D3cold support to PCIe bus.  D3cold is the
> > > > > deepest power saving state for PCIe devices.  Where the main PCIe link
> > > > > will be powered off totally, so before the PCIe main link is powered
> > > > > on again, you can not access the device, even the configuration space,
> > > > > which is still accessible in D3hot.  Because the main PCIe link is not
> > > > > available, the PCI PM registers can not be used to put device into/out
> > > > > of D3cold state, that will be done by platform logic such as ACPI
> > > > > _PR3.
> > > > > 
> > > > > To support remote wake up in D3cold, aux power is supplied, and a
> > > > > side-band pin: WAKE# is used to notify remote wake up event, the pin
> > > > > is usually connected to platform logic such as ACPI GPE.  This is
> > > > > quite different with other power saving states, where remote wake up
> > > > > is notified via PME message through the PCIe main link.
> > > > > 
> > > > > PCIe devices in plug-in slot usually has no direct platform logic, for
> > > > > example, there is usually no ACPI _PR3 for them.  The D3cold support
> > > > > for these devices can be done via the PCIe port connected to it.  When
> > > > > power on/off the PCIe port, the corresponding PCIe devices are powered
> > > > > on/off too.  And the remote wake up event from PCIe device will be
> > > > > notified to the corresponding PCIe port.
> > > > > 
> > > > > The PCI subsystem D3cold support and corresponding ACPI platform
> > > > > support is implemented in the patch.  Including the support for PCIe
> > > > > devices in plug-in slot.
> > > > > 
> > > > > For more information about PCIe D3cold and corresponding ACPI support,
> > > > > please refer to:
> > > > > 
> > > > > - PCI Express Base Specification Revision 2.0
> > > > > - Advanced Configuration and Power Interface Specification Revision 5.0
> > > > > 
> > > > > Originally-by: Zheng Yan <zheng.z.yan@intel.com>
> > > > > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > > > 
> > > > Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > On a second thought ...
> > > 
> > > > > ---
> > > > >  drivers/pci/pci-acpi.c         |   23 +++++++-
> > > > >  drivers/pci/pci-driver.c       |    3 +
> > > > >  drivers/pci/pci-sysfs.c        |   29 ++++++++++
> > > > >  drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
> > > > >  drivers/pci/pci.h              |    1 
> > > > >  drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
> > > > >  include/linux/pci.h            |   16 ++++-
> > > > >  7 files changed, 205 insertions(+), 21 deletions(-)
> > > > > 
> > > > > --- a/drivers/pci/pci-acpi.c
> > > > > +++ b/drivers/pci/pci-acpi.c
> > > > > @@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
> > > > >  	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
> > > > >  		return;
> > > > >  
> > > > > +	if (pci_dev->current_state == PCI_D3cold) {
> > > > > +		pci_wakeup_event(pci_dev);
> > > > > +		pm_runtime_resume(&pci_dev->dev);
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > >  	if (!pci_dev->pm_cap || !pci_dev->pme_support
> > > > >  	     || pci_check_pme_status(pci_dev)) {
> > > > >  		if (pci_dev->pme_poll)
> > > > > @@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
> > > > >  
> > > > >  static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
> > > > >  {
> > > > > -	int acpi_state;
> > > > > +	int acpi_state, d_max;
> > > > >  
> > > > > -	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
> > > > > -						ACPI_STATE_D3);
> > > 
> > > I wonder what tree this patch is against?  The code above doesn't seem to
> > > be present in the current mainline.
> > 
> > As stated in [PATCH -v4 0/2], This patchset is based on:
> > 
> > [PATCH -v2] ACPI, PM, Specify lowest allowed state for device sleep
> > state
> > 
> > Which is sent to LKML separately.
> 
> I see.
> 
> Can you post the latest version of that patch, please?

Sorry if you received duplicated email.  Forget to copy all people.

Here is the patch that this patchset is based on,

Subject: [PATCH -v2] ACPI, PM, Specify lowest allowed state for device sleep state

Lower device sleep state can save more power, but has more exit
latency too.  Sometimes, to satisfy some power QoS and other
requirement, we need to constrain the lowest device sleep state.

In this patch, a parameter to specify lowest allowed state for
acpi_pm_device_sleep_state is added.  So that the caller can enforce
the constraint via the parameter.

This is needed by PCIe D3cold support, where the lowest power state
allowed may be D3_HOT instead of default D3_COLD.

Changelog:

v2:
- Minor change per Rafeal's comments

Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 drivers/acpi/sleep.c       |   24 +++++++++++++++++++-----
 drivers/pci/pci-acpi.c     |    3 ++-
 drivers/pnp/pnpacpi/core.c |    4 ++--
 include/acpi/acpi_bus.h    |    6 +++---
 4 files changed, 26 insertions(+), 11 deletions(-)

--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -677,8 +677,9 @@ int acpi_suspend(u32 acpi_state)
  *	@dev: device to examine; its driver model wakeup flags control
  *		whether it should be able to wake up the system
  *	@d_min_p: used to store the upper limit of allowed states range
- *	Return value: preferred power state of the device on success, -ENODEV on
- *		failure (ie. if there's no 'struct acpi_device' for @dev)
+ *	@d_max_in: specify the lowest allowed states
+ *	Return value: preferred power state of the device on success, -ENODEV
+ *	(ie. if there's no 'struct acpi_device' for @dev) or -EINVAL on failure
  *
  *	Find the lowest power (highest number) ACPI device power state that
  *	device @dev can be in while the system is in the sleep state represented
@@ -693,13 +694,15 @@ int acpi_suspend(u32 acpi_state)
  *	via @wake.
  */
 
-int acpi_pm_device_sleep_state(struct device *dev, int *d_min_p)
+int acpi_pm_device_sleep_state(struct device *dev, int *d_min_p, int d_max_in)
 {
 	acpi_handle handle = DEVICE_ACPI_HANDLE(dev);
 	struct acpi_device *adev;
 	char acpi_method[] = "_SxD";
 	unsigned long long d_min, d_max;
 
+	if (d_max_in < ACPI_STATE_D0 || d_max_in > ACPI_STATE_D3)
+		return -EINVAL;
 	if (!handle || ACPI_FAILURE(acpi_bus_get_device(handle, &adev))) {
 		printk(KERN_DEBUG "ACPI handle has no context!\n");
 		return -ENODEV;
@@ -707,8 +710,10 @@ int acpi_pm_device_sleep_state(struct de
 
 	acpi_method[2] = '0' + acpi_target_sleep_state;
 	/*
-	 * If the sleep state is S0, we will return D3, but if the device has
-	 * _S0W, we will use the value from _S0W
+	 * If the sleep state is S0, the lowest limit from ACPI is D3,
+	 * but if the device has _S0W, we will use the value from _S0W
+	 * as the lowest limit from ACPI.  Finally, we will constrain
+	 * the lowest limit with the specified one.
 	 */
 	d_min = ACPI_STATE_D0;
 	d_max = ACPI_STATE_D3;
@@ -752,8 +757,17 @@ int acpi_pm_device_sleep_state(struct de
 		}
 	}
 
+	if (d_max_in < d_min)
+		return -EINVAL;
 	if (d_min_p)
 		*d_min_p = d_min;
+	/* constrain d_max with specified lowest limit (max number) */
+	if (d_max > d_max_in) {
+		for (d_max = d_max_in; d_max > d_min; d_max--) {
+			if (adev->power.states[d_max].flags.valid)
+				break;
+		}
+	}
 	return d_max;
 }
 #endif /* CONFIG_PM */
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -189,7 +189,8 @@ static pci_power_t acpi_pci_choose_state
 {
 	int acpi_state;
 
-	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL);
+	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
+						ACPI_STATE_D3);
 	if (acpi_state < 0)
 		return PCI_POWER_ERROR;
 
--- a/drivers/pnp/pnpacpi/core.c
+++ b/drivers/pnp/pnpacpi/core.c
@@ -170,8 +170,8 @@ static int pnpacpi_suspend(struct pnp_de
 	}
 
 	if (acpi_bus_power_manageable(handle)) {
-		int power_state = acpi_pm_device_sleep_state(&dev->dev, NULL);
-
+		int power_state = acpi_pm_device_sleep_state(&dev->dev, NULL,
+							     ACPI_STATE_D3);
 		if (power_state < 0)
 			power_state = (state.event == PM_EVENT_ON) ?
 					ACPI_STATE_D0 : ACPI_STATE_D3;
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -383,13 +383,13 @@ int acpi_enable_wakeup_device_power(stru
 int acpi_disable_wakeup_device_power(struct acpi_device *dev);
 
 #ifdef CONFIG_PM
-int acpi_pm_device_sleep_state(struct device *, int *);
+int acpi_pm_device_sleep_state(struct device *, int *, int);
 #else
-static inline int acpi_pm_device_sleep_state(struct device *d, int *p)
+static inline int acpi_pm_device_sleep_state(struct device *d, int *p, int m)
 {
 	if (p)
 		*p = ACPI_STATE_D0;
-	return ACPI_STATE_D3;
+	return (m >= ACPI_STATE_D0 && <= ACPI_STATE_D3) ? m : ACPI_STATE_D0;
 }
 #endif
 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-05-31 19:01         ` Rafael J. Wysocki
  2012-06-01  2:03           ` Huang Ying
@ 2012-06-01  2:25           ` Huang Ying
  2012-06-01 23:10             ` Rafael J. Wysocki
  1 sibling, 1 reply; 14+ messages in thread
From: Huang Ying @ 2012-06-01  2:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Thu, 2012-05-31 at 21:01 +0200, Rafael J. Wysocki wrote:
> On Thursday, May 31, 2012, Huang Ying wrote:
> > On Wed, 2012-05-30 at 23:49 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 22, 2012, Rafael J. Wysocki wrote:
> > > > On Friday, May 18, 2012, Huang Ying wrote:
> > > > > This patch adds runtime D3cold support to PCIe bus.  D3cold is the
> > > > > deepest power saving state for PCIe devices.  Where the main PCIe link
> > > > > will be powered off totally, so before the PCIe main link is powered
> > > > > on again, you can not access the device, even the configuration space,
> > > > > which is still accessible in D3hot.  Because the main PCIe link is not
> > > > > available, the PCI PM registers can not be used to put device into/out
> > > > > of D3cold state, that will be done by platform logic such as ACPI
> > > > > _PR3.
> > > > > 
> > > > > To support remote wake up in D3cold, aux power is supplied, and a
> > > > > side-band pin: WAKE# is used to notify remote wake up event, the pin
> > > > > is usually connected to platform logic such as ACPI GPE.  This is
> > > > > quite different with other power saving states, where remote wake up
> > > > > is notified via PME message through the PCIe main link.
> > > > > 
> > > > > PCIe devices in plug-in slot usually has no direct platform logic, for
> > > > > example, there is usually no ACPI _PR3 for them.  The D3cold support
> > > > > for these devices can be done via the PCIe port connected to it.  When
> > > > > power on/off the PCIe port, the corresponding PCIe devices are powered
> > > > > on/off too.  And the remote wake up event from PCIe device will be
> > > > > notified to the corresponding PCIe port.
> > > > > 
> > > > > The PCI subsystem D3cold support and corresponding ACPI platform
> > > > > support is implemented in the patch.  Including the support for PCIe
> > > > > devices in plug-in slot.
> > > > > 
> > > > > For more information about PCIe D3cold and corresponding ACPI support,
> > > > > please refer to:
> > > > > 
> > > > > - PCI Express Base Specification Revision 2.0
> > > > > - Advanced Configuration and Power Interface Specification Revision 5.0
> > > > > 
> > > > > Originally-by: Zheng Yan <zheng.z.yan@intel.com>
> > > > > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > > > 
> > > > Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > On a second thought ...
> > > 
> > > > > ---
> > > > >  drivers/pci/pci-acpi.c         |   23 +++++++-
> > > > >  drivers/pci/pci-driver.c       |    3 +
> > > > >  drivers/pci/pci-sysfs.c        |   29 ++++++++++
> > > > >  drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
> > > > >  drivers/pci/pci.h              |    1 
> > > > >  drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
> > > > >  include/linux/pci.h            |   16 ++++-
> > > > >  7 files changed, 205 insertions(+), 21 deletions(-)
> > > > > 
> > > > > --- a/drivers/pci/pci-acpi.c
> > > > > +++ b/drivers/pci/pci-acpi.c
> > > > > @@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
> > > > >  	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
> > > > >  		return;
> > > > >  
> > > > > +	if (pci_dev->current_state == PCI_D3cold) {
> > > > > +		pci_wakeup_event(pci_dev);
> > > > > +		pm_runtime_resume(&pci_dev->dev);
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > >  	if (!pci_dev->pm_cap || !pci_dev->pme_support
> > > > >  	     || pci_check_pme_status(pci_dev)) {
> > > > >  		if (pci_dev->pme_poll)
> > > > > @@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
> > > > >  
> > > > >  static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
> > > > >  {
> > > > > -	int acpi_state;
> > > > > +	int acpi_state, d_max;
> > > > >  
> > > > > -	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
> > > > > -						ACPI_STATE_D3);
> > > 
> > > I wonder what tree this patch is against?  The code above doesn't seem to
> > > be present in the current mainline.
> > 
> > As stated in [PATCH -v4 0/2], This patchset is based on:
> > 
> > [PATCH -v2] ACPI, PM, Specify lowest allowed state for device sleep
> > state
> > 
> > Which is sent to LKML separately.
> 
> I see.
> 
> Can you post the latest version of that patch, please?
> 
> Besides, this hunk of the $subject patch:
> 
> > @@ -731,8 +791,8 @@ int pci_set_power_state(struct pci_dev *
> >       int error;
> >  
> >       /* bound the state we're entering */
> > -     if (state > PCI_D3hot)
> > -             state = PCI_D3hot;
> > +     if (state > PCI_D3cold)
> > +             state = PCI_D3cold;
> >       else if (state < PCI_D0)
> >               state = PCI_D0;
> >       else if ((state == PCI_D1 || state == PCI_D2) && pci_no_d1d2(dev))
> > @@ -747,10 +807,15 @@ int pci_set_power_state(struct pci_dev *
> >  
> >       /* This device is quirked not to be put into D3, so
> >          don't put it in D3 */
> > -     if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> > +     if (state >= PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> >               return 0;
> >  
> > -     error = pci_raw_set_power_state(dev, state);
> > +     /*
> > +      * To put device in D3cold, we put device into D3hot in native
> > +      * way, then put device into D3cold with platform ops
> > +      */
> > +     error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
> > +                                     PCI_D3hot : state);
> >  
> >       if (!__pci_complete_power_transition(dev, state))
> >               error = 0;
> 
> should be merged separately, because it will affect suspend/hibernation code
> paths.  Namely, it will change the behavior in such a way that some devices
> put into D3hot previously will be put into D3cold now during system suspend.

Yes.  This patch enables both runtime D3cold and D3cold during system
suspend.  How about separate this patch into the following patches?

- Add d3cold disable logic, including flags: no_d3cold, d3cold_allowed,
runtime_d3cold, and disable runtime d3cold (because part of runtime
d3cold support will be enabled by system d3cold support).

- system d3cold support for PCIe port

- system d3cold support in PCI core

- runtime d3cold support for PCIe port

- runtime d3cold support in PCI core

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-06-01  2:25           ` Huang Ying
@ 2012-06-01 23:10             ` Rafael J. Wysocki
  2012-06-05  5:24               ` Huang Ying
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-06-01 23:10 UTC (permalink / raw)
  To: Huang Ying; +Cc: Bjorn Helgaas, ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Friday, June 01, 2012, Huang Ying wrote:
> On Thu, 2012-05-31 at 21:01 +0200, Rafael J. Wysocki wrote:
> > On Thursday, May 31, 2012, Huang Ying wrote:
> > > On Wed, 2012-05-30 at 23:49 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 22, 2012, Rafael J. Wysocki wrote:
> > > > > On Friday, May 18, 2012, Huang Ying wrote:
> > > > > > This patch adds runtime D3cold support to PCIe bus.  D3cold is the
> > > > > > deepest power saving state for PCIe devices.  Where the main PCIe link
> > > > > > will be powered off totally, so before the PCIe main link is powered
> > > > > > on again, you can not access the device, even the configuration space,
> > > > > > which is still accessible in D3hot.  Because the main PCIe link is not
> > > > > > available, the PCI PM registers can not be used to put device into/out
> > > > > > of D3cold state, that will be done by platform logic such as ACPI
> > > > > > _PR3.
> > > > > > 
> > > > > > To support remote wake up in D3cold, aux power is supplied, and a
> > > > > > side-band pin: WAKE# is used to notify remote wake up event, the pin
> > > > > > is usually connected to platform logic such as ACPI GPE.  This is
> > > > > > quite different with other power saving states, where remote wake up
> > > > > > is notified via PME message through the PCIe main link.
> > > > > > 
> > > > > > PCIe devices in plug-in slot usually has no direct platform logic, for
> > > > > > example, there is usually no ACPI _PR3 for them.  The D3cold support
> > > > > > for these devices can be done via the PCIe port connected to it.  When
> > > > > > power on/off the PCIe port, the corresponding PCIe devices are powered
> > > > > > on/off too.  And the remote wake up event from PCIe device will be
> > > > > > notified to the corresponding PCIe port.
> > > > > > 
> > > > > > The PCI subsystem D3cold support and corresponding ACPI platform
> > > > > > support is implemented in the patch.  Including the support for PCIe
> > > > > > devices in plug-in slot.
> > > > > > 
> > > > > > For more information about PCIe D3cold and corresponding ACPI support,
> > > > > > please refer to:
> > > > > > 
> > > > > > - PCI Express Base Specification Revision 2.0
> > > > > > - Advanced Configuration and Power Interface Specification Revision 5.0
> > > > > > 
> > > > > > Originally-by: Zheng Yan <zheng.z.yan@intel.com>
> > > > > > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > > > > 
> > > > > Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > On a second thought ...
> > > > 
> > > > > > ---
> > > > > >  drivers/pci/pci-acpi.c         |   23 +++++++-
> > > > > >  drivers/pci/pci-driver.c       |    3 +
> > > > > >  drivers/pci/pci-sysfs.c        |   29 ++++++++++
> > > > > >  drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
> > > > > >  drivers/pci/pci.h              |    1 
> > > > > >  drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
> > > > > >  include/linux/pci.h            |   16 ++++-
> > > > > >  7 files changed, 205 insertions(+), 21 deletions(-)
> > > > > > 
> > > > > > --- a/drivers/pci/pci-acpi.c
> > > > > > +++ b/drivers/pci/pci-acpi.c
> > > > > > @@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
> > > > > >  	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
> > > > > >  		return;
> > > > > >  
> > > > > > +	if (pci_dev->current_state == PCI_D3cold) {
> > > > > > +		pci_wakeup_event(pci_dev);
> > > > > > +		pm_runtime_resume(&pci_dev->dev);
> > > > > > +		return;
> > > > > > +	}
> > > > > > +
> > > > > >  	if (!pci_dev->pm_cap || !pci_dev->pme_support
> > > > > >  	     || pci_check_pme_status(pci_dev)) {
> > > > > >  		if (pci_dev->pme_poll)
> > > > > > @@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
> > > > > >  
> > > > > >  static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
> > > > > >  {
> > > > > > -	int acpi_state;
> > > > > > +	int acpi_state, d_max;
> > > > > >  
> > > > > > -	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
> > > > > > -						ACPI_STATE_D3);
> > > > 
> > > > I wonder what tree this patch is against?  The code above doesn't seem to
> > > > be present in the current mainline.
> > > 
> > > As stated in [PATCH -v4 0/2], This patchset is based on:
> > > 
> > > [PATCH -v2] ACPI, PM, Specify lowest allowed state for device sleep
> > > state
> > > 
> > > Which is sent to LKML separately.
> > 
> > I see.
> > 
> > Can you post the latest version of that patch, please?
> > 
> > Besides, this hunk of the $subject patch:
> > 
> > > @@ -731,8 +791,8 @@ int pci_set_power_state(struct pci_dev *
> > >       int error;
> > >  
> > >       /* bound the state we're entering */
> > > -     if (state > PCI_D3hot)
> > > -             state = PCI_D3hot;
> > > +     if (state > PCI_D3cold)
> > > +             state = PCI_D3cold;
> > >       else if (state < PCI_D0)
> > >               state = PCI_D0;
> > >       else if ((state == PCI_D1 || state == PCI_D2) && pci_no_d1d2(dev))
> > > @@ -747,10 +807,15 @@ int pci_set_power_state(struct pci_dev *
> > >  
> > >       /* This device is quirked not to be put into D3, so
> > >          don't put it in D3 */
> > > -     if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> > > +     if (state >= PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> > >               return 0;
> > >  
> > > -     error = pci_raw_set_power_state(dev, state);
> > > +     /*
> > > +      * To put device in D3cold, we put device into D3hot in native
> > > +      * way, then put device into D3cold with platform ops
> > > +      */
> > > +     error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
> > > +                                     PCI_D3hot : state);
> > >  
> > >       if (!__pci_complete_power_transition(dev, state))
> > >               error = 0;
> > 
> > should be merged separately, because it will affect suspend/hibernation code
> > paths.  Namely, it will change the behavior in such a way that some devices
> > put into D3hot previously will be put into D3cold now during system suspend.
> 
> Yes.  This patch enables both runtime D3cold and D3cold during system
> suspend.  How about separate this patch into the following patches?
> 
> - Add d3cold disable logic, including flags: no_d3cold, d3cold_allowed,
> runtime_d3cold, and disable runtime d3cold (because part of runtime
> d3cold support will be enabled by system d3cold support).
> 
> - system d3cold support for PCIe port
> 
> - system d3cold support in PCI core
> 
> - runtime d3cold support for PCIe port
> 
> - runtime d3cold support in PCI core

Sounds good in principle.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-06-01 23:10             ` Rafael J. Wysocki
@ 2012-06-05  5:24               ` Huang Ying
  2012-06-06 13:52                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Huang Ying @ 2012-06-05  5:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Sat, 2012-06-02 at 01:10 +0200, Rafael J. Wysocki wrote:
> On Friday, June 01, 2012, Huang Ying wrote:
> > On Thu, 2012-05-31 at 21:01 +0200, Rafael J. Wysocki wrote:
> > > On Thursday, May 31, 2012, Huang Ying wrote:
> > > > On Wed, 2012-05-30 at 23:49 +0200, Rafael J. Wysocki wrote:
> > > > > On Tuesday, May 22, 2012, Rafael J. Wysocki wrote:
> > > > > > On Friday, May 18, 2012, Huang Ying wrote:
> > > > > > > This patch adds runtime D3cold support to PCIe bus.  D3cold is the
> > > > > > > deepest power saving state for PCIe devices.  Where the main PCIe link
> > > > > > > will be powered off totally, so before the PCIe main link is powered
> > > > > > > on again, you can not access the device, even the configuration space,
> > > > > > > which is still accessible in D3hot.  Because the main PCIe link is not
> > > > > > > available, the PCI PM registers can not be used to put device into/out
> > > > > > > of D3cold state, that will be done by platform logic such as ACPI
> > > > > > > _PR3.
> > > > > > > 
> > > > > > > To support remote wake up in D3cold, aux power is supplied, and a
> > > > > > > side-band pin: WAKE# is used to notify remote wake up event, the pin
> > > > > > > is usually connected to platform logic such as ACPI GPE.  This is
> > > > > > > quite different with other power saving states, where remote wake up
> > > > > > > is notified via PME message through the PCIe main link.
> > > > > > > 
> > > > > > > PCIe devices in plug-in slot usually has no direct platform logic, for
> > > > > > > example, there is usually no ACPI _PR3 for them.  The D3cold support
> > > > > > > for these devices can be done via the PCIe port connected to it.  When
> > > > > > > power on/off the PCIe port, the corresponding PCIe devices are powered
> > > > > > > on/off too.  And the remote wake up event from PCIe device will be
> > > > > > > notified to the corresponding PCIe port.
> > > > > > > 
> > > > > > > The PCI subsystem D3cold support and corresponding ACPI platform
> > > > > > > support is implemented in the patch.  Including the support for PCIe
> > > > > > > devices in plug-in slot.
> > > > > > > 
> > > > > > > For more information about PCIe D3cold and corresponding ACPI support,
> > > > > > > please refer to:
> > > > > > > 
> > > > > > > - PCI Express Base Specification Revision 2.0
> > > > > > > - Advanced Configuration and Power Interface Specification Revision 5.0
> > > > > > > 
> > > > > > > Originally-by: Zheng Yan <zheng.z.yan@intel.com>
> > > > > > > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > > > > > 
> > > > > > Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > 
> > > > > On a second thought ...
> > > > > 
> > > > > > > ---
> > > > > > >  drivers/pci/pci-acpi.c         |   23 +++++++-
> > > > > > >  drivers/pci/pci-driver.c       |    3 +
> > > > > > >  drivers/pci/pci-sysfs.c        |   29 ++++++++++
> > > > > > >  drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
> > > > > > >  drivers/pci/pci.h              |    1 
> > > > > > >  drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
> > > > > > >  include/linux/pci.h            |   16 ++++-
> > > > > > >  7 files changed, 205 insertions(+), 21 deletions(-)
> > > > > > > 
> > > > > > > --- a/drivers/pci/pci-acpi.c
> > > > > > > +++ b/drivers/pci/pci-acpi.c
> > > > > > > @@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
> > > > > > >  	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
> > > > > > >  		return;
> > > > > > >  
> > > > > > > +	if (pci_dev->current_state == PCI_D3cold) {
> > > > > > > +		pci_wakeup_event(pci_dev);
> > > > > > > +		pm_runtime_resume(&pci_dev->dev);
> > > > > > > +		return;
> > > > > > > +	}
> > > > > > > +
> > > > > > >  	if (!pci_dev->pm_cap || !pci_dev->pme_support
> > > > > > >  	     || pci_check_pme_status(pci_dev)) {
> > > > > > >  		if (pci_dev->pme_poll)
> > > > > > > @@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
> > > > > > >  
> > > > > > >  static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
> > > > > > >  {
> > > > > > > -	int acpi_state;
> > > > > > > +	int acpi_state, d_max;
> > > > > > >  
> > > > > > > -	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
> > > > > > > -						ACPI_STATE_D3);
> > > > > 
> > > > > I wonder what tree this patch is against?  The code above doesn't seem to
> > > > > be present in the current mainline.
> > > > 
> > > > As stated in [PATCH -v4 0/2], This patchset is based on:
> > > > 
> > > > [PATCH -v2] ACPI, PM, Specify lowest allowed state for device sleep
> > > > state
> > > > 
> > > > Which is sent to LKML separately.
> > > 
> > > I see.
> > > 
> > > Can you post the latest version of that patch, please?
> > > 
> > > Besides, this hunk of the $subject patch:
> > > 
> > > > @@ -731,8 +791,8 @@ int pci_set_power_state(struct pci_dev *
> > > >       int error;
> > > >  
> > > >       /* bound the state we're entering */
> > > > -     if (state > PCI_D3hot)
> > > > -             state = PCI_D3hot;
> > > > +     if (state > PCI_D3cold)
> > > > +             state = PCI_D3cold;
> > > >       else if (state < PCI_D0)
> > > >               state = PCI_D0;
> > > >       else if ((state == PCI_D1 || state == PCI_D2) && pci_no_d1d2(dev))
> > > > @@ -747,10 +807,15 @@ int pci_set_power_state(struct pci_dev *
> > > >  
> > > >       /* This device is quirked not to be put into D3, so
> > > >          don't put it in D3 */
> > > > -     if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> > > > +     if (state >= PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> > > >               return 0;
> > > >  
> > > > -     error = pci_raw_set_power_state(dev, state);
> > > > +     /*
> > > > +      * To put device in D3cold, we put device into D3hot in native
> > > > +      * way, then put device into D3cold with platform ops
> > > > +      */
> > > > +     error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
> > > > +                                     PCI_D3hot : state);
> > > >  
> > > >       if (!__pci_complete_power_transition(dev, state))
> > > >               error = 0;
> > > 
> > > should be merged separately, because it will affect suspend/hibernation code
> > > paths.  Namely, it will change the behavior in such a way that some devices
> > > put into D3hot previously will be put into D3cold now during system suspend.
> > 
> > Yes.  This patch enables both runtime D3cold and D3cold during system
> > suspend.  How about separate this patch into the following patches?
> > 
> > - Add d3cold disable logic, including flags: no_d3cold, d3cold_allowed,
> > runtime_d3cold, and disable runtime d3cold (because part of runtime
> > d3cold support will be enabled by system d3cold support).
> > 
> > - system d3cold support for PCIe port
> > 
> > - system d3cold support in PCI core
> > 
> > - runtime d3cold support for PCIe port
> > 
> > - runtime d3cold support in PCI core
> 
> Sounds good in principle.

Thought it again.

If my understanding were correct, in most cases, The value to put
devices into D3cold during system suspend/hibernate may be questionable.
Because after Linux put the devices into lower power state,  the
firmware may put devices into D3cold state before entering system
suspend state.

So, I think maybe we can just constrain the target state to D3hot in
pci_set_power_state() if system suspend/hibernate is ongoing
(dev->runtime_d3cold is not set).  That is something as follow:

/* D3cold during system suspend/hibernate is not supported yet */
if (dev->runtime_d3cold && state >= PCI_D3cold)
	state = PCI_D3cold;
else if (state > PCI_D3hot)
	state = PCI_D3hot;

This way, we can just make some minor change to this patch and maybe
work on D3cold during system suspend/hibernate after some discussion.

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-06-05  5:24               ` Huang Ying
@ 2012-06-06 13:52                 ` Rafael J. Wysocki
  2012-06-07  1:03                   ` Huang Ying
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-06-06 13:52 UTC (permalink / raw)
  To: Huang Ying; +Cc: Bjorn Helgaas, ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Tuesday, June 05, 2012, Huang Ying wrote:
> On Sat, 2012-06-02 at 01:10 +0200, Rafael J. Wysocki wrote:
> > On Friday, June 01, 2012, Huang Ying wrote:
> > > On Thu, 2012-05-31 at 21:01 +0200, Rafael J. Wysocki wrote:
> > > > On Thursday, May 31, 2012, Huang Ying wrote:
> > > > > On Wed, 2012-05-30 at 23:49 +0200, Rafael J. Wysocki wrote:
> > > > > > On Tuesday, May 22, 2012, Rafael J. Wysocki wrote:
> > > > > > > On Friday, May 18, 2012, Huang Ying wrote:
> > > > > > > > This patch adds runtime D3cold support to PCIe bus.  D3cold is the
> > > > > > > > deepest power saving state for PCIe devices.  Where the main PCIe link
> > > > > > > > will be powered off totally, so before the PCIe main link is powered
> > > > > > > > on again, you can not access the device, even the configuration space,
> > > > > > > > which is still accessible in D3hot.  Because the main PCIe link is not
> > > > > > > > available, the PCI PM registers can not be used to put device into/out
> > > > > > > > of D3cold state, that will be done by platform logic such as ACPI
> > > > > > > > _PR3.
> > > > > > > > 
> > > > > > > > To support remote wake up in D3cold, aux power is supplied, and a
> > > > > > > > side-band pin: WAKE# is used to notify remote wake up event, the pin
> > > > > > > > is usually connected to platform logic such as ACPI GPE.  This is
> > > > > > > > quite different with other power saving states, where remote wake up
> > > > > > > > is notified via PME message through the PCIe main link.
> > > > > > > > 
> > > > > > > > PCIe devices in plug-in slot usually has no direct platform logic, for
> > > > > > > > example, there is usually no ACPI _PR3 for them.  The D3cold support
> > > > > > > > for these devices can be done via the PCIe port connected to it.  When
> > > > > > > > power on/off the PCIe port, the corresponding PCIe devices are powered
> > > > > > > > on/off too.  And the remote wake up event from PCIe device will be
> > > > > > > > notified to the corresponding PCIe port.
> > > > > > > > 
> > > > > > > > The PCI subsystem D3cold support and corresponding ACPI platform
> > > > > > > > support is implemented in the patch.  Including the support for PCIe
> > > > > > > > devices in plug-in slot.
> > > > > > > > 
> > > > > > > > For more information about PCIe D3cold and corresponding ACPI support,
> > > > > > > > please refer to:
> > > > > > > > 
> > > > > > > > - PCI Express Base Specification Revision 2.0
> > > > > > > > - Advanced Configuration and Power Interface Specification Revision 5.0
> > > > > > > > 
> > > > > > > > Originally-by: Zheng Yan <zheng.z.yan@intel.com>
> > > > > > > > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > > > > > > 
> > > > > > > Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > > 
> > > > > > On a second thought ...
> > > > > > 
> > > > > > > > ---
> > > > > > > >  drivers/pci/pci-acpi.c         |   23 +++++++-
> > > > > > > >  drivers/pci/pci-driver.c       |    3 +
> > > > > > > >  drivers/pci/pci-sysfs.c        |   29 ++++++++++
> > > > > > > >  drivers/pci/pci.c              |  110 +++++++++++++++++++++++++++++++++++++----
> > > > > > > >  drivers/pci/pci.h              |    1 
> > > > > > > >  drivers/pci/pcie/portdrv_pci.c |   44 ++++++++++++++--
> > > > > > > >  include/linux/pci.h            |   16 ++++-
> > > > > > > >  7 files changed, 205 insertions(+), 21 deletions(-)
> > > > > > > > 
> > > > > > > > --- a/drivers/pci/pci-acpi.c
> > > > > > > > +++ b/drivers/pci/pci-acpi.c
> > > > > > > > @@ -48,6 +48,12 @@ static void pci_acpi_wake_dev(acpi_handl
> > > > > > > >  	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
> > > > > > > >  		return;
> > > > > > > >  
> > > > > > > > +	if (pci_dev->current_state == PCI_D3cold) {
> > > > > > > > +		pci_wakeup_event(pci_dev);
> > > > > > > > +		pm_runtime_resume(&pci_dev->dev);
> > > > > > > > +		return;
> > > > > > > > +	}
> > > > > > > > +
> > > > > > > >  	if (!pci_dev->pm_cap || !pci_dev->pme_support
> > > > > > > >  	     || pci_check_pme_status(pci_dev)) {
> > > > > > > >  		if (pci_dev->pme_poll)
> > > > > > > > @@ -187,10 +193,13 @@ acpi_status pci_acpi_remove_pm_notifier(
> > > > > > > >  
> > > > > > > >  static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
> > > > > > > >  {
> > > > > > > > -	int acpi_state;
> > > > > > > > +	int acpi_state, d_max;
> > > > > > > >  
> > > > > > > > -	acpi_state = acpi_pm_device_sleep_state(&pdev->dev, NULL,
> > > > > > > > -						ACPI_STATE_D3);
> > > > > > 
> > > > > > I wonder what tree this patch is against?  The code above doesn't seem to
> > > > > > be present in the current mainline.
> > > > > 
> > > > > As stated in [PATCH -v4 0/2], This patchset is based on:
> > > > > 
> > > > > [PATCH -v2] ACPI, PM, Specify lowest allowed state for device sleep
> > > > > state
> > > > > 
> > > > > Which is sent to LKML separately.
> > > > 
> > > > I see.
> > > > 
> > > > Can you post the latest version of that patch, please?
> > > > 
> > > > Besides, this hunk of the $subject patch:
> > > > 
> > > > > @@ -731,8 +791,8 @@ int pci_set_power_state(struct pci_dev *
> > > > >       int error;
> > > > >  
> > > > >       /* bound the state we're entering */
> > > > > -     if (state > PCI_D3hot)
> > > > > -             state = PCI_D3hot;
> > > > > +     if (state > PCI_D3cold)
> > > > > +             state = PCI_D3cold;
> > > > >       else if (state < PCI_D0)
> > > > >               state = PCI_D0;
> > > > >       else if ((state == PCI_D1 || state == PCI_D2) && pci_no_d1d2(dev))
> > > > > @@ -747,10 +807,15 @@ int pci_set_power_state(struct pci_dev *
> > > > >  
> > > > >       /* This device is quirked not to be put into D3, so
> > > > >          don't put it in D3 */
> > > > > -     if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> > > > > +     if (state >= PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> > > > >               return 0;
> > > > >  
> > > > > -     error = pci_raw_set_power_state(dev, state);
> > > > > +     /*
> > > > > +      * To put device in D3cold, we put device into D3hot in native
> > > > > +      * way, then put device into D3cold with platform ops
> > > > > +      */
> > > > > +     error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
> > > > > +                                     PCI_D3hot : state);
> > > > >  
> > > > >       if (!__pci_complete_power_transition(dev, state))
> > > > >               error = 0;
> > > > 
> > > > should be merged separately, because it will affect suspend/hibernation code
> > > > paths.  Namely, it will change the behavior in such a way that some devices
> > > > put into D3hot previously will be put into D3cold now during system suspend.
> > > 
> > > Yes.  This patch enables both runtime D3cold and D3cold during system
> > > suspend.  How about separate this patch into the following patches?
> > > 
> > > - Add d3cold disable logic, including flags: no_d3cold, d3cold_allowed,
> > > runtime_d3cold, and disable runtime d3cold (because part of runtime
> > > d3cold support will be enabled by system d3cold support).
> > > 
> > > - system d3cold support for PCIe port
> > > 
> > > - system d3cold support in PCI core
> > > 
> > > - runtime d3cold support for PCIe port
> > > 
> > > - runtime d3cold support in PCI core
> > 
> > Sounds good in principle.
> 
> Thought it again.
> 
> If my understanding were correct, in most cases, The value to put
> devices into D3cold during system suspend/hibernate may be questionable.
> Because after Linux put the devices into lower power state,  the
> firmware may put devices into D3cold state before entering system
> suspend state.
> 
> So, I think maybe we can just constrain the target state to D3hot in
> pci_set_power_state() if system suspend/hibernate is ongoing
> (dev->runtime_d3cold is not set).  That is something as follow:
> 
> /* D3cold during system suspend/hibernate is not supported yet */
> if (dev->runtime_d3cold && state >= PCI_D3cold)
> 	state = PCI_D3cold;
> else if (state > PCI_D3hot)
> 	state = PCI_D3hot;
> 
> This way, we can just make some minor change to this patch and maybe
> work on D3cold during system suspend/hibernate after some discussion.

I don't honestly think we should add such ugly checks to pci_set_power_state().

Why don't we ensure that acpi_pm_device_sleep_state() doesn't return 4
(D3cold) if acpi_target_sleep_state is different from 0 instead?

Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support
  2012-06-06 13:52                 ` Rafael J. Wysocki
@ 2012-06-07  1:03                   ` Huang Ying
  0 siblings, 0 replies; 14+ messages in thread
From: Huang Ying @ 2012-06-07  1:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, ming.m.lin, linux-kernel, linux-pm, Zheng Yan

On Wed, 2012-06-06 at 15:52 +0200, Rafael J. Wysocki wrote:
> On Tuesday, June 05, 2012, Huang Ying wrote:
> > On Sat, 2012-06-02 at 01:10 +0200, Rafael J. Wysocki wrote:
> > > On Friday, June 01, 2012, Huang Ying wrote:
> > > > On Thu, 2012-05-31 at 21:01 +0200, Rafael J. Wysocki wrote:
> > > > > On Thursday, May 31, 2012, Huang Ying wrote:
> > > > > > On Wed, 2012-05-30 at 23:49 +0200, Rafael J. Wysocki wrote:
> > > > > > > On Tuesday, May 22, 2012, Rafael J. Wysocki wrote:
> > > > > > > > On Friday, May 18, 2012, Huang Ying wrote:
[snip] 
> > > > > Besides, this hunk of the $subject patch:
> > > > > 
> > > > > > @@ -731,8 +791,8 @@ int pci_set_power_state(struct pci_dev *
> > > > > >       int error;
> > > > > >  
> > > > > >       /* bound the state we're entering */
> > > > > > -     if (state > PCI_D3hot)
> > > > > > -             state = PCI_D3hot;
> > > > > > +     if (state > PCI_D3cold)
> > > > > > +             state = PCI_D3cold;
> > > > > >       else if (state < PCI_D0)
> > > > > >               state = PCI_D0;
> > > > > >       else if ((state == PCI_D1 || state == PCI_D2) && pci_no_d1d2(dev))
> > > > > > @@ -747,10 +807,15 @@ int pci_set_power_state(struct pci_dev *
> > > > > >  
> > > > > >       /* This device is quirked not to be put into D3, so
> > > > > >          don't put it in D3 */
> > > > > > -     if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> > > > > > +     if (state >= PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
> > > > > >               return 0;
> > > > > >  
> > > > > > -     error = pci_raw_set_power_state(dev, state);
> > > > > > +     /*
> > > > > > +      * To put device in D3cold, we put device into D3hot in native
> > > > > > +      * way, then put device into D3cold with platform ops
> > > > > > +      */
> > > > > > +     error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
> > > > > > +                                     PCI_D3hot : state);
> > > > > >  
> > > > > >       if (!__pci_complete_power_transition(dev, state))
> > > > > >               error = 0;
> > > > > 
> > > > > should be merged separately, because it will affect suspend/hibernation code
> > > > > paths.  Namely, it will change the behavior in such a way that some devices
> > > > > put into D3hot previously will be put into D3cold now during system suspend.
> > > > 
> > > > Yes.  This patch enables both runtime D3cold and D3cold during system
> > > > suspend.  How about separate this patch into the following patches?
> > > > 
> > > > - Add d3cold disable logic, including flags: no_d3cold, d3cold_allowed,
> > > > runtime_d3cold, and disable runtime d3cold (because part of runtime
> > > > d3cold support will be enabled by system d3cold support).
> > > > 
> > > > - system d3cold support for PCIe port
> > > > 
> > > > - system d3cold support in PCI core
> > > > 
> > > > - runtime d3cold support for PCIe port
> > > > 
> > > > - runtime d3cold support in PCI core
> > > 
> > > Sounds good in principle.
> > 
> > Thought it again.
> > 
> > If my understanding were correct, in most cases, The value to put
> > devices into D3cold during system suspend/hibernate may be questionable.
> > Because after Linux put the devices into lower power state,  the
> > firmware may put devices into D3cold state before entering system
> > suspend state.
> > 
> > So, I think maybe we can just constrain the target state to D3hot in
> > pci_set_power_state() if system suspend/hibernate is ongoing
> > (dev->runtime_d3cold is not set).  That is something as follow:
> > 
> > /* D3cold during system suspend/hibernate is not supported yet */
> > if (dev->runtime_d3cold && state >= PCI_D3cold)
> > 	state = PCI_D3cold;
> > else if (state > PCI_D3hot)
> > 	state = PCI_D3hot;
> > 
> > This way, we can just make some minor change to this patch and maybe
> > work on D3cold during system suspend/hibernate after some discussion.
> 
> I don't honestly think we should add such ugly checks to pci_set_power_state().
> 
> Why don't we ensure that acpi_pm_device_sleep_state() doesn't return 4
> (D3cold) if acpi_target_sleep_state is different from 0 instead?

We can add that into acpi_target_sleep_state.  But pci_set_power_sate()
is public API and used by many places already, do a quick search in
recent kernel source.

$ grep 'pci_set_power_state' -r . | grep -v PCI_D3hot | grep -v PCI_D0 | grep -v pci_choose_state | wc -l

yields:
55

$ grep 'pci_set_power_state' -r . | grep -v PCI_D3hot | grep -v PCI_D0 | grep -v pci_choose_state | grep -i d3cold

yields:
./drivers/misc/cb710/core.c:            pci_set_power_state(pdev, PCI_D3cold);
./arch/x86/pci/mrst.c:  pci_set_power_state(dev, PCI_D3cold);

Although we can check/fix them one by one.  We need to prevent future
pci_set_power_state to accept PCI_D3cold in some situation.

So I think we need to constrain the parameter of pci_set_power_state
anyway.  Maybe something like below in pci_set_power_state is better
than my previous one.

/* D3cold during system suspend/hibernate is not supported yet */
if (!dev->runtime_d3cold && state > PCI_D3hot)
	state = PCI_D3hot;

if (state > PCI_D3cold)
	state = PCI_D3cold;

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-06-07  1:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-18  1:48 [PATCH -v4 0/2] PCIe: Add PCIe runtime D3cold support Huang Ying
2012-05-18  1:48 ` [PATCH -v4 1/2] PCIe: Add runtime PM support to PCIe port Huang Ying
2012-05-21 22:09   ` Rafael J. Wysocki
2012-05-18  1:48 ` [PATCH -v4 2/2] PCIe: Add PCIe runtime D3cold support Huang Ying
2012-05-21 22:11   ` Rafael J. Wysocki
2012-05-30 21:49     ` Rafael J. Wysocki
2012-05-31  0:40       ` Huang Ying
2012-05-31 19:01         ` Rafael J. Wysocki
2012-06-01  2:03           ` Huang Ying
2012-06-01  2:25           ` Huang Ying
2012-06-01 23:10             ` Rafael J. Wysocki
2012-06-05  5:24               ` Huang Ying
2012-06-06 13:52                 ` Rafael J. Wysocki
2012-06-07  1:03                   ` Huang Ying

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).