linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources
@ 2019-06-25 10:29 Mika Westerberg
  2019-06-25 10:29 ` [PATCH v3 1/3] PCI / ACPI: Use cached ACPI device state to get PCI device power state Mika Westerberg
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Mika Westerberg @ 2019-06-25 10:29 UTC (permalink / raw)
  To: Rafael J. Wysocki, Bjorn Helgaas
  Cc: Len Brown, Lukas Wunner, Keith Busch, Alex Williamson,
	Alexandru Gagniuc, Mika Westerberg, linux-acpi, linux-pci

Hi all,

This is third iteration of the patch series addressing issues around
sibling PCI devices sharing ACPI power resources.

As a concrete example in Intel Ice Lake the Thunderbolt controller, PCIe
root ports and xHCI all share the same ACPI power resources. When they are
all in D3hot power resources (returned by _PR3) can be turned off powering
off the whole block. However, there are two issues around this.

Firstly the PCI core sets the device power state by asking what the real
ACPI power state is. This results that all but last device sharing the
power resources are in D3hot when the power resources are turned off. This
causes issues if user runs for example 'lspci' because the device is really
in D3cold so what user gets back is all ones (0xffffffff).

Secondly if any of the device is runtime resumed the power resources are
turned on bringing all other devices sharing the resources to
D0uninitialized losing their wakeup configuration.

This series aims to fix the two issues by:

  1. Using the ACPI cached power state when PCI devices are transitioned
     into low power states instead of reading back the "real" power state.

  2. Introducing concept of "_PR0 dependent devices" that get runtime
     resumed whenever their power resource (which they might share with
     other sibling devices) gets turned on.

The series is based on the idea of Rafael J. Wysocki <rafael@kernel.org>.

Previous version of the series can be found here:

  v2: https://lore.kernel.org/linux-pci/20190618161858.77834-1-mika.westerberg@linux.intel.com/T/#m7a41d0b745400054543324ce84125040dbfed912
  v1: https://www.spinics.net/lists/linux-pci/msg83583.html

Changes from v2:

  * Updated changelog of patch [1/3] according to comments I got. I left
    the D3C power resource and xHCI there because it shows that we can have
    multiple shared power resources.

  * Added link to the discussion around v2.

  * Use adev->flags.power_manageable in patch [2/3].

Mika Westerberg (3):
  PCI / ACPI: Use cached ACPI device state to get PCI device power state
  ACPI / PM: Introduce concept of a _PR0 dependent device
  PCI / ACPI: Add _PR0 dependent devices

 drivers/acpi/power.c    | 135 ++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci-acpi.c  |   5 +-
 include/acpi/acpi_bus.h |   4 ++
 3 files changed, 143 insertions(+), 1 deletion(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/3] PCI / ACPI: Use cached ACPI device state to get PCI device power state
  2019-06-25 10:29 [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources Mika Westerberg
@ 2019-06-25 10:29 ` Mika Westerberg
  2019-06-25 12:15   ` Rafael J. Wysocki
  2019-06-25 10:29 ` [PATCH v3 2/3] ACPI / PM: Introduce concept of a _PR0 dependent device Mika Westerberg
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Mika Westerberg @ 2019-06-25 10:29 UTC (permalink / raw)
  To: Rafael J. Wysocki, Bjorn Helgaas
  Cc: Len Brown, Lukas Wunner, Keith Busch, Alex Williamson,
	Alexandru Gagniuc, Mika Westerberg, linux-acpi, linux-pci

The ACPI power state returned by acpi_device_get_power() may depend on
the configuration of ACPI power resources in the system which may change
any time after acpi_device_get_power() has returned, unless the
reference counters of the ACPI power resources in question are set to
prevent that from happening. Thus it is invalid to use acpi_device_get_power()
in acpi_pci_get_power_state() the way it is done now and the value of
the ->power.state field in the corresponding struct acpi_device objects
(which reflects the ACPI power resources reference counting, among other
things) should be used instead.

As an example where this becomes an issue is Intel Ice Lake where the
Thunderbolt controller (NHI), two PCIe root ports (RP0 and RP1) and xHCI
all share the same power resources. The following picture with power
resources marked with [] shows the topology:

  Host bridge
    |
    +- RP0 ---\
    +- RP1 ---|--+--> [TBT]
    +- NHI --/   |
    |            |
    |            v
    +- xHCI --> [D3C]

Here TBT and D3C are the shared ACPI power resources. ACPI _PR3() method
of the devices in question returns either TBT or D3C or both.

Say we runtime suspend first the root ports RP0 and RP1, then NHI. Now
since the TBT power resource is still on when the root ports are runtime
suspended their dev->current_state is set to D3hot. When NHI is runtime
suspended TBT is finally turned off but state of the root ports remain
to be D3hot. Now when the xHCI is runtime suspended D3C gets also turned
off. PCI core thus has power states of these devices cached in their
dev->current_state as follows:

  RP0 -> D3hot
  RP1 -> D3hot
  NHI -> D3cold
  xHCI -> D3cold

If the user now runs lspci for instance, the result is all 1's like in
the below output (00:07.0 is the first root port, RP0):

00:07.0 PCI bridge: Intel Corporation Device 8a1d (rev ff) (prog-if ff)
    !!! Unknown header type 7f
    Kernel driver in use: pcieport

In short the hardware state is not in sync with the software state
anymore. The exact same thing happens with the PME polling thread which
ends up bringing the root ports back into D0 after they are runtime
suspended.

For this reason, modify acpi_pci_get_power_state() so that it uses the
ACPI device power state that was cached by the ACPI core. This makes the
PCI device power state match the ACPI device power state regardless of
state of the shared power resources which may still be on at this point.

Link: https://lore.kernel.org/r/20190618161858.77834-2-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 drivers/pci/pci-acpi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 1897847ceb0c..b782acac26c5 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -685,7 +685,8 @@ static pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
 	if (!adev || !acpi_device_power_manageable(adev))
 		return PCI_UNKNOWN;
 
-	if (acpi_device_get_power(adev, &state) || state == ACPI_STATE_UNKNOWN)
+	state = adev->power.state;
+	if (state == ACPI_STATE_UNKNOWN)
 		return PCI_UNKNOWN;
 
 	return state_conv[state];
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/3] ACPI / PM: Introduce concept of a _PR0 dependent device
  2019-06-25 10:29 [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources Mika Westerberg
  2019-06-25 10:29 ` [PATCH v3 1/3] PCI / ACPI: Use cached ACPI device state to get PCI device power state Mika Westerberg
@ 2019-06-25 10:29 ` Mika Westerberg
  2019-06-25 10:29 ` [PATCH v3 3/3] PCI / ACPI: Add _PR0 dependent devices Mika Westerberg
  2019-06-25 10:35 ` [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources Rafael J. Wysocki
  3 siblings, 0 replies; 7+ messages in thread
From: Mika Westerberg @ 2019-06-25 10:29 UTC (permalink / raw)
  To: Rafael J. Wysocki, Bjorn Helgaas
  Cc: Len Brown, Lukas Wunner, Keith Busch, Alex Williamson,
	Alexandru Gagniuc, Mika Westerberg, linux-acpi, linux-pci

If there are shared power resources between otherwise unrelated devices
turning them on causes the other devices sharing them to be powered up
as well. In case of PCI devices go into D0uninitialized state meaning
that if they were configured to trigger wake that configuration is lost
at this point.

For this reason introduce a concept of "_PR0 dependent device" that can
be added to any ACPI device that has power resources. The dependent
device will be included in a list of dependent devices for all power
resources returned by the ACPI device's _PR0 (assuming it has one).
Whenever a power resource having dependent devices is turned physically
on (its _ON method is called) we runtime resume all of them to allow
their driver or in case of PCI the PCI core to re-initialize the device
and its wake configuration.

This adds two functions that can be used to add and remove these
dependent devices. Note the dependent device does not necessary need
share power resources so this functionality can be used to add "software
dependencies" as well if needed.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 drivers/acpi/power.c    | 135 ++++++++++++++++++++++++++++++++++++++++
 include/acpi/acpi_bus.h |   4 ++
 2 files changed, 139 insertions(+)

diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c
index a916417b9e70..fe1e7bc91a5e 100644
--- a/drivers/acpi/power.c
+++ b/drivers/acpi/power.c
@@ -42,6 +42,11 @@ ACPI_MODULE_NAME("power");
 #define ACPI_POWER_RESOURCE_STATE_ON	0x01
 #define ACPI_POWER_RESOURCE_STATE_UNKNOWN 0xFF
 
+struct acpi_power_dependent_device {
+	struct device *dev;
+	struct list_head node;
+};
+
 struct acpi_power_resource {
 	struct acpi_device device;
 	struct list_head list_node;
@@ -51,6 +56,7 @@ struct acpi_power_resource {
 	unsigned int ref_count;
 	bool wakeup_enabled;
 	struct mutex resource_lock;
+	struct list_head dependents;
 };
 
 struct acpi_power_resource_entry {
@@ -232,8 +238,121 @@ static int acpi_power_get_list_state(struct list_head *list, int *state)
 	return 0;
 }
 
+static int
+acpi_power_resource_add_dependent(struct acpi_power_resource *resource,
+				  struct device *dev)
+{
+	struct acpi_power_dependent_device *dep;
+	int ret = 0;
+
+	mutex_lock(&resource->resource_lock);
+	list_for_each_entry(dep, &resource->dependents, node) {
+		/* Only add it once */
+		if (dep->dev == dev)
+			goto unlock;
+	}
+
+	dep = kzalloc(sizeof(*dep), GFP_KERNEL);
+	if (!dep) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
+
+	dep->dev = dev;
+	list_add_tail(&dep->node, &resource->dependents);
+	dev_dbg(dev, "added power dependency to [%s]\n", resource->name);
+
+unlock:
+	mutex_unlock(&resource->resource_lock);
+	return ret;
+}
+
+static void
+acpi_power_resource_remove_dependent(struct acpi_power_resource *resource,
+				     struct device *dev)
+{
+	struct acpi_power_dependent_device *dep;
+
+	mutex_lock(&resource->resource_lock);
+	list_for_each_entry(dep, &resource->dependents, node) {
+		if (dep->dev == dev) {
+			list_del(&dep->node);
+			kfree(dep);
+			dev_dbg(dev, "removed power dependency to [%s]\n",
+				resource->name);
+			break;
+		}
+	}
+	mutex_unlock(&resource->resource_lock);
+}
+
+/**
+ * acpi_device_power_add_dependent - Add dependent device of this ACPI device
+ * @adev: ACPI device pointer
+ * @dev: Dependent device
+ *
+ * If @adev has non-empty _PR0 the @dev is added as dependent device to all
+ * power resources returned by it. This means that whenever these power
+ * resources are turned _ON the dependent devices get runtime resumed. This
+ * is needed for devices such as PCI to allow its driver to re-initialize
+ * it after it went to D0uninitialized.
+ *
+ * If @adev does not have _PR0 this does nothing.
+ *
+ * Returns %0 in case of success and negative errno otherwise.
+ */
+int acpi_device_power_add_dependent(struct acpi_device *adev,
+				    struct device *dev)
+{
+	struct acpi_power_resource_entry *entry;
+	struct list_head *resources;
+	int ret;
+
+	if (!adev->flags.power_manageable)
+		return 0;
+
+	resources = &adev->power.states[ACPI_STATE_D0].resources;
+	list_for_each_entry(entry, resources, node) {
+		ret = acpi_power_resource_add_dependent(entry->resource, dev);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+
+err:
+	list_for_each_entry(entry, resources, node)
+		acpi_power_resource_remove_dependent(entry->resource, dev);
+
+	return ret;
+}
+
+/**
+ * acpi_device_power_remove_dependent - Remove dependent device
+ * @adev: ACPI device pointer
+ * @dev: Dependent device
+ *
+ * Does the opposite of acpi_device_power_add_dependent() and removes the
+ * dependent device if it is found. Can be called to @adev that does not
+ * have _PR0 as well.
+ */
+void acpi_device_power_remove_dependent(struct acpi_device *adev,
+					struct device *dev)
+{
+	struct acpi_power_resource_entry *entry;
+	struct list_head *resources;
+
+	if (!adev->flags.power_manageable)
+		return;
+
+	resources = &adev->power.states[ACPI_STATE_D0].resources;
+	list_for_each_entry_reverse(entry, resources, node)
+		acpi_power_resource_remove_dependent(entry->resource, dev);
+}
+
 static int __acpi_power_on(struct acpi_power_resource *resource)
 {
+	struct acpi_power_dependent_device *dep;
 	acpi_status status = AE_OK;
 
 	status = acpi_evaluate_object(resource->device.handle, "_ON", NULL, NULL);
@@ -243,6 +362,21 @@ static int __acpi_power_on(struct acpi_power_resource *resource)
 	ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Power resource [%s] turned on\n",
 			  resource->name));
 
+	/*
+	 * If there are other dependents on this power resource we need to
+	 * resume them now so that their drivers can re-initialize the
+	 * hardware properly after it went back to D0.
+	 */
+	if (list_empty(&resource->dependents) ||
+	    list_is_singular(&resource->dependents))
+		return 0;
+
+	list_for_each_entry(dep, &resource->dependents, node) {
+		dev_dbg(dep->dev, "runtime resuming because [%s] turned on\n",
+			resource->name);
+		pm_request_resume(dep->dev);
+	}
+
 	return 0;
 }
 
@@ -810,6 +944,7 @@ int acpi_add_power_resource(acpi_handle handle)
 				ACPI_STA_DEFAULT);
 	mutex_init(&resource->resource_lock);
 	INIT_LIST_HEAD(&resource->list_node);
+	INIT_LIST_HEAD(&resource->dependents);
 	resource->name = device->pnp.bus_id;
 	strcpy(acpi_device_name(device), ACPI_POWER_DEVICE_NAME);
 	strcpy(acpi_device_class(device), ACPI_POWER_CLASS);
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 31b6c87d6240..4752ff0a9d9b 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -513,6 +513,10 @@ int acpi_device_fix_up_power(struct acpi_device *device);
 int acpi_bus_update_power(acpi_handle handle, int *state_p);
 int acpi_device_update_power(struct acpi_device *device, int *state_p);
 bool acpi_bus_power_manageable(acpi_handle handle);
+int acpi_device_power_add_dependent(struct acpi_device *adev,
+				    struct device *dev);
+void acpi_device_power_remove_dependent(struct acpi_device *adev,
+					struct device *dev);
 
 #ifdef CONFIG_PM
 bool acpi_bus_can_wakeup(acpi_handle handle);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 3/3] PCI / ACPI: Add _PR0 dependent devices
  2019-06-25 10:29 [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources Mika Westerberg
  2019-06-25 10:29 ` [PATCH v3 1/3] PCI / ACPI: Use cached ACPI device state to get PCI device power state Mika Westerberg
  2019-06-25 10:29 ` [PATCH v3 2/3] ACPI / PM: Introduce concept of a _PR0 dependent device Mika Westerberg
@ 2019-06-25 10:29 ` Mika Westerberg
  2019-06-25 10:35 ` [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources Rafael J. Wysocki
  3 siblings, 0 replies; 7+ messages in thread
From: Mika Westerberg @ 2019-06-25 10:29 UTC (permalink / raw)
  To: Rafael J. Wysocki, Bjorn Helgaas
  Cc: Len Brown, Lukas Wunner, Keith Busch, Alex Williamson,
	Alexandru Gagniuc, Mika Westerberg, linux-acpi, linux-pci

If otherwise unrelated PCI devices share ACPI power resources turning
them on causes the devices to enter D0uninitialized power state which may
cause problems.

For example in Intel Ice Lake two root ports (RP0 and RP1), Thunderbolt
controller (NHI) and xHCI controller all share power resources as can be
ween in the topology below where power resources are marked with []:

  Host bridge
    |
    +- RP0 ---\
    +- RP1 ---|--+--> [TBT]
    +- NHI --/   |
    |            |
    |            v
    +- xHCI --> [D3C]

In a situation where all devices sharing the power resources are in
D3cold (the power resources are turned off) and for example the
Thunderbolt controller is runtime resumed resulting that the power
resources are turned on. This means that the other devices sharing them
(RP0, RP1 and xHCI) are transitioned into D0uninitialized state. If they
were configured to trigger wake (PME) on a certain event that
configuration gets lost after reset so we would need to re-initialize
them to get the wakeup working as expected again. To do so we would need
to runtime resume all of them to make sure their registers get restored
properly before we can runtime suspend them again.

Since we just added concept of "_PR0 dependent device" we can solve this
by calling the relevant add/remove functions when the PCI device is bind
to its ACPI representation. If it has power resources the PCI device
will be added as dependent device to them and runtime resumed whenever
they are physically turned on. This should make sure PCI core can
reconfigure wakes after the device is transitioned into D0uninitialized.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 drivers/pci/pci-acpi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index b782acac26c5..2abe0eeafb53 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -902,6 +902,7 @@ static void pci_acpi_setup(struct device *dev)
 		device_wakeup_enable(dev);
 
 	acpi_pci_wakeup(pci_dev, false);
+	acpi_device_power_add_dependent(adev, dev);
 }
 
 static void pci_acpi_cleanup(struct device *dev)
@@ -914,6 +915,7 @@ static void pci_acpi_cleanup(struct device *dev)
 
 	pci_acpi_remove_pm_notifier(adev);
 	if (adev->wakeup.flags.valid) {
+		acpi_device_power_remove_dependent(adev, dev);
 		if (pci_dev->bridge_d3)
 			device_wakeup_disable(dev);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources
  2019-06-25 10:29 [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources Mika Westerberg
                   ` (2 preceding siblings ...)
  2019-06-25 10:29 ` [PATCH v3 3/3] PCI / ACPI: Add _PR0 dependent devices Mika Westerberg
@ 2019-06-25 10:35 ` Rafael J. Wysocki
  2019-07-05  9:51   ` Rafael J. Wysocki
  3 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2019-06-25 10:35 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Rafael J. Wysocki, Bjorn Helgaas, Len Brown, Lukas Wunner,
	Keith Busch, Alex Williamson, Alexandru Gagniuc,
	ACPI Devel Maling List, Linux PCI

On Tue, Jun 25, 2019 at 12:30 PM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> Hi all,
>
> This is third iteration of the patch series addressing issues around
> sibling PCI devices sharing ACPI power resources.
>
> As a concrete example in Intel Ice Lake the Thunderbolt controller, PCIe
> root ports and xHCI all share the same ACPI power resources. When they are
> all in D3hot power resources (returned by _PR3) can be turned off powering
> off the whole block. However, there are two issues around this.
>
> Firstly the PCI core sets the device power state by asking what the real
> ACPI power state is. This results that all but last device sharing the
> power resources are in D3hot when the power resources are turned off. This
> causes issues if user runs for example 'lspci' because the device is really
> in D3cold so what user gets back is all ones (0xffffffff).
>
> Secondly if any of the device is runtime resumed the power resources are
> turned on bringing all other devices sharing the resources to
> D0uninitialized losing their wakeup configuration.
>
> This series aims to fix the two issues by:
>
>   1. Using the ACPI cached power state when PCI devices are transitioned
>      into low power states instead of reading back the "real" power state.
>
>   2. Introducing concept of "_PR0 dependent devices" that get runtime
>      resumed whenever their power resource (which they might share with
>      other sibling devices) gets turned on.
>
> The series is based on the idea of Rafael J. Wysocki <rafael@kernel.org>.
>
> Previous version of the series can be found here:
>
>   v2: https://lore.kernel.org/linux-pci/20190618161858.77834-1-mika.westerberg@linux.intel.com/T/#m7a41d0b745400054543324ce84125040dbfed912
>   v1: https://www.spinics.net/lists/linux-pci/msg83583.html
>
> Changes from v2:
>
>   * Updated changelog of patch [1/3] according to comments I got. I left
>     the D3C power resource and xHCI there because it shows that we can have
>     multiple shared power resources.
>
>   * Added link to the discussion around v2.
>
>   * Use adev->flags.power_manageable in patch [2/3].
>
> Mika Westerberg (3):
>   PCI / ACPI: Use cached ACPI device state to get PCI device power state
>   ACPI / PM: Introduce concept of a _PR0 dependent device
>   PCI / ACPI: Add _PR0 dependent devices
>
>  drivers/acpi/power.c    | 135 ++++++++++++++++++++++++++++++++++++++++
>  drivers/pci/pci-acpi.c  |   5 +-
>  include/acpi/acpi_bus.h |   4 ++
>  3 files changed, 143 insertions(+), 1 deletion(-)
>

The whole series looks good to me, thank you!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/3] PCI / ACPI: Use cached ACPI device state to get PCI device power state
  2019-06-25 10:29 ` [PATCH v3 1/3] PCI / ACPI: Use cached ACPI device state to get PCI device power state Mika Westerberg
@ 2019-06-25 12:15   ` Rafael J. Wysocki
  0 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2019-06-25 12:15 UTC (permalink / raw)
  To: Mika Westerberg, Bjorn Helgaas
  Cc: Rafael J. Wysocki, Len Brown, Lukas Wunner, Keith Busch,
	Alex Williamson, Alexandru Gagniuc, ACPI Devel Maling List,
	Linux PCI

On Tue, Jun 25, 2019 at 12:30 PM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> The ACPI power state returned by acpi_device_get_power() may depend on
> the configuration of ACPI power resources in the system which may change
> any time after acpi_device_get_power() has returned, unless the
> reference counters of the ACPI power resources in question are set to
> prevent that from happening. Thus it is invalid to use acpi_device_get_power()
> in acpi_pci_get_power_state() the way it is done now and the value of
> the ->power.state field in the corresponding struct acpi_device objects
> (which reflects the ACPI power resources reference counting, among other
> things) should be used instead.
>
> As an example where this becomes an issue is Intel Ice Lake where the
> Thunderbolt controller (NHI), two PCIe root ports (RP0 and RP1) and xHCI
> all share the same power resources. The following picture with power
> resources marked with [] shows the topology:
>
>   Host bridge
>     |
>     +- RP0 ---\
>     +- RP1 ---|--+--> [TBT]
>     +- NHI --/   |
>     |            |
>     |            v
>     +- xHCI --> [D3C]
>
> Here TBT and D3C are the shared ACPI power resources. ACPI _PR3() method
> of the devices in question returns either TBT or D3C or both.
>
> Say we runtime suspend first the root ports RP0 and RP1, then NHI. Now
> since the TBT power resource is still on when the root ports are runtime
> suspended their dev->current_state is set to D3hot. When NHI is runtime
> suspended TBT is finally turned off but state of the root ports remain
> to be D3hot. Now when the xHCI is runtime suspended D3C gets also turned
> off. PCI core thus has power states of these devices cached in their
> dev->current_state as follows:
>
>   RP0 -> D3hot
>   RP1 -> D3hot
>   NHI -> D3cold
>   xHCI -> D3cold
>
> If the user now runs lspci for instance, the result is all 1's like in
> the below output (00:07.0 is the first root port, RP0):
>
> 00:07.0 PCI bridge: Intel Corporation Device 8a1d (rev ff) (prog-if ff)
>     !!! Unknown header type 7f
>     Kernel driver in use: pcieport
>
> In short the hardware state is not in sync with the software state
> anymore. The exact same thing happens with the PME polling thread which
> ends up bringing the root ports back into D0 after they are runtime
> suspended.
>
> For this reason, modify acpi_pci_get_power_state() so that it uses the
> ACPI device power state that was cached by the ACPI core. This makes the
> PCI device power state match the ACPI device power state regardless of
> state of the shared power resources which may still be on at this point.
>
> Link: https://lore.kernel.org/r/20190618161858.77834-2-mika.westerberg@linux.intel.com
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> ---
>  drivers/pci/pci-acpi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 1897847ceb0c..b782acac26c5 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -685,7 +685,8 @@ static pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
>         if (!adev || !acpi_device_power_manageable(adev))
>                 return PCI_UNKNOWN;
>
> -       if (acpi_device_get_power(adev, &state) || state == ACPI_STATE_UNKNOWN)
> +       state = adev->power.state;
> +       if (state == ACPI_STATE_UNKNOWN)
>                 return PCI_UNKNOWN;
>
>         return state_conv[state];
> --

Not that there are two additional issues related to the one fixed by
this patch that need to be addressed differently.

For details, see

https://patchwork.kernel.org/patch/11015379/
https://patchwork.kernel.org/patch/11015391/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources
  2019-06-25 10:35 ` [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources Rafael J. Wysocki
@ 2019-07-05  9:51   ` Rafael J. Wysocki
  0 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2019-07-05  9:51 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, Len Brown, Lukas Wunner, Keith Busch,
	Alex Williamson, Alexandru Gagniuc, ACPI Devel Maling List,
	Linux PCI

On Tuesday, June 25, 2019 12:35:12 PM CEST Rafael J. Wysocki wrote:
> On Tue, Jun 25, 2019 at 12:30 PM Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> >
> > Hi all,
> >
> > This is third iteration of the patch series addressing issues around
> > sibling PCI devices sharing ACPI power resources.
> >
> > As a concrete example in Intel Ice Lake the Thunderbolt controller, PCIe
> > root ports and xHCI all share the same ACPI power resources. When they are
> > all in D3hot power resources (returned by _PR3) can be turned off powering
> > off the whole block. However, there are two issues around this.
> >
> > Firstly the PCI core sets the device power state by asking what the real
> > ACPI power state is. This results that all but last device sharing the
> > power resources are in D3hot when the power resources are turned off. This
> > causes issues if user runs for example 'lspci' because the device is really
> > in D3cold so what user gets back is all ones (0xffffffff).
> >
> > Secondly if any of the device is runtime resumed the power resources are
> > turned on bringing all other devices sharing the resources to
> > D0uninitialized losing their wakeup configuration.
> >
> > This series aims to fix the two issues by:
> >
> >   1. Using the ACPI cached power state when PCI devices are transitioned
> >      into low power states instead of reading back the "real" power state.
> >
> >   2. Introducing concept of "_PR0 dependent devices" that get runtime
> >      resumed whenever their power resource (which they might share with
> >      other sibling devices) gets turned on.
> >
> > The series is based on the idea of Rafael J. Wysocki <rafael@kernel.org>.
> >
> > Previous version of the series can be found here:
> >
> >   v2: https://lore.kernel.org/linux-pci/20190618161858.77834-1-mika.westerberg@linux.intel.com/T/#m7a41d0b745400054543324ce84125040dbfed912
> >   v1: https://www.spinics.net/lists/linux-pci/msg83583.html
> >
> > Changes from v2:
> >
> >   * Updated changelog of patch [1/3] according to comments I got. I left
> >     the D3C power resource and xHCI there because it shows that we can have
> >     multiple shared power resources.
> >
> >   * Added link to the discussion around v2.
> >
> >   * Use adev->flags.power_manageable in patch [2/3].
> >
> > Mika Westerberg (3):
> >   PCI / ACPI: Use cached ACPI device state to get PCI device power state
> >   ACPI / PM: Introduce concept of a _PR0 dependent device
> >   PCI / ACPI: Add _PR0 dependent devices
> >
> >  drivers/acpi/power.c    | 135 ++++++++++++++++++++++++++++++++++++++++
> >  drivers/pci/pci-acpi.c  |   5 +-
> >  include/acpi/acpi_bus.h |   4 ++
> >  3 files changed, 143 insertions(+), 1 deletion(-)
> >
> 
> The whole series looks good to me, thank you!
> 

And so it has been applied and queued for 5.3, thanks!




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-07-05  9:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-25 10:29 [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources Mika Westerberg
2019-06-25 10:29 ` [PATCH v3 1/3] PCI / ACPI: Use cached ACPI device state to get PCI device power state Mika Westerberg
2019-06-25 12:15   ` Rafael J. Wysocki
2019-06-25 10:29 ` [PATCH v3 2/3] ACPI / PM: Introduce concept of a _PR0 dependent device Mika Westerberg
2019-06-25 10:29 ` [PATCH v3 3/3] PCI / ACPI: Add _PR0 dependent devices Mika Westerberg
2019-06-25 10:35 ` [PATCH v3 0/3] PCI / ACPI: Handle sibling devices sharing power resources Rafael J. Wysocki
2019-07-05  9:51   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).