linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
@ 2019-05-21 16:31 Kai-Heng Feng
  2019-05-21 22:23 ` Bjorn Helgaas
  0 siblings, 1 reply; 18+ messages in thread
From: Kai-Heng Feng @ 2019-05-21 16:31 UTC (permalink / raw)
  To: bhelgaas, rafael.j.wysocki; +Cc: linux-pci, linux-kernel, Kai-Heng Feng

There's an xHC device that doesn't wake when a USB device gets plugged
to its USB port. The driver's own runtime suspend callback was called,
PME signaling was enabled, but it stays at PCI D0.

A PCI device can be runtime suspended to D0 when it supports D0 PME and
its _S0W reports D0. Theoratically this should work, but as [1]
specifies, D0 doesn't have wakeup capability.

To avoid this problematic situation, we should avoid runtime suspend if
D0 is the only state that can wake up the device.

[1] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/device-working-state-d0

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/pci/pci-driver.c | 5 +++++
 drivers/pci/pci.c        | 2 +-
 include/linux/pci.h      | 3 +++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index cae630fe6387..15a6310c5d7b 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1251,6 +1251,11 @@ static int pci_pm_runtime_suspend(struct device *dev)
 		return 0;
 	}
 
+	if (pci_target_state(pci_dev, device_can_wakeup(dev)) == PCI_D0) {
+		dev_dbg(dev, "D0 doesn't have wakeup capability\n");
+		return -EBUSY;
+	}
+
 	pci_dev->state_saved = false;
 	if (pm && pm->runtime_suspend) {
 		error = pm->runtime_suspend(dev);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8abc843b1615..ceee6efbbcfe 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2294,7 +2294,7 @@ EXPORT_SYMBOL(pci_wake_from_d3);
  * If the platform can't manage @dev, return the deepest state from which it
  * can generate wake events, based on any available PME info.
  */
-static pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
+pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
 {
 	pci_power_t target_state = PCI_D3hot;
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 4a5a84d7bdd4..91e8dc4d04aa 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1188,6 +1188,7 @@ bool pci_pme_capable(struct pci_dev *dev, pci_power_t state);
 void pci_pme_active(struct pci_dev *dev, bool enable);
 int pci_enable_wake(struct pci_dev *dev, pci_power_t state, bool enable);
 int pci_wake_from_d3(struct pci_dev *dev, bool enable);
+pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup);
 int pci_prepare_to_sleep(struct pci_dev *dev);
 int pci_back_from_sleep(struct pci_dev *dev);
 bool pci_dev_run_wake(struct pci_dev *dev);
@@ -1672,6 +1673,8 @@ static inline int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
 { return 0; }
 static inline int pci_wake_from_d3(struct pci_dev *dev, bool enable)
 { return 0; }
+pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
+{ return PCI_D0; }
 static inline pci_power_t pci_choose_state(struct pci_dev *dev,
 					   pm_message_t state)
 { return PCI_D0; }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-21 16:31 [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0 Kai-Heng Feng
@ 2019-05-21 22:23 ` Bjorn Helgaas
  2019-05-22  3:42   ` Kai Heng Feng
  0 siblings, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2019-05-21 22:23 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: rafael.j.wysocki, linux-pci, linux-kernel, Mathias Nyman, linux-usb

[+cc Mathias, linux-usb]

On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
> There's an xHC device that doesn't wake when a USB device gets plugged
> to its USB port. The driver's own runtime suspend callback was called,
> PME signaling was enabled, but it stays at PCI D0.

s/xHC/xHCI/ ?

This looks like it's fixing a bug?  If so, please include a link to
the bug report, and make sure the bug report has "lspci -vv" output
attached to it.

> A PCI device can be runtime suspended to D0 when it supports D0 PME and
> its _S0W reports D0. Theoratically this should work, but as [1]
> specifies, D0 doesn't have wakeup capability.

s/Theoratically/Theoretically/

What does "runtime suspended to D0" mean?  Is that different from the
regular "device is fully operational" sort of D0?  If so, what
distinguishes "runtime suspended D0" from "normal fully operational
D0"?

> To avoid this problematic situation, we should avoid runtime suspend if
> D0 is the only state that can wake up the device.
> 
> [1] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/device-working-state-d0
> 
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/pci/pci-driver.c | 5 +++++
>  drivers/pci/pci.c        | 2 +-
>  include/linux/pci.h      | 3 +++
>  3 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index cae630fe6387..15a6310c5d7b 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1251,6 +1251,11 @@ static int pci_pm_runtime_suspend(struct device *dev)
>  		return 0;
>  	}
>  
> +	if (pci_target_state(pci_dev, device_can_wakeup(dev)) == PCI_D0) {
> +		dev_dbg(dev, "D0 doesn't have wakeup capability\n");
> +		return -EBUSY;
> +	}
> +
>  	pci_dev->state_saved = false;
>  	if (pm && pm->runtime_suspend) {
>  		error = pm->runtime_suspend(dev);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8abc843b1615..ceee6efbbcfe 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2294,7 +2294,7 @@ EXPORT_SYMBOL(pci_wake_from_d3);
>   * If the platform can't manage @dev, return the deepest state from which it
>   * can generate wake events, based on any available PME info.
>   */
> -static pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
> +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
>  {
>  	pci_power_t target_state = PCI_D3hot;
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 4a5a84d7bdd4..91e8dc4d04aa 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1188,6 +1188,7 @@ bool pci_pme_capable(struct pci_dev *dev, pci_power_t state);
>  void pci_pme_active(struct pci_dev *dev, bool enable);
>  int pci_enable_wake(struct pci_dev *dev, pci_power_t state, bool enable);
>  int pci_wake_from_d3(struct pci_dev *dev, bool enable);
> +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup);
>  int pci_prepare_to_sleep(struct pci_dev *dev);
>  int pci_back_from_sleep(struct pci_dev *dev);
>  bool pci_dev_run_wake(struct pci_dev *dev);
> @@ -1672,6 +1673,8 @@ static inline int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>  { return 0; }
>  static inline int pci_wake_from_d3(struct pci_dev *dev, bool enable)
>  { return 0; }
> +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
> +{ return PCI_D0; }
>  static inline pci_power_t pci_choose_state(struct pci_dev *dev,
>  					   pm_message_t state)
>  { return PCI_D0; }
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-21 22:23 ` Bjorn Helgaas
@ 2019-05-22  3:42   ` Kai Heng Feng
  2019-05-22 13:48     ` Bjorn Helgaas
  0 siblings, 1 reply; 18+ messages in thread
From: Kai Heng Feng @ 2019-05-22  3:42 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:

> [+cc Mathias, linux-usb]
>
> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
>> There's an xHC device that doesn't wake when a USB device gets plugged
>> to its USB port. The driver's own runtime suspend callback was called,
>> PME signaling was enabled, but it stays at PCI D0.
>
> s/xHC/xHCI/ ?
>
> This looks like it's fixing a bug?  If so, please include a link to
> the bug report, and make sure the bug report has "lspci -vv" output
> attached to it.

Ok, I’ll update this in V2.

>
>> A PCI device can be runtime suspended to D0 when it supports D0 PME and
>> its _S0W reports D0. Theoratically this should work, but as [1]
>> specifies, D0 doesn't have wakeup capability.
>
> s/Theoratically/Theoretically/

Ok.

>
> What does "runtime suspended to D0" mean?

It’s runtime suspended by PCI core, but stays at D0.

>  Is that different from the regular "device is fully operational" sort of D0?

Yes it's different to that.
Because of _S0W reports D0 and the device has D0 PME support, so it’s  
“suspended” to D0:

00:10.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] FCH USB  
XHCI Controller [1022:7914] (rev 20) (prog-if 30 [XHCI])
         Subsystem: Dell FCH USB XHCI Controller [1028:096c]
         Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
         Interrupt: pin A routed to IRQ 18
         Region 0: Memory at f0668000 (64-bit, non-prefetchable) [size=8K]
         Capabilities: [50] Power Management version 3
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                 Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
         Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
                 Address: 0000000000000000  Data: 0000
         Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
                 Vector table: BAR=0 offset=00001000
                 PBA: BAR=0 offset=00001080
         Capabilities: [a0] Express (v2) Root Complex Integrated Endpoint, MSI 00
                 DevCap: MaxPayload 128 bytes, PhantFunc 0
                         ExtTag- RBE+
                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                         RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 128 bytes, MaxReadReq 512 bytes
                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                 DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR+, OBFF Not Supported
                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
         Capabilities: [100 v1] Latency Tolerance Reporting
                 Max snoop latency: 0ns
                 Max no snoop latency: 0ns
         Kernel driver in use: xhci_hcd


PME signaling is correctly enabled:
Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-

> If so, what
> distinguishes "runtime suspended D0" from "normal fully operational
> D0”?

The xHC’s own runtime suspend routine is called, but PCI core’s runtime  
suspend routine decides it should stay at D0.
So it’s technically runtime suspended to D0.

Kai-Heng

>
>> To avoid this problematic situation, we should avoid runtime suspend if
>> D0 is the only state that can wake up the device.
>>
>> [1]  
>> https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/device-working-state-d0
>>
>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>> ---
>>  drivers/pci/pci-driver.c | 5 +++++
>>  drivers/pci/pci.c        | 2 +-
>>  include/linux/pci.h      | 3 +++
>>  3 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index cae630fe6387..15a6310c5d7b 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -1251,6 +1251,11 @@ static int pci_pm_runtime_suspend(struct device  
>> *dev)
>>  		return 0;
>>  	}
>>
>> +	if (pci_target_state(pci_dev, device_can_wakeup(dev)) == PCI_D0) {
>> +		dev_dbg(dev, "D0 doesn't have wakeup capability\n");
>> +		return -EBUSY;
>> +	}
>> +
>>  	pci_dev->state_saved = false;
>>  	if (pm && pm->runtime_suspend) {
>>  		error = pm->runtime_suspend(dev);
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 8abc843b1615..ceee6efbbcfe 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -2294,7 +2294,7 @@ EXPORT_SYMBOL(pci_wake_from_d3);
>>   * If the platform can't manage @dev, return the deepest state from which it
>>   * can generate wake events, based on any available PME info.
>>   */
>> -static pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
>> +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
>>  {
>>  	pci_power_t target_state = PCI_D3hot;
>>
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 4a5a84d7bdd4..91e8dc4d04aa 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -1188,6 +1188,7 @@ bool pci_pme_capable(struct pci_dev *dev,  
>> pci_power_t state);
>>  void pci_pme_active(struct pci_dev *dev, bool enable);
>>  int pci_enable_wake(struct pci_dev *dev, pci_power_t state, bool enable);
>>  int pci_wake_from_d3(struct pci_dev *dev, bool enable);
>> +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup);
>>  int pci_prepare_to_sleep(struct pci_dev *dev);
>>  int pci_back_from_sleep(struct pci_dev *dev);
>>  bool pci_dev_run_wake(struct pci_dev *dev);
>> @@ -1672,6 +1673,8 @@ static inline int pci_set_power_state(struct  
>> pci_dev *dev, pci_power_t state)
>>  { return 0; }
>>  static inline int pci_wake_from_d3(struct pci_dev *dev, bool enable)
>>  { return 0; }
>> +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
>> +{ return PCI_D0; }
>>  static inline pci_power_t pci_choose_state(struct pci_dev *dev,
>>  					   pm_message_t state)
>>  { return PCI_D0; }
>> -- 
>> 2.17.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-22  3:42   ` Kai Heng Feng
@ 2019-05-22 13:48     ` Bjorn Helgaas
  2019-05-22 15:46       ` Kai Heng Feng
  0 siblings, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2019-05-22 13:48 UTC (permalink / raw)
  To: Kai Heng Feng; +Cc: Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
> at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
> > > There's an xHC device that doesn't wake when a USB device gets plugged
> > > to its USB port. The driver's own runtime suspend callback was called,
> > > PME signaling was enabled, but it stays at PCI D0.
> > 
> > This looks like it's fixing a bug?  If so, please include a link to
> > the bug report, and make sure the bug report has "lspci -vv" output
> > attached to it.

I see your bug report (https://bugzilla.kernel.org/show_bug.cgi?id=203673)
but it doesn't say anything about what this looks like to a user.
Presumably somebody complained about something not working.  The bug
report should include that information about what isn't working.
Ideally, a user experiencing a problem should be able to google for
the symptoms and find the bug report.

> > > A PCI device can be runtime suspended to D0 when it supports D0 PME and
> > > its _S0W reports D0. Theoratically this should work, but as [1]
> > > specifies, D0 doesn't have wakeup capability.
> > 
> > What does "runtime suspended to D0" mean?

If I understand correctly, Linux normally clears PME-Enable while
devices are in D0, but sets PME-Enable if a device is "runtime
suspended to D0".

I'm not sure I'd describe that as "suspended", since the power
management state is exactly D0 and the only difference is that a PME
interrupt is enabled.  It sounds to me like the xHCI controller is
basically using PME as a hotplug interrupt to say "something happened
on my secondary (USB) interface".  But that's more a question for
Rafael.

And I guess this patch basically means we wouldn't call the driver's
suspend callback if we're merely going to stay at D0, so the driver
would have no idea anything happened.  That might match
Documentation/power/pci.txt better, because it suggests that the
suspend callback is related to putting a device in a low-power state,
and D0 is not a low-power state.

Maybe we should also update Documentation/power/pci.txt to say "PCI
devices ... can be programmed to generate PMEs while in any state
(D0-D3)" instead of "a low-power state (D1-D3)".

Anyway, this is all Rafael's area, so I'll defer to him.

> > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > ---
> > >  drivers/pci/pci-driver.c | 5 +++++
> > >  drivers/pci/pci.c        | 2 +-
> > >  include/linux/pci.h      | 3 +++
> > >  3 files changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > > index cae630fe6387..15a6310c5d7b 100644
> > > --- a/drivers/pci/pci-driver.c
> > > +++ b/drivers/pci/pci-driver.c
> > > @@ -1251,6 +1251,11 @@ static int pci_pm_runtime_suspend(struct
> > > device *dev)
> > >  		return 0;
> > >  	}
> > > 
> > > +	if (pci_target_state(pci_dev, device_can_wakeup(dev)) == PCI_D0) {
> > > +		dev_dbg(dev, "D0 doesn't have wakeup capability\n");
> > > +		return -EBUSY;
> > > +	}
> > > +
> > >  	pci_dev->state_saved = false;
> > >  	if (pm && pm->runtime_suspend) {
> > >  		error = pm->runtime_suspend(dev);
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 8abc843b1615..ceee6efbbcfe 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -2294,7 +2294,7 @@ EXPORT_SYMBOL(pci_wake_from_d3);
> > >   * If the platform can't manage @dev, return the deepest state from which it
> > >   * can generate wake events, based on any available PME info.
> > >   */
> > > -static pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
> > > +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
> > >  {
> > >  	pci_power_t target_state = PCI_D3hot;
> > > 
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index 4a5a84d7bdd4..91e8dc4d04aa 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -1188,6 +1188,7 @@ bool pci_pme_capable(struct pci_dev *dev,
> > > pci_power_t state);
> > >  void pci_pme_active(struct pci_dev *dev, bool enable);
> > >  int pci_enable_wake(struct pci_dev *dev, pci_power_t state, bool enable);
> > >  int pci_wake_from_d3(struct pci_dev *dev, bool enable);
> > > +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup);
> > >  int pci_prepare_to_sleep(struct pci_dev *dev);
> > >  int pci_back_from_sleep(struct pci_dev *dev);
> > >  bool pci_dev_run_wake(struct pci_dev *dev);
> > > @@ -1672,6 +1673,8 @@ static inline int pci_set_power_state(struct
> > > pci_dev *dev, pci_power_t state)
> > >  { return 0; }
> > >  static inline int pci_wake_from_d3(struct pci_dev *dev, bool enable)
> > >  { return 0; }
> > > +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
> > > +{ return PCI_D0; }
> > >  static inline pci_power_t pci_choose_state(struct pci_dev *dev,
> > >  					   pm_message_t state)
> > >  { return PCI_D0; }
> > > -- 
> > > 2.17.1
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-22 13:48     ` Bjorn Helgaas
@ 2019-05-22 15:46       ` Kai Heng Feng
  2019-05-22 18:11         ` Bjorn Helgaas
  0 siblings, 1 reply; 18+ messages in thread
From: Kai Heng Feng @ 2019-05-22 15:46 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb



> On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
>> at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
>>>> There's an xHC device that doesn't wake when a USB device gets plugged
>>>> to its USB port. The driver's own runtime suspend callback was called,
>>>> PME signaling was enabled, but it stays at PCI D0.
>>> 
>>> This looks like it's fixing a bug?  If so, please include a link to
>>> the bug report, and make sure the bug report has "lspci -vv" output
>>> attached to it.
> 
> I see your bug report (https://bugzilla.kernel.org/show_bug.cgi?id=203673)
> but it doesn't say anything about what this looks like to a user.
> Presumably somebody complained about something not working.  The bug
> report should include that information about what isn't working.
> Ideally, a user experiencing a problem should be able to google for
> the symptoms and find the bug report.

Sorry about that. I’ve added a comment to describe the symptom.

> 
>>>> A PCI device can be runtime suspended to D0 when it supports D0 PME and
>>>> its _S0W reports D0. Theoratically this should work, but as [1]
>>>> specifies, D0 doesn't have wakeup capability.
>>> 
>>> What does "runtime suspended to D0" mean?
> 
> If I understand correctly, Linux normally clears PME-Enable while
> devices are in D0, but sets PME-Enable if a device is "runtime
> suspended to D0”.

Yes, this is what happens here.

> 
> I'm not sure I'd describe that as "suspended", since the power
> management state is exactly D0 and the only difference is that a PME
> interrupt is enabled.  It sounds to me like the xHCI controller is
> basically using PME as a hotplug interrupt to say "something happened
> on my secondary (USB) interface".  But that's more a question for
> Rafael.

Runtime suspend routines are called successfully, so I think it’s still logically suspended.

> 
> And I guess this patch basically means we wouldn't call the driver's
> suspend callback if we're merely going to stay at D0, so the driver
> would have no idea anything happened.  That might match
> Documentation/power/pci.txt better, because it suggests that the
> suspend callback is related to putting a device in a low-power state,
> and D0 is not a low-power state.

Yes, the patch is to let the device stay at D0 and don’t run driver’s own runtime suspend routine.

I guess I’ll just proceed to send a V2 with updated commit message?

Kai-Heng

> 
> Maybe we should also update Documentation/power/pci.txt to say "PCI
> devices ... can be programmed to generate PMEs while in any state
> (D0-D3)" instead of "a low-power state (D1-D3)”.
> 
> Anyway, this is all Rafael's area, so I'll defer to him.
> 
>>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>>>> ---
>>>> drivers/pci/pci-driver.c | 5 +++++
>>>> drivers/pci/pci.c        | 2 +-
>>>> include/linux/pci.h      | 3 +++
>>>> 3 files changed, 9 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>>>> index cae630fe6387..15a6310c5d7b 100644
>>>> --- a/drivers/pci/pci-driver.c
>>>> +++ b/drivers/pci/pci-driver.c
>>>> @@ -1251,6 +1251,11 @@ static int pci_pm_runtime_suspend(struct
>>>> device *dev)
>>>> 		return 0;
>>>> 	}
>>>> 
>>>> +	if (pci_target_state(pci_dev, device_can_wakeup(dev)) == PCI_D0) {
>>>> +		dev_dbg(dev, "D0 doesn't have wakeup capability\n");
>>>> +		return -EBUSY;
>>>> +	}
>>>> +
>>>> 	pci_dev->state_saved = false;
>>>> 	if (pm && pm->runtime_suspend) {
>>>> 		error = pm->runtime_suspend(dev);
>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>> index 8abc843b1615..ceee6efbbcfe 100644
>>>> --- a/drivers/pci/pci.c
>>>> +++ b/drivers/pci/pci.c
>>>> @@ -2294,7 +2294,7 @@ EXPORT_SYMBOL(pci_wake_from_d3);
>>>>  * If the platform can't manage @dev, return the deepest state from which it
>>>>  * can generate wake events, based on any available PME info.
>>>>  */
>>>> -static pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
>>>> +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
>>>> {
>>>> 	pci_power_t target_state = PCI_D3hot;
>>>> 
>>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>>> index 4a5a84d7bdd4..91e8dc4d04aa 100644
>>>> --- a/include/linux/pci.h
>>>> +++ b/include/linux/pci.h
>>>> @@ -1188,6 +1188,7 @@ bool pci_pme_capable(struct pci_dev *dev,
>>>> pci_power_t state);
>>>> void pci_pme_active(struct pci_dev *dev, bool enable);
>>>> int pci_enable_wake(struct pci_dev *dev, pci_power_t state, bool enable);
>>>> int pci_wake_from_d3(struct pci_dev *dev, bool enable);
>>>> +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup);
>>>> int pci_prepare_to_sleep(struct pci_dev *dev);
>>>> int pci_back_from_sleep(struct pci_dev *dev);
>>>> bool pci_dev_run_wake(struct pci_dev *dev);
>>>> @@ -1672,6 +1673,8 @@ static inline int pci_set_power_state(struct
>>>> pci_dev *dev, pci_power_t state)
>>>> { return 0; }
>>>> static inline int pci_wake_from_d3(struct pci_dev *dev, bool enable)
>>>> { return 0; }
>>>> +pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
>>>> +{ return PCI_D0; }
>>>> static inline pci_power_t pci_choose_state(struct pci_dev *dev,
>>>> 					   pm_message_t state)
>>>> { return PCI_D0; }
>>>> -- 
>>>> 2.17.1
>> 
>> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-22 15:46       ` Kai Heng Feng
@ 2019-05-22 18:11         ` Bjorn Helgaas
  2019-05-22 18:39           ` Alan Stern
  0 siblings, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2019-05-22 18:11 UTC (permalink / raw)
  To: Kai Heng Feng; +Cc: Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote:
> > On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
> >> at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
> >>>> There's an xHC device that doesn't wake when a USB device gets plugged
> >>>> to its USB port. The driver's own runtime suspend callback was called,
> >>>> PME signaling was enabled, but it stays at PCI D0.

> > ...
> > And I guess this patch basically means we wouldn't call the driver's
> > suspend callback if we're merely going to stay at D0, so the driver
> > would have no idea anything happened.  That might match
> > Documentation/power/pci.txt better, because it suggests that the
> > suspend callback is related to putting a device in a low-power state,
> > and D0 is not a low-power state.
> 
> Yes, the patch is to let the device stay at D0 and don’t run driver’s own
> runtime suspend routine.
> 
> I guess I’ll just proceed to send a V2 with updated commit message?

Now that I understand what "runtime suspended to D0" means, help me
understand what's actually wrong.

The PCI core apparently *does* enable PME when we "suspend to D0".
But somehow calling the xHCI runtime suspend callback makes the driver
unable to notice when the PME is signaled?

Bjorn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-22 18:11         ` Bjorn Helgaas
@ 2019-05-22 18:39           ` Alan Stern
  2019-05-22 18:53             ` Lukas Wunner
  2019-05-22 20:52             ` Bjorn Helgaas
  0 siblings, 2 replies; 18+ messages in thread
From: Alan Stern @ 2019-05-22 18:39 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Kai Heng Feng, Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

On Wed, 22 May 2019, Bjorn Helgaas wrote:

> On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote:
> > > On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
> > >> at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >>> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
> > >>>> There's an xHC device that doesn't wake when a USB device gets plugged
> > >>>> to its USB port. The driver's own runtime suspend callback was called,
> > >>>> PME signaling was enabled, but it stays at PCI D0.
> 
> > > ...
> > > And I guess this patch basically means we wouldn't call the driver's
> > > suspend callback if we're merely going to stay at D0, so the driver
> > > would have no idea anything happened.  That might match
> > > Documentation/power/pci.txt better, because it suggests that the
> > > suspend callback is related to putting a device in a low-power state,
> > > and D0 is not a low-power state.
> > 
> > Yes, the patch is to let the device stay at D0 and don’t run driver’s own
> > runtime suspend routine.
> > 
> > I guess I’ll just proceed to send a V2 with updated commit message?
> 
> Now that I understand what "runtime suspended to D0" means, help me
> understand what's actually wrong.

Kai's point is that the xhci-hcd driver thinks the device is now in 
runtime suspend, because the runtime_suspend method has been executed.  
But in fact the device is still in D0, and as a result, PME signalling 
may not work correctly.

On the other hand, it wasn't clear from the patch description whether
this actually causes a problem on real systems.  The description only
said that the problem was theoretical.

> The PCI core apparently *does* enable PME when we "suspend to D0".
> But somehow calling the xHCI runtime suspend callback makes the driver
> unable to notice when the PME is signaled?

According to Kai, PME signalling doesn't work in D0 -- or at least, it
is _documented_ not to work in D0 -- even though it is enabled and the
device claims to support it.

In any case, I don't really see any point in "runtime suspending" a 
device while leaving it in D0.  We might as well just leave it alone.

Alan Stern


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-22 18:39           ` Alan Stern
@ 2019-05-22 18:53             ` Lukas Wunner
  2019-05-22 19:05               ` Kai Heng Feng
  2019-05-22 20:52             ` Bjorn Helgaas
  1 sibling, 1 reply; 18+ messages in thread
From: Lukas Wunner @ 2019-05-22 18:53 UTC (permalink / raw)
  To: Alan Stern
  Cc: Bjorn Helgaas, Kai Heng Feng, Rafael Wysocki, linux-pci, LKML,
	Mathias Nyman, linux-usb

On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote:
> According to Kai, PME signalling doesn't work in D0 -- or at least, it
> is _documented_ not to work in D0 -- even though it is enabled and the
> device claims to support it.
> 
> In any case, I don't really see any point in "runtime suspending" a 
> device while leaving it in D0.  We might as well just leave it alone.

There may be devices whose drivers are able to reduce power consumption
through some device-specific means when runtime suspending, even though
the device remains in PCI_D0.  The patch would cause a power regression
for those.

In particular, pci_target_state() returns PCI_D0 if the device lacks the
PM capability.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-22 18:53             ` Lukas Wunner
@ 2019-05-22 19:05               ` Kai Heng Feng
  0 siblings, 0 replies; 18+ messages in thread
From: Kai Heng Feng @ 2019-05-22 19:05 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Alan Stern, Bjorn Helgaas, Rafael Wysocki, linux-pci, LKML,
	Mathias Nyman, linux-usb



> On May 23, 2019, at 2:53 AM, Lukas Wunner <lukas@wunner.de> wrote:
> 
> On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote:
>> According to Kai, PME signalling doesn't work in D0 -- or at least, it
>> is _documented_ not to work in D0 -- even though it is enabled and the
>> device claims to support it.
>> 
>> In any case, I don't really see any point in "runtime suspending" a 
>> device while leaving it in D0.  We might as well just leave it alone.
> 
> There may be devices whose drivers are able to reduce power consumption
> through some device-specific means when runtime suspending, even though
> the device remains in PCI_D0.  The patch would cause a power regression
> for those.
> 
> In particular, pci_target_state() returns PCI_D0 if the device lacks the
> PM capability.

So an explicit device_can_wakeup() check before calling pci_target_state()
is needed to avoid the case you described.

I’ll add this in patch v2.

Kai-Heng

> 
> Thanks,
> 
> Lukas


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-22 18:39           ` Alan Stern
  2019-05-22 18:53             ` Lukas Wunner
@ 2019-05-22 20:52             ` Bjorn Helgaas
  2019-05-23  4:39               ` Kai-Heng Feng
  1 sibling, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2019-05-22 20:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kai Heng Feng, Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote:
> On Wed, 22 May 2019, Bjorn Helgaas wrote:
> > On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote:
> > > > On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
> > > >> at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > >>> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
> > > >>>> There's an xHC device that doesn't wake when a USB device gets plugged
> > > >>>> to its USB port. The driver's own runtime suspend callback was called,
> > > >>>> PME signaling was enabled, but it stays at PCI D0.
> > 
> > > > ...
> > > > And I guess this patch basically means we wouldn't call the driver's
> > > > suspend callback if we're merely going to stay at D0, so the driver
> > > > would have no idea anything happened.  That might match
> > > > Documentation/power/pci.txt better, because it suggests that the
> > > > suspend callback is related to putting a device in a low-power state,
> > > > and D0 is not a low-power state.
> > > 
> > > Yes, the patch is to let the device stay at D0 and don’t run driver’s own
> > > runtime suspend routine.
> > > 
> > > I guess I’ll just proceed to send a V2 with updated commit message?
> > 
> > Now that I understand what "runtime suspended to D0" means, help me
> > understand what's actually wrong.
> 
> Kai's point is that the xhci-hcd driver thinks the device is now in 
> runtime suspend, because the runtime_suspend method has been executed.  
> But in fact the device is still in D0, and as a result, PME signalling 
> may not work correctly.

The device claims to be able to signal PME from D0 (this is from the lspci
in https://bugzilla.kernel.org/show_bug.cgi?id=203673):

  00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 20) (prog-if 30 [XHCI])
    Capabilities: [50] Power Management version 3
      Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)

From the xHCI spec r1.0, sec 4.15.2.3, it looks like a connect
detected while in D0 should assert PME# if enabled (and WCE is set).

> On the other hand, it wasn't clear from the patch description whether
> this actually causes a problem on real systems.  The description only
> said that the problem was theoretical.

Kai did say nothing happens when hot-adding a USB device, so I think
there really is a problem.  This should be an obvious problem that
lots of people would trip over, so I expect there should be reports in
launchpad, etc.  I'd really like to have those bread crumbs.  Kai, can
you add a complete dmesg log to the bugzilla?  Hints from the log,
like the platform name, can help find related reports.

> > The PCI core apparently *does* enable PME when we "suspend to D0".
> > But somehow calling the xHCI runtime suspend callback makes the
> > driver unable to notice when the PME is signaled?
> 
> According to Kai, PME signalling doesn't work in D0 -- or at least,
> it is _documented_ not to work in D0 -- even though it is enabled
> and the device claims to support it.

I didn't understand this part.  From a PCI perspective, PME signaling
while in D0 is an optional feature and should work if the device
advertises support for it.  If it doesn't work on this device, we
should have a quirk to indicate that.

But I thought Kai said the device *can* signal PME from D0, but for
some reason we don't handle it correctly if we have called the xHCI
suspend callback.

That's the part I don't understand.  Is this an xHCI driver issue?
Should the suspend callback do something different if we're staying in
D0?  I'm not sure the callback even knows what Dx state we're going
to.

Bjorn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-22 20:52             ` Bjorn Helgaas
@ 2019-05-23  4:39               ` Kai-Heng Feng
  2019-05-27 16:57                 ` Bjorn Helgaas
  0 siblings, 1 reply; 18+ messages in thread
From: Kai-Heng Feng @ 2019-05-23  4:39 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alan Stern, Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

at 04:52, Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote:
>> On Wed, 22 May 2019, Bjorn Helgaas wrote:
>>> On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote:
>>>>> On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>> On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
>>>>>> at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>>>> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
>>>>>>>> There's an xHC device that doesn't wake when a USB device gets  
>>>>>>>> plugged
>>>>>>>> to its USB port. The driver's own runtime suspend callback was  
>>>>>>>> called,
>>>>>>>> PME signaling was enabled, but it stays at PCI D0.
>>>
>>>>> ...
>>>>> And I guess this patch basically means we wouldn't call the driver's
>>>>> suspend callback if we're merely going to stay at D0, so the driver
>>>>> would have no idea anything happened.  That might match
>>>>> Documentation/power/pci.txt better, because it suggests that the
>>>>> suspend callback is related to putting a device in a low-power state,
>>>>> and D0 is not a low-power state.
>>>>
>>>> Yes, the patch is to let the device stay at D0 and don’t run driver’s  
>>>> own
>>>> runtime suspend routine.
>>>>
>>>> I guess I’ll just proceed to send a V2 with updated commit message?
>>>
>>> Now that I understand what "runtime suspended to D0" means, help me
>>> understand what's actually wrong.
>>
>> Kai's point is that the xhci-hcd driver thinks the device is now in
>> runtime suspend, because the runtime_suspend method has been executed.
>> But in fact the device is still in D0, and as a result, PME signalling
>> may not work correctly.
>
> The device claims to be able to signal PME from D0 (this is from the lspci
> in https://bugzilla.kernel.org/show_bug.cgi?id=203673):
>
>   00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 20) (prog-if 30 [XHCI])
>     Capabilities: [50] Power Management version 3
>       Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>
> From the xHCI spec r1.0, sec 4.15.2.3, it looks like a connect
> detected while in D0 should assert PME# if enabled (and WCE is set).

I think section 4.15.2.3 is about S3 wake up, no S0 we are discussing here.

>
>> On the other hand, it wasn't clear from the patch description whether
>> this actually causes a problem on real systems.  The description only
>> said that the problem was theoretical.
>
> Kai did say nothing happens when hot-adding a USB device, so I think
> there really is a problem.  This should be an obvious problem that
> lots of people would trip over, so I expect there should be reports in
> launchpad, etc.  I'd really like to have those bread crumbs.  Kai, can
> you add a complete dmesg log to the bugzilla?  Hints from the log,
> like the platform name, can help find related reports.

It’s a platform in development so the name can’t be disclosed.

>
>>> The PCI core apparently *does* enable PME when we "suspend to D0".
>>> But somehow calling the xHCI runtime suspend callback makes the
>>> driver unable to notice when the PME is signaled?
>>
>> According to Kai, PME signalling doesn't work in D0 -- or at least,
>> it is _documented_ not to work in D0 -- even though it is enabled
>> and the device claims to support it.
>
> I didn't understand this part.  From a PCI perspective, PME signaling
> while in D0 is an optional feature and should work if the device
> advertises support for it.  If it doesn't work on this device, we
> should have a quirk to indicate that.

The only document I can find is the "Device Working State D0” from Microsoft.
It says:
"As a best practice, the driver should configure the device to generate  
interrupts only when the device is in D0, and to generate wake signals only  
when the device is in a low-power Dx state.”

Wake-up capability
Not applicable.

Unfortunately PCI spec isn’t publicly available so I can only refer to  
Microsoft document.

>
> But I thought Kai said the device *can* signal PME from D0, but for
> some reason we don't handle it correctly if we have called the xHCI
> suspend callback.

Sorry, what I meant is PME signaling is enabled, i.e.
"Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-“

But no signal was actually regenerated when USB device gets plugged to the  
port.
So there’s no wake up event to let PCI know it should runtime resume the  
device.

>
> That's the part I don't understand.  Is this an xHCI driver issue?
> Should the suspend callback do something different if we're staying in
> D0?  I'm not sure the callback even knows what Dx state we're going
> to.

As there’s no PME signal to wakeup event to signal PCI to runtime resume, I  
don’t think it’s an xHCI bug.

Kai-Heng


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-23  4:39               ` Kai-Heng Feng
@ 2019-05-27 16:57                 ` Bjorn Helgaas
  2019-06-05 11:57                   ` Bjorn Helgaas
  0 siblings, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2019-05-27 16:57 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: Alan Stern, Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

On Thu, May 23, 2019 at 12:39:23PM +0800, Kai-Heng Feng wrote:
> at 04:52, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote:
> > > On Wed, 22 May 2019, Bjorn Helgaas wrote:
> > > > On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote:
> > > > > > On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
> > > > > > > at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > > > On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
> > > > > > > > > There's an xHC device that doesn't wake when
> > > > > > > > > a USB device gets plugged
> > > > > > > > > to its USB port. The driver's own runtime
> > > > > > > > > suspend callback was called,
> > > > > > > > > PME signaling was enabled, but it stays at PCI D0.
> > > > 
> > > > > > ...
> > > > > > And I guess this patch basically means we wouldn't call
> > > > > > the driver's suspend callback if we're merely going to
> > > > > > stay at D0, so the driver would have no idea anything
> > > > > > happened.  That might match Documentation/power/pci.txt
> > > > > > better, because it suggests that the suspend callback is
> > > > > > related to putting a device in a low-power state, and D0
> > > > > > is not a low-power state.
> > > > > 
> > > > > Yes, the patch is to let the device stay at D0 and don’t run
> > > > > driver’s own runtime suspend routine.
> > > > > 
> > > > > I guess I’ll just proceed to send a V2 with updated commit message?
> > > > 
> > > > Now that I understand what "runtime suspended to D0" means, help me
> > > > understand what's actually wrong.
> > > 
> > > Kai's point is that the xhci-hcd driver thinks the device is now
> > > in runtime suspend, because the runtime_suspend method has been
> > > executed.  But in fact the device is still in D0, and as a
> > > result, PME signalling may not work correctly.
> > 
> > The device claims to be able to signal PME from D0 (this is from the lspci
> > in https://bugzilla.kernel.org/show_bug.cgi?id=203673):
> > 
> >   00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 20) (prog-if 30 [XHCI])
> >     Capabilities: [50] Power Management version 3
> >       Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> > 
> > From the xHCI spec r1.0, sec 4.15.2.3, it looks like a connect
> > detected while in D0 should assert PME# if enabled (and WCE is
> > set).
> 
> I think section 4.15.2.3 is about S3 wake up, no S0 we are
> discussing here.

S0 and S3 are system-level ideas and have no meaning to an individual
PCI device.  The xHC is a PCI device and can't tell whether the system
as a whole is in S0 or S3.  If a PCI device claims to be able to
generate PME while in D0, that applies regardless of the system state.

xHCI r1.0, sec A.1 says "The host controller should be capable of
asserting PME# when in any supported device state."  In sec 4.19.2,
Figure 42 says PME# should be asserted whenever PMCSR.PME_En=1 and
WCE=1 and a connection is detected.

Figure 42 also shows that CSC (Connect Status Change) and related bits
feed into Port Status Change Event Generation.  So I assume the xhci
driver normally detects connect/disconnect via CSC, but the runtime
suspend method makes it use PME# instead?

And the way your patch works is by avoiding that xhci runtime suspend
method, so it *always* uses CSC and never uses PME#?  If that's the
case, we're just papering over a problem without really understanding
it.

I'm wondering if this platform has a firmware defect.  Here's my
thinking.  The xHC is a Root Complex Integrated Endpoint, so its PME
signaling is a little unusual.

The typical scenario is that a PCIe device is below a Root Port.  In
that case, it would send a PME Message upstream to the Root Port.  Per
PCIe r4.0, sec 6.1.6, when configured for native PME support (for ACPI
systems, I assume this means "when firmware has granted PME control to
the OS via _OSC"), the Root Port would generate a normal PCI INTx or
MSI interrupt:

  PCI Express-aware software can enable a mode where the Root Complex
  signals PME via an interrupt. When configured for native PME
  support, a Root Port receives the PME Message and sets the PME
  Status bit in its Root Status register. If software has set the PME
  Interrupt Enable bit in the Root Control register to 1b, the Root
  Port then generates an interrupt.

But on this platform the xHC is a Root Complex Integrated Endpoint, so
there is no Root Port upstream from it, and that mechanism can't be
used.  Per PCIe r4.0, sec 1.3.2.3, RCiEPs signal PME via "the same
mechanism as PCI systems" or via Root Complex Event Collectors:

  An RCiEP must signal PME and error conditions through the same
  mechanisms used on PCI systems. If a Root Complex Event Collector is
  implemented, an RCiEP may optionally signal PME and error conditions
  through a Root Complex Event Collector.

This platform has no Root Complex Event Collectors, so the xHC should
signal PME via the same mechanism as PCI systems, i.e., asserting a
PME# signal.  I think this means the OS cannot use native PCIe PME
control because it doesn't know what interrupt PME# is connected to.
The PCI Firmware Spec r3.2, sec 4.5.1 (also quoted in ACPI v6.2, sec
6.2.11.3), says:

  PCI Express Native Power Management Events control

  The firmware sets this bit to 1 to grant control over control over
  PCI Express native power management event interrupts (PMEs). If
  firmware allows the operating system control of this feature, then
  in the context of the _OSC method, it must ensure that all PMEs are
  routed to root port interrupts as described in the PCI Express Base
  Specification.

This platform cannot route all PMEs to Root Port interrupts because
the xHC RCiEP cannot report PME via a Root Port, so I think its _OSC
method should not grant control of PCIe Native Power Management Events
to the OS, and I think that would mean we have to use the ACPI
mechanism for PME on this platform.

Can you confirm or deny any of this line of reasoning?  I'm wondering
if there's something wrong with the platform's _OSC, so Linux thinks
it can use native PME, but that doesn't work for this device.

> It’s a platform in development so the name can’t be disclosed.

Please attach a complete dmesg log to the bugzilla.  You can remove
identifying details like the platform name, but I want to see the
results of the _OSC negotiation.

> > > According to Kai, PME signalling doesn't work in D0 -- or at
> > > least, it is _documented_ not to work in D0 -- even though it is
> > > enabled and the device claims to support it.
> > 
> > I didn't understand this part.  From a PCI perspective, PME
> > signaling while in D0 is an optional feature and should work if
> > the device advertises support for it.  If it doesn't work on this
> > device, we should have a quirk to indicate that.
> 
> The only document I can find is the "Device Working State D0” from
> Microsoft. It says:
> "As a best practice, the driver should configure the device to
> generate interrupts only when the device is in D0, and to generate
> wake signals only when the device is in a low-power Dx state.”

Without some details about what is common between Windows and Linux, I
don't think a Windows "best practice" for drivers is really applicable
to Linux.  The paragraph you quoted from talks about simplifying
interrupt and wake code paths, not hardware capabilities.

Bjorn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-05-27 16:57                 ` Bjorn Helgaas
@ 2019-06-05 11:57                   ` Bjorn Helgaas
  2019-07-05  7:02                     ` Kai-Heng Feng
  0 siblings, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2019-06-05 11:57 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: Alan Stern, Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

On Mon, May 27, 2019 at 11:57:47AM -0500, Bjorn Helgaas wrote:
> On Thu, May 23, 2019 at 12:39:23PM +0800, Kai-Heng Feng wrote:
> > at 04:52, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote:
> > > > On Wed, 22 May 2019, Bjorn Helgaas wrote:
> > > > > On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote:
> > > > > > > On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > > On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
> > > > > > > > at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > > > > On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
> > > > > > > > > > There's an xHC device that doesn't wake when
> > > > > > > > > > a USB device gets plugged
> > > > > > > > > > to its USB port. The driver's own runtime
> > > > > > > > > > suspend callback was called,
> > > > > > > > > > PME signaling was enabled, but it stays at PCI D0.
> > > > > 
> > > > > > > ...
> > > > > > > And I guess this patch basically means we wouldn't call
> > > > > > > the driver's suspend callback if we're merely going to
> > > > > > > stay at D0, so the driver would have no idea anything
> > > > > > > happened.  That might match Documentation/power/pci.txt
> > > > > > > better, because it suggests that the suspend callback is
> > > > > > > related to putting a device in a low-power state, and D0
> > > > > > > is not a low-power state.
> > > > > > 
> > > > > > Yes, the patch is to let the device stay at D0 and don’t run
> > > > > > driver’s own runtime suspend routine.
> > > > > > 
> > > > > > I guess I’ll just proceed to send a V2 with updated commit message?
> > > > > 
> > > > > Now that I understand what "runtime suspended to D0" means, help me
> > > > > understand what's actually wrong.
> > > > 
> > > > Kai's point is that the xhci-hcd driver thinks the device is now
> > > > in runtime suspend, because the runtime_suspend method has been
> > > > executed.  But in fact the device is still in D0, and as a
> > > > result, PME signalling may not work correctly.
> > > 
> > > The device claims to be able to signal PME from D0 (this is from the lspci
> > > in https://bugzilla.kernel.org/show_bug.cgi?id=203673):
> > > 
> > >   00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 20) (prog-if 30 [XHCI])
> > >     Capabilities: [50] Power Management version 3
> > >       Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> > > 
> > > From the xHCI spec r1.0, sec 4.15.2.3, it looks like a connect
> > > detected while in D0 should assert PME# if enabled (and WCE is
> > > set).
> > 
> > I think section 4.15.2.3 is about S3 wake up, no S0 we are
> > discussing here.
> 
> S0 and S3 are system-level ideas and have no meaning to an individual
> PCI device.  The xHC is a PCI device and can't tell whether the system
> as a whole is in S0 or S3.  If a PCI device claims to be able to
> generate PME while in D0, that applies regardless of the system state.
> 
> xHCI r1.0, sec A.1 says "The host controller should be capable of
> asserting PME# when in any supported device state."  In sec 4.19.2,
> Figure 42 says PME# should be asserted whenever PMCSR.PME_En=1 and
> WCE=1 and a connection is detected.
> 
> Figure 42 also shows that CSC (Connect Status Change) and related bits
> feed into Port Status Change Event Generation.  So I assume the xhci
> driver normally detects connect/disconnect via CSC, but the runtime
> suspend method makes it use PME# instead?
> 
> And the way your patch works is by avoiding that xhci runtime suspend
> method, so it *always* uses CSC and never uses PME#?  If that's the
> case, we're just papering over a problem without really understanding
> it.
> 
> I'm wondering if this platform has a firmware defect.  Here's my
> thinking.  The xHC is a Root Complex Integrated Endpoint, so its PME
> signaling is a little unusual.
> 
> The typical scenario is that a PCIe device is below a Root Port.  In
> that case, it would send a PME Message upstream to the Root Port.  Per
> PCIe r4.0, sec 6.1.6, when configured for native PME support (for ACPI
> systems, I assume this means "when firmware has granted PME control to
> the OS via _OSC"), the Root Port would generate a normal PCI INTx or
> MSI interrupt:
> 
>   PCI Express-aware software can enable a mode where the Root Complex
>   signals PME via an interrupt. When configured for native PME
>   support, a Root Port receives the PME Message and sets the PME
>   Status bit in its Root Status register. If software has set the PME
>   Interrupt Enable bit in the Root Control register to 1b, the Root
>   Port then generates an interrupt.
> 
> But on this platform the xHC is a Root Complex Integrated Endpoint, so
> there is no Root Port upstream from it, and that mechanism can't be
> used.  Per PCIe r4.0, sec 1.3.2.3, RCiEPs signal PME via "the same
> mechanism as PCI systems" or via Root Complex Event Collectors:
> 
>   An RCiEP must signal PME and error conditions through the same
>   mechanisms used on PCI systems. If a Root Complex Event Collector is
>   implemented, an RCiEP may optionally signal PME and error conditions
>   through a Root Complex Event Collector.
> 
> This platform has no Root Complex Event Collectors, so the xHC should
> signal PME via the same mechanism as PCI systems, i.e., asserting a
> PME# signal.  I think this means the OS cannot use native PCIe PME
> control because it doesn't know what interrupt PME# is connected to.
> The PCI Firmware Spec r3.2, sec 4.5.1 (also quoted in ACPI v6.2, sec
> 6.2.11.3), says:
> 
>   PCI Express Native Power Management Events control
> 
>   The firmware sets this bit to 1 to grant control over PCI Express
>   native power management event interrupts (PMEs). If firmware
>   allows the operating system control of this feature, then in the
>   context of the _OSC method, it must ensure that all PMEs are
>   routed to root port interrupts as described in the PCI Express
>   Base Specification.
> 
> This platform cannot route all PMEs to Root Port interrupts because
> the xHC RCiEP cannot report PME via a Root Port, so I think its _OSC
> method should not grant control of PCIe Native Power Management Events
> to the OS, and I think that would mean we have to use the ACPI
> mechanism for PME on this platform.
> 
> Can you confirm or deny any of this line of reasoning?  I'm wondering
> if there's something wrong with the platform's _OSC, so Linux thinks
> it can use native PME, but that doesn't work for this device.
> 
> > It’s a platform in development so the name can’t be disclosed.
> 
> Please attach a complete dmesg log to the bugzilla.  You can remove
> identifying details like the platform name, but I want to see the
> results of the _OSC negotiation.

Thanks for the dmesg log
(https://bugzilla.kernel.org/attachment.cgi?id=283109).  It shows:

  acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
  acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug LTR]
  acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability]

I think it is incorrect for the platform to give the OS native control
over PME because the OS has no way to know how the RCiEP PMEs are
routed.  But it would be interesting to know how BIOSes on other
platforms with RCiEPs handle this, and I did post a question to the
PCI-SIG to see if there's any guidance there.

Bjorn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-06-05 11:57                   ` Bjorn Helgaas
@ 2019-07-05  7:02                     ` Kai-Heng Feng
  2019-07-05  9:39                       ` Rafael J. Wysocki
  2019-07-09 13:45                       ` Bjorn Helgaas
  0 siblings, 2 replies; 18+ messages in thread
From: Kai-Heng Feng @ 2019-07-05  7:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alan Stern, Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

at 19:57, Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Mon, May 27, 2019 at 11:57:47AM -0500, Bjorn Helgaas wrote:
>> On Thu, May 23, 2019 at 12:39:23PM +0800, Kai-Heng Feng wrote:
>>> at 04:52, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>> On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote:
>>>>> On Wed, 22 May 2019, Bjorn Helgaas wrote:
>>>>>> On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote:
>>>>>>>> On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org>  
>>>>>>>> wrote:
>>>>>>>> On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
>>>>>>>>> at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>>>>>>> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
>>>>>>>>>>> There's an xHC device that doesn't wake when
>>>>>>>>>>> a USB device gets plugged
>>>>>>>>>>> to its USB port. The driver's own runtime
>>>>>>>>>>> suspend callback was called,
>>>>>>>>>>> PME signaling was enabled, but it stays at PCI D0.
>>>>>>
>>>>>>>> ...
>>>>>>>> And I guess this patch basically means we wouldn't call
>>>>>>>> the driver's suspend callback if we're merely going to
>>>>>>>> stay at D0, so the driver would have no idea anything
>>>>>>>> happened.  That might match Documentation/power/pci.txt
>>>>>>>> better, because it suggests that the suspend callback is
>>>>>>>> related to putting a device in a low-power state, and D0
>>>>>>>> is not a low-power state.
>>>>>>>
>>>>>>> Yes, the patch is to let the device stay at D0 and don’t run
>>>>>>> driver’s own runtime suspend routine.
>>>>>>>
>>>>>>> I guess I’ll just proceed to send a V2 with updated commit message?
>>>>>>
>>>>>> Now that I understand what "runtime suspended to D0" means, help me
>>>>>> understand what's actually wrong.
>>>>>
>>>>> Kai's point is that the xhci-hcd driver thinks the device is now
>>>>> in runtime suspend, because the runtime_suspend method has been
>>>>> executed.  But in fact the device is still in D0, and as a
>>>>> result, PME signalling may not work correctly.
>>>>
>>>> The device claims to be able to signal PME from D0 (this is from the  
>>>> lspci
>>>> in https://bugzilla.kernel.org/show_bug.cgi?id=203673):
>>>>
>>>>   00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 20) (prog-if 30 [XHCI])
>>>>     Capabilities: [50] Power Management version 3
>>>>       Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>>>
>>>> From the xHCI spec r1.0, sec 4.15.2.3, it looks like a connect
>>>> detected while in D0 should assert PME# if enabled (and WCE is
>>>> set).
>>>
>>> I think section 4.15.2.3 is about S3 wake up, no S0 we are
>>> discussing here.
>>
>> S0 and S3 are system-level ideas and have no meaning to an individual
>> PCI device.  The xHC is a PCI device and can't tell whether the system
>> as a whole is in S0 or S3.  If a PCI device claims to be able to
>> generate PME while in D0, that applies regardless of the system state.
>>
>> xHCI r1.0, sec A.1 says "The host controller should be capable of
>> asserting PME# when in any supported device state."  In sec 4.19.2,
>> Figure 42 says PME# should be asserted whenever PMCSR.PME_En=1 and
>> WCE=1 and a connection is detected.
>>
>> Figure 42 also shows that CSC (Connect Status Change) and related bits
>> feed into Port Status Change Event Generation.  So I assume the xhci
>> driver normally detects connect/disconnect via CSC, but the runtime
>> suspend method makes it use PME# instead?
>>
>> And the way your patch works is by avoiding that xhci runtime suspend
>> method, so it *always* uses CSC and never uses PME#?  If that's the
>> case, we're just papering over a problem without really understanding
>> it.
>>
>> I'm wondering if this platform has a firmware defect.  Here's my
>> thinking.  The xHC is a Root Complex Integrated Endpoint, so its PME
>> signaling is a little unusual.
>>
>> The typical scenario is that a PCIe device is below a Root Port.  In
>> that case, it would send a PME Message upstream to the Root Port.  Per
>> PCIe r4.0, sec 6.1.6, when configured for native PME support (for ACPI
>> systems, I assume this means "when firmware has granted PME control to
>> the OS via _OSC"), the Root Port would generate a normal PCI INTx or
>> MSI interrupt:
>>
>>   PCI Express-aware software can enable a mode where the Root Complex
>>   signals PME via an interrupt. When configured for native PME
>>   support, a Root Port receives the PME Message and sets the PME
>>   Status bit in its Root Status register. If software has set the PME
>>   Interrupt Enable bit in the Root Control register to 1b, the Root
>>   Port then generates an interrupt.
>>
>> But on this platform the xHC is a Root Complex Integrated Endpoint, so
>> there is no Root Port upstream from it, and that mechanism can't be
>> used.  Per PCIe r4.0, sec 1.3.2.3, RCiEPs signal PME via "the same
>> mechanism as PCI systems" or via Root Complex Event Collectors:
>>
>>   An RCiEP must signal PME and error conditions through the same
>>   mechanisms used on PCI systems. If a Root Complex Event Collector is
>>   implemented, an RCiEP may optionally signal PME and error conditions
>>   through a Root Complex Event Collector.
>>
>> This platform has no Root Complex Event Collectors, so the xHC should
>> signal PME via the same mechanism as PCI systems, i.e., asserting a
>> PME# signal.  I think this means the OS cannot use native PCIe PME
>> control because it doesn't know what interrupt PME# is connected to.
>> The PCI Firmware Spec r3.2, sec 4.5.1 (also quoted in ACPI v6.2, sec
>> 6.2.11.3), says:
>>
>>   PCI Express Native Power Management Events control
>>
>>   The firmware sets this bit to 1 to grant control over PCI Express
>>   native power management event interrupts (PMEs). If firmware
>>   allows the operating system control of this feature, then in the
>>   context of the _OSC method, it must ensure that all PMEs are
>>   routed to root port interrupts as described in the PCI Express
>>   Base Specification.
>>
>> This platform cannot route all PMEs to Root Port interrupts because
>> the xHC RCiEP cannot report PME via a Root Port, so I think its _OSC
>> method should not grant control of PCIe Native Power Management Events
>> to the OS, and I think that would mean we have to use the ACPI
>> mechanism for PME on this platform.
>>
>> Can you confirm or deny any of this line of reasoning?  I'm wondering
>> if there's something wrong with the platform's _OSC, so Linux thinks
>> it can use native PME, but that doesn't work for this device.
>>
>>> It’s a platform in development so the name can’t be disclosed.
>>
>> Please attach a complete dmesg log to the bugzilla.  You can remove
>> identifying details like the platform name, but I want to see the
>> results of the _OSC negotiation.
>
> Thanks for the dmesg log
> (https://bugzilla.kernel.org/attachment.cgi?id=283109).  It shows:
>
>   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
>   acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug LTR]
>   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability]
>
> I think it is incorrect for the platform to give the OS native control
> over PME because the OS has no way to know how the RCiEP PMEs are
> routed.  But it would be interesting to know how BIOSes on other
> platforms with RCiEPs handle this, and I did post a question to the
> PCI-SIG to see if there's any guidance there.

Is there any update from PCI-SIG?

I really think we don’t need wakeup capability in D0 because D0 is a  
working state.
Also, is there any real hardware which depends on D0 PME?

Kai-Heng

>
> Bjorn



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-07-05  7:02                     ` Kai-Heng Feng
@ 2019-07-05  9:39                       ` Rafael J. Wysocki
  2019-07-05 13:51                         ` Kai-Heng Feng
  2019-07-09 13:45                       ` Bjorn Helgaas
  1 sibling, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2019-07-05  9:39 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: Bjorn Helgaas, Alan Stern, Rafael Wysocki, linux-pci, LKML,
	Mathias Nyman, linux-usb

On Friday, July 5, 2019 9:02:01 AM CEST Kai-Heng Feng wrote:
> at 19:57, Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> > On Mon, May 27, 2019 at 11:57:47AM -0500, Bjorn Helgaas wrote:
> >> On Thu, May 23, 2019 at 12:39:23PM +0800, Kai-Heng Feng wrote:
> >>> at 04:52, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>>> On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote:
> >>>>> On Wed, 22 May 2019, Bjorn Helgaas wrote:
> >>>>>> On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote:
> >>>>>>>> On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org>  
> >>>>>>>> wrote:
> >>>>>>>> On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
> >>>>>>>>> at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>>>>>>>>> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
> >>>>>>>>>>> There's an xHC device that doesn't wake when
> >>>>>>>>>>> a USB device gets plugged
> >>>>>>>>>>> to its USB port. The driver's own runtime
> >>>>>>>>>>> suspend callback was called,
> >>>>>>>>>>> PME signaling was enabled, but it stays at PCI D0.
> >>>>>>
> >>>>>>>> ...
> >>>>>>>> And I guess this patch basically means we wouldn't call
> >>>>>>>> the driver's suspend callback if we're merely going to
> >>>>>>>> stay at D0, so the driver would have no idea anything
> >>>>>>>> happened.  That might match Documentation/power/pci.txt
> >>>>>>>> better, because it suggests that the suspend callback is
> >>>>>>>> related to putting a device in a low-power state, and D0
> >>>>>>>> is not a low-power state.
> >>>>>>>
> >>>>>>> Yes, the patch is to let the device stay at D0 and don’t run
> >>>>>>> driver’s own runtime suspend routine.
> >>>>>>>
> >>>>>>> I guess I’ll just proceed to send a V2 with updated commit message?
> >>>>>>
> >>>>>> Now that I understand what "runtime suspended to D0" means, help me
> >>>>>> understand what's actually wrong.
> >>>>>
> >>>>> Kai's point is that the xhci-hcd driver thinks the device is now
> >>>>> in runtime suspend, because the runtime_suspend method has been
> >>>>> executed.  But in fact the device is still in D0, and as a
> >>>>> result, PME signalling may not work correctly.
> >>>>
> >>>> The device claims to be able to signal PME from D0 (this is from the  
> >>>> lspci
> >>>> in https://bugzilla.kernel.org/show_bug.cgi?id=203673):
> >>>>
> >>>>   00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 20) (prog-if 30 [XHCI])
> >>>>     Capabilities: [50] Power Management version 3
> >>>>       Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> >>>>
> >>>> From the xHCI spec r1.0, sec 4.15.2.3, it looks like a connect
> >>>> detected while in D0 should assert PME# if enabled (and WCE is
> >>>> set).
> >>>
> >>> I think section 4.15.2.3 is about S3 wake up, no S0 we are
> >>> discussing here.
> >>
> >> S0 and S3 are system-level ideas and have no meaning to an individual
> >> PCI device.  The xHC is a PCI device and can't tell whether the system
> >> as a whole is in S0 or S3.  If a PCI device claims to be able to
> >> generate PME while in D0, that applies regardless of the system state.
> >>
> >> xHCI r1.0, sec A.1 says "The host controller should be capable of
> >> asserting PME# when in any supported device state."  In sec 4.19.2,
> >> Figure 42 says PME# should be asserted whenever PMCSR.PME_En=1 and
> >> WCE=1 and a connection is detected.
> >>
> >> Figure 42 also shows that CSC (Connect Status Change) and related bits
> >> feed into Port Status Change Event Generation.  So I assume the xhci
> >> driver normally detects connect/disconnect via CSC, but the runtime
> >> suspend method makes it use PME# instead?
> >>
> >> And the way your patch works is by avoiding that xhci runtime suspend
> >> method, so it *always* uses CSC and never uses PME#?  If that's the
> >> case, we're just papering over a problem without really understanding
> >> it.
> >>
> >> I'm wondering if this platform has a firmware defect.  Here's my
> >> thinking.  The xHC is a Root Complex Integrated Endpoint, so its PME
> >> signaling is a little unusual.
> >>
> >> The typical scenario is that a PCIe device is below a Root Port.  In
> >> that case, it would send a PME Message upstream to the Root Port.  Per
> >> PCIe r4.0, sec 6.1.6, when configured for native PME support (for ACPI
> >> systems, I assume this means "when firmware has granted PME control to
> >> the OS via _OSC"), the Root Port would generate a normal PCI INTx or
> >> MSI interrupt:
> >>
> >>   PCI Express-aware software can enable a mode where the Root Complex
> >>   signals PME via an interrupt. When configured for native PME
> >>   support, a Root Port receives the PME Message and sets the PME
> >>   Status bit in its Root Status register. If software has set the PME
> >>   Interrupt Enable bit in the Root Control register to 1b, the Root
> >>   Port then generates an interrupt.
> >>
> >> But on this platform the xHC is a Root Complex Integrated Endpoint, so
> >> there is no Root Port upstream from it, and that mechanism can't be
> >> used.  Per PCIe r4.0, sec 1.3.2.3, RCiEPs signal PME via "the same
> >> mechanism as PCI systems" or via Root Complex Event Collectors:
> >>
> >>   An RCiEP must signal PME and error conditions through the same
> >>   mechanisms used on PCI systems. If a Root Complex Event Collector is
> >>   implemented, an RCiEP may optionally signal PME and error conditions
> >>   through a Root Complex Event Collector.
> >>
> >> This platform has no Root Complex Event Collectors, so the xHC should
> >> signal PME via the same mechanism as PCI systems, i.e., asserting a
> >> PME# signal.  I think this means the OS cannot use native PCIe PME
> >> control because it doesn't know what interrupt PME# is connected to.
> >> The PCI Firmware Spec r3.2, sec 4.5.1 (also quoted in ACPI v6.2, sec
> >> 6.2.11.3), says:
> >>
> >>   PCI Express Native Power Management Events control
> >>
> >>   The firmware sets this bit to 1 to grant control over PCI Express
> >>   native power management event interrupts (PMEs). If firmware
> >>   allows the operating system control of this feature, then in the
> >>   context of the _OSC method, it must ensure that all PMEs are
> >>   routed to root port interrupts as described in the PCI Express
> >>   Base Specification.
> >>
> >> This platform cannot route all PMEs to Root Port interrupts because
> >> the xHC RCiEP cannot report PME via a Root Port, so I think its _OSC
> >> method should not grant control of PCIe Native Power Management Events
> >> to the OS, and I think that would mean we have to use the ACPI
> >> mechanism for PME on this platform.
> >>
> >> Can you confirm or deny any of this line of reasoning?  I'm wondering
> >> if there's something wrong with the platform's _OSC, so Linux thinks
> >> it can use native PME, but that doesn't work for this device.
> >>
> >>> It’s a platform in development so the name can’t be disclosed.
> >>
> >> Please attach a complete dmesg log to the bugzilla.  You can remove
> >> identifying details like the platform name, but I want to see the
> >> results of the _OSC negotiation.
> >
> > Thanks for the dmesg log
> > (https://bugzilla.kernel.org/attachment.cgi?id=283109).  It shows:
> >
> >   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
> >   acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug LTR]
> >   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability]
> >
> > I think it is incorrect for the platform to give the OS native control
> > over PME because the OS has no way to know how the RCiEP PMEs are
> > routed.  But it would be interesting to know how BIOSes on other
> > platforms with RCiEPs handle this, and I did post a question to the
> > PCI-SIG to see if there's any guidance there.
> 
> Is there any update from PCI-SIG?
> 
> I really think we don’t need wakeup capability in D0 because D0 is a  
> working state.

Well, in theory, devices may stay in D0 over suspend-to-idle and they may need to
signal wakeup then.  Using PME for that would be kind of handy (if it worked) as it
would allow special handling of in-band IRQs to be avoided in that case.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-07-05  9:39                       ` Rafael J. Wysocki
@ 2019-07-05 13:51                         ` Kai-Heng Feng
  0 siblings, 0 replies; 18+ messages in thread
From: Kai-Heng Feng @ 2019-07-05 13:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, Alan Stern, Rafael Wysocki, linux-pci, LKML,
	Mathias Nyman, linux-usb

at 17:39, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:

> On Friday, July 5, 2019 9:02:01 AM CEST Kai-Heng Feng wrote:
>> at 19:57, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>
>>> On Mon, May 27, 2019 at 11:57:47AM -0500, Bjorn Helgaas wrote:
>>>> On Thu, May 23, 2019 at 12:39:23PM +0800, Kai-Heng Feng wrote:
>>>>> at 04:52, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>>> On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote:
>>>>>>> On Wed, 22 May 2019, Bjorn Helgaas wrote:
>>>>>>>> On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote:
>>>>>>>>>> On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@kernel.org>
>>>>>>>>>> wrote:
>>>>>>>>>> On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote:
>>>>>>>>>>> at 6:23 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>>>>>>>>> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote:
>>>>>>>>>>>>> There's an xHC device that doesn't wake when
>>>>>>>>>>>>> a USB device gets plugged
>>>>>>>>>>>>> to its USB port. The driver's own runtime
>>>>>>>>>>>>> suspend callback was called,
>>>>>>>>>>>>> PME signaling was enabled, but it stays at PCI D0.
>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>> And I guess this patch basically means we wouldn't call
>>>>>>>>>> the driver's suspend callback if we're merely going to
>>>>>>>>>> stay at D0, so the driver would have no idea anything
>>>>>>>>>> happened.  That might match Documentation/power/pci.txt
>>>>>>>>>> better, because it suggests that the suspend callback is
>>>>>>>>>> related to putting a device in a low-power state, and D0
>>>>>>>>>> is not a low-power state.
>>>>>>>>>
>>>>>>>>> Yes, the patch is to let the device stay at D0 and don’t run
>>>>>>>>> driver’s own runtime suspend routine.
>>>>>>>>>
>>>>>>>>> I guess I’ll just proceed to send a V2 with updated commit message?
>>>>>>>>
>>>>>>>> Now that I understand what "runtime suspended to D0" means, help me
>>>>>>>> understand what's actually wrong.
>>>>>>>
>>>>>>> Kai's point is that the xhci-hcd driver thinks the device is now
>>>>>>> in runtime suspend, because the runtime_suspend method has been
>>>>>>> executed.  But in fact the device is still in D0, and as a
>>>>>>> result, PME signalling may not work correctly.
>>>>>>
>>>>>> The device claims to be able to signal PME from D0 (this is from the
>>>>>> lspci
>>>>>> in https://bugzilla.kernel.org/show_bug.cgi?id=203673):
>>>>>>
>>>>>>   00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 20) (prog-if 30 [XHCI])
>>>>>>     Capabilities: [50] Power Management version 3
>>>>>>       Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>>>>>
>>>>>> From the xHCI spec r1.0, sec 4.15.2.3, it looks like a connect
>>>>>> detected while in D0 should assert PME# if enabled (and WCE is
>>>>>> set).
>>>>>
>>>>> I think section 4.15.2.3 is about S3 wake up, no S0 we are
>>>>> discussing here.
>>>>
>>>> S0 and S3 are system-level ideas and have no meaning to an individual
>>>> PCI device.  The xHC is a PCI device and can't tell whether the system
>>>> as a whole is in S0 or S3.  If a PCI device claims to be able to
>>>> generate PME while in D0, that applies regardless of the system state.
>>>>
>>>> xHCI r1.0, sec A.1 says "The host controller should be capable of
>>>> asserting PME# when in any supported device state."  In sec 4.19.2,
>>>> Figure 42 says PME# should be asserted whenever PMCSR.PME_En=1 and
>>>> WCE=1 and a connection is detected.
>>>>
>>>> Figure 42 also shows that CSC (Connect Status Change) and related bits
>>>> feed into Port Status Change Event Generation.  So I assume the xhci
>>>> driver normally detects connect/disconnect via CSC, but the runtime
>>>> suspend method makes it use PME# instead?
>>>>
>>>> And the way your patch works is by avoiding that xhci runtime suspend
>>>> method, so it *always* uses CSC and never uses PME#?  If that's the
>>>> case, we're just papering over a problem without really understanding
>>>> it.
>>>>
>>>> I'm wondering if this platform has a firmware defect.  Here's my
>>>> thinking.  The xHC is a Root Complex Integrated Endpoint, so its PME
>>>> signaling is a little unusual.
>>>>
>>>> The typical scenario is that a PCIe device is below a Root Port.  In
>>>> that case, it would send a PME Message upstream to the Root Port.  Per
>>>> PCIe r4.0, sec 6.1.6, when configured for native PME support (for ACPI
>>>> systems, I assume this means "when firmware has granted PME control to
>>>> the OS via _OSC"), the Root Port would generate a normal PCI INTx or
>>>> MSI interrupt:
>>>>
>>>>   PCI Express-aware software can enable a mode where the Root Complex
>>>>   signals PME via an interrupt. When configured for native PME
>>>>   support, a Root Port receives the PME Message and sets the PME
>>>>   Status bit in its Root Status register. If software has set the PME
>>>>   Interrupt Enable bit in the Root Control register to 1b, the Root
>>>>   Port then generates an interrupt.
>>>>
>>>> But on this platform the xHC is a Root Complex Integrated Endpoint, so
>>>> there is no Root Port upstream from it, and that mechanism can't be
>>>> used.  Per PCIe r4.0, sec 1.3.2.3, RCiEPs signal PME via "the same
>>>> mechanism as PCI systems" or via Root Complex Event Collectors:
>>>>
>>>>   An RCiEP must signal PME and error conditions through the same
>>>>   mechanisms used on PCI systems. If a Root Complex Event Collector is
>>>>   implemented, an RCiEP may optionally signal PME and error conditions
>>>>   through a Root Complex Event Collector.
>>>>
>>>> This platform has no Root Complex Event Collectors, so the xHC should
>>>> signal PME via the same mechanism as PCI systems, i.e., asserting a
>>>> PME# signal.  I think this means the OS cannot use native PCIe PME
>>>> control because it doesn't know what interrupt PME# is connected to.
>>>> The PCI Firmware Spec r3.2, sec 4.5.1 (also quoted in ACPI v6.2, sec
>>>> 6.2.11.3), says:
>>>>
>>>>   PCI Express Native Power Management Events control
>>>>
>>>>   The firmware sets this bit to 1 to grant control over PCI Express
>>>>   native power management event interrupts (PMEs). If firmware
>>>>   allows the operating system control of this feature, then in the
>>>>   context of the _OSC method, it must ensure that all PMEs are
>>>>   routed to root port interrupts as described in the PCI Express
>>>>   Base Specification.
>>>>
>>>> This platform cannot route all PMEs to Root Port interrupts because
>>>> the xHC RCiEP cannot report PME via a Root Port, so I think its _OSC
>>>> method should not grant control of PCIe Native Power Management Events
>>>> to the OS, and I think that would mean we have to use the ACPI
>>>> mechanism for PME on this platform.
>>>>
>>>> Can you confirm or deny any of this line of reasoning?  I'm wondering
>>>> if there's something wrong with the platform's _OSC, so Linux thinks
>>>> it can use native PME, but that doesn't work for this device.
>>>>
>>>>> It’s a platform in development so the name can’t be disclosed.
>>>>
>>>> Please attach a complete dmesg log to the bugzilla.  You can remove
>>>> identifying details like the platform name, but I want to see the
>>>> results of the _OSC negotiation.
>>>
>>> Thanks for the dmesg log
>>> (https://bugzilla.kernel.org/attachment.cgi?id=283109).  It shows:
>>>
>>>   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
>>>   acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug LTR]
>>>   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability]
>>>
>>> I think it is incorrect for the platform to give the OS native control
>>> over PME because the OS has no way to know how the RCiEP PMEs are
>>> routed.  But it would be interesting to know how BIOSes on other
>>> platforms with RCiEPs handle this, and I did post a question to the
>>> PCI-SIG to see if there's any guidance there.
>>
>> Is there any update from PCI-SIG?
>>
>> I really think we don’t need wakeup capability in D0 because D0 is a
>> working state.
>
> Well, in theory, devices may stay in D0 over suspend-to-idle and they may  
> need to
> signal wakeup then.  Using PME for that would be kind of handy (if it  
> worked) as it
> would allow special handling of in-band IRQs to be avoided in that case.

That makes sense but doesn’t apply to this case.
This patch only avoids D0 runtime suspend, suspend-to-idle will call  
system-wide suspend routine which still enables D0 PME.

It’ll be great if you can review my v3 patch here:
https://patchwork.kernel.org/patch/10960271/

Kai-Heng


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-07-05  7:02                     ` Kai-Heng Feng
  2019-07-05  9:39                       ` Rafael J. Wysocki
@ 2019-07-09 13:45                       ` Bjorn Helgaas
  2019-09-02 13:47                         ` Kai-Heng Feng
  1 sibling, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2019-07-09 13:45 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: Alan Stern, Rafael Wysocki, linux-pci, LKML, Mathias Nyman, linux-usb

On Fri, Jul 05, 2019 at 03:02:01PM +0800, Kai-Heng Feng wrote:
> at 19:57, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Mon, May 27, 2019 at 11:57:47AM -0500, Bjorn Helgaas wrote:

> > > I'm wondering if this platform has a firmware defect.  Here's my
> > > thinking.  The xHC is a Root Complex Integrated Endpoint, so its PME
> > > signaling is a little unusual.
> > > 
> > > The typical scenario is that a PCIe device is below a Root Port.  In
> > > that case, it would send a PME Message upstream to the Root Port.  Per
> > > PCIe r4.0, sec 6.1.6, when configured for native PME support (for ACPI
> > > systems, I assume this means "when firmware has granted PME control to
> > > the OS via _OSC"), the Root Port would generate a normal PCI INTx or
> > > MSI interrupt:
> > > 
> > >   PCI Express-aware software can enable a mode where the Root Complex
> > >   signals PME via an interrupt. When configured for native PME
> > >   support, a Root Port receives the PME Message and sets the PME
> > >   Status bit in its Root Status register. If software has set the PME
> > >   Interrupt Enable bit in the Root Control register to 1b, the Root
> > >   Port then generates an interrupt.
> > > 
> > > But on this platform the xHC is a Root Complex Integrated Endpoint, so
> > > there is no Root Port upstream from it, and that mechanism can't be
> > > used.  Per PCIe r4.0, sec 1.3.2.3, RCiEPs signal PME via "the same
> > > mechanism as PCI systems" or via Root Complex Event Collectors:
> > > 
> > >   An RCiEP must signal PME and error conditions through the same
> > >   mechanisms used on PCI systems. If a Root Complex Event Collector is
> > >   implemented, an RCiEP may optionally signal PME and error conditions
> > >   through a Root Complex Event Collector.
> > > 
> > > This platform has no Root Complex Event Collectors, so the xHC should
> > > signal PME via the same mechanism as PCI systems, i.e., asserting a
> > > PME# signal.  I think this means the OS cannot use native PCIe PME
> > > control because it doesn't know what interrupt PME# is connected to.
> > > The PCI Firmware Spec r3.2, sec 4.5.1 (also quoted in ACPI v6.2, sec
> > > 6.2.11.3), says:
> > > 
> > >   PCI Express Native Power Management Events control
> > > 
> > >   The firmware sets this bit to 1 to grant control over PCI Express
> > >   native power management event interrupts (PMEs). If firmware
> > >   allows the operating system control of this feature, then in the
> > >   context of the _OSC method, it must ensure that all PMEs are
> > >   routed to root port interrupts as described in the PCI Express
> > >   Base Specification.
> > > 
> > > This platform cannot route all PMEs to Root Port interrupts because
> > > the xHC RCiEP cannot report PME via a Root Port, so I think its _OSC
> > > method should not grant control of PCIe Native Power Management Events
> > > to the OS, and I think that would mean we have to use the ACPI
> > > mechanism for PME on this platform.
> > > 
> > > Can you confirm or deny any of this line of reasoning?  I'm wondering
> > > if there's something wrong with the platform's _OSC, so Linux thinks
> > > it can use native PME, but that doesn't work for this device.
> > > 
> > > > It’s a platform in development so the name can’t be disclosed.
> > > 
> > > Please attach a complete dmesg log to the bugzilla.  You can remove
> > > identifying details like the platform name, but I want to see the
> > > results of the _OSC negotiation.
> > 
> > Thanks for the dmesg log
> > (https://bugzilla.kernel.org/attachment.cgi?id=283109).  It shows:
> > 
> >   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
> >   acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug LTR]
> >   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability]
> > 
> > I think it is incorrect for the platform to give the OS native control
> > over PME because the OS has no way to know how the RCiEP PMEs are
> > routed.  But it would be interesting to know how BIOSes on other
> > platforms with RCiEPs handle this, and I did post a question to the
> > PCI-SIG to see if there's any guidance there.
> 
> Is there any update from PCI-SIG?

Yes, but I did a terrible job asking the question, so we didn't
really get an answer for this situation.  The thread on the forum is
https://forum.pcisig.com/viewtopic.php?f=85&t=1081 (requires PCI-SIG
login, unfortunately).  My question was:

  Given an RCiEP that supports PME, can firmware grant control over
  native power management events to the OS?

  The PCI Firmware spec, r3.2, sec 4.5.1, says:

    PCI Express Native Power Management Events control

    The firmware sets this bit to 1 to grant control over PCI Express
    native power management event interrupts (PMEs). If firmware
    allows the operating system control of this feature, then in the
    context of the _OSC method, it must ensure that all PMEs are
    routed to root port interrupts as described in the PCI Express
    Base Specification.

  I don't think there's a mechanism for RCiEPs to route PMEs to a Root
  Port interrupt.

  PCIe r4.0, sec 1.3.2.3, says:

    An RCiEP must signal PME and error conditions through the same
    mechanisms used on PCI systems. If a Root Complex Event Collector
    is implemented, an RCiEP may optionally signal PME and error
    conditions through a Root Complex Event Collector.

  If the OS can be granted native PME control, how does it learn where
  the RCiEP PME is routed?

And the response from Robert Gough:

  The routing of root complex devices- Root Ports and Root Complex
  Integrated Endpionts- to Event Collectors is described in the Event
  Collector's RCEC Endpoint Association Capability Structure.

  In order for OSPM to process PMEs routed to an Event Collector, the
  source of the PME is found in the PME Requester ID field within the
  Root Status register of the Event Collector, in the same way that
  PME messages from children of Root Ports are serviced.

I just posted this follow-up question:

  Thanks, that clarifies one piece. The PCI Firmware spec, r3.2, sec
  4.5.1, says that if firmware allows OSPM control of PME, all PMEs
  should be routed to Root Port interrupts. Your answer suggests that
  this should be updated to say something like "all PMEs are routed to
  Root Port *or RCEC* interrupts".

  The piece I still don't understand is what happens when firmware
  allows OSPM control of PME in a system with an RCiEP but no RCEC.
  Where are PMEs from the RCiEP routed, and how does OSPM discover
  that? Or is it simply illegal for firmware to allow OSPM control of
  PME in that case?

The system we're looking at doesn't have any RCECs, so I don't think
the Root Complex Event Collector Endpoint Association Capability (what
a mouthful :)) is applicable, but I don't think Linux currently has
any support for it, so I think we're likely to trip over similar
issues on systems that do have RCECs.

It would be good if somebody added support for that capability.

Bjorn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0
  2019-07-09 13:45                       ` Bjorn Helgaas
@ 2019-09-02 13:47                         ` Kai-Heng Feng
  0 siblings, 0 replies; 18+ messages in thread
From: Kai-Heng Feng @ 2019-09-02 13:47 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alan Stern, Rafael Wysocki, Linux PCI, LKML, Mathias Nyman, linux-usb

at 21:45, Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Fri, Jul 05, 2019 at 03:02:01PM +0800, Kai-Heng Feng wrote:
>> at 19:57, Bjorn Helgaas <helgaas@kernel.org> wrote:
>>> On Mon, May 27, 2019 at 11:57:47AM -0500, Bjorn Helgaas wrote:
>
>>>> I'm wondering if this platform has a firmware defect.  Here's my
>>>> thinking.  The xHC is a Root Complex Integrated Endpoint, so its PME
>>>> signaling is a little unusual.
>>>>
>>>> The typical scenario is that a PCIe device is below a Root Port.  In
>>>> that case, it would send a PME Message upstream to the Root Port.  Per
>>>> PCIe r4.0, sec 6.1.6, when configured for native PME support (for ACPI
>>>> systems, I assume this means "when firmware has granted PME control to
>>>> the OS via _OSC"), the Root Port would generate a normal PCI INTx or
>>>> MSI interrupt:
>>>>
>>>>   PCI Express-aware software can enable a mode where the Root Complex
>>>>   signals PME via an interrupt. When configured for native PME
>>>>   support, a Root Port receives the PME Message and sets the PME
>>>>   Status bit in its Root Status register. If software has set the PME
>>>>   Interrupt Enable bit in the Root Control register to 1b, the Root
>>>>   Port then generates an interrupt.
>>>>
>>>> But on this platform the xHC is a Root Complex Integrated Endpoint, so
>>>> there is no Root Port upstream from it, and that mechanism can't be
>>>> used.  Per PCIe r4.0, sec 1.3.2.3, RCiEPs signal PME via "the same
>>>> mechanism as PCI systems" or via Root Complex Event Collectors:
>>>>
>>>>   An RCiEP must signal PME and error conditions through the same
>>>>   mechanisms used on PCI systems. If a Root Complex Event Collector is
>>>>   implemented, an RCiEP may optionally signal PME and error conditions
>>>>   through a Root Complex Event Collector.
>>>>
>>>> This platform has no Root Complex Event Collectors, so the xHC should
>>>> signal PME via the same mechanism as PCI systems, i.e., asserting a
>>>> PME# signal.  I think this means the OS cannot use native PCIe PME
>>>> control because it doesn't know what interrupt PME# is connected to.
>>>> The PCI Firmware Spec r3.2, sec 4.5.1 (also quoted in ACPI v6.2, sec
>>>> 6.2.11.3), says:
>>>>
>>>>   PCI Express Native Power Management Events control
>>>>
>>>>   The firmware sets this bit to 1 to grant control over PCI Express
>>>>   native power management event interrupts (PMEs). If firmware
>>>>   allows the operating system control of this feature, then in the
>>>>   context of the _OSC method, it must ensure that all PMEs are
>>>>   routed to root port interrupts as described in the PCI Express
>>>>   Base Specification.
>>>>
>>>> This platform cannot route all PMEs to Root Port interrupts because
>>>> the xHC RCiEP cannot report PME via a Root Port, so I think its _OSC
>>>> method should not grant control of PCIe Native Power Management Events
>>>> to the OS, and I think that would mean we have to use the ACPI
>>>> mechanism for PME on this platform.
>>>>
>>>> Can you confirm or deny any of this line of reasoning?  I'm wondering
>>>> if there's something wrong with the platform's _OSC, so Linux thinks
>>>> it can use native PME, but that doesn't work for this device.
>>>>
>>>>> It’s a platform in development so the name can’t be disclosed.
>>>>
>>>> Please attach a complete dmesg log to the bugzilla.  You can remove
>>>> identifying details like the platform name, but I want to see the
>>>> results of the _OSC negotiation.
>>>
>>> Thanks for the dmesg log
>>> (https://bugzilla.kernel.org/attachment.cgi?id=283109).  It shows:
>>>
>>>   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
>>>   acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug LTR]
>>>   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability]
>>>
>>> I think it is incorrect for the platform to give the OS native control
>>> over PME because the OS has no way to know how the RCiEP PMEs are
>>> routed.  But it would be interesting to know how BIOSes on other
>>> platforms with RCiEPs handle this, and I did post a question to the
>>> PCI-SIG to see if there's any guidance there.
>>
>> Is there any update from PCI-SIG?
>
> Yes, but I did a terrible job asking the question, so we didn't
> really get an answer for this situation.  The thread on the forum is
> https://forum.pcisig.com/viewtopic.php?f=85&t=1081 (requires PCI-SIG
> login, unfortunately).  My question was:
>
>   Given an RCiEP that supports PME, can firmware grant control over
>   native power management events to the OS?
>
>   The PCI Firmware spec, r3.2, sec 4.5.1, says:
>
>     PCI Express Native Power Management Events control
>
>     The firmware sets this bit to 1 to grant control over PCI Express
>     native power management event interrupts (PMEs). If firmware
>     allows the operating system control of this feature, then in the
>     context of the _OSC method, it must ensure that all PMEs are
>     routed to root port interrupts as described in the PCI Express
>     Base Specification.
>
>   I don't think there's a mechanism for RCiEPs to route PMEs to a Root
>   Port interrupt.
>
>   PCIe r4.0, sec 1.3.2.3, says:
>
>     An RCiEP must signal PME and error conditions through the same
>     mechanisms used on PCI systems. If a Root Complex Event Collector
>     is implemented, an RCiEP may optionally signal PME and error
>     conditions through a Root Complex Event Collector.
>
>   If the OS can be granted native PME control, how does it learn where
>   the RCiEP PME is routed?
>
> And the response from Robert Gough:
>
>   The routing of root complex devices- Root Ports and Root Complex
>   Integrated Endpionts- to Event Collectors is described in the Event
>   Collector's RCEC Endpoint Association Capability Structure.
>
>   In order for OSPM to process PMEs routed to an Event Collector, the
>   source of the PME is found in the PME Requester ID field within the
>   Root Status register of the Event Collector, in the same way that
>   PME messages from children of Root Ports are serviced.
>
> I just posted this follow-up question:
>
>   Thanks, that clarifies one piece. The PCI Firmware spec, r3.2, sec
>   4.5.1, says that if firmware allows OSPM control of PME, all PMEs
>   should be routed to Root Port interrupts. Your answer suggests that
>   this should be updated to say something like "all PMEs are routed to
>   Root Port *or RCEC* interrupts".
>
>   The piece I still don't understand is what happens when firmware
>   allows OSPM control of PME in a system with an RCiEP but no RCEC.
>   Where are PMEs from the RCiEP routed, and how does OSPM discover
>   that? Or is it simply illegal for firmware to allow OSPM control of
>   PME in that case?
>
> The system we're looking at doesn't have any RCECs, so I don't think
> the Root Complex Event Collector Endpoint Association Capability (what
> a mouthful :)) is applicable, but I don't think Linux currently has
> any support for it, so I think we're likely to trip over similar
> issues on systems that do have RCECs.
>
> It would be good if somebody added support for that capability.

I just found the same issue on another Stoney Ridge laptop so I’d like to  
bring up this issue again.

The original approach I took is based on the feed back from BIOS team. They  
modified the return value of _S0W method to 0, to prevent the device from  
being runtime suspended.

But since the D0 PME# doesn’t work, I think maybe it’s better to just  
remove D0 PME# from its PM CAP?
I’ll send a patch with quirk to the mailing list.

Kai-Heng

>
> Bjorn



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-09-02 13:47 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-21 16:31 [PATCH] PCI / PM: Don't runtime suspend when device only supports wakeup from D0 Kai-Heng Feng
2019-05-21 22:23 ` Bjorn Helgaas
2019-05-22  3:42   ` Kai Heng Feng
2019-05-22 13:48     ` Bjorn Helgaas
2019-05-22 15:46       ` Kai Heng Feng
2019-05-22 18:11         ` Bjorn Helgaas
2019-05-22 18:39           ` Alan Stern
2019-05-22 18:53             ` Lukas Wunner
2019-05-22 19:05               ` Kai Heng Feng
2019-05-22 20:52             ` Bjorn Helgaas
2019-05-23  4:39               ` Kai-Heng Feng
2019-05-27 16:57                 ` Bjorn Helgaas
2019-06-05 11:57                   ` Bjorn Helgaas
2019-07-05  7:02                     ` Kai-Heng Feng
2019-07-05  9:39                       ` Rafael J. Wysocki
2019-07-05 13:51                         ` Kai-Heng Feng
2019-07-09 13:45                       ` Bjorn Helgaas
2019-09-02 13:47                         ` Kai-Heng Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).