[PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
@ 2013-03-23 14:33 Rafael J. Wysocki
  2013-03-23 16:22 ` Matthew Garrett
                   ` (3 more replies)
  0 siblings, 4 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-23 14:33 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: ACPI Devel Maling List, LKML, Linux PM list, Len Brown,
	Matthew Garrett, Sarah Sharp, Accardi, Kristen C, Huang, Ying,
	linux-pci

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It turns out that _Lxx control methods provided by some BIOSes clear
the PME Status bit of PCI devices they handle, which means that
pci_acpi_wake_dev() cannot really use that bit to check whether or
not the device has signalled wakeup.

For this reason, make pci_acpi_wake_dev() always attempt to resume
the device it is called for regardless of the device's PME Status bit
value (that bit still has to be cleared if set at this point,
though).

Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/pci/pci-acpi.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/pci/pci-acpi.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-acpi.c
+++ linux-pm/drivers/pci/pci-acpi.c
@@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
 		return;
 	}
 
-	if (!pci_dev->pm_cap || !pci_dev->pme_support
-	     || pci_check_pme_status(pci_dev)) {
-		if (pci_dev->pme_poll)
-			pci_dev->pme_poll = false;
+	/* Clear PME Status if set. */
+	if (pci_dev->pme_support)
+		pci_check_pme_status(pci_dev);
 
-		pci_wakeup_event(pci_dev);
-		pm_runtime_resume(&pci_dev->dev);
-	}
+	if (pci_dev->pme_poll)
+		pci_dev->pme_poll = false;
+
+	pci_wakeup_event(pci_dev);
+	pm_runtime_resume(&pci_dev->dev);
 
 	if (pci_dev->subordinate)
 		pci_pme_wakeup_bus(pci_dev->subordinate);


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-23 14:33 [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications Rafael J. Wysocki
@ 2013-03-23 16:22 ` Matthew Garrett
  2013-03-25 16:45 ` Sarah Sharp
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 61+ messages in thread
From: Matthew Garrett @ 2013-03-23 16:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, ACPI Devel Maling List, LKML, Linux PM list,
	Len Brown, Sarah Sharp, Accardi, Kristen C, Huang, Ying,
	linux-pci

Looks good to me.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-23 14:33 [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications Rafael J. Wysocki
  2013-03-23 16:22 ` Matthew Garrett
@ 2013-03-25 16:45 ` Sarah Sharp
  2013-03-25 22:34   ` Rafael J. Wysocki
  2013-03-28 12:57 ` Rafael J. Wysocki
  2013-03-28 17:10 ` [Resend][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications Rafael J. Wysocki
  3 siblings, 1 reply; 61+ messages in thread
From: Sarah Sharp @ 2013-03-25 16:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, ACPI Devel Maling List, LKML, Linux PM list,
	Len Brown, Matthew Garrett, Accardi, Kristen C, Huang, Ying,
	linux-pci

On Sat, Mar 23, 2013 at 03:33:03PM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It turns out that _Lxx control methods provided by some BIOSes clear
> the PME Status bit of PCI devices they handle, which means that
> pci_acpi_wake_dev() cannot really use that bit to check whether or
> not the device has signalled wakeup.
> 
> For this reason, make pci_acpi_wake_dev() always attempt to resume
> the device it is called for regardless of the device's PME Status bit
> value (that bit still has to be cleared if set at this point,
> though).
> 
> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Should this be marked for stable?  I had this issue on 3.7 and 3.8 as
well.

Sarah

> ---
>  drivers/pci/pci-acpi.c |   15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci-acpi.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-acpi.c
> +++ linux-pm/drivers/pci/pci-acpi.c
> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
>  		return;
>  	}
>  
> -	if (!pci_dev->pm_cap || !pci_dev->pme_support
> -	     || pci_check_pme_status(pci_dev)) {
> -		if (pci_dev->pme_poll)
> -			pci_dev->pme_poll = false;
> +	/* Clear PME Status if set. */
> +	if (pci_dev->pme_support)
> +		pci_check_pme_status(pci_dev);
>  
> -		pci_wakeup_event(pci_dev);
> -		pm_runtime_resume(&pci_dev->dev);
> -	}
> +	if (pci_dev->pme_poll)
> +		pci_dev->pme_poll = false;
> +
> +	pci_wakeup_event(pci_dev);
> +	pm_runtime_resume(&pci_dev->dev);
>  
>  	if (pci_dev->subordinate)
>  		pci_pme_wakeup_bus(pci_dev->subordinate);
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-25 16:45 ` Sarah Sharp
@ 2013-03-25 22:34   ` Rafael J. Wysocki
  0 siblings, 0 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-25 22:34 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: Bjorn Helgaas, ACPI Devel Maling List, LKML, Linux PM list,
	Len Brown, Matthew Garrett, Accardi, Kristen C, Huang, Ying,
	linux-pci

On Monday, March 25, 2013 09:45:51 AM Sarah Sharp wrote:
> On Sat, Mar 23, 2013 at 03:33:03PM +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > It turns out that _Lxx control methods provided by some BIOSes clear
> > the PME Status bit of PCI devices they handle, which means that
> > pci_acpi_wake_dev() cannot really use that bit to check whether or
> > not the device has signalled wakeup.
> > 
> > For this reason, make pci_acpi_wake_dev() always attempt to resume
> > the device it is called for regardless of the device's PME Status bit
> > value (that bit still has to be cleared if set at this point,
> > though).
> > 
> > Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Should this be marked for stable?  I had this issue on 3.7 and 3.8 as
> well.

Yes, it probably should, but that's the maintainer's call.

Thanks,
Rafael


> > ---
> >  drivers/pci/pci-acpi.c |   15 ++++++++-------
> >  1 file changed, 8 insertions(+), 7 deletions(-)
> > 
> > Index: linux-pm/drivers/pci/pci-acpi.c
> > ===================================================================
> > --- linux-pm.orig/drivers/pci/pci-acpi.c
> > +++ linux-pm/drivers/pci/pci-acpi.c
> > @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
> >  		return;
> >  	}
> >  
> > -	if (!pci_dev->pm_cap || !pci_dev->pme_support
> > -	     || pci_check_pme_status(pci_dev)) {
> > -		if (pci_dev->pme_poll)
> > -			pci_dev->pme_poll = false;
> > +	/* Clear PME Status if set. */
> > +	if (pci_dev->pme_support)
> > +		pci_check_pme_status(pci_dev);
> >  
> > -		pci_wakeup_event(pci_dev);
> > -		pm_runtime_resume(&pci_dev->dev);
> > -	}
> > +	if (pci_dev->pme_poll)
> > +		pci_dev->pme_poll = false;
> > +
> > +	pci_wakeup_event(pci_dev);
> > +	pm_runtime_resume(&pci_dev->dev);
> >  
> >  	if (pci_dev->subordinate)
> >  		pci_pme_wakeup_bus(pci_dev->subordinate);
> > 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-23 14:33 [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications Rafael J. Wysocki
  2013-03-23 16:22 ` Matthew Garrett
  2013-03-25 16:45 ` Sarah Sharp
@ 2013-03-28 12:57 ` Rafael J. Wysocki
  2013-03-28 16:21   ` Bjorn Helgaas
  2013-03-28 17:10 ` [Resend][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications Rafael J. Wysocki
  3 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-28 12:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: ACPI Devel Maling List, Len Brown, Matthew Garrett, Sarah Sharp

Hi Bjorn,

I wonder what you think about the patch below?

Rafael


On Saturday, March 23, 2013 03:33:03 PM Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It turns out that _Lxx control methods provided by some BIOSes clear
> the PME Status bit of PCI devices they handle, which means that
> pci_acpi_wake_dev() cannot really use that bit to check whether or
> not the device has signalled wakeup.
> 
> For this reason, make pci_acpi_wake_dev() always attempt to resume
> the device it is called for regardless of the device's PME Status bit
> value (that bit still has to be cleared if set at this point,
> though).
> 
> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/pci/pci-acpi.c |   15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci-acpi.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-acpi.c
> +++ linux-pm/drivers/pci/pci-acpi.c
> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
>  		return;
>  	}
>  
> -	if (!pci_dev->pm_cap || !pci_dev->pme_support
> -	     || pci_check_pme_status(pci_dev)) {
> -		if (pci_dev->pme_poll)
> -			pci_dev->pme_poll = false;
> +	/* Clear PME Status if set. */
> +	if (pci_dev->pme_support)
> +		pci_check_pme_status(pci_dev);
>  
> -		pci_wakeup_event(pci_dev);
> -		pm_runtime_resume(&pci_dev->dev);
> -	}
> +	if (pci_dev->pme_poll)
> +		pci_dev->pme_poll = false;
> +
> +	pci_wakeup_event(pci_dev);
> +	pm_runtime_resume(&pci_dev->dev);
>  
>  	if (pci_dev->subordinate)
>  		pci_pme_wakeup_bus(pci_dev->subordinate);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 12:57 ` Rafael J. Wysocki
@ 2013-03-28 16:21   ` Bjorn Helgaas
  2013-03-28 16:41     ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Bjorn Helgaas @ 2013-03-28 16:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, Len Brown, Matthew Garrett, Sarah Sharp

On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> Hi Bjorn,
>
> I wonder what you think about the patch below?

Seems fine to me (I'm trusting your and Matthew's judgment here since
I don't know much about it).  Why don't you resend it with Matthew's
ack and the appropriate stable tags, and I'll put it in.  If you have
a URL for a bugzilla or mailing list report of the original problem,
that would be good, too.  It'd be nice if users and distros could
match problem reports with this solution, but I can't tell what the
user-visible issue was.  I assume that Sarah tested this (or somebody
else reproduced the problem and tested the fix)?

> On Saturday, March 23, 2013 03:33:03 PM Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> It turns out that _Lxx control methods provided by some BIOSes clear
>> the PME Status bit of PCI devices they handle, which means that
>> pci_acpi_wake_dev() cannot really use that bit to check whether or
>> not the device has signalled wakeup.
>>
>> For this reason, make pci_acpi_wake_dev() always attempt to resume
>> the device it is called for regardless of the device's PME Status bit
>> value (that bit still has to be cleared if set at this point,
>> though).
>>
>> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>  drivers/pci/pci-acpi.c |   15 ++++++++-------
>>  1 file changed, 8 insertions(+), 7 deletions(-)
>>
>> Index: linux-pm/drivers/pci/pci-acpi.c
>> ===================================================================
>> --- linux-pm.orig/drivers/pci/pci-acpi.c
>> +++ linux-pm/drivers/pci/pci-acpi.c
>> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
>>               return;
>>       }
>>
>> -     if (!pci_dev->pm_cap || !pci_dev->pme_support
>> -          || pci_check_pme_status(pci_dev)) {
>> -             if (pci_dev->pme_poll)
>> -                     pci_dev->pme_poll = false;
>> +     /* Clear PME Status if set. */
>> +     if (pci_dev->pme_support)
>> +             pci_check_pme_status(pci_dev);
>>
>> -             pci_wakeup_event(pci_dev);
>> -             pm_runtime_resume(&pci_dev->dev);
>> -     }
>> +     if (pci_dev->pme_poll)
>> +             pci_dev->pme_poll = false;
>> +
>> +     pci_wakeup_event(pci_dev);
>> +     pm_runtime_resume(&pci_dev->dev);
>>
>>       if (pci_dev->subordinate)
>>               pci_pme_wakeup_bus(pci_dev->subordinate);
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 16:21   ` Bjorn Helgaas
@ 2013-03-28 16:41     ` Rafael J. Wysocki
  2013-03-28 16:46       ` Bjorn Helgaas
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-28 16:41 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: ACPI Devel Maling List, Len Brown, Matthew Garrett, Sarah Sharp

On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > Hi Bjorn,
> >
> > I wonder what you think about the patch below?
> 
> Seems fine to me (I'm trusting your and Matthew's judgment here since
> I don't know much about it).  Why don't you resend it with Matthew's
> ack and the appropriate stable tags, and I'll put it in.

I will, thanks!

> If you have
> a URL for a bugzilla or mailing list report of the original problem,
> that would be good, too.  It'd be nice if users and distros could
> match problem reports with this solution, but I can't tell what the
> user-visible issue was.  I assume that Sarah tested this (or somebody
> else reproduced the problem and tested the fix)?

Sarah reported it to me privately and I'm afraid I don't have any pointers
to publicly available mailing list archives etc.

Thanks,
Rafael


> > On Saturday, March 23, 2013 03:33:03 PM Rafael J. Wysocki wrote:
> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>
> >> It turns out that _Lxx control methods provided by some BIOSes clear
> >> the PME Status bit of PCI devices they handle, which means that
> >> pci_acpi_wake_dev() cannot really use that bit to check whether or
> >> not the device has signalled wakeup.
> >>
> >> For this reason, make pci_acpi_wake_dev() always attempt to resume
> >> the device it is called for regardless of the device's PME Status bit
> >> value (that bit still has to be cleared if set at this point,
> >> though).
> >>
> >> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> ---
> >>  drivers/pci/pci-acpi.c |   15 ++++++++-------
> >>  1 file changed, 8 insertions(+), 7 deletions(-)
> >>
> >> Index: linux-pm/drivers/pci/pci-acpi.c
> >> ===================================================================
> >> --- linux-pm.orig/drivers/pci/pci-acpi.c
> >> +++ linux-pm/drivers/pci/pci-acpi.c
> >> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
> >>               return;
> >>       }
> >>
> >> -     if (!pci_dev->pm_cap || !pci_dev->pme_support
> >> -          || pci_check_pme_status(pci_dev)) {
> >> -             if (pci_dev->pme_poll)
> >> -                     pci_dev->pme_poll = false;
> >> +     /* Clear PME Status if set. */
> >> +     if (pci_dev->pme_support)
> >> +             pci_check_pme_status(pci_dev);
> >>
> >> -             pci_wakeup_event(pci_dev);
> >> -             pm_runtime_resume(&pci_dev->dev);
> >> -     }
> >> +     if (pci_dev->pme_poll)
> >> +             pci_dev->pme_poll = false;
> >> +
> >> +     pci_wakeup_event(pci_dev);
> >> +     pm_runtime_resume(&pci_dev->dev);
> >>
> >>       if (pci_dev->subordinate)
> >>               pci_pme_wakeup_bus(pci_dev->subordinate);
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > I speak only for myself.
> > Rafael J. Wysocki, Intel Open Source Technology Center.
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 16:41     ` Rafael J. Wysocki
@ 2013-03-28 16:46       ` Bjorn Helgaas
  2013-03-28 16:59         ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Bjorn Helgaas @ 2013-03-28 16:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, Len Brown, Matthew Garrett, Sarah Sharp

On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> > Hi Bjorn,
>> >
>> > I wonder what you think about the patch below?
>>
>> Seems fine to me (I'm trusting your and Matthew's judgment here since
>> I don't know much about it).  Why don't you resend it with Matthew's
>> ack and the appropriate stable tags, and I'll put it in.
>
> I will, thanks!
>
>> If you have
>> a URL for a bugzilla or mailing list report of the original problem,
>> that would be good, too.  It'd be nice if users and distros could
>> match problem reports with this solution, but I can't tell what the
>> user-visible issue was.  I assume that Sarah tested this (or somebody
>> else reproduced the problem and tested the fix)?
>
> Sarah reported it to me privately and I'm afraid I don't have any pointers
> to publicly available mailing list archives etc.

Do you at least have a description of how a user could determine
whether he is seeing the problem fixed by this patch?

>> > On Saturday, March 23, 2013 03:33:03 PM Rafael J. Wysocki wrote:
>> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >>
>> >> It turns out that _Lxx control methods provided by some BIOSes clear
>> >> the PME Status bit of PCI devices they handle, which means that
>> >> pci_acpi_wake_dev() cannot really use that bit to check whether or
>> >> not the device has signalled wakeup.
>> >>
>> >> For this reason, make pci_acpi_wake_dev() always attempt to resume
>> >> the device it is called for regardless of the device's PME Status bit
>> >> value (that bit still has to be cleared if set at this point,
>> >> though).
>> >>
>> >> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
>> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >> ---
>> >>  drivers/pci/pci-acpi.c |   15 ++++++++-------
>> >>  1 file changed, 8 insertions(+), 7 deletions(-)
>> >>
>> >> Index: linux-pm/drivers/pci/pci-acpi.c
>> >> ===================================================================
>> >> --- linux-pm.orig/drivers/pci/pci-acpi.c
>> >> +++ linux-pm/drivers/pci/pci-acpi.c
>> >> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
>> >>               return;
>> >>       }
>> >>
>> >> -     if (!pci_dev->pm_cap || !pci_dev->pme_support
>> >> -          || pci_check_pme_status(pci_dev)) {
>> >> -             if (pci_dev->pme_poll)
>> >> -                     pci_dev->pme_poll = false;
>> >> +     /* Clear PME Status if set. */
>> >> +     if (pci_dev->pme_support)
>> >> +             pci_check_pme_status(pci_dev);
>> >>
>> >> -             pci_wakeup_event(pci_dev);
>> >> -             pm_runtime_resume(&pci_dev->dev);
>> >> -     }
>> >> +     if (pci_dev->pme_poll)
>> >> +             pci_dev->pme_poll = false;
>> >> +
>> >> +     pci_wakeup_event(pci_dev);
>> >> +     pm_runtime_resume(&pci_dev->dev);
>> >>
>> >>       if (pci_dev->subordinate)
>> >>               pci_pme_wakeup_bus(pci_dev->subordinate);
>> >>
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>> >> the body of a message to majordomo@vger.kernel.org
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > --
>> > I speak only for myself.
>> > Rafael J. Wysocki, Intel Open Source Technology Center.
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 16:46       ` Bjorn Helgaas
@ 2013-03-28 16:59         ` Rafael J. Wysocki
  2013-03-28 17:26           ` Martin Mokrejs
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-28 16:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: ACPI Devel Maling List, Len Brown, Matthew Garrett, Sarah Sharp

On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
> >> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> > Hi Bjorn,
> >> >
> >> > I wonder what you think about the patch below?
> >>
> >> Seems fine to me (I'm trusting your and Matthew's judgment here since
> >> I don't know much about it).  Why don't you resend it with Matthew's
> >> ack and the appropriate stable tags, and I'll put it in.
> >
> > I will, thanks!
> >
> >> If you have
> >> a URL for a bugzilla or mailing list report of the original problem,
> >> that would be good, too.  It'd be nice if users and distros could
> >> match problem reports with this solution, but I can't tell what the
> >> user-visible issue was.  I assume that Sarah tested this (or somebody
> >> else reproduced the problem and tested the fix)?
> >
> > Sarah reported it to me privately and I'm afraid I don't have any pointers
> > to publicly available mailing list archives etc.
> 
> Do you at least have a description of how a user could determine
> whether he is seeing the problem fixed by this patch?

Yeah.  For example, when the problem is visible on a USB controller and that
controller is runtime-suspended, then plugging a new USB device into one
of the controller's ports won't wake the controller up without the patch.

I will put that information into the changelog.

Thanks,
Rafael


> >> > On Saturday, March 23, 2013 03:33:03 PM Rafael J. Wysocki wrote:
> >> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >>
> >> >> It turns out that _Lxx control methods provided by some BIOSes clear
> >> >> the PME Status bit of PCI devices they handle, which means that
> >> >> pci_acpi_wake_dev() cannot really use that bit to check whether or
> >> >> not the device has signalled wakeup.
> >> >>
> >> >> For this reason, make pci_acpi_wake_dev() always attempt to resume
> >> >> the device it is called for regardless of the device's PME Status bit
> >> >> value (that bit still has to be cleared if set at this point,
> >> >> though).
> >> >>
> >> >> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
> >> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >> ---
> >> >>  drivers/pci/pci-acpi.c |   15 ++++++++-------
> >> >>  1 file changed, 8 insertions(+), 7 deletions(-)
> >> >>
> >> >> Index: linux-pm/drivers/pci/pci-acpi.c
> >> >> ===================================================================
> >> >> --- linux-pm.orig/drivers/pci/pci-acpi.c
> >> >> +++ linux-pm/drivers/pci/pci-acpi.c
> >> >> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
> >> >>               return;
> >> >>       }
> >> >>
> >> >> -     if (!pci_dev->pm_cap || !pci_dev->pme_support
> >> >> -          || pci_check_pme_status(pci_dev)) {
> >> >> -             if (pci_dev->pme_poll)
> >> >> -                     pci_dev->pme_poll = false;
> >> >> +     /* Clear PME Status if set. */
> >> >> +     if (pci_dev->pme_support)
> >> >> +             pci_check_pme_status(pci_dev);
> >> >>
> >> >> -             pci_wakeup_event(pci_dev);
> >> >> -             pm_runtime_resume(&pci_dev->dev);
> >> >> -     }
> >> >> +     if (pci_dev->pme_poll)
> >> >> +             pci_dev->pme_poll = false;
> >> >> +
> >> >> +     pci_wakeup_event(pci_dev);
> >> >> +     pm_runtime_resume(&pci_dev->dev);
> >> >>
> >> >>       if (pci_dev->subordinate)
> >> >>               pci_pme_wakeup_bus(pci_dev->subordinate);
> >> >>
> >> >> --
> >> >> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> >> >> the body of a message to majordomo@vger.kernel.org
> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> > --
> >> > I speak only for myself.
> >> > Rafael J. Wysocki, Intel Open Source Technology Center.
> > --
> > I speak only for myself.
> > Rafael J. Wysocki, Intel Open Source Technology Center.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Resend][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-23 14:33 [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2013-03-28 12:57 ` Rafael J. Wysocki
@ 2013-03-28 17:10 ` Rafael J. Wysocki
  2013-03-28 21:07   ` [Update][PATCH] " Rafael J. Wysocki
  3 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-28 17:10 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: ACPI Devel Maling List, LKML, Linux PM list, Len Brown,
	Matthew Garrett, Sarah Sharp, Accardi, Kristen C, Huang, Ying,
	linux-pci

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It turns out that the _Lxx control methods provided by some BIOSes
clear the PME Status bit of PCI devices they handle, which means that
pci_acpi_wake_dev() cannot really use that bit to check whether or
not the device has signalled wakeup.

The symptom of the problem is, for example, that when a PCI USB
controller is affected, then plugging in a new USB device into one of
the controller's ports will not wake up the controller, which should
happen.

For this reason, make pci_acpi_wake_dev() always attempt to resume
the device it is called for regardless of the device's PME Status bit
value (that bit still has to be cleared if set at this point,
though).

Reported-and-tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: 3.7+ <stable@vger.kernel.org>
---
 drivers/pci/pci-acpi.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/pci/pci-acpi.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-acpi.c
+++ linux-pm/drivers/pci/pci-acpi.c
@@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
 		return;
 	}
 
-	if (!pci_dev->pm_cap || !pci_dev->pme_support
-	     || pci_check_pme_status(pci_dev)) {
-		if (pci_dev->pme_poll)
-			pci_dev->pme_poll = false;
+	/* Clear PME Status if set. */
+	if (pci_dev->pme_support)
+		pci_check_pme_status(pci_dev);
 
-		pci_wakeup_event(pci_dev);
-		pm_runtime_resume(&pci_dev->dev);
-	}
+	if (pci_dev->pme_poll)
+		pci_dev->pme_poll = false;
+
+	pci_wakeup_event(pci_dev);
+	pm_runtime_resume(&pci_dev->dev);
 
 	if (pci_dev->subordinate)
 		pci_pme_wakeup_bus(pci_dev->subordinate);

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 16:59         ` Rafael J. Wysocki
@ 2013-03-28 17:26           ` Martin Mokrejs
  2013-03-28 17:49             ` Bjorn Helgaas
  0 siblings, 1 reply; 61+ messages in thread
From: Martin Mokrejs @ 2013-03-28 17:26 UTC (permalink / raw)
  To: Rafael J. Wysocki, Bjorn Helgaas
  Cc: ACPI Devel Maling List, Len Brown, Matthew Garrett, Sarah Sharp



Rafael J. Wysocki wrote:
> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>>> Hi Bjorn,
>>>>>
>>>>> I wonder what you think about the patch below?
>>>>
>>>> Seems fine to me (I'm trusting your and Matthew's judgment here since
>>>> I don't know much about it).  Why don't you resend it with Matthew's
>>>> ack and the appropriate stable tags, and I'll put it in.
>>>
>>> I will, thanks!
>>>
>>>> If you have
>>>> a URL for a bugzilla or mailing list report of the original problem,
>>>> that would be good, too.  It'd be nice if users and distros could
>>>> match problem reports with this solution, but I can't tell what the
>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
>>>> else reproduced the problem and tested the fix)?
>>>
>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
>>> to publicly available mailing list archives etc.
>>
>> Do you at least have a description of how a user could determine
>> whether he is seeing the problem fixed by this patch?
> 
> Yeah.  For example, when the problem is visible on a USB controller and that
> controller is runtime-suspended, then plugging a new USB device into one
> of the controller's ports won't wake the controller up without the patch.

Hi,
 I am wondering for a week or two why nobody answered any of my bug reports,
not even Sarah who asked for more details. I am think the fix is about my report
under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
about my report. But I should better wait what Sarah says. ;-)

  I would have actually same comment for the proposed patches in:
Yinghai Lu  "Re: [PATCH] PCI: Remove not needed check in disable aspm link"
Who tested the bug, if anybody? What change(the fix) in lspci output should one
observe?

Thank you,
Martin


> 
> I will put that information into the changelog.
> 
> Thanks,
> Rafael
> 
> 
>>>>> On Saturday, March 23, 2013 03:33:03 PM Rafael J. Wysocki wrote:
>>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>>
>>>>>> It turns out that _Lxx control methods provided by some BIOSes clear
>>>>>> the PME Status bit of PCI devices they handle, which means that
>>>>>> pci_acpi_wake_dev() cannot really use that bit to check whether or
>>>>>> not the device has signalled wakeup.
>>>>>>
>>>>>> For this reason, make pci_acpi_wake_dev() always attempt to resume
>>>>>> the device it is called for regardless of the device's PME Status bit
>>>>>> value (that bit still has to be cleared if set at this point,
>>>>>> though).
>>>>>>
>>>>>> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
>>>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>> ---
>>>>>>  drivers/pci/pci-acpi.c |   15 ++++++++-------
>>>>>>  1 file changed, 8 insertions(+), 7 deletions(-)
>>>>>>
>>>>>> Index: linux-pm/drivers/pci/pci-acpi.c
>>>>>> ===================================================================
>>>>>> --- linux-pm.orig/drivers/pci/pci-acpi.c
>>>>>> +++ linux-pm/drivers/pci/pci-acpi.c
>>>>>> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
>>>>>>               return;
>>>>>>       }
>>>>>>
>>>>>> -     if (!pci_dev->pm_cap || !pci_dev->pme_support
>>>>>> -          || pci_check_pme_status(pci_dev)) {
>>>>>> -             if (pci_dev->pme_poll)
>>>>>> -                     pci_dev->pme_poll = false;
>>>>>> +     /* Clear PME Status if set. */
>>>>>> +     if (pci_dev->pme_support)
>>>>>> +             pci_check_pme_status(pci_dev);
>>>>>>
>>>>>> -             pci_wakeup_event(pci_dev);
>>>>>> -             pm_runtime_resume(&pci_dev->dev);
>>>>>> -     }
>>>>>> +     if (pci_dev->pme_poll)
>>>>>> +             pci_dev->pme_poll = false;
>>>>>> +
>>>>>> +     pci_wakeup_event(pci_dev);
>>>>>> +     pm_runtime_resume(&pci_dev->dev);
>>>>>>
>>>>>>       if (pci_dev->subordinate)
>>>>>>               pci_pme_wakeup_bus(pci_dev->subordinate);
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> --
>>>>> I speak only for myself.
>>>>> Rafael J. Wysocki, Intel Open Source Technology Center.
>>> --
>>> I speak only for myself.
>>> Rafael J. Wysocki, Intel Open Source Technology Center.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 17:26           ` Martin Mokrejs
@ 2013-03-28 17:49             ` Bjorn Helgaas
  2013-03-28 18:23               ` Sarah Sharp
  2013-03-28 18:31               ` Martin Mokrejs
  0 siblings, 2 replies; 61+ messages in thread
From: Bjorn Helgaas @ 2013-03-28 17:49 UTC (permalink / raw)
  To: Martin Mokrejs
  Cc: Rafael J. Wysocki, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp

On Thu, Mar 28, 2013 at 11:26 AM, Martin Mokrejs
<mmokrejs@fold.natur.cuni.cz> wrote:
>
>
> Rafael J. Wysocki wrote:
>> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
>>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
>>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>>>> Hi Bjorn,
>>>>>>
>>>>>> I wonder what you think about the patch below?
>>>>>
>>>>> Seems fine to me (I'm trusting your and Matthew's judgment here since
>>>>> I don't know much about it).  Why don't you resend it with Matthew's
>>>>> ack and the appropriate stable tags, and I'll put it in.
>>>>
>>>> I will, thanks!
>>>>
>>>>> If you have
>>>>> a URL for a bugzilla or mailing list report of the original problem,
>>>>> that would be good, too.  It'd be nice if users and distros could
>>>>> match problem reports with this solution, but I can't tell what the
>>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
>>>>> else reproduced the problem and tested the fix)?
>>>>
>>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
>>>> to publicly available mailing list archives etc.
>>>
>>> Do you at least have a description of how a user could determine
>>> whether he is seeing the problem fixed by this patch?
>>
>> Yeah.  For example, when the problem is visible on a USB controller and that
>> controller is runtime-suspended, then plugging a new USB device into one
>> of the controller's ports won't wake the controller up without the patch.
>
> Hi,
>  I am wondering for a week or two why nobody answered any of my bug reports,
> not even Sarah who asked for more details. I am think the fix is about my report
> under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
> and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
> about my report. But I should better wait what Sarah says. ;-)

I haven't forgotten about your hotplug issues, but I've been on
vacation for a week and have been working on the similar issue
reported by Chris Clayton
(https://bugzilla.kernel.org/show_bug.cgi?id=54981) because it seemed
a bit more tractable.  But I'll get back to yours eventually :)
Unfortunately nobody else seems to be jumping in to help, and I can
only do so much by myself.

I haven't been following your XHCI issue at all, but one thing you
might consider is that it's easy for us on the receiving end to be
overwhelmed by the sheer volume of information.    For me personally,
it's more useful to get specific answers to a few questions than it is
for me to sort through a lot of speculation and other data.  In some
cases, "less is more" :)

>   I would have actually same comment for the proposed patches in:
> Yinghai Lu  "Re: [PATCH] PCI: Remove not needed check in disable aspm link"
> Who tested the bug, if anybody? What change(the fix) in lspci output should one
> observe?

Yeah, that's one of the things I'm trying to sort out right now.  It
*was* tested by Roman, according to the changelog, and I think I can
dig out the user-visible behavior from the bugzilla
(https://bugzilla.kernel.org/show_bug.cgi?id=55211), but I definitely
agree -- that patch needs a lot of tender loving care before I apply
it.

>>>>>> On Saturday, March 23, 2013 03:33:03 PM Rafael J. Wysocki wrote:
>>>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>>>
>>>>>>> It turns out that _Lxx control methods provided by some BIOSes clear
>>>>>>> the PME Status bit of PCI devices they handle, which means that
>>>>>>> pci_acpi_wake_dev() cannot really use that bit to check whether or
>>>>>>> not the device has signalled wakeup.
>>>>>>>
>>>>>>> For this reason, make pci_acpi_wake_dev() always attempt to resume
>>>>>>> the device it is called for regardless of the device's PME Status bit
>>>>>>> value (that bit still has to be cleared if set at this point,
>>>>>>> though).
>>>>>>>
>>>>>>> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
>>>>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>>> ---
>>>>>>>  drivers/pci/pci-acpi.c |   15 ++++++++-------
>>>>>>>  1 file changed, 8 insertions(+), 7 deletions(-)
>>>>>>>
>>>>>>> Index: linux-pm/drivers/pci/pci-acpi.c
>>>>>>> ===================================================================
>>>>>>> --- linux-pm.orig/drivers/pci/pci-acpi.c
>>>>>>> +++ linux-pm/drivers/pci/pci-acpi.c
>>>>>>> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
>>>>>>>               return;
>>>>>>>       }
>>>>>>>
>>>>>>> -     if (!pci_dev->pm_cap || !pci_dev->pme_support
>>>>>>> -          || pci_check_pme_status(pci_dev)) {
>>>>>>> -             if (pci_dev->pme_poll)
>>>>>>> -                     pci_dev->pme_poll = false;
>>>>>>> +     /* Clear PME Status if set. */
>>>>>>> +     if (pci_dev->pme_support)
>>>>>>> +             pci_check_pme_status(pci_dev);
>>>>>>>
>>>>>>> -             pci_wakeup_event(pci_dev);
>>>>>>> -             pm_runtime_resume(&pci_dev->dev);
>>>>>>> -     }
>>>>>>> +     if (pci_dev->pme_poll)
>>>>>>> +             pci_dev->pme_poll = false;
>>>>>>> +
>>>>>>> +     pci_wakeup_event(pci_dev);
>>>>>>> +     pm_runtime_resume(&pci_dev->dev);
>>>>>>>
>>>>>>>       if (pci_dev->subordinate)
>>>>>>>               pci_pme_wakeup_bus(pci_dev->subordinate);
>>>>>>>
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>> --
>>>>>> I speak only for myself.
>>>>>> Rafael J. Wysocki, Intel Open Source Technology Center.
>>>> --
>>>> I speak only for myself.
>>>> Rafael J. Wysocki, Intel Open Source Technology Center.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 17:49             ` Bjorn Helgaas
@ 2013-03-28 18:23               ` Sarah Sharp
  2013-03-28 19:12                 ` Bjorn Helgaas
  2013-03-28 18:31               ` Martin Mokrejs
  1 sibling, 1 reply; 61+ messages in thread
From: Sarah Sharp @ 2013-03-28 18:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Martin Mokrejs, Rafael J. Wysocki, ACPI Devel Maling List,
	Len Brown, Matthew Garrett

On Thu, Mar 28, 2013 at 11:49:05AM -0600, Bjorn Helgaas wrote:
> On Thu, Mar 28, 2013 at 11:26 AM, Martin Mokrejs
> <mmokrejs@fold.natur.cuni.cz> wrote:
> >
> >
> > Rafael J. Wysocki wrote:
> >> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
> >>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
> >>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >>>>> If you have
> >>>>> a URL for a bugzilla or mailing list report of the original problem,
> >>>>> that would be good, too.  It'd be nice if users and distros could
> >>>>> match problem reports with this solution, but I can't tell what the
> >>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
> >>>>> else reproduced the problem and tested the fix)?
> >>>>
> >>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
> >>>> to publicly available mailing list archives etc.
> >>>
> >>> Do you at least have a description of how a user could determine
> >>> whether he is seeing the problem fixed by this patch?
> >>
> >> Yeah.  For example, when the problem is visible on a USB controller and that
> >> controller is runtime-suspended, then plugging a new USB device into one
> >> of the controller's ports won't wake the controller up without the patch.
> >
> > Hi,
> >  I am wondering for a week or two why nobody answered any of my bug reports,
> > not even Sarah who asked for more details. I am think the fix is about my report
> > under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
> > and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
> > about my report. But I should better wait what Sarah says. ;-)

Actually, it didn't occur to me that your issue might be related at all.
Sure, you can try this patch, on top of Greg's usb-linus branch, and see
if it fixes your issue.

I just reproduced this on an internal machine, and decided to report it
privately to Rafael first, in case the early hardware was just broken.

Bjorn, I see that you're encouraging people to have their bugs and
symptoms in a bug tracker.  I've also been doing that within Intel, in a
private JIRA issue tracker.  I've been discussing if we can duplicate
some bugs or features that don't contain Intel confidential information
to a public JIRA at 01.org.  I don't really want to use
bugzilla.kernel.org because, quite frankly, the interface is archaic,
and in the past I've gotten pushback from other devs about tracking
"someday" features in there.

If you're interested in making bugs and features more traceable, would
it help you for us to file scrubbed bugs/features in a public JIRA
instance?  If so, I'll talk with our admins further.

Sarah Sharp

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 17:49             ` Bjorn Helgaas
  2013-03-28 18:23               ` Sarah Sharp
@ 2013-03-28 18:31               ` Martin Mokrejs
  2013-03-28 21:27                 ` Rafael J. Wysocki
  1 sibling, 1 reply; 61+ messages in thread
From: Martin Mokrejs @ 2013-03-28 18:31 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Rafael J. Wysocki, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp

Hi Bjorn,

Bjorn Helgaas wrote:
> On Thu, Mar 28, 2013 at 11:26 AM, Martin Mokrejs
> <mmokrejs@fold.natur.cuni.cz> wrote:
>>
>>
>> Rafael J. Wysocki wrote:
>>> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
>>>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
>>>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>>>>> Hi Bjorn,
>>>>>>>
>>>>>>> I wonder what you think about the patch below?
>>>>>>
>>>>>> Seems fine to me (I'm trusting your and Matthew's judgment here since
>>>>>> I don't know much about it).  Why don't you resend it with Matthew's
>>>>>> ack and the appropriate stable tags, and I'll put it in.
>>>>>
>>>>> I will, thanks!
>>>>>
>>>>>> If you have
>>>>>> a URL for a bugzilla or mailing list report of the original problem,
>>>>>> that would be good, too.  It'd be nice if users and distros could
>>>>>> match problem reports with this solution, but I can't tell what the
>>>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
>>>>>> else reproduced the problem and tested the fix)?
>>>>>
>>>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
>>>>> to publicly available mailing list archives etc.
>>>>
>>>> Do you at least have a description of how a user could determine
>>>> whether he is seeing the problem fixed by this patch?
>>>
>>> Yeah.  For example, when the problem is visible on a USB controller and that
>>> controller is runtime-suspended, then plugging a new USB device into one
>>> of the controller's ports won't wake the controller up without the patch.
>>
>> Hi,
>>  I am wondering for a week or two why nobody answered any of my bug reports,
>> not even Sarah who asked for more details. I am think the fix is about my report
>> under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
>> and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
>> about my report. But I should better wait what Sarah says. ;-)
> 
> I haven't forgotten about your hotplug issues, but I've been on
> vacation for a week and have been working on the similar issue
> reported by Chris Clayton
> (https://bugzilla.kernel.org/show_bug.cgi?id=54981) because it seemed
> a bit more tractable.  But I'll get back to yours eventually :)
> Unfortunately nobody else seems to be jumping in to help, and I can
> only do so much by myself.
> 
> I haven't been following your XHCI issue at all, but one thing you

But please do so now. If we are talking about an existing patch it should be
possible to say whether what I observed is likely to be fixed by the patch.
I will happily discuss then why I loose interrupts in a same way for my
rtl8169 network card and why this PME# stuff happens for me only with 3.8
and not 3.7 (unlike what Sarah claims). I am not arguing that something 
else makes 3.7 be able to wakeup the device and overcome the same bug
while "it" is gone from 3.8. I think this should be an easy task for you,
pci devs. ;-)


> might consider is that it's easy for us on the receiving end to be
> overwhelmed by the sheer volume of information.    For me personally,
> it's more useful to get specific answers to a few questions than it is
> for me to sort through a lot of speculation and other data.  In some
> cases, "less is more" :)

Although in theory I agree in real, I can only collect data for you and test a
patch. Even when I extracted bits which I found important into emails there was
still not much answer. And if there is nobody to go through the data then it
is a waste of time.

But still, each thread was a different bug and I just thought you will pick
any which looks edible. Why Sarah had to fix PCI/ACPI stuff I don't know
but yes, that one seemed quite clear.


> 
>>   I would have actually same comment for the proposed patches in:
>> Yinghai Lu  "Re: [PATCH] PCI: Remove not needed check in disable aspm link"
>> Who tested the bug, if anybody? What change(the fix) in lspci output should one
>> observe?
> 
> Yeah, that's one of the things I'm trying to sort out right now.  It
> *was* tested by Roman, according to the changelog, and I think I can
> dig out the user-visible behavior from the bugzilla
> (https://bugzilla.kernel.org/show_bug.cgi?id=55211), but I definitely
> agree -- that patch needs a lot of tender loving care before I apply
> it.

Good. I think all of these relate to the issues I saw, and I don't believe
one cannot find the *now described* broken behavior in my test outputs
and verbosely explain what went on.

Best,
Martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 18:23               ` Sarah Sharp
@ 2013-03-28 19:12                 ` Bjorn Helgaas
  2013-03-28 19:42                   ` Martin Mokrejs
  0 siblings, 1 reply; 61+ messages in thread
From: Bjorn Helgaas @ 2013-03-28 19:12 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: Martin Mokrejs, Rafael J. Wysocki, ACPI Devel Maling List,
	Len Brown, Matthew Garrett

On Thu, Mar 28, 2013 at 12:23 PM, Sarah Sharp
<sarah.a.sharp@linux.intel.com> wrote:
> On Thu, Mar 28, 2013 at 11:49:05AM -0600, Bjorn Helgaas wrote:
>> On Thu, Mar 28, 2013 at 11:26 AM, Martin Mokrejs
>> <mmokrejs@fold.natur.cuni.cz> wrote:
>> >
>> >
>> > Rafael J. Wysocki wrote:
>> >> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
>> >>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> >>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
>> >>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> >>>>> If you have
>> >>>>> a URL for a bugzilla or mailing list report of the original problem,
>> >>>>> that would be good, too.  It'd be nice if users and distros could
>> >>>>> match problem reports with this solution, but I can't tell what the
>> >>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
>> >>>>> else reproduced the problem and tested the fix)?
>> >>>>
>> >>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
>> >>>> to publicly available mailing list archives etc.
>> >>>
>> >>> Do you at least have a description of how a user could determine
>> >>> whether he is seeing the problem fixed by this patch?
>> >>
>> >> Yeah.  For example, when the problem is visible on a USB controller and that
>> >> controller is runtime-suspended, then plugging a new USB device into one
>> >> of the controller's ports won't wake the controller up without the patch.
>> >
>> > Hi,
>> >  I am wondering for a week or two why nobody answered any of my bug reports,
>> > not even Sarah who asked for more details. I am think the fix is about my report
>> > under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
>> > and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
>> > about my report. But I should better wait what Sarah says. ;-)
>
> Actually, it didn't occur to me that your issue might be related at all.
> Sure, you can try this patch, on top of Greg's usb-linus branch, and see
> if it fixes your issue.
>
> I just reproduced this on an internal machine, and decided to report it
> privately to Rafael first, in case the early hardware was just broken.
>
> Bjorn, I see that you're encouraging people to have their bugs and
> symptoms in a bug tracker.  I've also been doing that within Intel, in a
> private JIRA issue tracker.  I've been discussing if we can duplicate
> some bugs or features that don't contain Intel confidential information
> to a public JIRA at 01.org.  I don't really want to use
> bugzilla.kernel.org because, quite frankly, the interface is archaic,
> and in the past I've gotten pushback from other devs about tracking
> "someday" features in there.

My main concern is that often there's more information relevant to a
change than it makes sense to put in the changelog, so I like to
include a URL to that additional info.  I don't really care if that's
for a mailing list archive, a bugzilla, a JIRA instance, etc.  Issue
trackers are more convenient than mailing lists for collecting dmesg
logs, acpidumps, etc.  The archaic bugzilla interface notwithstanding,
I'm not sure it would be an improvement to have a collection of dozens
of issue trackers controlled by random organizations.  I'd rather have
a single place and confidence that it will stick around.

Bjorn

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 19:12                 ` Bjorn Helgaas
@ 2013-03-28 19:42                   ` Martin Mokrejs
  0 siblings, 0 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-03-28 19:42 UTC (permalink / raw)
  To: Bjorn Helgaas, Sarah Sharp
  Cc: Rafael J. Wysocki, ACPI Devel Maling List, Len Brown, Matthew Garrett

Bjorn Helgaas wrote:
> On Thu, Mar 28, 2013 at 12:23 PM, Sarah Sharp

>>
>> Bjorn, I see that you're encouraging people to have their bugs and
>> symptoms in a bug tracker.  I've also been doing that within Intel, in a
>> private JIRA issue tracker.  I've been discussing if we can duplicate
>> some bugs or features that don't contain Intel confidential information
>> to a public JIRA at 01.org.  I don't really want to use
>> bugzilla.kernel.org because, quite frankly, the interface is archaic,
>> and in the past I've gotten pushback from other devs about tracking
>> "someday" features in there.
> 
> My main concern is that often there's more information relevant to a
> change than it makes sense to put in the changelog, so I like to
> include a URL to that additional info.  I don't really care if that's
> for a mailing list archive, a bugzilla, a JIRA instance, etc.  Issue
> trackers are more convenient than mailing lists for collecting dmesg
> logs, acpidumps, etc.  The archaic bugzilla interface notwithstanding,
> I'm not sure it would be an improvement to have a collection of dozens
> of issue trackers controlled by random organizations.  I'd rather have
> a single place and confidence that it will stick around.

What's wrong with bugzilla? It's nice and more appealing than Jira. From
a user perspective I always found Jira ugly. sorry to say that. ;-)

Martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Update][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 17:10 ` [Resend][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications Rafael J. Wysocki
@ 2013-03-28 21:07   ` Rafael J. Wysocki
  2013-03-29 15:05     ` Martin Mokrejs
  2013-04-03 22:38     ` Bjorn Helgaas
  0 siblings, 2 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-28 21:07 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: ACPI Devel Maling List, LKML, Linux PM list, Len Brown,
	Matthew Garrett, Sarah Sharp, Accardi, Kristen C, Huang, Ying,
	linux-pci

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: PCI / ACPI: Always resume devices on ACPI wakeup notifications

It turns out that the _Lxx control methods provided by some BIOSes
clear the PME Status bit of PCI devices they handle, which means that
pci_acpi_wake_dev() cannot really use that bit to check whether or
not the device has signalled wakeup.

One symptom of the problem is, for example, that when an affected PCI
USB controller is runtime-suspended, then plugging in a new USB device
into one of the controller's ports will not wake up the controller,
which should happen.

For this reason, make pci_acpi_wake_dev() always attempt to resume
the device it is called for regardless of the device's PME Status bit
value (that bit still has to be cleared if set at this point,
though).

Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: 3.7+ <stable@vger.kernel.org>
---

The changelog in this version is slightly better than in the previous one, IMHO.

Thanks,
Rafael

---
 drivers/pci/pci-acpi.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/pci/pci-acpi.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-acpi.c
+++ linux-pm/drivers/pci/pci-acpi.c
@@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
 		return;
 	}

-	if (!pci_dev->pm_cap || !pci_dev->pme_support
-	     || pci_check_pme_status(pci_dev)) {
-		if (pci_dev->pme_poll)
-			pci_dev->pme_poll = false;
+	/* Clear PME Status if set. */
+	if (pci_dev->pme_support)
+		pci_check_pme_status(pci_dev);

-		pci_wakeup_event(pci_dev);
-		pm_runtime_resume(&pci_dev->dev);
-	}
+	if (pci_dev->pme_poll)
+		pci_dev->pme_poll = false;
+
+	pci_wakeup_event(pci_dev);
+	pm_runtime_resume(&pci_dev->dev);

 	if (pci_dev->subordinate)
 		pci_pme_wakeup_bus(pci_dev->subordinate);

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 18:31               ` Martin Mokrejs
@ 2013-03-28 21:27                 ` Rafael J. Wysocki
  2013-03-29  7:41                   ` huang ying
                                     ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-28 21:27 UTC (permalink / raw)
  To: Martin Mokrejs
  Cc: Bjorn Helgaas, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp

On Thursday, March 28, 2013 07:31:58 PM Martin Mokrejs wrote:
> Hi Bjorn,
> 
> Bjorn Helgaas wrote:
> > On Thu, Mar 28, 2013 at 11:26 AM, Martin Mokrejs
> > <mmokrejs@fold.natur.cuni.cz> wrote:
> >>
> >>
> >> Rafael J. Wysocki wrote:
> >>> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
> >>>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >>>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
> >>>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >>>>>>> Hi Bjorn,
> >>>>>>>
> >>>>>>> I wonder what you think about the patch below?
> >>>>>>
> >>>>>> Seems fine to me (I'm trusting your and Matthew's judgment here since
> >>>>>> I don't know much about it).  Why don't you resend it with Matthew's
> >>>>>> ack and the appropriate stable tags, and I'll put it in.
> >>>>>
> >>>>> I will, thanks!
> >>>>>
> >>>>>> If you have
> >>>>>> a URL for a bugzilla or mailing list report of the original problem,
> >>>>>> that would be good, too.  It'd be nice if users and distros could
> >>>>>> match problem reports with this solution, but I can't tell what the
> >>>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
> >>>>>> else reproduced the problem and tested the fix)?
> >>>>>
> >>>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
> >>>>> to publicly available mailing list archives etc.
> >>>>
> >>>> Do you at least have a description of how a user could determine
> >>>> whether he is seeing the problem fixed by this patch?
> >>>
> >>> Yeah.  For example, when the problem is visible on a USB controller and that
> >>> controller is runtime-suspended, then plugging a new USB device into one
> >>> of the controller's ports won't wake the controller up without the patch.
> >>
> >> Hi,
> >>  I am wondering for a week or two why nobody answered any of my bug reports,
> >> not even Sarah who asked for more details. I am think the fix is about my report
> >> under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
> >> and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
> >> about my report. But I should better wait what Sarah says. ;-)
> > 
> > I haven't forgotten about your hotplug issues, but I've been on
> > vacation for a week and have been working on the similar issue
> > reported by Chris Clayton
> > (https://bugzilla.kernel.org/show_bug.cgi?id=54981) because it seemed
> > a bit more tractable.  But I'll get back to yours eventually :)
> > Unfortunately nobody else seems to be jumping in to help, and I can
> > only do so much by myself.
> > 
> > I haven't been following your XHCI issue at all, but one thing you
> 
> But please do so now. If we are talking about an existing patch it should be
> possible to say whether what I observed is likely to be fixed by the patch.
> I will happily discuss then why I loose interrupts in a same way for my
> rtl8169 network card and why this PME# stuff happens for me only with 3.8
> and not 3.7 (unlike what Sarah claims). I am not arguing that something 
> else makes 3.7 be able to wakeup the device and overcome the same bug
> while "it" is gone from 3.8. I think this should be an easy task for you,
> pci devs. ;-)

OK, let's try to establish facts.

Does the patch below causes the PCI PM issues you're seeing to go away?

If it doesn't make all of them go away, does it make *some* of them go away?

If that is the case, which of the problems remain after applying it (on top
of the Linus' current tree)?

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: PCI / PM: Disable runtime PM of PCIe ports

The runtime PM of PCIe ports turns out to be quite fragile, as in
some cases things work while in some other cases they don't and we
don't seem to have a good way to determine whether or not they are
going to work in advance.

For this reason, avoid enabling runtime PM for PCIe ports by
keeping their runtime PM reference counters always above 0 for the
time being.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/pci/pcie/portdrv_pci.c |    5 -----
 1 file changed, 5 deletions(-)

Index: linux-pm/drivers/pci/pcie/portdrv_pci.c
===================================================================
--- linux-pm.orig/drivers/pci/pcie/portdrv_pci.c
+++ linux-pm/drivers/pci/pcie/portdrv_pci.c
@@ -225,16 +225,11 @@ static int pcie_portdrv_probe(struct pci
 	 * it by default.
 	 */
 	dev->d3cold_allowed = false;
-	if (!pci_match_id(port_runtime_pm_black_list, dev))
-		pm_runtime_put_noidle(&dev->dev);
-
 	return 0;
 }
 
 static void pcie_portdrv_remove(struct pci_dev *dev)
 {
-	if (!pci_match_id(port_runtime_pm_black_list, dev))
-		pm_runtime_get_noresume(&dev->dev);
 	pcie_port_device_remove(dev);
 	pci_disable_device(dev);
 }

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 21:27                 ` Rafael J. Wysocki
@ 2013-03-29  7:41                   ` huang ying
  2013-03-31  2:29                     ` Martin Mokrejs
  2013-03-30  2:03                   ` Martin Mokrejs
  2013-03-30 22:38                   ` [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports Rafael J. Wysocki
  2 siblings, 1 reply; 61+ messages in thread
From: huang ying @ 2013-03-29  7:41 UTC (permalink / raw)
  To: Martin Mokrejs, Rafael J. Wysocki
  Cc: Bjorn Helgaas, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp

[-- Attachment #1: Type: text/plain, Size: 2631 bytes --]

On Fri, Mar 29, 2013 at 5:27 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Thursday, March 28, 2013 07:31:58 PM Martin Mokrejs wrote:
>> Hi Bjorn,
>>
>> Bjorn Helgaas wrote:
>> > On Thu, Mar 28, 2013 at 11:26 AM, Martin Mokrejs
>> > <mmokrejs@fold.natur.cuni.cz> wrote:
>> >>
>> >>
>> >> Rafael J. Wysocki wrote:
>> >>> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
>> >>>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> >>>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
>> >>>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> >>>>>>> Hi Bjorn,
>> >>>>>>>
>> >>>>>>> I wonder what you think about the patch below?
>> >>>>>>
>> >>>>>> Seems fine to me (I'm trusting your and Matthew's judgment here since
>> >>>>>> I don't know much about it).  Why don't you resend it with Matthew's
>> >>>>>> ack and the appropriate stable tags, and I'll put it in.
>> >>>>>
>> >>>>> I will, thanks!
>> >>>>>
>> >>>>>> If you have
>> >>>>>> a URL for a bugzilla or mailing list report of the original problem,
>> >>>>>> that would be good, too.  It'd be nice if users and distros could
>> >>>>>> match problem reports with this solution, but I can't tell what the
>> >>>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
>> >>>>>> else reproduced the problem and tested the fix)?
>> >>>>>
>> >>>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
>> >>>>> to publicly available mailing list archives etc.
>> >>>>
>> >>>> Do you at least have a description of how a user could determine
>> >>>> whether he is seeing the problem fixed by this patch?
>> >>>
>> >>> Yeah.  For example, when the problem is visible on a USB controller and that
>> >>> controller is runtime-suspended, then plugging a new USB device into one
>> >>> of the controller's ports won't wake the controller up without the patch.
>> >>
>> >> Hi,
>> >>  I am wondering for a week or two why nobody answered any of my bug reports,
>> >> not even Sarah who asked for more details. I am think the fix is about my report
>> >> under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
>> >> and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
>> >> about my report. But I should better wait what Sarah says. ;-)

Hi, Martin,

Sorry for late.  Just found your bug report.  That seems related with
PCIe port runtime PM support.

Can you try the debug patch attached?  And send me back the dmesg?

Sorry I use gmail web client, so I can only send patch as attachment.

Best Regards,
Huang Ying

[-- Attachment #2: port_wake_dbg.patch --]
[-- Type: application/octet-stream, Size: 1707 bytes --]

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index dee5ddd..8d0b909 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -43,10 +43,16 @@ static void pci_acpi_wake_bus(acpi_handle handle, u32 event, void *context)
 static void pci_acpi_wake_dev(acpi_handle handle, u32 event, void *context)
 {
 	struct pci_dev *pci_dev = context;
+	struct acpi_device *adev;
 
 	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
 		return;
 
+	if (!acpi_bus_get_device(handle, &adev)) {
+		adev->wakeup.flags.run_wake_works = true;
+		dev_info(&pci_dev->dev, "run wake works!\n");
+	}
+
 	if (pci_dev->current_state == PCI_D3cold) {
 		pci_wakeup_event(pci_dev);
 		pm_runtime_resume(&pci_dev->dev);
@@ -146,6 +152,15 @@ phys_addr_t acpi_pci_root_get_mcfg_addr(acpi_handle handle)
 static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
 {
 	int acpi_state, d_max;
+	acpi_handle handle = DEVICE_ACPI_HANDLE(&pdev->dev);
+	struct acpi_device *adev;
+
+	if (pci_is_bridge(pdev) && !acpi_bus_get_device(handle, &adev)) {
+		if (!adev->wakeup.flags.run_wake_works) {
+			dev_info(&pdev->dev, "choose state, run_wake not verified\n");
+			return PCI_D0;
+		}
+	}
 
 	if (pdev->no_d3cold)
 		d_max = ACPI_STATE_D3_HOT;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 22ba56e..bc88419 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -245,6 +245,7 @@ struct acpi_device_perf {
 struct acpi_device_wakeup_flags {
 	u8 valid:1;		/* Can successfully enable wakeup? */
 	u8 run_wake:1;		/* Run-Wake GPE devices */
+	u8 run_wake_works:1;	/* Run-Wake works for the device */
 	u8 notifier_present:1;  /* Wake-up notify handler has been installed */
 };
 

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 21:07   ` [Update][PATCH] " Rafael J. Wysocki
@ 2013-03-29 15:05     ` Martin Mokrejs
  2013-03-29 16:05       ` Sarah Sharp
  2013-03-29 21:34       ` Rafael J. Wysocki
  2013-04-03 22:38     ` Bjorn Helgaas
  1 sibling, 2 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-03-29 15:05 UTC (permalink / raw)
  To: Rafael J. Wysocki, Bjorn Helgaas
  Cc: ACPI Devel Maling List, LKML, Linux PM list, Len Brown,
	Matthew Garrett, Sarah Sharp, Accardi, Kristen C, Huang, Ying,
	linux-pci

Hi,
  I applied this patches over 3.8.3 hoping it will fix my issue under
thread: "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
but unfortunately, it is even worse! Now, although lsusb -v nor lsusb -vv do
wakeup the XHCI port but it falls asleep immediately, more quickly than I am
able to plug a device into the socket. To get a device working in the USB3 socket
I need to plug it in, run lsusb -vv and then it is recognized.

Without the patch, the 'lsusb -vv' woke up the port (PME# disabled happened
on both 1c.4 and 0b:00.0) and I had unlimited time to find some USB device
around and to plug it into the slot.


  I noticed this message some while after a bootup (no external USB devices were
connected to the laptop, neither into USB2 socket nor into USB3.0 sockets) before
I started to do the tests:

[   36.594171] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
[   36.594202] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
[   36.594247] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
[   36.594349] xhci_hcd 0000:0b:00.0: PME# enabled
[   36.703695] r8169 0000:05:00.0 eth0: link down
[   37.098299] microcode: CPU0 updated to revision 0x28, date = 2012-04-24
[   37.098941] microcode: CPU1 updated to revision 0x28, date = 2012-04-24
[   37.098944] perf_event_intel: PEBS enabled due to microcode update
[   38.343029] r8169 0000:05:00.0 eth0: link up
[   39.094944] r8169 0000:05:00.0 eth0: link down
[   41.492768] r8169 0000:05:00.0 eth0: link up
[   62.782910] xhci_hcd 0000:0b:00.0: Poll event ring: 4294943584
[   62.782938] xhci_hcd 0000:0b:00.0: op reg status = 0xffffffff
[   62.782939] xhci_hcd 0000:0b:00.0: HW died, polling stopped.
[   88.754183] pcieport 0000:00:1c.0: PME# enabled
[   88.764182] xhci_hcd 0000:0b:00.0: PME# disabled
[   88.764192] xhci_hcd 0000:0b:00.0: enabling bus mastering
[   88.764206] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
[   88.764242] xhci_hcd 0000:0b:00.0: Port Status Change Event for port 2
[   88.764246] xhci_hcd 0000:0b:00.0: resume root hub
[   88.764259] xhci_hcd 0000:0b:00.0: handle_port_status: starting port polling.
[   88.764276] xhci_hcd 0000:0b:00.0: xhci_resume: starting port polling.
[   88.764281] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0


What "HW died? Why 1c.0 is here? What is this device actually doing?

00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Bus: primary=00, secondary=03, subordinate=04, sec-latency=0
        I/O behind bridge: 0000f000-00000fff
        Memory behind bridge: fff00000-000fffff
        Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <1us, L1 <16us
                        ClockPM- Surprise- LLActRep+ BwNot-
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
                        Slot #0, PowerLimit 10.000W; Interlock- NoCompl+
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
                        Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
                        Changed: MRL- PresDet- LinkState-
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
                RootCap: CRSVisible-
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
                Address: 00000000  Data: 0000
        Capabilities: [90] Subsystem: Dell Device 04b3
        Capabilities: [a0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=0 PME-
        Kernel driver in use: pcieport





Nevertheless, I went to check if if the USB3 socket dies after first unplug of device
or not anymore thanks to the patch being tested:

I plugged into the USB3.0 socket a mouse, it worked. Around its unplug I got:

[   94.954779] hub 3-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x100
[   94.954795] hub 3-0:1.0: hub_suspend
[   94.954802] usb usb3: bus auto-suspend, wakeup 1
[   94.954817] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
[   94.954835] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
[   94.954857] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
[   94.954898] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
[   94.954983] xhci_hcd 0000:0b:00.0: PME# enabled
[  169.622513] hub 2-1:1.0: state 7 ports 8 chg 0000 evt 0004
[  169.623057] hub 2-1:1.0: port 2, status 0101, change 0001, 12 Mb/s
[  169.777012] hub 2-1:1.0: debounce: port 2: total 100ms stable 100ms status 0x101
[  169.856992] usb 2-1.2: new low-speed USB device number 4 using ehci-pci

and the port was dead, no matter what "lsusb -v or -vv" options I tried. At about
[  169.622513] I plugged the mouse into a USB2.0 socket (do not know if that is 1a.0 or 1d.0).

# lspci -tv
-[0000:00]-+-00.0  Intel Corporation 2nd Generation Core Processor Family DRAM Controller
           +-02.0  Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller
           +-16.0  Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1
           +-1a.0  Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2
           +-1b.0  Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller
           +-1c.0-[03-04]--
           +-1c.1-[05-06]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller
           +-1c.3-[09-0a]----00.0  Intel Corporation Centrino Wireless-N 1030 [Rainbow Peak]
           +-1c.4-[0b-0c]----00.0  Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller
           +-1c.7-[11-16]----00.0  Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller
           +-1d.0  Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1
           +-1f.0  Intel Corporation HM67 Express Chipset Family LPC Controller
           +-1f.2  Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller
           \-1f.3  Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller
#


 Why isn't the PME# enabled/disabled reported for the USB2.0 port of the
same laptop? There I can plugin a device many times in and out and not a single PME# line is
spitted by dmesg.
# grep . /sys/bus/pci/devices/*/power/control
/sys/bus/pci/devices/0000:00:00.0/power/control:auto
/sys/bus/pci/devices/0000:00:02.0/power/control:auto
/sys/bus/pci/devices/0000:00:16.0/power/control:auto
/sys/bus/pci/devices/0000:00:1a.0/power/control:auto
/sys/bus/pci/devices/0000:00:1b.0/power/control:auto
/sys/bus/pci/devices/0000:00:1c.0/power/control:auto
/sys/bus/pci/devices/0000:00:1c.1/power/control:auto
/sys/bus/pci/devices/0000:00:1c.3/power/control:auto
/sys/bus/pci/devices/0000:00:1c.4/power/control:auto
/sys/bus/pci/devices/0000:00:1c.7/power/control:auto
/sys/bus/pci/devices/0000:00:1d.0/power/control:auto
/sys/bus/pci/devices/0000:00:1f.0/power/control:auto
/sys/bus/pci/devices/0000:00:1f.2/power/control:auto
/sys/bus/pci/devices/0000:00:1f.3/power/control:auto
/sys/bus/pci/devices/0000:05:00.0/power/control:auto
/sys/bus/pci/devices/0000:09:00.0/power/control:auto
/sys/bus/pci/devices/0000:0b:00.0/power/control:auto
/sys/bus/pci/devices/0000:11:00.0/power/control:auto
# grep . /sys/bus/pci/devices/*/power/runtime_status
/sys/bus/pci/devices/0000:00:00.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:02.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:16.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1a.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1b.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.3/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.4/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1c.7/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1d.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.2/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.3/power/runtime_status:suspended
/sys/bus/pci/devices/0000:05:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:09:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:0b:00.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:11:00.0/power/runtime_status:active
#

If I run lsusb -vv it does (with the problematic patch):

[ 1760.414086] pcieport 0000:00:1c.4: PME# disabled
[ 1760.434314] xhci_hcd 0000:0b:00.0: PME# disabled
[ 1760.434327] xhci_hcd 0000:0b:00.0: enabling bus mastering
[ 1760.434338] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
[ 1760.434360] xhci_hcd 0000:0b:00.0: Port Status Change Event for port 2
[ 1760.434363] xhci_hcd 0000:0b:00.0: resume root hub
[ 1760.434367] xhci_hcd 0000:0b:00.0: handle_port_status: starting port polling.
[ 1760.434378] xhci_hcd 0000:0b:00.0: xhci_resume: starting port polling.
[ 1760.434383] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
[ 1760.434388] usb usb3: usb auto-resume
[ 1760.434407] hub 3-0:1.0: hub_resume
[ 1760.434439] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
[ 1760.434440] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[ 1760.434464] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x202a0
[ 1760.434465] xhci_hcd 0000:0b:00.0: Get port status returned 0x10100
[ 1760.434492] xhci_hcd 0000:0b:00.0: clear port connect change, actual port 1 status  = 0x2a0
[ 1760.434642] usb usb4: usb wakeup-resume
[ 1760.434646] usb usb4: usb auto-resume
[ 1760.434661] hub 4-0:1.0: hub_resume
[ 1760.434683] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
[ 1760.434684] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
[ 1760.434710] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[ 1760.434711] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
[ 1760.434727] hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
[ 1760.434757] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 0 status  = 0xe0002a0
[ 1760.434784] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 1 status  = 0xe0002a0
[ 1760.434791] hub 4-0:1.0: hub_suspend
[ 1760.434796] usb usb4: bus auto-suspend, wakeup 1
[ 1760.434807] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
[ 1760.553734] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
[ 1760.553751] hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
[ 1760.574793] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
[ 1760.574794] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[ 1760.575300] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[ 1760.575301] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[ 1760.576768] hub 3-0:1.0: hub_suspend
[ 1760.576774] usb usb3: bus auto-suspend, wakeup 1
[ 1760.576789] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
[ 1760.576802] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
[ 1760.576817] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
[ 1760.576851] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
[ 1760.576938] xhci_hcd 0000:0b:00.0: PME# enabled
[ 1760.613874] xhci_hcd 0000:0b:00.0: PME# disabled
[ 1760.613884] xhci_hcd 0000:0b:00.0: enabling bus mastering
[ 1760.613895] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
[ 1760.613914] xhci_hcd 0000:0b:00.0: xhci_resume: starting port polling.
[ 1760.613922] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
[ 1760.613924] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
[ 1760.613929] usb usb4: usb auto-resume
[ 1760.613945] hub 4-0:1.0: hub_resume
[ 1760.613981] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
[ 1760.613982] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
[ 1760.614010] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[ 1760.614012] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
[ 1760.614038] usb usb3: usb wakeup-resume
[ 1760.614040] usb usb3: usb auto-resume
[ 1760.614059] hub 3-0:1.0: hub_resume
[ 1760.614080] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
[ 1760.614081] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[ 1760.614104] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[ 1760.614105] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[ 1760.614122] hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
[ 1760.614126] hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
[ 1760.614134] hub 3-0:1.0: hub_suspend
[ 1760.614139] usb usb3: bus auto-suspend, wakeup 1
[ 1760.614152] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
[ 1760.623621] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
[ 1760.646744] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
[ 1760.646746] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
[ 1760.647281] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[ 1760.647283] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
[ 1760.657965] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 0 status  = 0xe0002a0
[ 1760.657992] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 1 status  = 0xe0002a0
[ 1760.658000] hub 4-0:1.0: hub_suspend
[ 1760.658004] usb usb4: bus auto-suspend, wakeup 1
[ 1760.658015] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
[ 1760.658027] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
[ 1760.658042] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
[ 1760.658074] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
[ 1760.658159] xhci_hcd 0000:0b:00.0: PME# enabled
[ 1760.683743] pcieport 0000:00:1c.4: PME# enabled



Hope this helps,
Martin




Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: PCI / ACPI: Always resume devices on ACPI wakeup notifications
> 
> It turns out that the _Lxx control methods provided by some BIOSes
> clear the PME Status bit of PCI devices they handle, which means that
> pci_acpi_wake_dev() cannot really use that bit to check whether or
> not the device has signalled wakeup.
> 
> One symptom of the problem is, for example, that when an affected PCI
> USB controller is runtime-suspended, then plugging in a new USB device
> into one of the controller's ports will not wake up the controller,
> which should happen.
> 
> For this reason, make pci_acpi_wake_dev() always attempt to resume
> the device it is called for regardless of the device's PME Status bit
> value (that bit still has to be cleared if set at this point,
> though).
> 
> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Acked-by: Matthew Garrett <mjg59@srcf.ucam.org>
> Cc: 3.7+ <stable@vger.kernel.org>
> ---
> 
> The changelog in this version is slightly better than in the previous one, IMHO.
> 
> Thanks,
> Rafael
> 
> ---
>  drivers/pci/pci-acpi.c |   15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci-acpi.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-acpi.c
> +++ linux-pm/drivers/pci/pci-acpi.c
> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
>  		return;
>  	}
>  
> -	if (!pci_dev->pm_cap || !pci_dev->pme_support
> -	     || pci_check_pme_status(pci_dev)) {
> -		if (pci_dev->pme_poll)
> -			pci_dev->pme_poll = false;
> +	/* Clear PME Status if set. */
> +	if (pci_dev->pme_support)
> +		pci_check_pme_status(pci_dev);
>  
> -		pci_wakeup_event(pci_dev);
> -		pm_runtime_resume(&pci_dev->dev);
> -	}
> +	if (pci_dev->pme_poll)
> +		pci_dev->pme_poll = false;
> +
> +	pci_wakeup_event(pci_dev);
> +	pm_runtime_resume(&pci_dev->dev);
>  
>  	if (pci_dev->subordinate)
>  		pci_pme_wakeup_bus(pci_dev->subordinate);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-29 15:05     ` Martin Mokrejs
@ 2013-03-29 16:05       ` Sarah Sharp
  2013-03-29 17:11         ` Martin Mokrejs
  2013-03-29 21:37         ` Rafael J. Wysocki
  2013-03-29 21:34       ` Rafael J. Wysocki
  1 sibling, 2 replies; 61+ messages in thread
From: Sarah Sharp @ 2013-03-29 16:05 UTC (permalink / raw)
  To: Martin Mokrejs
  Cc: Rafael J. Wysocki, Bjorn Helgaas, ACPI Devel Maling List, LKML,
	Linux PM list, Len Brown, Matthew Garrett, Accardi, Kristen C,
	Huang, Ying, linux-pci

On Fri, Mar 29, 2013 at 04:05:54PM +0100, Martin Mokrejs wrote:
> [   36.594171] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
> [   36.594202] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> [   36.594247] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
> [   36.594349] xhci_hcd 0000:0b:00.0: PME# enabled
> [   36.703695] r8169 0000:05:00.0 eth0: link down
> [   37.098299] microcode: CPU0 updated to revision 0x28, date = 2012-04-24
> [   37.098941] microcode: CPU1 updated to revision 0x28, date = 2012-04-24
> [   37.098944] perf_event_intel: PEBS enabled due to microcode update
> [   38.343029] r8169 0000:05:00.0 eth0: link up
> [   39.094944] r8169 0000:05:00.0 eth0: link down
> [   41.492768] r8169 0000:05:00.0 eth0: link up
> [   62.782910] xhci_hcd 0000:0b:00.0: Poll event ring: 4294943584
> [   62.782938] xhci_hcd 0000:0b:00.0: op reg status = 0xffffffff
> [   62.782939] xhci_hcd 0000:0b:00.0: HW died, polling stopped.
> [   88.754183] pcieport 0000:00:1c.0: PME# enabled
> [   88.764182] xhci_hcd 0000:0b:00.0: PME# disabled
> [   88.764192] xhci_hcd 0000:0b:00.0: enabling bus mastering
> [   88.764206] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> [   88.764242] xhci_hcd 0000:0b:00.0: Port Status Change Event for port 2
> [   88.764246] xhci_hcd 0000:0b:00.0: resume root hub
> [   88.764259] xhci_hcd 0000:0b:00.0: handle_port_status: starting port polling.
> [   88.764276] xhci_hcd 0000:0b:00.0: xhci_resume: starting port polling.
> [   88.764281] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
> 
> 
> What "HW died? Why 1c.0 is here? What is this device actually doing?

It's harmless.  The xHCI polling loop to debug the host registers and
rings simply notices that the registers are reading as all ffs.  I
believe that's normal when a PCI device is in D3.  I just haven't had
time to make a patch to disable the polling loop when the host is suspended.

So, for now, ignore the "HW died, polling stopped." messages.

> Nevertheless, I went to check if if the USB3 socket dies after first unplug of device
> or not anymore thanks to the patch being tested:
> 
> I plugged into the USB3.0 socket a mouse, it worked. Around its unplug I got:
> 
> [   94.954779] hub 3-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x100
> [   94.954795] hub 3-0:1.0: hub_suspend
> [   94.954802] usb usb3: bus auto-suspend, wakeup 1
> [   94.954817] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> [   94.954835] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
> [   94.954857] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> [   94.954898] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
> [   94.954983] xhci_hcd 0000:0b:00.0: PME# enabled
> [  169.622513] hub 2-1:1.0: state 7 ports 8 chg 0000 evt 0004
> [  169.623057] hub 2-1:1.0: port 2, status 0101, change 0001, 12 Mb/s
> [  169.777012] hub 2-1:1.0: debounce: port 2: total 100ms stable 100ms status 0x101
> [  169.856992] usb 2-1.2: new low-speed USB device number 4 using ehci-pci
> 
> and the port was dead, no matter what "lsusb -v or -vv" options I tried. At about
> [  169.622513] I plugged the mouse into a USB2.0 socket (do not know if that is 1a.0 or 1d.0).

All right, I wonder if the USB core/xHCI driver is forgetting to clear a
port status change bit after the device is unplugged.  That can cause
the xHCI host to not give us a port status change event later (and thus
no PME).  Looking at the logs later, it doesn't seem like we do this
though.

> If I run lsusb -vv it does (with the problematic patch):
> 
> [ 1760.414086] pcieport 0000:00:1c.4: PME# disabled
> [ 1760.434314] xhci_hcd 0000:0b:00.0: PME# disabled
> [ 1760.434327] xhci_hcd 0000:0b:00.0: enabling bus mastering
> [ 1760.434338] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> [ 1760.434360] xhci_hcd 0000:0b:00.0: Port Status Change Event for port 2

Ok, so the xHCI driver *is* getting a port status change event, and thus
must have gotten a PME.  So the PCI layer is doing its job.

> [ 1760.434363] xhci_hcd 0000:0b:00.0: resume root hub
> [ 1760.434367] xhci_hcd 0000:0b:00.0: handle_port_status: starting port polling.
> [ 1760.434378] xhci_hcd 0000:0b:00.0: xhci_resume: starting port polling.
> [ 1760.434383] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
> [ 1760.434388] usb usb3: usb auto-resume
> [ 1760.434407] hub 3-0:1.0: hub_resume
> [ 1760.434439] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
> [ 1760.434440] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
> [ 1760.434464] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x202a0
> [ 1760.434465] xhci_hcd 0000:0b:00.0: Get port status returned 0x10100
> [ 1760.434492] xhci_hcd 0000:0b:00.0: clear port connect change, actual port 1 status  = 0x2a0

Odd.  The port status shows there's no device connected, but there was a
connect change:

sarah@xanatos:~$ ./decode-port-status 0x202a0
port status = 0x0202a0
 bit  0     (CCS)          0x0, device not connected
 bit  1     (PED)          0x0, port disabled
 bit  3     (OCA)          0x0, no over-current condition
 bit  4     (PR)           0x0, port not in reset
 bits 8:5   (PLS)          0x5, link is in the RxDetect state
 bit  9     (PP)           0x1, port power on
 bits 13:10 (speed)        0x0, Undefined
 bits 15:14 (indicators)   0x0, port indicators are off
 bit  17    (CSC)          0x1, connect change
 bit  18    (PEC)          0x0, no port enable/disable change
 bit  19    (WRC)          0x0, no warm port reset change
 bit  20    (OCC)          0x0, no over-current change
 bit  21    (PRC)          0x0, no port reset change
 bit  22    (PLC)          0x0, no port link change
 bit  23    (CEC)          0x0, no port config error change
 bit  25    (WCE)          0x0, wake on connect disabled
 bit  26    (WDE)          0x0, wake on disconnect disabled
 bit  27    (WOE)          0x0, wake on over-current enable disabled
 bit  30    (DR)           0x0, device is permanently attached

RxDetect is the "I'm looking for a USB device" port state.

> [ 1760.434642] usb usb4: usb wakeup-resume
> [ 1760.434646] usb usb4: usb auto-resume
> [ 1760.434661] hub 4-0:1.0: hub_resume
> [ 1760.434683] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
> [ 1760.434684] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
> [ 1760.434710] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
> [ 1760.434711] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
> [ 1760.434727] hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
> [ 1760.434757] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 0 status  = 0xe0002a0
> [ 1760.434784] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 1 status  = 0xe0002a0
> [ 1760.434791] hub 4-0:1.0: hub_suspend
> [ 1760.434796] usb usb4: bus auto-suspend, wakeup 1
> [ 1760.434807] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> [ 1760.553734] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> [ 1760.553751] hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
> [ 1760.574793] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
> [ 1760.574794] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
> [ 1760.575300] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
> [ 1760.575301] xhci_hcd 0000:0b:00.0: Get port status returned 0x100

sarah@xanatos:~$ ./decode-port-status 0x2a0
port status = 0x0002a0
 bit  0     (CCS)          0x0, device not connected
 bit  1     (PED)          0x0, port disabled
 bit  3     (OCA)          0x0, no over-current condition
 bit  4     (PR)           0x0, port not in reset
 bits 8:5   (PLS)          0x5, link is in the RxDetect state
 bit  9     (PP)           0x1, port power on
 bits 13:10 (speed)        0x0, Undefined
 bits 15:14 (indicators)   0x0, port indicators are off
 bit  17    (CSC)          0x0, no connect change
 bit  18    (PEC)          0x0, no port enable/disable change
 bit  19    (WRC)          0x0, no warm port reset change
 bit  20    (OCC)          0x0, no over-current change
 bit  21    (PRC)          0x0, no port reset change
 bit  22    (PLC)          0x0, no port link change
 bit  23    (CEC)          0x0, no port config error change
 bit  25    (WCE)          0x0, wake on connect disabled
 bit  26    (WDE)          0x0, wake on disconnect disabled
 bit  27    (WOE)          0x0, wake on over-current enable disabled
 bit  30    (DR)           0x0, device is permanently attached

Nope, your host really isn't reporting there's a device connected
*at all*.  That's just broken hardware, and there's really nothing
software can do if the hardware isn't reporting connect events, even
with polling.

It also doesn't sound like the other TI redriver bug.  That bug only
effected USB 3.0 ports, and when lsusb was run, we would find the port
in Compliance Mode.  This is the host simply not reporting the USB 2.0
port connect at all.

Maybe if we completely disable PCI runtime PM for your host, we can work
around this bug?

Can you send me the output of `sudo lspci -vvv -n` again?

Sarah Sharp

> [ 1760.576768] hub 3-0:1.0: hub_suspend
> [ 1760.576774] usb usb3: bus auto-suspend, wakeup 1
> [ 1760.576789] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> [ 1760.576802] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
> [ 1760.576817] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> [ 1760.576851] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
> [ 1760.576938] xhci_hcd 0000:0b:00.0: PME# enabled
> [ 1760.613874] xhci_hcd 0000:0b:00.0: PME# disabled
> [ 1760.613884] xhci_hcd 0000:0b:00.0: enabling bus mastering
> [ 1760.613895] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> [ 1760.613914] xhci_hcd 0000:0b:00.0: xhci_resume: starting port polling.
> [ 1760.613922] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> [ 1760.613924] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
> [ 1760.613929] usb usb4: usb auto-resume
> [ 1760.613945] hub 4-0:1.0: hub_resume
> [ 1760.613981] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
> [ 1760.613982] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
> [ 1760.614010] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
> [ 1760.614012] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
> [ 1760.614038] usb usb3: usb wakeup-resume
> [ 1760.614040] usb usb3: usb auto-resume
> [ 1760.614059] hub 3-0:1.0: hub_resume
> [ 1760.614080] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
> [ 1760.614081] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
> [ 1760.614104] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
> [ 1760.614105] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
> [ 1760.614122] hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
> [ 1760.614126] hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
> [ 1760.614134] hub 3-0:1.0: hub_suspend
> [ 1760.614139] usb usb3: bus auto-suspend, wakeup 1
> [ 1760.614152] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> [ 1760.623621] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> [ 1760.646744] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
> [ 1760.646746] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
> [ 1760.647281] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
> [ 1760.647283] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
> [ 1760.657965] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 0 status  = 0xe0002a0
> [ 1760.657992] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 1 status  = 0xe0002a0
> [ 1760.658000] hub 4-0:1.0: hub_suspend
> [ 1760.658004] usb usb4: bus auto-suspend, wakeup 1
> [ 1760.658015] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> [ 1760.658027] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
> [ 1760.658042] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> [ 1760.658074] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
> [ 1760.658159] xhci_hcd 0000:0b:00.0: PME# enabled
> [ 1760.683743] pcieport 0000:00:1c.4: PME# enabled

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-29 16:05       ` Sarah Sharp
@ 2013-03-29 17:11         ` Martin Mokrejs
  2013-03-29 18:16           ` Martin Mokrejs
  2013-03-29 21:37         ` Rafael J. Wysocki
  1 sibling, 1 reply; 61+ messages in thread
From: Martin Mokrejs @ 2013-03-29 17:11 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: Rafael J. Wysocki, Bjorn Helgaas, ACPI Devel Maling List, LKML,
	Linux PM list, Len Brown, Matthew Garrett, Accardi, Kristen C,
	Huang, Ying, linux-pci

Sarah,
  please let me know if you feel the test was screwed by laptop-mode-tools
kicking in, although I believed they were not running while I was on AC power.
I was testing under these conditions:

vostro ~ # grep . /sys/bus/pci/devices/*/power/control
/sys/bus/pci/devices/0000:00:00.0/power/control:auto
/sys/bus/pci/devices/0000:00:02.0/power/control:auto
/sys/bus/pci/devices/0000:00:16.0/power/control:auto
/sys/bus/pci/devices/0000:00:1a.0/power/control:auto
/sys/bus/pci/devices/0000:00:1b.0/power/control:auto
/sys/bus/pci/devices/0000:00:1c.0/power/control:auto
/sys/bus/pci/devices/0000:00:1c.1/power/control:auto
/sys/bus/pci/devices/0000:00:1c.3/power/control:auto
/sys/bus/pci/devices/0000:00:1c.4/power/control:auto
/sys/bus/pci/devices/0000:00:1c.7/power/control:auto
/sys/bus/pci/devices/0000:00:1d.0/power/control:auto
/sys/bus/pci/devices/0000:00:1f.0/power/control:auto
/sys/bus/pci/devices/0000:00:1f.2/power/control:auto
/sys/bus/pci/devices/0000:00:1f.3/power/control:auto
/sys/bus/pci/devices/0000:05:00.0/power/control:auto
/sys/bus/pci/devices/0000:09:00.0/power/control:auto
/sys/bus/pci/devices/0000:0b:00.0/power/control:auto
/sys/bus/pci/devices/0000:11:00.0/power/control:auto
vostro ~ # grep . /sys/bus/pci/devices/*/power/runtime_status
/sys/bus/pci/devices/0000:00:00.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:02.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:16.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1a.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1b.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.3/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.4/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.7/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1d.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.2/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.3/power/runtime_status:suspended
/sys/bus/pci/devices/0000:05:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:09:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:0b:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:11:00.0/power/runtime_status:active
vostro ~ # 

My apologies if that twisted the test and thanks for you detailed explanations.

I will spot below, however, a few questions.

Sarah Sharp wrote:
> On Fri, Mar 29, 2013 at 04:05:54PM +0100, Martin Mokrejs wrote:

> 
>> Nevertheless, I went to check if if the USB3 socket dies after first unplug of device
>> or not anymore thanks to the patch being tested:
>>
>> I plugged into the USB3.0 socket a mouse, it worked. Around its unplug I got:
>>
>> [   94.954779] hub 3-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x100
>> [   94.954795] hub 3-0:1.0: hub_suspend
>> [   94.954802] usb usb3: bus auto-suspend, wakeup 1
>> [   94.954817] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
>> [   94.954835] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
>> [   94.954857] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
>> [   94.954898] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
>> [   94.954983] xhci_hcd 0000:0b:00.0: PME# enabled
>> [  169.622513] hub 2-1:1.0: state 7 ports 8 chg 0000 evt 0004
>> [  169.623057] hub 2-1:1.0: port 2, status 0101, change 0001, 12 Mb/s
>> [  169.777012] hub 2-1:1.0: debounce: port 2: total 100ms stable 100ms status 0x101
>> [  169.856992] usb 2-1.2: new low-speed USB device number 4 using ehci-pci
>>
>> and the port was dead, no matter what "lsusb -v or -vv" options I tried. At about
>> [  169.622513] I plugged the mouse into a USB2.0 socket (do not know if that is 1a.0 or 1d.0).
> 
> All right, I wonder if the USB core/xHCI driver is forgetting to clear a
> port status change bit after the device is unplugged.  That can cause
> the xHCI host to not give us a port status change event later (and thus
> no PME).  Looking at the logs later, it doesn't seem like we do this
> though.
> 
>> If I run lsusb -vv it does (with the problematic patch):
>>
>> [ 1760.414086] pcieport 0000:00:1c.4: PME# disabled
>> [ 1760.434314] xhci_hcd 0000:0b:00.0: PME# disabled
>> [ 1760.434327] xhci_hcd 0000:0b:00.0: enabling bus mastering
>> [ 1760.434338] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
>> [ 1760.434360] xhci_hcd 0000:0b:00.0: Port Status Change Event for port 2
> 
> Ok, so the xHCI driver *is* getting a port status change event, and thus
> must have gotten a PME.  So the PCI layer is doing its job.
> 
>> [ 1760.434363] xhci_hcd 0000:0b:00.0: resume root hub
>> [ 1760.434367] xhci_hcd 0000:0b:00.0: handle_port_status: starting port polling.
>> [ 1760.434378] xhci_hcd 0000:0b:00.0: xhci_resume: starting port polling.
>> [ 1760.434383] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
>> [ 1760.434388] usb usb3: usb auto-resume
>> [ 1760.434407] hub 3-0:1.0: hub_resume
>> [ 1760.434439] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
>> [ 1760.434440] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
>> [ 1760.434464] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x202a0
>> [ 1760.434465] xhci_hcd 0000:0b:00.0: Get port status returned 0x10100
>> [ 1760.434492] xhci_hcd 0000:0b:00.0: clear port connect change, actual port 1 status  = 0x2a0
> 
> Odd.  The port status shows there's no device connected, but there was a
> connect change:
> 
> sarah@xanatos:~$ ./decode-port-status 0x202a0
> port status = 0x0202a0
>  bit  0     (CCS)          0x0, device not connected
>  bit  1     (PED)          0x0, port disabled
>  bit  3     (OCA)          0x0, no over-current condition
>  bit  4     (PR)           0x0, port not in reset
>  bits 8:5   (PLS)          0x5, link is in the RxDetect state
>  bit  9     (PP)           0x1, port power on
>  bits 13:10 (speed)        0x0, Undefined
>  bits 15:14 (indicators)   0x0, port indicators are off
>  bit  17    (CSC)          0x1, connect change
>  bit  18    (PEC)          0x0, no port enable/disable change
>  bit  19    (WRC)          0x0, no warm port reset change
>  bit  20    (OCC)          0x0, no over-current change
>  bit  21    (PRC)          0x0, no port reset change
>  bit  22    (PLC)          0x0, no port link change
>  bit  23    (CEC)          0x0, no port config error change
>  bit  25    (WCE)          0x0, wake on connect disabled
>  bit  26    (WDE)          0x0, wake on disconnect disabled
>  bit  27    (WOE)          0x0, wake on over-current enable disabled
>  bit  30    (DR)           0x0, device is permanently attached
> 
> RxDetect is the "I'm looking for a USB device" port state.
> 
>> [ 1760.434642] usb usb4: usb wakeup-resume
>> [ 1760.434646] usb usb4: usb auto-resume
>> [ 1760.434661] hub 4-0:1.0: hub_resume
>> [ 1760.434683] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
>> [ 1760.434684] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
>> [ 1760.434710] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
>> [ 1760.434711] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
>> [ 1760.434727] hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
>> [ 1760.434757] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 0 status  = 0xe0002a0
>> [ 1760.434784] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 1 status  = 0xe0002a0
>> [ 1760.434791] hub 4-0:1.0: hub_suspend
>> [ 1760.434796] usb usb4: bus auto-suspend, wakeup 1
>> [ 1760.434807] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
>> [ 1760.553734] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
>> [ 1760.553751] hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
>> [ 1760.574793] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
>> [ 1760.574794] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
>> [ 1760.575300] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
>> [ 1760.575301] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
> 
> sarah@xanatos:~$ ./decode-port-status 0x2a0
> port status = 0x0002a0
>  bit  0     (CCS)          0x0, device not connected
>  bit  1     (PED)          0x0, port disabled
>  bit  3     (OCA)          0x0, no over-current condition
>  bit  4     (PR)           0x0, port not in reset
>  bits 8:5   (PLS)          0x5, link is in the RxDetect state
>  bit  9     (PP)           0x1, port power on
>  bits 13:10 (speed)        0x0, Undefined
>  bits 15:14 (indicators)   0x0, port indicators are off
>  bit  17    (CSC)          0x0, no connect change
>  bit  18    (PEC)          0x0, no port enable/disable change
>  bit  19    (WRC)          0x0, no warm port reset change
>  bit  20    (OCC)          0x0, no over-current change
>  bit  21    (PRC)          0x0, no port reset change
>  bit  22    (PLC)          0x0, no port link change
>  bit  23    (CEC)          0x0, no port config error change
>  bit  25    (WCE)          0x0, wake on connect disabled
>  bit  26    (WDE)          0x0, wake on disconnect disabled
>  bit  27    (WOE)          0x0, wake on over-current enable disabled
>  bit  30    (DR)           0x0, device is permanently attached
> 
> Nope, your host really isn't reporting there's a device connected
> *at all*.  That's just broken hardware, and there's really nothing
> software can do if the hardware isn't reporting connect events, even
> with polling.
> 
> It also doesn't sound like the other TI redriver bug.  That bug only
> effected USB 3.0 ports, and when lsusb was run, we would find the port
> in Compliance Mode.  This is the host simply not reporting the USB 2.0
> port connect at all.
> 
> Maybe if we completely disable PCI runtime PM for your host, we can work
> around this bug?

I am not sure I understand what you mean. The proposed patch makes the
situation worse. To be able to use the xHCI port for second and later times,
I have to plugin a device and run 'lsusb -vv' to get the device detected
before the "port" falls asleep. This is NOT necessary for the SandyBridge
USB2.0 port under same conditions (well, regarding the disclaimer I made
on the very top of this message, from the other thread where Ying found that
I had laptop-mode-tools installed I realized that laptop-mode-tools
fiddled with xHCI port while NOT with EHCI port. Please correct me if
I was wrong. So, please re-consider your above conclusions. Most importantly,
I don not understand the "This is the host simply not reporting the USB 2.0
port connect at all.". Did you mean "USB 3.0 instead"?

Other than that, I am ready to file a bug report to Dell's Pro Support site
but from my last experience they were quite clue-less regarding the broken
"express card PresDet detection". ;-) I bet they could replace the TexasInstruments
USB3.0 card which is a separate thing in the laptop. At least I could get
the TI hardware which has hopefully fixed the redriver. ;-)
How could I reproduce the issue in Win7 which are on the laptop? I mean,
the change the PM handling to reproduce what I got on linux under laptop-mode-tools?
;)

> 
> Can you send me the output of `sudo lspci -vvv -n` again?

Will send after I reboot into a clean state and re-test the behavior while
laptop-mode-tools are gone. Maybe the issue will remain anyways.
So so far I tested effectively as under:

echo auto > /sys/bus/pci/devices/0000:0b:00.0/power/control

and without laptop-mode-tools trickery I should be now testing under

echo on > /sys/bus/pci/devices/0000:0b:00.0/power/control

, right?

Thank you,
Martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-29 17:11         ` Martin Mokrejs
@ 2013-03-29 18:16           ` Martin Mokrejs
  0 siblings, 0 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-03-29 18:16 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: Rafael J. Wysocki, Bjorn Helgaas, ACPI Devel Maling List, LKML,
	Linux PM list, Len Brown, Matthew Garrett, Accardi, Kristen C,
	Huang, Ying, linux-pci

[-- Attachment #1: Type: text/plain, Size: 9237 bytes --]

So, I re-tested again with the patch and 3.8.3 but without laptop-mode-tools.
The xHCI port works fine provided /sys/bus/pci/devices/0000:0b:00.0/power/control
is set to on and /sys/bus/pci/devices/0000:00:1c.4/power/control also to on.
If I set parent 1c.4 to auto, it gets suspended and the port seems dead until
a device is in and I wake it using lsusb -vv. There must be a bug in linux so
that it cannot overcome upstream 1c.4 sleeping while willing to access 0b:00.
Or more likely, that upstream root port should be prevented to fall asleep, right?


# lspci -tv
-[0000:00]-+-00.0  Intel Corporation 2nd Generation Core Processor Family DRAM Controller
           +-02.0  Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller
           +-16.0  Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1
           +-1a.0  Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2
           +-1b.0  Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller
           +-1c.0-[03-04]--
           +-1c.1-[05-06]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller
           +-1c.3-[09-0a]----00.0  Intel Corporation Centrino Wireless-N 1030 [Rainbow Peak]
           +-1c.4-[0b-0c]----00.0  Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller
           +-1c.7-[11-16]----00.0  Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller
           +-1d.0  Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1
           +-1f.0  Intel Corporation HM67 Express Chipset Family LPC Controller
           +-1f.2  Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller
           \-1f.3  Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller
#



I have attached the lspci -vvv -n.

Interestingly, maybe, the TI xHCI controller ended up after my tests
changed. I booted up with all devices with power/control set to on
due to laptop-mode-tools uninstalled. I fiddled with the echo commands
tweaking 1c.4 and 0b:00 but in the end set both back to "on". However,
below is some diff. Don't know what that means. Maybe because I tried
to write '0', 'off', 'none' to the control file? ;-)

 00:1c.4 0604: 8086:1c18 (rev b5) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Bus: primary=00, secondary=0b, subordinate=0c, sec-latency=0
        I/O behind bridge: 0000f000-00000fff
        Memory behind bridge: f7d00000-f7dfffff
        Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #5, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <16us
                        ClockPM- Surprise- LLActRep+ BwNot-
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt+
                SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
                        Slot #4, PowerLimit 10.000W; Interlock- NoCompl+
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
                        Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
                        Changed: MRL- PresDet- LinkState+
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
                RootCap: CRSVisible-
-               RootSta: PME ReqID 0000, PMEStatus- PMEPending-
+               RootSta: PME ReqID 0b00, PMEStatus- PMEPending-
                DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
                Address: 00000000  Data: 0000
        Capabilities: [90] Subsystem: 1028:04b3
        Capabilities: [a0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Kernel driver in use: pcieport


 0b:00.0 0c03: 104c:8241 (rev 02) (prog-if 30 [XHCI])
        Subsystem: 1028:04b3
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
        Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
                        ClockPM+ Surprise- LLActRep- BwNot-
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [c0] MSI-X: Enable+ Count=8 Masked-
                Vector table: BAR=2 offset=00000000
                PBA: BAR=2 offset=00001000
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
-               CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
+               CESta:  RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00
        Kernel driver in use: xhci_hcd




[-- Attachment #2: lspci_vvvn_initial.txt --]
[-- Type: text/plain, Size: 31820 bytes --]

00:00.0 0600: 8086:0104 (rev 09)
	Subsystem: 1028:04b3
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: [e0] Vendor Specific Information: Len=0c <?>

00:02.0 0300: 8086:0126 (rev 09) (prog-if 00 [VGA controller])
	Subsystem: 1028:04b3
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 47
	Region 0: Memory at f6800000 (64-bit, non-prefetchable) [size=4M]
	Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Region 4: I/O ports at f000 [size=64]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0300c  Data: 4162
	Capabilities: [d0] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a4] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: i915

00:16.0 0780: 8086:1c3a (rev 04)
	Subsystem: 1028:04b3
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 11
	Region 0: Memory at f7f0a000 (64-bit, non-prefetchable) [size=16]
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [8c] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000

00:1a.0 0c03: 8086:1c2d (rev 05) (prog-if 20 [EHCI])
	Subsystem: 1028:04b3
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at f7f08000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Debug port: BAR=1 offset=00a0
	Capabilities: [98] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: ehci-pci

00:1b.0 0403: 8086:1c20 (rev 05)
	Subsystem: 1028:04b3
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 45
	Region 0: Memory at f7f00000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0300c  Data: 4142
	Capabilities: [70] Express (v1) Root Complex Integrated Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE- FLReset+
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed unknown, Width x0, ASPM unknown, Latency L0 <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed unknown, Width x0, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=01
			Status:	NegoPending- InProgress-
		VC1:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=1 ArbSelect=Fixed TC/VC=22
			Status:	NegoPending- InProgress-
	Capabilities: [130 v1] Root Complex Link
		Desc:	PortNumber=0f ComponentID=00 EltType=Config
		Link0:	Desc:	TargetPort=00 TargetComponent=00 AssocRCRB- LinkType=MemMapped LinkValid+
			Addr:	00000000fed1c000
	Kernel driver in use: snd_hda_intel

00:1c.0 0604: 8086:1c10 (rev b5) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=03, subordinate=04, sec-latency=0
	I/O behind bridge: 0000f000-00000fff
	Memory behind bridge: fff00000-000fffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <1us, L1 <16us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #0, PowerLimit 10.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
			Changed: MRL- PresDet- LinkState-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [90] Subsystem: 1028:04b3
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: pcieport

00:1c.1 0604: 8086:1c12 (rev b5) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=05, subordinate=06, sec-latency=0
	I/O behind bridge: 0000e000-0000efff
	Memory behind bridge: fff00000-000fffff
	Prefetchable memory behind bridge: 00000000f1100000-00000000f11fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #2, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <16us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #1, PowerLimit 10.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet- LinkState+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [90] Subsystem: 1028:04b3
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: pcieport

00:1c.3 0604: 8086:1c16 (rev b5) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=09, subordinate=0a, sec-latency=0
	I/O behind bridge: 0000f000-00000fff
	Memory behind bridge: f7e00000-f7efffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #4, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <16us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #3, PowerLimit 10.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet- LinkState+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [90] Subsystem: 1028:04b3
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: pcieport

00:1c.4 0604: 8086:1c18 (rev b5) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=0b, subordinate=0c, sec-latency=0
	I/O behind bridge: 0000f000-00000fff
	Memory behind bridge: f7d00000-f7dfffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #5, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <16us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt+
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #4, PowerLimit 10.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet- LinkState+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [90] Subsystem: 1028:04b3
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: pcieport

00:1c.7 0604: 8086:1c1e (rev b5) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=11, subordinate=16, sec-latency=0
	I/O behind bridge: 0000c000-0000dfff
	Memory behind bridge: f6c00000-f7cfffff
	Prefetchable memory behind bridge: 00000000f0000000-00000000f10fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #8, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <16us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
			Slot #7, PowerLimit 10.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet- LinkState-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [90] Subsystem: 1028:04b3
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: pcieport

00:1d.0 0c03: 8086:1c26 (rev 05) (prog-if 20 [EHCI])
	Subsystem: 1028:04b3
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 23
	Region 0: Memory at f7f07000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Debug port: BAR=1 offset=00a0
	Capabilities: [98] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: ehci-pci

00:1f.0 0601: 8086:1c4b (rev 05)
	Subsystem: 1028:04b3
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: [e0] Vendor Specific Information: Len=0c <?>
	Kernel driver in use: lpc_ich

00:1f.2 0106: 8086:1c03 (rev 05) (prog-if 01 [AHCI 1.0])
	Subsystem: 1028:04b3
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 40
	Region 0: I/O ports at f0b0 [size=8]
	Region 1: I/O ports at f0a0 [size=4]
	Region 2: I/O ports at f090 [size=8]
	Region 3: I/O ports at f080 [size=4]
	Region 4: I/O ports at f060 [size=32]
	Region 5: Memory at f7f06000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0300c  Data: 4191
	Capabilities: [70] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
	Capabilities: [b0] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: ahci

00:1f.3 0c05: 8086:1c22 (rev 05)
	Subsystem: 1028:04b3
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin C routed to IRQ 18
	Region 0: Memory at f7f05000 (64-bit, non-prefetchable) [size=256]
	Region 4: I/O ports at f040 [size=32]

05:00.0 0200: 10ec:8168 (rev 06)
	Subsystem: 1028:04b3
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 41
	Region 0: I/O ports at e000 [size=256]
	Region 2: Memory at f1104000 (64-bit, prefetchable) [size=4K]
	Region 4: Memory at f1100000 (64-bit, prefetchable) [size=16K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0300c  Data: 41a1
	Capabilities: [70] Express (v2) Endpoint, MSI 01
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
		Vector table: BAR=4 offset=00000000
		PBA: BAR=4 offset=00000800
	Capabilities: [d0] Vital Product Data
		Unknown small resource type 00, will not decode more.
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [140 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
	Kernel driver in use: r8169

09:00.0 0280: 8086:008a (rev 34)
	Subsystem: 8086:5325
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 46
	Region 0: Memory at f7e00000 (64-bit, non-prefetchable) [size=8K]
	Capabilities: [c8] Power Management version 3
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0300c  Data: 4152
	Capabilities: [e0] Express (v1) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <4us, L1 <32us
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
	Capabilities: [140 v1] Device Serial Number 4c-80-93-ff-ff-15-e6-c7
	Kernel driver in use: iwlwifi

0b:00.0 0c03: 104c:8241 (rev 02) (prog-if 30 [XHCI])
	Subsystem: 1028:04b3
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
	Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [c0] MSI-X: Enable+ Count=8 Masked-
		Vector table: BAR=2 offset=00000000
		PBA: BAR=2 offset=00001000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00
	Kernel driver in use: xhci_hcd

11:00.0 0180: 1095:3132 (rev 01)
	Subsystem: 1095:3132
	Physical Slot: 1
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 19
	Region 0: Memory at f6c84000 (64-bit, non-prefetchable) [size=128]
	Region 2: Memory at f6c80000 (64-bit, non-prefetchable) [size=16K]
	Region 4: I/O ports at c000 [size=128]
	Expansion ROM at f6c00000 [disabled] [size=512K]
	Capabilities: [54] Power Management version 2
		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [5c] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v1) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Latency L0 unlimited, L1 unlimited
			ClockPM- Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		AERCap:	First Error Pointer: 14, GenCap+ CGenEn- ChkCap+ ChkEn-
	Kernel driver in use: sata_sil24


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-29 15:05     ` Martin Mokrejs
  2013-03-29 16:05       ` Sarah Sharp
@ 2013-03-29 21:34       ` Rafael J. Wysocki
  1 sibling, 0 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-29 21:34 UTC (permalink / raw)
  To: Martin Mokrejs
  Cc: Bjorn Helgaas, ACPI Devel Maling List, LKML, Linux PM list,
	Len Brown, Matthew Garrett, Sarah Sharp, Accardi, Kristen C,
	Huang, Ying, linux-pci

On Friday, March 29, 2013 04:05:54 PM Martin Mokrejs wrote:
> Hi,
>   I applied this patches over 3.8.3 hoping it will fix my issue under
> thread: "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
> but unfortunately, it is even worse! Now, although lsusb -v nor lsusb -vv do
> wakeup the XHCI port but it falls asleep immediately, more quickly than I am
> able to plug a device into the socket. To get a device working in the USB3 socket
> I need to plug it in, run lsusb -vv and then it is recognized.
> 
> Without the patch, the 'lsusb -vv' woke up the port (PME# disabled happened
> on both 1c.4 and 0b:00.0) and I had unlimited time to find some USB device
> around and to plug it into the slot.

Well, using lsusb to work around problems in the PCI subsystem isn't even
*supposed* to work as far as I can tell.

First off, do you use laptop-mode (or something equivalent) to enable runtime
PM for all PCI devices in your system?  If you do, please test things without
it and see if they work then.

Second, do things work after you echo "on" to the xHCI controller's
/sys/devices/.../power/control file?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-29 16:05       ` Sarah Sharp
  2013-03-29 17:11         ` Martin Mokrejs
@ 2013-03-29 21:37         ` Rafael J. Wysocki
  1 sibling, 0 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-29 21:37 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: Martin Mokrejs, Bjorn Helgaas, ACPI Devel Maling List, LKML,
	Linux PM list, Len Brown, Matthew Garrett, Accardi, Kristen C,
	Huang, Ying, linux-pci

On Friday, March 29, 2013 09:05:35 AM Sarah Sharp wrote:
> On Fri, Mar 29, 2013 at 04:05:54PM +0100, Martin Mokrejs wrote:
> > [   36.594171] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
> > [   36.594202] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> > [   36.594247] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
> > [   36.594349] xhci_hcd 0000:0b:00.0: PME# enabled
> > [   36.703695] r8169 0000:05:00.0 eth0: link down
> > [   37.098299] microcode: CPU0 updated to revision 0x28, date = 2012-04-24
> > [   37.098941] microcode: CPU1 updated to revision 0x28, date = 2012-04-24
> > [   37.098944] perf_event_intel: PEBS enabled due to microcode update
> > [   38.343029] r8169 0000:05:00.0 eth0: link up
> > [   39.094944] r8169 0000:05:00.0 eth0: link down
> > [   41.492768] r8169 0000:05:00.0 eth0: link up
> > [   62.782910] xhci_hcd 0000:0b:00.0: Poll event ring: 4294943584
> > [   62.782938] xhci_hcd 0000:0b:00.0: op reg status = 0xffffffff
> > [   62.782939] xhci_hcd 0000:0b:00.0: HW died, polling stopped.
> > [   88.754183] pcieport 0000:00:1c.0: PME# enabled
> > [   88.764182] xhci_hcd 0000:0b:00.0: PME# disabled
> > [   88.764192] xhci_hcd 0000:0b:00.0: enabling bus mastering
> > [   88.764206] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> > [   88.764242] xhci_hcd 0000:0b:00.0: Port Status Change Event for port 2
> > [   88.764246] xhci_hcd 0000:0b:00.0: resume root hub
> > [   88.764259] xhci_hcd 0000:0b:00.0: handle_port_status: starting port polling.
> > [   88.764276] xhci_hcd 0000:0b:00.0: xhci_resume: starting port polling.
> > [   88.764281] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
> > 
> > 
> > What "HW died? Why 1c.0 is here? What is this device actually doing?
> 
> It's harmless.  The xHCI polling loop to debug the host registers and
> rings simply notices that the registers are reading as all ffs.  I
> believe that's normal when a PCI device is in D3.  I just haven't had
> time to make a patch to disable the polling loop when the host is suspended.
> 
> So, for now, ignore the "HW died, polling stopped." messages.
> 
> > Nevertheless, I went to check if if the USB3 socket dies after first unplug of device
> > or not anymore thanks to the patch being tested:
> > 
> > I plugged into the USB3.0 socket a mouse, it worked. Around its unplug I got:
> > 
> > [   94.954779] hub 3-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x100
> > [   94.954795] hub 3-0:1.0: hub_suspend
> > [   94.954802] usb usb3: bus auto-suspend, wakeup 1
> > [   94.954817] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> > [   94.954835] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
> > [   94.954857] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> > [   94.954898] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
> > [   94.954983] xhci_hcd 0000:0b:00.0: PME# enabled
> > [  169.622513] hub 2-1:1.0: state 7 ports 8 chg 0000 evt 0004
> > [  169.623057] hub 2-1:1.0: port 2, status 0101, change 0001, 12 Mb/s
> > [  169.777012] hub 2-1:1.0: debounce: port 2: total 100ms stable 100ms status 0x101
> > [  169.856992] usb 2-1.2: new low-speed USB device number 4 using ehci-pci
> > 
> > and the port was dead, no matter what "lsusb -v or -vv" options I tried. At about
> > [  169.622513] I plugged the mouse into a USB2.0 socket (do not know if that is 1a.0 or 1d.0).
> 
> All right, I wonder if the USB core/xHCI driver is forgetting to clear a
> port status change bit after the device is unplugged.  That can cause
> the xHCI host to not give us a port status change event later (and thus
> no PME).  Looking at the logs later, it doesn't seem like we do this
> though.
> 
> > If I run lsusb -vv it does (with the problematic patch):
> > 
> > [ 1760.414086] pcieport 0000:00:1c.4: PME# disabled
> > [ 1760.434314] xhci_hcd 0000:0b:00.0: PME# disabled
> > [ 1760.434327] xhci_hcd 0000:0b:00.0: enabling bus mastering
> > [ 1760.434338] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
> > [ 1760.434360] xhci_hcd 0000:0b:00.0: Port Status Change Event for port 2
> 
> Ok, so the xHCI driver *is* getting a port status change event, and thus
> must have gotten a PME.  So the PCI layer is doing its job.
> 
> > [ 1760.434363] xhci_hcd 0000:0b:00.0: resume root hub
> > [ 1760.434367] xhci_hcd 0000:0b:00.0: handle_port_status: starting port polling.
> > [ 1760.434378] xhci_hcd 0000:0b:00.0: xhci_resume: starting port polling.
> > [ 1760.434383] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
> > [ 1760.434388] usb usb3: usb auto-resume
> > [ 1760.434407] hub 3-0:1.0: hub_resume
> > [ 1760.434439] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
> > [ 1760.434440] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
> > [ 1760.434464] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x202a0
> > [ 1760.434465] xhci_hcd 0000:0b:00.0: Get port status returned 0x10100
> > [ 1760.434492] xhci_hcd 0000:0b:00.0: clear port connect change, actual port 1 status  = 0x2a0
> 
> Odd.  The port status shows there's no device connected, but there was a
> connect change:
> 
> sarah@xanatos:~$ ./decode-port-status 0x202a0
> port status = 0x0202a0
>  bit  0     (CCS)          0x0, device not connected
>  bit  1     (PED)          0x0, port disabled
>  bit  3     (OCA)          0x0, no over-current condition
>  bit  4     (PR)           0x0, port not in reset
>  bits 8:5   (PLS)          0x5, link is in the RxDetect state
>  bit  9     (PP)           0x1, port power on
>  bits 13:10 (speed)        0x0, Undefined
>  bits 15:14 (indicators)   0x0, port indicators are off
>  bit  17    (CSC)          0x1, connect change
>  bit  18    (PEC)          0x0, no port enable/disable change
>  bit  19    (WRC)          0x0, no warm port reset change
>  bit  20    (OCC)          0x0, no over-current change
>  bit  21    (PRC)          0x0, no port reset change
>  bit  22    (PLC)          0x0, no port link change
>  bit  23    (CEC)          0x0, no port config error change
>  bit  25    (WCE)          0x0, wake on connect disabled
>  bit  26    (WDE)          0x0, wake on disconnect disabled
>  bit  27    (WOE)          0x0, wake on over-current enable disabled
>  bit  30    (DR)           0x0, device is permanently attached
> 
> RxDetect is the "I'm looking for a USB device" port state.
> 
> > [ 1760.434642] usb usb4: usb wakeup-resume
> > [ 1760.434646] usb usb4: usb auto-resume
> > [ 1760.434661] hub 4-0:1.0: hub_resume
> > [ 1760.434683] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
> > [ 1760.434684] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
> > [ 1760.434710] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
> > [ 1760.434711] xhci_hcd 0000:0b:00.0: Get port status returned 0x2a0
> > [ 1760.434727] hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
> > [ 1760.434757] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 0 status  = 0xe0002a0
> > [ 1760.434784] xhci_hcd 0000:0b:00.0: set port remote wake mask, actual port 1 status  = 0xe0002a0
> > [ 1760.434791] hub 4-0:1.0: hub_suspend
> > [ 1760.434796] usb usb4: bus auto-suspend, wakeup 1
> > [ 1760.434807] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> > [ 1760.553734] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
> > [ 1760.553751] hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
> > [ 1760.574793] xhci_hcd 0000:0b:00.0: get port status, actual port 0 status  = 0x2a0
> > [ 1760.574794] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
> > [ 1760.575300] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
> > [ 1760.575301] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
> 
> sarah@xanatos:~$ ./decode-port-status 0x2a0
> port status = 0x0002a0
>  bit  0     (CCS)          0x0, device not connected
>  bit  1     (PED)          0x0, port disabled
>  bit  3     (OCA)          0x0, no over-current condition
>  bit  4     (PR)           0x0, port not in reset
>  bits 8:5   (PLS)          0x5, link is in the RxDetect state
>  bit  9     (PP)           0x1, port power on
>  bits 13:10 (speed)        0x0, Undefined
>  bits 15:14 (indicators)   0x0, port indicators are off
>  bit  17    (CSC)          0x0, no connect change
>  bit  18    (PEC)          0x0, no port enable/disable change
>  bit  19    (WRC)          0x0, no warm port reset change
>  bit  20    (OCC)          0x0, no over-current change
>  bit  21    (PRC)          0x0, no port reset change
>  bit  22    (PLC)          0x0, no port link change
>  bit  23    (CEC)          0x0, no port config error change
>  bit  25    (WCE)          0x0, wake on connect disabled
>  bit  26    (WDE)          0x0, wake on disconnect disabled
>  bit  27    (WOE)          0x0, wake on over-current enable disabled
>  bit  30    (DR)           0x0, device is permanently attached
> 
> Nope, your host really isn't reporting there's a device connected
> *at all*.  That's just broken hardware, and there's really nothing
> software can do if the hardware isn't reporting connect events, even
> with polling.
> 
> It also doesn't sound like the other TI redriver bug.  That bug only
> effected USB 3.0 ports, and when lsusb was run, we would find the port
> in Compliance Mode.  This is the host simply not reporting the USB 2.0
> port connect at all.
> 
> Maybe if we completely disable PCI runtime PM for your host, we can work
> around this bug?

Well, that's what I've just asked Martin to try.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 21:27                 ` Rafael J. Wysocki
  2013-03-29  7:41                   ` huang ying
@ 2013-03-30  2:03                   ` Martin Mokrejs
  2013-04-02  5:25                     ` huang ying
  2013-03-30 22:38                   ` [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports Rafael J. Wysocki
  2 siblings, 1 reply; 61+ messages in thread
From: Martin Mokrejs @ 2013-03-30  2:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp

Rafael J. Wysocki wrote:
> On Thursday, March 28, 2013 07:31:58 PM Martin Mokrejs wrote:
>> Hi Bjorn,
>>
>> Bjorn Helgaas wrote:
>>> On Thu, Mar 28, 2013 at 11:26 AM, Martin Mokrejs
>>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>>>
>>>>
>>>> Rafael J. Wysocki wrote:
>>>>> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
>>>>>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>>>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
>>>>>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>>>>>>> Hi Bjorn,
>>>>>>>>>
>>>>>>>>> I wonder what you think about the patch below?
>>>>>>>>
>>>>>>>> Seems fine to me (I'm trusting your and Matthew's judgment here since
>>>>>>>> I don't know much about it).  Why don't you resend it with Matthew's
>>>>>>>> ack and the appropriate stable tags, and I'll put it in.
>>>>>>>
>>>>>>> I will, thanks!
>>>>>>>
>>>>>>>> If you have
>>>>>>>> a URL for a bugzilla or mailing list report of the original problem,
>>>>>>>> that would be good, too.  It'd be nice if users and distros could
>>>>>>>> match problem reports with this solution, but I can't tell what the
>>>>>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
>>>>>>>> else reproduced the problem and tested the fix)?
>>>>>>>
>>>>>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
>>>>>>> to publicly available mailing list archives etc.
>>>>>>
>>>>>> Do you at least have a description of how a user could determine
>>>>>> whether he is seeing the problem fixed by this patch?
>>>>>
>>>>> Yeah.  For example, when the problem is visible on a USB controller and that
>>>>> controller is runtime-suspended, then plugging a new USB device into one
>>>>> of the controller's ports won't wake the controller up without the patch.
>>>>
>>>> Hi,
>>>>  I am wondering for a week or two why nobody answered any of my bug reports,
>>>> not even Sarah who asked for more details. I am think the fix is about my report
>>>> under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
>>>> and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
>>>> about my report. But I should better wait what Sarah says. ;-)
>>>
>>> I haven't forgotten about your hotplug issues, but I've been on
>>> vacation for a week and have been working on the similar issue
>>> reported by Chris Clayton
>>> (https://bugzilla.kernel.org/show_bug.cgi?id=54981) because it seemed
>>> a bit more tractable.  But I'll get back to yours eventually :)
>>> Unfortunately nobody else seems to be jumping in to help, and I can
>>> only do so much by myself.
>>>
>>> I haven't been following your XHCI issue at all, but one thing you
>>
>> But please do so now. If we are talking about an existing patch it should be
>> possible to say whether what I observed is likely to be fixed by the patch.
>> I will happily discuss then why I loose interrupts in a same way for my
>> rtl8169 network card and why this PME# stuff happens for me only with 3.8
>> and not 3.7 (unlike what Sarah claims). I am not arguing that something 
>> else makes 3.7 be able to wakeup the device and overcome the same bug
>> while "it" is gone from 3.8. I think this should be an easy task for you,
>> pci devs. ;-)
> 
> OK, let's try to establish facts.
> 
> Does the patch below causes the PCI PM issues you're seeing to go away?

Yes, the PME# enabled to disabled is gone, because only PME# disabled is allowed.
I don't think I really tested a scenario like before. Big thanks to Huang Ying
(https://patchwork.kernel.org/patch/2359611/):

# grep . /sys/bus/pci/devices/*/power/control
/sys/bus/pci/devices/0000:00:00.0/power/control:on
/sys/bus/pci/devices/0000:00:02.0/power/control:on
/sys/bus/pci/devices/0000:00:16.0/power/control:on
/sys/bus/pci/devices/0000:00:1a.0/power/control:on
/sys/bus/pci/devices/0000:00:1b.0/power/control:on
/sys/bus/pci/devices/0000:00:1c.0/power/control:on
/sys/bus/pci/devices/0000:00:1c.1/power/control:on
/sys/bus/pci/devices/0000:00:1c.3/power/control:on
/sys/bus/pci/devices/0000:00:1c.4/power/control:on
/sys/bus/pci/devices/0000:00:1c.7/power/control:on
/sys/bus/pci/devices/0000:00:1d.0/power/control:on
/sys/bus/pci/devices/0000:00:1f.0/power/control:on
/sys/bus/pci/devices/0000:00:1f.2/power/control:on
/sys/bus/pci/devices/0000:00:1f.3/power/control:on
/sys/bus/pci/devices/0000:05:00.0/power/control:on
/sys/bus/pci/devices/0000:09:00.0/power/control:on
/sys/bus/pci/devices/0000:0b:00.0/power/control:on
/sys/bus/pci/devices/0000:11:00.0/power/control:on
# grep . /sys/bus/pci/devices/*/power/runtime_status
/sys/bus/pci/devices/0000:00:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:02.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:16.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1a.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1b.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.3/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.4/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.7/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1d.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.2/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.3/power/runtime_status:active
/sys/bus/pci/devices/0000:05:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:09:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:0b:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:11:00.0/power/runtime_status:active
#

> 
> If it doesn't make all of them go away, does it make *some* of them go away?

Yes, repeated inserts and removals of devices into xHCI slot work fine, no need
to use "lsusb -vv" to wakeup devices.

Aside from some minor USB errors (won't mess them here) what is important is the fact
that the eSATA card hotplug works well or perfectly. I just sent to you and other pci devs
much more detailed report under the "Re: 3.9-rc1: pciehp and eSATA card SiI 3132, no XHCI"
thread although this particular testing was done on 3.8.3.

I think I can stop replying to this thread which is about the patch from Sarah.
My dead XHCI port issue is a power management issue, incidentally also fixed by the
very same patch from Huang Ying. Cool! ;-)

> 
> If that is the case, which of the problems remain after applying it (on top
> of the Linus' current tree)?

Sorry, I used plain 3.8.3 so that we can compare with the patch from Sarah.
I tested meanwhile also plain 3.8.3 while I had already uninstalled the laptop-mode-tools
from my laptop, which cause power/control values to be set to "auto" instead of
"on". Sarah, I concluded that your pci-acpi.c patch (https://patchwork.kernel.org/patch/2359531/)
did not break anything on my setup and that the PME# issues associated with
the "dead" xHCI ports were just due to the powersaving issue, merely
laptop-mode-tools setting the "auto" state which allowed devices to be
suspended. After I enabled manually the "auto" for 1.c7 while NOT on 0b:00
(the XHCI controller), port could still detect exchanged USB device.

What I will have to redo is that with plain 3.8.3 I did not manage to
reproduce the PME#enabled message so that the port would appear "dead"
until I would do 'lsusb -vv'. I set the "auto" values instead of "on"
but that was still not enough, somehow. In the end I did set all devices
under /sys/ to "auto" but only some other PCI devices got suspended
while the xHCI port was still working. I will redo the test as I said
but in brief I don't have a problem with the patch from Sarah, posted
initially under thsi thread.

Thank you everybody,
Martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-03-28 21:27                 ` Rafael J. Wysocki
  2013-03-29  7:41                   ` huang ying
  2013-03-30  2:03                   ` Martin Mokrejs
@ 2013-03-30 22:38                   ` Rafael J. Wysocki
  2013-04-01 17:34                     ` Bjorn Helgaas
  2013-04-03 22:34                     ` Bjorn Helgaas
  2 siblings, 2 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-03-30 22:38 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Martin Mokrejs, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp, LKML, linux-pci

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The runtime PM of PCIe ports turns out to be quite fragile, as in
some cases things work while in some other cases they don't and we
don't seem to have a good way to determine whether or not they are
going to work in advance.

For this reason, avoid enabling runtime PM for PCIe ports by
keeping their runtime PM reference counters always above 0 for the
time being.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

This version also removes the no longer necessary (and empty anyway)
port_runtime_pm_black_list[] table.

Thanks,
Rafael

---
 drivers/pci/pcie/portdrv_pci.c |   13 -------------
 1 file changed, 13 deletions(-)

Index: linux-pm/drivers/pci/pcie/portdrv_pci.c
===================================================================
--- linux-pm.orig/drivers/pci/pcie/portdrv_pci.c
+++ linux-pm/drivers/pci/pcie/portdrv_pci.c
@@ -185,14 +185,6 @@ static const struct dev_pm_ops pcie_port
 #endif /* !PM */
 
 /*
- * PCIe port runtime suspend is broken for some chipsets, so use a
- * black list to disable runtime PM for these chipsets.
- */
-static const struct pci_device_id port_runtime_pm_black_list[] = {
-	{ /* end: all zeroes */ }
-};
-
-/*
  * pcie_portdrv_probe - Probe PCI-Express port devices
  * @dev: PCI-Express port device being probed
  *
@@ -225,16 +217,11 @@ static int pcie_portdrv_probe(struct pci
 	 * it by default.
 	 */
 	dev->d3cold_allowed = false;
-	if (!pci_match_id(port_runtime_pm_black_list, dev))
-		pm_runtime_put_noidle(&dev->dev);
-
 	return 0;
 }
 
 static void pcie_portdrv_remove(struct pci_dev *dev)
 {
-	if (!pci_match_id(port_runtime_pm_black_list, dev))
-		pm_runtime_get_noresume(&dev->dev);
 	pcie_port_device_remove(dev);
 	pci_disable_device(dev);
 }


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-29  7:41                   ` huang ying
@ 2013-03-31  2:29                     ` Martin Mokrejs
  0 siblings, 0 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-03-31  2:29 UTC (permalink / raw)
  To: huang ying, Rafael J. Wysocki
  Cc: Bjorn Helgaas, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp



huang ying wrote:
> On Fri, Mar 29, 2013 at 5:27 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Thursday, March 28, 2013 07:31:58 PM Martin Mokrejs wrote:
>>> Hi Bjorn,
>>>
>>> Bjorn Helgaas wrote:
>>>> On Thu, Mar 28, 2013 at 11:26 AM, Martin Mokrejs
>>>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>>>>
>>>>>
>>>>> Rafael J. Wysocki wrote:
>>>>>> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
>>>>>>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>>>>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
>>>>>>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>>>>>>>> Hi Bjorn,
>>>>>>>>>>
>>>>>>>>>> I wonder what you think about the patch below?
>>>>>>>>>
>>>>>>>>> Seems fine to me (I'm trusting your and Matthew's judgment here since
>>>>>>>>> I don't know much about it).  Why don't you resend it with Matthew's
>>>>>>>>> ack and the appropriate stable tags, and I'll put it in.
>>>>>>>>
>>>>>>>> I will, thanks!
>>>>>>>>
>>>>>>>>> If you have
>>>>>>>>> a URL for a bugzilla or mailing list report of the original problem,
>>>>>>>>> that would be good, too.  It'd be nice if users and distros could
>>>>>>>>> match problem reports with this solution, but I can't tell what the
>>>>>>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
>>>>>>>>> else reproduced the problem and tested the fix)?
>>>>>>>>
>>>>>>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
>>>>>>>> to publicly available mailing list archives etc.
>>>>>>>
>>>>>>> Do you at least have a description of how a user could determine
>>>>>>> whether he is seeing the problem fixed by this patch?
>>>>>>
>>>>>> Yeah.  For example, when the problem is visible on a USB controller and that
>>>>>> controller is runtime-suspended, then plugging a new USB device into one
>>>>>> of the controller's ports won't wake the controller up without the patch.
>>>>>
>>>>> Hi,
>>>>>  I am wondering for a week or two why nobody answered any of my bug reports,
>>>>> not even Sarah who asked for more details. I am think the fix is about my report
>>>>> under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
>>>>> and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
>>>>> about my report. But I should better wait what Sarah says. ;-)
> 
> Hi, Martin,
> 
> Sorry for late.  Just found your bug report.  That seems related with
> PCIe port runtime PM support.
> 
> Can you try the debug patch attached?  And send me back the dmesg?
> 
> Sorry I use gmail web client, so I can only send patch as attachment.
> 
> Best Regards,
> Huang Ying
> 

Hi Ying,
  sorry I did not get yet to test this patch (sent on 03/29/13 08:41)?. Is it superseded
by either of your two latter patches?

Second sent on 03/29/13 09:20.

Third sent on 03/30/13 11:54 I did test few hours ago, just have to sum up the
results.

Thank you,
Martin


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-03-30 22:38                   ` [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports Rafael J. Wysocki
@ 2013-04-01 17:34                     ` Bjorn Helgaas
  2013-04-01 20:51                       ` Rafael J. Wysocki
  2013-04-03 22:34                     ` Bjorn Helgaas
  1 sibling, 1 reply; 61+ messages in thread
From: Bjorn Helgaas @ 2013-04-01 17:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Martin Mokrejs, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp, LKML, linux-pci, Zheng Yan

[+cc Zheng, who added this with 71a83bd727]

On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> The runtime PM of PCIe ports turns out to be quite fragile, as in
> some cases things work while in some other cases they don't and we
> don't seem to have a good way to determine whether or not they are
> going to work in advance.

Do you have any references to problems encountered when enabling
runtime PM for PCIe ports?  That information will be useful to anybody
who wants to take another crack at getting this working.

> For this reason, avoid enabling runtime PM for PCIe ports by
> keeping their runtime PM reference counters always above 0 for the
> time being.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>
> This version also removes the no longer necessary (and empty anyway)
> port_runtime_pm_black_list[] table.
>
> Thanks,
> Rafael
>
> ---
>  drivers/pci/pcie/portdrv_pci.c |   13 -------------
>  1 file changed, 13 deletions(-)
>
> Index: linux-pm/drivers/pci/pcie/portdrv_pci.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pcie/portdrv_pci.c
> +++ linux-pm/drivers/pci/pcie/portdrv_pci.c
> @@ -185,14 +185,6 @@ static const struct dev_pm_ops pcie_port
>  #endif /* !PM */
>
>  /*
> - * PCIe port runtime suspend is broken for some chipsets, so use a
> - * black list to disable runtime PM for these chipsets.
> - */
> -static const struct pci_device_id port_runtime_pm_black_list[] = {
> -       { /* end: all zeroes */ }
> -};
> -
> -/*
>   * pcie_portdrv_probe - Probe PCI-Express port devices
>   * @dev: PCI-Express port device being probed
>   *
> @@ -225,16 +217,11 @@ static int pcie_portdrv_probe(struct pci
>          * it by default.
>          */
>         dev->d3cold_allowed = false;
> -       if (!pci_match_id(port_runtime_pm_black_list, dev))
> -               pm_runtime_put_noidle(&dev->dev);
> -
>         return 0;
>  }
>
>  static void pcie_portdrv_remove(struct pci_dev *dev)
>  {
> -       if (!pci_match_id(port_runtime_pm_black_list, dev))
> -               pm_runtime_get_noresume(&dev->dev);
>         pcie_port_device_remove(dev);
>         pci_disable_device(dev);
>  }
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-04-01 17:34                     ` Bjorn Helgaas
@ 2013-04-01 20:51                       ` Rafael J. Wysocki
  2013-04-01 20:53                         ` Bjorn Helgaas
  2013-04-02  5:28                         ` huang ying
  0 siblings, 2 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-04-01 20:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Martin Mokrejs, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp, LKML, linux-pci, Zheng Yan

On Monday, April 01, 2013 11:34:46 AM Bjorn Helgaas wrote:
> [+cc Zheng, who added this with 71a83bd727]
> 
> On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > The runtime PM of PCIe ports turns out to be quite fragile, as in
> > some cases things work while in some other cases they don't and we
> > don't seem to have a good way to determine whether or not they are
> > going to work in advance.
> 
> Do you have any references to problems encountered when enabling
> runtime PM for PCIe ports?  That information will be useful to anybody
> who wants to take another crack at getting this working.

Well, bug 53811 is one example and problems recently reported by
Martin are another.  Do you want me to dig deeper?

Rafael


> > For this reason, avoid enabling runtime PM for PCIe ports by
> > keeping their runtime PM reference counters always above 0 for the
> > time being.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > This version also removes the no longer necessary (and empty anyway)
> > port_runtime_pm_black_list[] table.
> >
> > Thanks,
> > Rafael
> >
> > ---
> >  drivers/pci/pcie/portdrv_pci.c |   13 -------------
> >  1 file changed, 13 deletions(-)
> >
> > Index: linux-pm/drivers/pci/pcie/portdrv_pci.c
> > ===================================================================
> > --- linux-pm.orig/drivers/pci/pcie/portdrv_pci.c
> > +++ linux-pm/drivers/pci/pcie/portdrv_pci.c
> > @@ -185,14 +185,6 @@ static const struct dev_pm_ops pcie_port
> >  #endif /* !PM */
> >
> >  /*
> > - * PCIe port runtime suspend is broken for some chipsets, so use a
> > - * black list to disable runtime PM for these chipsets.
> > - */
> > -static const struct pci_device_id port_runtime_pm_black_list[] = {
> > -       { /* end: all zeroes */ }
> > -};
> > -
> > -/*
> >   * pcie_portdrv_probe - Probe PCI-Express port devices
> >   * @dev: PCI-Express port device being probed
> >   *
> > @@ -225,16 +217,11 @@ static int pcie_portdrv_probe(struct pci
> >          * it by default.
> >          */
> >         dev->d3cold_allowed = false;
> > -       if (!pci_match_id(port_runtime_pm_black_list, dev))
> > -               pm_runtime_put_noidle(&dev->dev);
> > -
> >         return 0;
> >  }
> >
> >  static void pcie_portdrv_remove(struct pci_dev *dev)
> >  {
> > -       if (!pci_match_id(port_runtime_pm_black_list, dev))
> > -               pm_runtime_get_noresume(&dev->dev);
> >         pcie_port_device_remove(dev);
> >         pci_disable_device(dev);
> >  }
> >
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-04-01 20:51                       ` Rafael J. Wysocki
@ 2013-04-01 20:53                         ` Bjorn Helgaas
  2013-04-01 21:24                           ` Rafael J. Wysocki
                                             ` (2 more replies)
  2013-04-02  5:28                         ` huang ying
  1 sibling, 3 replies; 61+ messages in thread
From: Bjorn Helgaas @ 2013-04-01 20:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Martin Mokrejs, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp, LKML, linux-pci, Zheng Yan

On Mon, Apr 1, 2013 at 2:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Monday, April 01, 2013 11:34:46 AM Bjorn Helgaas wrote:
>> [+cc Zheng, who added this with 71a83bd727]
>>
>> On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >
>> > The runtime PM of PCIe ports turns out to be quite fragile, as in
>> > some cases things work while in some other cases they don't and we
>> > don't seem to have a good way to determine whether or not they are
>> > going to work in advance.
>>
>> Do you have any references to problems encountered when enabling
>> runtime PM for PCIe ports?  That information will be useful to anybody
>> who wants to take another crack at getting this working.
>
> Well, bug 53811 is one example and problems recently reported by
> Martin are another.  Do you want me to dig deeper?

OK, I got this one:

  https://bugzilla.kernel.org/show_bug.cgi?id=53811

Martin has reported a lot of problems lately, and I don't know which
are related to runtime PM for PCIe ports.  I was hoping for a couple
URLs to put in the changelog so that when somebody gets the itch to
make this work, they have some useful info to start from.  If you
point me at a specific message, I'll dig up an archive URL for it.

Otherwise, I'm afraid we'll just oscillate between "enable PM, find
bug, disable PM, enable PM, find same bug, disable PM, etc..."

Bjorn

>> > For this reason, avoid enabling runtime PM for PCIe ports by
>> > keeping their runtime PM reference counters always above 0 for the
>> > time being.
>> >
>> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> > ---
>> >
>> > This version also removes the no longer necessary (and empty anyway)
>> > port_runtime_pm_black_list[] table.
>> >
>> > Thanks,
>> > Rafael
>> >
>> > ---
>> >  drivers/pci/pcie/portdrv_pci.c |   13 -------------
>> >  1 file changed, 13 deletions(-)
>> >
>> > Index: linux-pm/drivers/pci/pcie/portdrv_pci.c
>> > ===================================================================
>> > --- linux-pm.orig/drivers/pci/pcie/portdrv_pci.c
>> > +++ linux-pm/drivers/pci/pcie/portdrv_pci.c
>> > @@ -185,14 +185,6 @@ static const struct dev_pm_ops pcie_port
>> >  #endif /* !PM */
>> >
>> >  /*
>> > - * PCIe port runtime suspend is broken for some chipsets, so use a
>> > - * black list to disable runtime PM for these chipsets.
>> > - */
>> > -static const struct pci_device_id port_runtime_pm_black_list[] = {
>> > -       { /* end: all zeroes */ }
>> > -};
>> > -
>> > -/*
>> >   * pcie_portdrv_probe - Probe PCI-Express port devices
>> >   * @dev: PCI-Express port device being probed
>> >   *
>> > @@ -225,16 +217,11 @@ static int pcie_portdrv_probe(struct pci
>> >          * it by default.
>> >          */
>> >         dev->d3cold_allowed = false;
>> > -       if (!pci_match_id(port_runtime_pm_black_list, dev))
>> > -               pm_runtime_put_noidle(&dev->dev);
>> > -
>> >         return 0;
>> >  }
>> >
>> >  static void pcie_portdrv_remove(struct pci_dev *dev)
>> >  {
>> > -       if (!pci_match_id(port_runtime_pm_black_list, dev))
>> > -               pm_runtime_get_noresume(&dev->dev);
>> >         pcie_port_device_remove(dev);
>> >         pci_disable_device(dev);
>> >  }
>> >
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-04-01 20:53                         ` Bjorn Helgaas
@ 2013-04-01 21:24                           ` Rafael J. Wysocki
  2013-04-01 23:20                             ` Rafael J. Wysocki
  2013-04-01 21:48                           ` Martin Mokrejs
  2013-04-02  5:34                           ` huang ying
  2 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-04-01 21:24 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Martin Mokrejs, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp, LKML, linux-pci, Zheng Yan

On Monday, April 01, 2013 02:53:12 PM Bjorn Helgaas wrote:
> On Mon, Apr 1, 2013 at 2:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Monday, April 01, 2013 11:34:46 AM Bjorn Helgaas wrote:
> >> [+cc Zheng, who added this with 71a83bd727]
> >>
> >> On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >
> >> > The runtime PM of PCIe ports turns out to be quite fragile, as in
> >> > some cases things work while in some other cases they don't and we
> >> > don't seem to have a good way to determine whether or not they are
> >> > going to work in advance.
> >>
> >> Do you have any references to problems encountered when enabling
> >> runtime PM for PCIe ports?  That information will be useful to anybody
> >> who wants to take another crack at getting this working.
> >
> > Well, bug 53811 is one example and problems recently reported by
> > Martin are another.  Do you want me to dig deeper?
> 
> OK, I got this one:
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=53811
> 
> Martin has reported a lot of problems lately, and I don't know which
> are related to runtime PM for PCIe ports.  I was hoping for a couple
> URLs to put in the changelog so that when somebody gets the itch to
> make this work, they have some useful info to start from.  If you
> point me at a specific message, I'll dig up an archive URL for it.

This is the message in which Martin confirmed that the previous version of
the $subject patch made insert/removal of devices into xHCI ports on his
system work again.

> Otherwise, I'm afraid we'll just oscillate between "enable PM, find
> bug, disable PM, enable PM, find same bug, disable PM, etc..."

That's a valid concern, but I think we have an idea about what kind of problems
the runtime PM of PCIe ports may cause to happen (generally, PME and hotplug
notifications may not work as expected).

Thanks,
Rafael


> >> > For this reason, avoid enabling runtime PM for PCIe ports by
> >> > keeping their runtime PM reference counters always above 0 for the
> >> > time being.
> >> >
> >> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> > ---
> >> >
> >> > This version also removes the no longer necessary (and empty anyway)
> >> > port_runtime_pm_black_list[] table.
> >> >
> >> > Thanks,
> >> > Rafael
> >> >
> >> > ---
> >> >  drivers/pci/pcie/portdrv_pci.c |   13 -------------
> >> >  1 file changed, 13 deletions(-)
> >> >
> >> > Index: linux-pm/drivers/pci/pcie/portdrv_pci.c
> >> > ===================================================================
> >> > --- linux-pm.orig/drivers/pci/pcie/portdrv_pci.c
> >> > +++ linux-pm/drivers/pci/pcie/portdrv_pci.c
> >> > @@ -185,14 +185,6 @@ static const struct dev_pm_ops pcie_port
> >> >  #endif /* !PM */
> >> >
> >> >  /*
> >> > - * PCIe port runtime suspend is broken for some chipsets, so use a
> >> > - * black list to disable runtime PM for these chipsets.
> >> > - */
> >> > -static const struct pci_device_id port_runtime_pm_black_list[] = {
> >> > -       { /* end: all zeroes */ }
> >> > -};
> >> > -
> >> > -/*
> >> >   * pcie_portdrv_probe - Probe PCI-Express port devices
> >> >   * @dev: PCI-Express port device being probed
> >> >   *
> >> > @@ -225,16 +217,11 @@ static int pcie_portdrv_probe(struct pci
> >> >          * it by default.
> >> >          */
> >> >         dev->d3cold_allowed = false;
> >> > -       if (!pci_match_id(port_runtime_pm_black_list, dev))
> >> > -               pm_runtime_put_noidle(&dev->dev);
> >> > -
> >> >         return 0;
> >> >  }
> >> >
> >> >  static void pcie_portdrv_remove(struct pci_dev *dev)
> >> >  {
> >> > -       if (!pci_match_id(port_runtime_pm_black_list, dev))
> >> > -               pm_runtime_get_noresume(&dev->dev);
> >> >         pcie_port_device_remove(dev);
> >> >         pci_disable_device(dev);
> >> >  }
> >> >
> > --
> > I speak only for myself.
> > Rafael J. Wysocki, Intel Open Source Technology Center.
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-04-01 20:53                         ` Bjorn Helgaas
  2013-04-01 21:24                           ` Rafael J. Wysocki
@ 2013-04-01 21:48                           ` Martin Mokrejs
  2013-04-02  5:34                           ` huang ying
  2 siblings, 0 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-01 21:48 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: ACPI Devel Maling List, Len Brown, Matthew Garrett, Sarah Sharp,
	LKML, linux-pci, Zheng Yan

Bjorn Helgaas wrote:
> On Mon, Apr 1, 2013 at 2:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Monday, April 01, 2013 11:34:46 AM Bjorn Helgaas wrote:
>>> [+cc Zheng, who added this with 71a83bd727]
>>>
>>> On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>> The runtime PM of PCIe ports turns out to be quite fragile, as in
>>>> some cases things work while in some other cases they don't and we
>>>> don't seem to have a good way to determine whether or not they are
>>>> going to work in advance.
>>>
>>> Do you have any references to problems encountered when enabling
>>> runtime PM for PCIe ports?  That information will be useful to anybody
>>> who wants to take another crack at getting this working.
>>
>> Well, bug 53811 is one example and problems recently reported by
>> Martin are another.  Do you want me to dig deeper?
> 
> OK, I got this one:
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=53811
> 
> Martin has reported a lot of problems lately, and I don't know which
> are related to runtime PM for PCIe ports.  I was hoping for a couple
> URLs to put in the changelog so that when somebody gets the itch to
> make this work, they have some useful info to start from.  If you
> point me at a specific message, I'll dig up an archive URL for it.

In the thread

Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled
http://marc.info/?t=136328222600002&r=1&w=2

I reported that if an upstream express root port 1c.4 of the xHCI controller
at 0b:00 is suspended the USB3 socket on the laptop appears dead.
Initially I found that 'lsusb -v' rescues the dead socket and is accompanied
by these in logs:

[ 1445.597641] pcieport 0000:00:1c.4: PME# disabled
[ 1445.617667] xhci_hcd 0000:0b:00.0: PME# disabled

Ying Huang then realized elsewhere I am running laptop-mode-tools although
in their config file I set that they should NOT be run when on AC power.
Looks they do enable 'auto' power mode as seen in
/sys/bus/pci/devices/*/power/control files already upon bootup.
BTW, even worse, if I do /etc/init.d/laptop-mode-tools stop
they restore to some initial values. :(( So, if I meanwhile forced
'on' for some device they will return me back to 'auto' and the device
will immediately do suspend. ;-)

Provided I uninstalled the laptop-mode-tools and made sure all control
files say 'on' (and hence runtime_status files say 'active') then
my problem is with a dead xHCI port 'obeyed'.

Myself it weird that suspend of the port happens only upon USB device
unplug. The port does not suspend by itself if unused.

What is not clear to me how kernel is going to handle laptop-mode-tools
which enabled powersaving on the 1c.4. In my naive, user view kernel does
not realize and *check* that no user tool or a desperate user tried to
suspend an upstream port while there is something bound to it and it
does not apply a check for cascaded devices (1c.4 > 0b:00 and
1c.7 -> 11:00 in my case).

I am writing this without a reference but modprobe of a driver can overcome
suspended root port. I am in this particular case meaning my 1c.7 port
and its downstream 11:00 express card device. From the top of my head
I am not sure if modprobe overcame both 1c.7 and 11:00 being initially
suspended. I could dig it out from the

Re: 3.9-rc1: pciehp and eSATA card SiI 3132, no XHCI
http://marc.info/?t=136305008800001&r=1&w=2

thread if you want. Or it might be easier for you to test it yourself.

So, for me the issue is not fixed but if you decide to disable runtime
power saving for devices under pcieport I don't mind. Their mishandling
definitely causes my acpiphp hotplug issues under 3.7-3.8 kernels
(3.9-rc not tested) whereas these PM issues do not answer why pciehp
is broken on 3.7-3.9-rc1.

Anyway, this patch maybe only good because I would like to use the
laptop-mode-tools and they for sure will put one of the devices into 'auto'
and it will likely fall into suspend.
Martin

> 
> Otherwise, I'm afraid we'll just oscillate between "enable PM, find
> bug, disable PM, enable PM, find same bug, disable PM, etc..."
> 
> Bjorn
> 
>>>> For this reason, avoid enabling runtime PM for PCIe ports by
>>>> keeping their runtime PM reference counters always above 0 for the
>>>> time being.
>>>>
>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>> ---
>>>>
>>>> This version also removes the no longer necessary (and empty anyway)
>>>> port_runtime_pm_black_list[] table.
>>>>
>>>> Thanks,
>>>> Rafael
>>>>
>>>> ---
>>>>  drivers/pci/pcie/portdrv_pci.c |   13 -------------
>>>>  1 file changed, 13 deletions(-)
>>>>
>>>> Index: linux-pm/drivers/pci/pcie/portdrv_pci.c
>>>> ===================================================================
>>>> --- linux-pm.orig/drivers/pci/pcie/portdrv_pci.c
>>>> +++ linux-pm/drivers/pci/pcie/portdrv_pci.c
>>>> @@ -185,14 +185,6 @@ static const struct dev_pm_ops pcie_port
>>>>  #endif /* !PM */
>>>>
>>>>  /*
>>>> - * PCIe port runtime suspend is broken for some chipsets, so use a
>>>> - * black list to disable runtime PM for these chipsets.
>>>> - */
>>>> -static const struct pci_device_id port_runtime_pm_black_list[] = {
>>>> -       { /* end: all zeroes */ }
>>>> -};
>>>> -
>>>> -/*
>>>>   * pcie_portdrv_probe - Probe PCI-Express port devices
>>>>   * @dev: PCI-Express port device being probed
>>>>   *
>>>> @@ -225,16 +217,11 @@ static int pcie_portdrv_probe(struct pci
>>>>          * it by default.
>>>>          */
>>>>         dev->d3cold_allowed = false;
>>>> -       if (!pci_match_id(port_runtime_pm_black_list, dev))
>>>> -               pm_runtime_put_noidle(&dev->dev);
>>>> -
>>>>         return 0;
>>>>  }
>>>>
>>>>  static void pcie_portdrv_remove(struct pci_dev *dev)
>>>>  {
>>>> -       if (!pci_match_id(port_runtime_pm_black_list, dev))
>>>> -               pm_runtime_get_noresume(&dev->dev);
>>>>         pcie_port_device_remove(dev);
>>>>         pci_disable_device(dev);
>>>>  }
>>>>
>> --
>> I speak only for myself.
>> Rafael J. Wysocki, Intel Open Source Technology Center.
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-04-01 21:24                           ` Rafael J. Wysocki
@ 2013-04-01 23:20                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-04-01 23:20 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Martin Mokrejs, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp, LKML, linux-pci, Zheng Yan

On Monday, April 01, 2013 11:24:37 PM Rafael J. Wysocki wrote:
> On Monday, April 01, 2013 02:53:12 PM Bjorn Helgaas wrote:
> > On Mon, Apr 1, 2013 at 2:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > On Monday, April 01, 2013 11:34:46 AM Bjorn Helgaas wrote:
> > >> [+cc Zheng, who added this with 71a83bd727]
> > >>
> > >> On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >> >
> > >> > The runtime PM of PCIe ports turns out to be quite fragile, as in
> > >> > some cases things work while in some other cases they don't and we
> > >> > don't seem to have a good way to determine whether or not they are
> > >> > going to work in advance.
> > >>
> > >> Do you have any references to problems encountered when enabling
> > >> runtime PM for PCIe ports?  That information will be useful to anybody
> > >> who wants to take another crack at getting this working.
> > >
> > > Well, bug 53811 is one example and problems recently reported by
> > > Martin are another.  Do you want me to dig deeper?
> > 
> > OK, I got this one:
> > 
> >   https://bugzilla.kernel.org/show_bug.cgi?id=53811
> > 
> > Martin has reported a lot of problems lately, and I don't know which
> > are related to runtime PM for PCIe ports.  I was hoping for a couple
> > URLs to put in the changelog so that when somebody gets the itch to
> > make this work, they have some useful info to start from.  If you
> > point me at a specific message, I'll dig up an archive URL for it.
> 
> This is the message in which Martin confirmed that the previous version of
> the $subject patch made insert/removal of devices into xHCI ports on his
> system work again.
> 
> > Otherwise, I'm afraid we'll just oscillate between "enable PM, find
> > bug, disable PM, enable PM, find same bug, disable PM, etc..."
> 
> That's a valid concern, but I think we have an idea about what kind of problems
> the runtime PM of PCIe ports may cause to happen (generally, PME and hotplug
> notifications may not work as expected).

I was thinking about this one in particular:

http://marc.info/?l=linux-acpi&m=136460903910718&w=2

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-30  2:03                   ` Martin Mokrejs
@ 2013-04-02  5:25                     ` huang ying
  2013-04-02 15:02                       ` Martin Mokrejs
  0 siblings, 1 reply; 61+ messages in thread
From: huang ying @ 2013-04-02  5:25 UTC (permalink / raw)
  To: Martin Mokrejs
  Cc: Rafael J. Wysocki, Bjorn Helgaas, ACPI Devel Maling List,
	Len Brown, Matthew Garrett, Sarah Sharp

[-- Attachment #1: Type: text/plain, Size: 7272 bytes --]

Hi, Martin,

On Sat, Mar 30, 2013 at 10:03 AM, Martin Mokrejs
<mmokrejs@fold.natur.cuni.cz> wrote:
> Rafael J. Wysocki wrote:
> > On Thursday, March 28, 2013 07:31:58 PM Martin Mokrejs wrote:
> >> Hi Bjorn,
> >>
> >> Bjorn Helgaas wrote:
> >>> On Thu, Mar 28, 2013 at 11:26 AM, Martin Mokrejs
> >>> <mmokrejs@fold.natur.cuni.cz> wrote:
> >>>>
> >>>>
> >>>> Rafael J. Wysocki wrote:
> >>>>> On Thursday, March 28, 2013 10:46:10 AM Bjorn Helgaas wrote:
> >>>>>> On Thu, Mar 28, 2013 at 10:41 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >>>>>>> On Thursday, March 28, 2013 10:21:30 AM Bjorn Helgaas wrote:
> >>>>>>>> On Thu, Mar 28, 2013 at 6:57 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >>>>>>>>> Hi Bjorn,
> >>>>>>>>>
> >>>>>>>>> I wonder what you think about the patch below?
> >>>>>>>>
> >>>>>>>> Seems fine to me (I'm trusting your and Matthew's judgment here since
> >>>>>>>> I don't know much about it).  Why don't you resend it with Matthew's
> >>>>>>>> ack and the appropriate stable tags, and I'll put it in.
> >>>>>>>
> >>>>>>> I will, thanks!
> >>>>>>>
> >>>>>>>> If you have
> >>>>>>>> a URL for a bugzilla or mailing list report of the original problem,
> >>>>>>>> that would be good, too.  It'd be nice if users and distros could
> >>>>>>>> match problem reports with this solution, but I can't tell what the
> >>>>>>>> user-visible issue was.  I assume that Sarah tested this (or somebody
> >>>>>>>> else reproduced the problem and tested the fix)?
> >>>>>>>
> >>>>>>> Sarah reported it to me privately and I'm afraid I don't have any pointers
> >>>>>>> to publicly available mailing list archives etc.
> >>>>>>
> >>>>>> Do you at least have a description of how a user could determine
> >>>>>> whether he is seeing the problem fixed by this patch?
> >>>>>
> >>>>> Yeah.  For example, when the problem is visible on a USB controller and that
> >>>>> controller is runtime-suspended, then plugging a new USB device into one
> >>>>> of the controller's ports won't wake the controller up without the patch.
> >>>>
> >>>> Hi,
> >>>>  I am wondering for a week or two why nobody answered any of my bug reports,
> >>>> not even Sarah who asked for more details. I am think the fix is about my report
> >>>> under thread "Re: 3.8.2: xhci port is dead until pcieport PME# goes to disabled"
> >>>> and I really wonder why I wasn't Cc:ed and listed as a reporter provided it is
> >>>> about my report. But I should better wait what Sarah says. ;-)
> >>>
> >>> I haven't forgotten about your hotplug issues, but I've been on
> >>> vacation for a week and have been working on the similar issue
> >>> reported by Chris Clayton
> >>> (https://bugzilla.kernel.org/show_bug.cgi?id=54981) because it seemed
> >>> a bit more tractable.  But I'll get back to yours eventually :)
> >>> Unfortunately nobody else seems to be jumping in to help, and I can
> >>> only do so much by myself.
> >>>
> >>> I haven't been following your XHCI issue at all, but one thing you
> >>
> >> But please do so now. If we are talking about an existing patch it should be
> >> possible to say whether what I observed is likely to be fixed by the patch.
> >> I will happily discuss then why I loose interrupts in a same way for my
> >> rtl8169 network card and why this PME# stuff happens for me only with 3.8
> >> and not 3.7 (unlike what Sarah claims). I am not arguing that something
> >> else makes 3.7 be able to wakeup the device and overcome the same bug
> >> while "it" is gone from 3.8. I think this should be an easy task for you,
> >> pci devs. ;-)
> >
> > OK, let's try to establish facts.
> >
> > Does the patch below causes the PCI PM issues you're seeing to go away?
>
> Yes, the PME# enabled to disabled is gone, because only PME# disabled is allowed.
> I don't think I really tested a scenario like before. Big thanks to Huang Ying
> (https://patchwork.kernel.org/patch/2359611/):
>
> # grep . /sys/bus/pci/devices/*/power/control
> /sys/bus/pci/devices/0000:00:00.0/power/control:on
> /sys/bus/pci/devices/0000:00:02.0/power/control:on
> /sys/bus/pci/devices/0000:00:16.0/power/control:on
> /sys/bus/pci/devices/0000:00:1a.0/power/control:on
> /sys/bus/pci/devices/0000:00:1b.0/power/control:on
> /sys/bus/pci/devices/0000:00:1c.0/power/control:on
> /sys/bus/pci/devices/0000:00:1c.1/power/control:on
> /sys/bus/pci/devices/0000:00:1c.3/power/control:on
> /sys/bus/pci/devices/0000:00:1c.4/power/control:on
> /sys/bus/pci/devices/0000:00:1c.7/power/control:on
> /sys/bus/pci/devices/0000:00:1d.0/power/control:on
> /sys/bus/pci/devices/0000:00:1f.0/power/control:on
> /sys/bus/pci/devices/0000:00:1f.2/power/control:on
> /sys/bus/pci/devices/0000:00:1f.3/power/control:on
> /sys/bus/pci/devices/0000:05:00.0/power/control:on
> /sys/bus/pci/devices/0000:09:00.0/power/control:on
> /sys/bus/pci/devices/0000:0b:00.0/power/control:on
> /sys/bus/pci/devices/0000:11:00.0/power/control:on
> # grep . /sys/bus/pci/devices/*/power/runtime_status
> /sys/bus/pci/devices/0000:00:00.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:02.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:16.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1a.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1b.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1c.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1c.1/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1c.3/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1c.4/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1c.7/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1d.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1f.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1f.2/power/runtime_status:active
> /sys/bus/pci/devices/0000:00:1f.3/power/runtime_status:active
> /sys/bus/pci/devices/0000:05:00.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:09:00.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:0b:00.0/power/runtime_status:active
> /sys/bus/pci/devices/0000:11:00.0/power/runtime_status:active
> #
>
> >
> > If it doesn't make all of them go away, does it make *some* of them go away?
>
> Yes, repeated inserts and removals of devices into xHCI slot work fine, no need
> to use "lsusb -vv" to wakeup devices.
>
> Aside from some minor USB errors (won't mess them here) what is important is the fact
> that the eSATA card hotplug works well or perfectly. I just sent to you and other pci devs
> much more detailed report under the "Re: 3.9-rc1: pciehp and eSATA card SiI 3132, no XHCI"
> thread although this particular testing was done on 3.8.3.
>
> I think I can stop replying to this thread which is about the patch from Sarah.
> My dead XHCI port issue is a power management issue, incidentally also fixed by the
> very same patch from Huang Ying. Cool! ;-)

Sorry, which patch do you mean?  Or to be more clear, could you test
the patch attached? For the XHCI dead port issue?

Please test this patch with laptop-mode-tool installed and enabled.  And
before/after test, please get PCI devices runtime status with:

grep . /sys/bus/pci/devices/*/power/runtime_status

And please give me the full dmesg for boot and incremental dmesg for
operations.

Best Regards,
Huang Ying

[-- Attachment #2: port_dbg.patch --]
[-- Type: application/octet-stream, Size: 5636 bytes --]

---
 drivers/pci/hotplug/pci_hotplug_core.c |    8 ++++++++
 drivers/pci/pci-acpi.c                 |   21 ++++++++++++++++++++-
 drivers/pci/pci.c                      |    1 +
 drivers/pci/pcie/portdrv_pci.c         |   12 +++++++++---
 drivers/pci/slot.c                     |   18 ++++++++++++++++++
 include/acpi/acpi_bus.h                |    1 +
 include/linux/pci.h                    |    1 +
 7 files changed, 58 insertions(+), 4 deletions(-)

--- a/drivers/pci/hotplug/pci_hotplug_core.c
+++ b/drivers/pci/hotplug/pci_hotplug_core.c
@@ -39,6 +39,7 @@
 #include <linux/mutex.h>
 #include <linux/pci.h>
 #include <linux/pci_hotplug.h>
+#include <linux/pm_runtime.h>
 #include <asm/uaccess.h>
 #include "../pci.h"
 
@@ -473,6 +474,9 @@ int __pci_hp_register(struct hotplug_slo
 	dbg("Added slot %s to the list\n", name);
 out:
 	mutex_unlock(&pci_hp_mutex);
+	/* Bridge runtime PM state may be influenced by hotplug */
+	pm_runtime_resume(&bus->self->dev);
+	dev_info(&bus->self->dev, "hotplug slot added!\n");
 	return result;
 }
 
@@ -489,6 +493,7 @@ int pci_hp_deregister(struct hotplug_slo
 {
 	struct hotplug_slot *temp;
 	struct pci_slot *slot;
+	struct pci_bus *bus;
 
 	if (!hotplug)
 		return -ENODEV;
@@ -508,8 +513,11 @@ int pci_hp_deregister(struct hotplug_slo
 
 	hotplug->release(hotplug);
 	slot->hotplug = NULL;
+	bus = slot->bus;
 	pci_destroy_slot(slot);
 	mutex_unlock(&pci_hp_mutex);
+	pm_runtime_resume(&bus->self->dev);
+	dev_info(&bus->self->dev, "hotplug slot removed!\n");
 
 	return 0;
 }
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -43,10 +43,16 @@ static void pci_acpi_wake_bus(acpi_handl
 static void pci_acpi_wake_dev(acpi_handle handle, u32 event, void *context)
 {
 	struct pci_dev *pci_dev = context;
+	struct acpi_device *adev;
 
 	if (event != ACPI_NOTIFY_DEVICE_WAKE || !pci_dev)
 		return;
 
+	if (!acpi_bus_get_device(handle, &adev)) {
+		adev->wakeup.flags.run_wake_works = true;
+		dev_info(&pci_dev->dev, "run wake works!\n");
+	}
+
 	if (pci_dev->current_state == PCI_D3cold) {
 		pci_wakeup_event(pci_dev);
 		pm_runtime_resume(&pci_dev->dev);
@@ -146,6 +152,19 @@ phys_addr_t acpi_pci_root_get_mcfg_addr(
 static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
 {
 	int acpi_state, d_max;
+	acpi_handle handle = DEVICE_ACPI_HANDLE(&pdev->dev);
+	struct acpi_device *adev;
+
+	if (pci_is_bridge(pdev)) {
+		if (acpi_bus_get_device(handle, &adev)) {
+			dev_info(&pdev->dev, "choose state, no ACPI device\n");
+			return PCI_D0;
+		}
+		if (!adev->wakeup.flags.run_wake_works) {
+			dev_info(&pdev->dev, "choose state, run wake not verified\n");
+			return PCI_D0;
+		}
+	}
 
 	if (pdev->no_d3cold)
 		d_max = ACPI_STATE_D3_HOT;
@@ -269,7 +288,7 @@ static int acpi_pci_run_wake(struct pci_
 	 * waking up to power on the main link even if there is PME
 	 * support for D3cold
 	 */
-	if (dev->pme_interrupt && !dev->runtime_d3cold)
+	if (dev->pme_interrupt && !dev->runtime_d3cold && !pci_is_bridge(dev))
 		return 0;
 
 	if (!acpi_pm_device_run_wake(&dev->dev, enable))
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1832,6 +1832,7 @@ int pci_finish_runtime_suspend(struct pc
 	__pci_enable_wake(dev, target_state, true, pci_dev_run_wake(dev));
 
 	error = pci_set_power_state(dev, target_state);
+	dev_info(&dev->dev, "pfrs: target: %d, %d\n", target_state, error);
 
 	if (error) {
 		__pci_enable_wake(dev, target_state, true, false);
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -154,9 +154,15 @@ static int pcie_port_runtime_idle(struct
 	 */
 	pci_walk_bus(pdev->subordinate, pci_dev_pme_poll, &pme_poll);
 	/* Delay for a short while to prevent too frequent suspend/resume */
-	if (!pme_poll)
-		pm_schedule_suspend(dev, 10);
-	return -EBUSY;
+	if (pme_poll)
+		return -EBUSY;
+	if (pci_bus_has_hotplug_slots(pdev->subordinate)) {
+		dev_info(&pdev->dev, "ppri: has hotplug slots, do not suspend!\n");
+		return -EBUSY;
+	}
+	dev_info(&pdev->dev, "ppri: will go suspend, is_hotplug_bridge: %d.\n",
+		 pdev->is_hotplug_bridge);
+	return pm_schedule_suspend(dev, 10);
 }
 #else
 #define pcie_port_runtime_suspend	NULL
--- a/drivers/pci/slot.c
+++ b/drivers/pci/slot.c
@@ -345,6 +345,24 @@ out:
 }
 EXPORT_SYMBOL_GPL(pci_renumber_slot);
 
+bool pci_bus_has_hotplug_slots(struct pci_bus *bus)
+{
+	struct pci_slot *slot;
+	bool has_hotplug_slots = false;
+
+	down_read(&pci_bus_sem);
+	list_for_each_entry(slot, &bus->slots, list) {
+		if (slot->hotplug) {
+			has_hotplug_slots = true;
+			break;
+		}
+	}
+	up_read(&pci_bus_sem);
+
+	return has_hotplug_slots;
+}
+EXPORT_SYMBOL_GPL(pci_bus_has_hotplug_slots);
+
 /**
  * pci_destroy_slot - decrement refcount for physical PCI slot
  * @slot: struct pci_slot to decrement
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -245,6 +245,7 @@ struct acpi_device_perf {
 struct acpi_device_wakeup_flags {
 	u8 valid:1;		/* Can successfully enable wakeup? */
 	u8 run_wake:1;		/* Run-Wake GPE devices */
+	u8 run_wake_works:1;	/* Run-Wake works for the device */
 	u8 notifier_present:1;  /* Wake-up notify handler has been installed */
 };
 
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -722,6 +722,7 @@ struct pci_slot *pci_create_slot(struct
 void pci_destroy_slot(struct pci_slot *slot);
 void pci_renumber_slot(struct pci_slot *slot, int slot_nr);
 int pci_scan_slot(struct pci_bus *bus, int devfn);
+bool pci_bus_has_hotplug_slots(struct pci_bus *bus);
 struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn);
 void pci_device_add(struct pci_dev *dev, struct pci_bus *bus);
 unsigned int pci_scan_child_bus(struct pci_bus *bus);

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-04-01 20:51                       ` Rafael J. Wysocki
  2013-04-01 20:53                         ` Bjorn Helgaas
@ 2013-04-02  5:28                         ` huang ying
  2013-04-02  5:31                           ` huang ying
  1 sibling, 1 reply; 61+ messages in thread
From: huang ying @ 2013-04-02  5:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, Martin Mokrejs, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp, LKML, linux-pci, Zheng Yan

On Tue, Apr 2, 2013 at 4:51 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>
> On Monday, April 01, 2013 11:34:46 AM Bjorn Helgaas wrote:
> > [+cc Zheng, who added this with 71a83bd727]
> >
> > On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > The runtime PM of PCIe ports turns out to be quite fragile, as in
> > > some cases things work while in some other cases they don't and we
> > > don't seem to have a good way to determine whether or not they are
> > > going to work in advance.
> >
> > Do you have any references to problems encountered when enabling
> > runtime PM for PCIe ports?  That information will be useful to anybody
> > who wants to take another crack at getting this working.
>
> Well, bug 53811 is one example and problems recently reported by
> Martin are another.  Do you want me to dig deeper?

For bug 53811, I have a debug patch posted in bugzilla to disable
runtime PM for PCIe port with hotplug capability.  It appears that
patch resolved the issue for the reporter.  Do think that patch can
solve the hotplug issue.

Best Regards,
Huang Ying

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-04-02  5:28                         ` huang ying
@ 2013-04-02  5:31                           ` huang ying
  0 siblings, 0 replies; 61+ messages in thread
From: huang ying @ 2013-04-02  5:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, Martin Mokrejs, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp, LKML, linux-pci, Zheng Yan,
	Huang Ying

On Tue, Apr 2, 2013 at 1:28 PM, huang ying <huang.ying.caritas@gmail.com> wrote:
> On Tue, Apr 2, 2013 at 4:51 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>
>> On Monday, April 01, 2013 11:34:46 AM Bjorn Helgaas wrote:
>> > [+cc Zheng, who added this with 71a83bd727]
>> >
>> > On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> > >
>> > > The runtime PM of PCIe ports turns out to be quite fragile, as in
>> > > some cases things work while in some other cases they don't and we
>> > > don't seem to have a good way to determine whether or not they are
>> > > going to work in advance.
>> >
>> > Do you have any references to problems encountered when enabling
>> > runtime PM for PCIe ports?  That information will be useful to anybody
>> > who wants to take another crack at getting this working.
>>
>> Well, bug 53811 is one example and problems recently reported by
>> Martin are another.  Do you want me to dig deeper?
>
> For bug 53811, I have a debug patch posted in bugzilla to disable
> runtime PM for PCIe port with hotplug capability.  It appears that
> patch resolved the issue for the reporter.  Do think that patch can
> solve the hotplug issue.

For Martin's hotplug issue, it appears that a similar patch I sent to
him resolve his hotplug issue too.  For Martin's "XHCI dead port"
issue, that is, PME issue.  I just sent him another debug patch to
try.  Sorry for late!

Best Regards,
Huang Ying

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-04-01 20:53                         ` Bjorn Helgaas
  2013-04-01 21:24                           ` Rafael J. Wysocki
  2013-04-01 21:48                           ` Martin Mokrejs
@ 2013-04-02  5:34                           ` huang ying
  2 siblings, 0 replies; 61+ messages in thread
From: huang ying @ 2013-04-02  5:34 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Rafael J. Wysocki, Martin Mokrejs, ACPI Devel Maling List,
	Len Brown, Matthew Garrett, Sarah Sharp, LKML, linux-pci,
	Zheng Yan, Huang Ying

Hi, Bjorn,

On Tue, Apr 2, 2013 at 4:53 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Mon, Apr 1, 2013 at 2:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Monday, April 01, 2013 11:34:46 AM Bjorn Helgaas wrote:
>>> [+cc Zheng, who added this with 71a83bd727]
>>>
>>> On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> >
>>> > The runtime PM of PCIe ports turns out to be quite fragile, as in
>>> > some cases things work while in some other cases they don't and we
>>> > don't seem to have a good way to determine whether or not they are
>>> > going to work in advance.
>>>
>>> Do you have any references to problems encountered when enabling
>>> runtime PM for PCIe ports?  That information will be useful to anybody
>>> who wants to take another crack at getting this working.
>>
>> Well, bug 53811 is one example and problems recently reported by
>> Martin are another.  Do you want me to dig deeper?
>
> OK, I got this one:
>
>   https://bugzilla.kernel.org/show_bug.cgi?id=53811
>
> Martin has reported a lot of problems lately, and I don't know which
> are related to runtime PM for PCIe ports.  I was hoping for a couple
> URLs to put in the changelog so that when somebody gets the itch to
> make this work, they have some useful info to start from.  If you
> point me at a specific message, I'll dig up an archive URL for it.
>
> Otherwise, I'm afraid we'll just oscillate between "enable PM, find
> bug, disable PM, enable PM, find same bug, disable PM, etc..."

Sorry for late!  I am trying to fix a way to fix 53811 and Martin's
bug without disable PM for port totally.

Best Regards,
Huang Ying

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02  5:25                     ` huang ying
@ 2013-04-02 15:02                       ` Martin Mokrejs
  2013-04-02 16:08                         ` huang ying
  2013-04-02 16:30                         ` Bjorn Helgaas
  0 siblings, 2 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-02 15:02 UTC (permalink / raw)
  To: huang ying
  Cc: Rafael J. Wysocki, Bjorn Helgaas, ACPI Devel Maling List,
	Len Brown, Matthew Garrett, Sarah Sharp

Hi Ying,

huang ying wrote:
> Hi, Martin,
> 
> On Sat, Mar 30, 2013 at 10:03 AM, Martin Mokrejs
> <mmokrejs@fold.natur.cuni.cz> wrote:
>> Rafael J. Wysocki wrote:

>>> If it doesn't make all of them go away, does it make *some* of them go away?
>>
>> Yes, repeated inserts and removals of devices into xHCI slot work fine, no need
>> to use "lsusb -vv" to wakeup devices.
>>
>> Aside from some minor USB errors (won't mess them here) what is important is the fact
>> that the eSATA card hotplug works well or perfectly. I just sent to you and other pci devs
>> much more detailed report under the "Re: 3.9-rc1: pciehp and eSATA card SiI 3132, no XHCI"
>> thread although this particular testing was done on 3.8.3.
>>
>> I think I can stop replying to this thread which is about the patch from Sarah.
>> My dead XHCI port issue is a power management issue, incidentally also fixed by the
>> very same patch from Huang Ying. Cool! ;-)
> 
> Sorry, which patch do you mean?  Or to be more clear, could you test
> the patch attached? For the XHCI dead port issue?

So I tested your port_dbg.patch on 3.8.3. Or did you want me to do it on 3.8.5?


# lspci -tv
-[0000:00]-+-00.0  Intel Corporation 2nd Generation Core Processor Family DRAM Controller
           +-02.0  Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller
           +-16.0  Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1
           +-1a.0  Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2
           +-1b.0  Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller
           +-1c.0-[03-04]--
           +-1c.1-[05-06]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller
           +-1c.3-[09-0a]----00.0  Intel Corporation Centrino Wireless-N 1030 [Rainbow Peak]
           +-1c.4-[0b-0c]----00.0  Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller
           +-1c.7-[11-16]----00.0  Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller
           +-1d.0  Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1
           +-1f.0  Intel Corporation HM67 Express Chipset Family LPC Controller
           +-1f.2  Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller
           \-1f.3  Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller
#


After a cold boot 1c.4 is active whereas 0b:00 is "suspended?".
Attaching a mouse wakes up 0b:00 and the mouse works (I did not try USB3 device in the
xhci socket in this test). Did you anticipate that?
After its unplug 0b:00 falls asleep again, but also 1c.4 does.
That makes the xhci port appear dead and it does NOT detected a device
was plugged back in again. Doing echo on  > /sys/bus/pci/devices/0000:0b:00.0/power/control
wakes up the 0b:00 and it correctly also wakes up upstream 1c.4. So the socket detects
a device is already plugged in and things start to work.


> 
> Please test this patch with laptop-mode-tool installed and enabled.  And
> before/after test, please get PCI devices runtime status with:
> 
> grep . /sys/bus/pci/devices/*/power/runtime_status

Initial after cold boot, no mouse attached:

/sys/bus/pci/devices/0000:00:00.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:02.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:16.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1a.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1b.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.3/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.4/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.7/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1d.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.2/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.3/power/runtime_status:suspended
/sys/bus/pci/devices/0000:05:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:09:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:0b:00.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:11:00.0/power/runtime_status:active


Dead port with mouse attached:

/sys/bus/pci/devices/0000:00:00.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:02.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:16.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1a.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1b.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.3/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.4/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1c.7/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1d.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.2/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.3/power/runtime_status:suspended
/sys/bus/pci/devices/0000:05:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:09:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:0b:00.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:11:00.0/power/runtime_status:active


Rescued port after "echo on":

/sys/bus/pci/devices/0000:00:00.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:02.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:16.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1a.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1b.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.0/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.3/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.4/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1c.7/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1d.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.0/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.2/power/runtime_status:active
/sys/bus/pci/devices/0000:00:1f.3/power/runtime_status:suspended
/sys/bus/pci/devices/0000:05:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:09:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:0b:00.0/power/runtime_status:active
/sys/bus/pci/devices/0000:11:00.0/power/runtime_status:active


> 
> And please give me the full dmesg for boot and incremental dmesg for
> operations.


The incremental bits here, the full dmesg will send only directly to your email, due to its size.

--- dmesg_initial.txt   2013-04-02 14:36:24.000000000 +0200
+++ dmesg_initial__mouse_attached.txt   2013-04-02 14:37:03.000000000 +0200
@@ -1033,3 +1033,35 @@
 [   41.688341] r8169 0000:05:00.0 eth0: link up
 [   42.796053] r8169 0000:05:00.0 eth0: link down
 [   45.152871] r8169 0000:05:00.0 eth0: link up
+[   98.482665] xhci_hcd 0000:0b:00.0: PME# disabled
+[   98.482676] xhci_hcd 0000:0b:00.0: enabling bus mastering
+[   98.482753] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
+[   98.482822] usb usb3: usb wakeup-resume
+[   98.482827] usb usb3: usb auto-resume
+[   98.482856] hub 3-0:1.0: hub_resume
+[   98.482922] hub 3-0:1.0: port 2: status 0301 change 0001
+[   98.482956] usb usb4: usb wakeup-resume
+[   98.482958] usb usb4: usb auto-resume
+[   98.482972] hub 4-0:1.0: hub_resume
+[   98.483226] hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
+[   98.483284] hub 4-0:1.0: hub_suspend
+[   98.483289] usb usb4: bus auto-suspend, wakeup 1
+[   98.592406] hub 3-0:1.0: state 7 ports 2 chg 0004 evt 0000
+[   98.592456] hub 3-0:1.0: port 2, status 0301, change 0000, 1.5 Mb/s
+[   98.712244] usb 3-2: new low-speed USB device number 2 using xhci_hcd
+[   98.750683] usb 3-2: skipped 1 descriptor after interface
+[   98.753594] usb 3-2: default language 0x0409
+[   98.766647] usb 3-2: udev 2, busnum 3, minor = 257
+[   98.766650] usb 3-2: New USB device found, idVendor=0458, idProduct=0036
+[   98.766652] usb 3-2: New USB device strings: Mfr=2, Product=1, SerialNumber=0
+[   98.766653] usb 3-2: Product: NetScroll + Mini Traveler
+[   98.766654] usb 3-2: Manufacturer: Genius
+[   98.767287] usb 3-2: usb_probe_device
+[   98.767289] usb 3-2: configuration #1 chosen from 1 choice
+[   98.767337] usb 3-2: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
+[   98.770540] usb 3-2: Successful Endpoint Configure command
+[   98.771564] usb 3-2: adding 3-2:1.0 (config #1, interface 0)
+[   98.771792] usbhid 3-2:1.0: usb_probe_interface
+[   98.771793] usbhid 3-2:1.0: usb_probe_interface - got id
+[   98.783875] input: Genius NetScroll + Mini Traveler as /devices/pci0000:00/0000:00:1c.4/0000:0b:00.0/usb3/3-2/3-2:1.0/input/input13
+[   98.785500] hid-generic 0003:0458:0036.0001: input,hidraw0: USB HID v1.10 Mouse [Genius NetScroll + Mini Traveler] on usb-0000:0b:00.0-2/input0


Mouse unplug resulting in a suicide of the xhci socket due to xhci_hcd shutting it down:

--- dmesg_initial__mouse_attached.txt   2013-04-02 14:37:03.000000000 +0200
+++ dmesg_initial__mouse_attached__unplugged.txt        2013-04-02 14:37:48.000000000 +0200
@@ -1065,3 +1065,18 @@
 [   98.771793] usbhid 3-2:1.0: usb_probe_interface - got id
 [   98.783875] input: Genius NetScroll + Mini Traveler as /devices/pci0000:00/0000:00:1c.4/0000:0b:00.0/usb3/3-2/3-2:1.0/input/input13
 [   98.785500] hid-generic 0003:0458:0036.0001: input,hidraw0: USB HID v1.10 Mouse [Genius NetScroll + Mini Traveler] on usb-0000:0b:00.0-2/input0
+[  142.025637] hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0004
+[  142.025722] hub 3-0:1.0: port 2, status 0100, change 0001, 12 Mb/s
+[  142.025725] usb 3-2: USB disconnect, device number 2
+[  142.025726] usb 3-2: unregistering device
+[  142.025728] usb 3-2: unregistering interface 3-2:1.0
+[  142.026303] xhci_hcd 0000:0b:00.0: shutdown urb ffff880405d60a20 ep1in-intr
+[  142.124442] usb 3-2: usb_disable_device nuking all URBs
+[  142.131315] usb 3-2: Successful Endpoint Configure command
+[  142.292672] hub 3-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x100
+[  142.292691] hub 3-0:1.0: hub_suspend
+[  142.292699] usb usb3: bus auto-suspend, wakeup 1
+[  142.292808] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
+[  142.292900] xhci_hcd 0000:0b:00.0: PME# enabled
+[  153.974259] pcieport 0000:00:1c.4: PME# enabled
+[  154.014363] pcieport 0000:00:1c.4: PME# disabled

Re-attaching the same mouse does not wakeup dead 0b:00, only 1c.4 is being woken up:

--- dmesg_initial__mouse_attached__unplugged.txt        2013-04-02 14:37:48.000000000 +0200
+++ dmesg_initial__mouse_attached__unplugged__reattached_but_port_dead.txt      2013-04-02 14:38:27.000000000 +0200
@@ -1080,3 +1080,7 @@
 [  142.292900] xhci_hcd 0000:0b:00.0: PME# enabled
 [  153.974259] pcieport 0000:00:1c.4: PME# enabled
 [  154.014363] pcieport 0000:00:1c.4: PME# disabled
+[  154.024237] pcieport 0000:00:1c.4: PME# enabled
+[  192.077120] pcieport 0000:00:1c.4: PME# disabled
+[  192.087074] pcieport 0000:00:1c.4: PME# enabled
+[  192.127475] pcieport 0000:00:1c.4: PME# disabled


Doing echo on > ..../...0b:00.../control rescues the port:

--- dmesg_initial__mouse_attached__unplugged__reattached_but_port_dead.txt      2013-04-02 14:38:27.000000000 +0200
+++ dmesg_initial__mouse_attached__unplugged__reattached_but_port_dead__echo_on_0b:00_wakes_up_port.txt 2013-04-02 14:39:52.000000000 +0200
@@ -1084,3 +1084,37 @@
 [  192.077120] pcieport 0000:00:1c.4: PME# disabled
 [  192.087074] pcieport 0000:00:1c.4: PME# enabled
 [  192.127475] pcieport 0000:00:1c.4: PME# disabled
+[  192.136892] pcieport 0000:00:1c.4: PME# enabled
+[  248.761936] pcieport 0000:00:1c.4: PME# disabled
+[  248.781922] xhci_hcd 0000:0b:00.0: PME# disabled
+[  248.781937] xhci_hcd 0000:0b:00.0: enabling bus mastering
+[  248.782109] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_resume: 0
+[  248.782318] usb usb3: usb wakeup-resume
+[  248.782321] usb usb3: usb auto-resume
+[  248.782340] hub 3-0:1.0: hub_resume
+[  248.782397] hub 3-0:1.0: port 2: status 0301 change 0001
+[  248.782426] usb usb4: usb wakeup-resume
+[  248.782428] usb usb4: usb auto-resume
+[  248.782442] hub 4-0:1.0: hub_resume
+[  248.782496] hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
+[  248.782553] hub 4-0:1.0: hub_suspend
+[  248.782557] usb usb4: bus auto-suspend, wakeup 1
+[  248.891635] hub 3-0:1.0: state 7 ports 2 chg 0004 evt 0000
+[  248.891712] hub 3-0:1.0: port 2, status 0301, change 0000, 1.5 Mb/s
+[  249.011519] usb 3-2: new low-speed USB device number 3 using xhci_hcd
+[  249.049943] usb 3-2: skipped 1 descriptor after interface
+[  249.052853] usb 3-2: default language 0x0409
+[  249.065876] usb 3-2: udev 3, busnum 3, minor = 258
+[  249.065880] usb 3-2: New USB device found, idVendor=0458, idProduct=0036
+[  249.065881] usb 3-2: New USB device strings: Mfr=2, Product=1, SerialNumber=0
+[  249.065883] usb 3-2: Product: NetScroll + Mini Traveler
+[  249.065884] usb 3-2: Manufacturer: Genius
+[  249.066481] usb 3-2: usb_probe_device
+[  249.066483] usb 3-2: configuration #1 chosen from 1 choice
+[  249.066526] usb 3-2: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
+[  249.069656] usb 3-2: Successful Endpoint Configure command
+[  249.070823] usb 3-2: adding 3-2:1.0 (config #1, interface 0)
+[  249.071052] usbhid 3-2:1.0: usb_probe_interface
+[  249.071054] usbhid 3-2:1.0: usb_probe_interface - got id
+[  249.082981] input: Genius NetScroll + Mini Traveler as /devices/pci0000:00/0000:00:1c.4/0000:0b:00.0/usb3/3-2/3-2:1.0/input/input14
+[  249.084093] hid-generic 0003:0458:0036.0002: input,hidraw0: USB HID v1.10 Mouse [Genius NetScroll + Mini Traveler] on usb-0000:0b:00.0-2/input0




Martin


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 15:02                       ` Martin Mokrejs
@ 2013-04-02 16:08                         ` huang ying
  2013-04-02 16:53                           ` Martin Mokrejs
  2013-04-02 16:30                         ` Bjorn Helgaas
  1 sibling, 1 reply; 61+ messages in thread
From: huang ying @ 2013-04-02 16:08 UTC (permalink / raw)
  To: Martin Mokrejs
  Cc: Rafael J. Wysocki, Bjorn Helgaas, ACPI Devel Maling List,
	Len Brown, Matthew Garrett, Sarah Sharp

Hi, Martin,

Thanks for your test!

On Tue, Apr 2, 2013 at 11:02 PM, Martin Mokrejs
<mmokrejs@fold.natur.cuni.cz> wrote:
> Hi Ying,
>
> huang ying wrote:
>> Hi, Martin,
>>
>> On Sat, Mar 30, 2013 at 10:03 AM, Martin Mokrejs
>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>> Rafael J. Wysocki wrote:
>
>>>> If it doesn't make all of them go away, does it make *some* of them go away?
>>>
>>> Yes, repeated inserts and removals of devices into xHCI slot work fine, no need
>>> to use "lsusb -vv" to wakeup devices.
>>>
>>> Aside from some minor USB errors (won't mess them here) what is important is the fact
>>> that the eSATA card hotplug works well or perfectly. I just sent to you and other pci devs
>>> much more detailed report under the "Re: 3.9-rc1: pciehp and eSATA card SiI 3132, no XHCI"
>>> thread although this particular testing was done on 3.8.3.
>>>
>>> I think I can stop replying to this thread which is about the patch from Sarah.
>>> My dead XHCI port issue is a power management issue, incidentally also fixed by the
>>> very same patch from Huang Ying. Cool! ;-)
>>
>> Sorry, which patch do you mean?  Or to be more clear, could you test
>> the patch attached? For the XHCI dead port issue?
>
> So I tested your port_dbg.patch on 3.8.3. Or did you want me to do it on 3.8.5?

I think that is OK.  Although my patch is against 3.9-rc4.

I don't know why, but it appears that there is no any debug messages
that my patch will print in your dmesg.  For example, in my patch, if
the PCIe port (1c.4) goes into suspended, there should be something as
follow in the dmesg:

pcieport 0000:00:1c.4: ppri: will go suspend, is_hotplug_bridge: <0 or 1>

Are you sure you send me the right dmesg?  Or you use the right patch or kernel?

Best Regards,
Huang Ying

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 15:02                       ` Martin Mokrejs
  2013-04-02 16:08                         ` huang ying
@ 2013-04-02 16:30                         ` Bjorn Helgaas
       [not found]                           ` <515B17D9.6030805@fold.natur.cuni.cz>
                                             ` (2 more replies)
  1 sibling, 3 replies; 61+ messages in thread
From: Bjorn Helgaas @ 2013-04-02 16:30 UTC (permalink / raw)
  To: Martin Mokrejs
  Cc: huang ying, Rafael J. Wysocki, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp

On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
<mmokrejs@fold.natur.cuni.cz> wrote:
> Hi Ying,
>
> huang ying wrote:

>> And please give me the full dmesg for boot and incremental dmesg for
>> operations.
>
>
> The incremental bits here, the full dmesg will send only directly to your email, due to its size.

Is there a bugzilla for this issue?  Please attach the complete dmesg
there or somewhere similar so we can all benefit.

I think we have two problems that may be relevant to this discussion.

1) The _OSC "PCI Express Capability Structure control" bit.  I don't
think Linux pays attention to whether the BIOS has granted us control
over the capability, so we may do things to it that the BIOS doesn't
expect.

2) acpiphp currently uses the presence of _ADR/_EJ0/_RMV to detect
hotplug slots.  I don't think this is sufficient (see
https://bugzilla.kernel.org/show_bug.cgi?id=54981 for details).
Therefore, I don't think pci_bus_has_hotplug_slots() in port_dbg.patch
can be accurate.  I think it returns "false" for some buses where it
should return "true," such as the ExpressCard slot on Chris Clayton's
system (see bug 54981).

Bjorn

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 16:08                         ` huang ying
@ 2013-04-02 16:53                           ` Martin Mokrejs
  0 siblings, 0 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-02 16:53 UTC (permalink / raw)
  To: huang ying
  Cc: Rafael J. Wysocki, Bjorn Helgaas, ACPI Devel Maling List,
	Len Brown, Matthew Garrett, Sarah Sharp



huang ying wrote:
> Hi, Martin,
> 
> Thanks for your test!
> 
> On Tue, Apr 2, 2013 at 11:02 PM, Martin Mokrejs
> <mmokrejs@fold.natur.cuni.cz> wrote:
>> Hi Ying,
>>
>> huang ying wrote:
>>> Hi, Martin,
>>>
>>> On Sat, Mar 30, 2013 at 10:03 AM, Martin Mokrejs
>>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>>> Rafael J. Wysocki wrote:
>>
>>>>> If it doesn't make all of them go away, does it make *some* of them go away?
>>>>
>>>> Yes, repeated inserts and removals of devices into xHCI slot work fine, no need
>>>> to use "lsusb -vv" to wakeup devices.
>>>>
>>>> Aside from some minor USB errors (won't mess them here) what is important is the fact
>>>> that the eSATA card hotplug works well or perfectly. I just sent to you and other pci devs
>>>> much more detailed report under the "Re: 3.9-rc1: pciehp and eSATA card SiI 3132, no XHCI"
>>>> thread although this particular testing was done on 3.8.3.
>>>>
>>>> I think I can stop replying to this thread which is about the patch from Sarah.
>>>> My dead XHCI port issue is a power management issue, incidentally also fixed by the
>>>> very same patch from Huang Ying. Cool! ;-)
>>>
>>> Sorry, which patch do you mean?  Or to be more clear, could you test
>>> the patch attached? For the XHCI dead port issue?
>>
>> So I tested your port_dbg.patch on 3.8.3. Or did you want me to do it on 3.8.5?
> 
> I think that is OK.  Although my patch is against 3.9-rc4.
> 
> I don't know why, but it appears that there is no any debug messages
> that my patch will print in your dmesg.  For example, in my patch, if
> the PCIe port (1c.4) goes into suspended, there should be something as
> follow in the dmesg:
> 
> pcieport 0000:00:1c.4: ppri: will go suspend, is_hotplug_bridge: <0 or 1>
> 
> Are you sure you send me the right dmesg?  Or you use the right patch or kernel?

Damn you are right, I forgot to apply it. Consider what I reported as vanilla 3.8.3
behavior.

So it applies over 3.8.3 proving I forgot to run the patch command. :((

OK, will redo it with 3.8.5 so we stay close to 3.8.2 where I initially reported
the xHCI dead port issue. And will open a bug at bugzilla.kernel.org.


Martin
P.S.: Per Rafael's request in another thread at http://marc.info/?l=linux-pm&m=136491301104336&w=2
we might bring something from that back into linux-pci/acpi. Meanwhile, take that as a brief note
how things are on 3.9-rc5.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
       [not found]                           ` <515B17D9.6030805@fold.natur.cuni.cz>
@ 2013-04-02 20:55                             ` Martin Mokrejs
  2013-04-02 22:16                               ` Sarah Sharp
                                                 ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-02 20:55 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: huang ying, Rafael J. Wysocki, Len Brown, Matthew Garrett,
	Sarah Sharp, ACPI Devel Maling List, linux-pci, Yinghai Lu,
	Huang Ying

[ +linux-pci and Yinghai as they suffered already those many emails on individual
 threads so one overviewing email hopefully won't harm] ;-)

Martin Mokrejs wrote:
> 
> 
> Bjorn Helgaas wrote:
>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>> Hi Ying,
>>>
>>> huang ying wrote:
>>
>>>> And please give me the full dmesg for boot and incremental dmesg for
>>>> operations.
>>>
>>>
>>> The incremental bits here, the full dmesg will send only directly to your email, due to its size.
>>
>> Is there a bugzilla for this issue?  Please attach the complete dmesg
>> there or somewhere similar so we can all benefit.
> 
> I changed my mind. I am attaching the dmesg here but omitting linux-acpi
> list. After I hear a proposal from Rafel/Bjorn I will open separate bugs.
> I thought that the threads I started so far were enough but yes, dmesg
> files don't pass through list filters so I should move that to bugzilla.
> 
> so far my view of the the bugs was:
> 
> 1) acpiphp hotplug broken due to upstream pcieport 1c.7 PME# enabled
>   (eSATA-based card)

Fixed by Ying Huang port_dbg.patch applied over 3.8.5 (fixes acpiphp hotplug
of eSATA and Firewire cards, NOT the hotplug of a NEC-based USB3 card -> hence
the bug 4) below). Now I can continue using laptop-mode-tools.

> 2) xHCI dead due to to its suspend - 3.8 series and above

Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
XHCI card *in an express card slot* does not suffer this suspend issue.
Although it is being put into suspend if a device is unplugged.

> 3) pciehp completely broken since about 3.6, still 3.9-rc5

Even 3.9-rc5 with patch 2368081 and port_dbg.patch from Ying Huang this is
still broken (the eject of a cold plugged device from an express card slot).
That results in /proc/interrupts claiming IRQ19 is still used by the driver.
Non-forced but manual 'rmmod sata_sil24' removes the IRQ 19 from the listing.
The rmmod also removes association with sata_sil24 from the /proc/iomem but
the device 11:00 is retained in the file with its memory ranges.
lspci provides, as many times described by me, conflicting information.
Actually, I trust more lspci than /proc/ files.

> 
> 
> 
> There is one more which actually brought me into all of this in May2012 at about
> 3.2.x kernels:
> 
> 4) Even when upstream port 1c.7 is force control to 'on' hot removal of
>    USB3 express card is broken, only every second eject is recognized.
>    Is likely related to xhci_hcd having a special privilege to handle IRQ/PM
>    in its own way. In contrast, Firewire and eSATA cards work under same
>    circumstances. I see different sleep states listed as supported by those
>    cards but my bet is that is due to the exceptional xhci_hcd privilege.
>    I briefly repeated that already with 3.9-rc5.

Still broken even with port_dbg.patch applied over 3.8.5. Turns out the unnoticed
ejects and inserts are actually detected, but later, with 30sec delay of so.
Hmm, in my original thread back in 2012 I said 60sec delay but seems is likely
still the same problem:
3.2.11: PCI Express card cannot be re-detected withing cca 60sec timeframe

Before I forget, I will sketch several more bugs I hit and are all documented
in my postings from last week or two. I can provide the URLs to those postings
already in archives and maybe summarize them in bugzilla, after we agree what
will be worked on and where (email ... bugzilla), under the best matching suibject
you will propose.

5) lspci causes wake and suspend of pcieport handled devices. I fear this is
not good. Maybe it does the same to other pci devices but the "problem" is
that no other pci drivers report same type of message. I would like to see
the PME# enabled/disabled generated by other drivers as well, ideally by some
upstream, common driver.

6) sata_sil24 sometimes initializes badly under pciehp. Provided you once fix
the pciehp and still would like to get the init of sata_sil24 fixed as well.
The are two wrong paths in the driver. One is:

[  899.894862] sata_sil24 0000:11:00.0: version 1.1
[  899.894880] sata_sil24 0000:11:00.0: enabling device (0000 -> 0003)
[  899.985994] sata_sil24 0000:11:00.0: failed to clear port RST
[  900.086097] sata_sil24 0000:11:00.0: failed to clear port RST
[  900.086119] sata_sil24 0000:11:00.0: enabling bus mastering

while the other is:

[  974.021661] pcieport 0000:00:1c.0: PME# disabled
[  974.041697] pcieport 0000:00:1c.7: PME# disabled
[ 1048.450168] sata_sil24 0000:11:00.0: version 1.1
[ 1048.463692] sata_sil24 0000:11:00.0: Refused to change power state, currently in D3
[ 1048.563818] sata_sil24 0000:11:00.0: failed to clear port RST
[ 1048.663935] sata_sil24 0000:11:00.0: failed to clear port RST

Both lead to a broken device and I would prefer the driver to fail to load.
It seems they are at least in part related to early device eject while the
driver did not yet turn down an unused external SATA port.

7) It seems Rafael or Bjorn have a clue why sometimes I see only PME# disabled
or just PME# enabled in dmesg for a particular device and I am worried when was
it silently switched to the other state. I would like to hear this can be prevent
in future by some cross-checks, by design.

8) I don't know whether one can ensure that a driver releases either both
IRQ and memory ranges it has allocated, or just nothing, or an oops happens,
whatever. Maybe something could track what the driver grabbed once and make
sure both are released. even a background scan or /proc files would be fine.
The disagreement with lspci is not good.

9) In the thread 
Re: 3.8.2: stale pci device info for a previously inserted express card
I already showed an example that chimeric entries in 'lspci -vvv' output
can appear. Some data describe the previously loaded card in an Express
Card Slot while the other the one currently loaded in the slot.
This might lead to an explanation why are there those lines in lspci like:

a)
Latency: 0
Latency: 0, Cache Line Size: 64 bytes
or the Latency: line missing altogether

b)
[virtual] Expansion ROM at f6c00000 [disabled] [size=512K]
Expansion ROM at f6c00000 [size=512K]

c)
Region 0: Memory at f6c84000 (64-bit, non-prefetchable) [size=128]
Region 0: Memory at f6c84000 (64-bit, non-prefetchable) [disabled] [size=128]

If kernel does not give a hint what is wrong with a device/driver then
maybe lspci do do a runtime check and give some more useful user-oriented warning.

>>
>> I think we have two problems that may be relevant to this discussion.
>>
>> 1) The _OSC "PCI Express Capability Structure control" bit.  I don't
>> think Linux pays attention to whether the BIOS has granted us control
>> over the capability, so we may do things to it that the BIOS doesn't
>> expect.
>>
>> 2) acpiphp currently uses the presence of _ADR/_EJ0/_RMV to detect
>> hotplug slots.  I don't think this is sufficient (see
>> https://bugzilla.kernel.org/show_bug.cgi?id=54981 for details).
>> Therefore, I don't think pci_bus_has_hotplug_slots() in port_dbg.patch
>> can be accurate.  I think it returns "false" for some buses where it
>> should return "true," such as the ExpressCard slot on Chris Clayton's
>> system (see bug 54981).
> 
> But, I do not how whether and how to split the above 4 bugs into maybe more,
> better described bugs. I will repeat them likely with 3.8.5 and 3.9-rc5,
> I got quite skilled running diff all the last days and weeks. ;-)
> 
> I am waiting for some answers from you before opening bug reports.
> Please tell me how to name them and what data you want to get where.
> After I open them will try to (re)attach your patches. Ying, do you have an
> update for the port_dbg.patch per Bjorns comments about the pci_bus_has_hotplug_slots() 
> being inaccurate? I would gladly wait for an updated patch catching rather
> more scenarios than less.

Feel free to comment on the listing of deemed bugs, add more you saw in the
logs or diffs yourself (especially those downstream, secondary bugs which will
be soon masked by the hotplug issues being *fixed*). ;)
I am quite optimistic. ;))

The above listings don't contain URLs but can be all sorted out in
those respective bugzilla entries.

Thank you,
Martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 20:55                             ` Martin Mokrejs
@ 2013-04-02 22:16                               ` Sarah Sharp
  2013-04-03 10:35                                 ` Martin Mokrejs
  2013-04-03  2:34                               ` huang ying
  2013-04-03 12:16                               ` Martin Mokrejs
  2 siblings, 1 reply; 61+ messages in thread
From: Sarah Sharp @ 2013-04-02 22:16 UTC (permalink / raw)
  To: Martin Mokrejs
  Cc: Bjorn Helgaas, huang ying, Rafael J. Wysocki, Len Brown,
	Matthew Garrett, ACPI Devel Maling List, linux-pci, Yinghai Lu,
	Huang Ying

On Tue, Apr 02, 2013 at 10:55:02PM +0200, Martin Mokrejs wrote:
> > 2) xHCI dead due to to its suspend - 3.8 series and above
> 
> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
> XHCI card *in an express card slot* does not suffer this suspend issue.
> Although it is being put into suspend if a device is unplugged.

Wait, wait, wait.  Time out.  You have *two* xHCI host controllers?  Are
they different vendors?  Are they exhibiting different broken behaviors?
Please state for each host controller exactly the symptoms you are
seeing (no dmesg or other log files yet, just one paragraph for each
host).

Sarah Sharp

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 16:30                         ` Bjorn Helgaas
       [not found]                           ` <515B17D9.6030805@fold.natur.cuni.cz>
@ 2013-04-02 22:49                           ` Rafael J. Wysocki
  2013-04-02 23:58                             ` Bjorn Helgaas
  2013-04-03  2:04                           ` huang ying
  2 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-04-02 22:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Martin Mokrejs, huang ying, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp

On Tuesday, April 02, 2013 10:30:54 AM Bjorn Helgaas wrote:
> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
> <mmokrejs@fold.natur.cuni.cz> wrote:
> > Hi Ying,
> >
> > huang ying wrote:
> 
> >> And please give me the full dmesg for boot and incremental dmesg for
> >> operations.
> >
> >
> > The incremental bits here, the full dmesg will send only directly to your email, due to its size.
> 
> Is there a bugzilla for this issue?  Please attach the complete dmesg
> there or somewhere similar so we can all benefit.
> 
> I think we have two problems that may be relevant to this discussion.
> 
> 1) The _OSC "PCI Express Capability Structure control" bit.  I don't
> think Linux pays attention to whether the BIOS has granted us control
> over the capability, so we may do things to it that the BIOS doesn't
> expect.

Yes, it does, as far as I can say.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 22:49                           ` Rafael J. Wysocki
@ 2013-04-02 23:58                             ` Bjorn Helgaas
  2013-04-03 11:00                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Bjorn Helgaas @ 2013-04-02 23:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Martin Mokrejs, huang ying, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp

On Tue, Apr 2, 2013 at 4:49 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Tuesday, April 02, 2013 10:30:54 AM Bjorn Helgaas wrote:
>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
>> <mmokrejs@fold.natur.cuni.cz> wrote:
>> > Hi Ying,
>> >
>> > huang ying wrote:
>>
>> >> And please give me the full dmesg for boot and incremental dmesg for
>> >> operations.
>> >
>> >
>> > The incremental bits here, the full dmesg will send only directly to your email, due to its size.
>>
>> Is there a bugzilla for this issue?  Please attach the complete dmesg
>> there or somewhere similar so we can all benefit.
>>
>> I think we have two problems that may be relevant to this discussion.
>>
>> 1) The _OSC "PCI Express Capability Structure control" bit.  I don't
>> think Linux pays attention to whether the BIOS has granted us control
>> over the capability, so we may do things to it that the BIOS doesn't
>> expect.
>
> Yes, it does, as far as I can say.

Let me expand on this to see if we're talking about the same thing.
I'm looking at Tables 6-149 and 6-150 in ACPI spec 5.0, and I
interpret them as saying the OS should not perform any configuration
in the PCI Express, VC, Power Budgeting, AER, or Serial Number
capabilities unless the BIOS grants "PCI Express Capability Structure
control."

I see the code in acpi_pci_root_add() that runs _OSC with
OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL in flags, but I don't see
anything that uses the result to decide whether we can write to the
Device Control, Link Control, Slot Control, etc., registers in the
capability, or to the other extended capabilities.  (ASPM is an
exception -- we *do* call pcie_no_aspm(), and maybe that keeps us from
touching the ASPM part of Link Control.)

I haven't looked deeply enough to identify a problem; it's just
something that worries me.

Bjorn

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 16:30                         ` Bjorn Helgaas
       [not found]                           ` <515B17D9.6030805@fold.natur.cuni.cz>
  2013-04-02 22:49                           ` Rafael J. Wysocki
@ 2013-04-03  2:04                           ` huang ying
  2013-04-03 17:29                             ` Bjorn Helgaas
  2 siblings, 1 reply; 61+ messages in thread
From: huang ying @ 2013-04-03  2:04 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Martin Mokrejs, Rafael J. Wysocki, ACPI Devel Maling List,
	Len Brown, Matthew Garrett, Sarah Sharp

On Wed, Apr 3, 2013 at 12:30 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
> <mmokrejs@fold.natur.cuni.cz> wrote:
>> Hi Ying,
>>
>> huang ying wrote:
>
>>> And please give me the full dmesg for boot and incremental dmesg for
>>> operations.
>>
>>
>> The incremental bits here, the full dmesg will send only directly to your email, due to its size.
>
> Is there a bugzilla for this issue?  Please attach the complete dmesg
> there or somewhere similar so we can all benefit.
>
> I think we have two problems that may be relevant to this discussion.
>
> 1) The _OSC "PCI Express Capability Structure control" bit.  I don't
> think Linux pays attention to whether the BIOS has granted us control
> over the capability, so we may do things to it that the BIOS doesn't
> expect.
>
> 2) acpiphp currently uses the presence of _ADR/_EJ0/_RMV to detect
> hotplug slots.  I don't think this is sufficient (see
> https://bugzilla.kernel.org/show_bug.cgi?id=54981 for details).
> Therefore, I don't think pci_bus_has_hotplug_slots() in port_dbg.patch
> can be accurate.  I think it returns "false" for some buses where it
> should return "true," such as the ExpressCard slot on Chris Clayton's
> system (see bug 54981).

Yes. pci_bus_has_hotplug_slots() is not accurate.  But I still think
it can be used in port runtime PM.  Because if there is no hotplug
slots registered, the hotplug itself can not work properly, with or
without port runtime PM enabled.  And we should add necessary
pm_runtime_get_sync/put_sync into pci_scan_bus to deal with "rescan".
What do you think about that?

pci_dev->is_hotplug_bridge is not accurate too.  It reports a internal
port of my X220 as a hotplug-able port.  But it appears that it will
report more instead of less.  It can report correctly for port in bug
54981.  Do you think that is a good choice for port runtime PM
filtering?

Best Regards,
Huang Ying

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 20:55                             ` Martin Mokrejs
  2013-04-02 22:16                               ` Sarah Sharp
@ 2013-04-03  2:34                               ` huang ying
  2013-04-03 10:39                                 ` Martin Mokrejs
  2013-04-03 12:16                               ` Martin Mokrejs
  2 siblings, 1 reply; 61+ messages in thread
From: huang ying @ 2013-04-03  2:34 UTC (permalink / raw)
  To: Martin Mokrejs
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Matthew Garrett,
	Sarah Sharp, ACPI Devel Maling List, linux-pci, Yinghai Lu,
	Huang Ying

Hi, Martin,

On Wed, Apr 3, 2013 at 4:55 AM, Martin Mokrejs
<mmokrejs@fold.natur.cuni.cz> wrote:
> [ +linux-pci and Yinghai as they suffered already those many emails on individual
>  threads so one overviewing email hopefully won't harm] ;-)
>
> Martin Mokrejs wrote:
>>
>>
>> Bjorn Helgaas wrote:
>>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
>>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>>> Hi Ying,
>>>>
>>>> huang ying wrote:
>>>
>>>>> And please give me the full dmesg for boot and incremental dmesg for
>>>>> operations.
>>>>
>>>>
>>>> The incremental bits here, the full dmesg will send only directly to your email, due to its size.
>>>
>>> Is there a bugzilla for this issue?  Please attach the complete dmesg
>>> there or somewhere similar so we can all benefit.
>>
>> I changed my mind. I am attaching the dmesg here but omitting linux-acpi
>> list. After I hear a proposal from Rafel/Bjorn I will open separate bugs.
>> I thought that the threads I started so far were enough but yes, dmesg
>> files don't pass through list filters so I should move that to bugzilla.
>>
>> so far my view of the the bugs was:
>>
>> 1) acpiphp hotplug broken due to upstream pcieport 1c.7 PME# enabled
>>   (eSATA-based card)
>
> Fixed by Ying Huang port_dbg.patch applied over 3.8.5 (fixes acpiphp hotplug
> of eSATA and Firewire cards, NOT the hotplug of a NEC-based USB3 card -> hence
> the bug 4) below). Now I can continue using laptop-mode-tools.
>
>
>> 2) xHCI dead due to to its suspend - 3.8 series and above
>
> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
> XHCI card *in an express card slot* does not suffer this suspend issue.
> Although it is being put into suspend if a device is unplugged.

Do not find the dmesg or any other details about this.  Could you
provide some details?  Or I miss some emails from you?

Best Regards,
Huang Ying

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 22:16                               ` Sarah Sharp
@ 2013-04-03 10:35                                 ` Martin Mokrejs
  0 siblings, 0 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-03 10:35 UTC (permalink / raw)
  To: Sarah Sharp
  Cc: Bjorn Helgaas, huang ying, Rafael J. Wysocki, Len Brown,
	Matthew Garrett, ACPI Devel Maling List, linux-pci, Yinghai Lu,
	Huang Ying

Sarah Sharp wrote:
> On Tue, Apr 02, 2013 at 10:55:02PM +0200, Martin Mokrejs wrote:
>>> 2) xHCI dead due to to its suspend - 3.8 series and above
>>
>> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
>> XHCI card *in an express card slot* does not suffer this suspend issue.
>> Although it is being put into suspend if a device is unplugged.
> 
> Wait, wait, wait.  Time out.  You have *two* xHCI host controllers?  Are
> they different vendors?  Are they exhibiting different broken behaviors?
> Please state for each host controller exactly the symptoms you are
> seeing (no dmesg or other log files yet, just one paragraph for each
> host).

The laptop has TexasInstruments controller, which suffers the problem that
once it is suspended (0b:00) it does not observe that a new device was plugged
into the socket, so the end USB device gets no power and is dead. Manual wakeup
using echo 'on' > /sys/.../*0b:00/control wakes up the upstream PCIe root port
1c.4 (at least with the patch) and the 0b:00 itself as intended by the echo
command. That enables the TI controller realize e.g. a mouse is connected to
the socket and picks it up.
What is not clear to me why the xHCI socket is not dead upon bootup with no
USB devices attached. That also yields the controller 0b:00 in suspended state
but the very first plugin of the e.g. mouse is picked up and the mouse works.
Upon unplug of the mouse something gets screwed. We thought that it is due to
the upstream port being suspended but even with the patch preventing that
(port_dbg.patch) the broken gets is entered: the 0b:00 falls asleep, its
runtime_status files says 'suspended' a the socket is dead.

You maybe remember that I started a year ago the threads with Express Card
hotplug issues with another, USB3 NEC-based controller I have. To test better
the patch from Ying Huang I also tried what happens to the NEC-based controller.
It works. I did not provide you the logs although from the debug info Ying added
it seems the code flow in a different way. I had XHCI_DEBUG enabled while no
external USB devices attached and because I tested with a USB2 device (the mouse)
the xhci_hcd did not flood the logs too much.

Martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-03  2:34                               ` huang ying
@ 2013-04-03 10:39                                 ` Martin Mokrejs
  0 siblings, 0 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-03 10:39 UTC (permalink / raw)
  To: huang ying
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Matthew Garrett,
	Sarah Sharp, ACPI Devel Maling List, linux-pci, Yinghai Lu,
	Huang Ying

huang ying wrote:
> Hi, Martin,
> 
> On Wed, Apr 3, 2013 at 4:55 AM, Martin Mokrejs
> <mmokrejs@fold.natur.cuni.cz> wrote:
>> [ +linux-pci and Yinghai as they suffered already those many emails on individual
>>  threads so one overviewing email hopefully won't harm] ;-)
>>
>> Martin Mokrejs wrote:
>>>
>>>
>>> Bjorn Helgaas wrote:
>>>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
>>>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>>>> Hi Ying,
>>>>>
>>>>> huang ying wrote:
>>>>
>>>>>> And please give me the full dmesg for boot and incremental dmesg for
>>>>>> operations.
>>>>>
>>>>>
>>>>> The incremental bits here, the full dmesg will send only directly to your email, due to its size.
>>>>
>>>> Is there a bugzilla for this issue?  Please attach the complete dmesg
>>>> there or somewhere similar so we can all benefit.
>>>
>>> I changed my mind. I am attaching the dmesg here but omitting linux-acpi
>>> list. After I hear a proposal from Rafel/Bjorn I will open separate bugs.
>>> I thought that the threads I started so far were enough but yes, dmesg
>>> files don't pass through list filters so I should move that to bugzilla.
>>>
>>> so far my view of the the bugs was:
>>>
>>> 1) acpiphp hotplug broken due to upstream pcieport 1c.7 PME# enabled
>>>   (eSATA-based card)
>>
>> Fixed by Ying Huang port_dbg.patch applied over 3.8.5 (fixes acpiphp hotplug
>> of eSATA and Firewire cards, NOT the hotplug of a NEC-based USB3 card -> hence
>> the bug 4) below). Now I can continue using laptop-mode-tools.
>>
>>
>>> 2) xHCI dead due to to its suspend - 3.8 series and above
>>
>> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
>> XHCI card *in an express card slot* does not suffer this suspend issue.
>> Although it is being put into suspend if a device is unplugged.
> 
> Do not find the dmesg or any other details about this.  Could you
> provide some details?  Or I miss some emails from you?

No, I did not send them away. ;-) I was really waiting for answers how to separate
the bugs, how to name them, what components in bugzilla, etc.


So? ;)

> 
> Best Regards,
> Huang Ying
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 23:58                             ` Bjorn Helgaas
@ 2013-04-03 11:00                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2013-04-03 11:00 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Martin Mokrejs, huang ying, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp

On Tuesday, April 02, 2013 05:58:42 PM Bjorn Helgaas wrote:
> On Tue, Apr 2, 2013 at 4:49 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Tuesday, April 02, 2013 10:30:54 AM Bjorn Helgaas wrote:
> >> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
> >> <mmokrejs@fold.natur.cuni.cz> wrote:
> >> > Hi Ying,
> >> >
> >> > huang ying wrote:
> >>
> >> >> And please give me the full dmesg for boot and incremental dmesg for
> >> >> operations.
> >> >
> >> >
> >> > The incremental bits here, the full dmesg will send only directly to your email, due to its size.
> >>
> >> Is there a bugzilla for this issue?  Please attach the complete dmesg
> >> there or somewhere similar so we can all benefit.
> >>
> >> I think we have two problems that may be relevant to this discussion.
> >>
> >> 1) The _OSC "PCI Express Capability Structure control" bit.  I don't
> >> think Linux pays attention to whether the BIOS has granted us control
> >> over the capability, so we may do things to it that the BIOS doesn't
> >> expect.
> >
> > Yes, it does, as far as I can say.
> 
> Let me expand on this to see if we're talking about the same thing.
> I'm looking at Tables 6-149 and 6-150 in ACPI spec 5.0, and I
> interpret them as saying the OS should not perform any configuration
> in the PCI Express, VC, Power Budgeting, AER, or Serial Number
> capabilities unless the BIOS grants "PCI Express Capability Structure
> control."
> 
> I see the code in acpi_pci_root_add() that runs _OSC with
> OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL in flags, but I don't see
> anything that uses the result to decide whether we can write to the
> Device Control, Link Control, Slot Control, etc., registers in the
> capability, or to the other extended capabilities.  (ASPM is an
> exception -- we *do* call pcie_no_aspm(), and maybe that keeps us from
> touching the ASPM part of Link Control.)
> 
> I haven't looked deeply enough to identify a problem; it's just
> something that worries me.

That actually is more convoluted.  Because we pass
OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL to acpi_pci_osc_control_set() as the
minimum requirement, it won't request any of the other PCIe flags without it.
This causes those flags to be unset in root->osc_control_set and then
*srv_mask from pcie_port_acpi_setup() will always be 0, in which case no native
PCIe port services will be used.

Of course, whether this is sufficient or not is a good question.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-02 20:55                             ` Martin Mokrejs
  2013-04-02 22:16                               ` Sarah Sharp
  2013-04-03  2:34                               ` huang ying
@ 2013-04-03 12:16                               ` Martin Mokrejs
  2013-04-04 11:30                                 ` Huang Ying
  2 siblings, 1 reply; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-03 12:16 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: huang ying, Rafael J. Wysocki, Len Brown, Matthew Garrett,
	Sarah Sharp, ACPI Devel Maling List, linux-pci, Yinghai Lu,
	Huang Ying

Meanwhile, the raw data: http://195.113.57.32/~mmokrejs/tmp/20130402.tar.bz2
(size 468641 bytes)

They were collected by:

# cat ~/bin/collect_runtime_status.sh 
#!/bin/sh
grep . /sys/bus/pci/devices/*/power/runtime_status > runtime_status_"$1".txt
grep . /sys/bus/pci/devices/*/power/control > control_"$1".txt
cat /proc/interrupts > interrupts_"$1".txt
cat /proc/iomem > iomem_"$1".txt
lspci -vvv > lspci_vvv_"$1".txt
dmesg > dmesg_"$1".txt
#

Just do 'ls -latr' to see the ordering of the files as they were created.
The longer the filename, the later in the test process. The names should be
relatively self-explaining. Definitely, from the log files you should see
what happened in real and therefore, can figure out what the (maybe weird)
long filename really meant.

Sometimes I manually recorded lsusb of dmesg_final.txt, mostly after I did some
extra tests but but not want to record every step by the above 6 files.

In one or two places I added some my own notes into COMMENTS file.




I will try to guide your below where you can study which of the bugs. Mostly,
for each bug you need just one subdirectory to look into, the other are just
repeated the same bug under different kernel version or another patch.
However, Sarah for the xHCI dead port issue will need to compare by diff
two directories, one with the TI-based controller tests, the other with the
NEC-based tests. Especially there, I would do something like:

cd *TI-based; for f in dmesg*; do cut -c 15- $f > /tmp/TI/$f; done
cd ../*NEC-based; for f in dmesg*; do cut -c 15- $f > /tmp/NEC/$f; done

Then it should be easier to poke through file captured at the same test step,
like:

diff -u -w /tmp/TI/dmesg_initial__mouse_attached__unplugged__reattached_but_port_dead.txt \
/tmp/NEC/dmesg_initial__mouse_attached__detached__reattached.txt



Other than that, just diff pairs of files with each other, like:

diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__mouse_attached.txt


Sorry that I sometimes used only a single underscore instead of double underscores
to separate the test steps from each other in the filename.


Martin Mokrejs wrote:
> [ +linux-pci and Yinghai as they suffered already those many emails on individual
>  threads so one overviewing email hopefully won't harm] ;-)
> 
> Martin Mokrejs wrote:
>>
>>
>> Bjorn Helgaas wrote:
>>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
>>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>>> Hi Ying,
>>>>
>>>> huang ying wrote:
>>>
>>>>> And please give me the full dmesg for boot and incremental dmesg for
>>>>> operations.
>>>>
>>>>
>>>> The incremental bits here, the full dmesg will send only directly to your email, due to its size.
>>>
>>> Is there a bugzilla for this issue?  Please attach the complete dmesg
>>> there or somewhere similar so we can all benefit.
>>
>> I changed my mind. I am attaching the dmesg here but omitting linux-acpi
>> list. After I hear a proposal from Rafel/Bjorn I will open separate bugs.
>> I thought that the threads I started so far were enough but yes, dmesg
>> files don't pass through list filters so I should move that to bugzilla.
>>
>> so far my view of the the bugs was:
>>
>> 1) acpiphp hotplug broken due to upstream pcieport 1c.7 PME# enabled
>>   (eSATA-based card)
> 
> Fixed by Ying Huang port_dbg.patch applied over 3.8.5 (fixes acpiphp hotplug
> of eSATA and Firewire cards, NOT the hotplug of a NEC-based USB3 card -> hence
> the bug 4) below). Now I can continue using laptop-mode-tools.

20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_eSATA_testing
20130402/3.8.3-vanilla__with_laptop-mode-tools (with some comments in
                                                COMMENTS file)


>> 2) xHCI dead due to to its suspend - 3.8 series and above
> 
> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
> XHCI card *in an express card slot* does not suffer this suspend issue.
> Although it is being put into suspend if a device is unplugged.

20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_TI-based
20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_NEC-based

Same thing but yet without the port_dbg.patch:
20130402/3.9-rc5__with_2368081__with-latop-mode-tools_xhci_testing/


>> 3) pciehp completely broken since about 3.6, still 3.9-rc5
> 
> Even 3.9-rc5 with patch 2368081 and port_dbg.patch from Ying Huang this is
> still broken (the eject of a cold plugged device from an express card slot).
> That results in /proc/interrupts claiming IRQ19 is still used by the driver.
> Non-forced but manual 'rmmod sata_sil24' removes the IRQ 19 from the listing.
> The rmmod also removes association with sata_sil24 from the /proc/iomem but
> the device 11:00 is retained in the file with its memory ranges.
> lspci provides, as many times described by me, conflicting information.
> Actually, I trust more lspci than /proc/ files.

Tests with express cards SATA SiI3132 and FireWire VT6315:
20130402/3.9-rc5__with_2368081__and__ying_port-dbg__with-latop-mode-tools_eSATA_testing
20130402/3.9-rc5__with_2368081__and__ying_port-dbg__with-latop-mode-tools_FireWire_testing

A bit more testing but yet without port_dbg.patch (but contains more data for your
so look into it after the above two):
20130402/3.9-rc5__with_2368081__with-latop-mode-tools_eSATA_testing


>> There is one more which actually brought me into all of this in May2012 at about
>> 3.2.x kernels:
>>
>> 4) Even when upstream port 1c.7 is force control to 'on' hot removal of
>>    USB3 express card is broken, only every second eject is recognized.
>>    Is likely related to xhci_hcd having a special privilege to handle IRQ/PM
>>    in its own way. In contrast, Firewire and eSATA cards work under same
>>    circumstances. I see different sleep states listed as supported by those
>>    cards but my bet is that is due to the exceptional xhci_hcd privilege.
>>    I briefly repeated that already with 3.9-rc5.
> 
> Still broken even with port_dbg.patch applied over 3.8.5. Turns out the unnoticed
> ejects and inserts are actually detected, but later, with 30sec delay of so.
> Hmm, in my original thread back in 2012 I said 60sec delay but seems is likely
> still the same problem:
> 3.2.11: PCI Express card cannot be re-detected withing cca 60sec timeframe

20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_NEC-based_eject_testing


> Before I forget, I will sketch several more bugs I hit and are all documented
> in my postings from last week or two. I can provide the URLs to those postings
> already in archives and maybe summarize them in bugzilla, after we agree what
> will be worked on and where (email ... bugzilla), under the best matching subject
> you will propose.
> 
> 
> 5) lspci causes wake and suspend of pcieport handled devices. I fear this is
> not good. Maybe it does the same to other pci devices but the "problem" is
> that no other pci drivers report same type of message. I would like to see
> the PME# enabled/disabled generated by other drivers as well, ideally by some
> upstream, common driver.

At least in some cases, lspci -vv causes 7x these:

lspci -vvv causes 11x same message.


> 
> 
> 6) sata_sil24 sometimes initializes badly under pciehp. Provided you once fix
> the pciehp and still would like to get the init of sata_sil24 fixed as well.
> The are two wrong paths in the driver. One is:
> 
> [  899.894862] sata_sil24 0000:11:00.0: version 1.1
> [  899.894880] sata_sil24 0000:11:00.0: enabling device (0000 -> 0003)
> [  899.985994] sata_sil24 0000:11:00.0: failed to clear port RST
> [  900.086097] sata_sil24 0000:11:00.0: failed to clear port RST
> [  900.086119] sata_sil24 0000:11:00.0: enabling bus mastering

20130402/3.9-rc5__with_2368081__with-laptop-mode-tools_eSATA_testing/

> 
> while the other is:
> 
> [  974.021661] pcieport 0000:00:1c.0: PME# disabled
> [  974.041697] pcieport 0000:00:1c.7: PME# disabled
> [ 1048.450168] sata_sil24 0000:11:00.0: version 1.1
> [ 1048.463692] sata_sil24 0000:11:00.0: Refused to change power state, currently in D3
> [ 1048.563818] sata_sil24 0000:11:00.0: failed to clear port RST
> [ 1048.663935] sata_sil24 0000:11:00.0: failed to clear port RST

20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_NEC-based_eject_testing






The bugs below you will come across in multiple places in the tar.bz2 archive but
were also well described in the past email threads. It does not make sense to repeat
that all here or there. I suggest you come up with a debug patch to help with
these and then we can dive into more crafted log data.

> 
> Both lead to a broken device and I would prefer the driver to fail to load.
> It seems they are at least in part related to early device eject while the
> driver did not yet turn down an unused external SATA port.
> 
> 
> 7) It seems Rafael or Bjorn have a clue why sometimes I see only PME# disabled
> or just PME# enabled in dmesg for a particular device and I am worried when was
> it silently switched to the other state. I would like to hear this can be prevented
> in future by some cross-checks, by design.
> 
> 
> 8) I don't know whether one can ensure that a driver releases either both
> IRQ and memory ranges it has allocated, or just nothing, or an oops happens,
> whatever. Maybe something could track what the driver grabbed once and make
> sure both are released. even a background scan or /proc files would be fine.
> The disagreement with lspci is not good.
> 
> 
> 9) In the thread 
> Re: 3.8.2: stale pci device info for a previously inserted express card
> I already showed an example that chimeric entries in 'lspci -vvv' output
> can appear. Some data describe the previously loaded card in an Express
> Card Slot while the other the one currently loaded in the slot.
> This might lead to an explanation why are there those lines in lspci like:
> 
> a)
> Latency: 0
> Latency: 0, Cache Line Size: 64 bytes
> or the Latency: line missing altogether
> 
> b)
> [virtual] Expansion ROM at f6c00000 [disabled] [size=512K]
> Expansion ROM at f6c00000 [size=512K]
> 
> c)
> Region 0: Memory at f6c84000 (64-bit, non-prefetchable) [size=128]
> Region 0: Memory at f6c84000 (64-bit, non-prefetchable) [disabled] [size=128]
> 
> 
> If kernel does not give a hint what is wrong with a device/driver then
> maybe lspci do do a runtime check and give some more useful user-oriented warning.


Martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-03  2:04                           ` huang ying
@ 2013-04-03 17:29                             ` Bjorn Helgaas
  0 siblings, 0 replies; 61+ messages in thread
From: Bjorn Helgaas @ 2013-04-03 17:29 UTC (permalink / raw)
  To: huang ying
  Cc: Martin Mokrejs, Rafael J. Wysocki, ACPI Devel Maling List,
	Len Brown, Matthew Garrett, Sarah Sharp

On Tue, Apr 2, 2013 at 8:04 PM, huang ying <huang.ying.caritas@gmail.com> wrote:
> On Wed, Apr 3, 2013 at 12:30 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:

>> 2) acpiphp currently uses the presence of _ADR/_EJ0/_RMV to detect
>> hotplug slots.  I don't think this is sufficient (see
>> https://bugzilla.kernel.org/show_bug.cgi?id=54981 for details).
>> Therefore, I don't think pci_bus_has_hotplug_slots() in port_dbg.patch
>> can be accurate.  I think it returns "false" for some buses where it
>> should return "true," such as the ExpressCard slot on Chris Clayton's
>> system (see bug 54981).
>
> Yes. pci_bus_has_hotplug_slots() is not accurate.  But I still think
> it can be used in port runtime PM.  Because if there is no hotplug
> slots registered, the hotplug itself can not work properly, with or
> without port runtime PM enabled.

I'm not sure this is true.  For concreteness, let's talk about Chris
Clayton's machine.  He has a root port 00:1c.3, that leads to an
ExpressCard slot.  We request to use PCIe native hotplug, but BIOS
declines, so we have to use acpiphp.  The SCI that signals a hotplug
event is generated by the 00:1c.3 root port.  Therefore, 00:1c.3 must
remain powered up even when the slot is empty.

So the question really is, "Can we tell that there's a hotplug slot
below 00:1c.3?"  acpiphp currently does not detect this slot because
00:1c.3 does have an _ADR method, but there's no _EJ0 or _RMV method.

If you are suggesting that "acpiphp hotplug doesn't work correctly for
this slot even with  00:1c.3 powered up, so we might as well turn
00:1c.3 off," I completely disagree.  That would mean that even after
we fix acpiphp so it re-enumerates when we receive a Bus Check,
hotplug would still be broken because the powered-off port will not
generate the SCI that triggers the Bus Check.

>  And we should add necessary
> pm_runtime_get_sync/put_sync into pci_scan_bus to deal with "rescan".
> What do you think about that?
>
> pci_dev->is_hotplug_bridge is not accurate too.  It reports a internal
> port of my X220 as a hotplug-able port.  But it appears that it will
> report more instead of less.  It can report correctly for port in bug
> 54981.  Do you think that is a good choice for port runtime PM
> filtering?

pci_dev->is_hotplug_bridge is currently set in these cases:

  1) A quirk for a PLX bridge
  2) The PCIe Slot Capability "Hot-Plug Capable" bit is set
  3) The acpiphp driver thinks there's an ejectable slot below the device

This doesn't look reliable to me at all.  I don't think think it's
even possible for acpiphp to deduce from AML whether there are
ejectable slots.  If we're using acpiphp, I feel queasy about looking
at the PCIe Capability to figure out whether slots are present.  I
don't think it's a good idea to mix the PCIe and ACPI worlds that way.
 And ACPI hotplug could be used for other hotplug schemes besides
PCIe, so we can't even count on the PCIe capability existing.

So I disagree that pci_dev->is_hotplug_bridge is set for every device
that may have a hotplug slot below it.

I think it's mainly the acpiphp case where it's hard to tell when
runtime PM is safe; it might be possible to do runtime PM on a port
where we are using pciehp.

Bjorn

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports
  2013-03-30 22:38                   ` [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports Rafael J. Wysocki
  2013-04-01 17:34                     ` Bjorn Helgaas
@ 2013-04-03 22:34                     ` Bjorn Helgaas
  1 sibling, 0 replies; 61+ messages in thread
From: Bjorn Helgaas @ 2013-04-03 22:34 UTC (permalink / raw)
  To: Rafael J. Wysocki, Zheng Yan
  Cc: Martin Mokrejs, ACPI Devel Maling List, Len Brown,
	Matthew Garrett, Sarah Sharp, LKML, linux-pci

On Sat, Mar 30, 2013 at 4:38 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> The runtime PM of PCIe ports turns out to be quite fragile, as in
> some cases things work while in some other cases they don't and we
> don't seem to have a good way to determine whether or not they are
> going to work in advance.
>
> For this reason, avoid enabling runtime PM for PCIe ports by
> keeping their runtime PM reference counters always above 0 for the
> time being.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>
> This version also removes the no longer necessary (and empty anyway)
> port_runtime_pm_black_list[] table.

I applied this to for-linus for v3.9, and added a stable tag for v3.6+.  Thanks!

> ---
>  drivers/pci/pcie/portdrv_pci.c |   13 -------------
>  1 file changed, 13 deletions(-)
>
> Index: linux-pm/drivers/pci/pcie/portdrv_pci.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pcie/portdrv_pci.c
> +++ linux-pm/drivers/pci/pcie/portdrv_pci.c
> @@ -185,14 +185,6 @@ static const struct dev_pm_ops pcie_port
>  #endif /* !PM */
>
>  /*
> - * PCIe port runtime suspend is broken for some chipsets, so use a
> - * black list to disable runtime PM for these chipsets.
> - */
> -static const struct pci_device_id port_runtime_pm_black_list[] = {
> -       { /* end: all zeroes */ }
> -};
> -
> -/*
>   * pcie_portdrv_probe - Probe PCI-Express port devices
>   * @dev: PCI-Express port device being probed
>   *
> @@ -225,16 +217,11 @@ static int pcie_portdrv_probe(struct pci
>          * it by default.
>          */
>         dev->d3cold_allowed = false;
> -       if (!pci_match_id(port_runtime_pm_black_list, dev))
> -               pm_runtime_put_noidle(&dev->dev);
> -
>         return 0;
>  }
>
>  static void pcie_portdrv_remove(struct pci_dev *dev)
>  {
> -       if (!pci_match_id(port_runtime_pm_black_list, dev))
> -               pm_runtime_get_noresume(&dev->dev);
>         pcie_port_device_remove(dev);
>         pci_disable_device(dev);
>  }
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Update][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-03-28 21:07   ` [Update][PATCH] " Rafael J. Wysocki
  2013-03-29 15:05     ` Martin Mokrejs
@ 2013-04-03 22:38     ` Bjorn Helgaas
  1 sibling, 0 replies; 61+ messages in thread
From: Bjorn Helgaas @ 2013-04-03 22:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: ACPI Devel Maling List, LKML, Linux PM list, Len Brown,
	Matthew Garrett, Sarah Sharp, Accardi, Kristen C, Huang, Ying,
	linux-pci

On Thu, Mar 28, 2013 at 3:07 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: PCI / ACPI: Always resume devices on ACPI wakeup notifications
>
> It turns out that the _Lxx control methods provided by some BIOSes
> clear the PME Status bit of PCI devices they handle, which means that
> pci_acpi_wake_dev() cannot really use that bit to check whether or
> not the device has signalled wakeup.
>
> One symptom of the problem is, for example, that when an affected PCI
> USB controller is runtime-suspended, then plugging in a new USB device
> into one of the controller's ports will not wake up the controller,
> which should happen.
>
> For this reason, make pci_acpi_wake_dev() always attempt to resume
> the device it is called for regardless of the device's PME Status bit
> value (that bit still has to be cleared if set at this point,
> though).
>
> Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Acked-by: Matthew Garrett <mjg59@srcf.ucam.org>
> Cc: 3.7+ <stable@vger.kernel.org>
> ---
>
> The changelog in this version is slightly better than in the previous one, IMHO.

I applied this to for-linus for v3.9.  Thanks!

> ---
>  drivers/pci/pci-acpi.c |   15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
>
> Index: linux-pm/drivers/pci/pci-acpi.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-acpi.c
> +++ linux-pm/drivers/pci/pci-acpi.c
> @@ -53,14 +53,15 @@ static void pci_acpi_wake_dev(acpi_handl
>                 return;
>         }
>
> -       if (!pci_dev->pm_cap || !pci_dev->pme_support
> -            || pci_check_pme_status(pci_dev)) {
> -               if (pci_dev->pme_poll)
> -                       pci_dev->pme_poll = false;
> +       /* Clear PME Status if set. */
> +       if (pci_dev->pme_support)
> +               pci_check_pme_status(pci_dev);
>
> -               pci_wakeup_event(pci_dev);
> -               pm_runtime_resume(&pci_dev->dev);
> -       }
> +       if (pci_dev->pme_poll)
> +               pci_dev->pme_poll = false;
> +
> +       pci_wakeup_event(pci_dev);
> +       pm_runtime_resume(&pci_dev->dev);
>
>         if (pci_dev->subordinate)
>                 pci_pme_wakeup_bus(pci_dev->subordinate);
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-03 12:16                               ` Martin Mokrejs
@ 2013-04-04 11:30                                 ` Huang Ying
  2013-04-04 19:19                                   ` Sarah Sharp
  2013-04-05 12:40                                   ` Martin Mokrejs
  0 siblings, 2 replies; 61+ messages in thread
From: Huang Ying @ 2013-04-04 11:30 UTC (permalink / raw)
  To: Martin Mokrejs, Sarah Sharp
  Cc: Bjorn Helgaas, huang ying, Rafael J. Wysocki, Len Brown,
	Matthew Garrett, ACPI Devel Maling List, linux-pci, Yinghai Lu

Hi, Martin,

On Wed, 2013-04-03 at 14:16 +0200, Martin Mokrejs wrote:
> Meanwhile, the raw data: http://195.113.57.32/~mmokrejs/tmp/20130402.tar.bz2
> (size 468641 bytes)

Thanks a lot!  Your information is very complete and clear :)

> They were collected by:
> 
> # cat ~/bin/collect_runtime_status.sh 
> #!/bin/sh
> grep . /sys/bus/pci/devices/*/power/runtime_status > runtime_status_"$1".txt
> grep . /sys/bus/pci/devices/*/power/control > control_"$1".txt
> cat /proc/interrupts > interrupts_"$1".txt
> cat /proc/iomem > iomem_"$1".txt
> lspci -vvv > lspci_vvv_"$1".txt
> dmesg > dmesg_"$1".txt
> #
> 
> Just do 'ls -latr' to see the ordering of the files as they were created.
> The longer the filename, the later in the test process. The names should be
> relatively self-explaining. Definitely, from the log files you should see
> what happened in real and therefore, can figure out what the (maybe weird)
> long filename really meant.
> 
> Sometimes I manually recorded lsusb of dmesg_final.txt, mostly after I did some
> extra tests but but not want to record every step by the above 6 files.
> 
> In one or two places I added some my own notes into COMMENTS file.
> 
> 
> 
> 
> I will try to guide your below where you can study which of the bugs. Mostly,
> for each bug you need just one subdirectory to look into, the other are just
> repeated the same bug under different kernel version or another patch.
> However, Sarah for the xHCI dead port issue will need to compare by diff
> two directories, one with the TI-based controller tests, the other with the
> NEC-based tests. Especially there, I would do something like:
> 
> cd *TI-based; for f in dmesg*; do cut -c 15- $f > /tmp/TI/$f; done
> cd ../*NEC-based; for f in dmesg*; do cut -c 15- $f > /tmp/NEC/$f; done
> 
> Then it should be easier to poke through file captured at the same test step,
> like:
> 
> diff -u -w /tmp/TI/dmesg_initial__mouse_attached__unplugged__reattached_but_port_dead.txt \
> /tmp/NEC/dmesg_initial__mouse_attached__detached__reattached.txt
> 
> 
> 
> Other than that, just diff pairs of files with each other, like:
> 
> diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__mouse_attached.txt
> 
> 
> Sorry that I sometimes used only a single underscore instead of double underscores
> to separate the test steps from each other in the filename.
> 
> 
> Martin Mokrejs wrote:
> > [ +linux-pci and Yinghai as they suffered already those many emails on individual
> >  threads so one overviewing email hopefully won't harm] ;-)
> > 
> > Martin Mokrejs wrote:
> >>
> >>
> >> Bjorn Helgaas wrote:
> >>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
> >>> <mmokrejs@fold.natur.cuni.cz> wrote:
> >>>> Hi Ying,
> >>>>
> >>>> huang ying wrote:
> >>>
> >>>>> And please give me the full dmesg for boot and incremental dmesg for
> >>>>> operations.
> >>>>
> >>>>
> >>>> The incremental bits here, the full dmesg will send only directly to your email, due to its size.
> >>>
> >>> Is there a bugzilla for this issue?  Please attach the complete dmesg
> >>> there or somewhere similar so we can all benefit.
> >>
> >> I changed my mind. I am attaching the dmesg here but omitting linux-acpi
> >> list. After I hear a proposal from Rafel/Bjorn I will open separate bugs.
> >> I thought that the threads I started so far were enough but yes, dmesg
> >> files don't pass through list filters so I should move that to bugzilla.
> >>
> >> so far my view of the the bugs was:
> >>
> >> 1) acpiphp hotplug broken due to upstream pcieport 1c.7 PME# enabled
> >>   (eSATA-based card)
> > 
> > Fixed by Ying Huang port_dbg.patch applied over 3.8.5 (fixes acpiphp hotplug
> > of eSATA and Firewire cards, NOT the hotplug of a NEC-based USB3 card -> hence
> > the bug 4) below). Now I can continue using laptop-mode-tools.
> 
> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_eSATA_testing
> 20130402/3.8.3-vanilla__with_laptop-mode-tools (with some comments in
>                                                 COMMENTS file)

Thanks for your testing!

> >> 2) xHCI dead due to to its suspend - 3.8 series and above
> > 
> > Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
> > XHCI card *in an express card slot* does not suffer this suspend issue.
> > Although it is being put into suspend if a device is unplugged.
> 
> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_TI-based
> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_NEC-based
> 
> Same thing but yet without the port_dbg.patch:
> 20130402/3.9-rc5__with_2368081__with-latop-mode-tools_xhci_testing/

It appears that TI xHCI dead port issue will present even if the PCIe
port will never go suspended.  So I think this bug is not related to
PCIe port runtime PM but related to USB xHCI.

Do you agree Sarah?

[snip]

Best Regards,
Huang Ying



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-04 11:30                                 ` Huang Ying
@ 2013-04-04 19:19                                   ` Sarah Sharp
  2013-04-05 12:30                                     ` Martin Mokrejs
  2013-04-05 12:40                                   ` Martin Mokrejs
  1 sibling, 1 reply; 61+ messages in thread
From: Sarah Sharp @ 2013-04-04 19:19 UTC (permalink / raw)
  To: Huang Ying
  Cc: Martin Mokrejs, Bjorn Helgaas, huang ying, Rafael J. Wysocki,
	Len Brown, Matthew Garrett, ACPI Devel Maling List, linux-pci,
	Yinghai Lu

On Thu, Apr 04, 2013 at 07:30:19PM +0800, Huang Ying wrote:
> Hi, Martin,
> 
> On Wed, 2013-04-03 at 14:16 +0200, Martin Mokrejs wrote:
> > >> 2) xHCI dead due to to its suspend - 3.8 series and above
> > > 
> > > Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
> > > XHCI card *in an express card slot* does not suffer this suspend issue.
> > > Although it is being put into suspend if a device is unplugged.
> > 
> > 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_TI-based
> > 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_NEC-based
> > 
> > Same thing but yet without the port_dbg.patch:
> > 20130402/3.9-rc5__with_2368081__with-latop-mode-tools_xhci_testing/
> 
> It appears that TI xHCI dead port issue will present even if the PCIe
> port will never go suspended.  So I think this bug is not related to
> PCIe port runtime PM but related to USB xHCI.
> 
> Do you agree Sarah?

No.  The symptoms he described (in another email) were that the port
only becomes "dead" after a USB 2.0 device is removed, and the host was
suspended.  The issue was that the TI host is simply not reporting the
USB device connect, even if it is manually resumed.  The port status
registers do not show a device connect at all.

Martin, can you confirm this by trying this, and sending me dmesg of the
test with CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING turned on:

1. Remove the laptop mode tools
2. Reboot with no USB devices attached to the TI host
3. Make sure the xHCI PCI device's power/control file is set to 'on'
   You will find that file in /sys/bus/pci/devices/.  Use lspci to
   figure out which directory is the xHCI PCI device.
4. Plug in a USB 2.0 device and make sure it works (e.g. wiggle a
   mouse)
5. Unplug the device, replug it, and check to see if it works.

If you have problems, stop here.  Otherwise try:

6. Unplug all USB devices
7. echo 'auto' to the xHCI PCI device's power/control file in
8. echo 'auto' to both xHCI roothubs in /sys/bus/usb/devices/
   (i.e. all usbN directories)
9. Wait a few seconds or so until the xHCI PCI host suspends, meaning the
   power/runtime_status file reads as 'suspended'
10. Plug in the same USB 2.0 device, and check if it works.
11. Unplug the device, and wait until the PCI host is suspended.
12. Replug the device, and check to see if it works.

Sarah Sharp

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-04 19:19                                   ` Sarah Sharp
@ 2013-04-05 12:30                                     ` Martin Mokrejs
  0 siblings, 0 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-05 12:30 UTC (permalink / raw)
  To: Sarah Sharp, Huang Ying
  Cc: Bjorn Helgaas, huang ying, Rafael J. Wysocki, Len Brown,
	Matthew Garrett, ACPI Devel Maling List, linux-pci, Yinghai Lu



Sarah Sharp wrote:
> On Thu, Apr 04, 2013 at 07:30:19PM +0800, Huang Ying wrote:
>> Hi, Martin,
>>
>> On Wed, 2013-04-03 at 14:16 +0200, Martin Mokrejs wrote:
>>>>> 2) xHCI dead due to to its suspend - 3.8 series and above
>>>>
>>>> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
>>>> XHCI card *in an express card slot* does not suffer this suspend issue.
>>>> Although it is being put into suspend if a device is unplugged.
>>>
>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_TI-based
>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_NEC-based
>>>
>>> Same thing but yet without the port_dbg.patch:
>>> 20130402/3.9-rc5__with_2368081__with-latop-mode-tools_xhci_testing/
>>
>> It appears that TI xHCI dead port issue will present even if the PCIe
>> port will never go suspended.  So I think this bug is not related to
>> PCIe port runtime PM but related to USB xHCI.
>>
>> Do you agree Sarah?
> 
> No.  The symptoms he described (in another email) were that the port
> only becomes "dead" after a USB 2.0 device is removed, and the host was
> suspended.  The issue was that the TI host is simply not reporting the
> USB device connect, even if it is manually resumed.  The port status
> registers do not show a device connect at all.
> 
> Martin, can you confirm this by trying this, and sending me dmesg of the
> test with CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING turned on:
> 
> 1. Remove the laptop mode tools
> 2. Reboot with no USB devices attached to the TI host
> 3. Make sure the xHCI PCI device's power/control file is set to 'on'
>    You will find that file in /sys/bus/pci/devices/.  Use lspci to
>    figure out which directory is the xHCI PCI device.
> 4. Plug in a USB 2.0 device and make sure it works (e.g. wiggle a
>    mouse)
> 5. Unplug the device, replug it, and check to see if it works.

Works. Actually, I plugged in the mouse in and out several times
to show that the *unplug* does not kill the socket.


> If you have problems, stop here.  Otherwise try:
> 
> 6. Unplug all USB devices
> 7. echo 'auto' to the xHCI PCI device's power/control file in

The 0b:00.0 is already suspended after the echo 'auto', but I tried to continue
with step 8. Some default kicks in?


> 8. echo 'auto' to both xHCI roothubs in /sys/bus/usb/devices/
>    (i.e. all usbN directories)

No need, they are already suspended:

# cat /sys/devices/pci0000\:00/0000:00:1c.4/0000:0b:00.0/usb3/power/control
auto
# cat /sys/devices/pci0000\:00/0000:00:1c.4/0000:0b:00.0/usb3/power/runtime_status
suspended
# cat /sys/devices/pci0000\:00/0000:00:1c.4/0000:0b:00.0/usb4/power/runtime_status
suspended
#

> 9. Wait a few seconds or so until the xHCI PCI host suspends, meaning the
>    power/runtime_status file reads as 'suspended'
> 10. Plug in the same USB 2.0 device, and check if it works.

It works.

> 11. Unplug the device, and wait until the PCI host is suspended.

Unplug causes death per dmesg.

[  932.419828] xhci_hcd 0000:0b:00.0: Cached old ring, 1 ring cached
[  932.420240] xhci_hcd 0000:0b:00.0: // Ding dong!
[  932.420342] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[  932.420344] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[  932.454637] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[  932.454638] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[  932.494828] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[  932.494831] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[  932.534856] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[  932.534859] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[  932.574871] xhci_hcd 0000:0b:00.0: get port status, actual port 1 status  = 0x2a0
[  932.574874] xhci_hcd 0000:0b:00.0: Get port status returned 0x100
[  932.574888] hub 3-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x100
[  932.574905] hub 3-0:1.0: hub_suspend
[  932.574912] usb usb3: bus auto-suspend, wakeup 1
[  932.574928] xhci_hcd 0000:0b:00.0: xhci_hub_status_data: stopping port polling.
[  932.574947] xhci_hcd 0000:0b:00.0: xhci_suspend: stopping port polling.
[  932.574974] xhci_hcd 0000:0b:00.0: // Setting command ring address to 0xd6007001
[  932.575026] xhci_hcd 0000:0b:00.0: hcd_pci_runtime_suspend: 0
[  932.575119] xhci_hcd 0000:0b:00.0: PME# enabled
[  932.594863] xhci_hcd 0000:0b:00.0: pfrs: target: 3, 0


> 12. Replug the device, and check to see if it works.

Is dead.

Full logs at:
http://195.113.57.32/~mmokrejs/tmp/20130405.tar.bz2 (unpack, 'ls -latr', diff as you like).
Also .config is in there.

Martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-04 11:30                                 ` Huang Ying
  2013-04-04 19:19                                   ` Sarah Sharp
@ 2013-04-05 12:40                                   ` Martin Mokrejs
  2013-04-19 23:49                                     ` Martin Mokrejs
  1 sibling, 1 reply; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-05 12:40 UTC (permalink / raw)
  To: Huang Ying, Sarah Sharp
  Cc: Bjorn Helgaas, huang ying, Rafael J. Wysocki, Len Brown,
	Matthew Garrett, ACPI Devel Maling List, linux-pci, Yinghai Lu



Huang Ying wrote:
> Hi, Martin,
> 
> On Wed, 2013-04-03 at 14:16 +0200, Martin Mokrejs wrote:
>> Meanwhile, the raw data: http://195.113.57.32/~mmokrejs/tmp/20130402.tar.bz2
>> (size 468641 bytes)
> 
> Thanks a lot!  Your information is very complete and clear :)
> 
>> They were collected by:
>>
>> # cat ~/bin/collect_runtime_status.sh 
>> #!/bin/sh
>> grep . /sys/bus/pci/devices/*/power/runtime_status > runtime_status_"$1".txt
>> grep . /sys/bus/pci/devices/*/power/control > control_"$1".txt
>> cat /proc/interrupts > interrupts_"$1".txt
>> cat /proc/iomem > iomem_"$1".txt
>> lspci -vvv > lspci_vvv_"$1".txt
>> dmesg > dmesg_"$1".txt
>> #
>>
>> Just do 'ls -latr' to see the ordering of the files as they were created.
>> The longer the filename, the later in the test process. The names should be
>> relatively self-explaining. Definitely, from the log files you should see
>> what happened in real and therefore, can figure out what the (maybe weird)
>> long filename really meant.
>>
>> Sometimes I manually recorded lsusb of dmesg_final.txt, mostly after I did some
>> extra tests but but not want to record every step by the above 6 files.
>>
>> In one or two places I added some my own notes into COMMENTS file.
>>
>>
>>
>>
>> I will try to guide your below where you can study which of the bugs. Mostly,
>> for each bug you need just one subdirectory to look into, the other are just
>> repeated the same bug under different kernel version or another patch.
>> However, Sarah for the xHCI dead port issue will need to compare by diff
>> two directories, one with the TI-based controller tests, the other with the
>> NEC-based tests. Especially there, I would do something like:
>>
>> cd *TI-based; for f in dmesg*; do cut -c 15- $f > /tmp/TI/$f; done
>> cd ../*NEC-based; for f in dmesg*; do cut -c 15- $f > /tmp/NEC/$f; done
>>
>> Then it should be easier to poke through file captured at the same test step,
>> like:
>>
>> diff -u -w /tmp/TI/dmesg_initial__mouse_attached__unplugged__reattached_but_port_dead.txt \
>> /tmp/NEC/dmesg_initial__mouse_attached__detached__reattached.txt
>>
>>
>>
>> Other than that, just diff pairs of files with each other, like:
>>
>> diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__mouse_attached.txt
>>
>>
>> Sorry that I sometimes used only a single underscore instead of double underscores
>> to separate the test steps from each other in the filename.
>>
>>
>> Martin Mokrejs wrote:
>>> [ +linux-pci and Yinghai as they suffered already those many emails on individual
>>>  threads so one overviewing email hopefully won't harm] ;-)
>>>
>>> Martin Mokrejs wrote:
>>>>
>>>>
>>>> Bjorn Helgaas wrote:
>>>>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
>>>>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>>>>> Hi Ying,
>>>>>>
>>>>>> huang ying wrote:
>>>>>
>>>>>>> And please give me the full dmesg for boot and incremental dmesg for
>>>>>>> operations.
>>>>>>
>>>>>>
>>>>>> The incremental bits here, the full dmesg will send only directly to your email, due to its size.
>>>>>
>>>>> Is there a bugzilla for this issue?  Please attach the complete dmesg
>>>>> there or somewhere similar so we can all benefit.
>>>>
>>>> I changed my mind. I am attaching the dmesg here but omitting linux-acpi
>>>> list. After I hear a proposal from Rafel/Bjorn I will open separate bugs.
>>>> I thought that the threads I started so far were enough but yes, dmesg
>>>> files don't pass through list filters so I should move that to bugzilla.
>>>>
>>>> so far my view of the the bugs was:
>>>>
>>>> 1) acpiphp hotplug broken due to upstream pcieport 1c.7 PME# enabled
>>>>   (eSATA-based card)
>>>
>>> Fixed by Ying Huang port_dbg.patch applied over 3.8.5 (fixes acpiphp hotplug
>>> of eSATA and Firewire cards, NOT the hotplug of a NEC-based USB3 card -> hence
>>> the bug 4) below). Now I can continue using laptop-mode-tools.
>>
>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_eSATA_testing
>> 20130402/3.8.3-vanilla__with_laptop-mode-tools (with some comments in
>>                                                 COMMENTS file)
> 
> Thanks for your testing!
> 
>>>> 2) xHCI dead due to to its suspend - 3.8 series and above
>>>
>>> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
>>> XHCI card *in an express card slot* does not suffer this suspend issue.
>>> Although it is being put into suspend if a device is unplugged.
>>
>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_TI-based
>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_NEC-based
>>
>> Same thing but yet without the port_dbg.patch:
>> 20130402/3.9-rc5__with_2368081__with-latop-mode-tools_xhci_testing/
> 
> It appears that TI xHCI dead port issue will present even if the PCIe
> port will never go suspended.  So I think this bug is not related to
> PCIe port runtime PM but related to USB xHCI.
> 
> Do you agree Sarah?

Although I confirmed with 20130405.tar.bz2 dataset what Sarah repeated from our
past findings in the email which should be just in your your inbox, one thing is
puzzling:
When I have powersaving enabled upon bootup with NO USB devices attached to the TI
controller, effectively while reaching multiuser mode the 0b:00.0 is in a suspend
state. But, somehow, the very first mouse plugin works. Only the reject causes
more 'aggressive' suspend.
As it seems no upstream 1c.4 is messing up here (in the test Sarah wanted me to do
we have all control files 'on' except the end 0b:00.0) then really still something
*else* is causing the dead port *in conjunction* with 'suspended' runtime state.
Please double check what I wrote initially about the 20130402.tar.bz2 dataset.
Notably, I would compare lspci outputs from a cold boot state with no devices
attached and suspended 0b:00.0 (the 20130402.tar.bz2 dataset) with the dead port
status in lspci (find any in 20130402.tar.bz2 or now in 20130405.tar.bz2).

Martin

> 
> [snip]
> 
> Best Regards,
> Huang Ying
> 
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-05 12:40                                   ` Martin Mokrejs
@ 2013-04-19 23:49                                     ` Martin Mokrejs
  2013-04-30 20:47                                       ` Martin Mokrejs
  0 siblings, 1 reply; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-19 23:49 UTC (permalink / raw)
  To: Huang Ying, Sarah Sharp
  Cc: Bjorn Helgaas, huang ying, Rafael J. Wysocki, Len Brown,
	Matthew Garrett, ACPI Devel Maling List, linux-pci, Yinghai Lu

Hi Sarah,
  does anyone has any comments to this thread? I just retried with 3.8.8
kernel and it is still same issue. I can put to 'auto' upstream 1c.4 port,
detach mouse and the 1c.4 does not suspend (due to a recent patch I think
around 3.8.5).
If I set also its downstream 0b:00 to 'auto', plugin mouse ... mouse works,
after I unplug the mouse the 0b:00 goes 'suspended' and XHCI socket dies.

Here is comparison of the 'active' state and of the 'suspended' to death
(note pcie_aspm=off on my kernel command line):
--- lspci_vvv_initial.txt       2013-04-20 00:16:11.000000000 +0200
+++ lspci_vvv_initial__mouse_attached__detached__attached__1c.4_to_auto__detached__0b:00_to_auto.txt    2013-04-20 00:18:38.000000000 +0200
@@ -484,15 +484,14 @@
 
 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
        Subsystem: Dell Device 04b3
-       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
+       Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
-       Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
        Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
-               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
+               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
        Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00


If I put back 0b:00/control to 'on' I rescue the XHCI socket.



So, should the TI host be blacklisted so that it is never put into suspend
state? I wrote already that I don't think it is necessary but looks nobody
looked into the lspci files. So, here is my interpretation:


See another test scenario:

1. When I bootup without any devices attached to the TI host (no laptop-mode-tools), the TI host at 0b:00 is active.

2. If I enable powersaving via setting control file to 'auto' of 1c.4 (just to be sure) and 0b:00,
the 0b:00 goes after a while suspended. But it is not dead, if I connect a mouse to the XHCI socket
it would work. BUt look how such 'softly suspended' state looks like:

# diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt
--- lspci_vvv_initial.txt       2013-04-20 01:06:51.000000000 +0200
+++ lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt       2013-04-20 01:08:46.000000000 +0200
@@ -484,15 +484,14 @@
 
 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
        Subsystem: Dell Device 04b3
-       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
+       Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
-       Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
        Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
-               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
+               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
        Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
#


3. Now, look what happens if I plugin a mouse (works, as I said, and uplug it, which triggers a deadly suspend,
although reversible):

# diff -u -w lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt
--- lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt       2013-04-20 01:08:46.000000000 +0200
+++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt   2013-04-20 01:10:06.000000000 +0200
@@ -271,7 +271,7 @@
                        Changed: MRL- PresDet- LinkState+
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
                RootCap: CRSVisible-
-               RootSta: PME ReqID 0000, PMEStatus- PMEPending-
+               RootSta: PME ReqID 0b00, PMEStatus- PMEPending-
                DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-

4. Interestingly, if I connect a mouse to the socket to show it is "dead" there is a tiny change in lspci:

--- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt   2013-04-20 01:10:06.000000000 +0200
+++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt      2013-04-20 01:10:28.000000000 +0200
@@ -491,7 +491,7 @@
        Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
-               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
+               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+
        Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00


5. I said the port 'suspended to death' can be rescued by echo 'on' > .../*0b:00*/control (the mouse was
plugged in during the echo command so we see not only PME changes but also D3 to D0 change because the
mouse is attached):

# diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__rea
ttached_but_dead__0b\:00_to_on_rescues.txt 
--- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt      2013-04-20 01:10:28.000000000 +0200
+++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues.txt 2013-04-20 01:12:25.000000000 +0200
@@ -484,14 +484,15 @@
 
 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
        Subsystem: Dell Device 04b3
-       Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
+       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
+       Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
        Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
-               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+
+               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00



6. When I unplug the mouse of course the port does not die because the control file is set to 'on'.
I already demonstrated that but once again, setting 0b:00 to 'auto':

# diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto.txt 
--- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached.txt       2013-04-20 01:13:36.000000000 +0200
+++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto.txt        2013-04-20 01:14:41.000000000 +0200
@@ -484,15 +484,14 @@
 
 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
        Subsystem: Dell Device 04b3
-       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
+       Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
-       Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
        Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
-               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
+               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
        Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
@@ -521,7 +520,7 @@
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
-               CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
+               CESta:  RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00


7. Now, a question to the reader: If I attach the mouse, will it work or not?


# diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto__attached_dead.txt 
--- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto.txt        2013-04-20 01:14:41.000000000 +0200
+++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto__attached_dead.txt 2013-04-20 01:17:59.000000000 +0200
@@ -491,7 +491,7 @@
        Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
-               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
+               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+
        Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
#


No, it did not work. Situation in step 7 is same like in step 4. The diff below is likely benign:

# diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto__attached_dead.txt 
--- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt      2013-04-20 01:10:28.000000000 +0200
+++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto__attached_dead.txt 2013-04-20 01:17:59.000000000 +0200
@@ -520,7 +520,7 @@
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
-               CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
+               CESta:  RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00
#



Collected data are at http://195.113.57.32/~mmokrejs/tmp/20130420.tar.bz2 (90kB)

Thanks,
Martin


Martin Mokrejs wrote:
> 
> 
> Huang Ying wrote:
>> Hi, Martin,
>>
>> On Wed, 2013-04-03 at 14:16 +0200, Martin Mokrejs wrote:
>>> Meanwhile, the raw data: http://195.113.57.32/~mmokrejs/tmp/20130402.tar.bz2
>>> (size 468641 bytes)
>>
>> Thanks a lot!  Your information is very complete and clear :)
>>
>>> They were collected by:
>>>
>>> # cat ~/bin/collect_runtime_status.sh 
>>> #!/bin/sh
>>> grep . /sys/bus/pci/devices/*/power/runtime_status > runtime_status_"$1".txt
>>> grep . /sys/bus/pci/devices/*/power/control > control_"$1".txt
>>> cat /proc/interrupts > interrupts_"$1".txt
>>> cat /proc/iomem > iomem_"$1".txt
>>> lspci -vvv > lspci_vvv_"$1".txt
>>> dmesg > dmesg_"$1".txt
>>> #
>>>
>>> Just do 'ls -latr' to see the ordering of the files as they were created.
>>> The longer the filename, the later in the test process. The names should be
>>> relatively self-explaining. Definitely, from the log files you should see
>>> what happened in real and therefore, can figure out what the (maybe weird)
>>> long filename really meant.
>>>
>>> Sometimes I manually recorded lsusb of dmesg_final.txt, mostly after I did some
>>> extra tests but but not want to record every step by the above 6 files.
>>>
>>> In one or two places I added some my own notes into COMMENTS file.
>>>
>>>
>>>
>>>
>>> I will try to guide your below where you can study which of the bugs. Mostly,
>>> for each bug you need just one subdirectory to look into, the other are just
>>> repeated the same bug under different kernel version or another patch.
>>> However, Sarah for the xHCI dead port issue will need to compare by diff
>>> two directories, one with the TI-based controller tests, the other with the
>>> NEC-based tests. Especially there, I would do something like:
>>>
>>> cd *TI-based; for f in dmesg*; do cut -c 15- $f > /tmp/TI/$f; done
>>> cd ../*NEC-based; for f in dmesg*; do cut -c 15- $f > /tmp/NEC/$f; done
>>>
>>> Then it should be easier to poke through file captured at the same test step,
>>> like:
>>>
>>> diff -u -w /tmp/TI/dmesg_initial__mouse_attached__unplugged__reattached_but_port_dead.txt \
>>> /tmp/NEC/dmesg_initial__mouse_attached__detached__reattached.txt
>>>
>>>
>>>
>>> Other than that, just diff pairs of files with each other, like:
>>>
>>> diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__mouse_attached.txt
>>>
>>>
>>> Sorry that I sometimes used only a single underscore instead of double underscores
>>> to separate the test steps from each other in the filename.
>>>
>>>
>>> Martin Mokrejs wrote:
>>>> [ +linux-pci and Yinghai as they suffered already those many emails on individual
>>>>  threads so one overviewing email hopefully won't harm] ;-)
>>>>
>>>> Martin Mokrejs wrote:
>>>>>
>>>>>
>>>>> Bjorn Helgaas wrote:
>>>>>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
>>>>>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>>>>>> Hi Ying,
>>>>>>>
>>>>>>> huang ying wrote:
>>>>>>
>>>>>>>> And please give me the full dmesg for boot and incremental dmesg for
>>>>>>>> operations.
>>>>>>>
>>>>>>>
>>>>>>> The incremental bits here, the full dmesg will send only directly to your email, due to its size.
>>>>>>
>>>>>> Is there a bugzilla for this issue?  Please attach the complete dmesg
>>>>>> there or somewhere similar so we can all benefit.
>>>>>
>>>>> I changed my mind. I am attaching the dmesg here but omitting linux-acpi
>>>>> list. After I hear a proposal from Rafel/Bjorn I will open separate bugs.
>>>>> I thought that the threads I started so far were enough but yes, dmesg
>>>>> files don't pass through list filters so I should move that to bugzilla.
>>>>>
>>>>> so far my view of the the bugs was:
>>>>>
>>>>> 1) acpiphp hotplug broken due to upstream pcieport 1c.7 PME# enabled
>>>>>   (eSATA-based card)
>>>>
>>>> Fixed by Ying Huang port_dbg.patch applied over 3.8.5 (fixes acpiphp hotplug
>>>> of eSATA and Firewire cards, NOT the hotplug of a NEC-based USB3 card -> hence
>>>> the bug 4) below). Now I can continue using laptop-mode-tools.
>>>
>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_eSATA_testing
>>> 20130402/3.8.3-vanilla__with_laptop-mode-tools (with some comments in
>>>                                                 COMMENTS file)
>>
>> Thanks for your testing!
>>
>>>>> 2) xHCI dead due to to its suspend - 3.8 series and above
>>>>
>>>> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
>>>> XHCI card *in an express card slot* does not suffer this suspend issue.
>>>> Although it is being put into suspend if a device is unplugged.
>>>
>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_TI-based
>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_NEC-based
>>>
>>> Same thing but yet without the port_dbg.patch:
>>> 20130402/3.9-rc5__with_2368081__with-latop-mode-tools_xhci_testing/
>>
>> It appears that TI xHCI dead port issue will present even if the PCIe
>> port will never go suspended.  So I think this bug is not related to
>> PCIe port runtime PM but related to USB xHCI.
>>
>> Do you agree Sarah?
> 
> Although I confirmed with 20130405.tar.bz2 dataset what Sarah repeated from our
> past findings in the email which should be just in your your inbox, one thing is
> puzzling:
> When I have powersaving enabled upon bootup with NO USB devices attached to the TI
> controller, effectively while reaching multiuser mode the 0b:00.0 is in a suspend
> state. But, somehow, the very first mouse plugin works. Only the reject causes
> more 'aggressive' suspend.
> As it seems no upstream 1c.4 is messing up here (in the test Sarah wanted me to do
> we have all control files 'on' except the end 0b:00.0) then really still something
> *else* is causing the dead port *in conjunction* with 'suspended' runtime state.
> Please double check what I wrote initially about the 20130402.tar.bz2 dataset.
> Notably, I would compare lspci outputs from a cold boot state with no devices
> attached and suspended 0b:00.0 (the 20130402.tar.bz2 dataset) with the dead port
> status in lspci (find any in 20130402.tar.bz2 or now in 20130405.tar.bz2).
> 
> Martin
> 
>>
>> [snip]
>>
>> Best Regards,
>> Huang Ying

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications
  2013-04-19 23:49                                     ` Martin Mokrejs
@ 2013-04-30 20:47                                       ` Martin Mokrejs
  0 siblings, 0 replies; 61+ messages in thread
From: Martin Mokrejs @ 2013-04-30 20:47 UTC (permalink / raw)
  To: Huang Ying, Sarah Sharp
  Cc: Bjorn Helgaas, huang ying, Rafael J. Wysocki, Len Brown,
	Matthew Garrett, ACPI Devel Maling List, linux-pci, Yinghai Lu

Hi,
  would somebody comment please on the seemingly suspended dead xHCI socket behavior?
It is not completely dead, as you could see in step 7, the PME is being changed as a
result of mouse being plugged into the socket. True, the mouse appears dead because
it gets no power but I believe it is because xhci_hcd is fooled. Although I did the testing
with pcie_aspm=off I also tried just now pcie_aspm=native but with same results.

  Either way, the 'suspend to death' is reversible once I force wakeup of the
0b:00 device (the TI host) by echo on > ...0b:00.0/power/control.

Thanks,
Martin

Martin Mokrejs wrote:
> Hi Sarah,
>   does anyone has any comments to this thread? I just retried with 3.8.8
> kernel and it is still same issue. I can put to 'auto' upstream 1c.4 port,
> detach mouse and the 1c.4 does not suspend (due to a recent patch I think
> around 3.8.5).
> If I set also its downstream 0b:00 to 'auto', plugin mouse ... mouse works,
> after I unplug the mouse the 0b:00 goes 'suspended' and XHCI socket dies.
> 
> Here is comparison of the 'active' state and of the 'suspended' to death
> (note pcie_aspm=off on my kernel command line):
> --- lspci_vvv_initial.txt       2013-04-20 00:16:11.000000000 +0200
> +++ lspci_vvv_initial__mouse_attached__detached__attached__1c.4_to_auto__detached__0b:00_to_auto.txt    2013-04-20 00:18:38.000000000 +0200
> @@ -484,15 +484,14 @@
>  
>  0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
>         Subsystem: Dell Device 04b3
> -       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> +       Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> -       Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 16
>         Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
>         Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> -               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> +               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
>         Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [70] Express (v2) Endpoint, MSI 00
> 
> 
> If I put back 0b:00/control to 'on' I rescue the XHCI socket.
> 
> 
> 
> So, should the TI host be blacklisted so that it is never put into suspend
> state? I wrote already that I don't think it is necessary but looks nobody
> looked into the lspci files. So, here is my interpretation:
> 
> 
> See another test scenario:
> 
> 1. When I bootup without any devices attached to the TI host (no laptop-mode-tools), the TI host at 0b:00 is active.
> 
> 2. If I enable powersaving via setting control file to 'auto' of 1c.4 (just to be sure) and 0b:00,
> the 0b:00 goes after a while suspended. But it is not dead, if I connect a mouse to the XHCI socket
> it would work. BUt look how such 'softly suspended' state looks like:
> 
> # diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt
> --- lspci_vvv_initial.txt       2013-04-20 01:06:51.000000000 +0200
> +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt       2013-04-20 01:08:46.000000000 +0200
> @@ -484,15 +484,14 @@
>  
>  0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
>         Subsystem: Dell Device 04b3
> -       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> +       Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> -       Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 16
>         Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
>         Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> -               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> +               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
>         Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [70] Express (v2) Endpoint, MSI 00
> #
> 
> 
> 3. Now, look what happens if I plugin a mouse (works, as I said, and uplug it, which triggers a deadly suspend,
> although reversible):
> 
> # diff -u -w lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt
> --- lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt       2013-04-20 01:08:46.000000000 +0200
> +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt   2013-04-20 01:10:06.000000000 +0200
> @@ -271,7 +271,7 @@
>                         Changed: MRL- PresDet- LinkState+
>                 RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
>                 RootCap: CRSVisible-
> -               RootSta: PME ReqID 0000, PMEStatus- PMEPending-
> +               RootSta: PME ReqID 0b00, PMEStatus- PMEPending-
>                 DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
>                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
> 
> 4. Interestingly, if I connect a mouse to the socket to show it is "dead" there is a tiny change in lspci:
> 
> --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt   2013-04-20 01:10:06.000000000 +0200
> +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt      2013-04-20 01:10:28.000000000 +0200
> @@ -491,7 +491,7 @@
>         Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> -               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
> +               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+
>         Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [70] Express (v2) Endpoint, MSI 00
> 
> 
> 5. I said the port 'suspended to death' can be rescued by echo 'on' > .../*0b:00*/control (the mouse was
> plugged in during the echo command so we see not only PME changes but also D3 to D0 change because the
> mouse is attached):
> 
> # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__rea
> ttached_but_dead__0b\:00_to_on_rescues.txt 
> --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt      2013-04-20 01:10:28.000000000 +0200
> +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues.txt 2013-04-20 01:12:25.000000000 +0200
> @@ -484,14 +484,15 @@
>  
>  0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
>         Subsystem: Dell Device 04b3
> -       Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> +       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> +       Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 16
>         Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
>         Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> -               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+
> +               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>         Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [70] Express (v2) Endpoint, MSI 00
> 
> 
> 
> 6. When I unplug the mouse of course the port does not die because the control file is set to 'on'.
> I already demonstrated that but once again, setting 0b:00 to 'auto':
> 
> # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto.txt 
> --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached.txt       2013-04-20 01:13:36.000000000 +0200
> +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto.txt        2013-04-20 01:14:41.000000000 +0200
> @@ -484,15 +484,14 @@
>  
>  0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
>         Subsystem: Dell Device 04b3
> -       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> +       Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> -       Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 16
>         Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
>         Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> -               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> +               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
>         Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [70] Express (v2) Endpoint, MSI 00
> @@ -521,7 +520,7 @@
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> -               CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> +               CESta:  RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>         Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00
> 
> 
> 7. Now, a question to the reader: If I attach the mouse, will it work or not?
> 
> 
> # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto__attached_dead.txt 
> --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto.txt        2013-04-20 01:14:41.000000000 +0200
> +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto__attached_dead.txt 2013-04-20 01:17:59.000000000 +0200
> @@ -491,7 +491,7 @@
>         Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> -               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
> +               Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+
>         Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [70] Express (v2) Endpoint, MSI 00
> #
> 
> 
> No, it did not work. Situation in step 7 is same like in step 4. The diff below is likely benign:
> 
> # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto__attached_dead.txt 
> --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt      2013-04-20 01:10:28.000000000 +0200
> +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto__attached_dead.txt 2013-04-20 01:17:59.000000000 +0200
> @@ -520,7 +520,7 @@
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> -               CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> +               CESta:  RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>         Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00
> #
> 
> 
> 
> Collected data are at http://195.113.57.32/~mmokrejs/tmp/20130420.tar.bz2 (90kB)
> 
> Thanks,
> Martin
> 
> 
> Martin Mokrejs wrote:
>>
>>
>> Huang Ying wrote:
>>> Hi, Martin,
>>>
>>> On Wed, 2013-04-03 at 14:16 +0200, Martin Mokrejs wrote:
>>>> Meanwhile, the raw data: http://195.113.57.32/~mmokrejs/tmp/20130402.tar.bz2
>>>> (size 468641 bytes)
>>>
>>> Thanks a lot!  Your information is very complete and clear :)
>>>
>>>> They were collected by:
>>>>
>>>> # cat ~/bin/collect_runtime_status.sh 
>>>> #!/bin/sh
>>>> grep . /sys/bus/pci/devices/*/power/runtime_status > runtime_status_"$1".txt
>>>> grep . /sys/bus/pci/devices/*/power/control > control_"$1".txt
>>>> cat /proc/interrupts > interrupts_"$1".txt
>>>> cat /proc/iomem > iomem_"$1".txt
>>>> lspci -vvv > lspci_vvv_"$1".txt
>>>> dmesg > dmesg_"$1".txt
>>>> #
>>>>
>>>> Just do 'ls -latr' to see the ordering of the files as they were created.
>>>> The longer the filename, the later in the test process. The names should be
>>>> relatively self-explaining. Definitely, from the log files you should see
>>>> what happened in real and therefore, can figure out what the (maybe weird)
>>>> long filename really meant.
>>>>
>>>> Sometimes I manually recorded lsusb of dmesg_final.txt, mostly after I did some
>>>> extra tests but but not want to record every step by the above 6 files.
>>>>
>>>> In one or two places I added some my own notes into COMMENTS file.
>>>>
>>>>
>>>>
>>>>
>>>> I will try to guide your below where you can study which of the bugs. Mostly,
>>>> for each bug you need just one subdirectory to look into, the other are just
>>>> repeated the same bug under different kernel version or another patch.
>>>> However, Sarah for the xHCI dead port issue will need to compare by diff
>>>> two directories, one with the TI-based controller tests, the other with the
>>>> NEC-based tests. Especially there, I would do something like:
>>>>
>>>> cd *TI-based; for f in dmesg*; do cut -c 15- $f > /tmp/TI/$f; done
>>>> cd ../*NEC-based; for f in dmesg*; do cut -c 15- $f > /tmp/NEC/$f; done
>>>>
>>>> Then it should be easier to poke through file captured at the same test step,
>>>> like:
>>>>
>>>> diff -u -w /tmp/TI/dmesg_initial__mouse_attached__unplugged__reattached_but_port_dead.txt \
>>>> /tmp/NEC/dmesg_initial__mouse_attached__detached__reattached.txt
>>>>
>>>>
>>>>
>>>> Other than that, just diff pairs of files with each other, like:
>>>>
>>>> diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__mouse_attached.txt
>>>>
>>>>
>>>> Sorry that I sometimes used only a single underscore instead of double underscores
>>>> to separate the test steps from each other in the filename.
>>>>
>>>>
>>>> Martin Mokrejs wrote:
>>>>> [ +linux-pci and Yinghai as they suffered already those many emails on individual
>>>>>  threads so one overviewing email hopefully won't harm] ;-)
>>>>>
>>>>> Martin Mokrejs wrote:
>>>>>>
>>>>>>
>>>>>> Bjorn Helgaas wrote:
>>>>>>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs
>>>>>>> <mmokrejs@fold.natur.cuni.cz> wrote:
>>>>>>>> Hi Ying,
>>>>>>>>
>>>>>>>> huang ying wrote:
>>>>>>>
>>>>>>>>> And please give me the full dmesg for boot and incremental dmesg for
>>>>>>>>> operations.
>>>>>>>>
>>>>>>>>
>>>>>>>> The incremental bits here, the full dmesg will send only directly to your email, due to its size.
>>>>>>>
>>>>>>> Is there a bugzilla for this issue?  Please attach the complete dmesg
>>>>>>> there or somewhere similar so we can all benefit.
>>>>>>
>>>>>> I changed my mind. I am attaching the dmesg here but omitting linux-acpi
>>>>>> list. After I hear a proposal from Rafel/Bjorn I will open separate bugs.
>>>>>> I thought that the threads I started so far were enough but yes, dmesg
>>>>>> files don't pass through list filters so I should move that to bugzilla.
>>>>>>
>>>>>> so far my view of the the bugs was:
>>>>>>
>>>>>> 1) acpiphp hotplug broken due to upstream pcieport 1c.7 PME# enabled
>>>>>>   (eSATA-based card)
>>>>>
>>>>> Fixed by Ying Huang port_dbg.patch applied over 3.8.5 (fixes acpiphp hotplug
>>>>> of eSATA and Firewire cards, NOT the hotplug of a NEC-based USB3 card -> hence
>>>>> the bug 4) below). Now I can continue using laptop-mode-tools.
>>>>
>>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_eSATA_testing
>>>> 20130402/3.8.3-vanilla__with_laptop-mode-tools (with some comments in
>>>>                                                 COMMENTS file)
>>>
>>> Thanks for your testing!
>>>
>>>>>> 2) xHCI dead due to to its suspend - 3.8 series and above
>>>>>
>>>>> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based
>>>>> XHCI card *in an express card slot* does not suffer this suspend issue.
>>>>> Although it is being put into suspend if a device is unplugged.
>>>>
>>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_TI-based
>>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_NEC-based
>>>>
>>>> Same thing but yet without the port_dbg.patch:
>>>> 20130402/3.9-rc5__with_2368081__with-latop-mode-tools_xhci_testing/
>>>
>>> It appears that TI xHCI dead port issue will present even if the PCIe
>>> port will never go suspended.  So I think this bug is not related to
>>> PCIe port runtime PM but related to USB xHCI.
>>>
>>> Do you agree Sarah?
>>
>> Although I confirmed with 20130405.tar.bz2 dataset what Sarah repeated from our
>> past findings in the email which should be just in your your inbox, one thing is
>> puzzling:
>> When I have powersaving enabled upon bootup with NO USB devices attached to the TI
>> controller, effectively while reaching multiuser mode the 0b:00.0 is in a suspend
>> state. But, somehow, the very first mouse plugin works. Only the reject causes
>> more 'aggressive' suspend.
>> As it seems no upstream 1c.4 is messing up here (in the test Sarah wanted me to do
>> we have all control files 'on' except the end 0b:00.0) then really still something
>> *else* is causing the dead port *in conjunction* with 'suspended' runtime state.
>> Please double check what I wrote initially about the 20130402.tar.bz2 dataset.
>> Notably, I would compare lspci outputs from a cold boot state with no devices
>> attached and suspended 0b:00.0 (the 20130402.tar.bz2 dataset) with the dead port
>> status in lspci (find any in 20130402.tar.bz2 or now in 20130405.tar.bz2).
>>
>> Martin
>>
>>>
>>> [snip]
>>>
>>> Best Regards,
>>> Huang Ying
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2013-04-30 20:47 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-23 14:33 [PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications Rafael J. Wysocki
2013-03-23 16:22 ` Matthew Garrett
2013-03-25 16:45 ` Sarah Sharp
2013-03-25 22:34   ` Rafael J. Wysocki
2013-03-28 12:57 ` Rafael J. Wysocki
2013-03-28 16:21   ` Bjorn Helgaas
2013-03-28 16:41     ` Rafael J. Wysocki
2013-03-28 16:46       ` Bjorn Helgaas
2013-03-28 16:59         ` Rafael J. Wysocki
2013-03-28 17:26           ` Martin Mokrejs
2013-03-28 17:49             ` Bjorn Helgaas
2013-03-28 18:23               ` Sarah Sharp
2013-03-28 19:12                 ` Bjorn Helgaas
2013-03-28 19:42                   ` Martin Mokrejs
2013-03-28 18:31               ` Martin Mokrejs
2013-03-28 21:27                 ` Rafael J. Wysocki
2013-03-29  7:41                   ` huang ying
2013-03-31  2:29                     ` Martin Mokrejs
2013-03-30  2:03                   ` Martin Mokrejs
2013-04-02  5:25                     ` huang ying
2013-04-02 15:02                       ` Martin Mokrejs
2013-04-02 16:08                         ` huang ying
2013-04-02 16:53                           ` Martin Mokrejs
2013-04-02 16:30                         ` Bjorn Helgaas
     [not found]                           ` <515B17D9.6030805@fold.natur.cuni.cz>
2013-04-02 20:55                             ` Martin Mokrejs
2013-04-02 22:16                               ` Sarah Sharp
2013-04-03 10:35                                 ` Martin Mokrejs
2013-04-03  2:34                               ` huang ying
2013-04-03 10:39                                 ` Martin Mokrejs
2013-04-03 12:16                               ` Martin Mokrejs
2013-04-04 11:30                                 ` Huang Ying
2013-04-04 19:19                                   ` Sarah Sharp
2013-04-05 12:30                                     ` Martin Mokrejs
2013-04-05 12:40                                   ` Martin Mokrejs
2013-04-19 23:49                                     ` Martin Mokrejs
2013-04-30 20:47                                       ` Martin Mokrejs
2013-04-02 22:49                           ` Rafael J. Wysocki
2013-04-02 23:58                             ` Bjorn Helgaas
2013-04-03 11:00                               ` Rafael J. Wysocki
2013-04-03  2:04                           ` huang ying
2013-04-03 17:29                             ` Bjorn Helgaas
2013-03-30 22:38                   ` [Update][PATCH] PCI / PM: Disable runtime PM of PCIe ports Rafael J. Wysocki
2013-04-01 17:34                     ` Bjorn Helgaas
2013-04-01 20:51                       ` Rafael J. Wysocki
2013-04-01 20:53                         ` Bjorn Helgaas
2013-04-01 21:24                           ` Rafael J. Wysocki
2013-04-01 23:20                             ` Rafael J. Wysocki
2013-04-01 21:48                           ` Martin Mokrejs
2013-04-02  5:34                           ` huang ying
2013-04-02  5:28                         ` huang ying
2013-04-02  5:31                           ` huang ying
2013-04-03 22:34                     ` Bjorn Helgaas
2013-03-28 17:10 ` [Resend][PATCH] PCI / ACPI: Always resume devices on ACPI wakeup notifications Rafael J. Wysocki
2013-03-28 21:07   ` [Update][PATCH] " Rafael J. Wysocki
2013-03-29 15:05     ` Martin Mokrejs
2013-03-29 16:05       ` Sarah Sharp
2013-03-29 17:11         ` Martin Mokrejs
2013-03-29 18:16           ` Martin Mokrejs
2013-03-29 21:37         ` Rafael J. Wysocki
2013-03-29 21:34       ` Rafael J. Wysocki
2013-04-03 22:38     ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.