All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ASPM: Fix pcie devices with non-pcie children
@ 2012-03-27 14:17 Matthew Garrett
  2012-03-27 16:58 ` Colin Ian King
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Matthew Garrett @ 2012-03-27 14:17 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-kernel, linux-pci, Matthew Garrett, stable

Commit 4949be16822e92a18ea0cc1616319926628092ee changed the behaviour of
pcie_aspm_sanity_check() to always return 0 if aspm is disabled, in order
to avoid cases where we changed ASPM state on pre-PCIe 1.1 devices. This
skipped the secondary function of pcie_aspm_sanity_check which was to avoid
us enabling ASPM on devices that had non-PCIe children, causing us to hit
a BUG_ON later on. Move the aspm_disabled check so we continue to honour
that scenario.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
---
 drivers/pci/pcie/aspm.c |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 86111d9..41e367b 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -521,9 +521,6 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 	int pos;
 	u32 reg32;
 
-	if (aspm_disabled)
-		return 0;
-
 	/*
 	 * Some functions in a slot might not all be PCIe functions,
 	 * very strange. Disable ASPM for the whole slot
@@ -532,6 +529,16 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 		pos = pci_pcie_cap(child);
 		if (!pos)
 			return -EINVAL;
+
+		/*
+		 * If ASPM is disabled then we're not going to change
+		 * the BIOS state. It's safe to continue even if it's a
+		 * pre-1.1 device
+		 */
+
+		if (aspm_disabled)
+			continue;
+
 		/*
 		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
 		 * RBER bit to determine if a function is 1.1 version device
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] ASPM: Fix pcie devices with non-pcie children
  2012-03-27 14:17 [PATCH] ASPM: Fix pcie devices with non-pcie children Matthew Garrett
@ 2012-03-27 16:58 ` Colin Ian King
  2012-03-28 21:15 ` Jonathan Nieder
  2012-03-29 16:32 ` [PATCH resend] " Jonathan Nieder
  2 siblings, 0 replies; 10+ messages in thread
From: Colin Ian King @ 2012-03-27 16:58 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: bhelgaas, linux-kernel, linux-pci, stable

On 27/03/12 15:17, Matthew Garrett wrote:
> Commit 4949be16822e92a18ea0cc1616319926628092ee changed the behaviour of
> pcie_aspm_sanity_check() to always return 0 if aspm is disabled, in order
> to avoid cases where we changed ASPM state on pre-PCIe 1.1 devices. This
> skipped the secondary function of pcie_aspm_sanity_check which was to avoid
> us enabling ASPM on devices that had non-PCIe children, causing us to hit
> a BUG_ON later on. Move the aspm_disabled check so we continue to honour
> that scenario.
>
> Signed-off-by: Matthew Garrett<mjg@redhat.com>
> Cc: stable@vger.kernel.org
> ---
>   drivers/pci/pcie/aspm.c |   13 ++++++++++---
>   1 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 86111d9..41e367b 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -521,9 +521,6 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
>   	int pos;
>   	u32 reg32;
>
> -	if (aspm_disabled)
> -		return 0;
> -
>   	/*
>   	 * Some functions in a slot might not all be PCIe functions,
>   	 * very strange. Disable ASPM for the whole slot
> @@ -532,6 +529,16 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
>   		pos = pci_pcie_cap(child);
>   		if (!pos)
>   			return -EINVAL;
> +
> +		/*
> +		 * If ASPM is disabled then we're not going to change
> +		 * the BIOS state. It's safe to continue even if it's a
> +		 * pre-1.1 device
> +		 */
> +
> +		if (aspm_disabled)
> +			continue;
> +
>   		/*
>   		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
>   		 * RBER bit to determine if a function is 1.1 version device

We got a user who's now verified this fixes the BUG_ON() during boot,
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/961482/comments/51

.. so this does the trick. Thanks Matthew.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ASPM: Fix pcie devices with non-pcie children
  2012-03-27 14:17 [PATCH] ASPM: Fix pcie devices with non-pcie children Matthew Garrett
  2012-03-27 16:58 ` Colin Ian King
@ 2012-03-28 21:15 ` Jonathan Nieder
  2012-03-29 16:32 ` [PATCH resend] " Jonathan Nieder
  2 siblings, 0 replies; 10+ messages in thread
From: Jonathan Nieder @ 2012-03-28 21:15 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: bhelgaas, linux-kernel, linux-pci, stable, Colin Ian King, janek

Hi Matthew,

Matthew Garrett wrote:

> Commit 4949be16822e92a18ea0cc1616319926628092ee changed the behaviour of
> pcie_aspm_sanity_check() to always return 0 if aspm is disabled, in order
> to avoid cases where we changed ASPM state on pre-PCIe 1.1 devices. This
> skipped the secondary function of pcie_aspm_sanity_check which was to avoid
> us enabling ASPM on devices that had non-PCIe children, causing us to hit
> a BUG_ON later on. Move the aspm_disabled check so we continue to honour
> that scenario.

janek (cc-ed) never experienced the BUG_ON.  Instead, starting with
v3.3 and v3.2.12 his hard disk using the pata_jmicron driver was not
detected during boot-up, resulting in the message "gave up waiting for
root device" and a failed boot.

Found in

  Debian kernel 3.2.12-1
  Debian kernel 3.3-1~experimental.1
  Upstream 3.3
  Linus's "master" as of 2012-03-28

Based on the thread [1] we blamed 4949be16822.  janek tried the patch
above on top of linus's "master".  The result:

> Thanks. This patch fixes the problem.

In other words, this gets the pata_jmicron driver to enumerate its
drives again, a positive effect that wasn't even advertised in the
commit message. ;-)  Thanks for writing it.

Sincerely,
Jonathan

[1] http://thread.gmane.org/gmane.linux.kernel/1271264/focus=1271785

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH resend] ASPM: Fix pcie devices with non-pcie children
  2012-03-27 14:17 [PATCH] ASPM: Fix pcie devices with non-pcie children Matthew Garrett
  2012-03-27 16:58 ` Colin Ian King
  2012-03-28 21:15 ` Jonathan Nieder
@ 2012-03-29 16:32 ` Jonathan Nieder
  2012-03-29 20:46   ` Andrew Morton
  2 siblings, 1 reply; 10+ messages in thread
From: Jonathan Nieder @ 2012-03-29 16:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Garrett, bhelgaas, linux-kernel, linux-pci, stable,
	Romain Francoise, Chris Holland, Colin Ian King, Hatem Masmoudi,
	janek

From: Matthew Garrett <mjg@redhat.com>
Date: Tue, 27 Mar 2012 10:17:41 -0400

Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
Some other systems using the pata_jmicron driver fail to boot because
no disks are detected.  Passing pcie_aspm=force on the kernel command
line works around it.

The cause: commit 4949be16822e ("PCI: ignore pre-1.1 ASPM quirking
when ASPM is disabled") changed the behaviour of
pcie_aspm_sanity_check() to always return 0 if aspm is disabled, in
order to avoid cases where we changed ASPM state on pre-PCIe 1.1
devices.  This skipped the secondary function of
pcie_aspm_sanity_check which was to avoid us enabling ASPM on devices
that had non-PCIe children, causing trouble later on.  Move the
aspm_disabled check so we continue to honour that scenario.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
          http://bugs.debian.org/665420

[jn: with more symptoms in log message]

Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Hi Andrew,

This patch only appeared a couple of days ago[1], but it fixes a
noticeable regression so I would like to make sure the patch becomes
part of mainline and the 3.2.y- and 3.3.y-stable trees soon.  Could
you pick it up for linux-next until it makes its way to the PCI tree?

Regression was introduced between 3.3-rc7 and 3.3 and between 3.2.11
and 3.2.12.  Prevents boot on affected machines, though there is a
workaround.  Details about the symptoms and fix are above.

Thanks,
Jonathan

[1] http://thread.gmane.org/gmane.linux.kernel.pci/14503

 drivers/pci/pcie/aspm.c |   13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 4bdef24cd412..b500840a143b 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -508,9 +508,6 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 	int pos;
 	u32 reg32;
 
-	if (aspm_disabled)
-		return 0;
-
 	/*
 	 * Some functions in a slot might not all be PCIe functions,
 	 * very strange. Disable ASPM for the whole slot
@@ -519,6 +516,16 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 		pos = pci_pcie_cap(child);
 		if (!pos)
 			return -EINVAL;
+
+		/*
+		 * If ASPM is disabled then we're not going to change
+		 * the BIOS state. It's safe to continue even if it's a
+		 * pre-1.1 device
+		 */
+
+		if (aspm_disabled)
+			continue;
+
 		/*
 		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
 		 * RBER bit to determine if a function is 1.1 version device
-- 
1.7.10.rc1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH resend] ASPM: Fix pcie devices with non-pcie children
  2012-03-29 16:32 ` [PATCH resend] " Jonathan Nieder
@ 2012-03-29 20:46   ` Andrew Morton
  2012-03-29 20:56     ` Matthew Garrett
  2012-03-29 21:02     ` Jonathan Nieder
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2012-03-29 20:46 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Matthew Garrett, bhelgaas, linux-kernel, linux-pci, stable,
	Romain Francoise, Chris Holland, Colin Ian King, Hatem Masmoudi,
	janek, Jesse Barnes

On Thu, 29 Mar 2012 11:32:06 -0500
Jonathan Nieder <jrnieder@gmail.com> wrote:

> From: Matthew Garrett <mjg@redhat.com>
> Date: Tue, 27 Mar 2012 10:17:41 -0400
> 
> Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
> Some other systems using the pata_jmicron driver fail to boot because
> no disks are detected.  Passing pcie_aspm=force on the kernel command
> line works around it.
> 
> The cause: commit 4949be16822e ("PCI: ignore pre-1.1 ASPM quirking
> when ASPM is disabled") changed the behaviour of
> pcie_aspm_sanity_check() to always return 0 if aspm is disabled, in
> order to avoid cases where we changed ASPM state on pre-PCIe 1.1
> devices.  This skipped the secondary function of
> pcie_aspm_sanity_check which was to avoid us enabling ASPM on devices
> that had non-PCIe children, causing trouble later on.  Move the
> aspm_disabled check so we continue to honour that scenario.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
>           http://bugs.debian.org/665420
> 
> [jn: with more symptoms in log message]
> 
> Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
> Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
> Signed-off-by: Matthew Garrett <mjg@redhat.com>
> Cc: stable@vger.kernel.org
> Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
> Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> ---
> Hi Andrew,
> 
> This patch only appeared a couple of days ago[1], but it fixes a
> noticeable regression so I would like to make sure the patch becomes
> part of mainline and the 3.2.y- and 3.3.y-stable trees soon.  Could
> you pick it up for linux-next until it makes its way to the PCI tree?
> 
> Regression was introduced between 3.3-rc7 and 3.3 and between 3.2.11
> and 3.2.12.  Prevents boot on affected machines, though there is a
> workaround.  Details about the symptoms and fix are above.

Just about the only person who wasn't copied on this email is, umm, the
PCI maintainer!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH resend] ASPM: Fix pcie devices with non-pcie children
  2012-03-29 20:46   ` Andrew Morton
@ 2012-03-29 20:56     ` Matthew Garrett
  2012-03-29 20:59       ` Andrew Morton
  2012-03-29 21:02     ` Jonathan Nieder
  1 sibling, 1 reply; 10+ messages in thread
From: Matthew Garrett @ 2012-03-29 20:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jonathan Nieder, bhelgaas, linux-kernel, linux-pci, stable,
	Romain Francoise, Chris Holland, Colin Ian King, Hatem Masmoudi,
	janek, Jesse Barnes

On Thu, Mar 29, 2012 at 01:46:14PM -0700, Andrew Morton wrote:

> Just about the only person who wasn't copied on this email is, umm, the
> PCI maintainer!

Jesse just handed that off to Bjorn…

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH resend] ASPM: Fix pcie devices with non-pcie children
  2012-03-29 20:56     ` Matthew Garrett
@ 2012-03-29 20:59       ` Andrew Morton
  2012-03-29 21:49           ` Bjorn Helgaas
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2012-03-29 20:59 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Jonathan Nieder, bhelgaas, linux-kernel, linux-pci, stable,
	Romain Francoise, Chris Holland, Colin Ian King, Hatem Masmoudi,
	janek, Jesse Barnes

On Thu, 29 Mar 2012 21:56:35 +0100
Matthew Garrett <mjg59@srcf.ucam.org> wrote:

> On Thu, Mar 29, 2012 at 01:46:14PM -0700, Andrew Morton wrote:
> 
> > Just about the only person who wasn't copied on this email is, umm, the
> > PCI maintainer!
> 
> Jesse just handed that off to Bjorn…

Oh.  So he did.   My search of MAINTAINERS turned up the probably wrong

T:      git git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci.git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH resend] ASPM: Fix pcie devices with non-pcie children
  2012-03-29 20:46   ` Andrew Morton
  2012-03-29 20:56     ` Matthew Garrett
@ 2012-03-29 21:02     ` Jonathan Nieder
  1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Nieder @ 2012-03-29 21:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Garrett, bhelgaas, linux-kernel, linux-pci, stable,
	Romain Francoise, Chris Holland, Colin Ian King, Hatem Masmoudi,
	janek, Jesse Barnes

Andrew Morton wrote:
> Jonathan Nieder <jrnieder@gmail.com> wrote:

>> From: Matthew Garrett <mjg@redhat.com>
>> Date: Tue, 27 Mar 2012 10:17:41 -0400
[...]
>>            commit 4949be16822e ("PCI: ignore pre-1.1 ASPM quirking
>> when ASPM is disabled") changed the behaviour of
>> pcie_aspm_sanity_check() to always return 0 if aspm is disabled, in
>> order to avoid cases where we changed ASPM state on pre-PCIe 1.1
>> devices.  This skipped the secondary function of
>> pcie_aspm_sanity_check which was to avoid us enabling ASPM on devices
>> that had non-PCIe children, causing trouble later on.
[...]
>>                                                               Could
>> you pick it up for linux-next until it makes its way to the PCI tree?
[...]
> Just about the only person who wasn't copied on this email is, umm, the
> PCI maintainer!

Well spotted.  Thanks for catching it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH resend] ASPM: Fix pcie devices with non-pcie children
  2012-03-29 20:59       ` Andrew Morton
@ 2012-03-29 21:49           ` Bjorn Helgaas
  0 siblings, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2012-03-29 21:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Garrett, Jonathan Nieder, linux-kernel, linux-pci,
	stable, Romain Francoise, Chris Holland, Colin Ian King,
	Hatem Masmoudi, janek, Jesse Barnes

On Thu, Mar 29, 2012 at 2:59 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Thu, 29 Mar 2012 21:56:35 +0100
> Matthew Garrett <mjg59@srcf.ucam.org> wrote:
>
>> On Thu, Mar 29, 2012 at 01:46:14PM -0700, Andrew Morton wrote:
>>
>> > Just about the only person who wasn't copied on this email is, umm, the
>> > PCI maintainer!
>>
>> Jesse just handed that off to Bjorn…
>
> Oh.  So he did.   My search of MAINTAINERS turned up the probably wrong
>
> T:      git git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci.git

Yep, I'll update MAINTAINERS as soon as I set up a kernel.org tree.

The patch itself looks fine to me, so in case anybody wants to pick it
up earlier:

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

I do think that ASPM path is disappointingly hard to follow, which
likely contributed to the bug in the first place.
pcie_aspm_sanity_check() is a terrible name for something that returns
0/errno (which is treated as a bool meaning something like "do ASPM on
this device").  And the idea that we save this blacklist information
in the form of "link->aspm_enabled = ASPM_STATE_ALL" is weird.

But obviously, I'm ignorant of ASPM in general.

Bjorn

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH resend] ASPM: Fix pcie devices with non-pcie children
@ 2012-03-29 21:49           ` Bjorn Helgaas
  0 siblings, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2012-03-29 21:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Garrett, Jonathan Nieder, linux-kernel, linux-pci,
	stable, Romain Francoise, Chris Holland, Colin Ian King,
	Hatem Masmoudi, janek, Jesse Barnes

On Thu, Mar 29, 2012 at 2:59 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Thu, 29 Mar 2012 21:56:35 +0100
> Matthew Garrett <mjg59@srcf.ucam.org> wrote:
>
>> On Thu, Mar 29, 2012 at 01:46:14PM -0700, Andrew Morton wrote:
>>
>> > Just about the only person who wasn't copied on this email is, umm, the
>> > PCI maintainer!
>>
>> Jesse just handed that off to Bjorn�
>
> Oh. �So he did. � My search of MAINTAINERS turned up the probably wrong
>
> T: � � �git git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci.git

Yep, I'll update MAINTAINERS as soon as I set up a kernel.org tree.

The patch itself looks fine to me, so in case anybody wants to pick it
up earlier:

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

I do think that ASPM path is disappointingly hard to follow, which
likely contributed to the bug in the first place.
pcie_aspm_sanity_check() is a terrible name for something that returns
0/errno (which is treated as a bool meaning something like "do ASPM on
this device").  And the idea that we save this blacklist information
in the form of "link->aspm_enabled = ASPM_STATE_ALL" is weird.

But obviously, I'm ignorant of ASPM in general.

Bjorn

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-03-29 21:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-27 14:17 [PATCH] ASPM: Fix pcie devices with non-pcie children Matthew Garrett
2012-03-27 16:58 ` Colin Ian King
2012-03-28 21:15 ` Jonathan Nieder
2012-03-29 16:32 ` [PATCH resend] " Jonathan Nieder
2012-03-29 20:46   ` Andrew Morton
2012-03-29 20:56     ` Matthew Garrett
2012-03-29 20:59       ` Andrew Morton
2012-03-29 21:49         ` Bjorn Helgaas
2012-03-29 21:49           ` Bjorn Helgaas
2012-03-29 21:02     ` Jonathan Nieder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.