linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off
@ 2017-01-03  8:09 Jan Rüth
  2017-02-03 17:32 ` Bjorn Helgaas
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Rüth @ 2017-01-03  8:09 UTC (permalink / raw)
  To: linux-pci

This patch fixes a null pointer dereference during PCI bus enumeration
when ASPM is off. On an IBM x3850 8664 this bug causes the kernel to
halt, this behavior did not appear in 3.10, so this is a regression.

pcie_aspm_sanity_check should only be called if ASPM is on.

Signed-off-by: Jan Rueth <rueth@comsys.rwth-aachen.de>
---
 drivers/pci/pcie/aspm.c | +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index f981129..e758b56 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -552,11 +552,12 @@ static struct pcie_link_state
*alloc_pcie_link_state(struct pci_dev *pdev)
 void pcie_aspm_init_link_state(struct pci_dev *pdev)
 {
        struct pcie_link_state *link;
-       int blacklist = !!pcie_aspm_sanity_check(pdev);
-
+       int blacklist;
        if (!aspm_support_enabled)
                return;

+       blacklist = !!pcie_aspm_sanity_check(pdev);
+
        if (pdev->link_state)
                return;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off
  2017-01-03  8:09 [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off Jan Rüth
@ 2017-02-03 17:32 ` Bjorn Helgaas
  2017-03-07  0:27   ` Bjorn Helgaas
  0 siblings, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2017-02-03 17:32 UTC (permalink / raw)
  To: Jan Rüth; +Cc: linux-pci

On Tue, Jan 03, 2017 at 09:09:49AM +0100, Jan Rüth wrote:
> This patch fixes a null pointer dereference during PCI bus enumeration
> when ASPM is off. On an IBM x3850 8664 this bug causes the kernel to
> halt, this behavior did not appear in 3.10, so this is a regression.
> 
> pcie_aspm_sanity_check should only be called if ASPM is on.
> 
> Signed-off-by: Jan Rueth <rueth@comsys.rwth-aachen.de>

We have https://bugzilla.kernel.org/show_bug.cgi?id=187731 for this issue.

Per the dmesg attached there, you're booting with "pcie_aspm=off".
This patch fixes the NULL pointer dereference in that case, but I'm
concerned that we may still trip over it if you boot without
"pcie_aspm=off".

I want to make sure we fix both cases.  Can you please try a boot
with this patch but without "pcie_aspm=off"?

> ---
>  drivers/pci/pcie/aspm.c | +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index f981129..e758b56 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -552,11 +552,12 @@ static struct pcie_link_state
> *alloc_pcie_link_state(struct pci_dev *pdev)
>  void pcie_aspm_init_link_state(struct pci_dev *pdev)
>  {
>         struct pcie_link_state *link;
> -       int blacklist = !!pcie_aspm_sanity_check(pdev);
> -
> +       int blacklist;
>         if (!aspm_support_enabled)
>                 return;
> 
> +       blacklist = !!pcie_aspm_sanity_check(pdev);
> +
>         if (pdev->link_state)
>                 return;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off
  2017-02-03 17:32 ` Bjorn Helgaas
@ 2017-03-07  0:27   ` Bjorn Helgaas
  0 siblings, 0 replies; 7+ messages in thread
From: Bjorn Helgaas @ 2017-03-07  0:27 UTC (permalink / raw)
  To: Jan Rüth; +Cc: linux-pci

On Fri, Feb 03, 2017 at 11:32:43AM -0600, Bjorn Helgaas wrote:
> On Tue, Jan 03, 2017 at 09:09:49AM +0100, Jan Rüth wrote:
> > This patch fixes a null pointer dereference during PCI bus enumeration
> > when ASPM is off. On an IBM x3850 8664 this bug causes the kernel to
> > halt, this behavior did not appear in 3.10, so this is a regression.
> > 
> > pcie_aspm_sanity_check should only be called if ASPM is on.
> > 
> > Signed-off-by: Jan Rueth <rueth@comsys.rwth-aachen.de>
> 
> We have https://bugzilla.kernel.org/show_bug.cgi?id=187731 for this issue.
> 
> Per the dmesg attached there, you're booting with "pcie_aspm=off".
> This patch fixes the NULL pointer dereference in that case, but I'm
> concerned that we may still trip over it if you boot without
> "pcie_aspm=off".
> 
> I want to make sure we fix both cases.  Can you please try a boot
> with this patch but without "pcie_aspm=off"?

Ping?  I'd like to get this patch merged but I want to make sure we
fix the problem both with and without "pcie_aspm=off".

Is anybody in a position to test this on an IBM x3850 8664?

> > ---
> >  drivers/pci/pcie/aspm.c | +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index f981129..e758b56 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -552,11 +552,12 @@ static struct pcie_link_state
> > *alloc_pcie_link_state(struct pci_dev *pdev)
> >  void pcie_aspm_init_link_state(struct pci_dev *pdev)
> >  {
> >         struct pcie_link_state *link;
> > -       int blacklist = !!pcie_aspm_sanity_check(pdev);
> > -
> > +       int blacklist;
> >         if (!aspm_support_enabled)
> >                 return;
> > 
> > +       blacklist = !!pcie_aspm_sanity_check(pdev);
> > +
> >         if (pdev->link_state)
> >                 return;
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off
  2017-01-04 11:41   ` Jan Rüth
@ 2017-01-19 23:41     ` Bjorn Helgaas
  0 siblings, 0 replies; 7+ messages in thread
From: Bjorn Helgaas @ 2017-01-19 23:41 UTC (permalink / raw)
  To: Jan Rüth; +Cc: linux-pci

Hi Jan,

On Wed, Jan 04, 2017 at 12:41:36PM +0100, Jan Rüth wrote:
> On 03/01/2017 21:53, Bjorn Helgaas wrote:
> > On Tue, Jan 03, 2017 at 09:04:36AM +0100, Jan Rüth wrote:
> >> This patch fixes a null pointer dereference during PCI bus enumeration
> >> when ASPM is off. On an IBM x3850 8664 this bug causes the kernel to
> >> halt, this behavior did not appear in 3.10, so this is a regression.
> >>
> >> pcie_aspm_sanity_check should only be called if ASPM is on.
> > 
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=187731
> > 
> >> Signed-off-by: Jan Rueth <rueth@comsys.rwth-aachen.de>
> >> ---
> >>  drivers/pci/pcie/aspm.c | +++--
> >>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> >> index f981129..e758b56 100644
> >> --- a/drivers/pci/pcie/aspm.c
> >> +++ b/drivers/pci/pcie/aspm.c
> >> @@ -552,11 +552,12 @@ static struct pcie_link_state
> >> *alloc_pcie_link_state(struct pci_dev *pdev)
> >>  void pcie_aspm_init_link_state(struct pci_dev *pdev)
> >>  {
> >>         struct pcie_link_state *link;
> >> -       int blacklist = !!pcie_aspm_sanity_check(pdev);
> >> -
> >> +       int blacklist;
> >>         if (!aspm_support_enabled)
> >>                 return;
> >>
> >> +       blacklist = !!pcie_aspm_sanity_check(pdev);
> >> +
> >>         if (pdev->link_state)
> >>                 return;
> > 
> > For some reason this doesn't apply for me (complains about a malformed
> > patch), but I can apply it by hand easily enough.
> > 
> > However, this path is a little obtuse, and I'd like to understand
> > what's going on better.  We currently have:
> > 
> >   pci_scan_slot
> >     if (bus->self && nr)
> >       pcie_aspm_init_link_state(bus->self)
> >         pcie_aspm_sanity_check
> > 	  list_for_each_entry(child, &pdev->subordinate->devices, ...)
> > 
> > I assume the null pointer is pdev->subordinate, so maybe we're calling
> > pcie_aspm_init_link_state() for a bridge where we weren't able to
> > create a child bus ("pdev->subordinate")?
> 
> At the time when I debugged it could trace it back to the loop but I
> can't remember which part actually broke. But I can add some debug
> output and crash it again.

I'm pretty sure the null pointer would be pdev->subordinate, and your
patch certainly avoids dereferencing that as long as you boot with
"pcie_aspm=off".  But I think if you boot with your patch but without
"pcie_aspm=off", we'll probably dereference that null pointer again.

The only way I can see to have a null pdev->subordinate pointer is via
SR-IOV VF devices, and you should have some of those.  But I think we
should see the messages from pci_setup_device() for them, and I don't
see them in the dmesg log in the bugzilla, so I'm still confused about
this.

Why are you using "pcie_aspm=off"?  If it's due to Linux ASPM bugs, I
want to fix those, too.  If it's due to hardware problems on your
platform, maybe we can add some sort of quirk to do this
automatically.

Can you please apply the patches below (you should be able to feed
this whole email to "patch" at once), collect the complete dmesg logs
(both with and without "pcie_aspm=off"), and attach them to the
bugzilla?

Bjorn


commit 50deebe0077acdfafe08501b7da220d256743e36
Author: Jan Rüth <rueth@comsys.rwth-aachen.de>
Date:   Tue Jan 3 14:09:50 2017 -0600

    PCI/ASPM: Avoid ASPM sanity checking when ASPM is off
    
    Previously we called pcie_aspm_sanity_check() even if ASPM was disabled,
    which is unnecessary.  Delay the pcie_aspm_sanity_check() until we know
    that we should do some ASPM configuration.
    
    As a side effect, this avoids a null pointer dereference in
    pcie_aspm_sanity_check(), which we saw on IBM x3850 8664 (see bugzilla
    below).
    
    [bhelgaas: changelog, delay call a little more]
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=187731
    Signed-off-by: Jan Rüth <rueth@comsys.rwth-aachen.de>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    CC: stable@vger.kernel.org

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 17ac1dce3286..99a15b2e4228 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -565,7 +565,7 @@ static struct pcie_link_state *alloc_pcie_link_state(struct pci_dev *pdev)
 void pcie_aspm_init_link_state(struct pci_dev *pdev)
 {
 	struct pcie_link_state *link;
-	int blacklist = !!pcie_aspm_sanity_check(pdev);
+	int blacklist;
 
 	if (!aspm_support_enabled)
 		return;
@@ -586,6 +586,8 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev)
 	    pdev->bus->self)
 		return;
 
+	blacklist = !!pcie_aspm_sanity_check(pdev);
+
 	down_read(&pci_bus_sem);
 	if (list_empty(&pdev->subordinate->devices))
 		goto out;
@@ -594,6 +596,7 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev)
 	link = alloc_pcie_link_state(pdev);
 	if (!link)
 		goto unlock;
+
 	/*
 	 * Setup initial ASPM state. Note that we need to configure
 	 * upstream links also because capable state of them can be

commit 8cd2325c4360c6aa680dd32d804ae8dcc8f8cc5b
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Jan 19 17:21:10 2017 -0600

    aspm-debug

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 99a15b2e4228..c8695965b1d0 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -567,6 +567,13 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev)
 	struct pcie_link_state *link;
 	int blacklist;
 
+	dev_info(&pdev->dev, "support %d PCIe %d link_state %d secondary %d subordinate %d\n",
+		 aspm_support_enabled ? 1 : 0,
+		 pci_is_pcie(pdev) ? 1 : 0,
+		 pdev->link_state ? 1 : 0,
+		 pdev->has_secondary_link ? 1 : 0,
+		 pdev->subordinate ? 1 : 0);
+
 	if (!aspm_support_enabled)
 		return;
 

commit b89fa357a592615c00c1b98d8c06a3df1d185078
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Jan 19 17:32:38 2017 -0600

    sriov-debug

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 47227820406d..98be6d6b119f 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -129,6 +129,8 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
 	if (!bus)
 		goto failed;
 
+	dev_info(&dev->dev, "add VF %d on %s\n", id, dev_name(&bus->dev));
+
 	virtfn = pci_alloc_dev(bus);
 	if (!virtfn)
 		goto failed0;
@@ -145,6 +147,9 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
 	virtfn->is_virtfn = 1;
 	virtfn->multifunction = 0;
 
+	dev_info(&virtfn->dev, "new VF %d, subordinate %d\n", id,
+		virtfn->subordinate ? 1 : 0);
+
 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
 		res = &dev->resource[i + PCI_IOV_RESOURCES];
 		if (!res->parent)

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off
  2017-01-03 20:53 ` Bjorn Helgaas
@ 2017-01-04 11:41   ` Jan Rüth
  2017-01-19 23:41     ` Bjorn Helgaas
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Rüth @ 2017-01-04 11:41 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci



On 03/01/2017 21:53, Bjorn Helgaas wrote:
> On Tue, Jan 03, 2017 at 09:04:36AM +0100, Jan Rüth wrote:
>> This patch fixes a null pointer dereference during PCI bus enumeration
>> when ASPM is off. On an IBM x3850 8664 this bug causes the kernel to
>> halt, this behavior did not appear in 3.10, so this is a regression.
>>
>> pcie_aspm_sanity_check should only be called if ASPM is on.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=187731
> 
>> Signed-off-by: Jan Rueth <rueth@comsys.rwth-aachen.de>
>> ---
>>  drivers/pci/pcie/aspm.c | +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> index f981129..e758b56 100644
>> --- a/drivers/pci/pcie/aspm.c
>> +++ b/drivers/pci/pcie/aspm.c
>> @@ -552,11 +552,12 @@ static struct pcie_link_state
>> *alloc_pcie_link_state(struct pci_dev *pdev)
>>  void pcie_aspm_init_link_state(struct pci_dev *pdev)
>>  {
>>         struct pcie_link_state *link;
>> -       int blacklist = !!pcie_aspm_sanity_check(pdev);
>> -
>> +       int blacklist;
>>         if (!aspm_support_enabled)
>>                 return;
>>
>> +       blacklist = !!pcie_aspm_sanity_check(pdev);
>> +
>>         if (pdev->link_state)
>>                 return;
> 
> For some reason this doesn't apply for me (complains about a malformed
> patch), but I can apply it by hand easily enough.
> 
> However, this path is a little obtuse, and I'd like to understand
> what's going on better.  We currently have:
> 
>   pci_scan_slot
>     if (bus->self && nr)
>       pcie_aspm_init_link_state(bus->self)
>         pcie_aspm_sanity_check
> 	  list_for_each_entry(child, &pdev->subordinate->devices, ...)
> 
> I assume the null pointer is pdev->subordinate, so maybe we're calling
> pcie_aspm_init_link_state() for a bridge where we weren't able to
> create a child bus ("pdev->subordinate")?

At the time when I debugged it could trace it back to the loop but I
can't remember which part actually broke. But I can add some debug
output and crash it again.
> 
> I think it's legal to have a bridge device where we haven't set
> pdev->subordinate, but it does seem unusual.
> 
> I don't see a log with the actual null pointer dereference in the
> bugzilla.  Do you have any more details about which device blows up
> here and why it's unusual?

If you need it, I can boot an unpatched kernel again and post the actual
nullpointer dereference. The device in question is an ethernet card:
Intel Corporation 82576
Here is also a lspci -tv if that is of any help:

-+-[0000:19]---00.0-[1a-1d]----00.0-[1b-1d]--+-02.0-[1c]--+-00.0  Intel
Corporation 82576 Gigabit Network Connection
 |                                           |            \-00.1  Intel
Corporation 82576 Gigabit Network Connection
 |                                           \-04.0-[1d]--+-00.0  Intel
Corporation 82576 Gigabit Network Connection
 |                                                        \-00.1  Intel
Corporation 82576 Gigabit Network Connection
 +-[0000:14]---00.0-[15-18]--
 +-[0000:0f]---00.0-[10-13]--
 +-[0000:0a]---00.0-[0b-0e]--
 +-[0000:06]---00.0  IBM Calgary PCI-X Host Bridge
 +-[0000:02]---00.0  IBM Calgary PCI-X Host Bridge
 +-[0000:01]-+-00.0  IBM Calgary PCI-X Host Bridge
 |           +-01.0  Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet
 |           +-01.1  Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet
 |           \-02.0  Adaptec AAC-RAID
 \-[0000:00]-+-00.0  IBM Calgary PCI-X Host Bridge
             +-01.0  Advanced Micro Devices, Inc. [AMD/ATI] RV100
[Radeon 7000 / Radeon VE]
             +-03.0  NEC Corporation OHCI USB Controller
             +-03.1  NEC Corporation OHCI USB Controller
             +-03.2  NEC Corporation uPD72010x USB 2.0 Controller
             +-0f.0  Broadcom CSB6 South Bridge
             +-0f.1  Broadcom CSB6 RAID/IDE Controller
             \-0f.3  Broadcom GCLE-2 Host Bridge

> 
> Part of the reason I'm curious is that I'd like to identify the commit
> that introduced the problem so we can figure out where the fix might
> need to be backported.  I suspected b7206cbf024d ("PCI ASPM: support
> partial aspm enablement") because it moved the
> pcie_aspm_sanity_check() call before the checks for aspm_disabled, but
> that's been around since 2.6 days, and the problem you're seeing
> didn't happen until after 3.10.

The distribution I tested where it ran with a 3.10 kernel was a CentOS
(the machines were once certified for REHL so I figured there might be
something special). I'm not 100% sure anymore but I think I used a
CentOS 6.5 and updated the kernel to 3.10 there and it worked. (I think
I followed this guide:
http://bicofino.io/2014/10/25/install-kernel-3-dot-10-on-centos-6-dot-5/)
So I guess it could be that the patch that introduces the fix was just
not pulled into the CentOS 3.10 kernel?

> 
> Bjorn
> 

Best
 Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off
  2017-01-03  8:04 Jan Rüth
@ 2017-01-03 20:53 ` Bjorn Helgaas
  2017-01-04 11:41   ` Jan Rüth
  0 siblings, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2017-01-03 20:53 UTC (permalink / raw)
  To: Jan Rüth; +Cc: linux-pci

On Tue, Jan 03, 2017 at 09:04:36AM +0100, Jan Rüth wrote:
> This patch fixes a null pointer dereference during PCI bus enumeration
> when ASPM is off. On an IBM x3850 8664 this bug causes the kernel to
> halt, this behavior did not appear in 3.10, so this is a regression.
> 
> pcie_aspm_sanity_check should only be called if ASPM is on.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=187731

> Signed-off-by: Jan Rueth <rueth@comsys.rwth-aachen.de>
> ---
>  drivers/pci/pcie/aspm.c | +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index f981129..e758b56 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -552,11 +552,12 @@ static struct pcie_link_state
> *alloc_pcie_link_state(struct pci_dev *pdev)
>  void pcie_aspm_init_link_state(struct pci_dev *pdev)
>  {
>         struct pcie_link_state *link;
> -       int blacklist = !!pcie_aspm_sanity_check(pdev);
> -
> +       int blacklist;
>         if (!aspm_support_enabled)
>                 return;
> 
> +       blacklist = !!pcie_aspm_sanity_check(pdev);
> +
>         if (pdev->link_state)
>                 return;

For some reason this doesn't apply for me (complains about a malformed
patch), but I can apply it by hand easily enough.

However, this path is a little obtuse, and I'd like to understand
what's going on better.  We currently have:

  pci_scan_slot
    if (bus->self && nr)
      pcie_aspm_init_link_state(bus->self)
        pcie_aspm_sanity_check
	  list_for_each_entry(child, &pdev->subordinate->devices, ...)

I assume the null pointer is pdev->subordinate, so maybe we're calling
pcie_aspm_init_link_state() for a bridge where we weren't able to
create a child bus ("pdev->subordinate")?

I think it's legal to have a bridge device where we haven't set
pdev->subordinate, but it does seem unusual.

I don't see a log with the actual null pointer dereference in the
bugzilla.  Do you have any more details about which device blows up
here and why it's unusual?

Part of the reason I'm curious is that I'd like to identify the commit
that introduced the problem so we can figure out where the fix might
need to be backported.  I suspected b7206cbf024d ("PCI ASPM: support
partial aspm enablement") because it moved the
pcie_aspm_sanity_check() call before the checks for aspm_disabled, but
that's been around since 2.6 days, and the problem you're seeing
didn't happen until after 3.10.

Bjorn

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off
@ 2017-01-03  8:04 Jan Rüth
  2017-01-03 20:53 ` Bjorn Helgaas
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Rüth @ 2017-01-03  8:04 UTC (permalink / raw)
  To: linux-pci

This patch fixes a null pointer dereference during PCI bus enumeration
when ASPM is off. On an IBM x3850 8664 this bug causes the kernel to
halt, this behavior did not appear in 3.10, so this is a regression.

pcie_aspm_sanity_check should only be called if ASPM is on.

Signed-off-by: Jan Rueth <rueth@comsys.rwth-aachen.de>
---
 drivers/pci/pcie/aspm.c | +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index f981129..e758b56 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -552,11 +552,12 @@ static struct pcie_link_state
*alloc_pcie_link_state(struct pci_dev *pdev)
 void pcie_aspm_init_link_state(struct pci_dev *pdev)
 {
        struct pcie_link_state *link;
-       int blacklist = !!pcie_aspm_sanity_check(pdev);
-
+       int blacklist;
        if (!aspm_support_enabled)
                return;

+       blacklist = !!pcie_aspm_sanity_check(pdev);
+
        if (pdev->link_state)
                return;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-03-07  0:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-03  8:09 [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off Jan Rüth
2017-02-03 17:32 ` Bjorn Helgaas
2017-03-07  0:27   ` Bjorn Helgaas
  -- strict thread matches above, loose matches on Subject: below --
2017-01-03  8:04 Jan Rüth
2017-01-03 20:53 ` Bjorn Helgaas
2017-01-04 11:41   ` Jan Rüth
2017-01-19 23:41     ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).