linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <bhelgaas@google.com>
To: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>,
	Andreas Noever <andreas.noever@gmail.com>,
	Matthew Garrett <mjg59@srcf.ucam.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [3.11.4] Thunderbolt/PCI unplug oops in pci_pme_list_scan
Date: Mon, 18 Nov 2013 18:33:43 -0700	[thread overview]
Message-ID: <20131119013343.GA17294@google.com> (raw)
In-Reply-To: <20131115115235.GA2281@intel.com>

On Fri, Nov 15, 2013 at 01:52:35PM +0200, Mika Westerberg wrote:
> On Thu, Oct 24, 2013 at 09:33:50PM -0600, Bjorn Helgaas wrote:
> > On Wed, Oct 23, 2013 at 11:53 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> > > On Tue, Oct 22, 2013 at 8:32 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > >> On Thu, Oct 17, 2013 at 7:59 AM, Andreas Noever <andreas.noever@gmail.com> wrote:
> > >>> On Wed, Oct 16, 2013 at 10:21 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > >>>> On Tue, Oct 15, 2013 at 03:44:52AM +0100, Matthew Garrett wrote:
> > >>>>> On Mon, Oct 14, 2013 at 05:50:38PM -0600, Bjorn Helgaas wrote:
> > >>>>> > On Mon, Oct 14, 2013 at 4:47 PM, Andreas Noever <andreas.noever@gmail.com> wrote:
> > >>>>> > > When I unplug the Thunderbolt ethernet adapter on my MacBookPro Linux
> > >>>>> > > crashes a few seconds later. Using
> > >>>>> > > echo 1 > /sys/bus/pci/devices/0000:08:00.0/remove
> > >>>>> > > to remove a bridge two levels above the device triggers the fault immediately:
> > 
> > >>>> We save a pci_dev pointer in the pci_pme_list, which of course has a
> > >>>> longer lifetime than the pci_dev itself, but we don't acquire a reference
> > >>>> on it, so I suspect the pci_dev got released before we got around to
> > >>>> doing the pci_pme_list_scan().
> > >>>>
> > >>>> Andreas, can you try the patch below?  It's against v3.12-rc2, but it
> > >>>> should apply to v3.11, too.
> > >>>
> > >>> I have tested your patch against 3.11 where it solves the problem. Thanks!
> > >>>
> > >>> Unfortunately I could not reproduce the problem in 3.12-rc5. I only
> > >>> get the following warning (and no crash):
> > >>>
> > >>> tg3 0000:0a:00.0: PME# disabled
> > >>> pcieport 0000:09:00.0: PME# disabled
> > >>> pciehp 0000:09:00.0:pcie24: unloading service driver pciehp
> > >>> pci_bus 0000:0a: dev 00, dec refcount to 0
> > >>> pci_bus 0000:0a: dev 00, released physical slot 9
> > >>> ------------[ cut here ]------------
> > >>> WARNING: CPU: 0 PID: 122 at drivers/pci/pci.c:1430
> > >>> pci_disable_device+0x84/0x90()
> > >>> Device pcieport
> > >>> disabling already-disabled device
> > >>> ...
> > 
> > >>> Bisection points to 928bea964827d7824b548c1f8e06eccbbc4d0d7d .
> > >>
> > >> This is "PCI: Delay enabling bridges until they're needed" by Yinghai.
> > >
> > > that double disabling should be addressed by:
> > >
> > > https://lkml.org/lkml/2013/4/25/608
> > >
> > > [PATCH] PCI: Remove duplicate pci_disable_device for pcie port
> > 
> > I'll look at that patch again.  I had some questions about it the
> > first time, but perhaps it makes more sense after 928bea9648 has been
> > applied.
> 
> Bjorn,
> 
> Are there any plans to apply the above patch?
> 
> I'm seeing that warning on all my TBT test machines:
> 
> [  122.914180] pcieport 0000:06:05.0: PME# disabled
> [  122.915386] ------------[ cut here ]------------
> [  122.916513] WARNING: CPU: 0 PID: 1060 at drivers/pci/pci.c:1430 pci_disable_device+0x7c/0x90()
> [  122.917589] Device pcieport
> [  122.917589] disabling already-disabled device

I fixed the changelog (the extra disable was actually added by d899871936,
not by dc5351784e) and put the patch below in my for-linus branch.  I'll
ask Linus to pull it later this week.

Sorry for the delay, and thanks for the reminder.

Bjorn


PCI: Remove duplicate pci_disable_device() from pcie_portdrv_remove()

From: Yinghai Lu <yinghai@kernel.org>

The pcie_portdrv .probe() method calls pci_enable_device() once, in
pcie_port_device_register(), but the .remove() method calls
pci_disable_device() twice, in pcie_port_device_remove() and in
pcie_portdrv_remove().

That causes a "disabling already-disabled device" warning when removing a
PCIe port device.  This happens all the time when removing Thunderbolt
devices, but is also easy to reproduce with, e.g.,
"echo 0000:00:1c.3 > /sys/bus/pci/drivers/pcieport/unbind"

This patch removes the disable from pcie_portdrv_remove().

[bhelgaas: changelog, tag for stable]
Reported-by: David Bulkow <David.Bulkow@stratus.com>
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# v2.6.32+
---
 drivers/pci/pcie/portdrv_pci.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index cd1e57e51aa7..0d8fdc48e642 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -223,7 +223,6 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
 static void pcie_portdrv_remove(struct pci_dev *dev)
 {
 	pcie_port_device_remove(dev);
-	pci_disable_device(dev);
 }
 
 static int error_detected_iter(struct device *device, void *data)

  reply	other threads:[~2013-11-19  1:33 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-14 22:47 [3.11.4] Thunderbolt/PCI unplug oops in pci_pme_list_scan Andreas Noever
2013-10-14 23:50 ` Bjorn Helgaas
2013-10-15  2:44   ` Matthew Garrett
2013-10-16 20:21     ` Bjorn Helgaas
2013-10-17 13:59       ` Andreas Noever
2013-10-23  3:32         ` Bjorn Helgaas
2013-10-24  5:53           ` Yinghai Lu
2013-10-25  3:33             ` Bjorn Helgaas
2013-10-25  5:13               ` Yinghai Lu
2013-10-25  5:28                 ` Yinghai Lu
2013-10-25 23:01                 ` Bjorn Helgaas
2013-10-27  0:39                   ` Andreas Noever
2013-11-15 11:52               ` Mika Westerberg
2013-11-19  1:33                 ` Bjorn Helgaas [this message]
2013-11-19  1:54                   ` Yijing Wang
2013-11-19 17:18                     ` Bjorn Helgaas
2013-11-20  1:14                       ` Yijing Wang
2013-11-20  1:20                         ` Bjorn Helgaas
2013-11-20  1:39                           ` Yijing Wang
2013-11-19 10:06                   ` Mika Westerberg
2013-10-30  7:57             ` Yijing Wang
2013-10-31  6:48               ` Yinghai Lu
2013-10-23 23:53         ` Bjorn Helgaas
2013-10-29  3:30       ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131119013343.GA17294@google.com \
    --to=bhelgaas@google.com \
    --cc=andreas.noever@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=mjg59@srcf.ucam.org \
    --cc=rjw@sisk.pl \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).