All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Keith Busch <keith.busch@intel.com>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Sinan Kaya <okaya@kernel.org>,
	Kai Heng Feng <kai.heng.feng@canonical.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] PCI: pciehp: Prevent deadlock on disconnect
Date: Mon, 5 Aug 2019 13:18:54 +0200	[thread overview]
Message-ID: <20190805111854.al5bj3q2gdng5ai6@wunner.de> (raw)
In-Reply-To: <20190618125051.2382-2-mika.westerberg@linux.intel.com>

On Tue, Jun 18, 2019 at 03:50:51PM +0300, Mika Westerberg wrote:
> If there are more than one PCIe switch with hotplug downstream ports
> hot-removing them leads to a following deadlock:
[...]
> What happens here is that the whole hierarchy is runtime resumed and the
> parent PCIe downstream port, who got the hot-remove event, starts
> removing devices below it taking pci_lock_rescan_remove() lock. When the
> child PCIe port is runtime resumed it calls pciehp_check_presence()
> which ends up calling pciehp_card_present() and pciehp_check_link_active().
> Both of these read their parts of PCIe config space by calling helper
> function pcie_capability_read_word(). Now, this function notices that
> the underlying device is already gone and returns PCIBIOS_DEVICE_NOT_FOUND
> with the capability value set to 0. When pciehp gets this value it
> thinks that its child device is also hot-removed and schedules its IRQ
> thread to handle the event.
> 
> The deadlock happens when the child's IRQ thread runs and tries to
> acquire pci_lock_rescan_remove() which is already taken by the parent
> and the parent waits for the child's IRQ thread to finish.
> 
> We can prevent this from happening by checking the return value of
> pcie_capability_read_word() and if it is PCIBIOS_DEVICE_NOT_FOUND stop
> performing any hot-removal activities.

IIUC this patch only avoids the deadlock if the child hotplug port happens
to be runtime suspended when it is surprise removed.  The deadlock isn't
avoided if is runtime resumed.

This patch I posted last year should cover both cases:

https://patchwork.kernel.org/patch/10468065/

However, as I've noted in this follow-up to the patch, I don't consider
my solution a proper fix either:

https://patchwork.kernel.org/patch/10468065/#22206721

Rather, the problem should be addressed by unbinding PCI drivers without
holding pci_lock_rescan_remove().

I'm truly sorry but I haven't been able to make much progress on this
as I got caught up with other things.  Part of the problem is that this
is volunteer work.  Maybe someone's interested in hiring me to work on it?
Resume available on request.  (But I'll get to it sooner or later whether
paid or not, unless someone else beats me to it. :-) )

Thanks,

Lukas

  reply	other threads:[~2019-08-05 11:18 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-18 12:50 [PATCH 1/2] PCI: pciehp: Do not disable interrupt twice on suspend Mika Westerberg
2019-06-18 12:50 ` [PATCH 2/2] PCI: pciehp: Prevent deadlock on disconnect Mika Westerberg
2019-08-05 11:18   ` Lukas Wunner [this message]
2019-08-05 12:59     ` Mika Westerberg
2019-06-21  8:52 ` [PATCH 1/2] PCI: pciehp: Do not disable interrupt twice on suspend Kai-Heng Feng
2019-06-24  9:34 ` Rafael J. Wysocki
2019-08-04 19:53 ` Lukas Wunner
2019-08-05 13:13   ` Mika Westerberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190805111854.al5bj3q2gdng5ai6@wunner.de \
    --to=lukas@wunner.de \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=kai.heng.feng@canonical.com \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=okaya@kernel.org \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.