From: Lukas Wunner <lukas@wunner.de>
To: "Haeuptle, Michael" <michael.haeuptle@hpe.com>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
"michaelhaeuptle@gmail.com" <michaelhaeuptle@gmail.com>,
Christoph Hellwig <hch@lst.de>
Subject: Re: Deadlock during PCIe hot remove
Date: Sun, 29 Mar 2020 17:43:52 +0200 [thread overview]
Message-ID: <20200329154352.5lxbtlf3464sm4ce@wunner.de> (raw)
In-Reply-To: <CS1PR8401MB0728FC6FDAB8A35C22BD90EC95F10@CS1PR8401MB0728.NAMPRD84.PROD.OUTLOOK.COM>
On Tue, Mar 24, 2020 at 03:21:52PM +0000, Haeuptle, Michael wrote:
> I'm running into a deadlock scenario between the hotplug, pcie and
> vfio_pci driver when removing multiple devices in parallel.
> This is happening on CentOS8 (4.18) with SPDK (spdk.io). I'm using
> the latest pciehp code, the rest is all 4.18.
>
> The sequence that leads to the deadlock is as follows:
>
> The pciehp_ist() takes the reset_lock early in its processing.
> While the pciehp_ist processing is progressing, vfio_pci calls
> pci_try_reset_function() as part of vfio_pci_release or open.
> The pci_try_reset_function() takes the device lock.
>
> Eventually, pci_try_reset_function() calls pci_reset_hotplug_slot()
> which calls pciehp_reset_slot(). The pciehp_reset_slot() tries to
> take the reset_lock but has to wait since it is already taken by
> pciehp_ist().
>
> Eventually pciehp_ist calls pcie_stop_device() which calls
> device_release_driver_internal(). This function also tries to take
> device_lock causing the dead lock.
>
> Here's the kernel stack trace when the deadlock occurs:
>
> [root@localhost ~]# cat /proc/8594/task/8598/stack
> [<0>] pciehp_reset_slot+0xa5/0x220
> [<0>] pci_reset_hotplug_slot.cold.72+0x20/0x36
> [<0>] pci_dev_reset_slot_function+0x72/0x9b
> [<0>] __pci_reset_function_locked+0x15b/0x190
> [<0>] pci_try_reset_function.cold.77+0x9b/0x108
> [<0>] vfio_pci_disable+0x261/0x280
> [<0>] vfio_pci_release+0xcb/0xf0
> [<0>] vfio_device_fops_release+0x1e/0x40
> [<0>] __fput+0xa5/0x1d0
> [<0>] task_work_run+0x8a/0xb0
> [<0>] exit_to_usermode_loop+0xd3/0xe0
> [<0>] do_syscall_64+0xe1/0x100
> [<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
> [<0>] 0xffffffffffffffff
There's something I don't understand here:
The device_lock exists per-device.
The reset_lock exists per hotplug-capable downstream port.
Now if I understand correctly, in the stacktrace above, the
device_lock of the *downstream port* is acquired and then
its reset_lock is tried to acquire, which however is already
held by pciehp_ist().
You're saying that pciehp_ist() is handling removal of the
endpoint device *below* the downstream port, which means that
device_release_driver_internal() tries to acquire the device_lock
of the *endpoint device*. That's a separate lock than the one
acquired by vfio_pci_disable() before calling
pci_try_reset_function()!
So I don't quite understand how there can be a deadlock. Could
you instrument the code with a few printk()'s and dump_stack()'s
to show exactly which device's device_lock is acquired from where?
Note that device_release_driver_internal() also acquires the
parent's device_lock and this would indeed be the one of the
downstream port. However commit 8c97a46af04b ("driver core:
hold dev's parent lock when needed") constrained that to
USB devices. So the parent lock shouldn't be taken for PCI
devices. That commit went into v4.18, please double-check
that you have it in your tree.
Thanks,
Lukas
next prev parent reply other threads:[~2020-03-29 15:43 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-24 15:21 Deadlock during PCIe hot remove Haeuptle, Michael
2020-03-24 15:35 ` Hoyer, David
2020-03-24 15:37 ` Haeuptle, Michael
2020-03-24 15:39 ` Hoyer, David
2020-03-24 16:26 ` Haeuptle, Michael
2020-03-24 16:32 ` Hoyer, David
2020-03-24 16:15 ` Lukas Wunner
2020-03-25 10:40 ` Christoph Hellwig
2020-03-26 19:30 ` Haeuptle, Michael
2020-03-29 13:04 ` Lukas Wunner
2020-03-31 8:14 ` Christoph Hellwig
2020-03-29 15:43 ` Lukas Wunner [this message]
2020-03-30 16:15 ` Haeuptle, Michael
2020-03-31 13:01 ` Lukas Wunner
2020-03-31 15:02 ` Haeuptle, Michael
2020-04-02 19:24 ` Haeuptle, Michael
[not found] <492110694.79456.1666778757292.JavaMail.zimbra@kalray.eu>
2022-10-30 8:28 ` Christoph Hellwig
2022-10-30 8:39 ` Alex Michon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200329154352.5lxbtlf3464sm4ce@wunner.de \
--to=lukas@wunner.de \
--cc=hch@lst.de \
--cc=linux-pci@vger.kernel.org \
--cc=michael.haeuptle@hpe.com \
--cc=michaelhaeuptle@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).