* Re: [RFC][PATCH] PCI: Avoid PCI device removing/rescanning through sysfs triggers a deadlock
2018-11-05 23:25 [RFC][PATCH] PCI: Avoid PCI device removing/rescanning through sysfs triggers a deadlock Marek Vasut
@ 2018-11-06 8:13 ` Geert Uytterhoeven
2018-11-06 9:45 ` Marek Vasut
2018-11-27 22:55 ` Bjorn Helgaas
1 sibling, 1 reply; 4+ messages in thread
From: Geert Uytterhoeven @ 2018-11-06 8:13 UTC (permalink / raw)
To: Marek Vasut; +Cc: linux-pci, Linux-Renesas, tho.vu.wh
Hi Marek,
Thanks for your patch!
On Tue, Nov 6, 2018 at 12:25 AM Marek Vasut <marek.vasut@gmail.com> wrote:
> From: Tho Vu <tho.vu.wh@rvc.renesas.com>
>
> This patch fixes deadlock warning in removing/rescanning through sysfs
> when CONFIG_PROVE_LOCKING is enabled.
>
> The issue can be reproduced by these steps:
> 1. Enable CONFIG_PROVE_LOCKING via defconfig or menuconfig
> 2. Insert Ethernet card into PCIe CH0 and start up.
> After kernel starting up, execute the following command.
> echo 1 > /sys/class/pci_bus/0000\:00/device/0000\:00\:00.0/remove
> 3. Rescan PCI device by this command
> echo 1 > /sys/class/pci_bus/0000\:00/rescan
>
> The deadlock warnings will occur.
> ============================================
> WARNING: possible recursive locking detected
> 4.14.70-ltsi-yocto-standard #27 Not tainted
> --------------------------------------------
> sh/3402 is trying to acquire lock:
> (kn->count#78){++++}, at: kernfs_remove_by_name_ns+0x50/0xa8
>
> but task is already holding lock:
> (kn->count#78){++++}, at: kernfs_remove_self+0xe0/0x130
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(kn->count#78);
> lock(kn->count#78);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by sh/3402:
> #0: (sb_writers#4){.+.+}, at: vfs_write+0x198/0x1b0
> #1: (&of->mutex){+.+.}, at: kernfs_fop_write+0x108/0x210
> #2: (kn->count#78){++++}, at: kernfs_remove_self+0xe0/0x130
> #3: (pci_rescan_remove_lock){+.+.}, at: pci_lock_rescan_remove+0x1c/0x28
>
> stack backtrace:
> CPU: 3 PID: 3402 Comm: sh Not tainted 4.14.70-ltsi-yocto-standard #27
> Hardware name: Renesas Salvator-X 2nd version board based on r8a7795
> ES3.0+ with 8GiB (4 x 2 GiB) (DT)
> Call trace:
> dump_backtrace+0x0/0x3d8
> show_stack+0x14/0x20
> dump_stack+0xbc/0xf4
> __lock_acquire+0x930/0x18a8
> lock_acquire+0x48/0x68
> __kernfs_remove+0x280/0x2f8
> kernfs_remove_by_name_ns+0x50/0xa8
> remove_files.isra.0+0x38/0x78
> sysfs_remove_group+0x4c/0xa0
> sysfs_remove_groups+0x38/0x60
> device_remove_attrs+0x54/0x78
> device_del+0x1ac/0x308
> pci_remove_bus_device+0x78/0xf8
> pci_remove_bus_device+0x34/0xf8
> pci_stop_and_remove_bus_device_locked+0x24/0x38
> remove_store+0x6c/0x78
> dev_attr_store+0x18/0x28
> sysfs_kf_write+0x4c/0x78
> kernfs_fop_write+0x138/0x210
> __vfs_write+0x18/0x118
> vfs_write+0xa4/0x1b0
> SyS_write+0x48/0xb0
>
> This warning occurs due to a self-deletion attribute using in the sysfs
used
> PCI device directory. This kind of attribute is really tricky,
> it does not allow pci framework drop this attribute until all active
to drop
> .show() and .store() callbacks have finished unless
finished, unless
> sysfs_break_active_protection() is called.
> Hence this patch avoids writing into this attribute triggers a deadlock.
and trigger a deadlock.
>
> Referrence commit 5b55b24cec4c ("scsi: core: Avoid that SCSI device
> removal through sysfs triggers a deadlock")
> of scsi driver
>
> Signed-off-by: Tho Vu <tho.vu.wh@rvc.renesas.com>
You forgot to append your own SoB?
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -470,12 +470,22 @@ static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
> const char *buf, size_t count)
> {
> unsigned long val;
> + struct kernfs_node *kn;
> +
> + kn = sysfs_break_active_protection(&dev->kobj, &attr->attr);
> + WARN_ON_ONCE(!kn);
What's the purpose of the WARN_ON_ONCE? Just copied from the SCSI solution?
Can this ever happen?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC][PATCH] PCI: Avoid PCI device removing/rescanning through sysfs triggers a deadlock
2018-11-05 23:25 [RFC][PATCH] PCI: Avoid PCI device removing/rescanning through sysfs triggers a deadlock Marek Vasut
2018-11-06 8:13 ` Geert Uytterhoeven
@ 2018-11-27 22:55 ` Bjorn Helgaas
1 sibling, 0 replies; 4+ messages in thread
From: Bjorn Helgaas @ 2018-11-27 22:55 UTC (permalink / raw)
To: Marek Vasut
Cc: linux-pci, linux-renesas-soc, Tho Vu, Bart Van Assche, Tejun Heo
[+cc Bart, Tejun]
On Tue, Nov 06, 2018 at 12:25:00AM +0100, Marek Vasut wrote:
> From: Tho Vu <tho.vu.wh@rvc.renesas.com>
>
> This patch fixes deadlock warning in removing/rescanning through sysfs
> when CONFIG_PROVE_LOCKING is enabled.
>
> The issue can be reproduced by these steps:
> 1. Enable CONFIG_PROVE_LOCKING via defconfig or menuconfig
> 2. Insert Ethernet card into PCIe CH0 and start up.
> After kernel starting up, execute the following command.
> echo 1 > /sys/class/pci_bus/0000\:00/device/0000\:00\:00.0/remove
> 3. Rescan PCI device by this command
> echo 1 > /sys/class/pci_bus/0000\:00/rescan
>
> The deadlock warnings will occur.
> ============================================
> WARNING: possible recursive locking detected
> 4.14.70-ltsi-yocto-standard #27 Not tainted
> --------------------------------------------
> sh/3402 is trying to acquire lock:
> (kn->count#78){++++}, at: kernfs_remove_by_name_ns+0x50/0xa8
>
> but task is already holding lock:
> (kn->count#78){++++}, at: kernfs_remove_self+0xe0/0x130
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(kn->count#78);
> lock(kn->count#78);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by sh/3402:
> #0: (sb_writers#4){.+.+}, at: vfs_write+0x198/0x1b0
> #1: (&of->mutex){+.+.}, at: kernfs_fop_write+0x108/0x210
> #2: (kn->count#78){++++}, at: kernfs_remove_self+0xe0/0x130
> #3: (pci_rescan_remove_lock){+.+.}, at: pci_lock_rescan_remove+0x1c/0x28
>
> stack backtrace:
> CPU: 3 PID: 3402 Comm: sh Not tainted 4.14.70-ltsi-yocto-standard #27
> Hardware name: Renesas Salvator-X 2nd version board based on r8a7795
> ES3.0+ with 8GiB (4 x 2 GiB) (DT)
> Call trace:
> dump_backtrace+0x0/0x3d8
> show_stack+0x14/0x20
> dump_stack+0xbc/0xf4
> __lock_acquire+0x930/0x18a8
> lock_acquire+0x48/0x68
> __kernfs_remove+0x280/0x2f8
> kernfs_remove_by_name_ns+0x50/0xa8
> remove_files.isra.0+0x38/0x78
> sysfs_remove_group+0x4c/0xa0
> sysfs_remove_groups+0x38/0x60
> device_remove_attrs+0x54/0x78
> device_del+0x1ac/0x308
> pci_remove_bus_device+0x78/0xf8
> pci_remove_bus_device+0x34/0xf8
> pci_stop_and_remove_bus_device_locked+0x24/0x38
> remove_store+0x6c/0x78
> dev_attr_store+0x18/0x28
> sysfs_kf_write+0x4c/0x78
> kernfs_fop_write+0x138/0x210
> __vfs_write+0x18/0x118
> vfs_write+0xa4/0x1b0
> SyS_write+0x48/0xb0
Indent quoted material. I use two spaces.
> This warning occurs due to a self-deletion attribute using in the sysfs
> PCI device directory. This kind of attribute is really tricky,
> it does not allow pci framework drop this attribute until all active
> .show() and .store() callbacks have finished unless
> sysfs_break_active_protection() is called.
> Hence this patch avoids writing into this attribute triggers a deadlock.
Wrap paragraph correctly or, if this is supposed to be two paragraphs,
insert a blank line between them.
Use "PCI" (not "pci") consistently in English text.
> Referrence commit 5b55b24cec4c ("scsi: core: Avoid that SCSI device
> removal through sysfs triggers a deadlock")
> of scsi driver
s/Referrence/Reference/
5b55b24cec4c doesn't exist. I suppose maybe you mean 0ee223b2e1f6
("scsi: core: Avoid that SCSI device removal through sysfs triggers a
deadlock")?
Wrap paragraph correctly.
Use "SCSI" (not "scsi") consistently in English text.
> Signed-off-by: Tho Vu <tho.vu.wh@rvc.renesas.com>
Missing your Signed-off-by (as Geert pointed out).
> ---
> drivers/pci/pci-sysfs.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 9ecfe13157c0..d522bd8368d9 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -470,12 +470,22 @@ static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
> const char *buf, size_t count)
> {
> unsigned long val;
> + struct kernfs_node *kn;
> +
> + kn = sysfs_break_active_protection(&dev->kobj, &attr->attr);
I'm awfully wary of interfaces with only a single user. That suggests
that there's something very special about this path, and I doubt it's
*that* special. I grant you that sysfs_break_active_protection() is
very new (added by 2afc9166f79b ("scsi: sysfs: Introduce
sysfs_{un,}break_active_protection()")).
> + WARN_ON_ONCE(!kn);
I don't see the point of the WARN_ON_ONCE() (also pointed out already by
Geert).
> if (kstrtoul(buf, 0, &val) < 0)
> return -EINVAL;
This looks wrong because you may return -EINVAL without calling
sysfs_unbreak_active_protection().
There's no point in calling sysfs_break_active_protection() if either
kstrtoul() fails or val is zero, i.e., you could do:
if (kstrtoul(..., &val) < 0)
return -EINVAL;
if (!val)
return count;
sysfs_break_active_protection(...);
device_remove_file(...);
pci_stop_and_remove_bus_device_locked(...);
sysfs_unbreak_active_protection(...);
> - if (val && device_remove_file_self(dev, attr))
> + if (val) {
> + device_remove_file(dev, attr);
> pci_stop_and_remove_bus_device_locked(to_pci_dev(dev));
We need some story about why we should use device_remove_file()
instead of device_remove_file_self().
We also need an explanation for why other callers of
device_remove_file_self() don't need similar changes. They *look*
similar to this case, so we need to look at nvme_sysfs_delete(),
dcssblk_shared_store(), vfio remove_store(), s390 recover_store(),
etc, and explain why they are correct and this one is incorrect. Or,
if they're all wrong, we need to fix them all.
Is there a way we can avoid this self-deletion situation by doing the
deletion via a workqueue item that is scheduled by remove_store() but
not executed in its context?
> + }
> +
> + if (kn)
> + sysfs_unbreak_active_protection(kn);
> +
> return count;
> }
> static struct device_attribute dev_remove_attr = __ATTR(remove,
> @@ -487,11 +497,15 @@ static ssize_t dev_bus_rescan_store(struct device *dev,
> const char *buf, size_t count)
> {
> unsigned long val;
> + struct kernfs_node *kn;
> struct pci_bus *bus = to_pci_bus(dev);
>
> if (kstrtoul(buf, 0, &val) < 0)
> return -EINVAL;
>
> + kn = sysfs_break_active_protection(&dev->kobj, &attr->attr);
> + WARN_ON_ONCE(!kn);
> +
> if (val) {
> pci_lock_rescan_remove();
> if (!pci_is_root_bus(bus) && list_empty(&bus->devices))
> @@ -500,6 +514,10 @@ static ssize_t dev_bus_rescan_store(struct device *dev,
> pci_rescan_bus(bus);
> pci_unlock_rescan_remove();
> }
> +
> + if (kn)
> + sysfs_unbreak_active_protection(kn);
What's the purpose of this dev_bus_rescan_store() change? There's no
remove here, and the changelog doesn't mention this path.
> return count;
> }
> static DEVICE_ATTR(rescan, (S_IWUSR|S_IWGRP), NULL, dev_bus_rescan_store);
> --
> 2.18.0
>
^ permalink raw reply [flat|nested] 4+ messages in thread