linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drivers/base: use a worker for sysfs unbind
@ 2018-12-10  8:46 Daniel Vetter
  2018-12-10 10:06 ` Greg Kroah-Hartman
  2018-12-13  9:38 ` Rafael J. Wysocki
  0 siblings, 2 replies; 17+ messages in thread
From: Daniel Vetter @ 2018-12-10  8:46 UTC (permalink / raw)
  To: LKML
  Cc: DRI Development, Daniel Vetter, Ramalingam C, Greg Kroah-Hartman,
	Rafael J. Wysocki, Daniel Vetter

Drivers might want to remove some sysfs files, which needs the same
locks and ends up angering lockdep. Relevant snippet of the stack
trace:

  kernfs_remove_by_name_ns+0x3b/0x80
  bus_remove_driver+0x92/0xa0
  acpi_video_unregister+0x24/0x40
  i915_driver_unload+0x42/0x130 [i915]
  i915_pci_remove+0x19/0x30 [i915]
  pci_device_remove+0x36/0xb0
  device_release_driver_internal+0x185/0x250
  unbind_store+0xaf/0x180
  kernfs_fop_write+0x104/0x190

I've stumbled over this because some new patches by Ram connect the
snd-hda-intel unload (where we do use sysfs unbind) with the locking
chains in the i915 unload code (but without creating a new loop),
which upset our CI. But the bug is already there and can be easily
reproduced by unbind i915 directly.

No idea whether this is the correct place to fix this, should at least
get CI happy again. Also not sure whether we should do the same on the
bind side, there we have the additional complication that the current
code forwards the driver load errno.

Note that the bus locking is already done by
device_release_driver_internal (if you give it the parent), so I
dropped that part. Also note that we don't recheck that the device is
still bound by the same driver, but neither does the current code do
that without races. And I figured that's a obscure enough corner case
to not bother.

v2: Use a task work. An entirely async work leads to impressive
fireworks in our CI, notably in the vtcon bind/unbind code. Task work
will be as synchronous as the current code, and so keep all these
preexisting races neatly tugged under the rug.

Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/base/bus.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 8bfd27ec73d6..095c4a140d76 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -17,6 +17,7 @@
 #include <linux/string.h>
 #include <linux/mutex.h>
 #include <linux/sysfs.h>
+#include <linux/task_work.h>
 #include "base.h"
 #include "power/power.h"
 
@@ -174,22 +175,44 @@ static const struct kset_uevent_ops bus_uevent_ops = {
 
 static struct kset *bus_kset;
 
+struct unbind_work {
+	struct callback_head twork;
+	struct device *dev;
+};
+
+void unbind_work_fn(struct callback_head *work)
+{
+	struct unbind_work *unbind_work =
+		container_of(work, struct unbind_work, twork);
+
+	device_release_driver_internal(unbind_work->dev, NULL,
+				       unbind_work->dev->parent);
+	put_device(unbind_work->dev);
+	kfree(unbind_work);
+}
+
 /* Manually detach a device from its associated driver. */
 static ssize_t unbind_store(struct device_driver *drv, const char *buf,
 			    size_t count)
 {
 	struct bus_type *bus = bus_get(drv->bus);
+	struct unbind_work *unbind_work;
 	struct device *dev;
 	int err = -ENODEV;
 
 	dev = bus_find_device_by_name(bus, NULL, buf);
 	if (dev && dev->driver == drv) {
-		if (dev->parent && dev->bus->need_parent_lock)
-			device_lock(dev->parent);
-		device_release_driver(dev);
-		if (dev->parent && dev->bus->need_parent_lock)
-			device_unlock(dev->parent);
-		err = count;
+		unbind_work = kmalloc(sizeof(*unbind_work), GFP_KERNEL);
+		if (unbind_work) {
+			unbind_work->dev = dev;
+			get_device(dev);
+			init_task_work(&unbind_work->twork, unbind_work_fn);
+			task_work_add(current, &unbind_work->twork, true);
+
+			err = count;
+		} else {
+			err = -ENOMEM;
+		}
 	}
 	put_device(dev);
 	bus_put(bus);
-- 
2.20.0.rc1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-10  8:46 [PATCH] drivers/base: use a worker for sysfs unbind Daniel Vetter
@ 2018-12-10 10:06 ` Greg Kroah-Hartman
  2018-12-10 10:18   ` Daniel Vetter
  2018-12-13  9:38 ` Rafael J. Wysocki
  1 sibling, 1 reply; 17+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-10 10:06 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: LKML, DRI Development, Ramalingam C, Rafael J. Wysocki, Daniel Vetter

On Mon, Dec 10, 2018 at 09:46:53AM +0100, Daniel Vetter wrote:
> Drivers might want to remove some sysfs files, which needs the same
> locks and ends up angering lockdep. Relevant snippet of the stack
> trace:
> 
>   kernfs_remove_by_name_ns+0x3b/0x80
>   bus_remove_driver+0x92/0xa0
>   acpi_video_unregister+0x24/0x40
>   i915_driver_unload+0x42/0x130 [i915]
>   i915_pci_remove+0x19/0x30 [i915]
>   pci_device_remove+0x36/0xb0
>   device_release_driver_internal+0x185/0x250
>   unbind_store+0xaf/0x180
>   kernfs_fop_write+0x104/0x190
> 
> I've stumbled over this because some new patches by Ram connect the
> snd-hda-intel unload (where we do use sysfs unbind) with the locking
> chains in the i915 unload code (but without creating a new loop),
> which upset our CI. But the bug is already there and can be easily
> reproduced by unbind i915 directly.

This is odd, why wouldn't any driver hit this issue?  And why now since
you say this is triggerable today?

I know scsi was doing some strange things like trying to remove the
device itself from a sysfs callback on the device, which requires it to
just call a different kobject function created just for that type of
thing.  Would that also make sense to do here instead of your workqueue?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-10 10:06 ` Greg Kroah-Hartman
@ 2018-12-10 10:18   ` Daniel Vetter
  2018-12-10 10:20     ` Daniel Vetter
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Vetter @ 2018-12-10 10:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Daniel Vetter, LKML, DRI Development, Ramalingam C,
	Rafael J. Wysocki, Daniel Vetter

On Mon, Dec 10, 2018 at 11:06:34AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Dec 10, 2018 at 09:46:53AM +0100, Daniel Vetter wrote:
> > Drivers might want to remove some sysfs files, which needs the same
> > locks and ends up angering lockdep. Relevant snippet of the stack
> > trace:
> > 
> >   kernfs_remove_by_name_ns+0x3b/0x80
> >   bus_remove_driver+0x92/0xa0
> >   acpi_video_unregister+0x24/0x40
> >   i915_driver_unload+0x42/0x130 [i915]
> >   i915_pci_remove+0x19/0x30 [i915]
> >   pci_device_remove+0x36/0xb0
> >   device_release_driver_internal+0x185/0x250
> >   unbind_store+0xaf/0x180
> >   kernfs_fop_write+0x104/0x190
> > 
> > I've stumbled over this because some new patches by Ram connect the
> > snd-hda-intel unload (where we do use sysfs unbind) with the locking
> > chains in the i915 unload code (but without creating a new loop),
> > which upset our CI. But the bug is already there and can be easily
> > reproduced by unbind i915 directly.
> 
> This is odd, why wouldn't any driver hit this issue?  And why now since
> you say this is triggerable today?

The above backtrace is triggered by unbinding i915 on current upstream
kernels. Note: Will crash later on rather badly in the
fbdev/fbcon/vtconsole hell, but that's separate issue (which can be worked
around by first unbinding fbcon manually through sysfs).

> I know scsi was doing some strange things like trying to remove the
> device itself from a sysfs callback on the device, which requires it to
> just call a different kobject function created just for that type of
> thing.  Would that also make sense to do here instead of your workqueue?

Note how we blow up on unregistering sw device instances supported by i915
in entirely different subsystems. I guess most drivers just have sysfs
files for their own stuff, where this is done as you describe. The problem
is that there's an awful lot of unrelated stuff hanging off i915.

Or maybe acpi_video is busted, and should be using a different function.
You haven't said which one, and I have no idea which one it is ...

And in case the context wasn't clear: This is unbinding the i915 pci
driver which triggers the above lockdep splat recursion.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-10 10:18   ` Daniel Vetter
@ 2018-12-10 10:20     ` Daniel Vetter
  2018-12-12 11:08       ` Daniel Vetter
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Vetter @ 2018-12-10 10:20 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Daniel Vetter, LKML, DRI Development, Ramalingam C,
	Rafael J. Wysocki, Daniel Vetter

On Mon, Dec 10, 2018 at 11:18:32AM +0100, Daniel Vetter wrote:
> On Mon, Dec 10, 2018 at 11:06:34AM +0100, Greg Kroah-Hartman wrote:
> > On Mon, Dec 10, 2018 at 09:46:53AM +0100, Daniel Vetter wrote:
> > > Drivers might want to remove some sysfs files, which needs the same
> > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > trace:
> > > 
> > >   kernfs_remove_by_name_ns+0x3b/0x80
> > >   bus_remove_driver+0x92/0xa0
> > >   acpi_video_unregister+0x24/0x40
> > >   i915_driver_unload+0x42/0x130 [i915]
> > >   i915_pci_remove+0x19/0x30 [i915]
> > >   pci_device_remove+0x36/0xb0
> > >   device_release_driver_internal+0x185/0x250
> > >   unbind_store+0xaf/0x180
> > >   kernfs_fop_write+0x104/0x190
> > > 
> > > I've stumbled over this because some new patches by Ram connect the
> > > snd-hda-intel unload (where we do use sysfs unbind) with the locking
> > > chains in the i915 unload code (but without creating a new loop),
> > > which upset our CI. But the bug is already there and can be easily
> > > reproduced by unbind i915 directly.
> > 
> > This is odd, why wouldn't any driver hit this issue?  And why now since
> > you say this is triggerable today?
> 
> The above backtrace is triggered by unbinding i915 on current upstream
> kernels. Note: Will crash later on rather badly in the
> fbdev/fbcon/vtconsole hell, but that's separate issue (which can be worked
> around by first unbinding fbcon manually through sysfs).
> 
> > I know scsi was doing some strange things like trying to remove the
> > device itself from a sysfs callback on the device, which requires it to
> > just call a different kobject function created just for that type of
> > thing.  Would that also make sense to do here instead of your workqueue?
> 
> Note how we blow up on unregistering sw device instances supported by i915
> in entirely different subsystems. I guess most drivers just have sysfs
> files for their own stuff, where this is done as you describe. The problem
> is that there's an awful lot of unrelated stuff hanging off i915.
> 
> Or maybe acpi_video is busted, and should be using a different function.
> You haven't said which one, and I have no idea which one it is ...
> 
> And in case the context wasn't clear: This is unbinding the i915 pci
> driver which triggers the above lockdep splat recursion.

btw another option for "fixing" this would be to annotate the mutex_lock
in kernfs_remove_by_name_ns as recursive. Which just shuts up lockdep (and
might hide some real bugs), but would get the job done since there's not
actually a deadlock here. Just lockdep being annoyed.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-10 10:20     ` Daniel Vetter
@ 2018-12-12 11:08       ` Daniel Vetter
  2018-12-12 11:19         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Vetter @ 2018-12-12 11:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman, LKML, DRI Development, Ramalingam C,
	Rafael J. Wysocki, Daniel Vetter

On Mon, Dec 10, 2018 at 11:20:58AM +0100, Daniel Vetter wrote:
> On Mon, Dec 10, 2018 at 11:18:32AM +0100, Daniel Vetter wrote:
> > On Mon, Dec 10, 2018 at 11:06:34AM +0100, Greg Kroah-Hartman wrote:
> > > On Mon, Dec 10, 2018 at 09:46:53AM +0100, Daniel Vetter wrote:
> > > > Drivers might want to remove some sysfs files, which needs the same
> > > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > > trace:
> > > > 
> > > >   kernfs_remove_by_name_ns+0x3b/0x80
> > > >   bus_remove_driver+0x92/0xa0
> > > >   acpi_video_unregister+0x24/0x40
> > > >   i915_driver_unload+0x42/0x130 [i915]
> > > >   i915_pci_remove+0x19/0x30 [i915]
> > > >   pci_device_remove+0x36/0xb0
> > > >   device_release_driver_internal+0x185/0x250
> > > >   unbind_store+0xaf/0x180
> > > >   kernfs_fop_write+0x104/0x190
> > > > 
> > > > I've stumbled over this because some new patches by Ram connect the
> > > > snd-hda-intel unload (where we do use sysfs unbind) with the locking
> > > > chains in the i915 unload code (but without creating a new loop),
> > > > which upset our CI. But the bug is already there and can be easily
> > > > reproduced by unbind i915 directly.
> > > 
> > > This is odd, why wouldn't any driver hit this issue?  And why now since
> > > you say this is triggerable today?
> > 
> > The above backtrace is triggered by unbinding i915 on current upstream
> > kernels. Note: Will crash later on rather badly in the
> > fbdev/fbcon/vtconsole hell, but that's separate issue (which can be worked
> > around by first unbinding fbcon manually through sysfs).
> > 
> > > I know scsi was doing some strange things like trying to remove the
> > > device itself from a sysfs callback on the device, which requires it to
> > > just call a different kobject function created just for that type of
> > > thing.  Would that also make sense to do here instead of your workqueue?
> > 
> > Note how we blow up on unregistering sw device instances supported by i915
> > in entirely different subsystems. I guess most drivers just have sysfs
> > files for their own stuff, where this is done as you describe. The problem
> > is that there's an awful lot of unrelated stuff hanging off i915.
> > 
> > Or maybe acpi_video is busted, and should be using a different function.
> > You haven't said which one, and I have no idea which one it is ...
> > 
> > And in case the context wasn't clear: This is unbinding the i915 pci
> > driver which triggers the above lockdep splat recursion.
> 
> btw another option for "fixing" this would be to annotate the mutex_lock
> in kernfs_remove_by_name_ns as recursive. Which just shuts up lockdep (and
> might hide some real bugs), but would get the job done since there's not
> actually a deadlock here. Just lockdep being annoyed.

So what's the pick? I can do the typing, but I don't understand all the
driver core interactions to know what we should be doing here best.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-12 11:08       ` Daniel Vetter
@ 2018-12-12 11:19         ` Greg Kroah-Hartman
  2018-12-12 12:40           ` Daniel Vetter
  0 siblings, 1 reply; 17+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-12 11:19 UTC (permalink / raw)
  To: LKML, DRI Development, Ramalingam C, Rafael J. Wysocki, Daniel Vetter

On Wed, Dec 12, 2018 at 12:08:40PM +0100, Daniel Vetter wrote:
> On Mon, Dec 10, 2018 at 11:20:58AM +0100, Daniel Vetter wrote:
> > On Mon, Dec 10, 2018 at 11:18:32AM +0100, Daniel Vetter wrote:
> > > On Mon, Dec 10, 2018 at 11:06:34AM +0100, Greg Kroah-Hartman wrote:
> > > > On Mon, Dec 10, 2018 at 09:46:53AM +0100, Daniel Vetter wrote:
> > > > > Drivers might want to remove some sysfs files, which needs the same
> > > > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > > > trace:
> > > > > 
> > > > >   kernfs_remove_by_name_ns+0x3b/0x80
> > > > >   bus_remove_driver+0x92/0xa0
> > > > >   acpi_video_unregister+0x24/0x40
> > > > >   i915_driver_unload+0x42/0x130 [i915]
> > > > >   i915_pci_remove+0x19/0x30 [i915]
> > > > >   pci_device_remove+0x36/0xb0
> > > > >   device_release_driver_internal+0x185/0x250
> > > > >   unbind_store+0xaf/0x180
> > > > >   kernfs_fop_write+0x104/0x190
> > > > > 
> > > > > I've stumbled over this because some new patches by Ram connect the
> > > > > snd-hda-intel unload (where we do use sysfs unbind) with the locking
> > > > > chains in the i915 unload code (but without creating a new loop),
> > > > > which upset our CI. But the bug is already there and can be easily
> > > > > reproduced by unbind i915 directly.
> > > > 
> > > > This is odd, why wouldn't any driver hit this issue?  And why now since
> > > > you say this is triggerable today?
> > > 
> > > The above backtrace is triggered by unbinding i915 on current upstream
> > > kernels. Note: Will crash later on rather badly in the
> > > fbdev/fbcon/vtconsole hell, but that's separate issue (which can be worked
> > > around by first unbinding fbcon manually through sysfs).
> > > 
> > > > I know scsi was doing some strange things like trying to remove the
> > > > device itself from a sysfs callback on the device, which requires it to
> > > > just call a different kobject function created just for that type of
> > > > thing.  Would that also make sense to do here instead of your workqueue?
> > > 
> > > Note how we blow up on unregistering sw device instances supported by i915
> > > in entirely different subsystems. I guess most drivers just have sysfs
> > > files for their own stuff, where this is done as you describe. The problem
> > > is that there's an awful lot of unrelated stuff hanging off i915.
> > > 
> > > Or maybe acpi_video is busted, and should be using a different function.
> > > You haven't said which one, and I have no idea which one it is ...
> > > 
> > > And in case the context wasn't clear: This is unbinding the i915 pci
> > > driver which triggers the above lockdep splat recursion.
> > 
> > btw another option for "fixing" this would be to annotate the mutex_lock
> > in kernfs_remove_by_name_ns as recursive. Which just shuts up lockdep (and
> > might hide some real bugs), but would get the job done since there's not
> > actually a deadlock here. Just lockdep being annoyed.
> 
> So what's the pick? I can do the typing, but I don't understand all the
> driver core interactions to know what we should be doing here best.

Sorry for the delay.

Look at sdev_store_delete() in drivers/scsi/scsi_sysfs.c and see if the
logic there makes sense to do here instead.

It still seems odd that removing a sysfs file by writing to a sysfs file
at the same level really invokes lockdep as I would have thought that
this path is well-tested by now.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-12 11:19         ` Greg Kroah-Hartman
@ 2018-12-12 12:40           ` Daniel Vetter
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Vetter @ 2018-12-12 12:40 UTC (permalink / raw)
  To: Greg KH
  Cc: Linux Kernel Mailing List, dri-devel, Ramalingam C,
	Rafael J. Wysocki, Daniel Vetter

On Wed, Dec 12, 2018 at 12:19 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Wed, Dec 12, 2018 at 12:08:40PM +0100, Daniel Vetter wrote:
> > On Mon, Dec 10, 2018 at 11:20:58AM +0100, Daniel Vetter wrote:
> > > On Mon, Dec 10, 2018 at 11:18:32AM +0100, Daniel Vetter wrote:
> > > > On Mon, Dec 10, 2018 at 11:06:34AM +0100, Greg Kroah-Hartman wrote:
> > > > > On Mon, Dec 10, 2018 at 09:46:53AM +0100, Daniel Vetter wrote:
> > > > > > Drivers might want to remove some sysfs files, which needs the same
> > > > > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > > > > trace:
> > > > > >
> > > > > >   kernfs_remove_by_name_ns+0x3b/0x80
> > > > > >   bus_remove_driver+0x92/0xa0
> > > > > >   acpi_video_unregister+0x24/0x40
> > > > > >   i915_driver_unload+0x42/0x130 [i915]
> > > > > >   i915_pci_remove+0x19/0x30 [i915]
> > > > > >   pci_device_remove+0x36/0xb0
> > > > > >   device_release_driver_internal+0x185/0x250
> > > > > >   unbind_store+0xaf/0x180
> > > > > >   kernfs_fop_write+0x104/0x190
> > > > > >
> > > > > > I've stumbled over this because some new patches by Ram connect the
> > > > > > snd-hda-intel unload (where we do use sysfs unbind) with the locking
> > > > > > chains in the i915 unload code (but without creating a new loop),
> > > > > > which upset our CI. But the bug is already there and can be easily
> > > > > > reproduced by unbind i915 directly.
> > > > >
> > > > > This is odd, why wouldn't any driver hit this issue?  And why now since
> > > > > you say this is triggerable today?
> > > >
> > > > The above backtrace is triggered by unbinding i915 on current upstream
> > > > kernels. Note: Will crash later on rather badly in the
> > > > fbdev/fbcon/vtconsole hell, but that's separate issue (which can be worked
> > > > around by first unbinding fbcon manually through sysfs).
> > > >
> > > > > I know scsi was doing some strange things like trying to remove the
> > > > > device itself from a sysfs callback on the device, which requires it to
> > > > > just call a different kobject function created just for that type of
> > > > > thing.  Would that also make sense to do here instead of your workqueue?
> > > >
> > > > Note how we blow up on unregistering sw device instances supported by i915
> > > > in entirely different subsystems. I guess most drivers just have sysfs
> > > > files for their own stuff, where this is done as you describe. The problem
> > > > is that there's an awful lot of unrelated stuff hanging off i915.
> > > >
> > > > Or maybe acpi_video is busted, and should be using a different function.
> > > > You haven't said which one, and I have no idea which one it is ...
> > > >
> > > > And in case the context wasn't clear: This is unbinding the i915 pci
> > > > driver which triggers the above lockdep splat recursion.
> > >
> > > btw another option for "fixing" this would be to annotate the mutex_lock
> > > in kernfs_remove_by_name_ns as recursive. Which just shuts up lockdep (and
> > > might hide some real bugs), but would get the job done since there's not
> > > actually a deadlock here. Just lockdep being annoyed.
> >
> > So what's the pick? I can do the typing, but I don't understand all the
> > driver core interactions to know what we should be doing here best.
>
> Sorry for the delay.
>
> Look at sdev_store_delete() in drivers/scsi/scsi_sysfs.c and see if the
> logic there makes sense to do here instead.

This looks interesting, but it doesn't solve the problem. The issue is
_not_ that we remove the same sysfs file as the one we're writing
into. It's that we're removing an entirely unrelated sysfs file, which
will not cause a deadlock per se, but triggers lockdep because it's in
the same locking class (note how the locking recusion is within one
callchain, this would deadlock right away if it's the same file, but
unloading happily continues).

> It still seems odd that removing a sysfs file by writing to a sysfs file
> at the same level really invokes lockdep as I would have thought that
> this path is well-tested by now.

Iirc has been around forever for gpu drivers. Just never bothered to
fix it, because there's much bigger issues in hotunplug for gpu
drivers. Only reason we use unbind in CI is because it's the simplest
way to get userspace off the snd-hda-intel driver (which needs to be
unloaded before i915, if you want to unload that).
-Daniel

>
> thanks,
>
> greg k-h
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-10  8:46 [PATCH] drivers/base: use a worker for sysfs unbind Daniel Vetter
  2018-12-10 10:06 ` Greg Kroah-Hartman
@ 2018-12-13  9:38 ` Rafael J. Wysocki
  2018-12-13  9:58   ` Daniel Vetter
  1 sibling, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2018-12-13  9:38 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Linux Kernel Mailing List, dri-devel, ramalingam.c,
	Greg Kroah-Hartman, Rafael J. Wysocki, Daniel Vetter

On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Drivers might want to remove some sysfs files, which needs the same
> locks and ends up angering lockdep. Relevant snippet of the stack
> trace:
>
>   kernfs_remove_by_name_ns+0x3b/0x80
>   bus_remove_driver+0x92/0xa0
>   acpi_video_unregister+0x24/0x40
>   i915_driver_unload+0x42/0x130 [i915]
>   i915_pci_remove+0x19/0x30 [i915]
>   pci_device_remove+0x36/0xb0
>   device_release_driver_internal+0x185/0x250
>   unbind_store+0xaf/0x180
>   kernfs_fop_write+0x104/0x190

Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
source of the lockdep unhappiness?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-13  9:38 ` Rafael J. Wysocki
@ 2018-12-13  9:58   ` Daniel Vetter
  2018-12-13 10:23     ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Vetter @ 2018-12-13  9:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Daniel Vetter, Linux Kernel Mailing List, dri-devel,
	ramalingam.c, Greg Kroah-Hartman, Daniel Vetter

On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote:
> On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > Drivers might want to remove some sysfs files, which needs the same
> > locks and ends up angering lockdep. Relevant snippet of the stack
> > trace:
> >
> >   kernfs_remove_by_name_ns+0x3b/0x80
> >   bus_remove_driver+0x92/0xa0
> >   acpi_video_unregister+0x24/0x40
> >   i915_driver_unload+0x42/0x130 [i915]
> >   i915_pci_remove+0x19/0x30 [i915]
> >   pci_device_remove+0x36/0xb0
> >   device_release_driver_internal+0x185/0x250
> >   unbind_store+0xaf/0x180
> >   kernfs_fop_write+0x104/0x190
> 
> Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
> source of the lockdep unhappiness?

Yeah I guess I cut out too much of the lockdep splat. It complains about
kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock
class. It's ofc not the same lock, so no real deadlock. Getting the
device_release_driver outside of the callchain under kernfs_fop_write,
which this patch does, "fixes" it. For "fixes" = shut up lockdep.

Other options:
- Anotate the recursion with the usual lockdep annotations. Potentially
  results in lockdep not catching real deadlocks (you can still have other
  loops closing the deadlock, maybe through some subsystem/bus lock).

- Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks
  that know what they're doing), which should be fine if we refcount
  everything properly (bus, driver & device).

- Also note that probably the same bug exists on the bind sysfs interface,
  but we don't use that, so I don't care :-)

- Most of these issues are never visible in normal usage, since normally
  driver bind/unbind is done from a kthread or model_load/unload, neither
  of which is running in the context of that kernfs mutex kernfs_fop_write
  holds. That's why I think the task work is the best solution, since it
  changes the locking context of the unbind sysfs to match the locking
  context of module unload and hotunplug. Unfortunately that trick doesn't
  work for the bind sysfs file, since that way we can't thread the errno
  value back to userspace.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-13  9:58   ` Daniel Vetter
@ 2018-12-13 10:23     ` Rafael J. Wysocki
  2018-12-13 11:05       ` Rafael J. Wysocki
  2018-12-13 12:36       ` Daniel Vetter
  0 siblings, 2 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2018-12-13 10:23 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux Kernel Mailing List, dri-devel,
	ramalingam.c, Greg Kroah-Hartman, Daniel Vetter
  Cc: Daniel Vetter

On Thu, Dec 13, 2018 at 10:58 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote:
> > On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >
> > > Drivers might want to remove some sysfs files, which needs the same
> > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > trace:
> > >
> > >   kernfs_remove_by_name_ns+0x3b/0x80
> > >   bus_remove_driver+0x92/0xa0
> > >   acpi_video_unregister+0x24/0x40
> > >   i915_driver_unload+0x42/0x130 [i915]
> > >   i915_pci_remove+0x19/0x30 [i915]
> > >   pci_device_remove+0x36/0xb0
> > >   device_release_driver_internal+0x185/0x250
> > >   unbind_store+0xaf/0x180
> > >   kernfs_fop_write+0x104/0x190
> >
> > Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
> > source of the lockdep unhappiness?
>
> Yeah I guess I cut out too much of the lockdep splat. It complains about
> kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock
> class. It's ofc not the same lock, so no real deadlock. Getting the
> device_release_driver outside of the callchain under kernfs_fop_write,
> which this patch does, "fixes" it. For "fixes" = shut up lockdep.

OK, so the problem really is that the operation is started via sysfs
which means that this code is running under a lock already.

Which lock does lockdep complain about, exactly?

> Other options:
> - Anotate the recursion with the usual lockdep annotations. Potentially
>   results in lockdep not catching real deadlocks (you can still have other
>   loops closing the deadlock, maybe through some subsystem/bus lock).
>
> - Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks
>   that know what they're doing), which should be fine if we refcount
>   everything properly (bus, driver & device).
>
> - Also note that probably the same bug exists on the bind sysfs interface,
>   but we don't use that, so I don't care :-)
>
> - Most of these issues are never visible in normal usage, since normally
>   driver bind/unbind is done from a kthread or model_load/unload, neither
>   of which is running in the context of that kernfs mutex kernfs_fop_write
>   holds. That's why I think the task work is the best solution, since it
>   changes the locking context of the unbind sysfs to match the locking
>   context of module unload and hotunplug.

I think that using a task work here makes sense.  There is a drawback,
which is that the original sysfs write will not wait for the driver to
actually be released before returning to user space AFAICS, but that
probably isn't a big deal.

Also please note that the patch changes the code flow slightly,
because passing a non-NULL parent pointer to
device_release_driver_internal() potentially has side effects, but
that should not be a big deal either.

> Unfortunately that trick doesn't work for the bind sysfs file, since that way we can't thread the errno value back to userspace.

Right.  That is unless we wait for the operation to complete and check
the error left behind by it.  That should be doable, but somewhat
complicated.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-13 10:23     ` Rafael J. Wysocki
@ 2018-12-13 11:05       ` Rafael J. Wysocki
  2018-12-13 12:36       ` Daniel Vetter
  1 sibling, 0 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2018-12-13 11:05 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Linux Kernel Mailing List, dri-devel, ramalingam.c,
	Greg Kroah-Hartman, Daniel Vetter

On Thu, Dec 13, 2018 at 11:23 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Dec 13, 2018 at 10:58 AM Daniel Vetter <daniel@ffwll.ch> wrote:

[cut]

> >
> > - Most of these issues are never visible in normal usage, since normally
> >   driver bind/unbind is done from a kthread or model_load/unload, neither
> >   of which is running in the context of that kernfs mutex kernfs_fop_write
> >   holds. That's why I think the task work is the best solution, since it
> >   changes the locking context of the unbind sysfs to match the locking
> >   context of module unload and hotunplug.
>
> I think that using a task work here makes sense.  There is a drawback,
> which is that the original sysfs write will not wait for the driver to
> actually be released before returning to user space AFAICS, but that
> probably isn't a big deal.
>
> Also please note that the patch changes the code flow slightly,
> because passing a non-NULL parent pointer to
> device_release_driver_internal() potentially has side effects, but
> that should not be a big deal either.
>
> > Unfortunately that trick doesn't work for the bind sysfs file, since that way we can't thread the errno value back to userspace.
>
> Right.  That is unless we wait for the operation to complete and check
> the error left behind by it.  That should be doable, but somewhat
> complicated.

That said I'm not really sure if propagating the error to user space
in this case should be expected.  The interface could be defined as
asynchronous to begin with a separate way for user space to check the
status if necessary.  Changing that now may not be practical, though.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-13 10:23     ` Rafael J. Wysocki
  2018-12-13 11:05       ` Rafael J. Wysocki
@ 2018-12-13 12:36       ` Daniel Vetter
  2018-12-13 16:18         ` Rafael J. Wysocki
  1 sibling, 1 reply; 17+ messages in thread
From: Daniel Vetter @ 2018-12-13 12:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, dri-devel, Ramalingam C, Greg KH,
	Daniel Vetter

On Thu, Dec 13, 2018 at 11:23 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Dec 13, 2018 at 10:58 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote:
> > > On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > >
> > > > Drivers might want to remove some sysfs files, which needs the same
> > > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > > trace:
> > > >
> > > >   kernfs_remove_by_name_ns+0x3b/0x80
> > > >   bus_remove_driver+0x92/0xa0
> > > >   acpi_video_unregister+0x24/0x40
> > > >   i915_driver_unload+0x42/0x130 [i915]
> > > >   i915_pci_remove+0x19/0x30 [i915]
> > > >   pci_device_remove+0x36/0xb0
> > > >   device_release_driver_internal+0x185/0x250
> > > >   unbind_store+0xaf/0x180
> > > >   kernfs_fop_write+0x104/0x190
> > >
> > > Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
> > > source of the lockdep unhappiness?
> >
> > Yeah I guess I cut out too much of the lockdep splat. It complains about
> > kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock
> > class. It's ofc not the same lock, so no real deadlock. Getting the
> > device_release_driver outside of the callchain under kernfs_fop_write,
> > which this patch does, "fixes" it. For "fixes" = shut up lockdep.
>
> OK, so the problem really is that the operation is started via sysfs
> which means that this code is running under a lock already.
>
> Which lock does lockdep complain about, exactly?

mutex_lock(&of->mutex);

> > Other options:
> > - Anotate the recursion with the usual lockdep annotations. Potentially
> >   results in lockdep not catching real deadlocks (you can still have other
> >   loops closing the deadlock, maybe through some subsystem/bus lock).
> >
> > - Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks
> >   that know what they're doing), which should be fine if we refcount
> >   everything properly (bus, driver & device).
> >
> > - Also note that probably the same bug exists on the bind sysfs interface,
> >   but we don't use that, so I don't care :-)
> >
> > - Most of these issues are never visible in normal usage, since normally
> >   driver bind/unbind is done from a kthread or model_load/unload, neither
> >   of which is running in the context of that kernfs mutex kernfs_fop_write
> >   holds. That's why I think the task work is the best solution, since it
> >   changes the locking context of the unbind sysfs to match the locking
> >   context of module unload and hotunplug.
>
> I think that using a task work here makes sense.  There is a drawback,
> which is that the original sysfs write will not wait for the driver to
> actually be released before returning to user space AFAICS, but that
> probably isn't a big deal.

This would happen with a normal work_struct, which runs on some other
thread eventually. That added asynonchrouns execution uncovered lots
of bugs in our CI (fbcon isn't solid, let's put it that way). Hence
the task work, which will be run before the syscall returns to
userspace, but outside of anything else. Was originally created to
avoid locking inversion on the final fput, where the same "must
complete before returning to userspace, but outside of any other
locking context" issue was causing trouble.

> Also please note that the patch changes the code flow slightly,
> because passing a non-NULL parent pointer to
> device_release_driver_internal() potentially has side effects, but
> that should not be a big deal either.

I can do the old code exactly, but afaict the non-NULL parent just
takes care of the parent bus locking for us, instead of hand-rolling
it in the caller. But if I missed something, I can easily undo that
part.

> > Unfortunately that trick doesn't work for the bind sysfs file, since that way we can't thread the errno value back to userspace.
>
> Right.  That is unless we wait for the operation to complete and check
> the error left behind by it.  That should be doable, but somewhat
> complicated.

For real deadlocks this doesn't fix anything, it just hides it from
lockdep. cross-release lockdep would still complain. If we want to fix
the bind side _and_ keep reporting the errno from the driver's bind
function, then we need to rework kernfs to and add a callback which
doesn't hold the mutex. Should be doable, just a pile more work.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-13 12:36       ` Daniel Vetter
@ 2018-12-13 16:18         ` Rafael J. Wysocki
  2018-12-13 16:25           ` Daniel Vetter
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2018-12-13 16:18 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, dri-devel,
	ramalingam.c, Greg Kroah-Hartman, Daniel Vetter

On Thu, Dec 13, 2018 at 1:36 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On Thu, Dec 13, 2018 at 11:23 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Thu, Dec 13, 2018 at 10:58 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote:
> > > > On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > >
> > > > > Drivers might want to remove some sysfs files, which needs the same
> > > > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > > > trace:
> > > > >
> > > > >   kernfs_remove_by_name_ns+0x3b/0x80
> > > > >   bus_remove_driver+0x92/0xa0
> > > > >   acpi_video_unregister+0x24/0x40
> > > > >   i915_driver_unload+0x42/0x130 [i915]
> > > > >   i915_pci_remove+0x19/0x30 [i915]
> > > > >   pci_device_remove+0x36/0xb0
> > > > >   device_release_driver_internal+0x185/0x250
> > > > >   unbind_store+0xaf/0x180
> > > > >   kernfs_fop_write+0x104/0x190
> > > >
> > > > Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
> > > > source of the lockdep unhappiness?
> > >
> > > Yeah I guess I cut out too much of the lockdep splat. It complains about
> > > kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock
> > > class. It's ofc not the same lock, so no real deadlock. Getting the
> > > device_release_driver outside of the callchain under kernfs_fop_write,
> > > which this patch does, "fixes" it. For "fixes" = shut up lockdep.
> >
> > OK, so the problem really is that the operation is started via sysfs
> > which means that this code is running under a lock already.
> >
> > Which lock does lockdep complain about, exactly?
>
> mutex_lock(&of->mutex);

OK (I thought so)

> > > Other options:
> > > - Anotate the recursion with the usual lockdep annotations. Potentially
> > >   results in lockdep not catching real deadlocks (you can still have other
> > >   loops closing the deadlock, maybe through some subsystem/bus lock).
> > >
> > > - Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks
> > >   that know what they're doing), which should be fine if we refcount
> > >   everything properly (bus, driver & device).
> > >
> > > - Also note that probably the same bug exists on the bind sysfs interface,
> > >   but we don't use that, so I don't care :-)
> > >
> > > - Most of these issues are never visible in normal usage, since normally
> > >   driver bind/unbind is done from a kthread or model_load/unload, neither
> > >   of which is running in the context of that kernfs mutex kernfs_fop_write
> > >   holds. That's why I think the task work is the best solution, since it
> > >   changes the locking context of the unbind sysfs to match the locking
> > >   context of module unload and hotunplug.
> >
> > I think that using a task work here makes sense.  There is a drawback,
> > which is that the original sysfs write will not wait for the driver to
> > actually be released before returning to user space AFAICS, but that
> > probably isn't a big deal.
>
> This would happen with a normal work_struct, which runs on some other
> thread eventually. That added asynonchrouns execution uncovered lots
> of bugs in our CI (fbcon isn't solid, let's put it that way). Hence
> the task work, which will be run before the syscall returns to
> userspace, but outside of anything else. Was originally created to
> avoid locking inversion on the final fput, where the same "must
> complete before returning to userspace, but outside of any other
> locking context" issue was causing trouble.

I didn't realize that it would run completely before returning to user
space, thanks for pointing this out.

This isn't an issue then.

> > Also please note that the patch changes the code flow slightly,
> > because passing a non-NULL parent pointer to
> > device_release_driver_internal() potentially has side effects, but
> > that should not be a big deal either.
>
> I can do the old code exactly, but afaict the non-NULL parent just
> takes care of the parent bus locking for us, instead of hand-rolling
> it in the caller. But if I missed something, I can easily undo that
> part.

It is different if device links are present, but I'm not worried about
that case honestly. :-)

> > > Unfortunately that trick doesn't work for the bind sysfs file, since that way we can't thread the errno value back to userspace.
> >
> > Right.  That is unless we wait for the operation to complete and check
> > the error left behind by it.  That should be doable, but somewhat
> > complicated.
>
> For real deadlocks this doesn't fix anything, it just hides it from
> lockdep. cross-release lockdep would still complain. If we want to fix
> the bind side _and_ keep reporting the errno from the driver's bind
> function, then we need to rework kernfs to and add a callback which
> doesn't hold the mutex. Should be doable, just a pile more work.

It should be possible to store the error in a variable and export that
via a separate attribute for user space to inspect.  That would be a
significant I/F change, however.

Cheers,
Rafael

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-13 16:18         ` Rafael J. Wysocki
@ 2018-12-13 16:25           ` Daniel Vetter
  2018-12-13 18:09             ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Vetter @ 2018-12-13 16:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, dri-devel, Ramalingam C, Greg KH,
	Daniel Vetter

On Thu, Dec 13, 2018 at 5:18 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Dec 13, 2018 at 1:36 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > On Thu, Dec 13, 2018 at 11:23 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > >
> > > On Thu, Dec 13, 2018 at 10:58 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote:
> > > > > On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > >
> > > > > > Drivers might want to remove some sysfs files, which needs the same
> > > > > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > > > > trace:
> > > > > >
> > > > > >   kernfs_remove_by_name_ns+0x3b/0x80
> > > > > >   bus_remove_driver+0x92/0xa0
> > > > > >   acpi_video_unregister+0x24/0x40
> > > > > >   i915_driver_unload+0x42/0x130 [i915]
> > > > > >   i915_pci_remove+0x19/0x30 [i915]
> > > > > >   pci_device_remove+0x36/0xb0
> > > > > >   device_release_driver_internal+0x185/0x250
> > > > > >   unbind_store+0xaf/0x180
> > > > > >   kernfs_fop_write+0x104/0x190
> > > > >
> > > > > Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
> > > > > source of the lockdep unhappiness?
> > > >
> > > > Yeah I guess I cut out too much of the lockdep splat. It complains about
> > > > kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock
> > > > class. It's ofc not the same lock, so no real deadlock. Getting the
> > > > device_release_driver outside of the callchain under kernfs_fop_write,
> > > > which this patch does, "fixes" it. For "fixes" = shut up lockdep.
> > >
> > > OK, so the problem really is that the operation is started via sysfs
> > > which means that this code is running under a lock already.
> > >
> > > Which lock does lockdep complain about, exactly?
> >
> > mutex_lock(&of->mutex);
>
> OK (I thought so)
>
> > > > Other options:
> > > > - Anotate the recursion with the usual lockdep annotations. Potentially
> > > >   results in lockdep not catching real deadlocks (you can still have other
> > > >   loops closing the deadlock, maybe through some subsystem/bus lock).
> > > >
> > > > - Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks
> > > >   that know what they're doing), which should be fine if we refcount
> > > >   everything properly (bus, driver & device).
> > > >
> > > > - Also note that probably the same bug exists on the bind sysfs interface,
> > > >   but we don't use that, so I don't care :-)
> > > >
> > > > - Most of these issues are never visible in normal usage, since normally
> > > >   driver bind/unbind is done from a kthread or model_load/unload, neither
> > > >   of which is running in the context of that kernfs mutex kernfs_fop_write
> > > >   holds. That's why I think the task work is the best solution, since it
> > > >   changes the locking context of the unbind sysfs to match the locking
> > > >   context of module unload and hotunplug.
> > >
> > > I think that using a task work here makes sense.  There is a drawback,
> > > which is that the original sysfs write will not wait for the driver to
> > > actually be released before returning to user space AFAICS, but that
> > > probably isn't a big deal.
> >
> > This would happen with a normal work_struct, which runs on some other
> > thread eventually. That added asynonchrouns execution uncovered lots
> > of bugs in our CI (fbcon isn't solid, let's put it that way). Hence
> > the task work, which will be run before the syscall returns to
> > userspace, but outside of anything else. Was originally created to
> > avoid locking inversion on the final fput, where the same "must
> > complete before returning to userspace, but outside of any other
> > locking context" issue was causing trouble.
>
> I didn't realize that it would run completely before returning to user
> space, thanks for pointing this out.
>
> This isn't an issue then.
>
> > > Also please note that the patch changes the code flow slightly,
> > > because passing a non-NULL parent pointer to
> > > device_release_driver_internal() potentially has side effects, but
> > > that should not be a big deal either.
> >
> > I can do the old code exactly, but afaict the non-NULL parent just
> > takes care of the parent bus locking for us, instead of hand-rolling
> > it in the caller. But if I missed something, I can easily undo that
> > part.
>
> It is different if device links are present, but I'm not worried about
> that case honestly. :-)

What would change with device links? We have some cleanup plans to
remove our usage for early/late s/r hooks with a device link, to make
sure i915 resumes before snd_hda_intel. Digging more into the code I
only see the temporary dropping of the parent's device_lock, but I
have no idea what that even implies ...
-Daniel

>
> > > > Unfortunately that trick doesn't work for the bind sysfs file, since that way we can't thread the errno value back to userspace.
> > >
> > > Right.  That is unless we wait for the operation to complete and check
> > > the error left behind by it.  That should be doable, but somewhat
> > > complicated.
> >
> > For real deadlocks this doesn't fix anything, it just hides it from
> > lockdep. cross-release lockdep would still complain. If we want to fix
> > the bind side _and_ keep reporting the errno from the driver's bind
> > function, then we need to rework kernfs to and add a callback which
> > doesn't hold the mutex. Should be doable, just a pile more work.
>
> It should be possible to store the error in a variable and export that
> via a separate attribute for user space to inspect.  That would be a
> significant I/F change, however.
>
> Cheers,
> Rafael



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-13 16:25           ` Daniel Vetter
@ 2018-12-13 18:09             ` Rafael J. Wysocki
  2018-12-17 19:48               ` Daniel Vetter
  0 siblings, 1 reply; 17+ messages in thread
From: Rafael J. Wysocki @ 2018-12-13 18:09 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, dri-devel,
	ramalingam.c, Greg Kroah-Hartman, Daniel Vetter

On Thu, Dec 13, 2018 at 5:25 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On Thu, Dec 13, 2018 at 5:18 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Thu, Dec 13, 2018 at 1:36 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

[cut]

> > > I can do the old code exactly, but afaict the non-NULL parent just
> > > takes care of the parent bus locking for us, instead of hand-rolling
> > > it in the caller. But if I missed something, I can easily undo that
> > > part.
> >
> > It is different if device links are present, but I'm not worried about
> > that case honestly. :-)
>
> What would change with device links? We have some cleanup plans to
> remove our usage for early/late s/r hooks with a device link, to make
> sure i915 resumes before snd_hda_intel. Digging more into the code I
> only see the temporary dropping of the parent's device_lock, but I
> have no idea what that even implies ...

That's just it (which is why I said I was not worried).

Running device_links_unbind_consumers() with the parent lock held may
deadlock if another child of the same parent also is a consumer of the
current device (which really is a corner case), but the current code
has this problem - it goes away with your change.

But dev->bus->need_parent_lock checks are missing in there AFAICS, let
me cut a patch to fix that.

Cheers,
Rafael

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-13 18:09             ` Rafael J. Wysocki
@ 2018-12-17 19:48               ` Daniel Vetter
  2018-12-18  0:03                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Vetter @ 2018-12-17 19:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Daniel Vetter, Linux Kernel Mailing List, dri-devel,
	ramalingam.c, Greg Kroah-Hartman, Daniel Vetter

On Thu, Dec 13, 2018 at 07:09:15PM +0100, Rafael J. Wysocki wrote:
> On Thu, Dec 13, 2018 at 5:25 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > On Thu, Dec 13, 2018 at 5:18 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > >
> > > On Thu, Dec 13, 2018 at 1:36 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> [cut]
> 
> > > > I can do the old code exactly, but afaict the non-NULL parent just
> > > > takes care of the parent bus locking for us, instead of hand-rolling
> > > > it in the caller. But if I missed something, I can easily undo that
> > > > part.
> > >
> > > It is different if device links are present, but I'm not worried about
> > > that case honestly. :-)
> >
> > What would change with device links? We have some cleanup plans to
> > remove our usage for early/late s/r hooks with a device link, to make
> > sure i915 resumes before snd_hda_intel. Digging more into the code I
> > only see the temporary dropping of the parent's device_lock, but I
> > have no idea what that even implies ...
> 
> That's just it (which is why I said I was not worried).
> 
> Running device_links_unbind_consumers() with the parent lock held may
> deadlock if another child of the same parent also is a consumer of the
> current device (which really is a corner case), but the current code
> has this problem - it goes away with your change.
> 
> But dev->bus->need_parent_lock checks are missing in there AFAICS, let
> me cut a patch to fix that.

With your patch before this one, are you ok with mine? Or want me to
respin with a different flavour?

btw threading somehow broke apart, Chris Wilson r-b stamped this one on
intel-gfx:

https://patchwork.freedesktop.org/patch/267220/

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/base: use a worker for sysfs unbind
  2018-12-17 19:48               ` Daniel Vetter
@ 2018-12-18  0:03                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 17+ messages in thread
From: Rafael J. Wysocki @ 2018-12-18  0:03 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux Kernel Mailing List, dri-devel,
	ramalingam.c, Greg Kroah-Hartman, Daniel Vetter
  Cc: Daniel Vetter

On Mon, Dec 17, 2018 at 8:48 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Dec 13, 2018 at 07:09:15PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Dec 13, 2018 at 5:25 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >
> > > On Thu, Dec 13, 2018 at 5:18 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > >
> > > > On Thu, Dec 13, 2018 at 1:36 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > [cut]
> >
> > > > > I can do the old code exactly, but afaict the non-NULL parent just
> > > > > takes care of the parent bus locking for us, instead of hand-rolling
> > > > > it in the caller. But if I missed something, I can easily undo that
> > > > > part.
> > > >
> > > > It is different if device links are present, but I'm not worried about
> > > > that case honestly. :-)
> > >
> > > What would change with device links? We have some cleanup plans to
> > > remove our usage for early/late s/r hooks with a device link, to make
> > > sure i915 resumes before snd_hda_intel. Digging more into the code I
> > > only see the temporary dropping of the parent's device_lock, but I
> > > have no idea what that even implies ...
> >
> > That's just it (which is why I said I was not worried).
> >
> > Running device_links_unbind_consumers() with the parent lock held may
> > deadlock if another child of the same parent also is a consumer of the
> > current device (which really is a corner case), but the current code
> > has this problem - it goes away with your change.
> >
> > But dev->bus->need_parent_lock checks are missing in there AFAICS, let
> > me cut a patch to fix that.
>
> With your patch before this one, are you ok with mine? Or want me to
> respin with a different flavour?

Having reconsidered this a bit I'm leaning towards annotating the
locks - if that works.  After all, what is problematic is the lockdep
false-positive and addressing it that way would be most natural IMO.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-12-18  0:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-10  8:46 [PATCH] drivers/base: use a worker for sysfs unbind Daniel Vetter
2018-12-10 10:06 ` Greg Kroah-Hartman
2018-12-10 10:18   ` Daniel Vetter
2018-12-10 10:20     ` Daniel Vetter
2018-12-12 11:08       ` Daniel Vetter
2018-12-12 11:19         ` Greg Kroah-Hartman
2018-12-12 12:40           ` Daniel Vetter
2018-12-13  9:38 ` Rafael J. Wysocki
2018-12-13  9:58   ` Daniel Vetter
2018-12-13 10:23     ` Rafael J. Wysocki
2018-12-13 11:05       ` Rafael J. Wysocki
2018-12-13 12:36       ` Daniel Vetter
2018-12-13 16:18         ` Rafael J. Wysocki
2018-12-13 16:25           ` Daniel Vetter
2018-12-13 18:09             ` Rafael J. Wysocki
2018-12-17 19:48               ` Daniel Vetter
2018-12-18  0:03                 ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).