linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] USB: serial: fix sysfs-attribute removal deadlock
@ 2014-04-23  9:32 Johan Hovold
  2014-04-23 14:19 ` Tejun Heo
  0 siblings, 1 reply; 15+ messages in thread
From: Johan Hovold @ 2014-04-23  9:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Alan Stern, linux-usb, linux-kernel, Tejun Heo, Johan Hovold

Fix driver new_id sysfs-attribute removal deadlock by making sure to
not hold any locks that the attribute operations grab when removing the
attribute.

Specifically, usb_serial_deregister holds the table mutex when
deregistering the driver, which includes removing the new_id attribute.
This can lead to a deadlock as writing to new_id increments the
attribute's active count before trying to grab the same mutex in
usb_serial_probe.

The deadlock can easily be triggered by inserting a sleep in
usb_serial_deregister and writing the id of an unbound device to new_id
during module unload.

As the table mutex (in this case) is used to prevent subdriver unload
during probe, it should be sufficient to only hold the lock while
manipulating the usb-serial driver list during deregister. A racing
probe will then either fail to find a matching subdriver or fail to get
the corresponding module reference.

Since v3.15-rc1 this also triggers the following lockdep warning:

======================================================
[ INFO: possible circular locking dependency detected ]
3.15.0-rc2 #123 Tainted: G        W
-------------------------------------------------------
modprobe/190 is trying to acquire lock:
 (s_active#4){++++.+}, at: [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94

but task is already holding lock:
 (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (table_lock){+.+.+.}:
       [<c0075f84>] __lock_acquire+0x1694/0x1ce4
       [<c0076de8>] lock_acquire+0xb4/0x154
       [<c03af3cc>] _raw_spin_lock+0x4c/0x5c
       [<c02bbc24>] usb_store_new_id+0x14c/0x1ac
       [<bf007eb4>] new_id_store+0x68/0x70 [usbserial]
       [<c025f568>] drv_attr_store+0x30/0x3c
       [<c01690e0>] sysfs_kf_write+0x5c/0x60
       [<c01682c0>] kernfs_fop_write+0xd4/0x194
       [<c010881c>] vfs_write+0xbc/0x198
       [<c0108e4c>] SyS_write+0x4c/0xa0
       [<c000f880>] ret_fast_syscall+0x0/0x48

-> #0 (s_active#4){++++.+}:
       [<c03a7a28>] print_circular_bug+0x68/0x2f8
       [<c0076218>] __lock_acquire+0x1928/0x1ce4
       [<c0076de8>] lock_acquire+0xb4/0x154
       [<c0166b70>] __kernfs_remove+0x254/0x310
       [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94
       [<c0169fb8>] remove_files.isra.1+0x48/0x84
       [<c016a2fc>] sysfs_remove_group+0x58/0xac
       [<c016a414>] sysfs_remove_groups+0x34/0x44
       [<c02623b8>] driver_remove_groups+0x1c/0x20
       [<c0260e9c>] bus_remove_driver+0x3c/0xe4
       [<c026235c>] driver_unregister+0x38/0x58
       [<bf007fb4>] usb_serial_bus_deregister+0x84/0x88 [usbserial]
       [<bf004db4>] usb_serial_deregister+0x6c/0x78 [usbserial]
       [<bf005330>] usb_serial_deregister_drivers+0x2c/0x4c [usbserial]
       [<bf016618>] usb_serial_module_exit+0x14/0x1c [sierra]
       [<c009d6cc>] SyS_delete_module+0x184/0x210
       [<c000f880>] ret_fast_syscall+0x0/0x48

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(table_lock);
                               lock(s_active#4);
                               lock(table_lock);
  lock(s_active#4);

 *** DEADLOCK ***

1 lock held by modprobe/190:
 #0:  (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial]

stack backtrace:
CPU: 0 PID: 190 Comm: modprobe Tainted: G        W     3.15.0-rc2 #123
[<c0015e10>] (unwind_backtrace) from [<c0013728>] (show_stack+0x20/0x24)
[<c0013728>] (show_stack) from [<c03a9a54>] (dump_stack+0x24/0x28)
[<c03a9a54>] (dump_stack) from [<c03a7cac>] (print_circular_bug+0x2ec/0x2f8)
[<c03a7cac>] (print_circular_bug) from [<c0076218>] (__lock_acquire+0x1928/0x1ce4)
[<c0076218>] (__lock_acquire) from [<c0076de8>] (lock_acquire+0xb4/0x154)
[<c0076de8>] (lock_acquire) from [<c0166b70>] (__kernfs_remove+0x254/0x310)
[<c0166b70>] (__kernfs_remove) from [<c0167aa0>] (kernfs_remove_by_name_ns+0x4c/0x94)
[<c0167aa0>] (kernfs_remove_by_name_ns) from [<c0169fb8>] (remove_files.isra.1+0x48/0x84)
[<c0169fb8>] (remove_files.isra.1) from [<c016a2fc>] (sysfs_remove_group+0x58/0xac)
[<c016a2fc>] (sysfs_remove_group) from [<c016a414>] (sysfs_remove_groups+0x34/0x44)
[<c016a414>] (sysfs_remove_groups) from [<c02623b8>] (driver_remove_groups+0x1c/0x20)
[<c02623b8>] (driver_remove_groups) from [<c0260e9c>] (bus_remove_driver+0x3c/0xe4)
[<c0260e9c>] (bus_remove_driver) from [<c026235c>] (driver_unregister+0x38/0x58)
[<c026235c>] (driver_unregister) from [<bf007fb4>] (usb_serial_bus_deregister+0x84/0x88 [usbserial])
[<bf007fb4>] (usb_serial_bus_deregister [usbserial]) from [<bf004db4>] (usb_serial_deregister+0x6c/0x78 [usbserial])
[<bf004db4>] (usb_serial_deregister [usbserial]) from [<bf005330>] (usb_serial_deregister_drivers+0x2c/0x4c [usbserial])
[<bf005330>] (usb_serial_deregister_drivers [usbserial]) from [<bf016618>] (usb_serial_module_exit+0x14/0x1c [sierra])
[<bf016618>] (usb_serial_module_exit [sierra]) from [<c009d6cc>] (SyS_delete_module+0x184/0x210)
[<c009d6cc>] (SyS_delete_module) from [<c000f880>] (ret_fast_syscall+0x0/0x48)

Signed-off-by: Johan Hovold <jhovold@gmail.com>
---

I got this new lockdep warning with 3.15-rc, but this dates back to at
least 0daeed381c6a ("USB-BKL: Remove BKL use for usb serial driver
probing").

As far as I can tell, moving driver deregistration out of the table lock
should be fine. Another option would be to get a module reference in
new_id store, although that would still trigger the lockdep warning.

Thanks,
Johan


 drivers/usb/serial/usb-serial.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c
index 81fc0dfcfdcf..6d40d56378d7 100644
--- a/drivers/usb/serial/usb-serial.c
+++ b/drivers/usb/serial/usb-serial.c
@@ -1347,10 +1347,12 @@ static int usb_serial_register(struct usb_serial_driver *driver)
 static void usb_serial_deregister(struct usb_serial_driver *device)
 {
 	pr_info("USB Serial deregistering driver %s\n", device->description);
+
 	mutex_lock(&table_lock);
 	list_del(&device->driver_list);
-	usb_serial_bus_deregister(device);
 	mutex_unlock(&table_lock);
+
+	usb_serial_bus_deregister(device);
 }
 
 /**
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-23  9:32 [PATCH] USB: serial: fix sysfs-attribute removal deadlock Johan Hovold
@ 2014-04-23 14:19 ` Tejun Heo
  2014-04-24  8:29   ` Li Zhong
  0 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2014-04-23 14:19 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Greg Kroah-Hartman, Alan Stern, linux-usb, linux-kernel,
	Li Zhong, rafael.j.wysocki

cc'ing Li Zhong who's working on a simliar issue in the following
thread and quoting whole body.

  http://thread.gmane.org/gmane.linux.kernel/1680706

Li, this is another variation of the same problem.  Maybe this can be
covered by your work too?

Thanks.

On Wed, Apr 23, 2014 at 11:32:19AM +0200, Johan Hovold wrote:
> Fix driver new_id sysfs-attribute removal deadlock by making sure to
> not hold any locks that the attribute operations grab when removing the
> attribute.
> 
> Specifically, usb_serial_deregister holds the table mutex when
> deregistering the driver, which includes removing the new_id attribute.
> This can lead to a deadlock as writing to new_id increments the
> attribute's active count before trying to grab the same mutex in
> usb_serial_probe.
> 
> The deadlock can easily be triggered by inserting a sleep in
> usb_serial_deregister and writing the id of an unbound device to new_id
> during module unload.
> 
> As the table mutex (in this case) is used to prevent subdriver unload
> during probe, it should be sufficient to only hold the lock while
> manipulating the usb-serial driver list during deregister. A racing
> probe will then either fail to find a matching subdriver or fail to get
> the corresponding module reference.
> 
> Since v3.15-rc1 this also triggers the following lockdep warning:
> 
> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 3.15.0-rc2 #123 Tainted: G        W
> -------------------------------------------------------
> modprobe/190 is trying to acquire lock:
>  (s_active#4){++++.+}, at: [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94
> 
> but task is already holding lock:
>  (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial]
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (table_lock){+.+.+.}:
>        [<c0075f84>] __lock_acquire+0x1694/0x1ce4
>        [<c0076de8>] lock_acquire+0xb4/0x154
>        [<c03af3cc>] _raw_spin_lock+0x4c/0x5c
>        [<c02bbc24>] usb_store_new_id+0x14c/0x1ac
>        [<bf007eb4>] new_id_store+0x68/0x70 [usbserial]
>        [<c025f568>] drv_attr_store+0x30/0x3c
>        [<c01690e0>] sysfs_kf_write+0x5c/0x60
>        [<c01682c0>] kernfs_fop_write+0xd4/0x194
>        [<c010881c>] vfs_write+0xbc/0x198
>        [<c0108e4c>] SyS_write+0x4c/0xa0
>        [<c000f880>] ret_fast_syscall+0x0/0x48
> 
> -> #0 (s_active#4){++++.+}:
>        [<c03a7a28>] print_circular_bug+0x68/0x2f8
>        [<c0076218>] __lock_acquire+0x1928/0x1ce4
>        [<c0076de8>] lock_acquire+0xb4/0x154
>        [<c0166b70>] __kernfs_remove+0x254/0x310
>        [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94
>        [<c0169fb8>] remove_files.isra.1+0x48/0x84
>        [<c016a2fc>] sysfs_remove_group+0x58/0xac
>        [<c016a414>] sysfs_remove_groups+0x34/0x44
>        [<c02623b8>] driver_remove_groups+0x1c/0x20
>        [<c0260e9c>] bus_remove_driver+0x3c/0xe4
>        [<c026235c>] driver_unregister+0x38/0x58
>        [<bf007fb4>] usb_serial_bus_deregister+0x84/0x88 [usbserial]
>        [<bf004db4>] usb_serial_deregister+0x6c/0x78 [usbserial]
>        [<bf005330>] usb_serial_deregister_drivers+0x2c/0x4c [usbserial]
>        [<bf016618>] usb_serial_module_exit+0x14/0x1c [sierra]
>        [<c009d6cc>] SyS_delete_module+0x184/0x210
>        [<c000f880>] ret_fast_syscall+0x0/0x48
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(table_lock);
>                                lock(s_active#4);
>                                lock(table_lock);
>   lock(s_active#4);
> 
>  *** DEADLOCK ***
> 
> 1 lock held by modprobe/190:
>  #0:  (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial]
> 
> stack backtrace:
> CPU: 0 PID: 190 Comm: modprobe Tainted: G        W     3.15.0-rc2 #123
> [<c0015e10>] (unwind_backtrace) from [<c0013728>] (show_stack+0x20/0x24)
> [<c0013728>] (show_stack) from [<c03a9a54>] (dump_stack+0x24/0x28)
> [<c03a9a54>] (dump_stack) from [<c03a7cac>] (print_circular_bug+0x2ec/0x2f8)
> [<c03a7cac>] (print_circular_bug) from [<c0076218>] (__lock_acquire+0x1928/0x1ce4)
> [<c0076218>] (__lock_acquire) from [<c0076de8>] (lock_acquire+0xb4/0x154)
> [<c0076de8>] (lock_acquire) from [<c0166b70>] (__kernfs_remove+0x254/0x310)
> [<c0166b70>] (__kernfs_remove) from [<c0167aa0>] (kernfs_remove_by_name_ns+0x4c/0x94)
> [<c0167aa0>] (kernfs_remove_by_name_ns) from [<c0169fb8>] (remove_files.isra.1+0x48/0x84)
> [<c0169fb8>] (remove_files.isra.1) from [<c016a2fc>] (sysfs_remove_group+0x58/0xac)
> [<c016a2fc>] (sysfs_remove_group) from [<c016a414>] (sysfs_remove_groups+0x34/0x44)
> [<c016a414>] (sysfs_remove_groups) from [<c02623b8>] (driver_remove_groups+0x1c/0x20)
> [<c02623b8>] (driver_remove_groups) from [<c0260e9c>] (bus_remove_driver+0x3c/0xe4)
> [<c0260e9c>] (bus_remove_driver) from [<c026235c>] (driver_unregister+0x38/0x58)
> [<c026235c>] (driver_unregister) from [<bf007fb4>] (usb_serial_bus_deregister+0x84/0x88 [usbserial])
> [<bf007fb4>] (usb_serial_bus_deregister [usbserial]) from [<bf004db4>] (usb_serial_deregister+0x6c/0x78 [usbserial])
> [<bf004db4>] (usb_serial_deregister [usbserial]) from [<bf005330>] (usb_serial_deregister_drivers+0x2c/0x4c [usbserial])
> [<bf005330>] (usb_serial_deregister_drivers [usbserial]) from [<bf016618>] (usb_serial_module_exit+0x14/0x1c [sierra])
> [<bf016618>] (usb_serial_module_exit [sierra]) from [<c009d6cc>] (SyS_delete_module+0x184/0x210)
> [<c009d6cc>] (SyS_delete_module) from [<c000f880>] (ret_fast_syscall+0x0/0x48)
> 
> Signed-off-by: Johan Hovold <jhovold@gmail.com>
> ---
> 
> I got this new lockdep warning with 3.15-rc, but this dates back to at
> least 0daeed381c6a ("USB-BKL: Remove BKL use for usb serial driver
> probing").
> 
> As far as I can tell, moving driver deregistration out of the table lock
> should be fine. Another option would be to get a module reference in
> new_id store, although that would still trigger the lockdep warning.
> 
> Thanks,
> Johan
> 
> 
>  drivers/usb/serial/usb-serial.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c
> index 81fc0dfcfdcf..6d40d56378d7 100644
> --- a/drivers/usb/serial/usb-serial.c
> +++ b/drivers/usb/serial/usb-serial.c
> @@ -1347,10 +1347,12 @@ static int usb_serial_register(struct usb_serial_driver *driver)
>  static void usb_serial_deregister(struct usb_serial_driver *device)
>  {
>  	pr_info("USB Serial deregistering driver %s\n", device->description);
> +
>  	mutex_lock(&table_lock);
>  	list_del(&device->driver_list);
> -	usb_serial_bus_deregister(device);
>  	mutex_unlock(&table_lock);
> +
> +	usb_serial_bus_deregister(device);
>  }
>  
>  /**
> -- 
> 1.8.3.2
> 

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-23 14:19 ` Tejun Heo
@ 2014-04-24  8:29   ` Li Zhong
  2014-04-24 14:35     ` Tejun Heo
  0 siblings, 1 reply; 15+ messages in thread
From: Li Zhong @ 2014-04-24  8:29 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Johan Hovold, Greg Kroah-Hartman, Alan Stern, linux-usb,
	linux-kernel, rafael.j.wysocki

On Wed, 2014-04-23 at 10:19 -0400, Tejun Heo wrote:
> cc'ing Li Zhong who's working on a simliar issue in the following
> thread and quoting whole body.
> 
>   http://thread.gmane.org/gmane.linux.kernel/1680706
> 
> Li, this is another variation of the same problem.  Maybe this can be
> covered by your work too?

It seems to me that it is about write something to driver attribute, and
driver unloading. If so, maybe it's not easy to reuse the help functions
created for device attribute, and device removing.

But I guess the idea to break the active protection could still be
applied here:

Maybe we could try_module_get() here (like the other option suggested by
Johan?), and break active protection if we could get the module,
something like below? 

================
diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 83e910a..6ce27e0 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -69,9 +69,24 @@ static ssize_t drv_attr_store(struct kobject *kobj, struct attribute *attr,
 	struct driver_attribute *drv_attr = to_drv_attr(attr);
 	struct driver_private *drv_priv = to_driver(kobj);
 	ssize_t ret = -EIO;
+	struct kernfs_node *kn;
+
+	if (!try_module_get(drv_priv->driver->owner))
+		return ret;
+
+	kn = kernfs_find_and_get(kobj->sd, attr->name);
+	if (WARN_ON_ONCE(!kn))
+		return ret;
+
+	kernfs_break_active_protection(kn);
 
 	if (drv_attr->store)
 		ret = drv_attr->store(drv_priv->driver, buf, count);
+
+	kernfs_unbreak_active_protection(kn);
+	kernfs_put(kn);
+
+	module_put(drv_priv->driver->owner);
 	return ret;
 }
 



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-24  8:29   ` Li Zhong
@ 2014-04-24 14:35     ` Tejun Heo
  2014-04-24 14:52       ` Johan Hovold
  2014-04-25  2:15       ` Li Zhong
  0 siblings, 2 replies; 15+ messages in thread
From: Tejun Heo @ 2014-04-24 14:35 UTC (permalink / raw)
  To: Li Zhong
  Cc: Johan Hovold, Greg Kroah-Hartman, Alan Stern, linux-usb,
	linux-kernel, rafael.j.wysocki

On Thu, Apr 24, 2014 at 04:29:15PM +0800, Li Zhong wrote:
> On Wed, 2014-04-23 at 10:19 -0400, Tejun Heo wrote:
> > cc'ing Li Zhong who's working on a simliar issue in the following
> > thread and quoting whole body.
> > 
> >   http://thread.gmane.org/gmane.linux.kernel/1680706
> > 
> > Li, this is another variation of the same problem.  Maybe this can be
> > covered by your work too?
> 
> It seems to me that it is about write something to driver attribute, and
> driver unloading. If so, maybe it's not easy to reuse the help functions
> created for device attribute, and device removing.
> 
> But I guess the idea to break the active protection could still be
> applied here:
> 
> Maybe we could try_module_get() here (like the other option suggested by
> Johan?), and break active protection if we could get the module,
> something like below? 

I don't get why try_module_get() matters here.  We can't call into
->store if the object at hand is already destroyed and the underlying
module can't go away if the target device is still alive.
try_module_get() doesn't actually protect the object.  Why does that
matter?  This is self removal, right?  Can you please take a look at
kernfs_remove_self()?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-24 14:35     ` Tejun Heo
@ 2014-04-24 14:52       ` Johan Hovold
  2014-04-25  2:16         ` Li Zhong
  2014-04-25  2:15       ` Li Zhong
  1 sibling, 1 reply; 15+ messages in thread
From: Johan Hovold @ 2014-04-24 14:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Li Zhong, Johan Hovold, Greg Kroah-Hartman, Alan Stern,
	linux-usb, linux-kernel, rafael.j.wysocki

On Thu, Apr 24, 2014 at 10:35:17AM -0400, Tejun Heo wrote:
> On Thu, Apr 24, 2014 at 04:29:15PM +0800, Li Zhong wrote:
> > On Wed, 2014-04-23 at 10:19 -0400, Tejun Heo wrote:
> > > cc'ing Li Zhong who's working on a simliar issue in the following
> > > thread and quoting whole body.
> > > 
> > >   http://thread.gmane.org/gmane.linux.kernel/1680706
> > > 
> > > Li, this is another variation of the same problem.  Maybe this can be
> > > covered by your work too?
> > 
> > It seems to me that it is about write something to driver attribute, and
> > driver unloading. If so, maybe it's not easy to reuse the help functions
> > created for device attribute, and device removing.
> > 
> > But I guess the idea to break the active protection could still be
> > applied here:
> > 
> > Maybe we could try_module_get() here (like the other option suggested by
> > Johan?), and break active protection if we could get the module,
> > something like below? 
> 
> I don't get why try_module_get() matters here.  We can't call into
> ->store if the object at hand is already destroyed and the underlying
> module can't go away if the target device is still alive.
> try_module_get() doesn't actually protect the object.  Why does that
> matter?  This is self removal, right?  Can you please take a look at
> kernfs_remove_self()?

No, this isn't self removal. The driver-attribute (not device-attribute)
store operation simply grabs a lock that is also held while the driver
is being deregistered at module unload. Taking a reference to the module
in this case will prevent deregistration while store is running.

But it seems like this can be solved for usb-serial by simply not
holding the lock while deregistering.

Johan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-24 14:35     ` Tejun Heo
  2014-04-24 14:52       ` Johan Hovold
@ 2014-04-25  2:15       ` Li Zhong
  2014-04-25 13:54         ` Alan Stern
  1 sibling, 1 reply; 15+ messages in thread
From: Li Zhong @ 2014-04-25  2:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Johan Hovold, Greg Kroah-Hartman, Alan Stern, linux-usb,
	linux-kernel, rafael.j.wysocki

On Thu, 2014-04-24 at 10:35 -0400, Tejun Heo wrote:
> On Thu, Apr 24, 2014 at 04:29:15PM +0800, Li Zhong wrote:
> > On Wed, 2014-04-23 at 10:19 -0400, Tejun Heo wrote:
> > > cc'ing Li Zhong who's working on a simliar issue in the following
> > > thread and quoting whole body.
> > > 
> > >   http://thread.gmane.org/gmane.linux.kernel/1680706
> > > 
> > > Li, this is another variation of the same problem.  Maybe this can be
> > > covered by your work too?
> > 
> > It seems to me that it is about write something to driver attribute, and
> > driver unloading. If so, maybe it's not easy to reuse the help functions
> > created for device attribute, and device removing.
> > 
> > But I guess the idea to break the active protection could still be
> > applied here:
> > 
> > Maybe we could try_module_get() here (like the other option suggested by
> > Johan?), and break active protection if we could get the module,
> > something like below? 
> 
> I don't get why try_module_get() matters here.  We can't call into
> ->store if the object at hand is already destroyed and the underlying
> module can't go away if the target device is still alive.
> try_module_get() doesn't actually protect the object.  Why does that
> matter?  This is self removal, right?  Can you please take a look at
> kernfs_remove_self()?

This is about one process writing something to driver attributes, and
one process trying to unload this driver. 

I think try_module_get() could detect whether the driver is being
unloaded, and if not, prevent it from being unloaded, so it could
protect the object here by not allow the driver to be unloaded.

And if the driver is being unloaded, we could abort the write operation
directly. 

Thanks, Zhong

> 
> Thanks.
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-24 14:52       ` Johan Hovold
@ 2014-04-25  2:16         ` Li Zhong
  2014-04-25 10:15           ` Johan Hovold
  2014-04-25 13:59           ` Alan Stern
  0 siblings, 2 replies; 15+ messages in thread
From: Li Zhong @ 2014-04-25  2:16 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Tejun Heo, Greg Kroah-Hartman, Alan Stern, linux-usb,
	linux-kernel, rafael.j.wysocki

On Thu, 2014-04-24 at 16:52 +0200, Johan Hovold wrote:
> On Thu, Apr 24, 2014 at 10:35:17AM -0400, Tejun Heo wrote:
> > On Thu, Apr 24, 2014 at 04:29:15PM +0800, Li Zhong wrote:
> > > On Wed, 2014-04-23 at 10:19 -0400, Tejun Heo wrote:
> > > > cc'ing Li Zhong who's working on a simliar issue in the following
> > > > thread and quoting whole body.
> > > > 
> > > >   http://thread.gmane.org/gmane.linux.kernel/1680706
> > > > 
> > > > Li, this is another variation of the same problem.  Maybe this can be
> > > > covered by your work too?
> > > 
> > > It seems to me that it is about write something to driver attribute, and
> > > driver unloading. If so, maybe it's not easy to reuse the help functions
> > > created for device attribute, and device removing.
> > > 
> > > But I guess the idea to break the active protection could still be
> > > applied here:
> > > 
> > > Maybe we could try_module_get() here (like the other option suggested by
> > > Johan?), and break active protection if we could get the module,
> > > something like below? 
> > 
> > I don't get why try_module_get() matters here.  We can't call into
> > ->store if the object at hand is already destroyed and the underlying
> > module can't go away if the target device is still alive.
> > try_module_get() doesn't actually protect the object.  Why does that
> > matter?  This is self removal, right?  Can you please take a look at
> > kernfs_remove_self()?
> 
> No, this isn't self removal. The driver-attribute (not device-attribute)
> store operation simply grabs a lock that is also held while the driver
> is being deregistered at module unload. Taking a reference to the module
> in this case will prevent deregistration while store is running.
> 
> But it seems like this can be solved for usb-serial by simply not
> holding the lock while deregistering.

I didn't look carefully about this lock. 

But I'm not sure whether there are such requirements for driver
attributes:

some lock needs be grabbed in the driver attributes store callbacks, and
the same lock also needs to be grabbed during driver unregister. 

If we have such requirements currently or in the future, I think they
could all be solved by breaking active protection after get the module
reference.

Thanks, Zhong

> 
> Johan
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-25  2:16         ` Li Zhong
@ 2014-04-25 10:15           ` Johan Hovold
  2014-04-28  0:39             ` Li Zhong
  2014-04-25 13:59           ` Alan Stern
  1 sibling, 1 reply; 15+ messages in thread
From: Johan Hovold @ 2014-04-25 10:15 UTC (permalink / raw)
  To: Li Zhong
  Cc: Johan Hovold, Tejun Heo, Greg Kroah-Hartman, Alan Stern,
	linux-usb, linux-kernel, rafael.j.wysocki

On Fri, Apr 25, 2014 at 10:16:57AM +0800, Li Zhong wrote:
> On Thu, 2014-04-24 at 16:52 +0200, Johan Hovold wrote:

> > No, this isn't self removal. The driver-attribute (not device-attribute)
> > store operation simply grabs a lock that is also held while the driver
> > is being deregistered at module unload. Taking a reference to the module
> > in this case will prevent deregistration while store is running.
> > 
> > But it seems like this can be solved for usb-serial by simply not
> > holding the lock while deregistering.
> 
> I didn't look carefully about this lock. 
> 
> But I'm not sure whether there are such requirements for driver
> attributes:
> 
> some lock needs be grabbed in the driver attributes store callbacks, and
> the same lock also needs to be grabbed during driver unregister. 
> 
> If we have such requirements currently or in the future, I think they
> could all be solved by breaking active protection after get the module
> reference.

Yes, if we find any such cases, but I don't think it should be done
generally (as in one of your earlier posts).

Active protection is usually exactly what prevents the driver from being
deregistered, and would only need to be broken to avoid any ABBA
deadlocks from being reported by lockdep (the module reference would be
enough to prevent the actual deadlock).

Thanks,
Johan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-25  2:15       ` Li Zhong
@ 2014-04-25 13:54         ` Alan Stern
  2014-04-25 15:13           ` Johan Hovold
  2014-04-28  1:55           ` Li Zhong
  0 siblings, 2 replies; 15+ messages in thread
From: Alan Stern @ 2014-04-25 13:54 UTC (permalink / raw)
  To: Li Zhong
  Cc: Tejun Heo, Johan Hovold, Greg Kroah-Hartman, linux-usb,
	linux-kernel, rafael.j.wysocki

On Fri, 25 Apr 2014, Li Zhong wrote:

> > I don't get why try_module_get() matters here.  We can't call into
> > ->store if the object at hand is already destroyed and the underlying
> > module can't go away if the target device is still alive.
> > try_module_get() doesn't actually protect the object.  Why does that
> > matter?  This is self removal, right?  Can you please take a look at
> > kernfs_remove_self()?
> 
> This is about one process writing something to driver attributes, and
> one process trying to unload this driver. 
> 
> I think try_module_get() could detect whether the driver is being
> unloaded, and if not, prevent it from being unloaded, so it could
> protect the object here by not allow the driver to be unloaded.

That isn't how try_module_get() works.  If the module is being 
unloaded, try_module_get() simply fails.  It does not prevent the 
module from being unloaded -- that's why its name begins with "try".

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-25  2:16         ` Li Zhong
  2014-04-25 10:15           ` Johan Hovold
@ 2014-04-25 13:59           ` Alan Stern
  2014-04-28  1:58             ` Li Zhong
  1 sibling, 1 reply; 15+ messages in thread
From: Alan Stern @ 2014-04-25 13:59 UTC (permalink / raw)
  To: Li Zhong
  Cc: Johan Hovold, Tejun Heo, Greg Kroah-Hartman, linux-usb,
	linux-kernel, rafael.j.wysocki

On Fri, 25 Apr 2014, Li Zhong wrote:

> > No, this isn't self removal. The driver-attribute (not device-attribute)
> > store operation simply grabs a lock that is also held while the driver
> > is being deregistered at module unload. Taking a reference to the module
> > in this case will prevent deregistration while store is running.
> > 
> > But it seems like this can be solved for usb-serial by simply not
> > holding the lock while deregistering.
> 
> I didn't look carefully about this lock. 
> 
> But I'm not sure whether there are such requirements for driver
> attributes:
> 
> some lock needs be grabbed in the driver attributes store callbacks, and
> the same lock also needs to be grabbed during driver unregister. 

In this case, the lock does _not_ need to be grabbed during driver 
unregister.  The driver grabs the lock, but it doesn't need to.

> If we have such requirements currently or in the future, I think they
> could all be solved by breaking active protection after get the module
> reference.

No!  That would be very bad.

Unloading modules is quite different from unbinding drivers.  After the
driver is unbound, its attribute callback routines can continue to run.  
But after a driver module has been unloaded, its attribute callback 
routines _cannot_ run because they aren't present in memory any more.

If we allowed a module to be unloaded while one of its callbacks was 
running (because active protection was broken), imagine what would 
happen...

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-25 13:54         ` Alan Stern
@ 2014-04-25 15:13           ` Johan Hovold
  2014-04-28  1:55           ` Li Zhong
  1 sibling, 0 replies; 15+ messages in thread
From: Johan Hovold @ 2014-04-25 15:13 UTC (permalink / raw)
  To: Alan Stern
  Cc: Li Zhong, Tejun Heo, Johan Hovold, Greg Kroah-Hartman, linux-usb,
	linux-kernel, rafael.j.wysocki

On Fri, Apr 25, 2014 at 09:54:33AM -0400, Alan Stern wrote:
> On Fri, 25 Apr 2014, Li Zhong wrote:
> 
> > > I don't get why try_module_get() matters here.  We can't call into
> > > ->store if the object at hand is already destroyed and the underlying
> > > module can't go away if the target device is still alive.
> > > try_module_get() doesn't actually protect the object.  Why does that
> > > matter?  This is self removal, right?  Can you please take a look at
> > > kernfs_remove_self()?
> > 
> > This is about one process writing something to driver attributes, and
> > one process trying to unload this driver. 
> > 
> > I think try_module_get() could detect whether the driver is being
> > unloaded, and if not, prevent it from being unloaded, so it could
> > protect the object here by not allow the driver to be unloaded.
> 
> That isn't how try_module_get() works.  If the module is being 
> unloaded, try_module_get() simply fails.  It does not prevent the 
> module from being unloaded -- that's why its name begins with "try".

Well, to be fair, if the try_module_get() *succeeds* it will prevent the
module from being unloaded (and otherwise the sysfs operation bails
out).

Johan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-25 10:15           ` Johan Hovold
@ 2014-04-28  0:39             ` Li Zhong
  2014-05-02 15:20               ` Tejun Heo
  0 siblings, 1 reply; 15+ messages in thread
From: Li Zhong @ 2014-04-28  0:39 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Tejun Heo, Greg Kroah-Hartman, Alan Stern, linux-usb,
	linux-kernel, rafael.j.wysocki

On Fri, 2014-04-25 at 12:15 +0200, Johan Hovold wrote:
> On Fri, Apr 25, 2014 at 10:16:57AM +0800, Li Zhong wrote:
> > On Thu, 2014-04-24 at 16:52 +0200, Johan Hovold wrote:
> 
> > > No, this isn't self removal. The driver-attribute (not device-attribute)
> > > store operation simply grabs a lock that is also held while the driver
> > > is being deregistered at module unload. Taking a reference to the module
> > > in this case will prevent deregistration while store is running.
> > > 
> > > But it seems like this can be solved for usb-serial by simply not
> > > holding the lock while deregistering.
> > 
> > I didn't look carefully about this lock. 
> > 
> > But I'm not sure whether there are such requirements for driver
> > attributes:
> > 
> > some lock needs be grabbed in the driver attributes store callbacks, and
> > the same lock also needs to be grabbed during driver unregister. 
> > 
> > If we have such requirements currently or in the future, I think they
> > could all be solved by breaking active protection after get the module
> > reference.
> 
> Yes, if we find any such cases, but I don't think it should be done
> generally (as in one of your earlier posts).

OK, maybe we only need to reconsider this only if we see some more
similar lockdep warnings in the future. 

> 
> Active protection is usually exactly what prevents the driver from being
> deregistered, and would only need to be broken to avoid any ABBA
> deadlocks from being reported by lockdep (the module reference would be
> enough to prevent the actual deadlock).

Yes, maybe try to get the module reference is not bad before writing to
driver attributes, as it doesn't make much sense to really call the
callbacks for the driver attribute if the driver is being unload.

And after we get the reference, it is safe for us to break the active.
But if we don't have such real cases(lockdep warnings), we actually
don't need to break it. 

Thanks, Zhong

> 
> Thanks,
> Johan
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-25 13:54         ` Alan Stern
  2014-04-25 15:13           ` Johan Hovold
@ 2014-04-28  1:55           ` Li Zhong
  1 sibling, 0 replies; 15+ messages in thread
From: Li Zhong @ 2014-04-28  1:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: Tejun Heo, Johan Hovold, Greg Kroah-Hartman, linux-usb,
	linux-kernel, rafael.j.wysocki

On Fri, 2014-04-25 at 09:54 -0400, Alan Stern wrote:
> On Fri, 25 Apr 2014, Li Zhong wrote:
> 
> > > I don't get why try_module_get() matters here.  We can't call into
> > > ->store if the object at hand is already destroyed and the underlying
> > > module can't go away if the target device is still alive.
> > > try_module_get() doesn't actually protect the object.  Why does that
> > > matter?  This is self removal, right?  Can you please take a look at
> > > kernfs_remove_self()?
> > 
> > This is about one process writing something to driver attributes, and
> > one process trying to unload this driver. 
> > 
> > I think try_module_get() could detect whether the driver is being
> > unloaded, and if not, prevent it from being unloaded, so it could
> > protect the object here by not allow the driver to be unloaded.
> 
> That isn't how try_module_get() works.  If the module is being 
> unloaded, try_module_get() simply fails.  It does not prevent the 
> module from being unloaded -- that's why its name begins with "try".

Yes, I know that. What I said above is for the case when
try_module_get() detects the driver is NOT being unloaded, and increases
the reference counter. 

Thanks, Zhong
> 
> Alan Stern
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-25 13:59           ` Alan Stern
@ 2014-04-28  1:58             ` Li Zhong
  0 siblings, 0 replies; 15+ messages in thread
From: Li Zhong @ 2014-04-28  1:58 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johan Hovold, Tejun Heo, Greg Kroah-Hartman, linux-usb,
	linux-kernel, rafael.j.wysocki

On Fri, 2014-04-25 at 09:59 -0400, Alan Stern wrote:
> On Fri, 25 Apr 2014, Li Zhong wrote:
> 
> > > No, this isn't self removal. The driver-attribute (not device-attribute)
> > > store operation simply grabs a lock that is also held while the driver
> > > is being deregistered at module unload. Taking a reference to the module
> > > in this case will prevent deregistration while store is running.
> > > 
> > > But it seems like this can be solved for usb-serial by simply not
> > > holding the lock while deregistering.
> > 
> > I didn't look carefully about this lock. 
> > 
> > But I'm not sure whether there are such requirements for driver
> > attributes:
> > 
> > some lock needs be grabbed in the driver attributes store callbacks, and
> > the same lock also needs to be grabbed during driver unregister. 
> 
> In this case, the lock does _not_ need to be grabbed during driver 
> unregister.  The driver grabs the lock, but it doesn't need to.

OK.

> 
> > If we have such requirements currently or in the future, I think they
> > could all be solved by breaking active protection after get the module
> > reference.
> 
> No!  That would be very bad.
> 
> Unloading modules is quite different from unbinding drivers.  After the
> driver is unbound, its attribute callback routines can continue to run.  
> But after a driver module has been unloaded, its attribute callback 
> routines _cannot_ run because they aren't present in memory any more.
> 
> If we allowed a module to be unloaded while one of its callbacks was 
> running (because active protection was broken), imagine what would 
> happen...

I don't think the module could be unloaded after we increased the module
reference counter. 

Thanks, Zhong

> 
> Alan Stern
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
  2014-04-28  0:39             ` Li Zhong
@ 2014-05-02 15:20               ` Tejun Heo
  0 siblings, 0 replies; 15+ messages in thread
From: Tejun Heo @ 2014-05-02 15:20 UTC (permalink / raw)
  To: Li Zhong
  Cc: Johan Hovold, Greg Kroah-Hartman, Alan Stern, linux-usb,
	linux-kernel, rafael.j.wysocki

On Mon, Apr 28, 2014 at 08:39:47AM +0800, Li Zhong wrote:
> Yes, maybe try to get the module reference is not bad before writing to
> driver attributes, as it doesn't make much sense to really call the
> callbacks for the driver attribute if the driver is being unload.

Please don't do that spuriously.  Active protection is the primary
mechanism for that sort of protection and adding spurious things just
make them confusing.

> And after we get the reference, it is safe for us to break the active.
> But if we don't have such real cases(lockdep warnings), we actually
> don't need to break it. 

Yeah, for cases where active protection should be broken, other
measures should be taken to prevent the underlying data structure /
code from going away.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-05-02 15:20 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-23  9:32 [PATCH] USB: serial: fix sysfs-attribute removal deadlock Johan Hovold
2014-04-23 14:19 ` Tejun Heo
2014-04-24  8:29   ` Li Zhong
2014-04-24 14:35     ` Tejun Heo
2014-04-24 14:52       ` Johan Hovold
2014-04-25  2:16         ` Li Zhong
2014-04-25 10:15           ` Johan Hovold
2014-04-28  0:39             ` Li Zhong
2014-05-02 15:20               ` Tejun Heo
2014-04-25 13:59           ` Alan Stern
2014-04-28  1:58             ` Li Zhong
2014-04-25  2:15       ` Li Zhong
2014-04-25 13:54         ` Alan Stern
2014-04-25 15:13           ` Johan Hovold
2014-04-28  1:55           ` Li Zhong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).