Linux-USB Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v5] usb: core: Solve race condition in anchor cleanup functions
@ 2020-07-31  5:46 eli.billauer
  2020-07-31 13:57 ` Alan Stern
  2020-08-03  8:38 ` Oliver Neukum
  0 siblings, 2 replies; 3+ messages in thread
From: eli.billauer @ 2020-07-31  5:46 UTC (permalink / raw)
  To: gregkh, linux-usb; +Cc: hdegoede, oneukum, stern, Eli Billauer

From: Eli Billauer <eli.billauer@gmail.com>

usb_kill_anchored_urbs() is commonly used to cancel all URBs on an
anchor just before releasing resources which the URBs rely on. By doing
so, users of this function rely on that no completer callbacks will take
place from any URB on the anchor after it returns.

However if this function is called in parallel with __usb_hcd_giveback_urb
processing a URB on the anchor, the latter may call the completer
callback after usb_kill_anchored_urbs() returns. This can lead to a
kernel panic due to use after release of memory in interrupt context.

The race condition is that __usb_hcd_giveback_urb() first unanchors the URB
and then makes the completer callback. Such URB is hence invisible to
usb_kill_anchored_urbs(), allowing it to return before the completer has
been called, since the anchor's urb_list is empty.

Even worse, if the racing completer callback resubmits the URB, it may
remain in the system long after usb_kill_anchored_urbs() returns.

Hence list_empty(&anchor->urb_list), which is used in the existing
while-loop, doesn't reliably ensure that all URBs of the anchor are gone.

A similar problem exists with usb_poison_anchored_urbs() and
usb_scuttle_anchored_urbs().

This patch adds an external do-while loop, which ensures that all URBs
are indeed handled before these three functions return. This change has
no effect at all unless the race condition occurs, in which case the
loop will busy-wait until the racing completer callback has finished.
This is a rare condition, so the CPU waste of this spinning is
negligible.

The additional do-while loop relies on usb_anchor_check_wakeup(), which
returns true iff the anchor list is empty, and there is no
__usb_hcd_giveback_urb() in the system that is in the middle of the
unanchor-before-complete phase. The @suspend_wakeups member of
struct usb_anchor is used for this purpose, which was introduced to solve
another problem which the same race condition causes, in commit
6ec4147e7bdb ("usb-anchor: Delay usb_wait_anchor_empty_timeout wake up
till completion is done").

The surely_empty variable is necessary, because usb_anchor_check_wakeup()
must be called with the lock held to prevent races. However the spinlock
must be released and reacquired if the outer loop spins with an empty
URB list while waiting for the unanchor-before-complete passage to finish:
The completer callback may very well attempt to take the very same lock.

To summarize, using usb_anchor_check_wakeup() means that the patched
functions can return only when the anchor's list is empty, and there is
no invisible URB being processed. Since the inner while loop finishes on
the empty list condition, the new do-while loop will terminate as well,
except for when the said race condition occurs.

Signed-off-by: Eli Billauer <eli.billauer@gmail.com>
---
Difference from patch v4: Added cpu_relax() calls per Alan's advice.
 drivers/usb/core/urb.c | 89 +++++++++++++++++++++++++-----------------
 1 file changed, 54 insertions(+), 35 deletions(-)

diff --git a/drivers/usb/core/urb.c b/drivers/usb/core/urb.c
index da923ec17612..31ca5abb4c12 100644
--- a/drivers/usb/core/urb.c
+++ b/drivers/usb/core/urb.c
@@ -772,11 +772,12 @@ void usb_block_urb(struct urb *urb)
 EXPORT_SYMBOL_GPL(usb_block_urb);
 
 /**
- * usb_kill_anchored_urbs - cancel transfer requests en masse
+ * usb_kill_anchored_urbs - kill all URBs associated with an anchor
  * @anchor: anchor the requests are bound to
  *
- * this allows all outstanding URBs to be killed starting
- * from the back of the queue
+ * This kills all outstanding URBs starting from the back of the queue,
+ * with guarantee that no completer callbacks will take place from the
+ * anchor after this function returns.
  *
  * This routine should not be called by a driver after its disconnect
  * method has returned.
@@ -784,20 +785,26 @@ EXPORT_SYMBOL_GPL(usb_block_urb);
 void usb_kill_anchored_urbs(struct usb_anchor *anchor)
 {
 	struct urb *victim;
+	int surely_empty;
 
-	spin_lock_irq(&anchor->lock);
-	while (!list_empty(&anchor->urb_list)) {
-		victim = list_entry(anchor->urb_list.prev, struct urb,
-				    anchor_list);
-		/* we must make sure the URB isn't freed before we kill it*/
-		usb_get_urb(victim);
-		spin_unlock_irq(&anchor->lock);
-		/* this will unanchor the URB */
-		usb_kill_urb(victim);
-		usb_put_urb(victim);
+	do {
 		spin_lock_irq(&anchor->lock);
-	}
-	spin_unlock_irq(&anchor->lock);
+		while (!list_empty(&anchor->urb_list)) {
+			victim = list_entry(anchor->urb_list.prev,
+					    struct urb, anchor_list);
+			/* make sure the URB isn't freed before we kill it */
+			usb_get_urb(victim);
+			spin_unlock_irq(&anchor->lock);
+			/* this will unanchor the URB */
+			usb_kill_urb(victim);
+			usb_put_urb(victim);
+			spin_lock_irq(&anchor->lock);
+		}
+		surely_empty = usb_anchor_check_wakeup(anchor);
+
+		spin_unlock_irq(&anchor->lock);
+		cpu_relax();
+	} while (!surely_empty);
 }
 EXPORT_SYMBOL_GPL(usb_kill_anchored_urbs);
 
@@ -816,21 +823,27 @@ EXPORT_SYMBOL_GPL(usb_kill_anchored_urbs);
 void usb_poison_anchored_urbs(struct usb_anchor *anchor)
 {
 	struct urb *victim;
+	int surely_empty;
 
-	spin_lock_irq(&anchor->lock);
-	anchor->poisoned = 1;
-	while (!list_empty(&anchor->urb_list)) {
-		victim = list_entry(anchor->urb_list.prev, struct urb,
-				    anchor_list);
-		/* we must make sure the URB isn't freed before we kill it*/
-		usb_get_urb(victim);
-		spin_unlock_irq(&anchor->lock);
-		/* this will unanchor the URB */
-		usb_poison_urb(victim);
-		usb_put_urb(victim);
+	do {
 		spin_lock_irq(&anchor->lock);
-	}
-	spin_unlock_irq(&anchor->lock);
+		anchor->poisoned = 1;
+		while (!list_empty(&anchor->urb_list)) {
+			victim = list_entry(anchor->urb_list.prev,
+					    struct urb, anchor_list);
+			/* make sure the URB isn't freed before we kill it */
+			usb_get_urb(victim);
+			spin_unlock_irq(&anchor->lock);
+			/* this will unanchor the URB */
+			usb_poison_urb(victim);
+			usb_put_urb(victim);
+			spin_lock_irq(&anchor->lock);
+		}
+		surely_empty = usb_anchor_check_wakeup(anchor);
+
+		spin_unlock_irq(&anchor->lock);
+		cpu_relax();
+	} while (!surely_empty);
 }
 EXPORT_SYMBOL_GPL(usb_poison_anchored_urbs);
 
@@ -970,14 +983,20 @@ void usb_scuttle_anchored_urbs(struct usb_anchor *anchor)
 {
 	struct urb *victim;
 	unsigned long flags;
+	int surely_empty;
+
+	do {
+		spin_lock_irqsave(&anchor->lock, flags);
+		while (!list_empty(&anchor->urb_list)) {
+			victim = list_entry(anchor->urb_list.prev,
+					    struct urb, anchor_list);
+			__usb_unanchor_urb(victim, anchor);
+		}
+		surely_empty = usb_anchor_check_wakeup(anchor);
 
-	spin_lock_irqsave(&anchor->lock, flags);
-	while (!list_empty(&anchor->urb_list)) {
-		victim = list_entry(anchor->urb_list.prev, struct urb,
-				    anchor_list);
-		__usb_unanchor_urb(victim, anchor);
-	}
-	spin_unlock_irqrestore(&anchor->lock, flags);
+		spin_unlock_irqrestore(&anchor->lock, flags);
+		cpu_relax();
+	} while (!surely_empty);
 }
 
 EXPORT_SYMBOL_GPL(usb_scuttle_anchored_urbs);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v5] usb: core: Solve race condition in anchor cleanup functions
  2020-07-31  5:46 [PATCH v5] usb: core: Solve race condition in anchor cleanup functions eli.billauer
@ 2020-07-31 13:57 ` Alan Stern
  2020-08-03  8:38 ` Oliver Neukum
  1 sibling, 0 replies; 3+ messages in thread
From: Alan Stern @ 2020-07-31 13:57 UTC (permalink / raw)
  To: eli.billauer; +Cc: gregkh, linux-usb, hdegoede, oneukum

On Fri, Jul 31, 2020 at 08:46:50AM +0300, eli.billauer@gmail.com wrote:
> From: Eli Billauer <eli.billauer@gmail.com>
> 
> usb_kill_anchored_urbs() is commonly used to cancel all URBs on an
> anchor just before releasing resources which the URBs rely on. By doing
> so, users of this function rely on that no completer callbacks will take
> place from any URB on the anchor after it returns.
> 
> However if this function is called in parallel with __usb_hcd_giveback_urb
> processing a URB on the anchor, the latter may call the completer
> callback after usb_kill_anchored_urbs() returns. This can lead to a
> kernel panic due to use after release of memory in interrupt context.
> 
> The race condition is that __usb_hcd_giveback_urb() first unanchors the URB
> and then makes the completer callback. Such URB is hence invisible to
> usb_kill_anchored_urbs(), allowing it to return before the completer has
> been called, since the anchor's urb_list is empty.
> 
> Even worse, if the racing completer callback resubmits the URB, it may
> remain in the system long after usb_kill_anchored_urbs() returns.
> 
> Hence list_empty(&anchor->urb_list), which is used in the existing
> while-loop, doesn't reliably ensure that all URBs of the anchor are gone.
> 
> A similar problem exists with usb_poison_anchored_urbs() and
> usb_scuttle_anchored_urbs().
> 
> This patch adds an external do-while loop, which ensures that all URBs
> are indeed handled before these three functions return. This change has
> no effect at all unless the race condition occurs, in which case the
> loop will busy-wait until the racing completer callback has finished.
> This is a rare condition, so the CPU waste of this spinning is
> negligible.
> 
> The additional do-while loop relies on usb_anchor_check_wakeup(), which
> returns true iff the anchor list is empty, and there is no
> __usb_hcd_giveback_urb() in the system that is in the middle of the
> unanchor-before-complete phase. The @suspend_wakeups member of
> struct usb_anchor is used for this purpose, which was introduced to solve
> another problem which the same race condition causes, in commit
> 6ec4147e7bdb ("usb-anchor: Delay usb_wait_anchor_empty_timeout wake up
> till completion is done").
> 
> The surely_empty variable is necessary, because usb_anchor_check_wakeup()
> must be called with the lock held to prevent races. However the spinlock
> must be released and reacquired if the outer loop spins with an empty
> URB list while waiting for the unanchor-before-complete passage to finish:
> The completer callback may very well attempt to take the very same lock.
> 
> To summarize, using usb_anchor_check_wakeup() means that the patched
> functions can return only when the anchor's list is empty, and there is
> no invisible URB being processed. Since the inner while loop finishes on
> the empty list condition, the new do-while loop will terminate as well,
> except for when the said race condition occurs.
> 
> Signed-off-by: Eli Billauer <eli.billauer@gmail.com>
> ---
> Difference from patch v4: Added cpu_relax() calls per Alan's advice.
>  drivers/usb/core/urb.c | 89 +++++++++++++++++++++++++-----------------
>  1 file changed, 54 insertions(+), 35 deletions(-)

Acked-by: Alan Stern <stern@rowland.harvard.edu>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v5] usb: core: Solve race condition in anchor cleanup functions
  2020-07-31  5:46 [PATCH v5] usb: core: Solve race condition in anchor cleanup functions eli.billauer
  2020-07-31 13:57 ` Alan Stern
@ 2020-08-03  8:38 ` Oliver Neukum
  1 sibling, 0 replies; 3+ messages in thread
From: Oliver Neukum @ 2020-08-03  8:38 UTC (permalink / raw)
  To: eli.billauer, gregkh, linux-usb; +Cc: hdegoede, stern

Am Freitag, den 31.07.2020, 08:46 +0300 schrieb eli.billauer@gmail.com:
> From: Eli Billauer <eli.billauer@gmail.com>
> 
> usb_kill_anchored_urbs() is commonly used to cancel all URBs on an
> anchor just before releasing resources which the URBs rely on. By doing
> so, users of this function rely on that no completer callbacks will take
> place from any URB on the anchor after it returns.
> 
> However if this function is called in parallel with __usb_hcd_giveback_urb
> processing a URB on the anchor, the latter may call the completer
> callback after usb_kill_anchored_urbs() returns. This can lead to a
> kernel panic due to use after release of memory in interrupt context.
> 
> The race condition is that __usb_hcd_giveback_urb() first unanchors the URB
> and then makes the completer callback. Such URB is hence invisible to
> usb_kill_anchored_urbs(), allowing it to return before the completer has
> been called, since the anchor's urb_list is empty.
> 
> Even worse, if the racing completer callback resubmits the URB, it may
> remain in the system long after usb_kill_anchored_urbs() returns.
> 
> Hence list_empty(&anchor->urb_list), which is used in the existing
> while-loop, doesn't reliably ensure that all URBs of the anchor are gone.
> 
> A similar problem exists with usb_poison_anchored_urbs() and
> usb_scuttle_anchored_urbs().
> 
> This patch adds an external do-while loop, which ensures that all URBs
> are indeed handled before these three functions return. This change has
> no effect at all unless the race condition occurs, in which case the
> loop will busy-wait until the racing completer callback has finished.
> This is a rare condition, so the CPU waste of this spinning is
> negligible.
> 
> The additional do-while loop relies on usb_anchor_check_wakeup(), which
> returns true iff the anchor list is empty, and there is no
> __usb_hcd_giveback_urb() in the system that is in the middle of the
> unanchor-before-complete phase. The @suspend_wakeups member of
> struct usb_anchor is used for this purpose, which was introduced to solve
> another problem which the same race condition causes, in commit
> 6ec4147e7bdb ("usb-anchor: Delay usb_wait_anchor_empty_timeout wake up
> till completion is done").
> 
> The surely_empty variable is necessary, because usb_anchor_check_wakeup()
> must be called with the lock held to prevent races. However the spinlock
> must be released and reacquired if the outer loop spins with an empty
> URB list while waiting for the unanchor-before-complete passage to finish:
> The completer callback may very well attempt to take the very same lock.
> 
> To summarize, using usb_anchor_check_wakeup() means that the patched
> functions can return only when the anchor's list is empty, and there is
> no invisible URB being processed. Since the inner while loop finishes on
> the empty list condition, the new do-while loop will terminate as well,
> except for when the said race condition occurs.
> 
> Signed-off-by: Eli Billauer <eli.billauer@gmail.com>
Acked-by: Oliver Neukum <oneukum@suse.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-31  5:46 [PATCH v5] usb: core: Solve race condition in anchor cleanup functions eli.billauer
2020-07-31 13:57 ` Alan Stern
2020-08-03  8:38 ` Oliver Neukum

Linux-USB Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-usb/0 linux-usb/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-usb linux-usb/ https://lore.kernel.org/linux-usb \
		linux-usb@vger.kernel.org
	public-inbox-index linux-usb

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-usb


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git