[RE-RESEND,1/4] usb: musb: Fix runtime PM race in musb_queue_resume_work
diff mbox series

Message ID 20210123142502.16980-1-paul@crapouillou.net
State Accepted
Commit 0eaa1a3714db34a59ce121de5733c3909c529463
Headers show
Series
  • [RE-RESEND,1/4] usb: musb: Fix runtime PM race in musb_queue_resume_work
Related show

Commit Message

Paul Cercueil Jan. 23, 2021, 2:24 p.m. UTC
musb_queue_resume_work() would call the provided callback if the runtime
PM status was 'active'. Otherwise, it would enqueue the request if the
hardware was still suspended (musb->is_runtime_suspended is true).

This causes a race with the runtime PM handlers, as it is possible to be
in the case where the runtime PM status is not yet 'active', but the
hardware has been awaken (PM resume function has been called).

When hitting the race, the resume work was not enqueued, which probably
triggered other bugs further down the stack. For instance, a telnet
connection on Ingenic SoCs would result in a 50/50 chance of a
segmentation fault somewhere in the musb code.

Rework the code so that either we call the callback directly if
(musb->is_runtime_suspended == 0), or enqueue the query otherwise.

Fixes: ea2f35c01d5e ("usb: musb: Fix sleeping function called from invalid context for hdrc glue")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Tested-by: Tony Lindgren <tony@atomide.com>
---
 drivers/usb/musb/musb_core.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

Comments

Sergei Shtylyov Jan. 23, 2021, 4:41 p.m. UTC | #1
On 1/23/21 5:24 PM, Paul Cercueil wrote:

> musb_queue_resume_work() would call the provided callback if the runtime
> PM status was 'active'. Otherwise, it would enqueue the request if the
> hardware was still suspended (musb->is_runtime_suspended is true).
> 
> This causes a race with the runtime PM handlers, as it is possible to be
> in the case where the runtime PM status is not yet 'active', but the
> hardware has been awaken (PM resume function has been called).

   Awakened. :-)

> When hitting the race, the resume work was not enqueued, which probably
> triggered other bugs further down the stack. For instance, a telnet
> connection on Ingenic SoCs would result in a 50/50 chance of a
> segmentation fault somewhere in the musb code.
> 
> Rework the code so that either we call the callback directly if
> (musb->is_runtime_suspended == 0), or enqueue the query otherwise.
> 
> Fixes: ea2f35c01d5e ("usb: musb: Fix sleeping function called from invalid context for hdrc glue")
> Cc: stable@vger.kernel.org # v4.9+
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> Reviewed-by: Tony Lindgren <tony@atomide.com>
> Tested-by: Tony Lindgren <tony@atomide.com>
[...]


MBR, Sergei
Paul Cercueil Jan. 24, 2021, 8:38 a.m. UTC | #2
Hi Sergei,


Le sam. 23 janv. 2021 à 19:41, Sergei Shtylyov 
<sergei.shtylyov@gmail.com> a écrit :
> On 1/23/21 5:24 PM, Paul Cercueil wrote:
> 
>>  musb_queue_resume_work() would call the provided callback if the 
>> runtime
>>  PM status was 'active'. Otherwise, it would enqueue the request if 
>> the
>>  hardware was still suspended (musb->is_runtime_suspended is true).
>> 
>>  This causes a race with the runtime PM handlers, as it is possible 
>> to be
>>  in the case where the runtime PM status is not yet 'active', but the
>>  hardware has been awaken (PM resume function has been called).
> 
>    Awakened. :-)

Oops. Hopefully Bin or Greg can fix it when merging (if I don't need to 
v2, that is to say - feedback welcome).

Cheers,
-Paul

>>  When hitting the race, the resume work was not enqueued, which 
>> probably
>>  triggered other bugs further down the stack. For instance, a telnet
>>  connection on Ingenic SoCs would result in a 50/50 chance of a
>>  segmentation fault somewhere in the musb code.
>> 
>>  Rework the code so that either we call the callback directly if
>>  (musb->is_runtime_suspended == 0), or enqueue the query otherwise.
>> 
>>  Fixes: ea2f35c01d5e ("usb: musb: Fix sleeping function called from 
>> invalid context for hdrc glue")
>>  Cc: stable@vger.kernel.org # v4.9+
>>  Signed-off-by: Paul Cercueil <paul@crapouillou.net>
>>  Reviewed-by: Tony Lindgren <tony@atomide.com>
>>  Tested-by: Tony Lindgren <tony@atomide.com>
> [...]
> 
> 
> MBR, Sergei

Patch
diff mbox series

diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c
index 849e0b770130..1cd87729ba60 100644
--- a/drivers/usb/musb/musb_core.c
+++ b/drivers/usb/musb/musb_core.c
@@ -2240,32 +2240,35 @@  int musb_queue_resume_work(struct musb *musb,
 {
 	struct musb_pending_work *w;
 	unsigned long flags;
+	bool is_suspended;
 	int error;
 
 	if (WARN_ON(!callback))
 		return -EINVAL;
 
-	if (pm_runtime_active(musb->controller))
-		return callback(musb, data);
+	spin_lock_irqsave(&musb->list_lock, flags);
+	is_suspended = musb->is_runtime_suspended;
+
+	if (is_suspended) {
+		w = devm_kzalloc(musb->controller, sizeof(*w), GFP_ATOMIC);
+		if (!w) {
+			error = -ENOMEM;
+			goto out_unlock;
+		}
 
-	w = devm_kzalloc(musb->controller, sizeof(*w), GFP_ATOMIC);
-	if (!w)
-		return -ENOMEM;
+		w->callback = callback;
+		w->data = data;
 
-	w->callback = callback;
-	w->data = data;
-	spin_lock_irqsave(&musb->list_lock, flags);
-	if (musb->is_runtime_suspended) {
 		list_add_tail(&w->node, &musb->pending_list);
 		error = 0;
-	} else {
-		dev_err(musb->controller, "could not add resume work %p\n",
-			callback);
-		devm_kfree(musb->controller, w);
-		error = -EINPROGRESS;
 	}
+
+out_unlock:
 	spin_unlock_irqrestore(&musb->list_lock, flags);
 
+	if (!is_suspended)
+		error = callback(musb, data);
+
 	return error;
 }
 EXPORT_SYMBOL_GPL(musb_queue_resume_work);