All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
@ 2021-07-21 17:08 Guido Kiener
  2021-07-21 18:16 ` Alan Stern
  0 siblings, 1 reply; 20+ messages in thread
From: Guido Kiener @ 2021-07-21 17:08 UTC (permalink / raw)
  To: Alan Stern
  Cc: dave penkler, Greg KH, qiang.zhang, Dmitry Vyukov, paulmck, USB

> > > Subject: *EXT* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
> > >
> > > On Wed, Jul 21, 2021 at 11:44:23AM +0200, dave penkler wrote:
> > > > Sorry, the issue this patch is trying to fix occurs because the
> > > > current usbtmc driver resubmits the URB when it gets an EPROTO return.
> > > > The dummy usb host controller driver used in the syzbot tests
> > > > keeps returning the resubmitted URB with EPROTO causing a loop
> > > > that starves RCU. With an actual HCI driver it either recovers or returns an
> EPIPE.
> > >
> > > Are you sure about that?  Have you ever observed a usbtmc device
> > > recovering and continuing to operate after an EPROTO error?
> >
> > I can't speak for Dave and his investigations. However as you remember
> > I did tests with EPROTO errors, see thread:
> > https://marc.info/?l=linux-usb&m=162163776930423&w=2
> > In the thread you can see the recovering.
> 
> Ah yes, now I remember.
> 
> That message doesn't show the _device_ recovering and continuing to operate,
> though.  It shows the _system_ recovering and realizing that the device has been
> disconnected.
> 
> What I was asking about was whether you knew of a case where there was an
> EPROTO error but afterward the usbtmc device continued to work -- no
> disconnection.  Assuming such cases are vanishingly rare, there's no harm in
> having the driver give up whenever it encounters EPROTO.

I have no idea how often the EPROTO error can happen during normal operation and believe you that it's vanishingly rare.
When it happens, does the USB hardware protocol automatically retransmit the lost/invalid packet?
If yes, we should think about an error counter.
If not, then we really can stop the INT pipe and the user will detect that something is wrong when reading the status.

> > Since no user blamed the usbtmc driver for system locks up to now,
> > it's worth to think about whether the problem is caused by the dummy_hcd driver.
> 
> Both drivers contributed to the lockup.  The question is: Which driver was doing the
> wrong thing?  (Or which was _more_ wrong?)  I believe the usbtmc driver was.
> 
> > I still have no time for further investigations and would agree to use
> > the simple patch to get rid of the topic for the usbtmc driver. Then the syzbot will
> maybe find another usb driver.
> 
> Agreed.  So Greg should go ahead and apply the $SUBJECT patch.
> 
> Alan Stern

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21 17:08 [PATCH] USB: usbtmc: Fix RCU stall warning Guido Kiener
@ 2021-07-21 18:16 ` Alan Stern
  0 siblings, 0 replies; 20+ messages in thread
From: Alan Stern @ 2021-07-21 18:16 UTC (permalink / raw)
  To: Guido Kiener
  Cc: dave penkler, Greg KH, qiang.zhang, Dmitry Vyukov, paulmck, USB

On Wed, Jul 21, 2021 at 05:08:48PM +0000, Guido Kiener wrote:
> > > > Subject: *EXT* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
> > > >
> > > > On Wed, Jul 21, 2021 at 11:44:23AM +0200, dave penkler wrote:
> > > > > Sorry, the issue this patch is trying to fix occurs because the
> > > > > current usbtmc driver resubmits the URB when it gets an EPROTO return.
> > > > > The dummy usb host controller driver used in the syzbot tests
> > > > > keeps returning the resubmitted URB with EPROTO causing a loop
> > > > > that starves RCU. With an actual HCI driver it either recovers or returns an
> > EPIPE.
> > > >
> > > > Are you sure about that?  Have you ever observed a usbtmc device
> > > > recovering and continuing to operate after an EPROTO error?
> > >
> > > I can't speak for Dave and his investigations. However as you remember
> > > I did tests with EPROTO errors, see thread:
> > > https://marc.info/?l=linux-usb&m=162163776930423&w=2
> > > In the thread you can see the recovering.
> > 
> > Ah yes, now I remember.
> > 
> > That message doesn't show the _device_ recovering and continuing to operate,
> > though.  It shows the _system_ recovering and realizing that the device has been
> > disconnected.
> > 
> > What I was asking about was whether you knew of a case where there was an
> > EPROTO error but afterward the usbtmc device continued to work -- no
> > disconnection.  Assuming such cases are vanishingly rare, there's no harm in
> > having the driver give up whenever it encounters EPROTO.
> 
> I have no idea how often the EPROTO error can happen during normal operation and believe you that it's vanishingly rare.
> When it happens, does the USB hardware protocol automatically retransmit the lost/invalid packet?

When a low-level protocol error occurs, the USB host controller hardware 
does automatically retransmit the packet.  USB has a "3 strikes and you're 
out" approach: The error doesn't get reported until there have been three 
failed transmission attempts.

On the other hand, dummy-hcd never has these errors (for obvious reasons) 
unless the function driver has been unbound, which always results in a 
disconnect.  Or if the host-side driver does something wrong, like trying to 
communicate with a nonexistent endpoint.

> If yes, we should think about an error counter.

What for?

Besides, the ehci-hcd driver already has a higher-level retry loop for 
low-level protocol errors.  It makes at least 32 attempts before giving up 
on a transaction.

> If not, then we really can stop the INT pipe and the user will detect that something is wrong when reading the status.

Or in the most likely case, the system will realize (after a few hundred 
milliseconds) that the device has been disconnected and will clean up.  The 
only question is whether the usbtmc driver should make multiple futile 
attempts to restart the transmission during those milliseconds.  I think it 
shouldn't.

Alan Stern

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-22 17:33 Guido Kiener
@ 2021-07-23  0:36 ` Zhang, Qiang
  0 siblings, 0 replies; 20+ messages in thread
From: Zhang, Qiang @ 2021-07-23  0:36 UTC (permalink / raw)
  To: Guido Kiener, Greg KH, dave penkler
  Cc: Alan Stern, Dmitry Vyukov, paulmck, USB



________________________________________
From: Guido Kiener <Guido.Kiener@rohde-schwarz.com>
Sent: Friday, 23 July 2021 01:33
To: Greg KH; dave penkler
Cc: Zhang, Qiang; Alan Stern; Dmitry Vyukov; paulmck@kernel.org; USB
Subject: Re: [PATCH] USB: usbtmc: Fix RCU stall warning

[Please note: This e-mail is from an EXTERNAL e-mail address]

> From: Greg KH
> Sent: Wednesday, July 21, 2021 11:48 AM
> Subject: *EXT* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
>
> On Wed, Jul 21, 2021 at 11:44:23AM +0200, dave penkler wrote:
> > On Wed, 21 Jul 2021 at 09:52, Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Jul 21, 2021 at 09:41:15AM +0200, dave penkler wrote:
> > > > On Wed, 21 Jul 2021 at 09:08, Greg KH <gregkh@linuxfoundation.org>
> wrote:
> > > > >
> > > > > On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com
> wrote:
> > > > > > From: Zqiang <qiang.zhang@windriver.com>
> > > > >
> > > > > I need a "full" name here, and in the signed-off-by line please.
> > > > >
> > > > > >
> > > > > > rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > > > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> > > > > >         softirq=25390/25392 fqs=3
> > > > > >         (t=12164 jiffies g=31645 q=43226)
> > > > > > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> > > > > >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > > > > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> > > > > >         OOM is now expected behavior.
> > > > > > rcu: RCU grace-period kthread stack dump:
> > > > > > task:rcu_preempt     state:R  running task
> > > > > >
> > > > > > In the case of system use dummy_hcd as usb controller, when
> > > > > > the usbtmc devices is disconnected, in usbtmc_interrupt(), if
> > > > > > the urb status is unknown, the urb will be resubmit, the urb
> > > > > > may be insert to dum_hcd->urbp_list again, this will cause the
> > > > > > dummy_timer() not to exit for a long time, beacause the
> > > > > > dummy_timer() be called in softirq and local_bh is disable,
> > > > > > this not only causes the RCU reading critical area to consume
> > > > > > too much time but also makes the tasks in the current CPU runq not run
> in time, and that triggered RCU stall.
> > > > > >
> > > > > > return directly when find the urb status is not zero to fix it.
> > > > > >
> > > > > > Reported-by:
> > > > > > syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > > > > > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> > > > >
> > > > > What commit does this fix?  Does it need to go to stable kernels?
> > > > >
> > > > > What about the usbtmc maintainers, what do they think about this?
> > > >
> > > > This patch makes the babbling endpoint retry/recovery code in the
> > > > real world usb host controller drivers redundant and would prevent
> > > > usbtmc applications from benefiting from it.
> > >
> > > I do not understand, is this change ok or not?
> > >
> > > Why do usbtmc applications need to know if babbling happens or not?
> > >
> > > confused,
> > Sorry, the issue this patch is trying to fix occurs because the
> > current usbtmc driver resubmits the URB when it gets an EPROTO return.
> > The dummy usb host controller driver used in the syzbot tests keeps
> > returning the resubmitted URB with EPROTO causing a loop that starves
> > RCU. With an actual HCI driver it either recovers or returns an EPIPE.
> > In either case no loop occurs. So for my part as a user and maintainer
> > this patch is not ok.
>
> Thanks for the review.
>
> Zqiang, can you fix this patch up based on this please?
>
> thanks,
>
> greg k-h

>Qiang,
>
>After discussions with Alan and Dave we think that fixing the >usbtmc driver is the best approach to fix the RCU stall warning.
>Your first proposal was almost ok, but I think we should use >dev_dbg() instead of dev_err() to avoid printing the EPROTO >errors. See below:
>
>Please feel free to add the following text to your patch >description.
>
>-Guido
>
>
>The function usbtmc_interrupt() resubmits urbs when the error >status
>of an urb is -EPROTO. In systems using the dummy_hcd usb >controller
>this can result in endless interrupt loops when the usbtmc device >is
>disconnected from the host system.
>
>Since host controller drivers already try to recover from >transmission
>errors, there is no need to resubmit the urb or try other solutions
>to repair the error situation.
>
>In case of errors the INT pipe just stops to wait for further packets.
>
>Reviewed-by: Guido Kiener <guido.kiener@rohde-schwarz.com>
>
>diff --git a/drivers/usb/class/usbtmc.c b/drivers/usb/class>/usbtmc.c
>index 74d5a9c5238a..73f419adce61 100644
>--- a/drivers/usb/class/usbtmc.c
>+++ b/drivers/usb/class/usbtmc.c
>@@ -2324,17 +2324,10 @@ static void usbtmc_interrupt(struct >urb *urb)
>                dev_err(dev, "overflow with length %d, actual length is >%d\n",
>                        data->iin_wMaxPacketSize, urb->actual_length);
>                fallthrough;
>-       case -ECONNRESET:
>-       case -ENOENT:
>-       case -ESHUTDOWN:
>-       case -EILSEQ:
>-       case -ETIME:
>-       case -EPIPE:
>+       default:
>                /* urb terminated, clean up */
>                dev_dbg(dev, "urb terminated, status: %d\n", status);
>                return;
>-       default:
>-               dev_err(dev, "unknown status received: %d\n", status);
>        }
> exit:
>        rv = usb_submit_urb(urb, GFP_ATOMIC);
>

Thanks

I will resend v2

Qiang

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
@ 2021-07-22 17:33 Guido Kiener
  2021-07-23  0:36 ` Zhang, Qiang
  0 siblings, 1 reply; 20+ messages in thread
From: Guido Kiener @ 2021-07-22 17:33 UTC (permalink / raw)
  To: Greg KH, dave penkler
  Cc: qiang.zhang, Alan Stern, Dmitry Vyukov, paulmck, USB

> From: Greg KH
> Sent: Wednesday, July 21, 2021 11:48 AM
> Subject: *EXT* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
> 
> On Wed, Jul 21, 2021 at 11:44:23AM +0200, dave penkler wrote:
> > On Wed, 21 Jul 2021 at 09:52, Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Jul 21, 2021 at 09:41:15AM +0200, dave penkler wrote:
> > > > On Wed, 21 Jul 2021 at 09:08, Greg KH <gregkh@linuxfoundation.org>
> wrote:
> > > > >
> > > > > On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com
> wrote:
> > > > > > From: Zqiang <qiang.zhang@windriver.com>
> > > > >
> > > > > I need a "full" name here, and in the signed-off-by line please.
> > > > >
> > > > > >
> > > > > > rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > > > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> > > > > >         softirq=25390/25392 fqs=3
> > > > > >         (t=12164 jiffies g=31645 q=43226)
> > > > > > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> > > > > >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > > > > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> > > > > >         OOM is now expected behavior.
> > > > > > rcu: RCU grace-period kthread stack dump:
> > > > > > task:rcu_preempt     state:R  running task
> > > > > >
> > > > > > In the case of system use dummy_hcd as usb controller, when
> > > > > > the usbtmc devices is disconnected, in usbtmc_interrupt(), if
> > > > > > the urb status is unknown, the urb will be resubmit, the urb
> > > > > > may be insert to dum_hcd->urbp_list again, this will cause the
> > > > > > dummy_timer() not to exit for a long time, beacause the
> > > > > > dummy_timer() be called in softirq and local_bh is disable,
> > > > > > this not only causes the RCU reading critical area to consume
> > > > > > too much time but also makes the tasks in the current CPU runq not run
> in time, and that triggered RCU stall.
> > > > > >
> > > > > > return directly when find the urb status is not zero to fix it.
> > > > > >
> > > > > > Reported-by:
> > > > > > syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > > > > > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> > > > >
> > > > > What commit does this fix?  Does it need to go to stable kernels?
> > > > >
> > > > > What about the usbtmc maintainers, what do they think about this?
> > > >
> > > > This patch makes the babbling endpoint retry/recovery code in the
> > > > real world usb host controller drivers redundant and would prevent
> > > > usbtmc applications from benefiting from it.
> > >
> > > I do not understand, is this change ok or not?
> > >
> > > Why do usbtmc applications need to know if babbling happens or not?
> > >
> > > confused,
> > Sorry, the issue this patch is trying to fix occurs because the
> > current usbtmc driver resubmits the URB when it gets an EPROTO return.
> > The dummy usb host controller driver used in the syzbot tests keeps
> > returning the resubmitted URB with EPROTO causing a loop that starves
> > RCU. With an actual HCI driver it either recovers or returns an EPIPE.
> > In either case no loop occurs. So for my part as a user and maintainer
> > this patch is not ok.
> 
> Thanks for the review.
> 
> Zqiang, can you fix this patch up based on this please?
> 
> thanks,
> 
> greg k-h

Qiang,

After discussions with Alan and Dave we think that fixing the usbtmc driver is the best approach to fix the RCU stall warning.
Your first proposal was almost ok, but I think we should use dev_dbg() instead of dev_err() to avoid printing the EPROTO errors. See below:

Please feel free to add the following text to your patch description.

-Guido


The function usbtmc_interrupt() resubmits urbs when the error status
of an urb is -EPROTO. In systems using the dummy_hcd usb controller
this can result in endless interrupt loops when the usbtmc device is
disconnected from the host system.
    
Since host controller drivers already try to recover from transmission
errors, there is no need to resubmit the urb or try other solutions
to repair the error situation.
    
In case of errors the INT pipe just stops to wait for further packets.

Reviewed-by: Guido Kiener <guido.kiener@rohde-schwarz.com>

diff --git a/drivers/usb/class/usbtmc.c b/drivers/usb/class/usbtmc.c
index 74d5a9c5238a..73f419adce61 100644
--- a/drivers/usb/class/usbtmc.c
+++ b/drivers/usb/class/usbtmc.c
@@ -2324,17 +2324,10 @@ static void usbtmc_interrupt(struct urb *urb)
                dev_err(dev, "overflow with length %d, actual length is %d\n",
                        data->iin_wMaxPacketSize, urb->actual_length);
                fallthrough;
-       case -ECONNRESET:
-       case -ENOENT:
-       case -ESHUTDOWN:
-       case -EILSEQ:
-       case -ETIME:
-       case -EPIPE:
+       default:
                /* urb terminated, clean up */
                dev_dbg(dev, "urb terminated, status: %d\n", status);
                return;
-       default:
-               dev_err(dev, "unknown status received: %d\n", status);
        }
 exit:
        rv = usb_submit_urb(urb, GFP_ATOMIC);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21 15:24 Guido Kiener
@ 2021-07-21 16:17 ` Alan Stern
  0 siblings, 0 replies; 20+ messages in thread
From: Alan Stern @ 2021-07-21 16:17 UTC (permalink / raw)
  To: Guido Kiener
  Cc: dave penkler, Greg KH, qiang.zhang, Dmitry Vyukov, paulmck, USB

On Wed, Jul 21, 2021 at 03:24:13PM +0000, Guido Kiener wrote:
> > -----Original Message-----
> > From: Alan Stern <stern@rowland.harvard.edu>
> > Sent: Wednesday, July 21, 2021 4:22 PM
> > To: dave penkler <dpenkler@gmail.com>
> > Cc: Greg KH <gregkh@linuxfoundation.org>; qiang.zhang@windriver.com; Dmitry
> > Vyukov <dvyukov@google.com>; paulmck@kernel.org; Kiener Guido 14DS1
> > <Guido.Kiener@rohde-schwarz.com>; USB <linux-usb@vger.kernel.org>
> > Subject: *EXT* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
> > 
> > On Wed, Jul 21, 2021 at 11:44:23AM +0200, dave penkler wrote:
> > > Sorry, the issue this patch is trying to fix occurs because the
> > > current usbtmc driver resubmits the URB when it gets an EPROTO return.
> > > The dummy usb host controller driver used in the syzbot tests keeps
> > > returning the resubmitted URB with EPROTO causing a loop that starves
> > > RCU. With an actual HCI driver it either recovers or returns an EPIPE.
> > 
> > Are you sure about that?  Have you ever observed a usbtmc device recovering and
> > continuing to operate after an EPROTO error?
> 
> I can't speak for Dave and his investigations. However as you remember I did tests with
> EPROTO errors, see thread: https://marc.info/?l=linux-usb&m=162163776930423&w=2
> In the thread you can see the recovering.

Ah yes, now I remember.

That message doesn't show the _device_ recovering and continuing to operate, 
though.  It shows the _system_ recovering and realizing that the device has 
been disconnected.

What I was asking about was whether you knew of a case where there was an 
EPROTO error but afterward the usbtmc device continued to work -- no 
disconnection.  Assuming such cases are vanishingly rare, there's no harm in 
having the driver give up whenever it encounters EPROTO.

> Since no user blamed the usbtmc driver for system locks up to now, it's worth to think about
> whether the problem is caused by the dummy_hcd driver.

Both drivers contributed to the lockup.  The question is: Which driver was 
doing the wrong thing?  (Or which was _more_ wrong?)  I believe the usbtmc 
driver was.

> I still have no time for further investigations and would agree to use the simple patch
> to get rid of the topic for the usbtmc driver. Then the syzbot will maybe find another usb driver.

Agreed.  So Greg should go ahead and apply the $SUBJECT patch.

Alan Stern

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
@ 2021-07-21 15:24 Guido Kiener
  2021-07-21 16:17 ` Alan Stern
  0 siblings, 1 reply; 20+ messages in thread
From: Guido Kiener @ 2021-07-21 15:24 UTC (permalink / raw)
  To: Alan Stern, dave penkler
  Cc: Greg KH, qiang.zhang, Dmitry Vyukov, paulmck, USB

> -----Original Message-----
> From: Alan Stern <stern@rowland.harvard.edu>
> Sent: Wednesday, July 21, 2021 4:22 PM
> To: dave penkler <dpenkler@gmail.com>
> Cc: Greg KH <gregkh@linuxfoundation.org>; qiang.zhang@windriver.com; Dmitry
> Vyukov <dvyukov@google.com>; paulmck@kernel.org; Kiener Guido 14DS1
> <Guido.Kiener@rohde-schwarz.com>; USB <linux-usb@vger.kernel.org>
> Subject: *EXT* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
> 
> On Wed, Jul 21, 2021 at 11:44:23AM +0200, dave penkler wrote:
> > Sorry, the issue this patch is trying to fix occurs because the
> > current usbtmc driver resubmits the URB when it gets an EPROTO return.
> > The dummy usb host controller driver used in the syzbot tests keeps
> > returning the resubmitted URB with EPROTO causing a loop that starves
> > RCU. With an actual HCI driver it either recovers or returns an EPIPE.
> 
> Are you sure about that?  Have you ever observed a usbtmc device recovering and
> continuing to operate after an EPROTO error?

I can't speak for Dave and his investigations. However as you remember I did tests with
EPROTO errors, see thread: https://marc.info/?l=linux-usb&m=162163776930423&w=2
In the thread you can see the recovering.
Since no user blamed the usbtmc driver for system locks up to now, it's worth to think about
whether the problem is caused by the dummy_hcd driver.
I still have no time for further investigations and would agree to use the simple patch
to get rid of the topic for the usbtmc driver. Then the syzbot will maybe find another usb driver.

-Guido

> An EPIPE error also seems rather unlikely -- particularly if the device is not plugged
> into a high-speed hub.
> 
> Alan Stern
> 
> > In either case no loop occurs. So for my part as a user and maintainer
> > this patch is not ok.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21  9:44       ` dave penkler
  2021-07-21  9:47         ` Greg KH
@ 2021-07-21 14:22         ` Alan Stern
  1 sibling, 0 replies; 20+ messages in thread
From: Alan Stern @ 2021-07-21 14:22 UTC (permalink / raw)
  To: dave penkler; +Cc: Greg KH, qiang.zhang, Dmitry Vyukov, paulmck, Guido, USB

On Wed, Jul 21, 2021 at 11:44:23AM +0200, dave penkler wrote:
> Sorry, the issue this patch is trying to fix occurs because the
> current usbtmc driver resubmits the URB when it gets an EPROTO return.
> The dummy usb host controller driver used in the syzbot tests keeps
> returning the resubmitted URB with EPROTO causing a loop that starves
> RCU. With an actual HCI driver it either recovers or returns an EPIPE.

Are you sure about that?  Have you ever observed a usbtmc device recovering 
and continuing to operate after an EPROTO error?

An EPIPE error also seems rather unlikely -- particularly if the device is 
not plugged into a high-speed hub.

Alan Stern

> In either case no loop occurs. So for my part as a user and maintainer
> this patch is not ok.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
@ 2021-07-21 11:15 Guido Kiener
  0 siblings, 0 replies; 20+ messages in thread
From: Guido Kiener @ 2021-07-21 11:15 UTC (permalink / raw)
  To: Greg KH, Zhang, Qiang; +Cc: stern, dvyukov, paulmck, dpenkler, linux-usb

> -----Original Message-----
> From: Greg KH <gregkh@linuxfoundation.org>
> Sent: Wednesday, July 21, 2021 9:53 AM
> To: Zhang, Qiang <Qiang.Zhang@windriver.com>
> Cc: stern@rowland.harvard.edu; dvyukov@google.com; paulmck@kernel.org;
> dpenkler@gmail.com; Kiener Guido 14DS1 <Guido.Kiener@rohde-schwarz.com>;
> linux-usb@vger.kernel.org
> Subject: *EXT* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
> 
> On Wed, Jul 21, 2021 at 07:30:39AM +0000, Zhang, Qiang wrote:
> >
> >
> > ________________________________________
> > From: Greg KH <gregkh@linuxfoundation.org>
> > Sent: Wednesday, 21 July 2021 15:08
> > To: Zhang, Qiang
> > Cc: stern@rowland.harvard.edu; dvyukov@google.com; paulmck@kernel.org;
> > dpenkler@gmail.com; guido.kiener@rohde-schwarz.com;
> > linux-usb@vger.kernel.org
> > Subject: Re: [PATCH] USB: usbtmc: Fix RCU stall warning
> >
> > [Please note: This e-mail is from an EXTERNAL e-mail address]
> >
> > On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> > > From: Zqiang <qiang.zhang@windriver.com>
> >
> > >I need a "full" name here, and in the signed-off-by line please.
> >
> > >
> > > rcu: INFO: rcu_preempt self-detected stall on CPU
> > > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> > >         softirq=25390/25392 fqs=3
> > >         (t=12164 jiffies g=31645 q=43226)
> > > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> > >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> > >         OOM is now expected behavior.
> > > rcu: RCU grace-period kthread stack dump:
> > > task:rcu_preempt     state:R  running task
> > >
> > > In the case of system use dummy_hcd as usb controller, when the
> > > usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> > > status is unknown, the urb will be resubmit, the urb may be insert
> > > to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> > > to exit for a long time, beacause the dummy_timer() be called in
> > > softirq and local_bh is disable, this not only causes the RCU
> > > reading critical area to consume too much time but also makes the
> > > tasks in the current CPU runq not run in time, and that triggered RCU stall.
> > >
> > > return directly when find the urb status is not zero to fix it.
> > >
> > > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> >
> > >What commit does this fix?  Does it need to go to stable kernels?
> >
> >  I will add fix tags resend,   need to go to stable kernel
> >
> > >
> > >What about the usbtmc maintainers, what do they think about this?

I'm ok with the fix. It will make the syzbot and dummy_hcd controller happy when using
the usbtmc driver. Nevertheless there are many other usb kernel driver that resubmit
the urb when the callback handler detects the urb status = -EPROTO.
So I expect the issue will rehappen with other usb drivers again.
In "normal" environments the urb status = -EPROTO will mostly happen when the cable is
disconnected, but it does not freeze the kernel and shuts down the connection as usual.
Up to now we have no customer feedback which is blaming this issue. I hope this helps.

-Guido

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21  9:44       ` dave penkler
@ 2021-07-21  9:47         ` Greg KH
  2021-07-21 14:22         ` Alan Stern
  1 sibling, 0 replies; 20+ messages in thread
From: Greg KH @ 2021-07-21  9:47 UTC (permalink / raw)
  To: dave penkler; +Cc: qiang.zhang, Alan Stern, Dmitry Vyukov, paulmck, Guido, USB

On Wed, Jul 21, 2021 at 11:44:23AM +0200, dave penkler wrote:
> On Wed, 21 Jul 2021 at 09:52, Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Jul 21, 2021 at 09:41:15AM +0200, dave penkler wrote:
> > > On Wed, 21 Jul 2021 at 09:08, Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> > > > > From: Zqiang <qiang.zhang@windriver.com>
> > > >
> > > > I need a "full" name here, and in the signed-off-by line please.
> > > >
> > > > >
> > > > > rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> > > > >         softirq=25390/25392 fqs=3
> > > > >         (t=12164 jiffies g=31645 q=43226)
> > > > > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> > > > >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > > > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> > > > >         OOM is now expected behavior.
> > > > > rcu: RCU grace-period kthread stack dump:
> > > > > task:rcu_preempt     state:R  running task
> > > > >
> > > > > In the case of system use dummy_hcd as usb controller, when the
> > > > > usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> > > > > status is unknown, the urb will be resubmit, the urb may be insert
> > > > > to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> > > > > to exit for a long time, beacause the dummy_timer() be called in
> > > > > softirq and local_bh is disable, this not only causes the RCU reading
> > > > > critical area to consume too much time but also makes the tasks in
> > > > > the current CPU runq not run in time, and that triggered RCU stall.
> > > > >
> > > > > return directly when find the urb status is not zero to fix it.
> > > > >
> > > > > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > > > > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> > > >
> > > > What commit does this fix?  Does it need to go to stable kernels?
> > > >
> > > > What about the usbtmc maintainers, what do they think about this?
> > >
> > > This patch makes the babbling endpoint retry/recovery code in the real
> > > world usb host controller drivers redundant and would prevent usbtmc
> > > applications from benefiting from it.
> >
> > I do not understand, is this change ok or not?
> >
> > Why do usbtmc applications need to know if babbling happens or not?
> >
> > confused,
> Sorry, the issue this patch is trying to fix occurs because the
> current usbtmc driver resubmits the URB when it gets an EPROTO return.
> The dummy usb host controller driver used in the syzbot tests keeps
> returning the resubmitted URB with EPROTO causing a loop that starves
> RCU. With an actual HCI driver it either recovers or returns an EPIPE.
> In either case no loop occurs. So for my part as a user and maintainer
> this patch is not ok.

Thanks for the review.

Zqiang, can you fix this patch up based on this please?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21  7:52     ` Greg KH
@ 2021-07-21  9:44       ` dave penkler
  2021-07-21  9:47         ` Greg KH
  2021-07-21 14:22         ` Alan Stern
  0 siblings, 2 replies; 20+ messages in thread
From: dave penkler @ 2021-07-21  9:44 UTC (permalink / raw)
  To: Greg KH; +Cc: qiang.zhang, Alan Stern, Dmitry Vyukov, paulmck, Guido, USB

On Wed, 21 Jul 2021 at 09:52, Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Jul 21, 2021 at 09:41:15AM +0200, dave penkler wrote:
> > On Wed, 21 Jul 2021 at 09:08, Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> > > > From: Zqiang <qiang.zhang@windriver.com>
> > >
> > > I need a "full" name here, and in the signed-off-by line please.
> > >
> > > >
> > > > rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> > > >         softirq=25390/25392 fqs=3
> > > >         (t=12164 jiffies g=31645 q=43226)
> > > > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> > > >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> > > >         OOM is now expected behavior.
> > > > rcu: RCU grace-period kthread stack dump:
> > > > task:rcu_preempt     state:R  running task
> > > >
> > > > In the case of system use dummy_hcd as usb controller, when the
> > > > usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> > > > status is unknown, the urb will be resubmit, the urb may be insert
> > > > to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> > > > to exit for a long time, beacause the dummy_timer() be called in
> > > > softirq and local_bh is disable, this not only causes the RCU reading
> > > > critical area to consume too much time but also makes the tasks in
> > > > the current CPU runq not run in time, and that triggered RCU stall.
> > > >
> > > > return directly when find the urb status is not zero to fix it.
> > > >
> > > > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > > > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> > >
> > > What commit does this fix?  Does it need to go to stable kernels?
> > >
> > > What about the usbtmc maintainers, what do they think about this?
> >
> > This patch makes the babbling endpoint retry/recovery code in the real
> > world usb host controller drivers redundant and would prevent usbtmc
> > applications from benefiting from it.
>
> I do not understand, is this change ok or not?
>
> Why do usbtmc applications need to know if babbling happens or not?
>
> confused,
Sorry, the issue this patch is trying to fix occurs because the
current usbtmc driver resubmits the URB when it gets an EPROTO return.
The dummy usb host controller driver used in the syzbot tests keeps
returning the resubmitted URB with EPROTO causing a loop that starves
RCU. With an actual HCI driver it either recovers or returns an EPIPE.
In either case no loop occurs. So for my part as a user and maintainer
this patch is not ok.
>
> greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21  7:41   ` dave penkler
  2021-07-21  7:52     ` Greg KH
@ 2021-07-21  8:34     ` Zhang, Qiang
  1 sibling, 0 replies; 20+ messages in thread
From: Zhang, Qiang @ 2021-07-21  8:34 UTC (permalink / raw)
  To: dave penkler, Greg KH; +Cc: Alan Stern, Dmitry Vyukov, paulmck, Guido, USB



________________________________________
From: dave penkler <dpenkler@gmail.com>
Sent: Wednesday, 21 July 2021 15:41
To: Greg KH
Cc: Zhang, Qiang; Alan Stern; Dmitry Vyukov; paulmck@kernel.org; Guido; USB
Subject: Re: [PATCH] USB: usbtmc: Fix RCU stall warning

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Wed, 21 Jul 2021 at 09:08, Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> > From: Zqiang <qiang.zhang@windriver.com>
>
> I need a "full" name here, and in the signed-off-by line please.
>
> >
> > rcu: INFO: rcu_preempt self-detected stall on CPU
> > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> >         softirq=25390/25392 fqs=3
> >         (t=12164 jiffies g=31645 q=43226)
> > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> >         OOM is now expected behavior.
> > rcu: RCU grace-period kthread stack dump:
> > task:rcu_preempt     state:R  running task
> >
> > In the case of system use dummy_hcd as usb controller, when the
> > usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> > status is unknown, the urb will be resubmit, the urb may be insert
> > to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> > to exit for a long time, beacause the dummy_timer() be called in
> > softirq and local_bh is disable, this not only causes the RCU reading
> > critical area to consume too much time but also makes the tasks in
> > the current CPU runq not run in time, and that triggered RCU stall.
> >
> > return directly when find the urb status is not zero to fix it.
> >
> > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
>
> What commit does this fix?  Does it need to go to stable kernels?
>
> What about the usbtmc maintainers, what do they think about this?

>This patch makes the babbling endpoint retry/recovery code in the real
>world usb host controller drivers redundant and would prevent usbtmc
>applications from benefiting from it.

Due to use dummy_hcd host,  as explained earlier  the RCU stall happens all the time,  Is there a better way to solve it ?

Thanks
Qiang
>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21  7:52     ` Greg KH
@ 2021-07-21  8:28       ` Zhang, Qiang
  0 siblings, 0 replies; 20+ messages in thread
From: Zhang, Qiang @ 2021-07-21  8:28 UTC (permalink / raw)
  To: Greg KH; +Cc: stern, dvyukov, paulmck, dpenkler, guido.kiener, linux-usb



________________________________________
From: Greg KH <gregkh@linuxfoundation.org>
Sent: Wednesday, 21 July 2021 15:52
To: Zhang, Qiang
Cc: stern@rowland.harvard.edu; dvyukov@google.com; paulmck@kernel.org; dpenkler@gmail.com; guido.kiener@rohde-schwarz.com; linux-usb@vger.kernel.org
Subject: Re: [PATCH] USB: usbtmc: Fix RCU stall warning

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Wed, Jul 21, 2021 at 07:30:39AM +0000, Zhang, Qiang wrote:
>
>
> ________________________________________
> From: Greg KH <gregkh@linuxfoundation.org>
> Sent: Wednesday, 21 July 2021 15:08
> To: Zhang, Qiang
> Cc: stern@rowland.harvard.edu; dvyukov@google.com; paulmck@kernel.org; dpenkler@gmail.com; guido.kiener@rohde-schwarz.com; linux-usb@vger.kernel.org
> Subject: Re: [PATCH] USB: usbtmc: Fix RCU stall warning
>
> [Please note: This e-mail is from an EXTERNAL e-mail address]
>
> On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> > From: Zqiang <qiang.zhang@windriver.com>
>
> >I need a "full" name here, and in the signed-off-by line please.
>
> >
> > rcu: INFO: rcu_preempt self-detected stall on CPU
> > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> >         softirq=25390/25392 fqs=3
> >         (t=12164 jiffies g=31645 q=43226)
> > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> >         OOM is now expected behavior.
> > rcu: RCU grace-period kthread stack dump:
> > task:rcu_preempt     state:R  running task
> >
> > In the case of system use dummy_hcd as usb controller, when the
> > usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> > status is unknown, the urb will be resubmit, the urb may be insert
> > to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> > to exit for a long time, beacause the dummy_timer() be called in
> > softirq and local_bh is disable, this not only causes the RCU reading
> > critical area to consume too much time but also makes the tasks in
> > the current CPU runq not run in time, and that triggered RCU stall.
> >
> > return directly when find the urb status is not zero to fix it.
> >
> > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
>
> >What commit does this fix?  Does it need to go to stable kernels?
>
>  I will add fix tags resend,   need to go to stable kernel
>
> >
> >What about the usbtmc maintainers, what do they think about this?
>
>  Alan Stern reviewed this change before.
>
>I do not see that on this commit :(




Sorry,  I used the wrong words,  Alan Stern made suggestions for my patch.


The content is as follows :

On Mon, Jun 28, 2021 at 06:38:37AM +0000, Zhang, Qiang wrote:
>
>
> ________________________________________
> From: Dmitry Vyukov <dvyukov@google.com>
> Sent: Monday, 19 April 2021 15:27
> To: syzbot; Greg Kroah-Hartman; guido.kiener@rohde-schwarz.com; dpenkler@gmail.com; lee.jones@linaro.org; USB list
> Cc: bp@alien8.de; dwmw@amazon.co.uk; hpa@zytor.com; linux-kernel@vger.kernel.org; luto@kernel.org; mingo@redhat.com; syzkaller-bugs@googlegroups.com; tglx@linutronix.de; x86@kernel.org
> Subject: Re: [syzbot] INFO: rcu detected stall in tx
>
> [Please note: This e-mail is from an EXTERNAL e-mail address]
>
> On Mon, Apr 19, 2021 at 9:19 AM syzbot
> <syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    50987bec Merge tag 'trace-v5.12-rc7' of git://git.kernel.o..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1065c5fcd00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=398c4d0fe6f66e68
> > dashboard link: https://syzkaller.appspot.com/bug?extid=e2eae5639e7203360018
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> >
> > usbtmc 5-1:0.0: unknown status received: -71
> > usbtmc 3-1:0.0: unknown status received: -71
> > usbtmc 5-1:0.0: unknown status received: -71
>
> >The log shows an infinite stream of these before the stall, so I
> >assume it's an infinite loop in usbtmc.
> >+usbtmc maintainers
> >
> >[  370.171634][    C0] usbtmc 6-1:0.0: unknown status received: >-71
> >[  370.177799][    C1] usbtmc 3-1:0.0: unknown status received: >-71

> This seems like a long time in the following cycle,  when the callback function usbtmc_interrupt() find unknown status error, it will submit urb again. the urb may be insert  urbp_list.
> due to the dummy_timer() be called in bh-disable.
> This will result in the RCU reading critical area not exiting for a long time (note: bh_disable/enable, preempt_disable/enable is regarded as the RCU critical reading area ), and prevent rcu_preempt kthread be schedule and running.

> Whether to return directly when we find the urb status is unknown error?

Yes.
> diff --git a/drivers/usb/class/usbtmc.c b/drivers/usb/class/usbtmc.c
> index 74d5a9c5238a..39d44339c03f 100644
> --- a/drivers/usb/class/usbtmc.c
> +++ b/drivers/usb/class/usbtmc.c
> @@ -2335,6 +2335,7 @@ static void usbtmc_interrupt(struct urb *urb)
>                 return;
>         default:
>                 dev_err(dev, "unknown status received: %d\n", status);
> +               return;
>         }
>  exit:
>         rv = usb_submit_urb(urb, GFP_ATOMIC);
This is the right thing to do.  In fact, you should also change the code
above this.  There's no real need for special handling of the
-ECONNRESET, -ENOENT, ..., -EPIPE codes, since the driver will do the
same thing no matter what the code is.

Alan Stern

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21  7:30   ` Zhang, Qiang
@ 2021-07-21  7:52     ` Greg KH
  2021-07-21  8:28       ` Zhang, Qiang
  0 siblings, 1 reply; 20+ messages in thread
From: Greg KH @ 2021-07-21  7:52 UTC (permalink / raw)
  To: Zhang, Qiang; +Cc: stern, dvyukov, paulmck, dpenkler, guido.kiener, linux-usb

On Wed, Jul 21, 2021 at 07:30:39AM +0000, Zhang, Qiang wrote:
> 
> 
> ________________________________________
> From: Greg KH <gregkh@linuxfoundation.org>
> Sent: Wednesday, 21 July 2021 15:08
> To: Zhang, Qiang
> Cc: stern@rowland.harvard.edu; dvyukov@google.com; paulmck@kernel.org; dpenkler@gmail.com; guido.kiener@rohde-schwarz.com; linux-usb@vger.kernel.org
> Subject: Re: [PATCH] USB: usbtmc: Fix RCU stall warning
> 
> [Please note: This e-mail is from an EXTERNAL e-mail address]
> 
> On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> > From: Zqiang <qiang.zhang@windriver.com>
> 
> >I need a "full" name here, and in the signed-off-by line please.
> 
> >
> > rcu: INFO: rcu_preempt self-detected stall on CPU
> > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> >         softirq=25390/25392 fqs=3
> >         (t=12164 jiffies g=31645 q=43226)
> > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> >         OOM is now expected behavior.
> > rcu: RCU grace-period kthread stack dump:
> > task:rcu_preempt     state:R  running task
> >
> > In the case of system use dummy_hcd as usb controller, when the
> > usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> > status is unknown, the urb will be resubmit, the urb may be insert
> > to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> > to exit for a long time, beacause the dummy_timer() be called in
> > softirq and local_bh is disable, this not only causes the RCU reading
> > critical area to consume too much time but also makes the tasks in
> > the current CPU runq not run in time, and that triggered RCU stall.
> >
> > return directly when find the urb status is not zero to fix it.
> >
> > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> 
> >What commit does this fix?  Does it need to go to stable kernels?
> 
>  I will add fix tags resend,   need to go to stable kernel
> 
> >
> >What about the usbtmc maintainers, what do they think about this?
> 
>  Alan Stern reviewed this change before.

I do not see that on this commit :(

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21  7:41   ` dave penkler
@ 2021-07-21  7:52     ` Greg KH
  2021-07-21  9:44       ` dave penkler
  2021-07-21  8:34     ` Zhang, Qiang
  1 sibling, 1 reply; 20+ messages in thread
From: Greg KH @ 2021-07-21  7:52 UTC (permalink / raw)
  To: dave penkler; +Cc: qiang.zhang, Alan Stern, Dmitry Vyukov, paulmck, Guido, USB

On Wed, Jul 21, 2021 at 09:41:15AM +0200, dave penkler wrote:
> On Wed, 21 Jul 2021 at 09:08, Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> > > From: Zqiang <qiang.zhang@windriver.com>
> >
> > I need a "full" name here, and in the signed-off-by line please.
> >
> > >
> > > rcu: INFO: rcu_preempt self-detected stall on CPU
> > > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> > >         softirq=25390/25392 fqs=3
> > >         (t=12164 jiffies g=31645 q=43226)
> > > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> > >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> > >         OOM is now expected behavior.
> > > rcu: RCU grace-period kthread stack dump:
> > > task:rcu_preempt     state:R  running task
> > >
> > > In the case of system use dummy_hcd as usb controller, when the
> > > usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> > > status is unknown, the urb will be resubmit, the urb may be insert
> > > to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> > > to exit for a long time, beacause the dummy_timer() be called in
> > > softirq and local_bh is disable, this not only causes the RCU reading
> > > critical area to consume too much time but also makes the tasks in
> > > the current CPU runq not run in time, and that triggered RCU stall.
> > >
> > > return directly when find the urb status is not zero to fix it.
> > >
> > > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> >
> > What commit does this fix?  Does it need to go to stable kernels?
> >
> > What about the usbtmc maintainers, what do they think about this?
> 
> This patch makes the babbling endpoint retry/recovery code in the real
> world usb host controller drivers redundant and would prevent usbtmc
> applications from benefiting from it.

I do not understand, is this change ok or not?

Why do usbtmc applications need to know if babbling happens or not?

confused,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21  7:08 ` Greg KH
  2021-07-21  7:30   ` Zhang, Qiang
@ 2021-07-21  7:41   ` dave penkler
  2021-07-21  7:52     ` Greg KH
  2021-07-21  8:34     ` Zhang, Qiang
  1 sibling, 2 replies; 20+ messages in thread
From: dave penkler @ 2021-07-21  7:41 UTC (permalink / raw)
  To: Greg KH; +Cc: qiang.zhang, Alan Stern, Dmitry Vyukov, paulmck, Guido, USB

On Wed, 21 Jul 2021 at 09:08, Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> > From: Zqiang <qiang.zhang@windriver.com>
>
> I need a "full" name here, and in the signed-off-by line please.
>
> >
> > rcu: INFO: rcu_preempt self-detected stall on CPU
> > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> >         softirq=25390/25392 fqs=3
> >         (t=12164 jiffies g=31645 q=43226)
> > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> >         OOM is now expected behavior.
> > rcu: RCU grace-period kthread stack dump:
> > task:rcu_preempt     state:R  running task
> >
> > In the case of system use dummy_hcd as usb controller, when the
> > usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> > status is unknown, the urb will be resubmit, the urb may be insert
> > to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> > to exit for a long time, beacause the dummy_timer() be called in
> > softirq and local_bh is disable, this not only causes the RCU reading
> > critical area to consume too much time but also makes the tasks in
> > the current CPU runq not run in time, and that triggered RCU stall.
> >
> > return directly when find the urb status is not zero to fix it.
> >
> > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
>
> What commit does this fix?  Does it need to go to stable kernels?
>
> What about the usbtmc maintainers, what do they think about this?

This patch makes the babbling endpoint retry/recovery code in the real
world usb host controller drivers redundant and would prevent usbtmc
applications from benefiting from it.

>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-21  7:08 ` Greg KH
@ 2021-07-21  7:30   ` Zhang, Qiang
  2021-07-21  7:52     ` Greg KH
  2021-07-21  7:41   ` dave penkler
  1 sibling, 1 reply; 20+ messages in thread
From: Zhang, Qiang @ 2021-07-21  7:30 UTC (permalink / raw)
  To: Greg KH; +Cc: stern, dvyukov, paulmck, dpenkler, guido.kiener, linux-usb



________________________________________
From: Greg KH <gregkh@linuxfoundation.org>
Sent: Wednesday, 21 July 2021 15:08
To: Zhang, Qiang
Cc: stern@rowland.harvard.edu; dvyukov@google.com; paulmck@kernel.org; dpenkler@gmail.com; guido.kiener@rohde-schwarz.com; linux-usb@vger.kernel.org
Subject: Re: [PATCH] USB: usbtmc: Fix RCU stall warning

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> From: Zqiang <qiang.zhang@windriver.com>

>I need a "full" name here, and in the signed-off-by line please.

>
> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
>         softirq=25390/25392 fqs=3
>         (t=12164 jiffies g=31645 q=43226)
> rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
>      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
>         OOM is now expected behavior.
> rcu: RCU grace-period kthread stack dump:
> task:rcu_preempt     state:R  running task
>
> In the case of system use dummy_hcd as usb controller, when the
> usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> status is unknown, the urb will be resubmit, the urb may be insert
> to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> to exit for a long time, beacause the dummy_timer() be called in
> softirq and local_bh is disable, this not only causes the RCU reading
> critical area to consume too much time but also makes the tasks in
> the current CPU runq not run in time, and that triggered RCU stall.
>
> return directly when find the urb status is not zero to fix it.
>
> Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> Signed-off-by: Zqiang <qiang.zhang@windriver.com>

>What commit does this fix?  Does it need to go to stable kernels?

 I will add fix tags resend,   need to go to stable kernel

>
>What about the usbtmc maintainers, what do they think about this?

 Alan Stern reviewed this change before.

Thanks
Qiang
>
>thanks,
>
>greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-06-29  3:32 qiang.zhang
  2021-07-07  1:29 ` Zhang, Qiang
@ 2021-07-21  7:08 ` Greg KH
  2021-07-21  7:30   ` Zhang, Qiang
  2021-07-21  7:41   ` dave penkler
  1 sibling, 2 replies; 20+ messages in thread
From: Greg KH @ 2021-07-21  7:08 UTC (permalink / raw)
  To: qiang.zhang; +Cc: stern, dvyukov, paulmck, dpenkler, guido.kiener, linux-usb

On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@windriver.com wrote:
> From: Zqiang <qiang.zhang@windriver.com>

I need a "full" name here, and in the signed-off-by line please.

> 
> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
>         softirq=25390/25392 fqs=3
>         (t=12164 jiffies g=31645 q=43226)
> rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
>      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
>         OOM is now expected behavior.
> rcu: RCU grace-period kthread stack dump:
> task:rcu_preempt     state:R  running task
> 
> In the case of system use dummy_hcd as usb controller, when the
> usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
> status is unknown, the urb will be resubmit, the urb may be insert
> to dum_hcd->urbp_list again, this will cause the dummy_timer() not
> to exit for a long time, beacause the dummy_timer() be called in
> softirq and local_bh is disable, this not only causes the RCU reading
> critical area to consume too much time but also makes the tasks in
> the current CPU runq not run in time, and that triggered RCU stall.
> 
> return directly when find the urb status is not zero to fix it.
> 
> Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> Signed-off-by: Zqiang <qiang.zhang@windriver.com>

What commit does this fix?  Does it need to go to stable kernels?

What about the usbtmc maintainers, what do they think about this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-07-07  1:29 ` Zhang, Qiang
@ 2021-07-07  5:58   ` gregkh
  0 siblings, 0 replies; 20+ messages in thread
From: gregkh @ 2021-07-07  5:58 UTC (permalink / raw)
  To: Zhang, Qiang; +Cc: stern, dvyukov, dpenkler, guido.kiener, linux-usb

On Wed, Jul 07, 2021 at 01:29:12AM +0000, Zhang, Qiang wrote:
> 
> 
> ________________________________________
> From: Zhang, Qiang <qiang.zhang@windriver.com>
> Sent: Tuesday, 29 June 2021 11:32
> To: gregkh@linuxfoundation.org; stern@rowland.harvard.edu; dvyukov@google.com
> Cc: paulmck@kernel.org; dpenkler@gmail.com; guido.kiener@rohde-schwarz.com; linux-usb@vger.kernel.org
> 
> Hello Greg KH
> 
> Have you reviewed this change? 

Nope!

You should have got a copy of my "it's the merge window, relax!"
response from my bot, for this patch.  If not, here it is below:

-----------------

Hi,

This is the friendly semi-automated patch-bot of Greg Kroah-Hartman.
You have sent him a patch that has triggered this response.

Right now, the development tree you have sent a patch for is "closed"
due to the timing of the merge window.  Don't worry, the patch(es) you
have sent are not lost, and will be looked at after the merge window is
over (after the -rc1 kernel is released by Linus).

So thank you for your patience and your patches will be reviewed at this
later time, you do not have to do anything further, this is just a short
note to let you know the patch status and so you don't worry they didn't
make it through.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
  2021-06-29  3:32 qiang.zhang
@ 2021-07-07  1:29 ` Zhang, Qiang
  2021-07-07  5:58   ` gregkh
  2021-07-21  7:08 ` Greg KH
  1 sibling, 1 reply; 20+ messages in thread
From: Zhang, Qiang @ 2021-07-07  1:29 UTC (permalink / raw)
  To: gregkh, stern, dvyukov; +Cc: dpenkler, guido.kiener, linux-usb



________________________________________
From: Zhang, Qiang <qiang.zhang@windriver.com>
Sent: Tuesday, 29 June 2021 11:32
To: gregkh@linuxfoundation.org; stern@rowland.harvard.edu; dvyukov@google.com
Cc: paulmck@kernel.org; dpenkler@gmail.com; guido.kiener@rohde-schwarz.com; linux-usb@vger.kernel.org

Hello Greg KH

Have you reviewed this change? 

Thanks
Qiang

>Subject: [PATCH] USB: usbtmc: Fix RCU stall warning
>
>From: Zqiang <qiang.zhang@windriver.com>
>
>rcu: INFO: rcu_preempt self-detected stall on CPU
>rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
>        softirq=25390/25392 fqs=3
>        (t=12164 jiffies g=31645 q=43226)
>rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
>     RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
>rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
>        OOM is now expected behavior.
>rcu: RCU grace-period kthread stack dump:
>task:rcu_preempt     state:R  running task
>
>In the case of system use dummy_hcd as usb controller, when the
>usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
>status is unknown, the urb will be resubmit, the urb may be insert
>to dum_hcd->urbp_list again, this will cause the dummy_timer() not
>to exit for a long time, beacause the dummy_timer() be called in
>softirq and local_bh is disable, this not only causes the RCU reading
>critical area to consume too much time but also makes the tasks in
>the current CPU runq not run in time, and that triggered RCU stall.
>
>return directly when find the urb status is not zero to fix it.
>
>Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
>Signed-off-by: Zqiang <qiang.zhang@windriver.com>
>---
> drivers/usb/class/usbtmc.c | 12 ++----------
> 1 file changed, 2 insertions(+), 10 deletions(-)
>
>diff --git a/drivers/usb/class/usbtmc.c b/drivers/usb/class/usbtmc.c
>index 74d5a9c5238a..c4e1a88fff78 100644
>--- a/drivers/usb/class/usbtmc.c
>+++ b/drivers/usb/class/usbtmc.c
>@@ -2324,17 +2324,9 @@ static void usbtmc_interrupt(struct urb *urb)
>                dev_err(dev, "overflow with length %d, actual length is %d\n",
>                        data->iin_wMaxPacketSize, urb->actual_length);
>                fallthrough;
>-       case -ECONNRESET:
>-       case -ENOENT:
>-       case -ESHUTDOWN:
>-       case -EILSEQ:
>-       case -ETIME:
>-       case -EPIPE:
>-               /* urb terminated, clean up */
>-               dev_dbg(dev, "urb terminated, status: %d\n", status);
>-               return;
>        default:
>-               dev_err(dev, "unknown status received: %d\n", status);
>+               dev_err(dev, "error status received: %d\n", status);
>+               return;
>        }
> exit:
>        rv = usb_submit_urb(urb, GFP_ATOMIC);
>--
>2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] USB: usbtmc: Fix RCU stall warning
@ 2021-06-29  3:32 qiang.zhang
  2021-07-07  1:29 ` Zhang, Qiang
  2021-07-21  7:08 ` Greg KH
  0 siblings, 2 replies; 20+ messages in thread
From: qiang.zhang @ 2021-06-29  3:32 UTC (permalink / raw)
  To: gregkh, stern, dvyukov; +Cc: paulmck, dpenkler, guido.kiener, linux-usb

From: Zqiang <qiang.zhang@windriver.com>

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
        softirq=25390/25392 fqs=3
        (t=12164 jiffies g=31645 q=43226)
rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
     RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
        OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task

In the case of system use dummy_hcd as usb controller, when the
usbtmc devices is disconnected, in usbtmc_interrupt(), if the urb
status is unknown, the urb will be resubmit, the urb may be insert
to dum_hcd->urbp_list again, this will cause the dummy_timer() not
to exit for a long time, beacause the dummy_timer() be called in
softirq and local_bh is disable, this not only causes the RCU reading
critical area to consume too much time but also makes the tasks in
the current CPU runq not run in time, and that triggered RCU stall.

return directly when find the urb status is not zero to fix it.

Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
---
 drivers/usb/class/usbtmc.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/class/usbtmc.c b/drivers/usb/class/usbtmc.c
index 74d5a9c5238a..c4e1a88fff78 100644
--- a/drivers/usb/class/usbtmc.c
+++ b/drivers/usb/class/usbtmc.c
@@ -2324,17 +2324,9 @@ static void usbtmc_interrupt(struct urb *urb)
 		dev_err(dev, "overflow with length %d, actual length is %d\n",
 			data->iin_wMaxPacketSize, urb->actual_length);
 		fallthrough;
-	case -ECONNRESET:
-	case -ENOENT:
-	case -ESHUTDOWN:
-	case -EILSEQ:
-	case -ETIME:
-	case -EPIPE:
-		/* urb terminated, clean up */
-		dev_dbg(dev, "urb terminated, status: %d\n", status);
-		return;
 	default:
-		dev_err(dev, "unknown status received: %d\n", status);
+		dev_err(dev, "error status received: %d\n", status);
+		return;
 	}
 exit:
 	rv = usb_submit_urb(urb, GFP_ATOMIC);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-07-23  0:36 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-21 17:08 [PATCH] USB: usbtmc: Fix RCU stall warning Guido Kiener
2021-07-21 18:16 ` Alan Stern
  -- strict thread matches above, loose matches on Subject: below --
2021-07-22 17:33 Guido Kiener
2021-07-23  0:36 ` Zhang, Qiang
2021-07-21 15:24 Guido Kiener
2021-07-21 16:17 ` Alan Stern
2021-07-21 11:15 Guido Kiener
2021-06-29  3:32 qiang.zhang
2021-07-07  1:29 ` Zhang, Qiang
2021-07-07  5:58   ` gregkh
2021-07-21  7:08 ` Greg KH
2021-07-21  7:30   ` Zhang, Qiang
2021-07-21  7:52     ` Greg KH
2021-07-21  8:28       ` Zhang, Qiang
2021-07-21  7:41   ` dave penkler
2021-07-21  7:52     ` Greg KH
2021-07-21  9:44       ` dave penkler
2021-07-21  9:47         ` Greg KH
2021-07-21 14:22         ` Alan Stern
2021-07-21  8:34     ` Zhang, Qiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.