linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: Re: Re: Re: Re: Re: [syzbot] INFO: rcu detected stall in tx
@ 2021-05-19 16:14 Guido Kiener
  2021-05-19 17:35 ` Alan Stern
  2021-05-19 18:04 ` Re: Re: Re: Re: " Lee Jones
  0 siblings, 2 replies; 17+ messages in thread
From: Guido Kiener @ 2021-05-19 16:14 UTC (permalink / raw)
  To: Alan Stern, dave penkler
  Cc: Dmitry Vyukov, syzbot, Greg Kroah-Hartman, lee.jones, USB list,
	bp, dwmw, hpa, linux-kernel, luto, mingo, syzkaller-bugs, tglx,
	x86

> On Wed, May 19, 2021 at 10:48:29AM +0200, dave penkler wrote:
> > On Sat, 8 May 2021 at 16:29, Alan Stern <stern@rowland.harvard.edu> wrote:
> > >
> > > On Sat, May 08, 2021 at 10:14:41AM +0200, dave penkler wrote:
> > > > When the host driver detects a protocol error while processing an
> > > > URB it completes the URB with EPROTO status and marks the endpoint
> > > > as halted.
> > >
> > > Not true.  It does not mark the endpoint as halted, not unless it
> > > receives a STALL handshake from the device.  A STALL is not a
> > > protocol error.
> > >
> > > > When the class driver resubmits the URB and the if the host driver
> > > > finds the endpoint still marked as halted it should return EPIPE
> > > > status on the resubmitted URB
> > >
> > > Irrelevant.
> > Not at all. The point is that when an application is talking to an
> > instrument over the usbtmc driver, the underlying host controller and
> > its driver will detect and silence a babbling endpoint.
> 
> No, they won't.  That is, they will detect a babble error and return an error status, but
> they won't silence the endpoint.  What makes you think they will?

Maybe there is a misunderstanding. I guess that Dave wanted to propose:
"EPROTO is a link level issue and needs to be handled by the host driver.
When the host driver detects a protocol error while processing an
URB it SHOULD complete the URB with EPROTO status and SHOULD mark the endpoint
as halted."
Is this a realistic fix for all host drivers?

-Guido

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Re: Re: Re: Re: Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-19 16:14 Re: Re: Re: Re: Re: [syzbot] INFO: rcu detected stall in tx Guido Kiener
@ 2021-05-19 17:35 ` Alan Stern
  2021-05-19 19:38   ` Thinh Nguyen
  2021-05-19 18:04 ` Re: Re: Re: Re: " Lee Jones
  1 sibling, 1 reply; 17+ messages in thread
From: Alan Stern @ 2021-05-19 17:35 UTC (permalink / raw)
  To: Guido Kiener
  Cc: dave penkler, Dmitry Vyukov, syzbot, Greg Kroah-Hartman,
	lee.jones, USB list, bp, dwmw, hpa, linux-kernel, luto, mingo,
	syzkaller-bugs, tglx, x86

On Wed, May 19, 2021 at 04:14:20PM +0000, Guido Kiener wrote:
> > On Wed, May 19, 2021 at 10:48:29AM +0200, dave penkler wrote:
> > > On Sat, 8 May 2021 at 16:29, Alan Stern <stern@rowland.harvard.edu> wrote:
> > > >
> > > > On Sat, May 08, 2021 at 10:14:41AM +0200, dave penkler wrote:
> > > > > When the host driver detects a protocol error while processing an
> > > > > URB it completes the URB with EPROTO status and marks the endpoint
> > > > > as halted.
> > > >
> > > > Not true.  It does not mark the endpoint as halted, not unless it
> > > > receives a STALL handshake from the device.  A STALL is not a
> > > > protocol error.
> > > >
> > > > > When the class driver resubmits the URB and the if the host driver
> > > > > finds the endpoint still marked as halted it should return EPIPE
> > > > > status on the resubmitted URB
> > > >
> > > > Irrelevant.
> > > Not at all. The point is that when an application is talking to an
> > > instrument over the usbtmc driver, the underlying host controller and
> > > its driver will detect and silence a babbling endpoint.
> > 
> > No, they won't.  That is, they will detect a babble error and return an error status, but
> > they won't silence the endpoint.  What makes you think they will?
> 
> Maybe there is a misunderstanding. I guess that Dave wanted to propose:
> "EPROTO is a link level issue and needs to be handled by the host driver.
> When the host driver detects a protocol error while processing an
> URB it SHOULD complete the URB with EPROTO status

The host controller drivers _do_ complete URBs with -EPROTO (or similar) 
status when a link-level error occurs...

> and SHOULD mark the endpoint
> as halted."

but they don't mark the endpoint as halted.  Even if they did, it 
wouldn't fix anything because the kernel allows URBs to be submitted to 
halted endpoints.  In fact, it doesn't even keep track of which 
endpoints are or are not halted.

> Is this a realistic fix for all host drivers?

No, it isn't.

An endpoint shouldn't be marked as halted unless it really is halted.  
Otherwise a driver might be tempted to clear the Halt feature, and 
some devices do not like to receive a Clear-Halt request for an endpoint 
that isn't halted.

What we could do is what you suggested earlier: Note the fact that the 
endpoint is in some sort of fault condition and disallow further 
communication with the endpoint until the fault condition has been 
cleared.  (It isn't entirely obvious exactly what actions should clear 
such a fault...  I guess resetting or re-enabling the endpoint, or 
resetting the entire device.)

Alan Stern

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Re: Re: Re: Re: Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-19 16:14 Re: Re: Re: Re: Re: [syzbot] INFO: rcu detected stall in tx Guido Kiener
  2021-05-19 17:35 ` Alan Stern
@ 2021-05-19 18:04 ` Lee Jones
  1 sibling, 0 replies; 17+ messages in thread
From: Lee Jones @ 2021-05-19 18:04 UTC (permalink / raw)
  To: Guido Kiener
  Cc: Alan Stern, dave penkler, Dmitry Vyukov, syzbot,
	Greg Kroah-Hartman, USB list, bp, dwmw, hpa, linux-kernel, luto,
	mingo, syzkaller-bugs, tglx, x86

On Wed, 19 May 2021, Guido Kiener wrote:

> > On Wed, May 19, 2021 at 10:48:29AM +0200, dave penkler wrote:
> > > On Sat, 8 May 2021 at 16:29, Alan Stern <stern@rowland.harvard.edu> wrote:
> > > >
> > > > On Sat, May 08, 2021 at 10:14:41AM +0200, dave penkler wrote:
> > > > > When the host driver detects a protocol error while processing an
> > > > > URB it completes the URB with EPROTO status and marks the endpoint
> > > > > as halted.
> > > >
> > > > Not true.  It does not mark the endpoint as halted, not unless it
> > > > receives a STALL handshake from the device.  A STALL is not a
> > > > protocol error.
> > > >
> > > > > When the class driver resubmits the URB and the if the host driver
> > > > > finds the endpoint still marked as halted it should return EPIPE
> > > > > status on the resubmitted URB
> > > >
> > > > Irrelevant.
> > > Not at all. The point is that when an application is talking to an
> > > instrument over the usbtmc driver, the underlying host controller and
> > > its driver will detect and silence a babbling endpoint.
> > 
> > No, they won't.  That is, they will detect a babble error and return an error status, but
> > they won't silence the endpoint.  What makes you think they will?
> 
> Maybe there is a misunderstanding. I guess that Dave wanted to propose:
> "EPROTO is a link level issue and needs to be handled by the host driver.
> When the host driver detects a protocol error while processing an
> URB it SHOULD complete the URB with EPROTO status and SHOULD mark the endpoint
> as halted."
> Is this a realistic fix for all host drivers?
> 
> -Guido

Guido, would you mind taking a look at your mailer settings please?  I
now have >=7 threads running through my inbox with the same subject.
For some reason your mailer is insisting on creating a new one for
each of your replies.

It's also adding odd "re: re: re: ..." prefixes.

TIA

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-19 17:35 ` Alan Stern
@ 2021-05-19 19:38   ` Thinh Nguyen
  2021-05-20  2:01     ` Alan Stern
  0 siblings, 1 reply; 17+ messages in thread
From: Thinh Nguyen @ 2021-05-19 19:38 UTC (permalink / raw)
  To: Alan Stern, Guido Kiener
  Cc: dave penkler, Dmitry Vyukov, syzbot, Greg Kroah-Hartman,
	lee.jones, USB list, bp, dwmw, hpa, linux-kernel, luto, mingo,
	syzkaller-bugs, tglx, x86

Alan Stern wrote:
> On Wed, May 19, 2021 at 04:14:20PM +0000, Guido Kiener wrote:
>>> On Wed, May 19, 2021 at 10:48:29AM +0200, dave penkler wrote:
>>>> On Sat, 8 May 2021 at 16:29, Alan Stern <stern@rowland.harvard.edu> wrote:
>>>>>
>>>>> On Sat, May 08, 2021 at 10:14:41AM +0200, dave penkler wrote:
>>>>>> When the host driver detects a protocol error while processing an
>>>>>> URB it completes the URB with EPROTO status and marks the endpoint
>>>>>> as halted.
>>>>>
>>>>> Not true.  It does not mark the endpoint as halted, not unless it
>>>>> receives a STALL handshake from the device.  A STALL is not a
>>>>> protocol error.
>>>>>
>>>>>> When the class driver resubmits the URB and the if the host driver
>>>>>> finds the endpoint still marked as halted it should return EPIPE
>>>>>> status on the resubmitted URB
>>>>>
>>>>> Irrelevant.
>>>> Not at all. The point is that when an application is talking to an
>>>> instrument over the usbtmc driver, the underlying host controller and
>>>> its driver will detect and silence a babbling endpoint.
>>>
>>> No, they won't.  That is, they will detect a babble error and return an error status, but
>>> they won't silence the endpoint.  What makes you think they will?
>>
>> Maybe there is a misunderstanding. I guess that Dave wanted to propose:
>> "EPROTO is a link level issue and needs to be handled by the host driver.
>> When the host driver detects a protocol error while processing an
>> URB it SHOULD complete the URB with EPROTO status
> 
> The host controller drivers _do_ complete URBs with -EPROTO (or similar) 
> status when a link-level error occurs...
> 
>> and SHOULD mark the endpoint
>> as halted."
> 
> but they don't mark the endpoint as halted.  Even if they did, it 
> wouldn't fix anything because the kernel allows URBs to be submitted to 
> halted endpoints.  In fact, it doesn't even keep track of which 
> endpoints are or are not halted.
> 
>> Is this a realistic fix for all host drivers?
> 
> No, it isn't.
> 
> An endpoint shouldn't be marked as halted unless it really is halted.  
> Otherwise a driver might be tempted to clear the Halt feature, and 
> some devices do not like to receive a Clear-Halt request for an endpoint 
> that isn't halted.
> 
> What we could do is what you suggested earlier: Note the fact that the 
> endpoint is in some sort of fault condition and disallow further 
> communication with the endpoint until the fault condition has been 
> cleared.  (It isn't entirely obvious exactly what actions should clear 
> such a fault...  I guess resetting or re-enabling the endpoint, or 
> resetting the entire device.)
> 
> Alan Stern
> 

Hi Alan,

Sorry if this diverges from the thread, but I've been wondering whether
to add a change for this also.

For xHCI hosts, after transactions errors, the endpoint will enter
halted state. The driver will attempt a few soft-retries before giving
up. According to the xHCI spec (section 4.6.8), a host may send a
ClearFeature(endpoint_halt) to recover and restart the transfer (see
"reset a pipe" in xhci spec), and the class driver can handle this after
receiving something like -EPROTO from xhci.

However, as you've pointed out, some devices don't like
ClearFeature(ep_halt) and may not properly synchronize with the host on
where it should restart.

Some OS (such as Windows) do this. Not sure if we also want this?
Currently the recovery is just a timeout and a port reset from the class
driver, but the timeout is usually defaulted to a long time (e.g. 30
seconds for storage class driver).

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-19 19:38   ` Thinh Nguyen
@ 2021-05-20  2:01     ` Alan Stern
  2021-05-20 20:30       ` Thinh Nguyen
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Stern @ 2021-05-20  2:01 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Guido Kiener, dave penkler, Dmitry Vyukov, syzbot,
	Greg Kroah-Hartman, lee.jones, USB list, bp, dwmw, hpa,
	linux-kernel, luto, mingo, syzkaller-bugs, tglx, x86

On Wed, May 19, 2021 at 07:38:52PM +0000, Thinh Nguyen wrote:
> Hi Alan,
> 
> Sorry if this diverges from the thread, but I've been wondering whether
> to add a change for this also.
> 
> For xHCI hosts, after transactions errors, the endpoint will enter
> halted state.

No.  You are misreading the xHCI spec.  Section 4.6.8 says:

	... the state of the associated Endpoint Context is set to 
	Halted...

Note this carefully.  It says "Endpoint Context", not "endpoint".

The endpoint is part of the device, whereas the endpoint context is part 
of the host controller.  The device doesn't know when a transaction 
error has occurred; consequently such errors do not affect the endpoint.  
The host controller does know, and consequently such errors do affect 
the endpoint context.

> The driver will attempt a few soft-retries before giving
> up. According to the xHCI spec (section 4.6.8), a host may send a
> ClearFeature(endpoint_halt) to recover and restart the transfer (see

Not quite.  The section of the spec you're talking about says:

	Software shall execute the following sequence to “reset a 
	pipe”....  Issue a ClearFeature(ENDPOINT_HALT) request to 
	device.

It does not say the host controller will do this; it says that software 
will do it.

> "reset a pipe" in xhci spec), and the class driver can handle this after
> receiving something like -EPROTO from xhci.
> 
> However, as you've pointed out, some devices don't like
> ClearFeature(ep_halt) and may not properly synchronize with the host on
> where it should restart.
> 
> Some OS (such as Windows) do this. Not sure if we also want this?

In general we should do the same thing as Windows does, because most 
hardware designers test their equipment on Windows systems but 
relatively few test on Linux systems.

> Currently the recovery is just a timeout and a port reset from the class

This depends on the driver.  Some perform no recovery at all.

> driver, but the timeout is usually defaulted to a long time (e.g. 30
> seconds for storage class driver).

That 30-second timeout in the mass-storage driver applies in situations 
where a command fails to complete, not in situations where it completes 
quickly but with a -EPROTO or -EPIPE error.

The fact is that only a small percentage of -EPROTO errors are 
recoverable.  Some of them can be handled by a port reset, which can be 
pretty awkward to perform but does occasionally work.  A lot of them 
occur because the USB cable has been unplugged; obviously there's no way 
to recover from that.  With only a few exceptions, the best and simplest 
approach is not to try to recover at all.

For the case in question (the syzbot bug report that started this 
thread), the class driver doesn't try to perform any recovery.  It just 
resubmits the URB, getting into a tight retry loop which consumes too 
much CPU time.  Simply giving up would be preferable.

Alan Stern

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-20  2:01     ` Alan Stern
@ 2021-05-20 20:30       ` Thinh Nguyen
  2021-05-24 15:18         ` Mathias Nyman
  0 siblings, 1 reply; 17+ messages in thread
From: Thinh Nguyen @ 2021-05-20 20:30 UTC (permalink / raw)
  To: Alan Stern, Thinh Nguyen, Mathias Nyman
  Cc: Guido Kiener, dave penkler, Dmitry Vyukov, syzbot,
	Greg Kroah-Hartman, lee.jones, USB list, bp, dwmw, hpa,
	linux-kernel, luto, mingo, syzkaller-bugs, tglx, x86

+Mathias

Alan Stern wrote:
> On Wed, May 19, 2021 at 07:38:52PM +0000, Thinh Nguyen wrote:
>> Hi Alan,
>>
>> Sorry if this diverges from the thread, but I've been wondering whether
>> to add a change for this also.
>>
>> For xHCI hosts, after transactions errors, the endpoint will enter
>> halted state.
> 
> No.  You are misreading the xHCI spec.  Section 4.6.8 says:
> 
> 	... the state of the associated Endpoint Context is set to 
> 	Halted...
> 
> Note this carefully.  It says "Endpoint Context", not "endpoint".
> 
> The endpoint is part of the device, whereas the endpoint context is part 
> of the host controller.  The device doesn't know when a transaction 
> error has occurred; consequently such errors do not affect the endpoint.  
> The host controller does know, and consequently such errors do affect 
> the endpoint context.
> 

You're right, my mistake here.

>> The driver will attempt a few soft-retries before giving
>> up. According to the xHCI spec (section 4.6.8), a host may send a
>> ClearFeature(endpoint_halt) to recover and restart the transfer (see
> 
> Not quite.  The section of the spec you're talking about says:
> 
> 	Software shall execute the following sequence to “reset a 
> 	pipe”....  Issue a ClearFeature(ENDPOINT_HALT) request to 
> 	device.
> 
> It does not say the host controller will do this; it says that software 
> will do it.

Sorry for being unclear. I meant from the class driver, see my next
sentence.

> 
>> "reset a pipe" in xhci spec), and the class driver can handle this after
>> receiving something like -EPROTO from xhci.
>>
>> However, as you've pointed out, some devices don't like
>> ClearFeature(ep_halt) and may not properly synchronize with the host on
>> where it should restart.
>>
>> Some OS (such as Windows) do this. Not sure if we also want this?
> 
> In general we should do the same thing as Windows does, because most 
> hardware designers test their equipment on Windows systems but 
> relatively few test on Linux systems.
> 
>> Currently the recovery is just a timeout and a port reset from the class
> 
> This depends on the driver.  Some perform no recovery at all.
> 
>> driver, but the timeout is usually defaulted to a long time (e.g. 30
>> seconds for storage class driver).
> 
> That 30-second timeout in the mass-storage driver applies in situations 
> where a command fails to complete, not in situations where it completes 
> quickly but with a -EPROTO or -EPIPE error.

Hm... looks like we have a couple of issues in the uas storage class
driver and the xhci driver.

We may need to fix that in the uas storage driver because it doesn't
seem to handle it. (check uas_data_cmplt() in uas.c).

As for the xhci driver, there maybe a case where the stream URB never
gets to complete because the transaction err_count is not properly
updated. The err_count for transaction error is stored in ep_ring, but
the xhci driver may not be able to lookup the correct ep_ring based on
TRB address for streams. There are cases for streams where the event
TRBs have their TRB pointer field cleared to '0' (xhci spec section
4.12.2). If the xhci driver doesn't see ep_ring for transaction error,
it automatically does a soft-retry. This is seen from one of our
testings that the driver was repeatedly doing soft-retry until the class
driver timed out.

Hi Mathias, maybe you have some comment on this? Thanks.

> 
> The fact is that only a small percentage of -EPROTO errors are 
> recoverable.  Some of them can be handled by a port reset, which can be 
> pretty awkward to perform but does occasionally work.  A lot of them 
> occur because the USB cable has been unplugged; obviously there's no way 
> to recover from that.  With only a few exceptions, the best and simplest 
> approach is not to try to recover at all.

If the cable is unplugged, then we should get a connection change event
and the driver can handle it properly.

Yes, it's probably simplest to do a port reset and let the transfer be
incomplete/corrupted. However, I think we should give
ClearFeature(ep_halt) some more thoughts as I think it can be a recovery
mechanism for storage class driver, even though that it may not be
foolproof.

> 
> For the case in question (the syzbot bug report that started this 
> thread), the class driver doesn't try to perform any recovery.  It just 
> resubmits the URB, getting into a tight retry loop which consumes too 
> much CPU time.  Simply giving up would be preferable.
> 
> Alan Stern
> 

I see. By giving up, you mean doing port reset right? Otherwise it needs
some other mechanism to synchronize with the device side.

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-20 20:30       ` Thinh Nguyen
@ 2021-05-24 15:18         ` Mathias Nyman
  2021-05-24 18:55           ` Alan Stern
  0 siblings, 1 reply; 17+ messages in thread
From: Mathias Nyman @ 2021-05-24 15:18 UTC (permalink / raw)
  To: Thinh Nguyen, Alan Stern, Mathias Nyman
  Cc: Guido Kiener, dave penkler, Dmitry Vyukov, syzbot,
	Greg Kroah-Hartman, lee.jones, USB list, bp, dwmw, hpa,
	linux-kernel, luto, mingo, syzkaller-bugs, tglx, x86

On 20.5.2021 23.30, Thinh Nguyen wrote:
> +Mathias
> 
...

> Hm... looks like we have a couple of issues in the uas storage class
> driver and the xhci driver.
> 
> We may need to fix that in the uas storage driver because it doesn't
> seem to handle it. (check uas_data_cmplt() in uas.c).
> 
> As for the xhci driver, there maybe a case where the stream URB never
> gets to complete because the transaction err_count is not properly
> updated. The err_count for transaction error is stored in ep_ring, but
> the xhci driver may not be able to lookup the correct ep_ring based on
> TRB address for streams. There are cases for streams where the event
> TRBs have their TRB pointer field cleared to '0' (xhci spec section
> 4.12.2). If the xhci driver doesn't see ep_ring for transaction error,
> it automatically does a soft-retry. This is seen from one of our
> testings that the driver was repeatedly doing soft-retry until the class
> driver timed out.
> 
> Hi Mathias, maybe you have some comment on this? Thanks.

This is true, if TRB pointer is 0 then there is no retry limit for soft retry.
We should add one and prevent a loop. after e few soft resets we can end with a
hard reset to clear the host side endpoint halt.

We don't know the URB that was being tansferred during the error, and can't 
give it back with a proper error code.
In that sense we still end up waiting for a timeout and someone to cancel
the urb.

-Mathias

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-24 15:18         ` Mathias Nyman
@ 2021-05-24 18:55           ` Alan Stern
  2021-05-24 19:23             ` Thinh Nguyen
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Stern @ 2021-05-24 18:55 UTC (permalink / raw)
  To: Mathias Nyman
  Cc: Thinh Nguyen, Mathias Nyman, Guido Kiener, dave penkler,
	Dmitry Vyukov, syzbot, Greg Kroah-Hartman, lee.jones, USB list,
	bp, dwmw, hpa, linux-kernel, luto, mingo, syzkaller-bugs, tglx,
	x86

On Mon, May 24, 2021 at 06:18:59PM +0300, Mathias Nyman wrote:
> On 20.5.2021 23.30, Thinh Nguyen wrote:
> > As for the xhci driver, there maybe a case where the stream URB never
> > gets to complete because the transaction err_count is not properly
> > updated. The err_count for transaction error is stored in ep_ring, but
> > the xhci driver may not be able to lookup the correct ep_ring based on
> > TRB address for streams. There are cases for streams where the event
> > TRBs have their TRB pointer field cleared to '0' (xhci spec section
> > 4.12.2). If the xhci driver doesn't see ep_ring for transaction error,
> > it automatically does a soft-retry. This is seen from one of our
> > testings that the driver was repeatedly doing soft-retry until the class
> > driver timed out.
> > 
> > Hi Mathias, maybe you have some comment on this? Thanks.
> 
> This is true, if TRB pointer is 0 then there is no retry limit for soft retry.
> We should add one and prevent a loop. after e few soft resets we can end with a
> hard reset to clear the host side endpoint halt.
> 
> We don't know the URB that was being tansferred during the error, and can't 
> give it back with a proper error code.
> In that sense we still end up waiting for a timeout and someone to cancel
> the urb.

That's not good.  There may not be a timeout; drivers expect transfers 
to complete with a failure, not to be retried indefinitely.

However, if you do know which endpoint/stream the error is connected to, 
you should be able to get the URB.  It will be the first one queued for 
that endpoint/stream.

Alan Stern

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-24 18:55           ` Alan Stern
@ 2021-05-24 19:23             ` Thinh Nguyen
  2021-05-24 22:16               ` Mathias Nyman
  0 siblings, 1 reply; 17+ messages in thread
From: Thinh Nguyen @ 2021-05-24 19:23 UTC (permalink / raw)
  To: Alan Stern, Mathias Nyman
  Cc: Thinh Nguyen, Mathias Nyman, Guido Kiener, dave penkler,
	Dmitry Vyukov, syzbot, Greg Kroah-Hartman, lee.jones, USB list,
	bp, dwmw, hpa, linux-kernel, luto, mingo, syzkaller-bugs, tglx,
	x86

Alan Stern wrote:
> On Mon, May 24, 2021 at 06:18:59PM +0300, Mathias Nyman wrote:
>> On 20.5.2021 23.30, Thinh Nguyen wrote:
>>> As for the xhci driver, there maybe a case where the stream URB never
>>> gets to complete because the transaction err_count is not properly
>>> updated. The err_count for transaction error is stored in ep_ring, but
>>> the xhci driver may not be able to lookup the correct ep_ring based on
>>> TRB address for streams. There are cases for streams where the event
>>> TRBs have their TRB pointer field cleared to '0' (xhci spec section
>>> 4.12.2). If the xhci driver doesn't see ep_ring for transaction error,
>>> it automatically does a soft-retry. This is seen from one of our
>>> testings that the driver was repeatedly doing soft-retry until the class
>>> driver timed out.
>>>
>>> Hi Mathias, maybe you have some comment on this? Thanks.
>>
>> This is true, if TRB pointer is 0 then there is no retry limit for soft retry.
>> We should add one and prevent a loop. after e few soft resets we can end with a
>> hard reset to clear the host side endpoint halt.
>>
>> We don't know the URB that was being tansferred during the error, and can't 
>> give it back with a proper error code.
>> In that sense we still end up waiting for a timeout and someone to cancel
>> the urb.
> 
> That's not good.  There may not be a timeout; drivers expect transfers 
> to complete with a failure, not to be retried indefinitely.
> 
> However, if you do know which endpoint/stream the error is connected to, 
> you should be able to get the URB.  It will be the first one queued for 
> that endpoint/stream.
> 

When the xhci can't recover a transfer with soft-retry, no outstanding
transfer can proceed/complete for the endpoint. If the TRB pointer is 0,
we just don't know which stream or endpoint ring it's for, but we know
all the outstanding URBs of an endpoint. Let's may as well return an
error status for all of them after a limited number of soft-retries.

BR,
Thinh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-24 19:23             ` Thinh Nguyen
@ 2021-05-24 22:16               ` Mathias Nyman
  2021-05-24 22:48                 ` Thinh Nguyen
  0 siblings, 1 reply; 17+ messages in thread
From: Mathias Nyman @ 2021-05-24 22:16 UTC (permalink / raw)
  To: Thinh Nguyen, Alan Stern
  Cc: Mathias Nyman, Guido Kiener, dave penkler, Dmitry Vyukov, syzbot,
	Greg Kroah-Hartman, lee.jones, USB list, bp, dwmw, hpa,
	linux-kernel, luto, mingo, syzkaller-bugs, tglx, x86

On 24.5.2021 22.23, Thinh Nguyen wrote:
> Alan Stern wrote:
>> On Mon, May 24, 2021 at 06:18:59PM +0300, Mathias Nyman wrote:
>>> On 20.5.2021 23.30, Thinh Nguyen wrote:
>>>> As for the xhci driver, there maybe a case where the stream URB never
>>>> gets to complete because the transaction err_count is not properly
>>>> updated. The err_count for transaction error is stored in ep_ring, but
>>>> the xhci driver may not be able to lookup the correct ep_ring based on
>>>> TRB address for streams. There are cases for streams where the event
>>>> TRBs have their TRB pointer field cleared to '0' (xhci spec section
>>>> 4.12.2). If the xhci driver doesn't see ep_ring for transaction error,
>>>> it automatically does a soft-retry. This is seen from one of our
>>>> testings that the driver was repeatedly doing soft-retry until the class
>>>> driver timed out.
>>>>
>>>> Hi Mathias, maybe you have some comment on this? Thanks.
>>>
>>> This is true, if TRB pointer is 0 then there is no retry limit for soft retry.
>>> We should add one and prevent a loop. after e few soft resets we can end with a
>>> hard reset to clear the host side endpoint halt.
>>>
>>> We don't know the URB that was being tansferred during the error, and can't 
>>> give it back with a proper error code.
>>> In that sense we still end up waiting for a timeout and someone to cancel
>>> the urb.
>>
>> That's not good.  There may not be a timeout; drivers expect transfers 
>> to complete with a failure, not to be retried indefinitely.
>>
>> However, if you do know which endpoint/stream the error is connected to, 
>> you should be able to get the URB.  It will be the first one queued for 
>> that endpoint/stream.
>>
> 
> When the xhci can't recover a transfer with soft-retry, no outstanding
> transfer can proceed/complete for the endpoint. If the TRB pointer is 0,
> we just don't know which stream or endpoint ring it's for, but we know
> all the outstanding URBs of an endpoint. Let's may as well return an
> error status for all of them after a limited number of soft-retries.

We get the endpoint, but not the stream.

I guess we could walk through each stream of this endpoint, and return the 
first URB of every stream that has a pending URB.
xHCI spec claims to supports 65533 streams per endpoint, but in real life 
UAS probably only uses a few per endpoint?

-Mathias 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-05-24 22:16               ` Mathias Nyman
@ 2021-05-24 22:48                 ` Thinh Nguyen
  0 siblings, 0 replies; 17+ messages in thread
From: Thinh Nguyen @ 2021-05-24 22:48 UTC (permalink / raw)
  To: Mathias Nyman, Thinh Nguyen, Alan Stern
  Cc: Mathias Nyman, Guido Kiener, dave penkler, Dmitry Vyukov, syzbot,
	Greg Kroah-Hartman, lee.jones, USB list, bp, dwmw, hpa,
	linux-kernel, luto, mingo, syzkaller-bugs, tglx, x86

Mathias Nyman wrote:
> On 24.5.2021 22.23, Thinh Nguyen wrote:
>> Alan Stern wrote:
>>> On Mon, May 24, 2021 at 06:18:59PM +0300, Mathias Nyman wrote:
>>>> On 20.5.2021 23.30, Thinh Nguyen wrote:
>>>>> As for the xhci driver, there maybe a case where the stream URB never
>>>>> gets to complete because the transaction err_count is not properly
>>>>> updated. The err_count for transaction error is stored in ep_ring, but
>>>>> the xhci driver may not be able to lookup the correct ep_ring based on
>>>>> TRB address for streams. There are cases for streams where the event
>>>>> TRBs have their TRB pointer field cleared to '0' (xhci spec section
>>>>> 4.12.2). If the xhci driver doesn't see ep_ring for transaction error,
>>>>> it automatically does a soft-retry. This is seen from one of our
>>>>> testings that the driver was repeatedly doing soft-retry until the class
>>>>> driver timed out.
>>>>>
>>>>> Hi Mathias, maybe you have some comment on this? Thanks.
>>>>
>>>> This is true, if TRB pointer is 0 then there is no retry limit for soft retry.
>>>> We should add one and prevent a loop. after e few soft resets we can end with a
>>>> hard reset to clear the host side endpoint halt.
>>>>
>>>> We don't know the URB that was being tansferred during the error, and can't 
>>>> give it back with a proper error code.
>>>> In that sense we still end up waiting for a timeout and someone to cancel
>>>> the urb.
>>>
>>> That's not good.  There may not be a timeout; drivers expect transfers 
>>> to complete with a failure, not to be retried indefinitely.
>>>
>>> However, if you do know which endpoint/stream the error is connected to, 
>>> you should be able to get the URB.  It will be the first one queued for 
>>> that endpoint/stream.
>>>
>>
>> When the xhci can't recover a transfer with soft-retry, no outstanding
>> transfer can proceed/complete for the endpoint. If the TRB pointer is 0,
>> we just don't know which stream or endpoint ring it's for, but we know
>> all the outstanding URBs of an endpoint. Let's may as well return an
>> error status for all of them after a limited number of soft-retries.
> 
> We get the endpoint, but not the stream.

Right.

> 
> I guess we could walk through each stream of this endpoint, and return the 
> first URB of every stream that has a pending URB.
> xHCI spec claims to supports 65533 streams per endpoint, but in real life 
> UAS probably only uses a few per endpoint?
> 
> -Mathias 
> 

Typically UASP devices advertise to support up to 32 streams. We notice
that some newer builds of Windows OS has a bug (or intentional?) that it
rejects any device that uses more or less than 32 streams (probably a
bug) in the descriptor.

I think we only need to do this if we don't know which stream the event
belongs to. Otherwise, we can keep the old logic.

BR,
Thinh


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-04-19  7:19 syzbot
  2021-04-19  7:27 ` Dmitry Vyukov
  2021-06-27 20:20 ` syzbot
@ 2021-09-04  7:55 ` syzbot
  2 siblings, 0 replies; 17+ messages in thread
From: syzbot @ 2021-09-04  7:55 UTC (permalink / raw)
  To: Guido.Kiener, Qiang.Zhang, Thinh.Nguyen, bp, dpenkler, dvyukov,
	dwmw, fgheet255t, fweisbec, gregkh, guido.kiener, hpa,
	john.stultz, lee.jones, linux-kernel, linux-usb, luto,
	mathias.nyman, mathias.nyman, mingo, mingo, qiang.zhang, sboyd,
	stable-commits, stable, stern, syzkaller-bugs, tglx,
	tonymarislogistics, x86

syzbot has found a reproducer for the following issue on:

HEAD commit:    7cca308cfdc0 Merge tag 'powerpc-5.15-1' of git://git.kerne..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10535915300000
kernel config:  https://syzkaller.appspot.com/x/.config?x=9c582b69de20dde2
dashboard link: https://syzkaller.appspot.com/bug?extid=e2eae5639e7203360018
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=10e2e533300000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1294ff33300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	1-....: (288 ticks this GP) idle=bdd/1/0x4000000000000000 softirq=20305/20305 fqs=5241 
	(t=10500 jiffies g=18249 q=67)
NMI backtrace for cpu 1
CPU: 1 PID: 3254 Comm: aoe_tx0 Not tainted 5.14.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105
 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:105
 nmi_trigger_cpumask_backtrace+0x1ae/0x220 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x25e/0x3f0 kernel/rcu/tree_stall.h:343
 print_cpu_stall kernel/rcu/tree_stall.h:627 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:711 [inline]
 rcu_pending kernel/rcu/tree.c:3880 [inline]
 rcu_sched_clock_irq.cold+0x9d/0x746 kernel/rcu/tree.c:2599
 update_process_times+0x16d/0x200 kernel/time/timer.c:1785
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1421
 __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
 __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749
 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:__sanitizer_cov_trace_pc+0x5c/0x60 kernel/kcov.c:207
Code: 82 18 15 00 00 83 f8 02 75 20 48 8b 8a 20 15 00 00 8b 92 1c 15 00 00 48 8b 01 48 83 c0 01 48 39 c2 76 07 48 89 34 c1 48 89 01 <c3> 0f 1f 00 41 55 41 54 49 89 fc 55 48 bd eb 83 b5 80 46 86 c8 61
RSP: 0018:ffffc90002ccfad8 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: ffff8880206e3900 RSI: ffffffff874e536f RDI: 0000000000000003
RBP: ffff88807df1b340 R08: 0000000000000000 R09: 0000000000000001
R10: ffffffff874e5366 R11: 0000000000000000 R12: ffff88807df1b000
R13: dffffc0000000000 R14: ffff8880709ff490 R15: ffff88807df1b338
 __list_del_entry include/linux/list.h:132 [inline]
 list_move_tail include/linux/list.h:227 [inline]
 fq_codel_dequeue+0x7cf/0x1f50 net/sched/sch_fq_codel.c:299
 dequeue_skb net/sched/sch_generic.c:292 [inline]
 qdisc_restart net/sched/sch_generic.c:397 [inline]
 __qdisc_run+0x1ae/0x1700 net/sched/sch_generic.c:415
 __dev_xmit_skb net/core/dev.c:3861 [inline]
 __dev_queue_xmit+0x1f6e/0x3710 net/core/dev.c:4170
 tx+0x68/0xb0 drivers/block/aoe/aoenet.c:63
 kthread+0x1e7/0x3b0 drivers/block/aoe/aoecmd.c:1230
 kthread+0x3e5/0x4d0 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
----------------
Code disassembly (best guess), 1 bytes skipped:
   0:	18 15 00 00 83 f8    	sbb    %dl,-0x77d0000(%rip)        # 0xf8830006
   6:	02 75 20             	add    0x20(%rbp),%dh
   9:	48 8b 8a 20 15 00 00 	mov    0x1520(%rdx),%rcx
  10:	8b 92 1c 15 00 00    	mov    0x151c(%rdx),%edx
  16:	48 8b 01             	mov    (%rcx),%rax
  19:	48 83 c0 01          	add    $0x1,%rax
  1d:	48 39 c2             	cmp    %rax,%rdx
  20:	76 07                	jbe    0x29
  22:	48 89 34 c1          	mov    %rsi,(%rcx,%rax,8)
  26:	48 89 01             	mov    %rax,(%rcx)
* 29:	c3                   	retq <-- trapping instruction
  2a:	0f 1f 00             	nopl   (%rax)
  2d:	41 55                	push   %r13
  2f:	41 54                	push   %r12
  31:	49 89 fc             	mov    %rdi,%r12
  34:	55                   	push   %rbp
  35:	48 bd eb 83 b5 80 46 	movabs $0x61c8864680b583eb,%rbp
  3c:	86 c8 61


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-06-28  6:38   ` Zhang, Qiang
@ 2021-06-28 14:17     ` Alan Stern
  0 siblings, 0 replies; 17+ messages in thread
From: Alan Stern @ 2021-06-28 14:17 UTC (permalink / raw)
  To: Zhang, Qiang
  Cc: Dmitry Vyukov, syzbot, Greg Kroah-Hartman, guido.kiener,
	dpenkler, lee.jones, USB list, bp, dwmw, hpa, linux-kernel, luto,
	mingo, syzkaller-bugs, tglx, x86

On Mon, Jun 28, 2021 at 06:38:37AM +0000, Zhang, Qiang wrote:
> 
> 
> ________________________________________
> From: Dmitry Vyukov <dvyukov@google.com>
> Sent: Monday, 19 April 2021 15:27
> To: syzbot; Greg Kroah-Hartman; guido.kiener@rohde-schwarz.com; dpenkler@gmail.com; lee.jones@linaro.org; USB list
> Cc: bp@alien8.de; dwmw@amazon.co.uk; hpa@zytor.com; linux-kernel@vger.kernel.org; luto@kernel.org; mingo@redhat.com; syzkaller-bugs@googlegroups.com; tglx@linutronix.de; x86@kernel.org
> Subject: Re: [syzbot] INFO: rcu detected stall in tx
> 
> [Please note: This e-mail is from an EXTERNAL e-mail address]
> 
> On Mon, Apr 19, 2021 at 9:19 AM syzbot
> <syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    50987bec Merge tag 'trace-v5.12-rc7' of git://git.kernel.o..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1065c5fcd00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=398c4d0fe6f66e68
> > dashboard link: https://syzkaller.appspot.com/bug?extid=e2eae5639e7203360018
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
> >
> > usbtmc 5-1:0.0: unknown status received: -71
> > usbtmc 3-1:0.0: unknown status received: -71
> > usbtmc 5-1:0.0: unknown status received: -71
> 
> >The log shows an infinite stream of these before the stall, so I
> >assume it's an infinite loop in usbtmc.
> >+usbtmc maintainers
> >
> >[  370.171634][    C0] usbtmc 6-1:0.0: unknown status received: >-71
> >[  370.177799][    C1] usbtmc 3-1:0.0: unknown status received: >-71

> This seems like a long time in the following cycle,  when the callback function usbtmc_interrupt() find unknown status error, it will submit urb again. the urb may be insert  urbp_list.
> due to the dummy_timer() be called in bh-disable. 
> This will result in the RCU reading critical area not exiting for a long time (note: bh_disable/enable, preempt_disable/enable is regarded as the RCU critical reading area ), and prevent rcu_preempt kthread be schedule and running.

> Whether to return directly when we find the urb status is unknown error?

Yes.

> diff --git a/drivers/usb/class/usbtmc.c b/drivers/usb/class/usbtmc.c
> index 74d5a9c5238a..39d44339c03f 100644
> --- a/drivers/usb/class/usbtmc.c
> +++ b/drivers/usb/class/usbtmc.c
> @@ -2335,6 +2335,7 @@ static void usbtmc_interrupt(struct urb *urb)
>                 return;
>         default:
>                 dev_err(dev, "unknown status received: %d\n", status);
> +               return;
>         }
>  exit:
>         rv = usb_submit_urb(urb, GFP_ATOMIC);

This is the right thing to do.  In fact, you should also change the code 
above this.  There's no real need for special handling of the 
-ECONNRESET, -ENOENT, ..., -EPIPE codes, since the driver will do the 
same thing no matter what the code is.

Alan Stern

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-04-19  7:27 ` Dmitry Vyukov
@ 2021-06-28  6:38   ` Zhang, Qiang
  2021-06-28 14:17     ` Alan Stern
  0 siblings, 1 reply; 17+ messages in thread
From: Zhang, Qiang @ 2021-06-28  6:38 UTC (permalink / raw)
  To: Dmitry Vyukov, syzbot, Greg Kroah-Hartman, guido.kiener,
	dpenkler, lee.jones, USB list
  Cc: bp, dwmw, hpa, linux-kernel, luto, mingo, syzkaller-bugs, tglx, x86



________________________________________
From: Dmitry Vyukov <dvyukov@google.com>
Sent: Monday, 19 April 2021 15:27
To: syzbot; Greg Kroah-Hartman; guido.kiener@rohde-schwarz.com; dpenkler@gmail.com; lee.jones@linaro.org; USB list
Cc: bp@alien8.de; dwmw@amazon.co.uk; hpa@zytor.com; linux-kernel@vger.kernel.org; luto@kernel.org; mingo@redhat.com; syzkaller-bugs@googlegroups.com; tglx@linutronix.de; x86@kernel.org
Subject: Re: [syzbot] INFO: rcu detected stall in tx

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, Apr 19, 2021 at 9:19 AM syzbot
<syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    50987bec Merge tag 'trace-v5.12-rc7' of git://git.kernel.o..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1065c5fcd00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=398c4d0fe6f66e68
> dashboard link: https://syzkaller.appspot.com/bug?extid=e2eae5639e7203360018
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
>
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71

>The log shows an infinite stream of these before the stall, so I
>assume it's an infinite loop in usbtmc.
>+usbtmc maintainers
>
>[  370.171634][    C0] usbtmc 6-1:0.0: unknown status received: >-71
>[  370.177799][    C1] usbtmc 3-1:0.0: unknown status received: >-71
>[  370.183912][    C0] usbtmc 4-1:0.0: unknown status received: >-71
>[  370.190076][    C1] usbtmc 5-1:0.0: unknown status received: >-71
>[  370.196194][    C0] usbtmc 2-1:0.0: unknown status received: >-71
>[  370.202387][    C1] usbtmc 3-1:0.0: unknown status received: >-71
>[  370.208460][    C0] usbtmc 6-1:0.0: unknown status received: >-71
>[  370.214615][    C1] usbtmc 5-1:0.0: unknown status received: >-71
>[  370.220736][    C0] usbtmc 4-1:0.0: unknown status received: >-71
>[  370.226902][    C1] usbtmc 3-1:0.0: unknown status received: >-71
>[  370.233005][    C0] usbtmc 2-1:0.0: unknown status received: >-71
>[  370.239168][    C1] usbtmc 5-1:0.0: unknown status received: >-71
>[  370.245271][    C0] usbtmc 6-1:0.0: unknown status received: >-71
>[  370.251426][    C1] usbtmc 3-1:0.0: unknown status received: >-71
>[  370.257552][    C0] usbtmc 4-1:0.0: unknown status received: >-71
>[  370.263715][    C1] usbtmc 5-1:0.0: unknown status received: >-71
>[  370.269819][    C0] usbtmc 2-1:0.0: unknown status received: >-71
>[  370.275974][    C1] usbtmc 3-1:0.0: unknown status received: >-71
>[  370.282100][    C0] usbtmc 6-1:0.0: unknown status received: >-71
>[  370.288262][    C1] usbtmc 5-1:0.0: unknown status received: >-71
>[  370.294399][    C0] usbtmc 4-1:0.0: unknown status received: >-71



This seems like a long time in the following cycle,  when the callback function usbtmc_interrupt() find unknown status error, it will submit urb again. the urb may be insert  urbp_list.
due to the dummy_timer() be called in bh-disable. 
This will result in the RCU reading critical area not exiting for a long time (note: bh_disable/enable, preempt_disable/enable is regarded as the RCU critical reading area ), and prevent rcu_preempt kthread be schedule and running.

dummy_timer() 
{
restart:
         list_for_each_entry_safe(urbp, tmp, &dum_hcd->urbp_list, urbp_list) {
                     .........
                    ep =  find_endpoint(dum, address);
                    if (!ep) {
                           status = -EPROTO;
                            gotto return_urb;
                     }
                     ............
                    return_urb:
                              usb_hcd_giveback_urb();
                              goto restart;
                }
}

Whether to return directly when we find the urb status is unknown error?

diff --git a/drivers/usb/class/usbtmc.c b/drivers/usb/class/usbtmc.c
index 74d5a9c5238a..39d44339c03f 100644
--- a/drivers/usb/class/usbtmc.c
+++ b/drivers/usb/class/usbtmc.c
@@ -2335,6 +2335,7 @@ static void usbtmc_interrupt(struct urb *urb)
                return;
        default:
                dev_err(dev, "unknown status received: %d\n", status);
+               return;
        }
 exit:
        rv = usb_submit_urb(urb, GFP_ATOMIC);

Thanks
Qiang

> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu:    1-...!: (8580 ticks this GP) idle=72e/1/0x4000000000000000 softirq=20679/20679 fqs=0
>         (t=10500 jiffies g=27129 q=416)
> rcu: rcu_preempt kthread starved for 10500 jiffies! g27129 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> rcu:    Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
> rcu: RCU grace-period kthread stack dump:
> task:rcu_preempt     state:R  running task     stack:29168 pid:   14 ppid:     2 flags:0x00004000
> Call Trace:
>  context_switch kernel/sched/core.c:4322 [inline]
>  __schedule+0x911/0x21b0 kernel/sched/core.c:5073
>  schedule+0xcf/0x270 kernel/sched/core.c:5152
>  schedule_timeout+0x14a/0x250 kernel/time/timer.c:1892
>  rcu_gp_fqs_loop kernel/rcu/tree.c:2005 [inline]
>  rcu_gp_kthread+0xd07/0x2250 kernel/rcu/tree.c:2178
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> rcu: Stack dump where RCU GP kthread last ran:
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0
> CPU: 0 PID: 3232 Comm: aoe_tx0 Not tainted 5.12.0-rc7-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:native_apic_mem_write+0x8/0x10 arch/x86/include/asm/apic.h:110
> Code: c7 40 d9 36 8f e8 c8 11 86 00 eb b0 66 0f 1f 44 00 00 be 01 00 00 00 e9 36 c7 2c 00 cc cc cc cc cc cc 89 ff 89 b7 00 c0 5f ff <c3> 0f 1f 80 00 00 00 00 48 b8 00 00 00 00 00 fc ff df 53 89 fb 48
> RSP: 0018:ffffc90000007ea8 EFLAGS: 00000046
> RAX: dffffc0000000000 RBX: ffffffff8b0a78c0 RCX: 0000000000000020
> RDX: 1ffffffff1614f1a RSI: 000000000001c285 RDI: 0000000000000380
> RBP: ffff8880b9c1f2c0 R08: 000000000000003f R09: 0000000000000000
> R10: ffffffff8166ecf7 R11: 0000000000000000 R12: 000000000001c285
> R13: 0000000000000020 R14: ffff8880b9c26340 R15: 0000006120792e26
> FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fb9e6cdb380 CR3: 0000000018792000 CR4: 00000000001506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <IRQ>
>  apic_write arch/x86/include/asm/apic.h:393 [inline]
>  lapic_next_event+0x4d/0x80 arch/x86/kernel/apic/apic.c:472
>  clockevents_program_event+0x254/0x370 kernel/time/clockevents.c:334
>  tick_program_event+0xac/0x140 kernel/time/tick-oneshot.c:44
>  hrtimer_interrupt+0x414/0xa00 kernel/time/hrtimer.c:1676
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
>  __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1106
>  sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1100
>  </IRQ>
>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
> RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:27 [inline]
> RIP: 0010:check_kcov_mode kernel/kcov.c:163 [inline]
> RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x60 kernel/kcov.c:197
> Code: f0 4d 89 03 e9 f2 fc ff ff b9 ff ff ff ff ba 08 00 00 00 4d 8b 03 48 0f bd ca 49 8b 45 00 48 63 c9 e9 64 ff ff ff 0f 1f 40 00 <65> 8b 05 39 fe 8d 7e 89 c1 48 8b 34 24 81 e1 00 01 00 00 65 48 8b
> RSP: 0018:ffffc900030cf6f8 EFLAGS: 00000293
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff88801aff1c40 RSI: ffffffff815c2e4f RDI: 0000000000000003
> RBP: ffffc900030cf738 R08: 0000000000000000 R09: ffffffff8fa9a96f
> R10: ffffffff815c2e45 R11: 0000000000000000 R12: 000000000000002d
> R13: ffff8880113db880 R14: 0000000000000000 R15: 0000000000000200
>  console_trylock_spinning kernel/printk/printk.c:1818 [inline]
>  vprintk_emit+0x3a5/0x560 kernel/printk/printk.c:2097
>  dev_vprintk_emit+0x36e/0x3b2 drivers/base/core.c:4434
>  dev_printk_emit+0xba/0xf1 drivers/base/core.c:4445
>  __netdev_printk+0x1c6/0x27a net/core/dev.c:11292
>  netdev_warn+0xd7/0x109 net/core/dev.c:11345
>  ieee802154_subif_start_xmit.cold+0x17/0x27 net/mac802154/tx.c:125
>  __netdev_start_xmit include/linux/netdevice.h:4825 [inline]
>  netdev_start_xmit include/linux/netdevice.h:4839 [inline]
>  xmit_one net/core/dev.c:3605 [inline]
>  dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3621
>  sch_direct_xmit+0x2e1/0xbd0 net/sched/sch_generic.c:313
>  qdisc_restart net/sched/sch_generic.c:376 [inline]
>  __qdisc_run+0x4ba/0x15f0 net/sched/sch_generic.c:384
>  qdisc_run include/net/pkt_sched.h:136 [inline]
>  qdisc_run include/net/pkt_sched.h:128 [inline]
>  __dev_xmit_skb net/core/dev.c:3807 [inline]
>  __dev_queue_xmit+0x14b9/0x2e00 net/core/dev.c:4162
>  tx+0x68/0xb0 drivers/block/aoe/aoenet.c:63
>  kthread+0x1e7/0x3a0 drivers/block/aoe/aoecmd.c:1230
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> NMI backtrace for cpu 1
> CPU: 1 PID: 37 Comm: kworker/1:1 Not tainted 5.12.0-rc7-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Workqueue: events nsim_dev_trap_report_work
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x141/0x1d7 lib/dump_stack.c:120
>  nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
>  nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
>  trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
>  rcu_dump_cpu_stacks+0x222/0x2a7 kernel/rcu/tree_stall.h:341
>  print_cpu_stall kernel/rcu/tree_stall.h:622 [inline]
>  check_cpu_stall kernel/rcu/tree_stall.h:697 [inline]
>  rcu_pending kernel/rcu/tree.c:3830 [inline]
>  rcu_sched_clock_irq.cold+0x4f7/0x11dd kernel/rcu/tree.c:2650
>  update_process_times+0x16d/0x200 kernel/time/timer.c:1796
>  tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
>  tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1369
>  __run_hrtimer kernel/time/hrtimer.c:1537 [inline]
>  __hrtimer_run_queues+0x1c0/0xe40 kernel/time/hrtimer.c:1601
>  hrtimer_interrupt+0x330/0xa00 kernel/time/hrtimer.c:1663
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
>  __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1106
>  sysvec_apic_timer_interrupt+0x40/0xc0 arch/x86/kernel/apic/apic.c:1100
>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
> RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline]
> RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70 kernel/locking/spinlock.c:191
> Code: 74 24 10 e8 ba 19 54 f8 48 89 ef e8 f2 cf 54 f8 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> d3 9d 48 f8 65 8b 05 7c 68 fc 76 85 c0 74 0a 5b 5d c3 e8 40 59
> RSP: 0018:ffffc90000dc0b28 EFLAGS: 00000206
> RAX: 0000000000000002 RBX: 0000000000000200 RCX: 1ffffffff1f5f34a
> RDX: 0000000000000000 RSI: 0000000000000103 RDI: 0000000000000001
> RBP: ffff888144fa8000 R08: 0000000000000001 R09: ffffffff8fa9a99f
> R10: 0000000000000001 R11: ffffc90013880000 R12: ffff888145047440
> R13: ffff88801ee8e500 R14: dffffc0000000000 R15: ffff888011f69c00
>  spin_unlock_irqrestore include/linux/spinlock.h:409 [inline]
>  dummy_timer+0x12f1/0x32a0 drivers/usb/gadget/udc/dummy_hcd.c:1985
>  call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1431
>  expire_timers kernel/time/timer.c:1476 [inline]
>  __run_timers.part.0+0x67c/0xa50 kernel/time/timer.c:1745
>  __run_timers kernel/time/timer.c:1726 [inline]
>  run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1758
>  __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
>  do_softirq.part.0+0xd9/0x130 kernel/softirq.c:248
>  </IRQ>
>  do_softirq kernel/softirq.c:240 [inline]
>  __local_bh_enable_ip+0x102/0x120 kernel/softirq.c:198
>  spin_unlock_bh include/linux/spinlock.h:399 [inline]
>  nsim_dev_trap_report drivers/net/netdevsim/dev.c:585 [inline]
>  nsim_dev_trap_report_work+0x867/0xbd0 drivers/net/netdevsim/dev.c:611
>  process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 2-1:0.0: unknown status received: -71
> usbtmc 4-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: usb_submit_urb failed: -19
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: usb_submit_urb failed: -19
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000a9b79905c04e25a0%40google.com.

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-04-19  7:19 syzbot
  2021-04-19  7:27 ` Dmitry Vyukov
@ 2021-06-27 20:20 ` syzbot
  2021-09-04  7:55 ` syzbot
  2 siblings, 0 replies; 17+ messages in thread
From: syzbot @ 2021-06-27 20:20 UTC (permalink / raw)
  To: Guido.Kiener, Thinh.Nguyen, bp, dpenkler, dvyukov, dwmw, gregkh,
	guido.kiener, hpa, john.stultz, lee.jones, linux-kernel,
	linux-usb, luto, mathias.nyman, mathias.nyman, mingo, sboyd,
	stern, syzkaller-bugs, tglx, x86

syzbot has found a reproducer for the following issue on:

HEAD commit:    625acffd Merge tag 's390-5.13-5' of git://git.kernel.org/p..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=101a20fbd00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=279de9012e194ee1
dashboard link: https://syzkaller.appspot.com/bug?extid=e2eae5639e7203360018
compiler:       Debian clang version 11.0.1-2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17215fb8300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com

 #3: ffffffff8d5c8818 (kbd_event_lock){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
 #3: ffffffff8d5c8818 (kbd_event_lock){..-.}-{2:2}, at: kbd_event+0x97/0x3c00 drivers/tty/vt/keyboard.c:1525
 #4: ffffffff8cf15d00 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x0/0x30 arch/x86/pci/mmconfig_64.c:151
=============================================
keytouch 0003:0926:3333.00B5: can't resubmit intr, dummy_hcd.4-1/input0, status -19
keytouch 0003:0926:3333.00B5: usb_submit_urb(ctrl) failed: -19
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000 softirq=25390/25392 fqs=3 
	(t=12164 jiffies g=31645 q=43226)
rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:26384 pid:   14 ppid:     2 flags:0x00004000
Call Trace:
 context_switch kernel/sched/core.c:4339 [inline]
 __schedule+0xb98/0x1120 kernel/sched/core.c:5147
 schedule+0x14b/0x200 kernel/sched/core.c:5226
 schedule_timeout+0x1aa/0x2c0 kernel/time/timer.c:1892
 rcu_gp_fqs_loop kernel/rcu/tree.c:2004 [inline]
 rcu_gp_kthread+0x112d/0x2190 kernel/rcu/tree.c:2177
 kthread+0x39a/0x3c0 kernel/kthread.c:313
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 3234 Comm: aoe_tx0 Not tainted 5.13.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:mark_lock+0x208/0x1eb0 kernel/locking/lockdep.c:4501
Code: ff 8f 49 83 c6 50 4c 89 f0 48 c1 e8 03 42 80 3c 38 00 74 08 4c 89 f7 e8 56 5d 65 00 bb 01 00 00 00 45 85 2e 0f 84 b0 00 00 00 <48> c7 44 24 60 0e 36 e0 45 43 c7 04 27 00 00 00 00 4b c7 44 27 14
RSP: 0018:ffffc90000007580 EFLAGS: 00000002
RAX: 1ffffffff1fff3c2 RBX: 0000000000000001 RCX: ffffffff8161dad9
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffffff9026fd88
RBP: ffffc90000007810 R08: dffffc0000000000 R09: fffffbfff204dfb2
R10: fffffbfff204dfb2 R11: 0000000000000000 R12: 1ffff92000000ebc
R13: 0000000000000002 R14: ffffffff8fff9e10 R15: dffffc0000000000
FS:  0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f00730d2b00 CR3: 00000000141fd000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 mark_usage kernel/locking/lockdep.c:4365 [inline]
 __lock_acquire+0xb66/0x6040 kernel/locking/lockdep.c:4858
 lock_acquire+0x182/0x4a0 kernel/locking/lockdep.c:5514
 seqcount_lockdep_reader_access+0xe5/0x200 include/linux/seqlock.h:103
 ktime_get+0x35/0x2b0 kernel/time/timekeeping.c:827
 clockevents_program_event+0xe4/0x320 kernel/time/clockevents.c:326
 hrtimer_interrupt+0xbaa/0x1040 kernel/time/hrtimer.c:1676
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
 __sysvec_apic_timer_interrupt+0xf9/0x270 arch/x86/kernel/apic/apic.c:1106
 sysvec_apic_timer_interrupt+0x8c/0xb0 arch/x86/kernel/apic/apic.c:1100
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:647
RIP: 0010:console_trylock_spinning+0x31b/0x3a0 kernel/printk/printk.c:1894
Code: 08 4d 85 ed 74 91 e8 94 c2 19 00 fb 31 db eb 41 e8 8a c2 19 00 e8 95 74 5b 08 4d 85 ed 74 d1 e8 7b c2 19 00 fb bb 01 00 00 00 <48> c7 c7 00 22 df 8c 31 f6 ba 01 00 00 00 31 c9 41 b8 01 00 00 00
RSP: 0018:ffffc9000288f360 EFLAGS: 00000293
RAX: ffffffff81655005 RBX: 0000000000000001 RCX: ffff888021699c40
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc9000288f430 R08: ffffffff81654fc2 R09: fffffbfff204dfcb
R10: fffffbfff204dfcb R11: 0000000000000000 R12: 1ffff92000511e6c
R13: 0000000000000200 R14: 0000000000000086 R15: dffffc0000000000
 vprintk_emit+0x201/0x2f0 kernel/printk/printk.c:2173
 dev_vprintk_emit+0x2e1/0x355 drivers/base/core.c:4525
 dev_printk_emit+0xca/0x109 drivers/base/core.c:4536
 __netdev_printk+0x339/0x419 net/core/dev.c:11392
 netdev_warn+0x110/0x158 net/core/dev.c:11445
 ieee802154_subif_start_xmit+0xbd/0x100 net/mac802154/tx.c:125
 __netdev_start_xmit include/linux/netdevice.h:4944 [inline]
 netdev_start_xmit include/linux/netdevice.h:4958 [inline]
 xmit_one net/core/dev.c:3654 [inline]
 dev_hard_start_xmit+0x20b/0x450 net/core/dev.c:3670
 sch_direct_xmit+0x2be/0xec0 net/sched/sch_generic.c:336
 qdisc_restart net/sched/sch_generic.c:401 [inline]
 __qdisc_run+0xa43/0x1c00 net/sched/sch_generic.c:409
 qdisc_run include/net/pkt_sched.h:131 [inline]
 __dev_xmit_skb net/core/dev.c:3857 [inline]
 __dev_queue_xmit+0xedd/0x2fe0 net/core/dev.c:4214
 tx+0x6f/0x110 drivers/block/aoe/aoenet.c:63
 kthread+0x22d/0x440 drivers/block/aoe/aoecmd.c:1230
 kthread+0x39a/0x3c0 kernel/kthread.c:313
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
NMI backtrace for cpu 1
CPU: 1 PID: 17756 Comm: systemd-udevd Not tainted 5.13.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x202/0x31e lib/dump_stack.c:120
 nmi_cpu_backtrace+0x16c/0x190 lib/nmi_backtrace.c:105
 nmi_trigger_cpumask_backtrace+0x191/0x2f0 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x22d/0x390 kernel/rcu/tree_stall.h:341
 print_cpu_stall kernel/rcu/tree_stall.h:624 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:699 [inline]
 rcu_pending kernel/rcu/tree.c:3911 [inline]
 rcu_sched_clock_irq+0x1d0d/0x2a30 kernel/rcu/tree.c:2649
 update_process_times+0x197/0x200 kernel/time/timer.c:1796
 tick_sched_handle kernel/time/tick-sched.c:226 [inline]
 tick_sched_timer+0x27d/0x420 kernel/time/tick-sched.c:1374
 __run_hrtimer kernel/time/hrtimer.c:1537 [inline]
 __hrtimer_run_queues+0x4cb/0xa60 kernel/time/hrtimer.c:1601
 hrtimer_interrupt+0x3b3/0x1040 kernel/time/hrtimer.c:1663
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
 __sysvec_apic_timer_interrupt+0xf9/0x270 arch/x86/kernel/apic/apic.c:1106
 sysvec_apic_timer_interrupt+0x3e/0xb0 arch/x86/kernel/apic/apic.c:1100
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:647
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0xbc/0x120 kernel/locking/spinlock.c:191
Code: f0 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 f7 e8 ba ad 03 f8 f6 44 24 21 02 75 4e 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> ff 62 93 f7 65 8b 05 90 64 3e 76 85 c0 74 3f 48 c7 04 24 0e 36
RSP: 0018:ffffc90000dc0800 EFLAGS: 00000206
RAX: 1ffff920001b8104 RBX: ffff888022268000 RCX: ffffffff8161dad9
RDX: dffffc0000000000 RSI: 0000000000000102 RDI: 0000000000000001
RBP: ffffc90000dc0890 R08: dffffc0000000000 R09: fffffbfff204dfce
R10: fffffbfff204dfce R11: 0000000000000000 R12: dffffc0000000000
R13: 1ffff920001b8100 R14: ffffc90000dc0820 R15: 0000000000000a06
 dummy_timer+0x3002/0x3100 drivers/usb/gadget/udc/dummy_hcd.c:1987
 call_timer_fn+0xf6/0x210 kernel/time/timer.c:1431
 expire_timers kernel/time/timer.c:1476 [inline]
 __run_timers+0x6ff/0x910 kernel/time/timer.c:1745
 run_timer_softirq+0x63/0xf0 kernel/time/timer.c:1758
 __do_softirq+0x372/0x7a6 kernel/softirq.c:559
 invoke_softirq kernel/softirq.c:433 [inline]
 __irq_exit_rcu+0x245/0x280 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1100
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:647
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0xbc/0x120 kernel/locking/spinlock.c:191
Code: f0 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 f7 e8 ba ad 03 f8 f6 44 24 21 02 75 4e 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> ff 62 93 f7 65 8b 05 90 64 3e 76 85 c0 74 3f 48 c7 04 24 0e 36
RSP: 0018:ffffc9000945f7e0 EFLAGS: 00000206
RAX: 1ffff9200128bf00 RBX: ffffffff911be368 RCX: ffffffff90e87703
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: 0000000000000001
RBP: ffffc9000945f870 R08: ffffffff81856800 R09: fffffbfff2237c6e
R10: fffffbfff2237c6e R11: 0000000000000000 R12: dffffc0000000000
R13: 1ffff9200128befc R14: ffffc9000945f800 R15: 0000000000000a06
 __debug_check_no_obj_freed lib/debugobjects.c:997 [inline]
 debug_check_no_obj_freed+0x5a2/0x650 lib/debugobjects.c:1018
 slab_free_hook mm/slub.c:1558 [inline]
 slab_free_freelist_hook+0x161/0x290 mm/slub.c:1608
 slab_free mm/slub.c:3168 [inline]
 kmem_cache_free+0x85/0x170 mm/slub.c:3184
 anon_vma_chain_free mm/rmap.c:141 [inline]
 unlink_anon_vmas+0x58b/0x600 mm/rmap.c:439
 free_pgtables+0x7f/0x2d0 mm/memory.c:413
 exit_mmap+0x2be/0x5f0 mm/mmap.c:3209
 __mmput+0x111/0x370 kernel/fork.c:1096
 exit_mm+0x67e/0x7d0 kernel/exit.c:502
 do_exit+0x6b9/0x23d0 kernel/exit.c:813
 do_group_exit+0x168/0x2d0 kernel/exit.c:923
 __do_sys_exit_group+0x13/0x20 kernel/exit.c:934
 __se_sys_exit_group+0x10/0x10 kernel/exit.c:932
 __x64_sys_exit_group+0x37/0x40 kernel/exit.c:932
 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f0072df1618
Code: Unable to access opcode bytes at RIP 0x7f0072df15ee.
RSP: 002b:00007ffc0fb77be8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007ffc0fb77cb0 RCX: 00007f0072df1618
RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
RBP: 00007ffc0fb77d60 R08: 00000000000000e7 R09: fffffffffffffe50
R10: 00000000ffffffff R11: 0000000000000202 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000003 R15: 000000000000000e
keytouch 0003:0926:3333.00BB: usb_submit_urb(ctrl) failed: -19
keytouch 0003:0926:3333.00BC: usb_submit_urb(ctrl) failed: -19


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] INFO: rcu detected stall in tx
  2021-04-19  7:19 syzbot
@ 2021-04-19  7:27 ` Dmitry Vyukov
  2021-06-28  6:38   ` Zhang, Qiang
  2021-06-27 20:20 ` syzbot
  2021-09-04  7:55 ` syzbot
  2 siblings, 1 reply; 17+ messages in thread
From: Dmitry Vyukov @ 2021-04-19  7:27 UTC (permalink / raw)
  To: syzbot, Greg Kroah-Hartman, guido.kiener, dpenkler, lee.jones, USB list
  Cc: bp, dwmw, hpa, linux-kernel, luto, mingo, syzkaller-bugs, tglx, x86

On Mon, Apr 19, 2021 at 9:19 AM syzbot
<syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    50987bec Merge tag 'trace-v5.12-rc7' of git://git.kernel.o..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1065c5fcd00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=398c4d0fe6f66e68
> dashboard link: https://syzkaller.appspot.com/bug?extid=e2eae5639e7203360018
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com
>
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71

The log shows an infinite stream of these before the stall, so I
assume it's an infinite loop in usbtmc.
+usbtmc maintainers

[  370.171634][    C0] usbtmc 6-1:0.0: unknown status received: -71
[  370.177799][    C1] usbtmc 3-1:0.0: unknown status received: -71
[  370.183912][    C0] usbtmc 4-1:0.0: unknown status received: -71
[  370.190076][    C1] usbtmc 5-1:0.0: unknown status received: -71
[  370.196194][    C0] usbtmc 2-1:0.0: unknown status received: -71
[  370.202387][    C1] usbtmc 3-1:0.0: unknown status received: -71
[  370.208460][    C0] usbtmc 6-1:0.0: unknown status received: -71
[  370.214615][    C1] usbtmc 5-1:0.0: unknown status received: -71
[  370.220736][    C0] usbtmc 4-1:0.0: unknown status received: -71
[  370.226902][    C1] usbtmc 3-1:0.0: unknown status received: -71
[  370.233005][    C0] usbtmc 2-1:0.0: unknown status received: -71
[  370.239168][    C1] usbtmc 5-1:0.0: unknown status received: -71
[  370.245271][    C0] usbtmc 6-1:0.0: unknown status received: -71
[  370.251426][    C1] usbtmc 3-1:0.0: unknown status received: -71
[  370.257552][    C0] usbtmc 4-1:0.0: unknown status received: -71
[  370.263715][    C1] usbtmc 5-1:0.0: unknown status received: -71
[  370.269819][    C0] usbtmc 2-1:0.0: unknown status received: -71
[  370.275974][    C1] usbtmc 3-1:0.0: unknown status received: -71
[  370.282100][    C0] usbtmc 6-1:0.0: unknown status received: -71
[  370.288262][    C1] usbtmc 5-1:0.0: unknown status received: -71
[  370.294399][    C0] usbtmc 4-1:0.0: unknown status received: -71



> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu:    1-...!: (8580 ticks this GP) idle=72e/1/0x4000000000000000 softirq=20679/20679 fqs=0
>         (t=10500 jiffies g=27129 q=416)
> rcu: rcu_preempt kthread starved for 10500 jiffies! g27129 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> rcu:    Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
> rcu: RCU grace-period kthread stack dump:
> task:rcu_preempt     state:R  running task     stack:29168 pid:   14 ppid:     2 flags:0x00004000
> Call Trace:
>  context_switch kernel/sched/core.c:4322 [inline]
>  __schedule+0x911/0x21b0 kernel/sched/core.c:5073
>  schedule+0xcf/0x270 kernel/sched/core.c:5152
>  schedule_timeout+0x14a/0x250 kernel/time/timer.c:1892
>  rcu_gp_fqs_loop kernel/rcu/tree.c:2005 [inline]
>  rcu_gp_kthread+0xd07/0x2250 kernel/rcu/tree.c:2178
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> rcu: Stack dump where RCU GP kthread last ran:
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0
> CPU: 0 PID: 3232 Comm: aoe_tx0 Not tainted 5.12.0-rc7-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:native_apic_mem_write+0x8/0x10 arch/x86/include/asm/apic.h:110
> Code: c7 40 d9 36 8f e8 c8 11 86 00 eb b0 66 0f 1f 44 00 00 be 01 00 00 00 e9 36 c7 2c 00 cc cc cc cc cc cc 89 ff 89 b7 00 c0 5f ff <c3> 0f 1f 80 00 00 00 00 48 b8 00 00 00 00 00 fc ff df 53 89 fb 48
> RSP: 0018:ffffc90000007ea8 EFLAGS: 00000046
> RAX: dffffc0000000000 RBX: ffffffff8b0a78c0 RCX: 0000000000000020
> RDX: 1ffffffff1614f1a RSI: 000000000001c285 RDI: 0000000000000380
> RBP: ffff8880b9c1f2c0 R08: 000000000000003f R09: 0000000000000000
> R10: ffffffff8166ecf7 R11: 0000000000000000 R12: 000000000001c285
> R13: 0000000000000020 R14: ffff8880b9c26340 R15: 0000006120792e26
> FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fb9e6cdb380 CR3: 0000000018792000 CR4: 00000000001506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <IRQ>
>  apic_write arch/x86/include/asm/apic.h:393 [inline]
>  lapic_next_event+0x4d/0x80 arch/x86/kernel/apic/apic.c:472
>  clockevents_program_event+0x254/0x370 kernel/time/clockevents.c:334
>  tick_program_event+0xac/0x140 kernel/time/tick-oneshot.c:44
>  hrtimer_interrupt+0x414/0xa00 kernel/time/hrtimer.c:1676
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
>  __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1106
>  sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1100
>  </IRQ>
>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
> RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:27 [inline]
> RIP: 0010:check_kcov_mode kernel/kcov.c:163 [inline]
> RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x60 kernel/kcov.c:197
> Code: f0 4d 89 03 e9 f2 fc ff ff b9 ff ff ff ff ba 08 00 00 00 4d 8b 03 48 0f bd ca 49 8b 45 00 48 63 c9 e9 64 ff ff ff 0f 1f 40 00 <65> 8b 05 39 fe 8d 7e 89 c1 48 8b 34 24 81 e1 00 01 00 00 65 48 8b
> RSP: 0018:ffffc900030cf6f8 EFLAGS: 00000293
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff88801aff1c40 RSI: ffffffff815c2e4f RDI: 0000000000000003
> RBP: ffffc900030cf738 R08: 0000000000000000 R09: ffffffff8fa9a96f
> R10: ffffffff815c2e45 R11: 0000000000000000 R12: 000000000000002d
> R13: ffff8880113db880 R14: 0000000000000000 R15: 0000000000000200
>  console_trylock_spinning kernel/printk/printk.c:1818 [inline]
>  vprintk_emit+0x3a5/0x560 kernel/printk/printk.c:2097
>  dev_vprintk_emit+0x36e/0x3b2 drivers/base/core.c:4434
>  dev_printk_emit+0xba/0xf1 drivers/base/core.c:4445
>  __netdev_printk+0x1c6/0x27a net/core/dev.c:11292
>  netdev_warn+0xd7/0x109 net/core/dev.c:11345
>  ieee802154_subif_start_xmit.cold+0x17/0x27 net/mac802154/tx.c:125
>  __netdev_start_xmit include/linux/netdevice.h:4825 [inline]
>  netdev_start_xmit include/linux/netdevice.h:4839 [inline]
>  xmit_one net/core/dev.c:3605 [inline]
>  dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3621
>  sch_direct_xmit+0x2e1/0xbd0 net/sched/sch_generic.c:313
>  qdisc_restart net/sched/sch_generic.c:376 [inline]
>  __qdisc_run+0x4ba/0x15f0 net/sched/sch_generic.c:384
>  qdisc_run include/net/pkt_sched.h:136 [inline]
>  qdisc_run include/net/pkt_sched.h:128 [inline]
>  __dev_xmit_skb net/core/dev.c:3807 [inline]
>  __dev_queue_xmit+0x14b9/0x2e00 net/core/dev.c:4162
>  tx+0x68/0xb0 drivers/block/aoe/aoenet.c:63
>  kthread+0x1e7/0x3a0 drivers/block/aoe/aoecmd.c:1230
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> NMI backtrace for cpu 1
> CPU: 1 PID: 37 Comm: kworker/1:1 Not tainted 5.12.0-rc7-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Workqueue: events nsim_dev_trap_report_work
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x141/0x1d7 lib/dump_stack.c:120
>  nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
>  nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
>  trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
>  rcu_dump_cpu_stacks+0x222/0x2a7 kernel/rcu/tree_stall.h:341
>  print_cpu_stall kernel/rcu/tree_stall.h:622 [inline]
>  check_cpu_stall kernel/rcu/tree_stall.h:697 [inline]
>  rcu_pending kernel/rcu/tree.c:3830 [inline]
>  rcu_sched_clock_irq.cold+0x4f7/0x11dd kernel/rcu/tree.c:2650
>  update_process_times+0x16d/0x200 kernel/time/timer.c:1796
>  tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
>  tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1369
>  __run_hrtimer kernel/time/hrtimer.c:1537 [inline]
>  __hrtimer_run_queues+0x1c0/0xe40 kernel/time/hrtimer.c:1601
>  hrtimer_interrupt+0x330/0xa00 kernel/time/hrtimer.c:1663
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
>  __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1106
>  sysvec_apic_timer_interrupt+0x40/0xc0 arch/x86/kernel/apic/apic.c:1100
>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
> RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline]
> RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70 kernel/locking/spinlock.c:191
> Code: 74 24 10 e8 ba 19 54 f8 48 89 ef e8 f2 cf 54 f8 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> d3 9d 48 f8 65 8b 05 7c 68 fc 76 85 c0 74 0a 5b 5d c3 e8 40 59
> RSP: 0018:ffffc90000dc0b28 EFLAGS: 00000206
> RAX: 0000000000000002 RBX: 0000000000000200 RCX: 1ffffffff1f5f34a
> RDX: 0000000000000000 RSI: 0000000000000103 RDI: 0000000000000001
> RBP: ffff888144fa8000 R08: 0000000000000001 R09: ffffffff8fa9a99f
> R10: 0000000000000001 R11: ffffc90013880000 R12: ffff888145047440
> R13: ffff88801ee8e500 R14: dffffc0000000000 R15: ffff888011f69c00
>  spin_unlock_irqrestore include/linux/spinlock.h:409 [inline]
>  dummy_timer+0x12f1/0x32a0 drivers/usb/gadget/udc/dummy_hcd.c:1985
>  call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1431
>  expire_timers kernel/time/timer.c:1476 [inline]
>  __run_timers.part.0+0x67c/0xa50 kernel/time/timer.c:1745
>  __run_timers kernel/time/timer.c:1726 [inline]
>  run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1758
>  __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
>  do_softirq.part.0+0xd9/0x130 kernel/softirq.c:248
>  </IRQ>
>  do_softirq kernel/softirq.c:240 [inline]
>  __local_bh_enable_ip+0x102/0x120 kernel/softirq.c:198
>  spin_unlock_bh include/linux/spinlock.h:399 [inline]
>  nsim_dev_trap_report drivers/net/netdevsim/dev.c:585 [inline]
>  nsim_dev_trap_report_work+0x867/0xbd0 drivers/net/netdevsim/dev.c:611
>  process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 5-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 2-1:0.0: unknown status received: -71
> usbtmc 4-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: unknown status received: -71
> usbtmc 3-1:0.0: usb_submit_urb failed: -19
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: unknown status received: -71
> usbtmc 6-1:0.0: usb_submit_urb failed: -19
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000a9b79905c04e25a0%40google.com.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [syzbot] INFO: rcu detected stall in tx
@ 2021-04-19  7:19 syzbot
  2021-04-19  7:27 ` Dmitry Vyukov
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: syzbot @ 2021-04-19  7:19 UTC (permalink / raw)
  To: bp, dwmw, hpa, linux-kernel, luto, mingo, syzkaller-bugs, tglx, x86

Hello,

syzbot found the following issue on:

HEAD commit:    50987bec Merge tag 'trace-v5.12-rc7' of git://git.kernel.o..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1065c5fcd00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=398c4d0fe6f66e68
dashboard link: https://syzkaller.appspot.com/bug?extid=e2eae5639e7203360018

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com

usbtmc 5-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 5-1:0.0: unknown status received: -71
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	1-...!: (8580 ticks this GP) idle=72e/1/0x4000000000000000 softirq=20679/20679 fqs=0 
	(t=10500 jiffies g=27129 q=416)
rcu: rcu_preempt kthread starved for 10500 jiffies! g27129 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt     state:R  running task     stack:29168 pid:   14 ppid:     2 flags:0x00004000
Call Trace:
 context_switch kernel/sched/core.c:4322 [inline]
 __schedule+0x911/0x21b0 kernel/sched/core.c:5073
 schedule+0xcf/0x270 kernel/sched/core.c:5152
 schedule_timeout+0x14a/0x250 kernel/time/timer.c:1892
 rcu_gp_fqs_loop kernel/rcu/tree.c:2005 [inline]
 rcu_gp_kthread+0xd07/0x2250 kernel/rcu/tree.c:2178
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 3232 Comm: aoe_tx0 Not tainted 5.12.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:native_apic_mem_write+0x8/0x10 arch/x86/include/asm/apic.h:110
Code: c7 40 d9 36 8f e8 c8 11 86 00 eb b0 66 0f 1f 44 00 00 be 01 00 00 00 e9 36 c7 2c 00 cc cc cc cc cc cc 89 ff 89 b7 00 c0 5f ff <c3> 0f 1f 80 00 00 00 00 48 b8 00 00 00 00 00 fc ff df 53 89 fb 48
RSP: 0018:ffffc90000007ea8 EFLAGS: 00000046
RAX: dffffc0000000000 RBX: ffffffff8b0a78c0 RCX: 0000000000000020
RDX: 1ffffffff1614f1a RSI: 000000000001c285 RDI: 0000000000000380
RBP: ffff8880b9c1f2c0 R08: 000000000000003f R09: 0000000000000000
R10: ffffffff8166ecf7 R11: 0000000000000000 R12: 000000000001c285
R13: 0000000000000020 R14: ffff8880b9c26340 R15: 0000006120792e26
FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb9e6cdb380 CR3: 0000000018792000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 apic_write arch/x86/include/asm/apic.h:393 [inline]
 lapic_next_event+0x4d/0x80 arch/x86/kernel/apic/apic.c:472
 clockevents_program_event+0x254/0x370 kernel/time/clockevents.c:334
 tick_program_event+0xac/0x140 kernel/time/tick-oneshot.c:44
 hrtimer_interrupt+0x414/0xa00 kernel/time/hrtimer.c:1676
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1106
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1100
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:27 [inline]
RIP: 0010:check_kcov_mode kernel/kcov.c:163 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x60 kernel/kcov.c:197
Code: f0 4d 89 03 e9 f2 fc ff ff b9 ff ff ff ff ba 08 00 00 00 4d 8b 03 48 0f bd ca 49 8b 45 00 48 63 c9 e9 64 ff ff ff 0f 1f 40 00 <65> 8b 05 39 fe 8d 7e 89 c1 48 8b 34 24 81 e1 00 01 00 00 65 48 8b
RSP: 0018:ffffc900030cf6f8 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88801aff1c40 RSI: ffffffff815c2e4f RDI: 0000000000000003
RBP: ffffc900030cf738 R08: 0000000000000000 R09: ffffffff8fa9a96f
R10: ffffffff815c2e45 R11: 0000000000000000 R12: 000000000000002d
R13: ffff8880113db880 R14: 0000000000000000 R15: 0000000000000200
 console_trylock_spinning kernel/printk/printk.c:1818 [inline]
 vprintk_emit+0x3a5/0x560 kernel/printk/printk.c:2097
 dev_vprintk_emit+0x36e/0x3b2 drivers/base/core.c:4434
 dev_printk_emit+0xba/0xf1 drivers/base/core.c:4445
 __netdev_printk+0x1c6/0x27a net/core/dev.c:11292
 netdev_warn+0xd7/0x109 net/core/dev.c:11345
 ieee802154_subif_start_xmit.cold+0x17/0x27 net/mac802154/tx.c:125
 __netdev_start_xmit include/linux/netdevice.h:4825 [inline]
 netdev_start_xmit include/linux/netdevice.h:4839 [inline]
 xmit_one net/core/dev.c:3605 [inline]
 dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3621
 sch_direct_xmit+0x2e1/0xbd0 net/sched/sch_generic.c:313
 qdisc_restart net/sched/sch_generic.c:376 [inline]
 __qdisc_run+0x4ba/0x15f0 net/sched/sch_generic.c:384
 qdisc_run include/net/pkt_sched.h:136 [inline]
 qdisc_run include/net/pkt_sched.h:128 [inline]
 __dev_xmit_skb net/core/dev.c:3807 [inline]
 __dev_queue_xmit+0x14b9/0x2e00 net/core/dev.c:4162
 tx+0x68/0xb0 drivers/block/aoe/aoenet.c:63
 kthread+0x1e7/0x3a0 drivers/block/aoe/aoecmd.c:1230
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
NMI backtrace for cpu 1
CPU: 1 PID: 37 Comm: kworker/1:1 Not tainted 5.12.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events nsim_dev_trap_report_work
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0x222/0x2a7 kernel/rcu/tree_stall.h:341
 print_cpu_stall kernel/rcu/tree_stall.h:622 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:697 [inline]
 rcu_pending kernel/rcu/tree.c:3830 [inline]
 rcu_sched_clock_irq.cold+0x4f7/0x11dd kernel/rcu/tree.c:2650
 update_process_times+0x16d/0x200 kernel/time/timer.c:1796
 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1369
 __run_hrtimer kernel/time/hrtimer.c:1537 [inline]
 __hrtimer_run_queues+0x1c0/0xe40 kernel/time/hrtimer.c:1601
 hrtimer_interrupt+0x330/0xa00 kernel/time/hrtimer.c:1663
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
 __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1106
 sysvec_apic_timer_interrupt+0x40/0xc0 arch/x86/kernel/apic/apic.c:1100
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70 kernel/locking/spinlock.c:191
Code: 74 24 10 e8 ba 19 54 f8 48 89 ef e8 f2 cf 54 f8 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> d3 9d 48 f8 65 8b 05 7c 68 fc 76 85 c0 74 0a 5b 5d c3 e8 40 59
RSP: 0018:ffffc90000dc0b28 EFLAGS: 00000206
RAX: 0000000000000002 RBX: 0000000000000200 RCX: 1ffffffff1f5f34a
RDX: 0000000000000000 RSI: 0000000000000103 RDI: 0000000000000001
RBP: ffff888144fa8000 R08: 0000000000000001 R09: ffffffff8fa9a99f
R10: 0000000000000001 R11: ffffc90013880000 R12: ffff888145047440
R13: ffff88801ee8e500 R14: dffffc0000000000 R15: ffff888011f69c00
 spin_unlock_irqrestore include/linux/spinlock.h:409 [inline]
 dummy_timer+0x12f1/0x32a0 drivers/usb/gadget/udc/dummy_hcd.c:1985
 call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1431
 expire_timers kernel/time/timer.c:1476 [inline]
 __run_timers.part.0+0x67c/0xa50 kernel/time/timer.c:1745
 __run_timers kernel/time/timer.c:1726 [inline]
 run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1758
 __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
 do_softirq.part.0+0xd9/0x130 kernel/softirq.c:248
 </IRQ>
 do_softirq kernel/softirq.c:240 [inline]
 __local_bh_enable_ip+0x102/0x120 kernel/softirq.c:198
 spin_unlock_bh include/linux/spinlock.h:399 [inline]
 nsim_dev_trap_report drivers/net/netdevsim/dev.c:585 [inline]
 nsim_dev_trap_report_work+0x867/0xbd0 drivers/net/netdevsim/dev.c:611
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 5-1:0.0: unknown status received: -71
usbtmc 5-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 5-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 5-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 2-1:0.0: unknown status received: -71
usbtmc 4-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: unknown status received: -71
usbtmc 3-1:0.0: usb_submit_urb failed: -19
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: unknown status received: -71
usbtmc 6-1:0.0: usb_submit_urb failed: -19


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-09-04  7:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-19 16:14 Re: Re: Re: Re: Re: [syzbot] INFO: rcu detected stall in tx Guido Kiener
2021-05-19 17:35 ` Alan Stern
2021-05-19 19:38   ` Thinh Nguyen
2021-05-20  2:01     ` Alan Stern
2021-05-20 20:30       ` Thinh Nguyen
2021-05-24 15:18         ` Mathias Nyman
2021-05-24 18:55           ` Alan Stern
2021-05-24 19:23             ` Thinh Nguyen
2021-05-24 22:16               ` Mathias Nyman
2021-05-24 22:48                 ` Thinh Nguyen
2021-05-19 18:04 ` Re: Re: Re: Re: " Lee Jones
  -- strict thread matches above, loose matches on Subject: below --
2021-04-19  7:19 syzbot
2021-04-19  7:27 ` Dmitry Vyukov
2021-06-28  6:38   ` Zhang, Qiang
2021-06-28 14:17     ` Alan Stern
2021-06-27 20:20 ` syzbot
2021-09-04  7:55 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).