linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Stern <stern@rowland.harvard.edu>
To: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Cc: Guido Kiener <Guido.Kiener@rohde-schwarz.com>,
	dave penkler <dpenkler@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	syzbot <syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"lee.jones@linaro.org" <lee.jones@linaro.org>,
	USB list <linux-usb@vger.kernel.org>,
	"bp@alien8.de" <bp@alien8.de>,
	"dwmw@amazon.co.uk" <dwmw@amazon.co.uk>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"luto@kernel.org" <luto@kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"syzkaller-bugs@googlegroups.com"
	<syzkaller-bugs@googlegroups.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [syzbot] INFO: rcu detected stall in tx
Date: Wed, 19 May 2021 22:01:17 -0400	[thread overview]
Message-ID: <20210520020117.GA1186755@rowland.harvard.edu> (raw)
In-Reply-To: <12088413-2f7d-a1e5-5e8a-25876d85d18a@synopsys.com>

On Wed, May 19, 2021 at 07:38:52PM +0000, Thinh Nguyen wrote:
> Hi Alan,
> 
> Sorry if this diverges from the thread, but I've been wondering whether
> to add a change for this also.
> 
> For xHCI hosts, after transactions errors, the endpoint will enter
> halted state.

No.  You are misreading the xHCI spec.  Section 4.6.8 says:

	... the state of the associated Endpoint Context is set to 
	Halted...

Note this carefully.  It says "Endpoint Context", not "endpoint".

The endpoint is part of the device, whereas the endpoint context is part 
of the host controller.  The device doesn't know when a transaction 
error has occurred; consequently such errors do not affect the endpoint.  
The host controller does know, and consequently such errors do affect 
the endpoint context.

> The driver will attempt a few soft-retries before giving
> up. According to the xHCI spec (section 4.6.8), a host may send a
> ClearFeature(endpoint_halt) to recover and restart the transfer (see

Not quite.  The section of the spec you're talking about says:

	Software shall execute the following sequence to “reset a 
	pipe”....  Issue a ClearFeature(ENDPOINT_HALT) request to 
	device.

It does not say the host controller will do this; it says that software 
will do it.

> "reset a pipe" in xhci spec), and the class driver can handle this after
> receiving something like -EPROTO from xhci.
> 
> However, as you've pointed out, some devices don't like
> ClearFeature(ep_halt) and may not properly synchronize with the host on
> where it should restart.
> 
> Some OS (such as Windows) do this. Not sure if we also want this?

In general we should do the same thing as Windows does, because most 
hardware designers test their equipment on Windows systems but 
relatively few test on Linux systems.

> Currently the recovery is just a timeout and a port reset from the class

This depends on the driver.  Some perform no recovery at all.

> driver, but the timeout is usually defaulted to a long time (e.g. 30
> seconds for storage class driver).

That 30-second timeout in the mass-storage driver applies in situations 
where a command fails to complete, not in situations where it completes 
quickly but with a -EPROTO or -EPIPE error.

The fact is that only a small percentage of -EPROTO errors are 
recoverable.  Some of them can be handled by a port reset, which can be 
pretty awkward to perform but does occasionally work.  A lot of them 
occur because the USB cable has been unplugged; obviously there's no way 
to recover from that.  With only a few exceptions, the best and simplest 
approach is not to try to recover at all.

For the case in question (the syzbot bug report that started this 
thread), the class driver doesn't try to perform any recovery.  It just 
resubmits the URB, getting into a tight retry loop which consumes too 
much CPU time.  Simply giving up would be preferable.

Alan Stern

  reply	other threads:[~2021-05-20  2:01 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19 16:14 Re: Re: Re: Re: Re: [syzbot] INFO: rcu detected stall in tx Guido Kiener
2021-05-19 17:35 ` Alan Stern
2021-05-19 19:38   ` Thinh Nguyen
2021-05-20  2:01     ` Alan Stern [this message]
2021-05-20 20:30       ` Thinh Nguyen
2021-05-24 15:18         ` Mathias Nyman
2021-05-24 18:55           ` Alan Stern
2021-05-24 19:23             ` Thinh Nguyen
2021-05-24 22:16               ` Mathias Nyman
2021-05-24 22:48                 ` Thinh Nguyen
2021-05-19 18:04 ` Re: Re: Re: Re: " Lee Jones
  -- strict thread matches above, loose matches on Subject: below --
2021-04-19  7:19 syzbot
2021-04-19  7:27 ` Dmitry Vyukov
2021-06-28  6:38   ` Zhang, Qiang
2021-06-28 14:17     ` Alan Stern
2021-06-27 20:20 ` syzbot
2021-09-04  7:55 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210520020117.GA1186755@rowland.harvard.edu \
    --to=stern@rowland.harvard.edu \
    --cc=Guido.Kiener@rohde-schwarz.com \
    --cc=Thinh.Nguyen@synopsys.com \
    --cc=bp@alien8.de \
    --cc=dpenkler@gmail.com \
    --cc=dvyukov@google.com \
    --cc=dwmw@amazon.co.uk \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=lee.jones@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=syzbot+e2eae5639e7203360018@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).