linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ross Zwisler <zwisler@google.com>
To: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com>,
	"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"kernel@collabora.com" <kernel@collabora.com>
Subject: Re: xhci problem -> general protection fault
Date: Thu, 7 Jan 2021 09:07:04 -0700	[thread overview]
Message-ID: <X/cxqGgHu5NraH+s@google.com> (raw)
In-Reply-To: <52b0d136-3fcc-ad7f-8e53-140c077489be@linux.intel.com>

On Thu, Jan 07, 2021 at 10:57:19AM +0200, Mathias Nyman wrote:
> On 6.1.2021 20.52, Ross Zwisler wrote:
> > On Wed, Dec 30, 2020 at 5:31 AM Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
> > <>
> >> I was able to reproduce the issue with two USB devices on a different system.
> >>
> >> I saw that the new incorrect dequeue pointer sometimes had the higher 32 bits
> >> incorrect while the lower bits were correct.
> >> So this looks like a memory access order issue.
> >>
> >> The command TRB is 16 bytes. The xhci driver writes it in four 4 byte chunks.
> >> Even if the driver writes the chunk that sets the cycle bit last, handing the TRB
> >> over to the controller, it appears that the actual write order can be different.
> >> The controller ends up reading a command TRB with updated cycle bit but old bogus
> >> values in the "new dequeue pointer" field.
> >>
> >> Adding a write memory barrier before writing the cycle bit solved the issue in my case.
> >>
> >> Whole patch series is updated, added write memory barrier, and rebased on 5.10.
> >> It can be found force-updated in the same rewrite_halt_stop_handling branch:
> >>
> >> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git rewrite_halt_stop_handling
> >> https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=rewrite_halt_stop_handling
> >>
> >> Does this work for you?
> > 
> > Yes, it does!!  I verified that I'm able to reproduce the issue in less than
> > 10 seconds with this commit which is the HEAD~1 of your series:
> > 
> > a7d053d207121 xhci: handle halting transfer event properly after endpoint stop and halt raced.
> > 
> > With this commit (the final commit in your series, adding a single patch): 
> > 
> > 96887d191a88c xhci: make sure TRB is fully written before giving it to the controller
> > 
> > I ran cleanly for over 20 minutes and haven't been able to reproduce the
> > issue.  It looks like the wmb() added in that patch makes all the difference!
> > 
> > Thank you for the fix, and you can add this tag to the series:
> > 
> > Tested-by: Ross Zwisler <zwisler@google.com>
> > 
> 
> Great, thanks for testing.
> 
> I think it makes sense to cherry pick that one-liner wmb() patch first to 5.11 and stable,
> then add rest of the rewrite later.
> 
> -Mathias

Makes sense to me.  FWIW I've verified that v5.10 reproduces the issue easily,
and v5.10 with just the wmb() patch does not.

Thanks again for the fix!

      reply	other threads:[~2021-01-07 16:07 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-17 15:30 xhci problem -> general protection fault Andrzej Pietrasiewicz
2020-09-18 10:50 ` Mathias Nyman
2020-09-18 14:20   ` Andrzej Pietrasiewicz
2020-09-25 13:40     ` Mathias Nyman
2020-09-25 21:05       ` Ross Zwisler
2020-09-28 13:32         ` Andrzej Pietrasiewicz
2020-09-29  7:13           ` Mathias Nyman
2020-10-01 14:13             ` Andrzej Pietrasiewicz
2020-09-28 22:35         ` Mathias Nyman
2020-10-01 16:43           ` zwisler
2020-10-12 19:20             ` Mathias Nyman
2020-10-12 21:53               ` zwisler
2020-10-13  7:49                 ` Mathias Nyman
2020-10-13  8:29                   ` Andrzej Pietrasiewicz
2020-10-13 16:44                     ` zwisler
2020-11-19 16:52                   ` Ross Zwisler
2020-11-23 15:06                     ` Mathias Nyman
2020-12-02 22:59                       ` Ross Zwisler
2020-12-04 18:07                         ` Mathias Nyman
2020-12-08 17:24                           ` Ross Zwisler
2020-12-09 13:11                             ` Mathias Nyman
2020-12-09 18:54                               ` Ross Zwisler
2020-12-30 12:33                                 ` Mathias Nyman
2021-01-06 18:52                                   ` Ross Zwisler
2021-01-07  8:57                                     ` Mathias Nyman
2021-01-07 16:07                                       ` Ross Zwisler [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X/cxqGgHu5NraH+s@google.com \
    --to=zwisler@google.com \
    --cc=andrzej.p@collabora.com \
    --cc=kernel@collabora.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).