Linux-USB Archive on lore.kernel.org
 help / color / Atom feed
From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: Andrzej Pietrasiewicz <andrzej.p@collabora.com>,
	"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>
Cc: "kernel@collabora.com" <kernel@collabora.com>
Subject: Re: xhci problem -> general protection fault
Date: Fri, 25 Sep 2020 16:40:29 +0300
Message-ID: <133c123e-e857-7f83-d146-f39c00afe39f@linux.intel.com> (raw)
In-Reply-To: <8230c2a2-719c-ef81-e85d-5921bf8e98e6@collabora.com>

On 18.9.2020 17.20, Andrzej Pietrasiewicz wrote:
> Hi Mathias,
> 
> W dniu 18.09.2020 o 12:50, Mathias Nyman pisze:
>> On 17.9.2020 18.30, Andrzej Pietrasiewicz wrote:
>>> Dear All,
>>>
>>> I have encountered a problem in xhci which leads to general protection fault.
>>>
>>> The problem is triggered by running this program:
>>>
>>> https://gitlab.collabora.com/andrzej.p/bulk-cancel.git
>>>
>>> $ sudo ./bulk-cancel -D /dev/bus/usb/002/006 -i 1 -b 1
>>>
>>> where /dev/bus/usb/002/006 is a Gadget Zero:
>>>
>>> It takes less than a minute until the crash happens.
>>> The DMAR (iommu) errors don't happen always, i.e. there are crashes
>>> when they are not reported in the system log. In either case the
>>>
>>> "WARN Cannot submit Set TR Deq Ptr"
>>> "A Set TR Deq Ptr command is pending."
>>> "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state."
>>>
>>> messages do appear.
>>>
>>
>> Nice testcase and report, thanks.
>>
>> I started looking at issues in this area some time ago, and wrote a couple patches but
>> it was left hanging. The two patches (now rebased on 5.9-rc3) can be found in my tree in the
>> fix_invalid_context_at_stop_endpoint branch
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git fix_invalid_context_at_stop_endpoint
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=fix_invalid_context_at_stop_endpoint
>>
>> If you could give those a try and see if they help I'd be grateful.
> 
> No, it doesn't help, albeit the errors are slightly different:
> 
> xhci_hcd 0000:00:14.0: WARN Cannot submit Set TR Deq Ptr
> xhci_hcd 0000:00:14.0: A Set TR Deq Ptr command is pending.
> dmar_fault: 44 callbacks suppressed
> DRHD: handling fault status reg 3> DMAR: [DMA Write] Request device [00:14.0] PASID ffffffff fault addr ffcda000 [fault reason 05] PTE Write access is not set
> DMAR: DRHD: handling fault status reg 3

Ok, thanks, the DMA problems make sense to me now.

If a transfer ring stops on a transfer request (TRB) that should be canceled (manual cancel,
or error) it's not enough to just turn the  TRB to a no-op.
HW has most likely cached the TRB, and we need to move the transfer ring dequeue pointer past this TRB.
Moving deq also clears controller cache.

We do all this, but if we fail to queue the Set TR Deq command the TRB (with DMA  pointers) will stay on the ring,
and controller will access the TRB DMA  address once it continues running. At this point xhci driver has already
given back the canceled/erroneous TRB, and is probably unmapped already.
Hence the DMAR entries.  

Looks like this part of the code needs a more extensive rewrite, on top of this we are not handling
races between endpoints halted due errors, and endpoints stopped by driver to cancel URBs. 

-Mathias

  reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-17 15:30 Andrzej Pietrasiewicz
2020-09-18 10:50 ` Mathias Nyman
2020-09-18 14:20   ` Andrzej Pietrasiewicz
2020-09-25 13:40     ` Mathias Nyman [this message]
2020-09-25 21:05       ` Ross Zwisler
2020-09-28 13:32         ` Andrzej Pietrasiewicz
2020-09-29  7:13           ` Mathias Nyman
2020-10-01 14:13             ` Andrzej Pietrasiewicz
2020-09-28 22:35         ` Mathias Nyman
2020-10-01 16:43           ` zwisler
2020-10-12 19:20             ` Mathias Nyman
2020-10-12 21:53               ` zwisler
2020-10-13  7:49                 ` Mathias Nyman
2020-10-13  8:29                   ` Andrzej Pietrasiewicz
2020-10-13 16:44                     ` zwisler
2020-11-19 16:52                   ` Ross Zwisler
2020-11-23 15:06                     ` Mathias Nyman
2020-12-02 22:59                       ` Ross Zwisler
2020-12-04 18:07                         ` Mathias Nyman
2020-12-08 17:24                           ` Ross Zwisler
2020-12-09 13:11                             ` Mathias Nyman
2020-12-09 18:54                               ` Ross Zwisler
2020-12-30 12:33                                 ` Mathias Nyman
2021-01-06 18:52                                   ` Ross Zwisler
2021-01-07  8:57                                     ` Mathias Nyman
2021-01-07 16:07                                       ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=133c123e-e857-7f83-d146-f39c00afe39f@linux.intel.com \
    --to=mathias.nyman@linux.intel.com \
    --cc=andrzej.p@collabora.com \
    --cc=kernel@collabora.com \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-USB Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-usb/0 linux-usb/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-usb linux-usb/ https://lore.kernel.org/linux-usb \
		linux-usb@vger.kernel.org
	public-inbox-index linux-usb

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-usb


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git