linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: Tejas Joglekar <Tejas.Joglekar@synopsys.com>
Cc: "linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	John Youn <John.Youn@synopsys.com>
Subject: Re: [RFC PATCH 1/4] usb: xhci: Synopsys xHC consolidate TRBs
Date: Tue, 3 Mar 2020 16:47:57 +0200	[thread overview]
Message-ID: <37e8bc7b-4dd4-704c-ab1d-933a56534213@linux.intel.com> (raw)
In-Reply-To: <d485b110-2a81-e5db-6e76-c65c498550f3@synopsys.com>

On 3.3.2020 12.24, Tejas Joglekar wrote:
> Hi,
> On 2/14/2020 2:06 PM, Tejas Joglekar wrote:
>> Hi,
>> On 2/12/2020 8:34 PM, Mathias Nyman wrote:
>>> On 7.2.2020 19.17, Tejas Joglekar wrote:
>>>> Hi,
>>>> Thanks for the review comments.
>>>> On 1/3/2020 10:14 PM, Mathias Nyman wrote:
>>>>> On 20.12.2019 15.39, Tejas Joglekar wrote:
>>>>>> The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
>>>>>> each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
>>>>>> for HS. The controller loads and updates the TRB cache from the transfer
>>>>>> ring in system memory whenever the driver issues a start transfer or
>>>>>> update transfer command.
>>>>>>
>>>>>> For chained TRBs, the Synopsys xHC requires that the total amount of
>>>>>> bytes for all TRBs loaded in the TRB cache be greater than or equal to 1
>>>>>> MPS. Or the chain ends within the TRB cache (with a last TRB).
>>>>>>
>>>>>> If this requirement is not met, the controller will not be able to send
>>>>>> or receive a packet and it will hang causing a driver timeout and error.
>>>>>>
>>>>>> This can be a problem if a class driver queues SG requests with many
>>>>>> small-buffer entries. The XHCI driver will create a chained TRB for each
>>>>>> entry which may trigger this issue.
>>>>>>
>>>>>> This patch adds logic to the XHCI driver to detect and prevent this from
>>>>>> happening.
>>>>>>
>>>>>> For every (TRB_CACHE_SIZE - 2) TRBs, we check the total buffer size of
>>>>>> the TRBs and if the chain continues and we don't make up at least 1 MPS,
>>>>>> we create a bounce buffer to consolidate up to the next (4 * MPS) TRBs
>>>>>> into the last TRB.
>>>>>>
>>>>>> We check at (TRB_CACHE_SIZE - 2) because it is possible that there would
>>>>>> be a link and/or event data TRB that take up to 2 of the cache entries.
>>>>>> And we consolidate the next (4 * MPS) to improve performance.
>>>>>>
>>>>>> We discovered this issue with devices on other platforms but have not
>>>>>> yet come across any device that triggers this on Linux. But it could be
>>>>>> a real problem now or in the future. All it takes is N number of small
>>>>>> chained TRBs. And other instances of the Synopsys IP may have smaller
>>>>>> values for the TRB_CACHE_SIZE which would exacerbate the problem.
>>>>>>
>>>>>> We verified this patch using our internal driver and testing framework.
>>>>>
>>>>>
>>>> I understand that in a first look it looks a complex solution, but you can suggest
>>>> the modifications/changes which would be required to make the patch more readable.
>>>> I have tried to keep the implementation similar to bounce buffer implementation 
>>>> only with addition of bounce buffer list. For the spinlock case, you can take a 
>>>> call if it is required or not.
>>>
>>> In your case you know the need for a bounce buffer much earlier than in the
>>> existing TD fragment case.
>>>
>>> Have you looked into the struct hc_driver map_urb_for_dma() and
>>> unmap_urb_for_dma() hooks? In those you could detect the need for a bounce
>>> buffer, allocate it, and bluntly copy entire scattergather buffer to the 
>>> bounce buffer. It should be fairly small anyway.
>>>
>> I will look into it, and get back to you. Thanks for the suggestion.
>>  
> I looked into it and I have a question related to approach you have suggested.
> I can detect need for the bounce buffer in map_urb_for_dma() function and can allocate bounce
> buffer and bluntly copy it, but when the SG list is having data over few MB's, I think 
> my bounce buffer allocation might fail, over a period. Do you think it is possible?

Doesn't sound very likely, the sg would need to have more than 16 entries of which the
total length of consecutive 16 entries is less than 1024 bytes, and then the following
entries should be large enough to fail memory allocation. 
(for HS the total legth < 512 for 8 entries, and rest huge)  

And no real world device has yet even triggered the first condition of having 16 (8)
sg entries with total length less than max packet size.

> 
> So to avoid that, I thought of having a list of such bounce buffers held with the URB and 
> then I can break the SG list with some fixed data length (e.g 16KB or 32 KB's) bounce buffers
> and copy the SG into that bounce buffer list.
> 
> Another option is to create a bounce sg, based on detection of bounce buffer requirement, 
> where the small size SG's are combined to create a new SG list which can satisfy the 
> TRB cache requirement for the SNPS controller.
> 
> What do you suggest? Which is the best way to go about solving the problem?

As this is extremely rare (never reported in a real use case) 
I'd keep this as simple as possible. Allocate one bounce buffer when needed, and copy all.
If memory allocation fails (unlikely) print a warning, then we immediately know what to fix.

We are talking about sizes where 16 sg entries have in total 1024 bytes of data.

ehci-tegra.c does something related to what you want. It replaces
urb->transfer_buffer with a dma aligned bounce buffer. 

-Mathias

  reply	other threads:[~2020-03-03 14:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-20 13:38 [RFC PATCH 0/4] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
2019-12-20 13:39 ` [RFC PATCH 1/4] usb: xhci: Synopsys xHC consolidate TRBs Tejas Joglekar
2020-01-02  9:08   ` David Laight
2020-01-03 16:44   ` Mathias Nyman
2020-02-07 17:17     ` Tejas Joglekar
2020-02-12 15:04       ` Mathias Nyman
2020-02-14  8:36         ` Tejas Joglekar
2020-03-03 10:24           ` Tejas Joglekar
2020-03-03 14:47             ` Mathias Nyman [this message]
2020-03-16 17:09               ` Tejas Joglekar
2020-03-17  8:48                 ` Mathias Nyman
2019-12-20 13:39 ` [RFC PATCH 2/4] usb: dwc3: Add device property consolidate-trbs Tejas Joglekar
2019-12-20 13:40 ` [RFC PATCH 3/4] usb: xhci: Set quirk for XHCI_CONSOLIDATE_TRBS Tejas Joglekar
2019-12-20 13:40 ` [RFC PATCH 4/4] dt-bindings: usb: Add snps,consolidate-trbs & consolidate-trbs Tejas Joglekar
2020-01-08 15:19   ` Rob Herring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37e8bc7b-4dd4-704c-ab1d-933a56534213@linux.intel.com \
    --to=mathias.nyman@linux.intel.com \
    --cc=John.Youn@synopsys.com \
    --cc=Tejas.Joglekar@synopsys.com \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).