All of lore.kernel.org
 help / color / mirror / Atom feed
From: Felipe Balbi <felipe.balbi@linux.intel.com>
To: Thinh Nguyen <thinh.nguyen@synopsys.com>Thinh Nguyen
	<thinh.nguyen@synopsys.com>,
	Linux USB <linux-usb@vger.kernel.org>
Subject: [10/14] usb: dwc3: gadget: remove wait_end_transfer
Date: Mon, 21 Jan 2019 10:40:47 +0200	[thread overview]
Message-ID: <87r2d6jtdc.fsf@linux.intel.com> (raw)

Hi,

Thinh Nguyen <thinh.nguyen@synopsys.com> writes:
> On 1/17/2019 11:12 PM, Felipe Balbi wrote:
>> Hi,
>>
>> Thinh Nguyen <thinh.nguyen@synopsys.com> writes:
>>> @@ -1409,15 +1407,11 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
>>>                 if (r == req) {
>>>                         /* wait until it is processed */
>>>                         dwc3_stop_active_transfer(dep, true);
>>>
>>> I ran into a regression with this patch. DWC3 will cleanup cancelled
>>> requests on END_TRANSFER command completion. However, if for some
>>> reason, the driver is unable to send the command, it will never
>> by why wouldn't the driver send the command? That seems to be the error
>> we should be looking at. Got some tracepoints available?
>
> Yes. Attached is the tracepoint. Device got stuck on dequeue on the last
> line of the tracepoint.

to me, it seems that we're trying to dequeue the same request
twice. Here's a filtered list for the request in question. Notice last
two lines. They are both dwc3_ep_dequeue() without a matching
dwc3_ep_queue():

 irq/16-dwc3-3983  [004] d...   130.997794: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 8192/8192 zsI ==> 0
file-storage-3982  [006] d...   130.998090: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   130.998535: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   130.998584: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   130.999196: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   130.999245: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   130.999871: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   130.999917: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.000545: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.000554: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/13 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.000830: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 13/13 zsI ==> 0
file-storage-3982  [006] d...   131.001021: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.001763: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.001809: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.002437: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.002483: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.003097: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.003144: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/8192 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.003610: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 8192/8192 zsI ==> 0
file-storage-3982  [006] d...   131.003790: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.004186: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.004217: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.004848: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.004892: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.005514: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.005560: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.006174: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.006218: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/13 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.006469: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 13/13 zsI ==> 0
file-storage-3982  [006] d...   131.006680: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.007367: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.007398: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.008050: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.008098: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.008723: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.008767: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/8192 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.009231: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 8192/8192 zsI ==> 0
file-storage-3982  [006] d...   131.009522: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.009946: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.009983: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
 irq/16-dwc3-3983  [004] d...   131.010617: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0
file-storage-3982  [006] d...   131.010663: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
file-storage-3982  [006] ....   131.065766: dwc3_ep_dequeue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
file-storage-3982  [006] ....   135.315162: dwc3_ep_dequeue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115

We need to see what happens after we print this extra dequeue line
here. We also need to know why upper layer is dequeueing the same
request twice. It seems like we're dealing with a gadget driver
bug. Granted, we need to protect ourselves from it, but we don't know
why upper layer dequeues the same thing twice; nor do we know what
happens after this last line here.

Looking at the last few lines relating to ep1in we have:

> file-storage-3982  [006] d...   131.010663: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115

queue the request in question

> file-storage-3982  [006] d...   131.010667: dwc3_prepare_trb: ep1in: trb 000000002ab8a1f9 buf 00000000bde24000 size 16384 ctrl 00000811 (Hlcs:sC:normal)

prepare a TRB for it

> file-storage-3982  [006] d...   131.010674: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [30007] params 00000000 00000000 00000000 --> status: Successful

no status update transfer

>  irq/16-dwc3-3983  [004] d...   131.010942: dwc3_event: event (00004086): ep1in: Transfer In Progress [0] (sIm)

transfer in progress

>  irq/16-dwc3-3983  [004] d...   131.010942: dwc3_complete_trb: ep1in: trb 00000000426cd8cf buf 00000000bde20000 size 0 ctrl 00000810 (hlcs:sC:normal)

complete a TRB in full. Where is our giveback? This is not a chained
TRB, so we should have a call to giveback here. This is already fishy.

> file-storage-3982  [006] d...   131.010994: dwc3_ep_queue: ep1in: req 00000000f7765e56 length 0/16384 zsI ==> -115

new request coming in

> file-storage-3982  [006] d...   131.010998: dwc3_prepare_trb: ep1in: trb 0000000065d9143d buf 00000000bde28000 size 16384 ctrl 00000811 (Hlcs:sC:normal)

another TRB

> file-storage-3982  [006] d...   131.011005: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [30007] params 00000000 00000000 00000000 --> status: Successful

update transfer

> file-storage-3982  [006] ....   131.065558: dwc3_ep_dequeue: ep1in: req 00000000f7765e56 length 0/16384 zsI ==> -115

a call to dequeue for the new request. Request is moved to cancelled list

> file-storage-3982  [006] d...   131.065687: dwc3_gadget_ep_cmd: ep1in: cmd 'End Transfer' [30d08] params 00000000 00000000 00000000 --> status: Successful

End transfer issued

>  irq/16-dwc3-3983  [004] d...   131.065729: dwc3_event: event (080301c6): ep1in: Endpoint Command Complete

End transfer completed

>  irq/16-dwc3-3983  [004] d...   131.065731: dwc3_gadget_giveback: ep1in: req 00000000f7765e56 length 0/16384 zsI ==> -104

for each request is cancelled_list, giveback.

> file-storage-3982  [006] ....   131.065766: dwc3_ep_dequeue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115
> file-storage-3982  [006] ....   135.315162: dwc3_ep_dequeue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115

Try to dequeue a request that was already completed. Odd. Why are we
missing a call to giveback??

>> What we need is a way to ensure that either END_TRANSFER happens, or
>> prove that for cases where END_TRANSFER isn't issued, is because of
>> suspend/resume or driver removal. In either case, we can safely giveback
>> the TRBs since core will be reinitialized later on.
>>
>> Still, if you have some tracepoints, I'd like to see why is it that core
>> doesn't issue END_TRANSFER.
>>
>
> The driver does send END_TRANSFER command. So, the issue here is the
> upperlayer driver can call dequeue() after the END_TRANSFER command was
> completed and before any new START_TRANSFER command. The request was not
> added to the cancelled list soon enough for it to be cleaned up on
> command completion.
>
> How about giveback the request immediately if endpoint hasn't started
> and no pending_end_transfer?

we already do that. Here's ep_dequeue with comments:

> static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
> 		struct usb_request *request)
> {
> 	struct dwc3_request		*req = to_dwc3_request(request);
> 	struct dwc3_request		*r = NULL;
>
> 	struct dwc3_ep			*dep = to_dwc3_ep(ep);
> 	struct dwc3			*dwc = dep->dwc;
>
> 	unsigned long			flags;
> 	int				ret = 0;
>
> 	trace_dwc3_ep_dequeue(req);

print our trace line

> 	spin_lock_irqsave(&dwc->lock, flags);
>
> 	list_for_each_entry(r, &dep->pending_list, list) {
> 		if (r == req)
> 			break;

if we find the request on pending_list, meaning it wasn't started...

> 	}
>
> 	if (r != req) {

... this won't be true...

> 		list_for_each_entry(r, &dep->started_list, list) {
> 			if (r == req)
> 				break;
> 		}
> 		if (r == req) {
> 			/* wait until it is processed */
> 			dwc3_stop_active_transfer(dep, true);
>
> 			if (!r->trb)
> 				goto out0;
>
> 			dwc3_gadget_move_cancelled_request(req);
> 			goto out0;
> 		}
> 		dev_err(dwc->dev, "request %pK was not queued to %s\n",
> 				request, ep->name);
> 		ret = -EINVAL;
> 		goto out0;
> 	}
>
> 	dwc3_gadget_giveback(dep, req, -ECONNRESET);

... and we will run this. Request is only moved to cancelled_list if it
was previously on started_list.

> out0:
> 	spin_unlock_irqrestore(&dwc->lock, flags);
>
> 	return ret;
> }

             reply	other threads:[~2019-01-21  8:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-21  8:40 Felipe Balbi [this message]
  -- strict thread matches above, loose matches on Subject: below --
2019-01-22  7:27 [10/14] usb: dwc3: gadget: remove wait_end_transfer Felipe Balbi
2019-01-22  2:18 Thinh Nguyen
2019-01-21 12:15 Felipe Balbi
2019-01-21  9:31 Felipe Balbi
2019-01-21  9:29 Felipe Balbi
2019-01-19  2:51 Thinh Nguyen
2019-01-18  6:57 Felipe Balbi
2018-11-14  9:18 Felipe Balbi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r2d6jtdc.fsf@linux.intel.com \
    --to=felipe.balbi@linux.intel.com \
    --cc=thinh.nguyen@synopsys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.