All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wesley Cheng <wcheng@codeaurora.org>
To: Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
	Felipe Balbi <balbi@kernel.org>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jackp@codeauora.org" <jackp@codeauora.org>
Subject: Re: [RFC][PATCH] usb: dwc3: usb: dwc3: Force stop EP0 transfers during pullup disable
Date: Thu, 19 Aug 2021 13:51:47 -0700	[thread overview]
Message-ID: <cdd9d624-00c6-1be3-5aad-e6f923d1e8d8@codeaurora.org> (raw)
In-Reply-To: <096a03e0-a913-7188-2c2e-d801d9617160@synopsys.com>

Hi Thinh,

On 8/18/2021 5:40 PM, Thinh Nguyen wrote:
> Wesley Cheng wrote:
>> Hi Thinh/Felipe,
>>
>> On 8/16/2021 12:13 PM, Wesley Cheng wrote:
>>> Hi Thinh,
>>>
>>> On 8/15/2021 5:33 PM, Thinh Nguyen wrote:
>>>> Felipe Balbi wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Thinh Nguyen <Thinh.Nguyen@synopsys.com> writes:
>>>>>>>>>>>>>> If this occurs, then the entire pullup disable routine is skipped and
>>>>>>>>>>>>>> proper cleanup and halting of the controller does not complete.
>>>>>>>>>>>>>> Instead of returning an error (which is ignored from the UDC
>>>>>>>>>>>>>> perspective), do what is mentioned in the comments and force the
>>>>>>>>>>>>>> transaction to complete and put the ep0state back to the SETUP phase.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>  drivers/usb/dwc3/ep0.c    | 4 ++--
>>>>>>>>>>>>>>  drivers/usb/dwc3/gadget.c | 6 +++++-
>>>>>>>>>>>>>>  drivers/usb/dwc3/gadget.h | 3 +++
>>>>>>>>>>>>>>  3 files changed, 10 insertions(+), 3 deletions(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
>>>>>>>>>>>>>> index 6587394..abfc42b 100644
>>>>>>>>>>>>>> --- a/drivers/usb/dwc3/ep0.c
>>>>>>>>>>>>>> +++ b/drivers/usb/dwc3/ep0.c
>>>>>>>>>>>>>> @@ -218,7 +218,7 @@ int dwc3_gadget_ep0_queue(struct usb_ep *ep, struct usb_request *request,
>>>>>>>>>>>>>>  	return ret;
>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> -static void dwc3_ep0_stall_and_restart(struct dwc3 *dwc)
>>>>>>>>>>>>>> +void dwc3_ep0_stall_and_restart(struct dwc3 *dwc)
>>>>>>>>>>>>>>  {
>>>>>>>>>>>>>>  	struct dwc3_ep		*dep;
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> @@ -1073,7 +1073,7 @@ void dwc3_ep0_send_delayed_status(struct dwc3 *dwc)
>>>>>>>>>>>>>>  	__dwc3_ep0_do_control_status(dwc, dwc->eps[direction]);
>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> -static void dwc3_ep0_end_control_data(struct dwc3 *dwc, struct dwc3_ep *dep)
>>>>>>>>>>>>>> +void dwc3_ep0_end_control_data(struct dwc3 *dwc, struct dwc3_ep *dep)
>>>>>>>>>>>>>>  {
>>>>>>>>>>>>>>  	struct dwc3_gadget_ep_cmd_params params;
>>>>>>>>>>>>>>  	u32			cmd;
>>>>>>>>>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>>>>>>>>>> index 54c5a08..a0e2e4d 100644
>>>>>>>>>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>>>>>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>>>>>>>>>> @@ -2437,7 +2437,11 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
>>>>>>>>>>>>>>  				msecs_to_jiffies(DWC3_PULL_UP_TIMEOUT));
>>>>>>>>>>>>>>  		if (ret == 0) {
>>>>>>>>>>>>>>  			dev_err(dwc->dev, "timed out waiting for SETUP phase\n");
>>>>>>>>>>>>>> -			return -ETIMEDOUT;
>>>>>>>>>>>>>> +			spin_lock_irqsave(&dwc->lock, flags);
>>>>>>>>>>>>>> +			dwc3_ep0_end_control_data(dwc, dwc->eps[0]);
>>>>>>>>>>>>>> +			dwc3_ep0_end_control_data(dwc, dwc->eps[1]);
>>>>>>>>>>>>>
>>>>>>>>>>>>> End transfer command takes time, need to wait for it to complete before
>>>>>>>>>>>>> issuing Start transfer again. Also, why restart again when it's about to
>>>>>>>>>>>>> be disconnected.
>>>>>>>>>>>>
>>>>>>>>>>>> I can try without restarting it again, and see if that works.  Instead
>>>>>>>>>>>> of waiting for the command complete event, can we set the ForceRM bit,
>>>>>>>>>>>> similar to what we do for dwc3_remove_requests()?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ForceRM=1 means that the controller will ignore updating the TRBs
>>>>>>>>>>> (including not clearing the HWO and remain transfer size). The driver
>>>>>>>>>>> still needs to wait for the command to complete before issuing Start
>>>>>>>>>>> Transfer command. Otherwise Start Transfer won't go through. If we know
>>>>>>>>>>> that we're not going to issue Start Transfer any time soon, then we may
>>>>>>>>>>> be able to get away with ignoring End Transfer command completion.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I see.  Currently, in the place that we do use
>>>>>>>>>> dwc3_ep0_end_control_data(), its followed by
>>>>>>>>>> dwc3_ep0_stall_and_restart() which would execute start transfer.  For
>>>>>>>>>
>>>>>>>>> That doesn't look right. You can try to see if it can recover from a
>>>>>>>>> control write request. Often time we do control read and not write.
>>>>>>>>> (i.e. try to End Transfer and immediately Start Transfer on the same
>>>>>>>>> direction control endpoint).
>>>>>>>>>
>>>>>>>> OK, I can try, but just to clarify, I was referring to how it was being
>>>>>>>> done in:
>>>>>>>>
>>>>>>>> static void dwc3_ep0_xfernotready(struct dwc3 *dwc,
>>>>>>>> 		const struct dwc3_event_depevt *event)
>>>>>>>> {
>>>>>>>> ...
>>>>>>>> 		if (dwc->ep0_expect_in != event->endpoint_number) {
>>>>>>>> 			struct dwc3_ep	*dep = dwc->eps[dwc->ep0_expect_in];
>>>>>>>>
>>>>>>>> 			dev_err(dwc->dev, "unexpected direction for Data Phase\n");
>>>>>>>> 			dwc3_ep0_end_control_data(dwc, dep);
>>>>>>>> 			dwc3_ep0_stall_and_restart(dwc);
>>>>>>>> 			return;
>>>>>>>> 		}
>>>>>>>>
>>>>>>
>>>>>> Looking at this snippet again, it looks wrong. For control write
>>>>>> unexpected direction, if the driver hasn't setup and started the DATA
>>>>>> phase yet, then it's fine, but there is a problem if it did.
>>>>>>
>>>>>> Since dwc3_ep0_end_control_data() doesn't issue End Transfer command to
>>>>>> ep0 due to the resource_index check, it doesn't follow the control
>>>>>
>>>>> IIRC resource_index is always non-zero, so the command should be
>>>>
>>>> No, resource_index for ep0out is 0, ep0in is 1. You can check from any
>>>> of the driver tracepoint log for the return value of Start Transfer
>>>> command for the resource index of ep0. There could be a mixed up with
>>>> the undocumented return value of Set Endpoint Transfer Resource command
>>>> before when this code was written, don't mix up with that.
>>>>
>>>>> triggered. If you have access to a Lecroy USB Trainer, could you script
>>>>> this very scenario for verification?
>>>>
>>>> For anyone who wants to work on this, we don't need a LeCroy USB
>>>> trainer. If you use xhci host, just modify the xhci-ring.c to queue a
>>>> wrong direction DATA phase TRB of a particular control write request
>>>> test, and continue with the next control requests.
>>>>
>>> Let me give this a try since I already have a modified (broken :)) XHCI
>>> stack.
>>>
>>> Thanks
>>> Wesley Cheng
>>
>> Sorry for the late response.  I was trying to get a reliable change to
>> get the issue to reproduce.  I think I was able to find a set up which
>> will generate the unexpected direction issue.  I'll try my best to
>> summarize the traces here:
>>
>> Set up:
>> - Modified XHCI stack to queue an IN TRB for a three stage CONTROL OUT
>> transaction (OUT data stage)
>> - Device is using RNDIS, as that has interface specific commands (ie
>> SEND_ENCAPSUALTED messages)
>>
>> Kernel Log:
>> [ 1255.312870] msm-usb-hsphy 88e3000.hsphy: Avail curr from USB = 900
>> [ 1255.315300] dwc3_ep0_xfernotready event status = 1
>> [ 1255.315429] dwc3_ep0_xfernotready event status = 2
>> [ 1255.316390] android_work: sent uevent USB_STATE=CONFIGURED
>> [ 1255.317467] dwc3_ep0_xfernotready event status = 1
>> [ 1255.317588] dwc3_ep0_xfernotready event status = 2
>> [ 1255.334196] dwc3_ep0_xfernotready event status = 1
>> [ 1255.334217] dwc3 a600000.dwc3: unexpected direction for Data Phase
>> [ 1255.334311] rndis_msg_parser: unknown RNDIS message 0x0052033A len
>> 4456526
>> [ 1255.334328] RNDIS command error -524, 0/24
>> [ 1255.334369] ------------[ cut here ]------------
>> [ 1255.334377] dwc3 a600000.dwc3: No resource for ep0out
>> [ 1255.334440] WARNING: CPU: 0 PID: 8364 at
>> drivers/usb/dwc3/gadget.c:360 dwc3_send_gadget_ep_cmd+0x3c4/0x96c
>> ...
>>  1255.336163] WARNING: CPU: 0 PID: 8364 at drivers/usb/dwc3/ep0.c:281
>> dwc3_ep0_out_start+0x108/0x144
>>
>> So the kernel log does indeed show the concern mentioned by Thinh, where
>> after ending the transfer, we do see the dwc3_ep0_out_start() fail due
>> to no xfer resource.
>>
> 
> No, there's no end transfer command seen, and it's expected from code
> review.
> 
Sorry, yes that's correct.
>> ftrace:
>> <...>-8364    [000] d..1  1255.333988: dwc3_ctrl_req: Get Interface
>> Status(Intf = 0, Length = 24)
>> <...>-8364    [000] d..1  1255.334115: dwc3_prepare_trb: ep0out: trb
>> ffffffc01bffd000 (E1:D0) buf 00000000efb76000 size 24 ctrl 00000455
>> (HlCs:Sc:data)
>> <...>-8364    [000] d..1  1255.334128: dwc3_prepare_trb: ep0out: trb
>> ffffffc01bffd010 (E1:D0) buf 00000000efff9000 size 488 ctrl 00000c53
>> (HLcs:SC:data)
>> <...>-8364    [000] d..1  1255.334166: dwc3_gadget_ep_cmd: ep0out: cmd
>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>> Successful
>> <...>-8364    [000] d..1  1255.334239: dwc3_gadget_ep_cmd: ep0out: cmd
>> 'Set Stall' [404] params 00000000 00000000 00000000 --> status: Successful
>> <...>-8364    [000] d..1  1255.334291: dwc3_gadget_giveback: ep0out: req
>> ffffff8891724e00 length 0/24 zsI ==> -104
> 
> It detected wrong direction and sets STALL here, but no End Transfer
> command.
> 
>> <...>-8364    [000] d..1  1255.334339: dwc3_prepare_trb: ep0out: trb
>> ffffffc01bffd010 (E1:D0) buf 00000000efffa000 size 8 ctrl 00000c23
>> (HLcs:SC:setup)
> 
> The driver overwrote the active TRB with a new SETUP TRB.
> 
>> <...>-8364    [000] d..1  1255.336099: dwc3_gadget_ep_cmd: ep0out: cmd
>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status: No
>> Resource
> 
> The dep->flags DWC3_EP_TRANSFER_STARTED got cleared. When it tries to
> issue a Start Transfer a SETUP transfer, the command will fail with no
> resource because the endpoint never ended properly.
> 
>> <...>-8364    [000] d..1  1255.357594: dwc3_ctrl_req: 00 60 b7 ef 00 00
>> 00 00
> 
> Here is iffy because the behavior is undefined. The driver overwrote the
> previous TRB. The Start Transfer command didn't go through, so the
> controller still has the old TRB setup in its cache. It gets the next
> SETUP request completion anyway because the SETUP stage is a short
> packet. The driver updated its state that it's expecting the SETUP
> stage, and it doesn't check the TRB write back buffer size for more or
> less than 8 bytes or whether this is valid data.
> 
> Wesley, is this SETUP packet the correct RNDIS control request?
> 
That doesn't look to be a RNDIS control packet.  I collected a bus
analyzer log as well w/ this snippet, and nothing was transmitted from
the host side during the data stage.  Subsequent SETUP transactions were
standard USB descriptors (GET string descriptors).
>> <...>-8364    [000] d..1  1255.357680: dwc3_gadget_ep_cmd: ep0out: cmd
>> 'Set Stall' [404] params 00000000 00000000 00000000 --> status: Successful
> 
> The device Set Stall on an unrecognized request, probably from the
> application, which can be normal.
> 
>> <...>-8364    [000] d..1  1255.357696: dwc3_prepare_trb: ep0out: trb
>> ffffffc01bffd010 (E1:D0) buf 00000000efffa000 size 8 ctrl 00000c23
>> (HLcs:SC:setup)
>> <...>-8364    [000] d..1  1255.357722: dwc3_gadget_ep_cmd: ep0out: cmd
>> 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
>> Successful
>> <...>-224     [005] d..1  1266.313014: dwc3_gadget_ep_cmd: ep2out: cmd
>> 'End Transfer' [40c08] params 00000000 00000000 00000000 --> status:
>> Timed Out
>>
>> Thanks
>> Wesley Cheng
>>
> 
> Thanks for the test Wesley
> 
> BR,
> Thinh
> 

So back to the original issue, which was the SETUP timeout during pullup
disable, I went ahead and collected the ftrace w/ a change to just
remove the return statement:

ret = wait_for_completion_timeout(&dwc->ep0_in_setup,
				msecs_to_jiffies(DWC3_PULL_UP_TIMEOUT));
if (ret == 0) {
	dev_err(dwc->dev, "timed out waiting for SETUP phase\n");
	return -ETIMEDOUT;
}

This would allow the __dwc3_gadget_stop() to disable EP0/1, which would
call dwc3_gadget_stop_active_transfer() to send the end xfer command:

//Packet which will not queue a data stage (injected failure)
<...>-7098    [003] d..1   346.560711: dwc3_ctrl_req: Get String
Descriptor(Index = 3, Length = 2)

//Prepare IN data stage TRB
<...>-7098    [003] d..1   346.560865: dwc3_prepare_trb: ep0in: trb
ffffffc011edd000 (E0:D0) buf 00000000effcc000 size 2 ctrl 00000c53
(HLcs:SC:data)
<...>-7098    [003] d..1   346.560915: dwc3_gadget_ep_cmd: ep0in: cmd
'Start Transfer' [406] params 00000000 efffa000 00000000 --> status:
Successful

//Pullup disable here kicks in - __dwc3_gadget_stop()
<...>-224     [006] d..1   348.607367: dwc3_gadget_ep_disable: ep0out:
mps 64/512 streams 0 burst 1 ring 0/0 flags E:swbp:>
<...>-224     [006] d..1   348.607430: dwc3_gadget_giveback: ep0out: req
ffffff884e5f5500 length 0/2 zsI ==> -108
<...>-224     [006] d..1   348.607444: dwc3_gadget_ep_disable: ep0in:
mps 64/512 streams 0 burst 1 ring 0/0 flags E:swBp:<

//End transfer on the pending EP0 in TRB queued previously
<...>-224     [006] d..1   348.607484: dwc3_gadget_ep_cmd: ep0in: cmd
'End Transfer' [10c08] params 00000000 00000000 00000000 --> status:
Successful
<...>-224     [006] dN.1   348.607788: usb_gadget_vbus_draw: speed 3/6
state 6 0mA [sg:self-powered:activated:disconnected] --> 0
<...>-224     [006] ....   348.616888: usb_gadget_disconnect: speed 3/6
state 6 0mA [sg:self-powered:activated:disconnected] --> 0

Does this look ok, Thinh?

Thanks
Wesley Cheng

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

  reply	other threads:[~2021-08-19 20:51 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-11  2:23 [RFC][PATCH] usb: dwc3: usb: dwc3: Force stop EP0 transfers during pullup disable Wesley Cheng
2021-08-12  0:47 ` Thinh Nguyen
2021-08-12 17:12   ` Wesley Cheng
2021-08-12 21:31     ` Thinh Nguyen
2021-08-12 23:19       ` Wesley Cheng
2021-08-13 23:21         ` Thinh Nguyen
2021-08-13 23:49           ` Wesley Cheng
2021-08-14  0:30             ` Thinh Nguyen
2021-08-15  0:26               ` Thinh Nguyen
2021-08-15  6:06                 ` Felipe Balbi
2021-08-16  0:33                   ` Thinh Nguyen
2021-08-16  5:15                     ` Felipe Balbi
2021-08-17  1:25                       ` Thinh Nguyen
2021-08-16 19:13                     ` Wesley Cheng
2021-08-18 21:24                       ` Wesley Cheng
2021-08-19  0:40                         ` Thinh Nguyen
2021-08-19 20:51                           ` Wesley Cheng [this message]
2021-08-20  1:52                             ` Thinh Nguyen
2021-08-20  3:05                               ` Wesley Cheng
2021-08-19  5:36                         ` Felipe Balbi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cdd9d624-00c6-1be3-5aad-e6f923d1e8d8@codeaurora.org \
    --to=wcheng@codeaurora.org \
    --cc=Thinh.Nguyen@synopsys.com \
    --cc=balbi@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jackp@codeauora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.