All of lore.kernel.org
 help / color / mirror / Atom feed
* dwc3: unusual handling of setup requests with wLength == 0
@ 2023-08-18  0:15 Andrey Konovalov
  2023-08-18  1:08 ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Andrey Konovalov @ 2023-08-18  0:15 UTC (permalink / raw)
  To: Alan Stern, Thinh Nguyen; +Cc: Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

Hi Alan and Thinh,

I have been testing Raw Gadget with the dwc3 UDC driver and stumbled
upon an issue related to how dwc3 handles setup requests with wLength
== 0.

When running a simple Raw Gadget-based keyboard emulator [1],
everything works as expected until the point when the host sends a
SET_CONFIGURATION request, which has wLength == 0.

For setup requests with wLength != 0, just like the other UDC drivers
I tested, dwc3 calls the gadget driver's ->setup() callback and then
waits until the gadget driver queues an URB to EP0 as a response.

However, for a setup request with wLength == 0, dwc3 does not wait
until the gadget driver queues an URB to ack the transfer. It appears
that dwc3 just acks the request internally and then proceeds with
calling the ->setup() callback for the next request received from the
host. This confuses Raw Gadget, as it does not expect to get a new
->setup() call before it explicitly acks the previous one by queuing
an URB. As a result, the emulation fails.

I suspect this issue has not been observed with other gadget drivers,
as they queue an URB immediately after receiving a ->setup() call:
dwc3 appears to somehow correctly handle this internally even though
it acks the transfer by itself. But the timings with Raw Gadget are
different, as it requires userspace to ack the transfer. Sometimes
though, the Raw Gadget-based emulator also manages to queue an URB
before the next request is received from the host and the enumeration
continues properly (until the next request with wLength == 0).

What do you think would be the best approach to deal with this?

Can this be considered a bug in dwc3 that should be fixed? There's a
seemingly related comment in dwc3 code [2], but I'm not familiar
enough with its internals to understand whether this is what leads to
the issue I'm seeing.

Or should I adapt Raw Gadget to handle this unusual dwc3 behavior?
This might be tricky to do, as I cannot change the existing userspace
API.

On a side note, as an experiment, I tried returning
USB_GADGET_DELAYED_STATUS from the Raw Gadget's ->setup() callback if
the UDC driver calls it too early: some UDC drivers, including dwc3,
appear to contain a special handling for this return value. However,
that didn't work out. Perhaps, I misunderstand the meaning of this
value.

Thank you!

[1] https://github.com/xairy/raw-gadget/blob/master/examples/keyboard.c
[2] https://elixir.bootlin.com/linux/v6.5-rc6/source/drivers/usb/dwc3/ep0.c#L145

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-18  0:15 dwc3: unusual handling of setup requests with wLength == 0 Andrey Konovalov
@ 2023-08-18  1:08 ` Thinh Nguyen
  2023-08-18  2:37   ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-18  1:08 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Alan Stern, Thinh Nguyen, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

Hi,

On Fri, Aug 18, 2023, Andrey Konovalov wrote:
> Hi Alan and Thinh,
> 
> I have been testing Raw Gadget with the dwc3 UDC driver and stumbled
> upon an issue related to how dwc3 handles setup requests with wLength
> == 0.
> 
> When running a simple Raw Gadget-based keyboard emulator [1],
> everything works as expected until the point when the host sends a
> SET_CONFIGURATION request, which has wLength == 0.
> 
> For setup requests with wLength != 0, just like the other UDC drivers
> I tested, dwc3 calls the gadget driver's ->setup() callback and then
> waits until the gadget driver queues an URB to EP0 as a response.

For the lack of better term, can we use "request" or "usb_request"
instead of URB for gadget side, I get confused with the host side
whenever we mention URB.

> 
> However, for a setup request with wLength == 0, dwc3 does not wait
> until the gadget driver queues an URB to ack the transfer. It appears
> that dwc3 just acks the request internally and then proceeds with
> calling the ->setup() callback for the next request received from the

It depends on the bRequest. It should not proceed to ->setup() unless
the gadget driver already setups the request for it.

> host. This confuses Raw Gadget, as it does not expect to get a new
> ->setup() call before it explicitly acks the previous one by queuing
> an URB. As a result, the emulation fails.

If the host intent is to send a 3-stage control request with a 0-length
data packet, the gadget driver needs to return USB_GADGET_DELAYED_STATUS
to prepare a 0-length request. For SET_CONFIGURATION, we don't expect
a data phase, why should the gadget driver queue a 0-length data?

> 
> I suspect this issue has not been observed with other gadget drivers,
> as they queue an URB immediately after receiving a ->setup() call:
> dwc3 appears to somehow correctly handle this internally even though
> it acks the transfer by itself. But the timings with Raw Gadget are
> different, as it requires userspace to ack the transfer. Sometimes
> though, the Raw Gadget-based emulator also manages to queue an URB
> before the next request is received from the host and the enumeration
> continues properly (until the next request with wLength == 0).
> 
> What do you think would be the best approach to deal with this?

The communication should be clearly defined. That is, the dwc3 needs to
know if this is a 3-stage or 2-stage control transfer. It knows about
the standard requests, but not the vendor/non-standard ones. If the raw
gadget defined some unknown OUT request, it needs to tell dwc3 whether
it should expect the data stage or not.

BR,
Thinh

> 
> Can this be considered a bug in dwc3 that should be fixed? There's a
> seemingly related comment in dwc3 code [2], but I'm not familiar
> enough with its internals to understand whether this is what leads to
> the issue I'm seeing.
> 
> Or should I adapt Raw Gadget to handle this unusual dwc3 behavior?
> This might be tricky to do, as I cannot change the existing userspace
> API.
> 
> On a side note, as an experiment, I tried returning
> USB_GADGET_DELAYED_STATUS from the Raw Gadget's ->setup() callback if
> the UDC driver calls it too early: some UDC drivers, including dwc3,
> appear to contain a special handling for this return value. However,
> that didn't work out. Perhaps, I misunderstand the meaning of this
> value.
> 
> Thank you!
> 
> [1] https://urldefense.com/v3/__https://github.com/xairy/raw-gadget/blob/master/examples/keyboard.c__;!!A4F2R9G_pg!fsksN9-eooWS7Uui8MfCcBkff8awG2ovwl6UaPuJ4_v50Ny5s-WQ4YuQht1ACRl6eDYZ4o-gkVlF6l82C8UT58e7$ 
> [2] https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v6.5-rc6/source/drivers/usb/dwc3/ep0.c*L145__;Iw!!A4F2R9G_pg!fsksN9-eooWS7Uui8MfCcBkff8awG2ovwl6UaPuJ4_v50Ny5s-WQ4YuQht1ACRl6eDYZ4o-gkVlF6l82C6Q3wZfW$ 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-18  1:08 ` Thinh Nguyen
@ 2023-08-18  2:37   ` Alan Stern
  2023-08-18  3:10     ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-18  2:37 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Fri, Aug 18, 2023 at 01:08:19AM +0000, Thinh Nguyen wrote:
> Hi,
> 
> On Fri, Aug 18, 2023, Andrey Konovalov wrote:
> > Hi Alan and Thinh,
> > 
> > I have been testing Raw Gadget with the dwc3 UDC driver and stumbled
> > upon an issue related to how dwc3 handles setup requests with wLength
> > == 0.
> > 
> > When running a simple Raw Gadget-based keyboard emulator [1],
> > everything works as expected until the point when the host sends a
> > SET_CONFIGURATION request, which has wLength == 0.
> > 
> > For setup requests with wLength != 0, just like the other UDC drivers
> > I tested, dwc3 calls the gadget driver's ->setup() callback and then
> > waits until the gadget driver queues an URB to EP0 as a response.
> 
> For the lack of better term, can we use "request" or "usb_request"
> instead of URB for gadget side, I get confused with the host side
> whenever we mention URB.
> 
> > 
> > However, for a setup request with wLength == 0, dwc3 does not wait
> > until the gadget driver queues an URB to ack the transfer. It appears
> > that dwc3 just acks the request internally and then proceeds with
> > calling the ->setup() callback for the next request received from the
> 
> It depends on the bRequest. It should not proceed to ->setup() unless
> the gadget driver already setups the request for it.

Let's see if I understand what you're saying.  Some control transfers 
are handled directly by the UDC driver (things like SET_ADDRESS or 
CLEAR_HALT).  For these transfers, the ->setup() callback is not invoked 
and the gadget driver is completely unaware of them.  But for all other 
control transfers, the ->setup() callback _is_ invoked.

Is that what you meant?

> > host. This confuses Raw Gadget, as it does not expect to get a new
> > ->setup() call before it explicitly acks the previous one by queuing
> > an URB. As a result, the emulation fails.
> 
> If the host intent is to send a 3-stage control request with a 0-length
> data packet, the gadget driver needs to return USB_GADGET_DELAYED_STATUS
> to prepare a 0-length request. For SET_CONFIGURATION, we don't expect
> a data phase, why should the gadget driver queue a 0-length data?

The USB-2 spec prohibits 3-stage control requests with wLength == 0 (see 
sections 9.3.1 and 9.3.5).  Therefore the host's intent can never be to 
send a 3-stage control request with a 0-length Data-stage packet.

> > I suspect this issue has not been observed with other gadget drivers,
> > as they queue an URB immediately after receiving a ->setup() call:
> > dwc3 appears to somehow correctly handle this internally even though
> > it acks the transfer by itself. But the timings with Raw Gadget are
> > different, as it requires userspace to ack the transfer. Sometimes
> > though, the Raw Gadget-based emulator also manages to queue an URB
> > before the next request is received from the host and the enumeration
> > continues properly (until the next request with wLength == 0).
> > 
> > What do you think would be the best approach to deal with this?
> 
> The communication should be clearly defined. That is, the dwc3 needs to
> know if this is a 3-stage or 2-stage control transfer. It knows about
> the standard requests, but not the vendor/non-standard ones. If the raw
> gadget defined some unknown OUT request, it needs to tell dwc3 whether
> it should expect the data stage or not.

The communication _is_ clearly defined.  Here's how it works:

For control transfers that aren't handled directly by the UDC, the UDC 
driver invokes the ->setup() callback and waits for the gadget driver to 
queue a request.  If the SETUP packet's wLength value is > 0 then the 
gadget driver queues an IN or OUT request (depending on the transfer's 
direction) and the UDC waits for the host to transfer the Data stage 
packets, completing the request.  After this happens, the UDC driver 
automatically queues an internal 0-length request in the opposite 
direction for the Status stage.  Data-stage transfers are not allowed to 
span more than one usb_request.

(IMO that automatic action is a design flaw; the UDC driver should wait 
for the gadget driver to explictly queue a 0-length request or a STALL 
instead of doing it automatically.)

But if the SETUP packet's wLength value is 0 then when the gadget driver 
is ready, it queues a 0-length IN request which will act as the Status 
stage.  In this situation the UDC does not automatically create a 
Status-stage request.

Note that the gadget driver is allowed to queue its various requests 
either while the ->setup() callback is running or after it has returned.

(Another design flaw is that this design doesn't specify what should 
happen if the UDC receives another SETUP packet from the host before the 
Status stage completes.  By sending another SETUP packet, the host is 
indicating that the earlier control transfer has been aborted.  
Presumably the UDC driver will complete all the outstanding requests 
with an error status, but there's a potential race in the gadget driver 
between queuing a request for the first transfer and executing the 
->setup() callback for the second transfer.)

If the dwc3 UDC driver doesn't behave this way then it needs to be 
fixed.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-18  2:37   ` Alan Stern
@ 2023-08-18  3:10     ` Thinh Nguyen
  2023-08-18  3:26       ` Thinh Nguyen
  2023-08-18  3:42       ` Alan Stern
  0 siblings, 2 replies; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-18  3:10 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Thu, Aug 17, 2023, Alan Stern wrote:
> On Fri, Aug 18, 2023 at 01:08:19AM +0000, Thinh Nguyen wrote:
> > Hi,
> > 
> > On Fri, Aug 18, 2023, Andrey Konovalov wrote:
> > > Hi Alan and Thinh,
> > > 
> > > I have been testing Raw Gadget with the dwc3 UDC driver and stumbled
> > > upon an issue related to how dwc3 handles setup requests with wLength
> > > == 0.
> > > 
> > > When running a simple Raw Gadget-based keyboard emulator [1],
> > > everything works as expected until the point when the host sends a
> > > SET_CONFIGURATION request, which has wLength == 0.
> > > 
> > > For setup requests with wLength != 0, just like the other UDC drivers
> > > I tested, dwc3 calls the gadget driver's ->setup() callback and then
> > > waits until the gadget driver queues an URB to EP0 as a response.
> > 
> > For the lack of better term, can we use "request" or "usb_request"
> > instead of URB for gadget side, I get confused with the host side
> > whenever we mention URB.
> > 
> > > 
> > > However, for a setup request with wLength == 0, dwc3 does not wait
> > > until the gadget driver queues an URB to ack the transfer. It appears
> > > that dwc3 just acks the request internally and then proceeds with
> > > calling the ->setup() callback for the next request received from the
> > 
> > It depends on the bRequest. It should not proceed to ->setup() unless
> > the gadget driver already setups the request for it.
> 
> Let's see if I understand what you're saying.  Some control transfers 
> are handled directly by the UDC driver (things like SET_ADDRESS or 
> CLEAR_HALT).  For these transfers, the ->setup() callback is not invoked 
> and the gadget driver is completely unaware of them.  But for all other 
> control transfers, the ->setup() callback _is_ invoked.
> 
> Is that what you meant?

That's not what I meant.

I was referring to the next request. It should not be processed until
the first request is completed. Depending on the type of request, if
there's a delayed_status, the dwc3 driver will not prepare for the
Status stage and Setup stage (after status completion) to proceed to the
_next_ ->setup callback.

My understanding from the described problem is that somehow dwc3
processes the next request immediately without waiting for the raw
gadget preparing the data stage.


> 
> > > host. This confuses Raw Gadget, as it does not expect to get a new
> > > ->setup() call before it explicitly acks the previous one by queuing
> > > an URB. As a result, the emulation fails.
> > 
> > If the host intent is to send a 3-stage control request with a 0-length
> > data packet, the gadget driver needs to return USB_GADGET_DELAYED_STATUS
> > to prepare a 0-length request. For SET_CONFIGURATION, we don't expect
> > a data phase, why should the gadget driver queue a 0-length data?
> 
> The USB-2 spec prohibits 3-stage control requests with wLength == 0 (see 
> sections 9.3.1 and 9.3.5).  Therefore the host's intent can never be to 
> send a 3-stage control request with a 0-length Data-stage packet.

Right. Forgot about that, but my point was about the sequential flow of
the control transfer.

> 
> > > I suspect this issue has not been observed with other gadget drivers,
> > > as they queue an URB immediately after receiving a ->setup() call:
> > > dwc3 appears to somehow correctly handle this internally even though
> > > it acks the transfer by itself. But the timings with Raw Gadget are
> > > different, as it requires userspace to ack the transfer. Sometimes
> > > though, the Raw Gadget-based emulator also manages to queue an URB
> > > before the next request is received from the host and the enumeration
> > > continues properly (until the next request with wLength == 0).
> > > 
> > > What do you think would be the best approach to deal with this?
> > 
> > The communication should be clearly defined. That is, the dwc3 needs to
> > know if this is a 3-stage or 2-stage control transfer. It knows about
> > the standard requests, but not the vendor/non-standard ones. If the raw
> > gadget defined some unknown OUT request, it needs to tell dwc3 whether
> > it should expect the data stage or not.
> 
> The communication _is_ clearly defined.  Here's how it works:
> 
> For control transfers that aren't handled directly by the UDC, the UDC 
> driver invokes the ->setup() callback and waits for the gadget driver to 
> queue a request.  If the SETUP packet's wLength value is > 0 then the 
> gadget driver queues an IN or OUT request (depending on the transfer's 
> direction) and the UDC waits for the host to transfer the Data stage 
> packets, completing the request.  After this happens, the UDC driver 
> automatically queues an internal 0-length request in the opposite 
> direction for the Status stage.  Data-stage transfers are not allowed to 
> span more than one usb_request.

I was talking in context of 0-length transfer (albeit I forgot about the
special case of control OUT doesn't have 3-stage).

If it's a vendor request 0-length transfer, without responding with
USB_GADGET_DELAYED_STATUS, the dwc3 will proceed with preparing the
status stage.

> 
> (IMO that automatic action is a design flaw; the UDC driver should wait 
> for the gadget driver to explictly queue a 0-length request or a STALL 
> instead of doing it automatically.)

Would every UDC has this capability? I recalled some aren't capable of
delayed_status.

> 
> But if the SETUP packet's wLength value is 0 then when the gadget driver 
> is ready, it queues a 0-length IN request which will act as the Status 
> stage.  In this situation the UDC does not automatically create a 
> Status-stage request.
> 
> Note that the gadget driver is allowed to queue its various requests 
> either while the ->setup() callback is running or after it has returned.
> 
> (Another design flaw is that this design doesn't specify what should 
> happen if the UDC receives another SETUP packet from the host before the 
> Status stage completes.  By sending another SETUP packet, the host is 
> indicating that the earlier control transfer has been aborted.  
> Presumably the UDC driver will complete all the outstanding requests 
> with an error status, but there's a potential race in the gadget driver 
> between queuing a request for the first transfer and executing the 
> ->setup() callback for the second transfer.)

If there's another SETUP packet coming while there's a pending control
transfer, for dwc3 UDC, the pending control TRB should be completed with
a Setup_pending status indicating aborted control transfer for dwc3
driver to handle that.

> 
> If the dwc3 UDC driver doesn't behave this way then it needs to be 
> fixed.
> 

BR,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-18  3:10     ` Thinh Nguyen
@ 2023-08-18  3:26       ` Thinh Nguyen
  2023-08-18  3:42       ` Alan Stern
  1 sibling, 0 replies; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-18  3:26 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Fri, Aug 18, 2023, Thinh Nguyen wrote:
> On Thu, Aug 17, 2023, Alan Stern wrote:
> > (Another design flaw is that this design doesn't specify what should 
> > happen if the UDC receives another SETUP packet from the host before the 
> > Status stage completes.  By sending another SETUP packet, the host is 
> > indicating that the earlier control transfer has been aborted.  
> > Presumably the UDC driver will complete all the outstanding requests 
> > with an error status, but there's a potential race in the gadget driver 
> > between queuing a request for the first transfer and executing the 
> > ->setup() callback for the second transfer.)
> 
> If there's another SETUP packet coming while there's a pending control
> transfer, for dwc3 UDC, the pending control TRB should be completed with
> a Setup_pending status indicating aborted control transfer for dwc3
> driver to handle that.
> 

There may be one special case where dwc3 UDC may hit a race is when
there's back-to-back Setup Packet happening faster than the system can
DMA out the SETUP data of the aborted control transfer from the
controller FIFO. But this scenario should be very rare and should only
occur in simulation.

BR,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-18  3:10     ` Thinh Nguyen
  2023-08-18  3:26       ` Thinh Nguyen
@ 2023-08-18  3:42       ` Alan Stern
  2023-08-18 19:49         ` Thinh Nguyen
  1 sibling, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-18  3:42 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Fri, Aug 18, 2023 at 03:10:48AM +0000, Thinh Nguyen wrote:
> On Thu, Aug 17, 2023, Alan Stern wrote:
> > On Fri, Aug 18, 2023 at 01:08:19AM +0000, Thinh Nguyen wrote:
> > > Hi,
> > > 
> > > On Fri, Aug 18, 2023, Andrey Konovalov wrote:
> > > > Hi Alan and Thinh,
> > > > 
> > > > I have been testing Raw Gadget with the dwc3 UDC driver and stumbled
> > > > upon an issue related to how dwc3 handles setup requests with wLength
> > > > == 0.
> > > > 
> > > > When running a simple Raw Gadget-based keyboard emulator [1],
> > > > everything works as expected until the point when the host sends a
> > > > SET_CONFIGURATION request, which has wLength == 0.
> > > > 
> > > > For setup requests with wLength != 0, just like the other UDC drivers
> > > > I tested, dwc3 calls the gadget driver's ->setup() callback and then
> > > > waits until the gadget driver queues an URB to EP0 as a response.
> > > 
> > > For the lack of better term, can we use "request" or "usb_request"
> > > instead of URB for gadget side, I get confused with the host side
> > > whenever we mention URB.
> > > 
> > > > 
> > > > However, for a setup request with wLength == 0, dwc3 does not wait
> > > > until the gadget driver queues an URB to ack the transfer. It appears
> > > > that dwc3 just acks the request internally and then proceeds with
> > > > calling the ->setup() callback for the next request received from the
> > > 
> > > It depends on the bRequest. It should not proceed to ->setup() unless
> > > the gadget driver already setups the request for it.
> > 
> > Let's see if I understand what you're saying.  Some control transfers 
> > are handled directly by the UDC driver (things like SET_ADDRESS or 
> > CLEAR_HALT).  For these transfers, the ->setup() callback is not invoked 
> > and the gadget driver is completely unaware of them.  But for all other 
> > control transfers, the ->setup() callback _is_ invoked.
> > 
> > Is that what you meant?
> 
> That's not what I meant.
> 
> I was referring to the next request. It should not be processed until
> the first request is completed. Depending on the type of request, if
> there's a delayed_status, the dwc3 driver will not prepare for the
> Status stage and Setup stage (after status completion) to proceed to the
> _next_ ->setup callback.
> 
> My understanding from the described problem is that somehow dwc3
> processes the next request immediately without waiting for the raw
> gadget preparing the data stage.

Um.  This is one of the design flaws I mentioned: a new SETUP packet 
arriving before the old control transfer is finished.  The USB spec 
requires devices to accept the new SETUP packet and abort the old 
transfer.  So in this case, processing the next request immediately is 
the right thing to do.

One question is why Andrey is observing a new ->setup() callback 
happening so soon?  The host is supposed to allow a fairly long time for 
standard control requests to complete.  If the userspace component of 
the Raw Gadget takes too long to act, the transfer could time out and be 
cancelled on the host.  But "too long" means several seconds -- is that 
really what's going on here?

> I was talking in context of 0-length transfer (albeit I forgot about the
> special case of control OUT doesn't have 3-stage).
> 
> If it's a vendor request 0-length transfer, without responding with
> USB_GADGET_DELAYED_STATUS, the dwc3 will proceed with preparing the
> status stage.

This may be a holdover from the early days of the Gadget subsystem.  My 
memory from back then isn't very good; I vaguely recall that the first 
UDC drivers would queue their automatic Status-stage requests if wLength 
was 0 and ->setup() returned 0 (which would explain why 
USB_GADGET_DELAYED_STATUS had to be invented).  Unless I'm completely 
confused, that's not how UDC drivers are supposed to act now.

> > (IMO that automatic action is a design flaw; the UDC driver should wait 
> > for the gadget driver to explictly queue a 0-length request or a STALL 
> > instead of doing it automatically.)
> 
> Would every UDC has this capability? I recalled some aren't capable of
> delayed_status.

In those cases the UDC driver would just have to do the best it can.  
Very few modern USB device controllers should have this limitation.

> > (Another design flaw is that this design doesn't specify what should 
> > happen if the UDC receives another SETUP packet from the host before the 
> > Status stage completes.  By sending another SETUP packet, the host is 
> > indicating that the earlier control transfer has been aborted.  
> > Presumably the UDC driver will complete all the outstanding requests 
> > with an error status, but there's a potential race in the gadget driver 
> > between queuing a request for the first transfer and executing the 
> > ->setup() callback for the second transfer.)
> 
> If there's another SETUP packet coming while there's a pending control
> transfer, for dwc3 UDC, the pending control TRB should be completed with
> a Setup_pending status indicating aborted control transfer for dwc3
> driver to handle that.

Right.  The difficulty doesn't involve the communication between the HCD 
and the UDC hardware; it involves the communication between the UDC 
driver and the gadget driver.  Somehow they need to synchronize so that 
when the gadget driver queues a usb_request, the UDC driver can tell 
whether the request was meant for the earlier aborted control transfer 
or the new active one.  This can matter if the gadget driver has a 
separate control thread (a work routine or a kthread, for example) that 
could be queuing requests while the ->setup() callback is running.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-18  3:42       ` Alan Stern
@ 2023-08-18 19:49         ` Thinh Nguyen
  2023-08-18 20:46           ` Thinh Nguyen
  2023-08-18 23:06           ` Alan Stern
  0 siblings, 2 replies; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-18 19:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Thu, Aug 17, 2023, Alan Stern wrote:
> On Fri, Aug 18, 2023 at 03:10:48AM +0000, Thinh Nguyen wrote:
> > On Thu, Aug 17, 2023, Alan Stern wrote:
> > > On Fri, Aug 18, 2023 at 01:08:19AM +0000, Thinh Nguyen wrote:
> > > > Hi,
> > > > 
> > > > On Fri, Aug 18, 2023, Andrey Konovalov wrote:
> > > > > Hi Alan and Thinh,
> > > > > 
> > > > > I have been testing Raw Gadget with the dwc3 UDC driver and stumbled
> > > > > upon an issue related to how dwc3 handles setup requests with wLength
> > > > > == 0.
> > > > > 
> > > > > When running a simple Raw Gadget-based keyboard emulator [1],
> > > > > everything works as expected until the point when the host sends a
> > > > > SET_CONFIGURATION request, which has wLength == 0.
> > > > > 
> > > > > For setup requests with wLength != 0, just like the other UDC drivers
> > > > > I tested, dwc3 calls the gadget driver's ->setup() callback and then
> > > > > waits until the gadget driver queues an URB to EP0 as a response.
> > > > 
> > > > For the lack of better term, can we use "request" or "usb_request"
> > > > instead of URB for gadget side, I get confused with the host side
> > > > whenever we mention URB.
> > > > 
> > > > > 
> > > > > However, for a setup request with wLength == 0, dwc3 does not wait
> > > > > until the gadget driver queues an URB to ack the transfer. It appears
> > > > > that dwc3 just acks the request internally and then proceeds with
> > > > > calling the ->setup() callback for the next request received from the
> > > > 
> > > > It depends on the bRequest. It should not proceed to ->setup() unless
> > > > the gadget driver already setups the request for it.
> > > 
> > > Let's see if I understand what you're saying.  Some control transfers 
> > > are handled directly by the UDC driver (things like SET_ADDRESS or 
> > > CLEAR_HALT).  For these transfers, the ->setup() callback is not invoked 
> > > and the gadget driver is completely unaware of them.  But for all other 
> > > control transfers, the ->setup() callback _is_ invoked.
> > > 
> > > Is that what you meant?
> > 
> > That's not what I meant.
> > 
> > I was referring to the next request. It should not be processed until
> > the first request is completed. Depending on the type of request, if
> > there's a delayed_status, the dwc3 driver will not prepare for the
> > Status stage and Setup stage (after status completion) to proceed to the
> > _next_ ->setup callback.
> > 
> > My understanding from the described problem is that somehow dwc3
> > processes the next request immediately without waiting for the raw
> > gadget preparing the data stage.
> 
> Um.  This is one of the design flaws I mentioned: a new SETUP packet 
> arriving before the old control transfer is finished.  The USB spec 
> requires devices to accept the new SETUP packet and abort the old 
> transfer.  So in this case, processing the next request immediately is 
> the right thing to do.

No, as I've mentioned, from the gadget driver, it should not see the
next ->setup until the first control transfer completion, regardless
whether the transfer completed with error code due to abort or not.
Everything should happen sequentially from the gadget driver. This
should be handled in the dwc3 driver (though we missed a setup_pending
status check in the data phase that needs to be fixed now that I look at
it again).

Perhaps the core design should be so that this synchronization does not
depend on the udc driver implementation.

> 
> One question is why Andrey is observing a new ->setup() callback 
> happening so soon?  The host is supposed to allow a fairly long time for 
> standard control requests to complete.  If the userspace component of 
> the Raw Gadget takes too long to act, the transfer could time out and be 
> cancelled on the host.  But "too long" means several seconds -- is that 
> really what's going on here?

As I noted initially, it should not happen so I'm not sure what's really
the problem here. Granted that the setup_pending check for data phase is
missing in dwc3 driver, there should not be a problem unless the host
actually aborted a control transfer. Also, there should be no data phase
for wlength=0 even for IN direction if we go through the composite
layer. So, I doubt that's what happening here.

Perhaps Andrey can clarify.

> 
> > I was talking in context of 0-length transfer (albeit I forgot about the
> > special case of control OUT doesn't have 3-stage).
> > 
> > If it's a vendor request 0-length transfer, without responding with
> > USB_GADGET_DELAYED_STATUS, the dwc3 will proceed with preparing the
> > status stage.
> 
> This may be a holdover from the early days of the Gadget subsystem.  My 
> memory from back then isn't very good; I vaguely recall that the first 
> UDC drivers would queue their automatic Status-stage requests if wLength 
> was 0 and ->setup() returned 0 (which would explain why 
> USB_GADGET_DELAYED_STATUS had to be invented).  Unless I'm completely 
> confused, that's not how UDC drivers are supposed to act now.

For dwc3, it's been like this since the beginning that it automatically
queues the status upon host request. USB_GADGET_DELAYED_STATUS was
introduced to support scenarios where the device may need a longer time
to process the specific request (such as SET_CONFIGURATION).

> 
> > > (IMO that automatic action is a design flaw; the UDC driver should wait 
> > > for the gadget driver to explictly queue a 0-length request or a STALL 
> > > instead of doing it automatically.)
> > 
> > Would every UDC has this capability? I recalled some aren't capable of
> > delayed_status.
> 
> In those cases the UDC driver would just have to do the best it can.  
> Very few modern USB device controllers should have this limitation.
> 
> > > (Another design flaw is that this design doesn't specify what should 
> > > happen if the UDC receives another SETUP packet from the host before the 
> > > Status stage completes.  By sending another SETUP packet, the host is 
> > > indicating that the earlier control transfer has been aborted.  
> > > Presumably the UDC driver will complete all the outstanding requests 
> > > with an error status, but there's a potential race in the gadget driver 
> > > between queuing a request for the first transfer and executing the 
> > > ->setup() callback for the second transfer.)
> > 
> > If there's another SETUP packet coming while there's a pending control
> > transfer, for dwc3 UDC, the pending control TRB should be completed with
> > a Setup_pending status indicating aborted control transfer for dwc3
> > driver to handle that.
> 
> Right.  The difficulty doesn't involve the communication between the HCD 
> and the UDC hardware; it involves the communication between the UDC 
> driver and the gadget driver.  Somehow they need to synchronize so that 
> when the gadget driver queues a usb_request, the UDC driver can tell 
> whether the request was meant for the earlier aborted control transfer 
> or the new active one.  This can matter if the gadget driver has a 
> separate control thread (a work routine or a kthread, for example) that 
> could be queuing requests while the ->setup() callback is running.
> 

Perhaps this can be improved and enforced from the core. At the moment,
it should not be a problem for gadget driver with dwc3 driver (with a
minor fix due to missing check).

BR,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-18 19:49         ` Thinh Nguyen
@ 2023-08-18 20:46           ` Thinh Nguyen
  2023-08-18 23:06           ` Alan Stern
  1 sibling, 0 replies; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-18 20:46 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Fri, Aug 18, 2023, Thinh Nguyen wrote:
> > 
> > Right.  The difficulty doesn't involve the communication between the HCD 
> > and the UDC hardware; it involves the communication between the UDC 
> > driver and the gadget driver.  Somehow they need to synchronize so that 
> > when the gadget driver queues a usb_request, the UDC driver can tell 
> > whether the request was meant for the earlier aborted control transfer 
> > or the new active one.  This can matter if the gadget driver has a 
> > separate control thread (a work routine or a kthread, for example) that 
> > could be queuing requests while the ->setup() callback is running.
> > 
> 
> Perhaps this can be improved and enforced from the core. At the moment,
> it should not be a problem for gadget driver with dwc3 driver (with a
> minor fix due to missing check).
> 

Perhaps I'm too quick to respond to overlook what you've said. I've only
thinking about the UDC driver sequence and not the gadget driver
handling the givebacks. Yes if the gadget driver has a separate control
thread, that may be problematic.

BR,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-18 19:49         ` Thinh Nguyen
  2023-08-18 20:46           ` Thinh Nguyen
@ 2023-08-18 23:06           ` Alan Stern
  2023-08-19  0:06             ` Thinh Nguyen
  1 sibling, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-18 23:06 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Fri, Aug 18, 2023 at 07:49:27PM +0000, Thinh Nguyen wrote:
> On Thu, Aug 17, 2023, Alan Stern wrote:
> > On Fri, Aug 18, 2023 at 03:10:48AM +0000, Thinh Nguyen wrote:
> > > I was referring to the next request. It should not be processed until
> > > the first request is completed. Depending on the type of request, if
> > > there's a delayed_status, the dwc3 driver will not prepare for the
> > > Status stage and Setup stage (after status completion) to proceed to the
> > > _next_ ->setup callback.
> > > 
> > > My understanding from the described problem is that somehow dwc3
> > > processes the next request immediately without waiting for the raw
> > > gadget preparing the data stage.
> > 
> > Um.  This is one of the design flaws I mentioned: a new SETUP packet 
> > arriving before the old control transfer is finished.  The USB spec 
> > requires devices to accept the new SETUP packet and abort the old 
> > transfer.  So in this case, processing the next request immediately is 
> > the right thing to do.
> 
> No, as I've mentioned, from the gadget driver, it should not see the
> next ->setup until the first control transfer completion, regardless
> whether the transfer completed with error code due to abort or not.
> Everything should happen sequentially from the gadget driver. This
> should be handled in the dwc3 driver (though we missed a setup_pending
> status check in the data phase that needs to be fixed now that I look at
> it again).

Actually I agree with you.  When a new SETUP packet arrives before the 
old control transfer has finished, the UDC driver should cancel all 
pending requests and then invoke the ->setup() callback.  (I don't think 
there is a standard error code for the cancelled requests; net2280 seems 
to use -EPROTO whereas dummy-hcd seems to use -EOVERFLOW.)

> Perhaps the core design should be so that this synchronization does not
> depend on the udc driver implementation.

Do you mean the UDC core?  I don't see how it could get involved in 
managing control transfers.

> > One question is why Andrey is observing a new ->setup() callback 
> > happening so soon?  The host is supposed to allow a fairly long time for 
> > standard control requests to complete.  If the userspace component of 
> > the Raw Gadget takes too long to act, the transfer could time out and be 
> > cancelled on the host.  But "too long" means several seconds -- is that 
> > really what's going on here?
> 
> As I noted initially, it should not happen so I'm not sure what's really
> the problem here. Granted that the setup_pending check for data phase is
> missing in dwc3 driver, there should not be a problem unless the host
> actually aborted a control transfer. Also, there should be no data phase
> for wlength=0 even for IN direction if we go through the composite
> layer. So, I doubt that's what happening here.
> 
> Perhaps Andrey can clarify.

My impression from his initial email was that something different was 
happening.  It sounded like his ->setup() callback was invoked with 
wLength = 0, but before the Raw Gadget driver was ready to queue a 
response request, the UDC driver sent an automatic status, at which 
point the host sent another SETUP packet.  So the gadget driver got two 
->setup() callbacks in a row with no chance to do anything in between.

> > > I was talking in context of 0-length transfer (albeit I forgot about the
> > > special case of control OUT doesn't have 3-stage).
> > > 
> > > If it's a vendor request 0-length transfer, without responding with
> > > USB_GADGET_DELAYED_STATUS, the dwc3 will proceed with preparing the
> > > status stage.
> > 
> > This may be a holdover from the early days of the Gadget subsystem.  My 
> > memory from back then isn't very good; I vaguely recall that the first 
> > UDC drivers would queue their automatic Status-stage requests if wLength 
> > was 0 and ->setup() returned 0 (which would explain why 
> > USB_GADGET_DELAYED_STATUS had to be invented).  Unless I'm completely 
> > confused, that's not how UDC drivers are supposed to act now.

I did a little checking.  The USB_GADGET_DELAYED_STATUS mechanism was 
introduced for use by the composite framework -- not in order to make a 
UDC driver work properly.

> For dwc3, it's been like this since the beginning that it automatically
> queues the status upon host request. USB_GADGET_DELAYED_STATUS was
> introduced to support scenarios where the device may need a longer time
> to process the specific request (such as SET_CONFIGURATION).

It's hard to be sure, but that may be the cause of Andrey's problem.  He 
mentioned something about the Raw Gadget's ->setup() routine returning 
USB_GADGET_DELAYED_STATUS when the problem occurred, but I think he 
meant that it did this for the second callback, not the first one.

Still, I recommand that dwc3 not automatically queue a Status request 
when wLength is 0.  Gadget drivers don't expect UDC drivers to handle 
this for them.  For example, look at the composite_setup() function in 
composite.c.  It does this:

	/* respond with data transfer before status phase? */
	if (value >= 0 && value != USB_GADGET_DELAYED_STATUS) {
		req->length = value;
		req->context = cdev;
		req->zero = value < w_length;
		value = composite_ep0_queue(cdev, req, GFP_ATOMIC);

Clearly it doesn't want the UDC driver to queue a request for it, no 
matter what wLength or value is set to.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-18 23:06           ` Alan Stern
@ 2023-08-19  0:06             ` Thinh Nguyen
  2023-08-19  1:54               ` Andrey Konovalov
  2023-08-20 14:20               ` Alan Stern
  0 siblings, 2 replies; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-19  0:06 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Fri, Aug 18, 2023, Alan Stern wrote:
> On Fri, Aug 18, 2023 at 07:49:27PM +0000, Thinh Nguyen wrote:
> > On Thu, Aug 17, 2023, Alan Stern wrote:
> > > On Fri, Aug 18, 2023 at 03:10:48AM +0000, Thinh Nguyen wrote:
> > > > I was referring to the next request. It should not be processed until
> > > > the first request is completed. Depending on the type of request, if
> > > > there's a delayed_status, the dwc3 driver will not prepare for the
> > > > Status stage and Setup stage (after status completion) to proceed to the
> > > > _next_ ->setup callback.
> > > > 
> > > > My understanding from the described problem is that somehow dwc3
> > > > processes the next request immediately without waiting for the raw
> > > > gadget preparing the data stage.
> > > 
> > > Um.  This is one of the design flaws I mentioned: a new SETUP packet 
> > > arriving before the old control transfer is finished.  The USB spec 
> > > requires devices to accept the new SETUP packet and abort the old 
> > > transfer.  So in this case, processing the next request immediately is 
> > > the right thing to do.
> > 
> > No, as I've mentioned, from the gadget driver, it should not see the
> > next ->setup until the first control transfer completion, regardless
> > whether the transfer completed with error code due to abort or not.
> > Everything should happen sequentially from the gadget driver. This
> > should be handled in the dwc3 driver (though we missed a setup_pending
> > status check in the data phase that needs to be fixed now that I look at
> > it again).
> 
> Actually I agree with you.  When a new SETUP packet arrives before the 
> old control transfer has finished, the UDC driver should cancel all 
> pending requests and then invoke the ->setup() callback.  (I don't think 
> there is a standard error code for the cancelled requests; net2280 seems 
> to use -EPROTO whereas dummy-hcd seems to use -EOVERFLOW.)

Those are very odd choice of error codes for cancelled request. Even
though the gadget side doesn't have very defined error codes, I try to
follow the equivalent doc from the host side
(driver-api/usb/error-codes.rst), which is -ECONNRESET.

Whenever I see -EPROTO, I associate that to a specific host error:
transaction error. For -EOVERFLOW, I think of babble errors.

> 
> > Perhaps the core design should be so that this synchronization does not
> > depend on the udc driver implementation.
> 
> Do you mean the UDC core?  I don't see how it could get involved in 
> managing control transfers.

Actually I don't see it either. :)

> 
> > > One question is why Andrey is observing a new ->setup() callback 
> > > happening so soon?  The host is supposed to allow a fairly long time for 
> > > standard control requests to complete.  If the userspace component of 
> > > the Raw Gadget takes too long to act, the transfer could time out and be 
> > > cancelled on the host.  But "too long" means several seconds -- is that 
> > > really what's going on here?
> > 
> > As I noted initially, it should not happen so I'm not sure what's really
> > the problem here. Granted that the setup_pending check for data phase is
> > missing in dwc3 driver, there should not be a problem unless the host
> > actually aborted a control transfer. Also, there should be no data phase
> > for wlength=0 even for IN direction if we go through the composite
> > layer. So, I doubt that's what happening here.
> > 
> > Perhaps Andrey can clarify.
> 
> My impression from his initial email was that something different was 
> happening.  It sounded like his ->setup() callback was invoked with 
> wLength = 0, but before the Raw Gadget driver was ready to queue a 
> response request, the UDC driver sent an automatic status, at which 
> point the host sent another SETUP packet.  So the gadget driver got two 
> ->setup() callbacks in a row with no chance to do anything in between.

What else should the gadget driver do? There's no data stage for
wLength=0.

> 
> > > > I was talking in context of 0-length transfer (albeit I forgot about the
> > > > special case of control OUT doesn't have 3-stage).
> > > > 
> > > > If it's a vendor request 0-length transfer, without responding with
> > > > USB_GADGET_DELAYED_STATUS, the dwc3 will proceed with preparing the
> > > > status stage.
> > > 
> > > This may be a holdover from the early days of the Gadget subsystem.  My 
> > > memory from back then isn't very good; I vaguely recall that the first 
> > > UDC drivers would queue their automatic Status-stage requests if wLength 
> > > was 0 and ->setup() returned 0 (which would explain why 
> > > USB_GADGET_DELAYED_STATUS had to be invented).  Unless I'm completely 
> > > confused, that's not how UDC drivers are supposed to act now.
> 
> I did a little checking.  The USB_GADGET_DELAYED_STATUS mechanism was 
> introduced for use by the composite framework -- not in order to make a 
> UDC driver work properly.

Hm... perhaps we can update so that it's applicable outside of the
composite framework. Currently dwc3 treats it as such, which may not
work if the gadget driver does not know about USB_GADGET_DELAYED_STATUS.

> 
> > For dwc3, it's been like this since the beginning that it automatically
> > queues the status upon host request. USB_GADGET_DELAYED_STATUS was
> > introduced to support scenarios where the device may need a longer time
> > to process the specific request (such as SET_CONFIGURATION).
> 
> It's hard to be sure, but that may be the cause of Andrey's problem.  He 
> mentioned something about the Raw Gadget's ->setup() routine returning 
> USB_GADGET_DELAYED_STATUS when the problem occurred, but I think he 
> meant that it did this for the second callback, not the first one.

I think he was just experimenting with USB_GADGET_DELAYED_STATUS. The
problem happened without that. Regardless, we need clarifications.

> 
> Still, I recommand that dwc3 not automatically queue a Status request 
> when wLength is 0.  Gadget drivers don't expect UDC drivers to handle 
> this for them.  For example, look at the composite_setup() function in 
> composite.c.  It does this:
> 
> 	/* respond with data transfer before status phase? */
> 	if (value >= 0 && value != USB_GADGET_DELAYED_STATUS) {
> 		req->length = value;
> 		req->context = cdev;
> 		req->zero = value < w_length;
> 		value = composite_ep0_queue(cdev, req, GFP_ATOMIC);
> 
> Clearly it doesn't want the UDC driver to queue a request for it, no 
> matter what wLength or value is set to.
> 

dwc3 parse the SETUP data and determine whether it's a 3-state or
2-stage control transfer. If wLength > 0, then it must be a 3-stage
control transfer. The dwc3 driver would not queue the status immediately
until the data stage is completed.

To enforce the gadget driver to manually queue the status would take
some effort to ensure all the UDC driver comply to it. We would also
need to update the composite framework.

BR,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-19  0:06             ` Thinh Nguyen
@ 2023-08-19  1:54               ` Andrey Konovalov
  2023-08-20 14:20               ` Alan Stern
  1 sibling, 0 replies; 37+ messages in thread
From: Andrey Konovalov @ 2023-08-19  1:54 UTC (permalink / raw)
  To: Thinh Nguyen, Alan Stern; +Cc: Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

[I will skip the discussion about the racy or aborted SETUP requests
and the potential design flaws in the UDC subsystem: this is not
what's happening in my case. ]

On Sat, Aug 19, 2023 at 2:07 AM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> > > > One question is why Andrey is observing a new ->setup() callback
> > > > happening so soon?  The host is supposed to allow a fairly long time for
> > > > standard control requests to complete.  If the userspace component of
> > > > the Raw Gadget takes too long to act, the transfer could time out and be
> > > > cancelled on the host.  But "too long" means several seconds -- is that
> > > > really what's going on here?

No, this is not what's happening. All the requests get processed
relatively quickly, there are no big delays.

> > > As I noted initially, it should not happen so I'm not sure what's really
> > > the problem here. Granted that the setup_pending check for data phase is
> > > missing in dwc3 driver, there should not be a problem unless the host
> > > actually aborted a control transfer. Also, there should be no data phase
> > > for wlength=0 even for IN direction if we go through the composite
> > > layer. So, I doubt that's what happening here.
> > >
> > > Perhaps Andrey can clarify.
> >
> > My impression from his initial email was that something different was
> > happening.  It sounded like his ->setup() callback was invoked with
> > wLength = 0, but before the Raw Gadget driver was ready to queue a
> > response request, the UDC driver sent an automatic status, at which
> > point the host sent another SETUP packet.  So the gadget driver got two
> > ->setup() callbacks in a row with no chance to do anything in between.

Precisely!

> What else should the gadget driver do? There's no data stage for
> wLength=0.

Quoting one of the Alan's responses:

> But if the SETUP packet's wLength value is 0 then when the gadget driver
> is ready, it queues a 0-length IN request which will act as the Status
> stage.  In this situation the UDC does not automatically create a
> Status-stage request.
>
> Note that the gadget driver is allowed to queue its various requests
> either while the ->setup() callback is running or after it has returned.

So there's no Data stage indeed, but the problem is that dwc3 proceeds
with the Status stage by itself without waiting for an (empty) request
to be queued by the gadget driver.

> > > For dwc3, it's been like this since the beginning that it automatically
> > > queues the status upon host request. USB_GADGET_DELAYED_STATUS was
> > > introduced to support scenarios where the device may need a longer time
> > > to process the specific request (such as SET_CONFIGURATION).
> >
> > It's hard to be sure, but that may be the cause of Andrey's problem.

Yes, I believe this is exactly the problem.

> > He
> > mentioned something about the Raw Gadget's ->setup() routine returning
> > USB_GADGET_DELAYED_STATUS when the problem occurred, but I think he
> > meant that it did this for the second callback, not the first one.

Right!

I just tried changing Raw Gadget to return USB_GADGET_DELAYED_STATUS
from ->setup() whenever wLength == 0, and the enumeration now works
properly. However, I don't think this hack is the proper solution. I
think dwc3 should just do what other UDC drivers do and do not
autocomplete the Status stage for wLength == 0 requests.

> > Still, I recommand that dwc3 not automatically queue a Status request
> > when wLength is 0.

Yes, I believe this should fix the problem that I'm having.

> > Gadget drivers don't expect UDC drivers to handle
> > this for them.  For example, look at the composite_setup() function in
> > composite.c.  It does this:
> >
> >       /* respond with data transfer before status phase? */
> >       if (value >= 0 && value != USB_GADGET_DELAYED_STATUS) {
> >               req->length = value;
> >               req->context = cdev;
> >               req->zero = value < w_length;
> >               value = composite_ep0_queue(cdev, req, GFP_ATOMIC);
> >
> > Clearly it doesn't want the UDC driver to queue a request for it, no
> > matter what wLength or value is set to.
> >
>
> dwc3 parse the SETUP data and determine whether it's a 3-state or
> 2-stage control transfer. If wLength > 0, then it must be a 3-stage
> control transfer. The dwc3 driver would not queue the status immediately
> until the data stage is completed.
>
> To enforce the gadget driver to manually queue the status would take
> some effort to ensure all the UDC driver comply to it.

I think most (or all except dwc3?) UDC drivers already comply with
this behavior. At the very least, Dummy UDC, dwc2, musb, chipidea, and
net2280 drivers work with Raw Gadget, so they must be complying. dwc3
is the first UDC I tested that caused a problem.

> We would also need to update the composite framework.

I would expect that the composite framework already takes this
behavior into account. I suspect that it's the dwc3 that just happens
to work with the composite framework, as the composite framework
quickly queues an empty response to a wLength == 0 request (which gets
apparently ignored by dwc3) before the next ->setup() calls happens.

Thank you!

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-19  0:06             ` Thinh Nguyen
  2023-08-19  1:54               ` Andrey Konovalov
@ 2023-08-20 14:20               ` Alan Stern
  2023-08-21 16:13                 ` Andrey Konovalov
  2023-08-23  2:14                 ` Thinh Nguyen
  1 sibling, 2 replies; 37+ messages in thread
From: Alan Stern @ 2023-08-20 14:20 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Sat, Aug 19, 2023 at 12:06:53AM +0000, Thinh Nguyen wrote:
> On Fri, Aug 18, 2023, Alan Stern wrote:
> > Actually I agree with you.  When a new SETUP packet arrives before the 
> > old control transfer has finished, the UDC driver should cancel all 
> > pending requests and then invoke the ->setup() callback.  (I don't think 
> > there is a standard error code for the cancelled requests; net2280 seems 
> > to use -EPROTO whereas dummy-hcd seems to use -EOVERFLOW.)
> 
> Those are very odd choice of error codes for cancelled request. Even
> though the gadget side doesn't have very defined error codes, I try to
> follow the equivalent doc from the host side
> (driver-api/usb/error-codes.rst), which is -ECONNRESET.
> 
> Whenever I see -EPROTO, I associate that to a specific host error:
> transaction error. For -EOVERFLOW, I think of babble errors.

Do you have a suggestion for an error code that all the UDCs should use 
in this situation?  -ECONNRESET is currently being used for requests 
that were cancelled by usb_ep_dequeue().  Would -EREMOTEIO be more 
suitable for requests attached to an aborted control transfer?


> > My impression from his initial email was that something different was 
> > happening.  It sounded like his ->setup() callback was invoked with 
> > wLength = 0, but before the Raw Gadget driver was ready to queue a 
> > response request, the UDC driver sent an automatic status, at which 
> > point the host sent another SETUP packet.  So the gadget driver got two 
> > ->setup() callbacks in a row with no chance to do anything in between.
> 
> What else should the gadget driver do? There's no data stage for
> wLength=0.

So when wLength is 0, dwc3 should not automatically handle the Status 
stage.  It should wait for the gadget driver to submit an explicit 
Status-stage request.  As far as I know, all the gadget drivers will do 
this.

> > > > This may be a holdover from the early days of the Gadget subsystem.  My 
> > > > memory from back then isn't very good; I vaguely recall that the first 
> > > > UDC drivers would queue their automatic Status-stage requests if wLength 
> > > > was 0 and ->setup() returned 0 (which would explain why 
> > > > USB_GADGET_DELAYED_STATUS had to be invented).  Unless I'm completely 
> > > > confused, that's not how UDC drivers are supposed to act now.
> > 
> > I did a little checking.  The USB_GADGET_DELAYED_STATUS mechanism was 
> > introduced for use by the composite framework -- not in order to make a 
> > UDC driver work properly.
> 
> Hm... perhaps we can update so that it's applicable outside of the
> composite framework. Currently dwc3 treats it as such, which may not
> work if the gadget driver does not know about USB_GADGET_DELAYED_STATUS.

I think USB_GADGET_DELAYED_STATUS belongs entirely inside the composite 
framework and it should not be used by UDC drivers at all.

That return code makes some sense in composite.c, because the composite 
framework juggles several function drivers in a single gadget.  It has 
to know when all of them are ready to complete a configuration change; 
it can't assume each function is ready when the callback returns.

An alternative approach for composite.c would be _always_ to assume that 
functions aren't ready until they have called 
usb_composite_setup_continue().  However, doing this would require 
auditing each function driver.


> dwc3 parse the SETUP data and determine whether it's a 3-state or
> 2-stage control transfer. If wLength > 0, then it must be a 3-stage
> control transfer. The dwc3 driver would not queue the status immediately
> until the data stage is completed.
> 
> To enforce the gadget driver to manually queue the status would take
> some effort to ensure all the UDC driver comply to it. We would also
> need to update the composite framework.

The composite framework already does the right thing.  And as Andrey 
said, most if not all of the other UDC drivers already behave this way.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-20 14:20               ` Alan Stern
@ 2023-08-21 16:13                 ` Andrey Konovalov
  2023-08-21 17:25                   ` Alan Stern
  2023-08-23  2:14                 ` Thinh Nguyen
  1 sibling, 1 reply; 37+ messages in thread
From: Andrey Konovalov @ 2023-08-21 16:13 UTC (permalink / raw)
  To: Alan Stern, Thinh Nguyen, Sebastian Andrzej Siewior
  Cc: Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Sun, Aug 20, 2023 at 4:20 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> > > > > This may be a holdover from the early days of the Gadget subsystem.  My
> > > > > memory from back then isn't very good; I vaguely recall that the first
> > > > > UDC drivers would queue their automatic Status-stage requests if wLength
> > > > > was 0 and ->setup() returned 0 (which would explain why
> > > > > USB_GADGET_DELAYED_STATUS had to be invented).  Unless I'm completely
> > > > > confused, that's not how UDC drivers are supposed to act now.
> > >
> > > I did a little checking.  The USB_GADGET_DELAYED_STATUS mechanism was
> > > introduced for use by the composite framework -- not in order to make a
> > > UDC driver work properly.
> >
> > Hm... perhaps we can update so that it's applicable outside of the
> > composite framework. Currently dwc3 treats it as such, which may not
> > work if the gadget driver does not know about USB_GADGET_DELAYED_STATUS.
>
> I think USB_GADGET_DELAYED_STATUS belongs entirely inside the composite
> framework and it should not be used by UDC drivers at all.
>
> That return code makes some sense in composite.c, because the composite
> framework juggles several function drivers in a single gadget.  It has
> to know when all of them are ready to complete a configuration change;
> it can't assume each function is ready when the callback returns.

Out of curiosity, I also did some digging around USB_GADGET_DELAYED_STATUS.

1. USB_GADGET_DELAYED_STATUS was introduced in 1b9ba000177 ("usb:
gadget: composite: Allow function drivers to pause control
transfers"). It looks like it was indeed initially intended for the
composite framework, as nor that commit nor the directly following
commits use USB_GADGET_DELAYED_STATUS in any UDC drivers. However,
this commit had an unintended (?) side-effect of returning
USB_GADGET_DELAYED_STATUS from the ->setup() call of the composite
framework gadget driver.

2. In 5bdb1dcc6330 ("usb: dwc3: ep0: handle delayed_status again"),
the dwc3 driver was the first one to start relying on
USB_GADGET_DELAYED_STATUS to decide when to avoid auto-completing the
Status stage (+Sebastian). The commit description mentions some
previous rework of the driver that made it lose the ability of handle
delayed Status stage handling, but I couldn't figure out the exact
commit it refers to.

3. Following that, a few other UDC drivers also started using
USB_GADGET_DELAYED_STATUS, potentially using the dwc3 behavior as a
reference.

Interestingly, in 946ef68ad4e4 ("usb: gadget: ffs: Let setup() return
USB_GADGET_DELAYED_STATUS"), the FunctionFS composite driver had to
add USB_GADGET_DELAYED_STATUS to specifically avoid failures when dwc3
is used. This is the same "fix" that worked for me with Raw Gadget.

Right now dwc2, dwc3, mtu3, cdns3, bdc, and renesas have special
handling for USB_GADGET_DELAYED_STATUS. Surprisingly, dwc2 works with
Raw Gadget as is nevertheless. dwc3 fails as I described. For the
others, I don't have the hardware to test them.

I guess the proper solution would be to contain
USB_GADGET_DELAYED_STATUS within the composite framework and make all
UDCs not to handle the Status stage on their own. However, this would
require a collaborative effort from the maintainers of the UDC drivers
I mentioned.

An alternative would to declare USB_GADGET_DELAYED_STATUS to be usable
outside of the composite framework and leave everything as is
otherwise (but change Raw Gadget to return USB_GADGET_DELAYED_STATUS).
The downside is the discrepancy in the interface of different UDCs
(some require USB_GADGET_DELAYED_STATUS, others imply), but perhaps
it's not that bad provided that this discrepancy is documented.

Thank you!

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-21 16:13                 ` Andrey Konovalov
@ 2023-08-21 17:25                   ` Alan Stern
  2023-08-23  2:05                     ` Thinh Nguyen
  2023-08-23  2:30                     ` Andrey Konovalov
  0 siblings, 2 replies; 37+ messages in thread
From: Alan Stern @ 2023-08-21 17:25 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Thinh Nguyen, Sebastian Andrzej Siewior, Felipe Balbi,
	Greg Kroah-Hartman, USB list, LKML

On Mon, Aug 21, 2023 at 06:13:05PM +0200, Andrey Konovalov wrote:
> Out of curiosity, I also did some digging around USB_GADGET_DELAYED_STATUS.
> 
> 1. USB_GADGET_DELAYED_STATUS was introduced in 1b9ba000177 ("usb:
> gadget: composite: Allow function drivers to pause control
> transfers"). It looks like it was indeed initially intended for the
> composite framework, as nor that commit nor the directly following
> commits use USB_GADGET_DELAYED_STATUS in any UDC drivers. However,
> this commit had an unintended (?) side-effect of returning
> USB_GADGET_DELAYED_STATUS from the ->setup() call of the composite
> framework gadget driver.
> 
> 2. In 5bdb1dcc6330 ("usb: dwc3: ep0: handle delayed_status again"),
> the dwc3 driver was the first one to start relying on
> USB_GADGET_DELAYED_STATUS to decide when to avoid auto-completing the
> Status stage (+Sebastian). The commit description mentions some
> previous rework of the driver that made it lose the ability of handle
> delayed Status stage handling, but I couldn't figure out the exact
> commit it refers to.
> 
> 3. Following that, a few other UDC drivers also started using
> USB_GADGET_DELAYED_STATUS, potentially using the dwc3 behavior as a
> reference.
> 
> Interestingly, in 946ef68ad4e4 ("usb: gadget: ffs: Let setup() return
> USB_GADGET_DELAYED_STATUS"), the FunctionFS composite driver had to
> add USB_GADGET_DELAYED_STATUS to specifically avoid failures when dwc3
> is used. This is the same "fix" that worked for me with Raw Gadget.
> 
> Right now dwc2, dwc3, mtu3, cdns3, bdc, and renesas have special
> handling for USB_GADGET_DELAYED_STATUS. Surprisingly, dwc2 works with
> Raw Gadget as is nevertheless. dwc3 fails as I described. For the
> others, I don't have the hardware to test them.
> 
> I guess the proper solution would be to contain
> USB_GADGET_DELAYED_STATUS within the composite framework and make all
> UDCs not to handle the Status stage on their own. However, this would
> require a collaborative effort from the maintainers of the UDC drivers
> I mentioned.

Most if not all of the UDC drivers you listed are actively maintained.  
It shouldn't be too hard to get people to remove the special treatment 
of USB_GADGET_DELAYED_STATUS in them.

The necessary changes should be pretty small: Whenever wLength is 0, 
treat any non-negative return value from ->setup() as if it were 
USB_GADGET_DELAYED_STATUS.  This would probably end up shrinking the 
UDC driver code a little.  :-)

> An alternative would to declare USB_GADGET_DELAYED_STATUS to be usable
> outside of the composite framework and leave everything as is
> otherwise (but change Raw Gadget to return USB_GADGET_DELAYED_STATUS).
> The downside is the discrepancy in the interface of different UDCs
> (some require USB_GADGET_DELAYED_STATUS, others imply), but perhaps
> it's not that bad provided that this discrepancy is documented.

This alternative is less desirable, because the legacy gadgets (some of 
which don't use the composite framework) may not be compatible with it.

And as a matter of general principle, allowing UDC drivers to start 
automatically send Status replies to 0-length control transfers is a 
step in the wrong direction.  What we really should focus our energy on 
is getting them to _stop_ sending automatic Status replies to 
non-zero-length control transfers!

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-21 17:25                   ` Alan Stern
@ 2023-08-23  2:05                     ` Thinh Nguyen
  2023-08-23  2:30                     ` Andrey Konovalov
  1 sibling, 0 replies; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-23  2:05 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andrey Konovalov, Thinh Nguyen, Sebastian Andrzej Siewior,
	Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Mon, Aug 21, 2023, Alan Stern wrote:
> On Mon, Aug 21, 2023 at 06:13:05PM +0200, Andrey Konovalov wrote:
> > Out of curiosity, I also did some digging around USB_GADGET_DELAYED_STATUS.
> > 
> > 1. USB_GADGET_DELAYED_STATUS was introduced in 1b9ba000177 ("usb:
> > gadget: composite: Allow function drivers to pause control
> > transfers"). It looks like it was indeed initially intended for the
> > composite framework, as nor that commit nor the directly following
> > commits use USB_GADGET_DELAYED_STATUS in any UDC drivers. However,
> > this commit had an unintended (?) side-effect of returning
> > USB_GADGET_DELAYED_STATUS from the ->setup() call of the composite
> > framework gadget driver.
> > 
> > 2. In 5bdb1dcc6330 ("usb: dwc3: ep0: handle delayed_status again"),
> > the dwc3 driver was the first one to start relying on
> > USB_GADGET_DELAYED_STATUS to decide when to avoid auto-completing the
> > Status stage (+Sebastian). The commit description mentions some
> > previous rework of the driver that made it lose the ability of handle
> > delayed Status stage handling, but I couldn't figure out the exact
> > commit it refers to.
> > 
> > 3. Following that, a few other UDC drivers also started using
> > USB_GADGET_DELAYED_STATUS, potentially using the dwc3 behavior as a
> > reference.
> > 
> > Interestingly, in 946ef68ad4e4 ("usb: gadget: ffs: Let setup() return
> > USB_GADGET_DELAYED_STATUS"), the FunctionFS composite driver had to
> > add USB_GADGET_DELAYED_STATUS to specifically avoid failures when dwc3
> > is used. This is the same "fix" that worked for me with Raw Gadget.
> > 
> > Right now dwc2, dwc3, mtu3, cdns3, bdc, and renesas have special
> > handling for USB_GADGET_DELAYED_STATUS. Surprisingly, dwc2 works with
> > Raw Gadget as is nevertheless. dwc3 fails as I described. For the
> > others, I don't have the hardware to test them.
> > 
> > I guess the proper solution would be to contain
> > USB_GADGET_DELAYED_STATUS within the composite framework and make all
> > UDCs not to handle the Status stage on their own. However, this would
> > require a collaborative effort from the maintainers of the UDC drivers
> > I mentioned.
> 
> Most if not all of the UDC drivers you listed are actively maintained.  
> It shouldn't be too hard to get people to remove the special treatment 
> of USB_GADGET_DELAYED_STATUS in them.
> 
> The necessary changes should be pretty small: Whenever wLength is 0, 
> treat any non-negative return value from ->setup() as if it were 
> USB_GADGET_DELAYED_STATUS.  This would probably end up shrinking the 
> UDC driver code a little.  :-)

I don't think a simple removal of a few lines in dwc3 and all will be
fine. This will impact some users (including us and the tools we use for
validation). Also, this will require quite a bit of reviewing and
testing.

Nonetheless, this is noted.

> 
> > An alternative would to declare USB_GADGET_DELAYED_STATUS to be usable
> > outside of the composite framework and leave everything as is
> > otherwise (but change Raw Gadget to return USB_GADGET_DELAYED_STATUS).
> > The downside is the discrepancy in the interface of different UDCs
> > (some require USB_GADGET_DELAYED_STATUS, others imply), but perhaps
> > it's not that bad provided that this discrepancy is documented.
> 
> This alternative is less desirable, because the legacy gadgets (some of 
> which don't use the composite framework) may not be compatible with it.
> 
> And as a matter of general principle, allowing UDC drivers to start 
> automatically send Status replies to 0-length control transfers is a 
> step in the wrong direction.  What we really should focus our energy on 
> is getting them to _stop_ sending automatic Status replies to 
> non-zero-length control transfers!
> 

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-20 14:20               ` Alan Stern
  2023-08-21 16:13                 ` Andrey Konovalov
@ 2023-08-23  2:14                 ` Thinh Nguyen
  2023-08-23 15:17                   ` Alan Stern
  1 sibling, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-23  2:14 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Sun, Aug 20, 2023, Alan Stern wrote:
> On Sat, Aug 19, 2023 at 12:06:53AM +0000, Thinh Nguyen wrote:
> > On Fri, Aug 18, 2023, Alan Stern wrote:
> > > Actually I agree with you.  When a new SETUP packet arrives before the 
> > > old control transfer has finished, the UDC driver should cancel all 
> > > pending requests and then invoke the ->setup() callback.  (I don't think 
> > > there is a standard error code for the cancelled requests; net2280 seems 
> > > to use -EPROTO whereas dummy-hcd seems to use -EOVERFLOW.)
> > 
> > Those are very odd choice of error codes for cancelled request. Even
> > though the gadget side doesn't have very defined error codes, I try to
> > follow the equivalent doc from the host side
> > (driver-api/usb/error-codes.rst), which is -ECONNRESET.
> > 
> > Whenever I see -EPROTO, I associate that to a specific host error:
> > transaction error. For -EOVERFLOW, I think of babble errors.
> 
> Do you have a suggestion for an error code that all the UDCs should use 
> in this situation?  -ECONNRESET is currently being used for requests 
> that were cancelled by usb_ep_dequeue().  Would -EREMOTEIO be more 
> suitable for requests attached to an aborted control transfer?
> 

That works. It would be great if we can document the usb error codes for
gadget side and keep them consistent across UDC drivers. I think there
are only a few common errors:

* Request aborted
* Request dequeued
* STALL
* Data dropped (isoc only)

Any other?

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-21 17:25                   ` Alan Stern
  2023-08-23  2:05                     ` Thinh Nguyen
@ 2023-08-23  2:30                     ` Andrey Konovalov
  2023-08-23 15:48                       ` Alan Stern
  1 sibling, 1 reply; 37+ messages in thread
From: Andrey Konovalov @ 2023-08-23  2:30 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Sebastian Andrzej Siewior, Felipe Balbi,
	Greg Kroah-Hartman, USB list, LKML

On Mon, Aug 21, 2023 at 7:25 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> > I guess the proper solution would be to contain
> > USB_GADGET_DELAYED_STATUS within the composite framework and make all
> > UDCs not to handle the Status stage on their own. However, this would
> > require a collaborative effort from the maintainers of the UDC drivers
> > I mentioned.
>
> Most if not all of the UDC drivers you listed are actively maintained.
> It shouldn't be too hard to get people to remove the special treatment
> of USB_GADGET_DELAYED_STATUS in them.
>
> The necessary changes should be pretty small: Whenever wLength is 0,
> treat any non-negative return value from ->setup() as if it were
> USB_GADGET_DELAYED_STATUS.  This would probably end up shrinking the
> UDC driver code a little.  :-)

I started looking into reworking the UDC drivers to drop the special
case for USB_GADGET_DELAYED_STATUS, but this seems more complicated.

First, I noticed that some of the UDC drivers only expect to handle a
delayed Status stage for SET_CONFIGURATION requests. (Which is
reasonable, as they were developed assuming that only the composite
framework might request to delay the Status stage.) In particular,
dwc3, cdns2, and cdns3 set the gadget state to USB_STATE_CONFIGURED
when handling a delayed Status stage:

dwc3/ep0.c:136: usb_gadget_set_state(dwc->gadget, USB_STATE_CONFIGURED);
cdns3/cdns3-ep0.c:739: usb_gadget_set_state(&priv_dev->gadget,
USB_STATE_CONFIGURED);
gadget/udc/cdns2/cdns2-ep0.c:572: usb_gadget_set_state(&pdev->gadget,
USB_STATE_CONFIGURED);

So I believe an additional check for whether the request was indeed
SET_CONFIGURATION is required. (cdns2 and cdns3 also do other things
besides setting the state to USB_STATE_CONFIGURED, but it should be
possible to hide that under the same check.)

I also looked into how other UDC drivers change the gadget state to
USB_STATE_CONFIGURED:

1. isp1760, mtu3, and bdc immediately set USB_STATE_CONFIGURED once
they receive a SET_CONFIGURATION request, before calling ->setup() for
the gadget driver;
2. gr and mv_u3d do that after the ->setup() call;
3. tegra does it after the first non-control endpoint is enabled;
4. dwc3, cdns2, and cdns3 appear to not set USB_STATE_CONFIGURED if
the Status stage is not delayed;
5. dwc2, cdnsp, and all other UDCs don't set USB_STATE_CONFIGURED at all.

I'm guessing the UDCs in #4 and #5 expect the gadget driver to set
USB_STATE_CONFIGURED.

I see that the composite framework sets the gadget state to
USB_STATE_CONFIGURED even if some of the functions request a delayed
Status stage via USB_GADGET_DELAYED_STATUS. And GadgetFS also sets the
state to USB_STATE_CONFIGURED before delegating the SET_CONFIGURATION
request to userspace. However, Raw Gadget expects the userspace to
issue an ioctl that sets USB_STATE_CONFIGURED before completing the
delayed SET_CONFIGURATION request.

So I am wondering: when is proper time to set USB_STATE_CONFIGURED?
And should this be handled by the UDC driver or the gadget driver?

> > An alternative would to declare USB_GADGET_DELAYED_STATUS to be usable
> > outside of the composite framework and leave everything as is
> > otherwise (but change Raw Gadget to return USB_GADGET_DELAYED_STATUS).
> > The downside is the discrepancy in the interface of different UDCs
> > (some require USB_GADGET_DELAYED_STATUS, others imply), but perhaps
> > it's not that bad provided that this discrepancy is documented.
>
> This alternative is less desirable, because the legacy gadgets (some of
> which don't use the composite framework) may not be compatible with it.

I think GadgetFS and Raw Gadget are the only two such drivers?

> And as a matter of general principle, allowing UDC drivers to start
> automatically send Status replies to 0-length control transfers is a
> step in the wrong direction.  What we really should focus our energy on
> is getting them to _stop_ sending automatic Status replies to
> non-zero-length control transfers!

Ack!

But I don't think it's within my capability to fix all UDCs,
considering the issues I mentioned above.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-23  2:14                 ` Thinh Nguyen
@ 2023-08-23 15:17                   ` Alan Stern
  2023-08-23 17:59                     ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-23 15:17 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Wed, Aug 23, 2023 at 02:14:32AM +0000, Thinh Nguyen wrote:
> On Sun, Aug 20, 2023, Alan Stern wrote:
> > On Sat, Aug 19, 2023 at 12:06:53AM +0000, Thinh Nguyen wrote:
> > > On Fri, Aug 18, 2023, Alan Stern wrote:
> > > > Actually I agree with you.  When a new SETUP packet arrives before the 
> > > > old control transfer has finished, the UDC driver should cancel all 
> > > > pending requests and then invoke the ->setup() callback.  (I don't think 
> > > > there is a standard error code for the cancelled requests; net2280 seems 
> > > > to use -EPROTO whereas dummy-hcd seems to use -EOVERFLOW.)
> > > 
> > > Those are very odd choice of error codes for cancelled request. Even
> > > though the gadget side doesn't have very defined error codes, I try to
> > > follow the equivalent doc from the host side
> > > (driver-api/usb/error-codes.rst), which is -ECONNRESET.
> > > 
> > > Whenever I see -EPROTO, I associate that to a specific host error:
> > > transaction error. For -EOVERFLOW, I think of babble errors.
> > 
> > Do you have a suggestion for an error code that all the UDCs should use 
> > in this situation?  -ECONNRESET is currently being used for requests 
> > that were cancelled by usb_ep_dequeue().  Would -EREMOTEIO be more 
> > suitable for requests attached to an aborted control transfer?
> > 
> 
> That works. It would be great if we can document the usb error codes for
> gadget side and keep them consistent across UDC drivers. I think there
> are only a few common errors:
> 
> * Request aborted
> * Request dequeued
> * STALL
> * Data dropped (isoc only)
> 
> Any other?

* Overflow

STALL is not a valid status for usb_requests on the gadget side; it 
applies only on the host side (the host doesn't halt its endpoints).

Requests can be aborted for two different reasons:

	The endpoint is being disabled, either by an config/altsetting
	change or because of a disconnect

	The request was for an aborted control transfer

Putting this together, I get the following status codes:

-ESHUTDOWN	Request aborted because ep was disabled
-EREMOTEIO	Request was for an aborted control transfer
-ECONNRESET	Request was cancelled by usb_ep_dequeue()
-EXDEV		Data dropped (isoc only)
-EOVERFLOW	The host sent more data than the request wanted
		(will never happen if the request's length is a
		nonzero multiple of the maxpacket size)

This applies only to the .status field of struct usb_request.  Calls to 
usb_ep_queue() may return different error codes.

How does that sound?

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-23  2:30                     ` Andrey Konovalov
@ 2023-08-23 15:48                       ` Alan Stern
  2023-08-23 17:18                         ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-23 15:48 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Thinh Nguyen, Sebastian Andrzej Siewior, Felipe Balbi,
	Greg Kroah-Hartman, USB list, LKML

On Wed, Aug 23, 2023 at 04:30:23AM +0200, Andrey Konovalov wrote:
> I started looking into reworking the UDC drivers to drop the special
> case for USB_GADGET_DELAYED_STATUS, but this seems more complicated.
> 
> First, I noticed that some of the UDC drivers only expect to handle a
> delayed Status stage for SET_CONFIGURATION requests. (Which is

That expectation is wrong; gadget drivers can also want to delay the 
Status stage for a SET_INTERFACE request.  And in theory they might want 
to delay any control-OUT transfer.

> reasonable, as they were developed assuming that only the composite
> framework might request to delay the Status stage.) In particular,
> dwc3, cdns2, and cdns3 set the gadget state to USB_STATE_CONFIGURED
> when handling a delayed Status stage:
> 
> dwc3/ep0.c:136: usb_gadget_set_state(dwc->gadget, USB_STATE_CONFIGURED);
> cdns3/cdns3-ep0.c:739: usb_gadget_set_state(&priv_dev->gadget,
> USB_STATE_CONFIGURED);
> gadget/udc/cdns2/cdns2-ep0.c:572: usb_gadget_set_state(&pdev->gadget,
> USB_STATE_CONFIGURED);

This is also wrong.  SET_CONFIGURATION can tell a gadget to install 
config 0, in which case the state should be changed to 
USB_STATE_ADDRESS.

For that matter, a gadget can undergo many state changes other than the 
change into the CONFIGURED state.  It doesn't look like many of the UDC 
drivers are careful about reporting them.

> So I believe an additional check for whether the request was indeed
> SET_CONFIGURATION is required. (cdns2 and cdns3 also do other things
> besides setting the state to USB_STATE_CONFIGURED, but it should be
> possible to hide that under the same check.)
> 
> I also looked into how other UDC drivers change the gadget state to
> USB_STATE_CONFIGURED:
> 
> 1. isp1760, mtu3, and bdc immediately set USB_STATE_CONFIGURED once
> they receive a SET_CONFIGURATION request, before calling ->setup() for
> the gadget driver;
> 2. gr and mv_u3d do that after the ->setup() call;
> 3. tegra does it after the first non-control endpoint is enabled;
> 4. dwc3, cdns2, and cdns3 appear to not set USB_STATE_CONFIGURED if
> the Status stage is not delayed;
> 5. dwc2, cdnsp, and all other UDCs don't set USB_STATE_CONFIGURED at all.
> 
> I'm guessing the UDCs in #4 and #5 expect the gadget driver to set
> USB_STATE_CONFIGURED.
> 
> I see that the composite framework sets the gadget state to
> USB_STATE_CONFIGURED even if some of the functions request a delayed
> Status stage via USB_GADGET_DELAYED_STATUS. And GadgetFS also sets the
> state to USB_STATE_CONFIGURED before delegating the SET_CONFIGURATION
> request to userspace. However, Raw Gadget expects the userspace to
> issue an ioctl that sets USB_STATE_CONFIGURED before completing the
> delayed SET_CONFIGURATION request.
> 
> So I am wondering: when is proper time to set USB_STATE_CONFIGURED?
> And should this be handled by the UDC driver or the gadget driver?

The proper time isn't really well defined.  As far as the gadget driver 
is concerned, it's when the configuration change is completed (when it 
tells the composite framework to stop delaying the status stage).  But 
as far as the host is concerned, it's when the Status stage completes 
successfully.

If the Status stage of the control transfer gets corrupted, it's 
possible to end up in a situation where the gadget believes it is 
configured and the host believes it isn't.  Luckily this doesn't 
happen very often, and if it does then the host should reissue the 
transfer.

All the other state changes are (or should be) handled by the UDC 
drivers.  I guess they can handle the changes to/from the CONFIGURED 
state as well, although they will have to be more careful about it than 
they are now.

> > > An alternative would to declare USB_GADGET_DELAYED_STATUS to be usable
> > > outside of the composite framework and leave everything as is
> > > otherwise (but change Raw Gadget to return USB_GADGET_DELAYED_STATUS).
> > > The downside is the discrepancy in the interface of different UDCs
> > > (some require USB_GADGET_DELAYED_STATUS, others imply), but perhaps
> > > it's not that bad provided that this discrepancy is documented.
> >
> > This alternative is less desirable, because the legacy gadgets (some of
> > which don't use the composite framework) may not be compatible with it.
> 
> I think GadgetFS and Raw Gadget are the only two such drivers?

Yes, that appears to be so.  I didn't realize those were the only 
hold-outs.

> > And as a matter of general principle, allowing UDC drivers to start
> > automatically send Status replies to 0-length control transfers is a
> > step in the wrong direction.  What we really should focus our energy on
> > is getting them to _stop_ sending automatic Status replies to
> > non-zero-length control transfers!
> 
> Ack!
> 
> But I don't think it's within my capability to fix all UDCs,
> considering the issues I mentioned above.

One step at a time...

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-23 15:48                       ` Alan Stern
@ 2023-08-23 17:18                         ` Thinh Nguyen
  2023-08-25  1:36                           ` Andrey Konovalov
  0 siblings, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-23 17:18 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andrey Konovalov, Thinh Nguyen, Sebastian Andrzej Siewior,
	Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Wed, Aug 23, 2023, Alan Stern wrote:
> On Wed, Aug 23, 2023 at 04:30:23AM +0200, Andrey Konovalov wrote:
> > I started looking into reworking the UDC drivers to drop the special
> > case for USB_GADGET_DELAYED_STATUS, but this seems more complicated.
> > 
> > First, I noticed that some of the UDC drivers only expect to handle a
> > delayed Status stage for SET_CONFIGURATION requests. (Which is

Just want to clarify that dwc3 would expect a delayed status for any
request returned with USB_GADGET_DELAYED_STATUS. The issue is that dwc3
assumes the _first_ delayed request would be SET_CONFIGURATION. Any
subsequence control request with delayed request, it assumes the device
is already in configured state.

> 
> That expectation is wrong; gadget drivers can also want to delay the 
> Status stage for a SET_INTERFACE request.  And in theory they might want 
> to delay any control-OUT transfer.

Agree. Thanks Andrey and Alan for looking into dwc3.

Regarding SET_INTERFACE, it should be fine because it should be done
while it's already in configured state, which is after
SET_CONFIGURATION. But it's true that dwc3 needs to fix this assumption
here.

> 
> > reasonable, as they were developed assuming that only the composite
> > framework might request to delay the Status stage.) In particular,
> > dwc3, cdns2, and cdns3 set the gadget state to USB_STATE_CONFIGURED
> > when handling a delayed Status stage:
> > 
> > dwc3/ep0.c:136: usb_gadget_set_state(dwc->gadget, USB_STATE_CONFIGURED);
> > cdns3/cdns3-ep0.c:739: usb_gadget_set_state(&priv_dev->gadget,
> > USB_STATE_CONFIGURED);
> > gadget/udc/cdns2/cdns2-ep0.c:572: usb_gadget_set_state(&pdev->gadget,
> > USB_STATE_CONFIGURED);
> 
> This is also wrong.  SET_CONFIGURATION can tell a gadget to install 
> config 0, in which case the state should be changed to 
> USB_STATE_ADDRESS.

Added on my TODO list.

> 
> For that matter, a gadget can undergo many state changes other than the 
> change into the CONFIGURED state.  It doesn't look like many of the UDC 
> drivers are careful about reporting them.
> 
> > So I believe an additional check for whether the request was indeed
> > SET_CONFIGURATION is required. (cdns2 and cdns3 also do other things
> > besides setting the state to USB_STATE_CONFIGURED, but it should be
> > possible to hide that under the same check.)
> > 
> > I also looked into how other UDC drivers change the gadget state to
> > USB_STATE_CONFIGURED:
> > 
> > 1. isp1760, mtu3, and bdc immediately set USB_STATE_CONFIGURED once
> > they receive a SET_CONFIGURATION request, before calling ->setup() for
> > the gadget driver;
> > 2. gr and mv_u3d do that after the ->setup() call;
> > 3. tegra does it after the first non-control endpoint is enabled;
> > 4. dwc3, cdns2, and cdns3 appear to not set USB_STATE_CONFIGURED if
> > the Status stage is not delayed;
> > 5. dwc2, cdnsp, and all other UDCs don't set USB_STATE_CONFIGURED at all.
> > 
> > I'm guessing the UDCs in #4 and #5 expect the gadget driver to set
> > USB_STATE_CONFIGURED.
> > 
> > I see that the composite framework sets the gadget state to
> > USB_STATE_CONFIGURED even if some of the functions request a delayed
> > Status stage via USB_GADGET_DELAYED_STATUS. And GadgetFS also sets the
> > state to USB_STATE_CONFIGURED before delegating the SET_CONFIGURATION
> > request to userspace. However, Raw Gadget expects the userspace to
> > issue an ioctl that sets USB_STATE_CONFIGURED before completing the
> > delayed SET_CONFIGURATION request.
> > 
> > So I am wondering: when is proper time to set USB_STATE_CONFIGURED?
> > And should this be handled by the UDC driver or the gadget driver?
> 
> The proper time isn't really well defined.  As far as the gadget driver 
> is concerned, it's when the configuration change is completed (when it 
> tells the composite framework to stop delaying the status stage).  But 
> as far as the host is concerned, it's when the Status stage completes 
> successfully.
> 
> If the Status stage of the control transfer gets corrupted, it's 
> possible to end up in a situation where the gadget believes it is 
> configured and the host believes it isn't.  Luckily this doesn't 
> happen very often, and if it does then the host should reissue the 
> transfer.
> 
> All the other state changes are (or should be) handled by the UDC 
> drivers.  I guess they can handle the changes to/from the CONFIGURED 
> state as well, although they will have to be more careful about it than 
> they are now.
> 

The dwc3 tries to handle these state changes. However, as pointed out,
it needs to be audited. We got a little over reliant on USB CV tests to
audit these state changes for us.

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-23 15:17                   ` Alan Stern
@ 2023-08-23 17:59                     ` Thinh Nguyen
  2023-08-23 19:19                       ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-23 17:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Wed, Aug 23, 2023, Alan Stern wrote:
> On Wed, Aug 23, 2023 at 02:14:32AM +0000, Thinh Nguyen wrote:
> > On Sun, Aug 20, 2023, Alan Stern wrote:
> > > On Sat, Aug 19, 2023 at 12:06:53AM +0000, Thinh Nguyen wrote:
> > > > On Fri, Aug 18, 2023, Alan Stern wrote:
> > > > > Actually I agree with you.  When a new SETUP packet arrives before the 
> > > > > old control transfer has finished, the UDC driver should cancel all 
> > > > > pending requests and then invoke the ->setup() callback.  (I don't think 
> > > > > there is a standard error code for the cancelled requests; net2280 seems 
> > > > > to use -EPROTO whereas dummy-hcd seems to use -EOVERFLOW.)
> > > > 
> > > > Those are very odd choice of error codes for cancelled request. Even
> > > > though the gadget side doesn't have very defined error codes, I try to
> > > > follow the equivalent doc from the host side
> > > > (driver-api/usb/error-codes.rst), which is -ECONNRESET.
> > > > 
> > > > Whenever I see -EPROTO, I associate that to a specific host error:
> > > > transaction error. For -EOVERFLOW, I think of babble errors.
> > > 
> > > Do you have a suggestion for an error code that all the UDCs should use 
> > > in this situation?  -ECONNRESET is currently being used for requests 
> > > that were cancelled by usb_ep_dequeue().  Would -EREMOTEIO be more 
> > > suitable for requests attached to an aborted control transfer?
> > > 
> > 
> > That works. It would be great if we can document the usb error codes for
> > gadget side and keep them consistent across UDC drivers. I think there
> > are only a few common errors:
> > 
> > * Request aborted
> > * Request dequeued
> > * STALL
> > * Data dropped (isoc only)
> > 
> > Any other?
> 
> * Overflow
> 
> STALL is not a valid status for usb_requests on the gadget side; it 
> applies only on the host side (the host doesn't halt its endpoints).

The host can send a CLEAR_FEATURE(halt_ep). This will reset the data
sequence of the endpoint. In xhci spec (section 4.6.8), it suggests to
send this when the endpoint is reset. The endpoint is reset typically
when there's a transaction error.

The problem here is that typical protocol spec like MSC/UVC don't
specify how to handle CLEAR_FEATURE(halt_ep).

For Windows MSC driver, when the host recovers from the transaction
error, it sends CLEAR_FEATURE(halt_ep) and expects the transfer to be
cancelled. To synchronize with the host, the gadget driver needs to
cancel the request. Dwc3 needs to notify the gadget driver of this.

For other class driver, it may expect the transfer to resume after data
sequence reset.

As a result, for an endpoint that's STALL (or not), and if the host
sends CLEAR_FEATURE(halt_ep), the dwc3 returns the request with some
status code and let the gadget driver handle it. If the gadget driver
wants to cancel the transfer, it can drop the transfer. If the gadget
driver wants to resume, it can requeue the same requests with the saved
status to resume where it left off.

> 
> Requests can be aborted for two different reasons:
> 
> 	The endpoint is being disabled, either by an config/altsetting
> 	change or because of a disconnect
> 
> 	The request was for an aborted control transfer
> 
> Putting this together, I get the following status codes:
> 
> -ESHUTDOWN	Request aborted because ep was disabled
> -EREMOTEIO	Request was for an aborted control transfer
> -ECONNRESET	Request was cancelled by usb_ep_dequeue()
> -EXDEV		Data dropped (isoc only)
> -EOVERFLOW	The host sent more data than the request wanted
> 		(will never happen if the request's length is a
> 		nonzero multiple of the maxpacket size)
> 
> This applies only to the .status field of struct usb_request.  Calls to 
> usb_ep_queue() may return different error codes.
> 
> How does that sound?
> 

That looks great!

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-23 17:59                     ` Thinh Nguyen
@ 2023-08-23 19:19                       ` Alan Stern
  2023-08-23 22:22                         ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-23 19:19 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Wed, Aug 23, 2023 at 05:59:07PM +0000, Thinh Nguyen wrote:
> On Wed, Aug 23, 2023, Alan Stern wrote:
> > STALL is not a valid status for usb_requests on the gadget side; it 
> > applies only on the host side (the host doesn't halt its endpoints).
> 
> The host can send a CLEAR_FEATURE(halt_ep). This will reset the data
> sequence of the endpoint. In xhci spec (section 4.6.8), it suggests to
> send this when the endpoint is reset. The endpoint is reset typically
> when there's a transaction error.

It's important to be careful about the distinction between an actual 
endpoint in the gadget and the logical representation of an endpoint 
inside a host controller.  The host cannot reset the first; it can only 
reset the second.

So yes, the usb_clear_halt() routine on the host does a 
CLEAR_FEATURE(HALT) control transfer and then calls 
usb_reset_endpoint(), which calls usb_hcd_reset_endpoint().

> The problem here is that typical protocol spec like MSC/UVC don't
> specify how to handle CLEAR_FEATURE(halt_ep).

MSC does specify this.  I don't know about UVC.

> For Windows MSC driver, when the host recovers from the transaction
> error, it sends CLEAR_FEATURE(halt_ep) and expects the transfer to be
> cancelled. To synchronize with the host, the gadget driver needs to
> cancel the request. Dwc3 needs to notify the gadget driver of this.

No, that's not what happens in the Mass Storage Class.

For the Bulk-Only Transport version of MSC, when a Windows or Linux host 
detects a transaction error, it performs a USB port reset.  This clears 
all the state on the gadget.  The gadget gets re-enumerated, and the 
host proceeds to re-issue the MSC command.  The gadget driver doesn't 
need any special notifications; outstanding requests get cancelled as a 
normal part of the reset handling.

(In fact, this is not what the BOT spec says to do.  It says that when 
the host detects a transaction error, it should a Bulk-Only Mass Storage 
Reset -- this is a special class-specific control transfer.  In 
response, the gadget driver is supposed to reset its internal state and 
cancel all of its outstanding requests.  Then the host issues 
CLEAR_FEATURE(HALT) to both the bulk-IN and bulk-OUT endpoints and 
proceeds to issue its next MSC command.  A lot of MSC devices don't 
handle this properly, probably because Windows didn't use this 
approach.)

In the UAS version of MSC, the endpoints never halt.  If there's a 
transaction error, the host simply re-issues the transaction.  If that 
fails too, error recovery is started by the SCSI layer; it involves a 
USB port reset.

But as you can see, in each case the UDC driver doesn't have to cancel 
anything in particular when it gets a Clear-Halt.

> For other class driver, it may expect the transfer to resume after data
> sequence reset.

Indeed.  In which case, the UDC driver shouldn't cancel anything.

> As a result, for an endpoint that's STALL (or not), and if the host
> sends CLEAR_FEATURE(halt_ep), the dwc3 returns the request with some
> status code and let the gadget driver handle it. If the gadget driver
> wants to cancel the transfer, it can drop the transfer. If the gadget
> driver wants to resume, it can requeue the same requests with the saved
> status to resume where it left off.

The UDC driver should not dequeue a request merely because the endpoint 
is halted.  The gadget driver can take care of everything necessary.  
After all, it knows when an endpoint gets halted, because the gadget 
driver is what calls usb_ep_set_halt() or usb_ep_set_wedge() in the 
first place.

As for handling CLEAR_FEATURE(HALT), all the UDC driver needs to do is 
clear the HALT feature for the endpoint.  (Although if the endpoint is 
wedged, the HALT feature should not be cleared.)  It doesn't need to 
cancel any outstanding requests or inform the gadget driver in any way.

(Again, this is something that a lot of USB devices don't handle 
properly.  They get very confused if the host sends a Clear-Halt 
transfer for an endpoint that isn't halted.)

> > Putting this together, I get the following status codes:
> > 
> > -ESHUTDOWN	Request aborted because ep was disabled
> > -EREMOTEIO	Request was for an aborted control transfer
> > -ECONNRESET	Request was cancelled by usb_ep_dequeue()
> > -EXDEV		Data dropped (isoc only)
> > -EOVERFLOW	The host sent more data than the request wanted
> > 		(will never happen if the request's length is a
> > 		nonzero multiple of the maxpacket size)
> > 
> > This applies only to the .status field of struct usb_request.  Calls to 
> > usb_ep_queue() may return different error codes.
> > 
> > How does that sound?
> > 
> 
> That looks great!

At some point I'll write a patch adding this to the documentation.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-23 19:19                       ` Alan Stern
@ 2023-08-23 22:22                         ` Thinh Nguyen
  2023-08-24  2:21                           ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-23 22:22 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Wed, Aug 23, 2023, Alan Stern wrote:
> On Wed, Aug 23, 2023 at 05:59:07PM +0000, Thinh Nguyen wrote:
> > On Wed, Aug 23, 2023, Alan Stern wrote:
> > > STALL is not a valid status for usb_requests on the gadget side; it 
> > > applies only on the host side (the host doesn't halt its endpoints).
> > 
> > The host can send a CLEAR_FEATURE(halt_ep). This will reset the data
> > sequence of the endpoint. In xhci spec (section 4.6.8), it suggests to
> > send this when the endpoint is reset. The endpoint is reset typically
> > when there's a transaction error.
> 
> It's important to be careful about the distinction between an actual 
> endpoint in the gadget and the logical representation of an endpoint 
> inside a host controller.  The host cannot reset the first; it can only 
> reset the second.
> 
> So yes, the usb_clear_halt() routine on the host does a 
> CLEAR_FEATURE(HALT) control transfer and then calls 
> usb_reset_endpoint(), which calls usb_hcd_reset_endpoint().
> 
> > The problem here is that typical protocol spec like MSC/UVC don't
> > specify how to handle CLEAR_FEATURE(halt_ep).
> 
> MSC does specify this.  I don't know about UVC.

No, from what I last recalled, it doesn't clearly define what should
happen here. It just indicates ClearFeature(halt_ep) for reset recovery.
However, the "reset recovery" can be implementation specific for how the
host can synchronize with the device.

> 
> > For Windows MSC driver, when the host recovers from the transaction
> > error, it sends CLEAR_FEATURE(halt_ep) and expects the transfer to be
> > cancelled. To synchronize with the host, the gadget driver needs to
> > cancel the request. Dwc3 needs to notify the gadget driver of this.
> 
> No, that's not what happens in the Mass Storage Class.
> 
> For the Bulk-Only Transport version of MSC, when a Windows or Linux host 
> detects a transaction error, it performs a USB port reset.  This clears 

No, that's implementation specific for reset recovery. Typically for
Windows, for the first recovery, it sends a ClearFeature(halt_ep) and
sends a new MSC command. If the transfer doesn't complete within a
specific time, there will be a timeout and a port reset, which is
another level of recovery.

> all the state on the gadget.  The gadget gets re-enumerated, and the 
> host proceeds to re-issue the MSC command.  The gadget driver doesn't 
> need any special notifications; outstanding requests get cancelled as a 
> normal part of the reset handling.
> 
> (In fact, this is not what the BOT spec says to do.  It says that when 
> the host detects a transaction error, it should a Bulk-Only Mass Storage 
> Reset -- this is a special class-specific control transfer.  In 
> response, the gadget driver is supposed to reset its internal state and 
> cancel all of its outstanding requests.  Then the host issues 
> CLEAR_FEATURE(HALT) to both the bulk-IN and bulk-OUT endpoints and 
> proceeds to issue its next MSC command.  A lot of MSC devices don't 
> handle this properly, probably because Windows didn't use this 
> approach.)

At the moment, the gadget driver doesn't handle CLEAR_FEATURE(halt_ep),
the UDC driver does. I don't recall this being handled in the composite
framework or in the f_mass_storage function driver. Unless we change
this, the UDC driver needs to notify the gadget driver somehow.

> 
> In the UAS version of MSC, the endpoints never halt.  If there's a 
> transaction error, the host simply re-issues the transaction.  If that 

There are multiple levels of recovery. Different driver handles it
differently. For xHCI, Initially there's retry at the packet level
(typically set to retry 3 times in a row). If it fails, host controller
driver will get a transaction error event.

In Linux xHCI, the recovery for transaction error we perform soft-reset
(xhci reset ep command with TSP=1). If it still fails, we reset the
endpoint (TSP=0) and return the request with -EPROTO to the class
driver. However, we don't send ClearFeature(halt_ep). I don't recall
Linux MSC driver handle -EPROTO and do a port reset. However it does do
a port reset due to transfer timeout.

In Windows, it doesn't do soft-reset, but it does reset endpoint (TSP=0)
and send CLEAR_FEATURE(halt_ep) without port reset initially. It then
can send the a new MSC command expecting the device to be in sync based
on the CLEAR_FEATURE(halt_ep) request. If the recovery fails and the
transfer/command timed out, it will then do a port reset to recover.

> fails too, error recovery is started by the SCSI layer; it involves a 
> USB port reset.
> 
> But as you can see, in each case the UDC driver doesn't have to cancel 
> anything in particular when it gets a Clear-Halt.
> 
> > For other class driver, it may expect the transfer to resume after data
> > sequence reset.
> 
> Indeed.  In which case, the UDC driver shouldn't cancel anything.
> 
> > As a result, for an endpoint that's STALL (or not), and if the host
> > sends CLEAR_FEATURE(halt_ep), the dwc3 returns the request with some
> > status code and let the gadget driver handle it. If the gadget driver
> > wants to cancel the transfer, it can drop the transfer. If the gadget
> > driver wants to resume, it can requeue the same requests with the saved
> > status to resume where it left off.
> 
> The UDC driver should not dequeue a request merely because the endpoint 
> is halted.  The gadget driver can take care of everything necessary.  
> After all, it knows when an endpoint gets halted, because the gadget 

No, currently it doesn't know. That's the problem. The dwc3 driver
handles the CLEAR_FEATURE(halt_ep), not the gadget driver.

> driver is what calls usb_ep_set_halt() or usb_ep_set_wedge() in the 
> first place.
> 
> As for handling CLEAR_FEATURE(HALT), all the UDC driver needs to do is 
> clear the HALT feature for the endpoint.  (Although if the endpoint is 
> wedged, the HALT feature should not be cleared.)  It doesn't need to 
> cancel any outstanding requests or inform the gadget driver in any way.

The UDC driver needs to notify the gadget driver somehow, cancelling the
request and give it back is currently the way dwc3 handling it.

> 
> (Again, this is something that a lot of USB devices don't handle 
> properly.  They get very confused if the host sends a Clear-Halt 
> transfer for an endpoint that isn't halted.)
> 
> > > Putting this together, I get the following status codes:
> > > 
> > > -ESHUTDOWN	Request aborted because ep was disabled
> > > -EREMOTEIO	Request was for an aborted control transfer
> > > -ECONNRESET	Request was cancelled by usb_ep_dequeue()
> > > -EXDEV		Data dropped (isoc only)
> > > -EOVERFLOW	The host sent more data than the request wanted
> > > 		(will never happen if the request's length is a
> > > 		nonzero multiple of the maxpacket size)
> > > 
> > > This applies only to the .status field of struct usb_request.  Calls to 
> > > usb_ep_queue() may return different error codes.
> > > 
> > > How does that sound?
> > > 
> > 
> > That looks great!
> 
> At some point I'll write a patch adding this to the documentation.
> 

Thanks!
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-23 22:22                         ` Thinh Nguyen
@ 2023-08-24  2:21                           ` Alan Stern
  2023-08-26  1:20                             ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-24  2:21 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Wed, Aug 23, 2023 at 10:22:07PM +0000, Thinh Nguyen wrote:
> On Wed, Aug 23, 2023, Alan Stern wrote:
> > On Wed, Aug 23, 2023 at 05:59:07PM +0000, Thinh Nguyen wrote:
> > > On Wed, Aug 23, 2023, Alan Stern wrote:
> > > > STALL is not a valid status for usb_requests on the gadget side; it 
> > > > applies only on the host side (the host doesn't halt its endpoints).
> > > 
> > > The host can send a CLEAR_FEATURE(halt_ep). This will reset the data
> > > sequence of the endpoint. In xhci spec (section 4.6.8), it suggests to
> > > send this when the endpoint is reset. The endpoint is reset typically
> > > when there's a transaction error.
> > 
> > It's important to be careful about the distinction between an actual 
> > endpoint in the gadget and the logical representation of an endpoint 
> > inside a host controller.  The host cannot reset the first; it can only 
> > reset the second.
> > 
> > So yes, the usb_clear_halt() routine on the host does a 
> > CLEAR_FEATURE(HALT) control transfer and then calls 
> > usb_reset_endpoint(), which calls usb_hcd_reset_endpoint().
> > 
> > > The problem here is that typical protocol spec like MSC/UVC don't
> > > specify how to handle CLEAR_FEATURE(halt_ep).
> > 
> > MSC does specify this.  I don't know about UVC.
> 
> No, from what I last recalled, it doesn't clearly define what should
> happen here. It just indicates ClearFeature(halt_ep) for reset recovery.
> However, the "reset recovery" can be implementation specific for how the
> host can synchronize with the device.

Read the BOT spec.  I quote some of the relevant parts below.

> > > For Windows MSC driver, when the host recovers from the transaction
> > > error, it sends CLEAR_FEATURE(halt_ep) and expects the transfer to be
> > > cancelled. To synchronize with the host, the gadget driver needs to
> > > cancel the request. Dwc3 needs to notify the gadget driver of this.
> > 
> > No, that's not what happens in the Mass Storage Class.
> > 
> > For the Bulk-Only Transport version of MSC, when a Windows or Linux host 
> > detects a transaction error, it performs a USB port reset.  This clears 
> 
> No, that's implementation specific for reset recovery. Typically for

I haven't tested recent versions of Windows.  Older versions did behave 
this way.  I still have the logs to prove it.

> Windows, for the first recovery, it sends a ClearFeature(halt_ep) and
> sends a new MSC command.

That's a violation of the BOT spec.  Are you sure Windows really does 
this?

>  If the transfer doesn't complete within a
> specific time, there will be a timeout and a port reset, which is
> another level of recovery.
> 
> > all the state on the gadget.  The gadget gets re-enumerated, and the 
> > host proceeds to re-issue the MSC command.  The gadget driver doesn't 
> > need any special notifications; outstanding requests get cancelled as a 
> > normal part of the reset handling.
> > 
> > (In fact, this is not what the BOT spec says to do.  It says that when 
> > the host detects a transaction error, it should a Bulk-Only Mass Storage 
> > Reset -- this is a special class-specific control transfer.  In 
> > response, the gadget driver is supposed to reset its internal state and 
> > cancel all of its outstanding requests.  Then the host issues 
> > CLEAR_FEATURE(HALT) to both the bulk-IN and bulk-OUT endpoints and 
> > proceeds to issue its next MSC command.  A lot of MSC devices don't 
> > handle this properly, probably because Windows didn't use this 
> > approach.)
> 
> At the moment, the gadget driver doesn't handle CLEAR_FEATURE(halt_ep),
> the UDC driver does.

Correct.  My point is that it works this way because the gadget driver 
doesn't _need_ to handle Clear-Halt.

>   I don't recall this being handled in the composite
> framework or in the f_mass_storage function driver. Unless we change
> this, the UDC driver needs to notify the gadget driver somehow.

No, f_mass_storage does not need to be notified about Clear-Halts.  As 
you say, it isn't getting notified now, and yet it somehow still manages 
to work with every type of host I'm aware of.

> > In the UAS version of MSC, the endpoints never halt.  If there's a 
> > transaction error, the host simply re-issues the transaction.  If that 
> 
> There are multiple levels of recovery. Different driver handles it
> differently. For xHCI, Initially there's retry at the packet level
> (typically set to retry 3 times in a row). If it fails, host controller
> driver will get a transaction error event.

That 3-strikes-and-you're-out thing is the normal USB low-level retry 
mechanism.  We're talking about what happens when it fails and the HCD 
reports a transaction error such as -EPROTO.

> In Linux xHCI, the recovery for transaction error we perform soft-reset
> (xhci reset ep command with TSP=1). If it still fails, we reset the
> endpoint (TSP=0) and return the request with -EPROTO to the class
> driver. However, we don't send ClearFeature(halt_ep). I don't recall
> Linux MSC driver handle -EPROTO and do a port reset. However it does do
> a port reset due to transfer timeout.

In usb-storage, a -EPROTO error status causes interpret_urb_result() to 
return USB_STOR_XFER_ERROR.  This causes usb_stor_invoke_transport() to 
goto Handle_Errors:, which calls usb_stor_port_reset().

In uas, a -EPROTO error will cause an error status to be returned to the 
SCSI layer, which will invoke the SCSI error handler.  After enough 
failures it will call the uas device-reset handler, and 
uas_eh_device_reset_handler() calls usb_reset_device().

> In Windows, it doesn't do soft-reset, but it does reset endpoint (TSP=0)
> and send CLEAR_FEATURE(halt_ep) without port reset initially. It then
> can send the a new MSC command expecting the device to be in sync based
> on the CLEAR_FEATURE(halt_ep) request.

That is not how the Bulk-Only Transport protocol resynchronizes after a 
protocol error.  The BOT spec mentions in several places variations of:

	If the host detects a STALL of the Bulk-Out endpoint during 
	command transport, the host shall respond with a Reset Recovery
	(see 5.3.4 - Reset Recovery).

It doesn't say specifically what to do in case of other lower-level 
protocol errors, but we have to assume that the intention is for the 
host to follow the Reset Recovery procedure, because that's what the 
device will expect to see.  The spec goes on to say:

	5.3.4	Reset Recovery

	For Reset Recovery the host shall issue in the following order:
		(a) a Bulk-Only Mass Storage Reset
		(b) a Clear Feature HALT to the Bulk-In endpoint
		(c) a Clear Feature HALT to the Bulk-Out endpoint

It most definitely does _not_ say that the host should do a Clear-Halt 
without the Bulk-Only Mass Storage Reset.  By reading the spec carefully 
you can see that such action would leave the host out of sync with the 
device.

>  If the recovery fails and the
> transfer/command timed out, it will then do a port reset to recover.
> 
> > fails too, error recovery is started by the SCSI layer; it involves a 
> > USB port reset.
> > 
> > But as you can see, in each case the UDC driver doesn't have to cancel 
> > anything in particular when it gets a Clear-Halt.
> > 
> > > For other class driver, it may expect the transfer to resume after data
> > > sequence reset.
> > 
> > Indeed.  In which case, the UDC driver shouldn't cancel anything.
> > 
> > > As a result, for an endpoint that's STALL (or not), and if the host
> > > sends CLEAR_FEATURE(halt_ep), the dwc3 returns the request with some
> > > status code and let the gadget driver handle it. If the gadget driver
> > > wants to cancel the transfer, it can drop the transfer. If the gadget
> > > driver wants to resume, it can requeue the same requests with the saved
> > > status to resume where it left off.
> > 
> > The UDC driver should not dequeue a request merely because the endpoint 
> > is halted.  The gadget driver can take care of everything necessary.  
> > After all, it knows when an endpoint gets halted, because the gadget 
> 
> No, currently it doesn't know. That's the problem. The dwc3 driver
> handles the CLEAR_FEATURE(halt_ep), not the gadget driver.

You misunderstood what I wrote.  I said that the gadget driver knows 
when an endpoint's halt feature gets _set_; I didn't say that it knows 
when the halt feature gets _cleared_.

(There is one exception: The gadget driver won't know if the host sends 
a SET_FEATURE(halt_ep).  Hosts don't normally do this and I don't think 
we need to worry about it.)

> > driver is what calls usb_ep_set_halt() or usb_ep_set_wedge() in the 
> > first place.
> > 
> > As for handling CLEAR_FEATURE(HALT), all the UDC driver needs to do is 
> > clear the HALT feature for the endpoint.  (Although if the endpoint is 
> > wedged, the HALT feature should not be cleared.)  It doesn't need to 
> > cancel any outstanding requests or inform the gadget driver in any way.
> 
> The UDC driver needs to notify the gadget driver somehow, cancelling the
> request and give it back is currently the way dwc3 handling it.

And I'm saying that the UDC driver does _not_ need to notify the gadget 
driver.

The MSC gadget driver works just fine without any such notification.  
Can you name any gadget driver that _does_ need a notification?

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-23 17:18                         ` Thinh Nguyen
@ 2023-08-25  1:36                           ` Andrey Konovalov
  2023-08-25  2:08                             ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Andrey Konovalov @ 2023-08-25  1:36 UTC (permalink / raw)
  To: Thinh Nguyen, Alan Stern
  Cc: Sebastian Andrzej Siewior, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Wed, Aug 23, 2023 at 7:18 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>
> > > First, I noticed that some of the UDC drivers only expect to handle a
> > > delayed Status stage for SET_CONFIGURATION requests. (Which is
>
> Just want to clarify that dwc3 would expect a delayed status for any
> request returned with USB_GADGET_DELAYED_STATUS. The issue is that dwc3
> assumes the _first_ delayed request would be SET_CONFIGURATION. Any
> subsequence control request with delayed request, it assumes the device
> is already in configured state.
>
> > That expectation is wrong; gadget drivers can also want to delay the
> > Status stage for a SET_INTERFACE request.  And in theory they might want
> > to delay any control-OUT transfer.
>
> Agree. Thanks Andrey and Alan for looking into dwc3.
>
> Regarding SET_INTERFACE, it should be fine because it should be done
> while it's already in configured state, which is after
> SET_CONFIGURATION. But it's true that dwc3 needs to fix this assumption
> here.

Ack.

Thank you Thinh and Alan for taking the time to look into this!

So to summarize the issue wrt the 0-length requests:

1. UDC drivers must always delay the status stage for 0-length
requests until the gadget driver queues an empty request.

2. Many UDC drivers do not do this, and only delay the status stage
when USB_GADGET_DELAYED_STATUS is returned from ->setup(). Some
drivers also assume that only a SET_CONFIGURATION request can be
delayed.

3. All such UDC drivers should be fixed and USB_GADGET_DELAYED_STATUS
should be contained within the composite framework.

4. Fixing all such UDC drivers is a non-trivial amount of work, but
this is the goal to strive towards.

Alan, would it be acceptable if I add custom handling of
USB_GADGET_DELAYED_STATUS to Raw Gadget in the meantime? It would be
great to keep it at least somewhat working with dwc3. I can also do it
for GadgetFS, if you think it's a good idea.

I can also add some clarifying comments for USB_GADGET_DELAYED_STATUS
and ->setup() to hopefully avoid having new UDC drivers being added
with the same issue (e.g. cdns2 and renesas_usbf with the same issue
were added just recently).

Maybe it's also a good idea to add a checkpatch.pl check for using
USB_GADGET_DELAYED_STATUS in UDC drivers. Or maybe ask Greg to keep an
eye out for this?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-25  1:36                           ` Andrey Konovalov
@ 2023-08-25  2:08                             ` Alan Stern
  0 siblings, 0 replies; 37+ messages in thread
From: Alan Stern @ 2023-08-25  2:08 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Thinh Nguyen, Sebastian Andrzej Siewior, Felipe Balbi,
	Greg Kroah-Hartman, USB list, LKML

On Fri, Aug 25, 2023 at 03:36:36AM +0200, Andrey Konovalov wrote:
> So to summarize the issue wrt the 0-length requests:
> 
> 1. UDC drivers must always delay the status stage for 0-length
> requests until the gadget driver queues an empty request.

Any request, not necessarily empty (although it should be).  The request 
queued by the gadget driver _is_ the Status stage response; in this 
situation the UDC driver doesn't create one automatically.

> 2. Many UDC drivers do not do this, and only delay the status stage
> when USB_GADGET_DELAYED_STATUS is returned from ->setup(). Some
> drivers also assume that only a SET_CONFIGURATION request can be
> delayed.
> 
> 3. All such UDC drivers should be fixed and USB_GADGET_DELAYED_STATUS
> should be contained within the composite framework.
> 
> 4. Fixing all such UDC drivers is a non-trivial amount of work, but
> this is the goal to strive towards.
> 
> Alan, would it be acceptable if I add custom handling of
> USB_GADGET_DELAYED_STATUS to Raw Gadget in the meantime? It would be

Sure.  It's your driver; do whatever you want with it.  :-)

> great to keep it at least somewhat working with dwc3. I can also do it
> for GadgetFS, if you think it's a good idea.

I suspect gadgetfs doesn't need it.  But go ahead and look through the 
code to check for yourself; I might be wrong.

> I can also add some clarifying comments for USB_GADGET_DELAYED_STATUS
> and ->setup() to hopefully avoid having new UDC drivers being added
> with the same issue (e.g. cdns2 and renesas_usbf with the same issue
> were added just recently).

Good idea, although mistakes like this tend to propagate more through 
copy-and-paste than by failures of documentation.

> Maybe it's also a good idea to add a checkpatch.pl check for using
> USB_GADGET_DELAYED_STATUS in UDC drivers. Or maybe ask Greg to keep an
> eye out for this?

I wouldn't try to modify checkpatch.pl.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-24  2:21                           ` Alan Stern
@ 2023-08-26  1:20                             ` Thinh Nguyen
  2023-08-26  3:10                               ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-26  1:20 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

Sorry for the delay response.

On Wed, Aug 23, 2023, Alan Stern wrote:
> On Wed, Aug 23, 2023 at 10:22:07PM +0000, Thinh Nguyen wrote:
> > On Wed, Aug 23, 2023, Alan Stern wrote:
> > > On Wed, Aug 23, 2023 at 05:59:07PM +0000, Thinh Nguyen wrote:
> > > > On Wed, Aug 23, 2023, Alan Stern wrote:
> > > > > STALL is not a valid status for usb_requests on the gadget side; it 
> > > > > applies only on the host side (the host doesn't halt its endpoints).
> > > > 
> > > > The host can send a CLEAR_FEATURE(halt_ep). This will reset the data
> > > > sequence of the endpoint. In xhci spec (section 4.6.8), it suggests to
> > > > send this when the endpoint is reset. The endpoint is reset typically
> > > > when there's a transaction error.
> > > 
> > > It's important to be careful about the distinction between an actual 
> > > endpoint in the gadget and the logical representation of an endpoint 
> > > inside a host controller.  The host cannot reset the first; it can only 
> > > reset the second.
> > > 
> > > So yes, the usb_clear_halt() routine on the host does a 
> > > CLEAR_FEATURE(HALT) control transfer and then calls 
> > > usb_reset_endpoint(), which calls usb_hcd_reset_endpoint().
> > > 
> > > > The problem here is that typical protocol spec like MSC/UVC don't
> > > > specify how to handle CLEAR_FEATURE(halt_ep).
> > > 
> > > MSC does specify this.  I don't know about UVC.
> > 
> > No, from what I last recalled, it doesn't clearly define what should
> > happen here. It just indicates ClearFeature(halt_ep) for reset recovery.
> > However, the "reset recovery" can be implementation specific for how the
> > host can synchronize with the device.
> 
> Read the BOT spec.  I quote some of the relevant parts below.
> 
> > > > For Windows MSC driver, when the host recovers from the transaction
> > > > error, it sends CLEAR_FEATURE(halt_ep) and expects the transfer to be
> > > > cancelled. To synchronize with the host, the gadget driver needs to
> > > > cancel the request. Dwc3 needs to notify the gadget driver of this.
> > > 
> > > No, that's not what happens in the Mass Storage Class.
> > > 
> > > For the Bulk-Only Transport version of MSC, when a Windows or Linux host 
> > > detects a transaction error, it performs a USB port reset.  This clears 
> > 
> > No, that's implementation specific for reset recovery. Typically for
> 
> I haven't tested recent versions of Windows.  Older versions did behave 
> this way.  I still have the logs to prove it.
> 
> > Windows, for the first recovery, it sends a ClearFeature(halt_ep) and
> > sends a new MSC command.
> 
> That's a violation of the BOT spec.  Are you sure Windows really does 
> this?

This was actually from UASP, not BOT. Sorry for not being clear.

> 
> >  If the transfer doesn't complete within a
> > specific time, there will be a timeout and a port reset, which is
> > another level of recovery.
> > 
> > > all the state on the gadget.  The gadget gets re-enumerated, and the 
> > > host proceeds to re-issue the MSC command.  The gadget driver doesn't 
> > > need any special notifications; outstanding requests get cancelled as a 
> > > normal part of the reset handling.
> > > 
> > > (In fact, this is not what the BOT spec says to do.  It says that when 
> > > the host detects a transaction error, it should a Bulk-Only Mass Storage 
> > > Reset -- this is a special class-specific control transfer.  In 
> > > response, the gadget driver is supposed to reset its internal state and 
> > > cancel all of its outstanding requests.  Then the host issues 
> > > CLEAR_FEATURE(HALT) to both the bulk-IN and bulk-OUT endpoints and 
> > > proceeds to issue its next MSC command.  A lot of MSC devices don't 
> > > handle this properly, probably because Windows didn't use this 
> > > approach.)
> > 
> > At the moment, the gadget driver doesn't handle CLEAR_FEATURE(halt_ep),
> > the UDC driver does.
> 
> Correct.  My point is that it works this way because the gadget driver 
> doesn't _need_ to handle Clear-Halt.
> 
> >   I don't recall this being handled in the composite
> > framework or in the f_mass_storage function driver. Unless we change
> > this, the UDC driver needs to notify the gadget driver somehow.
> 
> No, f_mass_storage does not need to be notified about Clear-Halts.  As 
> you say, it isn't getting notified now, and yet it somehow still manages 
> to work with every type of host I'm aware of.
> 
> > > In the UAS version of MSC, the endpoints never halt.  If there's a 
> > > transaction error, the host simply re-issues the transaction.  If that 
> > 
> > There are multiple levels of recovery. Different driver handles it
> > differently. For xHCI, Initially there's retry at the packet level
> > (typically set to retry 3 times in a row). If it fails, host controller
> > driver will get a transaction error event.
> 
> That 3-strikes-and-you're-out thing is the normal USB low-level retry 
> mechanism.  We're talking about what happens when it fails and the HCD 
> reports a transaction error such as -EPROTO.
> 
> > In Linux xHCI, the recovery for transaction error we perform soft-reset
> > (xhci reset ep command with TSP=1). If it still fails, we reset the
> > endpoint (TSP=0) and return the request with -EPROTO to the class
> > driver. However, we don't send ClearFeature(halt_ep). I don't recall
> > Linux MSC driver handle -EPROTO and do a port reset. However it does do
> > a port reset due to transfer timeout.
> 
> In usb-storage, a -EPROTO error status causes interpret_urb_result() to 
> return USB_STOR_XFER_ERROR.  This causes usb_stor_invoke_transport() to 
> goto Handle_Errors:, which calls usb_stor_port_reset().
> 
> In uas, a -EPROTO error will cause an error status to be returned to the 
> SCSI layer, which will invoke the SCSI error handler.  After enough 
> failures it will call the uas device-reset handler, and 
> uas_eh_device_reset_handler() calls usb_reset_device().

From my testing with UASP, if I recall correctly, there's a 30 second
timeout before the reset handler kicks in.

> 
> > In Windows, it doesn't do soft-reset, but it does reset endpoint (TSP=0)
> > and send CLEAR_FEATURE(halt_ep) without port reset initially. It then
> > can send the a new MSC command expecting the device to be in sync based
> > on the CLEAR_FEATURE(halt_ep) request.
> 
> That is not how the Bulk-Only Transport protocol resynchronizes after a 
> protocol error.  The BOT spec mentions in several places variations of:
> 
> 	If the host detects a STALL of the Bulk-Out endpoint during 
> 	command transport, the host shall respond with a Reset Recovery
> 	(see 5.3.4 - Reset Recovery).
> 
> It doesn't say specifically what to do in case of other lower-level 
> protocol errors, but we have to assume that the intention is for the 
> host to follow the Reset Recovery procedure, because that's what the 
> device will expect to see.  The spec goes on to say:
> 
> 	5.3.4	Reset Recovery
> 
> 	For Reset Recovery the host shall issue in the following order:
> 		(a) a Bulk-Only Mass Storage Reset
> 		(b) a Clear Feature HALT to the Bulk-In endpoint
> 		(c) a Clear Feature HALT to the Bulk-Out endpoint
> 
> It most definitely does _not_ say that the host should do a Clear-Halt 
> without the Bulk-Only Mass Storage Reset.  By reading the spec carefully 
> you can see that such action would leave the host out of sync with the 
> device.

Again, sorry for not being clear here. The tests we did was against
UASP.

From the xHCI spec, it suggests to issue a CLEAR_FEATURE(halt_ep) after
the endpoint reset (e.g. from transaction error). Whether this action
should be taken from the class driver or from the xHCI driver, it's not
clear. However, as you said, without Bulk-Only Mass Storage Reset
request, the host and device will be out of sync. The response action
taken from xHCI is independent from MSC protocol. So it would make sense
for the UDC driver to treat CLEAR_FEATURE(halt_ep) as such and not
expect a Bulk-Only Mass Storage Reset or the equivalent.

> 
> >  If the recovery fails and the
> > transfer/command timed out, it will then do a port reset to recover.
> > 
> > > fails too, error recovery is started by the SCSI layer; it involves a 
> > > USB port reset.
> > > 
> > > But as you can see, in each case the UDC driver doesn't have to cancel 
> > > anything in particular when it gets a Clear-Halt.
> > > 
> > > > For other class driver, it may expect the transfer to resume after data
> > > > sequence reset.
> > > 
> > > Indeed.  In which case, the UDC driver shouldn't cancel anything.
> > > 
> > > > As a result, for an endpoint that's STALL (or not), and if the host
> > > > sends CLEAR_FEATURE(halt_ep), the dwc3 returns the request with some
> > > > status code and let the gadget driver handle it. If the gadget driver
> > > > wants to cancel the transfer, it can drop the transfer. If the gadget
> > > > driver wants to resume, it can requeue the same requests with the saved
> > > > status to resume where it left off.
> > > 
> > > The UDC driver should not dequeue a request merely because the endpoint 
> > > is halted.  The gadget driver can take care of everything necessary.  
> > > After all, it knows when an endpoint gets halted, because the gadget 
> > 
> > No, currently it doesn't know. That's the problem. The dwc3 driver
> > handles the CLEAR_FEATURE(halt_ep), not the gadget driver.
> 
> You misunderstood what I wrote.  I said that the gadget driver knows 
> when an endpoint's halt feature gets _set_; I didn't say that it knows 
> when the halt feature gets _cleared_.
> 
> (There is one exception: The gadget driver won't know if the host sends 
> a SET_FEATURE(halt_ep).  Hosts don't normally do this and I don't think 
> we need to worry about it.)
> 
> > > driver is what calls usb_ep_set_halt() or usb_ep_set_wedge() in the 
> > > first place.
> > > 
> > > As for handling CLEAR_FEATURE(HALT), all the UDC driver needs to do is 
> > > clear the HALT feature for the endpoint.  (Although if the endpoint is 
> > > wedged, the HALT feature should not be cleared.)  It doesn't need to 
> > > cancel any outstanding requests or inform the gadget driver in any way.
> > 
> > The UDC driver needs to notify the gadget driver somehow, cancelling the
> > request and give it back is currently the way dwc3 handling it.
> 
> And I'm saying that the UDC driver does _not_ need to notify the gadget 
> driver.
> 
> The MSC gadget driver works just fine without any such notification.  
> Can you name any gadget driver that _does_ need a notification?
> 

We were testing against UASP. The reason I added this was to mimic the
behavior of common vendor UASP devices when it recovers from transaction
errors in Windows.

When the data sequence of a transfer is reset, the host needs to send
CLEAR_FEATURE(halt_ep). It should be a common behavior. Since it is
common and part of the xHCI spec, do we expect the xHCI to send this or
do we expect the class protocol to document and handle this? At the
moment, I expect it to be the former and expect the UDC driver to treat
it as a common synchronization that perhaps the only synchronization
mechanism the upper protocol depends on.

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-26  1:20                             ` Thinh Nguyen
@ 2023-08-26  3:10                               ` Alan Stern
  2023-08-30  1:32                                 ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-26  3:10 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Sat, Aug 26, 2023 at 01:20:34AM +0000, Thinh Nguyen wrote:
> Sorry for the delay response.
> 
> On Wed, Aug 23, 2023, Alan Stern wrote:
> > In uas, a -EPROTO error will cause an error status to be returned to the 
> > SCSI layer, which will invoke the SCSI error handler.  After enough 
> > failures it will call the uas device-reset handler, and 
> > uas_eh_device_reset_handler() calls usb_reset_device().
> 
> From my testing with UASP, if I recall correctly, there's a 30 second
> timeout before the reset handler kicks in.

That timeout length comes from the SCSI error handler.  I believe the
user can control the length by setting a sysfs attribute.  Also, it
should be possible to change the uas driver to make it perform a reset
on its own right away, the way usb-storage does, without waiting for
the SCSI error handler -- if this matters.


> From the xHCI spec, it suggests to issue a CLEAR_FEATURE(halt_ep) after
> the endpoint reset (e.g. from transaction error). Whether this action
> should be taken from the class driver or from the xHCI driver, it's not
> clear. However, as you said, without Bulk-Only Mass Storage Reset
> request, the host and device will be out of sync. The response action
> taken from xHCI is independent from MSC protocol. So it would make sense
> for the UDC driver to treat CLEAR_FEATURE(halt_ep) as such and not
> expect a Bulk-Only Mass Storage Reset or the equivalent.

In USB-2, performing an endpoint reset in the host controller together
with sending a Clear-Halt is dangerous.  It can lead to data
duplication.

Here's how that can happen.  Suppose the data toggles on both the host
and gadget side are initially equal to 0 when a bulk-OUT transaction
occur.  The host sends a DATA0 packet which the gadget receives,
causing the gadget's data toggle to change to 1.  But let's say the
gadget's ACK handshake gets corrupted, causing a protocol error on the
host.  So the host does an internal endpoint reset and sends a
Clear-Halt to the gadget.  When the gadget processes this command, it
resets its data toggle back to 0.  Now the host resends the same DATA0
packet as before -- and the gadget accepts it the second time because
its data toggle is set to 0!  The duplicated data leads to
corruption.  If the gadget's data toggle had remained 1 then it would
have acknowledged the duplicate DATA0 packet but otherwise ignored it,
avoiding the corruption.

I admit the probability of this happening is very low, but a more
robust error recovery procedure at the class level would prevent this
scenario.


> > > The UDC driver needs to notify the gadget driver somehow, cancelling the
> > > request and give it back is currently the way dwc3 handling it.
> > 
> > And I'm saying that the UDC driver does _not_ need to notify the gadget 
> > driver.
> > 
> > The MSC gadget driver works just fine without any such notification.  
> > Can you name any gadget driver that _does_ need a notification?
> > 
> 
> We were testing against UASP. The reason I added this was to mimic the
> behavior of common vendor UASP devices when it recovers from transaction
> errors in Windows.
> 
> When the data sequence of a transfer is reset, the host needs to send
> CLEAR_FEATURE(halt_ep). It should be a common behavior. Since it is
> common and part of the xHCI spec, do we expect the xHCI to send this or
> do we expect the class protocol to document and handle this? At the
> moment, I expect it to be the former and expect the UDC driver to treat
> it as a common synchronization that perhaps the only synchronization
> mechanism the upper protocol depends on.

I think it should be the opposite; the class protocol should specify
how to recover from errors.  If for no other reason then to avoid the
data duplication problem for USB-2.  However, if it doesn't specify a
recovery procedure then there's not much else you can do.

But regardless, how can the gadget driver make any use of the
knowledge that the UDC received a Clear-Halt?  What would it do
differently?  If the intent is simply to clear an error condition and
continue with the existing transfer, the gadget driver doesn't need to
do anything.

Alternatively, if the procedure for clearing an error condition is
given by the class protocol then the protocol should spell out a
method for the host to inform the gadget about what it is doing --
something more than just sending a Clear-Halt.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-26  3:10                               ` Alan Stern
@ 2023-08-30  1:32                                 ` Thinh Nguyen
  2023-08-30 14:48                                   ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-30  1:32 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Fri, Aug 25, 2023, Alan Stern wrote:
> On Sat, Aug 26, 2023 at 01:20:34AM +0000, Thinh Nguyen wrote:
> > Sorry for the delay response.
> > 
> > On Wed, Aug 23, 2023, Alan Stern wrote:
> > > In uas, a -EPROTO error will cause an error status to be returned to the 
> > > SCSI layer, which will invoke the SCSI error handler.  After enough 
> > > failures it will call the uas device-reset handler, and 
> > > uas_eh_device_reset_handler() calls usb_reset_device().
> > 
> > From my testing with UASP, if I recall correctly, there's a 30 second
> > timeout before the reset handler kicks in.
> 
> That timeout length comes from the SCSI error handler.  I believe the
> user can control the length by setting a sysfs attribute.  Also, it
> should be possible to change the uas driver to make it perform a reset
> on its own right away, the way usb-storage does, without waiting for
> the SCSI error handler -- if this matters.

Sure, but my point is that the recovery for this is handled differently
compared Windows. Windows tries to recover with CLEAR_FEATURE(halt_ep)
before it gets to usb reset.

> 
> 
> > From the xHCI spec, it suggests to issue a CLEAR_FEATURE(halt_ep) after
> > the endpoint reset (e.g. from transaction error). Whether this action
> > should be taken from the class driver or from the xHCI driver, it's not
> > clear. However, as you said, without Bulk-Only Mass Storage Reset
> > request, the host and device will be out of sync. The response action
> > taken from xHCI is independent from MSC protocol. So it would make sense
> > for the UDC driver to treat CLEAR_FEATURE(halt_ep) as such and not
> > expect a Bulk-Only Mass Storage Reset or the equivalent.
> 
> In USB-2, performing an endpoint reset in the host controller together
> with sending a Clear-Halt is dangerous.  It can lead to data
> duplication.
> 
> Here's how that can happen.  Suppose the data toggles on both the host
> and gadget side are initially equal to 0 when a bulk-OUT transaction
> occur.  The host sends a DATA0 packet which the gadget receives,
> causing the gadget's data toggle to change to 1.  But let's say the
> gadget's ACK handshake gets corrupted, causing a protocol error on the
> host.  So the host does an internal endpoint reset and sends a
> Clear-Halt to the gadget.  When the gadget processes this command, it
> resets its data toggle back to 0.  Now the host resends the same DATA0
> packet as before -- and the gadget accepts it the second time because
> its data toggle is set to 0!  The duplicated data leads to
> corruption.  If the gadget's data toggle had remained 1 then it would
> have acknowledged the duplicate DATA0 packet but otherwise ignored it,
> avoiding the corruption.
> 
> I admit the probability of this happening is very low, but a more
> robust error recovery procedure at the class level would prevent this
> scenario.
> 

This actually all the more that we should not silently ignore
CLEAR_FEATURE(halt_ep) and depend on transfer timeout and usb reset.

That reminds me another thing, if the host (xhci in this case) does a
hard reset to the endpoint, it also resets the TRB pointer with dequeue
ep command. So, the transfer should not resume. It needs to be
cancelled. This xHCI behavior is the same for Windows and Linux.

> 
> > > > The UDC driver needs to notify the gadget driver somehow, cancelling the
> > > > request and give it back is currently the way dwc3 handling it.
> > > 
> > > And I'm saying that the UDC driver does _not_ need to notify the gadget 
> > > driver.
> > > 
> > > The MSC gadget driver works just fine without any such notification.  
> > > Can you name any gadget driver that _does_ need a notification?
> > > 
> > 
> > We were testing against UASP. The reason I added this was to mimic the
> > behavior of common vendor UASP devices when it recovers from transaction
> > errors in Windows.
> > 
> > When the data sequence of a transfer is reset, the host needs to send
> > CLEAR_FEATURE(halt_ep). It should be a common behavior. Since it is
> > common and part of the xHCI spec, do we expect the xHCI to send this or
> > do we expect the class protocol to document and handle this? At the
> > moment, I expect it to be the former and expect the UDC driver to treat
> > it as a common synchronization that perhaps the only synchronization
> > mechanism the upper protocol depends on.
> 
> I think it should be the opposite; the class protocol should specify
> how to recover from errors.  If for no other reason then to avoid the
> data duplication problem for USB-2.  However, if it doesn't specify a
> recovery procedure then there's not much else you can do.

Right, unfortunately that's not always the case that class protocol
spell out how to handle transaction error.

> 
> But regardless, how can the gadget driver make any use of the
> knowledge that the UDC received a Clear-Halt?  What would it do
> differently?  If the intent is simply to clear an error condition and
> continue with the existing transfer, the gadget driver doesn't need to
> do anything.

It's not simple to clear an error. It is to notify the gadget driver to
cancel the active transfer and resync with the host. This is observed in
UASP driver in Windows and how various consumer UASP devices handle it.
There no eqivalent of Bulk-Only Mass Storage Reset request from the
class protocol. We still have the USB analyzer traces for this.

> 
> Alternatively, if the procedure for clearing an error condition is
> given by the class protocol then the protocol should spell out a
> method for the host to inform the gadget about what it is doing --
> something more than just sending a Clear-Halt.
> 

Regardless whether the class protocol spells out how to handle the
transaction error, if there's transaction error, the host may send
CLEAR_FEATURE(halt_ep) as observed in Windows. The gadget driver needs
to know about it to cancel the active transfer and resync with the host.

Of course we can just wait for the host to timeout the transfer and
issue a usb reset. While I agree that the class protocol should
spell out everything, it may not always be the case. And I think UDC
driver should be handled this way to inter-op with windows.

BR,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-30  1:32                                 ` Thinh Nguyen
@ 2023-08-30 14:48                                   ` Alan Stern
  2023-08-31  2:43                                     ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-30 14:48 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Wed, Aug 30, 2023 at 01:32:28AM +0000, Thinh Nguyen wrote:
> That reminds me another thing, if the host (xhci in this case) does a
> hard reset to the endpoint, it also resets the TRB pointer with dequeue
> ep command. So, the transfer should not resume. It needs to be
> cancelled. This xHCI behavior is the same for Windows and Linux.

That's on the host side, right?  How does this affect the gadget side?

That is, cancelling a transfer on the host doesn't necessarily mean it 
has to be cancelled on the gadget.  Does it have any implications at all 
for the gadget driver?

> > I think it should be the opposite; the class protocol should specify
> > how to recover from errors.  If for no other reason then to avoid the
> > data duplication problem for USB-2.  However, if it doesn't specify a
> > recovery procedure then there's not much else you can do.
> 
> Right, unfortunately that's not always the case that class protocol
> spell out how to handle transaction error.

All too true...

> > But regardless, how can the gadget driver make any use of the
> > knowledge that the UDC received a Clear-Halt?  What would it do
> > differently?  If the intent is simply to clear an error condition and
> > continue with the existing transfer, the gadget driver doesn't need to
> > do anything.
> 
> It's not simple to clear an error. It is to notify the gadget driver to
> cancel the active transfer and resync with the host.

How does the gadget driver sync with the host if the class protocol 
doesn't say what should be done?

Also, what if there is no active transfer?  That is, what if the 
transaction that got an error on the host appeared to be successful on 
the gadget and it was the last transaction in the final transfer queued 
for the endpoint?  How would the UDC driver notify the gadget driver in 
this situation?

>  This is observed in
> UASP driver in Windows and how various consumer UASP devices handle it.

I don't understand what you're saying here.  How can you observe whether 
a transfer is cancelled in a consumer UAS device?  And how does the 
consumer device resync with the host?

> There no eqivalent of Bulk-Only Mass Storage Reset request from the
> class protocol. We still have the USB analyzer traces for this.

Can you post an example?  Not necessarily in complete detail, but enough 
so that we can see what's going on.

> Regardless whether the class protocol spells out how to handle the
> transaction error, if there's transaction error, the host may send
> CLEAR_FEATURE(halt_ep) as observed in Windows. The gadget driver needs
> to know about it to cancel the active transfer and resync with the host.

I'll be able to understand this better after seeing an example.  Do you 
have any traces that were made for a High-speed connection (say, using 
a USB-2 cable)?  It would probably be easier to follow than a SuperSpeed 
example.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-30 14:48                                   ` Alan Stern
@ 2023-08-31  2:43                                     ` Thinh Nguyen
  2023-08-31 15:40                                       ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-08-31  2:43 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Wed, Aug 30, 2023, Alan Stern wrote:
> On Wed, Aug 30, 2023 at 01:32:28AM +0000, Thinh Nguyen wrote:
> > That reminds me another thing, if the host (xhci in this case) does a
> > hard reset to the endpoint, it also resets the TRB pointer with dequeue
> > ep command. So, the transfer should not resume. It needs to be
> > cancelled. This xHCI behavior is the same for Windows and Linux.
> 
> That's on the host side, right?  How does this affect the gadget side?
> 
> That is, cancelling a transfer on the host doesn't necessarily mean it 
> has to be cancelled on the gadget.  Does it have any implications at all 
> for the gadget driver?

There are 2 things that needs to be in sync'ed between host and device:
1) The data sequence.
2) The transfer.

If host doesn't send CLEAR_FEATURE(halt_ep), best case scenario, the
data sequence does't match and the host issues usb reset after some
timeout because the packet won't go through. Worst case scenario, the
data sequence matches 0, and the wrong data is received causing
corruption.

If the device doesn't cancel the transfer in response to
CLEAR_FEATURE(halt_ep), it may send/receive data of a different transfer
because the host doesn't resume where it left off, causing corruption.

Base on the class protocol, the class driver and gadget driver know
what makes up a "transfer" and can appropriately cancel a transfer to
stay in sync.

> 
> > > I think it should be the opposite; the class protocol should specify
> > > how to recover from errors.  If for no other reason then to avoid the
> > > data duplication problem for USB-2.  However, if it doesn't specify a
> > > recovery procedure then there's not much else you can do.
> > 
> > Right, unfortunately that's not always the case that class protocol
> > spell out how to handle transaction error.
> 
> All too true...
> 
> > > But regardless, how can the gadget driver make any use of the
> > > knowledge that the UDC received a Clear-Halt?  What would it do
> > > differently?  If the intent is simply to clear an error condition and
> > > continue with the existing transfer, the gadget driver doesn't need to
> > > do anything.
> > 
> > It's not simple to clear an error. It is to notify the gadget driver to
> > cancel the active transfer and resync with the host.
> 
> How does the gadget driver sync with the host if the class protocol 
> doesn't say what should be done?
> 
> Also, what if there is no active transfer?  That is, what if the 
> transaction that got an error on the host appeared to be successful on 
> the gadget and it was the last transaction in the final transfer queued 
> for the endpoint?  How would the UDC driver notify the gadget driver in 
> this situation?

That's fine. If there's no active transfer, the gadget doesn't need to
cancel anything. As long as the host knows that the transfer did not
complete, it can retry and be in sync. For UASP, the host will send a
new MSC command to retry the failed transfer. ie. The host would
overwrite/re-read the transfer with the same transfer offset.

The problem arises if the gadget attempts to resume the incomplete
transfer.

> 
> >  This is observed in
> > UASP driver in Windows and how various consumer UASP devices handle it.
> 
> I don't understand what you're saying here.  How can you observe whether 
> a transfer is cancelled in a consumer UAS device?  And how does the 
> consumer device resync with the host?

You can see a hang if the transfer are out of sync. If the transfer
isn't cancelled, the device would only source/sink whatever the
remaining of the previous transfer but not enough to complete the new
transfer. The new transfer is seen as incomplete from host and thus the
hang and the usb reset.

> 
> > There no eqivalent of Bulk-Only Mass Storage Reset request from the
> > class protocol. We still have the USB analyzer traces for this.
> 
> Can you post an example?  Not necessarily in complete detail, but enough 
> so that we can see what's going on.
> 
> > Regardless whether the class protocol spells out how to handle the
> > transaction error, if there's transaction error, the host may send
> > CLEAR_FEATURE(halt_ep) as observed in Windows. The gadget driver needs
> > to know about it to cancel the active transfer and resync with the host.
> 
> I'll be able to understand this better after seeing an example.  Do you 
> have any traces that were made for a High-speed connection (say, using 
> a USB-2 cable)?  It would probably be easier to follow than a SuperSpeed 
> example.
> 

Unfortunately I only have LeCroy usb analyzer traces of Gen 2x1, not for
usb2 speed. It's a bit tricky converting it to text with all the proper
info to see all the context. If my explanation isn't clear, I'll try to
figure out how to proceed.

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-31  2:43                                     ` Thinh Nguyen
@ 2023-08-31 15:40                                       ` Alan Stern
  2023-09-01  1:27                                         ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-08-31 15:40 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Thu, Aug 31, 2023 at 02:43:51AM +0000, Thinh Nguyen wrote:
> On Wed, Aug 30, 2023, Alan Stern wrote:
> > On Wed, Aug 30, 2023 at 01:32:28AM +0000, Thinh Nguyen wrote:
> > > That reminds me another thing, if the host (xhci in this case) does a
> > > hard reset to the endpoint, it also resets the TRB pointer with dequeue
> > > ep command. So, the transfer should not resume. It needs to be
> > > cancelled. This xHCI behavior is the same for Windows and Linux.
> > 
> > That's on the host side, right?  How does this affect the gadget side?
> > 
> > That is, cancelling a transfer on the host doesn't necessarily mean it 
> > has to be cancelled on the gadget.  Does it have any implications at all 
> > for the gadget driver?
> 
> There are 2 things that needs to be in sync'ed between host and device:
> 1) The data sequence.

You mean the USB-3 sequence number value?

> 2) The transfer.
> 
> If host doesn't send CLEAR_FEATURE(halt_ep), best case scenario, the
> data sequence does't match and the host issues usb reset after some
> timeout because the packet won't go through.

The data toggles in USB-2, which are analogous to the sequence numbers 
in USB-3, don't work the same way.  When a USB-2 controller receives a 
data packet with the wrong sequence number, it sends an ACK response but 
otherwise ignores it.  This prevents timeouts (but not other types of 
errors).

>  Worst case scenario, the
> data sequence matches 0, and the wrong data is received causing
> corruption.
> 
> If the device doesn't cancel the transfer in response to
> CLEAR_FEATURE(halt_ep), it may send/receive data of a different transfer
> because the host doesn't resume where it left off, causing corruption.
> 
> Base on the class protocol, the class driver and gadget driver know
> what makes up a "transfer" and can appropriately cancel a transfer to
> stay in sync.

You're still thinking of UAS in particular, right?  What I would expect 
to happen when there's a transaction error in a UAS data transfer, based 
on reading the UAS spec, is that the host would cancel the transfer on 
its side and send either an Abort Task or an I_T Nexus Reset task 
management request to the device (in addition to resetting the host 
endpoint and sending a Clear-Halt).  I would not expect the host to hope 
that the device would abandon the transfer merely because it got the 
Clear-Halt.

Does Windows really work this way?  Does it not send a task management 
request?  That would definitely seem to be against the intent of the 
spec, if not against the letter.

> > How does the gadget driver sync with the host if the class protocol 
> > doesn't say what should be done?
> > 
> > Also, what if there is no active transfer?  That is, what if the 
> > transaction that got an error on the host appeared to be successful on 
> > the gadget and it was the last transaction in the final transfer queued 
> > for the endpoint?  How would the UDC driver notify the gadget driver in 
> > this situation?
> 
> That's fine. If there's no active transfer, the gadget doesn't need to
> cancel anything. As long as the host knows that the transfer did not
> complete, it can retry and be in sync. For UASP, the host will send a
> new MSC command to retry the failed transfer. ie. The host would
> overwrite/re-read the transfer with the same transfer offset.
> 
> The problem arises if the gadget attempts to resume the incomplete
> transfer.

Quite so.  But would the host send a new MSC retry command before the 
failed command completes?

> > >  This is observed in
> > > UASP driver in Windows and how various consumer UASP devices handle it.
> > 
> > I don't understand what you're saying here.  How can you observe whether 
> > a transfer is cancelled in a consumer UAS device?  And how does the 
> > consumer device resync with the host?
> 
> You can see a hang if the transfer are out of sync. If the transfer
> isn't cancelled, the device would only source/sink whatever the
> remaining of the previous transfer but not enough to complete the new
> transfer. The new transfer is seen as incomplete from host and thus the
> hang and the usb reset.
> 
> > 
> > > There no eqivalent of Bulk-Only Mass Storage Reset request from the
> > > class protocol. We still have the USB analyzer traces for this.
> > 
> > Can you post an example?  Not necessarily in complete detail, but enough 
> > so that we can see what's going on.
> > 
> > > Regardless whether the class protocol spells out how to handle the
> > > transaction error, if there's transaction error, the host may send
> > > CLEAR_FEATURE(halt_ep) as observed in Windows. The gadget driver needs
> > > to know about it to cancel the active transfer and resync with the host.
> > 
> > I'll be able to understand this better after seeing an example.  Do you 
> > have any traces that were made for a High-speed connection (say, using 
> > a USB-2 cable)?  It would probably be easier to follow than a SuperSpeed 
> > example.
> > 
> 
> Unfortunately I only have LeCroy usb analyzer traces of Gen 2x1, not for
> usb2 speed. It's a bit tricky converting it to text with all the proper
> info to see all the context. If my explanation isn't clear, I'll try to
> figure out how to proceed.

I would appreciate seeing whatever you can provide.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-08-31 15:40                                       ` Alan Stern
@ 2023-09-01  1:27                                         ` Thinh Nguyen
  2023-09-01 17:37                                           ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-09-01  1:27 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Thu, Aug 31, 2023, Alan Stern wrote:
> On Thu, Aug 31, 2023 at 02:43:51AM +0000, Thinh Nguyen wrote:
> > On Wed, Aug 30, 2023, Alan Stern wrote:
> > > On Wed, Aug 30, 2023 at 01:32:28AM +0000, Thinh Nguyen wrote:
> > > > That reminds me another thing, if the host (xhci in this case) does a
> > > > hard reset to the endpoint, it also resets the TRB pointer with dequeue
> > > > ep command. So, the transfer should not resume. It needs to be
> > > > cancelled. This xHCI behavior is the same for Windows and Linux.
> > > 
> > > That's on the host side, right?  How does this affect the gadget side?
> > > 
> > > That is, cancelling a transfer on the host doesn't necessarily mean it 
> > > has to be cancelled on the gadget.  Does it have any implications at all 
> > > for the gadget driver?
> > 
> > There are 2 things that needs to be in sync'ed between host and device:
> > 1) The data sequence.
> 
> You mean the USB-3 sequence number value?

Yes.

> 
> > 2) The transfer.
> > 
> > If host doesn't send CLEAR_FEATURE(halt_ep), best case scenario, the
> > data sequence does't match and the host issues usb reset after some
> > timeout because the packet won't go through.
> 
> The data toggles in USB-2, which are analogous to the sequence numbers 
> in USB-3, don't work the same way.  When a USB-2 controller receives a 
> data packet with the wrong sequence number, it sends an ACK response but 
> otherwise ignores it.  This prevents timeouts (but not other types of 
> errors).
> 
> >  Worst case scenario, the
> > data sequence matches 0, and the wrong data is received causing
> > corruption.
> > 
> > If the device doesn't cancel the transfer in response to
> > CLEAR_FEATURE(halt_ep), it may send/receive data of a different transfer
> > because the host doesn't resume where it left off, causing corruption.
> > 
> > Base on the class protocol, the class driver and gadget driver know
> > what makes up a "transfer" and can appropriately cancel a transfer to
> > stay in sync.
> 
> You're still thinking of UAS in particular, right?  What I would expect 
> to happen when there's a transaction error in a UAS data transfer, based 
> on reading the UAS spec, is that the host would cancel the transfer on 
> its side and send either an Abort Task or an I_T Nexus Reset task 
> management request to the device (in addition to resetting the host 
> endpoint and sending a Clear-Halt).  I would not expect the host to hope 
> that the device would abandon the transfer merely because it got the 
> Clear-Halt.
> 
> Does Windows really work this way?  Does it not send a task management 
> request?  That would definitely seem to be against the intent of the 
> spec, if not against the letter.

Unfortunately yes, I don't see any Task Management request aborting the
transfer.

> 
> > > How does the gadget driver sync with the host if the class protocol 
> > > doesn't say what should be done?
> > > 
> > > Also, what if there is no active transfer?  That is, what if the 
> > > transaction that got an error on the host appeared to be successful on 
> > > the gadget and it was the last transaction in the final transfer queued 
> > > for the endpoint?  How would the UDC driver notify the gadget driver in 
> > > this situation?
> > 
> > That's fine. If there's no active transfer, the gadget doesn't need to
> > cancel anything. As long as the host knows that the transfer did not
> > complete, it can retry and be in sync. For UASP, the host will send a
> > new MSC command to retry the failed transfer. ie. The host would
> > overwrite/re-read the transfer with the same transfer offset.
> > 
> > The problem arises if the gadget attempts to resume the incomplete
> > transfer.
> 
> Quite so.  But would the host send a new MSC retry command before the 
> failed command completes?

The host sends a new MSC command after the incomplete command failed.

> 
> > > >  This is observed in
> > > > UASP driver in Windows and how various consumer UASP devices handle it.
> > > 
> > > I don't understand what you're saying here.  How can you observe whether 
> > > a transfer is cancelled in a consumer UAS device?  And how does the 
> > > consumer device resync with the host?
> > 
> > You can see a hang if the transfer are out of sync. If the transfer
> > isn't cancelled, the device would only source/sink whatever the
> > remaining of the previous transfer but not enough to complete the new
> > transfer. The new transfer is seen as incomplete from host and thus the
> > hang and the usb reset.
> > 
> > > 
> > > > There no eqivalent of Bulk-Only Mass Storage Reset request from the
> > > > class protocol. We still have the USB analyzer traces for this.
> > > 
> > > Can you post an example?  Not necessarily in complete detail, but enough 
> > > so that we can see what's going on.
> > > 
> > > > Regardless whether the class protocol spells out how to handle the
> > > > transaction error, if there's transaction error, the host may send
> > > > CLEAR_FEATURE(halt_ep) as observed in Windows. The gadget driver needs
> > > > to know about it to cancel the active transfer and resync with the host.
> > > 
> > > I'll be able to understand this better after seeing an example.  Do you 
> > > have any traces that were made for a High-speed connection (say, using 
> > > a USB-2 cable)?  It would probably be easier to follow than a SuperSpeed 
> > > example.
> > > 
> > 
> > Unfortunately I only have LeCroy usb analyzer traces of Gen 2x1, not for
> > usb2 speed. It's a bit tricky converting it to text with all the proper
> > info to see all the context. If my explanation isn't clear, I'll try to
> > figure out how to proceed.
> 
> I would appreciate seeing whatever you can provide.
> 

Here's a snippet captured at the SCSI level from Samsung T7 device
response to CLEAR_FEATURE(halt-ep) to IN data endpoint from host
(Windows 10). Similar behavior is observed for OUT endpoint.


_______|_______________________________________________________________________
SCSI Op(80) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x0928E800) STATUS(GOOD) Data(524288 bytes) 
_______| Time(  1.335 ms) Time Stamp(10 . 000 538 006) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(81) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x0928EC00) STATUS(GOOD) Data(524288 bytes) 
_______| Time(  1.318 ms) Time Stamp(10 . 001 872 988) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(82) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x0928F000) STATUS(GOOD) Data(524288 bytes) 
_______| Time(  1.343 ms) Time Stamp(10 . 003 191 188) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(83) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x0928F400) STATUS(GOOD) Data(524288 bytes) 
_______| Time(  1.256 ms) Time Stamp(10 . 004 534 630) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(84) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x0928F800) STATUS(GOOD) Data(524288 bytes) 
_______| Time(  1.178 ms) Time Stamp(10 . 005 791 128) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(85) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x0928FC00) Data(146432 bytes) Status(Missing)-BAD 
_______| Time(  2.681 ms) Time Stamp(10 . 006 968 662) Metrics #Xfers(2) 
_______|_______________________________________________________________________


## Transaction eror occurs here.

Transfer(289) Left("Left") G2(x1) Control(SET) ADDR(3) ENDP(0) 
_______| bRequest(CLEAR_FEATURE) wValue(ENDPOINT_HALT) wLength(0) 
_______| Time(166.322 us) Time Stamp(10 . 009 649 516) 
_______|_______________________________________________________________________

## CLEAR_FEATURE happens here.

SCSI Op(99) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x09290000) RESPONSE_CODE(OVERLAPPED TAG) 
_______| Time(365.854 us) Time Stamp(10 . 009 815 838) Metrics #Xfers(2) 
_______|_______________________________________________________________________
SCSI Op(100) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x09290400) STATUS(GOOD) Data(524288 bytes) 
_______| Time(  1.012 sec) Time Stamp(10 . 010 181 692) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(101) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x0928FC00) STATUS(GOOD) Data(524288 bytes) 
_______| Time(882.412 us) Time Stamp(11 . 022 469 104) Metrics #Xfers(3) 
_______|_______________________________________________________________________

## Host retries transfer here. Check logical block address.

SCSI Op(102) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x09290000) STATUS(GOOD) Data(524288 bytes) 
_______| Time(  1.060 ms) Time Stamp(11 . 023 351 516) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(103) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x09290800) STATUS(GOOD) Data(524288 bytes) 
_______| Time(  1.013 ms) Time Stamp(11 . 024 411 510) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(104) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x09290C00) STATUS(GOOD) Data(524288 bytes) 
_______| Time(816.594 us) Time Stamp(11 . 025 424 600) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(105) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x09291000) STATUS(GOOD) Data(524288 bytes) 
_______| Time(762.286 us) Time Stamp(11 . 026 241 194) Metrics #Xfers(3) 
_______|_______________________________________________________________________
SCSI Op(106) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
_______| Logical Block Addr(0x09291400) STATUS(GOOD) Data(524288 bytes) 
_______| Time(768.696 us) Time Stamp(11 . 027 003 480) Metrics #Xfers(3) 
_______|_______________________________________________________________________



BR,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-09-01  1:27                                         ` Thinh Nguyen
@ 2023-09-01 17:37                                           ` Alan Stern
  2023-09-01 21:14                                             ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-09-01 17:37 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Fri, Sep 01, 2023 at 01:27:34AM +0000, Thinh Nguyen wrote:
> > Does Windows really work this way?  Does it not send a task management 
> > request?  That would definitely seem to be against the intent of the 
> > spec, if not against the letter.
> 
> Unfortunately yes, I don't see any Task Management request aborting the
> transfer.

Is it possible that the packets are there but you don't see them because 
of the filtering or data presentation done by the USB analyzer?

> Here's a snippet captured at the SCSI level from Samsung T7 device
> response to CLEAR_FEATURE(halt-ep) to IN data endpoint from host
> (Windows 10). Similar behavior is observed for OUT endpoint.

Hmmm.  The SCSI level may not provide enough detail.

> _______|_______________________________________________________________________
> SCSI Op(80) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x0928E800) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(  1.335 ms) Time Stamp(10 . 000 538 006) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(81) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x0928EC00) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(  1.318 ms) Time Stamp(10 . 001 872 988) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(82) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x0928F000) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(  1.343 ms) Time Stamp(10 . 003 191 188) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(83) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x0928F400) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(  1.256 ms) Time Stamp(10 . 004 534 630) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(84) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x0928F800) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(  1.178 ms) Time Stamp(10 . 005 791 128) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(85) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x0928FC00) Data(146432 bytes) Status(Missing)-BAD 
> _______| Time(  2.681 ms) Time Stamp(10 . 006 968 662) Metrics #Xfers(2) 
> _______|_______________________________________________________________________
> 
> 
> ## Transaction eror occurs here.
> 
> Transfer(289) Left("Left") G2(x1) Control(SET) ADDR(3) ENDP(0) 
> _______| bRequest(CLEAR_FEATURE) wValue(ENDPOINT_HALT) wLength(0) 
> _______| Time(166.322 us) Time Stamp(10 . 009 649 516) 
> _______|_______________________________________________________________________
> 
> ## CLEAR_FEATURE happens here.
> 
> SCSI Op(99) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x09290000) RESPONSE_CODE(OVERLAPPED TAG) 
> _______| Time(365.854 us) Time Stamp(10 . 009 815 838) Metrics #Xfers(2) 
> _______|_______________________________________________________________________
> SCSI Op(100) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x09290400) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(  1.012 sec) Time Stamp(10 . 010 181 692) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(101) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x0928FC00) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(882.412 us) Time Stamp(11 . 022 469 104) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> 
> ## Host retries transfer here. Check logical block address.
> 
> SCSI Op(102) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x09290000) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(  1.060 ms) Time Stamp(11 . 023 351 516) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(103) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x09290800) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(  1.013 ms) Time Stamp(11 . 024 411 510) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(104) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x09290C00) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(816.594 us) Time Stamp(11 . 025 424 600) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(105) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x09291000) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(762.286 us) Time Stamp(11 . 026 241 194) Metrics #Xfers(3) 
> _______|_______________________________________________________________________
> SCSI Op(106) ADDR(3) Tag(0x0002) SCSI CDB READ(10) 
> _______| Logical Block Addr(0x09291400) STATUS(GOOD) Data(524288 bytes) 
> _______| Time(768.696 us) Time Stamp(11 . 027 003 480) Metrics #Xfers(3) 
> _______|_______________________________________________________________________

Is is possible to show all the data packets (not the token or 
handshaking ones) for both the command and status endpoints?  I'm 
particularly interested in seeing what sort of status message the device 
sends for the failed command, and where it occurs in relation to the 
other transfers.

I wouldn't mind seeing a summary of the packets on the data-IN endpoint 
as well.  Not a detailed view -- multiple 500-KB transfers aren't very 
interesting -- but just to see what happens in the vicinity of the 
transaction error.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-09-01 17:37                                           ` Alan Stern
@ 2023-09-01 21:14                                             ` Thinh Nguyen
  2023-09-02 15:15                                               ` Alan Stern
  0 siblings, 1 reply; 37+ messages in thread
From: Thinh Nguyen @ 2023-09-01 21:14 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Fri, Sep 01, 2023, Alan Stern wrote:
> On Fri, Sep 01, 2023 at 01:27:34AM +0000, Thinh Nguyen wrote:
> > > Does Windows really work this way?  Does it not send a task management 
> > > request?  That would definitely seem to be against the intent of the 
> > > spec, if not against the letter.
> > 
> > Unfortunately yes, I don't see any Task Management request aborting the
> > transfer.
> 
> Is it possible that the packets are there but you don't see them because 
> of the filtering or data presentation done by the USB analyzer?

There's no Task Management transfer. All the packets are captured, but
it's not easy to show in a single email as I need to manually export
which view to see. Some view level (such as SCSI) would group the
transfers, which may not show the entire picture. But it helps with the
high level view. If there's Task Management command, then you would see
it from the SCSI level.

> 
> > Here's a snippet captured at the SCSI level from Samsung T7 device
> > response to CLEAR_FEATURE(halt-ep) to IN data endpoint from host
> > (Windows 10). Similar behavior is observed for OUT endpoint.
> 
> Hmmm.  The SCSI level may not provide enough detail.
> 

<snip>

> 
> Is is possible to show all the data packets (not the token or 
> handshaking ones) for both the command and status endpoints?  I'm 
> particularly interested in seeing what sort of status message the device 
> sends for the failed command, and where it occurs in relation to the 
> other transfers.
> 
> I wouldn't mind seeing a summary of the packets on the data-IN endpoint 
> as well.  Not a detailed view -- multiple 500-KB transfers aren't very 
> interesting -- but just to see what happens in the vicinity of the 
> transaction error.
> 

Actually, something else was probably happened here.

Some context:
* Bulk OUT endp 4 is for command endpoint
* Bulk IN endp 1 is for data endpoint
* Bulk IN endp 3 is for status endpoint

_______|_______________________________________________________________________
Transfer(261) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
_______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
_______| READ(10) Time(184.510 us) Time Stamp(10 . 005 791 128) 
_______|_______________________________________________________________________
Transfer(264) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(1) Stream ID(0x0002) 
_______| UASP  Data IN Bytes Transferred(524288) Time(945.654 us) 
_______| Time Stamp(10 . 005 975 638) 
_______|_______________________________________________________________________
Transfer(266) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(3) Stream ID(0x0002) 
_______| UASP Sense IU  Tag(0x0002) STATUS(GOOD) Time( 47.370 us) 
_______| Time Stamp(10 . 006 921 292) 
_______|_______________________________________________________________________
Transfer(267) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
_______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
_______| READ(10) Time(161.454 us) Time Stamp(10 . 006 968 662) 
_______|_______________________________________________________________________
Transfer(272) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(1) Stream ID(0x0002) 
_______| UASP  Data IN Bytes Transferred(146432) Time(  2.519 ms) 
_______| Time Stamp(10 . 007 130 116) 
_______|_______________________________________________________________________


## Transaction error occurs, the transfer stopped short at 146432 bytes
## instead of 512K.


Transfer(289) Left("Left") G2(x1) Control(SET) ADDR(3) ENDP(0) 
_______| bRequest(CLEAR_FEATURE) wValue(ENDPOINT_HALT) wLength(0) 
_______| Time(166.322 us) Time Stamp(10 . 009 649 516) 
_______|_______________________________________________________________________


# Host issues CLEAR_FEATURE(halt_ep)


Transfer(291) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
_______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
_______| READ(10) Time(158.848 us) Time Stamp(10 . 009 815 838) 
_______|_______________________________________________________________________


# Host completely dropped the SCSI command with transaction error. It
# doesn't request for the status. Since it's dropped, tag 2 is
# available. Now, a new SCSI command can use Tag 2.


Transfer(292) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(3) Stream ID(0x0002) 
_______| UASP Response IU  Tag(0x0002) RESPONSE_CODE(OVERLAPPED TAG) 
_______| Time(207.006 us) Time Stamp(10 . 009 974 686) 
_______|_______________________________________________________________________


# Thinking that the previous Tag 2 command was still active and
# responded with an OVERLAPPED TAG. This probably causes the gadget to
# cancel the transfer and drop the command so it can be in sync again.


Transfer(295) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
_______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
_______| READ(10) Time(884.218 ms) Time Stamp(10 . 010 181 692) 
_______|_______________________________________________________________________
Transfer(313) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(1) Stream ID(0x0002) 
_______| UASP  Data IN Bytes Transferred(524288) Time(554.648 us) 
_______| Time Stamp(10 . 894 399 886) 
_______|_______________________________________________________________________
Transfer(314) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(3) Stream ID(0x0002) 
_______| UASP Sense IU  Tag(0x0002) STATUS(GOOD) Time(127.515 ms) 
_______| Time Stamp(10 . 894 954 534) 
_______|_______________________________________________________________________
Transfer(315) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
_______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
_______| READ(10) Time(278.770 us) Time Stamp(11 . 022 469 104) 
_______|_______________________________________________________________________
Transfer(316) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(1) Stream ID(0x0002) 
_______| UASP  Data IN Bytes Transferred(524288) Time(554.236 us) 
_______| Time Stamp(11 . 022 747 874) 
_______|_______________________________________________________________________
Transfer(317) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(3) Stream ID(0x0002) 
_______| UASP Sense IU  Tag(0x0002) STATUS(GOOD) Time( 49.406 us) 
_______| Time Stamp(11 . 023 302 110) 
_______|_______________________________________________________________________


So, for this scenario, the host was probably stay in sync with the
device due to the overlap command tag id canceling the transfer. I'm not
sure if this is the intent of the Windows UASP class driver, which seems
like a non-conventional way of synchronization. Perhaps it was done
because some devices may not support TASK_MANAGEMENT(Abort_task)?

Regardless, if the host resets the endpoint, the transfer must be
canceled otherwise we risk data corruption.

Also whenever there's a OVERLAPPED tag error, Windows host takes a long
time (~1 sec) to send a new command (check delta time of Transfer 295
and 313). If the gadget driver can base off of the
CLEAR_FEATURE(halt_ep), this improves performance.

BR,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-09-01 21:14                                             ` Thinh Nguyen
@ 2023-09-02 15:15                                               ` Alan Stern
  2023-09-05 22:53                                                 ` Thinh Nguyen
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Stern @ 2023-09-02 15:15 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman, USB list, LKML

On Fri, Sep 01, 2023 at 09:14:19PM +0000, Thinh Nguyen wrote:
> There's no Task Management transfer. All the packets are captured, but
> it's not easy to show in a single email as I need to manually export
> which view to see. Some view level (such as SCSI) would group the
> transfers, which may not show the entire picture. But it helps with the
> high level view. If there's Task Management command, then you would see
> it from the SCSI level.

Okay.

> Actually, something else was probably happened here.
> 
> Some context:
> * Bulk OUT endp 4 is for command endpoint
> * Bulk IN endp 1 is for data endpoint
> * Bulk IN endp 3 is for status endpoint
> 
> _______|_______________________________________________________________________
> Transfer(261) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
> _______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
> _______| READ(10) Time(184.510 us) Time Stamp(10 . 005 791 128) 
> _______|_______________________________________________________________________
> Transfer(264) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(1) Stream ID(0x0002) 
> _______| UASP  Data IN Bytes Transferred(524288) Time(945.654 us) 
> _______| Time Stamp(10 . 005 975 638) 
> _______|_______________________________________________________________________
> Transfer(266) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(3) Stream ID(0x0002) 
> _______| UASP Sense IU  Tag(0x0002) STATUS(GOOD) Time( 47.370 us) 
> _______| Time Stamp(10 . 006 921 292) 
> _______|_______________________________________________________________________
> Transfer(267) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
> _______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
> _______| READ(10) Time(161.454 us) Time Stamp(10 . 006 968 662) 
> _______|_______________________________________________________________________
> Transfer(272) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(1) Stream ID(0x0002) 
> _______| UASP  Data IN Bytes Transferred(146432) Time(  2.519 ms) 
> _______| Time Stamp(10 . 007 130 116) 
> _______|_______________________________________________________________________
> 
> 
> ## Transaction error occurs, the transfer stopped short at 146432 bytes
> ## instead of 512K.
> 
> 
> Transfer(289) Left("Left") G2(x1) Control(SET) ADDR(3) ENDP(0) 
> _______| bRequest(CLEAR_FEATURE) wValue(ENDPOINT_HALT) wLength(0) 
> _______| Time(166.322 us) Time Stamp(10 . 009 649 516) 
> _______|_______________________________________________________________________
> 
> 
> # Host issues CLEAR_FEATURE(halt_ep)

We don't really know how the device responds to this request, other 
than resetting the endpoint's sequence number to 0.

> Transfer(291) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
> _______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
> _______| READ(10) Time(158.848 us) Time Stamp(10 . 009 815 838) 
> _______|_______________________________________________________________________
> 
> 
> # Host completely dropped the SCSI command with transaction error. It
> # doesn't request for the status. Since it's dropped, tag 2 is
> # available. Now, a new SCSI command can use Tag 2.

That is what this looks like.  I'm not aware of any place in the UAS 
spec where it says that a host is allowed to do this.

It would be very interesting to know at this point if the host tried to 
read more data on the data endpoint during the 160-microsecond period 
before transfer 292.  Does the full log show whether this happened?  
Does the fact that the transfers have consecutive numbers mean that 
nothing happened on the bus during this time?  Was the device waiting 
for the host before sending more data, or did it cancel the data that 
was already queued?

> Transfer(292) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(3) Stream ID(0x0002) 
> _______| UASP Response IU  Tag(0x0002) RESPONSE_CODE(OVERLAPPED TAG) 
> _______| Time(207.006 us) Time Stamp(10 . 009 974 686) 
> _______|_______________________________________________________________________
> 
> 
> # Thinking that the previous Tag 2 command was still active and
> # responded with an OVERLAPPED TAG. This probably causes the gadget to
> # cancel the transfer and drop the command so it can be in sync again.

Indeed.  If the Clear-Halt had caused the device to abort the ongoing 
transfer, would it have responded this way?  Wouldn't it have thought 
that the previous Tag 2 command was completely finished?  Or would it 
have cancelled just the data part of the transfer, while keeping track 
of the fact that the status part still needed to be sent?

To put it another way, it seems that the OVERLAPPED TAG response was 
caused by the fact that the status was never sent.  So whether or not 
the device responded to the Clear-Halt by cancelling anything, it 
became aware that something was wrong when it received this duplicate 
tag.

> Transfer(295) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
> _______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
> _______| READ(10) Time(884.218 ms) Time Stamp(10 . 010 181 692) 
> _______|_______________________________________________________________________
> Transfer(313) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(1) Stream ID(0x0002) 
> _______| UASP  Data IN Bytes Transferred(524288) Time(554.648 us) 
> _______| Time Stamp(10 . 894 399 886) 
> _______|_______________________________________________________________________
> Transfer(314) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(3) Stream ID(0x0002) 
> _______| UASP Sense IU  Tag(0x0002) STATUS(GOOD) Time(127.515 ms) 
> _______| Time Stamp(10 . 894 954 534) 
> _______|_______________________________________________________________________
> Transfer(315) Left("Left") G2(x1) Bulk(OUT) ADDR(3) ENDP(4) UASP  Command IU 
> _______| Tag(0x0002) Logical Unit Number(0x 0000 0000 0000 0000) SCSI CDB 
> _______| READ(10) Time(278.770 us) Time Stamp(11 . 022 469 104) 
> _______|_______________________________________________________________________
> Transfer(316) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(1) Stream ID(0x0002) 
> _______| UASP  Data IN Bytes Transferred(524288) Time(554.236 us) 
> _______| Time Stamp(11 . 022 747 874) 
> _______|_______________________________________________________________________
> Transfer(317) Left("Left") G2(x1) Bulk(IN) ADDR(3) ENDP(3) Stream ID(0x0002) 
> _______| UASP Sense IU  Tag(0x0002) STATUS(GOOD) Time( 49.406 us) 
> _______| Time Stamp(11 . 023 302 110) 
> _______|_______________________________________________________________________
> 
> 
> So, for this scenario, the host was probably stay in sync with the
> device due to the overlap command tag id canceling the transfer. I'm not
> sure if this is the intent of the Windows UASP class driver, which seems
> like a non-conventional way of synchronization. Perhaps it was done
> because some devices may not support TASK_MANAGEMENT(Abort_task)?

That's possible.  Windows seems to use a non-conventional approach to 
many things.

> Regardless, if the host resets the endpoint, the transfer must be
> canceled otherwise we risk data corruption.

Do we?  What would have happened if the transfer had not been 
cancelled?  The device might have sent some data left over from the 
original command and the host might have misinterpreted it as belong to 
the new command.  But when the device sent the OVERLAPPED TAG response, 
the host would have realized that any data it received for the 
new command was invalid and abandoned the command.

In fact, that's what it did do in transfer 295.  As far as I can tell 
from the log, the Clear-Halt didn't cause the device to fully cancel 
the transfer.

> Also whenever there's a OVERLAPPED tag error, Windows host takes a long
> time (~1 sec) to send a new command (check delta time of Transfer 295
> and 313). If the gadget driver can base off of the
> CLEAR_FEATURE(halt_ep), this improves performance.

Okay, that's a good point.

However, since not cancelling the transfer apparently did not lead to 
corruption, I think it's okay to allow UDC drivers not to cancel 
requests when they receive a Clear-Halt.  This decision could be left 
up to the driver.

Alan Stern

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: dwc3: unusual handling of setup requests with wLength == 0
  2023-09-02 15:15                                               ` Alan Stern
@ 2023-09-05 22:53                                                 ` Thinh Nguyen
  0 siblings, 0 replies; 37+ messages in thread
From: Thinh Nguyen @ 2023-09-05 22:53 UTC (permalink / raw)
  To: Alan Stern
  Cc: Thinh Nguyen, Andrey Konovalov, Felipe Balbi, Greg Kroah-Hartman,
	USB list, LKML

On Sat, Sep 02, 2023, Alan Stern wrote:
> > # Host completely dropped the SCSI command with transaction error. It
> > # doesn't request for the status. Since it's dropped, tag 2 is
> > # available. Now, a new SCSI command can use Tag 2.
> 
> That is what this looks like.  I'm not aware of any place in the UAS 
> spec where it says that a host is allowed to do this.
> 
> It would be very interesting to know at this point if the host tried to 
> read more data on the data endpoint during the 160-microsecond period 
> before transfer 292.  Does the full log show whether this happened?  
> Does the fact that the transfers have consecutive numbers mean that 
> nothing happened on the bus during this time?  Was the device waiting 
> for the host before sending more data, or did it cancel the data that 
> was already queued?
> 

Here's the packet level view (w/o link commands and misc packets):

_______|_______________________________________________________________________L 
Packet(302161) Left("Left") Dir G2(x1) TP Status(4) ADDR(3) ENDP(0) 
_______| Dir(--) PP(Not Pnd)   LCW  (Hseq:9) Duration(16.540 ns) 
_______| Time(772.000 ns) Time Stamp(10 . 009 669 310) 
_______|_______________________________________________________________________R 
Packet(302169) Right("Right") Dir G2(x1) TP ACK(1) ADDR(3) ENDP(0) 
_______| TT(Control) Dir(--) SeqN(1) NumP(0) Stream ID(0x0000) PP(Not Pnd) 
_______|   LCW  (Hseq:6) Warning Duration(16.540 ns) Time(145.756 us) 
_______| Time Stamp(10 . 009 670 082) 
_______|_______________________________________________________________________L 


## After the device ACK'ed the status stage. Nothing happened until the
## host sends a new command.


Packet(303013) Left("Left") Dir G2(x1) DP Data Len(32) ADDR(3) ENDP(4) 
_______| TT(Bulk) Dir(Out) SeqN(5) EoB(N) Stream ID(0x0000) PP(Not Pnd) 
_______|   LCW  (Hseq:11) Data(32 bytes) Duration(56.240 ns) Time(  1.390 us) 
_______| Time Stamp(10 . 009 815 838) 
_______|_______________________________________________________________________



> > 
> > # Thinking that the previous Tag 2 command was still active and
> > # responded with an OVERLAPPED TAG. This probably causes the gadget to
> > # cancel the transfer and drop the command so it can be in sync again.
> 
> Indeed.  If the Clear-Halt had caused the device to abort the ongoing 
> transfer, would it have responded this way?  Wouldn't it have thought 
> that the previous Tag 2 command was completely finished?  Or would it 
> have cancelled just the data part of the transfer, while keeping track 
> of the fact that the status part still needed to be sent?
> 
> To put it another way, it seems that the OVERLAPPED TAG response was 
> caused by the fact that the status was never sent.  So whether or not 
> the device responded to the Clear-Halt by cancelling anything, it 
> became aware that something was wrong when it received this duplicate 
> tag.
> 
> > 
> > So, for this scenario, the host was probably stay in sync with the
> > device due to the overlap command tag id canceling the transfer. I'm not
> > sure if this is the intent of the Windows UASP class driver, which seems
> > like a non-conventional way of synchronization. Perhaps it was done
> > because some devices may not support TASK_MANAGEMENT(Abort_task)?
> 
> That's possible.  Windows seems to use a non-conventional approach to 
> many things.
> 
> > Regardless, if the host resets the endpoint, the transfer must be
> > canceled otherwise we risk data corruption.
> 
> Do we?  What would have happened if the transfer had not been 
> cancelled?  The device might have sent some data left over from the 
> original command and the host might have misinterpreted it as belong to 
> the new command.  But when the device sent the OVERLAPPED TAG response, 
> the host would have realized that any data it received for the 
> new command was invalid and abandoned the command.
> 

If the host resets the endpoint, the transfer must be canceled. How
cancelation is triggered is another thing. In this case, it may not
be triggered by the CLEAR_FEATURE(halt_ep), but probably through the
overlap tag detection.

> In fact, that's what it did do in transfer 295.  As far as I can tell 
> from the log, the Clear-Halt didn't cause the device to fully cancel 
> the transfer.

> 
> > Also whenever there's a OVERLAPPED tag error, Windows host takes a long
> > time (~1 sec) to send a new command (check delta time of Transfer 295
> > and 313). If the gadget driver can base off of the
> > CLEAR_FEATURE(halt_ep), this improves performance.
> 
> Okay, that's a good point.
> 
> However, since not cancelling the transfer apparently did not lead to 
> corruption, I think it's okay to allow UDC drivers not to cancel 
> requests when they receive a Clear-Halt.  This decision could be left 
> up to the driver.
> 

Ok, fair enough.

For our tests, we based the transaction errors through the
CLEAR_FEATURE(halt_ep) and dropped the MSC command altogether to stay in
sync, just as how the Windows host drops the MSC command.

But as through this long discussion, we may not need and probably can
avoid forcing the gadget driver to learn a new error code to stay in
sync with Windows's host.

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2023-09-05 22:53 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-18  0:15 dwc3: unusual handling of setup requests with wLength == 0 Andrey Konovalov
2023-08-18  1:08 ` Thinh Nguyen
2023-08-18  2:37   ` Alan Stern
2023-08-18  3:10     ` Thinh Nguyen
2023-08-18  3:26       ` Thinh Nguyen
2023-08-18  3:42       ` Alan Stern
2023-08-18 19:49         ` Thinh Nguyen
2023-08-18 20:46           ` Thinh Nguyen
2023-08-18 23:06           ` Alan Stern
2023-08-19  0:06             ` Thinh Nguyen
2023-08-19  1:54               ` Andrey Konovalov
2023-08-20 14:20               ` Alan Stern
2023-08-21 16:13                 ` Andrey Konovalov
2023-08-21 17:25                   ` Alan Stern
2023-08-23  2:05                     ` Thinh Nguyen
2023-08-23  2:30                     ` Andrey Konovalov
2023-08-23 15:48                       ` Alan Stern
2023-08-23 17:18                         ` Thinh Nguyen
2023-08-25  1:36                           ` Andrey Konovalov
2023-08-25  2:08                             ` Alan Stern
2023-08-23  2:14                 ` Thinh Nguyen
2023-08-23 15:17                   ` Alan Stern
2023-08-23 17:59                     ` Thinh Nguyen
2023-08-23 19:19                       ` Alan Stern
2023-08-23 22:22                         ` Thinh Nguyen
2023-08-24  2:21                           ` Alan Stern
2023-08-26  1:20                             ` Thinh Nguyen
2023-08-26  3:10                               ` Alan Stern
2023-08-30  1:32                                 ` Thinh Nguyen
2023-08-30 14:48                                   ` Alan Stern
2023-08-31  2:43                                     ` Thinh Nguyen
2023-08-31 15:40                                       ` Alan Stern
2023-09-01  1:27                                         ` Thinh Nguyen
2023-09-01 17:37                                           ` Alan Stern
2023-09-01 21:14                                             ` Thinh Nguyen
2023-09-02 15:15                                               ` Alan Stern
2023-09-05 22:53                                                 ` Thinh Nguyen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.