linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Felipe Balbi <balbi@kernel.org>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: USB mailing list <linux-usb@vger.kernel.org>
Subject: Re: Disconnect race in Gadget core
Date: Sun, 16 May 2021 12:43:58 +0300	[thread overview]
Message-ID: <8735un6mjl.fsf@kernel.org> (raw)
In-Reply-To: <20210515153113.GB1036273@rowland.harvard.edu>

[-- Attachment #1: Type: text/plain, Size: 5060 bytes --]


Hi,

Alan Stern <stern@rowland.harvard.edu> writes:
>> Alan Stern <stern@rowland.harvard.edu> writes:
>> > On Fri, May 14, 2021 at 10:35:59AM +0300, Felipe Balbi wrote:
>> >> Alan Stern <stern@rowland.harvard.edu> writes:
>> >> > On Wed, May 12, 2021 at 10:00:41AM +0300, Felipe Balbi wrote:
>> >> >> Hmm, IIRC only the storage gadget defers work to another thread.
>> >> >
>> >> > Well, the composite core is built into almost every gadget, and doesn't 
>> >> > it handle Set-Configuration and Set-Interface requests in a separate 
>> >> > thread?  Or doesn't it expect function drivers to do so?
>> >> 
>> >> composite.c doesn't defer anything to a separate thread by itself. The
>> >> gadget driver, if it finds it necessary, should handle it
>> >> internally. Most gadget drivers handle all of this in interrupt context.
>> >> 
>> >> > After all, the gadget first learns about config or altsetting changes 
>> >> > when it receives a Set-Configuration or Set-Interface request, which 
>> >> > happens in interrupt context.  But changing configs or altsettings 
>> >> > requires disabling/enabling endpoints, which needs a process context 
>> >> > (see the kerneldoc for usb_ep_enable and usb_ep_disable).
>> >> 
>> >> Ouch, I don't think any driver is guaranteeing that other than the
>> >> storage gadget.
>> >
>> > A quick search shows that USB_GADGET_DELAYED_STATUS is used in f_fs.c, 
>> > f_tcm.c, and f_uvc.c in addition to f_mass_storage.c.  So the situation 
>> > isn't as bad as you thought, although it should be better.
>> 
>> right, that's not what the documentation means, though. We're still
>> enabling/disabling endpoints in interrupt context, just delaying the
>> status stage until a later time.
>> 
>> It looks like delaying status like this is enough for the current UDC
>> drivers so, perhaps, we should make delayed_status mandatory and update
>> the documentation accordingly.
>
> If it's okay to call those functions in interrupt context then the 
> kerneldoc definitely should be updated.  However, I don't see why you 
> would want to make DELAYED_STATUS mandatory.  If all the necessary work 
> can be done in the set_alt handler, why not return the status 
> immediately?

because we avoid a special case. Instead of having magic return value to
mean "Don't do anything until I enqueue a request" we can just make that
an assumption, i.e. gadget driver *must* enqueue requests for data and
status stages.

> BTW, does it seem odd to you that these function drivers defer some of 
> the set_alt activities into a work thread but do the ep_enable/disable 
> parts right away, in interrupt context?  I should think the drivers 
> would want to minimize the amount of processing that happens 
> in_interrupt.

it is a bit odd, yes. I'm pretty sure this is forcing odd things to
happen in at least camera gadget, which must communicate with v4l2.

IIRC, camera gadget processes frames in the same context as the
->complete callback, as well, which also runs in_interrupt.

>> > Anyway, getting back to the main point...
>> >
>> > To minimize disruption, suppose we add a new callback to usb_gadget_ops:
>> >
>> > 	void	(*udc_async_callbacks)(struct usb_gadget *, int enable);
>> >
>> > The UDC core can call this at the appropriate times, if it is not NULL.  
>> > It allows the core to tell a UDC driver whether or not to issue 
>> > callbacks for setup, disconnect, reset, suspend, and resume.  It doesn't 
>> > affect request completion callbacks.
>> >
>> > So for removing a driver, the proper sequence will be:
>> >
>> > 	usb_gadget_disconnect()
>> > 	if (gadget->ops->udc_async_callbacks)
>> > 		gadget->ops->udc_async_callbacks(gadget, 0);
>> > 	synchronize_irq()
>> > 	udc->driver->unbind()
>> > 	usb_gadget_udc_stop()
>> >
>> > Or maybe the last two steps should be reversed.  In udc_bind_to_driver, 
>
> After some more thought, I decided the last two steps are in the right 
> order now.  When udc_stop runs, it causes the UDC driver to forget about 
> the gadget driver, so there wouldn't be any way to issue the completion 
> callbacks when the unbind handler cancels the outstanding requests and 
> disables the endpoints.
>
>> > the opposite sequence is:
>> >
>> > 	bind
>> > 	udc_start
>
> Then by analogy, these two steps should be reversed.  But it doesn't 
> really matter, because the gadget driver won't try to enable endpoints 
> or do anything else until the host has enumerated the gadget.  (And if 
> there are any gadget devices which don't allow software to control the 
> pull-up separately, then they clearly will want bind to precede 
> udc_start.)

makes sense

>> > 	enable async callbacks
>> > 	connect (turn on pull-up)
>> >
>> > How does this sound?
>> 
>> Sounds reasonable and, probably, minimizes the amount of code that needs
>> to be changed. This will also enable us to fix each UDC in isolation.
>
> Right.  Okay, I'll work on a patch series.

Thank you, Alan.

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

  reply	other threads:[~2021-05-16  9:44 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 15:24 Disconnect race in Gadget core Alan Stern
2021-05-10 16:43 ` Felipe Balbi
2021-05-10 19:38   ` Alan Stern
2021-05-11  2:53     ` Peter Chen
2021-05-11 19:15       ` Alan Stern
2021-05-12  9:37         ` Peter Chen
2021-05-12  9:41           ` Felipe Balbi
2021-05-12 19:33           ` Alan Stern
2021-05-11  8:22     ` Felipe Balbi
2021-05-11 21:26       ` Alan Stern
2021-05-12  7:00         ` Felipe Balbi
2021-05-12 15:33           ` Alan Stern
2021-05-14  7:35             ` Felipe Balbi
2021-05-14 16:58               ` Alan Stern
2021-05-15  6:41                 ` Felipe Balbi
2021-05-15 15:31                   ` Alan Stern
2021-05-16  9:43                     ` Felipe Balbi [this message]
2021-05-16 14:51                       ` Alan Stern
2021-05-17  2:00                         ` Peter Chen
2021-05-17  5:33                           ` Felipe Balbi
2021-05-17  5:35                         ` Felipe Balbi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735un6mjl.fsf@kernel.org \
    --to=balbi@kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).