From: Felipe Balbi <balbi@kernel.org>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: USB mailing list <linux-usb@vger.kernel.org>
Subject: Re: Disconnect race in Gadget core
Date: Tue, 11 May 2021 11:22:51 +0300 [thread overview]
Message-ID: <87r1idfzms.fsf@kernel.org> (raw)
In-Reply-To: <20210510193849.GB873147@rowland.harvard.edu>
[-- Attachment #1: Type: text/plain, Size: 6849 bytes --]
Hi,
Alan Stern <stern@rowland.harvard.edu> writes:
>> Alan Stern <stern@rowland.harvard.edu> writes:
>> > I just noticed this potential race in the Gadget core, specifically, in
>> > usb_gadget_remove_driver. Here's the relevant code:
>> >
>> > usb_gadget_disconnect(udc->gadget);
>> > if (udc->gadget->irq)
>> > synchronize_irq(udc->gadget->irq);
>> > udc->driver->unbind(udc->gadget);
>> > usb_gadget_udc_stop(udc);
>> >
>> > usb_gadget_disconnect will turn off the D+ pull-up, telling the host (or
>> > upstream hub) that the gadget is no longer connected to the bus. The
>> > udc->driver->unbind call then tells the gadget driver that it will no
>> > longer receive any callbacks from the UDC driver or the gadget core.
>> >
>> > Now suppose that at just this moment, the user unplugs the USB cable.
>> > The UDC driver will notice that the Vbus voltage has gone away and will
>> > invoke the gadget driver's ->disconnect callback. After all, it doesn't
>> > realize it should avoid making these callbacks until usb_gadget_udc_stop
>> > has run.
>> >
>> > As a result, the gadget driver's disconnect callback may be invoked
>> > _after_ the driver has been unbound from the gadget.
>>
>> argh, nice catch!
>>
>> > How should we fix this potential problem?
>>
>> I think we can just move usb_gadget_udc_stop(). Looking at a few udcs,
>> they're just either freeing or masking UDC's interrupts which should
>> prevent the race you describe while also making sure that no further
>> interrupts will trigger.
>>
>> Perhaps we could move udc_stop() before synchronize_irq(). Do you
>> foresee any issues with that for net2272 or dummy?
>
> I don't think it will be that easy. As you may remember, there have
> been a number of commits over the years fiddling with the order of
> operations during gadget driver unbinding (although I don't think any of
> them moved udc_stop from its position at the end). Still, it's a
> delicate matter.
>
> The purpose of synchronize_irq is to wait until any currently running
> IRQ handlers have finished. Obviously this doesn't do any good unless
> the software has arranged beforehand that no new interrupt requests will
> be generated. In this case, turning off the pull-up is currently not
> sufficient. But waiting until after udc_stop has returned isn't the
> answer; it wouldn't prevent callbacks from being issued between the
> unbind and the udc_stop.
>
> And I'm quite sure that calling udc_stop before unbind would be wrong.
> The gadget driver would then get various errors (like requests
> completing with -ESHUTDOWN status) it's not prepared to handle.
>
> So what we need is a way to tell the UDC driver to stop issuing
> callbacks. The ones defined in struct usb_gadget_driver are:
>
> bind and unbind,
> bus suspend and bus resume,
> setup,
> bus reset,
> disconnect.
>
> Bind and unbind aren't relevant for this discussion because they are
> synchronous, not invoked in response to interrupts.
>
> In theory there shouldn't be any bus-resume, setup, or bus-reset
> interrupts once the pull-up is turned off, but it would be good to
> protect against bad hardware which may produce them.
>
> Bus suspend is a real possibility. It occurs when the UDC hasn't
> received any packets for a few milliseconds, which certainly will be the
> case when the pull-up is off. UDC hardware and drivers may or may not
> be smart enough to realize that under this circumstance, lack of packets
> shouldn't be reported as a bus suspend.
>
> And of course, a cable disconnect can occur at any time -- nothing we
> can do about that.
>
> Putting it all together, we need to tell the UDC driver, somewhere
> between turning the pull-up off and doing the synchronize_irq, to stop
> issuing disconnect (and possibly also setup and suspend) callbacks. I
> don't think we can use the pull-up call for this purpose; a gadget
> driver may want for some reason to disconnect logically from the bus
> while still knowing whether Vbus is present. (You may disagree about
> that, but otherwise what's the point of having a disconnect callback in
> the first place?)
>
> Which means in the end that struct usb_gadget_ops needs either to have a
> new callback for this purpose or to have an existing callback augmented
> (for example, the ->pullup callback could get an additional argument
> saying whether to continue issuing callbacks). Or another possibility
> is to make UDC drivers call a core routine to do disconnect callbacks
> instead of issuing those callbacks themselves, and have the core filter
> out callbacks that come at the wrong time. Either way, every UDC driver
> would need to be modified.
>
> What do you think? Is this reasoning accurate, or have I missed
> something?
Right, I'm arguing that, perhaps, ->udc_stop() is the one that should
have said semantics. For starters, 'stop' has a very clear meaning and,
considering my quick review of 3 or 4 UDC drivers, they are just masking
or releasing interrupts which would prevent ->suspend() and
->disconnect() from being called. It could be, however, that if we
change the semantics of udc_stop to fit your description above,
->udc_start() may have to change accordingly. Using dwc3 as an example,
here are the relevant implementations:
> static int dwc3_gadget_start(struct usb_gadget *g,
> struct usb_gadget_driver *driver)
> {
> struct dwc3 *dwc = gadget_to_dwc(g);
> unsigned long flags;
> int ret;
> int irq;
>
> irq = dwc->irq_gadget;
> ret = request_threaded_irq(irq, dwc3_interrupt, dwc3_thread_interrupt,
> IRQF_SHARED, "dwc3", dwc->ev_buf);
request interrupt line and enable it. Prepare the udc to call gadget ops.
> if (ret) {
> dev_err(dwc->dev, "failed to request irq #%d --> %d\n",
> irq, ret);
> return ret;
> }
>
> spin_lock_irqsave(&dwc->lock, flags);
> dwc->gadget_driver = driver;
internal pointer cached for convenience
> spin_unlock_irqrestore(&dwc->lock, flags);
>
> return 0;
> }
>
> static int dwc3_gadget_stop(struct usb_gadget *g)
> {
> struct dwc3 *dwc = gadget_to_dwc(g);
> unsigned long flags;
>
> spin_lock_irqsave(&dwc->lock, flags);
> dwc->gadget_driver = NULL;
> spin_unlock_irqrestore(&dwc->lock, flags);
>
> free_irq(dwc->irq_gadget, dwc->ev_buf);
drop the interrupt line. This makes the synchronize_irq() call
irrelevant in usb_gadget_remove_driver().
I'm not against adding new udc methods to gadget ops, but it seems
fitting that udc_start/udc_stop would fit your description while some
new members could be given the task of priming the default control pipe
to receive the first SETUP packet.
What do you think?
--
balbi
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]
next prev parent reply other threads:[~2021-05-11 8:31 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-10 15:24 Disconnect race in Gadget core Alan Stern
2021-05-10 16:43 ` Felipe Balbi
2021-05-10 19:38 ` Alan Stern
2021-05-11 2:53 ` Peter Chen
2021-05-11 19:15 ` Alan Stern
2021-05-12 9:37 ` Peter Chen
2021-05-12 9:41 ` Felipe Balbi
2021-05-12 19:33 ` Alan Stern
2021-05-11 8:22 ` Felipe Balbi [this message]
2021-05-11 21:26 ` Alan Stern
2021-05-12 7:00 ` Felipe Balbi
2021-05-12 15:33 ` Alan Stern
2021-05-14 7:35 ` Felipe Balbi
2021-05-14 16:58 ` Alan Stern
2021-05-15 6:41 ` Felipe Balbi
2021-05-15 15:31 ` Alan Stern
2021-05-16 9:43 ` Felipe Balbi
2021-05-16 14:51 ` Alan Stern
2021-05-17 2:00 ` Peter Chen
2021-05-17 5:33 ` Felipe Balbi
2021-05-17 5:35 ` Felipe Balbi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r1idfzms.fsf@kernel.org \
--to=balbi@kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).