From: Danilo Krummrich <danilokrummrich@dk-develop.de>
To: Felipe Balbi <balbi@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org,
gregkh@linuxfoundation.org, k.opasiak@samsung.com
Subject: Re: [PATCH] usb: gadget: udc: udc_stop before gadget unbind
Date: Tue, 15 Aug 2017 14:51:19 +0200 [thread overview]
Message-ID: <c19219134cb23e7e0338e42890d1604d@dk-develop.de> (raw)
In-Reply-To: <878til6ntr.fsf@linux.intel.com>
On 2017-08-15 14:02, Felipe Balbi wrote:
> Hi,
>
> Danilo Krummrich <danilokrummrich@dk-develop.de> writes:
>> thanks for reviewing.
>
> np :-)
>
>> On 2017-08-15 12:03, Felipe Balbi wrote:
>>> Hi,
>>>
>>> Danilo Krummrich <danilokrummrich@dk-develop.de> writes:
>>>> udc_stop needs to be called before gadget driver unbind. Otherwise
>>>> it
>>>> might happen that udc drivers still call into the gadget driver
>>>> (e.g.
>>>> to reset gadget after OTG event). If this happens it is likely to
>>>> get
>>>> panics from gadget driver dereferencing NULL ptr, as gadget's
>>>> drvdata
>>>> is set to NULL on unbind.
>>>
>>> seems like the problem here is with the OTG layer, not UDC core.
>>>
>> I mentioned this just as example, it can happen whenever a UDC driver
>> calls
>> the gadget driver (e.g. by calling usb_gadget_udc_reset() in ISR)
>> after
>> gadget
>> drivers unbind() was called already (e.g. by gadget configfs).
>> If this happens gadget drivers drvdata was already set to NULL by
>> unbind()
>> and reset() could result into a NULL ptr exception.
>> Therefore my assumption was that it needs to be prevented that the
>> gadget
>> driver is getting called after unbind.
>
> We have a known problem in the design of the gadget API that causes
> this
> races but we couldn't come up with a solution yet :-)
>
> Inverting these two calls is not the correct way to go about this :-)
>
Now I see, thanks for explanation below.
>>>> Signed-off-by: Danilo Krummrich <danilokrummrich@dk-develop.de>
>>>> ---
>>>> Actually there could still be a race:
>>>> (CPU1 code taken from dwc3 drivers dwc3_disconnect_gadget() as
>>>> exsample)
>>>>
>>>> CPU0 CPU1
>>>> ---- ----
>>>> usb_gadget_disconnect(udc->gadget);
>>>> udc->driver->disconnect(udc->gadget);
>>>> if (dwc->gadget_driver && dwc->gadget_driver->disconnect)
>>>> usb_gadget_udc_stop(udc);
>>>> udc->driver->unbind(udc->gadget);
>>>> dwc->gadget_driver->disconnect(&dwc->gadget);
>>>>
>>>> UDC drivers typically set their gadget driver pointer to NULL in
>>>> udc_stop
>>>> and check for it before calling into the gadget driver. To fix the
>>>> issue
>>>> above every udc driver could apply a lock around this.
>>>>
>>>> If you see the need for having this or another solutions I can
>>>> provide
>>>> further patches. This patch could also just serve as a base for
>>>> discussion
>>>> if someone knows a smarter solution.
>>>>
>>>> I saw this problem causing a panic on hikey960 board and provided a
>>>> quick
>>>> workaround for the same problem here:
>>>> https://android-review.googlesource.com/#/c/kernel/common/+/457476/
>>>> (panic log in the commit message of the linked patch)
>>>> ---
>>>> drivers/usb/gadget/udc/core.c | 6 +++++-
>>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/usb/gadget/udc/core.c
>>>> b/drivers/usb/gadget/udc/core.c
>>>> index efce68e9a8e0..8155468afc0d 100644
>>>> --- a/drivers/usb/gadget/udc/core.c
>>>> +++ b/drivers/usb/gadget/udc/core.c
>>>> @@ -1234,8 +1234,12 @@ static void usb_gadget_remove_driver(struct
>>>> usb_udc *udc)
>>>>
>>>> usb_gadget_disconnect(udc->gadget);
>>>> udc->driver->disconnect(udc->gadget);
>>>> - udc->driver->unbind(udc->gadget);
>>>> + /* udc_stop needs to be called before gadget driver unbind to
>>>> prevent
>>>> + * udc driver calls into gadget driver after unbind which could
>>>> cause
>>>> + * a nullptr exception.
>>>> + */
>>>> usb_gadget_udc_stop(udc);
>>>> + udc->driver->unbind(udc->gadget);
>>>
>>> This patch is incorrect, it will prevent us from giving back requests
>>> to
>>> gadget driver properly. ->unbind() has to happen before ->udc_stop().
>>
>> Do you mean after udc_stop the udc driver can not call the gadget
>> driver
>> anymore? If not, I did not got your point, sorry for that. Can you
>> please
>> help me out? Would the changed order raise another issue I'm not aware
>> of?
>
> right, ->udc_stop() is supposed to completely teardown the USB
> controller, including disabling interrupts and all. The only thing it
> _can_ do from ->udc_stop() would be giving back any pending requests
> that were left (which would cause req->complete() to be called with an
> error status). But even that is unlikely in the case you mention since
> ->unbind() was already called.
>
Ok, got it. That's why req->context = cdev, to overcome being unbound
already. Thanks for clarification.
>> If I understood you correctly, without this patch udc driver can not
>> call
>> the gadget driver back as well, because this would result in a NULL
>> ptr
>> dereference, as unbind() sets drvdata to NULL.
>>
>> In any case the race described in my original message can still
>> happen,
>> regardless of the order of udc_stop and unbind. But with this patch
>> the
>> needed locking could easily done within the udc driver only. Without,
>> the
>> lock needs to be acquired before udc->driver->unbind(udc->gadget) and
>> released after usb_gadget_udc_stop(). Otherwise an ISR of the udc
>> driver
>> trying to call into the gadget driver could do this after gadget
>> driver
>> already unbound.
>
> right
Is someone working on this issue, already?
If not, I would like to offer introducing the needed locking to overcome
this issue.
If you are about to refactor the whole thing already to solve this (as
you state there's a design problem in the gadget API) would it make
sense
for you to have a workaround meanwhile (maybe not in mainline kernel,
but
somewhere else)? As e.g. on hikey960 board this causes panics very
often.
Thanks,
Danilo
prev parent reply other threads:[~2017-08-15 12:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-14 16:36 [PATCH] usb: gadget: udc: udc_stop before gadget unbind Danilo Krummrich
2017-08-15 10:03 ` Felipe Balbi
2017-08-15 11:51 ` Danilo Krummrich
2017-08-15 12:02 ` Felipe Balbi
2017-08-15 12:51 ` Danilo Krummrich [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c19219134cb23e7e0338e42890d1604d@dk-develop.de \
--to=danilokrummrich@dk-develop.de \
--cc=balbi@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=k.opasiak@samsung.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).