linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wesley Cheng <wcheng@codeaurora.org>
To: Felipe Balbi <balbi@kernel.org>,
	gregkh@linuxfoundation.org, peter.chen@kernel.org
Cc: linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
	jackp@codeaurora.org
Subject: Re: [PATCH v2] usb: dwc3: gadget: Replace list_for_each_entry_safe() if using giveback
Date: Tue, 27 Jul 2021 14:00:16 -0700	[thread overview]
Message-ID: <3c55dd4f-5ebd-3730-7428-cb15235465a7@codeaurora.org> (raw)
In-Reply-To: <87v954xjoz.fsf@kernel.org>

Hi Felipe,

On 7/20/2021 11:30 PM, Felipe Balbi wrote:
> 
> Hi Wesley,
> 
> (first of all, sorry for the super long delay. This really fell through
> the cracks)
> 

No problem, I understand that you probably get a whole bunch of things
all at once, so its hard to keep track of each one :).

> Wesley Cheng <wcheng@codeaurora.org> writes:
>> Hi Felipe,
>>
>> On 6/9/2021 1:57 PM, Wesley Cheng wrote:
>>> Hi Felipe,
>>>
>>> On 5/19/2021 1:52 AM, Wesley Cheng wrote:
>>>>
>>>>
>>>> On 5/11/2021 1:13 AM, Felipe Balbi wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Wesley Cheng <wcheng@codeaurora.org> writes:
>>>>>> The list_for_each_entry_safe() macro saves the current item (n) and
>>>>>> the item after (n+1), so that n can be safely removed without
>>>>>> corrupting the list.  However, when traversing the list and removing
>>>>>> items using gadget giveback, the DWC3 lock is briefly released,
>>>>>> allowing other routines to execute.  There is a situation where, while
>>>>>> items are being removed from the cancelled_list using
>>>>>> dwc3_gadget_ep_cleanup_cancelled_requests(), the pullup disable
>>>>>> routine is running in parallel (due to UDC unbind).  As the cleanup
>>>>>> routine removes n, and the pullup disable removes n+1, once the
>>>>>> cleanup retakes the DWC3 lock, it references a request who was already
>>>>>> removed/handled.  With list debug enabled, this leads to a panic.
>>>>>> Ensure all instances of the macro are replaced where gadget giveback
>>>>>> is used.
>>>>>>
>>>>>> Example call stack:
>>>>>>
>>>>>> Thread#1:
>>>>>> __dwc3_gadget_ep_set_halt() - CLEAR HALT
>>>>>>   -> dwc3_gadget_ep_cleanup_cancelled_requests()
>>>>>>     ->list_for_each_entry_safe()
>>>>>>     ->dwc3_gadget_giveback(n)
>>>>>>       ->dwc3_gadget_del_and_unmap_request()- n deleted[cancelled_list]
>>>>>>       ->spin_unlock
>>>>>>       ->Thread#2 executes
>>>>>>       ...
>>>>>>     ->dwc3_gadget_giveback(n+1)
>>>>>>       ->Already removed!
>>>>>>
>>>>>> Thread#2:
>>>>>> dwc3_gadget_pullup()
>>>>>>   ->waiting for dwc3 spin_lock
>>>>>>   ...
>>>>>>   ->Thread#1 released lock
>>>>>>   ->dwc3_stop_active_transfers()
>>>>>>     ->dwc3_remove_requests()
>>>>>>       ->fetches n+1 item from cancelled_list (n removed by Thread#1)
>>>>>>       ->dwc3_gadget_giveback()
>>>>>>         ->dwc3_gadget_del_and_unmap_request()- n+1 deleted[cancelled_list]
>>>>>>         ->spin_unlock
>>>>>>
>>>>>> Fixes: d4f1afe5e896 ("usb: dwc3: gadget: move requests to cancelled_list")
>>>>>> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
>>>>>> Reviewed-by: Peter Chen <peter.chen@kernel.org>
>>>>>> ---
>>>>>> Changes in v2:
>>>>>>  - Updated commit message with context call stack of an example scenario
>>>>>>    seen on device.
>>>>>>
>>>>>>  drivers/usb/dwc3/gadget.c | 8 ++++----
>>>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>> index dd80e5c..efa939b 100644
>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>> @@ -1737,10 +1737,10 @@ static void dwc3_gadget_ep_skip_trbs(struct dwc3_ep *dep, struct dwc3_request *r
>>>>>>  static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
>>>>>>  {
>>>>>>  	struct dwc3_request		*req;
>>>>>> -	struct dwc3_request		*tmp;
>>>>>>  	struct dwc3			*dwc = dep->dwc;
>>>>>>  
>>>>>> -	list_for_each_entry_safe(req, tmp, &dep->cancelled_list, list) {
>>>>>> +	while (!list_empty(&dep->cancelled_list)) {
>>>>>> +		req = next_request(&dep->cancelled_list);
>>>>>
>>>>> couldn't this be solved list_replace_init() instead? Then we can keep
>>>>> using the regular list_for_each_entry_safe() which has an added semantic
>>>>> meaning due to its name.
>>>>>
>>>>
>>>> Hi Felipe,
>>>>
>>>> Sorry for the late response.  So I tried with a list_replace_init() to
>>>> within the list_for_each_entry_safe() loop to update tmp w/ the
>>>> cancelled_list list head, but the issue was still observed.  This is
>>>> because we can't replace the reference the loop already has stored in
>>>> tmp, which is simply updated as the current item on the next iteration.
>>>>
>>>> I believe this is what you were trying to achieve?
>>>>
>>> Was wondering if you had any further inputs on this change?  As
>>> mentioned, I tried a few things with list_replace_init(), which did not
>>> work.
>>>
>>
>> Sorry for the ping.  Is this change OK to add as is?  We've been running
>> into this instance pretty frequently during our testing, so just wanted
>> to close on the proper changes being merged upstream.
> 
> The idea is this:
> 
> 	struct list_head	local;
> 
>         spin_lock_irq(&lock);
>         list_replace_init(&dwc->cancelled_list, &local);
>         spin_unlock_irq(&lock);
> 
> 	list_for_each_entry_safe(req, tmp, &local, list) {
>         	/* ... */
> 	}
> 
> It looks to me this should work fine, no? You can also follow what
> drivers/usb/core/hcd.c is doing in usb_giveback_urb_bh() and restarting
> if dwc->cancelled_list is not empty after list_for_each_entry_safe().
> 
> Can you give that one a shot?
> 

Great, thanks for this suggestion!  Now I understand what you were
referring to.  I gave this a try and it works well.  Will prepare a
change to replace both places with list_replace_init()

Thanks
Wesley Cheng

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

  reply	other threads:[~2021-07-27 21:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-11  7:03 [PATCH v2] usb: dwc3: gadget: Replace list_for_each_entry_safe() if using giveback Wesley Cheng
2021-05-11  8:13 ` Felipe Balbi
2021-05-19  8:52   ` Wesley Cheng
2021-06-09 20:57     ` Wesley Cheng
2021-07-21  6:14       ` Wesley Cheng
2021-07-21  6:30         ` Felipe Balbi
2021-07-27 21:00           ` Wesley Cheng [this message]
2021-07-28  5:53             ` Felipe Balbi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3c55dd4f-5ebd-3730-7428-cb15235465a7@codeaurora.org \
    --to=wcheng@codeaurora.org \
    --cc=balbi@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jackp@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=peter.chen@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).