All of lore.kernel.org
 help / color / mirror / Atom feed
From: Felipe Balbi <balbi@kernel.org>
To: Wesley Cheng <wcheng@codeaurora.org>, gregkh@linuxfoundation.org
Cc: linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
	jackp@codeaurora.org
Subject: Re: [PATCH] usb: dwc3: gadget: Disable gadget IRQ during pullup disable
Date: Fri, 11 Jun 2021 10:23:43 +0300	[thread overview]
Message-ID: <87wnr0zwxs.fsf@kernel.org> (raw)
In-Reply-To: <a59a81c1-367f-b4b0-b6bf-dbe91ab3613d@codeaurora.org>

[-- Attachment #1: Type: text/plain, Size: 7731 bytes --]

Wesley Cheng <wcheng@codeaurora.org> writes:

> On 6/10/2021 11:21 AM, Wesley Cheng wrote:
>> 
>> 
>> On 6/10/2021 4:09 AM, Felipe Balbi wrote:
>>> Wesley Cheng <wcheng@codeaurora.org> writes:
>>>
>>>> Current sequence utilizes dwc3_gadget_disable_irq() alongside
>>>> synchronize_irq() to ensure that no further DWC3 events are generated.
>>>> However, the dwc3_gadget_disable_irq() API only disables device
>>>> specific events.  Endpoint events can still be generated.  Briefly
>>>> disable the interrupt line, so that the cleanup code can run to
>>>> prevent device and endpoint events. (i.e. __dwc3_gadget_stop() and
>>>> dwc3_stop_active_transfers() respectively)
>>>>
>>>> Without doing so, it can lead to both the interrupt handler and the
>>>> pullup disable routine both writing to the GEVNTCOUNT register, which
>>>> will cause an incorrect count being read from future interrupts.
>>>>
>>>> Fixes: ae7e86108b12 ("usb: dwc3: Stop active transfers before halting the controller")
>>>> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
>>>> ---
>>>>  drivers/usb/dwc3/gadget.c | 11 +++++------
>>>>  1 file changed, 5 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>> index 49ca5da..89aa9ac 100644
>>>> --- a/drivers/usb/dwc3/gadget.c
>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>> @@ -2260,13 +2260,10 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
>>>>  	}
>>>>  
>>>>  	/*
>>>> -	 * Synchronize any pending event handling before executing the controller
>>>> -	 * halt routine.
>>>> +	 * Synchronize and disable any further event handling while controller
>>>> +	 * is being enabled/disabled.
>>>>  	 */
>>>> -	if (!is_on) {
>>>> -		dwc3_gadget_disable_irq(dwc);
>>>> -		synchronize_irq(dwc->irq_gadget);
>>>> -	}
>>>> +	disable_irq(dwc->irq_gadget);
>>>>  
>>>>  	spin_lock_irqsave(&dwc->lock, flags);
>>>
>>> spin_lock_irqsave() is already disabling interrupt, right? Why do we
>>> need another call to disable_irq()?
>>>
>> 
>> Hi Felipe,
>> 
>> Yes, I remember you brought up that point as well before.  So when I
>> checked the logs (USB and scheduler ftrace) for this issue, I clearly
>> saw that we were handling a soft disconnect on CPU3 and then an DWC3 IRQ
>> being scheduled into CPU0.  Last time we discussed, I mentioned that
>> spin_lock_irqsave() only disables interrupts on that particular CPU the
>> thread is running on.
>> 
> Hi Felipe,
>
> Let me share the ftrace snippets as well:
>
> USB FTRACE:
>
> <idle>-0    [002]   304.567900:  dwc3_gadget_ep_cmd name=ep0out cmd=1030  param0=0  param1=4026523648  param2=0  cmd_status=0
> <idle>-0    [004]   304.583225:  dwc3_gadget_ep_cmd name=ep1out cmd=134152  param0=0  param1=0  param2=0  cmd_status=0
>
> //We don't have any logging in pullup disable, but this is when that
> occurs, as we start to see -ESHUTDOWN statuses due to stop active
> transfer.

argh, maybe we should add :-)

I noticed that the logs here look different. Did you modify dwc3 trace
events or is it a special tracer (just curious)? Anyway...

> <idle>-0    [004]   304.583237:  dwc3_gadget_giveback name=ep1out req=1243650560  actual=0  length=16384  status=4294967188  zero=0 short_not_ok=0  no_interrupt=0
> <idle>-0    [004]   304.583275:  dwc3_gadget_giveback name=ep1out req=1243645440  actual=0  length=16384  status=4294967188  zero=0 short_not_ok=0  no_interrupt=0
> <idle>-0    [004]   304.583282:  dwc3_gadget_giveback name=ep1out req=1243648256  actual=0  length=16384  status=4294967188  zero=0 short_not_ok=0  no_interrupt=0
> <idle>-0    [004]   304.583312:  dwc3_gadget_giveback name=ep1out req=4075146240  actual=0  length=16384  status=4294967188  zero=0 short_not_ok=0  no_interrupt=0
> ...
> //USB gadget disconnect is printed AFTER the pullup(0) routine is complete.
> <idle>-0    [004]   304.584240:  usb_gadget_disconnect   speed=3 max_speed=6  state=7  mA=500  sg_supported=1  is_otg=0 is_a_peripheral=0  b_hnp_enable=0  a_hnp_support=0 hnp_polling_support=0  host_request_flag=0  quirk_ep_out_aligned_size=0 quirk_altset_not_supp=0  quirk_stall_not_supp=0  quirk_zlp_not_supp=0 is_selfpowered=0  deactivated=0  connected=1  ret=4294967186
>
> ---------------------------------------------------------
> CPU2 SCHED FTRACE:
>
> kworker/u16:5-192    [002]   304.583354:  irq_handler_entry   irq=343 name=dwc3-msm
> kworker/u16:5-192    [002]   304.583356:  irq_handler_exit   irq=343 ret=0
> kworker/u16:5-192    [002]   304.583358:  irq_handler_entry   irq=343 name=dwc3
> kworker/u16:5-192    [002]   304.583366:  irq_handler_exit   irq=343 ret=2
> kworker/u16:5-192    [002]   304.583377:  sched_wakeup comm=irq/343-dwc3  pid=20174  prio=100  success=1  target_cpu=2
> kworker/u16:5-192   [002]  304.583411: sched_switch: prev_comm=kworker/u16:5 prev_pid=192 prev_prio=120 prev_state=R ==> next_comm=irq/343-dwc3 next_pid=20174 next_prio=100
> irq/343-dwc3-20174    [002]   304.583454:  irq_handler_entry   irq=343 name=dwc3-msm
> irq/343-dwc3-20174    [002]   304.583455:  irq_handler_exit   irq=343 ret=0
> irq/343-dwc3-20174    [002]   304.583458:  irq_handler_entry   irq=343 name=dwc3
> irq/343-dwc3-20174    [002]   304.583465:  irq_handler_exit   irq=343 ret=2
>
> So if we assume that the point of error happens @ 304.567900 due to the
> below points:
> - DWC3 pullup disable routine synchronized IRQs (w/ previous logic)
> before calling stop active transfers.
> - DWC3 request statuses shows ESHUTDOWN, which occurs during stop active
> transfer, as we call giveback w/ that error code
>
> Then we can see that on CPU2, we're still getting DWC3 core IRQs, and
> DWC3 IRQ handler is returning --> 2, which is IRQ_WAKE_THREAD.

very good breakdown of the problem, thank you :-) What if we move the
spin_lock_irqsave() before that !is_on check and do away with
synchronize IRQ?

Then the only thing left would be ignore the interrupt events if they
still come, no?

Something like:

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 612825a39f82..aff9a4673830 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2260,20 +2260,12 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
                return 0;
        }

-       /*
-        * Synchronize any pending event handling before executing the controller
-        * halt routine.
-        */
-       if (!is_on) {
-               dwc3_gadget_disable_irq(dwc);
-               synchronize_irq(dwc->irq_gadget);
-       }
-
        spin_lock_irqsave(&dwc->lock, flags);

        if (!is_on) {
                u32 count;

+               dwc3_gadget_disable_irq(dwc);
                dwc->connected = false;
                /*
                 * In the Synopsis DesignWare Cores USB3 Databook Rev. 3.30a
@@ -3873,6 +3865,9 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
        if (!count)
                return IRQ_NONE;

+       if (!dwc->connected)
+               goto out_ignore;
+
        evt->count = count;
        evt->flags |= DWC3_EVENT_PENDING;

@@ -3887,6 +3882,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
        if (amount < count)
                memcpy(evt->cache, evt->buf, count - amount);

+out_ignore:
        dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);

        return IRQ_WAKE_THREAD;


(totally untested, mind you :)

Ignoring the events in dwc3_check_event_buf() may be unnecessary since
we're already doing that in dwc3_pullup() after __dwc3_gadget_stop().

What do you think?

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 511 bytes --]

  reply	other threads:[~2021-06-11  7:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-21  4:23 [PATCH] usb: dwc3: gadget: Disable gadget IRQ during pullup disable Wesley Cheng
2021-06-10 11:09 ` Felipe Balbi
2021-06-10 18:21   ` Wesley Cheng
2021-06-10 20:27     ` Wesley Cheng
2021-06-11  7:23       ` Felipe Balbi [this message]
2021-06-14  7:39         ` Jack Pham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wnr0zwxs.fsf@kernel.org \
    --to=balbi@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jackp@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=wcheng@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.