linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ferry Toth <fntoth@gmail.com>
To: Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
	Felipe Balbi <balbi@kernel.org>,
	Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>, USB <linux-usb@vger.kernel.org>
Subject: Re: USB network gadget / DWC3 issue
Date: Mon, 5 Apr 2021 22:59:06 +0200	[thread overview]
Message-ID: <7963d464-44c1-f580-398c-775c694664cb@gmail.com> (raw)
In-Reply-To: <5d8459ae-4a4c-7371-6b0a-ed817e898168@gmail.com>

Hi,

Op 03-04-2021 om 23:15 schreef Ferry Toth:
> Hi,
>
> Op 03-04-2021 om 13:25 schreef Ferry Toth:
>> Hi,
>>
>> Op 03-04-2021 om 04:02 schreef Thinh Nguyen:
>>> Ferry Toth wrote:
>>>> Hi,
>>>>
>>>> Op 02-04-2021 om 22:16 schreef Thinh Nguyen:
>>>>> Ferry Toth wrote:
>>>>>> Hi
>>>>>>
>>>>>> Op 30-03-2021 om 23:57 schreef Ferry Toth:
>>>>>>> Hi
>>>>>>>
>>>>>>> Op 30-03-2021 om 22:26 schreef Ferry Toth:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Op 30-03-2021 om 18:17 schreef Felipe Balbi:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Andy Shevchenko <andy.shevchenko@gmail.com> writes:
>>>>>>>>>> Hi!
>>>>>>>>>>
>>>>>>>>>> I have a platform with DWC3 in Dual Role mode. Currently I'm
>>>>>>>>>> experimenting on v5.12-rc5 with a few patches (mostly 
>>>>>>>>>> configuration)
>>>>>>>>>> applied [1]. I'm using Debian Unstable on the host machine and
>>>>>>>>>> BuildRoot with the above mentioned kernel on the target.
>>>>>>>>>>
>>>>>>>>>> **So, scenario 0:
>>>>>>>>>> 1. Run iperf3 -s on target
>>>>>>>>>> 2. Run iperf3 -c ... -t 0 on the host
>>>>>>>>>> 3.  0.00-10.36  sec   237 MBytes   192 Mbits/sec
>>>>>>>>>> receiver
>>>>>>>>>>
>>>>>>>>>> **Scenario 1:
>>>>>>>>>> 1. Now, detach USB cable, wait for several seconds, attach it 
>>>>>>>>>> back,
>>>>>>>>>> repeat above:
>>>>>>>>>> 0.00-9.94   sec   209 MBytes   176 Mbits/sec receiver
>>>>>>>>>>
>>>>>>>>>> Note the bandwidth drop (177 vs. 192).
>>>>>>>>>>
>>>>>>>>>> (Repeating scenario 1 will give now the same result)
>>>>>>>>>>
>>>>>>>>>> **Scenario 2.
>>>>>>>>>> 1. Detach USB cable, attach a device, for example USB stick,
>>>>>>>>>> 2. See it being enumerated and detach it.
>>>>>>>>>> 3. Attach cable from host
>>>>>>>>>> 4 .   0.00-19.36  sec   315 MBytes   136 Mbits/sec
>>>>>>>>>> receiver
>>>>>>>>>>
>>>>>>>>>> Note even more bandwidth drop!
>>>>>>>>>>
>>>>>>>>>> (Repeating scenario 1 keeps the same lower bandwidth)
>>>>>>>>>>
>>>>>>>>>> NOTE, sometimes on this scenario after several seconds the 
>>>>>>>>>> target
>>>>>>>>>> simply reboots (w/o any logs [from kernel] printed)!
>>>>>>>>>>
>>>>>>>>>> So, any pointers on how to debug and what can be a smoking 
>>>>>>>>>> gun here?
>>>>>>>>>>
>>>>>>>>>> Ferry reported this in [2]. There are different kernel 
>>>>>>>>>> versions and
>>>>>>>>>> tools to establish the connection (like connman vs. none in my
>>>>>>>>>> case).
>>>>>>>>>>
>>>>>>>>>> [1]:
>>>>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/__;!!A4F2R9G_pg!KpQnudHIK6XgK6HbPaqtbVgipDmkNBWewo-euAIuBlGdtSiaQiJ8jLn9OoMEppG6qq-d$ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [2]:
>>>>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!KpQnudHIK6XgK6HbPaqtbVgipDmkNBWewo-euAIuBlGdtSiaQiJ8jLn9OoMEptMCrp-F$ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> dwc3 tracepoints should give some initial hints. Look at packets
>>>>>>>>> sizes
>>>>>>>>> and period of transmission. From dwc3 side, I can't think of
>>>>>>>>> anything we
>>>>>>>>> would do to throttle the transmission, but tracepoints should 
>>>>>>>>> tell a
>>>>>>>>> clearer story.
>>>>>>>>>
>>>>>>>> My testing (but yes, with difference kernel and network managed by
>>>>>>>> connman) shows:
>>>>>>>>
>>>>>>>> 1) on cold boot eem network gadget works fine
>>>>>>>>
>>>>>>>> 2) after unplug or warm reboot (which is also an unplug) it's 
>>>>>>>> broken,
>>>>>>>> speed is lost (|12.0 Mbits/sec from 200Mb/s normally)|, packets 
>>>>>>>> lost,
>>>>>>>> no configuration received from dhcp, occasional reboot, only 
>>>>>>>> way to
>>>>>>>> fix is cold boot
>>>>>>>>
>>>>>>>> 3) if before unplug `connmanctl disable gadget`, on replugging and
>>>>>>>> enabling it works fine
>>>>>>>>
>>>>>>>> My theory is that some HW register is disturbed on a surprise 
>>>>>>>> unplug,
>>>>>>>> but not reset on plug or warm boot. But on cold boot is cleared.
>>>>>>>> Maybe that can help to narrow down tracepoints?
>>>>>>>>
>>>>>>> I captured a plug after warm and after cold boot. This includes
>>>>>>> network setup (dhcp). You can find it in [2] or directly link here:
>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6232410/boot.zip 
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> While the above traces in boot.zip allow compare which regs not
>>>>>> correctly initialized on warm boot, I have now captured traces of
>>>>>> unplug/plug.
>>>>>>
>>>>>> Here kernel is 5.10.27 (LTS), cold booted with USB cable plugged 
>>>>>> and the
>>>>>> eem gadget network setup (dhcp). Then trace unplug. Then trace plug.
>>>>>>
>>>>>> After plug the eem connection is again broken.
>>>>>>
>>>>>> This might allow figuring out what goes wrong on unplug. Traces 
>>>>>> here:
>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6250924/plug-unplug.zip 
>>>>>>
>>>>>>
>>>>>> **
>>>>>>
>>>>> Hi,
>>>>>
>>>>> Were you able to narrow down the issue to only DWC3 device? (i.e. you
>>>>> tested with different hosts and different device controllers to 
>>>>> confirm
>>>>> this)
>>>> I haven't tried with other devices. I have been forced to replace my
>>>> host mobo and nothing changed. But I didn't pay attention to the
>>>> particular host controller.
>>>>
>>> It'd be better if we can narrow down the culprit as this seems to me
>>> like a synchronization issue at the upper layer between the host and 
>>> device.
>>>
>>>>> Did you see this issue previously? If not, is it possible to do git
>>>>> bisection?
>>>> This is with Intel Edison where main line usb gadget support appeared
>>>> around 4.19 iirc. I believed the problem appeared between 5.4 and 5.7
>>>> and tried to bisect but failed.
>>>>
>>>> I realize only now that I failed because:
>>>> 1) 5.4 already has this issue as I recently retested
>>> I'm confused, why do you believe the problem is between 5.4 and 5.7 if
>>> 5.4 already has this issue? So when did you start seeing this problem?
>>
>> Because at the time of 5.4 I didn't notice the issue as I normally 
>> did cold boots due to other problems on warm boot (i.e. sdhc 
>> inaccessible).
>>
>> I never new that on a cold boot it works. Even during bisecting I 
>> didn't know until the end, and then I found 5.4 has the same problem 
>> as all the later kernels (tested up to 5.11)
>>
>>> Also, these kernel versions are really old, there's been a lot of
>>> updates/fixes to dwc3 since then. Can we run tests on the latest 
>>> kernel?
>>
>> I have tested 5.10.27, 5.11.0 and 5.11.4-rt11.
>>
>> But of course I am completely prepared to run Andy's latest 
>> (v5.12-rc5) on the device.
>>
>>>> 2) I didn't use a reproducible criterion. After warm reboot the eem
>>>> gadget fails, but you can flip the host/gadget switch back and 
>>>> forth and
>>>> have the illusion that the connection restored.
>>>>
>>>> The scenario described here is reproducible: leaving the switch in
>>>> gadget mode eem works after cold boot only. And it likely breaks on 
>>>> unplug.
>>>>
>>>> A 2nd hint is that disabling gadget (I used `connmanctl disable 
>>>> gadget`
>>>> but I believe that has the same effect as `iw link set dev usb0 down`)
>>>> before unplug prevents messing up the driver, so you can replug and
>>>> enable again.
>>> These data points are good. However, we'd need to know where to look
>>> first. The issue isn't obvious from the DWC3 controller or the DWC3 
>>> driver.
>>>
>>> Can you check a few things:
>>> 1) Any error/timeout messages from the host's dmesg? Or device side?
>>
>> I'll add log from the host side.
>>
>> For now I only see (on a warm plug):
>>
>> kernel: usb 1-11: can't set config #1, error -110
>>
>>> 2) What kernel version is your host using? Can you use the latest for
>>> both host and device?
>>
>> The host is ubuntu's amd64 5.8.0-48-generic.
>>
>> I will test with v5.12-rc5  from ubuntu kernel ppa on the host. And 
>> Andy's latest (v5.12-rc5) on the device.
>
> I upgraded host kernel, but not yet device and captured relevant host 
> journal messages and device traces. Something did change: after cold 
> boot I don't a eem until after I unplug/replug. I then traced a iperf 
> transfer. Then after again unplug/replug I get the throttled 
> connection, which I also traced.
>
> See https://github.com/andy-shev/linux/files/6253414/transfer.zip
>

Now, with host updated to ubuntu kernel ppa 5.12.0-051200rc5-generic and 
edison to 5.12.0-rc5-edison-acpi-standard vanilla + 2 patches appearing 
in rc6:

* "usb: dwc3: gadget: Clear DEP flags after stop transfers in ep disable"
* "usb: dwc3: pci: Enable dis_uX_susphy_quirk for Intel Merrifield"

plus one from https://github.com/andy-shev/linux/commits/eds-acpi 
<https://github.com/andy-shev/linux/commits/eds-acpi>

* "TODO: driver core: Break infinite loop when deferred probe can't be 
satisfied"

I captured one good and one bad connection, plus logs on the host side 
see journalctl-plus-comments.txt in 
https://github.com/andy-shev/linux/files/6260614/5.12-rc5.zip

>
>> I am expecting results this evening.
>>
>>> 3) Snapshot of dwc3 tracepoints of active transfers between the normal
>>> vs throttled of the latest kernel
>>
>> I don't know if the problem I see is really throttling.
>>
>> I can trace an active transfer, but that does actually throttle from 
>> 200Mb/s down to 139MB/s and produces a trace of 53MB. (2x1sec of 
>> iperf3).
>>
>>> BR,
>>> Thinh

  reply	other threads:[~2021-04-05 20:59 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-30 12:37 USB network gadget / DWC3 issue Andy Shevchenko
2021-03-30 16:17 ` Felipe Balbi
2021-03-30 20:26   ` Ferry Toth
2021-03-30 21:57     ` Ferry Toth
2021-04-02 19:12       ` Ferry Toth
2021-04-02 20:16         ` Thinh Nguyen
2021-04-02 22:40           ` Ferry Toth
2021-04-03  2:02             ` Thinh Nguyen
2021-04-03 11:25               ` Ferry Toth
2021-04-03 21:15                 ` Ferry Toth
2021-04-05 20:59                   ` Ferry Toth [this message]
2021-04-07  0:10                     ` Thinh Nguyen
2021-04-07  0:24                       ` Thinh Nguyen
2021-04-07 13:34                         ` Andy Shevchenko
2021-04-07 16:08                           ` Ferry Toth
2021-04-08 20:17                           ` Ferry Toth
2021-04-08 21:12                             ` Thinh Nguyen
2021-04-08 21:37                               ` Thinh Nguyen
2021-04-09 13:26                               ` Ferry Toth
2021-04-10 13:29                                 ` Ferry Toth
2021-04-10 14:08                                   ` Ferry Toth
2021-04-11  0:04                                     ` Thinh Nguyen
2021-04-11 15:26                                       ` Ferry Toth
2021-04-13  2:17                                         ` Thinh Nguyen
2021-04-13  8:45                                           ` Ferry Toth
2021-04-13 21:06                                           ` Ferry Toth
2021-04-13 21:21                                             ` Thinh Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7963d464-44c1-f580-398c-775c694664cb@gmail.com \
    --to=fntoth@gmail.com \
    --cc=Thinh.Nguyen@synopsys.com \
    --cc=andy.shevchenko@gmail.com \
    --cc=balbi@kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).