All of lore.kernel.org
 help / color / mirror / Atom feed
From: johan@kernel.org (Johan Hovold)
To: linux-arm-kernel@lists.infradead.org
Subject: usb: dwc2: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 146s
Date: Wed, 19 Apr 2017 11:55:04 +0200	[thread overview]
Message-ID: <20170419095504.GF2823@localhost> (raw)
In-Reply-To: <13d18e41-b430-37ea-afb8-d300326bfb12@i2se.com>

On Wed, Apr 19, 2017 at 11:12:17AM +0200, Stefan Wahren wrote:
> Am 19.04.2017 um 10:56 schrieb Johan Hovold:
> > On Tue, Apr 18, 2017 at 10:25:06PM +0200, Stefan Wahren wrote:
> >> Hi,
> >>
> >> [add Johan]
> >>
> >>> Stefan Wahren <stefan.wahren@i2se.com> hat am 18. April 2017 um 10:07 geschrieben:
> >>>
> >>>
> >>> Am 18.04.2017 um 00:37 schrieb Doug Anderson:
> >>>> Hi,
> >>>>
> >>>> On Mon, Apr 17, 2017 at 4:05 AM, Stefan Wahren <stefan.wahren@i2se.com> wrote:
> >>>>> Hi,
> >>>>>
> >>>>>> Stefan Wahren <stefan.wahren@i2se.com> hat am 31. Oktober 2016 um 21:34 geschrieben:
> >>>>>>
> >>>>>>
> >>>>>> I inspired by this issue [1] i build up a slightly modified
> >>>>>> setup with a Raspberry Pi B (mainline kernel 4.9rc3), a powered
> >>>>>> 7 port USB hub and 5 Prolific PL2303 USB to serial convertors. I
> >>>>>> modified the usb_test for dwc2 [2], which only tries to open all
> >>>>>> ttyUSB devices one after the other.
> >>>>>>
> >>>>>> Unfortunately the complete system stuck after opening the first
> >>>>>> ttyUSB device ( heartbeat LED stop blinking, no reaction to
> >>>>>> debug UART). The only way to reanimate the system is to
> >>>>>> powerdown the USB hub with the USB to serial convertors.
> >>>>>>
> >>>>>> [1] - https://github.com/raspberrypi/linux/issues/1692
> >> i took GPIO17 to measure _dwc2_hcd_irq and GPIO18 to measure
> >> _dwc2_hcd_urb_enqueue (patch against 4.11rc1 below).
> >>
> >> So i made my observations for 3 test cases:
> >>
> >> 1) no serial converter connected (idle)
> >> 2) 1 FTDI serial converter connected
> >> 3) 1 PL2303 serial converter connected
> >>
> >> case   | ksoftirq cpu     | mean duration | max duration  | max duration | urb_enqueue  |
> >>        |                  | hcd_irq       | hcd_irq       | urb_enqueue  | within 10 sec|
> >> -------+------------------+---------------+---------------+--------------+--------------+
> >> idle   | 0.0%             | 2 us          | 16.5 us       |     12 us    | 5            |
> >> FTDI   | 25.0%            | 8.5 us        | 18.0 us       |  31000 us    | ~ 400        |
> >> PL2303 | top doesn't work | 8.5 us        | 22.5 us       | 900000 us    | 4            |
> >>
> >> So it seems the serial USB driver has also an impact. In the analyzer
> >> trace the FTDI triggers many smaller urb_enqueue calls in the opposite
> >> to the PL2303 which only has few but huge calls.
> > What do you mean by "huge calls" above?
> 
> The time it takes until _dwc2_hcd_urb_enqueue is finished.

Yes, sorry, I realised that after I sent my reply.

> > Are you just keeping the ports open (i.e. with no data being received or
> > sent)?
> 
> Yes, only open and no data is received or sent (LEDs doesn't show any
> activity).
> 
> > FTDI devices are notorious for their status messages being sent
> > continuously while the port is open. You'll get a two-byte bulk-in
> > message every 16ms by default (these were sent every millisecond up
> > until recently due to a long-standing regression).
> >
> > PL2303 devices have an interrupt-in endpoint which is used for status
> > notifications so you would not see any completion callbacks on an
> > otherwise idle port.
> >
> >> Additional notes:
> >> After closing the serial connection to the FTDI the system is usable
> >> as before. In case of PL2303 i need to disconnect the converter in
> >> order to get a usable system.
> > Does your system lock up when you open the first pl2303 device?
> 
> Yes (directly to the onboard hub or an external hub make no difference).

Ok, good. I suggest not using the external hub until you've tracked this
down.

> >> Why do they behave so differently?
> > So the ftdi bulk-in status messages and the pl2303 interrupt-in endpoint
> > could be two reasons, but I guess so could any change in timing etc.
> >
> > Is your ftdi-device a Full Speed device like the pl2303?
> 
> Sorry, i will need to verify.

Can you reproduce this on 4.11-rc7 without the external hub and post the
corresponding soft lockup bug output?

On open, the pl2303 driver first enqueues the interrupt-in URB before
enqueueing two bulk-in URBs. From your original stack trace it looks
like the lockup happens when queueing either of those. Perhaps you can
enable debugging in the dwc2 driver or add some printks to determine how
far you get.

Another thing to try is to comment out the interrupt-in URB submission
in pl2303 and see whether that avoids triggering the lockup.

Johan

  reply	other threads:[~2017-04-19  9:55 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-31 20:34 usb: dwc2: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 146s Stefan Wahren
2016-11-01  8:26 ` Michael Zoran
2017-04-17 11:05 ` Stefan Wahren
2017-04-17 22:37   ` Doug Anderson
2017-04-18  8:07     ` Stefan Wahren
2017-04-18 20:08       ` Doug Anderson
2017-04-18 20:25       ` Stefan Wahren
2017-04-18 20:41         ` Doug Anderson
2017-04-18 20:53           ` Stefan Wahren
2017-04-19 20:25           ` Stefan Wahren
2017-04-19 21:47             ` Doug Anderson
2017-04-20  7:46               ` Stefan Wahren
2017-04-20 16:19                 ` Doug Anderson
2017-04-20 18:54             ` Eric Anholt
2017-04-20 19:45               ` Doug Anderson
2017-04-20 19:57                 ` Eric Anholt
2017-04-20 20:37                   ` Doug Anderson
2017-04-22 20:50               ` Stefan Wahren
2017-04-25 18:11                 ` Stefan Wahren
2017-05-08 20:22                   ` Stefan Wahren
2017-05-10 16:31                     ` Johan Hovold
2017-05-10 23:50                       ` Doug Anderson
2017-05-13 12:28                         ` Stefan Wahren
2017-10-16 20:49                           ` Julius Werner
2017-10-17  8:52                             ` Johan Hovold
2017-10-25 21:22                             ` Doug Anderson
2017-04-19  8:56         ` Johan Hovold
2017-04-19  9:12           ` Stefan Wahren
2017-04-19  9:55             ` Johan Hovold [this message]
2017-04-17 23:45   ` Heiko Stuebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170419095504.GF2823@localhost \
    --to=johan@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.