All of lore.kernel.org
 help / color / mirror / Atom feed
From: stefan.wahren@i2se.com (Stefan Wahren)
To: linux-arm-kernel@lists.infradead.org
Subject: usb: dwc2: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 146s
Date: Tue, 25 Apr 2017 20:11:09 +0200 (CEST)	[thread overview]
Message-ID: <2127594073.298820.1493143869792@email.1und1.de> (raw)
In-Reply-To: <1998517910.54108.1492894253010@email.1und1.de>

> Stefan Wahren <stefan.wahren@i2se.com> hat am 22. April 2017 um 22:50 geschrieben:
> 
> 
> Hi,
> 
> > Eric Anholt <eric@anholt.net> hat am 20. April 2017 um 20:54 geschrieben:
> > 
> > 
> > Stefan Wahren <stefan.wahren@i2se.com> writes:
> > 
> > > Hi,
> > >
> > >> Doug Anderson <dianders@chromium.org> hat am 18. April 2017 um 22:41 geschrieben:
> > >> 
> > >> 
> > >> It's hard to know for sure that all of this time is really in
> > >> urb_enqueue().  Possible we could have task switched out and been
> > >> blocked elsewhere.  Using ftrace to get more fine-grained timings
> > >> would be useful.  ktime_get(), ktime_sub(), and ktime_to_us() are your
> > >> friends here if you want to use trace_printk.
> > >
> > > i'm a newbie to ftrace, so i hope this would be helpful.
> > >
> > > # connect PL2303 to the onboard hub
> > > # echo 0 > options/sleep-time
> > > # echo 0 > function_profile_enabled
> > > # echo 1 > function_profile_enabled
> > > # ./usb_test
> > > # Waiting for at least 20 seconds and then disconnect PL2303
> > > # echo 0 > function_profile_enabled
> > > # cat trace_stat/function0
> > >
> > >   Function                               Hit    Time            Avg             s^2
> > >   --------                               ---    ----            ---             ---
> > >   bcm2835_handle_irq                  361347    219567633 us     607.636 us      1485199 us  
> > >   __handle_domain_irq                1082482    212639551 us     196.437 us      3642030 us  
> > >   generic_handle_irq                 1082482    100592051 us     92.927 us       50511334 us 
> > >   irq_exit                           1082482    98197771 us     90.715 us       29649040 us 
> > >   handle_level_irq                   1082482    95812379 us     88.511 us       51910093 us 
> > 
> > If I'm reading this output right, we're spending half of our interrupt
> > processing time in irq_exit(), so even if dwc2's interrupt was free (the
> > generic_handle_irq() chain), we'd be eating about half the CPU getting
> > back out of the interrupt handler, right?
> > 
> > I don't really know anything about DWC2 or USB, but is there any way we
> > could mitigate the interrupt frequency with this hardware?  If nothing
> > else, could we loop reading gintsts until it reads back 0?
> 
> first of all i updated my kernel to 4.11rc7 and the issue still occures. Today i had some time to investigate this issue further and i made some interesting observations:
> 
> 1. The lockup doesn't occure always after starting usb_test. In rare cases i was able to run the program without lockup.
> 2. In case the lockup occured we are always able to "rescue" the BCM2835 from this state by sending some serial data to the PL2303.

Based on this scenario i patched the interrupt routine to detect the interrupt storm and normal condition. So i can dump the global and host registers in both situations (bad and goodcase).

Here is the diff between both register dumps, maybe someone see something interesting:

--- badcase.txt	2017-04-25 18:02:59.000000000 +0000
+++ goodcase.txt	2017-04-25 18:02:59.000000000 +0000
@@ -4,12 +4,12 @@
  dwc2_dump_global_registers: GAHBCFG	 @0xCC850008 : 0x00000031
  dwc2_dump_global_registers: GUSBCFG	 @0xCC85000C : 0x20001700
  dwc2_dump_global_registers: GRSTCTL	 @0xCC850010 : 0x80000000
- dwc2_dump_global_registers: GINTSTS	 @0xCC850014 : 0x06000021
+ dwc2_dump_global_registers: GINTSTS	 @0xCC850014 : 0x06000029
  dwc2_dump_global_registers: GINTMSK	 @0xCC850018 : 0xF300080E
- dwc2_dump_global_registers: GRXSTSR	 @0xCC85001C : 0x00070044
+ dwc2_dump_global_registers: GRXSTSR	 @0xCC85001C : 0x000F0001
  dwc2_dump_global_registers: GRXFSIZ	 @0xCC850024 : 0x00000306
  dwc2_dump_global_registers: GNPTXFSIZ	 @0xCC850028 : 0x01000306
- dwc2_dump_global_registers: GNPTXSTS	 @0xCC85002C : 0x1D080100
+ dwc2_dump_global_registers: GNPTXSTS	 @0xCC85002C : 0x18080100
  dwc2_dump_global_registers: GI2CCTL	 @0xCC850030 : 0x00000000
  dwc2_dump_global_registers: GPVNDCTL	 @0xCC850034 : 0x00000000
  dwc2_dump_global_registers: GGPIO	 @0xCC850038 : 0x00000000
@@ -27,64 +27,64 @@
  dwc2_dump_host_registers: Host Global Registers
  dwc2_dump_host_registers: HCFG	 @0xCC850400 : 0x00000000
  dwc2_dump_host_registers: HFIR	 @0xCC850404 : 0x00001D4B
- dwc2_dump_host_registers: HFNUM	 @0xCC850408 : 0x0BF03CA1
- dwc2_dump_host_registers: HPTXSTS	 @0xCC850410 : 0x29080200
- dwc2_dump_host_registers: HAINT	 @0xCC850414 : 0x00000002
- dwc2_dump_host_registers: HAINTMSK	 @0xCC850418 : 0x00000002
+ dwc2_dump_host_registers: HFNUM	 @0xCC850408 : 0x029F3421
+ dwc2_dump_host_registers: HPTXSTS	 @0xCC850410 : 0x25080200
+ dwc2_dump_host_registers: HAINT	 @0xCC850414 : 0x00000001
+ dwc2_dump_host_registers: HAINTMSK	 @0xCC850418 : 0x00000001
  dwc2_dump_host_registers: HPRT0	 @0xCC850440 : 0x00001005
  dwc2_dump_host_registers: Host Channel 0 Specific Registers
- dwc2_dump_host_registers: HCCHAR	 @0xCC850500 : 0x01189840
- dwc2_dump_host_registers: HCSPLT	 @0xCC850504 : 0x8001C102
- dwc2_dump_host_registers: HCINT	 @0xCC850508 : 0x00000000
- dwc2_dump_host_registers: HCINTMSK	 @0xCC85050C : 0x00000000
- dwc2_dump_host_registers: HCTSIZ	 @0xCC850510 : 0x00080040
- dwc2_dump_host_registers: HCDMA	 @0xCC850514 : 0x4A07D300
+ dwc2_dump_host_registers: HCCHAR	 @0xCC850500 : 0x013C880A
+ dwc2_dump_host_registers: HCSPLT	 @0xCC850504 : 0x8000C102
+ dwc2_dump_host_registers: HCINT	 @0xCC850508 : 0x00000022
+ dwc2_dump_host_registers: HCINTMSK	 @0xCC85050C : 0x00000006
+ dwc2_dump_host_registers: HCTSIZ	 @0xCC850510 : 0x4008000A
+ dwc2_dump_host_registers: HCDMA	 @0xCC850514 : 0x4A093840
  dwc2_dump_host_registers: Host Channel 1 Specific Registers
  dwc2_dump_host_registers: HCCHAR	 @0xCC850520 : 0x01189840
  dwc2_dump_host_registers: HCSPLT	 @0xCC850524 : 0x8000C102
- dwc2_dump_host_registers: HCINT	 @0xCC850528 : 0x00000022
- dwc2_dump_host_registers: HCINTMSK	 @0xCC85052C : 0x00000006
+ dwc2_dump_host_registers: HCINT	 @0xCC850528 : 0x00000000
+ dwc2_dump_host_registers: HCINTMSK	 @0xCC85052C : 0x00000000
  dwc2_dump_host_registers: HCTSIZ	 @0xCC850530 : 0x00080040
- dwc2_dump_host_registers: HCDMA	 @0xCC850534 : 0x4A07D300
+ dwc2_dump_host_registers: HCDMA	 @0xCC850534 : 0x4A07D500
  dwc2_dump_host_registers: Host Channel 2 Specific Registers
  dwc2_dump_host_registers: HCCHAR	 @0xCC850540 : 0x01189840
  dwc2_dump_host_registers: HCSPLT	 @0xCC850544 : 0x8001C102
  dwc2_dump_host_registers: HCINT	 @0xCC850548 : 0x00000000
  dwc2_dump_host_registers: HCINTMSK	 @0xCC85054C : 0x00000000
  dwc2_dump_host_registers: HCTSIZ	 @0xCC850550 : 0x00080040
- dwc2_dump_host_registers: HCDMA	 @0xCC850554 : 0x4A07D300
+ dwc2_dump_host_registers: HCDMA	 @0xCC850554 : 0x4A07D500
  dwc2_dump_host_registers: Host Channel 3 Specific Registers
- dwc2_dump_host_registers: HCCHAR	 @0xCC850560 : 0x01189840
- dwc2_dump_host_registers: HCSPLT	 @0xCC850564 : 0x8001C102
+ dwc2_dump_host_registers: HCCHAR	 @0xCC850560 : 0x00D08040
+ dwc2_dump_host_registers: HCSPLT	 @0xCC850564 : 0x00000000
  dwc2_dump_host_registers: HCINT	 @0xCC850568 : 0x00000000
  dwc2_dump_host_registers: HCINTMSK	 @0xCC85056C : 0x00000000
- dwc2_dump_host_registers: HCTSIZ	 @0xCC850570 : 0x00080040
- dwc2_dump_host_registers: HCDMA	 @0xCC850574 : 0x4A07D300
+ dwc2_dump_host_registers: HCTSIZ	 @0xCC850570 : 0x8000003C
+ dwc2_dump_host_registers: HCDMA	 @0xCC850574 : 0x4A3B7544
  dwc2_dump_host_registers: Host Channel 4 Specific Registers
- dwc2_dump_host_registers: HCCHAR	 @0xCC850580 : 0x01189840
- dwc2_dump_host_registers: HCSPLT	 @0xCC850584 : 0x8001C102
+ dwc2_dump_host_registers: HCCHAR	 @0xCC850580 : 0x00D00040
+ dwc2_dump_host_registers: HCSPLT	 @0xCC850584 : 0x00000000
  dwc2_dump_host_registers: HCINT	 @0xCC850588 : 0x00000000
  dwc2_dump_host_registers: HCINTMSK	 @0xCC85058C : 0x00000000
- dwc2_dump_host_registers: HCTSIZ	 @0xCC850590 : 0x00080040
- dwc2_dump_host_registers: HCDMA	 @0xCC850594 : 0x4A07D300
+ dwc2_dump_host_registers: HCTSIZ	 @0xCC850590 : 0x20000008
+ dwc2_dump_host_registers: HCDMA	 @0xCC850594 : 0x4A3B74C8
  dwc2_dump_host_registers: Host Channel 5 Specific Registers
  dwc2_dump_host_registers: HCCHAR	 @0xCC8505A0 : 0x01189840
  dwc2_dump_host_registers: HCSPLT	 @0xCC8505A4 : 0x8000C102
  dwc2_dump_host_registers: HCINT	 @0xCC8505A8 : 0x00000000
  dwc2_dump_host_registers: HCINTMSK	 @0xCC8505AC : 0x00000000
  dwc2_dump_host_registers: HCTSIZ	 @0xCC8505B0 : 0x00080040
- dwc2_dump_host_registers: HCDMA	 @0xCC8505B4 : 0x4A07D300
+ dwc2_dump_host_registers: HCDMA	 @0xCC8505B4 : 0x4A07D500
  dwc2_dump_host_registers: Host Channel 6 Specific Registers
- dwc2_dump_host_registers: HCCHAR	 @0xCC8505C0 : 0x01189840
- dwc2_dump_host_registers: HCSPLT	 @0xCC8505C4 : 0x8001C102
+ dwc2_dump_host_registers: HCCHAR	 @0xCC8505C0 : 0x00D00040
+ dwc2_dump_host_registers: HCSPLT	 @0xCC8505C4 : 0x00000000
  dwc2_dump_host_registers: HCINT	 @0xCC8505C8 : 0x00000000
  dwc2_dump_host_registers: HCINTMSK	 @0xCC8505CC : 0x00000000
- dwc2_dump_host_registers: HCTSIZ	 @0xCC8505D0 : 0x00080040
- dwc2_dump_host_registers: HCDMA	 @0xCC8505D4 : 0x4A07D300
+ dwc2_dump_host_registers: HCTSIZ	 @0xCC8505D0 : 0x00000000
+ dwc2_dump_host_registers: HCDMA	 @0xCC8505D4 : 0x4B080000
  dwc2_dump_host_registers: Host Channel 7 Specific Registers
- dwc2_dump_host_registers: HCCHAR	 @0xCC8505E0 : 0x01189840
+ dwc2_dump_host_registers: HCCHAR	 @0xCC8505E0 : 0x013C880A
  dwc2_dump_host_registers: HCSPLT	 @0xCC8505E4 : 0x8001C102
  dwc2_dump_host_registers: HCINT	 @0xCC8505E8 : 0x00000000
  dwc2_dump_host_registers: HCINTMSK	 @0xCC8505EC : 0x00000000
- dwc2_dump_host_registers: HCTSIZ	 @0xCC8505F0 : 0x00080040
- dwc2_dump_host_registers: HCDMA	 @0xCC8505F4 : 0x4A07D300
+ dwc2_dump_host_registers: HCTSIZ	 @0xCC8505F0 : 0x4008000A
+ dwc2_dump_host_registers: HCDMA	 @0xCC8505F4 : 0x4A093840

  reply	other threads:[~2017-04-25 18:11 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-31 20:34 usb: dwc2: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 146s Stefan Wahren
2016-11-01  8:26 ` Michael Zoran
2017-04-17 11:05 ` Stefan Wahren
2017-04-17 22:37   ` Doug Anderson
2017-04-18  8:07     ` Stefan Wahren
2017-04-18 20:08       ` Doug Anderson
2017-04-18 20:25       ` Stefan Wahren
2017-04-18 20:41         ` Doug Anderson
2017-04-18 20:53           ` Stefan Wahren
2017-04-19 20:25           ` Stefan Wahren
2017-04-19 21:47             ` Doug Anderson
2017-04-20  7:46               ` Stefan Wahren
2017-04-20 16:19                 ` Doug Anderson
2017-04-20 18:54             ` Eric Anholt
2017-04-20 19:45               ` Doug Anderson
2017-04-20 19:57                 ` Eric Anholt
2017-04-20 20:37                   ` Doug Anderson
2017-04-22 20:50               ` Stefan Wahren
2017-04-25 18:11                 ` Stefan Wahren [this message]
2017-05-08 20:22                   ` Stefan Wahren
2017-05-10 16:31                     ` Johan Hovold
2017-05-10 23:50                       ` Doug Anderson
2017-05-13 12:28                         ` Stefan Wahren
2017-10-16 20:49                           ` Julius Werner
2017-10-17  8:52                             ` Johan Hovold
2017-10-25 21:22                             ` Doug Anderson
2017-04-19  8:56         ` Johan Hovold
2017-04-19  9:12           ` Stefan Wahren
2017-04-19  9:55             ` Johan Hovold
2017-04-17 23:45   ` Heiko Stuebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2127594073.298820.1493143869792@email.1und1.de \
    --to=stefan.wahren@i2se.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.