All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>, Michal Hocko <mhocko@kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page
Date: Tue, 18 Aug 2020 15:50:45 +0200	[thread overview]
Message-ID: <20200818135045.GA495837@kroah.com> (raw)
In-Reply-To: <alpine.LSU.2.11.2008052221440.8716@eggly.anvils>

On Wed, Aug 05, 2020 at 10:46:12PM -0700, Hugh Dickins wrote:
> On Mon, 27 Jul 2020, Greg KH wrote:
> > 
> > Linus just pointed me at this thread.
> > 
> > If you could run:
> > 	echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
> > and run the same workload to see if anything shows up in the log when
> > xhci crashes, that would be great.
> 
> Thanks, I tried that, and indeed it did have a story to tell:
> 
> ep 0x81 - asked for 16 bytes, 10 bytes untransferred
> ep 0x81 - asked for 16 bytes, 10 bytes untransferred
> ep 0x81 - asked for 16 bytes, 10 bytes untransferred
>    a very large number of lines like the above, then
> Cancel URB 00000000d81602f7, dev 4, ep 0x0, starting at offset 0xfffd42c0
> // Ding dong!
> ep 0x81 - asked for 16 bytes, 10 bytes untransferred
> Stopped on No-op or Link TRB for slot 1 ep 0
> xhci_drop_endpoint called for udev 000000005bc07fa6
> drop ep 0x81, slot id 1, new drop flags = 0x8, new add flags = 0x0
> add ep 0x81, slot id 1, new drop flags = 0x8, new add flags = 0x8
> xhci_check_bandwidth called for udev 000000005bc07fa6
> // Ding dong!
> Successful Endpoint Configure command
> Cancel URB 000000006b77d490, dev 4, ep 0x81, starting at offset 0x0
> // Ding dong!
> Stopped on No-op or Link TRB for slot 1 ep 2
> Removing canceled TD starting at 0x0 (dma).
> list_del corruption: prev(ffff8fdb4de7a130)->next should be ffff8fdb41697f88,
>    but is 6b6b6b6b6b6b6b6b; next(ffff8fdb4de7a130)->prev is 6b6b6b6b6b6b6b6b.
> ------------[ cut here ]------------
> kernel BUG at lib/list_debug.c:53!
> RIP: 0010:__list_del_entry_valid+0x8e/0xb0
> Call Trace:
>  <IRQ>
>  handle_cmd_completion+0x7d4/0x14f0 [xhci_hcd]
>  xhci_irq+0x242/0x1ea0 [xhci_hcd]
>  xhci_msi_irq+0x11/0x20 [xhci_hcd]
>  __handle_irq_event_percpu+0x48/0x2c0
>  handle_irq_event_percpu+0x32/0x80
>  handle_irq_event+0x4a/0x80
>  handle_edge_irq+0xd8/0x1b0
>  handle_irq+0x2b/0x50
>  do_IRQ+0xb6/0x1c0
>  common_interrupt+0x90/0x90
>  </IRQ>
> 
> Info provided for your interest, not expecting any response.
> The list_del info in there is non-standard, from a patch of mine:
> I find hashed addresses in debug output less than helpful.

Thanks for this, that is really odd.

> > 
> > Although if you are using an "older version" of the driver, there's not
> > much I can suggest except update to a newer one :)
> 
> Yes, I was reluctant to post any info, since really the ball is at our
> end of the court, not yours. I did have a go at bringing in the latest
> xhci driver instead, but quickly saw that was not a sensible task for
> me. And I did scan the git log of xhci changes (especially xhci-ring.c
> changes): thought I saw a likely relevant and easily applied fix commit,
> but in fact it made no difference here.
> 
> I suspect it's in part a hardware problem, but driver not recovering
> correctly. I've replaced the machine (but also noticed that the same
> crash has occasionally been seen on other machines). I'm sure it has
> no relevance to this unlock_page() thread, though it's quite possible
> that it's triggered under stress, and Linus's changes allowed greater
> stress.

I will be willing to blame hardware problems for this as well, but will
save this report in case something else shows up in the future, thanks!

greg k-h

  reply	other threads:[~2020-08-18 13:50 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21  6:32 [RFC PATCH] mm: silence soft lockups from unlock_page Michal Hocko
2020-07-21 11:10 ` Qian Cai
2020-07-21 11:25   ` Michal Hocko
2020-07-21 11:44     ` Qian Cai
2020-07-21 12:17       ` Michal Hocko
2020-07-21 13:23         ` Qian Cai
2020-07-21 13:38           ` Michal Hocko
2020-07-21 14:15             ` Qian Cai
2020-07-21 14:17 ` Chris Down
2020-07-21 15:00   ` Michal Hocko
2020-07-21 15:33 ` Linus Torvalds
2020-07-21 15:33   ` Linus Torvalds
2020-07-21 15:49   ` Michal Hocko
2020-07-22 18:29   ` Linus Torvalds
2020-07-22 18:29     ` Linus Torvalds
2020-07-22 21:29     ` Hugh Dickins
2020-07-22 21:29       ` Hugh Dickins
2020-07-22 22:10       ` Linus Torvalds
2020-07-22 22:10         ` Linus Torvalds
2020-07-22 23:42         ` Linus Torvalds
2020-07-22 23:42           ` Linus Torvalds
2020-07-23  0:23           ` Linus Torvalds
2020-07-23  0:23             ` Linus Torvalds
2020-07-23 12:47           ` Oleg Nesterov
2020-07-23 17:32             ` Linus Torvalds
2020-07-23 17:32               ` Linus Torvalds
2020-07-23 18:01               ` Oleg Nesterov
2020-07-23 18:22                 ` Linus Torvalds
2020-07-23 18:22                   ` Linus Torvalds
2020-07-23 19:03                   ` Linus Torvalds
2020-07-23 19:03                     ` Linus Torvalds
2020-07-24 14:45                     ` Oleg Nesterov
2020-07-23 20:03               ` Linus Torvalds
2020-07-23 20:03                 ` Linus Torvalds
2020-07-23 23:11                 ` Hugh Dickins
2020-07-23 23:11                   ` Hugh Dickins
2020-07-23 23:43                   ` Linus Torvalds
2020-07-23 23:43                     ` Linus Torvalds
2020-07-24  0:07                     ` Hugh Dickins
2020-07-24  0:07                       ` Hugh Dickins
2020-07-24  0:46                       ` Linus Torvalds
2020-07-24  0:46                         ` Linus Torvalds
2020-07-24  3:45                         ` Hugh Dickins
2020-07-24  3:45                           ` Hugh Dickins
2020-07-24 15:24                     ` Oleg Nesterov
2020-07-24 17:32                       ` Linus Torvalds
2020-07-24 17:32                         ` Linus Torvalds
2020-07-24 23:25                         ` Linus Torvalds
2020-07-24 23:25                           ` Linus Torvalds
2020-07-25  2:08                           ` Hugh Dickins
2020-07-25  2:08                             ` Hugh Dickins
2020-07-25  2:46                             ` Linus Torvalds
2020-07-25  2:46                               ` Linus Torvalds
2020-07-25 10:14                           ` Oleg Nesterov
2020-07-25 18:48                             ` Linus Torvalds
2020-07-25 18:48                               ` Linus Torvalds
2020-07-25 19:27                               ` Oleg Nesterov
2020-07-25 19:51                                 ` Linus Torvalds
2020-07-25 19:51                                   ` Linus Torvalds
2020-07-26 13:57                                   ` Oleg Nesterov
2020-07-25 21:19                               ` Hugh Dickins
2020-07-25 21:19                                 ` Hugh Dickins
2020-07-26  4:22                                 ` Hugh Dickins
2020-07-26  4:22                                   ` Hugh Dickins
2020-07-26 20:30                                   ` Hugh Dickins
2020-07-26 20:30                                     ` Hugh Dickins
2020-07-26 20:41                                     ` Linus Torvalds
2020-07-26 20:41                                       ` Linus Torvalds
2020-07-26 22:09                                       ` Hugh Dickins
2020-07-26 22:09                                         ` Hugh Dickins
2020-07-27 19:35                                     ` Greg KH
2020-08-06  5:46                                       ` Hugh Dickins
2020-08-06  5:46                                         ` Hugh Dickins
2020-08-18 13:50                                         ` Greg KH [this message]
2020-08-06  5:21                                     ` Hugh Dickins
2020-08-06  5:21                                       ` Hugh Dickins
2020-08-06 17:07                                       ` Linus Torvalds
2020-08-06 17:07                                         ` Linus Torvalds
2020-08-06 18:00                                         ` Matthew Wilcox
2020-08-06 18:32                                           ` Linus Torvalds
2020-08-06 18:32                                             ` Linus Torvalds
2020-08-07 18:41                                             ` Hugh Dickins
2020-08-07 18:41                                               ` Hugh Dickins
2020-08-07 19:07                                               ` Linus Torvalds
2020-08-07 19:07                                                 ` Linus Torvalds
2020-08-07 19:35                                               ` Matthew Wilcox
2020-08-03 13:14                           ` Michal Hocko
2020-08-03 17:56                             ` Linus Torvalds
2020-08-03 17:56                               ` Linus Torvalds
2020-07-25  9:39                         ` Oleg Nesterov
2020-07-23  8:03     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200818135045.GA495837@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=oleg@redhat.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.