From: Mark Rutland <mark.rutland@arm.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marco Elver <elver@google.com>,
Steven Rostedt <rostedt@goodmis.org>,
Anders Roxell <anders.roxell@linaro.org>,
Andrew Morton <akpm@linux-foundation.org>,
Alexander Potapenko <glider@google.com>,
Dmitry Vyukov <dvyukov@google.com>, Jann Horn <jannh@google.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>,
kasan-dev <kasan-dev@googlegroups.com>,
rcu@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
linux-arm-kernel@lists.infradead.org
Subject: Re: linux-next: stall warnings and deadlock on Arm64 (was: [PATCH] kfence: Avoid stalling...)
Date: Fri, 20 Nov 2020 18:02:06 +0000 [thread overview]
Message-ID: <20201120180206.GF2328@C02TD0UTHF1T.local> (raw)
In-Reply-To: <20201120173824.GJ1437@paulmck-ThinkPad-P72>
On Fri, Nov 20, 2020 at 09:38:24AM -0800, Paul E. McKenney wrote:
> On Fri, Nov 20, 2020 at 03:22:00PM +0000, Mark Rutland wrote:
> > On Fri, Nov 20, 2020 at 06:39:28AM -0800, Paul E. McKenney wrote:
> > > On Fri, Nov 20, 2020 at 03:19:28PM +0100, Marco Elver wrote:
> > > > I found that disabling ftrace for some of kernel/rcu (see below) solved
> > > > the stalls (and any mention of deadlocks as a side-effect I assume),
> > > > resulting in successful boot.
> > > >
> > > > Does that provide any additional clues? I tried to narrow it down to 1-2
> > > > files, but that doesn't seem to work.
> > >
> > > There were similar issues during the x86/entry work. Are the ARM guys
> > > doing arm64/entry work now?
> >
> > I'm currently looking at it. I had been trying to shift things to C for
> > a while, and right now I'm trying to fix the lockdep state tracking,
> > which is requiring untangling lockdep/rcu/tracing.
> >
> > The main issue I see remaining atm is that we don't save/restore the
> > lockdep state over exceptions taken from kernel to kernel. That could
> > result in lockdep thinking IRQs are disabled when they're actually
> > enabled (because code in the nested context might do a save/restore
> > while IRQs are disabled, then return to a context where IRQs are
> > enabled), but AFAICT shouldn't result in the inverse in most cases since
> > the non-NMI handlers all call lockdep_hardirqs_disabled().
> >
> > I'm at a loss to explaim the rcu vs ftrace bits, so if you have any
> > pointers to the issuies ween with the x86 rework that'd be quite handy.
>
> There were several over a number of months. I especially recall issues
> with the direct-from-idle execution of smp_call_function*() handlers,
> and also with some of the special cases in the entry code, for example,
> reentering the kernel from the kernel. This latter could cause RCU to
> not be watching when it should have been or vice versa.
Ah; those are precisely the cases I'm currently fixing, so if we're
lucky this is an indirect result of one of those rather than a novel
source of pain...
> I would of course be most aware of the issues that impinged on RCU
> and that were located by rcutorture. This is actually not hard to run,
> especially if the ARM bits in the scripting have managed to avoid bitrot.
> The "modprobe rcutorture" approach has fewer dependencies. Either way:
> https://paulmck.livejournal.com/57769.html and later posts.
That is a very good idea. I'd been relying on Syzkaller to tickle the
issue, but the torture infrastructure is a much better fit for this
problem. I hadn't realise how comprehensive the scripting was, thanks
for this!
I'll see about giving that a go once I have the irq-from-idle cases
sorted, as those are very obviously broken if you hack
trace_hardirqs_{on,off}() to check that RCU is watching.
Thanks,
Mark.
next prev parent reply other threads:[~2020-11-20 18:02 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-10 13:53 [PATCH] kfence: Avoid stalling work queue task without allocations Marco Elver
2020-11-10 14:25 ` Dmitry Vyukov
2020-11-10 14:53 ` Marco Elver
2020-11-10 23:23 ` Anders Roxell
2020-11-11 8:29 ` Marco Elver
2020-11-11 13:38 ` Marco Elver
2020-11-11 18:05 ` Steven Rostedt
2020-11-11 18:23 ` Paul E. McKenney
2020-11-11 18:34 ` Marco Elver
2020-11-11 19:21 ` Paul E. McKenney
2020-11-11 20:21 ` Marco Elver
2020-11-12 0:11 ` Paul E. McKenney
2020-11-12 12:49 ` Marco Elver
2020-11-12 16:14 ` Marco Elver
2020-11-12 17:54 ` Paul E. McKenney
2020-11-12 18:12 ` Marco Elver
2020-11-12 20:00 ` Paul E. McKenney
2020-11-13 11:06 ` Marco Elver
2020-11-13 17:20 ` Paul E. McKenney
2020-11-13 17:57 ` Paul E. McKenney
2020-11-17 10:52 ` Marco Elver
2020-11-17 18:29 ` Paul E. McKenney
2020-11-18 22:56 ` Marco Elver
2020-11-18 23:38 ` Paul E. McKenney
2020-11-19 12:53 ` Marco Elver
2020-11-19 15:14 ` Paul E. McKenney
2020-11-19 17:02 ` Marco Elver
2020-11-19 18:48 ` Paul E. McKenney
2020-11-19 19:38 ` linux-next: stall warnings and deadlock on Arm64 (was: [PATCH] kfence: Avoid stalling...) Marco Elver
2020-11-19 21:35 ` Paul E. McKenney
2020-11-19 22:53 ` Will Deacon
2020-11-20 10:30 ` Mark Rutland
2020-11-20 14:03 ` Marco Elver
2020-11-23 19:32 ` Mark Rutland
2020-11-24 14:03 ` Marco Elver
2020-11-24 15:01 ` Paul E. McKenney
2020-11-24 19:43 ` Mark Rutland
2020-11-24 20:32 ` Steven Rostedt
2020-11-24 19:30 ` Mark Rutland
2020-11-25 9:45 ` Marco Elver
2020-11-25 10:28 ` Mark Rutland
2020-11-20 14:19 ` Marco Elver
2020-11-20 14:39 ` Paul E. McKenney
2020-11-20 15:22 ` Mark Rutland
2020-11-20 17:38 ` Paul E. McKenney
2020-11-20 18:02 ` Mark Rutland [this message]
2020-11-20 18:57 ` Paul E. McKenney
2020-11-20 15:26 ` Steven Rostedt
2020-11-20 18:17 ` Marco Elver
2020-11-20 18:57 ` Steven Rostedt
2020-11-20 19:16 ` Steven Rostedt
2020-11-20 19:22 ` Marco Elver
2020-11-20 19:27 ` [PATCH] kfence: Avoid stalling work queue task without allocations Steven Rostedt
2020-11-23 15:27 ` Marco Elver
2020-11-23 16:28 ` Steven Rostedt
2020-11-23 16:36 ` Steven Rostedt
2020-11-23 18:53 ` Marco Elver
2020-11-23 18:42 ` Steven Rostedt
2020-11-24 2:59 ` Boqun Feng
2020-11-24 3:44 ` Paul E. McKenney
2020-11-11 18:21 ` Paul E. McKenney
2020-11-11 15:01 ` Anders Roxell
2020-11-11 15:22 ` Marco Elver
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201120180206.GF2328@C02TD0UTHF1T.local \
--to=mark.rutland@arm.com \
--cc=akpm@linux-foundation.org \
--cc=anders.roxell@linaro.org \
--cc=dvyukov@google.com \
--cc=elver@google.com \
--cc=glider@google.com \
--cc=jannh@google.com \
--cc=jiangshanlai@gmail.com \
--cc=kasan-dev@googlegroups.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).