All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Anders Roxell <anders.roxell@linaro.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexander Potapenko <glider@google.com>,
	Dmitry Vyukov <dvyukov@google.com>, Jann Horn <jannh@google.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	kasan-dev <kasan-dev@googlegroups.com>,
	rcu@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: linux-next: stall warnings and deadlock on Arm64 (was: [PATCH] kfence: Avoid stalling...)
Date: Fri, 20 Nov 2020 15:19:28 +0100	[thread overview]
Message-ID: <20201120141928.GB3120165@elver.google.com> (raw)
In-Reply-To: <20201119213512.GB1437@paulmck-ThinkPad-P72>

On Thu, Nov 19, 2020 at 01:35PM -0800, Paul E. McKenney wrote:
> On Thu, Nov 19, 2020 at 08:38:19PM +0100, Marco Elver wrote:
> > On Thu, Nov 19, 2020 at 10:48AM -0800, Paul E. McKenney wrote:
> > > On Thu, Nov 19, 2020 at 06:02:59PM +0100, Marco Elver wrote:
> 
> [ . . . ]
> 
> > > > I can try bisection again, or reverting some commits that might be
> > > > suspicious? But we'd need some selection of suspicious commits.
> > > 
> > > The report claims that one of the rcu_node ->lock fields is held
> > > with interrupts enabled, which would indeed be bad.  Except that all
> > > of the stack traces that it shows have these locks held within the
> > > scheduling-clock interrupt handler.  Now with the "rcu: Don't invoke
> > > try_invoke_on_locked_down_task() with irqs disabled" but without the
> > > "sched/core: Allow try_invoke_on_locked_down_task() with irqs disabled"
> > > commit, I understand why.  With both, I don't see how this happens.
> > 
> > I'm at a loss, but happy to keep bisecting and trying patches. I'm also
> > considering:
> > 
> > 	Is it the compiler? Probably not, I tried 2 versions of GCC.
> > 
> > 	Can we trust lockdep to precisely know IRQ state? I know there's
> > 	been some recent work around this, but hopefully we're not
> > 	affected here?
> > 
> > 	Is QEMU buggy?
> > 
> > > At this point, I am reduced to adding lockdep_assert_irqs_disabled()
> > > calls at various points in that code, as shown in the patch below.
> > > 
> > > At this point, I would guess that your first priority would be the
> > > initial bug rather than this following issue, but you never know, this
> > > might well help diagnose the initial bug.
> > 
> > I don't mind either way. I'm worried deadlocking the whole system might
> > be worse.
> 
> Here is another set of lockdep_assert_irqs_disabled() calls on the
> off-chance that they actually find something.
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit bcca5277df3f24db15e15ccc8b05ecf346d05169
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Thu Nov 19 13:30:33 2020 -0800
> 
>     rcu: Add lockdep_assert_irqs_disabled() to raw_spin_unlock_rcu_node() macros

None of those triggered either.

I found that disabling ftrace for some of kernel/rcu (see below) solved
the stalls (and any mention of deadlocks as a side-effect I assume),
resulting in successful boot.

Does that provide any additional clues? I tried to narrow it down to 1-2
files, but that doesn't seem to work.

Thanks,
-- Marco

------ >8 ------

diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
index 0cfb009a99b9..678b4b094f94 100644
--- a/kernel/rcu/Makefile
+++ b/kernel/rcu/Makefile
@@ -3,6 +3,13 @@
 # and is generally not a function of system call inputs.
 KCOV_INSTRUMENT := n
 
+ifdef CONFIG_FUNCTION_TRACER
+CFLAGS_REMOVE_update.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_sync.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_srcutree.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_tree.o = $(CC_FLAGS_FTRACE)
+endif
+
 ifeq ($(CONFIG_KCSAN),y)
 KBUILD_CFLAGS += -g -fno-omit-frame-pointer
 endif

WARNING: multiple messages have this Message-ID (diff)
From: Marco Elver <elver@google.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Anders Roxell <anders.roxell@linaro.org>,
	Jann Horn <jannh@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	rcu@vger.kernel.org, Linux-MM <linux-mm@kvack.org>,
	Alexander Potapenko <glider@google.com>,
	kasan-dev <kasan-dev@googlegroups.com>, Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: linux-next: stall warnings and deadlock on Arm64 (was: [PATCH] kfence: Avoid stalling...)
Date: Fri, 20 Nov 2020 15:19:28 +0100	[thread overview]
Message-ID: <20201120141928.GB3120165@elver.google.com> (raw)
In-Reply-To: <20201119213512.GB1437@paulmck-ThinkPad-P72>

On Thu, Nov 19, 2020 at 01:35PM -0800, Paul E. McKenney wrote:
> On Thu, Nov 19, 2020 at 08:38:19PM +0100, Marco Elver wrote:
> > On Thu, Nov 19, 2020 at 10:48AM -0800, Paul E. McKenney wrote:
> > > On Thu, Nov 19, 2020 at 06:02:59PM +0100, Marco Elver wrote:
> 
> [ . . . ]
> 
> > > > I can try bisection again, or reverting some commits that might be
> > > > suspicious? But we'd need some selection of suspicious commits.
> > > 
> > > The report claims that one of the rcu_node ->lock fields is held
> > > with interrupts enabled, which would indeed be bad.  Except that all
> > > of the stack traces that it shows have these locks held within the
> > > scheduling-clock interrupt handler.  Now with the "rcu: Don't invoke
> > > try_invoke_on_locked_down_task() with irqs disabled" but without the
> > > "sched/core: Allow try_invoke_on_locked_down_task() with irqs disabled"
> > > commit, I understand why.  With both, I don't see how this happens.
> > 
> > I'm at a loss, but happy to keep bisecting and trying patches. I'm also
> > considering:
> > 
> > 	Is it the compiler? Probably not, I tried 2 versions of GCC.
> > 
> > 	Can we trust lockdep to precisely know IRQ state? I know there's
> > 	been some recent work around this, but hopefully we're not
> > 	affected here?
> > 
> > 	Is QEMU buggy?
> > 
> > > At this point, I am reduced to adding lockdep_assert_irqs_disabled()
> > > calls at various points in that code, as shown in the patch below.
> > > 
> > > At this point, I would guess that your first priority would be the
> > > initial bug rather than this following issue, but you never know, this
> > > might well help diagnose the initial bug.
> > 
> > I don't mind either way. I'm worried deadlocking the whole system might
> > be worse.
> 
> Here is another set of lockdep_assert_irqs_disabled() calls on the
> off-chance that they actually find something.
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit bcca5277df3f24db15e15ccc8b05ecf346d05169
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Thu Nov 19 13:30:33 2020 -0800
> 
>     rcu: Add lockdep_assert_irqs_disabled() to raw_spin_unlock_rcu_node() macros

None of those triggered either.

I found that disabling ftrace for some of kernel/rcu (see below) solved
the stalls (and any mention of deadlocks as a side-effect I assume),
resulting in successful boot.

Does that provide any additional clues? I tried to narrow it down to 1-2
files, but that doesn't seem to work.

Thanks,
-- Marco

------ >8 ------

diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
index 0cfb009a99b9..678b4b094f94 100644
--- a/kernel/rcu/Makefile
+++ b/kernel/rcu/Makefile
@@ -3,6 +3,13 @@
 # and is generally not a function of system call inputs.
 KCOV_INSTRUMENT := n
 
+ifdef CONFIG_FUNCTION_TRACER
+CFLAGS_REMOVE_update.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_sync.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_srcutree.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_tree.o = $(CC_FLAGS_FTRACE)
+endif
+
 ifeq ($(CONFIG_KCSAN),y)
 KBUILD_CFLAGS += -g -fno-omit-frame-pointer
 endif

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2020-11-20 14:19 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-10 13:53 [PATCH] kfence: Avoid stalling work queue task without allocations Marco Elver
2020-11-10 13:53 ` Marco Elver
2020-11-10 14:25 ` Dmitry Vyukov
2020-11-10 14:25   ` Dmitry Vyukov
2020-11-10 14:53   ` Marco Elver
2020-11-10 14:53     ` Marco Elver
2020-11-10 23:23 ` Anders Roxell
2020-11-10 23:23   ` Anders Roxell
2020-11-11  8:29   ` Marco Elver
2020-11-11  8:29     ` Marco Elver
2020-11-11 13:38     ` Marco Elver
2020-11-11 18:05       ` Steven Rostedt
2020-11-11 18:23         ` Paul E. McKenney
2020-11-11 18:34           ` Marco Elver
2020-11-11 19:21             ` Paul E. McKenney
2020-11-11 20:21               ` Marco Elver
2020-11-12  0:11                 ` Paul E. McKenney
2020-11-12 12:49                   ` Marco Elver
2020-11-12 12:49                     ` Marco Elver
2020-11-12 16:14                     ` Marco Elver
2020-11-12 17:54                       ` Paul E. McKenney
2020-11-12 18:12                         ` Marco Elver
2020-11-12 20:00                           ` Paul E. McKenney
2020-11-13 11:06                             ` Marco Elver
2020-11-13 17:20                               ` Paul E. McKenney
2020-11-13 17:57                         ` Paul E. McKenney
2020-11-17 10:52                           ` Marco Elver
2020-11-17 18:29                             ` Paul E. McKenney
2020-11-18 22:56                               ` Marco Elver
2020-11-18 23:38                                 ` Paul E. McKenney
2020-11-19 12:53                                   ` Marco Elver
2020-11-19 15:14                                     ` Paul E. McKenney
2020-11-19 17:02                                       ` Marco Elver
2020-11-19 18:48                                         ` Paul E. McKenney
2020-11-19 19:38                                           ` linux-next: stall warnings and deadlock on Arm64 (was: [PATCH] kfence: Avoid stalling...) Marco Elver
2020-11-19 19:38                                             ` Marco Elver
2020-11-19 21:35                                             ` Paul E. McKenney
2020-11-19 21:35                                               ` Paul E. McKenney
2020-11-19 22:53                                               ` Will Deacon
2020-11-19 22:53                                                 ` Will Deacon
2020-11-20 10:30                                                 ` Mark Rutland
2020-11-20 10:30                                                   ` Mark Rutland
2020-11-20 14:03                                                   ` Marco Elver
2020-11-20 14:03                                                     ` Marco Elver
2020-11-23 19:32                                                     ` Mark Rutland
2020-11-23 19:32                                                       ` Mark Rutland
2020-11-24 14:03                                                       ` Marco Elver
2020-11-24 14:03                                                         ` Marco Elver
2020-11-24 15:01                                                         ` Paul E. McKenney
2020-11-24 15:01                                                           ` Paul E. McKenney
2020-11-24 19:43                                                           ` Mark Rutland
2020-11-24 19:43                                                             ` Mark Rutland
2020-11-24 20:32                                                             ` Steven Rostedt
2020-11-24 20:32                                                               ` Steven Rostedt
2020-11-24 19:30                                                         ` Mark Rutland
2020-11-24 19:30                                                           ` Mark Rutland
2020-11-25  9:45                                                           ` Marco Elver
2020-11-25  9:45                                                             ` Marco Elver
2020-11-25 10:28                                                             ` Mark Rutland
2020-11-25 10:28                                                               ` Mark Rutland
2020-11-20 14:19                                               ` Marco Elver [this message]
2020-11-20 14:19                                                 ` Marco Elver
2020-11-20 14:39                                                 ` Paul E. McKenney
2020-11-20 14:39                                                   ` Paul E. McKenney
2020-11-20 15:22                                                   ` Mark Rutland
2020-11-20 15:22                                                     ` Mark Rutland
2020-11-20 17:38                                                     ` Paul E. McKenney
2020-11-20 17:38                                                       ` Paul E. McKenney
2020-11-20 18:02                                                       ` Mark Rutland
2020-11-20 18:02                                                         ` Mark Rutland
2020-11-20 18:57                                                         ` Paul E. McKenney
2020-11-20 18:57                                                           ` Paul E. McKenney
2020-11-20 15:26                                                 ` Steven Rostedt
2020-11-20 15:26                                                   ` Steven Rostedt
2020-11-20 18:17                                                   ` Marco Elver
2020-11-20 18:17                                                     ` Marco Elver
2020-11-20 18:57                                                     ` Steven Rostedt
2020-11-20 18:57                                                       ` Steven Rostedt
2020-11-20 19:16                                                     ` Steven Rostedt
2020-11-20 19:16                                                       ` Steven Rostedt
2020-11-20 19:22                                                       ` Marco Elver
2020-11-20 19:22                                                         ` Marco Elver
2020-11-20 19:22                                                         ` Marco Elver
2020-11-20 19:27                                     ` [PATCH] kfence: Avoid stalling work queue task without allocations Steven Rostedt
2020-11-23 15:27                                       ` Marco Elver
2020-11-23 16:28                                         ` Steven Rostedt
2020-11-23 16:36                                           ` Steven Rostedt
2020-11-23 18:53                                             ` Marco Elver
2020-11-23 18:42                                           ` Steven Rostedt
2020-11-24  2:59                                             ` Boqun Feng
2020-11-24  3:44                                               ` Paul E. McKenney
2020-11-11 18:21       ` Paul E. McKenney
2020-11-11 15:01     ` Anders Roxell
2020-11-11 15:01       ` Anders Roxell
2020-11-11 15:22       ` Marco Elver
2020-11-11 15:22         ` Marco Elver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201120141928.GB3120165@elver.google.com \
    --to=elver@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=anders.roxell@linaro.org \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=jannh@google.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.