All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Qian Cai <cai@redhat.com>
Cc: Will Deacon <will@kernel.org>,
	catalin.marinas@arm.com, kernel-team@android.com,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arm64/smp: Move rcu_cpu_starting() earlier
Date: Thu, 5 Nov 2020 15:28:13 -0800	[thread overview]
Message-ID: <20201105232813.GR3249@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <3b4c324abdabd12d7bd5346c18411e667afe6a55.camel@redhat.com>

On Thu, Nov 05, 2020 at 06:02:49PM -0500, Qian Cai wrote:
> On Thu, 2020-11-05 at 22:22 +0000, Will Deacon wrote:
> > On Fri, Oct 30, 2020 at 04:33:25PM +0000, Will Deacon wrote:
> > > On Wed, 28 Oct 2020 14:26:14 -0400, Qian Cai wrote:
> > > > The call to rcu_cpu_starting() in secondary_start_kernel() is not early
> > > > enough in the CPU-hotplug onlining process, which results in lockdep
> > > > splats as follows:
> > > > 
> > > >  WARNING: suspicious RCU usage
> > > >  -----------------------------
> > > >  kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!!
> > > > 
> > > > [...]
> > > 
> > > Applied to arm64 (for-next/fixes), thanks!
> > > 
> > > [1/1] arm64/smp: Move rcu_cpu_starting() earlier
> > >       https://git.kernel.org/arm64/c/ce3d31ad3cac
> > 
> > Hmm, this patch has caused a regression in the case that we fail to
> > online a CPU because it has incompatible CPU features and so we park it
> > in cpu_die_early(). We now get an endless spew of RCU stalls because the
> > core will never come online, but is being tracked by RCU. So I'm tempted
> > to revert this and live with the lockdep warning while we figure out a
> > proper fix.
> > 
> > What's the correct say to undo rcu_cpu_starting(), given that we cannot
> > invoke the full hotplug machinery here? Is it correct to call
> > rcutree_dying_cpu() on the bad CPU and then rcutree_dead_cpu() from the
> > CPU doing cpu_up(), or should we do something else?
> It looks to me that rcu_report_dead() does the opposite of rcu_cpu_starting(),
> so lift rcu_report_dead() out of CONFIG_HOTPLUG_CPU and use it there to rewind,
> Paul?

Yes, rcu_report_dead() should do the trick.  Presumably the earlier
online-time CPU-hotplug notifiers are also unwound?

							Thanx, Paul

WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Qian Cai <cai@redhat.com>
Cc: Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	catalin.marinas@arm.com, linux-kernel@vger.kernel.org,
	kernel-team@android.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arm64/smp: Move rcu_cpu_starting() earlier
Date: Thu, 5 Nov 2020 15:28:13 -0800	[thread overview]
Message-ID: <20201105232813.GR3249@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <3b4c324abdabd12d7bd5346c18411e667afe6a55.camel@redhat.com>

On Thu, Nov 05, 2020 at 06:02:49PM -0500, Qian Cai wrote:
> On Thu, 2020-11-05 at 22:22 +0000, Will Deacon wrote:
> > On Fri, Oct 30, 2020 at 04:33:25PM +0000, Will Deacon wrote:
> > > On Wed, 28 Oct 2020 14:26:14 -0400, Qian Cai wrote:
> > > > The call to rcu_cpu_starting() in secondary_start_kernel() is not early
> > > > enough in the CPU-hotplug onlining process, which results in lockdep
> > > > splats as follows:
> > > > 
> > > >  WARNING: suspicious RCU usage
> > > >  -----------------------------
> > > >  kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!!
> > > > 
> > > > [...]
> > > 
> > > Applied to arm64 (for-next/fixes), thanks!
> > > 
> > > [1/1] arm64/smp: Move rcu_cpu_starting() earlier
> > >       https://git.kernel.org/arm64/c/ce3d31ad3cac
> > 
> > Hmm, this patch has caused a regression in the case that we fail to
> > online a CPU because it has incompatible CPU features and so we park it
> > in cpu_die_early(). We now get an endless spew of RCU stalls because the
> > core will never come online, but is being tracked by RCU. So I'm tempted
> > to revert this and live with the lockdep warning while we figure out a
> > proper fix.
> > 
> > What's the correct say to undo rcu_cpu_starting(), given that we cannot
> > invoke the full hotplug machinery here? Is it correct to call
> > rcutree_dying_cpu() on the bad CPU and then rcutree_dead_cpu() from the
> > CPU doing cpu_up(), or should we do something else?
> It looks to me that rcu_report_dead() does the opposite of rcu_cpu_starting(),
> so lift rcu_report_dead() out of CONFIG_HOTPLUG_CPU and use it there to rewind,
> Paul?

Yes, rcu_report_dead() should do the trick.  Presumably the earlier
online-time CPU-hotplug notifiers are also unwound?

							Thanx, Paul

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-11-05 23:28 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-28 18:26 [PATCH] arm64/smp: Move rcu_cpu_starting() earlier Qian Cai
2020-10-28 18:26 ` Qian Cai
2020-10-28 21:00 ` Paul E. McKenney
2020-10-28 21:00   ` Paul E. McKenney
2020-10-29  9:10 ` Will Deacon
2020-10-29  9:10   ` Will Deacon
2020-10-29 13:17   ` Qian Cai
2020-10-29 13:17     ` Qian Cai
2020-10-30  8:15     ` Will Deacon
2020-10-30  8:15       ` Will Deacon
2020-10-29 14:09   ` Paul E. McKenney
2020-10-29 14:09     ` Paul E. McKenney
2020-10-30 16:33 ` Will Deacon
2020-10-30 16:33   ` Will Deacon
2020-11-05 22:22   ` Will Deacon
2020-11-05 22:22     ` Will Deacon
2020-11-05 23:02     ` Qian Cai
2020-11-05 23:02       ` Qian Cai
2020-11-05 23:28       ` Paul E. McKenney [this message]
2020-11-05 23:28         ` Paul E. McKenney
2020-11-06  2:15         ` Qian Cai
2020-11-06  2:15           ` Qian Cai
2020-11-06  4:07           ` Paul E. McKenney
2020-11-06  4:07             ` Paul E. McKenney
2020-11-06 10:37           ` Will Deacon
2020-11-06 10:37             ` Will Deacon
2020-11-06 12:48             ` Qian Cai
2020-11-06 12:48               ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201105232813.GR3249@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=cai@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=kernel-team@android.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.