linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>, Qian Cai <cai@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>, x86 <x86@kernel.org>,
	linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org,
	Linux Next Mailing List <linux-next@vger.kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>
Subject: Re: [tip: locking/core] lockdep: Fix lockdep recursion
Date: Thu, 15 Oct 2020 11:49:26 +0200	[thread overview]
Message-ID: <20201015094926.GY2611@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20201015034128.GA10260@paulmck-ThinkPad-P72>

On Wed, Oct 14, 2020 at 08:41:28PM -0700, Paul E. McKenney wrote:
> So the (untested) patch below (on top of the other two) moves the delay
> to rcu_gp_init(), in particular, to the first loop that traverses only
> the leaf rcu_node structures handling CPU hotplug.
> 
> Hopefully getting closer!

So, if I composed things right, we end up with this. Comments below.


--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1143,13 +1143,15 @@ bool rcu_lockdep_current_cpu_online(void
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
 	bool ret = false;
+	unsigned long seq;
 
 	if (in_nmi() || !rcu_scheduler_fully_active)
 		return true;
 	preempt_disable_notrace();
 	rdp = this_cpu_ptr(&rcu_data);
 	rnp = rdp->mynode;
-	if (rdp->grpmask & rcu_rnp_online_cpus(rnp))
+	seq = READ_ONCE(rnp->ofl_seq) & ~0x1;
+	if (rdp->grpmask & rcu_rnp_online_cpus(rnp) || seq != READ_ONCE(rnp->ofl_seq))
 		ret = true;
 	preempt_enable_notrace();
 	return ret;
@@ -1715,6 +1717,7 @@ static void rcu_strict_gp_boundary(void
  */
 static bool rcu_gp_init(void)
 {
+	unsigned long firstseq;
 	unsigned long flags;
 	unsigned long oldmask;
 	unsigned long mask;
@@ -1758,6 +1761,12 @@ static bool rcu_gp_init(void)
 	 */
 	rcu_state.gp_state = RCU_GP_ONOFF;
 	rcu_for_each_leaf_node(rnp) {
+		smp_mb(); // Pair with barriers used when updating ->ofl_seq to odd values.
+		firstseq = READ_ONCE(rnp->ofl_seq);
+		if (firstseq & 0x1)
+			while (firstseq == smp_load_acquire(&rnp->ofl_seq))
+				schedule_timeout_idle(1);  // Can't wake unless RCU is watching.
+		smp_mb(); // Pair with barriers used when updating ->ofl_seq to even values.
 		raw_spin_lock(&rcu_state.ofl_lock);
 		raw_spin_lock_irq_rcu_node(rnp);
 		if (rnp->qsmaskinit == rnp->qsmaskinitnext &&
@@ -4047,6 +4056,9 @@ void rcu_cpu_starting(unsigned int cpu)
 
 	rnp = rdp->mynode;
 	mask = rdp->grpmask;
+	WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
+	WARN_ON_ONCE(!(rnp->ofl_seq & 0x1));
+	smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
 	raw_spin_lock_irqsave_rcu_node(rnp, flags);
 	WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask);
 	newcpu = !(rnp->expmaskinitnext & mask);
@@ -4064,6 +4076,9 @@ void rcu_cpu_starting(unsigned int cpu)
 	} else {
 		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 	}
+	smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
+	WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
+	WARN_ON_ONCE(rnp->ofl_seq & 0x1);
 	smp_mb(); /* Ensure RCU read-side usage follows above initialization. */
 }
 
@@ -4091,6 +4106,9 @@ void rcu_report_dead(unsigned int cpu)
 
 	/* Remove outgoing CPU from mask in the leaf rcu_node structure. */
 	mask = rdp->grpmask;
+	WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
+	WARN_ON_ONCE(!(rnp->ofl_seq & 0x1));
+	smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
 	raw_spin_lock(&rcu_state.ofl_lock);
 	raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */
 	rdp->rcu_ofl_gp_seq = READ_ONCE(rcu_state.gp_seq);
@@ -4103,6 +4121,9 @@ void rcu_report_dead(unsigned int cpu)
 	WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext & ~mask);
 	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 	raw_spin_unlock(&rcu_state.ofl_lock);
+	smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
+	WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
+	WARN_ON_ONCE(rnp->ofl_seq & 0x1);
 
 	rdp->cpu_started = false;
 }
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -56,6 +56,7 @@ struct rcu_node {
 				/*  Initialized from ->qsmaskinitnext at the */
 				/*  beginning of each grace period. */
 	unsigned long qsmaskinitnext;
+	unsigned long ofl_seq;	/* CPU-hotplug operation sequence count. */
 				/* Online CPUs for next grace period. */
 	unsigned long expmask;	/* CPUs or groups that need to check in */
 				/*  to allow the current expedited GP */


Lets see if I can understand this.

 - we seqcount wrap online/offline, such that they're odd while
   in-progress. Full memory barriers, such that, unlike with regular
   seqcount, it also orders later reads, important?

 - when odd, we ensure it is seen as online; notable detail seems to be
   that this function is expected to be called in PO relative to the
   seqcount ops. It is unsafe concurrently. This seems sufficient for
   our goals today.

 - when odd, we delay the current gp.


It is that last point where I think I'd like to suggest change. Given
that both rcu_cpu_starting() and rcu_report_dead() (the naming is
slightly inconsistent) are ran with IRQs disabled, spin-waiting seems
like a more natural match.

Also, I don't see the purpose of your smp_load_acquire(), you don't
actually do anything before then calling a full smp_mb().


--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1764,8 +1764,7 @@ static bool rcu_gp_init(void)
 		smp_mb(); // Pair with barriers used when updating ->ofl_seq to odd values.
 		firstseq = READ_ONCE(rnp->ofl_seq);
 		if (firstseq & 0x1)
-			while (firstseq == smp_load_acquire(&rnp->ofl_seq))
-				schedule_timeout_idle(1);  // Can't wake unless RCU is watching.
+			smp_cond_load_relaxed(&rnp->ofl_seq, VAL == firstseq);
 		smp_mb(); // Pair with barriers used when updating ->ofl_seq to even values.
 		raw_spin_lock(&rcu_state.ofl_lock);
 		raw_spin_lock_irq_rcu_node(rnp);

  reply	other threads:[~2020-10-15  9:50 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <160223032121.7002.1269740091547117869.tip-bot2@tip-bot2>
2020-10-09 13:41 ` [tip: locking/core] lockdep: Fix lockdep recursion Qian Cai
2020-10-09 13:58   ` Paul E. McKenney
2020-10-09 15:30     ` Qian Cai
2020-10-09 16:11       ` Paul E. McKenney
2020-10-09 16:23     ` Peter Zijlstra
2020-10-09 16:37       ` Paul E. McKenney
2020-10-09 17:36       ` Qian Cai
2020-10-09 17:50         ` Paul E. McKenney
2020-10-09 17:54         ` Qian Cai
2020-10-09 18:21           ` Paul E. McKenney
2020-10-12  3:11   ` Boqun Feng
2020-10-12 14:14     ` Qian Cai
2020-10-12 21:28     ` Paul E. McKenney
2020-10-13 10:34       ` Peter Zijlstra
2020-10-13 10:44         ` Peter Zijlstra
2020-10-13 11:25           ` Peter Zijlstra
2020-10-13 16:26             ` Paul E. McKenney
2020-10-13 19:30               ` Paul E. McKenney
2020-10-14 18:34                 ` Paul E. McKenney
2020-10-14 21:53                   ` Peter Zijlstra
2020-10-14 22:11                     ` Paul E. McKenney
2020-10-14 22:39                       ` Peter Zijlstra
2020-10-14 23:55                         ` Paul E. McKenney
2020-10-15  3:41                           ` Paul E. McKenney
2020-10-15  9:49                             ` Peter Zijlstra [this message]
2020-10-15  9:50                               ` Peter Zijlstra
2020-10-15 16:15                                 ` Paul E. McKenney
2020-10-15  9:52                               ` Peter Zijlstra
2020-10-15 16:20                                 ` Paul E. McKenney
2020-10-15 16:15                               ` Paul E. McKenney
2020-10-15 17:23                                 ` Paul E. McKenney
2020-10-13 16:15           ` Paul E. McKenney
2020-10-13 10:27     ` Peter Zijlstra
2020-10-13 16:24       ` Boqun Feng
2020-10-27 19:31     ` Qian Cai
2020-10-28  3:01       ` Paul E. McKenney
2020-10-28 14:39         ` Qian Cai
2020-10-28 15:53           ` Paul E. McKenney
2020-10-28 20:08             ` Qian Cai
2020-10-28 21:02               ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201015094926.GY2611@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=boqun.feng@gmail.com \
    --cc=cai@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sfr@canb.auug.org.au \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).