LKML Archive on lore.kernel.org
 help / color / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, mingo@kernel.org, jiangshanlai@gmail.com,
	dipankar@in.ibm.com, akpm@linux-foundation.org,
	mathieu.desnoyers@efficios.com, josh@joshtriplett.org,
	tglx@linutronix.de, peterz@infradead.org, dhowells@redhat.com,
	edumazet@google.com, fweisbec@gmail.com, oleg@redhat.com,
	joel@joelfernandes.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH tip/core/rcu 22/30] rcu: Don't flag non-starting GPs before GP kthread is running
Date: Sat, 15 Feb 2020 05:42:08 -0800
Message-ID: <20200215134208.GA9879@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200215110111.GZ2935@paulmck-ThinkPad-P72>

On Sat, Feb 15, 2020 at 03:01:11AM -0800, Paul E. McKenney wrote:
> On Fri, Feb 14, 2020 at 10:53:05PM -0500, Steven Rostedt wrote:
> > On Fri, 14 Feb 2020 15:55:59 -0800
> > paulmck@kernel.org wrote:
> > 
> > > @@ -1252,10 +1252,10 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp)
> > >   */
> > >  static void rcu_gp_kthread_wake(void)
> > >  {
> > > -	if ((current == rcu_state.gp_kthread &&
> > > +	if ((current == READ_ONCE(rcu_state.gp_kthread) &&
> > >  	     !in_irq() && !in_serving_softirq()) ||
> > >  	    !READ_ONCE(rcu_state.gp_flags) ||
> > > -	    !rcu_state.gp_kthread)
> > > +	    !READ_ONCE(rcu_state.gp_kthread))
> > >  		return;
> > 
> > This looks buggy. You have two instances of
> > READ_ONCE(rcu_state.gp_thread), which means they can be different. Is
> > that intentional?
> 
> It might well be a bug, but let's see...
> 
> The rcu_state.gp_kthread field is initially NULL and transitions only once
> to the non-NULL pointer to the RCU grace-period kthread's task_struct
> structure.  So yes, this does work, courtesy of the compiler not being
> allowed to change the order of READ_ONCE() instances and conherence-order
> rules for READ_ONCE() and WRITE_ONCE().
> 
> But it would clearly be way better to do just one READ_ONCE() into a
> local variable and test that local variable twice.
> 
> I will make this change, and thank you for calling my attention to it!

And does the following V2 look better?

							Thanx, Paul

------------------------------------------------------------------------

commit 35f7c539d30d5b595718302d07334146f8eb7304
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Tue Jan 21 12:30:22 2020 -0800

    rcu: Don't flag non-starting GPs before GP kthread is running
    
    Currently rcu_check_gp_start_stall() complains if a grace period takes
    too long to start, where "too long" is roughly one RCU CPU stall-warning
    interval.  This has worked well, but there are some debugging Kconfig
    options (such as CONFIG_EFI_PGT_DUMP=y) that can make booting take a
    very long time, so much so that the stall-warning interval has expired
    before RCU's grace-period kthread has even been spawned.
    
    This commit therefore resets the rcu_state.gp_req_activity and
    rcu_state.gp_activity timestamps just before the grace-period kthread
    is spawned, and modifies the checks and adds ordering to ensure that
    if rcu_check_gp_start_stall() sees that the grace-period kthread
    has been spawned, that it will also see the resets applied to the
    rcu_state.gp_req_activity and rcu_state.gp_activity timestamps.
    
    Reported-by: Qian Cai <cai@lca.pw>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    [ paulmck: Fix whitespace issues reported by Qian Cai. ]
    Tested-by: Qian Cai <cai@lca.pw>
    [ paulmck: Simplify grace-period wakeup check per Steve Rostedt feedback. ]

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 62383ce..4a4a975 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1202,7 +1202,7 @@ static bool rcu_start_this_gp(struct rcu_node *rnp_start, struct rcu_data *rdp,
 	trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("Startedroot"));
 	WRITE_ONCE(rcu_state.gp_flags, rcu_state.gp_flags | RCU_GP_FLAG_INIT);
 	WRITE_ONCE(rcu_state.gp_req_activity, jiffies);
-	if (!rcu_state.gp_kthread) {
+	if (!READ_ONCE(rcu_state.gp_kthread)) {
 		trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("NoGPkthread"));
 		goto unlock_out;
 	}
@@ -1237,12 +1237,13 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp)
 }
 
 /*
- * Awaken the grace-period kthread.  Don't do a self-awaken (unless in
- * an interrupt or softirq handler), and don't bother awakening when there
- * is nothing for the grace-period kthread to do (as in several CPUs raced
- * to awaken, and we lost), and finally don't try to awaken a kthread that
- * has not yet been created.  If all those checks are passed, track some
- * debug information and awaken.
+ * Awaken the grace-period kthread.  Don't do a self-awaken (unless in an
+ * interrupt or softirq handler, in which case we just might immediately
+ * sleep upon return, resulting in a grace-period hang), and don't bother
+ * awakening when there is nothing for the grace-period kthread to do
+ * (as in several CPUs raced to awaken, we lost), and finally don't try
+ * to awaken a kthread that has not yet been created.  If all those checks
+ * are passed, track some debug information and awaken.
  *
  * So why do the self-wakeup when in an interrupt or softirq handler
  * in the grace-period kthread's context?  Because the kthread might have
@@ -1252,10 +1253,10 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp)
  */
 static void rcu_gp_kthread_wake(void)
 {
-	if ((current == rcu_state.gp_kthread &&
-	     !in_irq() && !in_serving_softirq()) ||
-	    !READ_ONCE(rcu_state.gp_flags) ||
-	    !rcu_state.gp_kthread)
+	struct task_struct *t = READ_ONCE(rcu_state.gp_kthread);
+
+	if ((current == t && !in_irq() && !in_serving_softirq()) ||
+	    !READ_ONCE(rcu_state.gp_flags) || !t)
 		return;
 	WRITE_ONCE(rcu_state.gp_wake_time, jiffies);
 	WRITE_ONCE(rcu_state.gp_wake_seq, READ_ONCE(rcu_state.gp_seq));
@@ -3554,7 +3555,10 @@ static int __init rcu_spawn_gp_kthread(void)
 	}
 	rnp = rcu_get_root();
 	raw_spin_lock_irqsave_rcu_node(rnp, flags);
-	rcu_state.gp_kthread = t;
+	WRITE_ONCE(rcu_state.gp_activity, jiffies);
+	WRITE_ONCE(rcu_state.gp_req_activity, jiffies);
+	// Reset .gp_activity and .gp_req_activity before setting .gp_kthread.
+	smp_store_release(&rcu_state.gp_kthread, t);  /* ^^^ */
 	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 	wake_up_process(t);
 	rcu_spawn_nocb_kthreads();
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 488b71d..16ad7ad 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -578,6 +578,7 @@ void show_rcu_gp_kthreads(void)
 	unsigned long jw;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
+	struct task_struct *t = READ_ONCE(rcu_state.gp_kthread);
 
 	j = jiffies;
 	ja = j - READ_ONCE(rcu_state.gp_activity);
@@ -585,8 +586,7 @@ void show_rcu_gp_kthreads(void)
 	jw = j - READ_ONCE(rcu_state.gp_wake_time);
 	pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n",
 		rcu_state.name, gp_state_getname(rcu_state.gp_state),
-		rcu_state.gp_state,
-		rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1ffffL,
+		rcu_state.gp_state, t ? t->state : 0x1ffffL,
 		ja, jr, jw, (long)READ_ONCE(rcu_state.gp_wake_seq),
 		(long)READ_ONCE(rcu_state.gp_seq),
 		(long)READ_ONCE(rcu_get_root()->gp_seq_needed),
@@ -633,7 +633,8 @@ static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp,
 
 	if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() ||
 	    ULONG_CMP_GE(READ_ONCE(rnp_root->gp_seq),
-			 READ_ONCE(rnp_root->gp_seq_needed)))
+			 READ_ONCE(rnp_root->gp_seq_needed)) ||
+	    !smp_load_acquire(&rcu_state.gp_kthread)) // Get stable kthread.
 		return;
 	j = jiffies; /* Expensive access, and in common case don't get here. */
 	if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) ||

  reply index

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-14 23:55 [PATCH tip/core/rcu 0/30] Miscellaneous fixes for v5.7 Paul E. McKenney
2020-02-14 23:55 ` [PATCH tip/core/rcu 01/30] nfs: Fix nfs_access_get_cached_rcu() sparse error paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 02/30] rcu: Warn on for_each_leaf_node_cpu_mask() from non-leaf paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 03/30] rcu: Fix exp_funnel_lock()/rcu_exp_wait_wake() datarace paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 04/30] rcu: Provide debug symbols and line numbers in KCSAN runs paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 05/30] rcu: Add WRITE_ONCE() to rcu_node ->qsmask update paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 06/30] rcu: Add WRITE_ONCE to rcu_node ->exp_seq_rq store paulmck
2020-02-15  3:47   ` Steven Rostedt
2020-02-15 10:58     ` Paul E. McKenney
2020-02-17 21:11       ` Joel Fernandes
2020-02-17 21:36         ` Paul E. McKenney
2020-02-14 23:55 ` [PATCH tip/core/rcu 07/30] rcu: Add READ_ONCE() to rcu_node ->gp_seq paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 08/30] rcu: Add WRITE_ONCE() to rcu_state ->gp_req_activity paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 09/30] rcu: Add WRITE_ONCE() to rcu_node ->qsmaskinitnext paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 10/30] locking/rtmutex: rcu: Add WRITE_ONCE() to rt_mutex ->owner paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 11/30] rcu: Add READ_ONCE() to rcu_segcblist ->tails[] paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 12/30] rcu: *_ONCE() for grace-period progress indicators paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 13/30] rcu: Fix typos in beginning comments paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 14/30] rcu: Add READ_ONCE() to rcu_data ->gpwrap paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 15/30] rcu: Add *_ONCE() to rcu_data ->rcu_forced_tick paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 16/30] rcu: Add *_ONCE() to rcu_node ->boost_kthread_status paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 17/30] timer: Use hlist_unhashed_lockless() in timer_pending() paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 18/30] rcu: Remove dead code from rcu_segcblist_insert_pend_cbs() paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 19/30] rcu: Add WRITE_ONCE() to rcu_state ->gp_start paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 20/30] rcu: Fix rcu_barrier_callback() race condition paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 21/30] rculist: Add brackets around cond argument in __list_check_rcu macro paulmck
2020-02-14 23:55 ` [PATCH tip/core/rcu 22/30] rcu: Don't flag non-starting GPs before GP kthread is running paulmck
2020-02-15  3:53   ` Steven Rostedt
2020-02-15 11:01     ` Paul E. McKenney
2020-02-15 13:42       ` Paul E. McKenney [this message]
2020-02-17 20:25         ` Steven Rostedt
2020-02-17 22:03           ` Paul E. McKenney
2020-02-17 22:21             ` Steven Rostedt
2020-02-17 23:03               ` Paul E. McKenney
2020-02-14 23:56 ` [PATCH tip/core/rcu 23/30] rcu: Add missing annotation for rcu_nocb_bypass_lock() paulmck
2020-02-14 23:56 ` [PATCH tip/core/rcu 24/30] rcu/nocb: Add missing annotation for rcu_nocb_bypass_unlock() paulmck
2020-02-14 23:56 ` [PATCH tip/core/rcu 25/30] rcu: Optimize and protect atomic_cmpxchg() loop paulmck
2020-02-14 23:56 ` [PATCH tip/core/rcu 26/30] rcu: Tighten rcu_lockdep_assert_cblist_protected() check paulmck
2020-02-14 23:56 ` [PATCH tip/core/rcu 27/30] rcu: Make nocb_gp_wait() double-check unexpected-callback warning paulmck
2020-02-14 23:56 ` [PATCH tip/core/rcu 28/30] rcu: Mark rcu_state.ncpus to detect concurrent writes paulmck
2020-02-14 23:56 ` [PATCH tip/core/rcu 29/30] rcu: Mark rcu_state.gp_seq " paulmck
2020-02-14 23:56 ` [PATCH tip/core/rcu 30/30] rcu: Make rcu_barrier() account for offline no-CBs CPUs paulmck

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200215134208.GA9879@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git