RCU Archive on lore.kernel.org
 help / color / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: linux-kernel@vger.kernel.org,
	Neeraj Upadhyay <neeraju@codeaurora.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	rcu@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH 1/2] rcu/tree: Add a warning if CPU being onlined did not report QS already
Date: Thu, 30 Jul 2020 09:21:59 -0700
Message-ID: <20200730162159.GZ9247@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200730030221.705255-1-joel@joelfernandes.org>

On Wed, Jul 29, 2020 at 11:02:20PM -0400, Joel Fernandes (Google) wrote:
> Add a warning if CPU being onlined did not report QS already. This is to
> simplify the code in the CPU onlining path and also to make clear about
> where QS is reported. The act of QS reporting in CPU onlining path is
> is likely unnecessary as shown by code reading and testing with
> rcutorture's TREE03 and hotplug parameters.

How about something like this for the commit log?


Currently, rcu_cpu_starting() checks to see if the RCU core expects a
quiescent state from the incoming CPU.  However, the current interaction
between RCU quiescent-state reporting and CPU-hotplug operations should
mean that the incoming CPU never needs to report a quiescent state.
First, the outgoing CPU reports a quiescent state if needed.  Second,
the race where the CPU is leaving just as RCU is initializing a new
grace period is handled by an explicit check for this condition.  Third,
the CPU's leaf rcu_node structure's ->lock serializes these checks.

This means that if rcu_cpu_starting() ever feels the need to report
a quiescent state, then there is a bug somewhere in the CPU hotplug
code or the RCU grace-period handling code.  This commit therefore
adds a WARN_ON_ONCE() to bring that bug to everyone's attention.


> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  kernel/rcu/tree.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 65e1b5e92319..1e51962b565b 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3996,7 +3996,19 @@ void rcu_cpu_starting(unsigned int cpu)
>  	rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */
>  	rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
>  	rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);
> -	if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */
> +
> +	/*
> +	 * Delete QS reporting from here, by June 2021, if warning does not
> +	 * fire. Let us make the rules for reporting QS for an offline CPUs
> +	 * more explicit. The CPU onlining path does not need to report QS for
> +	 * an offline CPU. Either the QS should have reported during CPU
> +	 * offlining, or during rcu_gp_init() if it detected a race with either
> +	 * CPU offlining or task unblocking on previously offlined CPUs. Note
> +	 * that the FQS loop also does not report QS for an offline CPU any
> +	 * longer (unless it splats due to an offline CPU blocking the GP for
> +	 * too long).
> +	 */

Let's leave at least the WARN_ON_ONCE() indefinitely.  If you don't
believe me, remove this code in your local tree, have someone give you
several branches, some with bugs injected, and then try to figure out
which have the bugs and then try to find those bugs.

This is not a fastpath, so the overhead of the check is not a concern.
Believe me, the difficulty of bug location without this check is a very
real concern!  ;-)

On the other hand, I fully agree with the benefits of documenting the
design rules.  But is this really the best place to do that from the
viewpoint of someone who is trying to figure out how RCU works?

							Thanx, Paul

> +	if (WARN_ON_ONCE(rnp->qsmask & mask)) { /* RCU waiting on incoming CPU? */
>  		rcu_disable_urgency_upon_qs(rdp);
>  		/* Report QS -after- changing ->qsmaskinitnext! */
>  		rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
> -- 
> 2.28.0.rc0.142.g3c755180ce-goog

  parent reply index

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-30  3:02 Joel Fernandes (Google)
2020-07-30  3:02 ` [PATCH 2/2] rcu/tree: Clarify comments about FQS loop reporting quiescent states Joel Fernandes (Google)
2020-07-30  3:25   ` Joel Fernandes
2020-07-30 16:35     ` Paul E. McKenney
2020-07-31  1:21       ` Joel Fernandes
2020-07-31  1:34         ` Paul E. McKenney
2020-07-30 16:21 ` Paul E. McKenney [this message]
2020-07-31  1:08   ` [PATCH 1/2] rcu/tree: Add a warning if CPU being onlined did not report QS already Joel Fernandes
2020-07-31  1:42   ` Joel Fernandes
2020-07-31  3:48     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200730162159.GZ9247@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=neeraju@codeaurora.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

RCU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/rcu/0 rcu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 rcu rcu/ https://lore.kernel.org/rcu \
	public-inbox-index rcu

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git