rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: linux-kernel@vger.kernel.org, Andy Lutomirski <luto@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Ingo Molnar <mingo@redhat.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Petr Mladek <pmladek@suse.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	rcu@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	Yafang Shao <laoar.shao@gmail.com>
Subject: Re: [PATCH -rcu dev 1/2] Revert b8c17e6664c4 ("rcu: Maintain special bits at bottom of ->dynticks counter")
Date: Thu, 5 Sep 2019 11:36:20 -0400	[thread overview]
Message-ID: <20190905153620.GG26466@google.com> (raw)
In-Reply-To: <20190904231308.GB4125@linux.ibm.com>

On Wed, Sep 04, 2019 at 04:13:08PM -0700, Paul E. McKenney wrote:
> On Wed, Sep 04, 2019 at 09:54:20AM -0400, Joel Fernandes wrote:
> > On Wed, Sep 04, 2019 at 03:12:10AM -0700, Paul E. McKenney wrote:
> > > On Wed, Sep 04, 2019 at 12:59:10AM -0400, Joel Fernandes wrote:
> > > > On Tue, Sep 03, 2019 at 01:02:49PM -0700, Paul E. McKenney wrote:
> 
> [ . . . ]
> 
> > > If this task gets delayed betweentimes, rcu_implicit_dynticks_qs() would
> > > fail to set .rcu_need_heavy_qs because it saw it already being set,
> > > even though the corresponding ->dynticks update had already happened.
> > > (It might be a new grace period, given that the old grace period might
> > > have ended courtesy of the atomic_add_return().)
> > 
> > Makes sense and I agree.
> > 
> > Also, I would really appreciate if you can correct the nits in the above
> > patch we're reviewing, and apply them (if you can).
> > I think, there are only 2 changes left:
> > - rename special to seq.
> > - reorder the rcu_need_heavy_qs write.
> > 
> >  On a related point, when I was working on the NOHZ_FULL testing I noticed a
> >  weird issue where rcu_urgent_qs was reset but rcu_need_heavy_qs was still
> >  set indefinitely. I am a bit afraid our hints are not being cleared
> >  appropriately and I believe I fixed a similar issue a few months ago. I
> >  would rather have them cleared once they are no longer needed.  What do you
> >  think about the below patch? I did not submit it yet because I was working
> >  on other patches. 
> > 
> > ---8<-----------------------
> > 
> > From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > Subject: [RFC] rcu/tree: Reset CPU hints when reporting a quiescent state
> > 
> > While tracing, I am seeing cases where need_heavy_qs is still set even
> > though urgent_qs was cleared, after a quiescent state is reported. One
> > such case is when the softirq reports that a CPU has passed quiescent
> > state.
> > 
> > Previously in 671a63517cf9 ("rcu: Avoid unnecessary softirq when system
> > is idle"), I had fixed a bug where core_needs_qs was not being cleared.
> > I worry we keep running into similar situations. Let us just add a
> > function to clear hints and call it from all relevant places to make the
> > code more robust and avoid such stale hints which could in theory at
> > least, cause false hints after the quiescent state was already reported.
> > 
> > Tested overnight with rcutorture running for 60 minutes on all
> > configurations of RCU.
> > 
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > ---
> >  kernel/rcu/tree.c | 17 ++++++++++++++++-
> >  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> Excellent point!  But how about if we combine it with the existing
> disabling of the scheduler tick, perhaps something like the following?
> 
> Note that the FQS clearing can come from some other CPU, hence the added
> {READ,WRITE}_ONCE() calls.  The call is moved down in rcu_report_qs_rdp()
> because something would have had to clear the bit to prevent execution
> from getting there, and I believe that the other bit-clearing events
> have calls to rcu_disable_urgency_upon_qs().  (But I easily could have
> missed something!)

Is there any harm just clearing it earlier in rcu_report_qs_rdp()? If no,
then let us just play it safe and do it that way (clear earlier in
rcu_report_qs_rdp())?

> I am OK leaving RCU urgency set on offline CPUs, hence clearing things
> at online time.

Got it, probably this point can be added to the commit message.

Added more comments below but otherwise it looks good to me:

> ------------------------------------------------------------------------
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 68ebf0eb64c8..2b74b6c94086 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -827,7 +827,7 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
>  		incby = 1;
>  	} else if (tick_nohz_full_cpu(rdp->cpu) &&
>  		   rdp->dynticks_nmi_nesting == DYNTICK_IRQ_NONIDLE &&
> -		   rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
> +		   READ_ONCE(rdp->rcu_urgent_qs) && !rdp->rcu_forced_tick) {
>  		rdp->rcu_forced_tick = true;
>  		tick_dep_set_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
>  	}
> @@ -892,11 +892,15 @@ void rcu_irq_enter_irqson(void)
>  }
>  
>  /*
> - * If the scheduler-clock interrupt was enabled on a nohz_full CPU
> - * in order to get to a quiescent state, disable it.
> + * If any sort of urgency was applied to the current CPU (for example,
> + * the scheduler-clock interrupt was enabled on a nohz_full CPU) in order
> + * to get to a quiescent state, disable it.
>   */
> -void rcu_disable_tick_upon_qs(struct rcu_data *rdp)
> +void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
>  {
> +	WRITE_ONCE(rdp->core_needs_qs, false);
> +	WRITE_ONCE(rdp->rcu_urgent_qs, false);
> +	WRITE_ONCE(rdp->rcu_need_heavy_qs, false);

Better to put a comment here saying _ONCE is needed to avoid data-races with
the FQS loop? Just so if anyone thinks why we are using _ONCE().

And I am guessing the __this_cpu_read(rcu_data.core_needs_qs) in
rcu_flavor_sched_clock_irq() implies READ_ONCE() so no need READ_ONCE()
there right?

>  	if (tick_nohz_full_cpu(rdp->cpu) && rdp->rcu_forced_tick) {
>  		tick_dep_clear_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
>  		rdp->rcu_forced_tick = false;
> @@ -1417,7 +1421,7 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
>  		trace_rcu_grace_period(rcu_state.name, rnp->gp_seq, TPS("cpustart"));
>  		need_gp = !!(rnp->qsmask & rdp->grpmask);
>  		rdp->cpu_no_qs.b.norm = need_gp;
> -		rdp->core_needs_qs = need_gp;
> +		WRITE_ONCE(rdp->core_needs_qs, need_gp);
>  		zero_cpu_stall_ticks(rdp);
>  	}
>  	rdp->gp_seq = rnp->gp_seq;  /* Remember new grace-period state. */
> @@ -1987,7 +1991,6 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
>  		return;
>  	}
>  	mask = rdp->grpmask;
> -	rdp->core_needs_qs = false;
>  	if ((rnp->qsmask & mask) == 0) {
>  		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
>  	} else {
> @@ -1998,7 +2001,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
>  		if (!offloaded)
>  			needwake = rcu_accelerate_cbs(rnp, rdp);
>  
> -		rcu_disable_tick_upon_qs(rdp);
> +		rcu_disable_urgency_upon_qs(rdp);
>  		rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
>  		/* ^^^ Released rnp->lock */
>  		if (needwake)
> @@ -2022,7 +2025,7 @@ rcu_check_quiescent_state(struct rcu_data *rdp)
>  	 * Does this CPU still need to do its part for current grace period?
>  	 * If no, return and let the other CPUs do their part as well.
>  	 */
> -	if (!rdp->core_needs_qs)
> +	if (!READ_ONCE(rdp->core_needs_qs))
>  		return;
>  
>  	/*
> @@ -2316,7 +2319,7 @@ static void force_qs_rnp(int (*f)(struct rcu_data *rdp))
>  				rdp = per_cpu_ptr(&rcu_data, cpu);
>  				if (f(rdp)) {
>  					mask |= bit;
> -					rcu_disable_tick_upon_qs(rdp);
> +					rcu_disable_urgency_upon_qs(rdp);
>  				}
>  			}
>  		}
> @@ -3004,7 +3007,7 @@ static int rcu_pending(void)
>  		return 0;
>  
>  	/* Is the RCU core waiting for a quiescent state from this CPU? */
> -	if (rdp->core_needs_qs && !rdp->cpu_no_qs.b.norm)
> +	if (READ_ONCE(rdp->core_needs_qs) && !rdp->cpu_no_qs.b.norm)
>  		return 1;
>  
>  	/* Does this CPU have callbacks ready to invoke? */
> @@ -3244,7 +3247,6 @@ int rcutree_prepare_cpu(unsigned int cpu)
>  	rdp->gp_seq = rnp->gp_seq;
>  	rdp->gp_seq_needed = rnp->gp_seq;
>  	rdp->cpu_no_qs.b.norm = true;
> -	rdp->core_needs_qs = false;

How about calling the new hint-clearing function here as well? Just for
robustness and consistency purposes?

thanks,

 - Joel

>  	rdp->rcu_iw_pending = false;
>  	rdp->rcu_iw_gp_seq = rnp->gp_seq - 1;
>  	trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuonl"));
> @@ -3359,7 +3361,7 @@ void rcu_cpu_starting(unsigned int cpu)
>  	rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
>  	rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);
>  	if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */
> -		rcu_disable_tick_upon_qs(rdp);
> +		rcu_disable_urgency_upon_qs(rdp);
>  		/* Report QS -after- changing ->qsmaskinitnext! */
>  		rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
>  	} else {

  reply	other threads:[~2019-09-05 15:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-30 16:23 [PATCH -rcu dev 1/2] Revert b8c17e6664c4 ("rcu: Maintain special bits at bottom of ->dynticks counter") Joel Fernandes (Google)
2019-08-30 16:23 ` [PATCH -rcu dev 2/2] rcu/dyntick-idle: Add better tracing Joel Fernandes (Google)
2019-09-03 20:04   ` Paul E. McKenney
2019-09-04  0:46     ` Joel Fernandes
2019-09-03 20:02 ` [PATCH -rcu dev 1/2] Revert b8c17e6664c4 ("rcu: Maintain special bits at bottom of ->dynticks counter") Paul E. McKenney
2019-09-04  4:59   ` Joel Fernandes
2019-09-04 10:12     ` Paul E. McKenney
2019-09-04 13:54       ` Joel Fernandes
2019-09-04 23:13         ` Paul E. McKenney
2019-09-05 15:36           ` Joel Fernandes [this message]
2019-09-05 16:43             ` Paul E. McKenney
2019-09-06  0:01               ` Joel Fernandes
2019-09-06 15:08                 ` Joel Fernandes
2019-09-06 15:21                   ` Paul E. McKenney
2019-09-06 15:27                     ` Paul E. McKenney
2019-09-06 16:57                       ` Joel Fernandes
2019-09-06 17:16                         ` Paul E. McKenney
2019-09-06 17:26                           ` Joel Fernandes
2019-09-07 17:28                           ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190905153620.GG26466@google.com \
    --to=joel@joelfernandes.org \
    --cc=bhelgaas@google.com \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pmladek@suse.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).