rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Lai Jiangshan <laijs@linux.alibaba.com>
Cc: linux-kernel@vger.kernel.org,
	Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	rcu@vger.kernel.org
Subject: Re: [PATCH 01/11] rcu: avoid leaking exp_deferred_qs into next GP
Date: Thu, 31 Oct 2019 12:00:14 -0700	[thread overview]
Message-ID: <20191031190014.GZ20975@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <2cf71e70-4cb3-57f8-f542-69ddf04106dd@linux.alibaba.com>

On Fri, Nov 01, 2019 at 02:19:13AM +0800, Lai Jiangshan wrote:
> 
> 
> On 2019/10/31 9:43 下午, Paul E. McKenney wrote:
> > On Thu, Oct 31, 2019 at 10:07:56AM +0000, Lai Jiangshan wrote:
> > > If exp_deferred_qs is incorrectly set and leaked to the next
> > > exp GP, it may cause the next GP to be incorrectly prematurely
> > > completed.
> > 
> > Could you please provide the sequence of events leading to a such a
> > failure?
> 
> I just felt nervous with "leaking" exp_deferred_qs.
> I didn't careful consider the sequence of events.
> 
> Now it proves that I must have misunderstood the exp_deferred_qs.
> So call "leaking" is wrong concept, preempt_disable()
> is considered as rcu_read_lock() and exp_deferred_qs
> needs to be set.

Thank you for checking, and yes, this code is a bit subtle.  So good
on you for digging into it!

							Thanx, Paul

> Thanks
> Lai
> 
> ============don't need to read:
> 
> read_read_lock()
> // other cpu start exp GP_A
> preempt_schedule() // queue itself
> read_read_unlock() //report qs, other cpu is sending ipi to me
> preempt_disable
>   rcu_exp_handler() interrupt for GP_A and leave a exp_deferred_qs
>   // exp GP_A finished
>   ---------------above is one possible way to leave a exp_deferred_qs
> preempt_enable()
>  interrupt before preempt_schedule()
>   read_read_lock()
>   read_read_unlock()
>    NESTED interrupt when nagative rcu_read_lock_nesting
>     read_read_lock()
>     // other cpu start exp GP_B
>     NESTED interrupt for rcu_flavor_sched_clock_irq()
>      report exq qs since rcu_read_lock_nesting <0 and \
>      exp_deferred_qs is true
>     // exp GP_B complete
>     read_read_unlock()
> 
> This plausible sequence relies on NESTED interrupt too,
> and can be avoided by patch2 if NESTED interrupt were allowed.
> 
> > 
> > Also, did you provoke such a failure in testing?  If so, an upgrade
> > to rcutorture would be good, so please tell me what you did to make
> > the failure happen.
> > 
> > I do like the reduction in state space, but I am a bit concerned about
> > the potential increase in contention on rnp->lock.  Thoughts?
> > 
> > 							Thanx, Paul
> > 
> > > Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> > > ---
> > >   kernel/rcu/tree_exp.h | 23 ++++++++++++++---------
> > >   1 file changed, 14 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> > > index a0e1e51c51c2..6dec21909b30 100644
> > > --- a/kernel/rcu/tree_exp.h
> > > +++ b/kernel/rcu/tree_exp.h
> > > @@ -603,6 +603,18 @@ static void rcu_exp_handler(void *unused)
> > >   	struct rcu_node *rnp = rdp->mynode;
> > >   	struct task_struct *t = current;
> > > +	/*
> > > +	 * Note that there is a large group of race conditions that
> > > +	 * can have caused this quiescent state to already have been
> > > +	 * reported, so we really do need to check ->expmask first.
> > > +	 */
> > > +	raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > +	if (!(rnp->expmask & rdp->grpmask)) {
> > > +		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> > > +		return;
> > > +	}
> > > +	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> > > +
> > >   	/*
> > >   	 * First, the common case of not being in an RCU read-side
> > >   	 * critical section.  If also enabled or idle, immediately
> > > @@ -628,17 +640,10 @@ static void rcu_exp_handler(void *unused)
> > >   	 * a future context switch.  Either way, if the expedited
> > >   	 * grace period is still waiting on this CPU, set ->deferred_qs
> > >   	 * so that the eventual quiescent state will be reported.
> > > -	 * Note that there is a large group of race conditions that
> > > -	 * can have caused this quiescent state to already have been
> > > -	 * reported, so we really do need to check ->expmask.
> > >   	 */
> > >   	if (t->rcu_read_lock_nesting > 0) {
> > > -		raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > -		if (rnp->expmask & rdp->grpmask) {
> > > -			rdp->exp_deferred_qs = true;
> > > -			t->rcu_read_unlock_special.b.exp_hint = true;
> > > -		}
> > > -		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> > > +		rdp->exp_deferred_qs = true;
> > > +		WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, true);
> > >   		return;
> > >   	}
> > > -- 
> > > 2.20.1
> > > 

  reply	other threads:[~2019-10-31 19:00 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-31 10:07 [PATCH 00/11] rcu: introduce percpu rcu_preempt_depth Lai Jiangshan
2019-10-31 10:07 ` [PATCH 01/11] rcu: avoid leaking exp_deferred_qs into next GP Lai Jiangshan
2019-10-31 13:43   ` Paul E. McKenney
2019-10-31 18:19     ` Lai Jiangshan
2019-10-31 19:00       ` Paul E. McKenney [this message]
2019-10-31 10:07 ` [PATCH 02/11] rcu: fix bug when rcu_exp_handler() in nested interrupt Lai Jiangshan
2019-10-31 13:47   ` Paul E. McKenney
2019-10-31 14:20     ` Lai Jiangshan
2019-10-31 14:31     ` Paul E. McKenney
2019-10-31 15:14       ` Lai Jiangshan
2019-10-31 18:52         ` Paul E. McKenney
2019-11-01  0:19           ` Boqun Feng
2019-11-01  2:29             ` Lai Jiangshan
2019-10-31 10:07 ` [PATCH 03/11] rcu: clean up rcu_preempt_deferred_qs_irqrestore() Lai Jiangshan
2019-10-31 13:52   ` Paul E. McKenney
2019-10-31 15:25     ` Lai Jiangshan
2019-10-31 18:57       ` Paul E. McKenney
2019-10-31 19:02         ` Paul E. McKenney
2019-10-31 10:07 ` [PATCH 04/11] rcu: cleanup rcu_preempt_deferred_qs() Lai Jiangshan
2019-10-31 14:10   ` Paul E. McKenney
2019-10-31 14:35     ` Lai Jiangshan
2019-10-31 15:07       ` Paul E. McKenney
2019-10-31 18:33         ` Lai Jiangshan
2019-10-31 22:45           ` Paul E. McKenney
2019-10-31 10:08 ` [PATCH 05/11] rcu: clean all rcu_read_unlock_special after report qs Lai Jiangshan
2019-11-01 11:54   ` Paul E. McKenney
2019-10-31 10:08 ` [PATCH 06/11] rcu: clear t->rcu_read_unlock_special in one go Lai Jiangshan
2019-11-01 12:10   ` Paul E. McKenney
2019-11-01 16:58     ` Paul E. McKenney
2019-10-31 10:08 ` [PATCH 07/11] rcu: set special.b.deferred_qs before wake_up() Lai Jiangshan
2019-10-31 10:08 ` [PATCH 08/11] rcu: don't use negative ->rcu_read_lock_nesting Lai Jiangshan
2019-11-01 12:33   ` Paul E. McKenney
2019-11-16 13:04     ` Lai Jiangshan
2019-11-17 21:53       ` Paul E. McKenney
2019-11-18  1:54         ` Lai Jiangshan
2019-11-18 14:57           ` Paul E. McKenney
2019-10-31 10:08 ` [PATCH 09/11] rcu: wrap usages of rcu_read_lock_nesting Lai Jiangshan
2019-10-31 10:08 ` [PATCH 10/11] rcu: clear the special.b.need_qs in rcu_note_context_switch() Lai Jiangshan
2019-10-31 10:08 ` [PATCH 11/11] x86,rcu: use percpu rcu_preempt_depth Lai Jiangshan
2019-11-01 12:58   ` Paul E. McKenney
2019-11-01 13:13     ` Peter Zijlstra
2019-11-01 14:30       ` Paul E. McKenney
2019-11-01 15:32         ` Lai Jiangshan
2019-11-01 16:21           ` Paul E. McKenney
2019-11-01 15:47       ` Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191031190014.GZ20975@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).