All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-rt-users@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Steven Price <steven.price@arm.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [PATCH 2/3] rcu/nocb: Check for migratability rather than pure preemptability
Date: Wed, 28 Jul 2021 18:04:45 -0700	[thread overview]
Message-ID: <20210729010445.GO4397@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <20210728220137.GD293265@lothringen>

On Thu, Jul 29, 2021 at 12:01:37AM +0200, Frederic Weisbecker wrote:
> On Wed, Jul 28, 2021 at 08:34:14PM +0100, Valentin Schneider wrote:
> > On 28/07/21 01:08, Frederic Weisbecker wrote:
> > > On Wed, Jul 21, 2021 at 12:51:17PM +0100, Valentin Schneider wrote:
> > >> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> > >> ---
> > >>  kernel/rcu/tree_plugin.h | 3 +--
> > >>  1 file changed, 1 insertion(+), 2 deletions(-)
> > >>
> > >> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > >> index ad0156b86937..6c3c4100da83 100644
> > >> --- a/kernel/rcu/tree_plugin.h
> > >> +++ b/kernel/rcu/tree_plugin.h
> > >> @@ -70,8 +70,7 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
> > >>              !(lockdep_is_held(&rcu_state.barrier_mutex) ||
> > >>                (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) ||
> > >>                rcu_lockdep_is_held_nocb(rdp) ||
> > >> -		  (rdp == this_cpu_ptr(&rcu_data) &&
> > >> -		   !(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible())) ||
> > >> +		  (rdp == this_cpu_ptr(&rcu_data) && is_pcpu_safe()) ||
> > >
> > > I fear that won't work. We really need any caller of rcu_rdp_is_offloaded()
> > > on the local rdp to have preemption disabled and not just migration disabled,
> > > because we must protect against concurrent offloaded state changes.
> > >
> > > The offloaded state is changed by a workqueue that executes on the target rdp.
> > >
> > > Here is a practical example where it matters:
> > >
> > >            CPU 0
> > >            -----
> > >            // =======> task rcuc running
> > >            rcu_core {
> > >              rcu_nocb_lock_irqsave(rdp, flags) {
> > >                    if (!rcu_segcblist_is_offloaded(rdp->cblist)) {
> > >                      // is not offloaded right now, so it's going
> > >                        // to just disable IRQs. Oh no wait:
> > >            // preemption
> > >            // ========> workqueue running
> > >            rcu_nocb_rdp_offload();
> > >            // ========> task rcuc resume
> > >                      local_irq_disable();
> > >                    }
> > >                }
> > >              ....
> > >                      rcu_nocb_unlock_irqrestore(rdp, flags) {
> > >                    if (rcu_segcblist_is_offloaded(rdp->cblist)) {
> > >                        // is offloaded right now so:
> > >                        raw_spin_unlock_irqrestore(rdp, flags);
> > >
> > > And that will explode because that's an impaired unlock on nocb_lock.
> > 
> > Harumph, that doesn't look good, thanks for pointing this out.
> > 
> > AFAICT PREEMPT_RT doesn't actually require to disable softirqs here (since
> > it forces RCU callbacks on the RCU kthreads), but disabled softirqs seem to
> > be a requirement for much of the underlying functions and even some of the
> > callbacks (delayed_put_task_struct() ~> vfree() pays close attention to
> > in_interrupt() for instance).
> > 
> > Now, if the offloaded state was (properly) protected by a local_lock, do
> > you reckon we could then keep preemption enabled?
> 
> I guess we could take such a local lock on the update side
> (rcu_nocb_rdp_offload) and then take it on rcuc kthread/softirqs
> and maybe other places.
> 
> But we must make sure that rcu_core() is preempt-safe from a general perspective
> in the first place. From a quick glance I can't find obvious issues...yet.
> 
> Paul maybe you can see something?

Let's see...

o	Extra context switches in rcu_core() mean extra quiescent
	states.  It therefore might be necessary to wrap rcu_core()
	in an rcu_read_lock() / rcu_read_unlock() pair, because
	otherwise an RCU grace period won't wait for rcu_core().

	Actually, better have local_bh_disable() imply
	rcu_read_lock() and local_bh_enable() imply rcu_read_unlock().
	But I would hope that this already happened.

o	The rcu_preempt_deferred_qs() check should still be fine,
	unless there is a raw_bh_disable() in -rt. 

o	The set_tsk_need_resched() and set_preempt_need_resched()
	might preempt immediately.  I cannot think of a problem
	with that, but careful testing is clearly in order.

o	The values checked by rcu_check_quiescent_state() could now
	change while this function is running.	I don't immediately
	see a problematic sequence of events, but here be dragons.
	I therefore suggest disabling preemption across this function.
	Or if that is impossible, taking a very careful look at the
	proposed expansion of the state space of this function.

o	I don't see any new races in the grace-period/callback check.
	New callbacks can appear in interrupt handlers, after all.

o	The rcu_check_gp_start_stall() function looks similarly
	unproblematic.

o	Callback invocation can now be preempted, but then again it
	recently started being concurrent, so this should be no
	added risk over offloading/de-offloading.

o	I don't see any problem with do_nocb_deferred_wakeup().

o	The CONFIG_RCU_STRICT_GRACE_PERIOD check should not be
	impacted.

So some adjustments might be needed, but I don't see a need for
major surgery.

This of course might be a failure of imagination on my part, so it
wouldn't hurt to double-check my observations.

> > From a naive outsider PoV, rdp->nocb_lock looks like a decent candidate,
> > but it's a *raw* spinlock (I can't tell right now whether changing this is
> > a horrible idea or not), and then there's
> 
> Yeah that's not possible, nocb_lock is too low level and has to be called with
> IRQs disabled. So if we take that local_lock solution, we need a new lock.

No argument here!

							Thanx, Paul

WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-rt-users@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Steven Price <steven.price@arm.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [PATCH 2/3] rcu/nocb: Check for migratability rather than pure preemptability
Date: Wed, 28 Jul 2021 18:04:45 -0700	[thread overview]
Message-ID: <20210729010445.GO4397@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <20210728220137.GD293265@lothringen>

On Thu, Jul 29, 2021 at 12:01:37AM +0200, Frederic Weisbecker wrote:
> On Wed, Jul 28, 2021 at 08:34:14PM +0100, Valentin Schneider wrote:
> > On 28/07/21 01:08, Frederic Weisbecker wrote:
> > > On Wed, Jul 21, 2021 at 12:51:17PM +0100, Valentin Schneider wrote:
> > >> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> > >> ---
> > >>  kernel/rcu/tree_plugin.h | 3 +--
> > >>  1 file changed, 1 insertion(+), 2 deletions(-)
> > >>
> > >> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > >> index ad0156b86937..6c3c4100da83 100644
> > >> --- a/kernel/rcu/tree_plugin.h
> > >> +++ b/kernel/rcu/tree_plugin.h
> > >> @@ -70,8 +70,7 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
> > >>              !(lockdep_is_held(&rcu_state.barrier_mutex) ||
> > >>                (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) ||
> > >>                rcu_lockdep_is_held_nocb(rdp) ||
> > >> -		  (rdp == this_cpu_ptr(&rcu_data) &&
> > >> -		   !(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible())) ||
> > >> +		  (rdp == this_cpu_ptr(&rcu_data) && is_pcpu_safe()) ||
> > >
> > > I fear that won't work. We really need any caller of rcu_rdp_is_offloaded()
> > > on the local rdp to have preemption disabled and not just migration disabled,
> > > because we must protect against concurrent offloaded state changes.
> > >
> > > The offloaded state is changed by a workqueue that executes on the target rdp.
> > >
> > > Here is a practical example where it matters:
> > >
> > >            CPU 0
> > >            -----
> > >            // =======> task rcuc running
> > >            rcu_core {
> > >              rcu_nocb_lock_irqsave(rdp, flags) {
> > >                    if (!rcu_segcblist_is_offloaded(rdp->cblist)) {
> > >                      // is not offloaded right now, so it's going
> > >                        // to just disable IRQs. Oh no wait:
> > >            // preemption
> > >            // ========> workqueue running
> > >            rcu_nocb_rdp_offload();
> > >            // ========> task rcuc resume
> > >                      local_irq_disable();
> > >                    }
> > >                }
> > >              ....
> > >                      rcu_nocb_unlock_irqrestore(rdp, flags) {
> > >                    if (rcu_segcblist_is_offloaded(rdp->cblist)) {
> > >                        // is offloaded right now so:
> > >                        raw_spin_unlock_irqrestore(rdp, flags);
> > >
> > > And that will explode because that's an impaired unlock on nocb_lock.
> > 
> > Harumph, that doesn't look good, thanks for pointing this out.
> > 
> > AFAICT PREEMPT_RT doesn't actually require to disable softirqs here (since
> > it forces RCU callbacks on the RCU kthreads), but disabled softirqs seem to
> > be a requirement for much of the underlying functions and even some of the
> > callbacks (delayed_put_task_struct() ~> vfree() pays close attention to
> > in_interrupt() for instance).
> > 
> > Now, if the offloaded state was (properly) protected by a local_lock, do
> > you reckon we could then keep preemption enabled?
> 
> I guess we could take such a local lock on the update side
> (rcu_nocb_rdp_offload) and then take it on rcuc kthread/softirqs
> and maybe other places.
> 
> But we must make sure that rcu_core() is preempt-safe from a general perspective
> in the first place. From a quick glance I can't find obvious issues...yet.
> 
> Paul maybe you can see something?

Let's see...

o	Extra context switches in rcu_core() mean extra quiescent
	states.  It therefore might be necessary to wrap rcu_core()
	in an rcu_read_lock() / rcu_read_unlock() pair, because
	otherwise an RCU grace period won't wait for rcu_core().

	Actually, better have local_bh_disable() imply
	rcu_read_lock() and local_bh_enable() imply rcu_read_unlock().
	But I would hope that this already happened.

o	The rcu_preempt_deferred_qs() check should still be fine,
	unless there is a raw_bh_disable() in -rt. 

o	The set_tsk_need_resched() and set_preempt_need_resched()
	might preempt immediately.  I cannot think of a problem
	with that, but careful testing is clearly in order.

o	The values checked by rcu_check_quiescent_state() could now
	change while this function is running.	I don't immediately
	see a problematic sequence of events, but here be dragons.
	I therefore suggest disabling preemption across this function.
	Or if that is impossible, taking a very careful look at the
	proposed expansion of the state space of this function.

o	I don't see any new races in the grace-period/callback check.
	New callbacks can appear in interrupt handlers, after all.

o	The rcu_check_gp_start_stall() function looks similarly
	unproblematic.

o	Callback invocation can now be preempted, but then again it
	recently started being concurrent, so this should be no
	added risk over offloading/de-offloading.

o	I don't see any problem with do_nocb_deferred_wakeup().

o	The CONFIG_RCU_STRICT_GRACE_PERIOD check should not be
	impacted.

So some adjustments might be needed, but I don't see a need for
major surgery.

This of course might be a failure of imagination on my part, so it
wouldn't hurt to double-check my observations.

> > From a naive outsider PoV, rdp->nocb_lock looks like a decent candidate,
> > but it's a *raw* spinlock (I can't tell right now whether changing this is
> > a horrible idea or not), and then there's
> 
> Yeah that's not possible, nocb_lock is too low level and has to be called with
> IRQs disabled. So if we take that local_lock solution, we need a new lock.

No argument here!

							Thanx, Paul

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-07-29  1:04 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-21 11:51 [PATCH 0/3] sched: migrate_disable() vs per-CPU access safety checks Valentin Schneider
2021-07-21 11:51 ` Valentin Schneider
2021-07-21 11:51 ` [PATCH 1/3] sched: Introduce is_pcpu_safe() Valentin Schneider
2021-07-21 11:51   ` Valentin Schneider
2021-07-27 16:23   ` Paul E. McKenney
2021-07-27 16:23     ` Paul E. McKenney
2021-07-21 11:51 ` [PATCH 2/3] rcu/nocb: Check for migratability rather than pure preemptability Valentin Schneider
2021-07-21 11:51   ` Valentin Schneider
2021-07-27 16:24   ` Paul E. McKenney
2021-07-27 16:24     ` Paul E. McKenney
2021-07-27 23:08   ` Frederic Weisbecker
2021-07-27 23:08     ` Frederic Weisbecker
2021-07-28 19:34     ` Valentin Schneider
2021-07-28 19:34       ` Valentin Schneider
2021-07-28 22:01       ` Frederic Weisbecker
2021-07-28 22:01         ` Frederic Weisbecker
2021-07-29  1:04         ` Paul E. McKenney [this message]
2021-07-29  1:04           ` Paul E. McKenney
2021-07-29 10:51           ` Valentin Schneider
2021-07-29 10:51             ` Valentin Schneider
2021-07-21 11:51 ` [PATCH 3/3] arm64: mm: Make arch_faults_on_old_pte() check for migratability Valentin Schneider
2021-07-21 11:51   ` Valentin Schneider
2021-07-27 19:45 ` [PATCH 0/3] sched: migrate_disable() vs per-CPU access safety checks Thomas Gleixner
2021-07-27 19:45   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210729010445.GO4397@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=ardb@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=frederic@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=steven.price@arm.com \
    --cc=tglx@linutronix.de \
    --cc=valentin.schneider@arm.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.