From: "Paul E. McKenney" <paulmck@kernel.org> To: Frederic Weisbecker <frederic@kernel.org> Cc: Valentin Schneider <valentin.schneider@arm.com>, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rt-users@vger.kernel.org, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Ingo Molnar <mingo@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Thomas Gleixner <tglx@linutronix.de>, Steven Rostedt <rostedt@goodmis.org>, Daniel Bristot de Oliveira <bristot@redhat.com>, Josh Triplett <josh@joshtriplett.org>, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, Lai Jiangshan <jiangshanlai@gmail.com>, Joel Fernandes <joel@joelfernandes.org>, Anshuman Khandual <anshuman.khandual@arm.com>, Vincenzo Frascino <vincenzo.frascino@arm.com>, Steven Price <steven.price@arm.com>, Ard Biesheuvel <ardb@kernel.org>, Sebastian Andrzej Siewior <bigeasy@linutronix.de> Subject: Re: [PATCH 2/3] rcu/nocb: Check for migratability rather than pure preemptability Date: Wed, 28 Jul 2021 18:04:45 -0700 [thread overview] Message-ID: <20210729010445.GO4397@paulmck-ThinkPad-P17-Gen-1> (raw) In-Reply-To: <20210728220137.GD293265@lothringen> On Thu, Jul 29, 2021 at 12:01:37AM +0200, Frederic Weisbecker wrote: > On Wed, Jul 28, 2021 at 08:34:14PM +0100, Valentin Schneider wrote: > > On 28/07/21 01:08, Frederic Weisbecker wrote: > > > On Wed, Jul 21, 2021 at 12:51:17PM +0100, Valentin Schneider wrote: > > >> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> > > >> --- > > >> kernel/rcu/tree_plugin.h | 3 +-- > > >> 1 file changed, 1 insertion(+), 2 deletions(-) > > >> > > >> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > >> index ad0156b86937..6c3c4100da83 100644 > > >> --- a/kernel/rcu/tree_plugin.h > > >> +++ b/kernel/rcu/tree_plugin.h > > >> @@ -70,8 +70,7 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp) > > >> !(lockdep_is_held(&rcu_state.barrier_mutex) || > > >> (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) || > > >> rcu_lockdep_is_held_nocb(rdp) || > > >> - (rdp == this_cpu_ptr(&rcu_data) && > > >> - !(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible())) || > > >> + (rdp == this_cpu_ptr(&rcu_data) && is_pcpu_safe()) || > > > > > > I fear that won't work. We really need any caller of rcu_rdp_is_offloaded() > > > on the local rdp to have preemption disabled and not just migration disabled, > > > because we must protect against concurrent offloaded state changes. > > > > > > The offloaded state is changed by a workqueue that executes on the target rdp. > > > > > > Here is a practical example where it matters: > > > > > > CPU 0 > > > ----- > > > // =======> task rcuc running > > > rcu_core { > > > rcu_nocb_lock_irqsave(rdp, flags) { > > > if (!rcu_segcblist_is_offloaded(rdp->cblist)) { > > > // is not offloaded right now, so it's going > > > // to just disable IRQs. Oh no wait: > > > // preemption > > > // ========> workqueue running > > > rcu_nocb_rdp_offload(); > > > // ========> task rcuc resume > > > local_irq_disable(); > > > } > > > } > > > .... > > > rcu_nocb_unlock_irqrestore(rdp, flags) { > > > if (rcu_segcblist_is_offloaded(rdp->cblist)) { > > > // is offloaded right now so: > > > raw_spin_unlock_irqrestore(rdp, flags); > > > > > > And that will explode because that's an impaired unlock on nocb_lock. > > > > Harumph, that doesn't look good, thanks for pointing this out. > > > > AFAICT PREEMPT_RT doesn't actually require to disable softirqs here (since > > it forces RCU callbacks on the RCU kthreads), but disabled softirqs seem to > > be a requirement for much of the underlying functions and even some of the > > callbacks (delayed_put_task_struct() ~> vfree() pays close attention to > > in_interrupt() for instance). > > > > Now, if the offloaded state was (properly) protected by a local_lock, do > > you reckon we could then keep preemption enabled? > > I guess we could take such a local lock on the update side > (rcu_nocb_rdp_offload) and then take it on rcuc kthread/softirqs > and maybe other places. > > But we must make sure that rcu_core() is preempt-safe from a general perspective > in the first place. From a quick glance I can't find obvious issues...yet. > > Paul maybe you can see something? Let's see... o Extra context switches in rcu_core() mean extra quiescent states. It therefore might be necessary to wrap rcu_core() in an rcu_read_lock() / rcu_read_unlock() pair, because otherwise an RCU grace period won't wait for rcu_core(). Actually, better have local_bh_disable() imply rcu_read_lock() and local_bh_enable() imply rcu_read_unlock(). But I would hope that this already happened. o The rcu_preempt_deferred_qs() check should still be fine, unless there is a raw_bh_disable() in -rt. o The set_tsk_need_resched() and set_preempt_need_resched() might preempt immediately. I cannot think of a problem with that, but careful testing is clearly in order. o The values checked by rcu_check_quiescent_state() could now change while this function is running. I don't immediately see a problematic sequence of events, but here be dragons. I therefore suggest disabling preemption across this function. Or if that is impossible, taking a very careful look at the proposed expansion of the state space of this function. o I don't see any new races in the grace-period/callback check. New callbacks can appear in interrupt handlers, after all. o The rcu_check_gp_start_stall() function looks similarly unproblematic. o Callback invocation can now be preempted, but then again it recently started being concurrent, so this should be no added risk over offloading/de-offloading. o I don't see any problem with do_nocb_deferred_wakeup(). o The CONFIG_RCU_STRICT_GRACE_PERIOD check should not be impacted. So some adjustments might be needed, but I don't see a need for major surgery. This of course might be a failure of imagination on my part, so it wouldn't hurt to double-check my observations. > > From a naive outsider PoV, rdp->nocb_lock looks like a decent candidate, > > but it's a *raw* spinlock (I can't tell right now whether changing this is > > a horrible idea or not), and then there's > > Yeah that's not possible, nocb_lock is too low level and has to be called with > IRQs disabled. So if we take that local_lock solution, we need a new lock. No argument here! Thanx, Paul
WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@kernel.org> To: Frederic Weisbecker <frederic@kernel.org> Cc: Valentin Schneider <valentin.schneider@arm.com>, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rt-users@vger.kernel.org, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Ingo Molnar <mingo@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Thomas Gleixner <tglx@linutronix.de>, Steven Rostedt <rostedt@goodmis.org>, Daniel Bristot de Oliveira <bristot@redhat.com>, Josh Triplett <josh@joshtriplett.org>, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, Lai Jiangshan <jiangshanlai@gmail.com>, Joel Fernandes <joel@joelfernandes.org>, Anshuman Khandual <anshuman.khandual@arm.com>, Vincenzo Frascino <vincenzo.frascino@arm.com>, Steven Price <steven.price@arm.com>, Ard Biesheuvel <ardb@kernel.org>, Sebastian Andrzej Siewior <bigeasy@linutronix.de> Subject: Re: [PATCH 2/3] rcu/nocb: Check for migratability rather than pure preemptability Date: Wed, 28 Jul 2021 18:04:45 -0700 [thread overview] Message-ID: <20210729010445.GO4397@paulmck-ThinkPad-P17-Gen-1> (raw) In-Reply-To: <20210728220137.GD293265@lothringen> On Thu, Jul 29, 2021 at 12:01:37AM +0200, Frederic Weisbecker wrote: > On Wed, Jul 28, 2021 at 08:34:14PM +0100, Valentin Schneider wrote: > > On 28/07/21 01:08, Frederic Weisbecker wrote: > > > On Wed, Jul 21, 2021 at 12:51:17PM +0100, Valentin Schneider wrote: > > >> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> > > >> --- > > >> kernel/rcu/tree_plugin.h | 3 +-- > > >> 1 file changed, 1 insertion(+), 2 deletions(-) > > >> > > >> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > >> index ad0156b86937..6c3c4100da83 100644 > > >> --- a/kernel/rcu/tree_plugin.h > > >> +++ b/kernel/rcu/tree_plugin.h > > >> @@ -70,8 +70,7 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp) > > >> !(lockdep_is_held(&rcu_state.barrier_mutex) || > > >> (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) || > > >> rcu_lockdep_is_held_nocb(rdp) || > > >> - (rdp == this_cpu_ptr(&rcu_data) && > > >> - !(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible())) || > > >> + (rdp == this_cpu_ptr(&rcu_data) && is_pcpu_safe()) || > > > > > > I fear that won't work. We really need any caller of rcu_rdp_is_offloaded() > > > on the local rdp to have preemption disabled and not just migration disabled, > > > because we must protect against concurrent offloaded state changes. > > > > > > The offloaded state is changed by a workqueue that executes on the target rdp. > > > > > > Here is a practical example where it matters: > > > > > > CPU 0 > > > ----- > > > // =======> task rcuc running > > > rcu_core { > > > rcu_nocb_lock_irqsave(rdp, flags) { > > > if (!rcu_segcblist_is_offloaded(rdp->cblist)) { > > > // is not offloaded right now, so it's going > > > // to just disable IRQs. Oh no wait: > > > // preemption > > > // ========> workqueue running > > > rcu_nocb_rdp_offload(); > > > // ========> task rcuc resume > > > local_irq_disable(); > > > } > > > } > > > .... > > > rcu_nocb_unlock_irqrestore(rdp, flags) { > > > if (rcu_segcblist_is_offloaded(rdp->cblist)) { > > > // is offloaded right now so: > > > raw_spin_unlock_irqrestore(rdp, flags); > > > > > > And that will explode because that's an impaired unlock on nocb_lock. > > > > Harumph, that doesn't look good, thanks for pointing this out. > > > > AFAICT PREEMPT_RT doesn't actually require to disable softirqs here (since > > it forces RCU callbacks on the RCU kthreads), but disabled softirqs seem to > > be a requirement for much of the underlying functions and even some of the > > callbacks (delayed_put_task_struct() ~> vfree() pays close attention to > > in_interrupt() for instance). > > > > Now, if the offloaded state was (properly) protected by a local_lock, do > > you reckon we could then keep preemption enabled? > > I guess we could take such a local lock on the update side > (rcu_nocb_rdp_offload) and then take it on rcuc kthread/softirqs > and maybe other places. > > But we must make sure that rcu_core() is preempt-safe from a general perspective > in the first place. From a quick glance I can't find obvious issues...yet. > > Paul maybe you can see something? Let's see... o Extra context switches in rcu_core() mean extra quiescent states. It therefore might be necessary to wrap rcu_core() in an rcu_read_lock() / rcu_read_unlock() pair, because otherwise an RCU grace period won't wait for rcu_core(). Actually, better have local_bh_disable() imply rcu_read_lock() and local_bh_enable() imply rcu_read_unlock(). But I would hope that this already happened. o The rcu_preempt_deferred_qs() check should still be fine, unless there is a raw_bh_disable() in -rt. o The set_tsk_need_resched() and set_preempt_need_resched() might preempt immediately. I cannot think of a problem with that, but careful testing is clearly in order. o The values checked by rcu_check_quiescent_state() could now change while this function is running. I don't immediately see a problematic sequence of events, but here be dragons. I therefore suggest disabling preemption across this function. Or if that is impossible, taking a very careful look at the proposed expansion of the state space of this function. o I don't see any new races in the grace-period/callback check. New callbacks can appear in interrupt handlers, after all. o The rcu_check_gp_start_stall() function looks similarly unproblematic. o Callback invocation can now be preempted, but then again it recently started being concurrent, so this should be no added risk over offloading/de-offloading. o I don't see any problem with do_nocb_deferred_wakeup(). o The CONFIG_RCU_STRICT_GRACE_PERIOD check should not be impacted. So some adjustments might be needed, but I don't see a need for major surgery. This of course might be a failure of imagination on my part, so it wouldn't hurt to double-check my observations. > > From a naive outsider PoV, rdp->nocb_lock looks like a decent candidate, > > but it's a *raw* spinlock (I can't tell right now whether changing this is > > a horrible idea or not), and then there's > > Yeah that's not possible, nocb_lock is too low level and has to be called with > IRQs disabled. So if we take that local_lock solution, we need a new lock. No argument here! Thanx, Paul _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-07-29 1:04 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-07-21 11:51 [PATCH 0/3] sched: migrate_disable() vs per-CPU access safety checks Valentin Schneider 2021-07-21 11:51 ` Valentin Schneider 2021-07-21 11:51 ` [PATCH 1/3] sched: Introduce is_pcpu_safe() Valentin Schneider 2021-07-21 11:51 ` Valentin Schneider 2021-07-27 16:23 ` Paul E. McKenney 2021-07-27 16:23 ` Paul E. McKenney 2021-07-21 11:51 ` [PATCH 2/3] rcu/nocb: Check for migratability rather than pure preemptability Valentin Schneider 2021-07-21 11:51 ` Valentin Schneider 2021-07-27 16:24 ` Paul E. McKenney 2021-07-27 16:24 ` Paul E. McKenney 2021-07-27 23:08 ` Frederic Weisbecker 2021-07-27 23:08 ` Frederic Weisbecker 2021-07-28 19:34 ` Valentin Schneider 2021-07-28 19:34 ` Valentin Schneider 2021-07-28 22:01 ` Frederic Weisbecker 2021-07-28 22:01 ` Frederic Weisbecker 2021-07-29 1:04 ` Paul E. McKenney [this message] 2021-07-29 1:04 ` Paul E. McKenney 2021-07-29 10:51 ` Valentin Schneider 2021-07-29 10:51 ` Valentin Schneider 2021-07-21 11:51 ` [PATCH 3/3] arm64: mm: Make arch_faults_on_old_pte() check for migratability Valentin Schneider 2021-07-21 11:51 ` Valentin Schneider 2021-07-27 19:45 ` [PATCH 0/3] sched: migrate_disable() vs per-CPU access safety checks Thomas Gleixner 2021-07-27 19:45 ` Thomas Gleixner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210729010445.GO4397@paulmck-ThinkPad-P17-Gen-1 \ --to=paulmck@kernel.org \ --cc=anshuman.khandual@arm.com \ --cc=ardb@kernel.org \ --cc=bigeasy@linutronix.de \ --cc=bristot@redhat.com \ --cc=catalin.marinas@arm.com \ --cc=frederic@kernel.org \ --cc=jiangshanlai@gmail.com \ --cc=joel@joelfernandes.org \ --cc=josh@joshtriplett.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-rt-users@vger.kernel.org \ --cc=mathieu.desnoyers@efficios.com \ --cc=mingo@kernel.org \ --cc=peterz@infradead.org \ --cc=rostedt@goodmis.org \ --cc=steven.price@arm.com \ --cc=tglx@linutronix.de \ --cc=valentin.schneider@arm.com \ --cc=vincenzo.frascino@arm.com \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.