* [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU @ 2021-07-05 23:43 Frederic Weisbecker 2021-07-05 23:43 ` [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() Frederic Weisbecker 2021-07-06 16:46 ` [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Paul E. McKenney 0 siblings, 2 replies; 7+ messages in thread From: Frederic Weisbecker @ 2021-07-05 23:43 UTC (permalink / raw) To: Paul E . McKenney Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Uladzislau Rezki, Boqun Feng, Ingo Molnar, Neeraj Upadhyay, Joel Fernandes cond_resched() reports an RCU quiescent state only in non-preemptible TREE RCU implementation. Provide an explanation for the different behaviour in CONFIG_PREEMPT_RCU=y. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> --- kernel/sched/core.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index cf16f8fda9a6..db374cb38eb2 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7780,6 +7780,19 @@ int __sched __cond_resched(void) preempt_schedule_common(); return 1; } + /* + * A process spending a long time in the kernel space might + * have too few opportunities to report quiescent states + * when CONFIG_PREEMPT_RCU=n because then the tick can't know + * if it's interrupting an RCU read side critical section. In the + * absence of voluntary sleeps, the last resort resides in tracking + * calls to cond_resched() which always imply quiescent states. + * + * On the other hand, preemptible RCU has a real RCU read side + * tracking that allows the tick for reporting interrupted quiescent + * states or, in the worst case, deferred quiescent states after + * rcu_read_unlock(). + */ #ifndef CONFIG_PREEMPT_RCU rcu_all_qs(); #endif -- 2.25.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() 2021-07-05 23:43 [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Frederic Weisbecker @ 2021-07-05 23:43 ` Frederic Weisbecker 2021-07-06 7:51 ` Peter Zijlstra 2021-07-06 16:46 ` [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Paul E. McKenney 1 sibling, 1 reply; 7+ messages in thread From: Frederic Weisbecker @ 2021-07-05 23:43 UTC (permalink / raw) To: Paul E . McKenney Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Uladzislau Rezki, Boqun Feng, Ingo Molnar, Neeraj Upadhyay, Joel Fernandes The preemption is already disabled when we write rcu_data.rcu_urgent_qs. We can use __this_cpu_write() directly, although that path is mostly used when CONFIG_PREEMPT=n. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> --- kernel/rcu/tree_plugin.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 27b74352cccf..38b3d01424d7 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -871,7 +871,7 @@ void rcu_all_qs(void) preempt_enable(); return; } - this_cpu_write(rcu_data.rcu_urgent_qs, false); + __this_cpu_write(rcu_data.rcu_urgent_qs, false); if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) { local_irq_save(flags); rcu_momentary_dyntick_idle(); -- 2.25.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() 2021-07-05 23:43 ` [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() Frederic Weisbecker @ 2021-07-06 7:51 ` Peter Zijlstra 2021-07-06 12:30 ` Frederic Weisbecker 0 siblings, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2021-07-06 7:51 UTC (permalink / raw) To: Frederic Weisbecker Cc: Paul E . McKenney, LKML, Uladzislau Rezki, Boqun Feng, Ingo Molnar, Neeraj Upadhyay, Joel Fernandes On Tue, Jul 06, 2021 at 01:43:44AM +0200, Frederic Weisbecker wrote: > The preemption is already disabled when we write rcu_data.rcu_urgent_qs. > We can use __this_cpu_write() directly, although that path is mostly > used when CONFIG_PREEMPT=n. > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > Cc: Neeraj Upadhyay <neeraju@codeaurora.org> > Cc: Joel Fernandes <joel@joelfernandes.org> > Cc: Uladzislau Rezki <urezki@gmail.com> > Cc: Boqun Feng <boqun.feng@gmail.com> > --- > kernel/rcu/tree_plugin.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > index 27b74352cccf..38b3d01424d7 100644 > --- a/kernel/rcu/tree_plugin.h > +++ b/kernel/rcu/tree_plugin.h > @@ -871,7 +871,7 @@ void rcu_all_qs(void) > preempt_enable(); > return; > } > - this_cpu_write(rcu_data.rcu_urgent_qs, false); > + __this_cpu_write(rcu_data.rcu_urgent_qs, false); There's another subtle difference between this_cpu_write() and __this_cpu_write() aside from preempt. this_cpu_write() is also IRQ-safe, while __this_cpu_write() is not. I've not looked at the usage here to see if that is relevant, but the Changelog only mentioned the preempt side of things, and that argument is incomplete in general. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() 2021-07-06 7:51 ` Peter Zijlstra @ 2021-07-06 12:30 ` Frederic Weisbecker 2021-07-06 13:28 ` Boqun Feng 0 siblings, 1 reply; 7+ messages in thread From: Frederic Weisbecker @ 2021-07-06 12:30 UTC (permalink / raw) To: Peter Zijlstra Cc: Paul E . McKenney, LKML, Uladzislau Rezki, Boqun Feng, Ingo Molnar, Neeraj Upadhyay, Joel Fernandes On Tue, Jul 06, 2021 at 09:51:01AM +0200, Peter Zijlstra wrote: > On Tue, Jul 06, 2021 at 01:43:44AM +0200, Frederic Weisbecker wrote: > > The preemption is already disabled when we write rcu_data.rcu_urgent_qs. > > We can use __this_cpu_write() directly, although that path is mostly > > used when CONFIG_PREEMPT=n. > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > Cc: Neeraj Upadhyay <neeraju@codeaurora.org> > > Cc: Joel Fernandes <joel@joelfernandes.org> > > Cc: Uladzislau Rezki <urezki@gmail.com> > > Cc: Boqun Feng <boqun.feng@gmail.com> > > --- > > kernel/rcu/tree_plugin.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > index 27b74352cccf..38b3d01424d7 100644 > > --- a/kernel/rcu/tree_plugin.h > > +++ b/kernel/rcu/tree_plugin.h > > @@ -871,7 +871,7 @@ void rcu_all_qs(void) > > preempt_enable(); > > return; > > } > > - this_cpu_write(rcu_data.rcu_urgent_qs, false); > > + __this_cpu_write(rcu_data.rcu_urgent_qs, false); > > There's another subtle difference between this_cpu_write() and > __this_cpu_write() aside from preempt. this_cpu_write() is also > IRQ-safe, while __this_cpu_write() is not. > > I've not looked at the usage here to see if that is relevant, but the > Changelog only mentioned the preempt side of things, and that argument > is incomplete in general. You're right, I missed that. I see this rcu_urgent_qs is set by RCU TASKS from rcu_tasks_wait_gp() (did I missed another path?). Not sure if this is called from IRQ nor if it actually matters to protect against IRQs for that single write. I'm not quite used to rcu_tasks. Paul? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() 2021-07-06 12:30 ` Frederic Weisbecker @ 2021-07-06 13:28 ` Boqun Feng 2021-07-06 16:24 ` Paul E. McKenney 0 siblings, 1 reply; 7+ messages in thread From: Boqun Feng @ 2021-07-06 13:28 UTC (permalink / raw) To: Frederic Weisbecker Cc: Peter Zijlstra, Paul E . McKenney, LKML, Uladzislau Rezki, Ingo Molnar, Neeraj Upadhyay, Joel Fernandes On Tue, Jul 06, 2021 at 02:30:58PM +0200, Frederic Weisbecker wrote: > On Tue, Jul 06, 2021 at 09:51:01AM +0200, Peter Zijlstra wrote: > > On Tue, Jul 06, 2021 at 01:43:44AM +0200, Frederic Weisbecker wrote: > > > The preemption is already disabled when we write rcu_data.rcu_urgent_qs. > > > We can use __this_cpu_write() directly, although that path is mostly > > > used when CONFIG_PREEMPT=n. > > > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > > Cc: Neeraj Upadhyay <neeraju@codeaurora.org> > > > Cc: Joel Fernandes <joel@joelfernandes.org> > > > Cc: Uladzislau Rezki <urezki@gmail.com> > > > Cc: Boqun Feng <boqun.feng@gmail.com> > > > --- > > > kernel/rcu/tree_plugin.h | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > > index 27b74352cccf..38b3d01424d7 100644 > > > --- a/kernel/rcu/tree_plugin.h > > > +++ b/kernel/rcu/tree_plugin.h > > > @@ -871,7 +871,7 @@ void rcu_all_qs(void) > > > preempt_enable(); > > > return; > > > } > > > - this_cpu_write(rcu_data.rcu_urgent_qs, false); > > > + __this_cpu_write(rcu_data.rcu_urgent_qs, false); > > > > There's another subtle difference between this_cpu_write() and > > __this_cpu_write() aside from preempt. this_cpu_write() is also > > IRQ-safe, while __this_cpu_write() is not. > > > > I've not looked at the usage here to see if that is relevant, but the > > Changelog only mentioned the preempt side of things, and that argument > > is incomplete in general. > > You're right, I missed that. I see this rcu_urgent_qs is set by > RCU TASKS from rcu_tasks_wait_gp() (did I missed another path?). > Not sure if this is called from IRQ nor if it actually matters to > protect against IRQs for that single write. I think __this_cpu_write() being IRQ-unsafe means it may overwrite percpu writes to other bytes in the same word? Let's say the rcu_urgent_qs is the lowest byte in the word, the pseduo asm code of __this_cpu_write() may be: __this_cpu_write(ptr, v): long tmp = *ptr; tmp &= ~(0xff); tmp |= v; *ptr = tmp; and the following sequence introduces an overwrite: __this_cpu_write(ptr, v): // v is 0, and *ptr is 1 long tmp = *ptr; // tmp is 1 <interrupted> this_cpu_write() // modify another byte of *ptr, make it // 0xff01 <ret from interrupt> tmp &= ~(0xff) // tmp is 0 tmp |=v; // tmp is 0 *ptr = tmp; // *ptr is 0, overwrite a percpu write on // another field. I know that many archs have byte-wise store, so compilers don't really have the reason to generate code as above, but __this_cpu_write() is just a normal write, nothing prevents this from happenning, unless I'm missing something here? Regards, Boqun > > I'm not quite used to rcu_tasks. Paul? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() 2021-07-06 13:28 ` Boqun Feng @ 2021-07-06 16:24 ` Paul E. McKenney 0 siblings, 0 replies; 7+ messages in thread From: Paul E. McKenney @ 2021-07-06 16:24 UTC (permalink / raw) To: Boqun Feng Cc: Frederic Weisbecker, Peter Zijlstra, LKML, Uladzislau Rezki, Ingo Molnar, Neeraj Upadhyay, Joel Fernandes On Tue, Jul 06, 2021 at 09:28:38PM +0800, Boqun Feng wrote: > On Tue, Jul 06, 2021 at 02:30:58PM +0200, Frederic Weisbecker wrote: > > On Tue, Jul 06, 2021 at 09:51:01AM +0200, Peter Zijlstra wrote: > > > On Tue, Jul 06, 2021 at 01:43:44AM +0200, Frederic Weisbecker wrote: > > > > The preemption is already disabled when we write rcu_data.rcu_urgent_qs. > > > > We can use __this_cpu_write() directly, although that path is mostly > > > > used when CONFIG_PREEMPT=n. > > > > > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > > > Cc: Neeraj Upadhyay <neeraju@codeaurora.org> > > > > Cc: Joel Fernandes <joel@joelfernandes.org> > > > > Cc: Uladzislau Rezki <urezki@gmail.com> > > > > Cc: Boqun Feng <boqun.feng@gmail.com> > > > > --- > > > > kernel/rcu/tree_plugin.h | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > > > index 27b74352cccf..38b3d01424d7 100644 > > > > --- a/kernel/rcu/tree_plugin.h > > > > +++ b/kernel/rcu/tree_plugin.h > > > > @@ -871,7 +871,7 @@ void rcu_all_qs(void) > > > > preempt_enable(); > > > > return; > > > > } > > > > - this_cpu_write(rcu_data.rcu_urgent_qs, false); > > > > + __this_cpu_write(rcu_data.rcu_urgent_qs, false); > > > > > > There's another subtle difference between this_cpu_write() and > > > __this_cpu_write() aside from preempt. this_cpu_write() is also > > > IRQ-safe, while __this_cpu_write() is not. > > > > > > I've not looked at the usage here to see if that is relevant, but the > > > Changelog only mentioned the preempt side of things, and that argument > > > is incomplete in general. > > > > You're right, I missed that. I see this rcu_urgent_qs is set by > > RCU TASKS from rcu_tasks_wait_gp() (did I missed another path?). > > Not sure if this is called from IRQ nor if it actually matters to > > protect against IRQs for that single write. > > I think __this_cpu_write() being IRQ-unsafe means it may overwrite > percpu writes to other bytes in the same word? Let's say the > rcu_urgent_qs is the lowest byte in the word, the pseduo asm code of > __this_cpu_write() may be: > > __this_cpu_write(ptr, v): > long tmp = *ptr; > tmp &= ~(0xff); > tmp |= v; > *ptr = tmp; > > and the following sequence introduces an overwrite: > > __this_cpu_write(ptr, v): // v is 0, and *ptr is 1 > long tmp = *ptr; // tmp is 1 > <interrupted> > this_cpu_write() // modify another byte of *ptr, make it > // 0xff01 > <ret from interrupt> > tmp &= ~(0xff) // tmp is 0 > tmp |=v; // tmp is 0 > *ptr = tmp; // *ptr is 0, overwrite a percpu write on > // another field. > > I know that many archs have byte-wise store, so compilers don't really > have the reason to generate code as above, but __this_cpu_write() is > just a normal write, nothing prevents this from happenning, unless I'm > missing something here? There can indeed be writes to .rcu_urgent_qs from interrupt handlers, for example in CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels from rcu_read_unlock(). More conventionally, the RCU_SOFTIRQ handler can run on the back of an interrupts, and can invoke rcu_check_quiescent_state(), which invokes rcu_report_qs_rdp(), which invokes rcu_disable_urgency_upon_qs(), which writes to ->rcu_urgent_qs. RCU takes a strict view of data races, so this wants the existing this_cpu_write(). However, RCU very likely has this_cpu_write() calls that should instead be __this_cpu_write() calls and vice versa, so please do continue treating any that you see with an appropriate level of suspicion. Thanx, Paul > Regards, > Boqun > > > > > I'm not quite used to rcu_tasks. Paul? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU 2021-07-05 23:43 [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Frederic Weisbecker 2021-07-05 23:43 ` [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() Frederic Weisbecker @ 2021-07-06 16:46 ` Paul E. McKenney 1 sibling, 0 replies; 7+ messages in thread From: Paul E. McKenney @ 2021-07-06 16:46 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Peter Zijlstra, Uladzislau Rezki, Boqun Feng, Ingo Molnar, Neeraj Upadhyay, Joel Fernandes On Tue, Jul 06, 2021 at 01:43:43AM +0200, Frederic Weisbecker wrote: > cond_resched() reports an RCU quiescent state only in non-preemptible > TREE RCU implementation. Provide an explanation for the different > behaviour in CONFIG_PREEMPT_RCU=y. > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > Cc: Neeraj Upadhyay <neeraju@codeaurora.org> > Cc: Joel Fernandes <joel@joelfernandes.org> > Cc: Uladzislau Rezki <urezki@gmail.com> > Cc: Boqun Feng <boqun.feng@gmail.com> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > Cc: Ingo Molnar <mingo@kernel.org> > --- > kernel/sched/core.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index cf16f8fda9a6..db374cb38eb2 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -7780,6 +7780,19 @@ int __sched __cond_resched(void) > preempt_schedule_common(); > return 1; > } > + /* > + * A process spending a long time in the kernel space might > + * have too few opportunities to report quiescent states > + * when CONFIG_PREEMPT_RCU=n because then the tick can't know > + * if it's interrupting an RCU read side critical section. In the > + * absence of voluntary sleeps, the last resort resides in tracking > + * calls to cond_resched() which always imply quiescent states. > + * > + * On the other hand, preemptible RCU has a real RCU read side > + * tracking that allows the tick for reporting interrupted quiescent > + * states or, in the worst case, deferred quiescent states after > + * rcu_read_unlock(). > + */ > #ifndef CONFIG_PREEMPT_RCU > rcu_all_qs(); > #endif Looks like a good addition! I have wordsmithed the comment and commit log a bit. If Peter wants to take either version of this: Acked-by: Paul E. McKenney <paulmck@kernel.org> In the meantime, I have queued it for v5.15. Thanx, Paul ------------------------------------------------------------------------ commit 41a95363ac9f020cc0e4fcc4b73015c60b6620f0 Author: Frederic Weisbecker <frederic@kernel.org> Date: Tue Jul 6 01:43:43 2021 +0200 rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU The cond_resched() function reports an RCU quiescent state only in non-preemptible TREE RCU implementation. This commit therefore adds a comment explaining why cond_resched() does nothing in preemptible kernels. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/kernel/sched/core.c b/kernel/sched/core.c index cf16f8fda9a6..eae24fdf3820 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7780,6 +7780,17 @@ int __sched __cond_resched(void) preempt_schedule_common(); return 1; } + /* + * In preemptible kernels, ->rcu_read_lock_nesting tells the tick + * whether the current CPU is in an RCU read-side critical section, + * so the tick can report quiescent states even for CPUs looping + * in kernel context. In contrast, in non-preemptible kernels, + * RCU readers leave no in-memory hints, which means that CPU-bound + * processes executing in kernel context might never report an + * RCU quiescent state. Therefore, the following code causes + * cond_resched() to report a quiescent state, but only when RCU + * is in urgent need of one. + */ #ifndef CONFIG_PREEMPT_RCU rcu_all_qs(); #endif ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-07-06 16:46 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-05 23:43 [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Frederic Weisbecker 2021-07-05 23:43 ` [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() Frederic Weisbecker 2021-07-06 7:51 ` Peter Zijlstra 2021-07-06 12:30 ` Frederic Weisbecker 2021-07-06 13:28 ` Boqun Feng 2021-07-06 16:24 ` Paul E. McKenney 2021-07-06 16:46 ` [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).