* [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU
@ 2021-07-05 23:43 Frederic Weisbecker
2021-07-05 23:43 ` [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() Frederic Weisbecker
2021-07-06 16:46 ` [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Paul E. McKenney
0 siblings, 2 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2021-07-05 23:43 UTC (permalink / raw)
To: Paul E . McKenney
Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Uladzislau Rezki,
Boqun Feng, Ingo Molnar, Neeraj Upadhyay, Joel Fernandes
cond_resched() reports an RCU quiescent state only in non-preemptible
TREE RCU implementation. Provide an explanation for the different
behaviour in CONFIG_PREEMPT_RCU=y.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
---
kernel/sched/core.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cf16f8fda9a6..db374cb38eb2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7780,6 +7780,19 @@ int __sched __cond_resched(void)
preempt_schedule_common();
return 1;
}
+ /*
+ * A process spending a long time in the kernel space might
+ * have too few opportunities to report quiescent states
+ * when CONFIG_PREEMPT_RCU=n because then the tick can't know
+ * if it's interrupting an RCU read side critical section. In the
+ * absence of voluntary sleeps, the last resort resides in tracking
+ * calls to cond_resched() which always imply quiescent states.
+ *
+ * On the other hand, preemptible RCU has a real RCU read side
+ * tracking that allows the tick for reporting interrupted quiescent
+ * states or, in the worst case, deferred quiescent states after
+ * rcu_read_unlock().
+ */
#ifndef CONFIG_PREEMPT_RCU
rcu_all_qs();
#endif
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs()
2021-07-05 23:43 [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Frederic Weisbecker
@ 2021-07-05 23:43 ` Frederic Weisbecker
2021-07-06 7:51 ` Peter Zijlstra
2021-07-06 16:46 ` [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Paul E. McKenney
1 sibling, 1 reply; 7+ messages in thread
From: Frederic Weisbecker @ 2021-07-05 23:43 UTC (permalink / raw)
To: Paul E . McKenney
Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Uladzislau Rezki,
Boqun Feng, Ingo Molnar, Neeraj Upadhyay, Joel Fernandes
The preemption is already disabled when we write rcu_data.rcu_urgent_qs.
We can use __this_cpu_write() directly, although that path is mostly
used when CONFIG_PREEMPT=n.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
kernel/rcu/tree_plugin.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 27b74352cccf..38b3d01424d7 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -871,7 +871,7 @@ void rcu_all_qs(void)
preempt_enable();
return;
}
- this_cpu_write(rcu_data.rcu_urgent_qs, false);
+ __this_cpu_write(rcu_data.rcu_urgent_qs, false);
if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) {
local_irq_save(flags);
rcu_momentary_dyntick_idle();
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs()
2021-07-05 23:43 ` [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() Frederic Weisbecker
@ 2021-07-06 7:51 ` Peter Zijlstra
2021-07-06 12:30 ` Frederic Weisbecker
0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2021-07-06 7:51 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Paul E . McKenney, LKML, Uladzislau Rezki, Boqun Feng,
Ingo Molnar, Neeraj Upadhyay, Joel Fernandes
On Tue, Jul 06, 2021 at 01:43:44AM +0200, Frederic Weisbecker wrote:
> The preemption is already disabled when we write rcu_data.rcu_urgent_qs.
> We can use __this_cpu_write() directly, although that path is mostly
> used when CONFIG_PREEMPT=n.
>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Uladzislau Rezki <urezki@gmail.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> ---
> kernel/rcu/tree_plugin.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 27b74352cccf..38b3d01424d7 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -871,7 +871,7 @@ void rcu_all_qs(void)
> preempt_enable();
> return;
> }
> - this_cpu_write(rcu_data.rcu_urgent_qs, false);
> + __this_cpu_write(rcu_data.rcu_urgent_qs, false);
There's another subtle difference between this_cpu_write() and
__this_cpu_write() aside from preempt. this_cpu_write() is also
IRQ-safe, while __this_cpu_write() is not.
I've not looked at the usage here to see if that is relevant, but the
Changelog only mentioned the preempt side of things, and that argument
is incomplete in general.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs()
2021-07-06 7:51 ` Peter Zijlstra
@ 2021-07-06 12:30 ` Frederic Weisbecker
2021-07-06 13:28 ` Boqun Feng
0 siblings, 1 reply; 7+ messages in thread
From: Frederic Weisbecker @ 2021-07-06 12:30 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Paul E . McKenney, LKML, Uladzislau Rezki, Boqun Feng,
Ingo Molnar, Neeraj Upadhyay, Joel Fernandes
On Tue, Jul 06, 2021 at 09:51:01AM +0200, Peter Zijlstra wrote:
> On Tue, Jul 06, 2021 at 01:43:44AM +0200, Frederic Weisbecker wrote:
> > The preemption is already disabled when we write rcu_data.rcu_urgent_qs.
> > We can use __this_cpu_write() directly, although that path is mostly
> > used when CONFIG_PREEMPT=n.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: Uladzislau Rezki <urezki@gmail.com>
> > Cc: Boqun Feng <boqun.feng@gmail.com>
> > ---
> > kernel/rcu/tree_plugin.h | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 27b74352cccf..38b3d01424d7 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -871,7 +871,7 @@ void rcu_all_qs(void)
> > preempt_enable();
> > return;
> > }
> > - this_cpu_write(rcu_data.rcu_urgent_qs, false);
> > + __this_cpu_write(rcu_data.rcu_urgent_qs, false);
>
> There's another subtle difference between this_cpu_write() and
> __this_cpu_write() aside from preempt. this_cpu_write() is also
> IRQ-safe, while __this_cpu_write() is not.
>
> I've not looked at the usage here to see if that is relevant, but the
> Changelog only mentioned the preempt side of things, and that argument
> is incomplete in general.
You're right, I missed that. I see this rcu_urgent_qs is set by
RCU TASKS from rcu_tasks_wait_gp() (did I missed another path?).
Not sure if this is called from IRQ nor if it actually matters to
protect against IRQs for that single write.
I'm not quite used to rcu_tasks. Paul?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs()
2021-07-06 12:30 ` Frederic Weisbecker
@ 2021-07-06 13:28 ` Boqun Feng
2021-07-06 16:24 ` Paul E. McKenney
0 siblings, 1 reply; 7+ messages in thread
From: Boqun Feng @ 2021-07-06 13:28 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Peter Zijlstra, Paul E . McKenney, LKML, Uladzislau Rezki,
Ingo Molnar, Neeraj Upadhyay, Joel Fernandes
On Tue, Jul 06, 2021 at 02:30:58PM +0200, Frederic Weisbecker wrote:
> On Tue, Jul 06, 2021 at 09:51:01AM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 06, 2021 at 01:43:44AM +0200, Frederic Weisbecker wrote:
> > > The preemption is already disabled when we write rcu_data.rcu_urgent_qs.
> > > We can use __this_cpu_write() directly, although that path is mostly
> > > used when CONFIG_PREEMPT=n.
> > >
> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> > > Cc: Joel Fernandes <joel@joelfernandes.org>
> > > Cc: Uladzislau Rezki <urezki@gmail.com>
> > > Cc: Boqun Feng <boqun.feng@gmail.com>
> > > ---
> > > kernel/rcu/tree_plugin.h | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > index 27b74352cccf..38b3d01424d7 100644
> > > --- a/kernel/rcu/tree_plugin.h
> > > +++ b/kernel/rcu/tree_plugin.h
> > > @@ -871,7 +871,7 @@ void rcu_all_qs(void)
> > > preempt_enable();
> > > return;
> > > }
> > > - this_cpu_write(rcu_data.rcu_urgent_qs, false);
> > > + __this_cpu_write(rcu_data.rcu_urgent_qs, false);
> >
> > There's another subtle difference between this_cpu_write() and
> > __this_cpu_write() aside from preempt. this_cpu_write() is also
> > IRQ-safe, while __this_cpu_write() is not.
> >
> > I've not looked at the usage here to see if that is relevant, but the
> > Changelog only mentioned the preempt side of things, and that argument
> > is incomplete in general.
>
> You're right, I missed that. I see this rcu_urgent_qs is set by
> RCU TASKS from rcu_tasks_wait_gp() (did I missed another path?).
> Not sure if this is called from IRQ nor if it actually matters to
> protect against IRQs for that single write.
I think __this_cpu_write() being IRQ-unsafe means it may overwrite
percpu writes to other bytes in the same word? Let's say the
rcu_urgent_qs is the lowest byte in the word, the pseduo asm code of
__this_cpu_write() may be:
__this_cpu_write(ptr, v):
long tmp = *ptr;
tmp &= ~(0xff);
tmp |= v;
*ptr = tmp;
and the following sequence introduces an overwrite:
__this_cpu_write(ptr, v): // v is 0, and *ptr is 1
long tmp = *ptr; // tmp is 1
<interrupted>
this_cpu_write() // modify another byte of *ptr, make it
// 0xff01
<ret from interrupt>
tmp &= ~(0xff) // tmp is 0
tmp |=v; // tmp is 0
*ptr = tmp; // *ptr is 0, overwrite a percpu write on
// another field.
I know that many archs have byte-wise store, so compilers don't really
have the reason to generate code as above, but __this_cpu_write() is
just a normal write, nothing prevents this from happenning, unless I'm
missing something here?
Regards,
Boqun
>
> I'm not quite used to rcu_tasks. Paul?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs()
2021-07-06 13:28 ` Boqun Feng
@ 2021-07-06 16:24 ` Paul E. McKenney
0 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2021-07-06 16:24 UTC (permalink / raw)
To: Boqun Feng
Cc: Frederic Weisbecker, Peter Zijlstra, LKML, Uladzislau Rezki,
Ingo Molnar, Neeraj Upadhyay, Joel Fernandes
On Tue, Jul 06, 2021 at 09:28:38PM +0800, Boqun Feng wrote:
> On Tue, Jul 06, 2021 at 02:30:58PM +0200, Frederic Weisbecker wrote:
> > On Tue, Jul 06, 2021 at 09:51:01AM +0200, Peter Zijlstra wrote:
> > > On Tue, Jul 06, 2021 at 01:43:44AM +0200, Frederic Weisbecker wrote:
> > > > The preemption is already disabled when we write rcu_data.rcu_urgent_qs.
> > > > We can use __this_cpu_write() directly, although that path is mostly
> > > > used when CONFIG_PREEMPT=n.
> > > >
> > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > > Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> > > > Cc: Joel Fernandes <joel@joelfernandes.org>
> > > > Cc: Uladzislau Rezki <urezki@gmail.com>
> > > > Cc: Boqun Feng <boqun.feng@gmail.com>
> > > > ---
> > > > kernel/rcu/tree_plugin.h | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > > index 27b74352cccf..38b3d01424d7 100644
> > > > --- a/kernel/rcu/tree_plugin.h
> > > > +++ b/kernel/rcu/tree_plugin.h
> > > > @@ -871,7 +871,7 @@ void rcu_all_qs(void)
> > > > preempt_enable();
> > > > return;
> > > > }
> > > > - this_cpu_write(rcu_data.rcu_urgent_qs, false);
> > > > + __this_cpu_write(rcu_data.rcu_urgent_qs, false);
> > >
> > > There's another subtle difference between this_cpu_write() and
> > > __this_cpu_write() aside from preempt. this_cpu_write() is also
> > > IRQ-safe, while __this_cpu_write() is not.
> > >
> > > I've not looked at the usage here to see if that is relevant, but the
> > > Changelog only mentioned the preempt side of things, and that argument
> > > is incomplete in general.
> >
> > You're right, I missed that. I see this rcu_urgent_qs is set by
> > RCU TASKS from rcu_tasks_wait_gp() (did I missed another path?).
> > Not sure if this is called from IRQ nor if it actually matters to
> > protect against IRQs for that single write.
>
> I think __this_cpu_write() being IRQ-unsafe means it may overwrite
> percpu writes to other bytes in the same word? Let's say the
> rcu_urgent_qs is the lowest byte in the word, the pseduo asm code of
> __this_cpu_write() may be:
>
> __this_cpu_write(ptr, v):
> long tmp = *ptr;
> tmp &= ~(0xff);
> tmp |= v;
> *ptr = tmp;
>
> and the following sequence introduces an overwrite:
>
> __this_cpu_write(ptr, v): // v is 0, and *ptr is 1
> long tmp = *ptr; // tmp is 1
> <interrupted>
> this_cpu_write() // modify another byte of *ptr, make it
> // 0xff01
> <ret from interrupt>
> tmp &= ~(0xff) // tmp is 0
> tmp |=v; // tmp is 0
> *ptr = tmp; // *ptr is 0, overwrite a percpu write on
> // another field.
>
> I know that many archs have byte-wise store, so compilers don't really
> have the reason to generate code as above, but __this_cpu_write() is
> just a normal write, nothing prevents this from happenning, unless I'm
> missing something here?
There can indeed be writes to .rcu_urgent_qs from interrupt
handlers, for example in CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels
from rcu_read_unlock(). More conventionally, the RCU_SOFTIRQ
handler can run on the back of an interrupts, and can invoke
rcu_check_quiescent_state(), which invokes rcu_report_qs_rdp(), which
invokes rcu_disable_urgency_upon_qs(), which writes to ->rcu_urgent_qs.
RCU takes a strict view of data races, so this wants the existing
this_cpu_write().
However, RCU very likely has this_cpu_write() calls that should instead
be __this_cpu_write() calls and vice versa, so please do continue treating
any that you see with an appropriate level of suspicion.
Thanx, Paul
> Regards,
> Boqun
>
> >
> > I'm not quite used to rcu_tasks. Paul?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU
2021-07-05 23:43 [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Frederic Weisbecker
2021-07-05 23:43 ` [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() Frederic Weisbecker
@ 2021-07-06 16:46 ` Paul E. McKenney
1 sibling, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2021-07-06 16:46 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Peter Zijlstra, Uladzislau Rezki, Boqun Feng, Ingo Molnar,
Neeraj Upadhyay, Joel Fernandes
On Tue, Jul 06, 2021 at 01:43:43AM +0200, Frederic Weisbecker wrote:
> cond_resched() reports an RCU quiescent state only in non-preemptible
> TREE RCU implementation. Provide an explanation for the different
> behaviour in CONFIG_PREEMPT_RCU=y.
>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Uladzislau Rezki <urezki@gmail.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> ---
> kernel/sched/core.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index cf16f8fda9a6..db374cb38eb2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7780,6 +7780,19 @@ int __sched __cond_resched(void)
> preempt_schedule_common();
> return 1;
> }
> + /*
> + * A process spending a long time in the kernel space might
> + * have too few opportunities to report quiescent states
> + * when CONFIG_PREEMPT_RCU=n because then the tick can't know
> + * if it's interrupting an RCU read side critical section. In the
> + * absence of voluntary sleeps, the last resort resides in tracking
> + * calls to cond_resched() which always imply quiescent states.
> + *
> + * On the other hand, preemptible RCU has a real RCU read side
> + * tracking that allows the tick for reporting interrupted quiescent
> + * states or, in the worst case, deferred quiescent states after
> + * rcu_read_unlock().
> + */
> #ifndef CONFIG_PREEMPT_RCU
> rcu_all_qs();
> #endif
Looks like a good addition! I have wordsmithed the comment and
commit log a bit. If Peter wants to take either version of this:
Acked-by: Paul E. McKenney <paulmck@kernel.org>
In the meantime, I have queued it for v5.15.
Thanx, Paul
------------------------------------------------------------------------
commit 41a95363ac9f020cc0e4fcc4b73015c60b6620f0
Author: Frederic Weisbecker <frederic@kernel.org>
Date: Tue Jul 6 01:43:43 2021 +0200
rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU
The cond_resched() function reports an RCU quiescent state only in
non-preemptible TREE RCU implementation. This commit therefore adds a
comment explaining why cond_resched() does nothing in preemptible kernels.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cf16f8fda9a6..eae24fdf3820 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7780,6 +7780,17 @@ int __sched __cond_resched(void)
preempt_schedule_common();
return 1;
}
+ /*
+ * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
+ * whether the current CPU is in an RCU read-side critical section,
+ * so the tick can report quiescent states even for CPUs looping
+ * in kernel context. In contrast, in non-preemptible kernels,
+ * RCU readers leave no in-memory hints, which means that CPU-bound
+ * processes executing in kernel context might never report an
+ * RCU quiescent state. Therefore, the following code causes
+ * cond_resched() to report a quiescent state, but only when RCU
+ * is in urgent need of one.
+ */
#ifndef CONFIG_PREEMPT_RCU
rcu_all_qs();
#endif
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-07-06 16:46 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-05 23:43 [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Frederic Weisbecker
2021-07-05 23:43 ` [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs() Frederic Weisbecker
2021-07-06 7:51 ` Peter Zijlstra
2021-07-06 12:30 ` Frederic Weisbecker
2021-07-06 13:28 ` Boqun Feng
2021-07-06 16:24 ` Paul E. McKenney
2021-07-06 16:46 ` [PATCH 1/2] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).