* One potential issue with concurrent execution of RCU callbacks... @ 2020-12-08 14:58 Paul E. McKenney 2020-12-08 15:54 ` Frederic Weisbecker 0 siblings, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2020-12-08 14:58 UTC (permalink / raw) To: frederic; +Cc: boqun.feng, rcu, linux-kernel Hello, Frederic, Boqun just asked if RCU callbacks ran in BH-disabled context to avoid concurrent execution of the same callback. Of course, this raises the question of whether a self-posting callback can have two instances of itself running concurrently while a CPU is in the process of transitioning between softirq and rcuo invocation of callbacks. I believe that the answer is "no" because BH-disabled context is an implicit RCU read-side critical section. Therefore, the initial invocation of the RCU callback must complete in order for a new grace period to complete, and a new grace period must complete before the second invocation of that same callback to start. Does that make sense, or am I missing something? Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: One potential issue with concurrent execution of RCU callbacks... 2020-12-08 14:58 One potential issue with concurrent execution of RCU callbacks Paul E. McKenney @ 2020-12-08 15:54 ` Frederic Weisbecker 2020-12-08 17:19 ` Paul E. McKenney 0 siblings, 1 reply; 9+ messages in thread From: Frederic Weisbecker @ 2020-12-08 15:54 UTC (permalink / raw) To: Paul E. McKenney; +Cc: boqun.feng, rcu, linux-kernel On Tue, Dec 08, 2020 at 06:58:10AM -0800, Paul E. McKenney wrote: > Hello, Frederic, > > Boqun just asked if RCU callbacks ran in BH-disabled context to avoid > concurrent execution of the same callback. Of course, this raises the > question of whether a self-posting callback can have two instances of > itself running concurrently while a CPU is in the process of transitioning > between softirq and rcuo invocation of callbacks. > > I believe that the answer is "no" because BH-disabled context is > an implicit RCU read-side critical section. Therefore, the initial > invocation of the RCU callback must complete in order for a new grace > period to complete, and a new grace period must complete before the > second invocation of that same callback to start. > > Does that make sense, or am I missing something? Sounds like a good explanation. But then why are we actually calling the entire rcu_do_batch() under BH-disabled context? Was it to quieten lockdep against rcu_callback_map ? Wouldn't rcu_read_lock() around callbacks invocation be enough? Or is there another reason for the BH-disabled context that I'm missing? Untested below: diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index bd04b09b84b3..207eff8a4e1a 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2468,6 +2468,7 @@ static void rcu_do_batch(struct rcu_data *rdp) debug_rcu_head_unqueue(rhp); + rcu_read_lock(); rcu_lock_acquire(&rcu_callback_map); trace_rcu_invoke_callback(rcu_state.name, rhp); @@ -2476,6 +2477,7 @@ static void rcu_do_batch(struct rcu_data *rdp) f(rhp); rcu_lock_release(&rcu_callback_map); + rcu_read_unlock(); /* * Stop only if limit reached and CPU has something to do. @@ -2494,11 +2496,9 @@ static void rcu_do_batch(struct rcu_data *rdp) } if (offloaded) { WARN_ON_ONCE(in_serving_softirq()); - local_bh_enable(); lockdep_assert_irqs_enabled(); cond_resched_tasks_rcu_qs(); lockdep_assert_irqs_enabled(); - local_bh_disable(); } } diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index fd8a52e9a887..2a3d3c59d650 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -2095,9 +2095,7 @@ static void nocb_cb_wait(struct rcu_data *rdp) local_irq_save(flags); rcu_momentary_dyntick_idle(); local_irq_restore(flags); - local_bh_disable(); rcu_do_batch(rdp); - local_bh_enable(); lockdep_assert_irqs_enabled(); rcu_nocb_lock_irqsave(rdp, flags); if (rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) && ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: One potential issue with concurrent execution of RCU callbacks... 2020-12-08 15:54 ` Frederic Weisbecker @ 2020-12-08 17:19 ` Paul E. McKenney 2020-12-08 17:52 ` Frederic Weisbecker 0 siblings, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2020-12-08 17:19 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: boqun.feng, rcu, linux-kernel On Tue, Dec 08, 2020 at 04:54:57PM +0100, Frederic Weisbecker wrote: > On Tue, Dec 08, 2020 at 06:58:10AM -0800, Paul E. McKenney wrote: > > Hello, Frederic, > > > > Boqun just asked if RCU callbacks ran in BH-disabled context to avoid > > concurrent execution of the same callback. Of course, this raises the > > question of whether a self-posting callback can have two instances of > > itself running concurrently while a CPU is in the process of transitioning > > between softirq and rcuo invocation of callbacks. > > > > I believe that the answer is "no" because BH-disabled context is > > an implicit RCU read-side critical section. Therefore, the initial > > invocation of the RCU callback must complete in order for a new grace > > period to complete, and a new grace period must complete before the > > second invocation of that same callback to start. > > > > Does that make sense, or am I missing something? > > Sounds like a good explanation. But then why are we actually calling > the entire rcu_do_batch() under BH-disabled context? Was it to quieten > lockdep against rcu_callback_map ? Inertia and lack of complaints about it. ;-) Plus when called from softirq, neither local_bh_disable() nor rcu_read_lock() is necessary, and so represents pointless overhead. > Wouldn't rcu_read_lock() around callbacks invocation be enough? Or is > there another reason for the BH-disabled context that I'm missing? There are likely to be callback functions that use spin_lock() instead of spin_lock_bh() because they know that they are invoked in BH-disabled context. But what does this change help? Thanx, Paul > Untested below: > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index bd04b09b84b3..207eff8a4e1a 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -2468,6 +2468,7 @@ static void rcu_do_batch(struct rcu_data *rdp) > > debug_rcu_head_unqueue(rhp); > > + rcu_read_lock(); > rcu_lock_acquire(&rcu_callback_map); > trace_rcu_invoke_callback(rcu_state.name, rhp); > > @@ -2476,6 +2477,7 @@ static void rcu_do_batch(struct rcu_data *rdp) > f(rhp); > > rcu_lock_release(&rcu_callback_map); > + rcu_read_unlock(); > > /* > * Stop only if limit reached and CPU has something to do. > @@ -2494,11 +2496,9 @@ static void rcu_do_batch(struct rcu_data *rdp) > } > if (offloaded) { > WARN_ON_ONCE(in_serving_softirq()); > - local_bh_enable(); > lockdep_assert_irqs_enabled(); > cond_resched_tasks_rcu_qs(); > lockdep_assert_irqs_enabled(); > - local_bh_disable(); > } > } > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > index fd8a52e9a887..2a3d3c59d650 100644 > --- a/kernel/rcu/tree_plugin.h > +++ b/kernel/rcu/tree_plugin.h > @@ -2095,9 +2095,7 @@ static void nocb_cb_wait(struct rcu_data *rdp) > local_irq_save(flags); > rcu_momentary_dyntick_idle(); > local_irq_restore(flags); > - local_bh_disable(); > rcu_do_batch(rdp); > - local_bh_enable(); > lockdep_assert_irqs_enabled(); > rcu_nocb_lock_irqsave(rdp, flags); > if (rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) && > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: One potential issue with concurrent execution of RCU callbacks... 2020-12-08 17:19 ` Paul E. McKenney @ 2020-12-08 17:52 ` Frederic Weisbecker 2020-12-08 18:24 ` Paul E. McKenney 0 siblings, 1 reply; 9+ messages in thread From: Frederic Weisbecker @ 2020-12-08 17:52 UTC (permalink / raw) To: Paul E. McKenney; +Cc: boqun.feng, rcu, linux-kernel On Tue, Dec 08, 2020 at 09:19:27AM -0800, Paul E. McKenney wrote: > On Tue, Dec 08, 2020 at 04:54:57PM +0100, Frederic Weisbecker wrote: > > On Tue, Dec 08, 2020 at 06:58:10AM -0800, Paul E. McKenney wrote: > > > Hello, Frederic, > > > > > > Boqun just asked if RCU callbacks ran in BH-disabled context to avoid > > > concurrent execution of the same callback. Of course, this raises the > > > question of whether a self-posting callback can have two instances of > > > itself running concurrently while a CPU is in the process of transitioning > > > between softirq and rcuo invocation of callbacks. > > > > > > I believe that the answer is "no" because BH-disabled context is > > > an implicit RCU read-side critical section. Therefore, the initial > > > invocation of the RCU callback must complete in order for a new grace > > > period to complete, and a new grace period must complete before the > > > second invocation of that same callback to start. > > > > > > Does that make sense, or am I missing something? > > > > Sounds like a good explanation. But then why are we actually calling > > the entire rcu_do_batch() under BH-disabled context? Was it to quieten > > lockdep against rcu_callback_map ? > > Inertia and lack of complaints about it. ;-) > > Plus when called from softirq, neither local_bh_disable() nor > rcu_read_lock() is necessary, and so represents pointless overhead. > > > Wouldn't rcu_read_lock() around callbacks invocation be enough? Or is > > there another reason for the BH-disabled context that I'm missing? > > There are likely to be callback functions that use spin_lock() instead > of spin_lock_bh() because they know that they are invoked in BH-disabled > context. Ah right. So perhaps we can keep local_bh_disable() instead. > > But what does this change help? It reduces the code scope running with BH disabled. Also narrowing down helps to understand what it actually protects. Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: One potential issue with concurrent execution of RCU callbacks... 2020-12-08 17:52 ` Frederic Weisbecker @ 2020-12-08 18:24 ` Paul E. McKenney 2020-12-08 22:04 ` Frederic Weisbecker 0 siblings, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2020-12-08 18:24 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: boqun.feng, rcu, linux-kernel On Tue, Dec 08, 2020 at 06:52:30PM +0100, Frederic Weisbecker wrote: > On Tue, Dec 08, 2020 at 09:19:27AM -0800, Paul E. McKenney wrote: > > On Tue, Dec 08, 2020 at 04:54:57PM +0100, Frederic Weisbecker wrote: > > > On Tue, Dec 08, 2020 at 06:58:10AM -0800, Paul E. McKenney wrote: > > > > Hello, Frederic, > > > > > > > > Boqun just asked if RCU callbacks ran in BH-disabled context to avoid > > > > concurrent execution of the same callback. Of course, this raises the > > > > question of whether a self-posting callback can have two instances of > > > > itself running concurrently while a CPU is in the process of transitioning > > > > between softirq and rcuo invocation of callbacks. > > > > > > > > I believe that the answer is "no" because BH-disabled context is > > > > an implicit RCU read-side critical section. Therefore, the initial > > > > invocation of the RCU callback must complete in order for a new grace > > > > period to complete, and a new grace period must complete before the > > > > second invocation of that same callback to start. > > > > > > > > Does that make sense, or am I missing something? > > > > > > Sounds like a good explanation. But then why are we actually calling > > > the entire rcu_do_batch() under BH-disabled context? Was it to quieten > > > lockdep against rcu_callback_map ? > > > > Inertia and lack of complaints about it. ;-) > > > > Plus when called from softirq, neither local_bh_disable() nor > > rcu_read_lock() is necessary, and so represents pointless overhead. > > > > > Wouldn't rcu_read_lock() around callbacks invocation be enough? Or is > > > there another reason for the BH-disabled context that I'm missing? > > > > There are likely to be callback functions that use spin_lock() instead > > of spin_lock_bh() because they know that they are invoked in BH-disabled > > context. > > Ah right. So perhaps we can keep local_bh_disable() instead. > > > But what does this change help? > > It reduces the code scope running with BH disabled. > Also narrowing down helps to understand what it actually protects. I thought that you would call out unnecessarily delaying other softirq handlers. ;-) But if such delays are a problem (and they might well be), then to avoid them on non-rcu_nocb CPUs would instead/also require changing the early-exit checks to check for other pending softirqs to the existing checks involving time, need_resched, and idle. At which point, entering and exiting BH-disabled again doesn't help, other than your point about the difference in BH-disabled scopes on rcu_nocb and non-rcu_nocb CPUs. Would it make sense to exit rcu_do_batch() if more than some amount of time had elapsed and there was some non-RCU softirq pending? My guess is that the current tlimit checks in rcu_do_batch() make this unnecessary. Thoughts? Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: One potential issue with concurrent execution of RCU callbacks... 2020-12-08 18:24 ` Paul E. McKenney @ 2020-12-08 22:04 ` Frederic Weisbecker 2020-12-09 0:03 ` Paul E. McKenney 2020-12-09 2:14 ` Boqun Feng 0 siblings, 2 replies; 9+ messages in thread From: Frederic Weisbecker @ 2020-12-08 22:04 UTC (permalink / raw) To: Paul E. McKenney; +Cc: boqun.feng, rcu, linux-kernel On Tue, Dec 08, 2020 at 10:24:09AM -0800, Paul E. McKenney wrote: > > It reduces the code scope running with BH disabled. > > Also narrowing down helps to understand what it actually protects. > > I thought that you would call out unnecessarily delaying other softirq > handlers. ;-) > > But if such delays are a problem (and they might well be), then to > avoid them on non-rcu_nocb CPUs would instead/also require changing the > early-exit checks to check for other pending softirqs to the existing > checks involving time, need_resched, and idle. At which point, entering and > exiting BH-disabled again doesn't help, other than your point about the > difference in BH-disabled scopes on rcu_nocb and non-rcu_nocb CPUs. Wise observation! > > Would it make sense to exit rcu_do_batch() if more than some amount > of time had elapsed and there was some non-RCU softirq pending? > > My guess is that the current tlimit checks in rcu_do_batch() make this > unnecessary. Right and nobody has complained about it so far. But I should add a comment explaining the reason for the BH-disabled section in my series. Thanks. > > Thoughts? > > Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: One potential issue with concurrent execution of RCU callbacks... 2020-12-08 22:04 ` Frederic Weisbecker @ 2020-12-09 0:03 ` Paul E. McKenney 2020-12-09 2:14 ` Boqun Feng 1 sibling, 0 replies; 9+ messages in thread From: Paul E. McKenney @ 2020-12-09 0:03 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: boqun.feng, rcu, linux-kernel On Tue, Dec 08, 2020 at 11:04:38PM +0100, Frederic Weisbecker wrote: > On Tue, Dec 08, 2020 at 10:24:09AM -0800, Paul E. McKenney wrote: > > > It reduces the code scope running with BH disabled. > > > Also narrowing down helps to understand what it actually protects. > > > > I thought that you would call out unnecessarily delaying other softirq > > handlers. ;-) > > > > But if such delays are a problem (and they might well be), then to > > avoid them on non-rcu_nocb CPUs would instead/also require changing the > > early-exit checks to check for other pending softirqs to the existing > > checks involving time, need_resched, and idle. At which point, entering and > > exiting BH-disabled again doesn't help, other than your point about the > > difference in BH-disabled scopes on rcu_nocb and non-rcu_nocb CPUs. > > Wise observation! > > > Would it make sense to exit rcu_do_batch() if more than some amount > > of time had elapsed and there was some non-RCU softirq pending? > > > > My guess is that the current tlimit checks in rcu_do_batch() make this > > unnecessary. > > Right and nobody has complained about it so far. If they did, my thought would be to add another early-exit check, but under the tlimit check, so that pending non-RCU softirqs might set a shorter time limit. For example, instead of allowing up to the current one second in rcu_do_batch(), allow only up to 100 milliseconds or whatever. But there are lots of choices, which is one reason to wait until it becomes a problem. > But I should add a comment explaining the reason for the BH-disabled > section in my series. That sounds like a most excellent idea, please do! Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: One potential issue with concurrent execution of RCU callbacks... 2020-12-08 22:04 ` Frederic Weisbecker 2020-12-09 0:03 ` Paul E. McKenney @ 2020-12-09 2:14 ` Boqun Feng 2020-12-10 0:50 ` Paul E. McKenney 1 sibling, 1 reply; 9+ messages in thread From: Boqun Feng @ 2020-12-09 2:14 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: Paul E. McKenney, rcu, linux-kernel Hi Frederic, On Tue, Dec 08, 2020 at 11:04:38PM +0100, Frederic Weisbecker wrote: > On Tue, Dec 08, 2020 at 10:24:09AM -0800, Paul E. McKenney wrote: > > > It reduces the code scope running with BH disabled. > > > Also narrowing down helps to understand what it actually protects. > > > > I thought that you would call out unnecessarily delaying other softirq > > handlers. ;-) > > > > But if such delays are a problem (and they might well be), then to > > avoid them on non-rcu_nocb CPUs would instead/also require changing the > > early-exit checks to check for other pending softirqs to the existing > > checks involving time, need_resched, and idle. At which point, entering and > > exiting BH-disabled again doesn't help, other than your point about the > > difference in BH-disabled scopes on rcu_nocb and non-rcu_nocb CPUs. > > Wise observation! > > > > > Would it make sense to exit rcu_do_batch() if more than some amount > > of time had elapsed and there was some non-RCU softirq pending? > > > > My guess is that the current tlimit checks in rcu_do_batch() make this > > unnecessary. > > Right and nobody has complained about it so far. > > But I should add a comment explaining the reason for the BH-disabled > section in my series. > Some background for the original question: I'm revisiting the wait context checking feature of lockdep (which can detect bugs like acquiring a spinlock_t lock inside a raw_spinlock_t), I've post my first version: https://lore.kernel.org/lkml/20201208103112.2838119-1-boqun.feng@gmail.com/ , and will surely copy you in the next version ;-) The reason I asked for the RCU callback context requirement is that we have the virtual lock (rcu_callback_map) that marks a RCU callback context, so if RCU callback contexts have special restrictions on the locking usage inside, we can use the wait context checking to do the check (like what I did in the patch #3 of the above series). My current summary is that since in certain configs (use_softirq is true and nocb is disabled) RCU callbacks are executed in a softirq context, so the least requirement for any RCU callbacks is they need to obey the rules in softirq contexts. And yes, I'm aware that in some configs, RCU callbacks are not executed in a softirq context (sometimes, even the BH is not disabled), but we need to make all the callback work in the "worst" (or strictest) case (callbacks executing in softirq contexts). Currently, the effect of using wait context for rcu_callback_map in my patchset is that lockdep will complain if a RCU callback use a mutex or other sleepable locks, but using spinlock_t (even in PREEMPT_RT) won't cause lockdep to complain. Am I getting this correct? Regards, Boqun > Thanks. > > > > > Thoughts? > > > > Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: One potential issue with concurrent execution of RCU callbacks... 2020-12-09 2:14 ` Boqun Feng @ 2020-12-10 0:50 ` Paul E. McKenney 0 siblings, 0 replies; 9+ messages in thread From: Paul E. McKenney @ 2020-12-10 0:50 UTC (permalink / raw) To: Boqun Feng; +Cc: Frederic Weisbecker, rcu, linux-kernel On Wed, Dec 09, 2020 at 10:14:49AM +0800, Boqun Feng wrote: > Hi Frederic, > > On Tue, Dec 08, 2020 at 11:04:38PM +0100, Frederic Weisbecker wrote: > > On Tue, Dec 08, 2020 at 10:24:09AM -0800, Paul E. McKenney wrote: > > > > It reduces the code scope running with BH disabled. > > > > Also narrowing down helps to understand what it actually protects. > > > > > > I thought that you would call out unnecessarily delaying other softirq > > > handlers. ;-) > > > > > > But if such delays are a problem (and they might well be), then to > > > avoid them on non-rcu_nocb CPUs would instead/also require changing the > > > early-exit checks to check for other pending softirqs to the existing > > > checks involving time, need_resched, and idle. At which point, entering and > > > exiting BH-disabled again doesn't help, other than your point about the > > > difference in BH-disabled scopes on rcu_nocb and non-rcu_nocb CPUs. > > > > Wise observation! > > > > > > > > Would it make sense to exit rcu_do_batch() if more than some amount > > > of time had elapsed and there was some non-RCU softirq pending? > > > > > > My guess is that the current tlimit checks in rcu_do_batch() make this > > > unnecessary. > > > > Right and nobody has complained about it so far. > > > > But I should add a comment explaining the reason for the BH-disabled > > section in my series. > > > > Some background for the original question: I'm revisiting the wait > context checking feature of lockdep (which can detect bugs like > acquiring a spinlock_t lock inside a raw_spinlock_t), I've post my first > version: > > https://lore.kernel.org/lkml/20201208103112.2838119-1-boqun.feng@gmail.com/ > > , and will surely copy you in the next version ;-) > > The reason I asked for the RCU callback context requirement is that we > have the virtual lock (rcu_callback_map) that marks a RCU callback > context, so if RCU callback contexts have special restrictions on the > locking usage inside, we can use the wait context checking to do the > check (like what I did in the patch #3 of the above series). > > My current summary is that since in certain configs (use_softirq is true > and nocb is disabled) RCU callbacks are executed in a softirq context, > so the least requirement for any RCU callbacks is they need to obey the > rules in softirq contexts. And yes, I'm aware that in some configs, RCU > callbacks are not executed in a softirq context (sometimes, even the BH > is not disabled), but we need to make all the callback work in the > "worst" (or strictest) case (callbacks executing in softirq contexts). > Currently, the effect of using wait context for rcu_callback_map in my > patchset is that lockdep will complain if a RCU callback use a mutex or > other sleepable locks, but using spinlock_t (even in PREEMPT_RT) won't > cause lockdep to complain. Am I getting this correct? It matches what I know. And yes, in PREEMPT_RT, softirq is preemptible, allowing spinlock_t to be used, but there are restrictions that lopckdep enforces. I am not going to claim to remember the exact set of restrictions. Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-12-10 0:52 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-08 14:58 One potential issue with concurrent execution of RCU callbacks Paul E. McKenney 2020-12-08 15:54 ` Frederic Weisbecker 2020-12-08 17:19 ` Paul E. McKenney 2020-12-08 17:52 ` Frederic Weisbecker 2020-12-08 18:24 ` Paul E. McKenney 2020-12-08 22:04 ` Frederic Weisbecker 2020-12-09 0:03 ` Paul E. McKenney 2020-12-09 2:14 ` Boqun Feng 2020-12-10 0:50 ` Paul E. McKenney
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.