* [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5 @ 2022-09-15 0:14 Joel Fernandes (Google) 2022-09-15 0:14 ` [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google) ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Joel Fernandes (Google) @ 2022-09-15 0:14 UTC (permalink / raw) To: linux-kernel Cc: Joel Fernandes (Google), Frederic Weisbecker, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt Hello, Just sending a few patches before lazy rcu v6 can come out. That can come out after more testing to check feasibility of ideas discussed at LPC. I will try to get rcutop pushed as well before v6, but in the meanwhile I don't see much reason to block these few patches for 6.1. These apply to Paul's rcu/next branch. What do you think? Joel Fernandes (Google) (3): rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask rcu: Move trace_rcu_callback() before bypassing rcu: Fix late wakeup when flush of bypass cblist happens kernel/rcu/tree.c | 12 +++++++----- kernel/rcu/tree_nocb.h | 10 ++++++++-- 2 files changed, 15 insertions(+), 7 deletions(-) -- 2.37.2.789.g6183377224-goog ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask 2022-09-15 0:14 [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5 Joel Fernandes (Google) @ 2022-09-15 0:14 ` Joel Fernandes (Google) 2022-09-16 10:55 ` Frederic Weisbecker 2022-09-15 0:14 ` [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing Joel Fernandes (Google) 2022-09-15 0:14 ` [PATCH rcu/next 3/3] rcu: Fix late wakeup when flush of bypass cblist happens Joel Fernandes (Google) 2 siblings, 1 reply; 11+ messages in thread From: Joel Fernandes (Google) @ 2022-09-15 0:14 UTC (permalink / raw) To: linux-kernel Cc: Joel Fernandes (Google), Frederic Weisbecker, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt The rnp->qsmask is locklessly accessed from rcutree_dying_cpu(). This may help avoid load tearing due to concurrent access, KCSAN issues, and preserve sanity of people reading the mask in tracing. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> --- kernel/rcu/tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0ca21ac0f064..5ec97e3f7468 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2106,7 +2106,7 @@ int rcutree_dying_cpu(unsigned int cpu) if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) return 0; - blkd = !!(rnp->qsmask & rdp->grpmask); + blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask); trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq), blkd ? TPS("cpuofl-bgp") : TPS("cpuofl")); return 0; -- 2.37.2.789.g6183377224-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask 2022-09-15 0:14 ` [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google) @ 2022-09-16 10:55 ` Frederic Weisbecker 0 siblings, 0 replies; 11+ messages in thread From: Frederic Weisbecker @ 2022-09-16 10:55 UTC (permalink / raw) To: Joel Fernandes (Google) Cc: linux-kernel, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt On Thu, Sep 15, 2022 at 12:14:17AM +0000, Joel Fernandes (Google) wrote: > The rnp->qsmask is locklessly accessed from rcutree_dying_cpu(). This > may help avoid load tearing due to concurrent access, KCSAN > issues, and preserve sanity of people reading the mask in tracing. > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > --- > kernel/rcu/tree.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 0ca21ac0f064..5ec97e3f7468 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -2106,7 +2106,7 @@ int rcutree_dying_cpu(unsigned int cpu) > if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) > return 0; > > - blkd = !!(rnp->qsmask & rdp->grpmask); > + blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask); > trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq), > blkd ? TPS("cpuofl-bgp") : TPS("cpuofl")); > return 0; Reviewed-by: Frederic Weisbecker <frederic@kernel.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing 2022-09-15 0:14 [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5 Joel Fernandes (Google) 2022-09-15 0:14 ` [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google) @ 2022-09-15 0:14 ` Joel Fernandes (Google) 2022-09-16 11:09 ` Frederic Weisbecker 2022-09-15 0:14 ` [PATCH rcu/next 3/3] rcu: Fix late wakeup when flush of bypass cblist happens Joel Fernandes (Google) 2 siblings, 1 reply; 11+ messages in thread From: Joel Fernandes (Google) @ 2022-09-15 0:14 UTC (permalink / raw) To: linux-kernel Cc: Joel Fernandes (Google), Frederic Weisbecker, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt If any CB is queued into the bypass list, then trace_rcu_callback() does not show it. This makes it not clear when a callback was actually queued, as you only end up getting a trace_rcu_invoke_callback() trace. Fix it by moving trace_rcu_callback() before trace_rcu_nocb_try_bypass(). Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> --- kernel/rcu/tree.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 5ec97e3f7468..9fe581be8696 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2809,10 +2809,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) } check_cb_ovld(rdp); - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) - return; // Enqueued onto ->nocb_bypass, so just leave. - // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. - rcu_segcblist_enqueue(&rdp->cblist, head); + if (__is_kvfree_rcu_offset((unsigned long)func)) trace_rcu_kvfree_callback(rcu_state.name, head, (unsigned long)func, @@ -2821,6 +2818,11 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) trace_rcu_callback(rcu_state.name, head, rcu_segcblist_n_cbs(&rdp->cblist)); + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) + return; // Enqueued onto ->nocb_bypass, so just leave. + // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. + rcu_segcblist_enqueue(&rdp->cblist, head); + trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); /* Go handle any RCU core processing required. */ -- 2.37.2.789.g6183377224-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing 2022-09-15 0:14 ` [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing Joel Fernandes (Google) @ 2022-09-16 11:09 ` Frederic Weisbecker 2022-09-16 14:10 ` Joel Fernandes 2022-09-16 22:19 ` Joel Fernandes 0 siblings, 2 replies; 11+ messages in thread From: Frederic Weisbecker @ 2022-09-16 11:09 UTC (permalink / raw) To: Joel Fernandes (Google) Cc: linux-kernel, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt On Thu, Sep 15, 2022 at 12:14:18AM +0000, Joel Fernandes (Google) wrote: > If any CB is queued into the bypass list, then trace_rcu_callback() does > not show it. This makes it not clear when a callback was actually > queued, as you only end up getting a trace_rcu_invoke_callback() trace. > Fix it by moving trace_rcu_callback() before > trace_rcu_nocb_try_bypass(). > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > --- > kernel/rcu/tree.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 5ec97e3f7468..9fe581be8696 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -2809,10 +2809,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > } > > check_cb_ovld(rdp); > - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) > - return; // Enqueued onto ->nocb_bypass, so just leave. > - // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. > - rcu_segcblist_enqueue(&rdp->cblist, head); > + > if (__is_kvfree_rcu_offset((unsigned long)func)) > trace_rcu_kvfree_callback(rcu_state.name, head, > (unsigned long)func, > @@ -2821,6 +2818,11 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > trace_rcu_callback(rcu_state.name, head, > rcu_segcblist_n_cbs(&rdp->cblist)); > > + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) > + return; // Enqueued onto ->nocb_bypass, so just leave. > + // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. > + rcu_segcblist_enqueue(&rdp->cblist, head); > + > trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); > > /* Go handle any RCU core processing required. */ Two subtle changes induced here: * rcu_segcblist_n_cbs() is now read lockless. It's just tracing so no huge deal but still, if this races with callbacks invocation, we may on some rare occasion read stale numbers on traces while enqueuing (think about rcu_top for example) * trace_rcu_callback() will now show the number of callbacks _before_ enqueuing instead of _after_. Not sure if it matters, but sometimes tools rely on trace events. To avoid all that, how about a new trace_rcu_nocb_bypass() instead? Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing 2022-09-16 11:09 ` Frederic Weisbecker @ 2022-09-16 14:10 ` Joel Fernandes 2022-09-16 14:14 ` Joel Fernandes 2022-09-16 22:19 ` Joel Fernandes 1 sibling, 1 reply; 11+ messages in thread From: Joel Fernandes @ 2022-09-16 14:10 UTC (permalink / raw) To: Frederic Weisbecker Cc: linux-kernel, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt On Fri, Sep 16, 2022 at 01:09:49PM +0200, Frederic Weisbecker wrote: > On Thu, Sep 15, 2022 at 12:14:18AM +0000, Joel Fernandes (Google) wrote: > > If any CB is queued into the bypass list, then trace_rcu_callback() does > > not show it. This makes it not clear when a callback was actually > > queued, as you only end up getting a trace_rcu_invoke_callback() trace. > > Fix it by moving trace_rcu_callback() before > > trace_rcu_nocb_try_bypass(). > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > --- > > kernel/rcu/tree.c | 10 ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > index 5ec97e3f7468..9fe581be8696 100644 > > --- a/kernel/rcu/tree.c > > +++ b/kernel/rcu/tree.c > > @@ -2809,10 +2809,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > > } > > > > check_cb_ovld(rdp); > > - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) > > - return; // Enqueued onto ->nocb_bypass, so just leave. > > - // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. > > - rcu_segcblist_enqueue(&rdp->cblist, head); > > + > > if (__is_kvfree_rcu_offset((unsigned long)func)) > > trace_rcu_kvfree_callback(rcu_state.name, head, > > (unsigned long)func, > > @@ -2821,6 +2818,11 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > > trace_rcu_callback(rcu_state.name, head, > > rcu_segcblist_n_cbs(&rdp->cblist)); > > > > + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) > > + return; // Enqueued onto ->nocb_bypass, so just leave. > > + // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. > > + rcu_segcblist_enqueue(&rdp->cblist, head); > > + > > trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); > > > > /* Go handle any RCU core processing required. */ > > Two subtle changes induced here: > > * rcu_segcblist_n_cbs() is now read lockless. It's just tracing so no huge deal > but still, if this races with callbacks invocation, we may on some rare occasion > read stale numbers on traces while enqueuing (think about rcu_top for example) > > * trace_rcu_callback() will now show the number of callbacks _before_ enqueuing > instead of _after_. Not sure if it matters, but sometimes tools rely on trace > events. > > To avoid all that, how about a new trace_rcu_nocb_bypass() instead? Great points, thanks much and you rock. How about something like the following? That way we don't need to add yet another trace point: ---8<----------------------- From: "Joel Fernandes (Google)" <joel@joelfernandes.org> Subject: [PATCH v2] rcu: Call trace_rcu_callback() also for bypassing If any CB is queued into the bypass list, then trace_rcu_callback() does not show it. This makes it not clear when a callback was actually queued, as you only end up getting a trace_rcu_invoke_callback() trace. Fix it by calling the tracing function even for bypass queue. [Frederic: Hold lock while tracing] Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> --- kernel/rcu/tree.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 5ec97e3f7468..85609ccbb8ed 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2728,6 +2728,22 @@ static void check_cb_ovld(struct rcu_data *rdp) raw_spin_unlock_rcu_node(rnp); } +/* + * Trace RCU callback helper, call after enqueuing callback. + * The ->cblist must be locked when called. + */ +static void trace_rcu_callback_locked(struct rcu_head *head, + struct rcu_data *rdp) +{ + if (__is_kvfree_rcu_offset((unsigned long)head->func)) + trace_rcu_kvfree_callback(rcu_state.name, head, + (unsigned long)head->func, + rcu_segcblist_n_cbs(&rdp->cblist)); + else + trace_rcu_callback(rcu_state.name, head, + rcu_segcblist_n_cbs(&rdp->cblist)); +} + /** * call_rcu() - Queue an RCU callback for invocation after a grace period. * @head: structure to be used for queueing the RCU updates. @@ -2809,17 +2825,14 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) } check_cb_ovld(rdp); - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) + + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) { + trace_rcu_callback_locked(head, rdp); return; // Enqueued onto ->nocb_bypass, so just leave. + } // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. rcu_segcblist_enqueue(&rdp->cblist, head); - if (__is_kvfree_rcu_offset((unsigned long)func)) - trace_rcu_kvfree_callback(rcu_state.name, head, - (unsigned long)func, - rcu_segcblist_n_cbs(&rdp->cblist)); - else - trace_rcu_callback(rcu_state.name, head, - rcu_segcblist_n_cbs(&rdp->cblist)); + trace_rcu_callback_locked(head, rdp); trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); -- 2.37.3.968.ga6b4b080e4-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing 2022-09-16 14:10 ` Joel Fernandes @ 2022-09-16 14:14 ` Joel Fernandes 0 siblings, 0 replies; 11+ messages in thread From: Joel Fernandes @ 2022-09-16 14:14 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt On Fri, Sep 16, 2022 at 10:10 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > On Fri, Sep 16, 2022 at 01:09:49PM +0200, Frederic Weisbecker wrote: > > On Thu, Sep 15, 2022 at 12:14:18AM +0000, Joel Fernandes (Google) wrote: > > > If any CB is queued into the bypass list, then trace_rcu_callback() does > > > not show it. This makes it not clear when a callback was actually > > > queued, as you only end up getting a trace_rcu_invoke_callback() trace. > > > Fix it by moving trace_rcu_callback() before > > > trace_rcu_nocb_try_bypass(). > > > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > > --- > > > kernel/rcu/tree.c | 10 ++++++---- > > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > index 5ec97e3f7468..9fe581be8696 100644 > > > --- a/kernel/rcu/tree.c > > > +++ b/kernel/rcu/tree.c > > > @@ -2809,10 +2809,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > > > } > > > > > > check_cb_ovld(rdp); > > > - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) > > > - return; // Enqueued onto ->nocb_bypass, so just leave. > > > - // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. > > > - rcu_segcblist_enqueue(&rdp->cblist, head); > > > + > > > if (__is_kvfree_rcu_offset((unsigned long)func)) > > > trace_rcu_kvfree_callback(rcu_state.name, head, > > > (unsigned long)func, > > > @@ -2821,6 +2818,11 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > > > trace_rcu_callback(rcu_state.name, head, > > > rcu_segcblist_n_cbs(&rdp->cblist)); > > > > > > + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) > > > + return; // Enqueued onto ->nocb_bypass, so just leave. > > > + // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. > > > + rcu_segcblist_enqueue(&rdp->cblist, head); > > > + > > > trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); > > > > > > /* Go handle any RCU core processing required. */ > > > > Two subtle changes induced here: > > > > * rcu_segcblist_n_cbs() is now read lockless. It's just tracing so no huge deal > > but still, if this races with callbacks invocation, we may on some rare occasion > > read stale numbers on traces while enqueuing (think about rcu_top for example) > > > > * trace_rcu_callback() will now show the number of callbacks _before_ enqueuing > > instead of _after_. Not sure if it matters, but sometimes tools rely on trace > > events. > > > > To avoid all that, how about a new trace_rcu_nocb_bypass() instead? > > Great points, thanks much and you rock. How about something like the > following? That way we don't need to add yet another trace point: > > ---8<----------------------- > > From: "Joel Fernandes (Google)" <joel@joelfernandes.org> > Subject: [PATCH v2] rcu: Call trace_rcu_callback() also for bypassing > > If any CB is queued into the bypass list, then trace_rcu_callback() does > not show it. This makes it not clear when a callback was actually > queued, as you only end up getting a trace_rcu_invoke_callback() trace. > Fix it by calling the tracing function even for bypass queue. > > [Frederic: Hold lock while tracing] > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > --- > kernel/rcu/tree.c | 29 +++++++++++++++++++++-------- > 1 file changed, 21 insertions(+), 8 deletions(-) > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 5ec97e3f7468..85609ccbb8ed 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -2728,6 +2728,22 @@ static void check_cb_ovld(struct rcu_data *rdp) > raw_spin_unlock_rcu_node(rnp); > } > > +/* > + * Trace RCU callback helper, call after enqueuing callback. > + * The ->cblist must be locked when called. > + */ > +static void trace_rcu_callback_locked(struct rcu_head *head, > + struct rcu_data *rdp) > +{ > + if (__is_kvfree_rcu_offset((unsigned long)head->func)) > + trace_rcu_kvfree_callback(rcu_state.name, head, > + (unsigned long)head->func, > + rcu_segcblist_n_cbs(&rdp->cblist)); > + else > + trace_rcu_callback(rcu_state.name, head, > + rcu_segcblist_n_cbs(&rdp->cblist)); > +} > + > /** > * call_rcu() - Queue an RCU callback for invocation after a grace period. > * @head: structure to be used for queueing the RCU updates. > @@ -2809,17 +2825,14 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > } > > check_cb_ovld(rdp); > - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) > + > + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) { > + trace_rcu_callback_locked(head, rdp); Never mind, this is still broken. I need to move the call site into rcu_nocb_try_bypass() to be before where it releases the lock. I just landed so I'll take a small break before getting back at it. thanks, - Joel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing 2022-09-16 11:09 ` Frederic Weisbecker 2022-09-16 14:10 ` Joel Fernandes @ 2022-09-16 22:19 ` Joel Fernandes 2022-09-16 22:21 ` Joel Fernandes 1 sibling, 1 reply; 11+ messages in thread From: Joel Fernandes @ 2022-09-16 22:19 UTC (permalink / raw) To: Frederic Weisbecker Cc: linux-kernel, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt On 9/16/2022 7:09 AM, Frederic Weisbecker wrote: > On Thu, Sep 15, 2022 at 12:14:18AM +0000, Joel Fernandes (Google) wrote: >> If any CB is queued into the bypass list, then trace_rcu_callback() does >> not show it. This makes it not clear when a callback was actually >> queued, as you only end up getting a trace_rcu_invoke_callback() trace. >> Fix it by moving trace_rcu_callback() before >> trace_rcu_nocb_try_bypass(). >> >> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> >> --- >> kernel/rcu/tree.c | 10 ++++++---- >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >> index 5ec97e3f7468..9fe581be8696 100644 >> --- a/kernel/rcu/tree.c >> +++ b/kernel/rcu/tree.c >> @@ -2809,10 +2809,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t >> func) >> } >> >> check_cb_ovld(rdp); >> - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) >> - return; // Enqueued onto ->nocb_bypass, so just leave. >> - // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired >> ->nocb_lock. >> - rcu_segcblist_enqueue(&rdp->cblist, head); >> + >> if (__is_kvfree_rcu_offset((unsigned long)func)) >> trace_rcu_kvfree_callback(rcu_state.name, head, >> (unsigned long)func, >> @@ -2821,6 +2818,11 @@ void call_rcu(struct rcu_head *head, rcu_callback_t >> func) >> trace_rcu_callback(rcu_state.name, head, >> rcu_segcblist_n_cbs(&rdp->cblist)); >> >> + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) >> + return; // Enqueued onto ->nocb_bypass, so just leave. >> + // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired >> ->nocb_lock. >> + rcu_segcblist_enqueue(&rdp->cblist, head); >> + >> trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); >> >> /* Go handle any RCU core processing required. */ > > Two subtle changes induced here: > > * rcu_segcblist_n_cbs() is now read lockless. It's just tracing so no huge > deal > but still, if this races with callbacks invocation, we may on some rare > occasion > read stale numbers on traces while enqueuing (think about rcu_top for > example) Actually I disagree with this point now. Changes to the number of callbacks in the main ->cblist can be lockless. It uses atomic on CONFIG_RCU_NOCB. On non CONFIG_RCU_NOCB, CB cannot be queued as interrupts will be disabled. Also, in rcu_do_batch(), the count is manipulated after calling rcu_nocb_lock_irqsave(rdp, flags); > * trace_rcu_callback() will now show the number of callbacks _before_ > enqueuing > instead of _after_. Not sure if it matters, but sometimes tools rely on > trace > events. Yeah this is fixable and good point. So how about the below? ---8<----------------------- From: "Joel Fernandes (Google)" <joel@joelfernandes.org> Subject: [PATCH] rcu: Call trace_rcu_callback() also for bypass queuing If any CB is queued into the bypass list, then trace_rcu_callback() does not show it. This makes it not clear when a callback was actually queued, as you only end up getting a trace_rcu_invoke_callback() trace. Fix it by calling the tracing function even for bypass queue. Also, while at it, optimize the tracing so that rcu_state is not accessed here if tracing is disabled, because that's useless if we are not tracing. A quick inspection of the generated assembler shows that rcu_state is accessed even if the jump label for the tracepoint is disabled. __trace_rcu_callback: movq 8(%rdi), %rcx movq rcu_state+3640(%rip), %rax movq %rdi, %rdx cmpq $4095, %rcx ja .L3100 movq 192(%rsi), %r8 1:jmp .L3101 # objtool NOPs this .pushsection __jump_table, "aw" .balign 8 .long 1b - . .long .L3101 - . .quad __tracepoint_rcu_kvfree_callback+8 + 2 - . .popsection With this change, the jump label check which is NOOPed is moved to the beginning of the function. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> --- kernel/rcu/tree.c | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 5ec97e3f7468..b64df55f7f55 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2728,6 +2728,23 @@ static void check_cb_ovld(struct rcu_data *rdp) raw_spin_unlock_rcu_node(rnp); } +/* + * Trace RCU callback helper, call after enqueuing callback. + * The ->cblist must be locked when called. + */ +void __trace_rcu_callback(struct rcu_head *head, + struct rcu_data *rdp) +{ + if (trace_rcu_kvfree_callback_enabled() && + __is_kvfree_rcu_offset((unsigned long)head->func)) + trace_rcu_kvfree_callback(rcu_state.name, head, + (unsigned long)head->func, + rcu_segcblist_n_cbs(&rdp->cblist)); + else if (trace_rcu_callback_enabled()) + trace_rcu_callback(rcu_state.name, head, + rcu_segcblist_n_cbs(&rdp->cblist)); +} + /** * call_rcu() - Queue an RCU callback for invocation after a grace period. * @head: structure to be used for queueing the RCU updates. @@ -2809,17 +2826,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) } check_cb_ovld(rdp); - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) + + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) { + __trace_rcu_callback(head, rdp); return; // Enqueued onto ->nocb_bypass, so just leave. + } + // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. rcu_segcblist_enqueue(&rdp->cblist, head); - if (__is_kvfree_rcu_offset((unsigned long)func)) - trace_rcu_kvfree_callback(rcu_state.name, head, - (unsigned long)func, - rcu_segcblist_n_cbs(&rdp->cblist)); - else - trace_rcu_callback(rcu_state.name, head, - rcu_segcblist_n_cbs(&rdp->cblist)); + __trace_rcu_callback(head, rdp); trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); -- 2.37.3.968.ga6b4b080e4-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing 2022-09-16 22:19 ` Joel Fernandes @ 2022-09-16 22:21 ` Joel Fernandes 0 siblings, 0 replies; 11+ messages in thread From: Joel Fernandes @ 2022-09-16 22:21 UTC (permalink / raw) To: Frederic Weisbecker Cc: linux-kernel, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt On Fri, Sep 16, 2022 at 10:19:14PM +0000, Joel Fernandes wrote: [...] > >> + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) > >> + return; // Enqueued onto ->nocb_bypass, so just leave. > >> + // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired > >> ->nocb_lock. > >> + rcu_segcblist_enqueue(&rdp->cblist, head); > >> + > >> trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); > >> > >> /* Go handle any RCU core processing required. */ > > > > Two subtle changes induced here: > > > > * rcu_segcblist_n_cbs() is now read lockless. It's just tracing so no huge > > deal > > but still, if this races with callbacks invocation, we may on some rare > > occasion > > read stale numbers on traces while enqueuing (think about rcu_top for > > example) > > Actually I disagree with this point now. Changes to the number of callbacks > in the main ->cblist can be lockless. It uses atomic on CONFIG_RCU_NOCB. On > non CONFIG_RCU_NOCB, CB cannot be queued as interrupts will be disabled. > > Also, in rcu_do_batch(), the count is manipulated after calling > rcu_nocb_lock_irqsave(rdp, flags); > > > * trace_rcu_callback() will now show the number of callbacks _before_ > > enqueuing > > instead of _after_. Not sure if it matters, but sometimes tools rely on > > trace > > events. > > Yeah this is fixable and good point. So how about the below? > > ---8<----------------------- > > From: "Joel Fernandes (Google)" <joel@joelfernandes.org> > Subject: [PATCH] rcu: Call trace_rcu_callback() also for bypass queuing > > If any CB is queued into the bypass list, then trace_rcu_callback() does > not show it. This makes it not clear when a callback was actually > queued, as you only end up getting a trace_rcu_invoke_callback() trace. > Fix it by calling the tracing function even for bypass queue. > > Also, while at it, optimize the tracing so that rcu_state is not > accessed here if tracing is disabled, because that's useless if we are > not tracing. A quick inspection of the generated assembler shows that > rcu_state is accessed even if the jump label for the tracepoint is > disabled. > > __trace_rcu_callback: > movq 8(%rdi), %rcx > movq rcu_state+3640(%rip), %rax > movq %rdi, %rdx > cmpq $4095, %rcx > ja .L3100 > movq 192(%rsi), %r8 > 1:jmp .L3101 # objtool NOPs this > .pushsection __jump_table, "aw" > .balign 8 > .long 1b - . > .long .L3101 - . > .quad __tracepoint_rcu_kvfree_callback+8 + 2 - . > .popsection > > With this change, the jump label check which is NOOPed is moved to the > beginning of the function. > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > --- > kernel/rcu/tree.c | 31 +++++++++++++++++++++++-------- > 1 file changed, 23 insertions(+), 8 deletions(-) > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 5ec97e3f7468..b64df55f7f55 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -2728,6 +2728,23 @@ static void check_cb_ovld(struct rcu_data *rdp) > raw_spin_unlock_rcu_node(rnp); > } > > +/* > + * Trace RCU callback helper, call after enqueuing callback. > + * The ->cblist must be locked when called. Also sorry for the spam, this comment is stale now so will delete this line. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH rcu/next 3/3] rcu: Fix late wakeup when flush of bypass cblist happens 2022-09-15 0:14 [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5 Joel Fernandes (Google) 2022-09-15 0:14 ` [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google) 2022-09-15 0:14 ` [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing Joel Fernandes (Google) @ 2022-09-15 0:14 ` Joel Fernandes (Google) 2022-09-16 12:51 ` Frederic Weisbecker 2 siblings, 1 reply; 11+ messages in thread From: Joel Fernandes (Google) @ 2022-09-15 0:14 UTC (permalink / raw) To: linux-kernel Cc: Joel Fernandes (Google), Frederic Weisbecker, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt When the bypass cblist gets too big or its timeout has occurred, it is flushed into the main cblist. However, the bypass timer is still running and the behavior is that it would eventually expire and wake the GP thread. Since we are going to use the bypass cblist for lazy CBs, do the wakeup soon as the flush happens. Otherwise, the lazy-timer will go off much later and the now-non-lazy cblist CBs can get stranded for the duration of the timer. This is a good thing to do anyway, since it makes the behavior consistent with behavior of other code paths where queueing something into the ->cblist makes the GP kthread in a non-sleeping state quickly. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> --- kernel/rcu/tree_nocb.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 0a5f0ef41484..04c87f250e01 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -433,8 +433,9 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) || ncbs >= qhimark) { rcu_nocb_lock(rdp); + *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); + if (!rcu_nocb_flush_bypass(rdp, rhp, j)) { - *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); if (*was_alldone) trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstQ")); @@ -447,7 +448,12 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, rcu_advance_cbs_nowake(rdp->mynode, rdp); rdp->nocb_gp_adv_time = j; } - rcu_nocb_unlock_irqrestore(rdp, flags); + + // The flush succeeded and we moved CBs into the regular list. + // Don't wait for the wake up timer as it may be too far ahead. + // Wake up the GP thread now instead, if the cblist was empty. + __call_rcu_nocb_wake(rdp, *was_alldone, flags); + return true; // Callback already enqueued. } -- 2.37.2.789.g6183377224-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH rcu/next 3/3] rcu: Fix late wakeup when flush of bypass cblist happens 2022-09-15 0:14 ` [PATCH rcu/next 3/3] rcu: Fix late wakeup when flush of bypass cblist happens Joel Fernandes (Google) @ 2022-09-16 12:51 ` Frederic Weisbecker 0 siblings, 0 replies; 11+ messages in thread From: Frederic Weisbecker @ 2022-09-16 12:51 UTC (permalink / raw) To: Joel Fernandes (Google) Cc: linux-kernel, Josh Triplett, Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay, Paul E. McKenney, rcu, Steven Rostedt On Thu, Sep 15, 2022 at 12:14:19AM +0000, Joel Fernandes (Google) wrote: > When the bypass cblist gets too big or its timeout has occurred, it is > flushed into the main cblist. However, the bypass timer is still running > and the behavior is that it would eventually expire and wake the GP > thread. > > Since we are going to use the bypass cblist for lazy CBs, do the wakeup > soon as the flush happens. Otherwise, the lazy-timer will go off much > later and the now-non-lazy cblist CBs can get stranded for the duration > of the timer. > > This is a good thing to do anyway, since it makes the behavior consistent with > behavior of other code paths where queueing something into the ->cblist makes > the GP kthread in a non-sleeping state quickly. > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-09-16 22:22 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-09-15 0:14 [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5 Joel Fernandes (Google) 2022-09-15 0:14 ` [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google) 2022-09-16 10:55 ` Frederic Weisbecker 2022-09-15 0:14 ` [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing Joel Fernandes (Google) 2022-09-16 11:09 ` Frederic Weisbecker 2022-09-16 14:10 ` Joel Fernandes 2022-09-16 14:14 ` Joel Fernandes 2022-09-16 22:19 ` Joel Fernandes 2022-09-16 22:21 ` Joel Fernandes 2022-09-15 0:14 ` [PATCH rcu/next 3/3] rcu: Fix late wakeup when flush of bypass cblist happens Joel Fernandes (Google) 2022-09-16 12:51 ` Frederic Weisbecker
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.