All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5
@ 2022-09-17 16:41 Joel Fernandes (Google)
  2022-09-17 16:41 ` [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google)
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Joel Fernandes (Google) @ 2022-09-17 16:41 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, rushikesh.s.kadam, urezki, neeraj.iitr10, frederic,
	paulmck, rostedt, Joel Fernandes (Google)

Hello,

Just sending a few patches before lazy rcu v6 can come out. That can come out
after more testing to check feasibility of ideas discussed at LPC. I will try
to get rcutop pushed as well before that, but in the meanwhile I don't see much
reason to block these few patches for 6.1. These apply to Paul's rcu/next
branch. What do you think?

For Resend:
- Added Reviewed-by tags from Frederic on 1/3 and 3/3. For 2/3, I implemented
changes after last discussion.
- Better commit messages.

Joel Fernandes (Google) (3):
rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask
rcu: Fix late wakeup when flush of bypass cblist happens (v6)
rcu: Call trace_rcu_callback() also for bypass queuing (v2)

kernel/rcu/tree.c      | 32 +++++++++++++++++++++++---------
kernel/rcu/tree_nocb.h | 10 ++++++++--
2 files changed, 31 insertions(+), 11 deletions(-)

--
2.37.3.968.ga6b4b080e4-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask
  2022-09-17 16:41 [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5 Joel Fernandes (Google)
@ 2022-09-17 16:41 ` Joel Fernandes (Google)
  2022-09-17 16:41 ` [PATCH rcu/next 2/3] rcu: Fix late wakeup when flush of bypass cblist happens (v6) Joel Fernandes (Google)
  2022-09-17 16:42 ` [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2) Joel Fernandes (Google)
  2 siblings, 0 replies; 9+ messages in thread
From: Joel Fernandes (Google) @ 2022-09-17 16:41 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, rushikesh.s.kadam, urezki, neeraj.iitr10, frederic,
	paulmck, rostedt, Joel Fernandes (Google)

The rnp->qsmask is locklessly accessed from rcutree_dying_cpu(). This
may help avoid load tearing due to concurrent access, KCSAN
issues, and preserve sanity of people reading the mask in tracing.

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0ca21ac0f064..5ec97e3f7468 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2106,7 +2106,7 @@ int rcutree_dying_cpu(unsigned int cpu)
 	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
 		return 0;
 
-	blkd = !!(rnp->qsmask & rdp->grpmask);
+	blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask);
 	trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq),
 			       blkd ? TPS("cpuofl-bgp") : TPS("cpuofl"));
 	return 0;
-- 
2.37.3.968.ga6b4b080e4-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH rcu/next 2/3] rcu: Fix late wakeup when flush of bypass cblist happens (v6)
  2022-09-17 16:41 [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5 Joel Fernandes (Google)
  2022-09-17 16:41 ` [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google)
@ 2022-09-17 16:41 ` Joel Fernandes (Google)
  2022-09-21 14:54   ` Paul E. McKenney
  2022-09-17 16:42 ` [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2) Joel Fernandes (Google)
  2 siblings, 1 reply; 9+ messages in thread
From: Joel Fernandes (Google) @ 2022-09-17 16:41 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, rushikesh.s.kadam, urezki, neeraj.iitr10, frederic,
	paulmck, rostedt, Joel Fernandes (Google)

When the bypass cblist gets too big or its timeout has occurred, it is
flushed into the main cblist. However, the bypass timer is still running
and the behavior is that it would eventually expire and wake the GP
thread.

Since we are going to use the bypass cblist for lazy CBs, do the wakeup
soon as the flush for "too big or too long" bypass list happens.
Otherwise, long delays can happen for callbacks which get promoted from
lazy to non-lazy.

This is a good thing to do anyway (regardless of future lazy patches),
since it makes the behavior consistent with behavior of other code paths
where flushing into the ->cblist makes the GP kthread into a
non-sleeping state quickly.

[ Frederic Weisbec: changes to not do wake up GP thread unless needed,
		    comment changes ].

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/tree_nocb.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 0a5f0ef41484..04c87f250e01 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -433,8 +433,9 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 	if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) ||
 	    ncbs >= qhimark) {
 		rcu_nocb_lock(rdp);
+		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
+
 		if (!rcu_nocb_flush_bypass(rdp, rhp, j)) {
-			*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
 			if (*was_alldone)
 				trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 						    TPS("FirstQ"));
@@ -447,7 +448,12 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 			rcu_advance_cbs_nowake(rdp->mynode, rdp);
 			rdp->nocb_gp_adv_time = j;
 		}
-		rcu_nocb_unlock_irqrestore(rdp, flags);
+
+		// The flush succeeded and we moved CBs into the regular list.
+		// Don't wait for the wake up timer as it may be too far ahead.
+		// Wake up the GP thread now instead, if the cblist was empty.
+		__call_rcu_nocb_wake(rdp, *was_alldone, flags);
+
 		return true; // Callback already enqueued.
 	}
 
-- 
2.37.3.968.ga6b4b080e4-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2)
  2022-09-17 16:41 [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5 Joel Fernandes (Google)
  2022-09-17 16:41 ` [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google)
  2022-09-17 16:41 ` [PATCH rcu/next 2/3] rcu: Fix late wakeup when flush of bypass cblist happens (v6) Joel Fernandes (Google)
@ 2022-09-17 16:42 ` Joel Fernandes (Google)
  2022-09-17 19:58   ` Frederic Weisbecker
  2 siblings, 1 reply; 9+ messages in thread
From: Joel Fernandes (Google) @ 2022-09-17 16:42 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, rushikesh.s.kadam, urezki, neeraj.iitr10, frederic,
	paulmck, rostedt, Joel Fernandes (Google)

If any callback is queued into the bypass list, then
trace_rcu_callback() does not show it. This makes it unclear when a
callback was actually queued, as the resulting trace only includes a
rcu_invoke_callback event. Fix it by calling the tracing function even
if queuing into bypass. This is needed for the future rcutop tool which
monitors enqueuing of callbacks.

Note that, in case of bypass queuing, the new tracing happens without
the nocb_lock.  This should be OK since on CONFIG_RCU_NOCB_CPU systems,
the total callbacks is represented by an atomic counter. Also, other
paths like rcu_barrier() also sample the total number of callback
without the nocb_lock.

Also, while at it, optimize the tracing so that rcu_state is not
accessed if tracing is disabled, because that's useless if we are
not tracing. A quick inspection of the generated assembler shows that
rcu_state is accessed even if the jump label for the tracepoint is
disabled.

Here is gcc -S output of the bad asm (note that I un-inlined it just for
testing and illustration however the final __trace_rcu_callback in the
patch is marked static inline):

__trace_rcu_callback:
        movq    8(%rdi), %rcx
        movq    rcu_state+3640(%rip), %rax
        movq    %rdi, %rdx
        cmpq    $4095, %rcx
        ja      .L3100
        movq    192(%rsi), %r8
        1:jmp .L3101 # objtool NOPs this
        .pushsection __jump_table,  "aw"
         .balign 8
        .long 1b - .
        .long .L3101 - .
         .quad __tracepoint_rcu_kvfree_callback+8 + 2 - .
        .popsection

With this change, the jump label check which is NOOPed is moved to the
beginning of the function.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/rcu/tree.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 5ec97e3f7468..18f07e167d5e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2728,6 +2728,22 @@ static void check_cb_ovld(struct rcu_data *rdp)
 	raw_spin_unlock_rcu_node(rnp);
 }
 
+/*
+ * Trace RCU callback helper, call after enqueuing callback.
+ */
+static inline void __trace_rcu_callback(struct rcu_head *head,
+				      struct rcu_data *rdp)
+{
+	if (trace_rcu_kvfree_callback_enabled() &&
+	    __is_kvfree_rcu_offset((unsigned long)head->func))
+		trace_rcu_kvfree_callback(rcu_state.name, head,
+					 (unsigned long)head->func,
+					 rcu_segcblist_n_cbs(&rdp->cblist));
+	else if (trace_rcu_callback_enabled())
+		trace_rcu_callback(rcu_state.name, head,
+				   rcu_segcblist_n_cbs(&rdp->cblist));
+}
+
 /**
  * call_rcu() - Queue an RCU callback for invocation after a grace period.
  * @head: structure to be used for queueing the RCU updates.
@@ -2809,17 +2825,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
 	}
 
 	check_cb_ovld(rdp);
-	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
+
+	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) {
+		__trace_rcu_callback(head, rdp);
 		return; // Enqueued onto ->nocb_bypass, so just leave.
+	}
+
 	// If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
 	rcu_segcblist_enqueue(&rdp->cblist, head);
-	if (__is_kvfree_rcu_offset((unsigned long)func))
-		trace_rcu_kvfree_callback(rcu_state.name, head,
-					 (unsigned long)func,
-					 rcu_segcblist_n_cbs(&rdp->cblist));
-	else
-		trace_rcu_callback(rcu_state.name, head,
-				   rcu_segcblist_n_cbs(&rdp->cblist));
+	__trace_rcu_callback(head, rdp);
 
 	trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued"));
 
-- 
2.37.3.968.ga6b4b080e4-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2)
  2022-09-17 16:42 ` [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2) Joel Fernandes (Google)
@ 2022-09-17 19:58   ` Frederic Weisbecker
  2022-09-17 21:43     ` Joel Fernandes
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2022-09-17 19:58 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: rcu, linux-kernel, rushikesh.s.kadam, urezki, neeraj.iitr10,
	paulmck, rostedt

On Sat, Sep 17, 2022 at 04:42:00PM +0000, Joel Fernandes (Google) wrote:
> @@ -2809,17 +2825,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
>  	}
>  
>  	check_cb_ovld(rdp);
> -	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
> +
> +	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) {
> +		__trace_rcu_callback(head, rdp);
>  		return; // Enqueued onto ->nocb_bypass, so just leave.
> +	}

I think the bypass enqueues should be treated differently. Either with extending
the current trace_rcu_callback/trace_rcu_kvfree_callback (might break tools) or
with creating a new trace_rcu_callback_bypass()/trace_rcu_kvfree_callback_bypass().

Those could later be paired with a trace_rcu_bypass_flush().

Thanks.


> +
>  	// If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
>  	rcu_segcblist_enqueue(&rdp->cblist, head);
> -	if (__is_kvfree_rcu_offset((unsigned long)func))
> -		trace_rcu_kvfree_callback(rcu_state.name, head,
> -					 (unsigned long)func,
> -					 rcu_segcblist_n_cbs(&rdp->cblist));
> -	else
> -		trace_rcu_callback(rcu_state.name, head,
> -				   rcu_segcblist_n_cbs(&rdp->cblist));
> +	__trace_rcu_callback(head, rdp);
>  
>  	trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued"));
>  
> -- 
> 2.37.3.968.ga6b4b080e4-goog
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2)
  2022-09-17 19:58   ` Frederic Weisbecker
@ 2022-09-17 21:43     ` Joel Fernandes
  2022-09-17 22:21       ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Joel Fernandes @ 2022-09-17 21:43 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Neeraj upadhyay, Paul E. McKenney, rcu, Steven Rostedt,
	Rushikesh S Kadam, Uladzislau Rezki (Sony)

On Sat, Sep 17, 2022 at 3:58 PM Frederic Weisbecker <frederic@kernel.org> wrote:
>
> On Sat, Sep 17, 2022 at 04:42:00PM +0000, Joel Fernandes (Google) wrote:
> > @@ -2809,17 +2825,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
> >       }
> >
> >       check_cb_ovld(rdp);
> > -     if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
> > +
> > +     if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) {
> > +             __trace_rcu_callback(head, rdp);
> >               return; // Enqueued onto ->nocb_bypass, so just leave.
> > +     }
>
> I think the bypass enqueues should be treated differently. Either with extending
> the current trace_rcu_callback/trace_rcu_kvfree_callback (might break tools)
>
> or
> with creating a new trace_rcu_callback_bypass()/trace_rcu_kvfree_callback_bypass().
>
> Those could later be paired with a trace_rcu_bypass_flush().

I am having a hard time seeing why it should be treated differently.
We already increment the length of the main segcblist even when
bypassing. Why not just call the trace point instead of omitting it?
Otherwise it becomes a bit confusing IMO (say someone does not enable
your proposed new bypass tracepoint and only enables the existing one,
then they would see weird traces where call_rcu is called but their
traces are missing trace_rcu_callback). Not to mention - your
suggestion will also complicate writing tools that use the existing
rcu_callback tracepoint to monitor call_rcu().

Also if you see the definition of rcu_callback, "Tracepoint for the
registration of a single RCU callback function.".  That pretty much
fits the usage here.

As for tracing of the flushing, I don’t care about tracing that at the
moment using tracepoints, but I don’t mind if it is added later.

Maybe let’s let Paul help resolve our disagreement on this one? :)

thanks,

 - Joel

>
>
> Thanks.
>
>
> > +
> >       // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
> >       rcu_segcblist_enqueue(&rdp->cblist, head);
> > -     if (__is_kvfree_rcu_offset((unsigned long)func))
> > -             trace_rcu_kvfree_callback(rcu_state.name, head,
> > -                                      (unsigned long)func,
> > -                                      rcu_segcblist_n_cbs(&rdp->cblist));
> > -     else
> > -             trace_rcu_callback(rcu_state.name, head,
> > -                                rcu_segcblist_n_cbs(&rdp->cblist));
> > +     __trace_rcu_callback(head, rdp);
> >
> >       trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued"));
> >
> > --
> > 2.37.3.968.ga6b4b080e4-goog
> >

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2)
  2022-09-17 21:43     ` Joel Fernandes
@ 2022-09-17 22:21       ` Frederic Weisbecker
  2022-09-17 22:42         ` Joel Fernandes
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2022-09-17 22:21 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: LKML, Neeraj upadhyay, Paul E. McKenney, rcu, Steven Rostedt,
	Rushikesh S Kadam, Uladzislau Rezki (Sony)

On Sat, Sep 17, 2022 at 05:43:06PM -0400, Joel Fernandes wrote:
> On Sat, Sep 17, 2022 at 3:58 PM Frederic Weisbecker <frederic@kernel.org> wrote:
> >
> > On Sat, Sep 17, 2022 at 04:42:00PM +0000, Joel Fernandes (Google) wrote:
> > > @@ -2809,17 +2825,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
> > >       }
> > >
> > >       check_cb_ovld(rdp);
> > > -     if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
> > > +
> > > +     if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) {
> > > +             __trace_rcu_callback(head, rdp);
> > >               return; // Enqueued onto ->nocb_bypass, so just leave.
> > > +     }
> >
> > I think the bypass enqueues should be treated differently. Either with extending
> > the current trace_rcu_callback/trace_rcu_kvfree_callback (might break tools)
> >
> > or
> > with creating a new trace_rcu_callback_bypass()/trace_rcu_kvfree_callback_bypass().
> >
> > Those could later be paired with a trace_rcu_bypass_flush().
> 
> I am having a hard time seeing why it should be treated differently.
> We already increment the length of the main segcblist even when
> bypassing. Why not just call the trace point instead of omitting it?

I'm not suggesting to omit it. I'm suggesting to improve its precision.

> Otherwise it becomes a bit confusing IMO (say someone does not enable
> your proposed new bypass tracepoint and only enables the existing one,
> then they would see weird traces where call_rcu is called but their
> traces are missing trace_rcu_callback).

Well, if they decided to see only half of the information...

> Not to mention - your
> suggestion will also complicate writing tools that use the existing
> rcu_callback tracepoint to monitor call_rcu().

If we add another tracepoint, the prototype will be the same as the
existing one, not many lines to add. If instead we extend the existing
tracepoint, it's merely just a flag to check or ignore.

OTOH your suggestion doesn't provide any bypass related information.

> 
> Also if you see the definition of rcu_callback, "Tracepoint for the
> registration of a single RCU callback function.".  That pretty much
> fits the usage here.

Doesn't tell if it's a bypass or not.

> 
> As for tracing of the flushing, I don’t care about tracing that at the
> moment using tracepoints

You will soon enough ;-)

> but I don’t mind if it is added later.
> Maybe let’s let Paul help resolve our disagreement on this one? :)

FWIW, I would be personally interested in such tracepoints (or the extension
of the existing ones, whichever way you guys prefer), they would be of great help
for debugging.

Also if rcu_top is ever released, I really hope the kernel will be ready in
case we want the tool to display bypass related informations.

Please be careful while designing tracepoints that may be consumed by userspace
released tools. Such tracepoints eventually turn into ABI and there is no way
back after that.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2)
  2022-09-17 22:21       ` Frederic Weisbecker
@ 2022-09-17 22:42         ` Joel Fernandes
  0 siblings, 0 replies; 9+ messages in thread
From: Joel Fernandes @ 2022-09-17 22:42 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Neeraj upadhyay, Paul E. McKenney, rcu, Steven Rostedt,
	Rushikesh S Kadam, Uladzislau Rezki (Sony)

On Sat, Sep 17, 2022 at 6:21 PM Frederic Weisbecker <frederic@kernel.org> wrote:
>
> On Sat, Sep 17, 2022 at 05:43:06PM -0400, Joel Fernandes wrote:
> > On Sat, Sep 17, 2022 at 3:58 PM Frederic Weisbecker <frederic@kernel.org> wrote:
> > >
> > > On Sat, Sep 17, 2022 at 04:42:00PM +0000, Joel Fernandes (Google) wrote:
> > > > @@ -2809,17 +2825,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
> > > >       }
> > > >
> > > >       check_cb_ovld(rdp);
> > > > -     if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
> > > > +
> > > > +     if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) {
> > > > +             __trace_rcu_callback(head, rdp);
> > > >               return; // Enqueued onto ->nocb_bypass, so just leave.
> > > > +     }
> > >
> > > I think the bypass enqueues should be treated differently. Either with extending
> > > the current trace_rcu_callback/trace_rcu_kvfree_callback (might break tools)
> > >
> > > or
> > > with creating a new trace_rcu_callback_bypass()/trace_rcu_kvfree_callback_bypass().
> > >
> > > Those could later be paired with a trace_rcu_bypass_flush().
> >
> > I am having a hard time seeing why it should be treated differently.
> > We already increment the length of the main segcblist even when
> > bypassing. Why not just call the trace point instead of omitting it?
>
> I'm not suggesting to omit it. I'm suggesting to improve its precision.

That's exactly what I'm doing :-). It is imprecise the way it is, by
calling it where it needs to be (not omitting it), I am making it more
precise.

> > Otherwise it becomes a bit confusing IMO (say someone does not enable
> > your proposed new bypass tracepoint and only enables the existing one,
> > then they would see weird traces where call_rcu is called but their
> > traces are missing trace_rcu_callback).
>
> Well, if they decided to see only half of the information...

It is not that they decide, there are lots of RCU tracepoints and it
is likely common to enable just a few of them.

> > Not to mention - your
> > suggestion will also complicate writing tools that use the existing
> > rcu_callback tracepoint to monitor call_rcu().
>
> If we add another tracepoint, the prototype will be the same as the
> existing one, not many lines to add. If instead we extend the existing
> tracepoint, it's merely just a flag to check or ignore.
>
> OTOH your suggestion doesn't provide any bypass related information.

Bypass related information is not relevant to this patch. I am already
using trace_rcu_callback() in my (yet to be released) rcutop, and I
don't use it for any bypass information.

> > Also if you see the definition of rcu_callback, "Tracepoint for the
> > registration of a single RCU callback function.".  That pretty much
> > fits the usage here.
>
> Doesn't tell if it's a bypass or not.

It doesn't tell a lot of things, so what? Saying that it is bypass is
not the point of this patch.

> > As for tracing of the flushing, I don’t care about tracing that at the
> > moment using tracepoints
>
> You will soon enough ;-)

I respect your experience in the matter :-)

> > but I don’t mind if it is added later.
> > Maybe let’s let Paul help resolve our disagreement on this one? :)
>
> FWIW, I would be personally interested in such tracepoints (or the extension

Understood.

> of the existing ones, whichever way you guys prefer), they would be of great help
> for debugging.
>
> Also if rcu_top is ever released, I really hope the kernel will be ready in
> case we want the tool to display bypass related informations.

I feel the main issue you have with my patch is that it does not add
the information you want, however the information you mention is
beyond the scope of the patch and can in future/different patches.
This patch only fixes an *existing* broken tracepoint.

I can certainly add a new bypass-related tracepoint in the future, but
I don't see how that's relevant to my patch.

> Please be careful while designing tracepoints that may be consumed by userspace
> released tools. Such tracepoints eventually turn into ABI and there is no way
> back after that.

Sure thing, that's why I'm fixing the broken tracepoint. Some
registered callbacks can be invisible to the user.

 - Joel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rcu/next 2/3] rcu: Fix late wakeup when flush of bypass cblist happens (v6)
  2022-09-17 16:41 ` [PATCH rcu/next 2/3] rcu: Fix late wakeup when flush of bypass cblist happens (v6) Joel Fernandes (Google)
@ 2022-09-21 14:54   ` Paul E. McKenney
  0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2022-09-21 14:54 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: rcu, linux-kernel, rushikesh.s.kadam, urezki, neeraj.iitr10,
	frederic, rostedt

On Sat, Sep 17, 2022 at 04:41:59PM +0000, Joel Fernandes (Google) wrote:
> When the bypass cblist gets too big or its timeout has occurred, it is
> flushed into the main cblist. However, the bypass timer is still running
> and the behavior is that it would eventually expire and wake the GP
> thread.
> 
> Since we are going to use the bypass cblist for lazy CBs, do the wakeup
> soon as the flush for "too big or too long" bypass list happens.
> Otherwise, long delays can happen for callbacks which get promoted from
> lazy to non-lazy.
> 
> This is a good thing to do anyway (regardless of future lazy patches),
> since it makes the behavior consistent with behavior of other code paths
> where flushing into the ->cblist makes the GP kthread into a
> non-sleeping state quickly.
> 
> [ Frederic Weisbec: changes to not do wake up GP thread unless needed,
> 		    comment changes ].
> 
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Queued and pushed this and 1/3, thank you both!

							Thanx, Paul

> ---
>  kernel/rcu/tree_nocb.h | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 0a5f0ef41484..04c87f250e01 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -433,8 +433,9 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>  	if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) ||
>  	    ncbs >= qhimark) {
>  		rcu_nocb_lock(rdp);
> +		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
> +
>  		if (!rcu_nocb_flush_bypass(rdp, rhp, j)) {
> -			*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
>  			if (*was_alldone)
>  				trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>  						    TPS("FirstQ"));
> @@ -447,7 +448,12 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>  			rcu_advance_cbs_nowake(rdp->mynode, rdp);
>  			rdp->nocb_gp_adv_time = j;
>  		}
> -		rcu_nocb_unlock_irqrestore(rdp, flags);
> +
> +		// The flush succeeded and we moved CBs into the regular list.
> +		// Don't wait for the wake up timer as it may be too far ahead.
> +		// Wake up the GP thread now instead, if the cblist was empty.
> +		__call_rcu_nocb_wake(rdp, *was_alldone, flags);
> +
>  		return true; // Callback already enqueued.
>  	}
>  
> -- 
> 2.37.3.968.ga6b4b080e4-goog
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-09-21 14:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-17 16:41 [PATCH rcu/next 0/3] Preparatory patches borrowed from lazy rcu v5 Joel Fernandes (Google)
2022-09-17 16:41 ` [PATCH rcu/next 1/3] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google)
2022-09-17 16:41 ` [PATCH rcu/next 2/3] rcu: Fix late wakeup when flush of bypass cblist happens (v6) Joel Fernandes (Google)
2022-09-21 14:54   ` Paul E. McKenney
2022-09-17 16:42 ` [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2) Joel Fernandes (Google)
2022-09-17 19:58   ` Frederic Weisbecker
2022-09-17 21:43     ` Joel Fernandes
2022-09-17 22:21       ` Frederic Weisbecker
2022-09-17 22:42         ` Joel Fernandes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.