linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Tasks RCU vs Preempt RCU
@ 2018-05-18 18:36 Joel Fernandes
  2018-05-19  2:29 ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Joel Fernandes @ 2018-05-18 18:36 UTC (permalink / raw)
  To: paulmck, rostedt, byungchul.park, mathieu.desnoyers,
	Josh Triplett, Lai Jiangshan, linux-kernel
  Cc: kernel-team

Hi,

I was thinking about tasks-RCU and why its needed. Since preempt-RCU allows
tasks to be preempted in read-sections, can we not just reuse that mechanism
for the trampolines since we track all preempted tasks so we would wait on
all tasks preempted within a trampoline?

I am trying to understand what will _not_ work if we did that.. I'm guessing
the answer is that that would mean the trampoline has to be wrapped with
rcu_read_{lock,unlock} which may add some overhead, but please let me know
if I'm missing something else..

The advantage I guess is possible elimination of an RCU variant, and also
possibly eliminating the tasks RCU thread that monitors.. Anyway I was
thinking more in terms of the effort of reduction of the RCU flavors etc and
reducing complexity ideas.

thanks!

- Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-18 18:36 Tasks RCU vs Preempt RCU Joel Fernandes
@ 2018-05-19  2:29 ` Paul E. McKenney
  2018-05-19 22:59   ` Joel Fernandes
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2018-05-19  2:29 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: rostedt, byungchul.park, mathieu.desnoyers, Josh Triplett,
	Lai Jiangshan, linux-kernel, kernel-team

On Fri, May 18, 2018 at 11:36:23AM -0700, Joel Fernandes wrote:
> Hi,
> 
> I was thinking about tasks-RCU and why its needed. Since preempt-RCU allows
> tasks to be preempted in read-sections, can we not just reuse that mechanism
> for the trampolines since we track all preempted tasks so we would wait on
> all tasks preempted within a trampoline?
> 
> I am trying to understand what will _not_ work if we did that.. I'm guessing
> the answer is that that would mean the trampoline has to be wrapped with
> rcu_read_{lock,unlock} which may add some overhead, but please let me know
> if I'm missing something else..
> 
> The advantage I guess is possible elimination of an RCU variant, and also
> possibly eliminating the tasks RCU thread that monitors.. Anyway I was
> thinking more in terms of the effort of reduction of the RCU flavors etc and
> reducing complexity ideas.

The problem is that if they are preempted while executing in a trampoline,
RCU-preempt doesn't queue them nor does it wait on them.

And the problem with wrapping them with rcu_read_{lock,unlock} is that
there would be a point before the trampoline executed rcu_read_lock()
but while it was on the trampoline.  Nothing good comes from this.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-19  2:29 ` Paul E. McKenney
@ 2018-05-19 22:59   ` Joel Fernandes
  2018-05-20  0:49     ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Joel Fernandes @ 2018-05-19 22:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: rostedt, byungchul.park, mathieu.desnoyers, Josh Triplett,
	Lai Jiangshan, linux-kernel, kernel-team

On Fri, May 18, 2018 at 07:29:18PM -0700, Paul E. McKenney wrote:
> On Fri, May 18, 2018 at 11:36:23AM -0700, Joel Fernandes wrote:
> > Hi,
> > 
> > I was thinking about tasks-RCU and why its needed. Since preempt-RCU allows
> > tasks to be preempted in read-sections, can we not just reuse that mechanism
> > for the trampolines since we track all preempted tasks so we would wait on
> > all tasks preempted within a trampoline?
> > 
> > I am trying to understand what will _not_ work if we did that.. I'm guessing
> > the answer is that that would mean the trampoline has to be wrapped with
> > rcu_read_{lock,unlock} which may add some overhead, but please let me know
> > if I'm missing something else..
> > 
> > The advantage I guess is possible elimination of an RCU variant, and also
> > possibly eliminating the tasks RCU thread that monitors.. Anyway I was
> > thinking more in terms of the effort of reduction of the RCU flavors etc and
> > reducing complexity ideas.
> 
> The problem is that if they are preempted while executing in a trampoline,
> RCU-preempt doesn't queue them nor does it wait on them.

Not if they are wrapped with rcu_read_lock and rcu_read_unlock? From what I
can see, you are preparing a list of blocked tasks that would keep the grace period
from finishing in rcu_preempt_ctxt_queue?

> And the problem with wrapping them with rcu_read_{lock,unlock} is that
> there would be a point before the trampoline executed rcu_read_lock()
> but while it was on the trampoline.  Nothing good comes from this.  ;-)

Yes, I see what you're saying. The data being protected and freed in this
case is the code so relying on it to do the rcu_read_lock seems infeasible.
Conceptually atleast, I feel this can be fixed by cleverly implementing
trampolines such that the rcu_read_lock isn't done during the trampoline
execution. But I am not very experienced with how the trampolines work to say
definitely whether it is or isn't possible or worth it. But atleast I felt it
was a worthwhile food for thought ;)

I actually want to trace out the trampoline executing as it pertains to RCU,
with your latest rcu/dev.. I think it will be fun :)

thanks!

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-19 22:59   ` Joel Fernandes
@ 2018-05-20  0:49     ` Paul E. McKenney
  2018-05-20  0:56       ` Joel Fernandes
  2018-05-20 15:28       ` Steven Rostedt
  0 siblings, 2 replies; 16+ messages in thread
From: Paul E. McKenney @ 2018-05-20  0:49 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: rostedt, byungchul.park, mathieu.desnoyers, Josh Triplett,
	Lai Jiangshan, linux-kernel, kernel-team

On Sat, May 19, 2018 at 03:59:05PM -0700, Joel Fernandes wrote:
> On Fri, May 18, 2018 at 07:29:18PM -0700, Paul E. McKenney wrote:
> > On Fri, May 18, 2018 at 11:36:23AM -0700, Joel Fernandes wrote:
> > > Hi,
> > > 
> > > I was thinking about tasks-RCU and why its needed. Since preempt-RCU allows
> > > tasks to be preempted in read-sections, can we not just reuse that mechanism
> > > for the trampolines since we track all preempted tasks so we would wait on
> > > all tasks preempted within a trampoline?
> > > 
> > > I am trying to understand what will _not_ work if we did that.. I'm guessing
> > > the answer is that that would mean the trampoline has to be wrapped with
> > > rcu_read_{lock,unlock} which may add some overhead, but please let me know
> > > if I'm missing something else..
> > > 
> > > The advantage I guess is possible elimination of an RCU variant, and also
> > > possibly eliminating the tasks RCU thread that monitors.. Anyway I was
> > > thinking more in terms of the effort of reduction of the RCU flavors etc and
> > > reducing complexity ideas.
> > 
> > The problem is that if they are preempted while executing in a trampoline,
> > RCU-preempt doesn't queue them nor does it wait on them.
> 
> Not if they are wrapped with rcu_read_lock and rcu_read_unlock? From what I
> can see, you are preparing a list of blocked tasks that would keep the grace period
> from finishing in rcu_preempt_ctxt_queue?

But being on the ->blkd_tasks list doesn't necessarily block the current
grace period.  Only those tasks on that list that are also referenced
by ->gp_tasks (or that follow some task referenced by ->gp_tasks)
will block the current grace period.  This is be design -- otherwise,
an endless stream of tasks blocking in their RCU read-side critical
sections could prevent the current grace period from ever ending.

> > And the problem with wrapping them with rcu_read_{lock,unlock} is that
> > there would be a point before the trampoline executed rcu_read_lock()
> > but while it was on the trampoline.  Nothing good comes from this.  ;-)
> 
> Yes, I see what you're saying. The data being protected and freed in this
> case is the code so relying on it to do the rcu_read_lock seems infeasible.
> Conceptually atleast, I feel this can be fixed by cleverly implementing
> trampolines such that the rcu_read_lock isn't done during the trampoline
> execution. But I am not very experienced with how the trampolines work to say
> definitely whether it is or isn't possible or worth it. But atleast I felt it
> was a worthwhile food for thought ;)

I suggested to Steven that the rcu_read_lock() and rcu_read_unlock() might
be outside of the trampoline, but this turned out to be infeasible.  Not
that I remember why!  ;-)

> I actually want to trace out the trampoline executing as it pertains to RCU,
> with your latest rcu/dev.. I think it will be fun :)

Cool!

In addition, if you are interested, it might be worth looking for fields
in rcu_dynticks, rcu_data, rcu_node, and rcu_state that are no longer
actually used.  It might also be worth looking for RCU macros that are
no longer used.

I found a few by accident, so there are probably more...

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-20  0:49     ` Paul E. McKenney
@ 2018-05-20  0:56       ` Joel Fernandes
  2018-05-20 15:28       ` Steven Rostedt
  1 sibling, 0 replies; 16+ messages in thread
From: Joel Fernandes @ 2018-05-20  0:56 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: rostedt, byungchul.park, mathieu.desnoyers, Josh Triplett,
	Lai Jiangshan, linux-kernel, kernel-team

On Sat, May 19, 2018 at 05:49:38PM -0700, Paul E. McKenney wrote:
[...]
> > > And the problem with wrapping them with rcu_read_{lock,unlock} is that
> > > there would be a point before the trampoline executed rcu_read_lock()
> > > but while it was on the trampoline.  Nothing good comes from this.  ;-)
> > 
> > Yes, I see what you're saying. The data being protected and freed in this
> > case is the code so relying on it to do the rcu_read_lock seems infeasible.
> > Conceptually atleast, I feel this can be fixed by cleverly implementing
> > trampolines such that the rcu_read_lock isn't done during the trampoline
> > execution. But I am not very experienced with how the trampolines work to say
> > definitely whether it is or isn't possible or worth it. But atleast I felt it
> > was a worthwhile food for thought ;)
> 
> I suggested to Steven that the rcu_read_lock() and rcu_read_unlock() might
> be outside of the trampoline, but this turned out to be infeasible.  Not
> that I remember why!  ;-)
> 
> > I actually want to trace out the trampoline executing as it pertains to RCU,
> > with your latest rcu/dev.. I think it will be fun :)
> 
> Cool!
> 
> In addition, if you are interested, it might be worth looking for fields
> in rcu_dynticks, rcu_data, rcu_node, and rcu_state that are no longer
> actually used.  It might also be worth looking for RCU macros that are
> no longer used.

Yes, definitely interested. Will keep an eye out for such fields and macros.

thanks!

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-20  0:49     ` Paul E. McKenney
  2018-05-20  0:56       ` Joel Fernandes
@ 2018-05-20 15:28       ` Steven Rostedt
  2018-05-20 19:18         ` Joel Fernandes
  1 sibling, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2018-05-20 15:28 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Joel Fernandes, byungchul.park, mathieu.desnoyers, Josh Triplett,
	Lai Jiangshan, linux-kernel, kernel-team


[ Steve interrupts his time off ]

On Sat, 19 May 2018 17:49:38 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> I suggested to Steven that the rcu_read_lock() and rcu_read_unlock() might
> be outside of the trampoline, but this turned out to be infeasible.  Not
> that I remember why!  ;-)

Because the trampoline itself is what needs to be freed. The trampoline
is what mcount/fentry or an optimized kprobe jumps to.


<func>:
	nop

[ enable function tracing ]

<func>:
	call func_tramp --> set up stack
			    call function_tracer()
			    pop stack
			    ret

			    ^^^^^
			    This is the trampoline

There's no way to know when a task will be on the trampoline or not.
The trampoline is allocated, and we need RCU_tasks to know when we can
free it. The only way to make a "wrapper" is to modify more of the code
text to do whatever before calling the trampoline, which is
impractical.

The allocated trampolines were added as an optimization, where two
registered callback functions from ftrace that are attached to two
different functions don't call the same trampoline which would have to
do a loop and a hash lookup to know what callback to call per function.
If a callback is the only one attached to a specific function, then a
trampoline is allocated and will call that callback directly, keeping
the overhead down.

There is no feasible way to know when a task is on a trampoline
without adding overhead that negates the speed up we receive by making
individual trampolines to begin with.

-- Steve

[ Steve resumes his time off ]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-20 15:28       ` Steven Rostedt
@ 2018-05-20 19:18         ` Joel Fernandes
  2018-05-22  1:59           ` Steven Rostedt
  0 siblings, 1 reply; 16+ messages in thread
From: Joel Fernandes @ 2018-05-20 19:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, byungchul.park, mathieu.desnoyers,
	Josh Triplett, Lai Jiangshan, linux-kernel, kernel-team

On Sun, May 20, 2018 at 11:28:43AM -0400, Steven Rostedt wrote:
> 
> [ Steve interrupts his time off ]

Hope you're enjoying your vacation :)

> On Sat, 19 May 2018 17:49:38 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > I suggested to Steven that the rcu_read_lock() and rcu_read_unlock() might
> > be outside of the trampoline, but this turned out to be infeasible.  Not
> > that I remember why!  ;-)
> 
> Because the trampoline itself is what needs to be freed. The trampoline
> is what mcount/fentry or an optimized kprobe jumps to.
> 
> 
> <func>:
> 	nop
> 
> [ enable function tracing ]
> 
> <func>:
> 	call func_tramp --> set up stack
> 			    call function_tracer()
> 			    pop stack
> 			    ret
> 
> 			    ^^^^^
> 			    This is the trampoline
> 
> There's no way to know when a task will be on the trampoline or not.
> The trampoline is allocated, and we need RCU_tasks to know when we can
> free it. The only way to make a "wrapper" is to modify more of the code
> text to do whatever before calling the trampoline, which is
> impractical.
> 
> The allocated trampolines were added as an optimization, where two
> registered callback functions from ftrace that are attached to two
> different functions don't call the same trampoline which would have to
> do a loop and a hash lookup to know what callback to call per function.
> If a callback is the only one attached to a specific function, then a
> trampoline is allocated and will call that callback directly, keeping
> the overhead down.

Right, I saw your trampoline prototype tree. I understand how it works now,
thanks.

> There is no feasible way to know when a task is on a trampoline
> without adding overhead that negates the speed up we receive by making
> individual trampolines to begin with.

Are you speaking of time overhead or space overhead, or both?

Just thinking out loud and probably some food for thought..

The rcu_read_lock/unlock primitive are extrememly fast, so I don't personally
think there's a time hit.

Could we get around the trampoline code == data issue by say using a
multi-stage trampoline like so? :

 	call func_tramp --> (static
			    trampoline)               (dynamic trampoline)
			    rcu_read_lock()  -------> set up stack
		 	                              call function_tracer()
 			                              pop stack
                            rcu_read_unlock() <------ ret
 
I know there's probably more to it than this, but conceptually atleast, it
feels like all the RCU infrastructure is already there to handle preemption
within a trampoline and it would be cool if the trampoline were as shown
above for the dynamically allocated trampolines. Atleast I feel it will be
faster than the pre-trampoline code that did the hash lookups / matching to
call the right function callbacks, and could help eliminiate need for the
RCU-tasks subsystem and its kthread then.

If you still feel its nots worth it, then that's okay too and clearly the
RCU-tasks has benefits such as a simpler trampoline implementation..

thanks!

- Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-20 19:18         ` Joel Fernandes
@ 2018-05-22  1:59           ` Steven Rostedt
  2018-05-22  4:34             ` Joel Fernandes
  2018-05-22  4:54             ` Joel Fernandes
  0 siblings, 2 replies; 16+ messages in thread
From: Steven Rostedt @ 2018-05-22  1:59 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Paul E. McKenney, byungchul.park, mathieu.desnoyers,
	Josh Triplett, Lai Jiangshan, linux-kernel, kernel-team

On Sun, 20 May 2018 12:18:46 -0700
Joel Fernandes <joel@joelfernandes.org> wrote:

> > There is no feasible way to know when a task is on a trampoline
> > without adding overhead that negates the speed up we receive by making
> > individual trampolines to begin with.  
> 
> Are you speaking of time overhead or space overhead, or both?

Both.

> 
> Just thinking out loud and probably some food for thought..
> 
> The rcu_read_lock/unlock primitive are extrememly fast, so I don't personally
> think there's a time hit.
> 
> Could we get around the trampoline code == data issue by say using a
> multi-stage trampoline like so? :
> 
>  	call func_tramp --> (static
> 			    trampoline)               (dynamic trampoline)
> 			    rcu_read_lock()  -------> set up stack
> 		 	                              call function_tracer()
>  			                              pop stack
>                             rcu_read_unlock() <------ ret
>  
> I know there's probably more to it than this, but conceptually atleast, it

Yes, there is more to it. Think about why we create a dynamic
trampoline. It is to make a custom call per callback for a group of
functions being traced by that callback.

Now, if we make that static trampoline, we just lost the reason for the
dynamic one. How would that work if you have 5 different users of the
callbacks (and lets not forget about optimized kprobes)? How do you
jump from the static trampoline to the dynamic one with a single call?

> feels like all the RCU infrastructure is already there to handle preemption
> within a trampoline and it would be cool if the trampoline were as shown
> above for the dynamically allocated trampolines. Atleast I feel it will be
> faster than the pre-trampoline code that did the hash lookups / matching to
> call the right function callbacks, and could help eliminiate need for the
> RCU-tasks subsystem and its kthread then.

I don't see how the static trampoline would be able to call. Do we
create a static trampoline for every function that is traced and never
delete it? That's a lot of memory.

Also, we trace rcu_read_lock/unlock(), and I use that for a lot of
debugging. And we also need to deal with tracing code that RCU does not
watch, because function tracing does a lot of that too. I finally gave
up trying to have the stack tracer trace those locations, because it
was a serious game of whack a mole that would never end. I don't want
to give up full function tracing for the same reason.

> 
> If you still feel its nots worth it, then that's okay too and clearly the
> RCU-tasks has benefits such as a simpler trampoline implementation..

If you are worried about making RCU simpler, we can go to my original
thought which was to make a home grown RCU like system that we can use,
as this has different requirements than normal RCU has. Like we don't
need a "lock" at all. We just need guaranteed quiescent points that we
make sure all tasks would go through before freeing the trampolines.
But it was decided to create a new flavor of RCU instead of doing that.

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-22  1:59           ` Steven Rostedt
@ 2018-05-22  4:34             ` Joel Fernandes
  2018-05-22  4:54             ` Joel Fernandes
  1 sibling, 0 replies; 16+ messages in thread
From: Joel Fernandes @ 2018-05-22  4:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Joel Fernandes (Google),
	Paul McKenney, Byungchul Park, Mathieu Desnoyers, Josh Triplett,
	Lai Jiangshan, LKML, Cc: Android Kernel

On Mon, May 21, 2018 at 6:59 PM Steven Rostedt <rostedt@goodmis.org> wrote:
[...]
> >
> > Just thinking out loud and probably some food for thought..
> >
> > The rcu_read_lock/unlock primitive are extrememly fast, so I don't
personally
> > think there's a time hit.
> >
> > Could we get around the trampoline code == data issue by say using a
> > multi-stage trampoline like so? :
> >
> >       call func_tramp --> (static
> >                           trampoline)               (dynamic trampoline)
> >                           rcu_read_lock()  -------> set up stack
> >                                                     call
function_tracer()
> >                                                     pop stack
> >                             rcu_read_unlock() <------ ret
> >
> > I know there's probably more to it than this, but conceptually atleast,
it

> Yes, there is more to it. Think about why we create a dynamic
> trampoline. It is to make a custom call per callback for a group of
> functions being traced by that callback.

> Now, if we make that static trampoline, we just lost the reason for the
> dynamic one. How would that work if you have 5 different users of the
> callbacks (and lets not forget about optimized kprobes)? How do you
> jump from the static trampoline to the dynamic one with a single call?

> > feels like all the RCU infrastructure is already there to handle
preemption
> > within a trampoline and it would be cool if the trampoline were as shown
> > above for the dynamically allocated trampolines. Atleast I feel it will
be
> > faster than the pre-trampoline code that did the hash lookups /
matching to
> > call the right function callbacks, and could help eliminiate need for
the
> > RCU-tasks subsystem and its kthread then.

> I don't see how the static trampoline would be able to call. Do we
> create a static trampoline for every function that is traced and never
> delete it? That's a lot of memory.

Yeah, ok. That was a dumb idea. :) I see it defeats the point.

> Also, we trace rcu_read_lock/unlock(), and I use that for a lot of
> debugging. And we also need to deal with tracing code that RCU does not
> watch, because function tracing does a lot of that too. I finally gave
> up trying to have the stack tracer trace those locations, because it
> was a serious game of whack a mole that would never end. I don't want
> to give up full function tracing for the same reason.

Yes, I understand. Its good to not have it depend on too many things which
may limit its utility.

> > If you still feel its nots worth it, then that's okay too and clearly
the
> > RCU-tasks has benefits such as a simpler trampoline implementation..

> If you are worried about making RCU simpler, we can go to my original
> thought which was to make a home grown RCU like system that we can use,
> as this has different requirements than normal RCU has. Like we don't
> need a "lock" at all. We just need guaranteed quiescent points that we
> make sure all tasks would go through before freeing the trampolines.
> But it was decided to create a new flavor of RCU instead of doing that.

Yes, lets brain storm this if you like. One way I was thinking if we can
manually check every CPU and see what state its in (usermode, kernel, idle
etc) using an IPI mechanism. Once all CPUs have been seen to be usermode,
or idle atleast once - then we are done. You have probably already thought
about this so feel free to say why its not a good idea, but to me there are
3 places that a tasks quiescent state is recorded. During the timer tick,
during task sleep and during cond_resched_tasks_qs.  Of these, I feel only
the cond_resched case isn't trackable with and IPI mechanism.

thanks,

- Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-22  1:59           ` Steven Rostedt
  2018-05-22  4:34             ` Joel Fernandes
@ 2018-05-22  4:54             ` Joel Fernandes
  2018-05-22 12:38               ` Steven Rostedt
  1 sibling, 1 reply; 16+ messages in thread
From: Joel Fernandes @ 2018-05-22  4:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, byungchul.park, mathieu.desnoyers,
	Josh Triplett, Lai Jiangshan, linux-kernel, kernel-team

(Resending because my previous mail client terribly wrapped things..)

On Mon, May 21, 2018 at 09:59:51PM -0400, Steven Rostedt wrote:
[...] 
> > 
> > Just thinking out loud and probably some food for thought..
> > 
> > The rcu_read_lock/unlock primitive are extrememly fast, so I don't personally
> > think there's a time hit.
> > 
> > Could we get around the trampoline code == data issue by say using a
> > multi-stage trampoline like so? :
> > 
> >  	call func_tramp --> (static
> > 			    trampoline)               (dynamic trampoline)
> > 			    rcu_read_lock()  -------> set up stack
> > 		 	                              call function_tracer()
> >  			                              pop stack
> >                             rcu_read_unlock() <------ ret
> >  
> > I know there's probably more to it than this, but conceptually atleast, it
> 
> Yes, there is more to it. Think about why we create a dynamic
> trampoline. It is to make a custom call per callback for a group of
> functions being traced by that callback.
> 
> Now, if we make that static trampoline, we just lost the reason for the
> dynamic one. How would that work if you have 5 different users of the
> callbacks (and lets not forget about optimized kprobes)? How do you
> jump from the static trampoline to the dynamic one with a single call?
> 
> > feels like all the RCU infrastructure is already there to handle preemption
> > within a trampoline and it would be cool if the trampoline were as shown
> > above for the dynamically allocated trampolines. Atleast I feel it will be
> > faster than the pre-trampoline code that did the hash lookups / matching to
> > call the right function callbacks, and could help eliminiate need for the
> > RCU-tasks subsystem and its kthread then.
> 
> I don't see how the static trampoline would be able to call. Do we
> create a static trampoline for every function that is traced and never
> delete it? That's a lot of memory.

Yeah, ok. I agree that was a dumb idea. :) I see it defeats the point.

> Also, we trace rcu_read_lock/unlock(), and I use that for a lot of
> debugging. And we also need to deal with tracing code that RCU does not
> watch, because function tracing does a lot of that too. I finally gave
> up trying to have the stack tracer trace those locations, because it
> was a serious game of whack a mole that would never end. I don't want
> to give up full function tracing for the same reason.

Yes, I understand. Its good to not have it depend on too many things which
may limit its utility.

> > If you still feel its nots worth it, then that's okay too and clearly the
> > RCU-tasks has benefits such as a simpler trampoline implementation..
> 
> If you are worried about making RCU simpler, we can go to my original
> thought which was to make a home grown RCU like system that we can use,
> as this has different requirements than normal RCU has. Like we don't
> need a "lock" at all. We just need guaranteed quiescent points that we
> make sure all tasks would go through before freeing the trampolines.
> But it was decided to create a new flavor of RCU instead of doing that.

Yes, lets brain storm this if you like. One way I was thinking if we can
manually check every CPU and see what state its in (usermode, kernel, idle
etc) using an IPI mechanism. Once all CPUs have been seen to be in usermode,
or idle atleast once - then we are done. You have probably already thought
about this so feel free to say why its not a good idea, but to me there are 3
places that a tasks quiescent state is recorded: during the timer tick,
during task sleep and during rcu_note_voluntary_context_switch in
cond_resched_rcu_qs. Of these, I feel only the cond_resched_rcu_qs case isn't
trackable with IPI mechanism which may make the detection a bit slower, but
tasks-RCU in mainline is slow right now anyway (~ 1 second delay if any task
was held).

thanks,

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-22  4:54             ` Joel Fernandes
@ 2018-05-22 12:38               ` Steven Rostedt
  2018-05-22 16:09                 ` Paul E. McKenney
  2018-05-23  3:10                 ` Joel Fernandes
  0 siblings, 2 replies; 16+ messages in thread
From: Steven Rostedt @ 2018-05-22 12:38 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Paul E. McKenney, byungchul.park, mathieu.desnoyers,
	Josh Triplett, Lai Jiangshan, linux-kernel, kernel-team

On Mon, 21 May 2018 21:54:14 -0700
Joel Fernandes <joel@joelfernandes.org> wrote:


> Yes, lets brain storm this if you like. One way I was thinking if we can
> manually check every CPU and see what state its in (usermode, kernel, idle
> etc) using an IPI mechanism. Once all CPUs have been seen to be in usermode,
> or idle atleast once - then we are done. You have probably already thought

Nope, it has nothing to do with CPUs, it really has to do with tasks.

	CPU0
	----
 task 1: (pinned to CPU 0)
	 call func_tracer_trampoline
	 [on trampoline]
	 [timer tick, schedule ]

 task 2: (higher priority, also pinned to CPU 0)
	 goes to user space
	 [ Runs for along time ]

We cannot free the trampoline until task 2 releases the CPU and lets
task 1 run again to get off the CPU.

> about this so feel free to say why its not a good idea, but to me there are 3
> places that a tasks quiescent state is recorded: during the timer tick,
> during task sleep and during rcu_note_voluntary_context_switch in
> cond_resched_rcu_qs. Of these, I feel only the cond_resched_rcu_qs case isn't
> trackable with IPI mechanism which may make the detection a bit slower, but
> tasks-RCU in mainline is slow right now anyway (~ 1 second delay if any task
> was held).


The way I was originally going to handle this was with a per task
counter, where it can be incremented at certain points via tracepoints.

Thus my synchronize tasks, would have connected to a bunch of
tracepoints at known quiescent states that would increment the counter,
and then check each task until they all pass a certain point, or are in
a quiescent state (userspace or idle). But this would be doing much of
what RCU does today, and that is why we decided to hook with the RCU
infrastructure.

I have to ask, what's your motivation for getting rid of RCU tasks?

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-22 12:38               ` Steven Rostedt
@ 2018-05-22 16:09                 ` Paul E. McKenney
  2018-05-22 17:27                   ` Steven Rostedt
  2018-05-23  3:10                 ` Joel Fernandes
  1 sibling, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2018-05-22 16:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Joel Fernandes, byungchul.park, mathieu.desnoyers, Josh Triplett,
	Lai Jiangshan, linux-kernel, kernel-team

On Tue, May 22, 2018 at 08:38:32AM -0400, Steven Rostedt wrote:
> On Mon, 21 May 2018 21:54:14 -0700
> Joel Fernandes <joel@joelfernandes.org> wrote:
> 
> 
> > Yes, lets brain storm this if you like. One way I was thinking if we can
> > manually check every CPU and see what state its in (usermode, kernel, idle
> > etc) using an IPI mechanism. Once all CPUs have been seen to be in usermode,
> > or idle atleast once - then we are done. You have probably already thought
> 
> Nope, it has nothing to do with CPUs, it really has to do with tasks.
> 
> 	CPU0
> 	----
>  task 1: (pinned to CPU 0)
> 	 call func_tracer_trampoline
> 	 [on trampoline]
> 	 [timer tick, schedule ]
> 
>  task 2: (higher priority, also pinned to CPU 0)
> 	 goes to user space
> 	 [ Runs for along time ]
> 
> We cannot free the trampoline until task 2 releases the CPU and lets
> task 1 run again to get off the CPU.

What Steven said!  IPIs get to CPUs, but we need to handle the
(unlikely, but very real) case where a bunch of tasks are preempted
within trampolines.

> > about this so feel free to say why its not a good idea, but to me there are 3
> > places that a tasks quiescent state is recorded: during the timer tick,
> > during task sleep and during rcu_note_voluntary_context_switch in
> > cond_resched_rcu_qs. Of these, I feel only the cond_resched_rcu_qs case isn't
> > trackable with IPI mechanism which may make the detection a bit slower, but
> > tasks-RCU in mainline is slow right now anyway (~ 1 second delay if any task
> > was held).
> 
> The way I was originally going to handle this was with a per task
> counter, where it can be incremented at certain points via tracepoints.
> 
> Thus my synchronize tasks, would have connected to a bunch of
> tracepoints at known quiescent states that would increment the counter,
> and then check each task until they all pass a certain point, or are in
> a quiescent state (userspace or idle). But this would be doing much of
> what RCU does today, and that is why we decided to hook with the RCU
> infrastructure.

Just for the record, if you guys realy want to take over Tasks RCU,
I have no objections.  For one thing, I don't anticipate any other use
cases for it (famous last words!).  But you break it, you buy it!  ;-)

							Thanx, Paul

> I have to ask, what's your motivation for getting rid of RCU tasks?
> 
> -- Steve
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-22 16:09                 ` Paul E. McKenney
@ 2018-05-22 17:27                   ` Steven Rostedt
  2018-05-22 17:47                     ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2018-05-22 17:27 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Joel Fernandes, byungchul.park, mathieu.desnoyers, Josh Triplett,
	Lai Jiangshan, linux-kernel, kernel-team

On Tue, 22 May 2018 09:09:49 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> Just for the record, if you guys realy want to take over Tasks RCU,
> I have no objections.  For one thing, I don't anticipate any other use
> cases for it (famous last words!).  But you break it, you buy it!  ;-)

It really matters how much of a burden is RCU_tasks to RCU itself? If
it causes a lot of headache for you, and it prevents you from cleaning
up RCU or making it better, then I would be happy to take it out of RCU
and maintain it separately myself. But if that's not the case, I'm happy
with keeping it within the RCU umbrella. Which brings me to the
question of what motivation does Joel have to remove it?

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-22 17:27                   ` Steven Rostedt
@ 2018-05-22 17:47                     ` Paul E. McKenney
  2018-05-23  1:19                       ` Joel Fernandes
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2018-05-22 17:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Joel Fernandes, byungchul.park, mathieu.desnoyers, Josh Triplett,
	Lai Jiangshan, linux-kernel, kernel-team

On Tue, May 22, 2018 at 01:27:00PM -0400, Steven Rostedt wrote:
> On Tue, 22 May 2018 09:09:49 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > Just for the record, if you guys realy want to take over Tasks RCU,
> > I have no objections.  For one thing, I don't anticipate any other use
> > cases for it (famous last words!).  But you break it, you buy it!  ;-)
> 
> It really matters how much of a burden is RCU_tasks to RCU itself? If
> it causes a lot of headache for you, and it prevents you from cleaning
> up RCU or making it better, then I would be happy to take it out of RCU
> and maintain it separately myself. But if that's not the case, I'm happy
> with keeping it within the RCU umbrella. Which brings me to the
> question of what motivation does Joel have to remove it?

The burden on me from Tasks RCU has been quite light, so no need for a
change from my end.

Over to you, Joel!  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-22 17:47                     ` Paul E. McKenney
@ 2018-05-23  1:19                       ` Joel Fernandes
  0 siblings, 0 replies; 16+ messages in thread
From: Joel Fernandes @ 2018-05-23  1:19 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Steven Rostedt, byungchul.park, mathieu.desnoyers, Josh Triplett,
	Lai Jiangshan, linux-kernel, kernel-team

On Tue, May 22, 2018 at 10:47:11AM -0700, Paul E. McKenney wrote:
> On Tue, May 22, 2018 at 01:27:00PM -0400, Steven Rostedt wrote:
> > On Tue, 22 May 2018 09:09:49 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > Just for the record, if you guys realy want to take over Tasks RCU,
> > > I have no objections.  For one thing, I don't anticipate any other use
> > > cases for it (famous last words!).  But you break it, you buy it!  ;-)
> > 
> > It really matters how much of a burden is RCU_tasks to RCU itself? If
> > it causes a lot of headache for you, and it prevents you from cleaning
> > up RCU or making it better, then I would be happy to take it out of RCU
> > and maintain it separately myself. But if that's not the case, I'm happy
> > with keeping it within the RCU umbrella. Which brings me to the
> > question of what motivation does Joel have to remove it?
> 
> The burden on me from Tasks RCU has been quite light, so no need for a
> change from my end.
> 
> Over to you, Joel!  ;-)

My motivation was I felt RCU-preempt already did the same thing (which I
still believe it does) so its redundant. Although now I'm convinced from our
earlier discussions that its not feasible to do an rcu_read_lock and
rcu_read_unlock in trampoline code.

Sorry I didn't mean you should really nuke RCU-tasks if it has a purpose, but
I was more trying to understand what its purpose was that RCU-preempt didn't
solve. That's all.

And welcome back from Vacation Steve. I'm about to send v7 of my preempt/irq
tracepoint patches so the timing seems great. I hope you will be able to take
a look at them. thanks!

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Tasks RCU vs Preempt RCU
  2018-05-22 12:38               ` Steven Rostedt
  2018-05-22 16:09                 ` Paul E. McKenney
@ 2018-05-23  3:10                 ` Joel Fernandes
  1 sibling, 0 replies; 16+ messages in thread
From: Joel Fernandes @ 2018-05-23  3:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, byungchul.park, mathieu.desnoyers,
	Josh Triplett, Lai Jiangshan, linux-kernel, kernel-team

On Tue, May 22, 2018 at 08:38:32AM -0400, Steven Rostedt wrote:
> On Mon, 21 May 2018 21:54:14 -0700
> Joel Fernandes <joel@joelfernandes.org> wrote:
> 
> 
> > Yes, lets brain storm this if you like. One way I was thinking if we can
> > manually check every CPU and see what state its in (usermode, kernel, idle
> > etc) using an IPI mechanism. Once all CPUs have been seen to be in usermode,
> > or idle atleast once - then we are done. You have probably already thought
> 
> Nope, it has nothing to do with CPUs, it really has to do with tasks.
> 
> 	CPU0
> 	----
>  task 1: (pinned to CPU 0)
> 	 call func_tracer_trampoline
> 	 [on trampoline]
> 	 [timer tick, schedule ]
> 
>  task 2: (higher priority, also pinned to CPU 0)
> 	 goes to user space
> 	 [ Runs for along time ]
> 
> We cannot free the trampoline until task 2 releases the CPU and lets
> task 1 run again to get off the CPU.

Right. I totally missed that. Ofcourse, its not sufficient to see if a CPU is
in usermode. :(

> > about this so feel free to say why its not a good idea, but to me there are 3
> > places that a tasks quiescent state is recorded: during the timer tick,
> > during task sleep and during rcu_note_voluntary_context_switch in
> > cond_resched_rcu_qs. Of these, I feel only the cond_resched_rcu_qs case isn't
> > trackable with IPI mechanism which may make the detection a bit slower, but
> > tasks-RCU in mainline is slow right now anyway (~ 1 second delay if any task
> > was held).
> 
> 
> The way I was originally going to handle this was with a per task
> counter, where it can be incremented at certain points via tracepoints.
> 
> Thus my synchronize tasks, would have connected to a bunch of
> tracepoints at known quiescent states that would increment the counter,
> and then check each task until they all pass a certain point, or are in
> a quiescent state (userspace or idle). But this would be doing much of
> what RCU does today, and that is why we decided to hook with the RCU
> infrastructure.
> 
> I have to ask, what's your motivation for getting rid of RCU tasks?

Again, my motivation was just to see if we could use RCU-preempt. Linus was
talking about unifying RCU variants/API better. I believe Paul is well on top of
that task and I have been helping as I could with his recent series for that.
Since RCU-preempt doesn't suit this job, looks like we have keep RCU tasks
around. That said I sent you a patch to speed up RCU-tasks a bit, could you a
take a look? Paul was asking for your input: https://lkml.org/lkml/2018/5/20/263

Welcome back from your vacation, hope it was fun!

thanks,

 - Joel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-05-23  3:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-18 18:36 Tasks RCU vs Preempt RCU Joel Fernandes
2018-05-19  2:29 ` Paul E. McKenney
2018-05-19 22:59   ` Joel Fernandes
2018-05-20  0:49     ` Paul E. McKenney
2018-05-20  0:56       ` Joel Fernandes
2018-05-20 15:28       ` Steven Rostedt
2018-05-20 19:18         ` Joel Fernandes
2018-05-22  1:59           ` Steven Rostedt
2018-05-22  4:34             ` Joel Fernandes
2018-05-22  4:54             ` Joel Fernandes
2018-05-22 12:38               ` Steven Rostedt
2018-05-22 16:09                 ` Paul E. McKenney
2018-05-22 17:27                   ` Steven Rostedt
2018-05-22 17:47                     ` Paul E. McKenney
2018-05-23  1:19                       ` Joel Fernandes
2018-05-23  3:10                 ` Joel Fernandes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).