LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched()
@ 2018-08-09 20:05 Steven Rostedt
  2018-08-10 11:35 ` Oleg Nesterov
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2018-08-09 20:05 UTC (permalink / raw)
  To: LKML
  Cc: zhangwei(Jovi),
	Masami Hiramatsu, Oleg Nesterov, Namhyung Kim, Andrew Morton,
	stable


From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

While debugging another bug, I was looking at all the synchronize*()
functions being used in kernel/trace, and noticed that trace_uprobes was
using synchronize_sched(), with a comment to synchronize with
{u,ret}_probe_trace_func(). When looking at those functions, the data is
protected with "rcu_read_lock()" and not with "rcu_read_lock_sched()". This
is using the wrong synchronize_*() function.

Cc: stable@vger.kernel.org
Fixes: 70ed91c6ec7f8 ("tracing/uprobes: Support ftrace_event_file base multibuffer")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_uprobe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index bf89a51e740d..ac02fafc9f1b 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -952,7 +952,7 @@ probe_event_disable(struct trace_uprobe *tu, struct trace_event_file *file)
 
 		list_del_rcu(&link->list);
 		/* synchronize with u{,ret}probe_trace_func */
-		synchronize_sched();
+		synchronize_rcu();
 		kfree(link);
 
 		if (!list_empty(&tu->tp.files))
-- 
2.13.6


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched()
  2018-08-09 20:05 [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched() Steven Rostedt
@ 2018-08-10 11:35 ` Oleg Nesterov
  2018-08-10 11:42   ` Oleg Nesterov
  2018-08-10 12:48   ` Steven Rostedt
  0 siblings, 2 replies; 7+ messages in thread
From: Oleg Nesterov @ 2018-08-10 11:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, zhangwei(Jovi),
	Masami Hiramatsu, Namhyung Kim, Andrew Morton, stable

On 08/09, Steven Rostedt wrote:
>
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -952,7 +952,7 @@ probe_event_disable(struct trace_uprobe *tu, struct trace_event_file *file)
>
>  		list_del_rcu(&link->list);
>  		/* synchronize with u{,ret}probe_trace_func */
> -		synchronize_sched();
> +		synchronize_rcu();

Can't we change uprobe_trace_func() and uretprobe_trace_func() to use
rcu_read_lock_sched() instead? It is more cheap.


Hmm. probe_event_enable() does list_del + kfree on failure, this doesn't
look right... Not only because kfree() can race with list_for_each_entry_rcu(),
we should not put the 1st link on list until uprobe_buffer_enable().

Does the patch below make sense or I am confused?

Oleg.


--- x/kernel/trace/trace_uprobe.c
+++ x/kernel/trace/trace_uprobe.c
@@ -896,8 +896,6 @@ probe_event_enable(struct trace_uprobe *
 			return -ENOMEM;
 
 		link->file = file;
-		list_add_tail_rcu(&link->list, &tu->tp.files);
-
 		tu->tp.flags |= TP_FLAG_TRACE;
 	} else {
 		if (tu->tp.flags & TP_FLAG_TRACE)
@@ -909,7 +907,7 @@ probe_event_enable(struct trace_uprobe *
 	WARN_ON(!uprobe_filter_is_empty(&tu->filter));
 
 	if (enabled)
-		return 0;
+		goto add;
 
 	ret = uprobe_buffer_enable();
 	if (ret)
@@ -920,7 +918,8 @@ probe_event_enable(struct trace_uprobe *
 	ret = uprobe_register(tu->inode, tu->offset, &tu->consumer);
 	if (ret)
 		goto err_buffer;
-
+ add:
+	list_add_tail_rcu(&link->list, &tu->tp.files);
 	return 0;
 
  err_buffer:
@@ -928,7 +927,6 @@ probe_event_enable(struct trace_uprobe *
 
  err_flags:
 	if (file) {
-		list_del(&link->list);
 		kfree(link);
 		tu->tp.flags &= ~TP_FLAG_TRACE;
 	} else {


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched()
  2018-08-10 11:35 ` Oleg Nesterov
@ 2018-08-10 11:42   ` Oleg Nesterov
  2018-08-10 12:48   ` Steven Rostedt
  1 sibling, 0 replies; 7+ messages in thread
From: Oleg Nesterov @ 2018-08-10 11:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, zhangwei(Jovi),
	Masami Hiramatsu, Namhyung Kim, Andrew Morton, stable

On 08/10, Oleg Nesterov wrote:
>
> @@ -920,7 +918,8 @@ probe_event_enable(struct trace_uprobe *
>  	ret = uprobe_register(tu->inode, tu->offset, &tu->consumer);
>  	if (ret)
>  		goto err_buffer;
> -
> + add:
> +	list_add_tail_rcu(&link->list, &tu->tp.files);

	if (link)
		list_add_tail_rcu(&link->list, &tu->tp.files);

Oleg.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched()
  2018-08-10 11:35 ` Oleg Nesterov
  2018-08-10 11:42   ` Oleg Nesterov
@ 2018-08-10 12:48   ` Steven Rostedt
  2018-08-10 13:36     ` Oleg Nesterov
  1 sibling, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2018-08-10 12:48 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, zhangwei(Jovi),
	Masami Hiramatsu, Namhyung Kim, Andrew Morton, stable

On Fri, 10 Aug 2018 13:35:49 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> On 08/09, Steven Rostedt wrote:
> >
> > --- a/kernel/trace/trace_uprobe.c
> > +++ b/kernel/trace/trace_uprobe.c
> > @@ -952,7 +952,7 @@ probe_event_disable(struct trace_uprobe *tu, struct trace_event_file *file)
> >
> >  		list_del_rcu(&link->list);
> >  		/* synchronize with u{,ret}probe_trace_func */
> > -		synchronize_sched();
> > +		synchronize_rcu();  
> 
> Can't we change uprobe_trace_func() and uretprobe_trace_func() to use
> rcu_read_lock_sched() instead? It is more cheap.

Is it? rcu_read_lock_sched() is a preempt_disable(), where
rcu_read_lock() may just be a task counter increment.

> 
> 
> Hmm. probe_event_enable() does list_del + kfree on failure, this doesn't
> look right... Not only because kfree() can race with list_for_each_entry_rcu(),
> we should not put the 1st link on list until uprobe_buffer_enable().
> 
> Does the patch below make sense or I am confused?

I guess the question is, if it isn't enabled, are there any users or
even past users still running. If not, then I think the current code is
OK, as there shouldn't be anything happening to race with it.

-- Steve

> 
> Oleg.
> 
> 
> --- x/kernel/trace/trace_uprobe.c
> +++ x/kernel/trace/trace_uprobe.c
> @@ -896,8 +896,6 @@ probe_event_enable(struct trace_uprobe *
>  			return -ENOMEM;
>  
>  		link->file = file;
> -		list_add_tail_rcu(&link->list, &tu->tp.files);
> -
>  		tu->tp.flags |= TP_FLAG_TRACE;
>  	} else {
>  		if (tu->tp.flags & TP_FLAG_TRACE)
> @@ -909,7 +907,7 @@ probe_event_enable(struct trace_uprobe *
>  	WARN_ON(!uprobe_filter_is_empty(&tu->filter));
>  
>  	if (enabled)
> -		return 0;
> +		goto add;
>  
>  	ret = uprobe_buffer_enable();
>  	if (ret)
> @@ -920,7 +918,8 @@ probe_event_enable(struct trace_uprobe *
>  	ret = uprobe_register(tu->inode, tu->offset, &tu->consumer);
>  	if (ret)
>  		goto err_buffer;
> -
> + add:
> +	list_add_tail_rcu(&link->list, &tu->tp.files);
>  	return 0;
>  
>   err_buffer:
> @@ -928,7 +927,6 @@ probe_event_enable(struct trace_uprobe *
>  
>   err_flags:
>  	if (file) {
> -		list_del(&link->list);
>  		kfree(link);
>  		tu->tp.flags &= ~TP_FLAG_TRACE;
>  	} else {


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched()
  2018-08-10 12:48   ` Steven Rostedt
@ 2018-08-10 13:36     ` Oleg Nesterov
  2018-08-10 13:44       ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Nesterov @ 2018-08-10 13:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, zhangwei(Jovi),
	Masami Hiramatsu, Namhyung Kim, Andrew Morton, stable

On 08/10, Steven Rostedt wrote:
>
> On Fri, 10 Aug 2018 13:35:49 +0200
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > On 08/09, Steven Rostedt wrote:
> > >
> > > --- a/kernel/trace/trace_uprobe.c
> > > +++ b/kernel/trace/trace_uprobe.c
> > > @@ -952,7 +952,7 @@ probe_event_disable(struct trace_uprobe *tu, struct trace_event_file *file)
> > >
> > >  		list_del_rcu(&link->list);
> > >  		/* synchronize with u{,ret}probe_trace_func */
> > > -		synchronize_sched();
> > > +		synchronize_rcu();
> >
> > Can't we change uprobe_trace_func() and uretprobe_trace_func() to use
> > rcu_read_lock_sched() instead? It is more cheap.
>
> Is it? rcu_read_lock_sched() is a preempt_disable(),

which is just raw_cpu_inc()

> where
> rcu_read_lock() may just be a task counter increment.

and  __rcu_read_unlock() is more heavy.

OK, I agree, this doesn't really matter.

> > Hmm. probe_event_enable() does list_del + kfree on failure, this doesn't
> > look right... Not only because kfree() can race with list_for_each_entry_rcu(),
> > we should not put the 1st link on list until uprobe_buffer_enable().
> >
> > Does the patch below make sense or I am confused?
>
> I guess the question is, if it isn't enabled, are there any users or
> even past users still running.

Note that uprobe_register() is not "atomic".

To simplify, suppose we have 2 tasks T1 and T2 running the probed binary.
So we are going to do install_breakpoint(T1->mm) + install_breakpoint(T2->mm).
If the 2nd install_breakpoint() fails for any reason, _register() will do
remove_breakpoint(T1->mm) and return the error.

However, T1 can hit this bp right after install_breakpoint(T1->mm), so it
can call uprobe_trace_func() before list_del(&link->list).

OK, even if I am right this is mostly theoretical.

Oleg.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched()
  2018-08-10 13:36     ` Oleg Nesterov
@ 2018-08-10 13:44       ` Steven Rostedt
  2018-08-10 14:06         ` Oleg Nesterov
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2018-08-10 13:44 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: LKML, Masami Hiramatsu, Namhyung Kim, Andrew Morton, stable


[ Removing Jovi as his email is bouncing ]

On Fri, 10 Aug 2018 15:36:08 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> > > Can't we change uprobe_trace_func() and uretprobe_trace_func() to use
> > > rcu_read_lock_sched() instead? It is more cheap.  
> >
> > Is it? rcu_read_lock_sched() is a preempt_disable(),  
> 
> which is just raw_cpu_inc()
> 
> > where
> > rcu_read_lock() may just be a task counter increment.  
> 
> and  __rcu_read_unlock() is more heavy.
> 
> OK, I agree, this doesn't really matter.

Are you OK with this patch? I set it for stable and plan on pushing it
with the patches for the upcoming merge window. If you are OK, mind
giving me an Acked or Reviewed-by?

> 
> > > Hmm. probe_event_enable() does list_del + kfree on failure, this doesn't
> > > look right... Not only because kfree() can race with list_for_each_entry_rcu(),
> > > we should not put the 1st link on list until uprobe_buffer_enable().
> > >
> > > Does the patch below make sense or I am confused?  
> >
> > I guess the question is, if it isn't enabled, are there any users or
> > even past users still running.  
> 
> Note that uprobe_register() is not "atomic".
> 
> To simplify, suppose we have 2 tasks T1 and T2 running the probed binary.
> So we are going to do install_breakpoint(T1->mm) + install_breakpoint(T2->mm).
> If the 2nd install_breakpoint() fails for any reason, _register() will do
> remove_breakpoint(T1->mm) and return the error.
> 
> However, T1 can hit this bp right after install_breakpoint(T1->mm), so it
> can call uprobe_trace_func() before list_del(&link->list).
> 
> OK, even if I am right this is mostly theoretical.

Even if it is theoretical, we should make sure it can't happen. But
this is unrelated to the current patch, and if we should fix this, then
it can be a separate patch. I don't think your change hurts, and even
if it can't technically happen, it may let us sleep better at night.
Want to send a formal patch to make this change?

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched()
  2018-08-10 13:44       ` Steven Rostedt
@ 2018-08-10 14:06         ` Oleg Nesterov
  0 siblings, 0 replies; 7+ messages in thread
From: Oleg Nesterov @ 2018-08-10 14:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Masami Hiramatsu, Namhyung Kim, Andrew Morton, stable

On 08/10, Steven Rostedt wrote:
>
> Are you OK with this patch? I set it for stable and plan on pushing it
> with the patches for the upcoming merge window. If you are OK, mind
> giving me an Acked or Reviewed-by?

Yes, thanks, feel free to add

Acked-by: Oleg Nesterov <oleg@redhat.com>

> Even if it is theoretical, we should make sure it can't happen. But
> this is unrelated to the current patch,

agreed.

> and if we should fix this, then
> it can be a separate patch. I don't think your change hurts, and even
> if it can't technically happen, it may let us sleep better at night.
> Want to send a formal patch to make this change?

OK, will do.

Oleg.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-09 20:05 [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched() Steven Rostedt
2018-08-10 11:35 ` Oleg Nesterov
2018-08-10 11:42   ` Oleg Nesterov
2018-08-10 12:48   ` Steven Rostedt
2018-08-10 13:36     ` Oleg Nesterov
2018-08-10 13:44       ` Steven Rostedt
2018-08-10 14:06         ` Oleg Nesterov

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox