linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
@ 2021-06-29 13:55 Steven Rostedt
  2021-06-29 14:16 ` [syzbot] WARNING in tracepoint_add_func syzbot
  2021-07-07 22:12 ` [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing Andrii Nakryiko
  0 siblings, 2 replies; 11+ messages in thread
From: Steven Rostedt @ 2021-06-29 13:55 UTC (permalink / raw)
  To: LKML, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Mathieu Desnoyers, Ingo Molnar, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

All internal use cases for tracepoint_probe_register() is set to not ever
be called with the same function and data. If it is, it is considered a
bug, as that means the accounting of handling tracepoints is corrupted.
If the function and data for a tracepoint is already registered when
tracepoint_probe_register() is called, it will call WARN_ON_ONCE() and
return with EEXISTS.

The BPF system call can end up calling tracepoint_probe_register() with
the same data, which now means that this can trigger the warning because
of a user space process. As WARN_ON_ONCE() should not be called because
user space called a system call with bad data, there needs to be a way to
register a tracepoint without triggering a warning.

Enter tracepoint_probe_register_may_exist(), which can be called, but will
not cause a WARN_ON() if the probe already exists. It will still error out
with EEXIST, which will then be sent to the user space that performed the
BPF system call.

This keeps the previous testing for issues with other users of the
tracepoint code, while letting BPF call it with duplicated data and not
warn about it.

Link: https://lore.kernel.org/lkml/20210626135845.4080-1-penguin-kernel@I-love.SAKURA.ne.jp/
Link: https://syzkaller.appspot.com/bug?id=41f4318cf01762389f4d1c1c459da4f542fe5153 [1]`

Cc: stable@vger.kernel.org
Fixes: c4f6699dfcb85 ("bpf: introduce BPF_RAW_TRACEPOINT")
Reported-by: syzbot <syzbot+721aa903751db87aa244@syzkaller.appspotmail.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

 include/linux/tracepoint.h | 10 ++++++++++
 kernel/trace/bpf_trace.c   |  3 ++-
 kernel/tracepoint.c        | 33 ++++++++++++++++++++++++++++++---
 3 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 13f65420f188..ab58696d0ddd 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -41,7 +41,17 @@ extern int
 tracepoint_probe_register_prio(struct tracepoint *tp, void *probe, void *data,
 			       int prio);
 extern int
+tracepoint_probe_register_prio_may_exist(struct tracepoint *tp, void *probe, void *data,
+					 int prio);
+extern int
 tracepoint_probe_unregister(struct tracepoint *tp, void *probe, void *data);
+static inline int
+tracepoint_probe_register_may_exist(struct tracepoint *tp, void *probe,
+				    void *data)
+{
+	return tracepoint_probe_register_prio_may_exist(tp, probe, data,
+							TRACEPOINT_DEFAULT_PRIO);
+}
 extern void
 for_each_kernel_tracepoint(void (*fct)(struct tracepoint *tp, void *priv),
 		void *priv);
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 7a52bc172841..f0568b3d6bd1 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1840,7 +1840,8 @@ static int __bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *
 	if (prog->aux->max_tp_access > btp->writable_size)
 		return -EINVAL;
 
-	return tracepoint_probe_register(tp, (void *)btp->bpf_func, prog);
+	return tracepoint_probe_register_may_exist(tp, (void *)btp->bpf_func,
+						   prog);
 }
 
 int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 9f478d29b926..976bf8ce8039 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -273,7 +273,8 @@ static void tracepoint_update_call(struct tracepoint *tp, struct tracepoint_func
  * Add the probe function to a tracepoint.
  */
 static int tracepoint_add_func(struct tracepoint *tp,
-			       struct tracepoint_func *func, int prio)
+			       struct tracepoint_func *func, int prio,
+			       bool warn)
 {
 	struct tracepoint_func *old, *tp_funcs;
 	int ret;
@@ -288,7 +289,7 @@ static int tracepoint_add_func(struct tracepoint *tp,
 			lockdep_is_held(&tracepoints_mutex));
 	old = func_add(&tp_funcs, func, prio);
 	if (IS_ERR(old)) {
-		WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM);
+		WARN_ON_ONCE(warn && PTR_ERR(old) != -ENOMEM);
 		return PTR_ERR(old);
 	}
 
@@ -343,6 +344,32 @@ static int tracepoint_remove_func(struct tracepoint *tp,
 	return 0;
 }
 
+/**
+ * tracepoint_probe_register_prio_may_exist -  Connect a probe to a tracepoint with priority
+ * @tp: tracepoint
+ * @probe: probe handler
+ * @data: tracepoint data
+ * @prio: priority of this function over other registered functions
+ *
+ * Same as tracepoint_probe_register_prio() except that it will not warn
+ * if the tracepoint is already registered.
+ */
+int tracepoint_probe_register_prio_may_exist(struct tracepoint *tp, void *probe,
+					     void *data, int prio)
+{
+	struct tracepoint_func tp_func;
+	int ret;
+
+	mutex_lock(&tracepoints_mutex);
+	tp_func.func = probe;
+	tp_func.data = data;
+	tp_func.prio = prio;
+	ret = tracepoint_add_func(tp, &tp_func, prio, false);
+	mutex_unlock(&tracepoints_mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(tracepoint_probe_register_prio_may_exist);
+
 /**
  * tracepoint_probe_register_prio -  Connect a probe to a tracepoint with priority
  * @tp: tracepoint
@@ -366,7 +393,7 @@ int tracepoint_probe_register_prio(struct tracepoint *tp, void *probe,
 	tp_func.func = probe;
 	tp_func.data = data;
 	tp_func.prio = prio;
-	ret = tracepoint_add_func(tp, &tp_func, prio);
+	ret = tracepoint_add_func(tp, &tp_func, prio, true);
 	mutex_unlock(&tracepoints_mutex);
 	return ret;
 }
-- 
2.29.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [syzbot] WARNING in tracepoint_add_func
  2021-06-29 13:55 [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing Steven Rostedt
@ 2021-06-29 14:16 ` syzbot
  2021-07-07 22:12 ` [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing Andrii Nakryiko
  1 sibling, 0 replies; 11+ messages in thread
From: syzbot @ 2021-06-29 14:16 UTC (permalink / raw)
  To: andrii, ast, bpf, daniel, linux-kernel, mathieu.desnoyers, mingo,
	netdev, penguin-kernel, rostedt, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+721aa903751db87aa244@syzkaller.appspotmail.com

Tested on:

commit:         c54b245d Merge branch 'for-linus' of git://git.kernel.org/..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=b55ee8fdb0113c34
dashboard link: https://syzkaller.appspot.com/bug?extid=721aa903751db87aa244
compiler:       
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17938e5fd00000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  2021-06-29 13:55 [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing Steven Rostedt
  2021-06-29 14:16 ` [syzbot] WARNING in tracepoint_add_func syzbot
@ 2021-07-07 22:12 ` Andrii Nakryiko
  2021-07-07 22:45   ` Steven Rostedt
  1 sibling, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2021-07-07 22:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Mathieu Desnoyers, Ingo Molnar, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf

On Tue, Jun 29, 2021 at 6:55 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
>
> All internal use cases for tracepoint_probe_register() is set to not ever
> be called with the same function and data. If it is, it is considered a
> bug, as that means the accounting of handling tracepoints is corrupted.
> If the function and data for a tracepoint is already registered when
> tracepoint_probe_register() is called, it will call WARN_ON_ONCE() and
> return with EEXISTS.
>
> The BPF system call can end up calling tracepoint_probe_register() with
> the same data, which now means that this can trigger the warning because
> of a user space process. As WARN_ON_ONCE() should not be called because
> user space called a system call with bad data, there needs to be a way to
> register a tracepoint without triggering a warning.
>
> Enter tracepoint_probe_register_may_exist(), which can be called, but will
> not cause a WARN_ON() if the probe already exists. It will still error out
> with EEXIST, which will then be sent to the user space that performed the
> BPF system call.
>
> This keeps the previous testing for issues with other users of the
> tracepoint code, while letting BPF call it with duplicated data and not
> warn about it.

There doesn't seem to be anything conceptually wrong with attaching
the same BPF program twice to the same tracepoint. Is it a hard
requirement to have a unique tp+callback combination, or was it done
mostly to detect an API misuse? How hard is it to support such use
cases?

I was surprised to discover this is not supported (though I never had
a use for this, had to construct a test to see the warning).

>
> Link: https://lore.kernel.org/lkml/20210626135845.4080-1-penguin-kernel@I-love.SAKURA.ne.jp/
> Link: https://syzkaller.appspot.com/bug?id=41f4318cf01762389f4d1c1c459da4f542fe5153 [1]`
>
> Cc: stable@vger.kernel.org
> Fixes: c4f6699dfcb85 ("bpf: introduce BPF_RAW_TRACEPOINT")
> Reported-by: syzbot <syzbot+721aa903751db87aa244@syzkaller.appspotmail.com>
> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>
> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>
>  include/linux/tracepoint.h | 10 ++++++++++
>  kernel/trace/bpf_trace.c   |  3 ++-
>  kernel/tracepoint.c        | 33 ++++++++++++++++++++++++++++++---
>  3 files changed, 42 insertions(+), 4 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  2021-07-07 22:12 ` [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing Andrii Nakryiko
@ 2021-07-07 22:45   ` Steven Rostedt
  2021-07-07 23:49     ` Andrii Nakryiko
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2021-07-07 22:45 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: LKML, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Mathieu Desnoyers, Ingo Molnar, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf

On Wed, 7 Jul 2021 15:12:28 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> There doesn't seem to be anything conceptually wrong with attaching
> the same BPF program twice to the same tracepoint. Is it a hard
> requirement to have a unique tp+callback combination, or was it done
> mostly to detect an API misuse? How hard is it to support such use
> cases?
> 
> I was surprised to discover this is not supported (though I never had
> a use for this, had to construct a test to see the warning).

The callback is identified by the function and its data combination. If
there's two callbacks calling the same function with the same data on
the same tracepoint, one question is, why? And the second is how do you
differentiate the two?

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  2021-07-07 22:45   ` Steven Rostedt
@ 2021-07-07 23:49     ` Andrii Nakryiko
  2021-07-08  0:05       ` Steven Rostedt
  0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2021-07-07 23:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Mathieu Desnoyers, Ingo Molnar, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf

On Wed, Jul 7, 2021 at 3:45 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 7 Jul 2021 15:12:28 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > There doesn't seem to be anything conceptually wrong with attaching
> > the same BPF program twice to the same tracepoint. Is it a hard
> > requirement to have a unique tp+callback combination, or was it done
> > mostly to detect an API misuse? How hard is it to support such use
> > cases?
> >
> > I was surprised to discover this is not supported (though I never had
> > a use for this, had to construct a test to see the warning).
>
> The callback is identified by the function and its data combination. If
> there's two callbacks calling the same function with the same data on
> the same tracepoint, one question is, why? And the second is how do you
> differentiate the two?

For places where multiple BPF programs can be attached (kprobes,
cgroup programs, etc), we don't put a restriction that all programs
have to be unique. It's totally legal to have the same program
attached multiple times. So having this for tracepoints will be a
one-off behavior.

As for why the user might need that, it's up to the user and I don't
want to speculate because it will always sound contrived without a
specific production use case. But people are very creative and we try
not to dictate how and what can be done if it doesn't break any
fundamental assumption and safety.

>
> -- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  2021-07-07 23:49     ` Andrii Nakryiko
@ 2021-07-08  0:05       ` Steven Rostedt
  2021-07-08  0:23         ` Andrii Nakryiko
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2021-07-08  0:05 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: LKML, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Mathieu Desnoyers, Ingo Molnar, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf

On Wed, 7 Jul 2021 16:49:26 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> As for why the user might need that, it's up to the user and I don't
> want to speculate because it will always sound contrived without a
> specific production use case. But people are very creative and we try
> not to dictate how and what can be done if it doesn't break any
> fundamental assumption and safety.

I guess it doesn't matter, because if they try to do it, the second
attachment will simply fail to attach.

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  2021-07-08  0:05       ` Steven Rostedt
@ 2021-07-08  0:23         ` Andrii Nakryiko
  2021-07-08  0:43           ` Steven Rostedt
  2021-07-08 17:30           ` Mathieu Desnoyers
  0 siblings, 2 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2021-07-08  0:23 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Mathieu Desnoyers, Ingo Molnar, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf

On Wed, Jul 7, 2021 at 5:05 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 7 Jul 2021 16:49:26 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > As for why the user might need that, it's up to the user and I don't
> > want to speculate because it will always sound contrived without a
> > specific production use case. But people are very creative and we try
> > not to dictate how and what can be done if it doesn't break any
> > fundamental assumption and safety.
>
> I guess it doesn't matter, because if they try to do it, the second
> attachment will simply fail to attach.
>

But not for the kprobe case.

And it might not always be possible to know that the same BPF program
is being attached. It could be attached by different processes that
re-use pinned program (without being aware of each other). Or it could
be done from some generic library that just accepts prog_fd and
doesn't really know the exact BPF program and whether it was already
attached.

Not sure why it doesn't matter that attachment will fail where it is
expected to succeed. The question is rather why such restriction?

> -- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  2021-07-08  0:23         ` Andrii Nakryiko
@ 2021-07-08  0:43           ` Steven Rostedt
  2021-07-08 20:04             ` Andrii Nakryiko
  2021-07-08 17:30           ` Mathieu Desnoyers
  1 sibling, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2021-07-08  0:43 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: LKML, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Mathieu Desnoyers, Ingo Molnar, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf

On Wed, 7 Jul 2021 17:23:54 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Wed, Jul 7, 2021 at 5:05 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > On Wed, 7 Jul 2021 16:49:26 -0700
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >  
> > > As for why the user might need that, it's up to the user and I don't
> > > want to speculate because it will always sound contrived without a
> > > specific production use case. But people are very creative and we try
> > > not to dictate how and what can be done if it doesn't break any
> > > fundamental assumption and safety.  
> >
> > I guess it doesn't matter, because if they try to do it, the second
> > attachment will simply fail to attach.
> >  
> 
> But not for the kprobe case.

What do you mean "not for the kprobe case"? What kprobe case?

You attach the same program twice to the same kprobe? Or do you create
two kprobes at the same location?

> 
> And it might not always be possible to know that the same BPF program
> is being attached. It could be attached by different processes that
> re-use pinned program (without being aware of each other). Or it could
> be done from some generic library that just accepts prog_fd and
> doesn't really know the exact BPF program and whether it was already
> attached.
> 
> Not sure why it doesn't matter that attachment will fail where it is
> expected to succeed. The question is rather why such restriction?

Why is it expected to succeed? It never did. And why such a
restriction? Because it complicates the code, and there's no good use
case to do so. Why complicate something for little reward?

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  2021-07-08  0:23         ` Andrii Nakryiko
  2021-07-08  0:43           ` Steven Rostedt
@ 2021-07-08 17:30           ` Mathieu Desnoyers
  2021-07-08 20:11             ` Andrii Nakryiko
  1 sibling, 1 reply; 11+ messages in thread
From: Mathieu Desnoyers @ 2021-07-08 17:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: rostedt, linux-kernel, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Ingo Molnar, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, netdev, bpf

----- On Jul 7, 2021, at 8:23 PM, Andrii Nakryiko andrii.nakryiko@gmail.com wrote:

> On Wed, Jul 7, 2021 at 5:05 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>>
>> On Wed, 7 Jul 2021 16:49:26 -0700
>> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>>
>> > As for why the user might need that, it's up to the user and I don't
>> > want to speculate because it will always sound contrived without a
>> > specific production use case. But people are very creative and we try
>> > not to dictate how and what can be done if it doesn't break any
>> > fundamental assumption and safety.
>>
>> I guess it doesn't matter, because if they try to do it, the second
>> attachment will simply fail to attach.
>>
> 
> But not for the kprobe case.
> 
> And it might not always be possible to know that the same BPF program
> is being attached. It could be attached by different processes that
> re-use pinned program (without being aware of each other). Or it could
> be done from some generic library that just accepts prog_fd and
> doesn't really know the exact BPF program and whether it was already
> attached.
> 
> Not sure why it doesn't matter that attachment will fail where it is
> expected to succeed. The question is rather why such restriction?

Before eBPF came to exist, all in-kernel users of the tracepoint API never
required multiple registrations for a given (tracepoint, probe, data) tuple.

This allowed us to expose an API which can consider that the (tracepoint, probe, data)
tuple is unique for each registration/unregistration pair, and therefore use that same
tuple for unregistration. Refusing multiple registrations for a given tuple allows us to
forgo the complexity of reference counting for duplicate registrations, and provide
immediate feedback to misbehaving tracers which have duplicate registration or
unbalanced registration/unregistration pairs.

From the perspective of a ring buffer tracer, the notion of multiple instances of
a given (tracepoint, probe, data) tuple is rather silly: it would mean that a given
tracepoint hit would generate many instances of the exact same event into the
same trace buffer.

AFAIR, having the WARN_ON_ONCE() within the tracepoint code to highlight this kind of misuse
allowed Steven to find a few unbalanced registration/unregistration issues while developing
ftrace in the past. I vaguely recall that it triggered for blktrace at some point as well.

Considering that allowing duplicates would add complexity to the tracepoint code,
what is the use-case justifying allowing many instances of the exact same callback
and data for a given tracepoint ?

One key difference I notice here between eBPF and ring buffer tracers is what eBPF
considers a "program". AFAIU (please let me know if I'm mistaken), the "callback"
argument provided by eBPF to the tracepoint API is a limited set of trampoline routines.
The bulk of the eBPF "program" is provided in the "data" argument. So this means the
"program" is both the eBPF code and some context.

So I understand that a given eBPF code could be loaded more than once for a given
tracepoint, but I would expect that each registration on a given tracepoint be
provided with its own "context", otherwise we end up in a similar situation as the
ring buffer's duplicated events scenario I explained above.

Also, we should discuss whether kprobes might benefit from being more strict by
rejecting duplicated (instrumentation site, probe, data) tuples.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  2021-07-08  0:43           ` Steven Rostedt
@ 2021-07-08 20:04             ` Andrii Nakryiko
  0 siblings, 0 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2021-07-08 20:04 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Mathieu Desnoyers, Ingo Molnar, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf

On Wed, Jul 7, 2021 at 5:43 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 7 Jul 2021 17:23:54 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Wed, Jul 7, 2021 at 5:05 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > >
> > > On Wed, 7 Jul 2021 16:49:26 -0700
> > > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > > As for why the user might need that, it's up to the user and I don't
> > > > want to speculate because it will always sound contrived without a
> > > > specific production use case. But people are very creative and we try
> > > > not to dictate how and what can be done if it doesn't break any
> > > > fundamental assumption and safety.
> > >
> > > I guess it doesn't matter, because if they try to do it, the second
> > > attachment will simply fail to attach.
> > >
> >
> > But not for the kprobe case.
>
> What do you mean "not for the kprobe case"? What kprobe case?
>
> You attach the same program twice to the same kprobe? Or do you create
> two kprobes at the same location?
>

I meant attaching the same BPF program twice to the same kernel
function through the kprobe mechanism (through perf_event_open()
syscall). From user perspective it's one BPF program attached twice to
the same kprobe. Not entirely sure if two perf_event_open() calls will
create two kprobes or re-use one internally.

> >
> > And it might not always be possible to know that the same BPF program
> > is being attached. It could be attached by different processes that
> > re-use pinned program (without being aware of each other). Or it could
> > be done from some generic library that just accepts prog_fd and
> > doesn't really know the exact BPF program and whether it was already
> > attached.
> >
> > Not sure why it doesn't matter that attachment will fail where it is
> > expected to succeed. The question is rather why such restriction?
>
> Why is it expected to succeed? It never did. And why such a
> restriction? Because it complicates the code, and there's no good use
> case to do so. Why complicate something for little reward?

See above about kprobe for why it was my expectation.

But it was my original question whether this causes some complications
or it's just an attempt to detect API mis-use. Seems like it's the
former, alright.

>
> -- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  2021-07-08 17:30           ` Mathieu Desnoyers
@ 2021-07-08 20:11             ` Andrii Nakryiko
  0 siblings, 0 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2021-07-08 20:11 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: rostedt, linux-kernel, syzbot+721aa903751db87aa244, Tetsuo Handa,
	Ingo Molnar, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, netdev, bpf

On Thu, Jul 8, 2021 at 10:30 AM Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
>
> ----- On Jul 7, 2021, at 8:23 PM, Andrii Nakryiko andrii.nakryiko@gmail.com wrote:
>
> > On Wed, Jul 7, 2021 at 5:05 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >>
> >> On Wed, 7 Jul 2021 16:49:26 -0700
> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >>
> >> > As for why the user might need that, it's up to the user and I don't
> >> > want to speculate because it will always sound contrived without a
> >> > specific production use case. But people are very creative and we try
> >> > not to dictate how and what can be done if it doesn't break any
> >> > fundamental assumption and safety.
> >>
> >> I guess it doesn't matter, because if they try to do it, the second
> >> attachment will simply fail to attach.
> >>
> >
> > But not for the kprobe case.
> >
> > And it might not always be possible to know that the same BPF program
> > is being attached. It could be attached by different processes that
> > re-use pinned program (without being aware of each other). Or it could
> > be done from some generic library that just accepts prog_fd and
> > doesn't really know the exact BPF program and whether it was already
> > attached.
> >
> > Not sure why it doesn't matter that attachment will fail where it is
> > expected to succeed. The question is rather why such restriction?
>
> Before eBPF came to exist, all in-kernel users of the tracepoint API never
> required multiple registrations for a given (tracepoint, probe, data) tuple.
>
> This allowed us to expose an API which can consider that the (tracepoint, probe, data)
> tuple is unique for each registration/unregistration pair, and therefore use that same
> tuple for unregistration. Refusing multiple registrations for a given tuple allows us to
> forgo the complexity of reference counting for duplicate registrations, and provide
> immediate feedback to misbehaving tracers which have duplicate registration or
> unbalanced registration/unregistration pairs.
>
> From the perspective of a ring buffer tracer, the notion of multiple instances of
> a given (tracepoint, probe, data) tuple is rather silly: it would mean that a given
> tracepoint hit would generate many instances of the exact same event into the
> same trace buffer.
>
> AFAIR, having the WARN_ON_ONCE() within the tracepoint code to highlight this kind of misuse
> allowed Steven to find a few unbalanced registration/unregistration issues while developing
> ftrace in the past. I vaguely recall that it triggered for blktrace at some point as well.
>
> Considering that allowing duplicates would add complexity to the tracepoint code,
> what is the use-case justifying allowing many instances of the exact same callback
> and data for a given tracepoint ?

It wasn't clear to me if supporting this would cause any added
complexity, which is why I asked.

>
> One key difference I notice here between eBPF and ring buffer tracers is what eBPF
> considers a "program". AFAIU (please let me know if I'm mistaken), the "callback"
> argument provided by eBPF to the tracepoint API is a limited set of trampoline routines.
> The bulk of the eBPF "program" is provided in the "data" argument. So this means the
> "program" is both the eBPF code and some context.
>
> So I understand that a given eBPF code could be loaded more than once for a given

No, it turns out it can't, I was just surprised to learn that.
Surprised, because AFAIK we don't have such restrictions on uniqueness
of attached BPF programs anywhere else where multiple BPF programs are
allowed.

> tracepoint, but I would expect that each registration on a given tracepoint be
> provided with its own "context", otherwise we end up in a similar situation as the
> ring buffer's duplicated events scenario I explained above.
>
> Also, we should discuss whether kprobes might benefit from being more strict by
> rejecting duplicated (instrumentation site, probe, data) tuples.
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-07-08 20:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-29 13:55 [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing Steven Rostedt
2021-06-29 14:16 ` [syzbot] WARNING in tracepoint_add_func syzbot
2021-07-07 22:12 ` [PATCH] tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing Andrii Nakryiko
2021-07-07 22:45   ` Steven Rostedt
2021-07-07 23:49     ` Andrii Nakryiko
2021-07-08  0:05       ` Steven Rostedt
2021-07-08  0:23         ` Andrii Nakryiko
2021-07-08  0:43           ` Steven Rostedt
2021-07-08 20:04             ` Andrii Nakryiko
2021-07-08 17:30           ` Mathieu Desnoyers
2021-07-08 20:11             ` Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).