bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation
@ 2020-03-01  8:10 Andrii Nakryiko
  2020-03-01  8:10 ` [PATCH bpf-next 1/3] bpf: reliably preserve btf_trace_xxx types Andrii Nakryiko
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2020-03-01  8:10 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel, ethercflow
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Fix issue with not preserving btf_trace_##call structs when compiled under
Clang. Additionally, capture raw tracepoint arguments in raw_tp_##call
structs, directly usable from BPF programs. Convert runqslower to use those
for proof of concept and to simplify code further.

Andrii Nakryiko (3):
  bpf: reliably preserve btf_trace_xxx types
  bpf: generate directly-usable raw_tp_##call structs for raw
    tracepoints
  tools/runqslower: simplify BPF code by using raw_tp_xxx structs

 include/trace/bpf_probe.h             | 37 ++++++++++++++++++++++-----
 tools/bpf/runqslower/runqslower.bpf.c | 22 ++++++----------
 2 files changed, 38 insertions(+), 21 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 1/3] bpf: reliably preserve btf_trace_xxx types
  2020-03-01  8:10 [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation Andrii Nakryiko
@ 2020-03-01  8:10 ` Andrii Nakryiko
  2020-03-02 16:35   ` Yonghong Song
  2020-03-01  8:10 ` [PATCH bpf-next 2/3] bpf: generate directly-usable raw_tp_##call structs for raw tracepoints Andrii Nakryiko
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2020-03-01  8:10 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel, ethercflow
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

btf_trace_xxx types, crucial for tp_btf BPF programs (raw tracepoint with
verifier-checked direct memory access), have to be preserved in kernel BTF to
allow verifier do its job and enforce type/memory safety. It was reported
([0]) that for kernels built with Clang current type-casting approach doesn't
preserve these types.

This patch fixes it by declaring an anonymous union for each registered
tracepoint, capturing both struct bpf_raw_event_map information, as well as
recording btf_trace_##call type reliably. Structurally, it's still the same
content as for a plain struct bpf_raw_event_map, so no other changes are
necessary.

  [0] https://github.com/iovisor/bcc/issues/2770#issuecomment-591007692

Fixes: e8c423fb31fa ("bpf: Add typecast to raw_tracepoints to help BTF generation")
Reported-by: Wenbo Zhang <ethercflow@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 include/trace/bpf_probe.h | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h
index b04c29270973..1ce3be63add1 100644
--- a/include/trace/bpf_probe.h
+++ b/include/trace/bpf_probe.h
@@ -75,13 +75,17 @@ static inline void bpf_test_probe_##call(void)				\
 	check_trace_callback_type_##call(__bpf_trace_##template);	\
 }									\
 typedef void (*btf_trace_##call)(void *__data, proto);			\
-static struct bpf_raw_event_map	__used					\
-	__attribute__((section("__bpf_raw_tp_map")))			\
-__bpf_trace_tp_map_##call = {						\
-	.tp		= &__tracepoint_##call,				\
-	.bpf_func	= (void *)(btf_trace_##call)__bpf_trace_##template,	\
-	.num_args	= COUNT_ARGS(args),				\
-	.writable_size	= size,						\
+static union {								\
+	struct bpf_raw_event_map event;					\
+	btf_trace_##call handler;					\
+} __bpf_trace_tp_map_##call __used					\
+__attribute__((section("__bpf_raw_tp_map"))) = {			\
+	.event = {							\
+		.tp		= &__tracepoint_##call,			\
+		.bpf_func	= __bpf_trace_##template,		\
+		.num_args	= COUNT_ARGS(args),			\
+		.writable_size	= size,					\
+	},								\
 };
 
 #define FIRST(x, ...) x
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 2/3] bpf: generate directly-usable raw_tp_##call structs for raw tracepoints
  2020-03-01  8:10 [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation Andrii Nakryiko
  2020-03-01  8:10 ` [PATCH bpf-next 1/3] bpf: reliably preserve btf_trace_xxx types Andrii Nakryiko
@ 2020-03-01  8:10 ` Andrii Nakryiko
  2020-03-02 16:36   ` Yonghong Song
  2020-03-01  8:10 ` [PATCH bpf-next 3/3] tools/runqslower: simplify BPF code by using raw_tp_xxx structs Andrii Nakryiko
  2020-03-03  0:59 ` [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation Alexei Starovoitov
  3 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2020-03-01  8:10 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel, ethercflow
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

In addition to btf_trace_##call typedefs to func protos, generate a struct
raw_tp_##call with memory layout directly usable from BPF programs to access
raw tracepoint arguments. This allows for user BPF programs to directly use
such structs for their raw tracepoint BPF programs when using vmlinux.h,
without having to manually copy/paste and maintain raw tracepoint argument
declarations. Additionally, due to CO-RE and preserve_access_index attribute,
such structs are relocatable, all the CO-RE relocations and field existence
checks are available automatically to such BPF programs.

runqslower example in next patch will demonstrate this usage.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 include/trace/bpf_probe.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h
index 1ce3be63add1..a9e83236c181 100644
--- a/include/trace/bpf_probe.h
+++ b/include/trace/bpf_probe.h
@@ -55,6 +55,21 @@
 /* tracepoints with more than 12 arguments will hit build error */
 #define CAST_TO_U64(...) CONCATENATE(__CAST, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__)
 
+#define __RAW_TP_ARG(a) a __attribute__((aligned(8)))
+#define __RAW_TP_ARGS1(a,...) __RAW_TP_ARG(a);
+#define __RAW_TP_ARGS2(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS1(__VA_ARGS__);
+#define __RAW_TP_ARGS3(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS2(__VA_ARGS__);
+#define __RAW_TP_ARGS4(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS3(__VA_ARGS__);
+#define __RAW_TP_ARGS5(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS4(__VA_ARGS__);
+#define __RAW_TP_ARGS6(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS5(__VA_ARGS__);
+#define __RAW_TP_ARGS7(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS6(__VA_ARGS__);
+#define __RAW_TP_ARGS8(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS7(__VA_ARGS__);
+#define __RAW_TP_ARGS9(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS8(__VA_ARGS__);
+#define __RAW_TP_ARGS10(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS9(__VA_ARGS__);
+#define __RAW_TP_ARGS11(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS10(__VA_ARGS__);
+#define __RAW_TP_ARGS12(a,...) __RAW_TP_ARG(a); __RAW_TP_ARGS11(__VA_ARGS__);
+#define RAW_TP_ARGS(args...) CONCATENATE(__RAW_TP_ARGS, COUNT_ARGS(args))(args)
+
 #undef DECLARE_EVENT_CLASS
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
 static notrace void							\
@@ -75,9 +90,13 @@ static inline void bpf_test_probe_##call(void)				\
 	check_trace_callback_type_##call(__bpf_trace_##template);	\
 }									\
 typedef void (*btf_trace_##call)(void *__data, proto);			\
+struct raw_tp_##call {							\
+	RAW_TP_ARGS(proto)						\
+};									\
 static union {								\
 	struct bpf_raw_event_map event;					\
 	btf_trace_##call handler;					\
+	struct raw_tp_##call *raw_tp_args;				\
 } __bpf_trace_tp_map_##call __used					\
 __attribute__((section("__bpf_raw_tp_map"))) = {			\
 	.event = {							\
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 3/3] tools/runqslower: simplify BPF code by using raw_tp_xxx structs
  2020-03-01  8:10 [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation Andrii Nakryiko
  2020-03-01  8:10 ` [PATCH bpf-next 1/3] bpf: reliably preserve btf_trace_xxx types Andrii Nakryiko
  2020-03-01  8:10 ` [PATCH bpf-next 2/3] bpf: generate directly-usable raw_tp_##call structs for raw tracepoints Andrii Nakryiko
@ 2020-03-01  8:10 ` Andrii Nakryiko
  2020-03-02 16:36   ` Yonghong Song
  2020-03-03  0:59 ` [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation Alexei Starovoitov
  3 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2020-03-01  8:10 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel, ethercflow
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Convert runqslower to utilize raw_tp_xxx structs for accessing raw tracepoint
arguments.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/bpf/runqslower/runqslower.bpf.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/tools/bpf/runqslower/runqslower.bpf.c b/tools/bpf/runqslower/runqslower.bpf.c
index 48a39f72fadf..3931ef9c9a6c 100644
--- a/tools/bpf/runqslower/runqslower.bpf.c
+++ b/tools/bpf/runqslower/runqslower.bpf.c
@@ -40,41 +40,35 @@ static int trace_enqueue(u32 tgid, u32 pid)
 }
 
 SEC("tp_btf/sched_wakeup")
-int handle__sched_wakeup(u64 *ctx)
+int handle__sched_wakeup(struct raw_tp_sched_wakeup *ctx)
 {
 	/* TP_PROTO(struct task_struct *p) */
-	struct task_struct *p = (void *)ctx[0];
-
-	return trace_enqueue(p->tgid, p->pid);
+	return trace_enqueue(ctx->p->tgid, ctx->p->pid);
 }
 
 SEC("tp_btf/sched_wakeup_new")
-int handle__sched_wakeup_new(u64 *ctx)
+int handle__sched_wakeup_new(struct raw_tp_sched_wakeup_new *ctx)
 {
 	/* TP_PROTO(struct task_struct *p) */
-	struct task_struct *p = (void *)ctx[0];
-
-	return trace_enqueue(p->tgid, p->pid);
+	return trace_enqueue(ctx->p->tgid, ctx->p->pid);
 }
 
 SEC("tp_btf/sched_switch")
-int handle__sched_switch(u64 *ctx)
+int handle__sched_switch(struct raw_tp_sched_switch *ctx)
 {
 	/* TP_PROTO(bool preempt, struct task_struct *prev,
 	 *	    struct task_struct *next)
 	 */
-	struct task_struct *prev = (struct task_struct *)ctx[1];
-	struct task_struct *next = (struct task_struct *)ctx[2];
 	struct event event = {};
 	u64 *tsp, delta_us;
 	long state;
 	u32 pid;
 
 	/* ivcsw: treat like an enqueue event and store timestamp */
-	if (prev->state == TASK_RUNNING)
-		trace_enqueue(prev->tgid, prev->pid);
+	if (ctx->prev->state == TASK_RUNNING)
+		trace_enqueue(ctx->prev->tgid, ctx->prev->pid);
 
-	pid = next->pid;
+	pid = ctx->next->pid;
 
 	/* fetch timestamp and calculate delta */
 	tsp = bpf_map_lookup_elem(&start, &pid);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 1/3] bpf: reliably preserve btf_trace_xxx types
  2020-03-01  8:10 ` [PATCH bpf-next 1/3] bpf: reliably preserve btf_trace_xxx types Andrii Nakryiko
@ 2020-03-02 16:35   ` Yonghong Song
  0 siblings, 0 replies; 11+ messages in thread
From: Yonghong Song @ 2020-03-02 16:35 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast, daniel, ethercflow
  Cc: andrii.nakryiko, kernel-team



On 3/1/20 12:10 AM, Andrii Nakryiko wrote:
> btf_trace_xxx types, crucial for tp_btf BPF programs (raw tracepoint with
> verifier-checked direct memory access), have to be preserved in kernel BTF to
> allow verifier do its job and enforce type/memory safety. It was reported
> ([0]) that for kernels built with Clang current type-casting approach doesn't
> preserve these types.
> 
> This patch fixes it by declaring an anonymous union for each registered
> tracepoint, capturing both struct bpf_raw_event_map information, as well as
> recording btf_trace_##call type reliably. Structurally, it's still the same
> content as for a plain struct bpf_raw_event_map, so no other changes are
> necessary.
> 
>    [0] https://github.com/iovisor/bcc/issues/2770#issuecomment-591007692
> 
> Fixes: e8c423fb31fa ("bpf: Add typecast to raw_tracepoints to help BTF generation")
> Reported-by: Wenbo Zhang <ethercflow@gmail.com>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

clang seems doing a little bit optimization here...
The change looks good. It is hard to have code to preserve the types in 
the header. union seems an acceptable way.

Acked-by: Yonghong Song <yhs@fb.com>

> ---
>   include/trace/bpf_probe.h | 18 +++++++++++-------
>   1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h
> index b04c29270973..1ce3be63add1 100644
> --- a/include/trace/bpf_probe.h
> +++ b/include/trace/bpf_probe.h
> @@ -75,13 +75,17 @@ static inline void bpf_test_probe_##call(void)				\
>   	check_trace_callback_type_##call(__bpf_trace_##template);	\
>   }									\
>   typedef void (*btf_trace_##call)(void *__data, proto);			\
> -static struct bpf_raw_event_map	__used					\
> -	__attribute__((section("__bpf_raw_tp_map")))			\
> -__bpf_trace_tp_map_##call = {						\
> -	.tp		= &__tracepoint_##call,				\
> -	.bpf_func	= (void *)(btf_trace_##call)__bpf_trace_##template,	\
> -	.num_args	= COUNT_ARGS(args),				\
> -	.writable_size	= size,						\
> +static union {								\
> +	struct bpf_raw_event_map event;					\
> +	btf_trace_##call handler;					\
> +} __bpf_trace_tp_map_##call __used					\
> +__attribute__((section("__bpf_raw_tp_map"))) = {			\
> +	.event = {							\
> +		.tp		= &__tracepoint_##call,			\
> +		.bpf_func	= __bpf_trace_##template,		\
> +		.num_args	= COUNT_ARGS(args),			\
> +		.writable_size	= size,					\
> +	},								\
>   };
>   
>   #define FIRST(x, ...) x
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: generate directly-usable raw_tp_##call structs for raw tracepoints
  2020-03-01  8:10 ` [PATCH bpf-next 2/3] bpf: generate directly-usable raw_tp_##call structs for raw tracepoints Andrii Nakryiko
@ 2020-03-02 16:36   ` Yonghong Song
  0 siblings, 0 replies; 11+ messages in thread
From: Yonghong Song @ 2020-03-02 16:36 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast, daniel, ethercflow
  Cc: andrii.nakryiko, kernel-team



On 3/1/20 12:10 AM, Andrii Nakryiko wrote:
> In addition to btf_trace_##call typedefs to func protos, generate a struct
> raw_tp_##call with memory layout directly usable from BPF programs to access
> raw tracepoint arguments. This allows for user BPF programs to directly use
> such structs for their raw tracepoint BPF programs when using vmlinux.h,
> without having to manually copy/paste and maintain raw tracepoint argument
> declarations. Additionally, due to CO-RE and preserve_access_index attribute,
> such structs are relocatable, all the CO-RE relocations and field existence
> checks are available automatically to such BPF programs.
> 
> runqslower example in next patch will demonstrate this usage.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 3/3] tools/runqslower: simplify BPF code by using raw_tp_xxx structs
  2020-03-01  8:10 ` [PATCH bpf-next 3/3] tools/runqslower: simplify BPF code by using raw_tp_xxx structs Andrii Nakryiko
@ 2020-03-02 16:36   ` Yonghong Song
  0 siblings, 0 replies; 11+ messages in thread
From: Yonghong Song @ 2020-03-02 16:36 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast, daniel, ethercflow
  Cc: andrii.nakryiko, kernel-team



On 3/1/20 12:10 AM, Andrii Nakryiko wrote:
> Convert runqslower to utilize raw_tp_xxx structs for accessing raw tracepoint
> arguments.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation
  2020-03-01  8:10 [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2020-03-01  8:10 ` [PATCH bpf-next 3/3] tools/runqslower: simplify BPF code by using raw_tp_xxx structs Andrii Nakryiko
@ 2020-03-03  0:59 ` Alexei Starovoitov
  2020-03-03  4:10   ` Andrii Nakryiko
  3 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2020-03-03  0:59 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, netdev, ast, daniel, ethercflow, andrii.nakryiko, kernel-team

On Sun, Mar 01, 2020 at 12:10:42AM -0800, Andrii Nakryiko wrote:
> Fix issue with not preserving btf_trace_##call structs when compiled under
> Clang. Additionally, capture raw tracepoint arguments in raw_tp_##call
> structs, directly usable from BPF programs. Convert runqslower to use those
> for proof of concept and to simplify code further.

Not only folks compile kernel with clang they use the latest BPF/BTF features
with it. This is very nice to see!
I've applied 1st patch to make clang compiled kernel emit proper BTF.

As far as patch 2 I'm not sure about 'raw_tp_' prefix. tp_btf type of progs can
use the same structs. So I think there could be a better name. Also bpftool can
generate them as well while emitting vmlinux.h. I think that will avoid adding
few kilobytes to vmlinux BTF that kernel isn't going to use atm.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation
  2020-03-03  0:59 ` [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation Alexei Starovoitov
@ 2020-03-03  4:10   ` Andrii Nakryiko
  2020-03-03  4:59     ` Andrii Nakryiko
  0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2020-03-03  4:10 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Wenbo Zhang, Kernel Team

On Mon, Mar 2, 2020 at 4:59 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sun, Mar 01, 2020 at 12:10:42AM -0800, Andrii Nakryiko wrote:
> > Fix issue with not preserving btf_trace_##call structs when compiled under
> > Clang. Additionally, capture raw tracepoint arguments in raw_tp_##call
> > structs, directly usable from BPF programs. Convert runqslower to use those
> > for proof of concept and to simplify code further.
>
> Not only folks compile kernel with clang they use the latest BPF/BTF features
> with it. This is very nice to see!
> I've applied 1st patch to make clang compiled kernel emit proper BTF.
>
> As far as patch 2 I'm not sure about 'raw_tp_' prefix. tp_btf type of progs can
> use the same structs. So I think there could be a better name. Also bpftool can
> generate them as well while emitting vmlinux.h. I think that will avoid adding
> few kilobytes to vmlinux BTF that kernel isn't going to use atm.

Fair enough, I'll follow up with bpftool changes to generate such
structs. I'm thinking to use tp_args_xxx name pattern, unless someone
has a better idea :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation
  2020-03-03  4:10   ` Andrii Nakryiko
@ 2020-03-03  4:59     ` Andrii Nakryiko
  2020-04-21 21:49       ` Andrii Nakryiko
  0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2020-03-03  4:59 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Wenbo Zhang, Kernel Team

On Mon, Mar 2, 2020 at 8:10 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Mar 2, 2020 at 4:59 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Sun, Mar 01, 2020 at 12:10:42AM -0800, Andrii Nakryiko wrote:
> > > Fix issue with not preserving btf_trace_##call structs when compiled under
> > > Clang. Additionally, capture raw tracepoint arguments in raw_tp_##call
> > > structs, directly usable from BPF programs. Convert runqslower to use those
> > > for proof of concept and to simplify code further.
> >
> > Not only folks compile kernel with clang they use the latest BPF/BTF features
> > with it. This is very nice to see!
> > I've applied 1st patch to make clang compiled kernel emit proper BTF.
> >
> > As far as patch 2 I'm not sure about 'raw_tp_' prefix. tp_btf type of progs can
> > use the same structs. So I think there could be a better name. Also bpftool can
> > generate them as well while emitting vmlinux.h. I think that will avoid adding
> > few kilobytes to vmlinux BTF that kernel isn't going to use atm.
>
> Fair enough, I'll follow up with bpftool changes to generate such
> structs. I'm thinking to use tp_args_xxx name pattern, unless someone
> has a better idea :)

Bad news. BTF_KIND_FUNC_PROTOs don't capture argument names and having
something like:

struct tp_args_sched_switch {
    bool arg1;
    struct task_struct *arg2;
    struct task_struct *arg3;
};

doesn't seem like a good solution...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation
  2020-03-03  4:59     ` Andrii Nakryiko
@ 2020-04-21 21:49       ` Andrii Nakryiko
  0 siblings, 0 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2020-04-21 21:49 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Wenbo Zhang, Kernel Team

On Mon, Mar 2, 2020 at 8:59 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Mar 2, 2020 at 8:10 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, Mar 2, 2020 at 4:59 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Sun, Mar 01, 2020 at 12:10:42AM -0800, Andrii Nakryiko wrote:
> > > > Fix issue with not preserving btf_trace_##call structs when compiled under
> > > > Clang. Additionally, capture raw tracepoint arguments in raw_tp_##call
> > > > structs, directly usable from BPF programs. Convert runqslower to use those
> > > > for proof of concept and to simplify code further.
> > >
> > > Not only folks compile kernel with clang they use the latest BPF/BTF features
> > > with it. This is very nice to see!
> > > I've applied 1st patch to make clang compiled kernel emit proper BTF.
> > >
> > > As far as patch 2 I'm not sure about 'raw_tp_' prefix. tp_btf type of progs can
> > > use the same structs. So I think there could be a better name. Also bpftool can
> > > generate them as well while emitting vmlinux.h. I think that will avoid adding
> > > few kilobytes to vmlinux BTF that kernel isn't going to use atm.
> >
> > Fair enough, I'll follow up with bpftool changes to generate such
> > structs. I'm thinking to use tp_args_xxx name pattern, unless someone
> > has a better idea :)
>
> Bad news. BTF_KIND_FUNC_PROTOs don't capture argument names and having
> something like:
>
> struct tp_args_sched_switch {
>     bool arg1;
>     struct task_struct *arg2;
>     struct task_struct *arg3;
> };
>
> doesn't seem like a good solution...

I'd like to surface this one more time. I'd like to give just a bit
more context first, though.

Currently, when using various types of BPF programs (kprobe,
fentry/fexit, lsm, etc), one can use few ways to extract input
arguments of the "intercepted" function. One of the more user-friendly
ways to do it is through BPF_PROG macro. So BPF code would look like
this:

SEC("lsm/file_mprotect")
int BPF_PROG(test_int_hook, struct vm_area_struct *vma,
             unsigned long reqprot, unsigned long prot, int ret)
{
    if (ret != 0) { ... }
    /* use cma, reqprot, etc directly as variables */
}

This is kind of nice, but involves quite a bit of macro magic and
requires user to find and copy/paste those function prototypes and
track whether they ever change.

The alternative for this would be to have a memory-layout-compatible
struct, that can be used directly as a context. For the above example:

struct btf_lsm_file_mprotect_ctx {
    struct vm_area_struct *vma;
    unsigned long reqprot;
    int ret __attribute__((aligned(8)));
};

With that, one can do a nice pure C code:

int test_int_hook(struct btf_lsm_file_mprotect_ctx *ctx)
{
    if (ctx->ret != 0) { ... }
    /* here instead of using vma, reqprot and ret directly,
     * one would access them as ctx fields: ctx->vma, ctx->reqprot
     */
}

The benefit of the latter is that, hopefully, no one will have to
write such struct definitions by hand, they would be generated as part
of vmlinux.h and would come directly from kernel BTF (in one way or
another, see below). One added benefit is that such struct is
CO-RE-relocatable, so if fields are ever, say, reordered or removed,
by using CO-RE mechanisms one can write a compiled-once BPF program to
accommodate such changes and even incompatibilities.

Now to why I'm bringing this up again.

The original plan was to use bpftool to convert func_protos that are
used directly by kernel (e.g., btf_trace_xxx ones for raw tracepoints)
to dump them as memory layout-compatible structs at the end of
vmlinux.h for use by BPF programs. Unfortunately, I don't think that
will work because func_proto arguments don't preserve their names in
BTF. Which, as I pointed out in previous email, ruins usability of
generated structs.

So I'd like to solicit feedback on how we can proceed from here. I see
few possible ways to go about this:

1. Do nothing. Always an option. It sucks for users (they need to
copy-paste function definitions, use BPF_PROG macro, etc), but is
awesome for kernel (no changes, no extra stuff).

2. Bite a bullet and add compatible struct definitions into kernel
itself. It will slightly increase kernel BTF, but will require no
changes on user-space and tooling side. vmlinux.h just magically gets
proper structs that are directly usable from BPF programs. One
objection to that is that those structs are not directly used by
kernel and thus are just a dead weight.

3. Bite even bigger bullet and convert current uses of func_proto in
kernel to struct. That way we don't have unnecessary types laying
around in the kernel, verifier actually will use these structs for
verification. There might be a concern about backwards compatibility.
Libbpf can easily accommodate such changes by searching for either
struct or func_proto, whichever is available, but one can argue that
we have to leave existing btf_trace_xxx func protos intact if that's
considered to be part of UAPI.

4. Milder variant of #3 would be to convert typedef of func_protos
into a use of proper FUNCs. They still use func_proto, but that
func_proto actually preserves argument names, and would allow bpftool
to dump proper structs. Same considerations of what to do with
backwards compatibility of btf_trace_xxx typedefs+func_proto, but
won't require much verifier changes, because it's still going to be
func_proto that need to be used for verification. This is what
Yonghong's bpfdump changes actually do: they use __init empty funcs to
provide types for verifier.

I think it's important to discuss this, because more and more BPF
program types are relying on using similar approach (e.g., LSM,
bpfdump), and it would be nice to provide a good **and uniform**
solution to let users just use a proper context structure directly,
instead of using BPF_PROG macro (best case, worst case it's a direct
u64 array conversions) and copy-pasting definition.

Thanks for any feedback upfront!

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-04-21 21:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-01  8:10 [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation Andrii Nakryiko
2020-03-01  8:10 ` [PATCH bpf-next 1/3] bpf: reliably preserve btf_trace_xxx types Andrii Nakryiko
2020-03-02 16:35   ` Yonghong Song
2020-03-01  8:10 ` [PATCH bpf-next 2/3] bpf: generate directly-usable raw_tp_##call structs for raw tracepoints Andrii Nakryiko
2020-03-02 16:36   ` Yonghong Song
2020-03-01  8:10 ` [PATCH bpf-next 3/3] tools/runqslower: simplify BPF code by using raw_tp_xxx structs Andrii Nakryiko
2020-03-02 16:36   ` Yonghong Song
2020-03-03  0:59 ` [PATCH bpf-next 0/3] Improve raw tracepoint BTF types preservation Alexei Starovoitov
2020-03-03  4:10   ` Andrii Nakryiko
2020-03-03  4:59     ` Andrii Nakryiko
2020-04-21 21:49       ` Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).