linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v6 0/2] Make eBPF programs output data to perf
@ 2015-07-21  3:08 He Kuang
  2015-07-21  3:08 ` [RFC PATCH v6 1/2] tools lib traceevent: Support function __get_dynamic_array_len He Kuang
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: He Kuang @ 2015-07-21  3:08 UTC (permalink / raw)
  To: rostedt, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel, hekuang

Hi,

Previous patch v5 url:
http://thread.gmane.org/gmane.linux.kernel/1995274

The bugfix of dynamic array length in trace event goes to
kernel/git/rostedt/linux-trace.git ftrace/urgent and confirms that the
return value of __get_dynamic_array_len() is the total allocated
length of the dynamic array. For we print the bpf output data in byte
array from patch v5, that problem does not affect our patch any more,
but some comments in patch 1/2 is updated.

Patch 2/2 is acked by Alexei.

Thank you.

He Kuang (2):
  tools lib traceevent: Support function __get_dynamic_array_len
  bpf: Introduce function for outputing data to perf event

 include/trace/events/bpf.h                         | 30 ++++++++++++
 include/uapi/linux/bpf.h                           |  7 +++
 kernel/trace/bpf_trace.c                           | 23 +++++++++
 samples/bpf/bpf_helpers.h                          |  2 +
 tools/lib/traceevent/event-parse.c                 | 56 +++++++++++++++++++++-
 tools/lib/traceevent/event-parse.h                 |  1 +
 .../perf/util/scripting-engines/trace-event-perl.c |  1 +
 .../util/scripting-engines/trace-event-python.c    |  1 +
 8 files changed, 119 insertions(+), 2 deletions(-)
 create mode 100644 include/trace/events/bpf.h

-- 
1.8.5.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH v6 1/2] tools lib traceevent: Support function __get_dynamic_array_len
  2015-07-21  3:08 [RFC PATCH v6 0/2] Make eBPF programs output data to perf He Kuang
@ 2015-07-21  3:08 ` He Kuang
  2015-07-21 15:08   ` Namhyung Kim
  2015-07-21  3:08 ` [RFC PATCH v6 2/2] bpf: Introduce function for outputing data to perf event He Kuang
  2015-08-13 21:27 ` [RFC PATCH v6 0/2] Make eBPF programs output data to perf Alexei Starovoitov
  2 siblings, 1 reply; 9+ messages in thread
From: He Kuang @ 2015-07-21  3:08 UTC (permalink / raw)
  To: rostedt, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel, hekuang

Support helper function __get_dynamic_array_len() in libtraceevent,
this function is used accompany with __print_array() or __print_hex(),
but currently it is not an available function in the function list of
process_function().

The total allocated length of the dynamic array is embedded in the top
half of __data_loc_##item field. This patch adds new arg type
PRINT_DYNAMIC_ARRAY_LEN to return the length to eval_num_arg(),

Signed-off-by: He Kuang <hekuang@huawei.com>
---
 tools/lib/traceevent/event-parse.c                 | 56 +++++++++++++++++++++-
 tools/lib/traceevent/event-parse.h                 |  1 +
 .../perf/util/scripting-engines/trace-event-perl.c |  1 +
 .../util/scripting-engines/trace-event-python.c    |  1 +
 4 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index cc25f05..3af69cf 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -783,6 +783,7 @@ static void free_arg(struct print_arg *arg)
 		free(arg->bitmask.bitmask);
 		break;
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 		free(arg->dynarray.index);
 		break;
 	case PRINT_OP:
@@ -2655,6 +2656,42 @@ process_dynamic_array(struct event_format *event, struct print_arg *arg, char **
 }
 
 static enum event_type
+process_dynamic_array_len(struct event_format *event, struct print_arg *arg,
+			  char **tok)
+{
+	struct format_field *field;
+	enum event_type type;
+	char *token;
+
+	if (read_expect_type(EVENT_ITEM, &token) < 0)
+		goto out_free;
+
+	arg->type = PRINT_DYNAMIC_ARRAY_LEN;
+
+	/* Find the field */
+	field = pevent_find_field(event, token);
+	if (!field)
+		goto out_free;
+
+	arg->dynarray.field = field;
+	arg->dynarray.index = 0;
+
+	if (read_expected(EVENT_DELIM, ")") < 0)
+		goto out_err;
+
+	type = read_token(&token);
+	*tok = token;
+
+	return type;
+
+ out_free:
+	free_token(token);
+ out_err:
+	*tok = NULL;
+	return EVENT_ERROR;
+}
+
+static enum event_type
 process_paren(struct event_format *event, struct print_arg *arg, char **tok)
 {
 	struct print_arg *item_arg;
@@ -2901,6 +2938,10 @@ process_function(struct event_format *event, struct print_arg *arg,
 		free_token(token);
 		return process_dynamic_array(event, arg, tok);
 	}
+	if (strcmp(token, "__get_dynamic_array_len") == 0) {
+		free_token(token);
+		return process_dynamic_array_len(event, arg, tok);
+	}
 
 	func = find_func_handler(event->pevent, token);
 	if (func) {
@@ -3581,14 +3622,25 @@ eval_num_arg(void *data, int size, struct event_format *event, struct print_arg
 			goto out_warning_op;
 		}
 		break;
+	case PRINT_DYNAMIC_ARRAY_LEN:
+		offset = pevent_read_number(pevent,
+					    data + arg->dynarray.field->offset,
+					    arg->dynarray.field->size);
+		/*
+		 * The total allocated length of the dynamic array is
+		 * stored in the top half of the field, and the offset
+		 * is in the bottom half of the 32 bit field.
+		 */
+		val = (unsigned long long)(offset >> 16);
+		break;
 	case PRINT_DYNAMIC_ARRAY:
 		/* Without [], we pass the address to the dynamic data */
 		offset = pevent_read_number(pevent,
 					    data + arg->dynarray.field->offset,
 					    arg->dynarray.field->size);
 		/*
-		 * The actual length of the dynamic array is stored
-		 * in the top half of the field, and the offset
+		 * The total allocated length of the dynamic array is
+		 * stored in the top half of the field, and the offset
 		 * is in the bottom half of the 32 bit field.
 		 */
 		offset &= 0xffff;
diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
index 063b197..94543aa 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -294,6 +294,7 @@ enum print_arg_type {
 	PRINT_OP,
 	PRINT_FUNC,
 	PRINT_BITMASK,
+	PRINT_DYNAMIC_ARRAY_LEN,
 };
 
 struct print_arg {
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index 1bd593b..544509c 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -221,6 +221,7 @@ static void define_event_symbols(struct event_format *event,
 		break;
 	case PRINT_BSTRING:
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 	case PRINT_STRING:
 	case PRINT_BITMASK:
 		break;
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index ace2484..aa9e125 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -251,6 +251,7 @@ static void define_event_symbols(struct event_format *event,
 		/* gcc warns for these? */
 	case PRINT_BSTRING:
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 	case PRINT_FUNC:
 	case PRINT_BITMASK:
 		/* we should warn... */
-- 
1.8.5.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH v6 2/2] bpf: Introduce function for outputing data to perf event
  2015-07-21  3:08 [RFC PATCH v6 0/2] Make eBPF programs output data to perf He Kuang
  2015-07-21  3:08 ` [RFC PATCH v6 1/2] tools lib traceevent: Support function __get_dynamic_array_len He Kuang
@ 2015-07-21  3:08 ` He Kuang
  2015-07-21 15:13   ` Namhyung Kim
  2015-08-13 21:27 ` [RFC PATCH v6 0/2] Make eBPF programs output data to perf Alexei Starovoitov
  2 siblings, 1 reply; 9+ messages in thread
From: He Kuang @ 2015-07-21  3:08 UTC (permalink / raw)
  To: rostedt, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel, hekuang

There're scenarios that we need an eBPF program to record not only
kprobe point args, but also the PMU counters, time latencies or the
number of cache misses between two probe points and other information
when the probe point is entered.

This patch adds a new trace event to establish infrastruction for bpf to
output data to perf. Userspace perf tools can detect and use this event
as using the existing tracepoint events.

New bpf trace event entry in debugfs:

     /sys/kernel/debug/tracing/events/bpf/bpf_output_data

Userspace perf tools detect the new tracepoint event as:

     bpf:bpf_output_data                          [Tracepoint event]

Data in ring-buffer of perf events added to this event will be polled
out, sample types and other attributes can be adjusted to those events
directly without touching the original kprobe events.

The bpf helper function gives eBPF program ability to output data as
perf sample event. This helper simple call the new trace event and
userspace perf tools can record the BPF ftrace event to collect those
records.

Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
---
 include/trace/events/bpf.h | 30 ++++++++++++++++++++++++++++++
 include/uapi/linux/bpf.h   |  7 +++++++
 kernel/trace/bpf_trace.c   | 23 +++++++++++++++++++++++
 samples/bpf/bpf_helpers.h  |  2 ++
 4 files changed, 62 insertions(+)
 create mode 100644 include/trace/events/bpf.h

diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
new file mode 100644
index 0000000..6b739b8
--- /dev/null
+++ b/include/trace/events/bpf.h
@@ -0,0 +1,30 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM bpf
+
+#if !defined(_TRACE_BPF_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_BPF_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(bpf_output_data,
+
+	TP_PROTO(u64 *src, int size),
+
+	TP_ARGS(src, size),
+
+	TP_STRUCT__entry(
+		__dynamic_array(u8,		buf,		size)
+	),
+
+	TP_fast_assign(
+		memcpy(__get_dynamic_array(buf), src, size);
+	),
+
+	TP_printk("%s", __print_hex(__get_dynamic_array(buf),
+				    __get_dynamic_array_len(buf)))
+);
+
+#endif /* _TRACE_BPF_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 29ef6f9..5068ab1 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -249,6 +249,13 @@ enum bpf_func_id {
 	 * Return: 0 on success
 	 */
 	BPF_FUNC_get_current_comm,
+
+	/**
+	 * int bpf_output_trace_data(void *src, int size)
+	 * Return: 0 on success
+	 */
+	BPF_FUNC_output_trace_data,
+
 	__BPF_FUNC_MAX_ID,
 };
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 88a041a..219f670 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -11,7 +11,10 @@
 #include <linux/filter.h>
 #include <linux/uaccess.h>
 #include <linux/ctype.h>
+
 #include "trace.h"
+#define CREATE_TRACE_POINTS
+#include <trace/events/bpf.h>
 
 static DEFINE_PER_CPU(int, bpf_prog_active);
 
@@ -79,6 +82,24 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
 	.arg3_type	= ARG_ANYTHING,
 };
 
+static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
+{
+	void *src = (void *) (long) r1;
+	int size = (int) r2;
+
+	trace_bpf_output_data(src, size);
+
+	return 0;
+}
+
+static const struct bpf_func_proto bpf_output_trace_data_proto = {
+	.func		= bpf_output_trace_data,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_STACK,
+	.arg2_type	= ARG_CONST_STACK_SIZE,
+};
+
 /*
  * limited trace_printk()
  * only %d %u %x %ld %lu %lx %lld %llu %llx %p conversion specifiers allowed
@@ -169,6 +190,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
 		return &bpf_map_delete_elem_proto;
 	case BPF_FUNC_probe_read:
 		return &bpf_probe_read_proto;
+	case BPF_FUNC_output_trace_data:
+		return &bpf_output_trace_data_proto;
 	case BPF_FUNC_ktime_get_ns:
 		return &bpf_ktime_get_ns_proto;
 	case BPF_FUNC_tail_call:
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index bdf1c16..0aeaebe 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -59,5 +59,7 @@ static int (*bpf_l3_csum_replace)(void *ctx, int off, int from, int to, int flag
 	(void *) BPF_FUNC_l3_csum_replace;
 static int (*bpf_l4_csum_replace)(void *ctx, int off, int from, int to, int flags) =
 	(void *) BPF_FUNC_l4_csum_replace;
+static int (*bpf_output_trace_data)(void *src, int size) =
+	(void *) BPF_FUNC_output_trace_data;
 
 #endif
-- 
1.8.5.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v6 1/2] tools lib traceevent: Support function __get_dynamic_array_len
  2015-07-21  3:08 ` [RFC PATCH v6 1/2] tools lib traceevent: Support function __get_dynamic_array_len He Kuang
@ 2015-07-21 15:08   ` Namhyung Kim
  0 siblings, 0 replies; 9+ messages in thread
From: Namhyung Kim @ 2015-07-21 15:08 UTC (permalink / raw)
  To: He Kuang
  Cc: rostedt, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	jolsa, wangnan0, pi3orama, linux-kernel

On Tue, Jul 21, 2015 at 03:08:49AM +0000, He Kuang wrote:
> Support helper function __get_dynamic_array_len() in libtraceevent,
> this function is used accompany with __print_array() or __print_hex(),
> but currently it is not an available function in the function list of
> process_function().
> 
> The total allocated length of the dynamic array is embedded in the top
> half of __data_loc_##item field. This patch adds new arg type
> PRINT_DYNAMIC_ARRAY_LEN to return the length to eval_num_arg(),
> 
> Signed-off-by: He Kuang <hekuang@huawei.com>

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung


> ---
>  tools/lib/traceevent/event-parse.c                 | 56 +++++++++++++++++++++-
>  tools/lib/traceevent/event-parse.h                 |  1 +
>  .../perf/util/scripting-engines/trace-event-perl.c |  1 +
>  .../util/scripting-engines/trace-event-python.c    |  1 +
>  4 files changed, 57 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
> index cc25f05..3af69cf 100644
> --- a/tools/lib/traceevent/event-parse.c
> +++ b/tools/lib/traceevent/event-parse.c
> @@ -783,6 +783,7 @@ static void free_arg(struct print_arg *arg)
>  		free(arg->bitmask.bitmask);
>  		break;
>  	case PRINT_DYNAMIC_ARRAY:
> +	case PRINT_DYNAMIC_ARRAY_LEN:
>  		free(arg->dynarray.index);
>  		break;
>  	case PRINT_OP:
> @@ -2655,6 +2656,42 @@ process_dynamic_array(struct event_format *event, struct print_arg *arg, char **
>  }
>  
>  static enum event_type
> +process_dynamic_array_len(struct event_format *event, struct print_arg *arg,
> +			  char **tok)
> +{
> +	struct format_field *field;
> +	enum event_type type;
> +	char *token;
> +
> +	if (read_expect_type(EVENT_ITEM, &token) < 0)
> +		goto out_free;
> +
> +	arg->type = PRINT_DYNAMIC_ARRAY_LEN;
> +
> +	/* Find the field */
> +	field = pevent_find_field(event, token);
> +	if (!field)
> +		goto out_free;
> +
> +	arg->dynarray.field = field;
> +	arg->dynarray.index = 0;
> +
> +	if (read_expected(EVENT_DELIM, ")") < 0)
> +		goto out_err;
> +
> +	type = read_token(&token);
> +	*tok = token;
> +
> +	return type;
> +
> + out_free:
> +	free_token(token);
> + out_err:
> +	*tok = NULL;
> +	return EVENT_ERROR;
> +}
> +
> +static enum event_type
>  process_paren(struct event_format *event, struct print_arg *arg, char **tok)
>  {
>  	struct print_arg *item_arg;
> @@ -2901,6 +2938,10 @@ process_function(struct event_format *event, struct print_arg *arg,
>  		free_token(token);
>  		return process_dynamic_array(event, arg, tok);
>  	}
> +	if (strcmp(token, "__get_dynamic_array_len") == 0) {
> +		free_token(token);
> +		return process_dynamic_array_len(event, arg, tok);
> +	}
>  
>  	func = find_func_handler(event->pevent, token);
>  	if (func) {
> @@ -3581,14 +3622,25 @@ eval_num_arg(void *data, int size, struct event_format *event, struct print_arg
>  			goto out_warning_op;
>  		}
>  		break;
> +	case PRINT_DYNAMIC_ARRAY_LEN:
> +		offset = pevent_read_number(pevent,
> +					    data + arg->dynarray.field->offset,
> +					    arg->dynarray.field->size);
> +		/*
> +		 * The total allocated length of the dynamic array is
> +		 * stored in the top half of the field, and the offset
> +		 * is in the bottom half of the 32 bit field.
> +		 */
> +		val = (unsigned long long)(offset >> 16);
> +		break;
>  	case PRINT_DYNAMIC_ARRAY:
>  		/* Without [], we pass the address to the dynamic data */
>  		offset = pevent_read_number(pevent,
>  					    data + arg->dynarray.field->offset,
>  					    arg->dynarray.field->size);
>  		/*
> -		 * The actual length of the dynamic array is stored
> -		 * in the top half of the field, and the offset
> +		 * The total allocated length of the dynamic array is
> +		 * stored in the top half of the field, and the offset
>  		 * is in the bottom half of the 32 bit field.
>  		 */
>  		offset &= 0xffff;
> diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
> index 063b197..94543aa 100644
> --- a/tools/lib/traceevent/event-parse.h
> +++ b/tools/lib/traceevent/event-parse.h
> @@ -294,6 +294,7 @@ enum print_arg_type {
>  	PRINT_OP,
>  	PRINT_FUNC,
>  	PRINT_BITMASK,
> +	PRINT_DYNAMIC_ARRAY_LEN,
>  };
>  
>  struct print_arg {
> diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
> index 1bd593b..544509c 100644
> --- a/tools/perf/util/scripting-engines/trace-event-perl.c
> +++ b/tools/perf/util/scripting-engines/trace-event-perl.c
> @@ -221,6 +221,7 @@ static void define_event_symbols(struct event_format *event,
>  		break;
>  	case PRINT_BSTRING:
>  	case PRINT_DYNAMIC_ARRAY:
> +	case PRINT_DYNAMIC_ARRAY_LEN:
>  	case PRINT_STRING:
>  	case PRINT_BITMASK:
>  		break;
> diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
> index ace2484..aa9e125 100644
> --- a/tools/perf/util/scripting-engines/trace-event-python.c
> +++ b/tools/perf/util/scripting-engines/trace-event-python.c
> @@ -251,6 +251,7 @@ static void define_event_symbols(struct event_format *event,
>  		/* gcc warns for these? */
>  	case PRINT_BSTRING:
>  	case PRINT_DYNAMIC_ARRAY:
> +	case PRINT_DYNAMIC_ARRAY_LEN:
>  	case PRINT_FUNC:
>  	case PRINT_BITMASK:
>  		/* we should warn... */
> -- 
> 1.8.5.2
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v6 2/2] bpf: Introduce function for outputing data to perf event
  2015-07-21  3:08 ` [RFC PATCH v6 2/2] bpf: Introduce function for outputing data to perf event He Kuang
@ 2015-07-21 15:13   ` Namhyung Kim
  0 siblings, 0 replies; 9+ messages in thread
From: Namhyung Kim @ 2015-07-21 15:13 UTC (permalink / raw)
  To: He Kuang
  Cc: rostedt, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	jolsa, wangnan0, pi3orama, linux-kernel

On Tue, Jul 21, 2015 at 03:08:50AM +0000, He Kuang wrote:
> There're scenarios that we need an eBPF program to record not only
> kprobe point args, but also the PMU counters, time latencies or the
> number of cache misses between two probe points and other information
> when the probe point is entered.
> 
> This patch adds a new trace event to establish infrastruction for bpf to
> output data to perf. Userspace perf tools can detect and use this event
> as using the existing tracepoint events.
> 
> New bpf trace event entry in debugfs:
> 
>      /sys/kernel/debug/tracing/events/bpf/bpf_output_data
> 
> Userspace perf tools detect the new tracepoint event as:
> 
>      bpf:bpf_output_data                          [Tracepoint event]
> 
> Data in ring-buffer of perf events added to this event will be polled
> out, sample types and other attributes can be adjusted to those events
> directly without touching the original kprobe events.
> 
> The bpf helper function gives eBPF program ability to output data as
> perf sample event. This helper simple call the new trace event and
> userspace perf tools can record the BPF ftrace event to collect those
> records.
> 
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Acked-by: Alexei Starovoitov <ast@plumgrid.com>

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung


> ---
>  include/trace/events/bpf.h | 30 ++++++++++++++++++++++++++++++
>  include/uapi/linux/bpf.h   |  7 +++++++
>  kernel/trace/bpf_trace.c   | 23 +++++++++++++++++++++++
>  samples/bpf/bpf_helpers.h  |  2 ++
>  4 files changed, 62 insertions(+)
>  create mode 100644 include/trace/events/bpf.h
> 
> diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
> new file mode 100644
> index 0000000..6b739b8
> --- /dev/null
> +++ b/include/trace/events/bpf.h
> @@ -0,0 +1,30 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM bpf
> +
> +#if !defined(_TRACE_BPF_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_BPF_H
> +
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(bpf_output_data,
> +
> +	TP_PROTO(u64 *src, int size),
> +
> +	TP_ARGS(src, size),
> +
> +	TP_STRUCT__entry(
> +		__dynamic_array(u8,		buf,		size)
> +	),
> +
> +	TP_fast_assign(
> +		memcpy(__get_dynamic_array(buf), src, size);
> +	),
> +
> +	TP_printk("%s", __print_hex(__get_dynamic_array(buf),
> +				    __get_dynamic_array_len(buf)))
> +);
> +
> +#endif /* _TRACE_BPF_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 29ef6f9..5068ab1 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -249,6 +249,13 @@ enum bpf_func_id {
>  	 * Return: 0 on success
>  	 */
>  	BPF_FUNC_get_current_comm,
> +
> +	/**
> +	 * int bpf_output_trace_data(void *src, int size)
> +	 * Return: 0 on success
> +	 */
> +	BPF_FUNC_output_trace_data,
> +
>  	__BPF_FUNC_MAX_ID,
>  };
>  
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 88a041a..219f670 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -11,7 +11,10 @@
>  #include <linux/filter.h>
>  #include <linux/uaccess.h>
>  #include <linux/ctype.h>
> +
>  #include "trace.h"
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/bpf.h>
>  
>  static DEFINE_PER_CPU(int, bpf_prog_active);
>  
> @@ -79,6 +82,24 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
>  	.arg3_type	= ARG_ANYTHING,
>  };
>  
> +static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
> +{
> +	void *src = (void *) (long) r1;
> +	int size = (int) r2;
> +
> +	trace_bpf_output_data(src, size);
> +
> +	return 0;
> +}
> +
> +static const struct bpf_func_proto bpf_output_trace_data_proto = {
> +	.func		= bpf_output_trace_data,
> +	.gpl_only	= true,
> +	.ret_type	= RET_INTEGER,
> +	.arg1_type	= ARG_PTR_TO_STACK,
> +	.arg2_type	= ARG_CONST_STACK_SIZE,
> +};
> +
>  /*
>   * limited trace_printk()
>   * only %d %u %x %ld %lu %lx %lld %llu %llx %p conversion specifiers allowed
> @@ -169,6 +190,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
>  		return &bpf_map_delete_elem_proto;
>  	case BPF_FUNC_probe_read:
>  		return &bpf_probe_read_proto;
> +	case BPF_FUNC_output_trace_data:
> +		return &bpf_output_trace_data_proto;
>  	case BPF_FUNC_ktime_get_ns:
>  		return &bpf_ktime_get_ns_proto;
>  	case BPF_FUNC_tail_call:
> diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
> index bdf1c16..0aeaebe 100644
> --- a/samples/bpf/bpf_helpers.h
> +++ b/samples/bpf/bpf_helpers.h
> @@ -59,5 +59,7 @@ static int (*bpf_l3_csum_replace)(void *ctx, int off, int from, int to, int flag
>  	(void *) BPF_FUNC_l3_csum_replace;
>  static int (*bpf_l4_csum_replace)(void *ctx, int off, int from, int to, int flags) =
>  	(void *) BPF_FUNC_l4_csum_replace;
> +static int (*bpf_output_trace_data)(void *src, int size) =
> +	(void *) BPF_FUNC_output_trace_data;
>  
>  #endif
> -- 
> 1.8.5.2
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v6 0/2] Make eBPF programs output data to perf
  2015-07-21  3:08 [RFC PATCH v6 0/2] Make eBPF programs output data to perf He Kuang
  2015-07-21  3:08 ` [RFC PATCH v6 1/2] tools lib traceevent: Support function __get_dynamic_array_len He Kuang
  2015-07-21  3:08 ` [RFC PATCH v6 2/2] bpf: Introduce function for outputing data to perf event He Kuang
@ 2015-08-13 21:27 ` Alexei Starovoitov
  2015-08-13 21:35   ` pi3orama
  2 siblings, 1 reply; 9+ messages in thread
From: Alexei Starovoitov @ 2015-08-13 21:27 UTC (permalink / raw)
  To: He Kuang, rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra,
	mingo, namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel

On 7/20/15 8:08 PM, He Kuang wrote:
> Hi,
>
> Previous patch v5 url:
> http://thread.gmane.org/gmane.linux.kernel/1995274
>
> The bugfix of dynamic array length in trace event goes to
> kernel/git/rostedt/linux-trace.git ftrace/urgent and confirms that the
> return value of __get_dynamic_array_len() is the total allocated
> length of the dynamic array. For we print the bpf output data in byte
> array from patch v5, that problem does not affect our patch any more,
> but some comments in patch 1/2 is updated.
>
> Patch 2/2 is acked by Alexei.

what is the status of it? If I remember correctly patch 1 was fixed
differently in Steven's tree. Patch 2 probably needs refreshing?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v6 0/2] Make eBPF programs output data to perf
  2015-08-13 21:27 ` [RFC PATCH v6 0/2] Make eBPF programs output data to perf Alexei Starovoitov
@ 2015-08-13 21:35   ` pi3orama
  2015-08-13 21:49     ` Alexei Starovoitov
  0 siblings, 1 reply; 9+ messages in thread
From: pi3orama @ 2015-08-13 21:35 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: He Kuang, rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra,
	mingo, namhyung, jolsa, wangnan0, linux-kernel



发自我的 iPhone

> 在 2015年8月14日,上午5:27,Alexei Starovoitov <ast@plumgrid.com> 写道:
> 
>> On 7/20/15 8:08 PM, He Kuang wrote:
>> Hi,
>> 
>> Previous patch v5 url:
>> http://thread.gmane.org/gmane.linux.kernel/1995274
>> 
>> The bugfix of dynamic array length in trace event goes to
>> kernel/git/rostedt/linux-trace.git ftrace/urgent and confirms that the
>> return value of __get_dynamic_array_len() is the total allocated
>> length of the dynamic array. For we print the bpf output data in byte
>> array from patch v5, that problem does not affect our patch any more,
>> but some comments in patch 1/2 is updated.
>> 
>> Patch 2/2 is acked by Alexei.
> 
> what is the status of it? If I remember correctly patch 1 was fixed
> differently in Steven's tree. Patch 2 probably needs refreshing?
> 

I was thinking about whether to add a "type" field there, so we will have an explicit
mov const instruction before the call instruction, which can act as a mark. Also, if
we generate the type code automatically, a type field in this API can make things
easier since we don't need wrap the user structure in BPF stack. However, the
LLVM side is not ready yet, so we haven't post the new version.

Thank you.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v6 0/2] Make eBPF programs output data to perf
  2015-08-13 21:35   ` pi3orama
@ 2015-08-13 21:49     ` Alexei Starovoitov
  2015-08-19  3:05       ` Wangnan (F)
  0 siblings, 1 reply; 9+ messages in thread
From: Alexei Starovoitov @ 2015-08-13 21:49 UTC (permalink / raw)
  To: pi3orama
  Cc: He Kuang, rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra,
	mingo, namhyung, jolsa, wangnan0, linux-kernel

On 8/13/15 2:35 PM, pi3orama wrote:
> I was thinking about whether to add a "type" field there, so we will have an explicit
> mov const instruction before the call instruction, which can act as a mark. Also, if
> we generate the type code automatically, a type field in this API can make things
> easier since we don't need wrap the user structure in BPF stack. However, the
> LLVM side is not ready yet, so we haven't post the new version.

I think the helper was clean enough. Any type info probably needs to be
done as a side channel and not part of the helper anyway.
But, ok, let's figure out the type stuff first.
Also I don't think you can rely on extra insn in front of a call insn.
Compiler can freely insert other insns there. You don't want to
introduce data flow analysis in elf parser.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v6 0/2] Make eBPF programs output data to perf
  2015-08-13 21:49     ` Alexei Starovoitov
@ 2015-08-19  3:05       ` Wangnan (F)
  0 siblings, 0 replies; 9+ messages in thread
From: Wangnan (F) @ 2015-08-19  3:05 UTC (permalink / raw)
  To: Alexei Starovoitov, pi3orama, rostedt
  Cc: He Kuang, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	namhyung, jolsa, linux-kernel



On 2015/8/14 5:49, Alexei Starovoitov wrote:
> On 8/13/15 2:35 PM, pi3orama wrote:
>> I was thinking about whether to add a "type" field there, so we will 
>> have an explicit
>> mov const instruction before the call instruction, which can act as a 
>> mark. Also, if
>> we generate the type code automatically, a type field in this API can 
>> make things
>> easier since we don't need wrap the user structure in BPF stack. 
>> However, the
>> LLVM side is not ready yet, so we haven't post the new version.
>
> I think the helper was clean enough. Any type info probably needs to be
> done as a side channel and not part of the helper anyway.
> But, ok, let's figure out the type stuff first.
> Also I don't think you can rely on extra insn in front of a call insn.
> Compiler can freely insert other insns there. You don't want to
> introduce data flow analysis in elf parser.
>
I agree with you. I think we can rely on user providing correct type 
information.

Then we should make this two patches go into kernel. Both 1/2 and 2/2 are
required.

The bug mentioned in patch 1/3 of v5 series
(http://lkml.kernel.org/r/1436839171-31527-2-git-send-email-hekuang@huawei.com)
has already been fixed by adjusting sample
(d6726c8145290bef950ae2538ea6ae1d96a1944b)

So we only need these two patches. Currently they can be applied to 
mainline master
clearly since you haven't add new BPF functions.

Thank you.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-08-19  3:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-21  3:08 [RFC PATCH v6 0/2] Make eBPF programs output data to perf He Kuang
2015-07-21  3:08 ` [RFC PATCH v6 1/2] tools lib traceevent: Support function __get_dynamic_array_len He Kuang
2015-07-21 15:08   ` Namhyung Kim
2015-07-21  3:08 ` [RFC PATCH v6 2/2] bpf: Introduce function for outputing data to perf event He Kuang
2015-07-21 15:13   ` Namhyung Kim
2015-08-13 21:27 ` [RFC PATCH v6 0/2] Make eBPF programs output data to perf Alexei Starovoitov
2015-08-13 21:35   ` pi3orama
2015-08-13 21:49     ` Alexei Starovoitov
2015-08-19  3:05       ` Wangnan (F)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).