linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v4 0/3] Make eBPF programs output data to perf event
@ 2015-07-10 10:03 He Kuang
  2015-07-10 10:03 ` [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size He Kuang
                   ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: He Kuang @ 2015-07-10 10:03 UTC (permalink / raw)
  To: rostedt, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel, hekuang

Hi,

Previous discussion url(patch v3):
http://thread.gmane.org/gmane.linux.kernel/1990197/focus=1991022

We found that creating new trace event for bpf subsystem is more
simple than adding new ftrace:bpf entry, the only thing we should do
for outputing bpf sample events is to call a predefined trace event
function. Then it is easy to extend new output formats and not
restricted to one fixed format like ftrace:function.

By using trace events, we can also benifit from the dynamic array
field type for outputing results by number of items we filled in, and
achieve the purpose of standardization of output format. Function for
getting number of items in dynamic array is added to libtraceevent and
a improper result of macro __get_dynamic_array_len is corrected.

eBPF sample code, tests outputing multiple items:

  SEC("generic_perform_write=generic_perform_write")
  int NODE_generic_perform_write(struct pt_regs *ctx)
  {
         char fmt[] = "generic_perform_write last=%lld, cur=%lld, del=%lld\n";
         u64 cur_time, del_time, result[3] = {0};
         int ind =0;
         struct time_table *last = bpf_map_lookup_elem(&global_time_table, &ind);
         struct time_table output;

         if (!last)
                 return 0;
  
         cur_time = bpf_ktime_get_ns();
  
         if (!last->last_time)
                 del_time = 0;
         else
                 del_time = cur_time - last->last_time;
  
         /* For debug */
         bpf_trace_printk(fmt, sizeof(fmt), last->last_time, cur_time, del_time);
         result[0] = last->last_time;
  
         /* Table update */
         output.last_time = cur_time;
         bpf_map_update_elem(&global_time_table, &ind, &output, BPF_ANY);
  
         /* This is a casual condition to show the funciton */
         if (del_time < 1000)
                 return 0;
  
         result[1] = cur_time;
         result[2] = del_time;
         bpf_output_trace_data(result, sizeof(result));
  
         return 0;
  }

Record bpf events:

  $ perf record -e bpf:bpf_output_data -e sample.o --
    dd if=/dev/zero of=test bs=4k count=3

Results in /sys/kernel/debug/tracing/trace:

  dd-984   [000] d...    60.894097: : generic_perform_write
           last=60560862578, cur=60654629075, del=93766497
  dd-984   [000] d...    60.896957: : generic_perform_write
           last=60654629075, cur=60657510709, del=2881634
  dd-984   [000] d...    60.897276: : generic_perform_write
           last=60657510709, cur=60657829953, del=319244
           
Results showed in perf-script:

  dd   984 [000]    60.655211: bpf:bpf_output_data: 60560862578 60654629075 93766497
  dd   984 [000]    60.657552: bpf:bpf_output_data: 60654629075 60657510709 2881634
  dd   984 [000]    60.657898: bpf:bpf_output_data: 60657510709 60657829953 319244  

Thank you.

He Kuang (3):
  tracing/events: Fix wrong sample output by storing array length
    instead of size
  tools lib traceevent: Add function to get dynamic arrays length
  bpf: Introduce function for outputing data to perf event

 include/trace/events/bpf.h                         | 30 +++++++++++++
 include/trace/trace_events.h                       |  5 ++-
 include/uapi/linux/bpf.h                           |  7 +++
 kernel/trace/bpf_trace.c                           | 23 ++++++++++
 samples/bpf/bpf_helpers.h                          |  2 +
 tools/lib/traceevent/event-parse.c                 | 52 ++++++++++++++++++++++
 tools/lib/traceevent/event-parse.h                 |  1 +
 .../util/scripting-engines/trace-event-python.c    |  1 +
 8 files changed, 119 insertions(+), 2 deletions(-)
 create mode 100644 include/trace/events/bpf.h

-- 
1.8.5.2


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size
  2015-07-10 10:03 [RFC PATCH v4 0/3] Make eBPF programs output data to perf event He Kuang
@ 2015-07-10 10:03 ` He Kuang
  2015-07-17 14:32   ` Steven Rostedt
  2015-07-10 10:03 ` [RFC PATCH v4 2/3] tools lib traceevent: Add function to get dynamic arrays length He Kuang
  2015-07-10 10:03 ` [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event He Kuang
  2 siblings, 1 reply; 53+ messages in thread
From: He Kuang @ 2015-07-10 10:03 UTC (permalink / raw)
  To: rostedt, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel, hekuang

The output result of trace_foo_bar event in traceevent samples is
wrong. This problem can be reproduced as following:

  (Build kernel with SAMPLE_TRACE_EVENTS=m)

  $ insmod trace-events-sample.ko

  $ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable

  $ cat /sys/kernel/debug/tracing/trace

  event-sample-980 [000] ....  43.649559: foo_bar: foo hello 21 0x15
  BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 The array length is not right, should be {0x1}.
  (ffffffff,ffffffff)

  event-sample-980 [000] ....  44.653827: foo_bar: foo hello 22 0x16
  BIT2|BIT3|0x10
  {0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 The array length is not right, should be {0x1,0x2}.
  Gandalf (ffffffff,ffffffff)

The event defined in samples/trace_events/trace-events-sample.h uses
this helper function to output dynamic list:

   __print_array(__get_dynamic_array(list),
                 __get_dynamic_array_len(list),
                 sizeof(int))

Currently, __get_dynamic_array_len() returns the total size of the
array instead of the number of items by referencing the high 16 bits
of entry->data_loc_##item. The element size for calculating the number
of items can not be fetched by referencing fields from __entry, so
macro __get_dynamic_array_len can not return the expected value.

This patch stores array item number instead of the total size in
entry->__data_loc_##item, and makes __get_dynamic_array_len get the
right value directly. Because the function __get_bitmask() is affected
by this change, __bitmask_size is assigned to the array len by
multiplied bitmask type size.

After this patch:

  event-sample-993 [000] ....  692.348562: foo_bar: foo hello 201
  0xc9 BIT1|BIT4|0xc0 {0x1} Snoopy (ffffffff,ffffffff)
                      ^^^^^
                      Array length fixed.

  event-sample-993 [000] ....  693.349276: foo_bar: foo hello 202
  0xca BIT2|BIT4|0xc0 {0x1,0x2} Gandalf (ffffffff,ffffffff)
                      ^^^^^^^^^
                      Array length fixed.

Signed-off-by: He Kuang <hekuang@huawei.com>
---
 include/trace/trace_events.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h
index 43be3b0..5abe027 100644
--- a/include/trace/trace_events.h
+++ b/include/trace/trace_events.h
@@ -257,7 +257,8 @@ TRACE_MAKE_SYSTEM_STR();
 	({								\
 		void *__bitmask = __get_dynamic_array(field);		\
 		unsigned int __bitmask_size;				\
-		__bitmask_size = __get_dynamic_array_len(field);	\
+		__bitmask_size = (__get_dynamic_array_len(field) *	\
+				  sizeof(unsigned long));		\
 		trace_print_bitmask_seq(p, __bitmask, __bitmask_size);	\
 	})
 
@@ -453,7 +454,7 @@ trace_event_define_fields_##call(struct trace_event_call *event_call)	\
 	__item_length = (len) * sizeof(type);				\
 	__data_offsets->item = __data_size +				\
 			       offsetof(typeof(*entry), __data);	\
-	__data_offsets->item |= __item_length << 16;			\
+	__data_offsets->item |= (len) << 16;				\
 	__data_size += __item_length;
 
 #undef __string
-- 
1.8.5.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v4 2/3] tools lib traceevent: Add function to get dynamic arrays length
  2015-07-10 10:03 [RFC PATCH v4 0/3] Make eBPF programs output data to perf event He Kuang
  2015-07-10 10:03 ` [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size He Kuang
@ 2015-07-10 10:03 ` He Kuang
  2015-07-10 10:03 ` [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event He Kuang
  2 siblings, 0 replies; 53+ messages in thread
From: He Kuang @ 2015-07-10 10:03 UTC (permalink / raw)
  To: rostedt, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel, hekuang

To print a trace event with a dynamic array, __print_array(array, len,
element_size) requires the number of items in the array, which can be
got by the helper function __get_dynamic_array_len(), currently it is
not an available function in the function list in process_function().

Add new arg type PRINT_DYNAMIC_ARRAY_LEN which returns the array
length embedded in the __data_loc_##item field to eval_num_arg().

Signed-off-by: He Kuang <hekuang@huawei.com>
---
 tools/lib/traceevent/event-parse.c                 | 52 ++++++++++++++++++++++
 tools/lib/traceevent/event-parse.h                 |  1 +
 .../util/scripting-engines/trace-event-python.c    |  1 +
 3 files changed, 54 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index cc25f05..55fc3b6c 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -783,6 +783,7 @@ static void free_arg(struct print_arg *arg)
 		free(arg->bitmask.bitmask);
 		break;
 	case PRINT_DYNAMIC_ARRAY:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 		free(arg->dynarray.index);
 		break;
 	case PRINT_OP:
@@ -2655,6 +2656,42 @@ process_dynamic_array(struct event_format *event, struct print_arg *arg, char **
 }
 
 static enum event_type
+process_dynamic_array_len(struct event_format *event, struct print_arg *arg,
+			  char **tok)
+{
+	struct format_field *field;
+	enum event_type type;
+	char *token;
+
+	if (read_expect_type(EVENT_ITEM, &token) < 0)
+		goto out_free;
+
+	arg->type = PRINT_DYNAMIC_ARRAY_LEN;
+
+	/* Find the field */
+	field = pevent_find_field(event, token);
+	if (!field)
+		goto out_free;
+
+	arg->dynarray.field = field;
+	arg->dynarray.index = 0;
+
+	if (read_expected(EVENT_DELIM, ")") < 0)
+		goto out_err;
+
+	type = read_token(&token);
+	*tok = token;
+
+	return type;
+
+ out_free:
+	free_token(token);
+ out_err:
+	*tok = NULL;
+	return EVENT_ERROR;
+}
+
+static enum event_type
 process_paren(struct event_format *event, struct print_arg *arg, char **tok)
 {
 	struct print_arg *item_arg;
@@ -2901,6 +2938,10 @@ process_function(struct event_format *event, struct print_arg *arg,
 		free_token(token);
 		return process_dynamic_array(event, arg, tok);
 	}
+	if (strcmp(token, "__get_dynamic_array_len") == 0) {
+		free_token(token);
+		return process_dynamic_array_len(event, arg, tok);
+	}
 
 	func = find_func_handler(event->pevent, token);
 	if (func) {
@@ -3581,6 +3622,17 @@ eval_num_arg(void *data, int size, struct event_format *event, struct print_arg
 			goto out_warning_op;
 		}
 		break;
+	case PRINT_DYNAMIC_ARRAY_LEN:
+		offset = pevent_read_number(pevent,
+					    data + arg->dynarray.field->offset,
+					    arg->dynarray.field->size);
+		/*
+		 * The actual length of the dynamic array is stored
+		 * in the top half of the field, and the offset
+		 * is in the bottom half of the 32 bit field.
+		 */
+		val = (unsigned long long)(offset >> 16);
+		break;
 	case PRINT_DYNAMIC_ARRAY:
 		/* Without [], we pass the address to the dynamic data */
 		offset = pevent_read_number(pevent,
diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
index 063b197..94543aa 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -294,6 +294,7 @@ enum print_arg_type {
 	PRINT_OP,
 	PRINT_FUNC,
 	PRINT_BITMASK,
+	PRINT_DYNAMIC_ARRAY_LEN,
 };
 
 struct print_arg {
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index ace2484..d309341 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -253,6 +253,7 @@ static void define_event_symbols(struct event_format *event,
 	case PRINT_DYNAMIC_ARRAY:
 	case PRINT_FUNC:
 	case PRINT_BITMASK:
+	case PRINT_DYNAMIC_ARRAY_LEN:
 		/* we should warn... */
 		return;
 	}
-- 
1.8.5.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-10 10:03 [RFC PATCH v4 0/3] Make eBPF programs output data to perf event He Kuang
  2015-07-10 10:03 ` [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size He Kuang
  2015-07-10 10:03 ` [RFC PATCH v4 2/3] tools lib traceevent: Add function to get dynamic arrays length He Kuang
@ 2015-07-10 10:03 ` He Kuang
  2015-07-10 22:10   ` Alexei Starovoitov
  2015-07-13  8:29   ` Peter Zijlstra
  2 siblings, 2 replies; 53+ messages in thread
From: He Kuang @ 2015-07-10 10:03 UTC (permalink / raw)
  To: rostedt, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel, hekuang

There're scenarios that we need an eBPF program to record not only
kprobe point args, but also the PMU counters, time latencies or the
number of cache misses between two probe points and other information
when the probe point is entered.

This patch adds a new trace event to establish infrastruction for bpf to
output data to perf. Userspace perf tools can detect and use this event
as using the existing tracepoint events.

New bpf trace event entry in debugfs:

     /sys/kernel/debug/tracing/events/bpf/bpf_output_data

Userspace perf tools detect the new tracepoint event as:

     bpf:bpf_output_data                          [Tracepoint event]

Data in ring-buffer of perf events added to this event will be polled
out, sample types and other attributes can be adjusted to those events
directly without touching the original kprobe events.

The bpf helper function gives eBPF program ability to output data as
perf sample event. This helper simple call the new trace event and
userspace perf tools can record the BPF ftrace event to collect those
records.

Signed-off-by: He Kuang <hekuang@huawei.com>
---
 include/trace/events/bpf.h | 30 ++++++++++++++++++++++++++++++
 include/uapi/linux/bpf.h   |  7 +++++++
 kernel/trace/bpf_trace.c   | 23 +++++++++++++++++++++++
 samples/bpf/bpf_helpers.h  |  2 ++
 4 files changed, 62 insertions(+)
 create mode 100644 include/trace/events/bpf.h

diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
new file mode 100644
index 0000000..a659a91
--- /dev/null
+++ b/include/trace/events/bpf.h
@@ -0,0 +1,30 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM bpf
+
+#if !defined(_TRACE_BPF_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_BPF_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(bpf_output_data,
+
+	TP_PROTO(u64 *src, int len),
+
+	TP_ARGS(src, len),
+
+	TP_STRUCT__entry(
+		__dynamic_array(u64,		buf,		len)
+	),
+
+	TP_fast_assign(
+		memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
+	),
+
+	TP_printk("%s", __print_array(__get_dynamic_array(buf),
+				      __get_dynamic_array_len(buf), 8))
+);
+
+#endif /* _TRACE_BPF_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 29ef6f9..5068ab1 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -249,6 +249,13 @@ enum bpf_func_id {
 	 * Return: 0 on success
 	 */
 	BPF_FUNC_get_current_comm,
+
+	/**
+	 * int bpf_output_trace_data(void *src, int size)
+	 * Return: 0 on success
+	 */
+	BPF_FUNC_output_trace_data,
+
 	__BPF_FUNC_MAX_ID,
 };
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 88a041a..31fc31a 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -11,7 +11,10 @@
 #include <linux/filter.h>
 #include <linux/uaccess.h>
 #include <linux/ctype.h>
+
 #include "trace.h"
+#define CREATE_TRACE_POINTS
+#include <trace/events/bpf.h>
 
 static DEFINE_PER_CPU(int, bpf_prog_active);
 
@@ -79,6 +82,24 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
 	.arg3_type	= ARG_ANYTHING,
 };
 
+static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
+{
+	void *src = (void *) (long) r1;
+	int size = (int) r2;
+
+	trace_bpf_output_data(src, size / sizeof(u64));
+
+	return 0;
+}
+
+static const struct bpf_func_proto bpf_output_trace_data_proto = {
+	.func		= bpf_output_trace_data,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_STACK,
+	.arg2_type	= ARG_CONST_STACK_SIZE,
+};
+
 /*
  * limited trace_printk()
  * only %d %u %x %ld %lu %lx %lld %llu %llx %p conversion specifiers allowed
@@ -169,6 +190,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
 		return &bpf_map_delete_elem_proto;
 	case BPF_FUNC_probe_read:
 		return &bpf_probe_read_proto;
+	case BPF_FUNC_output_trace_data:
+		return &bpf_output_trace_data_proto;
 	case BPF_FUNC_ktime_get_ns:
 		return &bpf_ktime_get_ns_proto;
 	case BPF_FUNC_tail_call:
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index bdf1c16..0aeaebe 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -59,5 +59,7 @@ static int (*bpf_l3_csum_replace)(void *ctx, int off, int from, int to, int flag
 	(void *) BPF_FUNC_l3_csum_replace;
 static int (*bpf_l4_csum_replace)(void *ctx, int off, int from, int to, int flags) =
 	(void *) BPF_FUNC_l4_csum_replace;
+static int (*bpf_output_trace_data)(void *src, int size) =
+	(void *) BPF_FUNC_output_trace_data;
 
 #endif
-- 
1.8.5.2


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-10 10:03 ` [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event He Kuang
@ 2015-07-10 22:10   ` Alexei Starovoitov
  2015-07-13  4:36     ` He Kuang
  2015-07-13  8:29   ` Peter Zijlstra
  1 sibling, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-07-10 22:10 UTC (permalink / raw)
  To: He Kuang, rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra,
	mingo, namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel

On 7/10/15 3:03 AM, He Kuang wrote:
> There're scenarios that we need an eBPF program to record not only
> kprobe point args, but also the PMU counters, time latencies or the
> number of cache misses between two probe points and other information
> when the probe point is entered.
>
> This patch adds a new trace event to establish infrastruction for bpf to
> output data to perf. Userspace perf tools can detect and use this event
> as using the existing tracepoint events.
>
> New bpf trace event entry in debugfs:
>
>       /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>
> Userspace perf tools detect the new tracepoint event as:
>
>       bpf:bpf_output_data                          [Tracepoint event]

Nice! This approach looks cleanest so far.

> +TRACE_EVENT(bpf_output_data,
> +
> +	TP_PROTO(u64 *src, int len),
> +
> +	TP_ARGS(src, len),
> +
> +	TP_STRUCT__entry(
> +		__dynamic_array(u64,		buf,		len)
> +	),
> +
> +	TP_fast_assign(
> +		memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));

may be make it 'u8' array? The extra multiply and...

> +static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
> +{
> +	void *src = (void *) (long) r1;
> +	int size = (int) r2;
> +
> +	trace_bpf_output_data(src, size / sizeof(u64));

.. and this silent round down could be confusing to use.
With array of u8, the program can push any structured data into it
and let user space interpret it.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-10 22:10   ` Alexei Starovoitov
@ 2015-07-13  4:36     ` He Kuang
  2015-07-13 13:52       ` Namhyung Kim
  0 siblings, 1 reply; 53+ messages in thread
From: He Kuang @ 2015-07-13  4:36 UTC (permalink / raw)
  To: Alexei Starovoitov, rostedt, masami.hiramatsu.pt, acme,
	a.p.zijlstra, mingo, namhyung, jolsa
  Cc: wangnan0, pi3orama, linux-kernel

hi, Alexei

On 2015/7/11 6:10, Alexei Starovoitov wrote:
> On 7/10/15 3:03 AM, He Kuang wrote:
>> There're scenarios that we need an eBPF program to record not only
>> kprobe point args, but also the PMU counters, time latencies or the
>> number of cache misses between two probe points and other information
>> when the probe point is entered.
>>
>> This patch adds a new trace event to establish infrastruction for bpf to
>> output data to perf. Userspace perf tools can detect and use this event
>> as using the existing tracepoint events.
>>
>> New bpf trace event entry in debugfs:
>>
>>       /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>>
>> Userspace perf tools detect the new tracepoint event as:
>>
>>       bpf:bpf_output_data                          [Tracepoint event]
>
> Nice! This approach looks cleanest so far.
>
>> +TRACE_EVENT(bpf_output_data,
>> +
>> +    TP_PROTO(u64 *src, int len),
>> +
>> +    TP_ARGS(src, len),
>> +
>> +    TP_STRUCT__entry(
>> +        __dynamic_array(u64,        buf,        len)
>> +    ),
>> +
>> +    TP_fast_assign(
>> +        memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
>
> may be make it 'u8' array? The extra multiply and...

OK

So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
0x623eb6d) will be this:

   dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
   f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00

And users are not restricted to u64 type elements. I'll change that.

>
>> +static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
>> +{
>> +    void *src = (void *) (long) r1;
>> +    int size = (int) r2;
>> +
>> +    trace_bpf_output_data(src, size / sizeof(u64));
>
> .. and this silent round down could be confusing to use.
> With array of u8, the program can push any structured data into it
> and let user space interpret it.
>
>


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-10 10:03 ` [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event He Kuang
  2015-07-10 22:10   ` Alexei Starovoitov
@ 2015-07-13  8:29   ` Peter Zijlstra
  1 sibling, 0 replies; 53+ messages in thread
From: Peter Zijlstra @ 2015-07-13  8:29 UTC (permalink / raw)
  To: He Kuang
  Cc: rostedt, ast, masami.hiramatsu.pt, acme, mingo, namhyung, jolsa,
	wangnan0, pi3orama, linux-kernel

On Fri, Jul 10, 2015 at 10:03:07AM +0000, He Kuang wrote:

>  include/trace/events/bpf.h | 30 ++++++++++++++++++++++++++++++
>  include/uapi/linux/bpf.h   |  7 +++++++
>  kernel/trace/bpf_trace.c   | 23 +++++++++++++++++++++++
>  samples/bpf/bpf_helpers.h  |  2 ++
>  4 files changed, 62 insertions(+)
>  create mode 100644 include/trace/events/bpf.h

Note that the subject is misleading, nothing in this entire patch series
has anything to do with perf events.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-13  4:36     ` He Kuang
@ 2015-07-13 13:52       ` Namhyung Kim
  2015-07-13 14:01         ` pi3orama
  0 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2015-07-13 13:52 UTC (permalink / raw)
  To: He Kuang
  Cc: Alexei Starovoitov, rostedt, masami.hiramatsu.pt, acme,
	a.p.zijlstra, mingo, jolsa, wangnan0, pi3orama, linux-kernel

Hi,

On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
> hi, Alexei
> 
> On 2015/7/11 6:10, Alexei Starovoitov wrote:
> >On 7/10/15 3:03 AM, He Kuang wrote:
> >>There're scenarios that we need an eBPF program to record not only
> >>kprobe point args, but also the PMU counters, time latencies or the
> >>number of cache misses between two probe points and other information
> >>when the probe point is entered.
> >>
> >>This patch adds a new trace event to establish infrastruction for bpf to
> >>output data to perf. Userspace perf tools can detect and use this event
> >>as using the existing tracepoint events.
> >>
> >>New bpf trace event entry in debugfs:
> >>
> >>      /sys/kernel/debug/tracing/events/bpf/bpf_output_data
> >>
> >>Userspace perf tools detect the new tracepoint event as:
> >>
> >>      bpf:bpf_output_data                          [Tracepoint event]
> >
> >Nice! This approach looks cleanest so far.
> >
> >>+TRACE_EVENT(bpf_output_data,
> >>+
> >>+    TP_PROTO(u64 *src, int len),
> >>+
> >>+    TP_ARGS(src, len),
> >>+
> >>+    TP_STRUCT__entry(
> >>+        __dynamic_array(u64,        buf,        len)
> >>+    ),
> >>+
> >>+    TP_fast_assign(
> >>+        memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
> >
> >may be make it 'u8' array? The extra multiply and...
> 
> OK
> 
> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
> 0x623eb6d) will be this:
> 
>   dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
>   f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
> 
> And users are not restricted to u64 type elements. I'll change that.

While this general event format works well, I think it might be hard
to know which output came from which program when more than one bpf
programs used.

I was thinking about providing custom event formats for each bpf
program (if needed).  The event format definitions might be in a
specific directory or a bpf object itself.  Then perf can read those
formats and print the output data according to the formats.  Maybe we
need to add some dynamic event id to match format and data.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-13 13:52       ` Namhyung Kim
@ 2015-07-13 14:01         ` pi3orama
  2015-07-13 14:09           ` Namhyung Kim
  0 siblings, 1 reply; 53+ messages in thread
From: pi3orama @ 2015-07-13 14:01 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: He Kuang, Alexei Starovoitov, rostedt, masami.hiramatsu.pt, acme,
	a.p.zijlstra, mingo, jolsa, wangnan0, linux-kernel



发自我的 iPhone

> 在 2015年7月13日,下午9:52,Namhyung Kim <namhyung@kernel.org> 写道:
> 
> Hi,
> 
>> On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
>> hi, Alexei
>> 
>>> On 2015/7/11 6:10, Alexei Starovoitov wrote:
>>>> On 7/10/15 3:03 AM, He Kuang wrote:
>>>> There're scenarios that we need an eBPF program to record not only
>>>> kprobe point args, but also the PMU counters, time latencies or the
>>>> number of cache misses between two probe points and other information
>>>> when the probe point is entered.
>>>> 
>>>> This patch adds a new trace event to establish infrastruction for bpf to
>>>> output data to perf. Userspace perf tools can detect and use this event
>>>> as using the existing tracepoint events.
>>>> 
>>>> New bpf trace event entry in debugfs:
>>>> 
>>>>     /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>>>> 
>>>> Userspace perf tools detect the new tracepoint event as:
>>>> 
>>>>     bpf:bpf_output_data                          [Tracepoint event]
>>> 
>>> Nice! This approach looks cleanest so far.
>>> 
>>>> +TRACE_EVENT(bpf_output_data,
>>>> +
>>>> +    TP_PROTO(u64 *src, int len),
>>>> +
>>>> +    TP_ARGS(src, len),
>>>> +
>>>> +    TP_STRUCT__entry(
>>>> +        __dynamic_array(u64,        buf,        len)
>>>> +    ),
>>>> +
>>>> +    TP_fast_assign(
>>>> +        memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
>>> 
>>> may be make it 'u8' array? The extra multiply and...
>> 
>> OK
>> 
>> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
>> 0x623eb6d) will be this:
>> 
>>  dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
>>  f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
>> 
>> And users are not restricted to u64 type elements. I'll change that.
> 
> While this general event format works well, I think it might be hard
> to know which output came from which program when more than one bpf
> programs used.
> 
> I was thinking about providing custom event formats for each bpf
> program (if needed).  The event format definitions might be in a
> specific directory or a bpf object itself.  Then perf can read those
> formats and print the output data according to the formats.  Maybe we
> need to add some dynamic event id to match format and data.
> 

I think we can do it in perf side. Let BPF programs themselves encode format information into the array and make perf read and decode them. In kernel side simply support raw data should be enough, so we can make kernel code as simple as possible.

Thanks.

> Thanks,
> Namhyung


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-13 14:01         ` pi3orama
@ 2015-07-13 14:09           ` Namhyung Kim
  2015-07-13 14:29             ` pi3orama
  0 siblings, 1 reply; 53+ messages in thread
From: Namhyung Kim @ 2015-07-13 14:09 UTC (permalink / raw)
  To: pi3orama
  Cc: He Kuang, Alexei Starovoitov, rostedt, masami.hiramatsu.pt, acme,
	a.p.zijlstra, mingo, jolsa, wangnan0, linux-kernel

On Mon, Jul 13, 2015 at 10:01:26PM +0800, pi3orama wrote:
> 
> 
> 发自我的 iPhone
> 
> > 在 2015年7月13日,下午9:52,Namhyung Kim <namhyung@kernel.org> 写道:
> > 
> > Hi,
> > 
> >> On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
> >> hi, Alexei
> >> 
> >>> On 2015/7/11 6:10, Alexei Starovoitov wrote:
> >>>> On 7/10/15 3:03 AM, He Kuang wrote:
> >>>> There're scenarios that we need an eBPF program to record not only
> >>>> kprobe point args, but also the PMU counters, time latencies or the
> >>>> number of cache misses between two probe points and other information
> >>>> when the probe point is entered.
> >>>> 
> >>>> This patch adds a new trace event to establish infrastruction for bpf to
> >>>> output data to perf. Userspace perf tools can detect and use this event
> >>>> as using the existing tracepoint events.
> >>>> 
> >>>> New bpf trace event entry in debugfs:
> >>>> 
> >>>>     /sys/kernel/debug/tracing/events/bpf/bpf_output_data
> >>>> 
> >>>> Userspace perf tools detect the new tracepoint event as:
> >>>> 
> >>>>     bpf:bpf_output_data                          [Tracepoint event]
> >>> 
> >>> Nice! This approach looks cleanest so far.
> >>> 
> >>>> +TRACE_EVENT(bpf_output_data,
> >>>> +
> >>>> +    TP_PROTO(u64 *src, int len),
> >>>> +
> >>>> +    TP_ARGS(src, len),
> >>>> +
> >>>> +    TP_STRUCT__entry(
> >>>> +        __dynamic_array(u64,        buf,        len)
> >>>> +    ),
> >>>> +
> >>>> +    TP_fast_assign(
> >>>> +        memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
> >>> 
> >>> may be make it 'u8' array? The extra multiply and...
> >> 
> >> OK
> >> 
> >> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
> >> 0x623eb6d) will be this:
> >> 
> >>  dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
> >>  f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
> >> 
> >> And users are not restricted to u64 type elements. I'll change that.
> > 
> > While this general event format works well, I think it might be hard
> > to know which output came from which program when more than one bpf
> > programs used.
> > 
> > I was thinking about providing custom event formats for each bpf
> > program (if needed).  The event format definitions might be in a
> > specific directory or a bpf object itself.  Then perf can read those
> > formats and print the output data according to the formats.  Maybe we
> > need to add some dynamic event id to match format and data.
> > 
>
> I think we can do it in perf side. Let BPF programs themselves
> encode format information into the array and make perf read and
> decode them. In kernel side simply support raw data should be
> enough, so we can make kernel code as simple as possible.

Yes, of course, I also meant that doing those work all in perf side. :)

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-13 14:09           ` Namhyung Kim
@ 2015-07-13 14:29             ` pi3orama
  2015-07-14  1:43               ` Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: pi3orama @ 2015-07-13 14:29 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: He Kuang, Alexei Starovoitov, rostedt, masami.hiramatsu.pt, acme,
	a.p.zijlstra, mingo, jolsa, wangnan0, linux-kernel



发自我的 iPhone

> 在 2015年7月13日,下午10:09,Namhyung Kim <namhyung@kernel.org> 写道:
> 
>> On Mon, Jul 13, 2015 at 10:01:26PM +0800, pi3orama wrote:
>> 
>> 
>> 发自我的 iPhone
>> 
>>> 在 2015年7月13日,下午9:52,Namhyung Kim <namhyung@kernel.org> 写道:
>>> 
>>> Hi,
>>> 
>>>> On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
>>>> hi, Alexei
>>>> 
>>>>>> On 2015/7/11 6:10, Alexei Starovoitov wrote:
>>>>>> On 7/10/15 3:03 AM, He Kuang wrote:
>>>>>> There're scenarios that we need an eBPF program to record not only
>>>>>> kprobe point args, but also the PMU counters, time latencies or the
>>>>>> number of cache misses between two probe points and other information
>>>>>> when the probe point is entered.
>>>>>> 
>>>>>> This patch adds a new trace event to establish infrastruction for bpf to
>>>>>> output data to perf. Userspace perf tools can detect and use this event
>>>>>> as using the existing tracepoint events.
>>>>>> 
>>>>>> New bpf trace event entry in debugfs:
>>>>>> 
>>>>>>    /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>>>>>> 
>>>>>> Userspace perf tools detect the new tracepoint event as:
>>>>>> 
>>>>>>    bpf:bpf_output_data                          [Tracepoint event]
>>>>> 
>>>>> Nice! This approach looks cleanest so far.
>>>>> 
>>>>>> +TRACE_EVENT(bpf_output_data,
>>>>>> +
>>>>>> +    TP_PROTO(u64 *src, int len),
>>>>>> +
>>>>>> +    TP_ARGS(src, len),
>>>>>> +
>>>>>> +    TP_STRUCT__entry(
>>>>>> +        __dynamic_array(u64,        buf,        len)
>>>>>> +    ),
>>>>>> +
>>>>>> +    TP_fast_assign(
>>>>>> +        memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
>>>>> 
>>>>> may be make it 'u8' array? The extra multiply and...
>>>> 
>>>> OK
>>>> 
>>>> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
>>>> 0x623eb6d) will be this:
>>>> 
>>>> dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
>>>> f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
>>>> 
>>>> And users are not restricted to u64 type elements. I'll change that.
>>> 
>>> While this general event format works well, I think it might be hard
>>> to know which output came from which program when more than one bpf
>>> programs used.
>>> 
>>> I was thinking about providing custom event formats for each bpf
>>> program (if needed).  The event format definitions might be in a
>>> specific directory or a bpf object itself.  Then perf can read those
>>> formats and print the output data according to the formats.  Maybe we
>>> need to add some dynamic event id to match format and data.
>> 
>> I think we can do it in perf side. Let BPF programs themselves
>> encode format information into the array and make perf read and
>> decode them. In kernel side simply support raw data should be
>> enough, so we can make kernel code as simple as possible.
> 
> Yes, of course, I also meant that doing those work all in perf side. :)
> 

I have an idea on it:

struct x{
 int a;
 int b;
};
struct x __x;

SEC(...)
int func(void *ctx)
{
  struct x myx;
  ...
  myx.a = 1;
  myx.b = 2;
  OUTPUT(&myx, &__x);
  ...
}

In the above program, the '&' operator will emit a relocation, so libbpf will have a chance to know the exact type of the output data. It then can translate into a unique number. The OUTPUT macro should pass the number through the raw data. When decoding, by reading the first word in the raw data perf knows the format. According to it perf can then retrieve the structure format through dwarf information. We can use more macro to make the above code simpler.

We will start working on it after this patch get accepted.

Thank you.


> Thanks,
> Namhyung


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-13 14:29             ` pi3orama
@ 2015-07-14  1:43               ` Alexei Starovoitov
  2015-07-14 11:54                 ` He Kuang
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-07-14  1:43 UTC (permalink / raw)
  To: pi3orama, Namhyung Kim
  Cc: He Kuang, rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra,
	mingo, jolsa, wangnan0, linux-kernel

On 7/13/15 7:29 AM, pi3orama wrote:
>>>> I was thinking about providing custom event formats for each bpf
>>>> >>>program (if needed).  The event format definitions might be in a
>>>> >>>specific directory or a bpf object itself.  Then perf can read those
>>>> >>>formats and print the output data according to the formats.  Maybe we
>>>> >>>need to add some dynamic event id to match format and data.
>>> >>
>>> >>I think we can do it in perf side. Let BPF programs themselves
>>> >>encode format information into the array and make perf read and
>>> >>decode them. In kernel side simply support raw data should be
>>> >>enough, so we can make kernel code as simple as possible.
>> >
>> >Yes, of course, I also meant that doing those work all in perf side. :)

sounds like a great idea. +1

> I have an idea on it:
>
> struct x{
>   int a;
>   int b;
> };
> struct x __x;
>
> SEC(...)
> int func(void *ctx)
> {
>    struct x myx;
>    ...
>    myx.a = 1;
>    myx.b = 2;
>    OUTPUT(&myx, &__x);
>    ...
> }
>
> In the above program, the '&' operator will emit a relocation,

relocation once emitted will be painful to remove from compiled code.
Why not to use dwarf info from the struct passed into
bpf_output_trace_data()? No extra macros would be needed from users.
I'm not sure llvm generates proper dwarf along with bpf code (I didn't
test that part. If there are any issues they should be fixable. If you
can prepapre a patch for llvm that would be even better :)


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-14  1:43               ` Alexei Starovoitov
@ 2015-07-14 11:54                 ` He Kuang
  2015-07-17  4:11                   ` Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: He Kuang @ 2015-07-14 11:54 UTC (permalink / raw)
  To: Alexei Starovoitov, pi3orama, Namhyung Kim
  Cc: rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo, jolsa,
	wangnan0, linux-kernel

Hi, Alexei

On 2015/7/14 9:43, Alexei Starovoitov wrote:
> On 7/13/15 7:29 AM, pi3orama wrote:
>>>>> I was thinking about providing custom event formats for each bpf
>>>>> >>>program (if needed).  The event format definitions might be in a
>>>>> >>>specific directory or a bpf object itself.  Then perf can read those
>>>>> >>>formats and print the output data according to the formats.  Maybe we
>>>>> >>>need to add some dynamic event id to match format and data.
>>>> >>
>>>> >>I think we can do it in perf side. Let BPF programs themselves
>>>> >>encode format information into the array and make perf read and
>>>> >>decode them. In kernel side simply support raw data should be
>>>> >>enough, so we can make kernel code as simple as possible.
>>> >
>>> >Yes, of course, I also meant that doing those work all in perf side. :)
>
> sounds like a great idea. +1
>
>> I have an idea on it:
>>
>> struct x{
>>   int a;
>>   int b;
>> };
>> struct x __x;
>>
>> SEC(...)
>> int func(void *ctx)
>> {
>>    struct x myx;
>>    ...
>>    myx.a = 1;
>>    myx.b = 2;
>>    OUTPUT(&myx, &__x);
>>    ...
>> }
>>
>> In the above program, the '&' operator will emit a relocation,
>
> relocation once emitted will be painful to remove from compiled code.
> Why not to use dwarf info from the struct passed into
> bpf_output_trace_data()? No extra macros would be needed from users.

We are turning to obtain infomation from dwarf_info, but still we
should store the struct type in the bpf output raw data, otherwise
perf tools can not distinguish different bpf trace points if
there're multiple trace points in bpf program, because all the
bpf_output sample entries has the same id.
      
> I'm not sure llvm generates proper dwarf along with bpf code (I didn't
> test that part. If there are any issues they should be fixable. If you
> can prepapre a patch for llvm that would be even better :)
>

I found objdump can't get dwarf info from bpf object file:

   $ objdump --dwarf=info bpf.o
   bpf.o: file format elf64-little

   $ readelf -a bpf.o |grep debug_info
   <EMPTY>

'-g' and '-gdwarf=4' options are added to clang flags, and if we
compile object file for other platform such as x86_64, there's no problem.

   $ objdump --dwarf=info x86_64_xx.o |wc -l
   179

   $ readelf -a x86_64_xx.o |grep debug_info
   [10] .debug_info       PROGBITS         0000000000000000  000002b9

I'm not very familar with llvm so can you give me some
suggestions?

Thank you.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-14 11:54                 ` He Kuang
@ 2015-07-17  4:11                   ` Alexei Starovoitov
  2015-07-17  4:14                     ` Wangnan (F)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-07-17  4:11 UTC (permalink / raw)
  To: He Kuang, pi3orama, Namhyung Kim
  Cc: rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo, jolsa,
	wangnan0, linux-kernel

On 7/14/15 4:54 AM, He Kuang wrote:
>> I'm not sure llvm generates proper dwarf along with bpf code (I didn't
>> test that part. If there are any issues they should be fixable. If you
>> can prepapre a patch for llvm that would be even better :)
>>
>
> I found objdump can't get dwarf info from bpf object file:
>
>    $ objdump --dwarf=info bpf.o
>    bpf.o: file format elf64-little
>
>    $ readelf -a bpf.o |grep debug_info
>    <EMPTY>

yeah. looks like this part is not working.
Interesting that when I do: clang -O2 -target bpf a.c -g -S
there is some minimal debug info in the .s, but .o lacks
debuginfo completely. Digging further...

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-17  4:11                   ` Alexei Starovoitov
@ 2015-07-17  4:14                     ` Wangnan (F)
  2015-07-17  4:27                       ` Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-07-17  4:14 UTC (permalink / raw)
  To: Alexei Starovoitov, He Kuang, pi3orama, Namhyung Kim
  Cc: rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo, jolsa,
	linux-kernel



On 2015/7/17 12:11, Alexei Starovoitov wrote:
> On 7/14/15 4:54 AM, He Kuang wrote:
>>> I'm not sure llvm generates proper dwarf along with bpf code (I didn't
>>> test that part. If there are any issues they should be fixable. If you
>>> can prepapre a patch for llvm that would be even better :)
>>>
>>
>> I found objdump can't get dwarf info from bpf object file:
>>
>>    $ objdump --dwarf=info bpf.o
>>    bpf.o: file format elf64-little
>>
>>    $ readelf -a bpf.o |grep debug_info
>>    <EMPTY>
>
> yeah. looks like this part is not working.
> Interesting that when I do: clang -O2 -target bpf a.c -g -S
> there is some minimal debug info in the .s, but .o lacks
> debuginfo completely. Digging further...

Glad to see you start look at it. We are not familiar with LLVM, but I was
told that LLVM has a clean structure and very easy to introduce new 
features.
Could you please give us some hits on it so we can work together?

Thank you.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-17  4:14                     ` Wangnan (F)
@ 2015-07-17  4:27                       ` Alexei Starovoitov
  2015-07-23 11:54                         ` He Kuang
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-07-17  4:27 UTC (permalink / raw)
  To: Wangnan (F), He Kuang, pi3orama, Namhyung Kim
  Cc: rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo, jolsa,
	linux-kernel

On 7/16/15 9:14 PM, Wangnan (F) wrote:
>
>
> On 2015/7/17 12:11, Alexei Starovoitov wrote:
>> On 7/14/15 4:54 AM, He Kuang wrote:
>>>> I'm not sure llvm generates proper dwarf along with bpf code (I didn't
>>>> test that part. If there are any issues they should be fixable. If you
>>>> can prepapre a patch for llvm that would be even better :)
>>>>
>>>
>>> I found objdump can't get dwarf info from bpf object file:
>>>
>>>    $ objdump --dwarf=info bpf.o
>>>    bpf.o: file format elf64-little
>>>
>>>    $ readelf -a bpf.o |grep debug_info
>>>    <EMPTY>
>>
>> yeah. looks like this part is not working.
>> Interesting that when I do: clang -O2 -target bpf a.c -g -S
>> there is some minimal debug info in the .s, but .o lacks
>> debuginfo completely. Digging further...
>
> Glad to see you start look at it. We are not familiar with LLVM, but I was
> told that LLVM has a clean structure and very easy to introduce new
> features.
> Could you please give us some hits on it so we can work together?

sure. that would be awesome.
In general llmv is very well documented:
http://llvm.org/docs/

In this particular case start with:
clang -O2 -emit-llvm -g a.c -S -o a.ll
in a.ll you'll see llvm bitcode with corresponding debug tags.
note debug info in llvm ir in general is not compatible between
releases. So clang and llc need to match very closely.

Then use:
llc -march=bpf -print-after-all a.ll
you'll see something like:
BB#0: derived from LLVM BB %0
	DBG_VALUE %R1, %noreg, !"a", <!16>; line no:6
	DBG_VALUE %R2, %noreg, !"b", <!16>; line no:6
	DBG_VALUE %R3, %noreg, !"c", <!16>; line no:6
	%R1<def> = MOV_ri 0
	%R2<def> = MOV_ri 3
	JAL <ga:@bar>, %R0<imp-def,dead>, %R1<imp-def,dead>, %R2<imp-def,dead>, 
%R3<imp-def,dead>, %R11<imp-use>, %R1<imp-use>, %R2<imp-use>, ...; 
dbg:a.c:8:2
	%R1<def> = MOV_ri 1; dbg:a.c:9:2

which means that debug info about line numbers and variable
names/types mapping to bpf registers was preserved all the way
till the last pass of the compiler.

It means that the problem is somewhere in 'machine code emitter'
in lib/Target/BPF/MCTargetDesc/*
Likely just some switch is saying to the rest of llvm infra that
this backend is not capable of emitting debug info.

btw, if you can implement 32-bit subregister support for the backend
it would be really awesome. Many programs will benefit and will
become faster.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size
  2015-07-10 10:03 ` [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size He Kuang
@ 2015-07-17 14:32   ` Steven Rostedt
  2015-07-17 17:24     ` Sara Rostedt
  2015-07-17 18:13     ` Steven Rostedt
  0 siblings, 2 replies; 53+ messages in thread
From: Steven Rostedt @ 2015-07-17 14:32 UTC (permalink / raw)
  To: He Kuang
  Cc: ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo, namhyung,
	jolsa, wangnan0, pi3orama, linux-kernel, Alex Bennée

On Fri, 10 Jul 2015 10:03:05 +0000
He Kuang <hekuang@huawei.com> wrote:

> The output result of trace_foo_bar event in traceevent samples is
> wrong. This problem can be reproduced as following:
> 
>   (Build kernel with SAMPLE_TRACE_EVENTS=m)
> 
>   $ insmod trace-events-sample.ko
> 
>   $ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable
> 
>   $ cat /sys/kernel/debug/tracing/trace
> 
>   event-sample-980 [000] ....  43.649559: foo_bar: foo hello 21 0x15
>   BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
>                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                  The array length is not right, should be {0x1}.
>   (ffffffff,ffffffff)
> 
>   event-sample-980 [000] ....  44.653827: foo_bar: foo hello 22 0x16
>   BIT2|BIT3|0x10
>   {0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                  The array length is not right, should be {0x1,0x2}.
>   Gandalf (ffffffff,ffffffff)
> 
> The event defined in samples/trace_events/trace-events-sample.h uses
> this helper function to output dynamic list:
> 
>    __print_array(__get_dynamic_array(list),
>                  __get_dynamic_array_len(list),
>                  sizeof(int))
> 
> Currently, __get_dynamic_array_len() returns the total size of the
> array instead of the number of items by referencing the high 16 bits
> of entry->data_loc_##item. The element size for calculating the number
> of items can not be fetched by referencing fields from __entry, so
> macro __get_dynamic_array_len can not return the expected value.
> 
> This patch stores array item number instead of the total size in
> entry->__data_loc_##item, and makes __get_dynamic_array_len get the
> right value directly. Because the function __get_bitmask() is affected
> by this change, __bitmask_size is assigned to the array len by
> multiplied bitmask type size.
> 
> After this patch:
> 
>   event-sample-993 [000] ....  692.348562: foo_bar: foo hello 201
>   0xc9 BIT1|BIT4|0xc0 {0x1} Snoopy (ffffffff,ffffffff)
>                       ^^^^^
>                       Array length fixed.
> 
>   event-sample-993 [000] ....  693.349276: foo_bar: foo hello 202
>   0xca BIT2|BIT4|0xc0 {0x1,0x2} Gandalf (ffffffff,ffffffff)
>                       ^^^^^^^^^
>                       Array length fixed.
> 
> Signed-off-by: He Kuang <hekuang@huawei.com>
> ---
>  include/trace/trace_events.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h
> index 43be3b0..5abe027 100644
> --- a/include/trace/trace_events.h
> +++ b/include/trace/trace_events.h
> @@ -257,7 +257,8 @@ TRACE_MAKE_SYSTEM_STR();
>  	({								\
>  		void *__bitmask = __get_dynamic_array(field);		\
>  		unsigned int __bitmask_size;				\
> -		__bitmask_size = __get_dynamic_array_len(field);	\
> +		__bitmask_size = (__get_dynamic_array_len(field) *	\
> +				  sizeof(unsigned long));		\
>  		trace_print_bitmask_seq(p, __bitmask, __bitmask_size);	\
>  	})
>  
> @@ -453,7 +454,7 @@ trace_event_define_fields_##call(struct trace_event_call *event_call)	\
>  	__item_length = (len) * sizeof(type);				\
>  	__data_offsets->item = __data_size +				\
>  			       offsetof(typeof(*entry), __data);	\
> -	__data_offsets->item |= __item_length << 16;			\
> +	__data_offsets->item |= (len) << 16;				\

This change affects all callers of dymanic_array, not just bitmasks.

>  	__data_size += __item_length;
>  
>  #undef __string

BTW, if I revert commit ac01ce1410fc2 "tracing: Make
ftrace_print_array_seq compute buf_len" it works again.

I'm going to look into this some more, and maybe the answer is to go
back and just pass in buffer length here. I can't see what was broken
before that change.

-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size
  2015-07-17 14:32   ` Steven Rostedt
@ 2015-07-17 17:24     ` Sara Rostedt
  2015-07-17 18:13     ` Steven Rostedt
  1 sibling, 0 replies; 53+ messages in thread
From: Sara Rostedt @ 2015-07-17 17:24 UTC (permalink / raw)
  To: He Kuang, Javi Merino
  Cc: ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo, namhyung,
	jolsa, wangnan0, pi3orama, linux-kernel, Alex Bennée,
	Dave Martin

On Fri, 17 Jul 2015 10:32:15 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> > @@ -453,7 +454,7 @@ trace_event_define_fields_##call(struct trace_event_call *event_call)	\
> >  	__item_length = (len) * sizeof(type);				\
> >  	__data_offsets->item = __data_size +				\
> >  			       offsetof(typeof(*entry), __data);	\
> > -	__data_offsets->item |= __item_length << 16;			\
> > +	__data_offsets->item |= (len) << 16;				\
> 
> This change affects all callers of dymanic_array, not just bitmasks.

This also affects what goes into the trace record and changes what
tools expect. That size is to be the size of the entire allocated
space, not the size of an individual element or the number of them.

> 
> >  	__data_size += __item_length;
> >  
> >  #undef __string
> 
> BTW, if I revert commit ac01ce1410fc2 "tracing: Make
> ftrace_print_array_seq compute buf_len" it works again.
> 
> I'm going to look into this some more, and maybe the answer is to go
> back and just pass in buffer length here. I can't see what was broken
> before that change.

I'm thinking the best thing to do is to revert that patch and make the
second argument of __print_array() the size and not the count.

-- Steve

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size
  2015-07-17 14:32   ` Steven Rostedt
  2015-07-17 17:24     ` Sara Rostedt
@ 2015-07-17 18:13     ` Steven Rostedt
  2015-07-23 19:36       ` Alex Bennée
  1 sibling, 1 reply; 53+ messages in thread
From: Steven Rostedt @ 2015-07-17 18:13 UTC (permalink / raw)
  To: He Kuang
  Cc: ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo, namhyung,
	jolsa, wangnan0, pi3orama, linux-kernel, Alex Bennée

On Fri, 17 Jul 2015 10:32:15 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:


> This change affects all callers of dymanic_array, not just bitmasks.
> 
> >  	__data_size += __item_length;
> >  
> >  #undef __string
> 
> BTW, if I revert commit ac01ce1410fc2 "tracing: Make
> ftrace_print_array_seq compute buf_len" it works again.
> 
> I'm going to look into this some more, and maybe the answer is to go
> back and just pass in buffer length here. I can't see what was broken
> before that change.

OK, the print_array() code is already being used by the thermal events
and can't be changed. But we can't make the proposed change because
that changes the user interface.

What we can change is the sample code!

-- Steve

>From 95de1e9721a2f9d05831a53d228e181a33001c55 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
Date: Fri, 17 Jul 2015 14:03:26 -0400
Subject: [PATCH] tracing: Fix sample output of dynamic arrays

He Kuang noticed that the trace event samples for arrays was broken:

"The output result of trace_foo_bar event in traceevent samples is
 wrong. This problem can be reproduced as following:

  (Build kernel with SAMPLE_TRACE_EVENTS=m)

  $ insmod trace-events-sample.ko

  $ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable

  $ cat /sys/kernel/debug/tracing/trace

  event-sample-980 [000] ....  43.649559: foo_bar: foo hello 21 0x15
  BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 The array length is not right, should be {0x1}.
  (ffffffff,ffffffff)

  event-sample-980 [000] ....  44.653827: foo_bar: foo hello 22 0x16
  BIT2|BIT3|0x10
  {0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 The array length is not right, should be {0x1,0x2}.
  Gandalf (ffffffff,ffffffff)"

This was caused by an update to have __print_array()'s second parameter
be the count of items in the array and not the size of the array.

As there is already users of __print_array(), it can not change. But
the sample code can and we can also improve on the documentation about
__print_array() and __get_dynamic_array_len().

Link: http://lkml.kernel.org/r/1436839171-31527-2-git-send-email-hekuang@huawei.com

Fixes: ac01ce1410fc2 ("tracing: Make ftrace_print_array_seq compute buf_len")
Reported-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 samples/trace_events/trace-events-sample.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/samples/trace_events/trace-events-sample.h b/samples/trace_events/trace-events-sample.h
index 8965d1bb8811..125d6402f64f 100644
--- a/samples/trace_events/trace-events-sample.h
+++ b/samples/trace_events/trace-events-sample.h
@@ -168,7 +168,10 @@
  *
  *      For __dynamic_array(int, foo, bar) use __get_dynamic_array(foo)
  *            Use __get_dynamic_array_len(foo) to get the length of the array
- *            saved.
+ *            saved. Note, __get_dynamic_array_len() returns the total allocated
+ *            length of the dynamic array; __print_array() expects the second
+ *            parameter to be the number of elements. To get that, the array length
+ *            needs to be divided by the element size.
  *
  *      For __string(foo, bar) use __get_str(foo)
  *
@@ -288,7 +291,7 @@ TRACE_EVENT(foo_bar,
  *    This prints out the array that is defined by __array in a nice format.
  */
 		  __print_array(__get_dynamic_array(list),
-				__get_dynamic_array_len(list),
+				__get_dynamic_array_len(list) / sizeof(int),
 				sizeof(int)),
 		  __get_str(str), __get_bitmask(cpus))
 );
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-17  4:27                       ` Alexei Starovoitov
@ 2015-07-23 11:54                         ` He Kuang
  2015-07-23 20:49                           ` llvm bpf debug info. " Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: He Kuang @ 2015-07-23 11:54 UTC (permalink / raw)
  To: Alexei Starovoitov, Wangnan (F), pi3orama, Namhyung Kim
  Cc: rostedt, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo, jolsa,
	linux-kernel

Hi, Alexi

Thank you for your guidence, and by referencing your last mail
and other llvm backends, I found setting
BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h
and fix some unhandeled switch can make llc output debug_info,
but important information is missing in the result:

bpf:
<1><2a>: Abbrev Number: 2 (DW_TAG_subprogram)
    <2b>   DW_AT_low_pc      : 0x0
    <33>   DW_AT_high_pc     : 0x60
    <37>   Unknown AT value: 3fe7: 1
    <37>   DW_AT_frame_base  : 1 byte block: 5a         (DW_OP_reg10 (r10))
    <39>   DW_AT_name        : (indirect string, offset: 0x0): clang
                                version 3.7.0 (http://llvm.org/git/clang.git..
    <3d>   DW_AT_decl_file   : 1
    <3e>   DW_AT_decl_line   : 3
    <3f>   DW_AT_prototyped  : 1
    <3f>   DW_AT_type        : <0x65>
    <43>   DW_AT_external    : 1
    <43>   Unknown AT value: 3fe1: 1
<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <44>   DW_AT_name        : (indirect string, offset: 0x0): clang
                                version 3.7.0 (http://llvm.org/git/clang.git..
    <48>   DW_AT_decl_file   : 1
    <49>   DW_AT_decl_line   : 3
    <4a>   DW_AT_type        : <0x65>

Compares to x86 platform result:

<1><26>: Abbrev Number: 2 (DW_TAG_subprogram)
    <27>   DW_AT_low_pc      : 0x0
    <2b>   DW_AT_high_pc     : 0x16
    <2f>   Unknown AT value: 3fe7: 1
    <2f>   DW_AT_frame_base  : 1 byte block: 54         (DW_OP_reg4 (esp))
    <31>   DW_AT_name        : (indirect string, offset: 0xcf): testprog
    <35>   DW_AT_decl_file   : 1
    <36>   DW_AT_decl_line   : 3
    <37>   DW_AT_prototyped  : 1
    <37>   DW_AT_type        : <0x65>
    <3b>   DW_AT_external    : 1
    <3b>   Unknown AT value: 3fe1: 1
<2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <3c>   DW_AT_location    : 2 byte block: 91 4       (DW_OP_fbreg: 4)
    <3f>   DW_AT_name        : (indirect string, offset: 0xdc): myvar_a
    <43>   DW_AT_decl_file   : 1
    <44>   DW_AT_decl_line   : 3
    <45>   DW_AT_type        : <0x65>

The bpf result lacks of DW_AT_location, and DW_AT_name gives no
infomation.

Then I used 'llc print-after*' command to check each pass and
wanted to find by which step the debug infomation is dropped,
things looks similar until the passes between 'verify' and
'expand-isel-pseudos':

x86:
   $ llc -march=x86 --print-before-all -print-after-all
     -stop-after=expand-isel-pseudos test.ll

   # *** IR Dump Before Expand ISel Pseudo-instructions ***:
   # Machine code for function testprog: SSA
   Frame Objects:
   fi#-2: size=4, align=4, fixed, at location [SP+8]
   fi#-1: size=4, align=16, fixed, at location [SP+4]

   BB#0: derived from LLVM BB %entry
   DBG_VALUE <fi#-1>, 0, !"myvar_a", <!15>; line no:3
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   DBG_VALUE <fi#-2>, 0, !"myvar_b", <!15>; line no:3
   %vreg0<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack-2]
                                                      ; GR32:%vreg0

bpf:
   $ llc -march=bpf --print-before-all -print-after-all
     -stop-after=expand-isel-pseudos test.ll

   # *** IR Dump Before Expand ISel Pseudo-instructions ***:
   # Machine code for function testprog: SSA
   Function Live Ins: %R1 in %vreg0, %R2 in %vreg1

   BB#0: derived from LLVM BB %entry
   Live Ins: %R1 %R2
   %vreg1<def> = COPY %R2; GPR:%vreg1
   %vreg0<def> = COPY %R1; GPR:%vreg0
   %vreg2<def> = LD_imm64 2147483648; GPR:%vreg2

I think maybe this missing 'DBG_VALUE' causes the problem, but
I'm stuck here and hope you can give more advice.

Thank you!

On 2015/7/17 12:27, Alexei Starovoitov wrote:
> clang -O2 -emit-llvm -g a.c -S -o a.ll


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size
  2015-07-17 18:13     ` Steven Rostedt
@ 2015-07-23 19:36       ` Alex Bennée
  0 siblings, 0 replies; 53+ messages in thread
From: Alex Bennée @ 2015-07-23 19:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: He Kuang, ast, masami.hiramatsu.pt, acme, a.p.zijlstra, mingo,
	namhyung, jolsa, wangnan0, pi3orama, linux-kernel


Steven Rostedt <rostedt@goodmis.org> writes:

> On Fri, 17 Jul 2015 10:32:15 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
>
>> This change affects all callers of dymanic_array, not just bitmasks.
>> 
>> >  	__data_size += __item_length;
>> >  
>> >  #undef __string
>> 
>> BTW, if I revert commit ac01ce1410fc2 "tracing: Make
>> ftrace_print_array_seq compute buf_len" it works again.
>> 
>> I'm going to look into this some more, and maybe the answer is to go
>> back and just pass in buffer length here. I can't see what was broken
>> before that change.
>
> OK, the print_array() code is already being used by the thermal events
> and can't be changed. But we can't make the proposed change because
> that changes the user interface.
>
> What we can change is the sample code!
>
> -- Steve
>
> From 95de1e9721a2f9d05831a53d228e181a33001c55 Mon Sep 17 00:00:00 2001
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> Date: Fri, 17 Jul 2015 14:03:26 -0400
> Subject: [PATCH] tracing: Fix sample output of dynamic arrays
>
> He Kuang noticed that the trace event samples for arrays was broken:
>
> "The output result of trace_foo_bar event in traceevent samples is
>  wrong. This problem can be reproduced as following:
>
>   (Build kernel with SAMPLE_TRACE_EVENTS=m)
>
>   $ insmod trace-events-sample.ko
>
>   $ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable
>
>   $ cat /sys/kernel/debug/tracing/trace
>
>   event-sample-980 [000] ....  43.649559: foo_bar: foo hello 21 0x15
>   BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
>                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                  The array length is not right, should be {0x1}.
>   (ffffffff,ffffffff)
>
>   event-sample-980 [000] ....  44.653827: foo_bar: foo hello 22 0x16
>   BIT2|BIT3|0x10
>   {0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                  The array length is not right, should be {0x1,0x2}.
>   Gandalf (ffffffff,ffffffff)"
>
> This was caused by an update to have __print_array()'s second parameter
> be the count of items in the array and not the size of the array.
>
> As there is already users of __print_array(), it can not change. But
> the sample code can and we can also improve on the documentation about
> __print_array() and __get_dynamic_array_len().
>
> Link: http://lkml.kernel.org/r/1436839171-31527-2-git-send-email-hekuang@huawei.com
>
> Fixes: ac01ce1410fc2 ("tracing: Make ftrace_print_array_seq compute buf_len")
> Reported-by: He Kuang <hekuang@huawei.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  samples/trace_events/trace-events-sample.h | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/samples/trace_events/trace-events-sample.h b/samples/trace_events/trace-events-sample.h
> index 8965d1bb8811..125d6402f64f 100644
> --- a/samples/trace_events/trace-events-sample.h
> +++ b/samples/trace_events/trace-events-sample.h
> @@ -168,7 +168,10 @@
>   *
>   *      For __dynamic_array(int, foo, bar) use __get_dynamic_array(foo)
>   *            Use __get_dynamic_array_len(foo) to get the length of the array
> - *            saved.
> + *            saved. Note, __get_dynamic_array_len() returns the total allocated
> + *            length of the dynamic array; __print_array() expects the second
> + *            parameter to be the number of elements. To get that, the array length
> + *            needs to be divided by the element size.
>   *
>   *      For __string(foo, bar) use __get_str(foo)
>   *
> @@ -288,7 +291,7 @@ TRACE_EVENT(foo_bar,
>   *    This prints out the array that is defined by __array in a nice format.
>   */
>  		  __print_array(__get_dynamic_array(list),
> -				__get_dynamic_array_len(list),
> +				__get_dynamic_array_len(list) / sizeof(int),
>  				sizeof(int)),
>  		  __get_str(str), __get_bitmask(cpus))
>  );

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 53+ messages in thread

* llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-23 11:54                         ` He Kuang
@ 2015-07-23 20:49                           ` Alexei Starovoitov
  2015-07-24  3:20                             ` Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-07-23 20:49 UTC (permalink / raw)
  To: He Kuang, Wangnan (F), pi3orama; +Cc: linux-kernel

On 7/23/15 4:54 AM, He Kuang wrote:

trimmed cc-list, since it's not related to kernel.

> Thank you for your guidence, and by referencing your last mail
> and other llvm backends, I found setting
> BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h

thanks! yes. it was missing.

> and fix some unhandeled switch can make llc output debug_info,

what do you mean ?

> but important information is missing in the result:

hmm. I see slightly different picture.
With 'clang -O2 -target bpf -g -S a.c'
I see all the right info inside .s file.
with '-c a.c' for some reasons it produces bogus offset:
    Abbrev Offset: 0xffff0000
    Pointer Size:  8
/usr/local/bin/objdump: Warning: Debug info is corrupted, abbrev offset 
(ffff0000) is larger than abbrev section size (4b)

and objdump fails to parse .o
I'm using llvm trunk 3.8. Do you see this as well?


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-23 20:49                           ` llvm bpf debug info. " Alexei Starovoitov
@ 2015-07-24  3:20                             ` Alexei Starovoitov
  2015-07-24  4:16                               ` He Kuang
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-07-24  3:20 UTC (permalink / raw)
  To: He Kuang, Wangnan (F), pi3orama; +Cc: linux-kernel

On 7/23/15 1:49 PM, Alexei Starovoitov wrote:
> On 7/23/15 4:54 AM, He Kuang wrote:
>
> trimmed cc-list, since it's not related to kernel.
>
>> Thank you for your guidence, and by referencing your last mail
>> and other llvm backends, I found setting
>> BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h
>
> thanks! yes. it was missing.
>
>> and fix some unhandeled switch can make llc output debug_info,
>
> what do you mean ?
>
>> but important information is missing in the result:
>
> hmm. I see slightly different picture.
> With 'clang -O2 -target bpf -g -S a.c'
> I see all the right info inside .s file.
> with '-c a.c' for some reasons it produces bogus offset:
>     Abbrev Offset: 0xffff0000
>     Pointer Size:  8
> /usr/local/bin/objdump: Warning: Debug info is corrupted, abbrev offset
> (ffff0000) is larger than abbrev section size (4b)
>
> and objdump fails to parse .o
> I'm using llvm trunk 3.8. Do you see this as well?

there were few issues related to relocations.
Fixed it up and pushed to llvm trunk r243087.
Please pull and give it a try.
AT_location should be correct, but AT_name still looks buggy.



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-24  3:20                             ` Alexei Starovoitov
@ 2015-07-24  4:16                               ` He Kuang
  2015-07-25 10:04                                 ` He Kuang
  0 siblings, 1 reply; 53+ messages in thread
From: He Kuang @ 2015-07-24  4:16 UTC (permalink / raw)
  To: Alexei Starovoitov, Wangnan (F), pi3orama; +Cc: linux-kernel

Hi, Alexei

On 2015/7/24 11:20, Alexei Starovoitov wrote:
> On 7/23/15 1:49 PM, Alexei Starovoitov wrote:
>> On 7/23/15 4:54 AM, He Kuang wrote:
>>
>> trimmed cc-list, since it's not related to kernel.
>>
>>> Thank you for your guidence, and by referencing your last mail
>>> and other llvm backends, I found setting
>>> BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h
>>
>> thanks! yes. it was missing.
>>
>>> and fix some unhandeled switch can make llc output debug_info,
>>
>> what do you mean ?
>>
>>> but important information is missing in the result:
>>
>> hmm. I see slightly different picture.
>> With 'clang -O2 -target bpf -g -S a.c'
>> I see all the right info inside .s file.
>> with '-c a.c' for some reasons it produces bogus offset:
>>     Abbrev Offset: 0xffff0000
>>     Pointer Size:  8
>> /usr/local/bin/objdump: Warning: Debug info is corrupted, abbrev offset
>> (ffff0000) is larger than abbrev section size (4b)
>>
>> and objdump fails to parse .o
>> I'm using llvm trunk 3.8. Do you see this as well?
>
> there were few issues related to relocations.
> Fixed it up and pushed to llvm trunk r243087.
> Please pull and give it a try.
> AT_location should be correct, but AT_name still looks buggy.

I've pulled the lastest version "[bpf] initial support for
debug_info" and tested it. This version can output debug_info but
still not generate correct AT_location, I tested as following:

$ cat > main.c <<EOF
int testprog(int myvar_a, int myvar_b)
{
         int myvar_c;
         myvar_c = myvar_a + myvar_b;
         return myvar_c;
}
EOF

$ clang -g  -O2 -target bpf -c main.c -o test.obj.bpf
$ clang -g  -O2             -c main.c -o test.obj.x86

$ objdump --dwarf=info test.obj.x86   > test.obj.x86.dump
$ objdump --dwarf=info test.obj.bpf   > test.obj.bpf.dump

Compare those two dump files:

test.obj.x86.dump:

<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <44>   DW_AT_location    : 3 byte block: 55 93 4    (DW_OP_reg5 (rdi);
                                                         DW_OP_piece: 4)
    <48>   DW_AT_name        : (indirect string, offset: 0xdc): myvar_a
    <4c>   DW_AT_decl_file   : 1
    <4d>   DW_AT_decl_line   : 1
    <4e>   DW_AT_type        : <0x71>

test.obj.bpf.dump:

<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <44> DW_AT_name          : (indirect string, offset: 0x0): clang
                                version 3.8.0 (http://llvm.org/git/clang.git
                                3a7c733b80f156a547f3f1517e6fbce9c0a33026)
                                (http://llvm.org/git/llvm.git
                                 90908cb34d73460d3 a83e2194a58d82c6d1f199)
    <48>   DW_AT_decl_file   : 1
    <49>   DW_AT_decl_line   : 1
    <4a>   DW_AT_type        : <0x65>


No DW_AT_location info for formal parameters, but if we change
the function 'testprog' to 'main', DW_AT_location of formal
parameters appear but that of local variables are still missed,
don't know why..

$ cat > main.c <<EOF
#include <stdio.h>

int main(int argc, char *argv[])
{
         int myvar_a, myvar_b;
         int myvar_c;
         myvar_c = myvar_a + myvar_b;
         return myvar_c;
}

test.obj.bpf.dump:
<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <44>   DW_AT_location    : 1 byte block: 51         (DW_OP_reg1 (r1))
    <46>   DW_AT_name        : (indirect string, offset: 0x0): clang version 3.8.
    ..
<2><5d>: Abbrev Number: 4 (DW_TAG_variable)
    <5e>   DW_AT_name        : (indirect string, offset: 0x0): clang version 3.8.

test.obj.x86.dump:

<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <44>   DW_AT_location    : 3 byte block: 55 93 4    (DW_OP_reg5 (rdi);
                                                         DW_OP_piece: 4)
    <48>   DW_AT_name        : (indirect string, offset: 0xd8): argc
    ..
<2><5f>: Abbrev Number: 4 (DW_TAG_variable)
    <60>   DW_AT_name        : (indirect string, offset: 0xe7): myvar_a
    ..

Thank you.



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-24  4:16                               ` He Kuang
@ 2015-07-25 10:04                                 ` He Kuang
  2015-07-28  2:18                                   ` Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: He Kuang @ 2015-07-25 10:04 UTC (permalink / raw)
  To: Alexei Starovoitov, Wangnan (F), pi3orama; +Cc: linux-kernel

Hi, Alexei

On 2015/7/24 12:16, He Kuang wrote:
> Hi, Alexei
>
> On 2015/7/24 11:20, Alexei Starovoitov wrote:
>> On 7/23/15 1:49 PM, Alexei Starovoitov wrote:
>>> On 7/23/15 4:54 AM, He Kuang wrote:
>>>
>>> trimmed cc-list, since it's not related to kernel.
>>>
>>>> Thank you for your guidence, and by referencing your last mail
>>>> and other llvm backends, I found setting
>>>> BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h
>>>
>>> thanks! yes. it was missing.
>>>
>>>> and fix some unhandeled switch can make llc output debug_info,
>>>
>>> what do you mean ?
>>>
>>>> but important information is missing in the result:
>>>
>>> hmm. I see slightly different picture.
>>> With 'clang -O2 -target bpf -g -S a.c'
>>> I see all the right info inside .s file.
>>> with '-c a.c' for some reasons it produces bogus offset:
>>>     Abbrev Offset: 0xffff0000
>>>     Pointer Size:  8
>>> /usr/local/bin/objdump: Warning: Debug info is corrupted, abbrev offset
>>> (ffff0000) is larger than abbrev section size (4b)
>>>
>>> and objdump fails to parse .o
>>> I'm using llvm trunk 3.8. Do you see this as well?
>>
>> there were few issues related to relocations.
>> Fixed it up and pushed to llvm trunk r243087.
>> Please pull and give it a try.
>> AT_location should be correct, but AT_name still looks buggy.
>
> I've pulled the lastest version "[bpf] initial support for
> debug_info" and tested it. This version can output debug_info but
> still not generate correct AT_location, I tested as following:
>
> $ cat > main.c <<EOF
> int testprog(int myvar_a, int myvar_b)
> {
>          int myvar_c;
>          myvar_c = myvar_a + myvar_b;
>          return myvar_c;
> }
> EOF
>
> $ clang -g  -O2 -target bpf -c main.c -o test.obj.bpf
> $ clang -g  -O2             -c main.c -o test.obj.x86
>
> $ objdump --dwarf=info test.obj.x86   > test.obj.x86.dump
> $ objdump --dwarf=info test.obj.bpf   > test.obj.bpf.dump
>
> Compare those two dump files:
>
> test.obj.x86.dump:
>
> <2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
>     <44>   DW_AT_location    : 3 byte block: 55 93 4    (DW_OP_reg5 (rdi);
>                                                          DW_OP_piece: 4)
>     <48>   DW_AT_name        : (indirect string, offset: 0xdc): myvar_a
>     <4c>   DW_AT_decl_file   : 1
>     <4d>   DW_AT_decl_line   : 1
>     <4e>   DW_AT_type        : <0x71>
>
> test.obj.bpf.dump:
>
> <2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
>     <44> DW_AT_name          : (indirect string, offset: 0x0): clang
>                                 version 3.8.0 (http://llvm.org/git/clang.git
>                                 3a7c733b80f156a547f3f1517e6fbce9c0a33026)
>                                 (http://llvm.org/git/llvm.git
>                                  90908cb34d73460d3 a83e2194a58d82c6d1f199)
>     <48>   DW_AT_decl_file   : 1
>     <49>   DW_AT_decl_line   : 1
>     <4a>   DW_AT_type        : <0x65>
>
>
> No DW_AT_location info for formal parameters, but if we change
> the function 'testprog' to 'main', DW_AT_location of formal
> parameters appear but that of local variables are still missed,
> don't know why..
>
> $ cat > main.c <<EOF
> #include <stdio.h>
>
> int main(int argc, char *argv[])
> {
>          int myvar_a, myvar_b;
>          int myvar_c;
>          myvar_c = myvar_a + myvar_b;
>          return myvar_c;
> }
>
> test.obj.bpf.dump:
> <2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
>     <44>   DW_AT_location    : 1 byte block: 51         (DW_OP_reg1 (r1))
>     <46>   DW_AT_name        : (indirect string, offset: 0x0): clang version 3.8.
>     ..
> <2><5d>: Abbrev Number: 4 (DW_TAG_variable)
>     <5e>   DW_AT_name        : (indirect string, offset: 0x0): clang version 3.8.
>
> test.obj.x86.dump:
>
> <2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
>     <44>   DW_AT_location    : 3 byte block: 55 93 4    (DW_OP_reg5 (rdi);
>                                                          DW_OP_piece: 4)
>     <48>   DW_AT_name        : (indirect string, offset: 0xd8): argc
>     ..
> <2><5f>: Abbrev Number: 4 (DW_TAG_variable)
>     <60>   DW_AT_name        : (indirect string, offset: 0xe7): myvar_a
>     ..
>
> Thank you.
>
>

Share some infomation on debuging bpf debuginfo problem.

An error "error: failed to compute relocation: Unknown" can be
found if we use llvm-dwarfdump tools to dump the object file,
debuging on that error, it seems there's no support for type
'BPF' in llvm/include/llvm/Support/MachO.h, and llvm-dwarfdump
fails to find the corresponding VisitElf method. Then I have a
rough test which forces RelocVisitor to use 'visitELF_386_32',
and got the correct DW_AT_name in the output:

0x00000043: DW_TAG_formal_parameter [3]
             DW_AT_name [DW_FORM_strp]	( .debug_str[0x000000dc] = "myvar_a")
             DW_AT_decl_file [DW_FORM_data1]	("testllvm/main.c")
             DW_AT_decl_line [DW_FORM_data1]	(3)
             DW_AT_type [DW_FORM_ref4]	(cu + 0x0065 => {0x00000065})

I noticed that for 64-bit elf format, the reloc sections have
'Addend' in the entry, but there's no 'Addend' info in bpf elf
file(64bit). I think there must be something wrong in the process
of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
AT_name now, DW_AT_LOCATION still missed and need your help.

Thank you.




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-25 10:04                                 ` He Kuang
@ 2015-07-28  2:18                                   ` Alexei Starovoitov
  2015-07-29  9:38                                     ` He Kuang
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-07-28  2:18 UTC (permalink / raw)
  To: He Kuang, Wangnan (F), pi3orama; +Cc: linux-kernel

On 7/25/15 3:04 AM, He Kuang wrote:
> I noticed that for 64-bit elf format, the reloc sections have
> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
> file(64bit). I think there must be something wrong in the process
> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
> AT_name now, DW_AT_LOCATION still missed and need your help.

looks like objdump/llvm-dwarfdump can only read known EM,
but that that shouldn't be the problem for your dwarf reader right?
It should be able to recognize id-s of ELF::R_X86_64_* relo used right?
As far as AT_location for testprog.c it seems there is no info for
local variables because they were optimized away.
With -O0 I see AT_location being emitted.
Also line number info seems to be good in both cases.
But in our case, we don't need this anyway, no? we need to see
the types of structs mainly or you have some other use cases?
I think line number info would be great to correlate the error reported
by verifier into specific line in C.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-28  2:18                                   ` Alexei Starovoitov
@ 2015-07-29  9:38                                     ` He Kuang
  2015-07-29 17:13                                       ` Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: He Kuang @ 2015-07-29  9:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Wangnan (F), pi3orama; +Cc: linux-kernel

Hi, Alexei

On 2015/7/28 10:18, Alexei Starovoitov wrote:
> On 7/25/15 3:04 AM, He Kuang wrote:
>> I noticed that for 64-bit elf format, the reloc sections have
>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>> file(64bit). I think there must be something wrong in the process
>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>> AT_name now, DW_AT_LOCATION still missed and need your help.
>
> looks like objdump/llvm-dwarfdump can only read known EM,
> but that that shouldn't be the problem for your dwarf reader right?
> It should be able to recognize id-s of ELF::R_X86_64_* relo used right?
> As far as AT_location for testprog.c it seems there is no info for
> local variables because they were optimized away.
> With -O0 I see AT_location being emitted.
> Also line number info seems to be good in both cases.
> But in our case, we don't need this anyway, no? we need to see
> the types of structs mainly or you have some other use cases?
> I think line number info would be great to correlate the error reported
> by verifier into specific line in C.
>

Yes, without AT_location, we can lookup the user output data type
by line number, but there're some issues when we look deep.

There're two steps of work that should be done in user space,
first we embed data type into bpf output record, then we use this
type, or index or some other identifier to lookup the type from
dwarf info, so we got a few plans.

* Plan A. Use line number to identify the user data type

Predefined macros:

   #define DEFINE_BPF_OUTPUT_DATA(type, var)                               \
           const int BPF_OUTPUT_LINE__##var = __LINE__; type var;

   #define BPF_OUTPUT_TRACE_DATA(data, size)       \
           __bpf_output_trace_data(BPF_OUTPUT_LINE__##data, &data, size)

User defined BPF code:

   struct user_define_struct {
          ...
   };

   int testprog(int myvar_a, long myvar_b)
   {
           DEFINE_BPF_OUTPUT_DATA(struct user_define_struct, myvar_c);

           BPF_OUTPUT_TRACE_DATA(myvar_c, sizeof(myvar_c));

   ...

We use macros to embed linenum implicitly, which leads an extra
restriction that user should not define multiple variables in the
same line and not split the macro over multiple lines, like this:

   22 DEFINE_BPF_OUTPUT_DATA(struct xxtype, a); DEFINE_BPF_OUTPUT_DATA(struct xxtype, b);

Or

   22 DEFINE_BPF_OUTPUT_DATA(struct user_define_struct,
   23                               myvar_c);

DW_AT_decl_line = 22, while __LINE__ = 23

So we should add verifier in the llvm BPF backend to warn on the
above codes.

* Plan B. Lookup variable type from dwarf AT_location info

We can make use of the output data variable's address, for bpf is
a minus offset to frame base. Then lookup matched offset from
location info(e.g. "DW_OP_fbreg: -32") to identify the variable
type.

For getting the frame base address, we can use builtin functions
like __builtin_frame_base() and __builtin_dwarf_cfa() which
returns the call frame base address. Currently those builtin
functions are not implemented in BPF lower operation yet, so we
tested our bpf program by using a variable tag on frame base, as
following:

   struct user_define_struct {
      ...
   };

   typedef struct {} frame_base_tag;

   int testprog(void)
   {
       frame_base_tag BPF_FRAME_BASE;
       struct user_define_struct myvar_a;

       __bpf_trace_output_data((void *)&myvar_a - (void *)&BPF_FRAME_BASE,
                               &myvar_a, sizeof(myvar_a));
   ...

The first argument of __bpf_trace_output_data() will be caculated
and it's easy to traverse the variable DIEs in dwarf info and
check each DW_AT_location attribute to find the corresponding
variable type.

The things let us worry about is the opimization may reuse the
stack space which can cause different variables share the same
address, by some rough tests that kind of optimization does not
appear.

* Comparison

Plan A needs less effort and easy to implement, but requires more
check to ensure user not use multiple definition in the same line
and not use macro cross lines.

The advantages of plan B is that we do not need introduce macros
showed in above example and all the things are done implicitly,
but the AT_location info is the prerequisite of this plan, I'm
not sure whether we can guarantee this info in dwarf or not.

Another way we can think of is adding new builtin functions to
indicate the compilier to generate codes return the dwarf type
index directly:

   __bpf_trace_output_data(__builtin_dwarf_type(myvar_a), &myvar_a, size);

What's your opinion on those plans, and do you have more
suggestion?

Thank you.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-29  9:38                                     ` He Kuang
@ 2015-07-29 17:13                                       ` Alexei Starovoitov
  2015-07-29 20:00                                         ` pi3orama
                                                           ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Alexei Starovoitov @ 2015-07-29 17:13 UTC (permalink / raw)
  To: He Kuang, Wangnan (F), pi3orama; +Cc: linux-kernel

On 7/29/15 2:38 AM, He Kuang wrote:
> Hi, Alexei
>
> On 2015/7/28 10:18, Alexei Starovoitov wrote:
>> On 7/25/15 3:04 AM, He Kuang wrote:
>>> I noticed that for 64-bit elf format, the reloc sections have
>>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>>> file(64bit). I think there must be something wrong in the process
>>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>>> AT_name now, DW_AT_LOCATION still missed and need your help.
>>
>> looks like objdump/llvm-dwarfdump can only read known EM,
>> but that that shouldn't be the problem for your dwarf reader right?
>> It should be able to recognize id-s of ELF::R_X86_64_* relo used right?
>> As far as AT_location for testprog.c it seems there is no info for
>> local variables because they were optimized away.
>> With -O0 I see AT_location being emitted.
>> Also line number info seems to be good in both cases.
>> But in our case, we don't need this anyway, no? we need to see
>> the types of structs mainly or you have some other use cases?
>> I think line number info would be great to correlate the error reported
>> by verifier into specific line in C.
>>
>
> Yes, without AT_location, we can lookup the user output data type
> by line number, but there're some issues when we look deep.
>
> There're two steps of work that should be done in user space,
> first we embed data type into bpf output record, then we use this
> type, or index or some other identifier to lookup the type from
> dwarf info, so we got a few plans.
>
> * Plan A. Use line number to identify the user data type
>
> Predefined macros:
>
>    #define DEFINE_BPF_OUTPUT_DATA(type,
> var)                               \
>            const int BPF_OUTPUT_LINE__##var = __LINE__; type var;
>
>    #define BPF_OUTPUT_TRACE_DATA(data, size)       \
>            __bpf_output_trace_data(BPF_OUTPUT_LINE__##data, &data, size)
>
> User defined BPF code:
>
>    struct user_define_struct {
>           ...
>    };
>
>    int testprog(int myvar_a, long myvar_b)
>    {
>            DEFINE_BPF_OUTPUT_DATA(struct user_define_struct, myvar_c);
>
>            BPF_OUTPUT_TRACE_DATA(myvar_c, sizeof(myvar_c));
>
>    ...
>
> We use macros to embed linenum implicitly, which leads an extra
> restriction that user should not define multiple variables in the
> same line and not split the macro over multiple lines, like this:
>
>    22 DEFINE_BPF_OUTPUT_DATA(struct xxtype, a);
> DEFINE_BPF_OUTPUT_DATA(struct xxtype, b);
>
> Or
>
>    22 DEFINE_BPF_OUTPUT_DATA(struct user_define_struct,
>    23                               myvar_c);
>
> DW_AT_decl_line = 22, while __LINE__ = 23
>
> So we should add verifier in the llvm BPF backend to warn on the
> above codes.
>
> * Plan B. Lookup variable type from dwarf AT_location info
>
> We can make use of the output data variable's address, for bpf is
> a minus offset to frame base. Then lookup matched offset from
> location info(e.g. "DW_OP_fbreg: -32") to identify the variable
> type.
>
> For getting the frame base address, we can use builtin functions
> like __builtin_frame_base() and __builtin_dwarf_cfa() which
> returns the call frame base address. Currently those builtin
> functions are not implemented in BPF lower operation yet, so we
> tested our bpf program by using a variable tag on frame base, as
> following:
>
>    struct user_define_struct {
>       ...
>    };
>
>    typedef struct {} frame_base_tag;
>
>    int testprog(void)
>    {
>        frame_base_tag BPF_FRAME_BASE;
>        struct user_define_struct myvar_a;
>
>        __bpf_trace_output_data((void *)&myvar_a - (void *)&BPF_FRAME_BASE,
>                                &myvar_a, sizeof(myvar_a));
>    ...
>
> The first argument of __bpf_trace_output_data() will be caculated
> and it's easy to traverse the variable DIEs in dwarf info and
> check each DW_AT_location attribute to find the corresponding
> variable type.
>
> The things let us worry about is the opimization may reuse the
> stack space which can cause different variables share the same
> address, by some rough tests that kind of optimization does not
> appear.
>
> * Comparison
>
> Plan A needs less effort and easy to implement, but requires more
> check to ensure user not use multiple definition in the same line
> and not use macro cross lines.
>
> The advantages of plan B is that we do not need introduce macros
> showed in above example and all the things are done implicitly,
> but the AT_location info is the prerequisite of this plan, I'm
> not sure whether we can guarantee this info in dwarf or not.
>
> Another way we can think of is adding new builtin functions to
> indicate the compilier to generate codes return the dwarf type
> index directly:
>
>    __bpf_trace_output_data(__builtin_dwarf_type(myvar_a), &myvar_a, size);
>

probably both A and B won't really work when programs get bigger
and optimizations will start moving lines around.
the builtin_dwarf_type idea is actually quite interesting.
Potentially that builtin can stringify type name and later we can
search it in dwarf. Please take a look how to add such builtin.
There are few similar builtins that deal with exception handling
and need type info. May be they can be reused. Like:
int_eh_typeid_for and int_eh_dwarf_cfa


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-29 17:13                                       ` Alexei Starovoitov
@ 2015-07-29 20:00                                         ` pi3orama
  2015-07-29 22:20                                           ` Alexei Starovoitov
  2015-07-31 10:18                                         ` Wangnan (F)
  2015-08-05  8:59                                         ` [LLVMdev] Cc llvmdev: " He Kuang
  2 siblings, 1 reply; 53+ messages in thread
From: pi3orama @ 2015-07-29 20:00 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: He Kuang, Wangnan (F), linux-kernel



发自我的 iPhone

> 在 2015年7月30日,上午1:13,Alexei Starovoitov <ast@plumgrid.com> 写道:
> 
>> On 7/29/15 2:38 AM, He Kuang wrote:
>> Hi, Alexei
>> 
>>> On 2015/7/28 10:18, Alexei Starovoitov wrote:
>>>> On 7/25/15 3:04 AM, He Kuang wrote:
>>>> I noticed that for 64-bit elf format, the reloc sections have
>>>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>>>> file(64bit). I think there must be something wrong in the process
>>>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>>>> AT_name now, DW_AT_LOCATION still missed and need your help.
>>> 
>>> looks like objdump/llvm-dwarfdump can only read known EM,
>>> but that that shouldn't be the problem for your dwarf reader right?
>>> It should be able to recognize id-s of ELF::R_X86_64_* relo used right?
>>> As far as AT_location for testprog.c it seems there is no info for
>>> local variables because they were optimized away.
>>> With -O0 I see AT_location being emitted.
>>> Also line number info seems to be good in both cases.
>>> But in our case, we don't need this anyway, no? we need to see
>>> the types of structs mainly or you have some other use cases?
>>> I think line number info would be great to correlate the error reported
>>> by verifier into specific line in C.
>> 
>> Yes, without AT_location, we can lookup the user output data type
>> by line number, but there're some issues when we look deep.
>> 
>> There're two steps of work that should be done in user space,
>> first we embed data type into bpf output record, then we use this
>> type, or index or some other identifier to lookup the type from
>> dwarf info, so we got a few plans.
>> 
>> * Plan A. Use line number to identify the user data type
>> 
>> Predefined macros:
>> 
>>   #define DEFINE_BPF_OUTPUT_DATA(type,
>> var)                               \
>>           const int BPF_OUTPUT_LINE__##var = __LINE__; type var;
>> 
>>   #define BPF_OUTPUT_TRACE_DATA(data, size)       \
>>           __bpf_output_trace_data(BPF_OUTPUT_LINE__##data, &data, size)
>> 
>> User defined BPF code:
>> 
>>   struct user_define_struct {
>>          ...
>>   };
>> 
>>   int testprog(int myvar_a, long myvar_b)
>>   {
>>           DEFINE_BPF_OUTPUT_DATA(struct user_define_struct, myvar_c);
>> 
>>           BPF_OUTPUT_TRACE_DATA(myvar_c, sizeof(myvar_c));
>> 
>>   ...
>> 
>> We use macros to embed linenum implicitly, which leads an extra
>> restriction that user should not define multiple variables in the
>> same line and not split the macro over multiple lines, like this:
>> 
>>   22 DEFINE_BPF_OUTPUT_DATA(struct xxtype, a);
>> DEFINE_BPF_OUTPUT_DATA(struct xxtype, b);
>> 
>> Or
>> 
>>   22 DEFINE_BPF_OUTPUT_DATA(struct user_define_struct,
>>   23                               myvar_c);
>> 
>> DW_AT_decl_line = 22, while __LINE__ = 23
>> 
>> So we should add verifier in the llvm BPF backend to warn on the
>> above codes.
>> 
>> * Plan B. Lookup variable type from dwarf AT_location info
>> 
>> We can make use of the output data variable's address, for bpf is
>> a minus offset to frame base. Then lookup matched offset from
>> location info(e.g. "DW_OP_fbreg: -32") to identify the variable
>> type.
>> 
>> For getting the frame base address, we can use builtin functions
>> like __builtin_frame_base() and __builtin_dwarf_cfa() which
>> returns the call frame base address. Currently those builtin
>> functions are not implemented in BPF lower operation yet, so we
>> tested our bpf program by using a variable tag on frame base, as
>> following:
>> 
>>   struct user_define_struct {
>>      ...
>>   };
>> 
>>   typedef struct {} frame_base_tag;
>> 
>>   int testprog(void)
>>   {
>>       frame_base_tag BPF_FRAME_BASE;
>>       struct user_define_struct myvar_a;
>> 
>>       __bpf_trace_output_data((void *)&myvar_a - (void *)&BPF_FRAME_BASE,
>>                               &myvar_a, sizeof(myvar_a));
>>   ...
>> 
>> The first argument of __bpf_trace_output_data() will be caculated
>> and it's easy to traverse the variable DIEs in dwarf info and
>> check each DW_AT_location attribute to find the corresponding
>> variable type.
>> 
>> The things let us worry about is the opimization may reuse the
>> stack space which can cause different variables share the same
>> address, by some rough tests that kind of optimization does not
>> appear.
>> 
>> * Comparison
>> 
>> Plan A needs less effort and easy to implement, but requires more
>> check to ensure user not use multiple definition in the same line
>> and not use macro cross lines.
>> 
>> The advantages of plan B is that we do not need introduce macros
>> showed in above example and all the things are done implicitly,
>> but the AT_location info is the prerequisite of this plan, I'm
>> not sure whether we can guarantee this info in dwarf or not.
>> 
>> Another way we can think of is adding new builtin functions to
>> indicate the compilier to generate codes return the dwarf type
>> index directly:
>> 
>>   __bpf_trace_output_data(__builtin_dwarf_type(myvar_a), &myvar_a, size);
> 
> probably both A and B won't really work when programs get bigger
> and optimizations will start moving lines around.
> the builtin_dwarf_type idea is actually quite interesting.
> Potentially that builtin can stringify type name and later we can
> search it in dwarf. Please take a look how to add such builtin.
> There are few similar builtins that deal with exception handling
> and need type info. May be they can be reused. Like:
> int_eh_typeid_for and int_eh_dwarf_cfa
> 

Hi Alexei,

I was wondering if you could give us a hint on adding BPF specific builtins?

Doesn't like other machines, currently there's no Builtins.def for BPF to hold
builtins for that specific target. If we start creating such builtins, we can bring
more there. One builtins I want to see should be __builtin_bpf_strcmp(char*, char*),
with that we can filter events base on name of a task. What I really need is filtering 
events based on comm of the main thread, which should be useful on Android 
since in Android name of working threads are always something like "Binder_?", 
only the main threads names are meaningful. But let's work on strcmp first.

Thank you.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-29 20:00                                         ` pi3orama
@ 2015-07-29 22:20                                           ` Alexei Starovoitov
  0 siblings, 0 replies; 53+ messages in thread
From: Alexei Starovoitov @ 2015-07-29 22:20 UTC (permalink / raw)
  To: pi3orama; +Cc: He Kuang, Wangnan (F), linux-kernel

On 7/29/15 1:00 PM, pi3orama wrote:
> I was wondering if you could give us a hint on adding BPF specific builtins?
>
> Doesn't like other machines, currently there's no Builtins.def for BPF to hold
> builtins for that specific target. If we start creating such builtins, we can bring
> more there. One builtins I want to see should be __builtin_bpf_strcmp(char*, char*),
> with that we can filter events base on name of a task. What I really need is filtering
> events based on comm of the main thread, which should be useful on Android
> since in Android name of working threads are always something like "Binder_?",
> only the main threads names are meaningful. But let's work on strcmp first.

Currently there are 4 of them in IntrinsicsBPF.td
but I don't see how strcmp can work with current no-loops restriction.
As an alternative to filter binder threads you can use
memcmp(ptr, "Binder_", 6) since it will get unrolled fully.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-29 17:13                                       ` Alexei Starovoitov
  2015-07-29 20:00                                         ` pi3orama
@ 2015-07-31 10:18                                         ` Wangnan (F)
  2015-07-31 10:20                                           ` [LLVM PATCH] BPF: add FRAMEADDR support Wang Nan
                                                             ` (3 more replies)
  2015-08-05  8:59                                         ` [LLVMdev] Cc llvmdev: " He Kuang
  2 siblings, 4 replies; 53+ messages in thread
From: Wangnan (F) @ 2015-07-31 10:18 UTC (permalink / raw)
  To: Alexei Starovoitov, He Kuang, pi3orama; +Cc: linux-kernel



On 2015/7/30 1:13, Alexei Starovoitov wrote:

[SNIP]
> probably both A and B won't really work when programs get bigger
> and optimizations will start moving lines around.
> the builtin_dwarf_type idea is actually quite interesting.
> Potentially that builtin can stringify type name and later we can
> search it in dwarf. Please take a look how to add such builtin.
> There are few similar builtins that deal with exception handling
> and need type info. May be they can be reused. Like:
> int_eh_typeid_for and int_eh_dwarf_cfa
>

Hi Alexei,

I have tested int_eh_dwarf_cfa and int_eh_typeid_for.

By implementing ISD::FRAMEADDR support in LLVM BPF backend users are 
allowed to
fetch frame address R11. __builtin_frame_addr(0) and __builtin_dwarf_cfa()
will be enabled.

By emitting llvm.eh_typeid_for in clang we can utilize it go generate an 
unique
ID from a pointer of user defined type. However, we can't use pointer of 
local
variables.

I attach a sample program and the resuling asm output at the bottom of 
this mail.

Looks like llvm.eh_typeid_for meets our goal further. However, currently 
it is
still ugly:

1. llvm.eh_typeid_for can be used on global variables only. So for each 
output
    structure we have to define a global varable.

2. We still need to find a way to connect the fetchd typeid with DWARF info.
    Inserting that ID into DWARF may workable?

    However with the newest llvm + clang the DWARF info is still incorrect:

$ objdump  --dwarf=info ./out.o
...
  <1><3f>: Abbrev Number: 3 (DW_TAG_structure_type)
     <40>   DW_AT_name        : (indirect string, offset: 0x0): clang 
version 3.8.0 (http://llvm.org/git/clang.git 
f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git 
c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
     <44>   DW_AT_byte_size   : 8
     <45>   DW_AT_decl_file   : 1
     <46>   DW_AT_decl_line   : 4
  <2><47>: Abbrev Number: 4 (DW_TAG_member)
     <48>   DW_AT_name        : (indirect string, offset: 0x0): clang 
version 3.8.0 (http://llvm.org/git/clang.git 
f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git 
c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
     <4c>   DW_AT_type        : <0x60>
     <50>   DW_AT_decl_file   : 1
     <51>   DW_AT_decl_line   : 5
     <52>   DW_AT_data_member_location: 0
  <2><53>: Abbrev Number: 4 (DW_TAG_member)
     <54>   DW_AT_name        : (indirect string, offset: 0x0): clang 
version 3.8.0 (http://llvm.org/git/clang.git 
f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git 
c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
     <58>   DW_AT_type        : <0x60>
     <5c>   DW_AT_decl_file   : 1
     <5d>   DW_AT_decl_line   : 6
     <5e>   DW_AT_data_member_location: 4
...

The DW_AT_name is missing.

I'll post 2 LLVM patches by replying this mail. Please have a look and 
help me
send them to LLVM if you think my code is correct.

Following is the sample code and resuling ASM:

========================================================

static int (*test_func)(unsigned long) = (void *) 0x1234;

struct my_str {
         int x;
         int y;
};
struct my_str __str_my_str;

struct my_str2 {
         int x;
         int y;
         int z;
};
struct my_str2 __str_my_str2;

int func(int *ctx)
{
         struct my_str var_a;
         struct my_str2 var_b;
         test_func((void*)&var_a - __builtin_dwarf_cfa());
         test_func(__builtin_bpf_typeid(&__str_my_str));
         test_func(__builtin_bpf_typeid(&__str_my_str2));
         return 0;
}

==================== the resuling asm code =============

         .text
         .globl  func
         .align  8
func:                                   # @func
# BB#0:                                 # %entry
         mov     r1, r10
         addi    r1, -8
         sub     r1, r11
         call    4660
         mov     r1, 1
         call    4660
         mov     r1, 2
         call    4660
         mov     r0, 0
         ret

         .comm   __str_my_str,8,4        # @__str_my_str
         .comm   __str_my_str2,12,4      # @__str_my_str2



^ permalink raw reply	[flat|nested] 53+ messages in thread

* [LLVM PATCH] BPF: add FRAMEADDR support
  2015-07-31 10:18                                         ` Wangnan (F)
@ 2015-07-31 10:20                                           ` Wang Nan
  2015-07-31 10:21                                           ` [LLVM CLANG PATCH] BPF: add __builtin_bpf_typeid() Wang Nan
                                                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 53+ messages in thread
From: Wang Nan @ 2015-07-31 10:20 UTC (permalink / raw)
  To: ast; +Cc: hekuang, pi3orama, linux-kernel

After this patch now BPF backend support __builtin_frame_address()
and __builtin_dwarf_cfa().

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 lib/Target/BPF/BPFISelLowering.cpp | 20 ++++++++++++++++++++
 lib/Target/BPF/BPFISelLowering.h   |  1 +
 2 files changed, 21 insertions(+)

diff --git a/lib/Target/BPF/BPFISelLowering.cpp b/lib/Target/BPF/BPFISelLowering.cpp
index 58498a1..f1934a2 100644
--- a/lib/Target/BPF/BPFISelLowering.cpp
+++ b/lib/Target/BPF/BPFISelLowering.cpp
@@ -169,6 +169,9 @@ BPFTargetLowering::BPFTargetLowering(const TargetMachine &TM,
   MaxStoresPerMemset = MaxStoresPerMemsetOptSize = 128;
   MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = 128;
   MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = 128;
+
+  // support __builtin_frame_address(0)
+  setOperationAction(ISD::FRAMEADDR, MVT::i32, Custom);
 }
 
 SDValue BPFTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
@@ -179,6 +182,8 @@ SDValue BPFTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
     return LowerGlobalAddress(Op, DAG);
   case ISD::SELECT_CC:
     return LowerSELECT_CC(Op, DAG);
+  case ISD::FRAMEADDR:
+    return LowerFRAMEADDR(Op, DAG);
   default:
     llvm_unreachable("unimplemented operand");
   }
@@ -509,6 +514,21 @@ SDValue BPFTargetLowering::LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const {
   return DAG.getNode(BPFISD::SELECT_CC, DL, VTs, Ops);
 }
 
+SDValue BPFTargetLowering::LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const {
+  EVT VT = Op.getValueType();
+  unsigned FrameReg = BPF::R11;
+  unsigned Depth = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
+
+  if (Depth != 0) {
+    SDLoc DL(Op);
+    MachineFunction &MF = DAG.getMachineFunction();
+    DiagnosticInfoUnsupported Err(DL, *MF.getFunction(),
+                                  "only frame 0 address can be fetched", SDValue());
+    DAG.getContext()->diagnose(Err);
+  }
+  return DAG.getRegister(FrameReg, VT);
+}
+
 const char *BPFTargetLowering::getTargetNodeName(unsigned Opcode) const {
   switch ((BPFISD::NodeType)Opcode) {
   case BPFISD::FIRST_NUMBER:
diff --git a/lib/Target/BPF/BPFISelLowering.h b/lib/Target/BPF/BPFISelLowering.h
index ec71dca..e4bf73d 100644
--- a/lib/Target/BPF/BPFISelLowering.h
+++ b/lib/Target/BPF/BPFISelLowering.h
@@ -50,6 +50,7 @@ private:
   SDValue LowerBR_CC(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;
+  SDValue LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const;
 
   // Lower the result values of a call, copying them out of physregs into vregs
   SDValue LowerCallResult(SDValue Chain, SDValue InFlag,
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [LLVM CLANG PATCH] BPF: add __builtin_bpf_typeid()
  2015-07-31 10:18                                         ` Wangnan (F)
  2015-07-31 10:20                                           ` [LLVM PATCH] BPF: add FRAMEADDR support Wang Nan
@ 2015-07-31 10:21                                           ` Wang Nan
  2015-07-31 10:48                                           ` llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event pi3orama
  2015-08-03 19:44                                           ` Alexei Starovoitov
  3 siblings, 0 replies; 53+ messages in thread
From: Wang Nan @ 2015-07-31 10:21 UTC (permalink / raw)
  To: ast; +Cc: hekuang, pi3orama, linux-kernel

This patch introduces a new builtin function __builtin_bpf_typeid()
for BPF. This function emit a 'llvm.eh_typeid_for', and will be
converted to typeid of a global variable.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 include/clang/Basic/BuiltinsBPF.def  | 16 ++++++++++++++++
 include/clang/Basic/TargetBuiltins.h |  9 +++++++++
 lib/Basic/Targets.cpp                | 14 +++++++++++++-
 lib/CodeGen/CGBuiltin.cpp            | 16 ++++++++++++++++
 lib/CodeGen/CodeGenFunction.h        |  1 +
 5 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 include/clang/Basic/BuiltinsBPF.def

diff --git a/include/clang/Basic/BuiltinsBPF.def b/include/clang/Basic/BuiltinsBPF.def
new file mode 100644
index 0000000..9e53ee0
--- /dev/null
+++ b/include/clang/Basic/BuiltinsBPF.def
@@ -0,0 +1,16 @@
+//===--- BuiltinsBPF.def - BPF Builtin function database ----*- C++ -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines the BPF-specific builtin function database.  Users of
+// this file must define the BUILTIN macro to make use of this information.
+//
+//===----------------------------------------------------------------------===//
+
+BUILTIN(__builtin_bpf_typeid, "Wiv*", "nc")
+
diff --git a/include/clang/Basic/TargetBuiltins.h b/include/clang/Basic/TargetBuiltins.h
index b4740c5..550f921 100644
--- a/include/clang/Basic/TargetBuiltins.h
+++ b/include/clang/Basic/TargetBuiltins.h
@@ -93,6 +93,15 @@ namespace clang {
     };
   }
 
+  namespace BPF {
+    enum {
+        LastTIBuiltin = clang::Builtin::FirstTSBuiltin-1,
+#define BUILTIN(ID, TYPE, ATTRS) BI##ID,
+#include "clang/Basic/BuiltinsBPF.def"
+        LastTSBuiltin
+    };
+  }
+
   /// \brief Flags to identify the types for overloaded Neon builtins.
   ///
   /// These must be kept in sync with the flags in utils/TableGen/NeonEmitter.h.
diff --git a/lib/Basic/Targets.cpp b/lib/Basic/Targets.cpp
index f229c99..aa7fffb 100644
--- a/lib/Basic/Targets.cpp
+++ b/lib/Basic/Targets.cpp
@@ -6111,6 +6111,7 @@ validateAsmConstraint(const char *&Name,
   };
 
 class BPFTargetInfo : public TargetInfo {
+  static const Builtin::Info BuiltinInfo[];
 public:
   BPFTargetInfo(const llvm::Triple &Triple) : TargetInfo(Triple) {
     LongWidth = LongAlign = PointerWidth = PointerAlign = 64;
@@ -6141,7 +6142,10 @@ public:
   }
 
   void getTargetBuiltins(const Builtin::Info *&Records,
-                         unsigned &NumRecords) const override {}
+                         unsigned &NumRecords) const override {
+    Records = BuiltinInfo;
+    NumRecords = clang::BPF::LastTSBuiltin - Builtin::FirstTSBuiltin;
+  }
   const char *getClobbers() const override {
     return "";
   }
@@ -6164,6 +6168,14 @@ public:
   }
 };
 
+const Builtin::Info BPFTargetInfo::BuiltinInfo[] = {
+#define BUILTIN(ID, TYPE, ATTRS) { #ID, TYPE, ATTRS, 0, ALL_LANGUAGES },
+#define LIBBUILTIN(ID, TYPE, ATTRS, HEADER) { #ID, TYPE, ATTRS, HEADER,\
+                                              ALL_LANGUAGES },
+#include "clang/Basic/BuiltinsBPF.def"
+};
+
+
 class MipsTargetInfoBase : public TargetInfo {
   virtual void setDescriptionString() = 0;
 
diff --git a/lib/CodeGen/CGBuiltin.cpp b/lib/CodeGen/CGBuiltin.cpp
index 1555d29..a46f638 100644
--- a/lib/CodeGen/CGBuiltin.cpp
+++ b/lib/CodeGen/CGBuiltin.cpp
@@ -1882,6 +1882,9 @@ Value *CodeGenFunction::EmitTargetBuiltinExpr(unsigned BuiltinID,
   case llvm::Triple::nvptx:
   case llvm::Triple::nvptx64:
     return EmitNVPTXBuiltinExpr(BuiltinID, E);
+  case llvm::Triple::bpfel:
+  case llvm::Triple::bpfeb:
+    return EmitBPFBuiltinExpr(BuiltinID, E);
   default:
     return nullptr;
   }
@@ -7056,3 +7059,16 @@ Value *CodeGenFunction::EmitNVPTXBuiltinExpr(unsigned BuiltinID,
     return nullptr;
   }
 }
+
+Value *CodeGenFunction::EmitBPFBuiltinExpr(unsigned BuiltinID,
+		                           const CallExpr *E) {
+  switch (BuiltinID) {
+  default:
+    return nullptr;
+  case BPF::BI__builtin_bpf_typeid:
+    assert(E->getArg(0)->getType()->isPointerType());
+    llvm::Value *llvm_eh_typeid_for =
+      CGM.getIntrinsic(llvm::Intrinsic::eh_typeid_for);
+    return Builder.CreateCall(llvm_eh_typeid_for, EmitScalarExpr(E->getArg(0)));
+  }
+}
diff --git a/lib/CodeGen/CodeGenFunction.h b/lib/CodeGen/CodeGenFunction.h
index b7d6bbd..863420d 100644
--- a/lib/CodeGen/CodeGenFunction.h
+++ b/lib/CodeGen/CodeGenFunction.h
@@ -2622,6 +2622,7 @@ public:
   llvm::Value *EmitAMDGPUBuiltinExpr(unsigned BuiltinID, const CallExpr *E);
   llvm::Value *EmitSystemZBuiltinExpr(unsigned BuiltinID, const CallExpr *E);
   llvm::Value *EmitNVPTXBuiltinExpr(unsigned BuiltinID, const CallExpr *E);
+  llvm::Value *EmitBPFBuiltinExpr(unsigned BuiltinID, const CallExpr *E);
 
   llvm::Value *EmitObjCProtocolExpr(const ObjCProtocolExpr *E);
   llvm::Value *EmitObjCStringLiteral(const ObjCStringLiteral *E);
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-31 10:18                                         ` Wangnan (F)
  2015-07-31 10:20                                           ` [LLVM PATCH] BPF: add FRAMEADDR support Wang Nan
  2015-07-31 10:21                                           ` [LLVM CLANG PATCH] BPF: add __builtin_bpf_typeid() Wang Nan
@ 2015-07-31 10:48                                           ` pi3orama
  2015-08-03 19:44                                           ` Alexei Starovoitov
  3 siblings, 0 replies; 53+ messages in thread
From: pi3orama @ 2015-07-31 10:48 UTC (permalink / raw)
  To: Wangnan (F); +Cc: Alexei Starovoitov, He Kuang, linux-kernel

Just for your information:

The buggy DW_AT_name seems to be a problem in assembler.

I built the C file using -S:

$ clang -target bpf -g -c -O2 test.c

Then in resuming test.s, replaced all BPF op by "nop";

Also adjust .file directive;

Then assemble the .s file using as:

$ as -c test.s -o test.o

Then check dwarf info using objdump:

$ objdump --dwarf=info test.o

DW_AT_name looks all correct.

Thank you.


发自我的 iPhone

> 在 2015年7月31日,下午6:18,Wangnan (F) <wangnan0@huawei.com> 写道:
> 
> 
> 
> On 2015/7/30 1:13, Alexei Starovoitov wrote:
> 
> [SNIP]
>> probably both A and B won't really work when programs get bigger
>> and optimizations will start moving lines around.
>> the builtin_dwarf_type idea is actually quite interesting.
>> Potentially that builtin can stringify type name and later we can
>> search it in dwarf. Please take a look how to add such builtin.
>> There are few similar builtins that deal with exception handling
>> and need type info. May be they can be reused. Like:
>> int_eh_typeid_for and int_eh_dwarf_cfa
> 
> Hi Alexei,
> 
> I have tested int_eh_dwarf_cfa and int_eh_typeid_for.
> 
> By implementing ISD::FRAMEADDR support in LLVM BPF backend users are allowed to
> fetch frame address R11. __builtin_frame_addr(0) and __builtin_dwarf_cfa()
> will be enabled.
> 
> By emitting llvm.eh_typeid_for in clang we can utilize it go generate an unique
> ID from a pointer of user defined type. However, we can't use pointer of local
> variables.
> 
> I attach a sample program and the resuling asm output at the bottom of this mail.
> 
> Looks like llvm.eh_typeid_for meets our goal further. However, currently it is
> still ugly:
> 
> 1. llvm.eh_typeid_for can be used on global variables only. So for each output
>   structure we have to define a global varable.
> 
> 2. We still need to find a way to connect the fetchd typeid with DWARF info.
>   Inserting that ID into DWARF may workable?
> 
>   However with the newest llvm + clang the DWARF info is still incorrect:
> 
> $ objdump  --dwarf=info ./out.o
> ...
> <1><3f>: Abbrev Number: 3 (DW_TAG_structure_type)
>    <40>   DW_AT_name        : (indirect string, offset: 0x0): clang version 3.8.0 (http://llvm.org/git/clang.git f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
>    <44>   DW_AT_byte_size   : 8
>    <45>   DW_AT_decl_file   : 1
>    <46>   DW_AT_decl_line   : 4
> <2><47>: Abbrev Number: 4 (DW_TAG_member)
>    <48>   DW_AT_name        : (indirect string, offset: 0x0): clang version 3.8.0 (http://llvm.org/git/clang.git f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
>    <4c>   DW_AT_type        : <0x60>
>    <50>   DW_AT_decl_file   : 1
>    <51>   DW_AT_decl_line   : 5
>    <52>   DW_AT_data_member_location: 0
> <2><53>: Abbrev Number: 4 (DW_TAG_member)
>    <54>   DW_AT_name        : (indirect string, offset: 0x0): clang version 3.8.0 (http://llvm.org/git/clang.git f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
>    <58>   DW_AT_type        : <0x60>
>    <5c>   DW_AT_decl_file   : 1
>    <5d>   DW_AT_decl_line   : 6
>    <5e>   DW_AT_data_member_location: 4
> ...
> 
> The DW_AT_name is missing.
> 
> I'll post 2 LLVM patches by replying this mail. Please have a look and help me
> send them to LLVM if you think my code is correct.
> 
> Following is the sample code and resuling ASM:
> 
> ========================================================
> 
> static int (*test_func)(unsigned long) = (void *) 0x1234;
> 
> struct my_str {
>        int x;
>        int y;
> };
> struct my_str __str_my_str;
> 
> struct my_str2 {
>        int x;
>        int y;
>        int z;
> };
> struct my_str2 __str_my_str2;
> 
> int func(int *ctx)
> {
>        struct my_str var_a;
>        struct my_str2 var_b;
>        test_func((void*)&var_a - __builtin_dwarf_cfa());
>        test_func(__builtin_bpf_typeid(&__str_my_str));
>        test_func(__builtin_bpf_typeid(&__str_my_str2));
>        return 0;
> }
> 
> ==================== the resuling asm code =============
> 
>        .text
>        .globl  func
>        .align  8
> func:                                   # @func
> # BB#0:                                 # %entry
>        mov     r1, r10
>        addi    r1, -8
>        sub     r1, r11
>        call    4660
>        mov     r1, 1
>        call    4660
>        mov     r1, 2
>        call    4660
>        mov     r0, 0
>        ret
> 
>        .comm   __str_my_str,8,4        # @__str_my_str
>        .comm   __str_my_str2,12,4      # @__str_my_str2
> 
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-31 10:18                                         ` Wangnan (F)
                                                             ` (2 preceding siblings ...)
  2015-07-31 10:48                                           ` llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event pi3orama
@ 2015-08-03 19:44                                           ` Alexei Starovoitov
  2015-08-04  9:01                                             ` Cc llvmdev: " Wangnan (F)
  2015-08-12  2:34                                             ` Wangnan (F)
  3 siblings, 2 replies; 53+ messages in thread
From: Alexei Starovoitov @ 2015-08-03 19:44 UTC (permalink / raw)
  To: Wangnan (F), He Kuang, pi3orama; +Cc: linux-kernel

On 7/31/15 3:18 AM, Wangnan (F) wrote:
>
>     However with the newest llvm + clang the DWARF info is still incorrect:
>
> $ objdump  --dwarf=info ./out.o
> ...
>   <1><3f>: Abbrev Number: 3 (DW_TAG_structure_type)
>      <40>   DW_AT_name        : (indirect string, offset: 0x0): clang
> version 3.8.0 (http://llvm.org/git/clang.git
> f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git
> c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
>      <44>   DW_AT_byte_size   : 8
>      <45>   DW_AT_decl_file   : 1
>      <46>   DW_AT_decl_line   : 4
>   <2><47>: Abbrev Number: 4 (DW_TAG_member)
>      <48>   DW_AT_name        : (indirect string, offset: 0x0): clang
> version 3.8.0 (http://llvm.org/git/clang.git
> f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git
> c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
>      <4c>   DW_AT_type        : <0x60>
>      <50>   DW_AT_decl_file   : 1
>      <51>   DW_AT_decl_line   : 5
>      <52>   DW_AT_data_member_location: 0
>   <2><53>: Abbrev Number: 4 (DW_TAG_member)
>      <54>   DW_AT_name        : (indirect string, offset: 0x0): clang
> version 3.8.0 (http://llvm.org/git/clang.git
> f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git
> c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
>      <58>   DW_AT_type        : <0x60>
>      <5c>   DW_AT_decl_file   : 1
>      <5d>   DW_AT_decl_line   : 6
>      <5e>   DW_AT_data_member_location: 4
> ...
>
> The DW_AT_name is missing.

didn't have time to look at it.
from your llvm patches looks like you've got quite experienced
with it already :)

> I'll post 2 LLVM patches by replying this mail. Please have a look and
> help me
> send them to LLVM if you think my code is correct.

patch 1:
I don't quite understand the purpose of builtin_dwarf_cfa
returning R11. It's a special register seen inside llvm codegen
only. It doesn't have kernel meaning.

patch 2:
do we really need to hack clang?
Can you just define a function that aliases to intrinsic,
like we do for ld_abs/ld_ind ?
void bpf_store_half(void *skb, u64 off, u64 val) asm("llvm.bpf.store.half");
then no extra patches necessary.

> struct my_str {
>          int x;
>          int y;
> };
> struct my_str __str_my_str;
>
> struct my_str2 {
>          int x;
>          int y;
>          int z;
> };
> struct my_str2 __str_my_str2;
>
>          test_func(__builtin_bpf_typeid(&__str_my_str));
>          test_func(__builtin_bpf_typeid(&__str_my_str2));
>          mov     r1, 1
>          call    4660
>          mov     r1, 2
>          call    4660

this part looks good. I think it's usable.

 > 1. llvm.eh_typeid_for can be used on global variables only. So for each
 > output
 >     structure we have to define a global varable.

why? I think it should work with local pointers as well.

 > 2. We still need to find a way to connect the fetchd typeid with DWARF
 > info.
 >     Inserting that ID into DWARF may workable?

hmm, that id should be the same id we're seeing in dwarf, right?
I think it's used in exception handling which is reusing some of
the dwarf stuff for this, so the must be a way to connect that id
to actual type info. Though last time I looked at EH was
during g++ hacking days. No idea how llvm does it exactly, but
I'm assuming the logic for rtti should be similar.

btw, for any deep llvm stuff you may need to move the thread to llvmdev.
May be folks there will have other ideas.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-03 19:44                                           ` Alexei Starovoitov
@ 2015-08-04  9:01                                             ` Wangnan (F)
  2015-08-05  1:58                                               ` Wangnan (F)
  2015-08-12  2:34                                             ` Wangnan (F)
  1 sibling, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-08-04  9:01 UTC (permalink / raw)
  To: Alexei Starovoitov, He Kuang, pi3orama; +Cc: linux-kernel, llvmdev

For people who in llvmdev:

This mail is belong to a thread in linux kernel mailing list, the first 
message
can be retrived from:

  http://lkml.kernel.org/r/55B1535E.8090406@plumgrid.com

Our goal is to fild a way to make BPF program get an unique ID for each type
so it can pass the ID to other part of kernel, then we can retrive the 
type and
decode the structure using DWARF information. Currently we have two problem
needs to solve:

1. Dwarf information generated by BPF backend lost all DW_AT_name field;

2. How to get typeid from local variable? I tried llvm.eh_typeid_for
    but it support global variable only.

Following is my response to Alexei.

On 2015/8/4 3:44, Alexei Starovoitov wrote:
> On 7/31/15 3:18 AM, Wangnan (F) wrote:
>

[SNIP]

> didn't have time to look at it.
> from your llvm patches looks like you've got quite experienced
> with it already :)
>
>> I'll post 2 LLVM patches by replying this mail. Please have a look and
>> help me
>> send them to LLVM if you think my code is correct.
>
> patch 1:
> I don't quite understand the purpose of builtin_dwarf_cfa
> returning R11. It's a special register seen inside llvm codegen
> only. It doesn't have kernel meaning.
>

Kernel side verifier allows us to do arithmetic computation using two 
local variable
address or local variable address and R11. Therefore, we can compute the 
location
of a local variable using:

   mark = &my_var_a - __builtin_frame_address(0);

If the stack allocation is fixed (if the location is never reused), the 
above 'mark'
can be uniquely identify a local variable. That's why I'm interesting in 
it. However
I'm not sure whether the prerequestion is hold.

> patch 2:
> do we really need to hack clang?
> Can you just define a function that aliases to intrinsic,
> like we do for ld_abs/ld_ind ?
> void bpf_store_half(void *skb, u64 off, u64 val) 
> asm("llvm.bpf.store.half");
> then no extra patches necessary.
>
>> struct my_str {
>>          int x;
>>          int y;
>> };
>> struct my_str __str_my_str;
>>
>> struct my_str2 {
>>          int x;
>>          int y;
>>          int z;
>> };
>> struct my_str2 __str_my_str2;
>>
>>          test_func(__builtin_bpf_typeid(&__str_my_str));
>>          test_func(__builtin_bpf_typeid(&__str_my_str2));
>>          mov     r1, 1
>>          call    4660
>>          mov     r1, 2
>>          call    4660
>
> this part looks good. I think it's usable.
>
> > 1. llvm.eh_typeid_for can be used on global variables only. So for each
> > output
> >     structure we have to define a global varable.
>
> why? I think it should work with local pointers as well.
>

It is defined by LLVM, in lib/CodeGen/Analysis.cpp:

/// ExtractTypeInfo - Returns the type info, possibly bitcast, encoded in V.
GlobalValue *llvm::ExtractTypeInfo(Value *V) {
   ...
   assert((GV || isa<ConstantPointerNull>(V)) &&
          "TypeInfo must be a global variable or NULL");   <-- we can 
use only constant pointers
   return GV;
}

So from llvm::Intrinsic::eh_typeid_for we can get type of global 
variables only.

We may need a new intrinsic for that.


> > 2. We still need to find a way to connect the fetchd typeid with DWARF
> > info.
> >     Inserting that ID into DWARF may workable?
>
> hmm, that id should be the same id we're seeing in dwarf, right?

There's no 'typeid' field in dwarf originally. I'm still looking for a way
to inject this ID into dwarf infromation.

> I think it's used in exception handling which is reusing some of
> the dwarf stuff for this, so the must be a way to connect that id
> to actual type info. Though last time I looked at EH was
> during g++ hacking days. No idea how llvm does it exactly, but
> I'm assuming the logic for rtti should be similar.
>

I'm not sure whether RTTI use dwarf to deduce type information. I think not,
because dwarf infos can be stripped out.

> btw, for any deep llvm stuff you may need to move the thread to llvmdev.
> May be folks there will have other ideas.
>



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-04  9:01                                             ` Cc llvmdev: " Wangnan (F)
@ 2015-08-05  1:58                                               ` Wangnan (F)
  2015-08-05  2:05                                                 ` Wangnan (F)
  0 siblings, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-08-05  1:58 UTC (permalink / raw)
  To: Alexei Starovoitov, He Kuang, pi3orama; +Cc: linux-kernel, llvmdev



On 2015/8/4 17:01, Wangnan (F) wrote:
> For people who in llvmdev:
>
> This mail is belong to a thread in linux kernel mailing list, the 
> first message
> can be retrived from:
>
>  http://lkml.kernel.org/r/55B1535E.8090406@plumgrid.com
>
> Our goal is to fild a way to make BPF program get an unique ID for 
> each type
> so it can pass the ID to other part of kernel, then we can retrive the 
> type and
> decode the structure using DWARF information. Currently we have two 
> problem
> needs to solve:
>
> 1. Dwarf information generated by BPF backend lost all DW_AT_name field;
>
> 2. How to get typeid from local variable? I tried llvm.eh_typeid_for
>    but it support global variable only.
>
> Following is my response to Alexei.
>
> On 2015/8/4 3:44, Alexei Starovoitov wrote:
>> On 7/31/15 3:18 AM, Wangnan (F) wrote:
>>
>
> [SNIP]
>
>> didn't have time to look at it.
>> from your llvm patches looks like you've got quite experienced
>> with it already :)
>>
>>> I'll post 2 LLVM patches by replying this mail. Please have a look and
>>> help me
>>> send them to LLVM if you think my code is correct.
>>
>> patch 1:
>> I don't quite understand the purpose of builtin_dwarf_cfa
>> returning R11. It's a special register seen inside llvm codegen
>> only. It doesn't have kernel meaning.
>>
>
> Kernel side verifier allows us to do arithmetic computation using two 
> local variable
> address or local variable address and R11. Therefore, we can compute 
> the location
> of a local variable using:
>
>   mark = &my_var_a - __builtin_frame_address(0);
>
> If the stack allocation is fixed (if the location is never reused), 
> the above 'mark'
> can be uniquely identify a local variable. That's why I'm interesting 
> in it. However
> I'm not sure whether the prerequestion is hold.
>
>> patch 2:
>> do we really need to hack clang?
>> Can you just define a function that aliases to intrinsic,
>> like we do for ld_abs/ld_ind ?
>> void bpf_store_half(void *skb, u64 off, u64 val) 
>> asm("llvm.bpf.store.half");
>> then no extra patches necessary.
>>
>>> struct my_str {
>>>          int x;
>>>          int y;
>>> };
>>> struct my_str __str_my_str;
>>>
>>> struct my_str2 {
>>>          int x;
>>>          int y;
>>>          int z;
>>> };
>>> struct my_str2 __str_my_str2;
>>>
>>>          test_func(__builtin_bpf_typeid(&__str_my_str));
>>>          test_func(__builtin_bpf_typeid(&__str_my_str2));
>>>          mov     r1, 1
>>>          call    4660
>>>          mov     r1, 2
>>>          call    4660
>>
>> this part looks good. I think it's usable.
>>
>> > 1. llvm.eh_typeid_for can be used on global variables only. So for 
>> each
>> > output
>> >     structure we have to define a global varable.
>>
>> why? I think it should work with local pointers as well.
>>
>
> It is defined by LLVM, in lib/CodeGen/Analysis.cpp:
>
> /// ExtractTypeInfo - Returns the type info, possibly bitcast, encoded 
> in V.
> GlobalValue *llvm::ExtractTypeInfo(Value *V) {
>   ...
>   assert((GV || isa<ConstantPointerNull>(V)) &&
>          "TypeInfo must be a global variable or NULL");   <-- we can 
> use only constant pointers
>   return GV;
> }
>
> So from llvm::Intrinsic::eh_typeid_for we can get type of global 
> variables only.
>
> We may need a new intrinsic for that.
>
>
>> > 2. We still need to find a way to connect the fetchd typeid with DWARF
>> > info.
>> >     Inserting that ID into DWARF may workable?
>>
>> hmm, that id should be the same id we're seeing in dwarf, right?
>
> There's no 'typeid' field in dwarf originally. I'm still looking for a 
> way
> to inject this ID into dwarf infromation.
>
>> I think it's used in exception handling which is reusing some of
>> the dwarf stuff for this, so the must be a way to connect that id
>> to actual type info. Though last time I looked at EH was
>> during g++ hacking days. No idea how llvm does it exactly, but
>> I'm assuming the logic for rtti should be similar.
>>
>
> I'm not sure whether RTTI use dwarf to deduce type information. I 
> think not,
> because dwarf infos can be stripped out.
>

Hi Alexei,

Just found that llvm::Intrinsic::eh_typeid_for is function specific. ID 
of same type in
different functions may be different. Here is an example:

static int (*bpf_output_event)(unsigned long, void *buf, int size) =
         (void *) 0x1234;

struct my_str {
         int x;
         int y;
};
struct my_str __str_my_str;

struct my_str2 {
         int x;
         int y;
         int z;
};
struct my_str2 __str_my_str2;

int func(int *ctx)
{
         struct my_str var_a;
         struct my_str2 var_b;
         bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a, 
sizeof(var_a));
         bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b, 
sizeof(var_b));
         return 0;
}

int func2(int *ctx)
{
         struct my_str var_a;
         struct my_str2 var_b;

         /* change order here */
         bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b, 
sizeof(var_b));
         bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a, 
sizeof(var_a))
         return 0;
}

This program uses __builtin_bpf_typeid(llvm::Intrinsic::eh_typeid_for) 
in func and func2
for same two types but in different order. We expect same type get same ID.

Compiled with:

  $ clang -target bpf -S -O2 -c ./test_bpf_typeid.c

The result is:

           .text
         .globl  func
         .align  8
func:                                   # @func
# BB#0:                                 # %entry
         mov     r2, r10
         addi    r2, -8
         mov     r1, 1
         mov     r3, 8
         call    4660
         mov     r2, r10
         addi    r2, -24
         mov     r1, 2
         mov     r3, 12
         call    4660
         mov     r0, 0
         ret

         .globl  func2
         .align  8
func2:                                  # @func2
# BB#0:                                 # %entry
         mov     r2, r10
         addi    r2, -24
         mov     r1, 1                  <--- we want 2 here.
         mov     r3, 12
         call    4660
         mov     r2, r10
         addi    r2, -8
         mov     r1, 2                  <--- we want 1 here.
         mov     r3, 8
         call    4660
         mov     r0, 0
         ret

         .comm   __str_my_str,8,4        # @__str_my_str
         .comm   __str_my_str2,12,4      # @__str_my_str2


Conclusion: llvm::Intrinsic::eh_typeid_for is not on the right direction...

Thank you.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-05  1:58                                               ` Wangnan (F)
@ 2015-08-05  2:05                                                 ` Wangnan (F)
  2015-08-05  6:51                                                   ` [LLVMdev] " Wangnan (F)
  0 siblings, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-08-05  2:05 UTC (permalink / raw)
  To: Alexei Starovoitov, He Kuang, pi3orama; +Cc: linux-kernel, llvm-dev

Send again since llvmdev is moved to llvm-dev@lists.llvm.org

On 2015/8/5 9:58, Wangnan (F) wrote:
>
>
> On 2015/8/4 17:01, Wangnan (F) wrote:
>> For people who in llvmdev:
>>
>> This mail is belong to a thread in linux kernel mailing list, the 
>> first message
>> can be retrived from:
>>
>>  http://lkml.kernel.org/r/55B1535E.8090406@plumgrid.com
>>
>> Our goal is to fild a way to make BPF program get an unique ID for 
>> each type
>> so it can pass the ID to other part of kernel, then we can retrive 
>> the type and
>> decode the structure using DWARF information. Currently we have two 
>> problem
>> needs to solve:
>>
>> 1. Dwarf information generated by BPF backend lost all DW_AT_name field;
>>
>> 2. How to get typeid from local variable? I tried llvm.eh_typeid_for
>>    but it support global variable only.
>>
>> Following is my response to Alexei.
>>
>> On 2015/8/4 3:44, Alexei Starovoitov wrote:
>>> On 7/31/15 3:18 AM, Wangnan (F) wrote:
>>>
>>
>> [SNIP]
>>
>>> didn't have time to look at it.
>>> from your llvm patches looks like you've got quite experienced
>>> with it already :)
>>>
>>>> I'll post 2 LLVM patches by replying this mail. Please have a look and
>>>> help me
>>>> send them to LLVM if you think my code is correct.
>>>
>>> patch 1:
>>> I don't quite understand the purpose of builtin_dwarf_cfa
>>> returning R11. It's a special register seen inside llvm codegen
>>> only. It doesn't have kernel meaning.
>>>
>>
>> Kernel side verifier allows us to do arithmetic computation using two 
>> local variable
>> address or local variable address and R11. Therefore, we can compute 
>> the location
>> of a local variable using:
>>
>>   mark = &my_var_a - __builtin_frame_address(0);
>>
>> If the stack allocation is fixed (if the location is never reused), 
>> the above 'mark'
>> can be uniquely identify a local variable. That's why I'm interesting 
>> in it. However
>> I'm not sure whether the prerequestion is hold.
>>
>>> patch 2:
>>> do we really need to hack clang?
>>> Can you just define a function that aliases to intrinsic,
>>> like we do for ld_abs/ld_ind ?
>>> void bpf_store_half(void *skb, u64 off, u64 val) 
>>> asm("llvm.bpf.store.half");
>>> then no extra patches necessary.
>>>
>>>> struct my_str {
>>>>          int x;
>>>>          int y;
>>>> };
>>>> struct my_str __str_my_str;
>>>>
>>>> struct my_str2 {
>>>>          int x;
>>>>          int y;
>>>>          int z;
>>>> };
>>>> struct my_str2 __str_my_str2;
>>>>
>>>>          test_func(__builtin_bpf_typeid(&__str_my_str));
>>>> test_func(__builtin_bpf_typeid(&__str_my_str2));
>>>>          mov     r1, 1
>>>>          call    4660
>>>>          mov     r1, 2
>>>>          call    4660
>>>
>>> this part looks good. I think it's usable.
>>>
>>> > 1. llvm.eh_typeid_for can be used on global variables only. So for 
>>> each
>>> > output
>>> >     structure we have to define a global varable.
>>>
>>> why? I think it should work with local pointers as well.
>>>
>>
>> It is defined by LLVM, in lib/CodeGen/Analysis.cpp:
>>
>> /// ExtractTypeInfo - Returns the type info, possibly bitcast, 
>> encoded in V.
>> GlobalValue *llvm::ExtractTypeInfo(Value *V) {
>>   ...
>>   assert((GV || isa<ConstantPointerNull>(V)) &&
>>          "TypeInfo must be a global variable or NULL");   <-- we can 
>> use only constant pointers
>>   return GV;
>> }
>>
>> So from llvm::Intrinsic::eh_typeid_for we can get type of global 
>> variables only.
>>
>> We may need a new intrinsic for that.
>>
>>
>>> > 2. We still need to find a way to connect the fetchd typeid with 
>>> DWARF
>>> > info.
>>> >     Inserting that ID into DWARF may workable?
>>>
>>> hmm, that id should be the same id we're seeing in dwarf, right?
>>
>> There's no 'typeid' field in dwarf originally. I'm still looking for 
>> a way
>> to inject this ID into dwarf infromation.
>>
>>> I think it's used in exception handling which is reusing some of
>>> the dwarf stuff for this, so the must be a way to connect that id
>>> to actual type info. Though last time I looked at EH was
>>> during g++ hacking days. No idea how llvm does it exactly, but
>>> I'm assuming the logic for rtti should be similar.
>>>
>>
>> I'm not sure whether RTTI use dwarf to deduce type information. I 
>> think not,
>> because dwarf infos can be stripped out.
>>
>
> Hi Alexei,
>
> Just found that llvm::Intrinsic::eh_typeid_for is function specific. 
> ID of same type in
> different functions may be different. Here is an example:
>
> static int (*bpf_output_event)(unsigned long, void *buf, int size) =
>         (void *) 0x1234;
>
> struct my_str {
>         int x;
>         int y;
> };
> struct my_str __str_my_str;
>
> struct my_str2 {
>         int x;
>         int y;
>         int z;
> };
> struct my_str2 __str_my_str2;
>
> int func(int *ctx)
> {
>         struct my_str var_a;
>         struct my_str2 var_b;
>         bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a, 
> sizeof(var_a));
>         bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b, 
> sizeof(var_b));
>         return 0;
> }
>
> int func2(int *ctx)
> {
>         struct my_str var_a;
>         struct my_str2 var_b;
>
>         /* change order here */
>         bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b, 
> sizeof(var_b));
>         bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a, 
> sizeof(var_a))
>         return 0;
> }
>
> This program uses __builtin_bpf_typeid(llvm::Intrinsic::eh_typeid_for) 
> in func and func2
> for same two types but in different order. We expect same type get 
> same ID.
>
> Compiled with:
>
>  $ clang -target bpf -S -O2 -c ./test_bpf_typeid.c
>
> The result is:
>
>           .text
>         .globl  func
>         .align  8
> func:                                   # @func
> # BB#0:                                 # %entry
>         mov     r2, r10
>         addi    r2, -8
>         mov     r1, 1
>         mov     r3, 8
>         call    4660
>         mov     r2, r10
>         addi    r2, -24
>         mov     r1, 2
>         mov     r3, 12
>         call    4660
>         mov     r0, 0
>         ret
>
>         .globl  func2
>         .align  8
> func2:                                  # @func2
> # BB#0:                                 # %entry
>         mov     r2, r10
>         addi    r2, -24
>         mov     r1, 1                  <--- we want 2 here.
>         mov     r3, 12
>         call    4660
>         mov     r2, r10
>         addi    r2, -8
>         mov     r1, 2                  <--- we want 1 here.
>         mov     r3, 8
>         call    4660
>         mov     r0, 0
>         ret
>
>         .comm   __str_my_str,8,4        # @__str_my_str
>         .comm   __str_my_str2,12,4      # @__str_my_str2
>
>
> Conclusion: llvm::Intrinsic::eh_typeid_for is not on the right 
> direction...
>
> Thank you.



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-05  2:05                                                 ` Wangnan (F)
@ 2015-08-05  6:51                                                   ` Wangnan (F)
  2015-08-05  7:11                                                     ` Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-08-05  6:51 UTC (permalink / raw)
  To: Alexei Starovoitov, He Kuang, pi3orama; +Cc: llvm-dev, linux-kernel



On 2015/8/5 10:05, Wangnan (F) wrote:
> Send again since llvmdev is moved to llvm-dev@lists.llvm.org
>
> On 2015/8/5 9:58, Wangnan (F) wrote:
>>
>>>
>>> On 2015/8/4 3:44, Alexei Starovoitov wrote:
>>>> On 7/31/15 3:18 AM, Wangnan (F) wrote:
>>>>
>>>
>>> [SNIP]
>>>
>>>> didn't have time to look at it.
>>>> from your llvm patches looks like you've got quite experienced
>>>> with it already :)
>>>>
>>>>> I'll post 2 LLVM patches by replying this mail. Please have a look 
>>>>> and
>>>>> help me
>>>>> send them to LLVM if you think my code is correct.
>>>>
>>>> patch 1:
>>>> I don't quite understand the purpose of builtin_dwarf_cfa
>>>> returning R11. It's a special register seen inside llvm codegen
>>>> only. It doesn't have kernel meaning.
>>>>
>>>
>>> Kernel side verifier allows us to do arithmetic computation using 
>>> two local variable
>>> address or local variable address and R11. Therefore, we can compute 
>>> the location
>>> of a local variable using:
>>>
>>>   mark = &my_var_a - __builtin_frame_address(0);
>>>
>>> If the stack allocation is fixed (if the location is never reused), 
>>> the above 'mark'
>>> can be uniquely identify a local variable. That's why I'm 
>>> interesting in it. However
>>> I'm not sure whether the prerequestion is hold.
>>>
>>>> patch 2:
>>>> do we really need to hack clang?
>>>> Can you just define a function that aliases to intrinsic,
>>>> like we do for ld_abs/ld_ind ?
>>>> void bpf_store_half(void *skb, u64 off, u64 val) 
>>>> asm("llvm.bpf.store.half");
>>>> then no extra patches necessary.
>>>>

And for this:

I tried this test function:

void bpf_store_half(void *skb, int off, int val) asm("llvm.bpf.store.half");
int func()
{
         bpf_store_half(0, 0, 0);
         return 0;
}

Compiled with:

$ clang -g -target bpf -O2 -S -c test.c

And get this:

         .text
         .globl  func
         .align  8
func:                                   # @func
# BB#0:                                 # %entry
         mov     r1, 0
         mov     r2, 0
         mov     r3, 0
         call    llvm.bpf.store.half
         mov     r0, 0
         ret

Without -S, it generate a function relocation:

$ objdump -r ./test.o

./test.o:     file format elf64-little

RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE              VALUE
0000000000000018 UNKNOWN           llvm.bpf.store.half


It doesn't work as you suggestion. I think we still need to do something
in clang frontend, or it can only be used in '.ll'.

Thank you.



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-05  6:51                                                   ` [LLVMdev] " Wangnan (F)
@ 2015-08-05  7:11                                                     ` Alexei Starovoitov
  2015-08-05  8:28                                                       ` Wangnan (F)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-08-05  7:11 UTC (permalink / raw)
  To: Wangnan (F), He Kuang, pi3orama; +Cc: llvm-dev, linux-kernel

On 8/4/15 11:51 PM, Wangnan (F) wrote:
> void bpf_store_half(void *skb, int off, int val)
> asm("llvm.bpf.store.half");
> int func()
> {
>          bpf_store_half(0, 0, 0);
>          return 0;
> }
>
> Compiled with:
>
> $ clang -g -target bpf -O2 -S -c test.c
>
> And get this:
>
>          .text
>          .globl  func
>          .align  8
> func:                                   # @func
> # BB#0:                                 # %entry
>          mov     r1, 0
>          mov     r2, 0
>          mov     r3, 0
>          call    llvm.bpf.store.half

it didn't work because number and types of args were incompatible.
Every samples/bpf/sockex[0-9]_kern.c is using llvm.bpf.load.* intrinsics.

the typeid changing ids with order is surprising.
I think the assertion in ExtractTypeInfo() is not hard.
Just there were no such use cases. May be we can do something
similar to what LowerIntrinsicCall() does and lower it differently
in the backend.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-05  7:11                                                     ` Alexei Starovoitov
@ 2015-08-05  8:28                                                       ` Wangnan (F)
  2015-08-06  3:22                                                         ` [llvm-dev] " Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-08-05  8:28 UTC (permalink / raw)
  To: Alexei Starovoitov, He Kuang, pi3orama; +Cc: llvm-dev, linux-kernel



On 2015/8/5 15:11, Alexei Starovoitov wrote:
> On 8/4/15 11:51 PM, Wangnan (F) wrote:
>> void bpf_store_half(void *skb, int off, int val)
>> asm("llvm.bpf.store.half");
>> int func()
>> {
>>          bpf_store_half(0, 0, 0);
>>          return 0;
>> }
>>
>> Compiled with:
>>
>> $ clang -g -target bpf -O2 -S -c test.c
>>
>> And get this:
>>
>>          .text
>>          .globl  func
>>          .align  8
>> func:                                   # @func
>> # BB#0:                                 # %entry
>>          mov     r1, 0
>>          mov     r2, 0
>>          mov     r3, 0
>>          call    llvm.bpf.store.half
>
> it didn't work because number and types of args were incompatible.
> Every samples/bpf/sockex[0-9]_kern.c is using llvm.bpf.load.* intrinsics.
>

Make it work. Thank you.

It doesn't work for me at first since in my llvm there's only 
llvm.bpf.load.*.

I think llvm.bpf.store.* belone to some patches you haven't posted yet?

> the typeid changing ids with order is surprising.
> I think the assertion in ExtractTypeInfo() is not hard.
> Just there were no such use cases. May be we can do something
> similar to what LowerIntrinsicCall() does and lower it differently
> in the backend.
>
But in backend can we still get type information? I thought type is
meaningful in frontend only, and backend behaviors is unable to affect
DWARF generation, right?

Thank you.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-07-29 17:13                                       ` Alexei Starovoitov
  2015-07-29 20:00                                         ` pi3orama
  2015-07-31 10:18                                         ` Wangnan (F)
@ 2015-08-05  8:59                                         ` He Kuang
  2015-08-06  3:41                                           ` [llvm-dev] " Alexei Starovoitov
  2 siblings, 1 reply; 53+ messages in thread
From: He Kuang @ 2015-08-05  8:59 UTC (permalink / raw)
  To: Alexei Starovoitov, Wangnan (F), pi3orama; +Cc: linux-kernel, llvm-dev

Hi, Alexei

On 2015/7/30 1:13, Alexei Starovoitov wrote:
> On 7/29/15 2:38 AM, He Kuang wrote:
>> Hi, Alexei
>>
>> On 2015/7/28 10:18, Alexei Starovoitov wrote:
>>> On 7/25/15 3:04 AM, He Kuang wrote:
>>>> I noticed that for 64-bit elf format, the reloc sections have
>>>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>>>> file(64bit). I think there must be something wrong in the process
>>>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>>>> AT_name now, DW_AT_LOCATION still missed and need your help.
>>>

Another thing about DW_AT_name, we've already found that the name
string is stored indirectly and needs relocation which is
architecture specific, while the e_machine info in bpf obj file
is "unknown", both objdump and libdw cannot parse DW_AT_name
correctly.

Should we just use a known architeture for bpf object file
instead of "unknown"? If so, we can use the existing relocation
codes in libdw and get DIE name by simply invoking
dwarf_diename(). The drawback of this method is that, e.g. we
use "x86-64" instead, is hard to distinguish bpf obj file with
x86-64 elf file. Do you think this is ok?

Otherwise, for not touching libdw, we should reimplement the
relocation codes already in libdw for bpf elf file with "unknown"
machine info specially in perf. I wonder whether it is worth doing
this and what's your opinion?

Thank you.

>> index directly:
>>
>>    __bpf_trace_output_data(__builtin_dwarf_type(myvar_a), &myvar_a, size);
>>
>
> probably both A and B won't really work when programs get bigger
> and optimizations will start moving lines around.
> the builtin_dwarf_type idea is actually quite interesting.
> Potentially that builtin can stringify type name and later we can
> search it in dwarf. Please take a look how to add such builtin.
> There are few similar builtins that deal with exception handling
> and need type info. May be they can be reused. Like:
> int_eh_typeid_for and int_eh_dwarf_cfa
>
>


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-05  8:28                                                       ` Wangnan (F)
@ 2015-08-06  3:22                                                         ` Alexei Starovoitov
  2015-08-06  4:35                                                           ` Wangnan (F)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-08-06  3:22 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Alexei Starovoitov, He Kuang, pi3orama, llvm-dev, linux-kernel

On Wed, Aug 05, 2015 at 04:28:13PM +0800, Wangnan (F) wrote:
> 
> It doesn't work for me at first since in my llvm there's only
> llvm.bpf.load.*.
> 
> I think llvm.bpf.store.* belone to some patches you haven't posted yet?

nope. only loads have special instructions ld_abs/ld_ind
which are represented by these intrinsics.
stores, so far, are done via single bpf_store_bytes() helper function.

> >the typeid changing ids with order is surprising.
> >I think the assertion in ExtractTypeInfo() is not hard.
> >Just there were no such use cases. May be we can do something
> >similar to what LowerIntrinsicCall() does and lower it differently
> >in the backend.
> >
> But in backend can we still get type information? I thought type is
> meaningful in frontend only, and backend behaviors is unable to affect
> DWARF generation, right?

why do we need to affect type generation? we just need to know dwarf
type id in the backend, so we can emit it as a constant.
I still think lowering eh_typeid_for differently may work.
Like instead of doing
GV = ExtractTypeInfo(I.getArgOperand(0)) followed by
getMachineFunction().getMMI().getTypeIDFor(GV)
we can get dwarf type id from I.getArgOperand(0) if it's
any pointer to struct type.
I'm not familiar with dwarf handling part of llvm, but feels possible.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-05  8:59                                         ` [LLVMdev] Cc llvmdev: " He Kuang
@ 2015-08-06  3:41                                           ` Alexei Starovoitov
  2015-08-06  4:31                                             ` Wangnan (F)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-08-06  3:41 UTC (permalink / raw)
  To: He Kuang
  Cc: Alexei Starovoitov, Wangnan (F), pi3orama, llvm-dev, linux-kernel

On Wed, Aug 05, 2015 at 04:59:01PM +0800, He Kuang wrote:
> Hi, Alexei
> 
> On 2015/7/30 1:13, Alexei Starovoitov wrote:
> >On 7/29/15 2:38 AM, He Kuang wrote:
> >>Hi, Alexei
> >>
> >>On 2015/7/28 10:18, Alexei Starovoitov wrote:
> >>>On 7/25/15 3:04 AM, He Kuang wrote:
> >>>>I noticed that for 64-bit elf format, the reloc sections have
> >>>>'Addend' in the entry, but there's no 'Addend' info in bpf elf
> >>>>file(64bit). I think there must be something wrong in the process
> >>>>of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
> >>>>AT_name now, DW_AT_LOCATION still missed and need your help.
> >>>
> 
> Another thing about DW_AT_name, we've already found that the name
> string is stored indirectly and needs relocation which is
> architecture specific, while the e_machine info in bpf obj file
> is "unknown", both objdump and libdw cannot parse DW_AT_name
> correctly.
> 
> Should we just use a known architeture for bpf object file
> instead of "unknown"? If so, we can use the existing relocation
> codes in libdw and get DIE name by simply invoking
> dwarf_diename(). The drawback of this method is that, e.g. we
> use "x86-64" instead, is hard to distinguish bpf obj file with
> x86-64 elf file. Do you think this is ok?

The only clean way would be to register bpf as an architecture
with elf standards committee. I have no idea who is doing that and
how much such new e_machine registration may cost.
So far using EM_NONE is a hack to avoid bureaucracy.
Are dwarf relocation processor specific?
Then simple hack to elfutils/libdw to treat EM_NONE as X64
should do the trick, right?
If that indeed works, we can tweak bpf backend to use EM_X86_64,
but then the danger that such .o file will be wrongly
recognized by elf utils. imo it's safer to keep it as EM_NONE
until real number is assigned, but even after it's assigned it
will take time to propagate that value. So for now I would try
to find a solution keeping EM_NONE hack.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-06  3:41                                           ` [llvm-dev] " Alexei Starovoitov
@ 2015-08-06  4:31                                             ` Wangnan (F)
  2015-08-06  6:50                                               ` Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-08-06  4:31 UTC (permalink / raw)
  To: Alexei Starovoitov, He Kuang
  Cc: Alexei Starovoitov, pi3orama, llvm-dev, linux-kernel



On 2015/8/6 11:41, Alexei Starovoitov wrote:
> On Wed, Aug 05, 2015 at 04:59:01PM +0800, He Kuang wrote:
>> Hi, Alexei
>>
>> On 2015/7/30 1:13, Alexei Starovoitov wrote:
>>> On 7/29/15 2:38 AM, He Kuang wrote:
>>>> Hi, Alexei
>>>>
>>>> On 2015/7/28 10:18, Alexei Starovoitov wrote:
>>>>> On 7/25/15 3:04 AM, He Kuang wrote:
>>>>>> I noticed that for 64-bit elf format, the reloc sections have
>>>>>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>>>>>> file(64bit). I think there must be something wrong in the process
>>>>>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>>>>>> AT_name now, DW_AT_LOCATION still missed and need your help.
>> Another thing about DW_AT_name, we've already found that the name
>> string is stored indirectly and needs relocation which is
>> architecture specific, while the e_machine info in bpf obj file
>> is "unknown", both objdump and libdw cannot parse DW_AT_name
>> correctly.
>>
>> Should we just use a known architeture for bpf object file
>> instead of "unknown"? If so, we can use the existing relocation
>> codes in libdw and get DIE name by simply invoking
>> dwarf_diename(). The drawback of this method is that, e.g. we
>> use "x86-64" instead, is hard to distinguish bpf obj file with
>> x86-64 elf file. Do you think this is ok?
> The only clean way would be to register bpf as an architecture
> with elf standards committee. I have no idea who is doing that and
> how much such new e_machine registration may cost.
> So far using EM_NONE is a hack to avoid bureaucracy.
> Are dwarf relocation processor specific?
> Then simple hack to elfutils/libdw to treat EM_NONE as X64
> should do the trick, right?
> If that indeed works, we can tweak bpf backend to use EM_X86_64,
> but then the danger that such .o file will be wrongly
> recognized by elf utils. imo it's safer to keep it as EM_NONE
> until real number is assigned, but even after it's assigned it
> will take time to propagate that value. So for now I would try
> to find a solution keeping EM_NONE hack.
>

What about hacking ELF binary in memory?

1. load the object into memory;
2. twist the machine code to EM_X86_64;
3. load it using elf_begin;
4. return the twested elf memory image using libdwfl's find_elf callback.

Then libdw will recognise BPF's object file as a X86_64 object file. If 
required,
relocation sections can also be twisted in this way. Should not very 
hard since
we can only consider one relocation type.

Then let's start thinking how to introduce EM_BPF. We can rely on the 
hacking
until EM_BPF symbol reaches elfutils in perf.

What do you think?

Thank you.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-06  3:22                                                         ` [llvm-dev] " Alexei Starovoitov
@ 2015-08-06  4:35                                                           ` Wangnan (F)
  2015-08-06  6:55                                                             ` Alexei Starovoitov
  0 siblings, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-08-06  4:35 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, He Kuang, pi3orama, llvm-dev, linux-kernel



On 2015/8/6 11:22, Alexei Starovoitov wrote:
> On Wed, Aug 05, 2015 at 04:28:13PM +0800, Wangnan (F) wrote:
>> It doesn't work for me at first since in my llvm there's only
>> llvm.bpf.load.*.
>>
>> I think llvm.bpf.store.* belone to some patches you haven't posted yet?
> nope. only loads have special instructions ld_abs/ld_ind
> which are represented by these intrinsics.
> stores, so far, are done via single bpf_store_bytes() helper function.
>
>>> the typeid changing ids with order is surprising.
>>> I think the assertion in ExtractTypeInfo() is not hard.
>>> Just there were no such use cases. May be we can do something
>>> similar to what LowerIntrinsicCall() does and lower it differently
>>> in the backend.
>>>
>> But in backend can we still get type information? I thought type is
>> meaningful in frontend only, and backend behaviors is unable to affect
>> DWARF generation, right?
> why do we need to affect type generation? we just need to know dwarf
> type id in the backend, so we can emit it as a constant.
> I still think lowering eh_typeid_for differently may work.
> Like instead of doing
> GV = ExtractTypeInfo(I.getArgOperand(0)) followed by
> getMachineFunction().getMMI().getTypeIDFor(GV)
> we can get dwarf type id from I.getArgOperand(0) if it's
> any pointer to struct type.

I have a bad news to tell:

#include <stdio.h>
struct my_str {
         int x;
         int y;
} __gv_my_str;
struct my_str __gv_my_str_;

struct my_str2 {
         int x;
         int y;
} __gv_my_str2;

int typeid(void *p) asm("llvm.eh.typeid.for");

int main()
{
         printf("%d\n", typeid(&__gv_my_str));
         printf("%d\n", typeid(&__gv_my_str_));
         printf("%d\n", typeid(&__gv_my_str2));
         return 0;
}

Compiled with clang into x86 executable, then:

$ ./a.out
3
2
1

See? I have two types but reported 3 IDs.

And here is the implementation of getTypeIDFor, in 
lib/CodeGen/MachineModuleInfo.cpp:

unsigned MachineModuleInfo::getTypeIDFor(const GlobalValue *TI) {
   for (unsigned i = 0, N = TypeInfos.size(); i != N; ++i)
     if (TypeInfos[i] == TI) return i + 1;

   TypeInfos.push_back(TI);
   return TypeInfos.size();
}

It only checks value in a stupid way.

Now the dwarf side becomes clear (see my other response), but the 
frontend may require
totally reconsidering.

Do you know someone in LLVM-dev who can help us?

Thank you.

> I'm not familiar with dwarf handling part of llvm, but feels possible.
>



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-06  4:31                                             ` Wangnan (F)
@ 2015-08-06  6:50                                               ` Alexei Starovoitov
  0 siblings, 0 replies; 53+ messages in thread
From: Alexei Starovoitov @ 2015-08-06  6:50 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: He Kuang, Alexei Starovoitov, pi3orama, llvm-dev, linux-kernel

On Thu, Aug 06, 2015 at 12:31:26PM +0800, Wangnan (F) wrote:
> 
> 
> What about hacking ELF binary in memory?
> 
> 1. load the object into memory;
> 2. twist the machine code to EM_X86_64;
> 3. load it using elf_begin;
> 4. return the twested elf memory image using libdwfl's find_elf callback.
> 
> Then libdw will recognise BPF's object file as a X86_64 object file. If
> required,
> relocation sections can also be twisted in this way. Should not very hard
> since
> we can only consider one relocation type.
> 
> Then let's start thinking how to introduce EM_BPF. We can rely on the
> hacking
> until EM_BPF symbol reaches elfutils in perf.
> 
> What do you think?

sounds crazy, but may work. let's try it :)


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-06  4:35                                                           ` Wangnan (F)
@ 2015-08-06  6:55                                                             ` Alexei Starovoitov
  0 siblings, 0 replies; 53+ messages in thread
From: Alexei Starovoitov @ 2015-08-06  6:55 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Alexei Starovoitov, He Kuang, pi3orama, llvm-dev, linux-kernel

On Thu, Aug 06, 2015 at 12:35:30PM +0800, Wangnan (F) wrote:
> 
> 
> On 2015/8/6 11:22, Alexei Starovoitov wrote:
> >On Wed, Aug 05, 2015 at 04:28:13PM +0800, Wangnan (F) wrote:
> >>It doesn't work for me at first since in my llvm there's only
> >>llvm.bpf.load.*.
> >>
> >>I think llvm.bpf.store.* belone to some patches you haven't posted yet?
> >nope. only loads have special instructions ld_abs/ld_ind
> >which are represented by these intrinsics.
> >stores, so far, are done via single bpf_store_bytes() helper function.
> >
> >>>the typeid changing ids with order is surprising.
> >>>I think the assertion in ExtractTypeInfo() is not hard.
> >>>Just there were no such use cases. May be we can do something
> >>>similar to what LowerIntrinsicCall() does and lower it differently
> >>>in the backend.
> >>>
> >>But in backend can we still get type information? I thought type is
> >>meaningful in frontend only, and backend behaviors is unable to affect
> >>DWARF generation, right?
> >why do we need to affect type generation? we just need to know dwarf
> >type id in the backend, so we can emit it as a constant.
> >I still think lowering eh_typeid_for differently may work.
> >Like instead of doing
> >GV = ExtractTypeInfo(I.getArgOperand(0)) followed by
> >getMachineFunction().getMMI().getTypeIDFor(GV)
> >we can get dwarf type id from I.getArgOperand(0) if it's
> >any pointer to struct type.
> 
> I have a bad news to tell:
> 
> #include <stdio.h>
> struct my_str {
>         int x;
>         int y;
> } __gv_my_str;
> struct my_str __gv_my_str_;
> 
> struct my_str2 {
>         int x;
>         int y;
> } __gv_my_str2;
> 
> int typeid(void *p) asm("llvm.eh.typeid.for");
> 
> int main()
> {
>         printf("%d\n", typeid(&__gv_my_str));
>         printf("%d\n", typeid(&__gv_my_str_));
>         printf("%d\n", typeid(&__gv_my_str2));
>         return 0;
> }
> 
> Compiled with clang into x86 executable, then:
> 
> $ ./a.out
> 3
> 2
> 1
> 
> See? I have two types but reported 3 IDs.

that's expected. We don't have to use default lowering
of typeid_for with getTypeIDFor. bpf backend specific
lowering can be different, though in this case it's odd
that id for __gv_my_str and __gv_my_str_ are different.
__gv_my_str and __gv_my_str2 should be different.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-03 19:44                                           ` Alexei Starovoitov
  2015-08-04  9:01                                             ` Cc llvmdev: " Wangnan (F)
@ 2015-08-12  2:34                                             ` Wangnan (F)
  2015-08-12  4:57                                               ` [llvm-dev] " Alexei Starovoitov
  1 sibling, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-08-12  2:34 UTC (permalink / raw)
  To: Alexei Starovoitov, He Kuang, pi3orama; +Cc: linux-kernel, llvm-dev



On 2015/8/4 3:44, Alexei Starovoitov wrote:

[SNIP]
>> I'll post 2 LLVM patches by replying this mail. Please have a look and
>> help me
>> send them to LLVM if you think my code is correct.
>
>
[SNIP]
> patch 2:
> do we really need to hack clang?
> Can you just define a function that aliases to intrinsic,
> like we do for ld_abs/ld_ind ?
> void bpf_store_half(void *skb, u64 off, u64 val) 
> asm("llvm.bpf.store.half");
> then no extra patches necessary.

Hi Alexei,

By two weeks researching, I have to give you a sad answer that:

target specific intrinsic is not work.

I tried target specific intrinsic. However, LLVM isolates backend and
frontend, and there's no way to pass language level type information
to backend code.

Think about a program like this:

struct strA { int a; }
struct strB { int b; }
int func() {
   struct strA a;
   struct strB b;

   a.a = 1;
   b.b = 2;
   bpf_output(gettype(a), &a);
   bpf_output(gettype(b), &b);
   return 0;
}

BPF backend can't (and needn't) tell the difference between local
variables a and b in theory. In LLVM implementation, it filters type
information out using ComputeValueVTs().  Please have a look at
SelectionDAGBuilder::visitIntrinsicCall in
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp and
SelectionDAGBuilder::visitTargetIntrinsic in the same file. in
visitTargetIntrinsic, ComputeValueVTs acts as a barrier which strips
type information out from CallInst ("I"), and leave SDValue and SDVTList
("Ops" and "VTs") to target code. SDValue and SDVTList are wrappers of
EVT and MVT, all information we concern won't be passed here.

I think now we have 2 choices:

1. Hacking into clang, implement target specific builtin function. Now I
    have worked out a ugly but workable patch which setup a builtin 
function:
    __builtin_bpf_typeid(), which accepts local or global variable then
    returns different constant for different types.

2. Implementing an LLVM intrinsic call (llvm.typeid), make it be 
processed in
    visitIntrinsicCall(). I think we can get something useful if it is 
processed
    with that function.

The next thing should be generating debug information to map type and
constants which issued by __builtin_bpf_typeid() or llvm.typeid. Now we
have a crazy idea that, if we limit the name of the structure to 8 bytes,
we can insert the name into a u64, then there would be no need to consider
type information in DWARF. For example, in the above sample code, gettype(a)
will issue 0x0000000041727473 because its type is "strA". What do you think?

Thank you.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-12  2:34                                             ` Wangnan (F)
@ 2015-08-12  4:57                                               ` Alexei Starovoitov
  2015-08-12  5:28                                                 ` Wangnan (F)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexei Starovoitov @ 2015-08-12  4:57 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Alexei Starovoitov, He Kuang, pi3orama, llvm-dev, linux-kernel

On Wed, Aug 12, 2015 at 10:34:43AM +0800, Wangnan (F) via llvm-dev wrote:
> 
> Think about a program like this:
> 
> struct strA { int a; }
> struct strB { int b; }
> int func() {
>   struct strA a;
>   struct strB b;
> 
>   a.a = 1;
>   b.b = 2;
>   bpf_output(gettype(a), &a);
>   bpf_output(gettype(b), &b);
>   return 0;
> }
> 
> BPF backend can't (and needn't) tell the difference between local
> variables a and b in theory. In LLVM implementation, it filters type
> information out using ComputeValueVTs().  Please have a look at
> SelectionDAGBuilder::visitIntrinsicCall in
> lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp and
> SelectionDAGBuilder::visitTargetIntrinsic in the same file. in
> visitTargetIntrinsic, ComputeValueVTs acts as a barrier which strips
> type information out from CallInst ("I"), and leave SDValue and SDVTList
> ("Ops" and "VTs") to target code. SDValue and SDVTList are wrappers of
> EVT and MVT, all information we concern won't be passed here.
> 
> I think now we have 2 choices:
> 
> 1. Hacking into clang, implement target specific builtin function. Now I
>    have worked out a ugly but workable patch which setup a builtin function:
>    __builtin_bpf_typeid(), which accepts local or global variable then
>    returns different constant for different types.
> 
> 2. Implementing an LLVM intrinsic call (llvm.typeid), make it be processed
> in
>    visitIntrinsicCall(). I think we can get something useful if it is
> processed
>    with that function.

Yeah. You're right about pure target intrinsics.
I think llvm.typeid might work. imo it's cleaner than
doing it at clang level.

> The next thing should be generating debug information to map type and
> constants which issued by __builtin_bpf_typeid() or llvm.typeid. Now we
> have a crazy idea that, if we limit the name of the structure to 8 bytes,
> we can insert the name into a u64, then there would be no need to consider
> type information in DWARF. For example, in the above sample code, gettype(a)
> will issue 0x0000000041727473 because its type is "strA". What do you think?

that's way too hacky.
I was thinking when compiling we can keep llvm ir along with .o
instead of dwarf and extract type info from there.
dwarf has names and other things that we don't need. We only
care about actual field layout of the structs.
But it probably won't be easy to parse llvm ir on perf side
instead of dwarf.

btw, if you haven't looked at iovisor/bcc, there we're solving
similar problem differently. There we use clang rewriter, so all
structs fields are visible at this level, then we use bpf backend
in JIT mode and push bpf instructions into the kernel on the fly
completely skipping ELF and .o
For example in:
https://github.com/iovisor/bcc/blob/master/examples/distributed_bridge/tunnel.c
when you see
struct ethernet_t {
  unsigned long long  dst:48;
  unsigned long long  src:48;
  unsigned int        type:16;
} BPF_PACKET_HEADER;
struct ethernet_t *ethernet = cursor_advance(cursor, sizeof(*ethernet));
... ethernet->src ...
is recognized by clang rewriter and ->src is converted to a different
C code that is sent again into clang.
So there is no need to use dwarf or patch clang/llvm. clang rewriter
has all the info.
I'm not sure you can live with clang/llvm on the host where you
want to run the tracing bits, but if you can that's an easier option.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-12  4:57                                               ` [llvm-dev] " Alexei Starovoitov
@ 2015-08-12  5:28                                                 ` Wangnan (F)
  2015-08-12 13:15                                                   ` Brenden Blanco
  0 siblings, 1 reply; 53+ messages in thread
From: Wangnan (F) @ 2015-08-12  5:28 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, He Kuang, pi3orama, llvm-dev, linux-kernel



On 2015/8/12 12:57, Alexei Starovoitov wrote:
> On Wed, Aug 12, 2015 at 10:34:43AM +0800, Wangnan (F) via llvm-dev wrote:
>> Think about a program like this:
>>
>> struct strA { int a; }
>> struct strB { int b; }
>> int func() {
>>    struct strA a;
>>    struct strB b;
>>
>>    a.a = 1;
>>    b.b = 2;
>>    bpf_output(gettype(a), &a);
>>    bpf_output(gettype(b), &b);
>>    return 0;
>> }
>>
>> BPF backend can't (and needn't) tell the difference between local
>> variables a and b in theory. In LLVM implementation, it filters type
>> information out using ComputeValueVTs().  Please have a look at
>> SelectionDAGBuilder::visitIntrinsicCall in
>> lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp and
>> SelectionDAGBuilder::visitTargetIntrinsic in the same file. in
>> visitTargetIntrinsic, ComputeValueVTs acts as a barrier which strips
>> type information out from CallInst ("I"), and leave SDValue and SDVTList
>> ("Ops" and "VTs") to target code. SDValue and SDVTList are wrappers of
>> EVT and MVT, all information we concern won't be passed here.
>>
>> I think now we have 2 choices:
>>
>> 1. Hacking into clang, implement target specific builtin function. Now I
>>     have worked out a ugly but workable patch which setup a builtin function:
>>     __builtin_bpf_typeid(), which accepts local or global variable then
>>     returns different constant for different types.
>>
>> 2. Implementing an LLVM intrinsic call (llvm.typeid), make it be processed
>> in
>>     visitIntrinsicCall(). I think we can get something useful if it is
>> processed
>>     with that function.
> Yeah. You're right about pure target intrinsics.
> I think llvm.typeid might work. imo it's cleaner than
> doing it at clang level.
>
>> The next thing should be generating debug information to map type and
>> constants which issued by __builtin_bpf_typeid() or llvm.typeid. Now we
>> have a crazy idea that, if we limit the name of the structure to 8 bytes,
>> we can insert the name into a u64, then there would be no need to consider
>> type information in DWARF. For example, in the above sample code, gettype(a)
>> will issue 0x0000000041727473 because its type is "strA". What do you think?
> that's way too hacky.
> I was thinking when compiling we can keep llvm ir along with .o
> instead of dwarf and extract type info from there.
> dwarf has names and other things that we don't need. We only
> care about actual field layout of the structs.
> But it probably won't be easy to parse llvm ir on perf side
> instead of dwarf.

Shipping both llvm IR and .o to perf makes it harder to use. I'm
not sure whether it is a good idea. If we are unable to encode the
structure using a u64, let's still dig into dwarf.

We have another idea that we can utilize dwarf's existing feature.
For example, when __buildin_bpf_typeid() get called, define an enumerate
type in dwarf info, so you'll find:

  <1><2a>: Abbrev Number: 2 (DW_TAG_enumeration_type)
     <2b>   DW_AT_name        : (indirect string, offset: 0xec): TYPEINFO
     <2f>   DW_AT_byte_size   : 4
     <30>   DW_AT_decl_file   : 1
     <31>   DW_AT_decl_line   : 3
  <2><32>: Abbrev Number: 3 (DW_TAG_enumerator)
     <33>   DW_AT_name        : (indirect string, offset: 0xcc): 
__typeinfo_strA
     <37>   DW_AT_const_value : 2
  <2><38>: Abbrev Number: 3 (DW_TAG_enumerator)
     <39>   DW_AT_name        : (indirect string, offset: 0xdc): 
__typeinfo_strB
     <3d>   DW_AT_const_value : 3

or this:

  <3><54>: Abbrev Number: 4 (DW_TAG_variable)
     <55>   DW_AT_const_value : 2
     <66>   DW_AT_name        : (indirect string, offset: 0x1e): 
__typeinfo_strA
     <6a>   DW_AT_decl_file   : 1
     <6b>   DW_AT_decl_line   : 29
     <6c>   DW_AT_type        : <0x72>

then from DW_AT_name and DW_AT_const_value we can do the mapping. 
Drawback is that
all __typeinfo_ prefixed names become reserved.


> btw, if you haven't looked at iovisor/bcc, there we're solving
> similar problem differently. There we use clang rewriter, so all
> structs fields are visible at this level, then we use bpf backend
> in JIT mode and push bpf instructions into the kernel on the fly
> completely skipping ELF and .o
> For example in:
> https://github.com/iovisor/bcc/blob/master/examples/distributed_bridge/tunnel.c
> when you see
> struct ethernet_t {
>    unsigned long long  dst:48;
>    unsigned long long  src:48;
>    unsigned int        type:16;
> } BPF_PACKET_HEADER;
> struct ethernet_t *ethernet = cursor_advance(cursor, sizeof(*ethernet));
> ... ethernet->src ...
> is recognized by clang rewriter and ->src is converted to a different
> C code that is sent again into clang.
> So there is no need to use dwarf or patch clang/llvm. clang rewriter
> has all the info.

Could you please give us further information about your clang rewriter?
I guess you need a new .so when injecting those code into kernel?

> I'm not sure you can live with clang/llvm on the host where you
> want to run the tracing bits, but if you can that's an easier option.
>

I'm not sure. Our target platform should be embedded devices like 
smartphone.
Bringing full clang/llvm environment there is not acceptable.

Thank you.



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-12  5:28                                                 ` Wangnan (F)
@ 2015-08-12 13:15                                                   ` Brenden Blanco
  2015-08-13  6:24                                                     ` Wangnan (F)
  0 siblings, 1 reply; 53+ messages in thread
From: Brenden Blanco @ 2015-08-12 13:15 UTC (permalink / raw)
  To: Wangnan (F), Alexei Starovoitov
  Cc: llvm-dev, linux-kernel, pi3orama, Alexei Starovoitov

Hi Wangnan, I've been authoring the BCC development, so I'll answer
those specific questions.
>
>
> Could you please give us further information about your clang rewriter?
> I guess you need a new .so when injecting those code into kernel?

The rewriter runs all of its passes in a single process, creating no
files on disk and having no external dependencies in terms of
toolchain.
1. Entry point: bpf_module_create() - C API call to create module, can
take filename or directly a c string with the full contents of the
program
2. Convert contents into a clang memory buffer
3. Set up a clang driver::CompilerInvocation in the style of the clang
interpreter example
4. Run a rewriter pass over the memory buffer file, annotating and/or
doing BPF specific magic on the input source
 a. Open BPF maps with a call to bpf_create_map directly
 b. Convert references to map operations with the specific FD of the new map
 c. Convert arguments to bpf_probe_read calls as needed
 d. Collect the externed function names to avoid section() hack in the language
5. Re-run the CompilerInvocation on the modified sources
6. JIT the llvm::Module to bpf arch
7. Load the resulting in-memory ".o" to bpf_prog_load, keeping the FD
alive in the compiler process
8. Attach the FD as necessary to perf events, socket, tc, etc.
9. goto 1

The above steps are captured in the BCC github repo in src/cc, with
the clang specific bits inside of the frontends/clang subdirectory.

> I'm not sure. Our target platform should be embedded devices like
> smartphone.
> Bringing full clang/llvm environment there is not acceptable.

The artifact from the build process of BCC is a shared library, which
has the clang/llvm .a embedded within them. It is not yet a single
binary, but not unfeasible to make it so. The clang toolchain itself
does not need to exist on the target. I have not attempted to
cross-compile BCC to any architecture, currently x86_64 only.

If you have more BCC specific questions not involving clang/llvm,
perhaps you can ping Alexei/myself off of the llvm-dev list, in case
this discussion is not relevant to them.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event
  2015-08-12 13:15                                                   ` Brenden Blanco
@ 2015-08-13  6:24                                                     ` Wangnan (F)
  0 siblings, 0 replies; 53+ messages in thread
From: Wangnan (F) @ 2015-08-13  6:24 UTC (permalink / raw)
  To: Brenden Blanco, Alexei Starovoitov
  Cc: llvm-dev, linux-kernel, pi3orama, Alexei Starovoitov, hekuang 00206996

Thank you for your reply.

Add He Kuang to CC list.

On 2015/8/12 21:15, Brenden Blanco wrote:
> Hi Wangnan, I've been authoring the BCC development, so I'll answer
> those specific questions.
>>
>> Could you please give us further information about your clang rewriter?
>> I guess you need a new .so when injecting those code into kernel?
> The rewriter runs all of its passes in a single process, creating no
> files on disk and having no external dependencies in terms of
> toolchain.
> 1. Entry point: bpf_module_create() - C API call to create module, can
> take filename or directly a c string with the full contents of the
> program
> 2. Convert contents into a clang memory buffer
> 3. Set up a clang driver::CompilerInvocation in the style of the clang
> interpreter example
> 4. Run a rewriter pass over the memory buffer file, annotating and/or
> doing BPF specific magic on the input source
>   a. Open BPF maps with a call to bpf_create_map directly
>   b. Convert references to map operations with the specific FD of the new map
>   c. Convert arguments to bpf_probe_read calls as needed
>   d. Collect the externed function names to avoid section() hack in the language
> 5. Re-run the CompilerInvocation on the modified sources
> 6. JIT the llvm::Module to bpf arch
> 7. Load the resulting in-memory ".o" to bpf_prog_load, keeping the FD
> alive in the compiler process
> 8. Attach the FD as necessary to perf events, socket, tc, etc.
> 9. goto 1
>
> The above steps are captured in the BCC github repo in src/cc, with
> the clang specific bits inside of the frontends/clang subdirectory.
>
>> I'm not sure. Our target platform should be embedded devices like
>> smartphone.
>> Bringing full clang/llvm environment there is not acceptable.
> The artifact from the build process of BCC is a shared library, which
> has the clang/llvm .a embedded within them. It is not yet a single
> binary, but not unfeasible to make it so. The clang toolchain itself
> does not need to exist on the target. I have not attempted to
> cross-compile BCC to any architecture, currently x86_64 only.
>
> If you have more BCC specific questions not involving clang/llvm,
> perhaps you can ping Alexei/myself off of the llvm-dev list, in case
> this discussion is not relevant to them.



^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2015-08-13  6:24 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-10 10:03 [RFC PATCH v4 0/3] Make eBPF programs output data to perf event He Kuang
2015-07-10 10:03 ` [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size He Kuang
2015-07-17 14:32   ` Steven Rostedt
2015-07-17 17:24     ` Sara Rostedt
2015-07-17 18:13     ` Steven Rostedt
2015-07-23 19:36       ` Alex Bennée
2015-07-10 10:03 ` [RFC PATCH v4 2/3] tools lib traceevent: Add function to get dynamic arrays length He Kuang
2015-07-10 10:03 ` [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event He Kuang
2015-07-10 22:10   ` Alexei Starovoitov
2015-07-13  4:36     ` He Kuang
2015-07-13 13:52       ` Namhyung Kim
2015-07-13 14:01         ` pi3orama
2015-07-13 14:09           ` Namhyung Kim
2015-07-13 14:29             ` pi3orama
2015-07-14  1:43               ` Alexei Starovoitov
2015-07-14 11:54                 ` He Kuang
2015-07-17  4:11                   ` Alexei Starovoitov
2015-07-17  4:14                     ` Wangnan (F)
2015-07-17  4:27                       ` Alexei Starovoitov
2015-07-23 11:54                         ` He Kuang
2015-07-23 20:49                           ` llvm bpf debug info. " Alexei Starovoitov
2015-07-24  3:20                             ` Alexei Starovoitov
2015-07-24  4:16                               ` He Kuang
2015-07-25 10:04                                 ` He Kuang
2015-07-28  2:18                                   ` Alexei Starovoitov
2015-07-29  9:38                                     ` He Kuang
2015-07-29 17:13                                       ` Alexei Starovoitov
2015-07-29 20:00                                         ` pi3orama
2015-07-29 22:20                                           ` Alexei Starovoitov
2015-07-31 10:18                                         ` Wangnan (F)
2015-07-31 10:20                                           ` [LLVM PATCH] BPF: add FRAMEADDR support Wang Nan
2015-07-31 10:21                                           ` [LLVM CLANG PATCH] BPF: add __builtin_bpf_typeid() Wang Nan
2015-07-31 10:48                                           ` llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event pi3orama
2015-08-03 19:44                                           ` Alexei Starovoitov
2015-08-04  9:01                                             ` Cc llvmdev: " Wangnan (F)
2015-08-05  1:58                                               ` Wangnan (F)
2015-08-05  2:05                                                 ` Wangnan (F)
2015-08-05  6:51                                                   ` [LLVMdev] " Wangnan (F)
2015-08-05  7:11                                                     ` Alexei Starovoitov
2015-08-05  8:28                                                       ` Wangnan (F)
2015-08-06  3:22                                                         ` [llvm-dev] " Alexei Starovoitov
2015-08-06  4:35                                                           ` Wangnan (F)
2015-08-06  6:55                                                             ` Alexei Starovoitov
2015-08-12  2:34                                             ` Wangnan (F)
2015-08-12  4:57                                               ` [llvm-dev] " Alexei Starovoitov
2015-08-12  5:28                                                 ` Wangnan (F)
2015-08-12 13:15                                                   ` Brenden Blanco
2015-08-13  6:24                                                     ` Wangnan (F)
2015-08-05  8:59                                         ` [LLVMdev] Cc llvmdev: " He Kuang
2015-08-06  3:41                                           ` [llvm-dev] " Alexei Starovoitov
2015-08-06  4:31                                             ` Wangnan (F)
2015-08-06  6:50                                               ` Alexei Starovoitov
2015-07-13  8:29   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).