linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 bpf-next 0/4] bpf: fix stackmap on perf_events with PEBS
@ 2020-07-22 18:42 Song Liu
  2020-07-22 18:42 ` [PATCH v4 bpf-next 1/4] bpf: separate bpf_get_[stack|stackid] for perf events BPF Song Liu
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Song Liu @ 2020-07-22 18:42 UTC (permalink / raw)
  To: linux-kernel, bpf, netdev
  Cc: ast, daniel, kernel-team, john.fastabend, kpsingh, brouer,
	peterz, Song Liu

Calling get_perf_callchain() on perf_events from PEBS entries may cause
unwinder errors. To fix this issue, perf subsystem fetches callchain early,
and marks perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY.
Similar issue exists when BPF program calls get_perf_callchain() via
helper functions. For more information about this issue, please refer to
discussions in [1].

This set fixes this issue with helper proto bpf_get_stackid_pe and
bpf_get_stack_pe.

[1] https://lore.kernel.org/lkml/ED7B9430-6489-4260-B3C5-9CFA2E3AA87A@fb.com/

Changes v3 => v4:
1. Fix error check logic in bpf_get_stackid_pe and bpf_get_stack_pe.
   (Alexei)
2. Do not allow attaching BPF programs with bpf_get_stack|stackid to
   perf_event with precise_ip > 0, but not proper callchain. (Alexei)
3. Add selftest get_stackid_cannot_attach.

Changes v2 => v3:
1. Fix handling of stackmap skip field. (Andrii)
2. Simplify the code in a few places. (Andrii)

Changes v1 => v2:
1. Simplify the design and avoid introducing new helper function. (Andrii)

Song Liu (4):
  bpf: separate bpf_get_[stack|stackid] for perf events BPF
  bpf: fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot
    work
  selftests/bpf: add callchain_stackid
  selftests/bpf: add get_stackid_cannot_attach

 include/linux/bpf.h                           |   2 +
 include/linux/filter.h                        |   3 +-
 kernel/bpf/stackmap.c                         | 184 ++++++++++++++++--
 kernel/bpf/verifier.c                         |   3 +
 kernel/events/core.c                          |  18 ++
 kernel/trace/bpf_trace.c                      |   4 +-
 .../prog_tests/get_stackid_cannot_attach.c    |  91 +++++++++
 .../bpf/prog_tests/perf_event_stackmap.c      | 116 +++++++++++
 .../selftests/bpf/progs/perf_event_stackmap.c |  59 ++++++
 9 files changed, 459 insertions(+), 21 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_stackid_cannot_attach.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/perf_event_stackmap.c
 create mode 100644 tools/testing/selftests/bpf/progs/perf_event_stackmap.c

--
2.24.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 bpf-next 1/4] bpf: separate bpf_get_[stack|stackid] for perf events BPF
  2020-07-22 18:42 [PATCH v4 bpf-next 0/4] bpf: fix stackmap on perf_events with PEBS Song Liu
@ 2020-07-22 18:42 ` Song Liu
  2020-07-22 18:42 ` [PATCH v4 bpf-next 2/4] bpf: fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot work Song Liu
  2020-07-22 18:42 ` [PATCH v4 bpf-next 4/4] selftests/bpf: add get_stackid_cannot_attach Song Liu
  2 siblings, 0 replies; 7+ messages in thread
From: Song Liu @ 2020-07-22 18:42 UTC (permalink / raw)
  To: linux-kernel, bpf, netdev
  Cc: ast, daniel, kernel-team, john.fastabend, kpsingh, brouer,
	peterz, Song Liu

Calling get_perf_callchain() on perf_events from PEBS entries may cause
unwinder errors. To fix this issue, the callchain is fetched early. Such
perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY.

Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may
also cause unwinder errors. To fix this, add separate version of these
two helpers, bpf_get_[stack|stackid]_pe. These two hepers use callchain in
bpf_perf_event_data_kern->data->callchain.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 include/linux/bpf.h      |   2 +
 kernel/bpf/stackmap.c    | 184 +++++++++++++++++++++++++++++++++++----
 kernel/trace/bpf_trace.c |   4 +-
 3 files changed, 170 insertions(+), 20 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index bae557ff2da8b..70e34a0ebbac6 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1644,6 +1644,8 @@ extern const struct bpf_func_proto bpf_get_current_comm_proto;
 extern const struct bpf_func_proto bpf_get_stackid_proto;
 extern const struct bpf_func_proto bpf_get_stack_proto;
 extern const struct bpf_func_proto bpf_get_task_stack_proto;
+extern const struct bpf_func_proto bpf_get_stackid_proto_pe;
+extern const struct bpf_func_proto bpf_get_stack_proto_pe;
 extern const struct bpf_func_proto bpf_sock_map_update_proto;
 extern const struct bpf_func_proto bpf_sock_hash_update_proto;
 extern const struct bpf_func_proto bpf_get_current_cgroup_id_proto;
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 48d8e739975fa..5beb2f8c23da1 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -4,6 +4,7 @@
 #include <linux/bpf.h>
 #include <linux/jhash.h>
 #include <linux/filter.h>
+#include <linux/kernel.h>
 #include <linux/stacktrace.h>
 #include <linux/perf_event.h>
 #include <linux/elf.h>
@@ -387,11 +388,10 @@ get_callchain_entry_for_task(struct task_struct *task, u32 init_nr)
 #endif
 }
 
-BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map,
-	   u64, flags)
+static long __bpf_get_stackid(struct bpf_map *map,
+			      struct perf_callchain_entry *trace, u64 flags)
 {
 	struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map);
-	struct perf_callchain_entry *trace;
 	struct stack_map_bucket *bucket, *new_bucket, *old_bucket;
 	u32 max_depth = map->value_size / stack_map_data_size(map);
 	/* stack_map_alloc() checks that max_depth <= sysctl_perf_event_max_stack */
@@ -399,21 +399,9 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map,
 	u32 skip = flags & BPF_F_SKIP_FIELD_MASK;
 	u32 hash, id, trace_nr, trace_len;
 	bool user = flags & BPF_F_USER_STACK;
-	bool kernel = !user;
 	u64 *ips;
 	bool hash_matches;
 
-	if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK |
-			       BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID)))
-		return -EINVAL;
-
-	trace = get_perf_callchain(regs, init_nr, kernel, user,
-				   sysctl_perf_event_max_stack, false, false);
-
-	if (unlikely(!trace))
-		/* couldn't fetch the stack trace */
-		return -EFAULT;
-
 	/* get_perf_callchain() guarantees that trace->nr >= init_nr
 	 * and trace-nr <= sysctl_perf_event_max_stack, so trace_nr <= max_depth
 	 */
@@ -478,6 +466,30 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map,
 	return id;
 }
 
+BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map,
+	   u64, flags)
+{
+	u32 max_depth = map->value_size / stack_map_data_size(map);
+	/* stack_map_alloc() checks that max_depth <= sysctl_perf_event_max_stack */
+	u32 init_nr = sysctl_perf_event_max_stack - max_depth;
+	bool user = flags & BPF_F_USER_STACK;
+	struct perf_callchain_entry *trace;
+	bool kernel = !user;
+
+	if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK |
+			       BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID)))
+		return -EINVAL;
+
+	trace = get_perf_callchain(regs, init_nr, kernel, user,
+				   sysctl_perf_event_max_stack, false, false);
+
+	if (unlikely(!trace))
+		/* couldn't fetch the stack trace */
+		return -EFAULT;
+
+	return __bpf_get_stackid(map, trace, flags);
+}
+
 const struct bpf_func_proto bpf_get_stackid_proto = {
 	.func		= bpf_get_stackid,
 	.gpl_only	= true,
@@ -487,7 +499,77 @@ const struct bpf_func_proto bpf_get_stackid_proto = {
 	.arg3_type	= ARG_ANYTHING,
 };
 
+static __u64 count_kernel_ip(struct perf_callchain_entry *trace)
+{
+	__u64 nr_kernel = 0;
+
+	while (nr_kernel < trace->nr) {
+		if (trace->ip[nr_kernel] == PERF_CONTEXT_USER)
+			break;
+		nr_kernel++;
+	}
+	return nr_kernel;
+}
+
+BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx,
+	   struct bpf_map *, map, u64, flags)
+{
+	struct perf_event *event = ctx->event;
+	struct perf_callchain_entry *trace;
+	bool kernel, user;
+	__u64 nr_kernel;
+	int ret;
+
+	/* perf_sample_data doesn't have callchain, use bpf_get_stackid */
+	if (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
+		return bpf_get_stackid((unsigned long)(ctx->regs),
+				       (unsigned long) map, flags, 0, 0);
+
+	if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK |
+			       BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID)))
+		return -EINVAL;
+
+	user = flags & BPF_F_USER_STACK;
+	kernel = !user;
+
+	trace = ctx->data->callchain;
+	if (unlikely(!trace))
+		return -EFAULT;
+
+	nr_kernel = count_kernel_ip(trace);
+
+	if (kernel) {
+		__u64 nr = trace->nr;
+
+		trace->nr = nr_kernel;
+		ret = __bpf_get_stackid(map, trace, flags);
+
+		/* restore nr */
+		trace->nr = nr;
+	} else { /* user */
+		u64 skip = flags & BPF_F_SKIP_FIELD_MASK;
+
+		skip += nr_kernel;
+		if (skip > BPF_F_SKIP_FIELD_MASK)
+			return -EFAULT;
+
+		flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip;
+		ret = __bpf_get_stackid(map, trace, flags);
+	}
+	return ret;
+}
+
+const struct bpf_func_proto bpf_get_stackid_proto_pe = {
+	.func		= bpf_get_stackid_pe,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+	.arg2_type	= ARG_CONST_MAP_PTR,
+	.arg3_type	= ARG_ANYTHING,
+};
+
 static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
+			    struct perf_callchain_entry *trace_in,
 			    void *buf, u32 size, u64 flags)
 {
 	u32 init_nr, trace_nr, copy_len, elem_size, num_elem;
@@ -520,7 +602,9 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
 	else
 		init_nr = sysctl_perf_event_max_stack - num_elem;
 
-	if (kernel && task)
+	if (trace_in)
+		trace = trace_in;
+	else if (kernel && task)
 		trace = get_callchain_entry_for_task(task, init_nr);
 	else
 		trace = get_perf_callchain(regs, init_nr, kernel, user,
@@ -556,7 +640,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
 BPF_CALL_4(bpf_get_stack, struct pt_regs *, regs, void *, buf, u32, size,
 	   u64, flags)
 {
-	return __bpf_get_stack(regs, NULL, buf, size, flags);
+	return __bpf_get_stack(regs, NULL, NULL, buf, size, flags);
 }
 
 const struct bpf_func_proto bpf_get_stack_proto = {
@@ -574,7 +658,7 @@ BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf,
 {
 	struct pt_regs *regs = task_pt_regs(task);
 
-	return __bpf_get_stack(regs, task, buf, size, flags);
+	return __bpf_get_stack(regs, task, NULL, buf, size, flags);
 }
 
 BTF_ID_LIST(bpf_get_task_stack_btf_ids)
@@ -591,6 +675,70 @@ const struct bpf_func_proto bpf_get_task_stack_proto = {
 	.btf_id		= bpf_get_task_stack_btf_ids,
 };
 
+BPF_CALL_4(bpf_get_stack_pe, struct bpf_perf_event_data_kern *, ctx,
+	   void *, buf, u32, size, u64, flags)
+{
+	struct perf_event *event = ctx->event;
+	struct perf_callchain_entry *trace;
+	bool kernel, user;
+	int err = -EINVAL;
+	__u64 nr_kernel;
+
+	if (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
+		return __bpf_get_stack(ctx->regs, NULL, NULL, buf, size, flags);
+
+	if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK |
+			       BPF_F_USER_BUILD_ID)))
+		goto clear;
+
+	user = flags & BPF_F_USER_STACK;
+	kernel = !user;
+
+	err = -EFAULT;
+	trace = ctx->data->callchain;
+	if (unlikely(!trace))
+		goto clear;
+
+	nr_kernel = count_kernel_ip(trace);
+
+	if (kernel) {
+		__u64 nr = trace->nr;
+
+		trace->nr = nr_kernel;
+		err = __bpf_get_stack(ctx->regs, NULL, trace, buf,
+				      size, flags);
+
+		/* restore nr */
+		trace->nr = nr;
+	} else { /* user */
+		u64 skip = flags & BPF_F_SKIP_FIELD_MASK;
+
+		skip += nr_kernel;
+		if (skip > BPF_F_SKIP_FIELD_MASK)
+			goto clear;
+
+		flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip;
+		err = __bpf_get_stack(ctx->regs, NULL, trace, buf,
+				      size, flags);
+	}
+	return err;
+
+clear:
+	memset(buf, 0, size);
+	return err;
+
+}
+
+const struct bpf_func_proto bpf_get_stack_proto_pe = {
+	.func		= bpf_get_stack_pe,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+	.arg2_type	= ARG_PTR_TO_UNINIT_MEM,
+	.arg3_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg4_type	= ARG_ANYTHING,
+};
+
 /* Called from eBPF program */
 static void *stack_map_lookup_elem(struct bpf_map *map, void *key)
 {
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 3cc0dcb60ca20..cb91ef902cc43 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1411,9 +1411,9 @@ pe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_perf_event_output:
 		return &bpf_perf_event_output_proto_tp;
 	case BPF_FUNC_get_stackid:
-		return &bpf_get_stackid_proto_tp;
+		return &bpf_get_stackid_proto_pe;
 	case BPF_FUNC_get_stack:
-		return &bpf_get_stack_proto_tp;
+		return &bpf_get_stack_proto_pe;
 	case BPF_FUNC_perf_prog_read_value:
 		return &bpf_perf_prog_read_value_proto;
 	case BPF_FUNC_read_branch_records:
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 bpf-next 2/4] bpf: fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot work
  2020-07-22 18:42 [PATCH v4 bpf-next 0/4] bpf: fix stackmap on perf_events with PEBS Song Liu
  2020-07-22 18:42 ` [PATCH v4 bpf-next 1/4] bpf: separate bpf_get_[stack|stackid] for perf events BPF Song Liu
@ 2020-07-22 18:42 ` Song Liu
  2020-07-23  5:55   ` Alexei Starovoitov
  2020-07-22 18:42 ` [PATCH v4 bpf-next 4/4] selftests/bpf: add get_stackid_cannot_attach Song Liu
  2 siblings, 1 reply; 7+ messages in thread
From: Song Liu @ 2020-07-22 18:42 UTC (permalink / raw)
  To: linux-kernel, bpf, netdev
  Cc: ast, daniel, kernel-team, john.fastabend, kpsingh, brouer,
	peterz, Song Liu

bpf_get_[stack|stackid] on perf_events with precise_ip uses callchain
attached to perf_sample_data. If this callchain is not presented, do not
allow attaching BPF program that calls bpf_get_[stack|stackid] to this
event.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 include/linux/filter.h |  3 ++-
 kernel/bpf/verifier.c  |  3 +++
 kernel/events/core.c   | 18 ++++++++++++++++++
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 8252572db918f..582262017a7dd 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -534,7 +534,8 @@ struct bpf_prog {
 				is_func:1,	/* program is a bpf function */
 				kprobe_override:1, /* Do we override a kprobe? */
 				has_callchain_buf:1, /* callchain buffer allocated? */
-				enforce_expected_attach_type:1; /* Enforce expected_attach_type checking at attach time */
+				enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
+				call_get_stack:1; /* Do we call bpf_get_stack() or bpf_get_stackid() */
 	enum bpf_prog_type	type;		/* Type of BPF program */
 	enum bpf_attach_type	expected_attach_type; /* For some prog types */
 	u32			len;		/* Number of filter blocks */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9a6703bc3f36f..41c9517a505ff 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4887,6 +4887,9 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
 		env->prog->has_callchain_buf = true;
 	}
 
+	if (func_id == BPF_FUNC_get_stackid || func_id == BPF_FUNC_get_stack)
+		env->prog->call_get_stack = true;
+
 	if (changes_data)
 		clear_all_pkt_pointers(env);
 	return 0;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 856d98c36f562..f77d009fcce95 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9544,6 +9544,24 @@ static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd)
 	if (IS_ERR(prog))
 		return PTR_ERR(prog);
 
+	if (event->attr.precise_ip &&
+	    prog->call_get_stack &&
+	    (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) ||
+	     event->attr.exclude_callchain_kernel ||
+	     event->attr.exclude_callchain_user)) {
+		/*
+		 * On perf_event with precise_ip, calling bpf_get_stack()
+		 * may trigger unwinder warnings and occasional crashes.
+		 * bpf_get_[stack|stackid] works around this issue by using
+		 * callchain attached to perf_sample_data. If the
+		 * perf_event does not full (kernel and user) callchain
+		 * attached to perf_sample_data, do not allow attaching BPF
+		 * program that calls bpf_get_[stack|stackid].
+		 */
+		bpf_prog_put(prog);
+		return -EINVAL;
+	}
+
 	event->prog = prog;
 	event->orig_overflow_handler = READ_ONCE(event->overflow_handler);
 	WRITE_ONCE(event->overflow_handler, bpf_overflow_handler);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 bpf-next 4/4] selftests/bpf: add get_stackid_cannot_attach
  2020-07-22 18:42 [PATCH v4 bpf-next 0/4] bpf: fix stackmap on perf_events with PEBS Song Liu
  2020-07-22 18:42 ` [PATCH v4 bpf-next 1/4] bpf: separate bpf_get_[stack|stackid] for perf events BPF Song Liu
  2020-07-22 18:42 ` [PATCH v4 bpf-next 2/4] bpf: fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot work Song Liu
@ 2020-07-22 18:42 ` Song Liu
  2 siblings, 0 replies; 7+ messages in thread
From: Song Liu @ 2020-07-22 18:42 UTC (permalink / raw)
  To: linux-kernel, bpf, netdev
  Cc: ast, daniel, kernel-team, john.fastabend, kpsingh, brouer,
	peterz, Song Liu

This test confirms that BPF program that calls bpf_get_stackid() cannot
attach to perf_event with precise_ip > 0 but not PERF_SAMPLE_CALLCHAIN;
and cannot attach if the perf_event has exclude_callchain_kernel.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 .../prog_tests/get_stackid_cannot_attach.c    | 91 +++++++++++++++++++
 1 file changed, 91 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_stackid_cannot_attach.c

diff --git a/tools/testing/selftests/bpf/prog_tests/get_stackid_cannot_attach.c b/tools/testing/selftests/bpf/prog_tests/get_stackid_cannot_attach.c
new file mode 100644
index 0000000000000..f13149d279bc9
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/get_stackid_cannot_attach.c
@@ -0,0 +1,91 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020 Facebook
+#include <test_progs.h>
+#include "test_stacktrace_build_id.skel.h"
+
+void test_get_stackid_cannot_attach(void)
+{
+	struct perf_event_attr attr = {
+		/* .type = PERF_TYPE_SOFTWARE, */
+		.type = PERF_TYPE_HARDWARE,
+		.config = PERF_COUNT_HW_CPU_CYCLES,
+		.precise_ip = 1,
+		.sample_type = PERF_SAMPLE_IP | PERF_SAMPLE_BRANCH_STACK,
+		.branch_sample_type = PERF_SAMPLE_BRANCH_USER |
+			PERF_SAMPLE_BRANCH_NO_FLAGS |
+			PERF_SAMPLE_BRANCH_NO_CYCLES |
+			PERF_SAMPLE_BRANCH_CALL_STACK,
+		.sample_period = 5000,
+		.size = sizeof(struct perf_event_attr),
+	};
+	struct test_stacktrace_build_id *skel;
+	__u32 duration = 0;
+	int pmu_fd, err;
+
+	skel = test_stacktrace_build_id__open();
+	if (CHECK(!skel, "skel_open", "skeleton open failed\n"))
+		return;
+
+	/* override program type */
+	bpf_program__set_perf_event(skel->progs.oncpu);
+
+	err = test_stacktrace_build_id__load(skel);
+	if (CHECK(err, "skel_load", "skeleton load failed: %d\n", err))
+		goto cleanup;
+
+	pmu_fd = syscall(__NR_perf_event_open, &attr, -1 /* pid */,
+			 0 /* cpu 0 */, -1 /* group id */,
+			 0 /* flags */);
+	if (pmu_fd < 0 && errno == ENOENT) {
+		printf("%s:SKIP:cannot open PERF_COUNT_HW_CPU_CYCLES with precise_ip > 0\n",
+		       __func__);
+		test__skip();
+		goto cleanup;
+	}
+	if (CHECK(pmu_fd < 0, "perf_event_open", "err %d errno %d\n",
+		  pmu_fd, errno))
+		goto cleanup;
+
+	skel->links.oncpu = bpf_program__attach_perf_event(skel->progs.oncpu,
+							   pmu_fd);
+	CHECK(!IS_ERR(skel->links.oncpu), "attach_perf_event_no_callchain",
+	      "should have failed\n");
+	close(pmu_fd);
+
+	/* add PERF_SAMPLE_CALLCHAIN, attach should succeed */
+	attr.sample_type |= PERF_SAMPLE_CALLCHAIN;
+
+	pmu_fd = syscall(__NR_perf_event_open, &attr, -1 /* pid */,
+			 0 /* cpu 0 */, -1 /* group id */,
+			 0 /* flags */);
+
+	if (CHECK(pmu_fd < 0, "perf_event_open", "err %d errno %d\n",
+		  pmu_fd, errno))
+		goto cleanup;
+
+	skel->links.oncpu = bpf_program__attach_perf_event(skel->progs.oncpu,
+							   pmu_fd);
+	CHECK(IS_ERR(skel->links.oncpu), "attach_perf_event_callchain",
+	      "err: %ld\n", PTR_ERR(skel->links.oncpu));
+	close(pmu_fd);
+
+	/* add exclude_callchain_kernel, attach should fail */
+	attr.exclude_callchain_kernel = 1;
+
+	pmu_fd = syscall(__NR_perf_event_open, &attr, -1 /* pid */,
+			 0 /* cpu 0 */, -1 /* group id */,
+			 0 /* flags */);
+
+	if (CHECK(pmu_fd < 0, "perf_event_open", "err %d errno %d\n",
+		  pmu_fd, errno))
+		goto cleanup;
+
+	skel->links.oncpu = bpf_program__attach_perf_event(skel->progs.oncpu,
+							   pmu_fd);
+	CHECK(!IS_ERR(skel->links.oncpu), "attach_perf_event_exclude_callchain_kernel",
+	      "should have failed\n");
+	close(pmu_fd);
+
+cleanup:
+	test_stacktrace_build_id__destroy(skel);
+}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 bpf-next 2/4] bpf: fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot work
  2020-07-22 18:42 ` [PATCH v4 bpf-next 2/4] bpf: fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot work Song Liu
@ 2020-07-23  5:55   ` Alexei Starovoitov
  2020-07-23  6:20     ` Song Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Alexei Starovoitov @ 2020-07-23  5:55 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, bpf, netdev, ast, daniel, kernel-team,
	john.fastabend, kpsingh, brouer, peterz

On Wed, Jul 22, 2020 at 11:42:08AM -0700, Song Liu wrote:
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 856d98c36f562..f77d009fcce95 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9544,6 +9544,24 @@ static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd)
>  	if (IS_ERR(prog))
>  		return PTR_ERR(prog);
>  
> +	if (event->attr.precise_ip &&
> +	    prog->call_get_stack &&
> +	    (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) ||
> +	     event->attr.exclude_callchain_kernel ||
> +	     event->attr.exclude_callchain_user)) {
> +		/*
> +		 * On perf_event with precise_ip, calling bpf_get_stack()
> +		 * may trigger unwinder warnings and occasional crashes.
> +		 * bpf_get_[stack|stackid] works around this issue by using
> +		 * callchain attached to perf_sample_data. If the
> +		 * perf_event does not full (kernel and user) callchain
> +		 * attached to perf_sample_data, do not allow attaching BPF
> +		 * program that calls bpf_get_[stack|stackid].
> +		 */
> +		bpf_prog_put(prog);
> +		return -EINVAL;

I suspect this will be a common error. bpftrace and others will be hitting
this issue and would need to fix how they do perf_event_open.
But EINVAL is too ambiguous and sys_perf_event_open has no ability to
return a string.
So how about we pick some different errno here to make future debugging
a bit less painful?
May be EBADFD or EPROTO or EPROTOTYPE ?
I think anything would be better than EINVAL.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 bpf-next 2/4] bpf: fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot work
  2020-07-23  5:55   ` Alexei Starovoitov
@ 2020-07-23  6:20     ` Song Liu
  2020-07-23 15:22       ` Alexei Starovoitov
  0 siblings, 1 reply; 7+ messages in thread
From: Song Liu @ 2020-07-23  6:20 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: open list, bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, john fastabend, KP Singh, Jesper Dangaard Brouer,
	peterz



> On Jul 22, 2020, at 10:55 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> On Wed, Jul 22, 2020 at 11:42:08AM -0700, Song Liu wrote:
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 856d98c36f562..f77d009fcce95 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -9544,6 +9544,24 @@ static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd)
>> 	if (IS_ERR(prog))
>> 		return PTR_ERR(prog);
>> 
>> +	if (event->attr.precise_ip &&
>> +	    prog->call_get_stack &&
>> +	    (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) ||
>> +	     event->attr.exclude_callchain_kernel ||
>> +	     event->attr.exclude_callchain_user)) {
>> +		/*
>> +		 * On perf_event with precise_ip, calling bpf_get_stack()
>> +		 * may trigger unwinder warnings and occasional crashes.
>> +		 * bpf_get_[stack|stackid] works around this issue by using
>> +		 * callchain attached to perf_sample_data. If the
>> +		 * perf_event does not full (kernel and user) callchain
>> +		 * attached to perf_sample_data, do not allow attaching BPF
>> +		 * program that calls bpf_get_[stack|stackid].
>> +		 */
>> +		bpf_prog_put(prog);
>> +		return -EINVAL;
> 
> I suspect this will be a common error. bpftrace and others will be hitting
> this issue and would need to fix how they do perf_event_open.
> But EINVAL is too ambiguous and sys_perf_event_open has no ability to
> return a string.
> So how about we pick some different errno here to make future debugging
> a bit less painful?
> May be EBADFD or EPROTO or EPROTOTYPE ?
> I think anything would be better than EINVAL.

I like EPROTO most. I will change it to EPROTO if we don't have better ideas.

Btw, this is not the error code on sys_perf_event_open(). It is the ioctl()
on the perf_event fd. So debugging this error will be less painful than 
debugging sys_perf_event_open() errors. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 bpf-next 2/4] bpf: fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot work
  2020-07-23  6:20     ` Song Liu
@ 2020-07-23 15:22       ` Alexei Starovoitov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2020-07-23 15:22 UTC (permalink / raw)
  To: Song Liu
  Cc: open list, bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, john fastabend, KP Singh, Jesper Dangaard Brouer,
	peterz

On Wed, Jul 22, 2020 at 11:20 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Jul 22, 2020, at 10:55 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Jul 22, 2020 at 11:42:08AM -0700, Song Liu wrote:
> >> diff --git a/kernel/events/core.c b/kernel/events/core.c
> >> index 856d98c36f562..f77d009fcce95 100644
> >> --- a/kernel/events/core.c
> >> +++ b/kernel/events/core.c
> >> @@ -9544,6 +9544,24 @@ static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd)
> >>      if (IS_ERR(prog))
> >>              return PTR_ERR(prog);
> >>
> >> +    if (event->attr.precise_ip &&
> >> +        prog->call_get_stack &&
> >> +        (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) ||
> >> +         event->attr.exclude_callchain_kernel ||
> >> +         event->attr.exclude_callchain_user)) {
> >> +            /*
> >> +             * On perf_event with precise_ip, calling bpf_get_stack()
> >> +             * may trigger unwinder warnings and occasional crashes.
> >> +             * bpf_get_[stack|stackid] works around this issue by using
> >> +             * callchain attached to perf_sample_data. If the
> >> +             * perf_event does not full (kernel and user) callchain
> >> +             * attached to perf_sample_data, do not allow attaching BPF
> >> +             * program that calls bpf_get_[stack|stackid].
> >> +             */
> >> +            bpf_prog_put(prog);
> >> +            return -EINVAL;
> >
> > I suspect this will be a common error. bpftrace and others will be hitting
> > this issue and would need to fix how they do perf_event_open.
> > But EINVAL is too ambiguous and sys_perf_event_open has no ability to
> > return a string.
> > So how about we pick some different errno here to make future debugging
> > a bit less painful?
> > May be EBADFD or EPROTO or EPROTOTYPE ?
> > I think anything would be better than EINVAL.
>
> I like EPROTO most. I will change it to EPROTO if we don't have better ideas.
>
> Btw, this is not the error code on sys_perf_event_open(). It is the ioctl()
> on the perf_event fd. So debugging this error will be less painful than
> debugging sys_perf_event_open() errors.

ahh. right. Could you also add a string hint to libbpf when it sees this errno?

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-07-23 15:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 18:42 [PATCH v4 bpf-next 0/4] bpf: fix stackmap on perf_events with PEBS Song Liu
2020-07-22 18:42 ` [PATCH v4 bpf-next 1/4] bpf: separate bpf_get_[stack|stackid] for perf events BPF Song Liu
2020-07-22 18:42 ` [PATCH v4 bpf-next 2/4] bpf: fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot work Song Liu
2020-07-23  5:55   ` Alexei Starovoitov
2020-07-23  6:20     ` Song Liu
2020-07-23 15:22       ` Alexei Starovoitov
2020-07-22 18:42 ` [PATCH v4 bpf-next 4/4] selftests/bpf: add get_stackid_cannot_attach Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).