bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/3] bpf: introduce bpf_get_branch_trace
@ 2021-08-24  6:01 Song Liu
  2021-08-24  6:01 ` [PATCH bpf-next 1/3] perf: enable branch record for software events Song Liu
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Song Liu @ 2021-08-24  6:01 UTC (permalink / raw)
  To: bpf, linux-kernel; +Cc: acme, peterz, mingo, kernel-team, Song Liu

Branch stack can be very useful in understanding software events. For
example, when a long function, e.g. sys_perf_event_open, returns an errno,
it is not obvious why the function failed. Branch stack could provide very
helpful information in this type of scenarios.

This set adds support to read branch stack with a new BPF helper
bpf_get_branch_trace(). Currently, this is only supported in Intel systems.
It is also possible to support the same feaure for PowerPC.

The hardware that records the branch stace is not stopped automatically on
software events. Therefore, it is necessary to stop it in software soon.
Otherwise, the hardware buffers/registers will be flushed. One of the key
design consideration in this set is to minimize the number of branch record
entries between the event triggers and the hardware recorder is stopped.
Based on this goal, current design is different from the discussions in
original RFC [1]:
 1) Static call is used when supported, to save function pointer
    dereference;
 2) intel_pmu_lbr_disable_all is used instead of perf_pmu_disable(),
    because the latter uses about 10 entries before stopping LBR.

With current code, on Intel CPU, LBR is stopped after 6 branch entries
after fexit triggers:

ID: 0 from intel_pmu_lbr_disable_all.part.10+37 to intel_pmu_lbr_disable_all.part.10+72
ID: 1 from intel_pmu_lbr_disable_all.part.10+33 to intel_pmu_lbr_disable_all.part.10+37
ID: 2 from intel_pmu_snapshot_branch_stack+46 to intel_pmu_lbr_disable_all.part.10+0
ID: 3 from __bpf_prog_enter+38 to intel_pmu_snapshot_branch_stack+0
ID: 4 from __bpf_prog_enter+8 to __bpf_prog_enter+38
ID: 5 from __brk_limit+477020214 to __bpf_prog_enter+0
ID: 6 from bpf_fexit_loop_test1+22 to __brk_limit+477020195
ID: 7 from bpf_fexit_loop_test1+20 to bpf_fexit_loop_test1+13
ID: 8 from bpf_fexit_loop_test1+20 to bpf_fexit_loop_test1+13
...

[1] https://lore.kernel.org/bpf/20210818012937.2522409-1-songliubraving@fb.com/

Song Liu (3):
  perf: enable branch record for software events
  bpf: introduce helper bpf_get_branch_trace
  selftests/bpf: add test for bpf_get_branch_trace

 arch/x86/events/intel/core.c                  |   5 +-
 arch/x86/events/intel/lbr.c                   |  12 ++
 arch/x86/events/perf_event.h                  |   2 +
 include/linux/filter.h                        |   3 +-
 include/linux/perf_event.h                    |  33 ++++++
 include/uapi/linux/bpf.h                      |  16 +++
 kernel/bpf/trampoline.c                       |  15 +++
 kernel/bpf/verifier.c                         |   7 ++
 kernel/events/core.c                          |  28 +++++
 kernel/trace/bpf_trace.c                      |  30 +++++
 net/bpf/test_run.c                            |  15 ++-
 tools/include/uapi/linux/bpf.h                |  16 +++
 .../bpf/prog_tests/get_branch_trace.c         | 106 ++++++++++++++++++
 .../selftests/bpf/progs/get_branch_trace.c    |  41 +++++++
 tools/testing/selftests/bpf/trace_helpers.c   |  30 +++++
 tools/testing/selftests/bpf/trace_helpers.h   |   5 +
 16 files changed, 361 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_branch_trace.c
 create mode 100644 tools/testing/selftests/bpf/progs/get_branch_trace.c

--
2.30.2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf-next 1/3] perf: enable branch record for software events
  2021-08-24  6:01 [PATCH bpf-next 0/3] bpf: introduce bpf_get_branch_trace Song Liu
@ 2021-08-24  6:01 ` Song Liu
  2021-08-25 12:09   ` Peter Zijlstra
  2021-08-24  6:01 ` [PATCH bpf-next 2/3] bpf: introduce helper bpf_get_branch_trace Song Liu
  2021-08-24  6:01 ` [PATCH bpf-next 3/3] selftests/bpf: add test for bpf_get_branch_trace Song Liu
  2 siblings, 1 reply; 9+ messages in thread
From: Song Liu @ 2021-08-24  6:01 UTC (permalink / raw)
  To: bpf, linux-kernel; +Cc: acme, peterz, mingo, kernel-team, Song Liu

The typical way to access branch record (e.g. Intel LBR) is via hardware
perf_event. For CPUs with FREEZE_LBRS_ON_PMI support, PMI could capture
reliable LBR. On the other hand, LBR could also be useful in non-PMI
scenario. For example, in kretprobe or bpf fexit program, LBR could
provide a lot of information on what happened with the function. Add API
to use branch record for software use.

Note that, when the software event triggers, it is necessary to stop the
branch record hardware asap. Therefore, static_call is used to remove some
branch instructions in this process.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 arch/x86/events/intel/core.c |  5 ++++-
 arch/x86/events/intel/lbr.c  | 12 ++++++++++++
 arch/x86/events/perf_event.h |  2 ++
 include/linux/perf_event.h   | 33 +++++++++++++++++++++++++++++++++
 kernel/events/core.c         | 28 ++++++++++++++++++++++++++++
 5 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ac6fd2dabf6a2..a29649e7241cc 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -6283,8 +6283,11 @@ __init int intel_pmu_init(void)
 			x86_pmu.lbr_nr = 0;
 	}
 
-	if (x86_pmu.lbr_nr)
+	if (x86_pmu.lbr_nr) {
 		pr_cont("%d-deep LBR, ", x86_pmu.lbr_nr);
+		static_call_update(perf_snapshot_branch_stack,
+				   intel_pmu_snapshot_branch_stack);
+	}
 
 	intel_pmu_check_extra_regs(x86_pmu.extra_regs);
 
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 9e6d6eaeb4cb6..b73b444cf229d 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1862,3 +1862,15 @@ EXPORT_SYMBOL_GPL(x86_perf_get_lbr);
 struct event_constraint vlbr_constraint =
 	__EVENT_CONSTRAINT(INTEL_FIXED_VLBR_EVENT, (1ULL << INTEL_PMC_IDX_FIXED_VLBR),
 			  FIXED_EVENT_FLAGS, 1, 0, PERF_X86_EVENT_LBR_SELECT);
+
+void intel_pmu_snapshot_branch_stack(void)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	intel_pmu_lbr_disable_all();
+	intel_pmu_lbr_read();
+	memcpy(this_cpu_ptr(&perf_branch_snapshot_entries), cpuc->lbr_entries,
+	       sizeof(struct perf_branch_entry) * x86_pmu.lbr_nr);
+	*this_cpu_ptr(&perf_branch_snapshot_size) = x86_pmu.lbr_nr;
+	intel_pmu_lbr_enable_all(false);
+}
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index e3ac05c97b5e5..5262083f4e13b 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1379,6 +1379,8 @@ void intel_pmu_pebs_data_source_skl(bool pmem);
 
 int intel_pmu_setup_lbr_filter(struct perf_event *event);
 
+void intel_pmu_snapshot_branch_stack(void);
+
 void intel_pt_interrupt(void);
 
 int intel_bts_interrupt(void);
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index fe156a8170aa3..7cd2af7c5eda6 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -57,6 +57,7 @@ struct perf_guest_info_callbacks {
 #include <linux/cgroup.h>
 #include <linux/refcount.h>
 #include <linux/security.h>
+#include <linux/static_call.h>
 #include <asm/local.h>
 
 struct perf_callchain_entry {
@@ -1612,4 +1613,36 @@ extern void __weak arch_perf_update_userpage(struct perf_event *event,
 extern __weak u64 arch_perf_get_page_size(struct mm_struct *mm, unsigned long addr);
 #endif
 
+/*
+ * Snapshot branch stack on software events.
+ *
+ * Branch stack can be very useful in understanding software events. For
+ * example, when a long function, e.g. sys_perf_event_open, returns an
+ * errno, it is not obvious why the function failed. Branch stack could
+ * provide very helpful information in this type of scenarios.
+ *
+ * On software event, it is necessary to stop the hardware branch recorder
+ * fast. Otherwise, the hardware register/buffer will be flushed with
+ * entries af the triggering event. Therefore, static call is used to
+ * stop the hardware recorder.
+ *
+ * To use the snapshot:
+ * 1) After the event triggers, call perf_snapshot_branch_stack asap;
+ * 2) On the same cpu, access the snapshot with perf_read_branch_snapshot;
+ */
+#define MAX_BRANCH_SNAPSHOT 32
+DECLARE_PER_CPU(struct perf_branch_entry,
+		perf_branch_snapshot_entries[MAX_BRANCH_SNAPSHOT]);
+DECLARE_PER_CPU(int, perf_branch_snapshot_size);
+
+void perf_default_snapshot_branch_stack(void);
+
+#ifdef CONFIG_HAVE_STATIC_CALL
+DECLARE_STATIC_CALL(perf_snapshot_branch_stack,
+		    perf_default_snapshot_branch_stack);
+#else
+extern void (*perf_snapshot_branch_stack)(void);
+#endif
+
+int perf_read_branch_snapshot(void *buf, size_t len);
 #endif /* _LINUX_PERF_EVENT_H */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 011cc5069b7ba..b42cc20451709 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -13437,3 +13437,31 @@ struct cgroup_subsys perf_event_cgrp_subsys = {
 	.threaded	= true,
 };
 #endif /* CONFIG_CGROUP_PERF */
+
+DEFINE_PER_CPU(struct perf_branch_entry,
+	       perf_branch_snapshot_entries[MAX_BRANCH_SNAPSHOT]);
+DEFINE_PER_CPU(int, perf_branch_snapshot_size);
+
+void perf_default_snapshot_branch_stack(void)
+{
+	*this_cpu_ptr(&perf_branch_snapshot_size) = 0;
+}
+
+#ifdef CONFIG_HAVE_STATIC_CALL
+DEFINE_STATIC_CALL(perf_snapshot_branch_stack,
+		   perf_default_snapshot_branch_stack);
+#else
+void (*perf_snapshot_branch_stack)(void) = perf_default_snapshot_branch_stack;
+#endif
+
+int perf_read_branch_snapshot(void *buf, size_t len)
+{
+	int cnt;
+
+	memcpy(buf, *this_cpu_ptr(&perf_branch_snapshot_entries),
+	       min_t(u32, (u32)len,
+		     sizeof(struct perf_branch_entry) * MAX_BRANCH_SNAPSHOT));
+	cnt =  *this_cpu_ptr(&perf_branch_snapshot_size);
+
+	return (cnt > 0) ? cnt : -EOPNOTSUPP;
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next 2/3] bpf: introduce helper bpf_get_branch_trace
  2021-08-24  6:01 [PATCH bpf-next 0/3] bpf: introduce bpf_get_branch_trace Song Liu
  2021-08-24  6:01 ` [PATCH bpf-next 1/3] perf: enable branch record for software events Song Liu
@ 2021-08-24  6:01 ` Song Liu
  2021-08-25  1:14   ` kernel test robot
  2021-08-24  6:01 ` [PATCH bpf-next 3/3] selftests/bpf: add test for bpf_get_branch_trace Song Liu
  2 siblings, 1 reply; 9+ messages in thread
From: Song Liu @ 2021-08-24  6:01 UTC (permalink / raw)
  To: bpf, linux-kernel; +Cc: acme, peterz, mingo, kernel-team, Song Liu

Introduce bpf_get_branch_trace(), which allows tracing pogram to get
branch trace from hardware (e.g. Intel LBR). To use the feature, the
user need to create perf_event with proper branch_record filtering
on each cpu, and then calls bpf_get_branch_trace in the bpf function.
On Intel CPUs, VLBR event (raw event 0x1b00) can be use for this.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 include/linux/filter.h         |  3 ++-
 include/uapi/linux/bpf.h       | 16 ++++++++++++++++
 kernel/bpf/trampoline.c        | 15 +++++++++++++++
 kernel/bpf/verifier.c          |  7 +++++++
 kernel/trace/bpf_trace.c       | 30 ++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 16 ++++++++++++++++
 6 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 7d248941ecea3..8c30712f56ab2 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -575,7 +575,8 @@ struct bpf_prog {
 				has_callchain_buf:1, /* callchain buffer allocated? */
 				enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
 				call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
-				call_get_func_ip:1; /* Do we call get_func_ip() */
+				call_get_func_ip:1, /* Do we call get_func_ip() */
+				call_get_branch:1; /* Do we call get_branch_trace() */
 	enum bpf_prog_type	type;		/* Type of BPF program */
 	enum bpf_attach_type	expected_attach_type; /* For some prog types */
 	u32			len;		/* Number of filter blocks */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 191f0b286ee39..4b1ddb76603a5 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4871,6 +4871,21 @@ union bpf_attr {
  * 	Return
  *		Value specified by user at BPF link creation/attachment time
  *		or 0, if it was not specified.
+ *
+ * long bpf_get_branch_trace(void *entries, u32 size)
+ *	Description
+ *		Get branch trace from hardware engines like Intel LBR. The
+ *		branch trace is taken soon after the trigger point of the
+ *		BPF program, so it may contain some entries after the
+ *		trigger point. The user need to filter these entries
+ *		accordingly.
+ *
+ *		The data is stored as struct perf_branch_entry into output
+ *		buffer *entries*. *size* is the size of *entries* in bytes.
+ *
+ *	Return
+ *		> 0, number of valid output entries.
+ *		**-EOPNOTSUP**, the hardware/kernel does not support this function
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5048,6 +5063,7 @@ union bpf_attr {
 	FN(timer_cancel),		\
 	FN(get_func_ip),		\
 	FN(get_attach_cookie),		\
+	FN(get_branch_trace),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index fe1e857324e66..c36d3d7366cc9 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -10,6 +10,7 @@
 #include <linux/rcupdate_trace.h>
 #include <linux/rcupdate_wait.h>
 #include <linux/module.h>
+#include <linux/static_call.h>
 
 /* dummy _ops. The verifier will operate on target program's ops. */
 const struct bpf_verifier_ops bpf_extension_verifier_ops = {
@@ -564,6 +565,20 @@ static void notrace inc_misses_counter(struct bpf_prog *prog)
 u64 notrace __bpf_prog_enter(struct bpf_prog *prog)
 	__acquires(RCU)
 {
+	/* Calling migrate_disable costs two entries in the LBR. To save
+	 * some entries, we call perf_snapshot_branch_stack before
+	 * migrate_disable to save some entries. This is OK because we
+	 * care about the branch trace before entering the BPF program.
+	 * If migrate happens exactly here, there isn't much we can do to
+	 * preserve the data.
+	 */
+	if (prog->call_get_branch) {
+#ifdef CONFIG_HAVE_STATIC_CALL
+		static_call(perf_snapshot_branch_stack)();
+#else
+		perf_snapshot_branch_stack();
+#endif
+	}
 	rcu_read_lock();
 	migrate_disable();
 	if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f5a0077c99811..292d2b471892a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6446,6 +6446,13 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 		env->prog->call_get_func_ip = true;
 	}
 
+	if (func_id == BPF_FUNC_get_branch_trace) {
+		if (env->prog->aux->sleepable) {
+			verbose(env, "sleepable progs cannot call get_branch_trace\n");
+			return -ENOTSUPP;
+		}
+		env->prog->call_get_branch = true;
+	}
 	if (changes_data)
 		clear_all_pkt_pointers(env);
 	return 0;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index cbc73c08c4a4e..fe0a653190a5f 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1002,6 +1002,19 @@ static const struct bpf_func_proto bpf_get_attach_cookie_proto_pe = {
 	.arg1_type	= ARG_PTR_TO_CTX,
 };
 
+BPF_CALL_2(bpf_get_branch_trace, void *, buf, u32, size)
+{
+	return perf_read_branch_snapshot(buf, size);
+}
+
+static const struct bpf_func_proto bpf_get_branch_trace_proto = {
+	.func		= bpf_get_branch_trace,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+};
+
 static const struct bpf_func_proto *
 bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -1115,6 +1128,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_snprintf_proto;
 	case BPF_FUNC_get_func_ip:
 		return &bpf_get_func_ip_proto_tracing;
+	case BPF_FUNC_get_branch_trace:
+		return &bpf_get_branch_trace_proto;
 	default:
 		return bpf_base_func_proto(func_id);
 	}
@@ -1849,6 +1864,21 @@ void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
 static __always_inline
 void __bpf_trace_run(struct bpf_prog *prog, u64 *args)
 {
+	/* Calling migrate_disable costs two entries in the LBR. To save
+	 * some entries, we call perf_snapshot_branch_stack before
+	 * migrate_disable to save some entries. This is OK because we
+	 * care about the branch trace before entering the BPF program.
+	 * If migrate happens exactly here, there isn't much we can do to
+	 * preserve the data.
+	 */
+	if (prog->call_get_branch) {
+#ifdef CONFIG_HAVE_STATIC_CALL
+		static_call(perf_snapshot_branch_stack)();
+#else
+		perf_snapshot_branch_stack();
+#endif
+	}
+
 	cant_sleep();
 	rcu_read_lock();
 	(void) bpf_prog_run(prog, args);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 191f0b286ee39..4b1ddb76603a5 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4871,6 +4871,21 @@ union bpf_attr {
  * 	Return
  *		Value specified by user at BPF link creation/attachment time
  *		or 0, if it was not specified.
+ *
+ * long bpf_get_branch_trace(void *entries, u32 size)
+ *	Description
+ *		Get branch trace from hardware engines like Intel LBR. The
+ *		branch trace is taken soon after the trigger point of the
+ *		BPF program, so it may contain some entries after the
+ *		trigger point. The user need to filter these entries
+ *		accordingly.
+ *
+ *		The data is stored as struct perf_branch_entry into output
+ *		buffer *entries*. *size* is the size of *entries* in bytes.
+ *
+ *	Return
+ *		> 0, number of valid output entries.
+ *		**-EOPNOTSUP**, the hardware/kernel does not support this function
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5048,6 +5063,7 @@ union bpf_attr {
 	FN(timer_cancel),		\
 	FN(get_func_ip),		\
 	FN(get_attach_cookie),		\
+	FN(get_branch_trace),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next 3/3] selftests/bpf: add test for bpf_get_branch_trace
  2021-08-24  6:01 [PATCH bpf-next 0/3] bpf: introduce bpf_get_branch_trace Song Liu
  2021-08-24  6:01 ` [PATCH bpf-next 1/3] perf: enable branch record for software events Song Liu
  2021-08-24  6:01 ` [PATCH bpf-next 2/3] bpf: introduce helper bpf_get_branch_trace Song Liu
@ 2021-08-24  6:01 ` Song Liu
  2 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2021-08-24  6:01 UTC (permalink / raw)
  To: bpf, linux-kernel; +Cc: acme, peterz, mingo, kernel-team, Song Liu

This test uses bpf_get_branch_trace from a fexit program. The test uses
a target kernel function (bpf_fexit_loop_test1) and compares the record
against kallsyms. If there isn't enough record matching kallsyms, the
test fails.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 net/bpf/test_run.c                            |  15 ++-
 .../bpf/prog_tests/get_branch_trace.c         | 106 ++++++++++++++++++
 .../selftests/bpf/progs/get_branch_trace.c    |  41 +++++++
 tools/testing/selftests/bpf/trace_helpers.c   |  30 +++++
 tools/testing/selftests/bpf/trace_helpers.h   |   5 +
 5 files changed, 196 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_branch_trace.c
 create mode 100644 tools/testing/selftests/bpf/progs/get_branch_trace.c

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 2eb0e55ef54d2..6cc179a532c9c 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -231,6 +231,18 @@ struct sock * noinline bpf_kfunc_call_test3(struct sock *sk)
 	return sk;
 }
 
+noinline int bpf_fexit_loop_test1(int n)
+{
+	int i, sum = 0;
+
+	/* the primary goal of this test is to test LBR. Create a lot of
+	 * branches in the function, so we can catch it easily.
+	 */
+	for (i = 0; i < n; i++)
+		sum += i;
+	return sum;
+}
+
 __diag_pop();
 
 ALLOW_ERROR_INJECTION(bpf_modify_return_test, ERRNO);
@@ -293,7 +305,8 @@ int bpf_prog_test_run_tracing(struct bpf_prog *prog,
 		    bpf_fentry_test5(11, (void *)12, 13, 14, 15) != 65 ||
 		    bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111 ||
 		    bpf_fentry_test7((struct bpf_fentry_test_t *)0) != 0 ||
-		    bpf_fentry_test8(&arg) != 0)
+		    bpf_fentry_test8(&arg) != 0 ||
+		    bpf_fexit_loop_test1(101) != 5050)
 			goto out;
 		break;
 	case BPF_MODIFY_RETURN:
diff --git a/tools/testing/selftests/bpf/prog_tests/get_branch_trace.c b/tools/testing/selftests/bpf/prog_tests/get_branch_trace.c
new file mode 100644
index 0000000000000..67693322e0974
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/get_branch_trace.c
@@ -0,0 +1,106 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+#include <test_progs.h>
+#include "get_branch_trace.skel.h"
+
+static int *pfd_array;
+static int cpu_cnt;
+
+static int create_perf_events(void)
+{
+	struct perf_event_attr attr = {0};
+	int cpu;
+
+	/* create perf event */
+	attr.size = sizeof(attr);
+	attr.type = PERF_TYPE_RAW;
+	attr.config = 0x1b00;
+	attr.sample_type = PERF_SAMPLE_BRANCH_STACK;
+	attr.branch_sample_type = PERF_SAMPLE_BRANCH_KERNEL |
+		PERF_SAMPLE_BRANCH_USER | PERF_SAMPLE_BRANCH_ANY;
+
+	cpu_cnt = libbpf_num_possible_cpus();
+	pfd_array = malloc(sizeof(int) * cpu_cnt);
+	if (!pfd_array) {
+		cpu_cnt = 0;
+		return 1;
+	}
+
+	for (cpu = 0; cpu < libbpf_num_possible_cpus(); cpu++) {
+		pfd_array[cpu] = syscall(__NR_perf_event_open, &attr,
+					 -1, cpu, -1, PERF_FLAG_FD_CLOEXEC);
+		if (pfd_array[cpu] < 0)
+			break;
+	}
+
+	return cpu == 0;
+}
+
+static void close_perf_events(void)
+{
+	int cpu = 0;
+	int fd;
+
+	while (cpu++ < cpu_cnt) {
+		fd = pfd_array[cpu];
+		if (fd < 0)
+			break;
+		close(fd);
+	}
+	free(pfd_array);
+}
+
+void test_get_branch_trace(void)
+{
+	struct get_branch_trace *skel;
+	int err, prog_fd;
+	__u32 retval;
+
+	if (create_perf_events()) {
+		test__skip();  /* system doesn't support LBR */
+		goto cleanup;
+	}
+
+	skel = get_branch_trace__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "get_branch_trace__open_and_load"))
+		goto cleanup;
+
+	err = kallsyms_find("bpf_fexit_loop_test1", &skel->bss->address_low);
+	if (!ASSERT_OK(err, "kallsyms_find"))
+		goto cleanup;
+
+	err = kallsyms_find_next("bpf_fexit_loop_test1", &skel->bss->address_high);
+	if (!ASSERT_OK(err, "kallsyms_find_next"))
+		goto cleanup;
+
+	err = get_branch_trace__attach(skel);
+	if (!ASSERT_OK(err, "get_branch_trace__attach"))
+		goto cleanup;
+
+	prog_fd = bpf_program__fd(skel->progs.test1);
+	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+				NULL, 0, &retval, NULL);
+
+	if (!ASSERT_OK(err, "bpf_prog_test_run"))
+		goto cleanup;
+
+	if (skel->bss->total_entries < 16) {
+		/* too few entries for the hit/waste test */
+		test__skip();
+		goto cleanup;
+	}
+
+	ASSERT_GT(skel->bss->test1_hits, 5, "find_test1_in_lbr");
+
+	/* Given we stop LBR in software, we will waste a few entries.
+	 * But we should try to waste as few as possibleentries. We are at
+	 * about 7 on x86_64 systems.
+	 * Add a check for < 10 so that we get heads-up when something
+	 * changes and wastes too many entries.
+	 */
+	ASSERT_LT(skel->bss->wasted_entries, 10, "check_wasted_entries");
+
+cleanup:
+	get_branch_trace__destroy(skel);
+	close_perf_events();
+}
diff --git a/tools/testing/selftests/bpf/progs/get_branch_trace.c b/tools/testing/selftests/bpf/progs/get_branch_trace.c
new file mode 100644
index 0000000000000..02ff41951b377
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/get_branch_trace.c
@@ -0,0 +1,41 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+__u64 test1_hits = 0;
+__u64 address_low = 0;
+__u64 address_high = 0;
+int wasted_entries = 0;
+long total_entries = 0;
+
+#define MAX_LBR_ENTRIES 32
+
+struct perf_branch_entry entries[MAX_LBR_ENTRIES] = {};
+
+
+static inline bool in_range(__u64 val)
+{
+	return (val >= address_low) && (val < address_high);
+}
+
+SEC("fexit/bpf_fexit_loop_test1")
+int BPF_PROG(test1, int n, int ret)
+{
+	long i;
+
+	total_entries = bpf_get_branch_trace(entries, sizeof(entries));
+
+	for (i = 0; i < MAX_LBR_ENTRIES; i++) {
+		if (i >= total_entries)
+			break;
+		if (in_range(entries[i].from) && in_range(entries[i].to))
+			test1_hits++;
+		else if (!test1_hits)
+			wasted_entries++;
+	}
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
index e7a19b04d4eaf..2926a3b626821 100644
--- a/tools/testing/selftests/bpf/trace_helpers.c
+++ b/tools/testing/selftests/bpf/trace_helpers.c
@@ -117,6 +117,36 @@ int kallsyms_find(const char *sym, unsigned long long *addr)
 	return err;
 }
 
+/* find the address of the next symbol, this can be used to determine the
+ * end of a function
+ */
+int kallsyms_find_next(const char *sym, unsigned long long *addr)
+{
+	char type, name[500];
+	unsigned long long value;
+	bool found = false;
+	int err = 0;
+	FILE *f;
+
+	f = fopen("/proc/kallsyms", "r");
+	if (!f)
+		return -EINVAL;
+
+	while (fscanf(f, "%llx %c %499s%*[^\n]\n", &value, &type, name) > 0) {
+		if (found) {
+			*addr = value;
+			goto out;
+		}
+		if (strcmp(name, sym) == 0)
+			found = true;
+	}
+	err = -ENOENT;
+
+out:
+	fclose(f);
+	return err;
+}
+
 void read_trace_pipe(void)
 {
 	int trace_fd;
diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
index d907b445524d5..bc8ed86105d94 100644
--- a/tools/testing/selftests/bpf/trace_helpers.h
+++ b/tools/testing/selftests/bpf/trace_helpers.h
@@ -16,6 +16,11 @@ long ksym_get_addr(const char *name);
 /* open kallsyms and find addresses on the fly, faster than load + search. */
 int kallsyms_find(const char *sym, unsigned long long *addr);
 
+/* find the address of the next symbol, this can be used to determine the
+ * end of a function
+ */
+int kallsyms_find_next(const char *sym, unsigned long long *addr);
+
 void read_trace_pipe(void);
 
 ssize_t get_uprobe_offset(const void *addr, ssize_t base);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: introduce helper bpf_get_branch_trace
  2021-08-24  6:01 ` [PATCH bpf-next 2/3] bpf: introduce helper bpf_get_branch_trace Song Liu
@ 2021-08-25  1:14   ` kernel test robot
  0 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2021-08-25  1:14 UTC (permalink / raw)
  To: Song Liu, bpf, linux-kernel
  Cc: kbuild-all, acme, peterz, mingo, kernel-team, Song Liu

[-- Attachment #1: Type: text/plain, Size: 1604 bytes --]

Hi Song,

I love your patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Song-Liu/bpf-introduce-bpf_get_branch_trace/20210824-140315
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: riscv-randconfig-r025-20210825 (attached as .config)
compiler: riscv64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/e32271f38de34ebc8cc48176d9f1f0972182414e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Song-Liu/bpf-introduce-bpf_get_branch_trace/20210824-140315
        git checkout e32271f38de34ebc8cc48176d9f1f0972182414e
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=riscv SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   riscv64-linux-ld: kernel/bpf/trampoline.o: in function `__bpf_prog_enter':
>> trampoline.c:(.text+0x83c): undefined reference to `perf_snapshot_branch_stack'
>> riscv64-linux-ld: trampoline.c:(.text+0x840): undefined reference to `perf_snapshot_branch_stack'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35843 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 1/3] perf: enable branch record for software events
  2021-08-24  6:01 ` [PATCH bpf-next 1/3] perf: enable branch record for software events Song Liu
@ 2021-08-25 12:09   ` Peter Zijlstra
  2021-08-25 15:22     ` Song Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2021-08-25 12:09 UTC (permalink / raw)
  To: Song Liu; +Cc: bpf, linux-kernel, acme, mingo, kernel-team

On Mon, Aug 23, 2021 at 11:01:55PM -0700, Song Liu wrote:

>  arch/x86/events/intel/core.c |  5 ++++-
>  arch/x86/events/intel/lbr.c  | 12 ++++++++++++
>  arch/x86/events/perf_event.h |  2 ++
>  include/linux/perf_event.h   | 33 +++++++++++++++++++++++++++++++++
>  kernel/events/core.c         | 28 ++++++++++++++++++++++++++++
>  5 files changed, 79 insertions(+), 1 deletion(-)

No PowerPC support :/

> +void intel_pmu_snapshot_branch_stack(void)
> +{
> +	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +
> +	intel_pmu_lbr_disable_all();
> +	intel_pmu_lbr_read();
> +	memcpy(this_cpu_ptr(&perf_branch_snapshot_entries), cpuc->lbr_entries,
> +	       sizeof(struct perf_branch_entry) * x86_pmu.lbr_nr);
> +	*this_cpu_ptr(&perf_branch_snapshot_size) = x86_pmu.lbr_nr;
> +	intel_pmu_lbr_enable_all(false);
> +}

Still has the layering violation and issues vs PMI.

> +#ifdef CONFIG_HAVE_STATIC_CALL
> +DECLARE_STATIC_CALL(perf_snapshot_branch_stack,
> +		    perf_default_snapshot_branch_stack);
> +#else
> +extern void (*perf_snapshot_branch_stack)(void);
> +#endif

That's weird, static call should work unconditionally, and fall back to
a regular function pointer exactly like you do here. Search for:
"Generic Implementation" in include/linux/static_call.h

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 011cc5069b7ba..b42cc20451709 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c

> +#ifdef CONFIG_HAVE_STATIC_CALL
> +DEFINE_STATIC_CALL(perf_snapshot_branch_stack,
> +		   perf_default_snapshot_branch_stack);
> +#else
> +void (*perf_snapshot_branch_stack)(void) = perf_default_snapshot_branch_stack;
> +#endif

Idem.

Something like:

DEFINE_STATIC_CALL_NULL(perf_snapshot_branch_stack, void (*)(void));

with usage like: static_call_cond(perf_snapshot_branch_stack)();

Should unconditionally work.

> +int perf_read_branch_snapshot(void *buf, size_t len)
> +{
> +	int cnt;
> +
> +	memcpy(buf, *this_cpu_ptr(&perf_branch_snapshot_entries),
> +	       min_t(u32, (u32)len,
> +		     sizeof(struct perf_branch_entry) * MAX_BRANCH_SNAPSHOT));
> +	cnt =  *this_cpu_ptr(&perf_branch_snapshot_size);
> +
> +	return (cnt > 0) ? cnt : -EOPNOTSUPP;
> +}

Doesn't seem used at all..


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 1/3] perf: enable branch record for software events
  2021-08-25 12:09   ` Peter Zijlstra
@ 2021-08-25 15:22     ` Song Liu
  2021-08-26  7:56       ` kajoljain
  0 siblings, 1 reply; 9+ messages in thread
From: Song Liu @ 2021-08-25 15:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: open list:BPF (Safe dynamic programs and tools),
	LKML, Arnaldo Carvalho de Melo, Ingo Molnar, Kernel Team



> On Aug 25, 2021, at 5:09 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Mon, Aug 23, 2021 at 11:01:55PM -0700, Song Liu wrote:
> 
>> arch/x86/events/intel/core.c |  5 ++++-
>> arch/x86/events/intel/lbr.c  | 12 ++++++++++++
>> arch/x86/events/perf_event.h |  2 ++
>> include/linux/perf_event.h   | 33 +++++++++++++++++++++++++++++++++
>> kernel/events/core.c         | 28 ++++++++++++++++++++++++++++
>> 5 files changed, 79 insertions(+), 1 deletion(-)
> 
> No PowerPC support :/

I don't have PowerPC system for testing at the moment. I guess we can decide
the overall framework now, and ask PowerPC folks' help on PowerPC support
later? 

> 
>> +void intel_pmu_snapshot_branch_stack(void)
>> +{
>> +	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +
>> +	intel_pmu_lbr_disable_all();
>> +	intel_pmu_lbr_read();
>> +	memcpy(this_cpu_ptr(&perf_branch_snapshot_entries), cpuc->lbr_entries,
>> +	       sizeof(struct perf_branch_entry) * x86_pmu.lbr_nr);
>> +	*this_cpu_ptr(&perf_branch_snapshot_size) = x86_pmu.lbr_nr;
>> +	intel_pmu_lbr_enable_all(false);
>> +}
> 
> Still has the layering violation and issues vs PMI.

Yes, this is the biggest change after I test with this more. I tested with 
perf_[disable|enable]_pmu(), and function pointer in "struct pmu". However,
all these logic consumes LBR entries. In one of the version, 22 out of the
32 LBR entries are branches after the fexit event. Most of them are from
perf_disable_pmu(). And each function pointer consumes 1 or 2 entries. 
This would be worse for systems with fewer LBR entries. 

On the other hand, I think current version was not too bad. It may corrupt
some samples when there is collision between this and PMI. But it should not
cause serious issues. Did I miss anything more serious? 

> 
>> +#ifdef CONFIG_HAVE_STATIC_CALL
>> +DECLARE_STATIC_CALL(perf_snapshot_branch_stack,
>> +		    perf_default_snapshot_branch_stack);
>> +#else
>> +extern void (*perf_snapshot_branch_stack)(void);
>> +#endif
> 
> That's weird, static call should work unconditionally, and fall back to
> a regular function pointer exactly like you do here. Search for:
> "Generic Implementation" in include/linux/static_call.h

Thanks for the pointer. Let me look into it. 
> 
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 011cc5069b7ba..b42cc20451709 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
> 
>> +#ifdef CONFIG_HAVE_STATIC_CALL
>> +DEFINE_STATIC_CALL(perf_snapshot_branch_stack,
>> +		   perf_default_snapshot_branch_stack);
>> +#else
>> +void (*perf_snapshot_branch_stack)(void) = perf_default_snapshot_branch_stack;
>> +#endif
> 
> Idem.
> 
> Something like:
> 
> DEFINE_STATIC_CALL_NULL(perf_snapshot_branch_stack, void (*)(void));
> 
> with usage like: static_call_cond(perf_snapshot_branch_stack)();
> 
> Should unconditionally work.
> 
>> +int perf_read_branch_snapshot(void *buf, size_t len)
>> +{
>> +	int cnt;
>> +
>> +	memcpy(buf, *this_cpu_ptr(&perf_branch_snapshot_entries),
>> +	       min_t(u32, (u32)len,
>> +		     sizeof(struct perf_branch_entry) * MAX_BRANCH_SNAPSHOT));
>> +	cnt =  *this_cpu_ptr(&perf_branch_snapshot_size);
>> +
>> +	return (cnt > 0) ? cnt : -EOPNOTSUPP;
>> +}
> 
> Doesn't seem used at all..

At the moment, we only use this from BPF side (see 2/3). We sure can use it
from perf side, but that would require discussions on the user interface. 
How about we have that discussion later? 

Thanks,
Song

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 1/3] perf: enable branch record for software events
  2021-08-25 15:22     ` Song Liu
@ 2021-08-26  7:56       ` kajoljain
  2021-08-26 16:41         ` Song Liu
  0 siblings, 1 reply; 9+ messages in thread
From: kajoljain @ 2021-08-26  7:56 UTC (permalink / raw)
  To: Song Liu, Peter Zijlstra
  Cc: open list:BPF (Safe dynamic programs and tools),
	LKML, Arnaldo Carvalho de Melo, Ingo Molnar, Kernel Team



On 8/25/21 8:52 PM, Song Liu wrote:
> 
> 
>> On Aug 25, 2021, at 5:09 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Mon, Aug 23, 2021 at 11:01:55PM -0700, Song Liu wrote:
>>
>>> arch/x86/events/intel/core.c |  5 ++++-
>>> arch/x86/events/intel/lbr.c  | 12 ++++++++++++
>>> arch/x86/events/perf_event.h |  2 ++
>>> include/linux/perf_event.h   | 33 +++++++++++++++++++++++++++++++++
>>> kernel/events/core.c         | 28 ++++++++++++++++++++++++++++
>>> 5 files changed, 79 insertions(+), 1 deletion(-)
>>
>> No PowerPC support :/
> 
> I don't have PowerPC system for testing at the moment. I guess we can decide
> the overall framework now, and ask PowerPC folks' help on PowerPC support
> later? 

Hi Song,
   I will look at powerpc side to enable this.

Thanks,
Kajol Jain

> 
>>
>>> +void intel_pmu_snapshot_branch_stack(void)
>>> +{
>>> +	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>>> +
>>> +	intel_pmu_lbr_disable_all();
>>> +	intel_pmu_lbr_read();
>>> +	memcpy(this_cpu_ptr(&perf_branch_snapshot_entries), cpuc->lbr_entries,
>>> +	       sizeof(struct perf_branch_entry) * x86_pmu.lbr_nr);
>>> +	*this_cpu_ptr(&perf_branch_snapshot_size) = x86_pmu.lbr_nr;
>>> +	intel_pmu_lbr_enable_all(false);
>>> +}
>>
>> Still has the layering violation and issues vs PMI.
> 
> Yes, this is the biggest change after I test with this more. I tested with 
> perf_[disable|enable]_pmu(), and function pointer in "struct pmu". However,
> all these logic consumes LBR entries. In one of the version, 22 out of the
> 32 LBR entries are branches after the fexit event. Most of them are from
> perf_disable_pmu(). And each function pointer consumes 1 or 2 entries. 
> This would be worse for systems with fewer LBR entries. 
> 
> On the other hand, I think current version was not too bad. It may corrupt
> some samples when there is collision between this and PMI. But it should not
> cause serious issues. Did I miss anything more serious? 
> 
>>
>>> +#ifdef CONFIG_HAVE_STATIC_CALL
>>> +DECLARE_STATIC_CALL(perf_snapshot_branch_stack,
>>> +		    perf_default_snapshot_branch_stack);
>>> +#else
>>> +extern void (*perf_snapshot_branch_stack)(void);
>>> +#endif
>>
>> That's weird, static call should work unconditionally, and fall back to
>> a regular function pointer exactly like you do here. Search for:
>> "Generic Implementation" in include/linux/static_call.h
> 
> Thanks for the pointer. Let me look into it. 
>>
>>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>>> index 011cc5069b7ba..b42cc20451709 100644
>>> --- a/kernel/events/core.c
>>> +++ b/kernel/events/core.c
>>
>>> +#ifdef CONFIG_HAVE_STATIC_CALL
>>> +DEFINE_STATIC_CALL(perf_snapshot_branch_stack,
>>> +		   perf_default_snapshot_branch_stack);
>>> +#else
>>> +void (*perf_snapshot_branch_stack)(void) = perf_default_snapshot_branch_stack;
>>> +#endif
>>
>> Idem.
>>
>> Something like:
>>
>> DEFINE_STATIC_CALL_NULL(perf_snapshot_branch_stack, void (*)(void));
>>
>> with usage like: static_call_cond(perf_snapshot_branch_stack)();
>>
>> Should unconditionally work.
>>
>>> +int perf_read_branch_snapshot(void *buf, size_t len)
>>> +{
>>> +	int cnt;
>>> +
>>> +	memcpy(buf, *this_cpu_ptr(&perf_branch_snapshot_entries),
>>> +	       min_t(u32, (u32)len,
>>> +		     sizeof(struct perf_branch_entry) * MAX_BRANCH_SNAPSHOT));
>>> +	cnt =  *this_cpu_ptr(&perf_branch_snapshot_size);
>>> +
>>> +	return (cnt > 0) ? cnt : -EOPNOTSUPP;
>>> +}
>>
>> Doesn't seem used at all..
> 
> At the moment, we only use this from BPF side (see 2/3). We sure can use it
> from perf side, but that would require discussions on the user interface. 
> How about we have that discussion later? 
> 
> Thanks,
> Song
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 1/3] perf: enable branch record for software events
  2021-08-26  7:56       ` kajoljain
@ 2021-08-26 16:41         ` Song Liu
  0 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2021-08-26 16:41 UTC (permalink / raw)
  To: kajoljain
  Cc: Peter Zijlstra, open list:BPF (Safe dynamic programs and tools),
	LKML, Arnaldo Carvalho de Melo, Ingo Molnar, Kernel Team



> On Aug 26, 2021, at 12:56 AM, kajoljain <kjain@linux.ibm.com> wrote:
> 
> 
> 
> On 8/25/21 8:52 PM, Song Liu wrote:
>> 
>> 
>>> On Aug 25, 2021, at 5:09 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>>> 
>>> On Mon, Aug 23, 2021 at 11:01:55PM -0700, Song Liu wrote:
>>> 
>>>> arch/x86/events/intel/core.c |  5 ++++-
>>>> arch/x86/events/intel/lbr.c  | 12 ++++++++++++
>>>> arch/x86/events/perf_event.h |  2 ++
>>>> include/linux/perf_event.h   | 33 +++++++++++++++++++++++++++++++++
>>>> kernel/events/core.c         | 28 ++++++++++++++++++++++++++++
>>>> 5 files changed, 79 insertions(+), 1 deletion(-)
>>> 
>>> No PowerPC support :/
>> 
>> I don't have PowerPC system for testing at the moment. I guess we can decide
>> the overall framework now, and ask PowerPC folks' help on PowerPC support
>> later? 
> 
> Hi Song,
>   I will look at powerpc side to enable this.
> 
> Thanks,
> Kajol Jain

Thanks Kajol! 

Let me address Peter's other comments and send v2. We can then merge in  
PowerPC support.

Song

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-08-26 16:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-24  6:01 [PATCH bpf-next 0/3] bpf: introduce bpf_get_branch_trace Song Liu
2021-08-24  6:01 ` [PATCH bpf-next 1/3] perf: enable branch record for software events Song Liu
2021-08-25 12:09   ` Peter Zijlstra
2021-08-25 15:22     ` Song Liu
2021-08-26  7:56       ` kajoljain
2021-08-26 16:41         ` Song Liu
2021-08-24  6:01 ` [PATCH bpf-next 2/3] bpf: introduce helper bpf_get_branch_trace Song Liu
2021-08-25  1:14   ` kernel test robot
2021-08-24  6:01 ` [PATCH bpf-next 3/3] selftests/bpf: add test for bpf_get_branch_trace Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).