All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/4] Introduce perf-stat -b for BPF programs
@ 2020-12-28 17:40 Song Liu
  2020-12-28 17:40 ` [PATCH v6 1/4] bpftool: add Makefile target bootstrap Song Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, kernel-team, Song Liu

This set introduces perf-stat -b option to count events for BPF programs.
This is similar to bpftool-prog-profile. But perf-stat makes it much more
flexible.

Changes v5 => v6
  1. Update the name for bootstrap bpftool. (Jiri)

Changes v4 => v5:
  1. Add documentation. (Jiri)
  2. Silent make output for removing .bpf.o file. (Jiri)

Changes v3 => v4:
  1. Split changes in bpftool/Makefile to a separate patch
  2. Various small fixes. (Jiri)

Changes v2 => v3:
  1. Small fixes in Makefile.perf and bpf_counter.c (Jiri)
  2. Rebased on top of bpf-next. This is because 1/2 conflicts with some
     patches in bpftool/Makefile.

Changes PATCH v1 => PATCH v2:
  1. Various fixes in Makefiles. (Jiri)
  2. Fix an build warning/error with gcc-10. (Jiri)

Changes RFC v2 => PATCH v1:
  1. Support counting on multiple BPF programs.
  2. Add BPF handling to target__validate().
  3. Improve Makefile. (Jiri)

Changes RFC v1 => RFC v2:
  1. Use bootstrap version of bpftool. (Jiri)
  2. Set default to not building bpf skeletons. (Jiri)
  3. Remove util/bpf_skel/Makefile, keep all the logic in Makefile.perf.
     (Jiri)
  4. Remove dependency to vmlinux.h in the two skeletons. The goal here is
     to enable building perf without building kernel (vmlinux) first.
     Note: I also removed the logic that build vmlinux.h. We can add that
     back when we have to use it (to access big kernel structures).

Song Liu (4):
  bpftool: add Makefile target bootstrap
  perf: support build BPF skeletons with perf
  perf-stat: enable counting events for BPF programs
  perf-stat: add documentation for -b option

 tools/bpf/bpftool/Makefile                    |   2 +
 tools/build/Makefile.feature                  |   4 +-
 tools/perf/Documentation/perf-stat.txt        |  14 +
 tools/perf/Makefile.config                    |   9 +
 tools/perf/Makefile.perf                      |  49 ++-
 tools/perf/builtin-stat.c                     |  77 ++++-
 tools/perf/util/Build                         |   1 +
 tools/perf/util/bpf_counter.c                 | 296 ++++++++++++++++++
 tools/perf/util/bpf_counter.h                 |  72 +++++
 tools/perf/util/bpf_skel/.gitignore           |   3 +
 .../util/bpf_skel/bpf_prog_profiler.bpf.c     |  93 ++++++
 tools/perf/util/evsel.c                       |   9 +
 tools/perf/util/evsel.h                       |   6 +
 tools/perf/util/stat-display.c                |   4 +-
 tools/perf/util/stat.c                        |   2 +-
 tools/perf/util/target.c                      |  34 +-
 tools/perf/util/target.h                      |  10 +
 tools/scripts/Makefile.include                |   1 +
 18 files changed, 666 insertions(+), 20 deletions(-)
 create mode 100644 tools/perf/util/bpf_counter.c
 create mode 100644 tools/perf/util/bpf_counter.h
 create mode 100644 tools/perf/util/bpf_skel/.gitignore
 create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c

--
2.24.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v6 1/4] bpftool: add Makefile target bootstrap
  2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu
@ 2020-12-28 17:40 ` Song Liu
  2020-12-28 17:40 ` [PATCH v6 2/4] perf: support build BPF skeletons with perf Song Liu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, kernel-team, Song Liu

This target is used to only build the bootstrap bpftool, which will be
used to generate bpf skeletons for other tools, like perf.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/bpf/bpftool/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index f897cb5fb12d0..e3292a3a0c461 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -148,6 +148,8 @@ VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux)				\
 		     /boot/vmlinux-$(shell uname -r)
 VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
 
+bootstrap: $(BPFTOOL_BOOTSTRAP)
+
 ifneq ($(VMLINUX_BTF)$(VMLINUX_H),)
 ifeq ($(feature-clang-bpf-co-re),1)
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v6 2/4] perf: support build BPF skeletons with perf
  2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu
  2020-12-28 17:40 ` [PATCH v6 1/4] bpftool: add Makefile target bootstrap Song Liu
@ 2020-12-28 17:40 ` Song Liu
  2020-12-29  7:01   ` Namhyung Kim
  2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu
  2020-12-28 17:40 ` [PATCH v6 4/4] perf-stat: add documentation for -b option Song Liu
  3 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, kernel-team, Song Liu

BPF programs are useful in perf to profile BPF programs. BPF skeleton is
by far the easiest way to write BPF tools. Enable building BPF skeletons
in util/bpf_skel. A dummy bpf skeleton is added. More bpf skeletons will
be added for different use cases.

Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/build/Makefile.feature        |  4 ++-
 tools/perf/Makefile.config          |  9 ++++++
 tools/perf/Makefile.perf            | 49 +++++++++++++++++++++++++++--
 tools/perf/util/bpf_skel/.gitignore |  3 ++
 tools/scripts/Makefile.include      |  1 +
 5 files changed, 63 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/util/bpf_skel/.gitignore

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 97cbfb31b7625..74e255d58d8d0 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -99,7 +99,9 @@ FEATURE_TESTS_EXTRA :=                  \
          clang                          \
          libbpf                         \
          libpfm4                        \
-         libdebuginfod
+         libdebuginfod			\
+         clang-bpf-co-re
+
 
 FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC)
 
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index ce8516e4de34f..d8e59d31399a5 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -621,6 +621,15 @@ ifndef NO_LIBBPF
   endif
 endif
 
+ifdef BUILD_BPF_SKEL
+  $(call feature_check,clang-bpf-co-re)
+  ifeq ($(feature-clang-bpf-co-re), 0)
+    dummy := $(error Error: clang too old. Please install recent clang)
+  endif
+  $(call detected,CONFIG_PERF_BPF_SKEL)
+  CFLAGS += -DHAVE_BPF_SKEL
+endif
+
 dwarf-post-unwind := 1
 dwarf-post-unwind-text := BUG
 
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 62f3deb1d3a8b..d182a2dbb9bbd 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -126,6 +126,8 @@ include ../scripts/utilities.mak
 #
 # Define NO_LIBDEBUGINFOD if you do not want support debuginfod
 #
+# Define BUILD_BPF_SKEL to enable BPF skeletons
+#
 
 # As per kernel Makefile, avoid funny character set dependencies
 unexport LC_ALL
@@ -175,6 +177,12 @@ endef
 
 LD += $(EXTRA_LDFLAGS)
 
+HOSTCC  ?= gcc
+HOSTLD  ?= ld
+HOSTAR  ?= ar
+CLANG   ?= clang
+LLVM_STRIP ?= llvm-strip
+
 PKG_CONFIG = $(CROSS_COMPILE)pkg-config
 LLVM_CONFIG ?= llvm-config
 
@@ -731,7 +739,8 @@ prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioc
 	$(x86_arch_prctl_code_array) \
 	$(rename_flags_array) \
 	$(arch_errno_name_array) \
-	$(sync_file_range_arrays)
+	$(sync_file_range_arrays) \
+	bpf-skel
 
 $(OUTPUT)%.o: %.c prepare FORCE
 	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
@@ -1004,7 +1013,43 @@ config-clean:
 python-clean:
 	$(python-clean)
 
-clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean
+SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
+SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
+SKELETONS :=
+
+ifdef BUILD_BPF_SKEL
+BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
+LIBBPF_SRC := $(abspath ../lib/bpf)
+BPF_INCLUDE := -I$(SKEL_TMP_OUT)/.. -I$(BPF_PATH) -I$(LIBBPF_SRC)/..
+
+$(SKEL_TMP_OUT):
+	$(Q)$(MKDIR) -p $@
+
+$(BPFTOOL): | $(SKEL_TMP_OUT)
+	CFLAGS= $(MAKE) -C ../bpf/bpftool \
+		OUTPUT=$(SKEL_TMP_OUT)/ bootstrap
+
+$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT)
+	$(QUIET_CLANG)$(CLANG) -g -O2 -target bpf $(BPF_INCLUDE) \
+	  -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@
+
+$(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL)
+	$(QUIET_GENSKEL)$(BPFTOOL) gen skeleton $< > $@
+
+bpf-skel: $(SKELETONS)
+
+.PRECIOUS: $(SKEL_TMP_OUT)/%.bpf.o
+
+else # BUILD_BPF_SKEL
+
+bpf-skel:
+
+endif # BUILD_BPF_SKEL
+
+bpf-skel-clean:
+	$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
+
+clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean bpf-skel-clean
 	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
 	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
 	$(Q)$(RM) $(OUTPUT).config-detected
diff --git a/tools/perf/util/bpf_skel/.gitignore b/tools/perf/util/bpf_skel/.gitignore
new file mode 100644
index 0000000000000..5263e9e6c5d83
--- /dev/null
+++ b/tools/perf/util/bpf_skel/.gitignore
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+.tmp
+*.skel.h
\ No newline at end of file
diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include
index 1358e89cdf7d6..62119ce69ad9a 100644
--- a/tools/scripts/Makefile.include
+++ b/tools/scripts/Makefile.include
@@ -127,6 +127,7 @@ ifneq ($(silent),1)
 			 $(MAKE) $(PRINT_DIR) -C $$subdir
 	QUIET_FLEX     = @echo '  FLEX     '$@;
 	QUIET_BISON    = @echo '  BISON    '$@;
+	QUIET_GENSKEL  = @echo '  GEN-SKEL '$@;
 
 	descend = \
 		+@echo	       '  DESCEND  '$(1); \
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu
  2020-12-28 17:40 ` [PATCH v6 1/4] bpftool: add Makefile target bootstrap Song Liu
  2020-12-28 17:40 ` [PATCH v6 2/4] perf: support build BPF skeletons with perf Song Liu
@ 2020-12-28 17:40 ` Song Liu
  2020-12-28 20:11   ` Arnaldo Carvalho de Melo
  2020-12-29  7:22   ` Namhyung Kim
  2020-12-28 17:40 ` [PATCH v6 4/4] perf-stat: add documentation for -b option Song Liu
  3 siblings, 2 replies; 25+ messages in thread
From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, kernel-team, Song Liu

Introduce perf-stat -b option, which counts events for BPF programs, like:

[root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
     1.487903822            115,200      ref-cycles
     1.487903822             86,012      cycles
     2.489147029             80,560      ref-cycles
     2.489147029             73,784      cycles
     3.490341825             60,720      ref-cycles
     3.490341825             37,797      cycles
     4.491540887             37,120      ref-cycles
     4.491540887             31,963      cycles

The example above counts cycles and ref-cycles of BPF program of id 254.
This is similar to bpftool-prog-profile command, but more flexible.

perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
programs (monitor-progs) to the target BPF program (target-prog). The
monitor-progs read perf_event before and after the target-prog, and
aggregate the difference in a BPF map. Then the user space reads data
from these maps.

A new struct bpf_counter is introduced to provide common interface that
uses BPF programs/maps to count perf events.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/Makefile.perf                      |   2 +-
 tools/perf/builtin-stat.c                     |  77 ++++-
 tools/perf/util/Build                         |   1 +
 tools/perf/util/bpf_counter.c                 | 296 ++++++++++++++++++
 tools/perf/util/bpf_counter.h                 |  72 +++++
 .../util/bpf_skel/bpf_prog_profiler.bpf.c     |  93 ++++++
 tools/perf/util/evsel.c                       |   9 +
 tools/perf/util/evsel.h                       |   6 +
 tools/perf/util/stat-display.c                |   4 +-
 tools/perf/util/stat.c                        |   2 +-
 tools/perf/util/target.c                      |  34 +-
 tools/perf/util/target.h                      |  10 +
 12 files changed, 588 insertions(+), 18 deletions(-)
 create mode 100644 tools/perf/util/bpf_counter.c
 create mode 100644 tools/perf/util/bpf_counter.h
 create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index d182a2dbb9bbd..8c4e039c3b813 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1015,7 +1015,7 @@ python-clean:
 
 SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
 SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
-SKELETONS :=
+SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
 
 ifdef BUILD_BPF_SKEL
 BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 8cc24967bc273..09bffb3fbcdd4 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -67,6 +67,7 @@
 #include "util/top.h"
 #include "util/affinity.h"
 #include "util/pfm.h"
+#include "util/bpf_counter.h"
 #include "asm/bug.h"
 
 #include <linux/time64.h>
@@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs)
 	return 0;
 }
 
+static int read_bpf_map_counters(void)
+{
+	struct evsel *counter;
+	int err;
+
+	evlist__for_each_entry(evsel_list, counter) {
+		err = bpf_counter__read(counter);
+		if (err)
+			return err;
+	}
+	return 0;
+}
+
 static void read_counters(struct timespec *rs)
 {
 	struct evsel *counter;
+	int err;
 
-	if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0))
-		return;
+	if (!stat_config.stop_read_counter) {
+		err = read_bpf_map_counters();
+		if (err == -EAGAIN)
+			err = read_affinity_counters(rs);
+		if (err < 0)
+			return;
+	}
 
 	evlist__for_each_entry(evsel_list, counter) {
 		if (counter->err)
@@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times)
 	return false;
 }
 
-static void enable_counters(void)
+static int enable_counters(void)
 {
+	struct evsel *evsel;
+	int err;
+
+	evlist__for_each_entry(evsel_list, evsel) {
+		err = bpf_counter__enable(evsel);
+		if (err)
+			return err;
+	}
+
 	if (stat_config.initial_delay < 0) {
 		pr_info(EVLIST_DISABLED_MSG);
-		return;
+		return 0;
 	}
 
 	if (stat_config.initial_delay > 0) {
@@ -518,6 +547,7 @@ static void enable_counters(void)
 		if (stat_config.initial_delay > 0)
 			pr_info(EVLIST_ENABLED_MSG);
 	}
+	return 0;
 }
 
 static void disable_counters(void)
@@ -720,7 +750,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	const bool forks = (argc > 0);
 	bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false;
 	struct affinity affinity;
-	int i, cpu;
+	int i, cpu, err;
 	bool second_pass = false;
 
 	if (forks) {
@@ -737,6 +767,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	if (affinity__setup(&affinity) < 0)
 		return -1;
 
+	evlist__for_each_entry(evsel_list, counter) {
+		if (bpf_counter__load(counter, &target))
+			return -1;
+	}
+
 	evlist__for_each_cpu (evsel_list, i, cpu) {
 		affinity__set(&affinity, cpu);
 
@@ -850,7 +885,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	}
 
 	if (STAT_RECORD) {
-		int err, fd = perf_data__fd(&perf_stat.data);
+		int fd = perf_data__fd(&perf_stat.data);
 
 		if (is_pipe) {
 			err = perf_header__write_pipe(perf_data__fd(&perf_stat.data));
@@ -876,7 +911,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 
 	if (forks) {
 		evlist__start_workload(evsel_list);
-		enable_counters();
+		err = enable_counters();
+		if (err)
+			return -1;
 
 		if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
 			status = dispatch_events(forks, timeout, interval, &times);
@@ -895,7 +932,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 		if (WIFSIGNALED(status))
 			psignal(WTERMSIG(status), argv[0]);
 	} else {
-		enable_counters();
+		err = enable_counters();
+		if (err)
+			return -1;
 		status = dispatch_events(forks, timeout, interval, &times);
 	}
 
@@ -1085,6 +1124,10 @@ static struct option stat_options[] = {
 		   "stat events on existing process id"),
 	OPT_STRING('t', "tid", &target.tid, "tid",
 		   "stat events on existing thread id"),
+#ifdef HAVE_BPF_SKEL
+	OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id",
+		   "stat events on existing bpf program id"),
+#endif
 	OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_BOOLEAN('g', "group", &group,
@@ -2064,11 +2107,12 @@ int cmd_stat(int argc, const char **argv)
 		"perf stat [<options>] [<command>]",
 		NULL
 	};
-	int status = -EINVAL, run_idx;
+	int status = -EINVAL, run_idx, err;
 	const char *mode;
 	FILE *output = stderr;
 	unsigned int interval, timeout;
 	const char * const stat_subcommands[] = { "record", "report" };
+	char errbuf[BUFSIZ];
 
 	setlocale(LC_ALL, "");
 
@@ -2179,6 +2223,12 @@ int cmd_stat(int argc, const char **argv)
 	} else if (big_num_opt == 0) /* User passed --no-big-num */
 		stat_config.big_num = false;
 
+	err = target__validate(&target);
+	if (err) {
+		target__strerror(&target, err, errbuf, BUFSIZ);
+		pr_warning("%s\n", errbuf);
+	}
+
 	setup_system_wide(argc);
 
 	/*
@@ -2252,8 +2302,6 @@ int cmd_stat(int argc, const char **argv)
 		}
 	}
 
-	target__validate(&target);
-
 	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
 		target.per_thread = true;
 
@@ -2384,9 +2432,10 @@ int cmd_stat(int argc, const char **argv)
 		 * tools remain  -acme
 		 */
 		int fd = perf_data__fd(&perf_stat.data);
-		int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat,
-							     process_synthesized_event,
-							     &perf_stat.session->machines.host);
+
+		err = perf_event__synthesize_kernel_mmap((void *)&perf_stat,
+							 process_synthesized_event,
+							 &perf_stat.session->machines.host);
 		if (err) {
 			pr_warning("Couldn't synthesize the kernel mmap record, harmless, "
 				   "older tools may produce warnings about this file\n.");
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index e2563d0154eb6..188521f343470 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -135,6 +135,7 @@ perf-y += clockid.o
 
 perf-$(CONFIG_LIBBPF) += bpf-loader.o
 perf-$(CONFIG_LIBBPF) += bpf_map.o
+perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o
 perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
 perf-$(CONFIG_LIBELF) += symbol-elf.o
 perf-$(CONFIG_LIBELF) += probe-file.o
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
new file mode 100644
index 0000000000000..f2cb86a40c882
--- /dev/null
+++ b/tools/perf/util/bpf_counter.c
@@ -0,0 +1,296 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2019 Facebook */
+
+#include <limits.h>
+#include <unistd.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <linux/err.h>
+#include <linux/zalloc.h>
+#include <bpf/bpf.h>
+#include <bpf/btf.h>
+#include <bpf/libbpf.h>
+
+#include "bpf_counter.h"
+#include "counts.h"
+#include "debug.h"
+#include "evsel.h"
+#include "target.h"
+
+#include "bpf_skel/bpf_prog_profiler.skel.h"
+
+static inline void *u64_to_ptr(__u64 ptr)
+{
+	return (void *)(unsigned long)ptr;
+}
+
+static void set_max_rlimit(void)
+{
+	struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
+
+	setrlimit(RLIMIT_MEMLOCK, &rinf);
+}
+
+static struct bpf_counter *bpf_counter_alloc(void)
+{
+	struct bpf_counter *counter;
+
+	counter = zalloc(sizeof(*counter));
+	if (counter)
+		INIT_LIST_HEAD(&counter->list);
+	return counter;
+}
+
+static int bpf_program_profiler__destroy(struct evsel *evsel)
+{
+	struct bpf_counter *counter;
+
+	list_for_each_entry(counter, &evsel->bpf_counter_list, list)
+		bpf_prog_profiler_bpf__destroy(counter->skel);
+	INIT_LIST_HEAD(&evsel->bpf_counter_list);
+	return 0;
+}
+
+static char *bpf_target_prog_name(int tgt_fd)
+{
+	struct bpf_prog_info_linear *info_linear;
+	struct bpf_func_info *func_info;
+	const struct btf_type *t;
+	char *name = NULL;
+	struct btf *btf;
+
+	info_linear = bpf_program__get_prog_info_linear(
+		tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO);
+	if (IS_ERR_OR_NULL(info_linear)) {
+		pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd);
+		return NULL;
+	}
+
+	if (info_linear->info.btf_id == 0 ||
+	    btf__get_from_id(info_linear->info.btf_id, &btf)) {
+		pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd);
+		goto out;
+	}
+
+	func_info = u64_to_ptr(info_linear->info.func_info);
+	t = btf__type_by_id(btf, func_info[0].type_id);
+	if (!t) {
+		pr_debug("btf %d doesn't have type %d\n",
+			 info_linear->info.btf_id, func_info[0].type_id);
+		goto out;
+	}
+	name = strdup(btf__name_by_offset(btf, t->name_off));
+out:
+	free(info_linear);
+	return name;
+}
+
+static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id)
+{
+	struct bpf_prog_profiler_bpf *skel;
+	struct bpf_counter *counter;
+	struct bpf_program *prog;
+	char *prog_name;
+	int prog_fd;
+	int err;
+
+	prog_fd = bpf_prog_get_fd_by_id(prog_id);
+	if (prog_fd < 0) {
+		pr_err("Failed to open fd for bpf prog %u\n", prog_id);
+		return -1;
+	}
+	counter = bpf_counter_alloc();
+	if (!counter) {
+		close(prog_fd);
+		return -1;
+	}
+
+	skel = bpf_prog_profiler_bpf__open();
+	if (!skel) {
+		pr_err("Failed to open bpf skeleton\n");
+		goto err_out;
+	}
+	skel->rodata->num_cpu = evsel__nr_cpus(evsel);
+
+	bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel));
+	bpf_map__resize(skel->maps.fentry_readings, 1);
+	bpf_map__resize(skel->maps.accum_readings, 1);
+
+	prog_name = bpf_target_prog_name(prog_fd);
+	if (!prog_name) {
+		pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id);
+		goto err_out;
+	}
+
+	bpf_object__for_each_program(prog, skel->obj) {
+		err = bpf_program__set_attach_target(prog, prog_fd, prog_name);
+		if (err) {
+			pr_err("bpf_program__set_attach_target failed.\n"
+			       "Does bpf prog %u have BTF?\n", prog_id);
+			goto err_out;
+		}
+	}
+	set_max_rlimit();
+	err = bpf_prog_profiler_bpf__load(skel);
+	if (err) {
+		pr_err("bpf_prog_profiler_bpf__load failed\n");
+		goto err_out;
+	}
+
+	counter->skel = skel;
+	list_add(&counter->list, &evsel->bpf_counter_list);
+	close(prog_fd);
+	return 0;
+err_out:
+	free(counter);
+	close(prog_fd);
+	return -1;
+}
+
+static int bpf_program_profiler__load(struct evsel *evsel, struct target *target)
+{
+	char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p;
+	u32 prog_id;
+	int ret;
+
+	bpf_str_ = bpf_str = strdup(target->bpf_str);
+	if (!bpf_str)
+		return -1;
+
+	while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) {
+		prog_id = strtoul(tok, &p, 10);
+		if (prog_id == 0 || prog_id == UINT_MAX ||
+		    (*p != '\0' && *p != ',')) {
+			pr_err("Failed to parse bpf prog ids %s\n",
+			       target->bpf_str);
+			return -1;
+		}
+
+		ret = bpf_program_profiler_load_one(evsel, prog_id);
+		if (ret) {
+			bpf_program_profiler__destroy(evsel);
+			free(bpf_str_);
+			return -1;
+		}
+		bpf_str = NULL;
+	}
+	free(bpf_str_);
+	return 0;
+}
+
+static int bpf_program_profiler__enable(struct evsel *evsel)
+{
+	struct bpf_counter *counter;
+	int ret;
+
+	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
+		ret = bpf_prog_profiler_bpf__attach(counter->skel);
+		if (ret) {
+			bpf_program_profiler__destroy(evsel);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+static int bpf_program_profiler__read(struct evsel *evsel)
+{
+	int num_cpu = evsel__nr_cpus(evsel);
+	struct bpf_perf_event_value values[num_cpu];
+	struct bpf_counter *counter;
+	int reading_map_fd;
+	__u32 key = 0;
+	int err, cpu;
+
+	if (list_empty(&evsel->bpf_counter_list))
+		return -EAGAIN;
+
+	for (cpu = 0; cpu < num_cpu; cpu++) {
+		perf_counts(evsel->counts, cpu, 0)->val = 0;
+		perf_counts(evsel->counts, cpu, 0)->ena = 0;
+		perf_counts(evsel->counts, cpu, 0)->run = 0;
+	}
+	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
+		struct bpf_prog_profiler_bpf *skel = counter->skel;
+
+		reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
+
+		err = bpf_map_lookup_elem(reading_map_fd, &key, values);
+		if (err) {
+			fprintf(stderr, "failed to read value\n");
+			return err;
+		}
+
+		for (cpu = 0; cpu < num_cpu; cpu++) {
+			perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
+			perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled;
+			perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running;
+		}
+	}
+	return 0;
+}
+
+static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
+					    int fd)
+{
+	struct bpf_prog_profiler_bpf *skel;
+	struct bpf_counter *counter;
+	int ret;
+
+	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
+		skel = counter->skel;
+		ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
+					  &cpu, &fd, BPF_ANY);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
+struct bpf_counter_ops bpf_program_profiler_ops = {
+	.load       = bpf_program_profiler__load,
+	.enable	    = bpf_program_profiler__enable,
+	.read       = bpf_program_profiler__read,
+	.destroy    = bpf_program_profiler__destroy,
+	.install_pe = bpf_program_profiler__install_pe,
+};
+
+int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd)
+{
+	if (list_empty(&evsel->bpf_counter_list))
+		return 0;
+	return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd);
+}
+
+int bpf_counter__load(struct evsel *evsel, struct target *target)
+{
+	if (target__has_bpf(target))
+		evsel->bpf_counter_ops = &bpf_program_profiler_ops;
+
+	if (evsel->bpf_counter_ops)
+		return evsel->bpf_counter_ops->load(evsel, target);
+	return 0;
+}
+
+int bpf_counter__enable(struct evsel *evsel)
+{
+	if (list_empty(&evsel->bpf_counter_list))
+		return 0;
+	return evsel->bpf_counter_ops->enable(evsel);
+}
+
+int bpf_counter__read(struct evsel *evsel)
+{
+	if (list_empty(&evsel->bpf_counter_list))
+		return -EAGAIN;
+	return evsel->bpf_counter_ops->read(evsel);
+}
+
+void bpf_counter__destroy(struct evsel *evsel)
+{
+	if (list_empty(&evsel->bpf_counter_list))
+		return;
+	evsel->bpf_counter_ops->destroy(evsel);
+	evsel->bpf_counter_ops = NULL;
+}
diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h
new file mode 100644
index 0000000000000..2eca210e5dc16
--- /dev/null
+++ b/tools/perf/util/bpf_counter.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_BPF_COUNTER_H
+#define __PERF_BPF_COUNTER_H 1
+
+#include <linux/list.h>
+
+struct evsel;
+struct target;
+struct bpf_counter;
+
+typedef int (*bpf_counter_evsel_op)(struct evsel *evsel);
+typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel,
+					   struct target *target);
+typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel,
+					       int cpu,
+					       int fd);
+
+struct bpf_counter_ops {
+	bpf_counter_evsel_target_op load;
+	bpf_counter_evsel_op enable;
+	bpf_counter_evsel_op read;
+	bpf_counter_evsel_op destroy;
+	bpf_counter_evsel_install_pe_op install_pe;
+};
+
+struct bpf_counter {
+	void *skel;
+	struct list_head list;
+};
+
+#ifdef HAVE_BPF_SKEL
+
+int bpf_counter__load(struct evsel *evsel, struct target *target);
+int bpf_counter__enable(struct evsel *evsel);
+int bpf_counter__read(struct evsel *evsel);
+void bpf_counter__destroy(struct evsel *evsel);
+int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd);
+
+#else /* HAVE_BPF_SKEL */
+
+#include<linux/err.h>
+
+static inline int bpf_counter__load(struct evsel *evsel __maybe_unused,
+				    struct target *target __maybe_unused)
+{
+	return 0;
+}
+
+static inline int bpf_counter__enable(struct evsel *evsel __maybe_unused)
+{
+	return 0;
+}
+
+static inline int bpf_counter__read(struct evsel *evsel __maybe_unused)
+{
+	return -EAGAIN;
+}
+
+static inline void bpf_counter__destroy(struct evsel *evsel __maybe_unused)
+{
+}
+
+static inline int bpf_counter__install_pe(struct evsel *evsel __maybe_unused,
+					  int cpu __maybe_unused,
+					  int fd __maybe_unused)
+{
+	return 0;
+}
+
+#endif /* HAVE_BPF_SKEL */
+
+#endif /* __PERF_BPF_COUNTER_H */
diff --git a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
new file mode 100644
index 0000000000000..c7cec92d02360
--- /dev/null
+++ b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
@@ -0,0 +1,93 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+// Copyright (c) 2020 Facebook
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+/* map of perf event fds, num_cpu * num_metric entries */
+struct {
+	__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(int));
+} events SEC(".maps");
+
+/* readings at fentry */
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(struct bpf_perf_event_value));
+	__uint(max_entries, 1);
+} fentry_readings SEC(".maps");
+
+/* accumulated readings */
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(struct bpf_perf_event_value));
+	__uint(max_entries, 1);
+} accum_readings SEC(".maps");
+
+const volatile __u32 num_cpu = 1;
+
+SEC("fentry/XXX")
+int BPF_PROG(fentry_XXX)
+{
+	__u32 key = bpf_get_smp_processor_id();
+	struct bpf_perf_event_value *ptr;
+	__u32 zero = 0;
+	long err;
+
+	/* look up before reading, to reduce error */
+	ptr = bpf_map_lookup_elem(&fentry_readings, &zero);
+	if (!ptr)
+		return 0;
+
+	err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr));
+	if (err)
+		return 0;
+
+	return 0;
+}
+
+static inline void
+fexit_update_maps(struct bpf_perf_event_value *after)
+{
+	struct bpf_perf_event_value *before, diff, *accum;
+	__u32 zero = 0;
+
+	before = bpf_map_lookup_elem(&fentry_readings, &zero);
+	/* only account samples with a valid fentry_reading */
+	if (before && before->counter) {
+		struct bpf_perf_event_value *accum;
+
+		diff.counter = after->counter - before->counter;
+		diff.enabled = after->enabled - before->enabled;
+		diff.running = after->running - before->running;
+
+		accum = bpf_map_lookup_elem(&accum_readings, &zero);
+		if (accum) {
+			accum->counter += diff.counter;
+			accum->enabled += diff.enabled;
+			accum->running += diff.running;
+		}
+	}
+}
+
+SEC("fexit/XXX")
+int BPF_PROG(fexit_XXX)
+{
+	struct bpf_perf_event_value reading;
+	__u32 cpu = bpf_get_smp_processor_id();
+	__u32 one = 1, zero = 0;
+	int err;
+
+	/* read all events before updating the maps, to reduce error */
+	err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading));
+	if (err)
+		return 0;
+
+	fexit_update_maps(&reading);
+	return 0;
+}
+
+char LICENSE[] SEC("license") = "Dual BSD/GPL";
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index c26ea82220bd8..7265308765d73 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -25,6 +25,7 @@
 #include <stdlib.h>
 #include <perf/evsel.h>
 #include "asm/bug.h"
+#include "bpf_counter.h"
 #include "callchain.h"
 #include "cgroup.h"
 #include "counts.h"
@@ -51,6 +52,10 @@
 #include <internal/lib.h>
 
 #include <linux/ctype.h>
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#include <bpf/btf.h>
+#include "rlimit.h"
 
 struct perf_missing_features perf_missing_features;
 
@@ -247,6 +252,7 @@ void evsel__init(struct evsel *evsel,
 	evsel->bpf_obj	   = NULL;
 	evsel->bpf_fd	   = -1;
 	INIT_LIST_HEAD(&evsel->config_terms);
+	INIT_LIST_HEAD(&evsel->bpf_counter_list);
 	perf_evsel__object.init(evsel);
 	evsel->sample_size = __evsel__sample_size(attr->sample_type);
 	evsel__calc_id_pos(evsel);
@@ -1366,6 +1372,7 @@ void evsel__exit(struct evsel *evsel)
 {
 	assert(list_empty(&evsel->core.node));
 	assert(evsel->evlist == NULL);
+	bpf_counter__destroy(evsel);
 	evsel__free_counts(evsel);
 	perf_evsel__free_fd(&evsel->core);
 	perf_evsel__free_id(&evsel->core);
@@ -1781,6 +1788,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 
 			FD(evsel, cpu, thread) = fd;
 
+			bpf_counter__install_pe(evsel, cpu, fd);
+
 			if (unlikely(test_attr__enabled)) {
 				test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
 						fd, group_fd, flags);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index cd1d8dd431997..40e3946cd7518 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -10,6 +10,7 @@
 #include <internal/evsel.h>
 #include <perf/evsel.h>
 #include "symbol_conf.h"
+#include "bpf_counter.h"
 #include <internal/cpumap.h>
 
 struct bpf_object;
@@ -17,6 +18,8 @@ struct cgroup;
 struct perf_counts;
 struct perf_stat_evsel;
 union perf_event;
+struct bpf_counter_ops;
+struct target;
 
 typedef int (evsel__sb_cb_t)(union perf_event *event, void *data);
 
@@ -127,6 +130,8 @@ struct evsel {
 	 * See also evsel__has_callchain().
 	 */
 	__u64			synth_sample_type;
+	struct list_head	bpf_counter_list;
+	struct bpf_counter_ops	*bpf_counter_ops;
 };
 
 struct perf_missing_features {
@@ -424,4 +429,5 @@ static inline bool evsel__is_dummy_event(struct evsel *evsel)
 struct perf_env *evsel__env(struct evsel *evsel);
 
 int evsel__store_ids(struct evsel *evsel, struct evlist *evlist);
+
 #endif /* __PERF_EVSEL_H */
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 583ae4f09c5d1..cce7a76d6473c 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -1045,7 +1045,9 @@ static void print_header(struct perf_stat_config *config,
 	if (!config->csv_output) {
 		fprintf(output, "\n");
 		fprintf(output, " Performance counter stats for ");
-		if (_target->system_wide)
+		if (_target->bpf_str)
+			fprintf(output, "\'BPF program(s) %s", _target->bpf_str);
+		else if (_target->system_wide)
 			fprintf(output, "\'system wide");
 		else if (_target->cpu_list)
 			fprintf(output, "\'CPU(s) %s", _target->cpu_list);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 8ce1479c98f03..0b3957323f668 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -527,7 +527,7 @@ int create_perf_stat_counter(struct evsel *evsel,
 	if (leader->core.nr_members > 1)
 		attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;
 
-	attr->inherit = !config->no_inherit;
+	attr->inherit = !config->no_inherit && list_empty(&evsel->bpf_counter_list);
 
 	/*
 	 * Some events get initialized with sample_(period/type) set,
diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c
index a3db13dea937c..0f383418e3df5 100644
--- a/tools/perf/util/target.c
+++ b/tools/perf/util/target.c
@@ -56,6 +56,34 @@ enum target_errno target__validate(struct target *target)
 			ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM;
 	}
 
+	/* BPF and CPU are mutually exclusive */
+	if (target->bpf_str && target->cpu_list) {
+		target->cpu_list = NULL;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__BPF_OVERRIDE_CPU;
+	}
+
+	/* BPF and PID/TID are mutually exclusive */
+	if (target->bpf_str && target->tid) {
+		target->tid = NULL;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__BPF_OVERRIDE_PID;
+	}
+
+	/* BPF and UID are mutually exclusive */
+	if (target->bpf_str && target->uid_str) {
+		target->uid_str = NULL;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__BPF_OVERRIDE_UID;
+	}
+
+	/* BPF and THREADS are mutually exclusive */
+	if (target->bpf_str && target->per_thread) {
+		target->per_thread = false;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__BPF_OVERRIDE_THREAD;
+	}
+
 	/* THREAD and SYSTEM/CPU are mutually exclusive */
 	if (target->per_thread && (target->system_wide || target->cpu_list)) {
 		target->per_thread = false;
@@ -109,6 +137,10 @@ static const char *target__error_str[] = {
 	"PID/TID switch overriding SYSTEM",
 	"UID switch overriding SYSTEM",
 	"SYSTEM/CPU switch overriding PER-THREAD",
+	"BPF switch overriding CPU",
+	"BPF switch overriding PID/TID",
+	"BPF switch overriding UID",
+	"BPF switch overriding THREAD",
 	"Invalid User: %s",
 	"Problems obtaining information for user %s",
 };
@@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum,
 
 	switch (errnum) {
 	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
-	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:
+	     TARGET_ERRNO__BPF_OVERRIDE_THREAD:
 		snprintf(buf, buflen, "%s", msg);
 		break;
 
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 6ef01a83b24e9..f132c6c2eef81 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -10,6 +10,7 @@ struct target {
 	const char   *tid;
 	const char   *cpu_list;
 	const char   *uid_str;
+	const char   *bpf_str;
 	uid_t	     uid;
 	bool	     system_wide;
 	bool	     uses_mmap;
@@ -36,6 +37,10 @@ enum target_errno {
 	TARGET_ERRNO__PID_OVERRIDE_SYSTEM,
 	TARGET_ERRNO__UID_OVERRIDE_SYSTEM,
 	TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD,
+	TARGET_ERRNO__BPF_OVERRIDE_CPU,
+	TARGET_ERRNO__BPF_OVERRIDE_PID,
+	TARGET_ERRNO__BPF_OVERRIDE_UID,
+	TARGET_ERRNO__BPF_OVERRIDE_THREAD,
 
 	/* for target__parse_uid() */
 	TARGET_ERRNO__INVALID_UID,
@@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target)
 	return target->system_wide || target->cpu_list;
 }
 
+static inline bool target__has_bpf(struct target *target)
+{
+	return target->bpf_str;
+}
+
 static inline bool target__none(struct target *target)
 {
 	return !target__has_task(target) && !target__has_cpu(target);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v6 4/4] perf-stat: add documentation for -b option
  2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu
                   ` (2 preceding siblings ...)
  2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu
@ 2020-12-28 17:40 ` Song Liu
  2020-12-29  7:24   ` Namhyung Kim
  3 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, kernel-team, Song Liu

Add documentation to perf-stat -b option, which stats event for BPF
programs.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/Documentation/perf-stat.txt | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 5d4a673d7621a..15b9a646e853d 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -75,6 +75,20 @@ report::
 --tid=<tid>::
         stat events on existing thread id (comma separated list)
 
+-b::
+--bpf-prog::
+        stat events on existing bpf program id (comma separated list),
+        requiring root righs. For example:
+
+  # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000
+
+   Performance counter stats for 'BPF program(s) 17247':
+
+             85,967      cycles
+             28,982      instructions              #    0.34  insn per cycle
+
+        1.102235068 seconds time elapsed
+
 ifdef::HAVE_LIBPFM[]
 --pfm-events events::
 Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu
@ 2020-12-28 20:11   ` Arnaldo Carvalho de Melo
  2020-12-28 23:43     ` Song Liu
  2020-12-29  7:22   ` Namhyung Kim
  1 sibling, 1 reply; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-12-28 20:11 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, peterz, mingo, alexander.shishkin, namhyung,
	mark.rutland, jolsa, kernel-team

Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu:
> Introduce perf-stat -b option, which counts events for BPF programs, like:
> 
> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
>      1.487903822            115,200      ref-cycles
>      1.487903822             86,012      cycles
>      2.489147029             80,560      ref-cycles
>      2.489147029             73,784      cycles
>      3.490341825             60,720      ref-cycles
>      3.490341825             37,797      cycles
>      4.491540887             37,120      ref-cycles
>      4.491540887             31,963      cycles
> 
> The example above counts cycles and ref-cycles of BPF program of id 254.
> This is similar to bpftool-prog-profile command, but more flexible.
> 
> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
> programs (monitor-progs) to the target BPF program (target-prog). The
> monitor-progs read perf_event before and after the target-prog, and
> aggregate the difference in a BPF map. Then the user space reads data
> from these maps.
> 
> A new struct bpf_counter is introduced to provide common interface that
> uses BPF programs/maps to count perf events.

Segfaulting here:

[root@five ~]# bpftool prog  | grep tracepoint
110: tracepoint  name syscall_unaugme  tag 57cd311f2e27366b  gpl
111: tracepoint  name sys_enter_conne  tag 3555418ac9476139  gpl
112: tracepoint  name sys_enter_sendt  tag bc7fcadbaf7b8145  gpl
113: tracepoint  name sys_enter_open  tag 0e59c3ac2bea5280  gpl
114: tracepoint  name sys_enter_opena  tag 0baf443610f59837  gpl
115: tracepoint  name sys_enter_renam  tag 24664e4aca62d7fa  gpl
116: tracepoint  name sys_enter_renam  tag 20093e51a8634ebb  gpl
117: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
118: tracepoint  name sys_exit  tag 29c7ae234d79bd5c  gpl
[root@five ~]#
[root@five ~]# gdb perf
GNU gdb (GDB) Fedora 10.1-2.fc33
Reading symbols from perf...
(gdb) run stat -e instructions,cycles -b 113 -I 1000
Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame

Program received signal SIGSEGV, Segmentation fault.
0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
(gdb) bt
#0  0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
#1  0x0000000000000000 in ?? ()
(gdb)

[acme@five perf]$ clang -v |& head -2
clang version 11.0.0 (Fedora 11.0.0-2.fc33)
Target: x86_64-unknown-linux-gnu
[acme@five perf]$

Do you need any extra info?

Please when resubmitting, please combine patches 3/4 and 4/4, man pages
updates usually come together with the new feature.

Thanks,

- Arnaldo

Full build output:

[acme@five perf]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;make VF=1 O=/tmp/build/perf -C tools/perf BUILD_BPF_SKEL=1 install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
  BUILD:   Doing 'make -j24' parallel build
  HOSTCC   /tmp/build/perf/fixdep.o
  HOSTLD   /tmp/build/perf/fixdep-in.o
  LINK     /tmp/build/perf/fixdep
Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h

Auto-detecting system features:
...                         dwarf: [ on  ]
...            dwarf_getlocations: [ on  ]
...                         glibc: [ on  ]
...                        libbfd: [ on  ]
...                libbfd-buildid: [ on  ]
...                        libcap: [ on  ]
...                        libelf: [ on  ]
...                       libnuma: [ on  ]
...        numa_num_possible_cpus: [ on  ]
...                       libperl: [ on  ]
...                     libpython: [ on  ]
...                     libcrypto: [ on  ]
...                     libunwind: [ on  ]
...            libdw-dwarf-unwind: [ on  ]
...                          zlib: [ on  ]
...                          lzma: [ on  ]
...                     get_cpuid: [ on  ]
...                           bpf: [ on  ]
...                        libaio: [ on  ]
...                       libzstd: [ on  ]
...        disassembler-four-args: [ on  ]
...                     backtrace: [ on  ]
...                       eventfd: [ on  ]
...                fortify-source: [ on  ]
...         sync-compare-and-swap: [ on  ]
...          get_current_dir_name: [ on  ]
...                        gettid: [ on  ]
...             libelf-getphdrnum: [ on  ]
...           libelf-gelf_getnote: [ on  ]
...          libelf-getshdrstrndx: [ on  ]
...             libpython-version: [ on  ]
...                      libslang: [ on  ]
...       libslang-include-subdir: [ on  ]
...   pthread-attr-setaffinity-np: [ on  ]
...               pthread-barrier: [ on  ]
...                  reallocarray: [ on  ]
...            stackprotector-all: [ on  ]
...                       timerfd: [ on  ]
...                  sched_getcpu: [ on  ]
...                           sdt: [ on  ]
...                         setns: [ on  ]
...                   file-handle: [ on  ]

...                        bionic: [ OFF ]
...                    compile-32: [ OFF ]
...                   compile-x32: [ OFF ]
...                cplus-demangle: [ on  ]
...                          gtk2: [ OFF ]
...                  gtk2-infobar: [ OFF ]
...                         hello: [ OFF ]
...                 libbabeltrace: [ on  ]
...                libbfd-liberty: [ OFF ]
...              libbfd-liberty-z: [ OFF ]
...                    libopencsd: [ OFF ]
...                 libunwind-x86: [ OFF ]
...              libunwind-x86_64: [ OFF ]
...                 libunwind-arm: [ OFF ]
...             libunwind-aarch64: [ OFF ]
...         libunwind-debug-frame: [ OFF ]
...     libunwind-debug-frame-arm: [ OFF ]
... libunwind-debug-frame-aarch64: [ OFF ]
...                           cxx: [ OFF ]
...                          llvm: [ OFF ]
...                  llvm-version: [ OFF ]
...                         clang: [ OFF ]
...                        libbpf: [ OFF ]
...                       libpfm4: [ OFF ]
...                 libdebuginfod: [ on  ]
...               clang-bpf-co-re: [ on  ]
...                        prefix: /home/acme
...                        bindir: /home/acme/bin
...                        libdir: /home/acme/lib64
...                    sysconfdir: /home/acme/etc
...                 LIBUNWIND_DIR:
...                     LIBDW_DIR:
...                          JDIR: /usr/lib/jvm/java-11-openjdk-11.0.9.11-4.fc33.x86_64
...     DWARF post unwind library: libunwind

  GEN      /tmp/build/perf/common-cmds.h
CFLAGS= make -C ../bpf/bpftool \
	OUTPUT=/tmp/build/perf/util/bpf_skel/.tmp/ bootstrap
  CC       /tmp/build/perf/exec-cmd.o
  CC       /tmp/build/perf/help.o
  MKDIR    /tmp/build/perf/pmu-events/
  MKDIR    /tmp/build/perf/jvmti/
  MKDIR    /tmp/build/perf/fd/
  MKDIR    /tmp/build/perf/fs/
  MKDIR    /tmp/build/perf/fs/
  HOSTCC   /tmp/build/perf/pmu-events/json.o
  CC       /tmp/build/perf/parse-options.o
  CC       /tmp/build/perf/fd/array.o
  CC       /tmp/build/perf/pager.o
  CC       /tmp/build/perf/jvmti/libjvmti.o
  CC       /tmp/build/perf/fs/fs.o
  CC       /tmp/build/perf/run-command.o
  MKDIR    /tmp/build/perf/jvmti/
  CC       /tmp/build/perf/sigchain.o
  MKDIR    /tmp/build/perf/fs/
  CC       /tmp/build/perf/fs/tracing_path.o
  CC       /tmp/build/perf/fs/cgroup.o
  MKDIR    /tmp/build/perf/pmu-events/
  CC       /tmp/build/perf/jvmti/libstring.o
  HOSTCC   /tmp/build/perf/pmu-events/jevents.o
  CC       /tmp/build/perf/jvmti/libctype.o
  CC       /tmp/build/perf/subcmd-config.o
  CC       /tmp/build/perf/jvmti/jvmti_agent.o
  LD       /tmp/build/perf/fd/libapi-in.o
  CC       /tmp/build/perf/event-parse.o
  HOSTCC   /tmp/build/perf/pmu-events/jsmn.o
  CC       /tmp/build/perf/event-plugin.o
  CC       /tmp/build/perf/cpu.o
  CC       /tmp/build/perf/trace-seq.o
  CC       /tmp/build/perf/core.o
  CC       /tmp/build/perf/parse-filter.o
  CC       /tmp/build/perf/debug.o
  CC       /tmp/build/perf/cpumap.o
  LD       /tmp/build/perf/fs/libapi-in.o
  CC       /tmp/build/perf/threadmap.o
  LD       /tmp/build/perf/libsubcmd-in.o
  HOSTLD   /tmp/build/perf/pmu-events/jevents-in.o
  CC       /tmp/build/perf/str_error_r.o
  CC       /tmp/build/perf/evsel.o
  GEN      /tmp/build/perf/bpf_helper_defs.h
  CC       /tmp/build/perf/evlist.o
  CC       /tmp/build/perf/parse-utils.o
  CC       /tmp/build/perf/zalloc.o
  CC       /tmp/build/perf/kbuffer-parse.o
  LD       /tmp/build/perf/jvmti/jvmti-in.o
  CC       /tmp/build/perf/mmap.o
  CC       /tmp/build/perf/tep_strerror.o
  CC       /tmp/build/perf/xyarray.o
  CC       /tmp/build/perf/event-parse-api.o
  CC       /tmp/build/perf/lib.o
  LINK     /tmp/build/perf/pmu-events/jevents
  LD       /tmp/build/perf/libapi-in.o
  AR       /tmp/build/perf/libsubcmd.a
  LINK     /tmp/build/perf/libperf-jvmti.so
  LD       /tmp/build/perf/libtraceevent-in.o
  CC       /tmp/build/perf/plugin_hrtimer.o
  CC       /tmp/build/perf/plugin_kmem.o
  CC       /tmp/build/perf/plugin_mac80211.o
  CC       /tmp/build/perf/plugin_kvm.o
  CC       /tmp/build/perf/plugin_jbd2.o
  LD       /tmp/build/perf/libperf-in.o
  CC       /tmp/build/perf/plugin_sched_switch.o
  CC       /tmp/build/perf/plugin_function.o
  CC       /tmp/build/perf/plugin_scsi.o
  CC       /tmp/build/perf/plugin_xen.o
  CC       /tmp/build/perf/plugin_futex.o
  CC       /tmp/build/perf/plugin_cfg80211.o
  CC       /tmp/build/perf/plugin_tlb.o
  AR       /tmp/build/perf/libapi.a
  LD       /tmp/build/perf/plugin_hrtimer-in.o
  LINK     /tmp/build/perf/libtraceevent.a
  LD       /tmp/build/perf/plugin_kvm-in.o
  LD       /tmp/build/perf/plugin_scsi-in.o
  LD       /tmp/build/perf/plugin_kmem-in.o
  LD       /tmp/build/perf/plugin_mac80211-in.o
  LD       /tmp/build/perf/plugin_futex-in.o
  LD       /tmp/build/perf/plugin_function-in.o
  LD       /tmp/build/perf/plugin_xen-in.o
  LD       /tmp/build/perf/plugin_sched_switch-in.o
  LD       /tmp/build/perf/plugin_tlb-in.o
  LD       /tmp/build/perf/plugin_jbd2-in.o
  LINK     /tmp/build/perf/plugin_hrtimer.so
  LINK     /tmp/build/perf/plugin_kmem.so
  AR       /tmp/build/perf/libperf.a
  LINK     /tmp/build/perf/plugin_scsi.so
  LINK     /tmp/build/perf/plugin_kvm.so
  LINK     /tmp/build/perf/plugin_mac80211.so
  LD       /tmp/build/perf/plugin_cfg80211-in.o
  LINK     /tmp/build/perf/plugin_futex.so
  LINK     /tmp/build/perf/plugin_xen.so
  LINK     /tmp/build/perf/plugin_function.so
  LINK     /tmp/build/perf/plugin_tlb.so
  LINK     /tmp/build/perf/plugin_jbd2.so
  LINK     /tmp/build/perf/plugin_cfg80211.so
  LINK     /tmp/build/perf/plugin_sched_switch.so
  GEN      /tmp/build/perf/pmu-events/pmu-events.c
  GEN      /tmp/build/perf/libtraceevent-dynamic-list
  MKDIR    /tmp/build/perf/staticobjs/
  MKDIR    /tmp/build/perf/staticobjs/
  MKDIR    /tmp/build/perf/staticobjs/
  MKDIR    /tmp/build/perf/staticobjs/
  MKDIR    /tmp/build/perf/staticobjs/
  MKDIR    /tmp/build/perf/staticobjs/
  MKDIR    /tmp/build/perf/staticobjs/
  PERF_VERSION = 5.11.rc1.g5eb0b370de61
  MKDIR    /tmp/build/perf/staticobjs/
  CC       /tmp/build/perf/staticobjs/libbpf_probes.o
  CC       /tmp/build/perf/staticobjs/libbpf.o
  CC       /tmp/build/perf/staticobjs/bpf.o
  CC       /tmp/build/perf/staticobjs/nlattr.o
  CC       /tmp/build/perf/staticobjs/btf.o
  CC       /tmp/build/perf/staticobjs/xsk.o
  GEN      perf-archive
  CC       /tmp/build/perf/staticobjs/hashmap.o
  GEN      perf-with-kcore
  CC       /tmp/build/perf/staticobjs/btf_dump.o
  CC       /tmp/build/perf/staticobjs/libbpf_errno.o
  CC       /tmp/build/perf/staticobjs/str_error.o
  CC       /tmp/build/perf/staticobjs/bpf_prog_linfo.o
  CC       /tmp/build/perf/staticobjs/netlink.o
  CC       /tmp/build/perf/staticobjs/ringbuf.o
  LD       /tmp/build/perf/staticobjs/libbpf-in.o
  LINK     /tmp/build/perf/libbpf.a
  CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o
  DESCEND  plugins
  GEN      /tmp/build/perf/python/perf.so
  CC       /tmp/build/perf/plugins/plugin_jbd2.o
  CC       /tmp/build/perf/plugins/plugin_kmem.o
  CC       /tmp/build/perf/plugins/plugin_hrtimer.o
  CC       /tmp/build/perf/plugins/plugin_mac80211.o
  CC       /tmp/build/perf/plugins/plugin_kvm.o
  CC       /tmp/build/perf/plugins/plugin_function.o
  CC       /tmp/build/perf/plugins/plugin_xen.o
  CC       /tmp/build/perf/plugins/plugin_sched_switch.o
  CC       /tmp/build/perf/plugins/plugin_futex.o
  CC       /tmp/build/perf/plugins/plugin_scsi.o
  CC       /tmp/build/perf/plugins/plugin_tlb.o
  CC       /tmp/build/perf/plugins/plugin_cfg80211.o
  LD       /tmp/build/perf/plugins/plugin_jbd2-in.o
  LD       /tmp/build/perf/plugins/plugin_kmem-in.o
  LD       /tmp/build/perf/plugins/plugin_hrtimer-in.o
  LD       /tmp/build/perf/plugins/plugin_kvm-in.o
  LD       /tmp/build/perf/plugins/plugin_mac80211-in.o
  LD       /tmp/build/perf/plugins/plugin_function-in.o
  LD       /tmp/build/perf/plugins/plugin_xen-in.o
  LD       /tmp/build/perf/plugins/plugin_sched_switch-in.o
  LD       /tmp/build/perf/plugins/plugin_scsi-in.o
  LD       /tmp/build/perf/plugins/plugin_futex-in.o
  LD       /tmp/build/perf/plugins/plugin_cfg80211-in.o
  LD       /tmp/build/perf/plugins/plugin_tlb-in.o
  LINK     /tmp/build/perf/plugins/plugin_jbd2.so
  LINK     /tmp/build/perf/plugins/plugin_hrtimer.so
  LINK     /tmp/build/perf/plugins/plugin_kmem.so
  LINK     /tmp/build/perf/plugins/plugin_mac80211.so
  LINK     /tmp/build/perf/plugins/plugin_kvm.so
  LINK     /tmp/build/perf/plugins/plugin_sched_switch.so
  LINK     /tmp/build/perf/plugins/plugin_scsi.so
  LINK     /tmp/build/perf/plugins/plugin_xen.so
  LINK     /tmp/build/perf/plugins/plugin_function.so
  LINK     /tmp/build/perf/plugins/plugin_futex.so
  LINK     /tmp/build/perf/plugins/plugin_tlb.so
  LINK     /tmp/build/perf/plugins/plugin_cfg80211.so
  INSTALL  trace_plugins
  CC       /tmp/build/perf/pmu-events/pmu-events.o
  LD       /tmp/build/perf/pmu-events/pmu-events-in.o

Auto-detecting system features:
...                        libbfd: [ on  ]
...        disassembler-four-args: [ on  ]
...                          zlib: [ on  ]
...                        libcap: [ on  ]
...               clang-bpf-co-re: [ on  ]
...                  reallocarray: [ on  ]

  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/main.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/common.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/btf.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/json_writer.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/gen.o

Auto-detecting system features:
...                        libelf: [ on  ]
...                          zlib: [ on  ]
...                           bpf: [ on  ]

  GEN      /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/bpf_helper_defs.h
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_probes.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/nlattr.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/xsk.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_errno.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/netlink.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/str_error.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/hashmap.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf_prog_linfo.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf_dump.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ringbuf.o
  LD       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o
  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a
  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool
  GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
  CC       /tmp/build/perf/builtin-bench.o
  CC       /tmp/build/perf/builtin-annotate.o
  CC       /tmp/build/perf/builtin-diff.o
  CC       /tmp/build/perf/builtin-config.o
  CC       /tmp/build/perf/builtin-ftrace.o
  CC       /tmp/build/perf/builtin-help.o
  CC       /tmp/build/perf/builtin-evlist.o
  CC       /tmp/build/perf/builtin-sched.o
  CC       /tmp/build/perf/builtin-buildid-list.o
  CC       /tmp/build/perf/builtin-kallsyms.o
  CC       /tmp/build/perf/builtin-buildid-cache.o
  CC       /tmp/build/perf/builtin-record.o
  CC       /tmp/build/perf/builtin-report.o
  CC       /tmp/build/perf/builtin-list.o
  CC       /tmp/build/perf/builtin-stat.o
  CC       /tmp/build/perf/builtin-top.o
  CC       /tmp/build/perf/builtin-timechart.o
  CC       /tmp/build/perf/builtin-script.o
  CC       /tmp/build/perf/builtin-kmem.o
  CC       /tmp/build/perf/builtin-lock.o
  CC       /tmp/build/perf/builtin-kvm.o
  CC       /tmp/build/perf/builtin-inject.o
  CC       /tmp/build/perf/builtin-mem.o
  CC       /tmp/build/perf/builtin-version.o
  CC       /tmp/build/perf/builtin-data.o
  CC       /tmp/build/perf/builtin-trace.o
  CC       /tmp/build/perf/builtin-probe.o
  CC       /tmp/build/perf/builtin-c2c.o
  MKDIR    /tmp/build/perf/bench/
  MKDIR    /tmp/build/perf/bench/
  MKDIR    /tmp/build/perf/tests/
  CC       /tmp/build/perf/arch/common.o
  MKDIR    /tmp/build/perf/ui/
  MKDIR    /tmp/build/perf/bench/
  MKDIR    /tmp/build/perf/tests/
  CC       /tmp/build/perf/bench/sched-messaging.o
  CC       /tmp/build/perf/bench/sched-pipe.o
  CC       /tmp/build/perf/tests/builtin-test.o
  MKDIR    /tmp/build/perf/scripts/python/Perf-Trace-Util/
  MKDIR    /tmp/build/perf/scripts/perl/Perf-Trace-Util/
  MKDIR    /tmp/build/perf/tests/
  MKDIR    /tmp/build/perf/ui/
  CC       /tmp/build/perf/ui/setup.o
  CC       /tmp/build/perf/tests/attr.o
  CC       /tmp/build/perf/bench/syscall.o
  CC       /tmp/build/perf/tests/parse-events.o
  CC       /tmp/build/perf/trace/beauty/clone.o
  CC       /tmp/build/perf/scripts/python/Perf-Trace-Util/Context.o
  CC       /tmp/build/perf/tests/dso-data.o
  MKDIR    /tmp/build/perf/arch/x86/util/
  CC       /tmp/build/perf/ui/helpline.o
  CC       /tmp/build/perf/scripts/perl/Perf-Trace-Util/Context.o
  MKDIR    /tmp/build/perf/ui/
  CC       /tmp/build/perf/ui/util.o
  CC       /tmp/build/perf/arch/x86/util/header.o
  MKDIR    /tmp/build/perf/arch/x86/tests/
  CC       /tmp/build/perf/ui/hist.o
  CC       /tmp/build/perf/ui/progress.o
  CC       /tmp/build/perf/tests/vmlinux-kallsyms.o
  CC       /tmp/build/perf/arch/x86/tests/regs_load.o
  CC       /tmp/build/perf/bench/mem-functions.o
  CC       /tmp/build/perf/trace/beauty/fcntl.o
  CC       /tmp/build/perf/trace/beauty/flock.o
  CC       /tmp/build/perf/bench/futex-hash.o
  CC       /tmp/build/perf/trace/beauty/fsmount.o
  CC       /tmp/build/perf/perf.o
  CC       /tmp/build/perf/tests/openat-syscall.o
  MKDIR    /tmp/build/perf/ui/stdio/
  MKDIR    /tmp/build/perf/arch/x86/util/
  LD       /tmp/build/perf/scripts/python/Perf-Trace-Util/perf-in.o
  MKDIR    /tmp/build/perf/arch/x86/tests/
  CC       /tmp/build/perf/tests/openat-syscall-all-cpus.o
  CC       /tmp/build/perf/arch/x86/util/pmu.o
  CC       /tmp/build/perf/trace/beauty/fspick.o
  CC       /tmp/build/perf/arch/x86/util/tsc.o
  CC       /tmp/build/perf/util/annotate.o
  CC       /tmp/build/perf/tests/openat-syscall-tp-fields.o
  CC       /tmp/build/perf/trace/beauty/ioctl.o
  CC       /tmp/build/perf/bench/futex-wake.o
  CC       /tmp/build/perf/arch/x86/tests/dwarf-unwind.o
  CC       /tmp/build/perf/bench/futex-wake-parallel.o
  CC       /tmp/build/perf/ui/stdio/hist.o
  CC       /tmp/build/perf/arch/x86/tests/arch-tests.o
  CC       /tmp/build/perf/trace/beauty/kcmp.o
  CC       /tmp/build/perf/arch/x86/util/perf_regs.o
  CC       /tmp/build/perf/trace/beauty/mount_flags.o
  CC       /tmp/build/perf/arch/x86/util/kvm-stat.o
  CC       /tmp/build/perf/bench/futex-requeue.o
  CC       /tmp/build/perf/util/block-info.o
  CC       /tmp/build/perf/arch/x86/tests/rdpmc.o
  CC       /tmp/build/perf/bench/futex-lock-pi.o
  CC       /tmp/build/perf/arch/x86/tests/insn-x86.o
  CC       /tmp/build/perf/trace/beauty/move_mount.o
  CC       /tmp/build/perf/tests/mmap-basic.o
  CC       /tmp/build/perf/trace/beauty/pkey_alloc.o
  CC       /tmp/build/perf/tests/perf-record.o
  CC       /tmp/build/perf/trace/beauty/arch_prctl.o
  CC       /tmp/build/perf/arch/x86/util/topdown.o
  CC       /tmp/build/perf/tests/evsel-roundtrip-name.o
  CC       /tmp/build/perf/tests/evsel-tp-sched.o
  CC       /tmp/build/perf/arch/x86/tests/intel-pt-pkt-decoder-test.o
  CC       /tmp/build/perf/bench/epoll-wait.o
  CC       /tmp/build/perf/tests/fdarray.o
  CC       /tmp/build/perf/bench/epoll-ctl.o
  CC       /tmp/build/perf/arch/x86/util/machine.o
  CC       /tmp/build/perf/trace/beauty/prctl.o
  CC       /tmp/build/perf/ui/browser.o
  CC       /tmp/build/perf/arch/x86/util/event.o
  CC       /tmp/build/perf/arch/x86/tests/bp-modify.o
  CC       /tmp/build/perf/trace/beauty/renameat.o
  CC       /tmp/build/perf/tests/pmu.o
  CC       /tmp/build/perf/tests/pmu-events.o
  CC       /tmp/build/perf/tests/hists_common.o
  CC       /tmp/build/perf/bench/synthesize.o
  CC       /tmp/build/perf/arch/x86/util/dwarf-regs.o
  CC       /tmp/build/perf/tests/hists_link.o
  CC       /tmp/build/perf/trace/beauty/sockaddr.o
  CC       /tmp/build/perf/arch/x86/util/unwind-libunwind.o
  CC       /tmp/build/perf/trace/beauty/socket.o
  CC       /tmp/build/perf/trace/beauty/statx.o
  LD       /tmp/build/perf/arch/x86/tests/perf-in.o
  CC       /tmp/build/perf/arch/x86/util/auxtrace.o
  CC       /tmp/build/perf/arch/x86/util/archinsn.o
  CC       /tmp/build/perf/bench/kallsyms-parse.o
  CC       /tmp/build/perf/bench/find-bit-bench.o
  CC       /tmp/build/perf/arch/x86/util/intel-pt.o
  CC       /tmp/build/perf/trace/beauty/sync_file_range.o
  CC       /tmp/build/perf/arch/x86/util/intel-bts.o
  MKDIR    /tmp/build/perf/trace/beauty/tracepoints/
  MKDIR    /tmp/build/perf/ui/browsers/
  CC       /tmp/build/perf/trace/beauty/tracepoints/x86_irq_vectors.o
  MKDIR    /tmp/build/perf/trace/beauty/tracepoints/
  CC       /tmp/build/perf/util/block-range.o
  CC       /tmp/build/perf/ui/browsers/annotate.o
  MKDIR    /tmp/build/perf/ui/browsers/
  CC       /tmp/build/perf/util/build-id.o
  CC       /tmp/build/perf/bench/inject-buildid.o
  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  MKDIR    /tmp/build/perf/ui/tui/
  CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
  MKDIR    /tmp/build/perf/ui/tui/
  CC       /tmp/build/perf/ui/browsers/hists.o
  CC       /tmp/build/perf/ui/browsers/map.o
  LD       /tmp/build/perf/arch/x86/util/perf-in.o
  CC       /tmp/build/perf/tests/hists_filter.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o
  CC       /tmp/build/perf/ui/tui/setup.o
  CC       /tmp/build/perf/ui/tui/util.o
  MKDIR    /tmp/build/perf/ui/tui/
  CC       /tmp/build/perf/bench/numa.o
  CC       /tmp/build/perf/util/cacheline.o
  CC       /tmp/build/perf/ui/browsers/scripts.o
  LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
  CC       /tmp/build/perf/ui/tui/helpline.o
  CC       /tmp/build/perf/util/config.o
  CC       /tmp/build/perf/ui/tui/progress.o
  CC       /tmp/build/perf/ui/browsers/header.o
  LD       /tmp/build/perf/trace/beauty/perf-in.o
  CC       /tmp/build/perf/ui/browsers/res_sample.o
  CC       /tmp/build/perf/tests/hists_output.o
  LD       /tmp/build/perf/bench/perf-in.o
  CC       /tmp/build/perf/tests/hists_cumulate.o
  CC       /tmp/build/perf/util/copyfile.o
  LD       /tmp/build/perf/arch/x86/perf-in.o
  CC       /tmp/build/perf/util/ctype.o
  CC       /tmp/build/perf/util/db-export.o
  CC       /tmp/build/perf/util/env.o
  CC       /tmp/build/perf/util/event.o
  LD       /tmp/build/perf/arch/perf-in.o
  CC       /tmp/build/perf/util/evlist.o
  CC       /tmp/build/perf/util/sideband_evlist.o
  CC       /tmp/build/perf/util/evsel.o
  CC       /tmp/build/perf/util/evsel_fprintf.o
  CC       /tmp/build/perf/tests/python-use.o
  CC       /tmp/build/perf/util/perf_event_attr_fprintf.o
  CC       /tmp/build/perf/util/evswitch.o
  CC       /tmp/build/perf/util/find_bit.o
  CC       /tmp/build/perf/tests/bp_signal.o
  LD       /tmp/build/perf/ui/tui/perf-in.o
  CC       /tmp/build/perf/util/get_current_dir_name.o
  CC       /tmp/build/perf/tests/bp_signal_overflow.o
  CC       /tmp/build/perf/util/kallsyms.o
  CC       /tmp/build/perf/tests/bp_account.o
  CC       /tmp/build/perf/util/llvm-utils.o
  CC       /tmp/build/perf/util/levenshtein.o
  CC       /tmp/build/perf/util/mmap.o
  CC       /tmp/build/perf/tests/wp.o
  CC       /tmp/build/perf/util/memswap.o
  CC       /tmp/build/perf/util/perf_regs.o
  BISON    /tmp/build/perf/util/parse-events-bison.c
  CC       /tmp/build/perf/tests/task-exit.o
  CC       /tmp/build/perf/util/path.o
  CC       /tmp/build/perf/util/print_binary.o
  CC       /tmp/build/perf/util/rlimit.o
  CC       /tmp/build/perf/tests/sw-clock.o
  CC       /tmp/build/perf/tests/mmap-thread-lookup.o
  CC       /tmp/build/perf/util/argv_split.o
  CC       /tmp/build/perf/util/rbtree.o
  CC       /tmp/build/perf/tests/thread-maps-share.o
  CC       /tmp/build/perf/util/libstring.o
  CC       /tmp/build/perf/tests/switch-tracking.o
  CC       /tmp/build/perf/tests/keep-tracking.o
  CC       /tmp/build/perf/util/bitmap.o
  CC       /tmp/build/perf/util/hweight.o
  CC       /tmp/build/perf/util/smt.o
  CC       /tmp/build/perf/tests/code-reading.o
  CC       /tmp/build/perf/util/strbuf.o
  CC       /tmp/build/perf/util/string.o
  CC       /tmp/build/perf/tests/sample-parsing.o
  CC       /tmp/build/perf/tests/parse-no-sample-id-all.o
  CC       /tmp/build/perf/util/strfilter.o
  CC       /tmp/build/perf/tests/kmod-path.o
  CC       /tmp/build/perf/util/strlist.o
  CC       /tmp/build/perf/util/top.o
  CC       /tmp/build/perf/tests/thread-map.o
  CC       /tmp/build/perf/util/usage.o
  CC       /tmp/build/perf/util/dso.o
  CC       /tmp/build/perf/util/dsos.o
  CC       /tmp/build/perf/util/symbol.o
  CC       /tmp/build/perf/util/symbol_fprintf.o
  CC       /tmp/build/perf/tests/llvm.o
  CC       /tmp/build/perf/util/color.o
  CC       /tmp/build/perf/util/color_config.o
  CC       /tmp/build/perf/util/metricgroup.o
  CC       /tmp/build/perf/util/header.o
  CC       /tmp/build/perf/util/callchain.o
  CC       /tmp/build/perf/util/values.o
  CC       /tmp/build/perf/tests/bpf.o
  CC       /tmp/build/perf/util/debug.o
  CC       /tmp/build/perf/util/fncache.o
  CC       /tmp/build/perf/tests/topology.o
  CC       /tmp/build/perf/util/machine.o
  CC       /tmp/build/perf/tests/cpumap.o
  CC       /tmp/build/perf/util/map.o
  CC       /tmp/build/perf/tests/mem.o
  CC       /tmp/build/perf/util/pstack.o
  CC       /tmp/build/perf/util/session.o
  CC       /tmp/build/perf/tests/stat.o
  CC       /tmp/build/perf/tests/event_update.o
  LD       /tmp/build/perf/ui/browsers/perf-in.o
  CC       /tmp/build/perf/tests/event-times.o
  CC       /tmp/build/perf/tests/expr.o
  CC       /tmp/build/perf/util/sample-raw.o
  CC       /tmp/build/perf/util/s390-sample-raw.o
  CC       /tmp/build/perf/tests/sdt.o
  CC       /tmp/build/perf/util/syscalltbl.o
  CC       /tmp/build/perf/tests/is_printable_array.o
  CC       /tmp/build/perf/util/ordered-events.o
  CC       /tmp/build/perf/tests/backward-ring-buffer.o
  CC       /tmp/build/perf/tests/bitmap.o
  CC       /tmp/build/perf/util/namespaces.o
  CC       /tmp/build/perf/tests/perf-hooks.o
  CC       /tmp/build/perf/tests/clang.o
  CC       /tmp/build/perf/util/comm.o
  CC       /tmp/build/perf/tests/unit_number__scnprintf.o
  CC       /tmp/build/perf/tests/mem2node.o
  CC       /tmp/build/perf/tests/maps.o
  CC       /tmp/build/perf/util/thread.o
  CC       /tmp/build/perf/util/thread_map.o
  CC       /tmp/build/perf/tests/time-utils-test.o
  CC       /tmp/build/perf/tests/genelf.o
  CC       /tmp/build/perf/util/trace-event-parse.o
  BISON    /tmp/build/perf/util/pmu-bison.c
  CC       /tmp/build/perf/util/trace-event-read.o
  CC       /tmp/build/perf/tests/api-io.o
  CC       /tmp/build/perf/util/trace-event-info.o
  CC       /tmp/build/perf/util/trace-event-scripting.o
  CC       /tmp/build/perf/tests/pfm.o
  CC       /tmp/build/perf/tests/demangle-java-test.o
  CC       /tmp/build/perf/util/trace-event.o
  LD       /tmp/build/perf/ui/perf-in.o
  CC       /tmp/build/perf/tests/parse-metric.o
  CC       /tmp/build/perf/tests/pe-file-parsing.o
  CC       /tmp/build/perf/tests/expand-cgroup.o
  CC       /tmp/build/perf/tests/perf-time-to-tsc.o
  CC       /tmp/build/perf/util/svghelper.o
  CC       /tmp/build/perf/util/sort.o
  CC       /tmp/build/perf/util/hist.o
  CC       /tmp/build/perf/tests/dwarf-unwind.o
  CC       /tmp/build/perf/tests/llvm-src-base.o
  CC       /tmp/build/perf/util/cpumap.o
  CC       /tmp/build/perf/util/util.o
  CC       /tmp/build/perf/util/affinity.o
  CC       /tmp/build/perf/util/cputopo.o
  CC       /tmp/build/perf/util/target.o
  CC       /tmp/build/perf/util/cgroup.o
  CC       /tmp/build/perf/tests/llvm-src-kbuild.o
  CC       /tmp/build/perf/util/rblist.o
  CC       /tmp/build/perf/tests/llvm-src-prologue.o
  CC       /tmp/build/perf/tests/llvm-src-relocation.o
  CC       /tmp/build/perf/util/intlist.o
  CC       /tmp/build/perf/util/counts.o
  CC       /tmp/build/perf/util/vdso.o
  CC       /tmp/build/perf/util/stat.o
  CC       /tmp/build/perf/util/stat-shadow.o
  CC       /tmp/build/perf/util/stat-display.o
  CC       /tmp/build/perf/util/perf_api_probe.o
  CC       /tmp/build/perf/util/record.o
  CC       /tmp/build/perf/util/srcline.o
  LD       /tmp/build/perf/tests/perf-in.o
  CC       /tmp/build/perf/util/srccode.o
  CC       /tmp/build/perf/util/synthetic-events.o
  CC       /tmp/build/perf/util/data.o
  CC       /tmp/build/perf/util/cloexec.o
  CC       /tmp/build/perf/util/tsc.o
  CC       /tmp/build/perf/util/rwsem.o
  CC       /tmp/build/perf/util/call-path.o
  CC       /tmp/build/perf/util/thread-stack.o
  CC       /tmp/build/perf/util/spark.o
  CC       /tmp/build/perf/util/topdown.o
  CC       /tmp/build/perf/util/auxtrace.o
  CC       /tmp/build/perf/util/intel-pt.o
  CC       /tmp/build/perf/util/stream.o
  CC       /tmp/build/perf/util/intel-bts.o
  MKDIR    /tmp/build/perf/util/arm-spe-decoder/
  CC       /tmp/build/perf/util/arm-spe.o
  MKDIR    /tmp/build/perf/util/intel-pt-decoder/
  MKDIR    /tmp/build/perf/util/arm-spe-decoder/
  CC       /tmp/build/perf/util/s390-cpumsf.o
  MKDIR    /tmp/build/perf/util/intel-pt-decoder/
  MKDIR    /tmp/build/perf/util/scripting-engines/
  MKDIR    /tmp/build/perf/util/intel-pt-decoder/
  MKDIR    /tmp/build/perf/util/scripting-engines/
  CC       /tmp/build/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.o
  GEN      /tmp/build/perf/util/intel-pt-decoder/inat-tables.c
  CC       /tmp/build/perf/util/arm-spe-decoder/arm-spe-decoder.o
  CC       /tmp/build/perf/util/scripting-engines/trace-event-perl.o
  CC       /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o
  CC       /tmp/build/perf/util/intel-pt-decoder/intel-pt-log.o
  CC       /tmp/build/perf/util/dump-insn.o
  CC       /tmp/build/perf/util/parse-branch-options.o
  CC       /tmp/build/perf/util/scripting-engines/trace-event-python.o
  CC       /tmp/build/perf/util/parse-regs-options.o
  CC       /tmp/build/perf/util/parse-sublevel-options.o
  MKDIR    /tmp/build/perf/util/intel-pt-decoder/
  CC       /tmp/build/perf/util/term.o
  CC       /tmp/build/perf/util/help-unknown-cmd.o
  LD       /tmp/build/perf/util/arm-spe-decoder/perf-in.o
  CC       /tmp/build/perf/util/mem-events.o
  CC       /tmp/build/perf/util/vsprintf.o
  CC       /tmp/build/perf/util/units.o
  CC       /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o
  BISON    /tmp/build/perf/util/expr-bison.c
  CC       /tmp/build/perf/util/time-utils.o
  CC       /tmp/build/perf/util/branch.o
  CC       /tmp/build/perf/util/mem2node.o
  CC       /tmp/build/perf/util/clockid.o
  CC       /tmp/build/perf/util/bpf-loader.o
  CC       /tmp/build/perf/util/bpf_map.o
  CC       /tmp/build/perf/util/bpf_counter.o
  CC       /tmp/build/perf/util/bpf-prologue.o
  CC       /tmp/build/perf/util/symbol-elf.o
  CC       /tmp/build/perf/util/probe-file.o
  CC       /tmp/build/perf/util/probe-event.o
  CC       /tmp/build/perf/util/probe-finder.o
  CC       /tmp/build/perf/util/dwarf-aux.o
  CC       /tmp/build/perf/util/dwarf-regs.o
  LD       /tmp/build/perf/util/scripting-engines/perf-in.o
  CC       /tmp/build/perf/util/unwind-libunwind-local.o
  CC       /tmp/build/perf/util/unwind-libunwind.o
  CC       /tmp/build/perf/util/intel-pt-decoder/intel-pt-insn-decoder.o
  CC       /tmp/build/perf/util/data-convert-bt.o
  CC       /tmp/build/perf/util/zlib.o
  CC       /tmp/build/perf/util/lzma.o
  CC       /tmp/build/perf/util/cap.o
  CC       /tmp/build/perf/util/zstd.o
  CC       /tmp/build/perf/util/demangle-java.o
  CC       /tmp/build/perf/util/demangle-rust.o
  CC       /tmp/build/perf/util/genelf.o
  CC       /tmp/build/perf/util/jitdump.o
  CC       /tmp/build/perf/util/genelf_debug.o
  CC       /tmp/build/perf/util/perf-hooks.o
  CC       /tmp/build/perf/util/bpf-event.o
  FLEX     /tmp/build/perf/util/pmu-flex.c
  CC       /tmp/build/perf/util/pmu-bison.o
  CC       /tmp/build/perf/util/pmu.o
  CC       /tmp/build/perf/util/pmu-flex.o
  FLEX     /tmp/build/perf/util/parse-events-flex.c
  CC       /tmp/build/perf/util/parse-events-bison.o
  FLEX     /tmp/build/perf/util/expr-flex.c
  CC       /tmp/build/perf/util/expr-bison.o
  LD       /tmp/build/perf/util/intel-pt-decoder/perf-in.o
  CC       /tmp/build/perf/util/parse-events.o
  CC       /tmp/build/perf/util/parse-events-flex.o
  CC       /tmp/build/perf/util/expr-flex.o
  CC       /tmp/build/perf/util/expr.o
  LD       /tmp/build/perf/scripts/perl/Perf-Trace-Util/perf-in.o
  LD       /tmp/build/perf/scripts/perf-in.o
  LD       /tmp/build/perf/util/perf-in.o
  LD       /tmp/build/perf/perf-in.o
  LINK     /tmp/build/perf/perf
  INSTALL  tests
  INSTALL  binaries
  INSTALL  libperf-jvmti.so
  INSTALL  libexec
  INSTALL  bpf-headers
  INSTALL  bpf-examples
  INSTALL  perf-archive
  INSTALL  perf-with-kcore
  INSTALL  strace/groups
  INSTALL  perl-scripts
  INSTALL  python-scripts
  INSTALL  perf_completion-script
  INSTALL  perf-tip
make: Leaving directory '/home/acme/git/perf/tools/perf'
[acme@five perf]$
 
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  tools/perf/Makefile.perf                      |   2 +-
>  tools/perf/builtin-stat.c                     |  77 ++++-
>  tools/perf/util/Build                         |   1 +
>  tools/perf/util/bpf_counter.c                 | 296 ++++++++++++++++++
>  tools/perf/util/bpf_counter.h                 |  72 +++++
>  .../util/bpf_skel/bpf_prog_profiler.bpf.c     |  93 ++++++
>  tools/perf/util/evsel.c                       |   9 +
>  tools/perf/util/evsel.h                       |   6 +
>  tools/perf/util/stat-display.c                |   4 +-
>  tools/perf/util/stat.c                        |   2 +-
>  tools/perf/util/target.c                      |  34 +-
>  tools/perf/util/target.h                      |  10 +
>  12 files changed, 588 insertions(+), 18 deletions(-)
>  create mode 100644 tools/perf/util/bpf_counter.c
>  create mode 100644 tools/perf/util/bpf_counter.h
>  create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
> 
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index d182a2dbb9bbd..8c4e039c3b813 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -1015,7 +1015,7 @@ python-clean:
>  
>  SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
>  SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
> -SKELETONS :=
> +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
>  
>  ifdef BUILD_BPF_SKEL
>  BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 8cc24967bc273..09bffb3fbcdd4 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -67,6 +67,7 @@
>  #include "util/top.h"
>  #include "util/affinity.h"
>  #include "util/pfm.h"
> +#include "util/bpf_counter.h"
>  #include "asm/bug.h"
>  
>  #include <linux/time64.h>
> @@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs)
>  	return 0;
>  }
>  
> +static int read_bpf_map_counters(void)
> +{
> +	struct evsel *counter;
> +	int err;
> +
> +	evlist__for_each_entry(evsel_list, counter) {
> +		err = bpf_counter__read(counter);
> +		if (err)
> +			return err;
> +	}
> +	return 0;
> +}
> +
>  static void read_counters(struct timespec *rs)
>  {
>  	struct evsel *counter;
> +	int err;
>  
> -	if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0))
> -		return;
> +	if (!stat_config.stop_read_counter) {
> +		err = read_bpf_map_counters();
> +		if (err == -EAGAIN)
> +			err = read_affinity_counters(rs);
> +		if (err < 0)
> +			return;
> +	}
>  
>  	evlist__for_each_entry(evsel_list, counter) {
>  		if (counter->err)
> @@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times)
>  	return false;
>  }
>  
> -static void enable_counters(void)
> +static int enable_counters(void)
>  {
> +	struct evsel *evsel;
> +	int err;
> +
> +	evlist__for_each_entry(evsel_list, evsel) {
> +		err = bpf_counter__enable(evsel);
> +		if (err)
> +			return err;
> +	}
> +
>  	if (stat_config.initial_delay < 0) {
>  		pr_info(EVLIST_DISABLED_MSG);
> -		return;
> +		return 0;
>  	}
>  
>  	if (stat_config.initial_delay > 0) {
> @@ -518,6 +547,7 @@ static void enable_counters(void)
>  		if (stat_config.initial_delay > 0)
>  			pr_info(EVLIST_ENABLED_MSG);
>  	}
> +	return 0;
>  }
>  
>  static void disable_counters(void)
> @@ -720,7 +750,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  	const bool forks = (argc > 0);
>  	bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false;
>  	struct affinity affinity;
> -	int i, cpu;
> +	int i, cpu, err;
>  	bool second_pass = false;
>  
>  	if (forks) {
> @@ -737,6 +767,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  	if (affinity__setup(&affinity) < 0)
>  		return -1;
>  
> +	evlist__for_each_entry(evsel_list, counter) {
> +		if (bpf_counter__load(counter, &target))
> +			return -1;
> +	}
> +
>  	evlist__for_each_cpu (evsel_list, i, cpu) {
>  		affinity__set(&affinity, cpu);
>  
> @@ -850,7 +885,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  	}
>  
>  	if (STAT_RECORD) {
> -		int err, fd = perf_data__fd(&perf_stat.data);
> +		int fd = perf_data__fd(&perf_stat.data);
>  
>  		if (is_pipe) {
>  			err = perf_header__write_pipe(perf_data__fd(&perf_stat.data));
> @@ -876,7 +911,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  
>  	if (forks) {
>  		evlist__start_workload(evsel_list);
> -		enable_counters();
> +		err = enable_counters();
> +		if (err)
> +			return -1;
>  
>  		if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
>  			status = dispatch_events(forks, timeout, interval, &times);
> @@ -895,7 +932,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  		if (WIFSIGNALED(status))
>  			psignal(WTERMSIG(status), argv[0]);
>  	} else {
> -		enable_counters();
> +		err = enable_counters();
> +		if (err)
> +			return -1;
>  		status = dispatch_events(forks, timeout, interval, &times);
>  	}
>  
> @@ -1085,6 +1124,10 @@ static struct option stat_options[] = {
>  		   "stat events on existing process id"),
>  	OPT_STRING('t', "tid", &target.tid, "tid",
>  		   "stat events on existing thread id"),
> +#ifdef HAVE_BPF_SKEL
> +	OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id",
> +		   "stat events on existing bpf program id"),
> +#endif
>  	OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
>  		    "system-wide collection from all CPUs"),
>  	OPT_BOOLEAN('g', "group", &group,
> @@ -2064,11 +2107,12 @@ int cmd_stat(int argc, const char **argv)
>  		"perf stat [<options>] [<command>]",
>  		NULL
>  	};
> -	int status = -EINVAL, run_idx;
> +	int status = -EINVAL, run_idx, err;
>  	const char *mode;
>  	FILE *output = stderr;
>  	unsigned int interval, timeout;
>  	const char * const stat_subcommands[] = { "record", "report" };
> +	char errbuf[BUFSIZ];
>  
>  	setlocale(LC_ALL, "");
>  
> @@ -2179,6 +2223,12 @@ int cmd_stat(int argc, const char **argv)
>  	} else if (big_num_opt == 0) /* User passed --no-big-num */
>  		stat_config.big_num = false;
>  
> +	err = target__validate(&target);
> +	if (err) {
> +		target__strerror(&target, err, errbuf, BUFSIZ);
> +		pr_warning("%s\n", errbuf);
> +	}
> +
>  	setup_system_wide(argc);
>  
>  	/*
> @@ -2252,8 +2302,6 @@ int cmd_stat(int argc, const char **argv)
>  		}
>  	}
>  
> -	target__validate(&target);
> -
>  	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
>  		target.per_thread = true;
>  
> @@ -2384,9 +2432,10 @@ int cmd_stat(int argc, const char **argv)
>  		 * tools remain  -acme
>  		 */
>  		int fd = perf_data__fd(&perf_stat.data);
> -		int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat,
> -							     process_synthesized_event,
> -							     &perf_stat.session->machines.host);
> +
> +		err = perf_event__synthesize_kernel_mmap((void *)&perf_stat,
> +							 process_synthesized_event,
> +							 &perf_stat.session->machines.host);
>  		if (err) {
>  			pr_warning("Couldn't synthesize the kernel mmap record, harmless, "
>  				   "older tools may produce warnings about this file\n.");
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index e2563d0154eb6..188521f343470 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -135,6 +135,7 @@ perf-y += clockid.o
>  
>  perf-$(CONFIG_LIBBPF) += bpf-loader.o
>  perf-$(CONFIG_LIBBPF) += bpf_map.o
> +perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o
>  perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
>  perf-$(CONFIG_LIBELF) += symbol-elf.o
>  perf-$(CONFIG_LIBELF) += probe-file.o
> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
> new file mode 100644
> index 0000000000000..f2cb86a40c882
> --- /dev/null
> +++ b/tools/perf/util/bpf_counter.c
> @@ -0,0 +1,296 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/* Copyright (c) 2019 Facebook */
> +
> +#include <limits.h>
> +#include <unistd.h>
> +#include <sys/time.h>
> +#include <sys/resource.h>
> +#include <linux/err.h>
> +#include <linux/zalloc.h>
> +#include <bpf/bpf.h>
> +#include <bpf/btf.h>
> +#include <bpf/libbpf.h>
> +
> +#include "bpf_counter.h"
> +#include "counts.h"
> +#include "debug.h"
> +#include "evsel.h"
> +#include "target.h"
> +
> +#include "bpf_skel/bpf_prog_profiler.skel.h"
> +
> +static inline void *u64_to_ptr(__u64 ptr)
> +{
> +	return (void *)(unsigned long)ptr;
> +}
> +
> +static void set_max_rlimit(void)
> +{
> +	struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
> +
> +	setrlimit(RLIMIT_MEMLOCK, &rinf);
> +}
> +
> +static struct bpf_counter *bpf_counter_alloc(void)
> +{
> +	struct bpf_counter *counter;
> +
> +	counter = zalloc(sizeof(*counter));
> +	if (counter)
> +		INIT_LIST_HEAD(&counter->list);
> +	return counter;
> +}
> +
> +static int bpf_program_profiler__destroy(struct evsel *evsel)
> +{
> +	struct bpf_counter *counter;
> +
> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list)
> +		bpf_prog_profiler_bpf__destroy(counter->skel);
> +	INIT_LIST_HEAD(&evsel->bpf_counter_list);
> +	return 0;
> +}
> +
> +static char *bpf_target_prog_name(int tgt_fd)
> +{
> +	struct bpf_prog_info_linear *info_linear;
> +	struct bpf_func_info *func_info;
> +	const struct btf_type *t;
> +	char *name = NULL;
> +	struct btf *btf;
> +
> +	info_linear = bpf_program__get_prog_info_linear(
> +		tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO);
> +	if (IS_ERR_OR_NULL(info_linear)) {
> +		pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd);
> +		return NULL;
> +	}
> +
> +	if (info_linear->info.btf_id == 0 ||
> +	    btf__get_from_id(info_linear->info.btf_id, &btf)) {
> +		pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd);
> +		goto out;
> +	}
> +
> +	func_info = u64_to_ptr(info_linear->info.func_info);
> +	t = btf__type_by_id(btf, func_info[0].type_id);
> +	if (!t) {
> +		pr_debug("btf %d doesn't have type %d\n",
> +			 info_linear->info.btf_id, func_info[0].type_id);
> +		goto out;
> +	}
> +	name = strdup(btf__name_by_offset(btf, t->name_off));
> +out:
> +	free(info_linear);
> +	return name;
> +}
> +
> +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id)
> +{
> +	struct bpf_prog_profiler_bpf *skel;
> +	struct bpf_counter *counter;
> +	struct bpf_program *prog;
> +	char *prog_name;
> +	int prog_fd;
> +	int err;
> +
> +	prog_fd = bpf_prog_get_fd_by_id(prog_id);
> +	if (prog_fd < 0) {
> +		pr_err("Failed to open fd for bpf prog %u\n", prog_id);
> +		return -1;
> +	}
> +	counter = bpf_counter_alloc();
> +	if (!counter) {
> +		close(prog_fd);
> +		return -1;
> +	}
> +
> +	skel = bpf_prog_profiler_bpf__open();
> +	if (!skel) {
> +		pr_err("Failed to open bpf skeleton\n");
> +		goto err_out;
> +	}
> +	skel->rodata->num_cpu = evsel__nr_cpus(evsel);
> +
> +	bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel));
> +	bpf_map__resize(skel->maps.fentry_readings, 1);
> +	bpf_map__resize(skel->maps.accum_readings, 1);
> +
> +	prog_name = bpf_target_prog_name(prog_fd);
> +	if (!prog_name) {
> +		pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id);
> +		goto err_out;
> +	}
> +
> +	bpf_object__for_each_program(prog, skel->obj) {
> +		err = bpf_program__set_attach_target(prog, prog_fd, prog_name);
> +		if (err) {
> +			pr_err("bpf_program__set_attach_target failed.\n"
> +			       "Does bpf prog %u have BTF?\n", prog_id);
> +			goto err_out;
> +		}
> +	}
> +	set_max_rlimit();
> +	err = bpf_prog_profiler_bpf__load(skel);
> +	if (err) {
> +		pr_err("bpf_prog_profiler_bpf__load failed\n");
> +		goto err_out;
> +	}
> +
> +	counter->skel = skel;
> +	list_add(&counter->list, &evsel->bpf_counter_list);
> +	close(prog_fd);
> +	return 0;
> +err_out:
> +	free(counter);
> +	close(prog_fd);
> +	return -1;
> +}
> +
> +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target)
> +{
> +	char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p;
> +	u32 prog_id;
> +	int ret;
> +
> +	bpf_str_ = bpf_str = strdup(target->bpf_str);
> +	if (!bpf_str)
> +		return -1;
> +
> +	while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) {
> +		prog_id = strtoul(tok, &p, 10);
> +		if (prog_id == 0 || prog_id == UINT_MAX ||
> +		    (*p != '\0' && *p != ',')) {
> +			pr_err("Failed to parse bpf prog ids %s\n",
> +			       target->bpf_str);
> +			return -1;
> +		}
> +
> +		ret = bpf_program_profiler_load_one(evsel, prog_id);
> +		if (ret) {
> +			bpf_program_profiler__destroy(evsel);
> +			free(bpf_str_);
> +			return -1;
> +		}
> +		bpf_str = NULL;
> +	}
> +	free(bpf_str_);
> +	return 0;
> +}
> +
> +static int bpf_program_profiler__enable(struct evsel *evsel)
> +{
> +	struct bpf_counter *counter;
> +	int ret;
> +
> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> +		ret = bpf_prog_profiler_bpf__attach(counter->skel);
> +		if (ret) {
> +			bpf_program_profiler__destroy(evsel);
> +			return ret;
> +		}
> +	}
> +	return 0;
> +}
> +
> +static int bpf_program_profiler__read(struct evsel *evsel)
> +{
> +	int num_cpu = evsel__nr_cpus(evsel);
> +	struct bpf_perf_event_value values[num_cpu];
> +	struct bpf_counter *counter;
> +	int reading_map_fd;
> +	__u32 key = 0;
> +	int err, cpu;
> +
> +	if (list_empty(&evsel->bpf_counter_list))
> +		return -EAGAIN;
> +
> +	for (cpu = 0; cpu < num_cpu; cpu++) {
> +		perf_counts(evsel->counts, cpu, 0)->val = 0;
> +		perf_counts(evsel->counts, cpu, 0)->ena = 0;
> +		perf_counts(evsel->counts, cpu, 0)->run = 0;
> +	}
> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> +		struct bpf_prog_profiler_bpf *skel = counter->skel;
> +
> +		reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> +
> +		err = bpf_map_lookup_elem(reading_map_fd, &key, values);
> +		if (err) {
> +			fprintf(stderr, "failed to read value\n");
> +			return err;
> +		}
> +
> +		for (cpu = 0; cpu < num_cpu; cpu++) {
> +			perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
> +			perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled;
> +			perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running;
> +		}
> +	}
> +	return 0;
> +}
> +
> +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
> +					    int fd)
> +{
> +	struct bpf_prog_profiler_bpf *skel;
> +	struct bpf_counter *counter;
> +	int ret;
> +
> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> +		skel = counter->skel;
> +		ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
> +					  &cpu, &fd, BPF_ANY);
> +		if (ret)
> +			return ret;
> +	}
> +	return 0;
> +}
> +
> +struct bpf_counter_ops bpf_program_profiler_ops = {
> +	.load       = bpf_program_profiler__load,
> +	.enable	    = bpf_program_profiler__enable,
> +	.read       = bpf_program_profiler__read,
> +	.destroy    = bpf_program_profiler__destroy,
> +	.install_pe = bpf_program_profiler__install_pe,
> +};
> +
> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd)
> +{
> +	if (list_empty(&evsel->bpf_counter_list))
> +		return 0;
> +	return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd);
> +}
> +
> +int bpf_counter__load(struct evsel *evsel, struct target *target)
> +{
> +	if (target__has_bpf(target))
> +		evsel->bpf_counter_ops = &bpf_program_profiler_ops;
> +
> +	if (evsel->bpf_counter_ops)
> +		return evsel->bpf_counter_ops->load(evsel, target);
> +	return 0;
> +}
> +
> +int bpf_counter__enable(struct evsel *evsel)
> +{
> +	if (list_empty(&evsel->bpf_counter_list))
> +		return 0;
> +	return evsel->bpf_counter_ops->enable(evsel);
> +}
> +
> +int bpf_counter__read(struct evsel *evsel)
> +{
> +	if (list_empty(&evsel->bpf_counter_list))
> +		return -EAGAIN;
> +	return evsel->bpf_counter_ops->read(evsel);
> +}
> +
> +void bpf_counter__destroy(struct evsel *evsel)
> +{
> +	if (list_empty(&evsel->bpf_counter_list))
> +		return;
> +	evsel->bpf_counter_ops->destroy(evsel);
> +	evsel->bpf_counter_ops = NULL;
> +}
> diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h
> new file mode 100644
> index 0000000000000..2eca210e5dc16
> --- /dev/null
> +++ b/tools/perf/util/bpf_counter.h
> @@ -0,0 +1,72 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __PERF_BPF_COUNTER_H
> +#define __PERF_BPF_COUNTER_H 1
> +
> +#include <linux/list.h>
> +
> +struct evsel;
> +struct target;
> +struct bpf_counter;
> +
> +typedef int (*bpf_counter_evsel_op)(struct evsel *evsel);
> +typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel,
> +					   struct target *target);
> +typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel,
> +					       int cpu,
> +					       int fd);
> +
> +struct bpf_counter_ops {
> +	bpf_counter_evsel_target_op load;
> +	bpf_counter_evsel_op enable;
> +	bpf_counter_evsel_op read;
> +	bpf_counter_evsel_op destroy;
> +	bpf_counter_evsel_install_pe_op install_pe;
> +};
> +
> +struct bpf_counter {
> +	void *skel;
> +	struct list_head list;
> +};
> +
> +#ifdef HAVE_BPF_SKEL
> +
> +int bpf_counter__load(struct evsel *evsel, struct target *target);
> +int bpf_counter__enable(struct evsel *evsel);
> +int bpf_counter__read(struct evsel *evsel);
> +void bpf_counter__destroy(struct evsel *evsel);
> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd);
> +
> +#else /* HAVE_BPF_SKEL */
> +
> +#include<linux/err.h>
> +
> +static inline int bpf_counter__load(struct evsel *evsel __maybe_unused,
> +				    struct target *target __maybe_unused)
> +{
> +	return 0;
> +}
> +
> +static inline int bpf_counter__enable(struct evsel *evsel __maybe_unused)
> +{
> +	return 0;
> +}
> +
> +static inline int bpf_counter__read(struct evsel *evsel __maybe_unused)
> +{
> +	return -EAGAIN;
> +}
> +
> +static inline void bpf_counter__destroy(struct evsel *evsel __maybe_unused)
> +{
> +}
> +
> +static inline int bpf_counter__install_pe(struct evsel *evsel __maybe_unused,
> +					  int cpu __maybe_unused,
> +					  int fd __maybe_unused)
> +{
> +	return 0;
> +}
> +
> +#endif /* HAVE_BPF_SKEL */
> +
> +#endif /* __PERF_BPF_COUNTER_H */
> diff --git a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
> new file mode 100644
> index 0000000000000..c7cec92d02360
> --- /dev/null
> +++ b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
> @@ -0,0 +1,93 @@
> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +// Copyright (c) 2020 Facebook
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +/* map of perf event fds, num_cpu * num_metric entries */
> +struct {
> +	__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
> +	__uint(key_size, sizeof(__u32));
> +	__uint(value_size, sizeof(int));
> +} events SEC(".maps");
> +
> +/* readings at fentry */
> +struct {
> +	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
> +	__uint(key_size, sizeof(__u32));
> +	__uint(value_size, sizeof(struct bpf_perf_event_value));
> +	__uint(max_entries, 1);
> +} fentry_readings SEC(".maps");
> +
> +/* accumulated readings */
> +struct {
> +	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
> +	__uint(key_size, sizeof(__u32));
> +	__uint(value_size, sizeof(struct bpf_perf_event_value));
> +	__uint(max_entries, 1);
> +} accum_readings SEC(".maps");
> +
> +const volatile __u32 num_cpu = 1;
> +
> +SEC("fentry/XXX")
> +int BPF_PROG(fentry_XXX)
> +{
> +	__u32 key = bpf_get_smp_processor_id();
> +	struct bpf_perf_event_value *ptr;
> +	__u32 zero = 0;
> +	long err;
> +
> +	/* look up before reading, to reduce error */
> +	ptr = bpf_map_lookup_elem(&fentry_readings, &zero);
> +	if (!ptr)
> +		return 0;
> +
> +	err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr));
> +	if (err)
> +		return 0;
> +
> +	return 0;
> +}
> +
> +static inline void
> +fexit_update_maps(struct bpf_perf_event_value *after)
> +{
> +	struct bpf_perf_event_value *before, diff, *accum;
> +	__u32 zero = 0;
> +
> +	before = bpf_map_lookup_elem(&fentry_readings, &zero);
> +	/* only account samples with a valid fentry_reading */
> +	if (before && before->counter) {
> +		struct bpf_perf_event_value *accum;
> +
> +		diff.counter = after->counter - before->counter;
> +		diff.enabled = after->enabled - before->enabled;
> +		diff.running = after->running - before->running;
> +
> +		accum = bpf_map_lookup_elem(&accum_readings, &zero);
> +		if (accum) {
> +			accum->counter += diff.counter;
> +			accum->enabled += diff.enabled;
> +			accum->running += diff.running;
> +		}
> +	}
> +}
> +
> +SEC("fexit/XXX")
> +int BPF_PROG(fexit_XXX)
> +{
> +	struct bpf_perf_event_value reading;
> +	__u32 cpu = bpf_get_smp_processor_id();
> +	__u32 one = 1, zero = 0;
> +	int err;
> +
> +	/* read all events before updating the maps, to reduce error */
> +	err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading));
> +	if (err)
> +		return 0;
> +
> +	fexit_update_maps(&reading);
> +	return 0;
> +}
> +
> +char LICENSE[] SEC("license") = "Dual BSD/GPL";
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index c26ea82220bd8..7265308765d73 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -25,6 +25,7 @@
>  #include <stdlib.h>
>  #include <perf/evsel.h>
>  #include "asm/bug.h"
> +#include "bpf_counter.h"
>  #include "callchain.h"
>  #include "cgroup.h"
>  #include "counts.h"
> @@ -51,6 +52,10 @@
>  #include <internal/lib.h>
>  
>  #include <linux/ctype.h>
> +#include <bpf/bpf.h>
> +#include <bpf/libbpf.h>
> +#include <bpf/btf.h>
> +#include "rlimit.h"
>  
>  struct perf_missing_features perf_missing_features;
>  
> @@ -247,6 +252,7 @@ void evsel__init(struct evsel *evsel,
>  	evsel->bpf_obj	   = NULL;
>  	evsel->bpf_fd	   = -1;
>  	INIT_LIST_HEAD(&evsel->config_terms);
> +	INIT_LIST_HEAD(&evsel->bpf_counter_list);
>  	perf_evsel__object.init(evsel);
>  	evsel->sample_size = __evsel__sample_size(attr->sample_type);
>  	evsel__calc_id_pos(evsel);
> @@ -1366,6 +1372,7 @@ void evsel__exit(struct evsel *evsel)
>  {
>  	assert(list_empty(&evsel->core.node));
>  	assert(evsel->evlist == NULL);
> +	bpf_counter__destroy(evsel);
>  	evsel__free_counts(evsel);
>  	perf_evsel__free_fd(&evsel->core);
>  	perf_evsel__free_id(&evsel->core);
> @@ -1781,6 +1788,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>  
>  			FD(evsel, cpu, thread) = fd;
>  
> +			bpf_counter__install_pe(evsel, cpu, fd);
> +
>  			if (unlikely(test_attr__enabled)) {
>  				test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
>  						fd, group_fd, flags);
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index cd1d8dd431997..40e3946cd7518 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -10,6 +10,7 @@
>  #include <internal/evsel.h>
>  #include <perf/evsel.h>
>  #include "symbol_conf.h"
> +#include "bpf_counter.h"
>  #include <internal/cpumap.h>
>  
>  struct bpf_object;
> @@ -17,6 +18,8 @@ struct cgroup;
>  struct perf_counts;
>  struct perf_stat_evsel;
>  union perf_event;
> +struct bpf_counter_ops;
> +struct target;
>  
>  typedef int (evsel__sb_cb_t)(union perf_event *event, void *data);
>  
> @@ -127,6 +130,8 @@ struct evsel {
>  	 * See also evsel__has_callchain().
>  	 */
>  	__u64			synth_sample_type;
> +	struct list_head	bpf_counter_list;
> +	struct bpf_counter_ops	*bpf_counter_ops;
>  };
>  
>  struct perf_missing_features {
> @@ -424,4 +429,5 @@ static inline bool evsel__is_dummy_event(struct evsel *evsel)
>  struct perf_env *evsel__env(struct evsel *evsel);
>  
>  int evsel__store_ids(struct evsel *evsel, struct evlist *evlist);
> +
>  #endif /* __PERF_EVSEL_H */
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 583ae4f09c5d1..cce7a76d6473c 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -1045,7 +1045,9 @@ static void print_header(struct perf_stat_config *config,
>  	if (!config->csv_output) {
>  		fprintf(output, "\n");
>  		fprintf(output, " Performance counter stats for ");
> -		if (_target->system_wide)
> +		if (_target->bpf_str)
> +			fprintf(output, "\'BPF program(s) %s", _target->bpf_str);
> +		else if (_target->system_wide)
>  			fprintf(output, "\'system wide");
>  		else if (_target->cpu_list)
>  			fprintf(output, "\'CPU(s) %s", _target->cpu_list);
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index 8ce1479c98f03..0b3957323f668 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -527,7 +527,7 @@ int create_perf_stat_counter(struct evsel *evsel,
>  	if (leader->core.nr_members > 1)
>  		attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;
>  
> -	attr->inherit = !config->no_inherit;
> +	attr->inherit = !config->no_inherit && list_empty(&evsel->bpf_counter_list);
>  
>  	/*
>  	 * Some events get initialized with sample_(period/type) set,
> diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c
> index a3db13dea937c..0f383418e3df5 100644
> --- a/tools/perf/util/target.c
> +++ b/tools/perf/util/target.c
> @@ -56,6 +56,34 @@ enum target_errno target__validate(struct target *target)
>  			ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM;
>  	}
>  
> +	/* BPF and CPU are mutually exclusive */
> +	if (target->bpf_str && target->cpu_list) {
> +		target->cpu_list = NULL;
> +		if (ret == TARGET_ERRNO__SUCCESS)
> +			ret = TARGET_ERRNO__BPF_OVERRIDE_CPU;
> +	}
> +
> +	/* BPF and PID/TID are mutually exclusive */
> +	if (target->bpf_str && target->tid) {
> +		target->tid = NULL;
> +		if (ret == TARGET_ERRNO__SUCCESS)
> +			ret = TARGET_ERRNO__BPF_OVERRIDE_PID;
> +	}
> +
> +	/* BPF and UID are mutually exclusive */
> +	if (target->bpf_str && target->uid_str) {
> +		target->uid_str = NULL;
> +		if (ret == TARGET_ERRNO__SUCCESS)
> +			ret = TARGET_ERRNO__BPF_OVERRIDE_UID;
> +	}
> +
> +	/* BPF and THREADS are mutually exclusive */
> +	if (target->bpf_str && target->per_thread) {
> +		target->per_thread = false;
> +		if (ret == TARGET_ERRNO__SUCCESS)
> +			ret = TARGET_ERRNO__BPF_OVERRIDE_THREAD;
> +	}
> +
>  	/* THREAD and SYSTEM/CPU are mutually exclusive */
>  	if (target->per_thread && (target->system_wide || target->cpu_list)) {
>  		target->per_thread = false;
> @@ -109,6 +137,10 @@ static const char *target__error_str[] = {
>  	"PID/TID switch overriding SYSTEM",
>  	"UID switch overriding SYSTEM",
>  	"SYSTEM/CPU switch overriding PER-THREAD",
> +	"BPF switch overriding CPU",
> +	"BPF switch overriding PID/TID",
> +	"BPF switch overriding UID",
> +	"BPF switch overriding THREAD",
>  	"Invalid User: %s",
>  	"Problems obtaining information for user %s",
>  };
> @@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum,
>  
>  	switch (errnum) {
>  	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
> -	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:
> +	     TARGET_ERRNO__BPF_OVERRIDE_THREAD:
>  		snprintf(buf, buflen, "%s", msg);
>  		break;
>  
> diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
> index 6ef01a83b24e9..f132c6c2eef81 100644
> --- a/tools/perf/util/target.h
> +++ b/tools/perf/util/target.h
> @@ -10,6 +10,7 @@ struct target {
>  	const char   *tid;
>  	const char   *cpu_list;
>  	const char   *uid_str;
> +	const char   *bpf_str;
>  	uid_t	     uid;
>  	bool	     system_wide;
>  	bool	     uses_mmap;
> @@ -36,6 +37,10 @@ enum target_errno {
>  	TARGET_ERRNO__PID_OVERRIDE_SYSTEM,
>  	TARGET_ERRNO__UID_OVERRIDE_SYSTEM,
>  	TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD,
> +	TARGET_ERRNO__BPF_OVERRIDE_CPU,
> +	TARGET_ERRNO__BPF_OVERRIDE_PID,
> +	TARGET_ERRNO__BPF_OVERRIDE_UID,
> +	TARGET_ERRNO__BPF_OVERRIDE_THREAD,
>  
>  	/* for target__parse_uid() */
>  	TARGET_ERRNO__INVALID_UID,
> @@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target)
>  	return target->system_wide || target->cpu_list;
>  }
>  
> +static inline bool target__has_bpf(struct target *target)
> +{
> +	return target->bpf_str;
> +}
> +
>  static inline bool target__none(struct target *target)
>  {
>  	return !target__has_task(target) && !target__has_cpu(target);
> -- 
> 2.24.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-28 20:11   ` Arnaldo Carvalho de Melo
@ 2020-12-28 23:43     ` Song Liu
  2020-12-29  5:53       ` Song Liu
  2020-12-29 15:15       ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 25+ messages in thread
From: Song Liu @ 2020-12-28 23:43 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team



> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu:
>> Introduce perf-stat -b option, which counts events for BPF programs, like:
>> 
>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
>>     1.487903822            115,200      ref-cycles
>>     1.487903822             86,012      cycles
>>     2.489147029             80,560      ref-cycles
>>     2.489147029             73,784      cycles
>>     3.490341825             60,720      ref-cycles
>>     3.490341825             37,797      cycles
>>     4.491540887             37,120      ref-cycles
>>     4.491540887             31,963      cycles
>> 
>> The example above counts cycles and ref-cycles of BPF program of id 254.
>> This is similar to bpftool-prog-profile command, but more flexible.
>> 
>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
>> programs (monitor-progs) to the target BPF program (target-prog). The
>> monitor-progs read perf_event before and after the target-prog, and
>> aggregate the difference in a BPF map. Then the user space reads data
>> from these maps.
>> 
>> A new struct bpf_counter is introduced to provide common interface that
>> uses BPF programs/maps to count perf events.
> 
> Segfaulting here:
> 
> [root@five ~]# bpftool prog  | grep tracepoint
> 110: tracepoint  name syscall_unaugme  tag 57cd311f2e27366b  gpl
> 111: tracepoint  name sys_enter_conne  tag 3555418ac9476139  gpl
> 112: tracepoint  name sys_enter_sendt  tag bc7fcadbaf7b8145  gpl
> 113: tracepoint  name sys_enter_open  tag 0e59c3ac2bea5280  gpl
> 114: tracepoint  name sys_enter_opena  tag 0baf443610f59837  gpl
> 115: tracepoint  name sys_enter_renam  tag 24664e4aca62d7fa  gpl
> 116: tracepoint  name sys_enter_renam  tag 20093e51a8634ebb  gpl
> 117: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
> 118: tracepoint  name sys_exit  tag 29c7ae234d79bd5c  gpl
> [root@five ~]#
> [root@five ~]# gdb perf
> GNU gdb (GDB) Fedora 10.1-2.fc33
> Reading symbols from perf...
> (gdb) run stat -e instructions,cycles -b 113 -I 1000
> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> (gdb) bt
> #0  0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
> #1  0x0000000000000000 in ?? ()
> (gdb)
> 
> [acme@five perf]$ clang -v |& head -2
> clang version 11.0.0 (Fedora 11.0.0-2.fc33)
> Target: x86_64-unknown-linux-gnu
> [acme@five perf]$
> 
> Do you need any extra info?

Hmm... I am not able to reproduce this. I am trying to setup an environment similar
to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? 

Thanks,
Song

> 
> Please when resubmitting, please combine patches 3/4 and 4/4, man pages
> updates usually come together with the new feature.
> 
> Thanks,
> 
> - Arnaldo
> 
> Full build output:
> 
> [acme@five perf]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;make VF=1 O=/tmp/build/perf -C tools/perf BUILD_BPF_SKEL=1 install-bin
> make: Entering directory '/home/acme/git/perf/tools/perf'
>  BUILD:   Doing 'make -j24' parallel build
>  HOSTCC   /tmp/build/perf/fixdep.o
>  HOSTLD   /tmp/build/perf/fixdep-in.o
>  LINK     /tmp/build/perf/fixdep
> Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
> diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h
> 
> Auto-detecting system features:
> ...                         dwarf: [ on  ]
> ...            dwarf_getlocations: [ on  ]
> ...                         glibc: [ on  ]
> ...                        libbfd: [ on  ]
> ...                libbfd-buildid: [ on  ]
> ...                        libcap: [ on  ]
> ...                        libelf: [ on  ]
> ...                       libnuma: [ on  ]
> ...        numa_num_possible_cpus: [ on  ]
> ...                       libperl: [ on  ]
> ...                     libpython: [ on  ]
> ...                     libcrypto: [ on  ]
> ...                     libunwind: [ on  ]
> ...            libdw-dwarf-unwind: [ on  ]
> ...                          zlib: [ on  ]
> ...                          lzma: [ on  ]
> ...                     get_cpuid: [ on  ]
> ...                           bpf: [ on  ]
> ...                        libaio: [ on  ]
> ...                       libzstd: [ on  ]
> ...        disassembler-four-args: [ on  ]
> ...                     backtrace: [ on  ]
> ...                       eventfd: [ on  ]
> ...                fortify-source: [ on  ]
> ...         sync-compare-and-swap: [ on  ]
> ...          get_current_dir_name: [ on  ]
> ...                        gettid: [ on  ]
> ...             libelf-getphdrnum: [ on  ]
> ...           libelf-gelf_getnote: [ on  ]
> ...          libelf-getshdrstrndx: [ on  ]
> ...             libpython-version: [ on  ]
> ...                      libslang: [ on  ]
> ...       libslang-include-subdir: [ on  ]
> ...   pthread-attr-setaffinity-np: [ on  ]
> ...               pthread-barrier: [ on  ]
> ...                  reallocarray: [ on  ]
> ...            stackprotector-all: [ on  ]
> ...                       timerfd: [ on  ]
> ...                  sched_getcpu: [ on  ]
> ...                           sdt: [ on  ]
> ...                         setns: [ on  ]
> ...                   file-handle: [ on  ]
> 
> ...                        bionic: [ OFF ]
> ...                    compile-32: [ OFF ]
> ...                   compile-x32: [ OFF ]
> ...                cplus-demangle: [ on  ]
> ...                          gtk2: [ OFF ]
> ...                  gtk2-infobar: [ OFF ]
> ...                         hello: [ OFF ]
> ...                 libbabeltrace: [ on  ]
> ...                libbfd-liberty: [ OFF ]
> ...              libbfd-liberty-z: [ OFF ]
> ...                    libopencsd: [ OFF ]
> ...                 libunwind-x86: [ OFF ]
> ...              libunwind-x86_64: [ OFF ]
> ...                 libunwind-arm: [ OFF ]
> ...             libunwind-aarch64: [ OFF ]
> ...         libunwind-debug-frame: [ OFF ]
> ...     libunwind-debug-frame-arm: [ OFF ]
> ... libunwind-debug-frame-aarch64: [ OFF ]
> ...                           cxx: [ OFF ]
> ...                          llvm: [ OFF ]
> ...                  llvm-version: [ OFF ]
> ...                         clang: [ OFF ]
> ...                        libbpf: [ OFF ]
> ...                       libpfm4: [ OFF ]
> ...                 libdebuginfod: [ on  ]
> ...               clang-bpf-co-re: [ on  ]
> ...                        prefix: /home/acme
> ...                        bindir: /home/acme/bin
> ...                        libdir: /home/acme/lib64
> ...                    sysconfdir: /home/acme/etc
> ...                 LIBUNWIND_DIR:
> ...                     LIBDW_DIR:
> ...                          JDIR: /usr/lib/jvm/java-11-openjdk-11.0.9.11-4.fc33.x86_64
> ...     DWARF post unwind library: libunwind
> 
>  GEN      /tmp/build/perf/common-cmds.h
> CFLAGS= make -C ../bpf/bpftool \
> 	OUTPUT=/tmp/build/perf/util/bpf_skel/.tmp/ bootstrap
>  CC       /tmp/build/perf/exec-cmd.o
>  CC       /tmp/build/perf/help.o
>  MKDIR    /tmp/build/perf/pmu-events/
>  MKDIR    /tmp/build/perf/jvmti/
>  MKDIR    /tmp/build/perf/fd/
>  MKDIR    /tmp/build/perf/fs/
>  MKDIR    /tmp/build/perf/fs/
>  HOSTCC   /tmp/build/perf/pmu-events/json.o
>  CC       /tmp/build/perf/parse-options.o
>  CC       /tmp/build/perf/fd/array.o
>  CC       /tmp/build/perf/pager.o
>  CC       /tmp/build/perf/jvmti/libjvmti.o
>  CC       /tmp/build/perf/fs/fs.o
>  CC       /tmp/build/perf/run-command.o
>  MKDIR    /tmp/build/perf/jvmti/
>  CC       /tmp/build/perf/sigchain.o
>  MKDIR    /tmp/build/perf/fs/
>  CC       /tmp/build/perf/fs/tracing_path.o
>  CC       /tmp/build/perf/fs/cgroup.o
>  MKDIR    /tmp/build/perf/pmu-events/
>  CC       /tmp/build/perf/jvmti/libstring.o
>  HOSTCC   /tmp/build/perf/pmu-events/jevents.o
>  CC       /tmp/build/perf/jvmti/libctype.o
>  CC       /tmp/build/perf/subcmd-config.o
>  CC       /tmp/build/perf/jvmti/jvmti_agent.o
>  LD       /tmp/build/perf/fd/libapi-in.o
>  CC       /tmp/build/perf/event-parse.o
>  HOSTCC   /tmp/build/perf/pmu-events/jsmn.o
>  CC       /tmp/build/perf/event-plugin.o
>  CC       /tmp/build/perf/cpu.o
>  CC       /tmp/build/perf/trace-seq.o
>  CC       /tmp/build/perf/core.o
>  CC       /tmp/build/perf/parse-filter.o
>  CC       /tmp/build/perf/debug.o
>  CC       /tmp/build/perf/cpumap.o
>  LD       /tmp/build/perf/fs/libapi-in.o
>  CC       /tmp/build/perf/threadmap.o
>  LD       /tmp/build/perf/libsubcmd-in.o
>  HOSTLD   /tmp/build/perf/pmu-events/jevents-in.o
>  CC       /tmp/build/perf/str_error_r.o
>  CC       /tmp/build/perf/evsel.o
>  GEN      /tmp/build/perf/bpf_helper_defs.h
>  CC       /tmp/build/perf/evlist.o
>  CC       /tmp/build/perf/parse-utils.o
>  CC       /tmp/build/perf/zalloc.o
>  CC       /tmp/build/perf/kbuffer-parse.o
>  LD       /tmp/build/perf/jvmti/jvmti-in.o
>  CC       /tmp/build/perf/mmap.o
>  CC       /tmp/build/perf/tep_strerror.o
>  CC       /tmp/build/perf/xyarray.o
>  CC       /tmp/build/perf/event-parse-api.o
>  CC       /tmp/build/perf/lib.o
>  LINK     /tmp/build/perf/pmu-events/jevents
>  LD       /tmp/build/perf/libapi-in.o
>  AR       /tmp/build/perf/libsubcmd.a
>  LINK     /tmp/build/perf/libperf-jvmti.so
>  LD       /tmp/build/perf/libtraceevent-in.o
>  CC       /tmp/build/perf/plugin_hrtimer.o
>  CC       /tmp/build/perf/plugin_kmem.o
>  CC       /tmp/build/perf/plugin_mac80211.o
>  CC       /tmp/build/perf/plugin_kvm.o
>  CC       /tmp/build/perf/plugin_jbd2.o
>  LD       /tmp/build/perf/libperf-in.o
>  CC       /tmp/build/perf/plugin_sched_switch.o
>  CC       /tmp/build/perf/plugin_function.o
>  CC       /tmp/build/perf/plugin_scsi.o
>  CC       /tmp/build/perf/plugin_xen.o
>  CC       /tmp/build/perf/plugin_futex.o
>  CC       /tmp/build/perf/plugin_cfg80211.o
>  CC       /tmp/build/perf/plugin_tlb.o
>  AR       /tmp/build/perf/libapi.a
>  LD       /tmp/build/perf/plugin_hrtimer-in.o
>  LINK     /tmp/build/perf/libtraceevent.a
>  LD       /tmp/build/perf/plugin_kvm-in.o
>  LD       /tmp/build/perf/plugin_scsi-in.o
>  LD       /tmp/build/perf/plugin_kmem-in.o
>  LD       /tmp/build/perf/plugin_mac80211-in.o
>  LD       /tmp/build/perf/plugin_futex-in.o
>  LD       /tmp/build/perf/plugin_function-in.o
>  LD       /tmp/build/perf/plugin_xen-in.o
>  LD       /tmp/build/perf/plugin_sched_switch-in.o
>  LD       /tmp/build/perf/plugin_tlb-in.o
>  LD       /tmp/build/perf/plugin_jbd2-in.o
>  LINK     /tmp/build/perf/plugin_hrtimer.so
>  LINK     /tmp/build/perf/plugin_kmem.so
>  AR       /tmp/build/perf/libperf.a
>  LINK     /tmp/build/perf/plugin_scsi.so
>  LINK     /tmp/build/perf/plugin_kvm.so
>  LINK     /tmp/build/perf/plugin_mac80211.so
>  LD       /tmp/build/perf/plugin_cfg80211-in.o
>  LINK     /tmp/build/perf/plugin_futex.so
>  LINK     /tmp/build/perf/plugin_xen.so
>  LINK     /tmp/build/perf/plugin_function.so
>  LINK     /tmp/build/perf/plugin_tlb.so
>  LINK     /tmp/build/perf/plugin_jbd2.so
>  LINK     /tmp/build/perf/plugin_cfg80211.so
>  LINK     /tmp/build/perf/plugin_sched_switch.so
>  GEN      /tmp/build/perf/pmu-events/pmu-events.c
>  GEN      /tmp/build/perf/libtraceevent-dynamic-list
>  MKDIR    /tmp/build/perf/staticobjs/
>  MKDIR    /tmp/build/perf/staticobjs/
>  MKDIR    /tmp/build/perf/staticobjs/
>  MKDIR    /tmp/build/perf/staticobjs/
>  MKDIR    /tmp/build/perf/staticobjs/
>  MKDIR    /tmp/build/perf/staticobjs/
>  MKDIR    /tmp/build/perf/staticobjs/
>  PERF_VERSION = 5.11.rc1.g5eb0b370de61
>  MKDIR    /tmp/build/perf/staticobjs/
>  CC       /tmp/build/perf/staticobjs/libbpf_probes.o
>  CC       /tmp/build/perf/staticobjs/libbpf.o
>  CC       /tmp/build/perf/staticobjs/bpf.o
>  CC       /tmp/build/perf/staticobjs/nlattr.o
>  CC       /tmp/build/perf/staticobjs/btf.o
>  CC       /tmp/build/perf/staticobjs/xsk.o
>  GEN      perf-archive
>  CC       /tmp/build/perf/staticobjs/hashmap.o
>  GEN      perf-with-kcore
>  CC       /tmp/build/perf/staticobjs/btf_dump.o
>  CC       /tmp/build/perf/staticobjs/libbpf_errno.o
>  CC       /tmp/build/perf/staticobjs/str_error.o
>  CC       /tmp/build/perf/staticobjs/bpf_prog_linfo.o
>  CC       /tmp/build/perf/staticobjs/netlink.o
>  CC       /tmp/build/perf/staticobjs/ringbuf.o
>  LD       /tmp/build/perf/staticobjs/libbpf-in.o
>  LINK     /tmp/build/perf/libbpf.a
>  CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o
>  DESCEND  plugins
>  GEN      /tmp/build/perf/python/perf.so
>  CC       /tmp/build/perf/plugins/plugin_jbd2.o
>  CC       /tmp/build/perf/plugins/plugin_kmem.o
>  CC       /tmp/build/perf/plugins/plugin_hrtimer.o
>  CC       /tmp/build/perf/plugins/plugin_mac80211.o
>  CC       /tmp/build/perf/plugins/plugin_kvm.o
>  CC       /tmp/build/perf/plugins/plugin_function.o
>  CC       /tmp/build/perf/plugins/plugin_xen.o
>  CC       /tmp/build/perf/plugins/plugin_sched_switch.o
>  CC       /tmp/build/perf/plugins/plugin_futex.o
>  CC       /tmp/build/perf/plugins/plugin_scsi.o
>  CC       /tmp/build/perf/plugins/plugin_tlb.o
>  CC       /tmp/build/perf/plugins/plugin_cfg80211.o
>  LD       /tmp/build/perf/plugins/plugin_jbd2-in.o
>  LD       /tmp/build/perf/plugins/plugin_kmem-in.o
>  LD       /tmp/build/perf/plugins/plugin_hrtimer-in.o
>  LD       /tmp/build/perf/plugins/plugin_kvm-in.o
>  LD       /tmp/build/perf/plugins/plugin_mac80211-in.o
>  LD       /tmp/build/perf/plugins/plugin_function-in.o
>  LD       /tmp/build/perf/plugins/plugin_xen-in.o
>  LD       /tmp/build/perf/plugins/plugin_sched_switch-in.o
>  LD       /tmp/build/perf/plugins/plugin_scsi-in.o
>  LD       /tmp/build/perf/plugins/plugin_futex-in.o
>  LD       /tmp/build/perf/plugins/plugin_cfg80211-in.o
>  LD       /tmp/build/perf/plugins/plugin_tlb-in.o
>  LINK     /tmp/build/perf/plugins/plugin_jbd2.so
>  LINK     /tmp/build/perf/plugins/plugin_hrtimer.so
>  LINK     /tmp/build/perf/plugins/plugin_kmem.so
>  LINK     /tmp/build/perf/plugins/plugin_mac80211.so
>  LINK     /tmp/build/perf/plugins/plugin_kvm.so
>  LINK     /tmp/build/perf/plugins/plugin_sched_switch.so
>  LINK     /tmp/build/perf/plugins/plugin_scsi.so
>  LINK     /tmp/build/perf/plugins/plugin_xen.so
>  LINK     /tmp/build/perf/plugins/plugin_function.so
>  LINK     /tmp/build/perf/plugins/plugin_futex.so
>  LINK     /tmp/build/perf/plugins/plugin_tlb.so
>  LINK     /tmp/build/perf/plugins/plugin_cfg80211.so
>  INSTALL  trace_plugins
>  CC       /tmp/build/perf/pmu-events/pmu-events.o
>  LD       /tmp/build/perf/pmu-events/pmu-events-in.o
> 
> Auto-detecting system features:
> ...                        libbfd: [ on  ]
> ...        disassembler-four-args: [ on  ]
> ...                          zlib: [ on  ]
> ...                        libcap: [ on  ]
> ...               clang-bpf-co-re: [ on  ]
> ...                  reallocarray: [ on  ]
> 
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/main.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/common.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/btf.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/json_writer.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/gen.o
> 
> Auto-detecting system features:
> ...                        libelf: [ on  ]
> ...                          zlib: [ on  ]
> ...                           bpf: [ on  ]
> 
>  GEN      /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/bpf_helper_defs.h
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_probes.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/nlattr.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/xsk.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_errno.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/netlink.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/str_error.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/hashmap.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf_prog_linfo.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf_dump.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ringbuf.o
>  LD       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o
>  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a
>  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool
>  GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>  CC       /tmp/build/perf/builtin-bench.o
>  CC       /tmp/build/perf/builtin-annotate.o
>  CC       /tmp/build/perf/builtin-diff.o
>  CC       /tmp/build/perf/builtin-config.o
>  CC       /tmp/build/perf/builtin-ftrace.o
>  CC       /tmp/build/perf/builtin-help.o
>  CC       /tmp/build/perf/builtin-evlist.o
>  CC       /tmp/build/perf/builtin-sched.o
>  CC       /tmp/build/perf/builtin-buildid-list.o
>  CC       /tmp/build/perf/builtin-kallsyms.o
>  CC       /tmp/build/perf/builtin-buildid-cache.o
>  CC       /tmp/build/perf/builtin-record.o
>  CC       /tmp/build/perf/builtin-report.o
>  CC       /tmp/build/perf/builtin-list.o
>  CC       /tmp/build/perf/builtin-stat.o
>  CC       /tmp/build/perf/builtin-top.o
>  CC       /tmp/build/perf/builtin-timechart.o
>  CC       /tmp/build/perf/builtin-script.o
>  CC       /tmp/build/perf/builtin-kmem.o
>  CC       /tmp/build/perf/builtin-lock.o
>  CC       /tmp/build/perf/builtin-kvm.o
>  CC       /tmp/build/perf/builtin-inject.o
>  CC       /tmp/build/perf/builtin-mem.o
>  CC       /tmp/build/perf/builtin-version.o
>  CC       /tmp/build/perf/builtin-data.o
>  CC       /tmp/build/perf/builtin-trace.o
>  CC       /tmp/build/perf/builtin-probe.o
>  CC       /tmp/build/perf/builtin-c2c.o
>  MKDIR    /tmp/build/perf/bench/
>  MKDIR    /tmp/build/perf/bench/
>  MKDIR    /tmp/build/perf/tests/
>  CC       /tmp/build/perf/arch/common.o
>  MKDIR    /tmp/build/perf/ui/
>  MKDIR    /tmp/build/perf/bench/
>  MKDIR    /tmp/build/perf/tests/
>  CC       /tmp/build/perf/bench/sched-messaging.o
>  CC       /tmp/build/perf/bench/sched-pipe.o
>  CC       /tmp/build/perf/tests/builtin-test.o
>  MKDIR    /tmp/build/perf/scripts/python/Perf-Trace-Util/
>  MKDIR    /tmp/build/perf/scripts/perl/Perf-Trace-Util/
>  MKDIR    /tmp/build/perf/tests/
>  MKDIR    /tmp/build/perf/ui/
>  CC       /tmp/build/perf/ui/setup.o
>  CC       /tmp/build/perf/tests/attr.o
>  CC       /tmp/build/perf/bench/syscall.o
>  CC       /tmp/build/perf/tests/parse-events.o
>  CC       /tmp/build/perf/trace/beauty/clone.o
>  CC       /tmp/build/perf/scripts/python/Perf-Trace-Util/Context.o
>  CC       /tmp/build/perf/tests/dso-data.o
>  MKDIR    /tmp/build/perf/arch/x86/util/
>  CC       /tmp/build/perf/ui/helpline.o
>  CC       /tmp/build/perf/scripts/perl/Perf-Trace-Util/Context.o
>  MKDIR    /tmp/build/perf/ui/
>  CC       /tmp/build/perf/ui/util.o
>  CC       /tmp/build/perf/arch/x86/util/header.o
>  MKDIR    /tmp/build/perf/arch/x86/tests/
>  CC       /tmp/build/perf/ui/hist.o
>  CC       /tmp/build/perf/ui/progress.o
>  CC       /tmp/build/perf/tests/vmlinux-kallsyms.o
>  CC       /tmp/build/perf/arch/x86/tests/regs_load.o
>  CC       /tmp/build/perf/bench/mem-functions.o
>  CC       /tmp/build/perf/trace/beauty/fcntl.o
>  CC       /tmp/build/perf/trace/beauty/flock.o
>  CC       /tmp/build/perf/bench/futex-hash.o
>  CC       /tmp/build/perf/trace/beauty/fsmount.o
>  CC       /tmp/build/perf/perf.o
>  CC       /tmp/build/perf/tests/openat-syscall.o
>  MKDIR    /tmp/build/perf/ui/stdio/
>  MKDIR    /tmp/build/perf/arch/x86/util/
>  LD       /tmp/build/perf/scripts/python/Perf-Trace-Util/perf-in.o
>  MKDIR    /tmp/build/perf/arch/x86/tests/
>  CC       /tmp/build/perf/tests/openat-syscall-all-cpus.o
>  CC       /tmp/build/perf/arch/x86/util/pmu.o
>  CC       /tmp/build/perf/trace/beauty/fspick.o
>  CC       /tmp/build/perf/arch/x86/util/tsc.o
>  CC       /tmp/build/perf/util/annotate.o
>  CC       /tmp/build/perf/tests/openat-syscall-tp-fields.o
>  CC       /tmp/build/perf/trace/beauty/ioctl.o
>  CC       /tmp/build/perf/bench/futex-wake.o
>  CC       /tmp/build/perf/arch/x86/tests/dwarf-unwind.o
>  CC       /tmp/build/perf/bench/futex-wake-parallel.o
>  CC       /tmp/build/perf/ui/stdio/hist.o
>  CC       /tmp/build/perf/arch/x86/tests/arch-tests.o
>  CC       /tmp/build/perf/trace/beauty/kcmp.o
>  CC       /tmp/build/perf/arch/x86/util/perf_regs.o
>  CC       /tmp/build/perf/trace/beauty/mount_flags.o
>  CC       /tmp/build/perf/arch/x86/util/kvm-stat.o
>  CC       /tmp/build/perf/bench/futex-requeue.o
>  CC       /tmp/build/perf/util/block-info.o
>  CC       /tmp/build/perf/arch/x86/tests/rdpmc.o
>  CC       /tmp/build/perf/bench/futex-lock-pi.o
>  CC       /tmp/build/perf/arch/x86/tests/insn-x86.o
>  CC       /tmp/build/perf/trace/beauty/move_mount.o
>  CC       /tmp/build/perf/tests/mmap-basic.o
>  CC       /tmp/build/perf/trace/beauty/pkey_alloc.o
>  CC       /tmp/build/perf/tests/perf-record.o
>  CC       /tmp/build/perf/trace/beauty/arch_prctl.o
>  CC       /tmp/build/perf/arch/x86/util/topdown.o
>  CC       /tmp/build/perf/tests/evsel-roundtrip-name.o
>  CC       /tmp/build/perf/tests/evsel-tp-sched.o
>  CC       /tmp/build/perf/arch/x86/tests/intel-pt-pkt-decoder-test.o
>  CC       /tmp/build/perf/bench/epoll-wait.o
>  CC       /tmp/build/perf/tests/fdarray.o
>  CC       /tmp/build/perf/bench/epoll-ctl.o
>  CC       /tmp/build/perf/arch/x86/util/machine.o
>  CC       /tmp/build/perf/trace/beauty/prctl.o
>  CC       /tmp/build/perf/ui/browser.o
>  CC       /tmp/build/perf/arch/x86/util/event.o
>  CC       /tmp/build/perf/arch/x86/tests/bp-modify.o
>  CC       /tmp/build/perf/trace/beauty/renameat.o
>  CC       /tmp/build/perf/tests/pmu.o
>  CC       /tmp/build/perf/tests/pmu-events.o
>  CC       /tmp/build/perf/tests/hists_common.o
>  CC       /tmp/build/perf/bench/synthesize.o
>  CC       /tmp/build/perf/arch/x86/util/dwarf-regs.o
>  CC       /tmp/build/perf/tests/hists_link.o
>  CC       /tmp/build/perf/trace/beauty/sockaddr.o
>  CC       /tmp/build/perf/arch/x86/util/unwind-libunwind.o
>  CC       /tmp/build/perf/trace/beauty/socket.o
>  CC       /tmp/build/perf/trace/beauty/statx.o
>  LD       /tmp/build/perf/arch/x86/tests/perf-in.o
>  CC       /tmp/build/perf/arch/x86/util/auxtrace.o
>  CC       /tmp/build/perf/arch/x86/util/archinsn.o
>  CC       /tmp/build/perf/bench/kallsyms-parse.o
>  CC       /tmp/build/perf/bench/find-bit-bench.o
>  CC       /tmp/build/perf/arch/x86/util/intel-pt.o
>  CC       /tmp/build/perf/trace/beauty/sync_file_range.o
>  CC       /tmp/build/perf/arch/x86/util/intel-bts.o
>  MKDIR    /tmp/build/perf/trace/beauty/tracepoints/
>  MKDIR    /tmp/build/perf/ui/browsers/
>  CC       /tmp/build/perf/trace/beauty/tracepoints/x86_irq_vectors.o
>  MKDIR    /tmp/build/perf/trace/beauty/tracepoints/
>  CC       /tmp/build/perf/util/block-range.o
>  CC       /tmp/build/perf/ui/browsers/annotate.o
>  MKDIR    /tmp/build/perf/ui/browsers/
>  CC       /tmp/build/perf/util/build-id.o
>  CC       /tmp/build/perf/bench/inject-buildid.o
>  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
>  MKDIR    /tmp/build/perf/ui/tui/
>  CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
>  MKDIR    /tmp/build/perf/ui/tui/
>  CC       /tmp/build/perf/ui/browsers/hists.o
>  CC       /tmp/build/perf/ui/browsers/map.o
>  LD       /tmp/build/perf/arch/x86/util/perf-in.o
>  CC       /tmp/build/perf/tests/hists_filter.o
>  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o
>  CC       /tmp/build/perf/ui/tui/setup.o
>  CC       /tmp/build/perf/ui/tui/util.o
>  MKDIR    /tmp/build/perf/ui/tui/
>  CC       /tmp/build/perf/bench/numa.o
>  CC       /tmp/build/perf/util/cacheline.o
>  CC       /tmp/build/perf/ui/browsers/scripts.o
>  LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
>  CC       /tmp/build/perf/ui/tui/helpline.o
>  CC       /tmp/build/perf/util/config.o
>  CC       /tmp/build/perf/ui/tui/progress.o
>  CC       /tmp/build/perf/ui/browsers/header.o
>  LD       /tmp/build/perf/trace/beauty/perf-in.o
>  CC       /tmp/build/perf/ui/browsers/res_sample.o
>  CC       /tmp/build/perf/tests/hists_output.o
>  LD       /tmp/build/perf/bench/perf-in.o
>  CC       /tmp/build/perf/tests/hists_cumulate.o
>  CC       /tmp/build/perf/util/copyfile.o
>  LD       /tmp/build/perf/arch/x86/perf-in.o
>  CC       /tmp/build/perf/util/ctype.o
>  CC       /tmp/build/perf/util/db-export.o
>  CC       /tmp/build/perf/util/env.o
>  CC       /tmp/build/perf/util/event.o
>  LD       /tmp/build/perf/arch/perf-in.o
>  CC       /tmp/build/perf/util/evlist.o
>  CC       /tmp/build/perf/util/sideband_evlist.o
>  CC       /tmp/build/perf/util/evsel.o
>  CC       /tmp/build/perf/util/evsel_fprintf.o
>  CC       /tmp/build/perf/tests/python-use.o
>  CC       /tmp/build/perf/util/perf_event_attr_fprintf.o
>  CC       /tmp/build/perf/util/evswitch.o
>  CC       /tmp/build/perf/util/find_bit.o
>  CC       /tmp/build/perf/tests/bp_signal.o
>  LD       /tmp/build/perf/ui/tui/perf-in.o
>  CC       /tmp/build/perf/util/get_current_dir_name.o
>  CC       /tmp/build/perf/tests/bp_signal_overflow.o
>  CC       /tmp/build/perf/util/kallsyms.o
>  CC       /tmp/build/perf/tests/bp_account.o
>  CC       /tmp/build/perf/util/llvm-utils.o
>  CC       /tmp/build/perf/util/levenshtein.o
>  CC       /tmp/build/perf/util/mmap.o
>  CC       /tmp/build/perf/tests/wp.o
>  CC       /tmp/build/perf/util/memswap.o
>  CC       /tmp/build/perf/util/perf_regs.o
>  BISON    /tmp/build/perf/util/parse-events-bison.c
>  CC       /tmp/build/perf/tests/task-exit.o
>  CC       /tmp/build/perf/util/path.o
>  CC       /tmp/build/perf/util/print_binary.o
>  CC       /tmp/build/perf/util/rlimit.o
>  CC       /tmp/build/perf/tests/sw-clock.o
>  CC       /tmp/build/perf/tests/mmap-thread-lookup.o
>  CC       /tmp/build/perf/util/argv_split.o
>  CC       /tmp/build/perf/util/rbtree.o
>  CC       /tmp/build/perf/tests/thread-maps-share.o
>  CC       /tmp/build/perf/util/libstring.o
>  CC       /tmp/build/perf/tests/switch-tracking.o
>  CC       /tmp/build/perf/tests/keep-tracking.o
>  CC       /tmp/build/perf/util/bitmap.o
>  CC       /tmp/build/perf/util/hweight.o
>  CC       /tmp/build/perf/util/smt.o
>  CC       /tmp/build/perf/tests/code-reading.o
>  CC       /tmp/build/perf/util/strbuf.o
>  CC       /tmp/build/perf/util/string.o
>  CC       /tmp/build/perf/tests/sample-parsing.o
>  CC       /tmp/build/perf/tests/parse-no-sample-id-all.o
>  CC       /tmp/build/perf/util/strfilter.o
>  CC       /tmp/build/perf/tests/kmod-path.o
>  CC       /tmp/build/perf/util/strlist.o
>  CC       /tmp/build/perf/util/top.o
>  CC       /tmp/build/perf/tests/thread-map.o
>  CC       /tmp/build/perf/util/usage.o
>  CC       /tmp/build/perf/util/dso.o
>  CC       /tmp/build/perf/util/dsos.o
>  CC       /tmp/build/perf/util/symbol.o
>  CC       /tmp/build/perf/util/symbol_fprintf.o
>  CC       /tmp/build/perf/tests/llvm.o
>  CC       /tmp/build/perf/util/color.o
>  CC       /tmp/build/perf/util/color_config.o
>  CC       /tmp/build/perf/util/metricgroup.o
>  CC       /tmp/build/perf/util/header.o
>  CC       /tmp/build/perf/util/callchain.o
>  CC       /tmp/build/perf/util/values.o
>  CC       /tmp/build/perf/tests/bpf.o
>  CC       /tmp/build/perf/util/debug.o
>  CC       /tmp/build/perf/util/fncache.o
>  CC       /tmp/build/perf/tests/topology.o
>  CC       /tmp/build/perf/util/machine.o
>  CC       /tmp/build/perf/tests/cpumap.o
>  CC       /tmp/build/perf/util/map.o
>  CC       /tmp/build/perf/tests/mem.o
>  CC       /tmp/build/perf/util/pstack.o
>  CC       /tmp/build/perf/util/session.o
>  CC       /tmp/build/perf/tests/stat.o
>  CC       /tmp/build/perf/tests/event_update.o
>  LD       /tmp/build/perf/ui/browsers/perf-in.o
>  CC       /tmp/build/perf/tests/event-times.o
>  CC       /tmp/build/perf/tests/expr.o
>  CC       /tmp/build/perf/util/sample-raw.o
>  CC       /tmp/build/perf/util/s390-sample-raw.o
>  CC       /tmp/build/perf/tests/sdt.o
>  CC       /tmp/build/perf/util/syscalltbl.o
>  CC       /tmp/build/perf/tests/is_printable_array.o
>  CC       /tmp/build/perf/util/ordered-events.o
>  CC       /tmp/build/perf/tests/backward-ring-buffer.o
>  CC       /tmp/build/perf/tests/bitmap.o
>  CC       /tmp/build/perf/util/namespaces.o
>  CC       /tmp/build/perf/tests/perf-hooks.o
>  CC       /tmp/build/perf/tests/clang.o
>  CC       /tmp/build/perf/util/comm.o
>  CC       /tmp/build/perf/tests/unit_number__scnprintf.o
>  CC       /tmp/build/perf/tests/mem2node.o
>  CC       /tmp/build/perf/tests/maps.o
>  CC       /tmp/build/perf/util/thread.o
>  CC       /tmp/build/perf/util/thread_map.o
>  CC       /tmp/build/perf/tests/time-utils-test.o
>  CC       /tmp/build/perf/tests/genelf.o
>  CC       /tmp/build/perf/util/trace-event-parse.o
>  BISON    /tmp/build/perf/util/pmu-bison.c
>  CC       /tmp/build/perf/util/trace-event-read.o
>  CC       /tmp/build/perf/tests/api-io.o
>  CC       /tmp/build/perf/util/trace-event-info.o
>  CC       /tmp/build/perf/util/trace-event-scripting.o
>  CC       /tmp/build/perf/tests/pfm.o
>  CC       /tmp/build/perf/tests/demangle-java-test.o
>  CC       /tmp/build/perf/util/trace-event.o
>  LD       /tmp/build/perf/ui/perf-in.o
>  CC       /tmp/build/perf/tests/parse-metric.o
>  CC       /tmp/build/perf/tests/pe-file-parsing.o
>  CC       /tmp/build/perf/tests/expand-cgroup.o
>  CC       /tmp/build/perf/tests/perf-time-to-tsc.o
>  CC       /tmp/build/perf/util/svghelper.o
>  CC       /tmp/build/perf/util/sort.o
>  CC       /tmp/build/perf/util/hist.o
>  CC       /tmp/build/perf/tests/dwarf-unwind.o
>  CC       /tmp/build/perf/tests/llvm-src-base.o
>  CC       /tmp/build/perf/util/cpumap.o
>  CC       /tmp/build/perf/util/util.o
>  CC       /tmp/build/perf/util/affinity.o
>  CC       /tmp/build/perf/util/cputopo.o
>  CC       /tmp/build/perf/util/target.o
>  CC       /tmp/build/perf/util/cgroup.o
>  CC       /tmp/build/perf/tests/llvm-src-kbuild.o
>  CC       /tmp/build/perf/util/rblist.o
>  CC       /tmp/build/perf/tests/llvm-src-prologue.o
>  CC       /tmp/build/perf/tests/llvm-src-relocation.o
>  CC       /tmp/build/perf/util/intlist.o
>  CC       /tmp/build/perf/util/counts.o
>  CC       /tmp/build/perf/util/vdso.o
>  CC       /tmp/build/perf/util/stat.o
>  CC       /tmp/build/perf/util/stat-shadow.o
>  CC       /tmp/build/perf/util/stat-display.o
>  CC       /tmp/build/perf/util/perf_api_probe.o
>  CC       /tmp/build/perf/util/record.o
>  CC       /tmp/build/perf/util/srcline.o
>  LD       /tmp/build/perf/tests/perf-in.o
>  CC       /tmp/build/perf/util/srccode.o
>  CC       /tmp/build/perf/util/synthetic-events.o
>  CC       /tmp/build/perf/util/data.o
>  CC       /tmp/build/perf/util/cloexec.o
>  CC       /tmp/build/perf/util/tsc.o
>  CC       /tmp/build/perf/util/rwsem.o
>  CC       /tmp/build/perf/util/call-path.o
>  CC       /tmp/build/perf/util/thread-stack.o
>  CC       /tmp/build/perf/util/spark.o
>  CC       /tmp/build/perf/util/topdown.o
>  CC       /tmp/build/perf/util/auxtrace.o
>  CC       /tmp/build/perf/util/intel-pt.o
>  CC       /tmp/build/perf/util/stream.o
>  CC       /tmp/build/perf/util/intel-bts.o
>  MKDIR    /tmp/build/perf/util/arm-spe-decoder/
>  CC       /tmp/build/perf/util/arm-spe.o
>  MKDIR    /tmp/build/perf/util/intel-pt-decoder/
>  MKDIR    /tmp/build/perf/util/arm-spe-decoder/
>  CC       /tmp/build/perf/util/s390-cpumsf.o
>  MKDIR    /tmp/build/perf/util/intel-pt-decoder/
>  MKDIR    /tmp/build/perf/util/scripting-engines/
>  MKDIR    /tmp/build/perf/util/intel-pt-decoder/
>  MKDIR    /tmp/build/perf/util/scripting-engines/
>  CC       /tmp/build/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.o
>  GEN      /tmp/build/perf/util/intel-pt-decoder/inat-tables.c
>  CC       /tmp/build/perf/util/arm-spe-decoder/arm-spe-decoder.o
>  CC       /tmp/build/perf/util/scripting-engines/trace-event-perl.o
>  CC       /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o
>  CC       /tmp/build/perf/util/intel-pt-decoder/intel-pt-log.o
>  CC       /tmp/build/perf/util/dump-insn.o
>  CC       /tmp/build/perf/util/parse-branch-options.o
>  CC       /tmp/build/perf/util/scripting-engines/trace-event-python.o
>  CC       /tmp/build/perf/util/parse-regs-options.o
>  CC       /tmp/build/perf/util/parse-sublevel-options.o
>  MKDIR    /tmp/build/perf/util/intel-pt-decoder/
>  CC       /tmp/build/perf/util/term.o
>  CC       /tmp/build/perf/util/help-unknown-cmd.o
>  LD       /tmp/build/perf/util/arm-spe-decoder/perf-in.o
>  CC       /tmp/build/perf/util/mem-events.o
>  CC       /tmp/build/perf/util/vsprintf.o
>  CC       /tmp/build/perf/util/units.o
>  CC       /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o
>  BISON    /tmp/build/perf/util/expr-bison.c
>  CC       /tmp/build/perf/util/time-utils.o
>  CC       /tmp/build/perf/util/branch.o
>  CC       /tmp/build/perf/util/mem2node.o
>  CC       /tmp/build/perf/util/clockid.o
>  CC       /tmp/build/perf/util/bpf-loader.o
>  CC       /tmp/build/perf/util/bpf_map.o
>  CC       /tmp/build/perf/util/bpf_counter.o
>  CC       /tmp/build/perf/util/bpf-prologue.o
>  CC       /tmp/build/perf/util/symbol-elf.o
>  CC       /tmp/build/perf/util/probe-file.o
>  CC       /tmp/build/perf/util/probe-event.o
>  CC       /tmp/build/perf/util/probe-finder.o
>  CC       /tmp/build/perf/util/dwarf-aux.o
>  CC       /tmp/build/perf/util/dwarf-regs.o
>  LD       /tmp/build/perf/util/scripting-engines/perf-in.o
>  CC       /tmp/build/perf/util/unwind-libunwind-local.o
>  CC       /tmp/build/perf/util/unwind-libunwind.o
>  CC       /tmp/build/perf/util/intel-pt-decoder/intel-pt-insn-decoder.o
>  CC       /tmp/build/perf/util/data-convert-bt.o
>  CC       /tmp/build/perf/util/zlib.o
>  CC       /tmp/build/perf/util/lzma.o
>  CC       /tmp/build/perf/util/cap.o
>  CC       /tmp/build/perf/util/zstd.o
>  CC       /tmp/build/perf/util/demangle-java.o
>  CC       /tmp/build/perf/util/demangle-rust.o
>  CC       /tmp/build/perf/util/genelf.o
>  CC       /tmp/build/perf/util/jitdump.o
>  CC       /tmp/build/perf/util/genelf_debug.o
>  CC       /tmp/build/perf/util/perf-hooks.o
>  CC       /tmp/build/perf/util/bpf-event.o
>  FLEX     /tmp/build/perf/util/pmu-flex.c
>  CC       /tmp/build/perf/util/pmu-bison.o
>  CC       /tmp/build/perf/util/pmu.o
>  CC       /tmp/build/perf/util/pmu-flex.o
>  FLEX     /tmp/build/perf/util/parse-events-flex.c
>  CC       /tmp/build/perf/util/parse-events-bison.o
>  FLEX     /tmp/build/perf/util/expr-flex.c
>  CC       /tmp/build/perf/util/expr-bison.o
>  LD       /tmp/build/perf/util/intel-pt-decoder/perf-in.o
>  CC       /tmp/build/perf/util/parse-events.o
>  CC       /tmp/build/perf/util/parse-events-flex.o
>  CC       /tmp/build/perf/util/expr-flex.o
>  CC       /tmp/build/perf/util/expr.o
>  LD       /tmp/build/perf/scripts/perl/Perf-Trace-Util/perf-in.o
>  LD       /tmp/build/perf/scripts/perf-in.o
>  LD       /tmp/build/perf/util/perf-in.o
>  LD       /tmp/build/perf/perf-in.o
>  LINK     /tmp/build/perf/perf
>  INSTALL  tests
>  INSTALL  binaries
>  INSTALL  libperf-jvmti.so
>  INSTALL  libexec
>  INSTALL  bpf-headers
>  INSTALL  bpf-examples
>  INSTALL  perf-archive
>  INSTALL  perf-with-kcore
>  INSTALL  strace/groups
>  INSTALL  perl-scripts
>  INSTALL  python-scripts
>  INSTALL  perf_completion-script
>  INSTALL  perf-tip
> make: Leaving directory '/home/acme/git/perf/tools/perf'
> [acme@five perf]$
> 
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>> tools/perf/Makefile.perf                      |   2 +-
>> tools/perf/builtin-stat.c                     |  77 ++++-
>> tools/perf/util/Build                         |   1 +
>> tools/perf/util/bpf_counter.c                 | 296 ++++++++++++++++++
>> tools/perf/util/bpf_counter.h                 |  72 +++++
>> .../util/bpf_skel/bpf_prog_profiler.bpf.c     |  93 ++++++
>> tools/perf/util/evsel.c                       |   9 +
>> tools/perf/util/evsel.h                       |   6 +
>> tools/perf/util/stat-display.c                |   4 +-
>> tools/perf/util/stat.c                        |   2 +-
>> tools/perf/util/target.c                      |  34 +-
>> tools/perf/util/target.h                      |  10 +
>> 12 files changed, 588 insertions(+), 18 deletions(-)
>> create mode 100644 tools/perf/util/bpf_counter.c
>> create mode 100644 tools/perf/util/bpf_counter.h
>> create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
>> 
>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>> index d182a2dbb9bbd..8c4e039c3b813 100644
>> --- a/tools/perf/Makefile.perf
>> +++ b/tools/perf/Makefile.perf
>> @@ -1015,7 +1015,7 @@ python-clean:
>> 
>> SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
>> SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
>> -SKELETONS :=
>> +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
>> 
>> ifdef BUILD_BPF_SKEL
>> BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 8cc24967bc273..09bffb3fbcdd4 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -67,6 +67,7 @@
>> #include "util/top.h"
>> #include "util/affinity.h"
>> #include "util/pfm.h"
>> +#include "util/bpf_counter.h"
>> #include "asm/bug.h"
>> 
>> #include <linux/time64.h>
>> @@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs)
>> 	return 0;
>> }
>> 
>> +static int read_bpf_map_counters(void)
>> +{
>> +	struct evsel *counter;
>> +	int err;
>> +
>> +	evlist__for_each_entry(evsel_list, counter) {
>> +		err = bpf_counter__read(counter);
>> +		if (err)
>> +			return err;
>> +	}
>> +	return 0;
>> +}
>> +
>> static void read_counters(struct timespec *rs)
>> {
>> 	struct evsel *counter;
>> +	int err;
>> 
>> -	if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0))
>> -		return;
>> +	if (!stat_config.stop_read_counter) {
>> +		err = read_bpf_map_counters();
>> +		if (err == -EAGAIN)
>> +			err = read_affinity_counters(rs);
>> +		if (err < 0)
>> +			return;
>> +	}
>> 
>> 	evlist__for_each_entry(evsel_list, counter) {
>> 		if (counter->err)
>> @@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times)
>> 	return false;
>> }
>> 
>> -static void enable_counters(void)
>> +static int enable_counters(void)
>> {
>> +	struct evsel *evsel;
>> +	int err;
>> +
>> +	evlist__for_each_entry(evsel_list, evsel) {
>> +		err = bpf_counter__enable(evsel);
>> +		if (err)
>> +			return err;
>> +	}
>> +
>> 	if (stat_config.initial_delay < 0) {
>> 		pr_info(EVLIST_DISABLED_MSG);
>> -		return;
>> +		return 0;
>> 	}
>> 
>> 	if (stat_config.initial_delay > 0) {
>> @@ -518,6 +547,7 @@ static void enable_counters(void)
>> 		if (stat_config.initial_delay > 0)
>> 			pr_info(EVLIST_ENABLED_MSG);
>> 	}
>> +	return 0;
>> }
>> 
>> static void disable_counters(void)
>> @@ -720,7 +750,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> 	const bool forks = (argc > 0);
>> 	bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false;
>> 	struct affinity affinity;
>> -	int i, cpu;
>> +	int i, cpu, err;
>> 	bool second_pass = false;
>> 
>> 	if (forks) {
>> @@ -737,6 +767,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> 	if (affinity__setup(&affinity) < 0)
>> 		return -1;
>> 
>> +	evlist__for_each_entry(evsel_list, counter) {
>> +		if (bpf_counter__load(counter, &target))
>> +			return -1;
>> +	}
>> +
>> 	evlist__for_each_cpu (evsel_list, i, cpu) {
>> 		affinity__set(&affinity, cpu);
>> 
>> @@ -850,7 +885,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> 	}
>> 
>> 	if (STAT_RECORD) {
>> -		int err, fd = perf_data__fd(&perf_stat.data);
>> +		int fd = perf_data__fd(&perf_stat.data);
>> 
>> 		if (is_pipe) {
>> 			err = perf_header__write_pipe(perf_data__fd(&perf_stat.data));
>> @@ -876,7 +911,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> 
>> 	if (forks) {
>> 		evlist__start_workload(evsel_list);
>> -		enable_counters();
>> +		err = enable_counters();
>> +		if (err)
>> +			return -1;
>> 
>> 		if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
>> 			status = dispatch_events(forks, timeout, interval, &times);
>> @@ -895,7 +932,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>> 		if (WIFSIGNALED(status))
>> 			psignal(WTERMSIG(status), argv[0]);
>> 	} else {
>> -		enable_counters();
>> +		err = enable_counters();
>> +		if (err)
>> +			return -1;
>> 		status = dispatch_events(forks, timeout, interval, &times);
>> 	}
>> 
>> @@ -1085,6 +1124,10 @@ static struct option stat_options[] = {
>> 		   "stat events on existing process id"),
>> 	OPT_STRING('t', "tid", &target.tid, "tid",
>> 		   "stat events on existing thread id"),
>> +#ifdef HAVE_BPF_SKEL
>> +	OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id",
>> +		   "stat events on existing bpf program id"),
>> +#endif
>> 	OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
>> 		    "system-wide collection from all CPUs"),
>> 	OPT_BOOLEAN('g', "group", &group,
>> @@ -2064,11 +2107,12 @@ int cmd_stat(int argc, const char **argv)
>> 		"perf stat [<options>] [<command>]",
>> 		NULL
>> 	};
>> -	int status = -EINVAL, run_idx;
>> +	int status = -EINVAL, run_idx, err;
>> 	const char *mode;
>> 	FILE *output = stderr;
>> 	unsigned int interval, timeout;
>> 	const char * const stat_subcommands[] = { "record", "report" };
>> +	char errbuf[BUFSIZ];
>> 
>> 	setlocale(LC_ALL, "");
>> 
>> @@ -2179,6 +2223,12 @@ int cmd_stat(int argc, const char **argv)
>> 	} else if (big_num_opt == 0) /* User passed --no-big-num */
>> 		stat_config.big_num = false;
>> 
>> +	err = target__validate(&target);
>> +	if (err) {
>> +		target__strerror(&target, err, errbuf, BUFSIZ);
>> +		pr_warning("%s\n", errbuf);
>> +	}
>> +
>> 	setup_system_wide(argc);
>> 
>> 	/*
>> @@ -2252,8 +2302,6 @@ int cmd_stat(int argc, const char **argv)
>> 		}
>> 	}
>> 
>> -	target__validate(&target);
>> -
>> 	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
>> 		target.per_thread = true;
>> 
>> @@ -2384,9 +2432,10 @@ int cmd_stat(int argc, const char **argv)
>> 		 * tools remain  -acme
>> 		 */
>> 		int fd = perf_data__fd(&perf_stat.data);
>> -		int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat,
>> -							     process_synthesized_event,
>> -							     &perf_stat.session->machines.host);
>> +
>> +		err = perf_event__synthesize_kernel_mmap((void *)&perf_stat,
>> +							 process_synthesized_event,
>> +							 &perf_stat.session->machines.host);
>> 		if (err) {
>> 			pr_warning("Couldn't synthesize the kernel mmap record, harmless, "
>> 				   "older tools may produce warnings about this file\n.");
>> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
>> index e2563d0154eb6..188521f343470 100644
>> --- a/tools/perf/util/Build
>> +++ b/tools/perf/util/Build
>> @@ -135,6 +135,7 @@ perf-y += clockid.o
>> 
>> perf-$(CONFIG_LIBBPF) += bpf-loader.o
>> perf-$(CONFIG_LIBBPF) += bpf_map.o
>> +perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o
>> perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
>> perf-$(CONFIG_LIBELF) += symbol-elf.o
>> perf-$(CONFIG_LIBELF) += probe-file.o
>> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
>> new file mode 100644
>> index 0000000000000..f2cb86a40c882
>> --- /dev/null
>> +++ b/tools/perf/util/bpf_counter.c
>> @@ -0,0 +1,296 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +/* Copyright (c) 2019 Facebook */
>> +
>> +#include <limits.h>
>> +#include <unistd.h>
>> +#include <sys/time.h>
>> +#include <sys/resource.h>
>> +#include <linux/err.h>
>> +#include <linux/zalloc.h>
>> +#include <bpf/bpf.h>
>> +#include <bpf/btf.h>
>> +#include <bpf/libbpf.h>
>> +
>> +#include "bpf_counter.h"
>> +#include "counts.h"
>> +#include "debug.h"
>> +#include "evsel.h"
>> +#include "target.h"
>> +
>> +#include "bpf_skel/bpf_prog_profiler.skel.h"
>> +
>> +static inline void *u64_to_ptr(__u64 ptr)
>> +{
>> +	return (void *)(unsigned long)ptr;
>> +}
>> +
>> +static void set_max_rlimit(void)
>> +{
>> +	struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
>> +
>> +	setrlimit(RLIMIT_MEMLOCK, &rinf);
>> +}
>> +
>> +static struct bpf_counter *bpf_counter_alloc(void)
>> +{
>> +	struct bpf_counter *counter;
>> +
>> +	counter = zalloc(sizeof(*counter));
>> +	if (counter)
>> +		INIT_LIST_HEAD(&counter->list);
>> +	return counter;
>> +}
>> +
>> +static int bpf_program_profiler__destroy(struct evsel *evsel)
>> +{
>> +	struct bpf_counter *counter;
>> +
>> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list)
>> +		bpf_prog_profiler_bpf__destroy(counter->skel);
>> +	INIT_LIST_HEAD(&evsel->bpf_counter_list);
>> +	return 0;
>> +}
>> +
>> +static char *bpf_target_prog_name(int tgt_fd)
>> +{
>> +	struct bpf_prog_info_linear *info_linear;
>> +	struct bpf_func_info *func_info;
>> +	const struct btf_type *t;
>> +	char *name = NULL;
>> +	struct btf *btf;
>> +
>> +	info_linear = bpf_program__get_prog_info_linear(
>> +		tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO);
>> +	if (IS_ERR_OR_NULL(info_linear)) {
>> +		pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd);
>> +		return NULL;
>> +	}
>> +
>> +	if (info_linear->info.btf_id == 0 ||
>> +	    btf__get_from_id(info_linear->info.btf_id, &btf)) {
>> +		pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd);
>> +		goto out;
>> +	}
>> +
>> +	func_info = u64_to_ptr(info_linear->info.func_info);
>> +	t = btf__type_by_id(btf, func_info[0].type_id);
>> +	if (!t) {
>> +		pr_debug("btf %d doesn't have type %d\n",
>> +			 info_linear->info.btf_id, func_info[0].type_id);
>> +		goto out;
>> +	}
>> +	name = strdup(btf__name_by_offset(btf, t->name_off));
>> +out:
>> +	free(info_linear);
>> +	return name;
>> +}
>> +
>> +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id)
>> +{
>> +	struct bpf_prog_profiler_bpf *skel;
>> +	struct bpf_counter *counter;
>> +	struct bpf_program *prog;
>> +	char *prog_name;
>> +	int prog_fd;
>> +	int err;
>> +
>> +	prog_fd = bpf_prog_get_fd_by_id(prog_id);
>> +	if (prog_fd < 0) {
>> +		pr_err("Failed to open fd for bpf prog %u\n", prog_id);
>> +		return -1;
>> +	}
>> +	counter = bpf_counter_alloc();
>> +	if (!counter) {
>> +		close(prog_fd);
>> +		return -1;
>> +	}
>> +
>> +	skel = bpf_prog_profiler_bpf__open();
>> +	if (!skel) {
>> +		pr_err("Failed to open bpf skeleton\n");
>> +		goto err_out;
>> +	}
>> +	skel->rodata->num_cpu = evsel__nr_cpus(evsel);
>> +
>> +	bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel));
>> +	bpf_map__resize(skel->maps.fentry_readings, 1);
>> +	bpf_map__resize(skel->maps.accum_readings, 1);
>> +
>> +	prog_name = bpf_target_prog_name(prog_fd);
>> +	if (!prog_name) {
>> +		pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id);
>> +		goto err_out;
>> +	}
>> +
>> +	bpf_object__for_each_program(prog, skel->obj) {
>> +		err = bpf_program__set_attach_target(prog, prog_fd, prog_name);
>> +		if (err) {
>> +			pr_err("bpf_program__set_attach_target failed.\n"
>> +			       "Does bpf prog %u have BTF?\n", prog_id);
>> +			goto err_out;
>> +		}
>> +	}
>> +	set_max_rlimit();
>> +	err = bpf_prog_profiler_bpf__load(skel);
>> +	if (err) {
>> +		pr_err("bpf_prog_profiler_bpf__load failed\n");
>> +		goto err_out;
>> +	}
>> +
>> +	counter->skel = skel;
>> +	list_add(&counter->list, &evsel->bpf_counter_list);
>> +	close(prog_fd);
>> +	return 0;
>> +err_out:
>> +	free(counter);
>> +	close(prog_fd);
>> +	return -1;
>> +}
>> +
>> +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target)
>> +{
>> +	char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p;
>> +	u32 prog_id;
>> +	int ret;
>> +
>> +	bpf_str_ = bpf_str = strdup(target->bpf_str);
>> +	if (!bpf_str)
>> +		return -1;
>> +
>> +	while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) {
>> +		prog_id = strtoul(tok, &p, 10);
>> +		if (prog_id == 0 || prog_id == UINT_MAX ||
>> +		    (*p != '\0' && *p != ',')) {
>> +			pr_err("Failed to parse bpf prog ids %s\n",
>> +			       target->bpf_str);
>> +			return -1;
>> +		}
>> +
>> +		ret = bpf_program_profiler_load_one(evsel, prog_id);
>> +		if (ret) {
>> +			bpf_program_profiler__destroy(evsel);
>> +			free(bpf_str_);
>> +			return -1;
>> +		}
>> +		bpf_str = NULL;
>> +	}
>> +	free(bpf_str_);
>> +	return 0;
>> +}
>> +
>> +static int bpf_program_profiler__enable(struct evsel *evsel)
>> +{
>> +	struct bpf_counter *counter;
>> +	int ret;
>> +
>> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
>> +		ret = bpf_prog_profiler_bpf__attach(counter->skel);
>> +		if (ret) {
>> +			bpf_program_profiler__destroy(evsel);
>> +			return ret;
>> +		}
>> +	}
>> +	return 0;
>> +}
>> +
>> +static int bpf_program_profiler__read(struct evsel *evsel)
>> +{
>> +	int num_cpu = evsel__nr_cpus(evsel);
>> +	struct bpf_perf_event_value values[num_cpu];
>> +	struct bpf_counter *counter;
>> +	int reading_map_fd;
>> +	__u32 key = 0;
>> +	int err, cpu;
>> +
>> +	if (list_empty(&evsel->bpf_counter_list))
>> +		return -EAGAIN;
>> +
>> +	for (cpu = 0; cpu < num_cpu; cpu++) {
>> +		perf_counts(evsel->counts, cpu, 0)->val = 0;
>> +		perf_counts(evsel->counts, cpu, 0)->ena = 0;
>> +		perf_counts(evsel->counts, cpu, 0)->run = 0;
>> +	}
>> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
>> +		struct bpf_prog_profiler_bpf *skel = counter->skel;
>> +
>> +		reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>> +
>> +		err = bpf_map_lookup_elem(reading_map_fd, &key, values);
>> +		if (err) {
>> +			fprintf(stderr, "failed to read value\n");
>> +			return err;
>> +		}
>> +
>> +		for (cpu = 0; cpu < num_cpu; cpu++) {
>> +			perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
>> +			perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled;
>> +			perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running;
>> +		}
>> +	}
>> +	return 0;
>> +}
>> +
>> +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
>> +					    int fd)
>> +{
>> +	struct bpf_prog_profiler_bpf *skel;
>> +	struct bpf_counter *counter;
>> +	int ret;
>> +
>> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
>> +		skel = counter->skel;
>> +		ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
>> +					  &cpu, &fd, BPF_ANY);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +	return 0;
>> +}
>> +
>> +struct bpf_counter_ops bpf_program_profiler_ops = {
>> +	.load       = bpf_program_profiler__load,
>> +	.enable	    = bpf_program_profiler__enable,
>> +	.read       = bpf_program_profiler__read,
>> +	.destroy    = bpf_program_profiler__destroy,
>> +	.install_pe = bpf_program_profiler__install_pe,
>> +};
>> +
>> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd)
>> +{
>> +	if (list_empty(&evsel->bpf_counter_list))
>> +		return 0;
>> +	return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd);
>> +}
>> +
>> +int bpf_counter__load(struct evsel *evsel, struct target *target)
>> +{
>> +	if (target__has_bpf(target))
>> +		evsel->bpf_counter_ops = &bpf_program_profiler_ops;
>> +
>> +	if (evsel->bpf_counter_ops)
>> +		return evsel->bpf_counter_ops->load(evsel, target);
>> +	return 0;
>> +}
>> +
>> +int bpf_counter__enable(struct evsel *evsel)
>> +{
>> +	if (list_empty(&evsel->bpf_counter_list))
>> +		return 0;
>> +	return evsel->bpf_counter_ops->enable(evsel);
>> +}
>> +
>> +int bpf_counter__read(struct evsel *evsel)
>> +{
>> +	if (list_empty(&evsel->bpf_counter_list))
>> +		return -EAGAIN;
>> +	return evsel->bpf_counter_ops->read(evsel);
>> +}
>> +
>> +void bpf_counter__destroy(struct evsel *evsel)
>> +{
>> +	if (list_empty(&evsel->bpf_counter_list))
>> +		return;
>> +	evsel->bpf_counter_ops->destroy(evsel);
>> +	evsel->bpf_counter_ops = NULL;
>> +}
>> diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h
>> new file mode 100644
>> index 0000000000000..2eca210e5dc16
>> --- /dev/null
>> +++ b/tools/perf/util/bpf_counter.h
>> @@ -0,0 +1,72 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef __PERF_BPF_COUNTER_H
>> +#define __PERF_BPF_COUNTER_H 1
>> +
>> +#include <linux/list.h>
>> +
>> +struct evsel;
>> +struct target;
>> +struct bpf_counter;
>> +
>> +typedef int (*bpf_counter_evsel_op)(struct evsel *evsel);
>> +typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel,
>> +					   struct target *target);
>> +typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel,
>> +					       int cpu,
>> +					       int fd);
>> +
>> +struct bpf_counter_ops {
>> +	bpf_counter_evsel_target_op load;
>> +	bpf_counter_evsel_op enable;
>> +	bpf_counter_evsel_op read;
>> +	bpf_counter_evsel_op destroy;
>> +	bpf_counter_evsel_install_pe_op install_pe;
>> +};
>> +
>> +struct bpf_counter {
>> +	void *skel;
>> +	struct list_head list;
>> +};
>> +
>> +#ifdef HAVE_BPF_SKEL
>> +
>> +int bpf_counter__load(struct evsel *evsel, struct target *target);
>> +int bpf_counter__enable(struct evsel *evsel);
>> +int bpf_counter__read(struct evsel *evsel);
>> +void bpf_counter__destroy(struct evsel *evsel);
>> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd);
>> +
>> +#else /* HAVE_BPF_SKEL */
>> +
>> +#include<linux/err.h>
>> +
>> +static inline int bpf_counter__load(struct evsel *evsel __maybe_unused,
>> +				    struct target *target __maybe_unused)
>> +{
>> +	return 0;
>> +}
>> +
>> +static inline int bpf_counter__enable(struct evsel *evsel __maybe_unused)
>> +{
>> +	return 0;
>> +}
>> +
>> +static inline int bpf_counter__read(struct evsel *evsel __maybe_unused)
>> +{
>> +	return -EAGAIN;
>> +}
>> +
>> +static inline void bpf_counter__destroy(struct evsel *evsel __maybe_unused)
>> +{
>> +}
>> +
>> +static inline int bpf_counter__install_pe(struct evsel *evsel __maybe_unused,
>> +					  int cpu __maybe_unused,
>> +					  int fd __maybe_unused)
>> +{
>> +	return 0;
>> +}
>> +
>> +#endif /* HAVE_BPF_SKEL */
>> +
>> +#endif /* __PERF_BPF_COUNTER_H */
>> diff --git a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
>> new file mode 100644
>> index 0000000000000..c7cec92d02360
>> --- /dev/null
>> +++ b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
>> @@ -0,0 +1,93 @@
>> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> +// Copyright (c) 2020 Facebook
>> +#include <linux/bpf.h>
>> +#include <bpf/bpf_helpers.h>
>> +#include <bpf/bpf_tracing.h>
>> +
>> +/* map of perf event fds, num_cpu * num_metric entries */
>> +struct {
>> +	__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
>> +	__uint(key_size, sizeof(__u32));
>> +	__uint(value_size, sizeof(int));
>> +} events SEC(".maps");
>> +
>> +/* readings at fentry */
>> +struct {
>> +	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
>> +	__uint(key_size, sizeof(__u32));
>> +	__uint(value_size, sizeof(struct bpf_perf_event_value));
>> +	__uint(max_entries, 1);
>> +} fentry_readings SEC(".maps");
>> +
>> +/* accumulated readings */
>> +struct {
>> +	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
>> +	__uint(key_size, sizeof(__u32));
>> +	__uint(value_size, sizeof(struct bpf_perf_event_value));
>> +	__uint(max_entries, 1);
>> +} accum_readings SEC(".maps");
>> +
>> +const volatile __u32 num_cpu = 1;
>> +
>> +SEC("fentry/XXX")
>> +int BPF_PROG(fentry_XXX)
>> +{
>> +	__u32 key = bpf_get_smp_processor_id();
>> +	struct bpf_perf_event_value *ptr;
>> +	__u32 zero = 0;
>> +	long err;
>> +
>> +	/* look up before reading, to reduce error */
>> +	ptr = bpf_map_lookup_elem(&fentry_readings, &zero);
>> +	if (!ptr)
>> +		return 0;
>> +
>> +	err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr));
>> +	if (err)
>> +		return 0;
>> +
>> +	return 0;
>> +}
>> +
>> +static inline void
>> +fexit_update_maps(struct bpf_perf_event_value *after)
>> +{
>> +	struct bpf_perf_event_value *before, diff, *accum;
>> +	__u32 zero = 0;
>> +
>> +	before = bpf_map_lookup_elem(&fentry_readings, &zero);
>> +	/* only account samples with a valid fentry_reading */
>> +	if (before && before->counter) {
>> +		struct bpf_perf_event_value *accum;
>> +
>> +		diff.counter = after->counter - before->counter;
>> +		diff.enabled = after->enabled - before->enabled;
>> +		diff.running = after->running - before->running;
>> +
>> +		accum = bpf_map_lookup_elem(&accum_readings, &zero);
>> +		if (accum) {
>> +			accum->counter += diff.counter;
>> +			accum->enabled += diff.enabled;
>> +			accum->running += diff.running;
>> +		}
>> +	}
>> +}
>> +
>> +SEC("fexit/XXX")
>> +int BPF_PROG(fexit_XXX)
>> +{
>> +	struct bpf_perf_event_value reading;
>> +	__u32 cpu = bpf_get_smp_processor_id();
>> +	__u32 one = 1, zero = 0;
>> +	int err;
>> +
>> +	/* read all events before updating the maps, to reduce error */
>> +	err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading));
>> +	if (err)
>> +		return 0;
>> +
>> +	fexit_update_maps(&reading);
>> +	return 0;
>> +}
>> +
>> +char LICENSE[] SEC("license") = "Dual BSD/GPL";
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index c26ea82220bd8..7265308765d73 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -25,6 +25,7 @@
>> #include <stdlib.h>
>> #include <perf/evsel.h>
>> #include "asm/bug.h"
>> +#include "bpf_counter.h"
>> #include "callchain.h"
>> #include "cgroup.h"
>> #include "counts.h"
>> @@ -51,6 +52,10 @@
>> #include <internal/lib.h>
>> 
>> #include <linux/ctype.h>
>> +#include <bpf/bpf.h>
>> +#include <bpf/libbpf.h>
>> +#include <bpf/btf.h>
>> +#include "rlimit.h"
>> 
>> struct perf_missing_features perf_missing_features;
>> 
>> @@ -247,6 +252,7 @@ void evsel__init(struct evsel *evsel,
>> 	evsel->bpf_obj	   = NULL;
>> 	evsel->bpf_fd	   = -1;
>> 	INIT_LIST_HEAD(&evsel->config_terms);
>> +	INIT_LIST_HEAD(&evsel->bpf_counter_list);
>> 	perf_evsel__object.init(evsel);
>> 	evsel->sample_size = __evsel__sample_size(attr->sample_type);
>> 	evsel__calc_id_pos(evsel);
>> @@ -1366,6 +1372,7 @@ void evsel__exit(struct evsel *evsel)
>> {
>> 	assert(list_empty(&evsel->core.node));
>> 	assert(evsel->evlist == NULL);
>> +	bpf_counter__destroy(evsel);
>> 	evsel__free_counts(evsel);
>> 	perf_evsel__free_fd(&evsel->core);
>> 	perf_evsel__free_id(&evsel->core);
>> @@ -1781,6 +1788,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>> 
>> 			FD(evsel, cpu, thread) = fd;
>> 
>> +			bpf_counter__install_pe(evsel, cpu, fd);
>> +
>> 			if (unlikely(test_attr__enabled)) {
>> 				test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
>> 						fd, group_fd, flags);
>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
>> index cd1d8dd431997..40e3946cd7518 100644
>> --- a/tools/perf/util/evsel.h
>> +++ b/tools/perf/util/evsel.h
>> @@ -10,6 +10,7 @@
>> #include <internal/evsel.h>
>> #include <perf/evsel.h>
>> #include "symbol_conf.h"
>> +#include "bpf_counter.h"
>> #include <internal/cpumap.h>
>> 
>> struct bpf_object;
>> @@ -17,6 +18,8 @@ struct cgroup;
>> struct perf_counts;
>> struct perf_stat_evsel;
>> union perf_event;
>> +struct bpf_counter_ops;
>> +struct target;
>> 
>> typedef int (evsel__sb_cb_t)(union perf_event *event, void *data);
>> 
>> @@ -127,6 +130,8 @@ struct evsel {
>> 	 * See also evsel__has_callchain().
>> 	 */
>> 	__u64			synth_sample_type;
>> +	struct list_head	bpf_counter_list;
>> +	struct bpf_counter_ops	*bpf_counter_ops;
>> };
>> 
>> struct perf_missing_features {
>> @@ -424,4 +429,5 @@ static inline bool evsel__is_dummy_event(struct evsel *evsel)
>> struct perf_env *evsel__env(struct evsel *evsel);
>> 
>> int evsel__store_ids(struct evsel *evsel, struct evlist *evlist);
>> +
>> #endif /* __PERF_EVSEL_H */
>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>> index 583ae4f09c5d1..cce7a76d6473c 100644
>> --- a/tools/perf/util/stat-display.c
>> +++ b/tools/perf/util/stat-display.c
>> @@ -1045,7 +1045,9 @@ static void print_header(struct perf_stat_config *config,
>> 	if (!config->csv_output) {
>> 		fprintf(output, "\n");
>> 		fprintf(output, " Performance counter stats for ");
>> -		if (_target->system_wide)
>> +		if (_target->bpf_str)
>> +			fprintf(output, "\'BPF program(s) %s", _target->bpf_str);
>> +		else if (_target->system_wide)
>> 			fprintf(output, "\'system wide");
>> 		else if (_target->cpu_list)
>> 			fprintf(output, "\'CPU(s) %s", _target->cpu_list);
>> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
>> index 8ce1479c98f03..0b3957323f668 100644
>> --- a/tools/perf/util/stat.c
>> +++ b/tools/perf/util/stat.c
>> @@ -527,7 +527,7 @@ int create_perf_stat_counter(struct evsel *evsel,
>> 	if (leader->core.nr_members > 1)
>> 		attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;
>> 
>> -	attr->inherit = !config->no_inherit;
>> +	attr->inherit = !config->no_inherit && list_empty(&evsel->bpf_counter_list);
>> 
>> 	/*
>> 	 * Some events get initialized with sample_(period/type) set,
>> diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c
>> index a3db13dea937c..0f383418e3df5 100644
>> --- a/tools/perf/util/target.c
>> +++ b/tools/perf/util/target.c
>> @@ -56,6 +56,34 @@ enum target_errno target__validate(struct target *target)
>> 			ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM;
>> 	}
>> 
>> +	/* BPF and CPU are mutually exclusive */
>> +	if (target->bpf_str && target->cpu_list) {
>> +		target->cpu_list = NULL;
>> +		if (ret == TARGET_ERRNO__SUCCESS)
>> +			ret = TARGET_ERRNO__BPF_OVERRIDE_CPU;
>> +	}
>> +
>> +	/* BPF and PID/TID are mutually exclusive */
>> +	if (target->bpf_str && target->tid) {
>> +		target->tid = NULL;
>> +		if (ret == TARGET_ERRNO__SUCCESS)
>> +			ret = TARGET_ERRNO__BPF_OVERRIDE_PID;
>> +	}
>> +
>> +	/* BPF and UID are mutually exclusive */
>> +	if (target->bpf_str && target->uid_str) {
>> +		target->uid_str = NULL;
>> +		if (ret == TARGET_ERRNO__SUCCESS)
>> +			ret = TARGET_ERRNO__BPF_OVERRIDE_UID;
>> +	}
>> +
>> +	/* BPF and THREADS are mutually exclusive */
>> +	if (target->bpf_str && target->per_thread) {
>> +		target->per_thread = false;
>> +		if (ret == TARGET_ERRNO__SUCCESS)
>> +			ret = TARGET_ERRNO__BPF_OVERRIDE_THREAD;
>> +	}
>> +
>> 	/* THREAD and SYSTEM/CPU are mutually exclusive */
>> 	if (target->per_thread && (target->system_wide || target->cpu_list)) {
>> 		target->per_thread = false;
>> @@ -109,6 +137,10 @@ static const char *target__error_str[] = {
>> 	"PID/TID switch overriding SYSTEM",
>> 	"UID switch overriding SYSTEM",
>> 	"SYSTEM/CPU switch overriding PER-THREAD",
>> +	"BPF switch overriding CPU",
>> +	"BPF switch overriding PID/TID",
>> +	"BPF switch overriding UID",
>> +	"BPF switch overriding THREAD",
>> 	"Invalid User: %s",
>> 	"Problems obtaining information for user %s",
>> };
>> @@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum,
>> 
>> 	switch (errnum) {
>> 	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
>> -	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:
>> +	     TARGET_ERRNO__BPF_OVERRIDE_THREAD:
>> 		snprintf(buf, buflen, "%s", msg);
>> 		break;
>> 
>> diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
>> index 6ef01a83b24e9..f132c6c2eef81 100644
>> --- a/tools/perf/util/target.h
>> +++ b/tools/perf/util/target.h
>> @@ -10,6 +10,7 @@ struct target {
>> 	const char   *tid;
>> 	const char   *cpu_list;
>> 	const char   *uid_str;
>> +	const char   *bpf_str;
>> 	uid_t	     uid;
>> 	bool	     system_wide;
>> 	bool	     uses_mmap;
>> @@ -36,6 +37,10 @@ enum target_errno {
>> 	TARGET_ERRNO__PID_OVERRIDE_SYSTEM,
>> 	TARGET_ERRNO__UID_OVERRIDE_SYSTEM,
>> 	TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD,
>> +	TARGET_ERRNO__BPF_OVERRIDE_CPU,
>> +	TARGET_ERRNO__BPF_OVERRIDE_PID,
>> +	TARGET_ERRNO__BPF_OVERRIDE_UID,
>> +	TARGET_ERRNO__BPF_OVERRIDE_THREAD,
>> 
>> 	/* for target__parse_uid() */
>> 	TARGET_ERRNO__INVALID_UID,
>> @@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target)
>> 	return target->system_wide || target->cpu_list;
>> }
>> 
>> +static inline bool target__has_bpf(struct target *target)
>> +{
>> +	return target->bpf_str;
>> +}
>> +
>> static inline bool target__none(struct target *target)
>> {
>> 	return !target__has_task(target) && !target__has_cpu(target);
>> -- 
>> 2.24.1
>> 
> 
> -- 
> 
> - Arnaldo


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-28 23:43     ` Song Liu
@ 2020-12-29  5:53       ` Song Liu
  2020-12-29 15:15       ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 25+ messages in thread
From: Song Liu @ 2020-12-29  5:53 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team



> On Dec 28, 2020, at 3:43 PM, Song Liu <songliubraving@fb.com> wrote:
> 
> 
> 
>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>> 
>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu:
>>> Introduce perf-stat -b option, which counts events for BPF programs, like:
>>> 
>>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
>>>    1.487903822            115,200      ref-cycles
>>>    1.487903822             86,012      cycles
>>>    2.489147029             80,560      ref-cycles
>>>    2.489147029             73,784      cycles
>>>    3.490341825             60,720      ref-cycles
>>>    3.490341825             37,797      cycles
>>>    4.491540887             37,120      ref-cycles
>>>    4.491540887             31,963      cycles
>>> 
>>> The example above counts cycles and ref-cycles of BPF program of id 254.
>>> This is similar to bpftool-prog-profile command, but more flexible.
>>> 
>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
>>> programs (monitor-progs) to the target BPF program (target-prog). The
>>> monitor-progs read perf_event before and after the target-prog, and
>>> aggregate the difference in a BPF map. Then the user space reads data
>>> from these maps.
>>> 
>>> A new struct bpf_counter is introduced to provide common interface that
>>> uses BPF programs/maps to count perf events.
>> 
>> Segfaulting here:
>> 
>> [root@five ~]# bpftool prog  | grep tracepoint
>> 110: tracepoint  name syscall_unaugme  tag 57cd311f2e27366b  gpl
>> 111: tracepoint  name sys_enter_conne  tag 3555418ac9476139  gpl
>> 112: tracepoint  name sys_enter_sendt  tag bc7fcadbaf7b8145  gpl
>> 113: tracepoint  name sys_enter_open  tag 0e59c3ac2bea5280  gpl
>> 114: tracepoint  name sys_enter_opena  tag 0baf443610f59837  gpl
>> 115: tracepoint  name sys_enter_renam  tag 24664e4aca62d7fa  gpl
>> 116: tracepoint  name sys_enter_renam  tag 20093e51a8634ebb  gpl
>> 117: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
>> 118: tracepoint  name sys_exit  tag 29c7ae234d79bd5c  gpl
>> [root@five ~]#
>> [root@five ~]# gdb perf
>> GNU gdb (GDB) Fedora 10.1-2.fc33
>> Reading symbols from perf...
>> (gdb) run stat -e instructions,cycles -b 113 -I 1000
>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>> 
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
>> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>> (gdb) bt
>> #0  0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
>> #1  0x0000000000000000 in ?? ()
>> (gdb)
>> 
>> [acme@five perf]$ clang -v |& head -2
>> clang version 11.0.0 (Fedora 11.0.0-2.fc33)
>> Target: x86_64-unknown-linux-gnu
>> [acme@five perf]$
>> 
>> Do you need any extra info?
> 
> Hmm... I am not able to reproduce this. I am trying to setup an environment similar
> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? 
> 

I tried it on CentOS Stream release 8, with 

  gcc version 8.4.1 20200928 (Red Hat 8.4.1-1) (GCC)
  clang version 11.0.0 (Red Hat 11.0.0-0.2.rc2.module_el8.4.0+533+50191577)

Unfortunately, I still cannot repro it. 

I didn't find the issue while looking through the code. AFAICS, the code
fail over when the skeleton is not ready, so bpf_program_profiler__read()
should find a valid skeleton. 

Could you please help run the test with the following patch? The patch is 
also available at 

    https://git.kernel.org/pub/scm/linux/kernel/git/song/linux.git perf-dash-b

Thanks,
Song


diff --git i/tools/perf/util/bpf_counter.c w/tools/perf/util/bpf_counter.c
index f2cb86a40c882..e09c571365b56 100644
--- i/tools/perf/util/bpf_counter.c
+++ w/tools/perf/util/bpf_counter.c
@@ -46,8 +46,10 @@ static int bpf_program_profiler__destroy(struct evsel *evsel)
 {
        struct bpf_counter *counter;

-       list_for_each_entry(counter, &evsel->bpf_counter_list, list)
+       list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
+               pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
                bpf_prog_profiler_bpf__destroy(counter->skel);
+       }
        INIT_LIST_HEAD(&evsel->bpf_counter_list);
        return 0;
 }
@@ -141,8 +143,14 @@ static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id)
        counter->skel = skel;
        list_add(&counter->list, &evsel->bpf_counter_list);
        close(prog_fd);
+       pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
+       pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel);
+       pr_debug("%s return 0\n", __func__);
        return 0;
 err_out:
+       pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
+       pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel);
+       pr_debug("%s return -1\n", __func__);
        free(counter);
        close(prog_fd);
        return -1;
@@ -214,11 +222,22 @@ static int bpf_program_profiler__read(struct evsel *evsel)
        list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
                struct bpf_prog_profiler_bpf *skel = counter->skel;

+               pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
+               pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel);
+               if (!skel) {
+                       pr_err("%s !skel\n", __func__);
+                       continue;
+               }
+               if (!skel->maps.accum_readings) {
+                       pr_err("%s !skel->maps.accum_readings", __func__);
+                       continue;
+               }
+
                reading_map_fd = bpf_map__fd(skel->maps.accum_readings);

                err = bpf_map_lookup_elem(reading_map_fd, &key, values);
                if (err) {
-                       fprintf(stderr, "failed to read value\n");
+                       pr_err("failed to read value\n");
                        return err;
                }

@@ -240,6 +259,8 @@ static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,

        list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
                skel = counter->skel;
+               pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
+               pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel);
                ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
                                          &cpu, &fd, BPF_ANY);
                if (ret)

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 2/4] perf: support build BPF skeletons with perf
  2020-12-28 17:40 ` [PATCH v6 2/4] perf: support build BPF skeletons with perf Song Liu
@ 2020-12-29  7:01   ` Namhyung Kim
  2020-12-29 11:48     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 25+ messages in thread
From: Namhyung Kim @ 2020-12-29  7:01 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra,
	Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa,
	kernel-team

Hello,

On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote:
>
> BPF programs are useful in perf to profile BPF programs. BPF skeleton is

I'm having difficulties understanding the first sentence - looks like a
recursion. :)  So do you want to use two (or more) BPF programs?

Thanks,
Namhyung


> by far the easiest way to write BPF tools. Enable building BPF skeletons
> in util/bpf_skel. A dummy bpf skeleton is added. More bpf skeletons will
> be added for different use cases.
>
> Acked-by: Jiri Olsa <jolsa@redhat.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  tools/build/Makefile.feature        |  4 ++-
>  tools/perf/Makefile.config          |  9 ++++++
>  tools/perf/Makefile.perf            | 49 +++++++++++++++++++++++++++--
>  tools/perf/util/bpf_skel/.gitignore |  3 ++
>  tools/scripts/Makefile.include      |  1 +
>  5 files changed, 63 insertions(+), 3 deletions(-)
>  create mode 100644 tools/perf/util/bpf_skel/.gitignore
>
> diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
> index 97cbfb31b7625..74e255d58d8d0 100644
> --- a/tools/build/Makefile.feature
> +++ b/tools/build/Makefile.feature
> @@ -99,7 +99,9 @@ FEATURE_TESTS_EXTRA :=                  \
>           clang                          \
>           libbpf                         \
>           libpfm4                        \
> -         libdebuginfod
> +         libdebuginfod                 \
> +         clang-bpf-co-re
> +
>
>  FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC)
>
> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> index ce8516e4de34f..d8e59d31399a5 100644
> --- a/tools/perf/Makefile.config
> +++ b/tools/perf/Makefile.config
> @@ -621,6 +621,15 @@ ifndef NO_LIBBPF
>    endif
>  endif
>
> +ifdef BUILD_BPF_SKEL
> +  $(call feature_check,clang-bpf-co-re)
> +  ifeq ($(feature-clang-bpf-co-re), 0)
> +    dummy := $(error Error: clang too old. Please install recent clang)
> +  endif
> +  $(call detected,CONFIG_PERF_BPF_SKEL)
> +  CFLAGS += -DHAVE_BPF_SKEL
> +endif
> +
>  dwarf-post-unwind := 1
>  dwarf-post-unwind-text := BUG
>
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 62f3deb1d3a8b..d182a2dbb9bbd 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -126,6 +126,8 @@ include ../scripts/utilities.mak
>  #
>  # Define NO_LIBDEBUGINFOD if you do not want support debuginfod
>  #
> +# Define BUILD_BPF_SKEL to enable BPF skeletons
> +#
>
>  # As per kernel Makefile, avoid funny character set dependencies
>  unexport LC_ALL
> @@ -175,6 +177,12 @@ endef
>
>  LD += $(EXTRA_LDFLAGS)
>
> +HOSTCC  ?= gcc
> +HOSTLD  ?= ld
> +HOSTAR  ?= ar
> +CLANG   ?= clang
> +LLVM_STRIP ?= llvm-strip
> +
>  PKG_CONFIG = $(CROSS_COMPILE)pkg-config
>  LLVM_CONFIG ?= llvm-config
>
> @@ -731,7 +739,8 @@ prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioc
>         $(x86_arch_prctl_code_array) \
>         $(rename_flags_array) \
>         $(arch_errno_name_array) \
> -       $(sync_file_range_arrays)
> +       $(sync_file_range_arrays) \
> +       bpf-skel
>
>  $(OUTPUT)%.o: %.c prepare FORCE
>         $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
> @@ -1004,7 +1013,43 @@ config-clean:
>  python-clean:
>         $(python-clean)
>
> -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean
> +SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
> +SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
> +SKELETONS :=
> +
> +ifdef BUILD_BPF_SKEL
> +BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
> +LIBBPF_SRC := $(abspath ../lib/bpf)
> +BPF_INCLUDE := -I$(SKEL_TMP_OUT)/.. -I$(BPF_PATH) -I$(LIBBPF_SRC)/..
> +
> +$(SKEL_TMP_OUT):
> +       $(Q)$(MKDIR) -p $@
> +
> +$(BPFTOOL): | $(SKEL_TMP_OUT)
> +       CFLAGS= $(MAKE) -C ../bpf/bpftool \
> +               OUTPUT=$(SKEL_TMP_OUT)/ bootstrap
> +
> +$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT)
> +       $(QUIET_CLANG)$(CLANG) -g -O2 -target bpf $(BPF_INCLUDE) \
> +         -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@
> +
> +$(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL)
> +       $(QUIET_GENSKEL)$(BPFTOOL) gen skeleton $< > $@
> +
> +bpf-skel: $(SKELETONS)
> +
> +.PRECIOUS: $(SKEL_TMP_OUT)/%.bpf.o
> +
> +else # BUILD_BPF_SKEL
> +
> +bpf-skel:
> +
> +endif # BUILD_BPF_SKEL
> +
> +bpf-skel-clean:
> +       $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
> +
> +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean bpf-skel-clean
>         $(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
>         $(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
>         $(Q)$(RM) $(OUTPUT).config-detected
> diff --git a/tools/perf/util/bpf_skel/.gitignore b/tools/perf/util/bpf_skel/.gitignore
> new file mode 100644
> index 0000000000000..5263e9e6c5d83
> --- /dev/null
> +++ b/tools/perf/util/bpf_skel/.gitignore
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +.tmp
> +*.skel.h
> \ No newline at end of file
> diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include
> index 1358e89cdf7d6..62119ce69ad9a 100644
> --- a/tools/scripts/Makefile.include
> +++ b/tools/scripts/Makefile.include
> @@ -127,6 +127,7 @@ ifneq ($(silent),1)
>                          $(MAKE) $(PRINT_DIR) -C $$subdir
>         QUIET_FLEX     = @echo '  FLEX     '$@;
>         QUIET_BISON    = @echo '  BISON    '$@;
> +       QUIET_GENSKEL  = @echo '  GEN-SKEL '$@;
>
>         descend = \
>                 +@echo         '  DESCEND  '$(1); \
> --
> 2.24.1
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu
  2020-12-28 20:11   ` Arnaldo Carvalho de Melo
@ 2020-12-29  7:22   ` Namhyung Kim
  2020-12-29 17:46     ` Song Liu
  1 sibling, 1 reply; 25+ messages in thread
From: Namhyung Kim @ 2020-12-29  7:22 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra,
	Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa,
	kernel-team

On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote:
>
> Introduce perf-stat -b option, which counts events for BPF programs, like:
>
> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
>      1.487903822            115,200      ref-cycles
>      1.487903822             86,012      cycles
>      2.489147029             80,560      ref-cycles
>      2.489147029             73,784      cycles
>      3.490341825             60,720      ref-cycles
>      3.490341825             37,797      cycles
>      4.491540887             37,120      ref-cycles
>      4.491540887             31,963      cycles
>
> The example above counts cycles and ref-cycles of BPF program of id 254.
> This is similar to bpftool-prog-profile command, but more flexible.
>
> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
> programs (monitor-progs) to the target BPF program (target-prog). The
> monitor-progs read perf_event before and after the target-prog, and
> aggregate the difference in a BPF map. Then the user space reads data
> from these maps.
>
> A new struct bpf_counter is introduced to provide common interface that
> uses BPF programs/maps to count perf events.
>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  tools/perf/Makefile.perf                      |   2 +-
>  tools/perf/builtin-stat.c                     |  77 ++++-
>  tools/perf/util/Build                         |   1 +
>  tools/perf/util/bpf_counter.c                 | 296 ++++++++++++++++++
>  tools/perf/util/bpf_counter.h                 |  72 +++++
>  .../util/bpf_skel/bpf_prog_profiler.bpf.c     |  93 ++++++
>  tools/perf/util/evsel.c                       |   9 +
>  tools/perf/util/evsel.h                       |   6 +
>  tools/perf/util/stat-display.c                |   4 +-
>  tools/perf/util/stat.c                        |   2 +-
>  tools/perf/util/target.c                      |  34 +-
>  tools/perf/util/target.h                      |  10 +
>  12 files changed, 588 insertions(+), 18 deletions(-)
>  create mode 100644 tools/perf/util/bpf_counter.c
>  create mode 100644 tools/perf/util/bpf_counter.h
>  create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
>
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index d182a2dbb9bbd..8c4e039c3b813 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -1015,7 +1015,7 @@ python-clean:
>
>  SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
>  SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
> -SKELETONS :=
> +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
>
>  ifdef BUILD_BPF_SKEL
>  BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 8cc24967bc273..09bffb3fbcdd4 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -67,6 +67,7 @@
>  #include "util/top.h"
>  #include "util/affinity.h"
>  #include "util/pfm.h"
> +#include "util/bpf_counter.h"
>  #include "asm/bug.h"
>
>  #include <linux/time64.h>
> @@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs)
>         return 0;
>  }
>
> +static int read_bpf_map_counters(void)
> +{
> +       struct evsel *counter;
> +       int err;
> +
> +       evlist__for_each_entry(evsel_list, counter) {
> +               err = bpf_counter__read(counter);
> +               if (err)
> +                       return err;
> +       }
> +       return 0;
> +}
> +
>  static void read_counters(struct timespec *rs)
>  {
>         struct evsel *counter;
> +       int err;
>
> -       if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0))
> -               return;
> +       if (!stat_config.stop_read_counter) {
> +               err = read_bpf_map_counters();
> +               if (err == -EAGAIN)
> +                       err = read_affinity_counters(rs);

Instead of checking the error code, can we do something like

  if (target__has_bpf(target))
      read_bpf_map_counters();

?

> +               if (err < 0)
> +                       return;
> +       }
>
>         evlist__for_each_entry(evsel_list, counter) {
>                 if (counter->err)
> @@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times)
>         return false;
>  }
>
> -static void enable_counters(void)
> +static int enable_counters(void)
>  {
> +       struct evsel *evsel;
> +       int err;
> +
> +       evlist__for_each_entry(evsel_list, evsel) {
> +               err = bpf_counter__enable(evsel);
> +               if (err)
> +                       return err;

Ditto.


> +       }
> +
>         if (stat_config.initial_delay < 0) {
>                 pr_info(EVLIST_DISABLED_MSG);
> -               return;
> +               return 0;
>         }
>
>         if (stat_config.initial_delay > 0) {
> @@ -518,6 +547,7 @@ static void enable_counters(void)
>                 if (stat_config.initial_delay > 0)
>                         pr_info(EVLIST_ENABLED_MSG);
>         }
> +       return 0;
>  }
>
>  static void disable_counters(void)
> @@ -720,7 +750,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>         const bool forks = (argc > 0);
>         bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false;
>         struct affinity affinity;
> -       int i, cpu;
> +       int i, cpu, err;
>         bool second_pass = false;
>
>         if (forks) {
> @@ -737,6 +767,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>         if (affinity__setup(&affinity) < 0)
>                 return -1;
>
> +       evlist__for_each_entry(evsel_list, counter) {
> +               if (bpf_counter__load(counter, &target))
> +                       return -1;
> +       }
> +

Ditto.


>         evlist__for_each_cpu (evsel_list, i, cpu) {
>                 affinity__set(&affinity, cpu);
>
> @@ -850,7 +885,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>         }
>
>         if (STAT_RECORD) {
> -               int err, fd = perf_data__fd(&perf_stat.data);
> +               int fd = perf_data__fd(&perf_stat.data);
>
>                 if (is_pipe) {
>                         err = perf_header__write_pipe(perf_data__fd(&perf_stat.data));

[SNIP]
>  perf-$(CONFIG_LIBELF) += probe-file.o
> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
> new file mode 100644
> index 0000000000000..f2cb86a40c882
> --- /dev/null
> +++ b/tools/perf/util/bpf_counter.c
> @@ -0,0 +1,296 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/* Copyright (c) 2019 Facebook */
> +
> +#include <limits.h>
> +#include <unistd.h>
> +#include <sys/time.h>
> +#include <sys/resource.h>
> +#include <linux/err.h>
> +#include <linux/zalloc.h>
> +#include <bpf/bpf.h>
> +#include <bpf/btf.h>
> +#include <bpf/libbpf.h>
> +
> +#include "bpf_counter.h"
> +#include "counts.h"
> +#include "debug.h"
> +#include "evsel.h"
> +#include "target.h"
> +
> +#include "bpf_skel/bpf_prog_profiler.skel.h"
> +
> +static inline void *u64_to_ptr(__u64 ptr)
> +{
> +       return (void *)(unsigned long)ptr;
> +}
> +
> +static void set_max_rlimit(void)
> +{
> +       struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
> +
> +       setrlimit(RLIMIT_MEMLOCK, &rinf);
> +}

This looks scary..


> +
> +static struct bpf_counter *bpf_counter_alloc(void)
> +{
> +       struct bpf_counter *counter;
> +
> +       counter = zalloc(sizeof(*counter));
> +       if (counter)
> +               INIT_LIST_HEAD(&counter->list);
> +       return counter;
> +}
> +
> +static int bpf_program_profiler__destroy(struct evsel *evsel)
> +{
> +       struct bpf_counter *counter;
> +
> +       list_for_each_entry(counter, &evsel->bpf_counter_list, list)
> +               bpf_prog_profiler_bpf__destroy(counter->skel);
> +       INIT_LIST_HEAD(&evsel->bpf_counter_list);
> +       return 0;
> +}
> +
> +static char *bpf_target_prog_name(int tgt_fd)
> +{
> +       struct bpf_prog_info_linear *info_linear;
> +       struct bpf_func_info *func_info;
> +       const struct btf_type *t;
> +       char *name = NULL;
> +       struct btf *btf;
> +
> +       info_linear = bpf_program__get_prog_info_linear(
> +               tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO);
> +       if (IS_ERR_OR_NULL(info_linear)) {
> +               pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd);
> +               return NULL;
> +       }
> +
> +       if (info_linear->info.btf_id == 0 ||
> +           btf__get_from_id(info_linear->info.btf_id, &btf)) {
> +               pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd);
> +               goto out;
> +       }
> +
> +       func_info = u64_to_ptr(info_linear->info.func_info);
> +       t = btf__type_by_id(btf, func_info[0].type_id);
> +       if (!t) {
> +               pr_debug("btf %d doesn't have type %d\n",
> +                        info_linear->info.btf_id, func_info[0].type_id);
> +               goto out;
> +       }
> +       name = strdup(btf__name_by_offset(btf, t->name_off));
> +out:
> +       free(info_linear);
> +       return name;
> +}
> +
> +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id)
> +{
> +       struct bpf_prog_profiler_bpf *skel;
> +       struct bpf_counter *counter;
> +       struct bpf_program *prog;
> +       char *prog_name;
> +       int prog_fd;
> +       int err;
> +
> +       prog_fd = bpf_prog_get_fd_by_id(prog_id);
> +       if (prog_fd < 0) {
> +               pr_err("Failed to open fd for bpf prog %u\n", prog_id);
> +               return -1;
> +       }
> +       counter = bpf_counter_alloc();
> +       if (!counter) {
> +               close(prog_fd);
> +               return -1;
> +       }
> +
> +       skel = bpf_prog_profiler_bpf__open();
> +       if (!skel) {
> +               pr_err("Failed to open bpf skeleton\n");
> +               goto err_out;
> +       }
> +       skel->rodata->num_cpu = evsel__nr_cpus(evsel);
> +
> +       bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel));
> +       bpf_map__resize(skel->maps.fentry_readings, 1);
> +       bpf_map__resize(skel->maps.accum_readings, 1);
> +
> +       prog_name = bpf_target_prog_name(prog_fd);
> +       if (!prog_name) {
> +               pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id);
> +               goto err_out;
> +       }
> +
> +       bpf_object__for_each_program(prog, skel->obj) {
> +               err = bpf_program__set_attach_target(prog, prog_fd, prog_name);
> +               if (err) {
> +                       pr_err("bpf_program__set_attach_target failed.\n"
> +                              "Does bpf prog %u have BTF?\n", prog_id);
> +                       goto err_out;
> +               }
> +       }
> +       set_max_rlimit();
> +       err = bpf_prog_profiler_bpf__load(skel);
> +       if (err) {
> +               pr_err("bpf_prog_profiler_bpf__load failed\n");
> +               goto err_out;
> +       }
> +
> +       counter->skel = skel;
> +       list_add(&counter->list, &evsel->bpf_counter_list);
> +       close(prog_fd);
> +       return 0;
> +err_out:
> +       free(counter);
> +       close(prog_fd);

I don't know how the 'skel' part is managed, is it safe to leave?


> +       return -1;
> +}
> +
> +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target)
> +{
> +       char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p;
> +       u32 prog_id;
> +       int ret;
> +
> +       bpf_str_ = bpf_str = strdup(target->bpf_str);
> +       if (!bpf_str)
> +               return -1;
> +
> +       while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) {
> +               prog_id = strtoul(tok, &p, 10);
> +               if (prog_id == 0 || prog_id == UINT_MAX ||
> +                   (*p != '\0' && *p != ',')) {
> +                       pr_err("Failed to parse bpf prog ids %s\n",
> +                              target->bpf_str);
> +                       return -1;
> +               }
> +
> +               ret = bpf_program_profiler_load_one(evsel, prog_id);
> +               if (ret) {
> +                       bpf_program_profiler__destroy(evsel);
> +                       free(bpf_str_);
> +                       return -1;
> +               }
> +               bpf_str = NULL;
> +       }
> +       free(bpf_str_);
> +       return 0;
> +}
> +
> +static int bpf_program_profiler__enable(struct evsel *evsel)
> +{
> +       struct bpf_counter *counter;
> +       int ret;
> +
> +       list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> +               ret = bpf_prog_profiler_bpf__attach(counter->skel);
> +               if (ret) {
> +                       bpf_program_profiler__destroy(evsel);
> +                       return ret;
> +               }
> +       }
> +       return 0;
> +}
> +
> +static int bpf_program_profiler__read(struct evsel *evsel)
> +{
> +       int num_cpu = evsel__nr_cpus(evsel);
> +       struct bpf_perf_event_value values[num_cpu];
> +       struct bpf_counter *counter;
> +       int reading_map_fd;
> +       __u32 key = 0;
> +       int err, cpu;
> +
> +       if (list_empty(&evsel->bpf_counter_list))
> +               return -EAGAIN;
> +
> +       for (cpu = 0; cpu < num_cpu; cpu++) {
> +               perf_counts(evsel->counts, cpu, 0)->val = 0;
> +               perf_counts(evsel->counts, cpu, 0)->ena = 0;
> +               perf_counts(evsel->counts, cpu, 0)->run = 0;
> +       }

Hmm.. not sure it's correct to reset counters here.


> +       list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> +               struct bpf_prog_profiler_bpf *skel = counter->skel;
> +
> +               reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> +
> +               err = bpf_map_lookup_elem(reading_map_fd, &key, values);
> +               if (err) {
> +                       fprintf(stderr, "failed to read value\n");
> +                       return err;
> +               }
> +
> +               for (cpu = 0; cpu < num_cpu; cpu++) {
> +                       perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
> +                       perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled;
> +                       perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running;
> +               }
> +       }

So this just aggregates all the counters in BPF programs, right?


> +       return 0;
> +}
> +
> +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
> +                                           int fd)
> +{
> +       struct bpf_prog_profiler_bpf *skel;
> +       struct bpf_counter *counter;
> +       int ret;
> +
> +       list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> +               skel = counter->skel;
> +               ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
> +                                         &cpu, &fd, BPF_ANY);
> +               if (ret)
> +                       return ret;
> +       }
> +       return 0;
> +}
> +
> +struct bpf_counter_ops bpf_program_profiler_ops = {
> +       .load       = bpf_program_profiler__load,
> +       .enable     = bpf_program_profiler__enable,
> +       .read       = bpf_program_profiler__read,
> +       .destroy    = bpf_program_profiler__destroy,
> +       .install_pe = bpf_program_profiler__install_pe,

What is 'pe'?

Btw, do you think other kinds of bpf programs are added later?
It seems 'perf stat -b' is somewhat coupled with this profiler ops.
Will it be possible to run other ops in a same evsel?


> +};
> +
> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd)
> +{
> +       if (list_empty(&evsel->bpf_counter_list))
> +               return 0;
> +       return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd);
> +}
> +
> +int bpf_counter__load(struct evsel *evsel, struct target *target)
> +{
> +       if (target__has_bpf(target))
> +               evsel->bpf_counter_ops = &bpf_program_profiler_ops;
> +
> +       if (evsel->bpf_counter_ops)
> +               return evsel->bpf_counter_ops->load(evsel, target);
> +       return 0;
> +}
> +
> +int bpf_counter__enable(struct evsel *evsel)
> +{
> +       if (list_empty(&evsel->bpf_counter_list))
> +               return 0;
> +       return evsel->bpf_counter_ops->enable(evsel);
> +}
> +
> +int bpf_counter__read(struct evsel *evsel)
> +{
> +       if (list_empty(&evsel->bpf_counter_list))
> +               return -EAGAIN;
> +       return evsel->bpf_counter_ops->read(evsel);
> +}
> +
> +void bpf_counter__destroy(struct evsel *evsel)
> +{
> +       if (list_empty(&evsel->bpf_counter_list))
> +               return;
> +       evsel->bpf_counter_ops->destroy(evsel);
> +       evsel->bpf_counter_ops = NULL;
> +}

[SNIP]
> diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
> index 6ef01a83b24e9..f132c6c2eef81 100644
> --- a/tools/perf/util/target.h
> +++ b/tools/perf/util/target.h
> @@ -10,6 +10,7 @@ struct target {
>         const char   *tid;
>         const char   *cpu_list;
>         const char   *uid_str;
> +       const char   *bpf_str;
>         uid_t        uid;
>         bool         system_wide;
>         bool         uses_mmap;
> @@ -36,6 +37,10 @@ enum target_errno {
>         TARGET_ERRNO__PID_OVERRIDE_SYSTEM,
>         TARGET_ERRNO__UID_OVERRIDE_SYSTEM,
>         TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD,
> +       TARGET_ERRNO__BPF_OVERRIDE_CPU,
> +       TARGET_ERRNO__BPF_OVERRIDE_PID,
> +       TARGET_ERRNO__BPF_OVERRIDE_UID,
> +       TARGET_ERRNO__BPF_OVERRIDE_THREAD,
>
>         /* for target__parse_uid() */
>         TARGET_ERRNO__INVALID_UID,
> @@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target)
>         return target->system_wide || target->cpu_list;
>  }
>
> +static inline bool target__has_bpf(struct target *target)
> +{
> +       return target->bpf_str;
> +}
> +
>  static inline bool target__none(struct target *target)
>  {
>         return !target__has_task(target) && !target__has_cpu(target);

Shouldn't it have && !target__has_bpf() too?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 4/4] perf-stat: add documentation for -b option
  2020-12-28 17:40 ` [PATCH v6 4/4] perf-stat: add documentation for -b option Song Liu
@ 2020-12-29  7:24   ` Namhyung Kim
  2020-12-29 16:59     ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Namhyung Kim @ 2020-12-29  7:24 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra,
	Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa,
	kernel-team

On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote:
>
> Add documentation to perf-stat -b option, which stats event for BPF
> programs.
>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  tools/perf/Documentation/perf-stat.txt | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 5d4a673d7621a..15b9a646e853d 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -75,6 +75,20 @@ report::
>  --tid=<tid>::
>          stat events on existing thread id (comma separated list)
>
> +-b::
> +--bpf-prog::
> +        stat events on existing bpf program id (comma separated list),
> +        requiring root righs. For example:

Typo: rights

It'd be nice if it can show how we can get the id.

Thanks,
Namhyung


> +
> +  # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000
> +
> +   Performance counter stats for 'BPF program(s) 17247':
> +
> +             85,967      cycles
> +             28,982      instructions              #    0.34  insn per cycle
> +
> +        1.102235068 seconds time elapsed
> +
>  ifdef::HAVE_LIBPFM[]
>  --pfm-events events::
>  Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net)
> --
> 2.24.1
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 2/4] perf: support build BPF skeletons with perf
  2020-12-29  7:01   ` Namhyung Kim
@ 2020-12-29 11:48     ` Arnaldo Carvalho de Melo
  2020-12-29 17:14       ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-12-29 11:48 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Song Liu, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Alexander Shishkin, Mark Rutland, Jiri Olsa, kernel-team

Em Tue, Dec 29, 2020 at 04:01:41PM +0900, Namhyung Kim escreveu:
> On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote:
> > BPF programs are useful in perf to profile BPF programs. BPF skeleton is
 
> I'm having difficulties understanding the first sentence - looks like a
> recursion. :)  So do you want to use two (or more) BPF programs?

Yeah, we use perf to perf perf, so we need to use bpf with perf to perf
bpf :-)

Look at tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c, the BPF
skeleton used to create the in-kernel scaffold to profile BPF programs.

It uses two BPF programs (fentry/XXX and fexit/XXX) and some a
PERF_EVENT_ARRAY map and an array to diff counters read at exit from
counters read at exit of the profiled BPF programs and then accumulate
those diffs in another PERCPU_ARRAY.

This all ends up composing a "BPF PMU" that is what the userspace perf
tooling will read (from "accum_readings" BPF map)  and 'perf stat' will
consume as if reading from an "old style perf counter" :-)

Song, did I get it right? :-)

For convenience, it is below:

- Arnaldo
 
[acme@five perf]$ cat tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c 
// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
// Copyright (c) 2020 Facebook
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>

/* map of perf event fds, num_cpu * num_metric entries */
struct {
	__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(int));
} events SEC(".maps");

/* readings at fentry */
struct {
	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(struct bpf_perf_event_value));
	__uint(max_entries, 1);
} fentry_readings SEC(".maps");

/* accumulated readings */
struct {
	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(struct bpf_perf_event_value));
	__uint(max_entries, 1);
} accum_readings SEC(".maps");

const volatile __u32 num_cpu = 1;

SEC("fentry/XXX")
int BPF_PROG(fentry_XXX)
{
	__u32 key = bpf_get_smp_processor_id();
	struct bpf_perf_event_value *ptr;
	__u32 zero = 0;
	long err;

	/* look up before reading, to reduce error */
	ptr = bpf_map_lookup_elem(&fentry_readings, &zero);
	if (!ptr)
		return 0;

	err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr));
	if (err)
		return 0;

	return 0;
}

static inline void
fexit_update_maps(struct bpf_perf_event_value *after)
{
	struct bpf_perf_event_value *before, diff, *accum;
	__u32 zero = 0;

	before = bpf_map_lookup_elem(&fentry_readings, &zero);
	/* only account samples with a valid fentry_reading */
	if (before && before->counter) {
		struct bpf_perf_event_value *accum;

		diff.counter = after->counter - before->counter;
		diff.enabled = after->enabled - before->enabled;
		diff.running = after->running - before->running;

		accum = bpf_map_lookup_elem(&accum_readings, &zero);
		if (accum) {
			accum->counter += diff.counter;
			accum->enabled += diff.enabled;
			accum->running += diff.running;
		}
	}
}

SEC("fexit/XXX")
int BPF_PROG(fexit_XXX)
{
	struct bpf_perf_event_value reading;
	__u32 cpu = bpf_get_smp_processor_id();
	__u32 one = 1, zero = 0;
	int err;

	/* read all events before updating the maps, to reduce error */
	err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading));
	if (err)
		return 0;

	fexit_update_maps(&reading);
	return 0;
}

char LICENSE[] SEC("license") = "Dual BSD/GPL";
[acme@five perf]$

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-28 23:43     ` Song Liu
  2020-12-29  5:53       ` Song Liu
@ 2020-12-29 15:15       ` Arnaldo Carvalho de Melo
  2020-12-29 18:42         ` Song Liu
  1 sibling, 1 reply; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-12-29 15:15 UTC (permalink / raw)
  To: Song Liu
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team

Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu:
> 
> 
> > On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > 
> > Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu:
> >> Introduce perf-stat -b option, which counts events for BPF programs, like:
> >> 
> >> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
> >>     1.487903822            115,200      ref-cycles
> >>     1.487903822             86,012      cycles
> >>     2.489147029             80,560      ref-cycles
> >>     2.489147029             73,784      cycles
> >>     3.490341825             60,720      ref-cycles
> >>     3.490341825             37,797      cycles
> >>     4.491540887             37,120      ref-cycles
> >>     4.491540887             31,963      cycles
> >> 
> >> The example above counts cycles and ref-cycles of BPF program of id 254.
> >> This is similar to bpftool-prog-profile command, but more flexible.
> >> 
> >> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
> >> programs (monitor-progs) to the target BPF program (target-prog). The
> >> monitor-progs read perf_event before and after the target-prog, and
> >> aggregate the difference in a BPF map. Then the user space reads data
> >> from these maps.
> >> 
> >> A new struct bpf_counter is introduced to provide common interface that
> >> uses BPF programs/maps to count perf events.
> > 
> > Segfaulting here:
> > 
> > [root@five ~]# bpftool prog  | grep tracepoint
> > 110: tracepoint  name syscall_unaugme  tag 57cd311f2e27366b  gpl
> > 111: tracepoint  name sys_enter_conne  tag 3555418ac9476139  gpl
> > 112: tracepoint  name sys_enter_sendt  tag bc7fcadbaf7b8145  gpl
> > 113: tracepoint  name sys_enter_open  tag 0e59c3ac2bea5280  gpl
> > 114: tracepoint  name sys_enter_opena  tag 0baf443610f59837  gpl
> > 115: tracepoint  name sys_enter_renam  tag 24664e4aca62d7fa  gpl
> > 116: tracepoint  name sys_enter_renam  tag 20093e51a8634ebb  gpl
> > 117: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
> > 118: tracepoint  name sys_exit  tag 29c7ae234d79bd5c  gpl
> > [root@five ~]#
> > [root@five ~]# gdb perf
> > GNU gdb (GDB) Fedora 10.1-2.fc33
> > Reading symbols from perf...
> > (gdb) run stat -e instructions,cycles -b 113 -I 1000
> > Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000
> > [Thread debugging using libthread_db enabled]
> > Using host libthread_db library "/lib64/libthread_db.so.1".
> > libbpf: elf: skipping unrecognized data section(9) .eh_frame
> > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> > libbpf: elf: skipping unrecognized data section(9) .eh_frame
> > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
> > 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> > (gdb) bt
> > #0  0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
> > #1  0x0000000000000000 in ?? ()
> > (gdb)
> > 
> > [acme@five perf]$ clang -v |& head -2
> > clang version 11.0.0 (Fedora 11.0.0-2.fc33)
> > Target: x86_64-unknown-linux-gnu
> > [acme@five perf]$
> > 
> > Do you need any extra info?
> 
> Hmm... I am not able to reproduce this. I am trying to setup an environment similar
> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? 

I'll try it with a BPF proggie attached to a kprobes, but here is
something else I noticed:

[root@five perf]# export PYTHONPATH=/tmp/build/perf/python
[root@five perf]# tools/perf/python/twatch.py
Traceback (most recent call last):
  File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module>
    import perf
ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
[root@five perf]# perf test python
19: 'import perf' in python                                         : FAILED!
[root@five perf]# perf test -v python
19: 'import perf' in python                                         :
--- start ---
test child forked, pid 3198864
python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' "
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
test child finished with -1
---- end ----
'import perf' in python: FAILED!
[root@five perf]#

This should be trivial, I hope, just add the new object file to
tools/perf/util/python-ext-sources, then do a 'perf test python', if it
fails, use 'perf test -v python' to see what is preventing the python
binding from loading.

- Arnaldo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 4/4] perf-stat: add documentation for -b option
  2020-12-29  7:24   ` Namhyung Kim
@ 2020-12-29 16:59     ` Song Liu
  0 siblings, 0 replies; 25+ messages in thread
From: Song Liu @ 2020-12-29 16:59 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra,
	Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa,
	Kernel Team



> On Dec 28, 2020, at 11:24 PM, Namhyung Kim <namhyung@kernel.org> wrote:
> 
> On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote:
>> 
>> Add documentation to perf-stat -b option, which stats event for BPF
>> programs.
>> 
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>> tools/perf/Documentation/perf-stat.txt | 14 ++++++++++++++
>> 1 file changed, 14 insertions(+)
>> 
>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
>> index 5d4a673d7621a..15b9a646e853d 100644
>> --- a/tools/perf/Documentation/perf-stat.txt
>> +++ b/tools/perf/Documentation/perf-stat.txt
>> @@ -75,6 +75,20 @@ report::
>> --tid=<tid>::
>>         stat events on existing thread id (comma separated list)
>> 
>> +-b::
>> +--bpf-prog::
>> +        stat events on existing bpf program id (comma separated list),
>> +        requiring root righs. For example:
> 
> Typo: rights
> 
> It'd be nice if it can show how we can get the id.

Thanks for the review! I fill fix these in the next version. 

Song


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 2/4] perf: support build BPF skeletons with perf
  2020-12-29 11:48     ` Arnaldo Carvalho de Melo
@ 2020-12-29 17:14       ` Song Liu
  2020-12-29 18:16         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2020-12-29 17:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Alexander Shishkin, Mark Rutland, Jiri Olsa, Kernel Team



> On Dec 29, 2020, at 3:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Tue, Dec 29, 2020 at 04:01:41PM +0900, Namhyung Kim escreveu:
>> On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote:
>>> BPF programs are useful in perf to profile BPF programs. BPF skeleton is
> 
>> I'm having difficulties understanding the first sentence - looks like a
>> recursion. :)  So do you want to use two (or more) BPF programs?
> 
> Yeah, we use perf to perf perf, so we need to use bpf with perf to perf
> bpf :-)
> 
> Look at tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c, the BPF
> skeleton used to create the in-kernel scaffold to profile BPF programs.
> 
> It uses two BPF programs (fentry/XXX and fexit/XXX) and some a
> PERF_EVENT_ARRAY map and an array to diff counters read at exit from
> counters read at exit of the profiled BPF programs and then accumulate
> those diffs in another PERCPU_ARRAY.
> 
> This all ends up composing a "BPF PMU" that is what the userspace perf
> tooling will read (from "accum_readings" BPF map)  and 'perf stat' will
> consume as if reading from an "old style perf counter" :-)
> 
> Song, did I get it right? :-)

Thanks Arnaldo! I don't think anyone can explain it better. :-)

Song

[...]


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-29  7:22   ` Namhyung Kim
@ 2020-12-29 17:46     ` Song Liu
  2020-12-29 17:59       ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2020-12-29 17:46 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra,
	Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa,
	Kernel Team



> On Dec 28, 2020, at 11:22 PM, Namhyung Kim <namhyung@kernel.org> wrote:
> 
> On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote:
>> 
>> Introduce perf-stat -b option, which counts events for BPF programs, like:
>> 
>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
>>     1.487903822            115,200      ref-cycles
>>     1.487903822             86,012      cycles
>>     2.489147029             80,560      ref-cycles
>>     2.489147029             73,784      cycles
>>     3.490341825             60,720      ref-cycles
>>     3.490341825             37,797      cycles
>>     4.491540887             37,120      ref-cycles
>>     4.491540887             31,963      cycles
>> 
>> The example above counts cycles and ref-cycles of BPF program of id 254.
>> This is similar to bpftool-prog-profile command, but more flexible.
>> 
>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
>> programs (monitor-progs) to the target BPF program (target-prog). The
>> monitor-progs read perf_event before and after the target-prog, and
>> aggregate the difference in a BPF map. Then the user space reads data
>> from these maps.
>> 
>> A new struct bpf_counter is introduced to provide common interface that
>> uses BPF programs/maps to count perf events.
>> 
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>> tools/perf/Makefile.perf                      |   2 +-
>> tools/perf/builtin-stat.c                     |  77 ++++-
>> tools/perf/util/Build                         |   1 +
>> tools/perf/util/bpf_counter.c                 | 296 ++++++++++++++++++
>> tools/perf/util/bpf_counter.h                 |  72 +++++
>> .../util/bpf_skel/bpf_prog_profiler.bpf.c     |  93 ++++++
>> tools/perf/util/evsel.c                       |   9 +
>> tools/perf/util/evsel.h                       |   6 +
>> tools/perf/util/stat-display.c                |   4 +-
>> tools/perf/util/stat.c                        |   2 +-
>> tools/perf/util/target.c                      |  34 +-
>> tools/perf/util/target.h                      |  10 +
>> 12 files changed, 588 insertions(+), 18 deletions(-)
>> create mode 100644 tools/perf/util/bpf_counter.c
>> create mode 100644 tools/perf/util/bpf_counter.h
>> create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
>> 
>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>> index d182a2dbb9bbd..8c4e039c3b813 100644
>> --- a/tools/perf/Makefile.perf
>> +++ b/tools/perf/Makefile.perf
>> @@ -1015,7 +1015,7 @@ python-clean:
>> 
>> SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
>> SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
>> -SKELETONS :=
>> +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
>> 
>> ifdef BUILD_BPF_SKEL
>> BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 8cc24967bc273..09bffb3fbcdd4 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -67,6 +67,7 @@
>> #include "util/top.h"
>> #include "util/affinity.h"
>> #include "util/pfm.h"
>> +#include "util/bpf_counter.h"
>> #include "asm/bug.h"
>> 
>> #include <linux/time64.h>
>> @@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs)
>>        return 0;
>> }
>> 
>> +static int read_bpf_map_counters(void)
>> +{
>> +       struct evsel *counter;
>> +       int err;
>> +
>> +       evlist__for_each_entry(evsel_list, counter) {
>> +               err = bpf_counter__read(counter);
>> +               if (err)
>> +                       return err;
>> +       }
>> +       return 0;
>> +}
>> +
>> static void read_counters(struct timespec *rs)
>> {
>>        struct evsel *counter;
>> +       int err;
>> 
>> -       if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0))
>> -               return;
>> +       if (!stat_config.stop_read_counter) {
>> +               err = read_bpf_map_counters();
>> +               if (err == -EAGAIN)
>> +                       err = read_affinity_counters(rs);
> 
> Instead of checking the error code, can we do something like
> 
>  if (target__has_bpf(target))
>      read_bpf_map_counters();
> 
> ?

Yeah, we can do that. 

> 
>> +               if (err < 0)
>> +                       return;
>> +       }
>> 
>>        evlist__for_each_entry(evsel_list, counter) {
>>                if (counter->err)
>> @@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times)
>>        return false;
>> }
>> 
>> -static void enable_counters(void)
>> +static int enable_counters(void)
>> {
>> +       struct evsel *evsel;
>> +       int err;
>> +
>> +       evlist__for_each_entry(evsel_list, evsel) {
>> +               err = bpf_counter__enable(evsel);
>> +               if (err)
>> +                       return err;
> 
> Ditto.

For this one, we still need to check the return value, as bpf_counter__enable()
may fail. We can add a global check to skip the loop. 

> 
>> +       }
>> +

[...]

>> +
>> +#include "bpf_skel/bpf_prog_profiler.skel.h"
>> +
>> +static inline void *u64_to_ptr(__u64 ptr)
>> +{
>> +       return (void *)(unsigned long)ptr;
>> +}
>> +
>> +static void set_max_rlimit(void)
>> +{
>> +       struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
>> +
>> +       setrlimit(RLIMIT_MEMLOCK, &rinf);
>> +}
> 
> This looks scary..

I guess this is OK as we requires root rights for -b?

> 

[...]
>> +       if (!counter) {
>> +               close(prog_fd);
>> +               return -1;
>> +       }
>> +
>> +       skel = bpf_prog_profiler_bpf__open();
>> +       if (!skel) {
>> +               pr_err("Failed to open bpf skeleton\n");
>> +               goto err_out;
>> +       }
>> +       skel->rodata->num_cpu = evsel__nr_cpus(evsel);
>> +
>> +       bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel));
>> +       bpf_map__resize(skel->maps.fentry_readings, 1);
>> +       bpf_map__resize(skel->maps.accum_readings, 1);
>> +
>> +       prog_name = bpf_target_prog_name(prog_fd);
>> +       if (!prog_name) {
>> +               pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id);
>> +               goto err_out;
>> +       }
>> +
>> +       bpf_object__for_each_program(prog, skel->obj) {
>> +               err = bpf_program__set_attach_target(prog, prog_fd, prog_name);
>> +               if (err) {
>> +                       pr_err("bpf_program__set_attach_target failed.\n"
>> +                              "Does bpf prog %u have BTF?\n", prog_id);
>> +                       goto err_out;
>> +               }
>> +       }
>> +       set_max_rlimit();
>> +       err = bpf_prog_profiler_bpf__load(skel);
>> +       if (err) {
>> +               pr_err("bpf_prog_profiler_bpf__load failed\n");
>> +               goto err_out;
>> +       }
>> +
>> +       counter->skel = skel;
>> +       list_add(&counter->list, &evsel->bpf_counter_list);
>> +       close(prog_fd);
>> +       return 0;
>> +err_out:
>> +       free(counter);
>> +       close(prog_fd);
> 
> I don't know how the 'skel' part is managed, is it safe to leave?

Good catch! We do have bpf_program_profiler__destroy() in bpf_program_profiler__load().
But I should have counter->skel = skel in err path. Will fix. 

> 
> 
>> +       return -1;
>> +}
>> +
>> +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target)
>> +{
>> +       char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p;
>> +       u32 prog_id;
>> +       int ret;
>> +
>> +       bpf_str_ = bpf_str = strdup(target->bpf_str);
>> +       if (!bpf_str)
>> +               return -1;
>> +
>> +       while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) {
>> +               prog_id = strtoul(tok, &p, 10);
>> +               if (prog_id == 0 || prog_id == UINT_MAX ||
>> +                   (*p != '\0' && *p != ',')) {
>> +                       pr_err("Failed to parse bpf prog ids %s\n",
>> +                              target->bpf_str);
>> +                       return -1;
>> +               }
>> +
>> +               ret = bpf_program_profiler_load_one(evsel, prog_id);
>> +               if (ret) {
>> +                       bpf_program_profiler__destroy(evsel);
>> +                       free(bpf_str_);
>> +                       return -1;
>> +               }
>> +               bpf_str = NULL;
>> +       }
>> +       free(bpf_str_);
>> +       return 0;
>> +}
>> +
>> +static int bpf_program_profiler__enable(struct evsel *evsel)
>> +{
>> +       struct bpf_counter *counter;
>> +       int ret;
>> +
>> +       list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
>> +               ret = bpf_prog_profiler_bpf__attach(counter->skel);
>> +               if (ret) {
>> +                       bpf_program_profiler__destroy(evsel);
>> +                       return ret;
>> +               }
>> +       }
>> +       return 0;
>> +}
>> +
>> +static int bpf_program_profiler__read(struct evsel *evsel)
>> +{
>> +       int num_cpu = evsel__nr_cpus(evsel);
>> +       struct bpf_perf_event_value values[num_cpu];
>> +       struct bpf_counter *counter;
>> +       int reading_map_fd;
>> +       __u32 key = 0;
>> +       int err, cpu;
>> +
>> +       if (list_empty(&evsel->bpf_counter_list))
>> +               return -EAGAIN;
>> +
>> +       for (cpu = 0; cpu < num_cpu; cpu++) {
>> +               perf_counts(evsel->counts, cpu, 0)->val = 0;
>> +               perf_counts(evsel->counts, cpu, 0)->ena = 0;
>> +               perf_counts(evsel->counts, cpu, 0)->run = 0;
>> +       }
> 
> Hmm.. not sure it's correct to reset counters here.

Yeah, we need to reset the user space values here. Otherwise, the later aggregation
would give wrong number. 

> 
> 
>> +       list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
>> +               struct bpf_prog_profiler_bpf *skel = counter->skel;
>> +
>> +               reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>> +
>> +               err = bpf_map_lookup_elem(reading_map_fd, &key, values);
>> +               if (err) {
>> +                       fprintf(stderr, "failed to read value\n");
>> +                       return err;
>> +               }
>> +
>> +               for (cpu = 0; cpu < num_cpu; cpu++) {
>> +                       perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
>> +                       perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled;
>> +                       perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running;
>> +               }
>> +       }
> 
> So this just aggregates all the counters in BPF programs, right?

Yes. 

> 
> 
>> +       return 0;
>> +}
>> +
>> +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
>> +                                           int fd)
>> +{
>> +       struct bpf_prog_profiler_bpf *skel;
>> +       struct bpf_counter *counter;
>> +       int ret;
>> +
>> +       list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
>> +               skel = counter->skel;
>> +               ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
>> +                                         &cpu, &fd, BPF_ANY);
>> +               if (ret)
>> +                       return ret;
>> +       }
>> +       return 0;
>> +}
>> +
>> +struct bpf_counter_ops bpf_program_profiler_ops = {
>> +       .load       = bpf_program_profiler__load,
>> +       .enable     = bpf_program_profiler__enable,
>> +       .read       = bpf_program_profiler__read,
>> +       .destroy    = bpf_program_profiler__destroy,
>> +       .install_pe = bpf_program_profiler__install_pe,
> 
> What is 'pe'?

pe here means perf_event. 

> 
> Btw, do you think other kinds of bpf programs are added later?
> It seems 'perf stat -b' is somewhat coupled with this profiler ops.
> Will it be possible to run other ops in a same evsel?

It will be possible to add other ops. I have some idea of using BPF programs in
other perf scenarios. 

To clarify, I think each instance of evsel should only have one ops attached. 
And each session of perf-stat should only use one kind of ops. 

> 
>> 

[...]

>> +static inline bool target__has_bpf(struct target *target)
>> +{
>> +       return target->bpf_str;
>> +}
>> +
>> static inline bool target__none(struct target *target)
>> {
>>        return !target__has_task(target) && !target__has_cpu(target);
> 
> Shouldn't it have && !target__has_bpf() too?

Will fix. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-29 17:46     ` Song Liu
@ 2020-12-29 17:59       ` Song Liu
  0 siblings, 0 replies; 25+ messages in thread
From: Song Liu @ 2020-12-29 17:59 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra,
	Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa,
	Kernel Team



> On Dec 29, 2020, at 9:46 AM, Song Liu <songliubraving@fb.com> wrote:
> 
>>> 

[...]

> 
> [...]
> 
>>> +static inline bool target__has_bpf(struct target *target)
>>> +{
>>> +       return target->bpf_str;
>>> +}
>>> +
>>> static inline bool target__none(struct target *target)
>>> {
>>>       return !target__has_task(target) && !target__has_cpu(target);
>> 
>> Shouldn't it have && !target__has_bpf() too?

Actually, we don't need target__has_bpf() here. As -b requires setting up counters
system wide (in setup_system_wide()). If we add target__has_bpf() here, we will
need something like below, which I think it not necessary. 

diff --git i/tools/perf/builtin-stat.c w/tools/perf/builtin-stat.c
index 09bffb3fbcdd4..853cec040191b 100644
--- i/tools/perf/builtin-stat.c
+++ w/tools/perf/builtin-stat.c
@@ -2081,7 +2081,7 @@ static void setup_system_wide(int forks)
         *   - there is workload specified but all requested
         *     events are system wide events
         */
-       if (!target__none(&target))
+       if (!target__none(&target) && !target__has_bpf(&target))
                return;

        if (!forks)
diff --git i/tools/perf/util/target.h w/tools/perf/util/target.h
index f132c6c2eef81..295fb11f4daff 100644
--- i/tools/perf/util/target.h
+++ w/tools/perf/util/target.h
@@ -71,7 +71,8 @@ static inline bool target__has_bpf(struct target *target)

 static inline bool target__none(struct target *target)
 {
-       return !target__has_task(target) && !target__has_cpu(target);
+       return !target__has_task(target) && !target__has_cpu(target) &&
+               !target__has_bpf(target);
 }

 static inline bool target__has_per_thread(struct target *target)



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 2/4] perf: support build BPF skeletons with perf
  2020-12-29 17:14       ` Song Liu
@ 2020-12-29 18:16         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-12-29 18:16 UTC (permalink / raw)
  To: Song Liu
  Cc: Namhyung Kim, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Alexander Shishkin, Mark Rutland, Jiri Olsa, Kernel Team

Em Tue, Dec 29, 2020 at 05:14:12PM +0000, Song Liu escreveu:
> > On Dec 29, 2020, at 3:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> > Em Tue, Dec 29, 2020 at 04:01:41PM +0900, Namhyung Kim escreveu:
> >> On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote:
> >>> BPF programs are useful in perf to profile BPF programs. BPF skeleton is

> >> I'm having difficulties understanding the first sentence - looks like a
> >> recursion. :)  So do you want to use two (or more) BPF programs?

> > Yeah, we use perf to perf perf, so we need to use bpf with perf to perf
> > bpf :-)

> > Look at tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c, the BPF
> > skeleton used to create the in-kernel scaffold to profile BPF programs.

> > It uses two BPF programs (fentry/XXX and fexit/XXX) and some a
                                                           s/some//
> > PERF_EVENT_ARRAY map and an array to diff counters read at exit from
> > counters read at exit of the profiled BPF programs and then accumulate
                    s/exit/entry/
> > those diffs in another PERCPU_ARRAY.

> > This all ends up composing a "BPF PMU" that is what the userspace perf
> > tooling will read (from "accum_readings" BPF map)  and 'perf stat' will
> > consume as if reading from an "old style perf counter" :-)

> > Song, did I get it right? :-)
 
> Thanks Arnaldo! I don't think anyone can explain it better. :-)

There, a patch :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-29 15:15       ` Arnaldo Carvalho de Melo
@ 2020-12-29 18:42         ` Song Liu
  2020-12-29 18:48           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2020-12-29 18:42 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team



> On Dec 29, 2020, at 7:15 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu:
>> 
>> 
>>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>> 
>>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu:
>>>> Introduce perf-stat -b option, which counts events for BPF programs, like:
>>>> 
>>>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
>>>>    1.487903822            115,200      ref-cycles
>>>>    1.487903822             86,012      cycles
>>>>    2.489147029             80,560      ref-cycles
>>>>    2.489147029             73,784      cycles
>>>>    3.490341825             60,720      ref-cycles
>>>>    3.490341825             37,797      cycles
>>>>    4.491540887             37,120      ref-cycles
>>>>    4.491540887             31,963      cycles
>>>> 
>>>> The example above counts cycles and ref-cycles of BPF program of id 254.
>>>> This is similar to bpftool-prog-profile command, but more flexible.
>>>> 
>>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
>>>> programs (monitor-progs) to the target BPF program (target-prog). The
>>>> monitor-progs read perf_event before and after the target-prog, and
>>>> aggregate the difference in a BPF map. Then the user space reads data
>>>> from these maps.
>>>> 
>>>> A new struct bpf_counter is introduced to provide common interface that
>>>> uses BPF programs/maps to count perf events.
>>> 
>>> Segfaulting here:
>>> 
>>> [root@five ~]# bpftool prog  | grep tracepoint
>>> 110: tracepoint  name syscall_unaugme  tag 57cd311f2e27366b  gpl
>>> 111: tracepoint  name sys_enter_conne  tag 3555418ac9476139  gpl
>>> 112: tracepoint  name sys_enter_sendt  tag bc7fcadbaf7b8145  gpl
>>> 113: tracepoint  name sys_enter_open  tag 0e59c3ac2bea5280  gpl
>>> 114: tracepoint  name sys_enter_opena  tag 0baf443610f59837  gpl
>>> 115: tracepoint  name sys_enter_renam  tag 24664e4aca62d7fa  gpl
>>> 116: tracepoint  name sys_enter_renam  tag 20093e51a8634ebb  gpl
>>> 117: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
>>> 118: tracepoint  name sys_exit  tag 29c7ae234d79bd5c  gpl
>>> [root@five ~]#
>>> [root@five ~]# gdb perf
>>> GNU gdb (GDB) Fedora 10.1-2.fc33
>>> Reading symbols from perf...
>>> (gdb) run stat -e instructions,cycles -b 113 -I 1000
>>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000
>>> [Thread debugging using libthread_db enabled]
>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>>> 
>>> Program received signal SIGSEGV, Segmentation fault.
>>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
>>> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>>> (gdb) bt
>>> #0  0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
>>> #1  0x0000000000000000 in ?? ()
>>> (gdb)
>>> 
>>> [acme@five perf]$ clang -v |& head -2
>>> clang version 11.0.0 (Fedora 11.0.0-2.fc33)
>>> Target: x86_64-unknown-linux-gnu
>>> [acme@five perf]$
>>> 
>>> Do you need any extra info?
>> 
>> Hmm... I am not able to reproduce this. I am trying to setup an environment similar
>> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? 
> 
> I'll try it with a BPF proggie attached to a kprobes, but here is
> something else I noticed:
> 
> [root@five perf]# export PYTHONPATH=/tmp/build/perf/python
> [root@five perf]# tools/perf/python/twatch.py
> Traceback (most recent call last):
>  File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module>
>    import perf
> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
> [root@five perf]# perf test python
> 19: 'import perf' in python                                         : FAILED!
> [root@five perf]# perf test -v python
> 19: 'import perf' in python                                         :
> --- start ---
> test child forked, pid 3198864
> python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' "
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
> test child finished with -1
> ---- end ----
> 'import perf' in python: FAILED!
> [root@five perf]#
> 
> This should be trivial, I hope, just add the new object file to
> tools/perf/util/python-ext-sources, then do a 'perf test python', if it
> fails, use 'perf test -v python' to see what is preventing the python
> binding from loading.

I fixed the undefined bpf_counter__destroy. But this one looks trickier:

19: 'import perf' in python                                         :
--- start ---
test child forked, pid 2714986
python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/bin/python2' "
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: XXXXX /tools/perf/python/perf.so: undefined symbol: bpf_map_update_elem

Given I already have:

diff --git i/tools/perf/util/python-ext-sources w/tools/perf/util/python-ext-sources
index a9d9c142eb7c3..2cac55273eca2 100644
--- i/tools/perf/util/python-ext-sources
+++ w/tools/perf/util/python-ext-sources
@@ -35,3 +35,6 @@ util/symbol_fprintf.c
 util/units.c
 util/affinity.c
 util/rwsem.c
+util/bpf_counter.c
+../lib/bpf/bpf.c
+../lib/bpf/libbpf.c


How should I fix this? 

Thanks,
Song

PS: I still cannot reproduce that segfault...

> 


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-29 18:42         ` Song Liu
@ 2020-12-29 18:48           ` Arnaldo Carvalho de Melo
  2020-12-29 19:11             ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-12-29 18:48 UTC (permalink / raw)
  To: Song Liu
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team

Em Tue, Dec 29, 2020 at 06:42:18PM +0000, Song Liu escreveu:
> 
> 
> > On Dec 29, 2020, at 7:15 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > 
> > Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu:
> >> 
> >> 
> >>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >>> 
> >>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu:
> >>>> Introduce perf-stat -b option, which counts events for BPF programs, like:
> >>>> 
> >>>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
> >>>>    1.487903822            115,200      ref-cycles
> >>>>    1.487903822             86,012      cycles
> >>>>    2.489147029             80,560      ref-cycles
> >>>>    2.489147029             73,784      cycles
> >>>>    3.490341825             60,720      ref-cycles
> >>>>    3.490341825             37,797      cycles
> >>>>    4.491540887             37,120      ref-cycles
> >>>>    4.491540887             31,963      cycles
> >>>> 
> >>>> The example above counts cycles and ref-cycles of BPF program of id 254.
> >>>> This is similar to bpftool-prog-profile command, but more flexible.
> >>>> 
> >>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
> >>>> programs (monitor-progs) to the target BPF program (target-prog). The
> >>>> monitor-progs read perf_event before and after the target-prog, and
> >>>> aggregate the difference in a BPF map. Then the user space reads data
> >>>> from these maps.
> >>>> 
> >>>> A new struct bpf_counter is introduced to provide common interface that
> >>>> uses BPF programs/maps to count perf events.
> >>> 
> >>> Segfaulting here:
> >>> 
> >>> [root@five ~]# bpftool prog  | grep tracepoint
> >>> 110: tracepoint  name syscall_unaugme  tag 57cd311f2e27366b  gpl
> >>> 111: tracepoint  name sys_enter_conne  tag 3555418ac9476139  gpl
> >>> 112: tracepoint  name sys_enter_sendt  tag bc7fcadbaf7b8145  gpl
> >>> 113: tracepoint  name sys_enter_open  tag 0e59c3ac2bea5280  gpl
> >>> 114: tracepoint  name sys_enter_opena  tag 0baf443610f59837  gpl
> >>> 115: tracepoint  name sys_enter_renam  tag 24664e4aca62d7fa  gpl
> >>> 116: tracepoint  name sys_enter_renam  tag 20093e51a8634ebb  gpl
> >>> 117: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
> >>> 118: tracepoint  name sys_exit  tag 29c7ae234d79bd5c  gpl
> >>> [root@five ~]#
> >>> [root@five ~]# gdb perf
> >>> GNU gdb (GDB) Fedora 10.1-2.fc33
> >>> Reading symbols from perf...
> >>> (gdb) run stat -e instructions,cycles -b 113 -I 1000
> >>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000
> >>> [Thread debugging using libthread_db enabled]
> >>> Using host libthread_db library "/lib64/libthread_db.so.1".
> >>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> >>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> >>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> >>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> >>> 
> >>> Program received signal SIGSEGV, Segmentation fault.
> >>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
> >>> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> >>> (gdb) bt
> >>> #0  0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
> >>> #1  0x0000000000000000 in ?? ()
> >>> (gdb)
> >>> 
> >>> [acme@five perf]$ clang -v |& head -2
> >>> clang version 11.0.0 (Fedora 11.0.0-2.fc33)
> >>> Target: x86_64-unknown-linux-gnu
> >>> [acme@five perf]$
> >>> 
> >>> Do you need any extra info?
> >> 
> >> Hmm... I am not able to reproduce this. I am trying to setup an environment similar
> >> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? 
> > 
> > I'll try it with a BPF proggie attached to a kprobes, but here is
> > something else I noticed:
> > 
> > [root@five perf]# export PYTHONPATH=/tmp/build/perf/python
> > [root@five perf]# tools/perf/python/twatch.py
> > Traceback (most recent call last):
> >  File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module>
> >    import perf
> > ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
> > [root@five perf]# perf test python
> > 19: 'import perf' in python                                         : FAILED!
> > [root@five perf]# perf test -v python
> > 19: 'import perf' in python                                         :
> > --- start ---
> > test child forked, pid 3198864
> > python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' "
> > Traceback (most recent call last):
> >  File "<stdin>", line 1, in <module>
> > ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
> > test child finished with -1
> > ---- end ----
> > 'import perf' in python: FAILED!
> > [root@five perf]#
> > 
> > This should be trivial, I hope, just add the new object file to
> > tools/perf/util/python-ext-sources, then do a 'perf test python', if it
> > fails, use 'perf test -v python' to see what is preventing the python
> > binding from loading.
> 
> I fixed the undefined bpf_counter__destroy. But this one looks trickier:
> 
> 19: 'import perf' in python                                         :
> --- start ---
> test child forked, pid 2714986
> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/bin/python2' "
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ImportError: XXXXX /tools/perf/python/perf.so: undefined symbol: bpf_map_update_elem
> 
> Given I already have:

I'll check this one to get a patch that at least moves the needle here,
i.e. probably we can leave supporting bpf counters in the python binding
for a later step.

- Arnaldo
 
> diff --git i/tools/perf/util/python-ext-sources w/tools/perf/util/python-ext-sources
> index a9d9c142eb7c3..2cac55273eca2 100644
> --- i/tools/perf/util/python-ext-sources
> +++ w/tools/perf/util/python-ext-sources
> @@ -35,3 +35,6 @@ util/symbol_fprintf.c
>  util/units.c
>  util/affinity.c
>  util/rwsem.c
> +util/bpf_counter.c
> +../lib/bpf/bpf.c
> +../lib/bpf/libbpf.c
> 
> 
> How should I fix this? 
> 
> Thanks,
> Song
> 
> PS: I still cannot reproduce that segfault...
> 
> > 
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-29 18:48           ` Arnaldo Carvalho de Melo
@ 2020-12-29 19:11             ` Song Liu
  2020-12-29 19:18               ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2020-12-29 19:11 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team



> On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Tue, Dec 29, 2020 at 06:42:18PM +0000, Song Liu escreveu:
>> 
>> 
>>> On Dec 29, 2020, at 7:15 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>> 
>>> Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu:
>>>> 
>>>> 
>>>>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>>>> 
>>>>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu:
>>>>>> Introduce perf-stat -b option, which counts events for BPF programs, like:
>>>>>> 
>>>>>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
>>>>>>   1.487903822            115,200      ref-cycles
>>>>>>   1.487903822             86,012      cycles
>>>>>>   2.489147029             80,560      ref-cycles
>>>>>>   2.489147029             73,784      cycles
>>>>>>   3.490341825             60,720      ref-cycles
>>>>>>   3.490341825             37,797      cycles
>>>>>>   4.491540887             37,120      ref-cycles
>>>>>>   4.491540887             31,963      cycles
>>>>>> 
>>>>>> The example above counts cycles and ref-cycles of BPF program of id 254.
>>>>>> This is similar to bpftool-prog-profile command, but more flexible.
>>>>>> 
>>>>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
>>>>>> programs (monitor-progs) to the target BPF program (target-prog). The
>>>>>> monitor-progs read perf_event before and after the target-prog, and
>>>>>> aggregate the difference in a BPF map. Then the user space reads data
>>>>>> from these maps.
>>>>>> 
>>>>>> A new struct bpf_counter is introduced to provide common interface that
>>>>>> uses BPF programs/maps to count perf events.
>>>>> 
>>>>> Segfaulting here:
>>>>> 
>>>>> [root@five ~]# bpftool prog  | grep tracepoint
>>>>> 110: tracepoint  name syscall_unaugme  tag 57cd311f2e27366b  gpl
>>>>> 111: tracepoint  name sys_enter_conne  tag 3555418ac9476139  gpl
>>>>> 112: tracepoint  name sys_enter_sendt  tag bc7fcadbaf7b8145  gpl
>>>>> 113: tracepoint  name sys_enter_open  tag 0e59c3ac2bea5280  gpl
>>>>> 114: tracepoint  name sys_enter_opena  tag 0baf443610f59837  gpl
>>>>> 115: tracepoint  name sys_enter_renam  tag 24664e4aca62d7fa  gpl
>>>>> 116: tracepoint  name sys_enter_renam  tag 20093e51a8634ebb  gpl
>>>>> 117: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
>>>>> 118: tracepoint  name sys_exit  tag 29c7ae234d79bd5c  gpl
>>>>> [root@five ~]#
>>>>> [root@five ~]# gdb perf
>>>>> GNU gdb (GDB) Fedora 10.1-2.fc33
>>>>> Reading symbols from perf...
>>>>> (gdb) run stat -e instructions,cycles -b 113 -I 1000
>>>>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000
>>>>> [Thread debugging using libthread_db enabled]
>>>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
>>>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>>>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
>>>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>>>>> 
>>>>> Program received signal SIGSEGV, Segmentation fault.
>>>>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
>>>>> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>>>>> (gdb) bt
>>>>> #0  0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
>>>>> #1  0x0000000000000000 in ?? ()
>>>>> (gdb)
>>>>> 
>>>>> [acme@five perf]$ clang -v |& head -2
>>>>> clang version 11.0.0 (Fedora 11.0.0-2.fc33)
>>>>> Target: x86_64-unknown-linux-gnu
>>>>> [acme@five perf]$
>>>>> 
>>>>> Do you need any extra info?
>>>> 
>>>> Hmm... I am not able to reproduce this. I am trying to setup an environment similar
>>>> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? 
>>> 
>>> I'll try it with a BPF proggie attached to a kprobes, but here is
>>> something else I noticed:
>>> 
>>> [root@five perf]# export PYTHONPATH=/tmp/build/perf/python
>>> [root@five perf]# tools/perf/python/twatch.py
>>> Traceback (most recent call last):
>>> File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module>
>>>   import perf
>>> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
>>> [root@five perf]# perf test python
>>> 19: 'import perf' in python                                         : FAILED!
>>> [root@five perf]# perf test -v python
>>> 19: 'import perf' in python                                         :
>>> --- start ---
>>> test child forked, pid 3198864
>>> python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' "
>>> Traceback (most recent call last):
>>> File "<stdin>", line 1, in <module>
>>> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
>>> test child finished with -1
>>> ---- end ----
>>> 'import perf' in python: FAILED!
>>> [root@five perf]#
>>> 
>>> This should be trivial, I hope, just add the new object file to
>>> tools/perf/util/python-ext-sources, then do a 'perf test python', if it
>>> fails, use 'perf test -v python' to see what is preventing the python
>>> binding from loading.
>> 
>> I fixed the undefined bpf_counter__destroy. But this one looks trickier:
>> 
>> 19: 'import perf' in python                                         :
>> --- start ---
>> test child forked, pid 2714986
>> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/bin/python2' "
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in <module>
>> ImportError: XXXXX /tools/perf/python/perf.so: undefined symbol: bpf_map_update_elem
>> 
>> Given I already have:
> 
> I'll check this one to get a patch that at least moves the needle here,
> i.e. probably we can leave supporting bpf counters in the python binding
> for a later step.

Thanks Arnaldo!

Currently, I have:
1. Fixed issues highlighted by Namhyung;
2. Merged 3/4 and 4/4;
3. NOT found segfault;
4. NOT fixed python import perf. 

I don't have good ideas with 3 and 4... Shall I send current code as v7?

Thanks,
Song

> 
> - Arnaldo
> 
>> diff --git i/tools/perf/util/python-ext-sources w/tools/perf/util/python-ext-sources
>> index a9d9c142eb7c3..2cac55273eca2 100644
>> --- i/tools/perf/util/python-ext-sources
>> +++ w/tools/perf/util/python-ext-sources
>> @@ -35,3 +35,6 @@ util/symbol_fprintf.c
>> util/units.c
>> util/affinity.c
>> util/rwsem.c
>> +util/bpf_counter.c
>> +../lib/bpf/bpf.c
>> +../lib/bpf/libbpf.c
>> 
>> 
>> How should I fix this? 
>> 
>> Thanks,
>> Song
>> 
>> PS: I still cannot reproduce that segfault...
>> 
>>> 
>> 
> 
> -- 
> 
> - Arnaldo


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-29 19:11             ` Song Liu
@ 2020-12-29 19:18               ` Arnaldo Carvalho de Melo
  2020-12-29 19:23                 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-12-29 19:18 UTC (permalink / raw)
  To: Song Liu
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team

Em Tue, Dec 29, 2020 at 07:11:12PM +0000, Song Liu escreveu:
> 
> 
> > On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > 
> > Em Tue, Dec 29, 2020 at 06:42:18PM +0000, Song Liu escreveu:
> >> 
> >> 
> >>> On Dec 29, 2020, at 7:15 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >>> 
> >>> Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu:
> >>>> 
> >>>> 
> >>>>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >>>>> 
> >>>>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu:
> >>>>>> Introduce perf-stat -b option, which counts events for BPF programs, like:
> >>>>>> 
> >>>>>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
> >>>>>>   1.487903822            115,200      ref-cycles
> >>>>>>   1.487903822             86,012      cycles
> >>>>>>   2.489147029             80,560      ref-cycles
> >>>>>>   2.489147029             73,784      cycles
> >>>>>>   3.490341825             60,720      ref-cycles
> >>>>>>   3.490341825             37,797      cycles
> >>>>>>   4.491540887             37,120      ref-cycles
> >>>>>>   4.491540887             31,963      cycles
> >>>>>> 
> >>>>>> The example above counts cycles and ref-cycles of BPF program of id 254.
> >>>>>> This is similar to bpftool-prog-profile command, but more flexible.
> >>>>>> 
> >>>>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
> >>>>>> programs (monitor-progs) to the target BPF program (target-prog). The
> >>>>>> monitor-progs read perf_event before and after the target-prog, and
> >>>>>> aggregate the difference in a BPF map. Then the user space reads data
> >>>>>> from these maps.
> >>>>>> 
> >>>>>> A new struct bpf_counter is introduced to provide common interface that
> >>>>>> uses BPF programs/maps to count perf events.
> >>>>> 
> >>>>> Segfaulting here:
> >>>>> 
> >>>>> [root@five ~]# bpftool prog  | grep tracepoint
> >>>>> 110: tracepoint  name syscall_unaugme  tag 57cd311f2e27366b  gpl
> >>>>> 111: tracepoint  name sys_enter_conne  tag 3555418ac9476139  gpl
> >>>>> 112: tracepoint  name sys_enter_sendt  tag bc7fcadbaf7b8145  gpl
> >>>>> 113: tracepoint  name sys_enter_open  tag 0e59c3ac2bea5280  gpl
> >>>>> 114: tracepoint  name sys_enter_opena  tag 0baf443610f59837  gpl
> >>>>> 115: tracepoint  name sys_enter_renam  tag 24664e4aca62d7fa  gpl
> >>>>> 116: tracepoint  name sys_enter_renam  tag 20093e51a8634ebb  gpl
> >>>>> 117: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
> >>>>> 118: tracepoint  name sys_exit  tag 29c7ae234d79bd5c  gpl
> >>>>> [root@five ~]#
> >>>>> [root@five ~]# gdb perf
> >>>>> GNU gdb (GDB) Fedora 10.1-2.fc33
> >>>>> Reading symbols from perf...
> >>>>> (gdb) run stat -e instructions,cycles -b 113 -I 1000
> >>>>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000
> >>>>> [Thread debugging using libthread_db enabled]
> >>>>> Using host libthread_db library "/lib64/libthread_db.so.1".
> >>>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> >>>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> >>>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> >>>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> >>>>> 
> >>>>> Program received signal SIGSEGV, Segmentation fault.
> >>>>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
> >>>>> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> >>>>> (gdb) bt
> >>>>> #0  0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
> >>>>> #1  0x0000000000000000 in ?? ()
> >>>>> (gdb)
> >>>>> 
> >>>>> [acme@five perf]$ clang -v |& head -2
> >>>>> clang version 11.0.0 (Fedora 11.0.0-2.fc33)
> >>>>> Target: x86_64-unknown-linux-gnu
> >>>>> [acme@five perf]$
> >>>>> 
> >>>>> Do you need any extra info?
> >>>> 
> >>>> Hmm... I am not able to reproduce this. I am trying to setup an environment similar
> >>>> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? 
> >>> 
> >>> I'll try it with a BPF proggie attached to a kprobes, but here is
> >>> something else I noticed:
> >>> 
> >>> [root@five perf]# export PYTHONPATH=/tmp/build/perf/python
> >>> [root@five perf]# tools/perf/python/twatch.py
> >>> Traceback (most recent call last):
> >>> File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module>
> >>>   import perf
> >>> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
> >>> [root@five perf]# perf test python
> >>> 19: 'import perf' in python                                         : FAILED!
> >>> [root@five perf]# perf test -v python
> >>> 19: 'import perf' in python                                         :
> >>> --- start ---
> >>> test child forked, pid 3198864
> >>> python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' "
> >>> Traceback (most recent call last):
> >>> File "<stdin>", line 1, in <module>
> >>> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy
> >>> test child finished with -1
> >>> ---- end ----
> >>> 'import perf' in python: FAILED!
> >>> [root@five perf]#
> >>> 
> >>> This should be trivial, I hope, just add the new object file to
> >>> tools/perf/util/python-ext-sources, then do a 'perf test python', if it
> >>> fails, use 'perf test -v python' to see what is preventing the python
> >>> binding from loading.
> >> 
> >> I fixed the undefined bpf_counter__destroy. But this one looks trickier:
> >> 
> >> 19: 'import perf' in python                                         :
> >> --- start ---
> >> test child forked, pid 2714986
> >> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/bin/python2' "
> >> Traceback (most recent call last):
> >>  File "<stdin>", line 1, in <module>
> >> ImportError: XXXXX /tools/perf/python/perf.so: undefined symbol: bpf_map_update_elem
> >> 
> >> Given I already have:
> > 
> > I'll check this one to get a patch that at least moves the needle here,
> > i.e. probably we can leave supporting bpf counters in the python binding
> > for a later step.
> 
> Thanks Arnaldo!
> 
> Currently, I have:
> 1. Fixed issues highlighted by Namhyung;
> 2. Merged 3/4 and 4/4;
> 3. NOT found segfault;
> 4. NOT fixed python import perf. 
> 
> I don't have good ideas with 3 and 4... Shall I send current code as v7?

For 4, please fold the patch below into the relevant patch, we don't
need bpf_counter.h included in util/evsel.h, you even added a forward
declaration for that 'struct bpf_counter_ops'.

And in general we should refrain from adding extra includes to header
files, .h-ell is not good.

Then we provide a stub for that bpf_counter__destroy() so that
util/evsel.o when linked into the perf python biding find it there,
links ok.

As we don't have a way to create such events via the perf python
binding, there will nothing to be done when destroying evsels created
via python.

- Arnaldo

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 40e3946cd7518113..8226b1fefa8cf2a3 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -10,7 +10,6 @@
 #include <internal/evsel.h>
 #include <perf/evsel.h>
 #include "symbol_conf.h"
-#include "bpf_counter.h"
 #include <internal/cpumap.h>
 
 struct bpf_object;
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index cc5ade85a33fc999..9609cc166d71a6f5 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -79,6 +79,21 @@ int metricgroup__copy_metric_events(struct evlist *evlist, struct cgroup *cgrp,
 	return 0;
 }
 
+/*
+ * XXX: All these evsel destructors need some better mechanism, like a linked
+ * list of destructors registered when the relevant code indeed is used instead
+ * of having more and more calls in perf_evsel__delete(). -- acme
+ *
+ * For now, add one more:
+ *
+ * Not to drag the BPF bandwagon...
+ */
+void bpf_counter__destroy(struct evsel *evsel);
+
+void bpf_counter__destroy(struct evsel *evsel __maybe_unused)
+{
+}
+
 /*
  * Support debug printing even though util/debug.c is not linked.  That means
  * implementing 'verbose' and 'eprintf'.

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-29 19:18               ` Arnaldo Carvalho de Melo
@ 2020-12-29 19:23                 ` Arnaldo Carvalho de Melo
  2020-12-29 19:32                   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-12-29 19:23 UTC (permalink / raw)
  To: Song Liu
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team

Em Tue, Dec 29, 2020 at 04:18:48PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Dec 29, 2020 at 07:11:12PM +0000, Song Liu escreveu:
> > > On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > I'll check this one to get a patch that at least moves the needle here,
> > > i.e. probably we can leave supporting bpf counters in the python binding
> > > for a later step.

> > Thanks Arnaldo!

> > Currently, I have:
> > 1. Fixed issues highlighted by Namhyung;
> > 2. Merged 3/4 and 4/4;
> > 3. NOT found segfault;
> > 4. NOT fixed python import perf. 

> > I don't have good ideas with 3 and 4... Shall I send current code as v7?

> For 4, please fold the patch below into the relevant patch, we don't
> need bpf_counter.h included in util/evsel.h, you even added a forward
> declaration for that 'struct bpf_counter_ops'.
 
> And in general we should refrain from adding extra includes to header
> files, .h-ell is not good.
> 
> Then we provide a stub for that bpf_counter__destroy() so that
> util/evsel.o when linked into the perf python biding find it there,
> links ok.

Ok, one more stub is needed, I wasn't building all the time with 

  $ make BUILD_BPF_SKEL=1

Ditch the previous patch please, use the one below instead:

- Arnaldo

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 40e3946cd7518113..8226b1fefa8cf2a3 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -10,7 +10,6 @@
 #include <internal/evsel.h>
 #include <perf/evsel.h>
 #include "symbol_conf.h"
-#include "bpf_counter.h"
 #include <internal/cpumap.h>
 
 struct bpf_object;
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index cc5ade85a33fc999..278abecb5bdfc0d2 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -79,6 +79,27 @@ int metricgroup__copy_metric_events(struct evlist *evlist, struct cgroup *cgrp,
 	return 0;
 }
 
+/*
+ * XXX: All these evsel destructors need some better mechanism, like a linked
+ * list of destructors registered when the relevant code indeed is used instead
+ * of having more and more calls in perf_evsel__delete(). -- acme
+ *
+ * For now, add some more:
+ *
+ * Not to drag the BPF bandwagon...
+ */
+void bpf_counter__destroy(struct evsel *evsel);
+int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd);
+
+void bpf_counter__destroy(struct evsel *evsel __maybe_unused)
+{
+}
+
+int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, int cpu __maybe_unused, int fd __maybe_unused)
+{
+	return 0;
+}
+
 /*
  * Support debug printing even though util/debug.c is not linked.  That means
  * implementing 'verbose' and 'eprintf'.

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-29 19:23                 ` Arnaldo Carvalho de Melo
@ 2020-12-29 19:32                   ` Arnaldo Carvalho de Melo
  2020-12-29 21:40                     ` Song Liu
  0 siblings, 1 reply; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-12-29 19:32 UTC (permalink / raw)
  To: Song Liu
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team

Em Tue, Dec 29, 2020 at 04:23:47PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Dec 29, 2020 at 04:18:48PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Tue, Dec 29, 2020 at 07:11:12PM +0000, Song Liu escreveu:
> > > > On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > I'll check this one to get a patch that at least moves the needle here,
> > > > i.e. probably we can leave supporting bpf counters in the python binding
> > > > for a later step.
> 
> > > Thanks Arnaldo!
> 
> > > Currently, I have:
> > > 1. Fixed issues highlighted by Namhyung;
> > > 2. Merged 3/4 and 4/4;
> > > 3. NOT found segfault;
> > > 4. NOT fixed python import perf. 

For 3, now with a kprobe:

[root@five ~]# bpftool prog | grep hrtimer -A10
99: kprobe  name hrtimer_nanosle  tag 0e77bacaf4555f83  gpl
	loaded_at 2020-12-29T16:25:34-0300  uid 0
	xlated 80B  jited 49B  memlock 4096B
	btf_id 253
[root@five ~]# perf stat -I 1000 -b 99
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
Segmentation fault (core dumped)
[root@five ~]#

(gdb) run stat -I 1000 -b 99
Starting program: /root/bin/perf stat -I 1000 -b 99
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.32-2.fc33.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame

Program received signal SIGSEGV, Segmentation fault.
0x000000000056559b in bpf_program_profiler__read (evsel=0xc39770) at util/bpf_counter.c:217
217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-4.fc33.x86_64 cyrus-sasl-lib-2.1.27-6.fc33.x86_64 elfutils-debuginfod-client-0.182-1.fc33.x86_64 elfutils-libelf-0.182-1.fc33.x86_64 elfutils-libs-0.182-1.fc33.x86_64 keyutils-libs-1.6-5.fc33.x86_64 krb5-libs-1.18.2-29.fc33.x86_64 libbabeltrace-1.5.8-3.fc33.x86_64 libbrotli-1.0.9-3.fc33.x86_64 libcap-2.26-8.fc33.x86_64 libcom_err-1.45.6-4.fc33.x86_64 libcurl-7.71.1-8.fc33.x86_64 libgcc-10.2.1-9.fc33.x86_64 libidn2-2.3.0-4.fc33.x86_64 libnghttp2-1.41.0-3.fc33.x86_64 libpsl-0.21.1-2.fc33.x86_64 libselinux-3.1-2.fc33.x86_64 libssh-0.9.5-1.fc33.x86_64 libunistring-0.9.10-9.fc33.x86_64 libunwind-1.4.0-4.fc33.x86_64 libuuid-2.36-3.fc33.x86_64 libxcrypt-4.4.17-1.fc33.x86_64 libzstd-1.4.5-5.fc33.x86_64 numactl-libs-2.0.14-1.fc33.x86_64 openldap-2.4.50-5.fc33.x86_64 openssl-libs-1.1.1i-1.fc33.x86_64 pcre-8.44-2.fc33.x86_64 pcre2-10.36-1.fc33.x86_64 perl-libs-5.32.0-464.fc33.x86_64 popt-1.18-2.fc33.x86_64 python3-libs-3.9.1-1.fc33.x86_64 slang-2.3.2-8.fc33.x86_64 xz-libs-5.2.5-3.fc33.x86_64
(gdb) bt
#0  0x000000000056559b in bpf_program_profiler__read (evsel=0xc39770) at util/bpf_counter.c:217
#1  0x0000000000000000 in ?? ()
(gdb) p skel->maps.accum_readings
Cannot access memory at address 0x20
(gdb) p skel
$1 = (struct bpf_prog_profiler_bpf *) 0x0
(gdb) list -10
202		int reading_map_fd;
203		__u32 key = 0;
204		int err, cpu;
205
206		if (list_empty(&evsel->bpf_counter_list))
207			return -EAGAIN;
208
209		for (cpu = 0; cpu < num_cpu; cpu++) {
210			perf_counts(evsel->counts, cpu, 0)->val = 0;
211			perf_counts(evsel->counts, cpu, 0)->ena = 0;
(gdb)
212			perf_counts(evsel->counts, cpu, 0)->run = 0;
213		}
214		list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
215			struct bpf_prog_profiler_bpf *skel = counter->skel;
216
217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
218
219			err = bpf_map_lookup_elem(reading_map_fd, &key, values);
220			if (err) {
221				fprintf(stderr, "failed to read value\n");
(gdb) p counter->skel
$2 = (void *) 0x0
(gdb) p perf_evsel__name(counter)
No symbol "perf_evsel__name" in current context.
(gdb) p evsel__name(counter)
$3 = 0xc77420 "unknown attr type: 13078424"
(gdb) p evsel->type
There is no member named type.
(gdb) p evsel->core.
attr         cpus         fd           id           ids          node         nr_members   own_cpus     sample_id    system_wide  threads
(gdb) p evsel->core.attr.type
$4 = 1
(gdb) p evsel->core.attr.config
$5 = 0
(gdb) p evsel->evlist
$6 = (struct evlist *) 0xc3cfd0
(gdb) p evsel->evlist->core.nr_entries
$7 = 10
(gdb)


10 entries, the default for 'perf stat'


With just one event:

(gdb) run stat -e cycles -I 1000 -b 99
Starting program: /root/bin/perf stat -e cycles -I 1000 -b 99
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.32-2.fc33.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame

Program received signal SIGSEGV, Segmentation fault.
0x000000000056559b in bpf_program_profiler__read (evsel=0xc392c0) at util/bpf_counter.c:217
217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-4.fc33.x86_64 cyrus-sasl-lib-2.1.27-6.fc33.x86_64 elfutils-debuginfod-client-0.182-1.fc33.x86_64 elfutils-libelf-0.182-1.fc33.x86_64 elfutils-libs-0.182-1.fc33.x86_64 keyutils-libs-1.6-5.fc33.x86_64 krb5-libs-1.18.2-29.fc33.x86_64 libbabeltrace-1.5.8-3.fc33.x86_64 libbrotli-1.0.9-3.fc33.x86_64 libcap-2.26-8.fc33.x86_64 libcom_err-1.45.6-4.fc33.x86_64 libcurl-7.71.1-8.fc33.x86_64 libgcc-10.2.1-9.fc33.x86_64 libidn2-2.3.0-4.fc33.x86_64 libnghttp2-1.41.0-3.fc33.x86_64 libpsl-0.21.1-2.fc33.x86_64 libselinux-3.1-2.fc33.x86_64 libssh-0.9.5-1.fc33.x86_64 libunistring-0.9.10-9.fc33.x86_64 libunwind-1.4.0-4.fc33.x86_64 libuuid-2.36-3.fc33.x86_64 libxcrypt-4.4.17-1.fc33.x86_64 libzstd-1.4.5-5.fc33.x86_64 numactl-libs-2.0.14-1.fc33.x86_64 openldap-2.4.50-5.fc33.x86_64 openssl-libs-1.1.1i-1.fc33.x86_64 pcre-8.44-2.fc33.x86_64 pcre2-10.36-1.fc33.x86_64 perl-libs-5.32.0-464.fc33.x86_64 popt-1.18-2.fc33.x86_64 python3-libs-3.9.1-1.fc33.x86_64 slang-2.3.2-8.fc33.x86_64 xz-libs-5.2.5-3.fc33.x86_64
(gdb) bt
#0  0x000000000056559b in bpf_program_profiler__read (evsel=0xc392c0) at util/bpf_counter.c:217
#1  0x0000000000000000 in ?? ()
(gdb) p evsel->name
$1 = 0xc37960 "cycles"
(gdb) p evsel->bpf_counter_
bpf_counter_list  bpf_counter_ops
(gdb) p evsel->bpf_counter_ops
$2 = (struct bpf_counter_ops *) 0xa08ec0 <bpf_program_profiler_ops>
(gdb) p evsel->bpf_counter_
bpf_counter_list  bpf_counter_ops
(gdb) p evsel->bpf_counter_list
$3 = {next = 0xc36e18, prev = 0xc36e18}
(gdb) p evsel->s
sample_size        side_band          stats              supported          synth_sample_type
(gdb) list -5
207			return -EAGAIN;
208
209		for (cpu = 0; cpu < num_cpu; cpu++) {
210			perf_counts(evsel->counts, cpu, 0)->val = 0;
211			perf_counts(evsel->counts, cpu, 0)->ena = 0;
212			perf_counts(evsel->counts, cpu, 0)->run = 0;
213		}
214		list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
215			struct bpf_prog_profiler_bpf *skel = counter->skel;
216
(gdb)
217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
218
219			err = bpf_map_lookup_elem(reading_map_fd, &key, values);
220			if (err) {
221				fprintf(stderr, "failed to read value\n");
222				return err;
223			}
224
225			for (cpu = 0; cpu < num_cpu; cpu++) {
226				perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
(gdb) p counter->skel
$4 = (void *) 0x0
(gdb)

skel is NULL?!

I ran out of time, have to go errands now. will bbl.

- Arnaldo
 
> > > I don't have good ideas with 3 and 4... Shall I send current code as v7?
> 
> > For 4, please fold the patch below into the relevant patch, we don't
> > need bpf_counter.h included in util/evsel.h, you even added a forward
> > declaration for that 'struct bpf_counter_ops'.
>  
> > And in general we should refrain from adding extra includes to header
> > files, .h-ell is not good.
> > 
> > Then we provide a stub for that bpf_counter__destroy() so that
> > util/evsel.o when linked into the perf python biding find it there,
> > links ok.
> 
> Ok, one more stub is needed, I wasn't building all the time with 
> 
>   $ make BUILD_BPF_SKEL=1
> 
> Ditch the previous patch please, use the one below instead:
> 
> - Arnaldo
> 
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 40e3946cd7518113..8226b1fefa8cf2a3 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -10,7 +10,6 @@
>  #include <internal/evsel.h>
>  #include <perf/evsel.h>
>  #include "symbol_conf.h"
> -#include "bpf_counter.h"
>  #include <internal/cpumap.h>
>  
>  struct bpf_object;
> diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
> index cc5ade85a33fc999..278abecb5bdfc0d2 100644
> --- a/tools/perf/util/python.c
> +++ b/tools/perf/util/python.c
> @@ -79,6 +79,27 @@ int metricgroup__copy_metric_events(struct evlist *evlist, struct cgroup *cgrp,
>  	return 0;
>  }
>  
> +/*
> + * XXX: All these evsel destructors need some better mechanism, like a linked
> + * list of destructors registered when the relevant code indeed is used instead
> + * of having more and more calls in perf_evsel__delete(). -- acme
> + *
> + * For now, add some more:
> + *
> + * Not to drag the BPF bandwagon...
> + */
> +void bpf_counter__destroy(struct evsel *evsel);
> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd);
> +
> +void bpf_counter__destroy(struct evsel *evsel __maybe_unused)
> +{
> +}
> +
> +int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, int cpu __maybe_unused, int fd __maybe_unused)
> +{
> +	return 0;
> +}
> +
>  /*
>   * Support debug printing even though util/debug.c is not linked.  That means
>   * implementing 'verbose' and 'eprintf'.

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs
  2020-12-29 19:32                   ` Arnaldo Carvalho de Melo
@ 2020-12-29 21:40                     ` Song Liu
  0 siblings, 0 replies; 25+ messages in thread
From: Song Liu @ 2020-12-29 21:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland,
	jolsa, Kernel Team



> On Dec 29, 2020, at 11:32 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Tue, Dec 29, 2020 at 04:23:47PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Tue, Dec 29, 2020 at 04:18:48PM -0300, Arnaldo Carvalho de Melo escreveu:
>>> Em Tue, Dec 29, 2020 at 07:11:12PM +0000, Song Liu escreveu:
>>>>> On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>>>> I'll check this one to get a patch that at least moves the needle here,
>>>>> i.e. probably we can leave supporting bpf counters in the python binding
>>>>> for a later step.
>> 
>>>> Thanks Arnaldo!
>> 
>>>> Currently, I have:
>>>> 1. Fixed issues highlighted by Namhyung;
>>>> 2. Merged 3/4 and 4/4;
>>>> 3. NOT found segfault;
>>>> 4. NOT fixed python import perf. 
> 
> For 3, now with a kprobe:
> 
> [root@five ~]# bpftool prog | grep hrtimer -A10
> 99: kprobe  name hrtimer_nanosle  tag 0e77bacaf4555f83  gpl
> 	loaded_at 2020-12-29T16:25:34-0300  uid 0
> 	xlated 80B  jited 49B  memlock 4096B
> 	btf_id 253
> [root@five ~]# perf stat -I 1000 -b 99
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> Segmentation fault (core dumped)
> [root@five ~]#
> 
> (gdb) run stat -I 1000 -b 99
> Starting program: /root/bin/perf stat -I 1000 -b 99
> Missing separate debuginfos, use: dnf debuginfo-install glibc-2.32-2.fc33.x86_64
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000056559b in bpf_program_profiler__read (evsel=0xc39770) at util/bpf_counter.c:217
> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-4.fc33.x86_64 cyrus-sasl-lib-2.1.27-6.fc33.x86_64 elfutils-debuginfod-client-0.182-1.fc33.x86_64 elfutils-libelf-0.182-1.fc33.x86_64 elfutils-libs-0.182-1.fc33.x86_64 keyutils-libs-1.6-5.fc33.x86_64 krb5-libs-1.18.2-29.fc33.x86_64 libbabeltrace-1.5.8-3.fc33.x86_64 libbrotli-1.0.9-3.fc33.x86_64 libcap-2.26-8.fc33.x86_64 libcom_err-1.45.6-4.fc33.x86_64 libcurl-7.71.1-8.fc33.x86_64 libgcc-10.2.1-9.fc33.x86_64 libidn2-2.3.0-4.fc33.x86_64 libnghttp2-1.41.0-3.fc33.x86_64 libpsl-0.21.1-2.fc33.x86_64 libselinux-3.1-2.fc33.x86_64 libssh-0.9.5-1.fc33.x86_64 libunistring-0.9.10-9.fc33.x86_64 libunwind-1.4.0-4.fc33.x86_64 libuuid-2.36-3.fc33.x86_64 libxcrypt-4.4.17-1.fc33.x86_64 libzstd-1.4.5-5.fc33.x86_64 numactl-libs-2.0.14-1.fc33.x86_64 openldap-2.4.50-5.fc33.x86_64 openssl-libs-1.1.1i-1.fc33.x86_64 pcre-8.44-2.fc33.x86_64 pcre2-10.36-1.fc33.x86_64 perl-libs-5.32.0-464.fc33.x86_64 popt-1.18-2.fc33.x86_64 python3-libs-3.9.1-1.fc33.x86_64 slang-2.3.2-8.fc33.x86_64 xz-libs-5.2.5-3.fc33.x86_64
> (gdb) bt
> #0  0x000000000056559b in bpf_program_profiler__read (evsel=0xc39770) at util/bpf_counter.c:217
> #1  0x0000000000000000 in ?? ()
> (gdb) p skel->maps.accum_readings
> Cannot access memory at address 0x20
> (gdb) p skel
> $1 = (struct bpf_prog_profiler_bpf *) 0x0
> (gdb) list -10
> 202		int reading_map_fd;
> 203		__u32 key = 0;
> 204		int err, cpu;
> 205
> 206		if (list_empty(&evsel->bpf_counter_list))
> 207			return -EAGAIN;
> 208
> 209		for (cpu = 0; cpu < num_cpu; cpu++) {
> 210			perf_counts(evsel->counts, cpu, 0)->val = 0;
> 211			perf_counts(evsel->counts, cpu, 0)->ena = 0;
> (gdb)
> 212			perf_counts(evsel->counts, cpu, 0)->run = 0;
> 213		}
> 214		list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> 215			struct bpf_prog_profiler_bpf *skel = counter->skel;
> 216
> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> 218
> 219			err = bpf_map_lookup_elem(reading_map_fd, &key, values);
> 220			if (err) {
> 221				fprintf(stderr, "failed to read value\n");
> (gdb) p counter->skel
> $2 = (void *) 0x0
> (gdb) p perf_evsel__name(counter)
> No symbol "perf_evsel__name" in current context.
> (gdb) p evsel__name(counter)
> $3 = 0xc77420 "unknown attr type: 13078424"
> (gdb) p evsel->type
> There is no member named type.
> (gdb) p evsel->core.
> attr         cpus         fd           id           ids          node         nr_members   own_cpus     sample_id    system_wide  threads
> (gdb) p evsel->core.attr.type
> $4 = 1
> (gdb) p evsel->core.attr.config
> $5 = 0
> (gdb) p evsel->evlist
> $6 = (struct evlist *) 0xc3cfd0
> (gdb) p evsel->evlist->core.nr_entries
> $7 = 10
> (gdb)
> 
> 
> 10 entries, the default for 'perf stat'
> 
> 
> With just one event:
> 
> (gdb) run stat -e cycles -I 1000 -b 99
> Starting program: /root/bin/perf stat -e cycles -I 1000 -b 99
> Missing separate debuginfos, use: dnf debuginfo-install glibc-2.32-2.fc33.x86_64
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000056559b in bpf_program_profiler__read (evsel=0xc392c0) at util/bpf_counter.c:217
> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-4.fc33.x86_64 cyrus-sasl-lib-2.1.27-6.fc33.x86_64 elfutils-debuginfod-client-0.182-1.fc33.x86_64 elfutils-libelf-0.182-1.fc33.x86_64 elfutils-libs-0.182-1.fc33.x86_64 keyutils-libs-1.6-5.fc33.x86_64 krb5-libs-1.18.2-29.fc33.x86_64 libbabeltrace-1.5.8-3.fc33.x86_64 libbrotli-1.0.9-3.fc33.x86_64 libcap-2.26-8.fc33.x86_64 libcom_err-1.45.6-4.fc33.x86_64 libcurl-7.71.1-8.fc33.x86_64 libgcc-10.2.1-9.fc33.x86_64 libidn2-2.3.0-4.fc33.x86_64 libnghttp2-1.41.0-3.fc33.x86_64 libpsl-0.21.1-2.fc33.x86_64 libselinux-3.1-2.fc33.x86_64 libssh-0.9.5-1.fc33.x86_64 libunistring-0.9.10-9.fc33.x86_64 libunwind-1.4.0-4.fc33.x86_64 libuuid-2.36-3.fc33.x86_64 libxcrypt-4.4.17-1.fc33.x86_64 libzstd-1.4.5-5.fc33.x86_64 numactl-libs-2.0.14-1.fc33.x86_64 openldap-2.4.50-5.fc33.x86_64 openssl-libs-1.1.1i-1.fc33.x86_64 pcre-8.44-2.fc33.x86_64 pcre2-10.36-1.fc33.x86_64 perl-libs-5.32.0-464.fc33.x86_64 popt-1.18-2.fc33.x86_64 python3-libs-3.9.1-1.fc33.x86_64 slang-2.3.2-8.fc33.x86_64 xz-libs-5.2.5-3.fc33.x86_64
> (gdb) bt
> #0  0x000000000056559b in bpf_program_profiler__read (evsel=0xc392c0) at util/bpf_counter.c:217
> #1  0x0000000000000000 in ?? ()
> (gdb) p evsel->name
> $1 = 0xc37960 "cycles"
> (gdb) p evsel->bpf_counter_
> bpf_counter_list  bpf_counter_ops
> (gdb) p evsel->bpf_counter_ops
> $2 = (struct bpf_counter_ops *) 0xa08ec0 <bpf_program_profiler_ops>
> (gdb) p evsel->bpf_counter_
> bpf_counter_list  bpf_counter_ops
> (gdb) p evsel->bpf_counter_list
> $3 = {next = 0xc36e18, prev = 0xc36e18}
> (gdb) p evsel->s
> sample_size        side_band          stats              supported          synth_sample_type
> (gdb) list -5
> 207			return -EAGAIN;
> 208
> 209		for (cpu = 0; cpu < num_cpu; cpu++) {
> 210			perf_counts(evsel->counts, cpu, 0)->val = 0;
> 211			perf_counts(evsel->counts, cpu, 0)->ena = 0;
> 212			perf_counts(evsel->counts, cpu, 0)->run = 0;
> 213		}
> 214		list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> 215			struct bpf_prog_profiler_bpf *skel = counter->skel;
> 216
> (gdb)
> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> 218
> 219			err = bpf_map_lookup_elem(reading_map_fd, &key, values);
> 220			if (err) {
> 221				fprintf(stderr, "failed to read value\n");
> 222				return err;
> 223			}
> 224
> 225			for (cpu = 0; cpu < num_cpu; cpu++) {
> 226				perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
> (gdb) p counter->skel
> $4 = (void *) 0x0
> (gdb)
> 
> skel is NULL?!

So it is skel == NULL. In v7 (coming soon), I fixed some issues in the 
allocate/free of skel, and added some assert(). Let's see how that goes..

Thanks,
Song

> 
> I ran out of time, have to go errands now. will bbl.
> 
> - Arnaldo
> 
>>>> I don't have good ideas with 3 and 4... Shall I send current code as v7?
>> 
>>> For 4, please fold the patch below into the relevant patch, we don't
>>> need bpf_counter.h included in util/evsel.h, you even added a forward
>>> declaration for that 'struct bpf_counter_ops'.
>> 
>>> And in general we should refrain from adding extra includes to header
>>> files, .h-ell is not good.
>>> 
>>> Then we provide a stub for that bpf_counter__destroy() so that
>>> util/evsel.o when linked into the perf python biding find it there,
>>> links ok.
>> 
>> Ok, one more stub is needed, I wasn't building all the time with 
>> 
>>  $ make BUILD_BPF_SKEL=1
>> 
>> Ditch the previous patch please, use the one below instead:
>> 
>> - Arnaldo
>> 
>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
>> index 40e3946cd7518113..8226b1fefa8cf2a3 100644
>> --- a/tools/perf/util/evsel.h
>> +++ b/tools/perf/util/evsel.h
>> @@ -10,7 +10,6 @@
>> #include <internal/evsel.h>
>> #include <perf/evsel.h>
>> #include "symbol_conf.h"
>> -#include "bpf_counter.h"
>> #include <internal/cpumap.h>
>> 
>> struct bpf_object;
>> diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
>> index cc5ade85a33fc999..278abecb5bdfc0d2 100644
>> --- a/tools/perf/util/python.c
>> +++ b/tools/perf/util/python.c
>> @@ -79,6 +79,27 @@ int metricgroup__copy_metric_events(struct evlist *evlist, struct cgroup *cgrp,
>> 	return 0;
>> }
>> 
>> +/*
>> + * XXX: All these evsel destructors need some better mechanism, like a linked
>> + * list of destructors registered when the relevant code indeed is used instead
>> + * of having more and more calls in perf_evsel__delete(). -- acme
>> + *
>> + * For now, add some more:
>> + *
>> + * Not to drag the BPF bandwagon...
>> + */
>> +void bpf_counter__destroy(struct evsel *evsel);
>> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd);
>> +
>> +void bpf_counter__destroy(struct evsel *evsel __maybe_unused)
>> +{
>> +}
>> +
>> +int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, int cpu __maybe_unused, int fd __maybe_unused)
>> +{
>> +	return 0;
>> +}
>> +
>> /*
>>  * Support debug printing even though util/debug.c is not linked.  That means
>>  * implementing 'verbose' and 'eprintf'.
> 
> -- 
> 
> - Arnaldo


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-12-29 21:42 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu
2020-12-28 17:40 ` [PATCH v6 1/4] bpftool: add Makefile target bootstrap Song Liu
2020-12-28 17:40 ` [PATCH v6 2/4] perf: support build BPF skeletons with perf Song Liu
2020-12-29  7:01   ` Namhyung Kim
2020-12-29 11:48     ` Arnaldo Carvalho de Melo
2020-12-29 17:14       ` Song Liu
2020-12-29 18:16         ` Arnaldo Carvalho de Melo
2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu
2020-12-28 20:11   ` Arnaldo Carvalho de Melo
2020-12-28 23:43     ` Song Liu
2020-12-29  5:53       ` Song Liu
2020-12-29 15:15       ` Arnaldo Carvalho de Melo
2020-12-29 18:42         ` Song Liu
2020-12-29 18:48           ` Arnaldo Carvalho de Melo
2020-12-29 19:11             ` Song Liu
2020-12-29 19:18               ` Arnaldo Carvalho de Melo
2020-12-29 19:23                 ` Arnaldo Carvalho de Melo
2020-12-29 19:32                   ` Arnaldo Carvalho de Melo
2020-12-29 21:40                     ` Song Liu
2020-12-29  7:22   ` Namhyung Kim
2020-12-29 17:46     ` Song Liu
2020-12-29 17:59       ` Song Liu
2020-12-28 17:40 ` [PATCH v6 4/4] perf-stat: add documentation for -b option Song Liu
2020-12-29  7:24   ` Namhyung Kim
2020-12-29 16:59     ` Song Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.