linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Introduce perf-stat -b for BPF programs
@ 2020-12-04  6:13 Song Liu
  2020-12-04  6:13 ` [PATCH v2 1/2] perf: support build BPF skeletons with perf Song Liu
  2020-12-04  6:13 ` [PATCH v2 2/2] perf-stat: enable counting events for BPF programs Song Liu
  0 siblings, 2 replies; 10+ messages in thread
From: Song Liu @ 2020-12-04  6:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, jolsa, namhyung, Song Liu

This set introduces perf-stat -b option to count events for BPF programs.
This is similar to bpftool-prog-profile. But perf-stat makes it much more
flexible.

Changes PATCH v1 => PATCH v2:
  1. Various fixes in Makefiles. (Jiri)
  2. Fix an build warning/error with gcc-10. (Jiri)

Changes RFC v2 => PATCH v1:
  1. Support counting on multiple BPF programs.
  2. Add BPF handling to target__validate().
  3. Improve Makefile. (Jiri)

Changes RFC v1 => RFC v2:
  1. Use bootstrap version of bpftool. (Jiri)
  2. Set default to not building bpf skeletons. (Jiri)
  3. Remove util/bpf_skel/Makefile, keep all the logic in Makefile.perf.
     (Jiri)
  4. Remove dependency to vmlinux.h in the two skeletons. The goal here is
     to enable building perf without building kernel (vmlinux) first.
     Note: I also removed the logic that build vmlinux.h. We can add that
     back when we have to use it (to access big kernel structures).

Song Liu (2):
  perf: support build BPF skeletons with perf
  perf-stat: enable counting events for BPF programs

 tools/bpf/bpftool/Makefile                    |   2 +
 tools/build/Makefile.feature                  |   4 +-
 tools/perf/Makefile.config                    |   9 +
 tools/perf/Makefile.perf                      |  44 ++-
 tools/perf/builtin-stat.c                     |  77 ++++-
 tools/perf/util/Build                         |   1 +
 tools/perf/util/bpf_counter.c                 | 281 ++++++++++++++++++
 tools/perf/util/bpf_counter.h                 |  73 +++++
 tools/perf/util/bpf_skel/.gitignore           |   3 +
 .../util/bpf_skel/bpf_prog_profiler.bpf.c     |  96 ++++++
 tools/perf/util/evsel.c                       |  11 +
 tools/perf/util/evsel.h                       |   6 +
 tools/perf/util/stat-display.c                |   4 +-
 tools/perf/util/target.c                      |  34 ++-
 tools/perf/util/target.h                      |  10 +
 tools/scripts/Makefile.include                |   1 +
 16 files changed, 637 insertions(+), 19 deletions(-)
 create mode 100644 tools/perf/util/bpf_counter.c
 create mode 100644 tools/perf/util/bpf_counter.h
 create mode 100644 tools/perf/util/bpf_skel/.gitignore
 create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c

--
2.24.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] perf: support build BPF skeletons with perf
  2020-12-04  6:13 [PATCH v2 0/2] Introduce perf-stat -b for BPF programs Song Liu
@ 2020-12-04  6:13 ` Song Liu
  2020-12-07 20:25   ` Jiri Olsa
  2020-12-04  6:13 ` [PATCH v2 2/2] perf-stat: enable counting events for BPF programs Song Liu
  1 sibling, 1 reply; 10+ messages in thread
From: Song Liu @ 2020-12-04  6:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, jolsa, namhyung, Song Liu

BPF programs are useful in perf to profile BPF programs. BPF skeleton is
by far the easiest way to write BPF tools. Enable building BPF skeletons
in util/bpf_skel. A dummy bpf skeleton is added. More bpf skeletons will
be added for different use cases.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/bpf/bpftool/Makefile          |  2 ++
 tools/build/Makefile.feature        |  4 ++-
 tools/perf/Makefile.config          |  9 ++++++
 tools/perf/Makefile.perf            | 44 +++++++++++++++++++++++++++--
 tools/perf/util/bpf_skel/.gitignore |  3 ++
 tools/scripts/Makefile.include      |  1 +
 6 files changed, 60 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/util/bpf_skel/.gitignore

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index f60e6ad3a1dff..a01407ec78dc5 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -120,6 +120,8 @@ endif
 
 BPFTOOL_BOOTSTRAP := $(if $(OUTPUT),$(OUTPUT)bpftool-bootstrap,./bpftool-bootstrap)
 
+bootstrap: $(BPFTOOL_BOOTSTRAP)
+
 BOOTSTRAP_OBJS = $(addprefix $(OUTPUT),main.o common.o json_writer.o gen.o btf.o)
 OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
 
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 97cbfb31b7625..74e255d58d8d0 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -99,7 +99,9 @@ FEATURE_TESTS_EXTRA :=                  \
          clang                          \
          libbpf                         \
          libpfm4                        \
-         libdebuginfod
+         libdebuginfod			\
+         clang-bpf-co-re
+
 
 FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC)
 
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index ce8516e4de34f..fe234b8bfeefb 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -621,6 +621,15 @@ ifndef NO_LIBBPF
   endif
 endif
 
+ifdef BUILD_BPF_SKEL
+  $(call feature_check,clang-bpf-co-re)
+  ifeq ($(feature-clang-bpf-co-re), 0)
+    dummy := $(error Error: clang too old. Please install recent clang)
+  endif
+  $(call detected,CONFIG_PERF_BPF_SKEL)
+  CFLAGS += -DBUILD_BPF_SKEL
+endif
+
 dwarf-post-unwind := 1
 dwarf-post-unwind-text := BUG
 
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 7ce3f2e8b9c74..a272bfb60a579 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -126,6 +126,8 @@ include ../scripts/utilities.mak
 #
 # Define NO_LIBDEBUGINFOD if you do not want support debuginfod
 #
+# Define BUILD_BPF_SKEL to enable BPF skeletons
+#
 
 # As per kernel Makefile, avoid funny character set dependencies
 unexport LC_ALL
@@ -178,6 +180,8 @@ LD += $(EXTRA_LDFLAGS)
 HOSTCC  ?= gcc
 HOSTLD  ?= ld
 HOSTAR  ?= ar
+CLANG   ?= clang
+LLVM_STRIP ?= llvm-strip
 
 PKG_CONFIG = $(CROSS_COMPILE)pkg-config
 LLVM_CONFIG ?= llvm-config
@@ -735,7 +739,8 @@ prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioc
 	$(x86_arch_prctl_code_array) \
 	$(rename_flags_array) \
 	$(arch_errno_name_array) \
-	$(sync_file_range_arrays)
+	$(sync_file_range_arrays) \
+	bpf-skel
 
 $(OUTPUT)%.o: %.c prepare FORCE
 	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
@@ -1008,7 +1013,42 @@ config-clean:
 python-clean:
 	$(python-clean)
 
-clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean
+SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
+SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
+SKELETONS :=
+
+ifdef BUILD_BPF_SKEL
+BPFTOOL := $(SKEL_TMP_OUT)/bpftool-bootstrap
+LIBBPF_SRC := $(abspath ../lib/bpf)
+BPF_INCLUDE := -I$(SKEL_TMP_OUT)/..
+
+$(SKEL_TMP_OUT):
+	$(Q)$(MKDIR) -p $@
+
+$(BPFTOOL): | $(SKEL_TMP_OUT)
+	CFLAGS= $(MAKE) -C ../bpf/bpftool \
+		OUTPUT=$(SKEL_TMP_OUT)/ bootstrap
+
+$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT)
+	$(call QUIET_CLANG, $@)
+	$(Q)$(CLANG) -g -O2 -target bpf $(BPF_INCLUDE) -c $(filter util/bpf_skel/%.bpf.c,$^) \
+	-o $@ && $(LLVM_STRIP) -g $@
+
+$(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL)
+	$(QUIET_GENSKEL)$(BPFTOOL) gen skeleton $< > $@
+
+bpf-skel: $(SKELETONS)
+
+else # BUILD_BPF_SKEL
+
+bpf-skel:
+
+endif # BUILD_BPF_SKEL
+
+bpf-skel-clean:
+	$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
+
+clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean bpf-skel-clean
 	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
 	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
 	$(Q)$(RM) $(OUTPUT).config-detected
diff --git a/tools/perf/util/bpf_skel/.gitignore b/tools/perf/util/bpf_skel/.gitignore
new file mode 100644
index 0000000000000..5263e9e6c5d83
--- /dev/null
+++ b/tools/perf/util/bpf_skel/.gitignore
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+.tmp
+*.skel.h
\ No newline at end of file
diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include
index a7974638561ca..2da3b37f1c6bf 100644
--- a/tools/scripts/Makefile.include
+++ b/tools/scripts/Makefile.include
@@ -117,6 +117,7 @@ ifneq ($(silent),1)
 			 $(MAKE) $(PRINT_DIR) -C $$subdir
 	QUIET_FLEX     = @echo '  FLEX     '$@;
 	QUIET_BISON    = @echo '  BISON    '$@;
+	QUIET_GENSKEL  = @echo '  GEN-SKEL '$@;
 
 	descend = \
 		+@echo	       '  DESCEND  '$(1); \
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] perf-stat: enable counting events for BPF programs
  2020-12-04  6:13 [PATCH v2 0/2] Introduce perf-stat -b for BPF programs Song Liu
  2020-12-04  6:13 ` [PATCH v2 1/2] perf: support build BPF skeletons with perf Song Liu
@ 2020-12-04  6:13 ` Song Liu
  2020-12-07 22:07   ` Jiri Olsa
  1 sibling, 1 reply; 10+ messages in thread
From: Song Liu @ 2020-12-04  6:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, jolsa, namhyung, Song Liu

Introduce perf-stat -b option, which counts events for BPF programs, like:

[root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
     1.487903822            115,200      ref-cycles
     1.487903822             86,012      cycles
     2.489147029             80,560      ref-cycles
     2.489147029             73,784      cycles
     3.490341825             60,720      ref-cycles
     3.490341825             37,797      cycles
     4.491540887             37,120      ref-cycles
     4.491540887             31,963      cycles

The example above counts cycles and ref-cycles of BPF program of id 254.
This is similar to bpftool-prog-profile command, but more flexible.

perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
programs (monitor-progs) to the target BPF program (target-prog). The
monitor-progs read perf_event before and after the target-prog, and
aggregate the difference in a BPF map. Then the user space reads data
from these maps.

A new struct bpf_counter is introduced to provide common interface that
uses BPF programs/maps to count perf events.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/Makefile.perf                      |   2 +-
 tools/perf/builtin-stat.c                     |  77 ++++-
 tools/perf/util/Build                         |   1 +
 tools/perf/util/bpf_counter.c                 | 281 ++++++++++++++++++
 tools/perf/util/bpf_counter.h                 |  73 +++++
 .../util/bpf_skel/bpf_prog_profiler.bpf.c     |  96 ++++++
 tools/perf/util/evsel.c                       |  11 +
 tools/perf/util/evsel.h                       |   6 +
 tools/perf/util/stat-display.c                |   4 +-
 tools/perf/util/target.c                      |  34 ++-
 tools/perf/util/target.h                      |  10 +
 11 files changed, 578 insertions(+), 17 deletions(-)
 create mode 100644 tools/perf/util/bpf_counter.c
 create mode 100644 tools/perf/util/bpf_counter.h
 create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index a272bfb60a579..fb7de412152b5 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1015,7 +1015,7 @@ python-clean:
 
 SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
 SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
-SKELETONS :=
+SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
 
 ifdef BUILD_BPF_SKEL
 BPFTOOL := $(SKEL_TMP_OUT)/bpftool-bootstrap
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f15b2f8aa14d8..a71684446a0e1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -67,6 +67,7 @@
 #include "util/top.h"
 #include "util/affinity.h"
 #include "util/pfm.h"
+#include "util/bpf_counter.h"
 #include "asm/bug.h"
 
 #include <linux/time64.h>
@@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs)
 	return 0;
 }
 
+static int read_bpf_map_counters(void)
+{
+	struct evsel *counter;
+	int err;
+
+	evlist__for_each_entry(evsel_list, counter) {
+		err = bpf_counter__read(counter);
+		if (err)
+			return err;
+	}
+	return 0;
+}
+
 static void read_counters(struct timespec *rs)
 {
 	struct evsel *counter;
+	int err;
 
-	if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0))
-		return;
+	if (!stat_config.stop_read_counter) {
+		err = read_bpf_map_counters();
+		if (err == -EAGAIN)
+			err = read_affinity_counters(rs);
+		if (err < 0)
+			return;
+	}
 
 	evlist__for_each_entry(evsel_list, counter) {
 		if (counter->err)
@@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times)
 	return false;
 }
 
-static void enable_counters(void)
+static int enable_counters(void)
 {
+	struct evsel *evsel;
+	int err;
+
+	evlist__for_each_entry(evsel_list, evsel) {
+		err = bpf_counter__enable(evsel);
+		if (err)
+			return err;
+	}
+
 	if (stat_config.initial_delay < 0) {
 		pr_info(EVLIST_DISABLED_MSG);
-		return;
+		return 0;
 	}
 
 	if (stat_config.initial_delay > 0) {
@@ -518,6 +547,7 @@ static void enable_counters(void)
 		if (stat_config.initial_delay > 0)
 			pr_info(EVLIST_ENABLED_MSG);
 	}
+	return 0;
 }
 
 static void disable_counters(void)
@@ -720,7 +750,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	const bool forks = (argc > 0);
 	bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false;
 	struct affinity affinity;
-	int i, cpu;
+	int i, cpu, err;
 	bool second_pass = false;
 
 	if (forks) {
@@ -738,6 +768,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	if (affinity__setup(&affinity) < 0)
 		return -1;
 
+	evlist__for_each_entry(evsel_list, counter) {
+		if (bpf_counter__load(counter, &target))
+			return -1;
+	}
+
 	evlist__for_each_cpu (evsel_list, i, cpu) {
 		affinity__set(&affinity, cpu);
 
@@ -851,7 +886,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	}
 
 	if (STAT_RECORD) {
-		int err, fd = perf_data__fd(&perf_stat.data);
+		int fd = perf_data__fd(&perf_stat.data);
 
 		if (is_pipe) {
 			err = perf_header__write_pipe(perf_data__fd(&perf_stat.data));
@@ -877,7 +912,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 
 	if (forks) {
 		perf_evlist__start_workload(evsel_list);
-		enable_counters();
+		err = enable_counters();
+		if (err)
+			return -1;
 
 		if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
 			status = dispatch_events(forks, timeout, interval, &times);
@@ -896,7 +933,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 		if (WIFSIGNALED(status))
 			psignal(WTERMSIG(status), argv[0]);
 	} else {
-		enable_counters();
+		err = enable_counters();
+		if (err)
+			return -1;
 		status = dispatch_events(forks, timeout, interval, &times);
 	}
 
@@ -1087,6 +1126,10 @@ static struct option stat_options[] = {
 		   "stat events on existing process id"),
 	OPT_STRING('t', "tid", &target.tid, "tid",
 		   "stat events on existing thread id"),
+#ifdef BUILD_BPF_SKEL
+	OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id",
+		   "stat events on existing bpf program id"),
+#endif
 	OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_BOOLEAN('g', "group", &group,
@@ -2058,11 +2101,12 @@ int cmd_stat(int argc, const char **argv)
 		"perf stat [<options>] [<command>]",
 		NULL
 	};
-	int status = -EINVAL, run_idx;
+	int status = -EINVAL, run_idx, err;
 	const char *mode;
 	FILE *output = stderr;
 	unsigned int interval, timeout;
 	const char * const stat_subcommands[] = { "record", "report" };
+	char errbuf[BUFSIZ];
 
 	setlocale(LC_ALL, "");
 
@@ -2173,6 +2217,12 @@ int cmd_stat(int argc, const char **argv)
 	} else if (big_num_opt == 0) /* User passed --no-big-num */
 		stat_config.big_num = false;
 
+	err = target__validate(&target);
+	if (err) {
+		target__strerror(&target, err, errbuf, BUFSIZ);
+		pr_warning("%s\n", errbuf);
+	}
+
 	setup_system_wide(argc);
 
 	/*
@@ -2246,8 +2296,6 @@ int cmd_stat(int argc, const char **argv)
 		}
 	}
 
-	target__validate(&target);
-
 	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
 		target.per_thread = true;
 
@@ -2378,9 +2426,10 @@ int cmd_stat(int argc, const char **argv)
 		 * tools remain  -acme
 		 */
 		int fd = perf_data__fd(&perf_stat.data);
-		int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat,
-							     process_synthesized_event,
-							     &perf_stat.session->machines.host);
+
+		err = perf_event__synthesize_kernel_mmap((void *)&perf_stat,
+							 process_synthesized_event,
+							 &perf_stat.session->machines.host);
 		if (err) {
 			pr_warning("Couldn't synthesize the kernel mmap record, harmless, "
 				   "older tools may produce warnings about this file\n.");
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index e2563d0154eb6..188521f343470 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -135,6 +135,7 @@ perf-y += clockid.o
 
 perf-$(CONFIG_LIBBPF) += bpf-loader.o
 perf-$(CONFIG_LIBBPF) += bpf_map.o
+perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o
 perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
 perf-$(CONFIG_LIBELF) += symbol-elf.o
 perf-$(CONFIG_LIBELF) += probe-file.o
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
new file mode 100644
index 0000000000000..6cee11cda3665
--- /dev/null
+++ b/tools/perf/util/bpf_counter.c
@@ -0,0 +1,281 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2019 Facebook */
+
+#include <limits.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <linux/err.h>
+#include <linux/zalloc.h>
+#include <bpf/bpf.h>
+#include <bpf/btf.h>
+#include <bpf/libbpf.h>
+
+#include "bpf_counter.h"
+#include "counts.h"
+#include "debug.h"
+#include "evsel.h"
+#include "target.h"
+
+#include "bpf_skel/bpf_prog_profiler.skel.h"
+
+static inline void *u64_to_ptr(__u64 ptr)
+{
+	return (void *)(unsigned long)ptr;
+}
+
+static void set_max_rlimit(void)
+{
+	struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
+
+	setrlimit(RLIMIT_MEMLOCK, &rinf);
+}
+
+static inline struct bpf_counter *bpf_counter_alloc(void)
+{
+	struct bpf_counter *counter;
+
+	counter = zalloc(sizeof(*counter));
+	if (counter)
+		INIT_LIST_HEAD(&counter->list);
+	return counter;
+}
+
+static int bpf_program_profiler__destroy(struct evsel *evsel)
+{
+	struct bpf_counter *counter;
+
+	list_for_each_entry(counter, &evsel->bpf_counter_list, list)
+		bpf_prog_profiler_bpf__destroy(counter->skel);
+	INIT_LIST_HEAD(&evsel->bpf_counter_list);
+	return 0;
+}
+
+static char *bpf_target_prog_name(int tgt_fd)
+{
+	struct bpf_prog_info_linear *info_linear;
+	struct bpf_func_info *func_info;
+	const struct btf_type *t;
+	char *name = NULL;
+	struct btf *btf;
+
+	info_linear = bpf_program__get_prog_info_linear(
+		tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO);
+	if (IS_ERR_OR_NULL(info_linear)) {
+		pr_debug2("failed to get info_linear for prog FD %d", tgt_fd);
+		return NULL;
+	}
+
+	if (info_linear->info.btf_id == 0 ||
+	    btf__get_from_id(info_linear->info.btf_id, &btf)) {
+		pr_debug2("prog FD %d doesn't have valid btf", tgt_fd);
+		goto out;
+	}
+
+	func_info = u64_to_ptr(info_linear->info.func_info);
+	t = btf__type_by_id(btf, func_info[0].type_id);
+	if (!t) {
+		pr_debug2("btf %d doesn't have type %d",
+		      info_linear->info.btf_id, func_info[0].type_id);
+		goto out;
+	}
+	name = strdup(btf__name_by_offset(btf, t->name_off));
+out:
+	free(info_linear);
+	return name;
+}
+
+static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id)
+{
+	struct bpf_prog_profiler_bpf *skel;
+	struct bpf_counter *counter;
+	struct bpf_program *prog;
+	char *prog_name;
+	int prog_fd;
+	int err;
+
+	prog_fd = bpf_prog_get_fd_by_id(prog_id);
+	if (prog_fd < 0) {
+		pr_debug("Failed to open fd for bpf prog %u\n", prog_id);
+		return -1;
+	}
+	counter = bpf_counter_alloc();
+	if (!counter)
+		return -1;
+
+	skel = bpf_prog_profiler_bpf__open();
+	if (!skel) {
+		pr_debug("Failed to load bpf skeleton\n");
+		free(counter);
+		return -1;
+	}
+	skel->rodata->num_cpu = evsel__nr_cpus(evsel);
+
+	bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel));
+	bpf_map__resize(skel->maps.fentry_readings, 1);
+	bpf_map__resize(skel->maps.accum_readings, 1);
+
+	prog_name = bpf_target_prog_name(prog_fd);
+
+	bpf_object__for_each_program(prog, skel->obj) {
+		err = bpf_program__set_attach_target(prog, prog_fd, prog_name);
+		if (err)
+			pr_debug("bpf_program__set_attach_target failed\n");
+	}
+	set_max_rlimit();
+	err = bpf_prog_profiler_bpf__load(skel);
+	if (err)
+		pr_debug("bpf_prog_profiler_bpf__load failed\n");
+
+	counter->skel = skel;
+	list_add(&counter->list, &evsel->bpf_counter_list);
+	return 0;
+}
+
+static int bpf_program_profiler__load(struct evsel *evsel, struct target *target)
+{
+	char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p;
+	u32 prog_id;
+	int ret;
+
+	bpf_str_ = bpf_str = strdup(target->bpf_str);
+	if (!bpf_str)
+		return -1;
+
+	while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) {
+		prog_id = strtoul(tok, &p, 10);
+		if (prog_id == 0 || prog_id == UINT_MAX ||
+		    (*p != '\0' && *p != ',')) {
+			pr_debug("Failed to parse bpf prog ids %s\n",
+				 target->bpf_str);
+			return -1;
+		}
+
+		ret = bpf_program_profiler_load_one(evsel, prog_id);
+		if (ret) {
+			bpf_program_profiler__destroy(evsel);
+			free(bpf_str_);
+			return -1;
+		}
+		bpf_str = NULL;
+	}
+	free(bpf_str_);
+	return 0;
+}
+
+static int bpf_program_profiler__enable(struct evsel *evsel)
+{
+	struct bpf_counter *counter;
+	int ret;
+
+	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
+		ret = bpf_prog_profiler_bpf__attach(counter->skel);
+		if (ret) {
+			bpf_program_profiler__destroy(evsel);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+static int bpf_program_profiler__read(struct evsel *evsel)
+{
+	int num_cpu = evsel__nr_cpus(evsel);
+	struct bpf_perf_event_value values[num_cpu];
+	struct bpf_counter *counter;
+	int reading_map_fd;
+	__u32 key = 0;
+	int err, cpu;
+
+	if (list_empty(&evsel->bpf_counter_list))
+		return -EAGAIN;
+
+	for (cpu = 0; cpu < num_cpu; cpu++) {
+		perf_counts(evsel->counts, cpu, 0)->val = 0;
+		perf_counts(evsel->counts, cpu, 0)->ena = 0;
+		perf_counts(evsel->counts, cpu, 0)->run = 0;
+	}
+	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
+		struct bpf_prog_profiler_bpf *skel = counter->skel;
+
+		reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
+
+		err = bpf_map_lookup_elem(reading_map_fd, &key, values);
+		if (err) {
+			fprintf(stderr, "failed to read value\n");
+			return err;
+		}
+
+		for (cpu = 0; cpu < num_cpu; cpu++) {
+			perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
+			perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled;
+			perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running;
+		}
+	}
+	return 0;
+}
+
+static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,
+					    int fd)
+{
+	struct bpf_prog_profiler_bpf *skel;
+	struct bpf_counter *counter;
+	int ret;
+
+	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
+		skel = counter->skel;
+		ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
+					  &cpu, &fd, BPF_ANY);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
+struct bpf_counter_ops bpf_program_profiler_ops = {
+	.load       = bpf_program_profiler__load,
+	.enable	    = bpf_program_profiler__enable,
+	.read       = bpf_program_profiler__read,
+	.destroy    = bpf_program_profiler__destroy,
+	.install_pe = bpf_program_profiler__install_pe,
+};
+
+int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd)
+{
+	if (list_empty(&evsel->bpf_counter_list))
+		return 0;
+	return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd);
+}
+
+int bpf_counter__load(struct evsel *evsel, struct target *target)
+{
+	if (target__has_bpf(target))
+		evsel->bpf_counter_ops = &bpf_program_profiler_ops;
+
+	if (evsel->bpf_counter_ops)
+		return evsel->bpf_counter_ops->load(evsel, target);
+	return 0;
+}
+
+int bpf_counter__enable(struct evsel *evsel)
+{
+	if (list_empty(&evsel->bpf_counter_list))
+		return 0;
+	return evsel->bpf_counter_ops->enable(evsel);
+}
+
+int bpf_counter__read(struct evsel *evsel)
+{
+	if (list_empty(&evsel->bpf_counter_list))
+		return -EAGAIN;
+	return evsel->bpf_counter_ops->read(evsel);
+}
+
+int bpf_counter__destroy(struct evsel *evsel)
+{
+	if (list_empty(&evsel->bpf_counter_list))
+		return 0;
+	evsel->bpf_counter_ops->destroy(evsel);
+	evsel->bpf_counter_ops = NULL;
+	return 0;
+}
diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h
new file mode 100644
index 0000000000000..d6f10077478ee
--- /dev/null
+++ b/tools/perf/util/bpf_counter.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_BPF_COUNTER_H
+#define __PERF_BPF_COUNTER_H 1
+
+#include <linux/list.h>
+
+struct evsel;
+struct target;
+struct bpf_counter;
+
+typedef int (*bpf_counter_evsel_op)(struct evsel *evsel);
+typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel,
+					   struct target *target);
+typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel,
+					       int cpu,
+					       int fd);
+
+struct bpf_counter_ops {
+	bpf_counter_evsel_target_op load;
+	bpf_counter_evsel_op enable;
+	bpf_counter_evsel_op read;
+	bpf_counter_evsel_op destroy;
+	bpf_counter_evsel_install_pe_op install_pe;
+};
+
+struct bpf_counter {
+	void *skel;
+	struct list_head list;
+};
+
+#ifdef BUILD_BPF_SKEL
+
+int bpf_counter__load(struct evsel *evsel, struct target *target);
+int bpf_counter__enable(struct evsel *evsel);
+int bpf_counter__read(struct evsel *evsel);
+int bpf_counter__destroy(struct evsel *evsel);
+int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd);
+
+#else
+
+#include<linux/err.h>
+
+static inline int bpf_counter__load(struct evsel *evsel __maybe_unused,
+				    struct target *target __maybe_unused)
+{
+	return 0;
+}
+
+static inline int bpf_counter__enable(struct evsel *evsel __maybe_unused)
+{
+	return 0;
+}
+
+static inline int bpf_counter__read(struct evsel *evsel __maybe_unused)
+{
+	return -EAGAIN;
+}
+
+static inline int bpf_counter__destroy(struct evsel *evsel __maybe_unused)
+{
+	return 0;
+}
+
+static inline int bpf_counter__install_pe(struct evsel *evsel __maybe_unused,
+					  int cpu __maybe_unused,
+					  int fd __maybe_unused)
+{
+	return 0;
+}
+
+#endif /* BUILD_BPF_SKEL */
+
+#endif /* __PERF_BPF_COUNTER_H */
diff --git a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
new file mode 100644
index 0000000000000..cdde2218af86a
--- /dev/null
+++ b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+// Copyright (c) 2020 Facebook
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+/* map of perf event fds, num_cpu * num_metric entries */
+struct {
+	__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(int));
+} events SEC(".maps");
+
+/* readings at fentry */
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(struct bpf_perf_event_value));
+	__uint(max_entries, 1);
+} fentry_readings SEC(".maps");
+
+/* accumulated readings */
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(struct bpf_perf_event_value));
+	__uint(max_entries, 1);
+} accum_readings SEC(".maps");
+
+const volatile __u32 num_cpu = 1;
+
+SEC("fentry/XXX")
+int BPF_PROG(fentry_XXX)
+{
+	__u32 key = bpf_get_smp_processor_id();
+	struct bpf_perf_event_value reading;
+	struct bpf_perf_event_value *ptr;
+	__u32 zero = 0;
+	long err;
+
+	/* look up before reading, to reduce error */
+	ptr = bpf_map_lookup_elem(&fentry_readings, &zero);
+	if (!ptr)
+		return 0;
+
+	err = bpf_perf_event_read_value(&events, key, &reading,
+					sizeof(reading));
+	if (err)
+		return 0;
+
+	*ptr = reading;
+	return 0;
+}
+
+static inline void
+fexit_update_maps(struct bpf_perf_event_value *after)
+{
+	struct bpf_perf_event_value *before, diff, *accum;
+	__u32 zero = 0;
+
+	before = bpf_map_lookup_elem(&fentry_readings, &zero);
+	/* only account samples with a valid fentry_reading */
+	if (before && before->counter) {
+		struct bpf_perf_event_value *accum;
+
+		diff.counter = after->counter - before->counter;
+		diff.enabled = after->enabled - before->enabled;
+		diff.running = after->running - before->running;
+
+		accum = bpf_map_lookup_elem(&accum_readings, &zero);
+		if (accum) {
+			accum->counter += diff.counter;
+			accum->enabled += diff.enabled;
+			accum->running += diff.running;
+		}
+	}
+}
+
+SEC("fexit/XXX")
+int BPF_PROG(fexit_XXX)
+{
+	struct bpf_perf_event_value reading;
+	__u32 cpu = bpf_get_smp_processor_id();
+	__u32 one = 1, zero = 0;
+	int err;
+
+	/* read all events before updating the maps, to reduce error */
+	err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading));
+	if (err)
+		return 0;
+
+	fexit_update_maps(&reading);
+	return 0;
+}
+
+char LICENSE[] SEC("license") = "Dual BSD/GPL";
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1cad6051d8b08..c6a50467076f8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -25,6 +25,7 @@
 #include <stdlib.h>
 #include <perf/evsel.h>
 #include "asm/bug.h"
+#include "bpf_counter.h"
 #include "callchain.h"
 #include "cgroup.h"
 #include "counts.h"
@@ -51,6 +52,10 @@
 #include <internal/lib.h>
 
 #include <linux/ctype.h>
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#include <bpf/btf.h>
+#include "rlimit.h"
 
 struct perf_missing_features perf_missing_features;
 
@@ -247,6 +252,7 @@ void evsel__init(struct evsel *evsel,
 	evsel->bpf_obj	   = NULL;
 	evsel->bpf_fd	   = -1;
 	INIT_LIST_HEAD(&evsel->config_terms);
+	INIT_LIST_HEAD(&evsel->bpf_counter_list);
 	perf_evsel__object.init(evsel);
 	evsel->sample_size = __evsel__sample_size(attr->sample_type);
 	evsel__calc_id_pos(evsel);
@@ -1365,6 +1371,7 @@ void evsel__exit(struct evsel *evsel)
 {
 	assert(list_empty(&evsel->core.node));
 	assert(evsel->evlist == NULL);
+	bpf_counter__destroy(evsel);
 	evsel__free_counts(evsel);
 	perf_evsel__free_fd(&evsel->core);
 	perf_evsel__free_id(&evsel->core);
@@ -1770,6 +1777,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 		evsel->core.attr.sample_id_all = 0;
 
 	display_attr(&evsel->core.attr);
+	if (!list_empty(&evsel->bpf_counter_list))
+		evsel->core.attr.inherit = 0;
 
 	for (cpu = start_cpu; cpu < end_cpu; cpu++) {
 
@@ -1788,6 +1797,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 
 			FD(evsel, cpu, thread) = fd;
 
+			bpf_counter__install_pe(evsel, cpu, fd);
+
 			if (unlikely(test_attr__enabled)) {
 				test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
 						fd, group_fd, flags);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 79a860d8e3eef..1731fba702bf4 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -10,6 +10,7 @@
 #include <internal/evsel.h>
 #include <perf/evsel.h>
 #include "symbol_conf.h"
+#include "bpf_counter.h"
 #include <internal/cpumap.h>
 
 struct bpf_object;
@@ -17,6 +18,8 @@ struct cgroup;
 struct perf_counts;
 struct perf_stat_evsel;
 union perf_event;
+struct bpf_counter_ops;
+struct target;
 
 typedef int (evsel__sb_cb_t)(union perf_event *event, void *data);
 
@@ -127,6 +130,8 @@ struct evsel {
 	 * See also evsel__has_callchain().
 	 */
 	__u64			synth_sample_type;
+	struct list_head	bpf_counter_list;
+	struct bpf_counter_ops	*bpf_counter_ops;
 };
 
 struct perf_missing_features {
@@ -423,4 +428,5 @@ static inline bool evsel__is_dummy_event(struct evsel *evsel)
 struct perf_env *evsel__env(struct evsel *evsel);
 
 int evsel__store_ids(struct evsel *evsel, struct evlist *evlist);
+
 #endif /* __PERF_EVSEL_H */
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 4b57c0c076323..1536cfe0e50b4 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -1029,7 +1029,9 @@ static void print_header(struct perf_stat_config *config,
 	if (!config->csv_output) {
 		fprintf(output, "\n");
 		fprintf(output, " Performance counter stats for ");
-		if (_target->system_wide)
+		if (_target->bpf_str)
+			fprintf(output, "\'BPF program(s) %s", _target->bpf_str);
+		else if (_target->system_wide)
 			fprintf(output, "\'system wide");
 		else if (_target->cpu_list)
 			fprintf(output, "\'CPU(s) %s", _target->cpu_list);
diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c
index a3db13dea937c..0f383418e3df5 100644
--- a/tools/perf/util/target.c
+++ b/tools/perf/util/target.c
@@ -56,6 +56,34 @@ enum target_errno target__validate(struct target *target)
 			ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM;
 	}
 
+	/* BPF and CPU are mutually exclusive */
+	if (target->bpf_str && target->cpu_list) {
+		target->cpu_list = NULL;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__BPF_OVERRIDE_CPU;
+	}
+
+	/* BPF and PID/TID are mutually exclusive */
+	if (target->bpf_str && target->tid) {
+		target->tid = NULL;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__BPF_OVERRIDE_PID;
+	}
+
+	/* BPF and UID are mutually exclusive */
+	if (target->bpf_str && target->uid_str) {
+		target->uid_str = NULL;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__BPF_OVERRIDE_UID;
+	}
+
+	/* BPF and THREADS are mutually exclusive */
+	if (target->bpf_str && target->per_thread) {
+		target->per_thread = false;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__BPF_OVERRIDE_THREAD;
+	}
+
 	/* THREAD and SYSTEM/CPU are mutually exclusive */
 	if (target->per_thread && (target->system_wide || target->cpu_list)) {
 		target->per_thread = false;
@@ -109,6 +137,10 @@ static const char *target__error_str[] = {
 	"PID/TID switch overriding SYSTEM",
 	"UID switch overriding SYSTEM",
 	"SYSTEM/CPU switch overriding PER-THREAD",
+	"BPF switch overriding CPU",
+	"BPF switch overriding PID/TID",
+	"BPF switch overriding UID",
+	"BPF switch overriding THREAD",
 	"Invalid User: %s",
 	"Problems obtaining information for user %s",
 };
@@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum,
 
 	switch (errnum) {
 	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
-	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:
+	     TARGET_ERRNO__BPF_OVERRIDE_THREAD:
 		snprintf(buf, buflen, "%s", msg);
 		break;
 
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 6ef01a83b24e9..f132c6c2eef81 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -10,6 +10,7 @@ struct target {
 	const char   *tid;
 	const char   *cpu_list;
 	const char   *uid_str;
+	const char   *bpf_str;
 	uid_t	     uid;
 	bool	     system_wide;
 	bool	     uses_mmap;
@@ -36,6 +37,10 @@ enum target_errno {
 	TARGET_ERRNO__PID_OVERRIDE_SYSTEM,
 	TARGET_ERRNO__UID_OVERRIDE_SYSTEM,
 	TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD,
+	TARGET_ERRNO__BPF_OVERRIDE_CPU,
+	TARGET_ERRNO__BPF_OVERRIDE_PID,
+	TARGET_ERRNO__BPF_OVERRIDE_UID,
+	TARGET_ERRNO__BPF_OVERRIDE_THREAD,
 
 	/* for target__parse_uid() */
 	TARGET_ERRNO__INVALID_UID,
@@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target)
 	return target->system_wide || target->cpu_list;
 }
 
+static inline bool target__has_bpf(struct target *target)
+{
+	return target->bpf_str;
+}
+
 static inline bool target__none(struct target *target)
 {
 	return !target__has_task(target) && !target__has_cpu(target);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] perf: support build BPF skeletons with perf
  2020-12-04  6:13 ` [PATCH v2 1/2] perf: support build BPF skeletons with perf Song Liu
@ 2020-12-07 20:25   ` Jiri Olsa
  2020-12-08  0:59     ` Song Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2020-12-07 20:25 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, kernel-team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung

On Thu, Dec 03, 2020 at 10:13:09PM -0800, Song Liu wrote:

SNIP

> @@ -735,7 +739,8 @@ prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioc
>  	$(x86_arch_prctl_code_array) \
>  	$(rename_flags_array) \
>  	$(arch_errno_name_array) \
> -	$(sync_file_range_arrays)
> +	$(sync_file_range_arrays) \
> +	bpf-skel
>  
>  $(OUTPUT)%.o: %.c prepare FORCE
>  	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
> @@ -1008,7 +1013,42 @@ config-clean:
>  python-clean:
>  	$(python-clean)
>  
> -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean
> +SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
> +SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
> +SKELETONS :=
> +
> +ifdef BUILD_BPF_SKEL
> +BPFTOOL := $(SKEL_TMP_OUT)/bpftool-bootstrap
> +LIBBPF_SRC := $(abspath ../lib/bpf)
> +BPF_INCLUDE := -I$(SKEL_TMP_OUT)/..

it looks good, but I still need to add following includes to compile
for bpf_helper* headers

jirka


---
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index fb7de412152b..1f2fe339be85 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1020,7 +1020,7 @@ SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
 ifdef BUILD_BPF_SKEL
 BPFTOOL := $(SKEL_TMP_OUT)/bpftool-bootstrap
 LIBBPF_SRC := $(abspath ../lib/bpf)
-BPF_INCLUDE := -I$(SKEL_TMP_OUT)/..
+BPF_INCLUDE := -I$(SKEL_TMP_OUT)/.. -I$(BPF_PATH) -I$(LIBBPF_SRC)/..
 
 $(SKEL_TMP_OUT):
 	$(Q)$(MKDIR) -p $@


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] perf-stat: enable counting events for BPF programs
  2020-12-04  6:13 ` [PATCH v2 2/2] perf-stat: enable counting events for BPF programs Song Liu
@ 2020-12-07 22:07   ` Jiri Olsa
  2020-12-08  1:36     ` Song Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2020-12-07 22:07 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, kernel-team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung

On Thu, Dec 03, 2020 at 10:13:10PM -0800, Song Liu wrote:

SNIP

> +#include "bpf_skel/bpf_prog_profiler.skel.h"
> +
> +static inline void *u64_to_ptr(__u64 ptr)
> +{
> +	return (void *)(unsigned long)ptr;
> +}
> +
> +static void set_max_rlimit(void)
> +{
> +	struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
> +
> +	setrlimit(RLIMIT_MEMLOCK, &rinf);
> +}
> +
> +static inline struct bpf_counter *bpf_counter_alloc(void)

why is this inlined?

SNIP

> +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id)
> +{
> +	struct bpf_prog_profiler_bpf *skel;
> +	struct bpf_counter *counter;
> +	struct bpf_program *prog;
> +	char *prog_name;
> +	int prog_fd;
> +	int err;
> +
> +	prog_fd = bpf_prog_get_fd_by_id(prog_id);
> +	if (prog_fd < 0) {
> +		pr_debug("Failed to open fd for bpf prog %u\n", prog_id);
> +		return -1;
> +	}
> +	counter = bpf_counter_alloc();
> +	if (!counter)
> +		return -1;
> +
> +	skel = bpf_prog_profiler_bpf__open();
> +	if (!skel) {
> +		pr_debug("Failed to load bpf skeleton\n");

I'm still getting

[root@dell-r440-01 perf]# ./perf stat -b 38
libbpf: elf: skipping unrecognized data section(9) .eh_frame
libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
libbpf: XXX is not found in vmlinux BTF
libbpf: failed to load object 'bpf_prog_profiler_bpf'
libbpf: failed to load BPF skeleton 'bpf_prog_profiler_bpf': -2
...

with id 38 being:

38: tracepoint  name sys_enter  tag 03418b72a610af75  gpl
        loaded_at 2020-12-07T22:54:05+0100  uid 0
        xlated 272B  jited 153B  memlock 4096B  map_ids 1

how is this supposed to work when there's XXX in the
program's section? libbpf is trying to find XXX in
kernel BTF and fails of course


> +		free(counter);
> +		return -1;
> +	}
> +	skel->rodata->num_cpu = evsel__nr_cpus(evsel);
> +
> +	bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel));
> +	bpf_map__resize(skel->maps.fentry_readings, 1);
> +	bpf_map__resize(skel->maps.accum_readings, 1);
> +

SNIP

> +static int bpf_program_profiler__read(struct evsel *evsel)
> +{
> +	int num_cpu = evsel__nr_cpus(evsel);
> +	struct bpf_perf_event_value values[num_cpu];
> +	struct bpf_counter *counter;
> +	int reading_map_fd;
> +	__u32 key = 0;
> +	int err, cpu;
> +
> +	if (list_empty(&evsel->bpf_counter_list))
> +		return -EAGAIN;
> +
> +	for (cpu = 0; cpu < num_cpu; cpu++) {
> +		perf_counts(evsel->counts, cpu, 0)->val = 0;
> +		perf_counts(evsel->counts, cpu, 0)->ena = 0;
> +		perf_counts(evsel->counts, cpu, 0)->run = 0;
> +	}
> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> +		struct bpf_prog_profiler_bpf *skel = counter->skel;
> +
> +		reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> +
> +		err = bpf_map_lookup_elem(reading_map_fd, &key, values);
> +		if (err) {
> +			fprintf(stderr, "failed to read value\n");
> +			return err;
> +		}
> +
> +		for (cpu = 0; cpu < num_cpu; cpu++) {
> +			perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
> +			perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled;
> +			perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running;
> +		}

so we sum everything up for all provided bpf IDs,
should we count/display them separately?

SNIP

> +SEC("fentry/XXX")
> +int BPF_PROG(fentry_XXX)
> +{
> +	__u32 key = bpf_get_smp_processor_id();
> +	struct bpf_perf_event_value reading;
> +	struct bpf_perf_event_value *ptr;
> +	__u32 zero = 0;
> +	long err;
> +
> +	/* look up before reading, to reduce error */
> +	ptr = bpf_map_lookup_elem(&fentry_readings, &zero);
> +	if (!ptr)
> +		return 0;
> +
> +	err = bpf_perf_event_read_value(&events, key, &reading,
> +					sizeof(reading));

can't we read directly to ptr in here?

SNIP

>  	/* THREAD and SYSTEM/CPU are mutually exclusive */
>  	if (target->per_thread && (target->system_wide || target->cpu_list)) {
>  		target->per_thread = false;
> @@ -109,6 +137,10 @@ static const char *target__error_str[] = {
>  	"PID/TID switch overriding SYSTEM",
>  	"UID switch overriding SYSTEM",
>  	"SYSTEM/CPU switch overriding PER-THREAD",
> +	"BPF switch overriding CPU",
> +	"BPF switch overriding PID/TID",
> +	"BPF switch overriding UID",
> +	"BPF switch overriding THREAD",
>  	"Invalid User: %s",
>  	"Problems obtaining information for user %s",
>  };
> @@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum,
>  
>  	switch (errnum) {
>  	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
> -	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:

hum, this should stay, no?

thanks,
jirka

> +	     TARGET_ERRNO__BPF_OVERRIDE_THREAD:
>  		snprintf(buf, buflen, "%s", msg);
>  		break;
>  
> diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h

SNIP


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] perf: support build BPF skeletons with perf
  2020-12-07 20:25   ` Jiri Olsa
@ 2020-12-08  0:59     ` Song Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Song Liu @ 2020-12-08  0:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: lkml, Kernel Team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung



> On Dec 7, 2020, at 12:25 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Thu, Dec 03, 2020 at 10:13:09PM -0800, Song Liu wrote:
> 
> SNIP
> 
>> @@ -735,7 +739,8 @@ prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioc
>> 	$(x86_arch_prctl_code_array) \
>> 	$(rename_flags_array) \
>> 	$(arch_errno_name_array) \
>> -	$(sync_file_range_arrays)
>> +	$(sync_file_range_arrays) \
>> +	bpf-skel
>> 
>> $(OUTPUT)%.o: %.c prepare FORCE
>> 	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
>> @@ -1008,7 +1013,42 @@ config-clean:
>> python-clean:
>> 	$(python-clean)
>> 
>> -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean
>> +SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
>> +SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
>> +SKELETONS :=
>> +
>> +ifdef BUILD_BPF_SKEL
>> +BPFTOOL := $(SKEL_TMP_OUT)/bpftool-bootstrap
>> +LIBBPF_SRC := $(abspath ../lib/bpf)
>> +BPF_INCLUDE := -I$(SKEL_TMP_OUT)/..
> 
> it looks good, but I still need to add following includes to compile
> for bpf_helper* headers

Thanks! I fixed this in the next version. 

Song

> 
> jirka
> 
> 
> ---
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index fb7de412152b..1f2fe339be85 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -1020,7 +1020,7 @@ SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
> ifdef BUILD_BPF_SKEL
> BPFTOOL := $(SKEL_TMP_OUT)/bpftool-bootstrap
> LIBBPF_SRC := $(abspath ../lib/bpf)
> -BPF_INCLUDE := -I$(SKEL_TMP_OUT)/..
> +BPF_INCLUDE := -I$(SKEL_TMP_OUT)/.. -I$(BPF_PATH) -I$(LIBBPF_SRC)/..
> 
> $(SKEL_TMP_OUT):
> 	$(Q)$(MKDIR) -p $@
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] perf-stat: enable counting events for BPF programs
  2020-12-07 22:07   ` Jiri Olsa
@ 2020-12-08  1:36     ` Song Liu
  2020-12-08 10:24       ` Jiri Olsa
  2020-12-09 15:55       ` Jiri Olsa
  0 siblings, 2 replies; 10+ messages in thread
From: Song Liu @ 2020-12-08  1:36 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: lkml, Kernel Team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung



> On Dec 7, 2020, at 2:07 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Thu, Dec 03, 2020 at 10:13:10PM -0800, Song Liu wrote:
> 
> SNIP
> 
>> +#include "bpf_skel/bpf_prog_profiler.skel.h"
>> +
>> +static inline void *u64_to_ptr(__u64 ptr)
>> +{
>> +	return (void *)(unsigned long)ptr;
>> +}
>> +
>> +static void set_max_rlimit(void)
>> +{
>> +	struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY };
>> +
>> +	setrlimit(RLIMIT_MEMLOCK, &rinf);
>> +}
>> +
>> +static inline struct bpf_counter *bpf_counter_alloc(void)
> 
> why is this inlined?

We don't need the inline here. I will remove it in the next version. 

> 
> SNIP
> 
>> +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id)
>> +{
>> +	struct bpf_prog_profiler_bpf *skel;
>> +	struct bpf_counter *counter;
>> +	struct bpf_program *prog;
>> +	char *prog_name;
>> +	int prog_fd;
>> +	int err;
>> +
>> +	prog_fd = bpf_prog_get_fd_by_id(prog_id);
>> +	if (prog_fd < 0) {
>> +		pr_debug("Failed to open fd for bpf prog %u\n", prog_id);
>> +		return -1;
>> +	}
>> +	counter = bpf_counter_alloc();
>> +	if (!counter)
>> +		return -1;
>> +
>> +	skel = bpf_prog_profiler_bpf__open();
>> +	if (!skel) {
>> +		pr_debug("Failed to load bpf skeleton\n");
> 
> I'm still getting
> 
> [root@dell-r440-01 perf]# ./perf stat -b 38
> libbpf: elf: skipping unrecognized data section(9) .eh_frame
> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> libbpf: XXX is not found in vmlinux BTF
> libbpf: failed to load object 'bpf_prog_profiler_bpf'
> libbpf: failed to load BPF skeleton 'bpf_prog_profiler_bpf': -2
> ...
> 
> with id 38 being:
> 
> 38: tracepoint  name sys_enter  tag 03418b72a610af75  gpl
>        loaded_at 2020-12-07T22:54:05+0100  uid 0
>        xlated 272B  jited 153B  memlock 4096B  map_ids 1
> 
> how is this supposed to work when there's XXX in the
> program's section? libbpf is trying to find XXX in
> kernel BTF and fails of course

I think this is because this program doesn't have BTF. The actual failed
function was bpf_program__set_attach_target(). So the error message above
should be "Failed to _open_ bpf skeleton". I will fix the error messages. 

> 
> 
>> +		free(counter);
>> +		return -1;
>> +	}
>> +	skel->rodata->num_cpu = evsel__nr_cpus(evsel);
>> +
>> +	bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel));
>> +	bpf_map__resize(skel->maps.fentry_readings, 1);
>> +	bpf_map__resize(skel->maps.accum_readings, 1);
>> +
> 
> SNIP
> 
>> +static int bpf_program_profiler__read(struct evsel *evsel)
>> +{
>> +	int num_cpu = evsel__nr_cpus(evsel);
>> +	struct bpf_perf_event_value values[num_cpu];
>> +	struct bpf_counter *counter;
>> +	int reading_map_fd;
>> +	__u32 key = 0;
>> +	int err, cpu;
>> +
>> +	if (list_empty(&evsel->bpf_counter_list))
>> +		return -EAGAIN;
>> +
>> +	for (cpu = 0; cpu < num_cpu; cpu++) {
>> +		perf_counts(evsel->counts, cpu, 0)->val = 0;
>> +		perf_counts(evsel->counts, cpu, 0)->ena = 0;
>> +		perf_counts(evsel->counts, cpu, 0)->run = 0;
>> +	}
>> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
>> +		struct bpf_prog_profiler_bpf *skel = counter->skel;
>> +
>> +		reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>> +
>> +		err = bpf_map_lookup_elem(reading_map_fd, &key, values);
>> +		if (err) {
>> +			fprintf(stderr, "failed to read value\n");
>> +			return err;
>> +		}
>> +
>> +		for (cpu = 0; cpu < num_cpu; cpu++) {
>> +			perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
>> +			perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled;
>> +			perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running;
>> +		}
> 
> so we sum everything up for all provided bpf IDs,
> should we count/display them separately?

I think that's the default behavior with --pid x,y,z or --cpu a,b,c. 
Do we need to separate them?

> 
> SNIP
> 
>> +SEC("fentry/XXX")
>> +int BPF_PROG(fentry_XXX)
>> +{
>> +	__u32 key = bpf_get_smp_processor_id();
>> +	struct bpf_perf_event_value reading;
>> +	struct bpf_perf_event_value *ptr;
>> +	__u32 zero = 0;
>> +	long err;
>> +
>> +	/* look up before reading, to reduce error */
>> +	ptr = bpf_map_lookup_elem(&fentry_readings, &zero);
>> +	if (!ptr)
>> +		return 0;
>> +
>> +	err = bpf_perf_event_read_value(&events, key, &reading,
>> +					sizeof(reading));
> 
> can't we read directly to ptr in here?

Yes, we can! Thanks for catching this. 

> 
> SNIP
> 
>> 	/* THREAD and SYSTEM/CPU are mutually exclusive */
>> 	if (target->per_thread && (target->system_wide || target->cpu_list)) {
>> 		target->per_thread = false;
>> @@ -109,6 +137,10 @@ static const char *target__error_str[] = {
>> 	"PID/TID switch overriding SYSTEM",
>> 	"UID switch overriding SYSTEM",
>> 	"SYSTEM/CPU switch overriding PER-THREAD",
>> +	"BPF switch overriding CPU",
>> +	"BPF switch overriding PID/TID",
>> +	"BPF switch overriding UID",
>> +	"BPF switch overriding THREAD",
>> 	"Invalid User: %s",
>> 	"Problems obtaining information for user %s",
>> };
>> @@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum,
>> 
>> 	switch (errnum) {
>> 	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
>> -	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:
> 
> hum, this should stay, no?

We need this to show the warning like:

~/perf stat -e cycles,instructions -b 245561 -C 0
BPF switch overriding CPU
...

Thanks,
Song


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] perf-stat: enable counting events for BPF programs
  2020-12-08  1:36     ` Song Liu
@ 2020-12-08 10:24       ` Jiri Olsa
  2020-12-08 18:16         ` Song Liu
  2020-12-09 15:55       ` Jiri Olsa
  1 sibling, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2020-12-08 10:24 UTC (permalink / raw)
  To: Song Liu
  Cc: lkml, Kernel Team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung

On Tue, Dec 08, 2020 at 01:36:57AM +0000, Song Liu wrote:

SNIP

> > 
> > I'm still getting
> > 
> > [root@dell-r440-01 perf]# ./perf stat -b 38
> > libbpf: elf: skipping unrecognized data section(9) .eh_frame
> > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
> > libbpf: XXX is not found in vmlinux BTF
> > libbpf: failed to load object 'bpf_prog_profiler_bpf'
> > libbpf: failed to load BPF skeleton 'bpf_prog_profiler_bpf': -2
> > ...
> > 
> > with id 38 being:
> > 
> > 38: tracepoint  name sys_enter  tag 03418b72a610af75  gpl
> >        loaded_at 2020-12-07T22:54:05+0100  uid 0
> >        xlated 272B  jited 153B  memlock 4096B  map_ids 1
> > 
> > how is this supposed to work when there's XXX in the
> > program's section? libbpf is trying to find XXX in
> > kernel BTF and fails of course
> 
> I think this is because this program doesn't have BTF. The actual failed
> function was bpf_program__set_attach_target(). So the error message above
> should be "Failed to _open_ bpf skeleton". I will fix the error messages. 

ah right, it's bpftrace program, so there's no BTF loaded for the program
I'll check if there's a way to add it, it'd be shame not to have this
feature for bpftrace programs

there's no way around it, right? we need btf id of the program to attach
fentry/fexit to it

I think we need to fail the function if there's error detected,
and also check on the prog_name and fail if it's not found

plus change all those pr_debug to pr_err in bpf_program_profiler_load_one

jirka


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] perf-stat: enable counting events for BPF programs
  2020-12-08 10:24       ` Jiri Olsa
@ 2020-12-08 18:16         ` Song Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Song Liu @ 2020-12-08 18:16 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: lkml, Kernel Team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung



> On Dec 8, 2020, at 2:24 AM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Tue, Dec 08, 2020 at 01:36:57AM +0000, Song Liu wrote:
> 
> SNIP
> 
>>> 
>>> I'm still getting
>>> 
>>> [root@dell-r440-01 perf]# ./perf stat -b 38
>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>>> libbpf: XXX is not found in vmlinux BTF
>>> libbpf: failed to load object 'bpf_prog_profiler_bpf'
>>> libbpf: failed to load BPF skeleton 'bpf_prog_profiler_bpf': -2
>>> ...
>>> 
>>> with id 38 being:
>>> 
>>> 38: tracepoint  name sys_enter  tag 03418b72a610af75  gpl
>>>       loaded_at 2020-12-07T22:54:05+0100  uid 0
>>>       xlated 272B  jited 153B  memlock 4096B  map_ids 1
>>> 
>>> how is this supposed to work when there's XXX in the
>>> program's section? libbpf is trying to find XXX in
>>> kernel BTF and fails of course
>> 
>> I think this is because this program doesn't have BTF. The actual failed
>> function was bpf_program__set_attach_target(). So the error message above
>> should be "Failed to _open_ bpf skeleton". I will fix the error messages. 
> 
> ah right, it's bpftrace program, so there's no BTF loaded for the program
> I'll check if there's a way to add it, it'd be shame not to have this
> feature for bpftrace programs
> 
> there's no way around it, right? we need btf id of the program to attach
> fentry/fexit to it

Yes, we do need BTF here (in bpf_program__set_attach_target). 
> 
> I think we need to fail the function if there's error detected,
> and also check on the prog_name and fail if it's not found
> 
> plus change all those pr_debug to pr_err in bpf_program_profiler_load_one

I fixed these in the next version. 

Thanks,
Song



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] perf-stat: enable counting events for BPF programs
  2020-12-08  1:36     ` Song Liu
  2020-12-08 10:24       ` Jiri Olsa
@ 2020-12-09 15:55       ` Jiri Olsa
  1 sibling, 0 replies; 10+ messages in thread
From: Jiri Olsa @ 2020-12-09 15:55 UTC (permalink / raw)
  To: Song Liu
  Cc: lkml, Kernel Team, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, namhyung

On Tue, Dec 08, 2020 at 01:36:57AM +0000, Song Liu wrote:

SNIP

> > SNIP
> > 
> >> +static int bpf_program_profiler__read(struct evsel *evsel)
> >> +{
> >> +	int num_cpu = evsel__nr_cpus(evsel);
> >> +	struct bpf_perf_event_value values[num_cpu];
> >> +	struct bpf_counter *counter;
> >> +	int reading_map_fd;
> >> +	__u32 key = 0;
> >> +	int err, cpu;
> >> +
> >> +	if (list_empty(&evsel->bpf_counter_list))
> >> +		return -EAGAIN;
> >> +
> >> +	for (cpu = 0; cpu < num_cpu; cpu++) {
> >> +		perf_counts(evsel->counts, cpu, 0)->val = 0;
> >> +		perf_counts(evsel->counts, cpu, 0)->ena = 0;
> >> +		perf_counts(evsel->counts, cpu, 0)->run = 0;
> >> +	}
> >> +	list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
> >> +		struct bpf_prog_profiler_bpf *skel = counter->skel;
> >> +
> >> +		reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> >> +
> >> +		err = bpf_map_lookup_elem(reading_map_fd, &key, values);
> >> +		if (err) {
> >> +			fprintf(stderr, "failed to read value\n");
> >> +			return err;
> >> +		}
> >> +
> >> +		for (cpu = 0; cpu < num_cpu; cpu++) {
> >> +			perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter;
> >> +			perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled;
> >> +			perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running;
> >> +		}
> > 
> > so we sum everything up for all provided bpf IDs,
> > should we count/display them separately?
> 
> I think that's the default behavior with --pid x,y,z or --cpu a,b,c. 
> Do we need to separate them?

ah right, and we have --per-thread that splits the output
for specified pids

I think we should add something like that for bpf, so we
could see stats for specific programs

it's ok to do this as a follow up patch in future

thanks,
jirka


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-12-09 15:57 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04  6:13 [PATCH v2 0/2] Introduce perf-stat -b for BPF programs Song Liu
2020-12-04  6:13 ` [PATCH v2 1/2] perf: support build BPF skeletons with perf Song Liu
2020-12-07 20:25   ` Jiri Olsa
2020-12-08  0:59     ` Song Liu
2020-12-04  6:13 ` [PATCH v2 2/2] perf-stat: enable counting events for BPF programs Song Liu
2020-12-07 22:07   ` Jiri Olsa
2020-12-08  1:36     ` Song Liu
2020-12-08 10:24       ` Jiri Olsa
2020-12-08 18:16         ` Song Liu
2020-12-09 15:55       ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).