* [PATCH v6 0/4] Introduce perf-stat -b for BPF programs @ 2020-12-28 17:40 Song Liu 2020-12-28 17:40 ` [PATCH v6 1/4] bpftool: add Makefile target bootstrap Song Liu ` (3 more replies) 0 siblings, 4 replies; 25+ messages in thread From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw) To: linux-kernel Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, kernel-team, Song Liu This set introduces perf-stat -b option to count events for BPF programs. This is similar to bpftool-prog-profile. But perf-stat makes it much more flexible. Changes v5 => v6 1. Update the name for bootstrap bpftool. (Jiri) Changes v4 => v5: 1. Add documentation. (Jiri) 2. Silent make output for removing .bpf.o file. (Jiri) Changes v3 => v4: 1. Split changes in bpftool/Makefile to a separate patch 2. Various small fixes. (Jiri) Changes v2 => v3: 1. Small fixes in Makefile.perf and bpf_counter.c (Jiri) 2. Rebased on top of bpf-next. This is because 1/2 conflicts with some patches in bpftool/Makefile. Changes PATCH v1 => PATCH v2: 1. Various fixes in Makefiles. (Jiri) 2. Fix an build warning/error with gcc-10. (Jiri) Changes RFC v2 => PATCH v1: 1. Support counting on multiple BPF programs. 2. Add BPF handling to target__validate(). 3. Improve Makefile. (Jiri) Changes RFC v1 => RFC v2: 1. Use bootstrap version of bpftool. (Jiri) 2. Set default to not building bpf skeletons. (Jiri) 3. Remove util/bpf_skel/Makefile, keep all the logic in Makefile.perf. (Jiri) 4. Remove dependency to vmlinux.h in the two skeletons. The goal here is to enable building perf without building kernel (vmlinux) first. Note: I also removed the logic that build vmlinux.h. We can add that back when we have to use it (to access big kernel structures). Song Liu (4): bpftool: add Makefile target bootstrap perf: support build BPF skeletons with perf perf-stat: enable counting events for BPF programs perf-stat: add documentation for -b option tools/bpf/bpftool/Makefile | 2 + tools/build/Makefile.feature | 4 +- tools/perf/Documentation/perf-stat.txt | 14 + tools/perf/Makefile.config | 9 + tools/perf/Makefile.perf | 49 ++- tools/perf/builtin-stat.c | 77 ++++- tools/perf/util/Build | 1 + tools/perf/util/bpf_counter.c | 296 ++++++++++++++++++ tools/perf/util/bpf_counter.h | 72 +++++ tools/perf/util/bpf_skel/.gitignore | 3 + .../util/bpf_skel/bpf_prog_profiler.bpf.c | 93 ++++++ tools/perf/util/evsel.c | 9 + tools/perf/util/evsel.h | 6 + tools/perf/util/stat-display.c | 4 +- tools/perf/util/stat.c | 2 +- tools/perf/util/target.c | 34 +- tools/perf/util/target.h | 10 + tools/scripts/Makefile.include | 1 + 18 files changed, 666 insertions(+), 20 deletions(-) create mode 100644 tools/perf/util/bpf_counter.c create mode 100644 tools/perf/util/bpf_counter.h create mode 100644 tools/perf/util/bpf_skel/.gitignore create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c -- 2.24.1 ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v6 1/4] bpftool: add Makefile target bootstrap 2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu @ 2020-12-28 17:40 ` Song Liu 2020-12-28 17:40 ` [PATCH v6 2/4] perf: support build BPF skeletons with perf Song Liu ` (2 subsequent siblings) 3 siblings, 0 replies; 25+ messages in thread From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw) To: linux-kernel Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, kernel-team, Song Liu This target is used to only build the bootstrap bpftool, which will be used to generate bpf skeletons for other tools, like perf. Signed-off-by: Song Liu <songliubraving@fb.com> --- tools/bpf/bpftool/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index f897cb5fb12d0..e3292a3a0c461 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -148,6 +148,8 @@ VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ /boot/vmlinux-$(shell uname -r) VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS)))) +bootstrap: $(BPFTOOL_BOOTSTRAP) + ifneq ($(VMLINUX_BTF)$(VMLINUX_H),) ifeq ($(feature-clang-bpf-co-re),1) -- 2.24.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v6 2/4] perf: support build BPF skeletons with perf 2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu 2020-12-28 17:40 ` [PATCH v6 1/4] bpftool: add Makefile target bootstrap Song Liu @ 2020-12-28 17:40 ` Song Liu 2020-12-29 7:01 ` Namhyung Kim 2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu 2020-12-28 17:40 ` [PATCH v6 4/4] perf-stat: add documentation for -b option Song Liu 3 siblings, 1 reply; 25+ messages in thread From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw) To: linux-kernel Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, kernel-team, Song Liu BPF programs are useful in perf to profile BPF programs. BPF skeleton is by far the easiest way to write BPF tools. Enable building BPF skeletons in util/bpf_skel. A dummy bpf skeleton is added. More bpf skeletons will be added for different use cases. Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> --- tools/build/Makefile.feature | 4 ++- tools/perf/Makefile.config | 9 ++++++ tools/perf/Makefile.perf | 49 +++++++++++++++++++++++++++-- tools/perf/util/bpf_skel/.gitignore | 3 ++ tools/scripts/Makefile.include | 1 + 5 files changed, 63 insertions(+), 3 deletions(-) create mode 100644 tools/perf/util/bpf_skel/.gitignore diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index 97cbfb31b7625..74e255d58d8d0 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -99,7 +99,9 @@ FEATURE_TESTS_EXTRA := \ clang \ libbpf \ libpfm4 \ - libdebuginfod + libdebuginfod \ + clang-bpf-co-re + FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC) diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index ce8516e4de34f..d8e59d31399a5 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -621,6 +621,15 @@ ifndef NO_LIBBPF endif endif +ifdef BUILD_BPF_SKEL + $(call feature_check,clang-bpf-co-re) + ifeq ($(feature-clang-bpf-co-re), 0) + dummy := $(error Error: clang too old. Please install recent clang) + endif + $(call detected,CONFIG_PERF_BPF_SKEL) + CFLAGS += -DHAVE_BPF_SKEL +endif + dwarf-post-unwind := 1 dwarf-post-unwind-text := BUG diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 62f3deb1d3a8b..d182a2dbb9bbd 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -126,6 +126,8 @@ include ../scripts/utilities.mak # # Define NO_LIBDEBUGINFOD if you do not want support debuginfod # +# Define BUILD_BPF_SKEL to enable BPF skeletons +# # As per kernel Makefile, avoid funny character set dependencies unexport LC_ALL @@ -175,6 +177,12 @@ endef LD += $(EXTRA_LDFLAGS) +HOSTCC ?= gcc +HOSTLD ?= ld +HOSTAR ?= ar +CLANG ?= clang +LLVM_STRIP ?= llvm-strip + PKG_CONFIG = $(CROSS_COMPILE)pkg-config LLVM_CONFIG ?= llvm-config @@ -731,7 +739,8 @@ prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioc $(x86_arch_prctl_code_array) \ $(rename_flags_array) \ $(arch_errno_name_array) \ - $(sync_file_range_arrays) + $(sync_file_range_arrays) \ + bpf-skel $(OUTPUT)%.o: %.c prepare FORCE $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@ @@ -1004,7 +1013,43 @@ config-clean: python-clean: $(python-clean) -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean +SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) +SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) +SKELETONS := + +ifdef BUILD_BPF_SKEL +BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool +LIBBPF_SRC := $(abspath ../lib/bpf) +BPF_INCLUDE := -I$(SKEL_TMP_OUT)/.. -I$(BPF_PATH) -I$(LIBBPF_SRC)/.. + +$(SKEL_TMP_OUT): + $(Q)$(MKDIR) -p $@ + +$(BPFTOOL): | $(SKEL_TMP_OUT) + CFLAGS= $(MAKE) -C ../bpf/bpftool \ + OUTPUT=$(SKEL_TMP_OUT)/ bootstrap + +$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT) + $(QUIET_CLANG)$(CLANG) -g -O2 -target bpf $(BPF_INCLUDE) \ + -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@ + +$(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL) + $(QUIET_GENSKEL)$(BPFTOOL) gen skeleton $< > $@ + +bpf-skel: $(SKELETONS) + +.PRECIOUS: $(SKEL_TMP_OUT)/%.bpf.o + +else # BUILD_BPF_SKEL + +bpf-skel: + +endif # BUILD_BPF_SKEL + +bpf-skel-clean: + $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) + +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean bpf-skel-clean $(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS) $(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete $(Q)$(RM) $(OUTPUT).config-detected diff --git a/tools/perf/util/bpf_skel/.gitignore b/tools/perf/util/bpf_skel/.gitignore new file mode 100644 index 0000000000000..5263e9e6c5d83 --- /dev/null +++ b/tools/perf/util/bpf_skel/.gitignore @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0-only +.tmp +*.skel.h \ No newline at end of file diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include index 1358e89cdf7d6..62119ce69ad9a 100644 --- a/tools/scripts/Makefile.include +++ b/tools/scripts/Makefile.include @@ -127,6 +127,7 @@ ifneq ($(silent),1) $(MAKE) $(PRINT_DIR) -C $$subdir QUIET_FLEX = @echo ' FLEX '$@; QUIET_BISON = @echo ' BISON '$@; + QUIET_GENSKEL = @echo ' GEN-SKEL '$@; descend = \ +@echo ' DESCEND '$(1); \ -- 2.24.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v6 2/4] perf: support build BPF skeletons with perf 2020-12-28 17:40 ` [PATCH v6 2/4] perf: support build BPF skeletons with perf Song Liu @ 2020-12-29 7:01 ` Namhyung Kim 2020-12-29 11:48 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 25+ messages in thread From: Namhyung Kim @ 2020-12-29 7:01 UTC (permalink / raw) To: Song Liu Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa, kernel-team Hello, On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote: > > BPF programs are useful in perf to profile BPF programs. BPF skeleton is I'm having difficulties understanding the first sentence - looks like a recursion. :) So do you want to use two (or more) BPF programs? Thanks, Namhyung > by far the easiest way to write BPF tools. Enable building BPF skeletons > in util/bpf_skel. A dummy bpf skeleton is added. More bpf skeletons will > be added for different use cases. > > Acked-by: Jiri Olsa <jolsa@redhat.com> > Signed-off-by: Song Liu <songliubraving@fb.com> > --- > tools/build/Makefile.feature | 4 ++- > tools/perf/Makefile.config | 9 ++++++ > tools/perf/Makefile.perf | 49 +++++++++++++++++++++++++++-- > tools/perf/util/bpf_skel/.gitignore | 3 ++ > tools/scripts/Makefile.include | 1 + > 5 files changed, 63 insertions(+), 3 deletions(-) > create mode 100644 tools/perf/util/bpf_skel/.gitignore > > diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature > index 97cbfb31b7625..74e255d58d8d0 100644 > --- a/tools/build/Makefile.feature > +++ b/tools/build/Makefile.feature > @@ -99,7 +99,9 @@ FEATURE_TESTS_EXTRA := \ > clang \ > libbpf \ > libpfm4 \ > - libdebuginfod > + libdebuginfod \ > + clang-bpf-co-re > + > > FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC) > > diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config > index ce8516e4de34f..d8e59d31399a5 100644 > --- a/tools/perf/Makefile.config > +++ b/tools/perf/Makefile.config > @@ -621,6 +621,15 @@ ifndef NO_LIBBPF > endif > endif > > +ifdef BUILD_BPF_SKEL > + $(call feature_check,clang-bpf-co-re) > + ifeq ($(feature-clang-bpf-co-re), 0) > + dummy := $(error Error: clang too old. Please install recent clang) > + endif > + $(call detected,CONFIG_PERF_BPF_SKEL) > + CFLAGS += -DHAVE_BPF_SKEL > +endif > + > dwarf-post-unwind := 1 > dwarf-post-unwind-text := BUG > > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf > index 62f3deb1d3a8b..d182a2dbb9bbd 100644 > --- a/tools/perf/Makefile.perf > +++ b/tools/perf/Makefile.perf > @@ -126,6 +126,8 @@ include ../scripts/utilities.mak > # > # Define NO_LIBDEBUGINFOD if you do not want support debuginfod > # > +# Define BUILD_BPF_SKEL to enable BPF skeletons > +# > > # As per kernel Makefile, avoid funny character set dependencies > unexport LC_ALL > @@ -175,6 +177,12 @@ endef > > LD += $(EXTRA_LDFLAGS) > > +HOSTCC ?= gcc > +HOSTLD ?= ld > +HOSTAR ?= ar > +CLANG ?= clang > +LLVM_STRIP ?= llvm-strip > + > PKG_CONFIG = $(CROSS_COMPILE)pkg-config > LLVM_CONFIG ?= llvm-config > > @@ -731,7 +739,8 @@ prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioc > $(x86_arch_prctl_code_array) \ > $(rename_flags_array) \ > $(arch_errno_name_array) \ > - $(sync_file_range_arrays) > + $(sync_file_range_arrays) \ > + bpf-skel > > $(OUTPUT)%.o: %.c prepare FORCE > $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@ > @@ -1004,7 +1013,43 @@ config-clean: > python-clean: > $(python-clean) > > -clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean > +SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) > +SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) > +SKELETONS := > + > +ifdef BUILD_BPF_SKEL > +BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool > +LIBBPF_SRC := $(abspath ../lib/bpf) > +BPF_INCLUDE := -I$(SKEL_TMP_OUT)/.. -I$(BPF_PATH) -I$(LIBBPF_SRC)/.. > + > +$(SKEL_TMP_OUT): > + $(Q)$(MKDIR) -p $@ > + > +$(BPFTOOL): | $(SKEL_TMP_OUT) > + CFLAGS= $(MAKE) -C ../bpf/bpftool \ > + OUTPUT=$(SKEL_TMP_OUT)/ bootstrap > + > +$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT) > + $(QUIET_CLANG)$(CLANG) -g -O2 -target bpf $(BPF_INCLUDE) \ > + -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@ > + > +$(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL) > + $(QUIET_GENSKEL)$(BPFTOOL) gen skeleton $< > $@ > + > +bpf-skel: $(SKELETONS) > + > +.PRECIOUS: $(SKEL_TMP_OUT)/%.bpf.o > + > +else # BUILD_BPF_SKEL > + > +bpf-skel: > + > +endif # BUILD_BPF_SKEL > + > +bpf-skel-clean: > + $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) > + > +clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean bpf-skel-clean > $(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS) > $(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete > $(Q)$(RM) $(OUTPUT).config-detected > diff --git a/tools/perf/util/bpf_skel/.gitignore b/tools/perf/util/bpf_skel/.gitignore > new file mode 100644 > index 0000000000000..5263e9e6c5d83 > --- /dev/null > +++ b/tools/perf/util/bpf_skel/.gitignore > @@ -0,0 +1,3 @@ > +# SPDX-License-Identifier: GPL-2.0-only > +.tmp > +*.skel.h > \ No newline at end of file > diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include > index 1358e89cdf7d6..62119ce69ad9a 100644 > --- a/tools/scripts/Makefile.include > +++ b/tools/scripts/Makefile.include > @@ -127,6 +127,7 @@ ifneq ($(silent),1) > $(MAKE) $(PRINT_DIR) -C $$subdir > QUIET_FLEX = @echo ' FLEX '$@; > QUIET_BISON = @echo ' BISON '$@; > + QUIET_GENSKEL = @echo ' GEN-SKEL '$@; > > descend = \ > +@echo ' DESCEND '$(1); \ > -- > 2.24.1 > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 2/4] perf: support build BPF skeletons with perf 2020-12-29 7:01 ` Namhyung Kim @ 2020-12-29 11:48 ` Arnaldo Carvalho de Melo 2020-12-29 17:14 ` Song Liu 0 siblings, 1 reply; 25+ messages in thread From: Arnaldo Carvalho de Melo @ 2020-12-29 11:48 UTC (permalink / raw) To: Namhyung Kim Cc: Song Liu, linux-kernel, Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa, kernel-team Em Tue, Dec 29, 2020 at 04:01:41PM +0900, Namhyung Kim escreveu: > On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote: > > BPF programs are useful in perf to profile BPF programs. BPF skeleton is > I'm having difficulties understanding the first sentence - looks like a > recursion. :) So do you want to use two (or more) BPF programs? Yeah, we use perf to perf perf, so we need to use bpf with perf to perf bpf :-) Look at tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c, the BPF skeleton used to create the in-kernel scaffold to profile BPF programs. It uses two BPF programs (fentry/XXX and fexit/XXX) and some a PERF_EVENT_ARRAY map and an array to diff counters read at exit from counters read at exit of the profiled BPF programs and then accumulate those diffs in another PERCPU_ARRAY. This all ends up composing a "BPF PMU" that is what the userspace perf tooling will read (from "accum_readings" BPF map) and 'perf stat' will consume as if reading from an "old style perf counter" :-) Song, did I get it right? :-) For convenience, it is below: - Arnaldo [acme@five perf]$ cat tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) // Copyright (c) 2020 Facebook #include <linux/bpf.h> #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> /* map of perf event fds, num_cpu * num_metric entries */ struct { __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(int)); } events SEC(".maps"); /* readings at fentry */ struct { __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(struct bpf_perf_event_value)); __uint(max_entries, 1); } fentry_readings SEC(".maps"); /* accumulated readings */ struct { __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(struct bpf_perf_event_value)); __uint(max_entries, 1); } accum_readings SEC(".maps"); const volatile __u32 num_cpu = 1; SEC("fentry/XXX") int BPF_PROG(fentry_XXX) { __u32 key = bpf_get_smp_processor_id(); struct bpf_perf_event_value *ptr; __u32 zero = 0; long err; /* look up before reading, to reduce error */ ptr = bpf_map_lookup_elem(&fentry_readings, &zero); if (!ptr) return 0; err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr)); if (err) return 0; return 0; } static inline void fexit_update_maps(struct bpf_perf_event_value *after) { struct bpf_perf_event_value *before, diff, *accum; __u32 zero = 0; before = bpf_map_lookup_elem(&fentry_readings, &zero); /* only account samples with a valid fentry_reading */ if (before && before->counter) { struct bpf_perf_event_value *accum; diff.counter = after->counter - before->counter; diff.enabled = after->enabled - before->enabled; diff.running = after->running - before->running; accum = bpf_map_lookup_elem(&accum_readings, &zero); if (accum) { accum->counter += diff.counter; accum->enabled += diff.enabled; accum->running += diff.running; } } } SEC("fexit/XXX") int BPF_PROG(fexit_XXX) { struct bpf_perf_event_value reading; __u32 cpu = bpf_get_smp_processor_id(); __u32 one = 1, zero = 0; int err; /* read all events before updating the maps, to reduce error */ err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading)); if (err) return 0; fexit_update_maps(&reading); return 0; } char LICENSE[] SEC("license") = "Dual BSD/GPL"; [acme@five perf]$ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 2/4] perf: support build BPF skeletons with perf 2020-12-29 11:48 ` Arnaldo Carvalho de Melo @ 2020-12-29 17:14 ` Song Liu 2020-12-29 18:16 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 25+ messages in thread From: Song Liu @ 2020-12-29 17:14 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Namhyung Kim, linux-kernel, Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa, Kernel Team > On Dec 29, 2020, at 3:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Tue, Dec 29, 2020 at 04:01:41PM +0900, Namhyung Kim escreveu: >> On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote: >>> BPF programs are useful in perf to profile BPF programs. BPF skeleton is > >> I'm having difficulties understanding the first sentence - looks like a >> recursion. :) So do you want to use two (or more) BPF programs? > > Yeah, we use perf to perf perf, so we need to use bpf with perf to perf > bpf :-) > > Look at tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c, the BPF > skeleton used to create the in-kernel scaffold to profile BPF programs. > > It uses two BPF programs (fentry/XXX and fexit/XXX) and some a > PERF_EVENT_ARRAY map and an array to diff counters read at exit from > counters read at exit of the profiled BPF programs and then accumulate > those diffs in another PERCPU_ARRAY. > > This all ends up composing a "BPF PMU" that is what the userspace perf > tooling will read (from "accum_readings" BPF map) and 'perf stat' will > consume as if reading from an "old style perf counter" :-) > > Song, did I get it right? :-) Thanks Arnaldo! I don't think anyone can explain it better. :-) Song [...] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 2/4] perf: support build BPF skeletons with perf 2020-12-29 17:14 ` Song Liu @ 2020-12-29 18:16 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 25+ messages in thread From: Arnaldo Carvalho de Melo @ 2020-12-29 18:16 UTC (permalink / raw) To: Song Liu Cc: Namhyung Kim, linux-kernel, Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa, Kernel Team Em Tue, Dec 29, 2020 at 05:14:12PM +0000, Song Liu escreveu: > > On Dec 29, 2020, at 3:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Tue, Dec 29, 2020 at 04:01:41PM +0900, Namhyung Kim escreveu: > >> On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote: > >>> BPF programs are useful in perf to profile BPF programs. BPF skeleton is > >> I'm having difficulties understanding the first sentence - looks like a > >> recursion. :) So do you want to use two (or more) BPF programs? > > Yeah, we use perf to perf perf, so we need to use bpf with perf to perf > > bpf :-) > > Look at tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c, the BPF > > skeleton used to create the in-kernel scaffold to profile BPF programs. > > It uses two BPF programs (fentry/XXX and fexit/XXX) and some a s/some// > > PERF_EVENT_ARRAY map and an array to diff counters read at exit from > > counters read at exit of the profiled BPF programs and then accumulate s/exit/entry/ > > those diffs in another PERCPU_ARRAY. > > This all ends up composing a "BPF PMU" that is what the userspace perf > > tooling will read (from "accum_readings" BPF map) and 'perf stat' will > > consume as if reading from an "old style perf counter" :-) > > Song, did I get it right? :-) > Thanks Arnaldo! I don't think anyone can explain it better. :-) There, a patch :-) - Arnaldo ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu 2020-12-28 17:40 ` [PATCH v6 1/4] bpftool: add Makefile target bootstrap Song Liu 2020-12-28 17:40 ` [PATCH v6 2/4] perf: support build BPF skeletons with perf Song Liu @ 2020-12-28 17:40 ` Song Liu 2020-12-28 20:11 ` Arnaldo Carvalho de Melo 2020-12-29 7:22 ` Namhyung Kim 2020-12-28 17:40 ` [PATCH v6 4/4] perf-stat: add documentation for -b option Song Liu 3 siblings, 2 replies; 25+ messages in thread From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw) To: linux-kernel Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, kernel-team, Song Liu Introduce perf-stat -b option, which counts events for BPF programs, like: [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 1.487903822 115,200 ref-cycles 1.487903822 86,012 cycles 2.489147029 80,560 ref-cycles 2.489147029 73,784 cycles 3.490341825 60,720 ref-cycles 3.490341825 37,797 cycles 4.491540887 37,120 ref-cycles 4.491540887 31,963 cycles The example above counts cycles and ref-cycles of BPF program of id 254. This is similar to bpftool-prog-profile command, but more flexible. perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF programs (monitor-progs) to the target BPF program (target-prog). The monitor-progs read perf_event before and after the target-prog, and aggregate the difference in a BPF map. Then the user space reads data from these maps. A new struct bpf_counter is introduced to provide common interface that uses BPF programs/maps to count perf events. Signed-off-by: Song Liu <songliubraving@fb.com> --- tools/perf/Makefile.perf | 2 +- tools/perf/builtin-stat.c | 77 ++++- tools/perf/util/Build | 1 + tools/perf/util/bpf_counter.c | 296 ++++++++++++++++++ tools/perf/util/bpf_counter.h | 72 +++++ .../util/bpf_skel/bpf_prog_profiler.bpf.c | 93 ++++++ tools/perf/util/evsel.c | 9 + tools/perf/util/evsel.h | 6 + tools/perf/util/stat-display.c | 4 +- tools/perf/util/stat.c | 2 +- tools/perf/util/target.c | 34 +- tools/perf/util/target.h | 10 + 12 files changed, 588 insertions(+), 18 deletions(-) create mode 100644 tools/perf/util/bpf_counter.c create mode 100644 tools/perf/util/bpf_counter.h create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index d182a2dbb9bbd..8c4e039c3b813 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -1015,7 +1015,7 @@ python-clean: SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) -SKELETONS := +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h ifdef BUILD_BPF_SKEL BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 8cc24967bc273..09bffb3fbcdd4 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -67,6 +67,7 @@ #include "util/top.h" #include "util/affinity.h" #include "util/pfm.h" +#include "util/bpf_counter.h" #include "asm/bug.h" #include <linux/time64.h> @@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs) return 0; } +static int read_bpf_map_counters(void) +{ + struct evsel *counter; + int err; + + evlist__for_each_entry(evsel_list, counter) { + err = bpf_counter__read(counter); + if (err) + return err; + } + return 0; +} + static void read_counters(struct timespec *rs) { struct evsel *counter; + int err; - if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0)) - return; + if (!stat_config.stop_read_counter) { + err = read_bpf_map_counters(); + if (err == -EAGAIN) + err = read_affinity_counters(rs); + if (err < 0) + return; + } evlist__for_each_entry(evsel_list, counter) { if (counter->err) @@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times) return false; } -static void enable_counters(void) +static int enable_counters(void) { + struct evsel *evsel; + int err; + + evlist__for_each_entry(evsel_list, evsel) { + err = bpf_counter__enable(evsel); + if (err) + return err; + } + if (stat_config.initial_delay < 0) { pr_info(EVLIST_DISABLED_MSG); - return; + return 0; } if (stat_config.initial_delay > 0) { @@ -518,6 +547,7 @@ static void enable_counters(void) if (stat_config.initial_delay > 0) pr_info(EVLIST_ENABLED_MSG); } + return 0; } static void disable_counters(void) @@ -720,7 +750,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) const bool forks = (argc > 0); bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false; struct affinity affinity; - int i, cpu; + int i, cpu, err; bool second_pass = false; if (forks) { @@ -737,6 +767,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) if (affinity__setup(&affinity) < 0) return -1; + evlist__for_each_entry(evsel_list, counter) { + if (bpf_counter__load(counter, &target)) + return -1; + } + evlist__for_each_cpu (evsel_list, i, cpu) { affinity__set(&affinity, cpu); @@ -850,7 +885,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) } if (STAT_RECORD) { - int err, fd = perf_data__fd(&perf_stat.data); + int fd = perf_data__fd(&perf_stat.data); if (is_pipe) { err = perf_header__write_pipe(perf_data__fd(&perf_stat.data)); @@ -876,7 +911,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) if (forks) { evlist__start_workload(evsel_list); - enable_counters(); + err = enable_counters(); + if (err) + return -1; if (interval || timeout || evlist__ctlfd_initialized(evsel_list)) status = dispatch_events(forks, timeout, interval, ×); @@ -895,7 +932,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) if (WIFSIGNALED(status)) psignal(WTERMSIG(status), argv[0]); } else { - enable_counters(); + err = enable_counters(); + if (err) + return -1; status = dispatch_events(forks, timeout, interval, ×); } @@ -1085,6 +1124,10 @@ static struct option stat_options[] = { "stat events on existing process id"), OPT_STRING('t', "tid", &target.tid, "tid", "stat events on existing thread id"), +#ifdef HAVE_BPF_SKEL + OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id", + "stat events on existing bpf program id"), +#endif OPT_BOOLEAN('a', "all-cpus", &target.system_wide, "system-wide collection from all CPUs"), OPT_BOOLEAN('g', "group", &group, @@ -2064,11 +2107,12 @@ int cmd_stat(int argc, const char **argv) "perf stat [<options>] [<command>]", NULL }; - int status = -EINVAL, run_idx; + int status = -EINVAL, run_idx, err; const char *mode; FILE *output = stderr; unsigned int interval, timeout; const char * const stat_subcommands[] = { "record", "report" }; + char errbuf[BUFSIZ]; setlocale(LC_ALL, ""); @@ -2179,6 +2223,12 @@ int cmd_stat(int argc, const char **argv) } else if (big_num_opt == 0) /* User passed --no-big-num */ stat_config.big_num = false; + err = target__validate(&target); + if (err) { + target__strerror(&target, err, errbuf, BUFSIZ); + pr_warning("%s\n", errbuf); + } + setup_system_wide(argc); /* @@ -2252,8 +2302,6 @@ int cmd_stat(int argc, const char **argv) } } - target__validate(&target); - if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide)) target.per_thread = true; @@ -2384,9 +2432,10 @@ int cmd_stat(int argc, const char **argv) * tools remain -acme */ int fd = perf_data__fd(&perf_stat.data); - int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, - process_synthesized_event, - &perf_stat.session->machines.host); + + err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, + process_synthesized_event, + &perf_stat.session->machines.host); if (err) { pr_warning("Couldn't synthesize the kernel mmap record, harmless, " "older tools may produce warnings about this file\n."); diff --git a/tools/perf/util/Build b/tools/perf/util/Build index e2563d0154eb6..188521f343470 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -135,6 +135,7 @@ perf-y += clockid.o perf-$(CONFIG_LIBBPF) += bpf-loader.o perf-$(CONFIG_LIBBPF) += bpf_map.o +perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o perf-$(CONFIG_LIBELF) += symbol-elf.o perf-$(CONFIG_LIBELF) += probe-file.o diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c new file mode 100644 index 0000000000000..f2cb86a40c882 --- /dev/null +++ b/tools/perf/util/bpf_counter.c @@ -0,0 +1,296 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* Copyright (c) 2019 Facebook */ + +#include <limits.h> +#include <unistd.h> +#include <sys/time.h> +#include <sys/resource.h> +#include <linux/err.h> +#include <linux/zalloc.h> +#include <bpf/bpf.h> +#include <bpf/btf.h> +#include <bpf/libbpf.h> + +#include "bpf_counter.h" +#include "counts.h" +#include "debug.h" +#include "evsel.h" +#include "target.h" + +#include "bpf_skel/bpf_prog_profiler.skel.h" + +static inline void *u64_to_ptr(__u64 ptr) +{ + return (void *)(unsigned long)ptr; +} + +static void set_max_rlimit(void) +{ + struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY }; + + setrlimit(RLIMIT_MEMLOCK, &rinf); +} + +static struct bpf_counter *bpf_counter_alloc(void) +{ + struct bpf_counter *counter; + + counter = zalloc(sizeof(*counter)); + if (counter) + INIT_LIST_HEAD(&counter->list); + return counter; +} + +static int bpf_program_profiler__destroy(struct evsel *evsel) +{ + struct bpf_counter *counter; + + list_for_each_entry(counter, &evsel->bpf_counter_list, list) + bpf_prog_profiler_bpf__destroy(counter->skel); + INIT_LIST_HEAD(&evsel->bpf_counter_list); + return 0; +} + +static char *bpf_target_prog_name(int tgt_fd) +{ + struct bpf_prog_info_linear *info_linear; + struct bpf_func_info *func_info; + const struct btf_type *t; + char *name = NULL; + struct btf *btf; + + info_linear = bpf_program__get_prog_info_linear( + tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO); + if (IS_ERR_OR_NULL(info_linear)) { + pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd); + return NULL; + } + + if (info_linear->info.btf_id == 0 || + btf__get_from_id(info_linear->info.btf_id, &btf)) { + pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd); + goto out; + } + + func_info = u64_to_ptr(info_linear->info.func_info); + t = btf__type_by_id(btf, func_info[0].type_id); + if (!t) { + pr_debug("btf %d doesn't have type %d\n", + info_linear->info.btf_id, func_info[0].type_id); + goto out; + } + name = strdup(btf__name_by_offset(btf, t->name_off)); +out: + free(info_linear); + return name; +} + +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id) +{ + struct bpf_prog_profiler_bpf *skel; + struct bpf_counter *counter; + struct bpf_program *prog; + char *prog_name; + int prog_fd; + int err; + + prog_fd = bpf_prog_get_fd_by_id(prog_id); + if (prog_fd < 0) { + pr_err("Failed to open fd for bpf prog %u\n", prog_id); + return -1; + } + counter = bpf_counter_alloc(); + if (!counter) { + close(prog_fd); + return -1; + } + + skel = bpf_prog_profiler_bpf__open(); + if (!skel) { + pr_err("Failed to open bpf skeleton\n"); + goto err_out; + } + skel->rodata->num_cpu = evsel__nr_cpus(evsel); + + bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel)); + bpf_map__resize(skel->maps.fentry_readings, 1); + bpf_map__resize(skel->maps.accum_readings, 1); + + prog_name = bpf_target_prog_name(prog_fd); + if (!prog_name) { + pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id); + goto err_out; + } + + bpf_object__for_each_program(prog, skel->obj) { + err = bpf_program__set_attach_target(prog, prog_fd, prog_name); + if (err) { + pr_err("bpf_program__set_attach_target failed.\n" + "Does bpf prog %u have BTF?\n", prog_id); + goto err_out; + } + } + set_max_rlimit(); + err = bpf_prog_profiler_bpf__load(skel); + if (err) { + pr_err("bpf_prog_profiler_bpf__load failed\n"); + goto err_out; + } + + counter->skel = skel; + list_add(&counter->list, &evsel->bpf_counter_list); + close(prog_fd); + return 0; +err_out: + free(counter); + close(prog_fd); + return -1; +} + +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target) +{ + char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p; + u32 prog_id; + int ret; + + bpf_str_ = bpf_str = strdup(target->bpf_str); + if (!bpf_str) + return -1; + + while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) { + prog_id = strtoul(tok, &p, 10); + if (prog_id == 0 || prog_id == UINT_MAX || + (*p != '\0' && *p != ',')) { + pr_err("Failed to parse bpf prog ids %s\n", + target->bpf_str); + return -1; + } + + ret = bpf_program_profiler_load_one(evsel, prog_id); + if (ret) { + bpf_program_profiler__destroy(evsel); + free(bpf_str_); + return -1; + } + bpf_str = NULL; + } + free(bpf_str_); + return 0; +} + +static int bpf_program_profiler__enable(struct evsel *evsel) +{ + struct bpf_counter *counter; + int ret; + + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { + ret = bpf_prog_profiler_bpf__attach(counter->skel); + if (ret) { + bpf_program_profiler__destroy(evsel); + return ret; + } + } + return 0; +} + +static int bpf_program_profiler__read(struct evsel *evsel) +{ + int num_cpu = evsel__nr_cpus(evsel); + struct bpf_perf_event_value values[num_cpu]; + struct bpf_counter *counter; + int reading_map_fd; + __u32 key = 0; + int err, cpu; + + if (list_empty(&evsel->bpf_counter_list)) + return -EAGAIN; + + for (cpu = 0; cpu < num_cpu; cpu++) { + perf_counts(evsel->counts, cpu, 0)->val = 0; + perf_counts(evsel->counts, cpu, 0)->ena = 0; + perf_counts(evsel->counts, cpu, 0)->run = 0; + } + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { + struct bpf_prog_profiler_bpf *skel = counter->skel; + + reading_map_fd = bpf_map__fd(skel->maps.accum_readings); + + err = bpf_map_lookup_elem(reading_map_fd, &key, values); + if (err) { + fprintf(stderr, "failed to read value\n"); + return err; + } + + for (cpu = 0; cpu < num_cpu; cpu++) { + perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter; + perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled; + perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running; + } + } + return 0; +} + +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu, + int fd) +{ + struct bpf_prog_profiler_bpf *skel; + struct bpf_counter *counter; + int ret; + + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { + skel = counter->skel; + ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events), + &cpu, &fd, BPF_ANY); + if (ret) + return ret; + } + return 0; +} + +struct bpf_counter_ops bpf_program_profiler_ops = { + .load = bpf_program_profiler__load, + .enable = bpf_program_profiler__enable, + .read = bpf_program_profiler__read, + .destroy = bpf_program_profiler__destroy, + .install_pe = bpf_program_profiler__install_pe, +}; + +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd) +{ + if (list_empty(&evsel->bpf_counter_list)) + return 0; + return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd); +} + +int bpf_counter__load(struct evsel *evsel, struct target *target) +{ + if (target__has_bpf(target)) + evsel->bpf_counter_ops = &bpf_program_profiler_ops; + + if (evsel->bpf_counter_ops) + return evsel->bpf_counter_ops->load(evsel, target); + return 0; +} + +int bpf_counter__enable(struct evsel *evsel) +{ + if (list_empty(&evsel->bpf_counter_list)) + return 0; + return evsel->bpf_counter_ops->enable(evsel); +} + +int bpf_counter__read(struct evsel *evsel) +{ + if (list_empty(&evsel->bpf_counter_list)) + return -EAGAIN; + return evsel->bpf_counter_ops->read(evsel); +} + +void bpf_counter__destroy(struct evsel *evsel) +{ + if (list_empty(&evsel->bpf_counter_list)) + return; + evsel->bpf_counter_ops->destroy(evsel); + evsel->bpf_counter_ops = NULL; +} diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h new file mode 100644 index 0000000000000..2eca210e5dc16 --- /dev/null +++ b/tools/perf/util/bpf_counter.h @@ -0,0 +1,72 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __PERF_BPF_COUNTER_H +#define __PERF_BPF_COUNTER_H 1 + +#include <linux/list.h> + +struct evsel; +struct target; +struct bpf_counter; + +typedef int (*bpf_counter_evsel_op)(struct evsel *evsel); +typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel, + struct target *target); +typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel, + int cpu, + int fd); + +struct bpf_counter_ops { + bpf_counter_evsel_target_op load; + bpf_counter_evsel_op enable; + bpf_counter_evsel_op read; + bpf_counter_evsel_op destroy; + bpf_counter_evsel_install_pe_op install_pe; +}; + +struct bpf_counter { + void *skel; + struct list_head list; +}; + +#ifdef HAVE_BPF_SKEL + +int bpf_counter__load(struct evsel *evsel, struct target *target); +int bpf_counter__enable(struct evsel *evsel); +int bpf_counter__read(struct evsel *evsel); +void bpf_counter__destroy(struct evsel *evsel); +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); + +#else /* HAVE_BPF_SKEL */ + +#include<linux/err.h> + +static inline int bpf_counter__load(struct evsel *evsel __maybe_unused, + struct target *target __maybe_unused) +{ + return 0; +} + +static inline int bpf_counter__enable(struct evsel *evsel __maybe_unused) +{ + return 0; +} + +static inline int bpf_counter__read(struct evsel *evsel __maybe_unused) +{ + return -EAGAIN; +} + +static inline void bpf_counter__destroy(struct evsel *evsel __maybe_unused) +{ +} + +static inline int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, + int cpu __maybe_unused, + int fd __maybe_unused) +{ + return 0; +} + +#endif /* HAVE_BPF_SKEL */ + +#endif /* __PERF_BPF_COUNTER_H */ diff --git a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c new file mode 100644 index 0000000000000..c7cec92d02360 --- /dev/null +++ b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +// Copyright (c) 2020 Facebook +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +/* map of perf event fds, num_cpu * num_metric entries */ +struct { + __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(int)); +} events SEC(".maps"); + +/* readings at fentry */ +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(struct bpf_perf_event_value)); + __uint(max_entries, 1); +} fentry_readings SEC(".maps"); + +/* accumulated readings */ +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(struct bpf_perf_event_value)); + __uint(max_entries, 1); +} accum_readings SEC(".maps"); + +const volatile __u32 num_cpu = 1; + +SEC("fentry/XXX") +int BPF_PROG(fentry_XXX) +{ + __u32 key = bpf_get_smp_processor_id(); + struct bpf_perf_event_value *ptr; + __u32 zero = 0; + long err; + + /* look up before reading, to reduce error */ + ptr = bpf_map_lookup_elem(&fentry_readings, &zero); + if (!ptr) + return 0; + + err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr)); + if (err) + return 0; + + return 0; +} + +static inline void +fexit_update_maps(struct bpf_perf_event_value *after) +{ + struct bpf_perf_event_value *before, diff, *accum; + __u32 zero = 0; + + before = bpf_map_lookup_elem(&fentry_readings, &zero); + /* only account samples with a valid fentry_reading */ + if (before && before->counter) { + struct bpf_perf_event_value *accum; + + diff.counter = after->counter - before->counter; + diff.enabled = after->enabled - before->enabled; + diff.running = after->running - before->running; + + accum = bpf_map_lookup_elem(&accum_readings, &zero); + if (accum) { + accum->counter += diff.counter; + accum->enabled += diff.enabled; + accum->running += diff.running; + } + } +} + +SEC("fexit/XXX") +int BPF_PROG(fexit_XXX) +{ + struct bpf_perf_event_value reading; + __u32 cpu = bpf_get_smp_processor_id(); + __u32 one = 1, zero = 0; + int err; + + /* read all events before updating the maps, to reduce error */ + err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading)); + if (err) + return 0; + + fexit_update_maps(&reading); + return 0; +} + +char LICENSE[] SEC("license") = "Dual BSD/GPL"; diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index c26ea82220bd8..7265308765d73 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -25,6 +25,7 @@ #include <stdlib.h> #include <perf/evsel.h> #include "asm/bug.h" +#include "bpf_counter.h" #include "callchain.h" #include "cgroup.h" #include "counts.h" @@ -51,6 +52,10 @@ #include <internal/lib.h> #include <linux/ctype.h> +#include <bpf/bpf.h> +#include <bpf/libbpf.h> +#include <bpf/btf.h> +#include "rlimit.h" struct perf_missing_features perf_missing_features; @@ -247,6 +252,7 @@ void evsel__init(struct evsel *evsel, evsel->bpf_obj = NULL; evsel->bpf_fd = -1; INIT_LIST_HEAD(&evsel->config_terms); + INIT_LIST_HEAD(&evsel->bpf_counter_list); perf_evsel__object.init(evsel); evsel->sample_size = __evsel__sample_size(attr->sample_type); evsel__calc_id_pos(evsel); @@ -1366,6 +1372,7 @@ void evsel__exit(struct evsel *evsel) { assert(list_empty(&evsel->core.node)); assert(evsel->evlist == NULL); + bpf_counter__destroy(evsel); evsel__free_counts(evsel); perf_evsel__free_fd(&evsel->core); perf_evsel__free_id(&evsel->core); @@ -1781,6 +1788,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, FD(evsel, cpu, thread) = fd; + bpf_counter__install_pe(evsel, cpu, fd); + if (unlikely(test_attr__enabled)) { test_attr__open(&evsel->core.attr, pid, cpus->map[cpu], fd, group_fd, flags); diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index cd1d8dd431997..40e3946cd7518 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -10,6 +10,7 @@ #include <internal/evsel.h> #include <perf/evsel.h> #include "symbol_conf.h" +#include "bpf_counter.h" #include <internal/cpumap.h> struct bpf_object; @@ -17,6 +18,8 @@ struct cgroup; struct perf_counts; struct perf_stat_evsel; union perf_event; +struct bpf_counter_ops; +struct target; typedef int (evsel__sb_cb_t)(union perf_event *event, void *data); @@ -127,6 +130,8 @@ struct evsel { * See also evsel__has_callchain(). */ __u64 synth_sample_type; + struct list_head bpf_counter_list; + struct bpf_counter_ops *bpf_counter_ops; }; struct perf_missing_features { @@ -424,4 +429,5 @@ static inline bool evsel__is_dummy_event(struct evsel *evsel) struct perf_env *evsel__env(struct evsel *evsel); int evsel__store_ids(struct evsel *evsel, struct evlist *evlist); + #endif /* __PERF_EVSEL_H */ diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c index 583ae4f09c5d1..cce7a76d6473c 100644 --- a/tools/perf/util/stat-display.c +++ b/tools/perf/util/stat-display.c @@ -1045,7 +1045,9 @@ static void print_header(struct perf_stat_config *config, if (!config->csv_output) { fprintf(output, "\n"); fprintf(output, " Performance counter stats for "); - if (_target->system_wide) + if (_target->bpf_str) + fprintf(output, "\'BPF program(s) %s", _target->bpf_str); + else if (_target->system_wide) fprintf(output, "\'system wide"); else if (_target->cpu_list) fprintf(output, "\'CPU(s) %s", _target->cpu_list); diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c index 8ce1479c98f03..0b3957323f668 100644 --- a/tools/perf/util/stat.c +++ b/tools/perf/util/stat.c @@ -527,7 +527,7 @@ int create_perf_stat_counter(struct evsel *evsel, if (leader->core.nr_members > 1) attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP; - attr->inherit = !config->no_inherit; + attr->inherit = !config->no_inherit && list_empty(&evsel->bpf_counter_list); /* * Some events get initialized with sample_(period/type) set, diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c index a3db13dea937c..0f383418e3df5 100644 --- a/tools/perf/util/target.c +++ b/tools/perf/util/target.c @@ -56,6 +56,34 @@ enum target_errno target__validate(struct target *target) ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM; } + /* BPF and CPU are mutually exclusive */ + if (target->bpf_str && target->cpu_list) { + target->cpu_list = NULL; + if (ret == TARGET_ERRNO__SUCCESS) + ret = TARGET_ERRNO__BPF_OVERRIDE_CPU; + } + + /* BPF and PID/TID are mutually exclusive */ + if (target->bpf_str && target->tid) { + target->tid = NULL; + if (ret == TARGET_ERRNO__SUCCESS) + ret = TARGET_ERRNO__BPF_OVERRIDE_PID; + } + + /* BPF and UID are mutually exclusive */ + if (target->bpf_str && target->uid_str) { + target->uid_str = NULL; + if (ret == TARGET_ERRNO__SUCCESS) + ret = TARGET_ERRNO__BPF_OVERRIDE_UID; + } + + /* BPF and THREADS are mutually exclusive */ + if (target->bpf_str && target->per_thread) { + target->per_thread = false; + if (ret == TARGET_ERRNO__SUCCESS) + ret = TARGET_ERRNO__BPF_OVERRIDE_THREAD; + } + /* THREAD and SYSTEM/CPU are mutually exclusive */ if (target->per_thread && (target->system_wide || target->cpu_list)) { target->per_thread = false; @@ -109,6 +137,10 @@ static const char *target__error_str[] = { "PID/TID switch overriding SYSTEM", "UID switch overriding SYSTEM", "SYSTEM/CPU switch overriding PER-THREAD", + "BPF switch overriding CPU", + "BPF switch overriding PID/TID", + "BPF switch overriding UID", + "BPF switch overriding THREAD", "Invalid User: %s", "Problems obtaining information for user %s", }; @@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum, switch (errnum) { case TARGET_ERRNO__PID_OVERRIDE_CPU ... - TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD: + TARGET_ERRNO__BPF_OVERRIDE_THREAD: snprintf(buf, buflen, "%s", msg); break; diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h index 6ef01a83b24e9..f132c6c2eef81 100644 --- a/tools/perf/util/target.h +++ b/tools/perf/util/target.h @@ -10,6 +10,7 @@ struct target { const char *tid; const char *cpu_list; const char *uid_str; + const char *bpf_str; uid_t uid; bool system_wide; bool uses_mmap; @@ -36,6 +37,10 @@ enum target_errno { TARGET_ERRNO__PID_OVERRIDE_SYSTEM, TARGET_ERRNO__UID_OVERRIDE_SYSTEM, TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD, + TARGET_ERRNO__BPF_OVERRIDE_CPU, + TARGET_ERRNO__BPF_OVERRIDE_PID, + TARGET_ERRNO__BPF_OVERRIDE_UID, + TARGET_ERRNO__BPF_OVERRIDE_THREAD, /* for target__parse_uid() */ TARGET_ERRNO__INVALID_UID, @@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target) return target->system_wide || target->cpu_list; } +static inline bool target__has_bpf(struct target *target) +{ + return target->bpf_str; +} + static inline bool target__none(struct target *target) { return !target__has_task(target) && !target__has_cpu(target); -- 2.24.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu @ 2020-12-28 20:11 ` Arnaldo Carvalho de Melo 2020-12-28 23:43 ` Song Liu 2020-12-29 7:22 ` Namhyung Kim 1 sibling, 1 reply; 25+ messages in thread From: Arnaldo Carvalho de Melo @ 2020-12-28 20:11 UTC (permalink / raw) To: Song Liu Cc: linux-kernel, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, kernel-team Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu: > Introduce perf-stat -b option, which counts events for BPF programs, like: > > [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 > 1.487903822 115,200 ref-cycles > 1.487903822 86,012 cycles > 2.489147029 80,560 ref-cycles > 2.489147029 73,784 cycles > 3.490341825 60,720 ref-cycles > 3.490341825 37,797 cycles > 4.491540887 37,120 ref-cycles > 4.491540887 31,963 cycles > > The example above counts cycles and ref-cycles of BPF program of id 254. > This is similar to bpftool-prog-profile command, but more flexible. > > perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF > programs (monitor-progs) to the target BPF program (target-prog). The > monitor-progs read perf_event before and after the target-prog, and > aggregate the difference in a BPF map. Then the user space reads data > from these maps. > > A new struct bpf_counter is introduced to provide common interface that > uses BPF programs/maps to count perf events. Segfaulting here: [root@five ~]# bpftool prog | grep tracepoint 110: tracepoint name syscall_unaugme tag 57cd311f2e27366b gpl 111: tracepoint name sys_enter_conne tag 3555418ac9476139 gpl 112: tracepoint name sys_enter_sendt tag bc7fcadbaf7b8145 gpl 113: tracepoint name sys_enter_open tag 0e59c3ac2bea5280 gpl 114: tracepoint name sys_enter_opena tag 0baf443610f59837 gpl 115: tracepoint name sys_enter_renam tag 24664e4aca62d7fa gpl 116: tracepoint name sys_enter_renam tag 20093e51a8634ebb gpl 117: tracepoint name sys_enter tag 0bc3fc9d11754ba1 gpl 118: tracepoint name sys_exit tag 29c7ae234d79bd5c gpl [root@five ~]# [root@five ~]# gdb perf GNU gdb (GDB) Fedora 10.1-2.fc33 Reading symbols from perf... (gdb) run stat -e instructions,cycles -b 113 -I 1000 Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame Program received signal SIGSEGV, Segmentation fault. 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); (gdb) bt #0 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 #1 0x0000000000000000 in ?? () (gdb) [acme@five perf]$ clang -v |& head -2 clang version 11.0.0 (Fedora 11.0.0-2.fc33) Target: x86_64-unknown-linux-gnu [acme@five perf]$ Do you need any extra info? Please when resubmitting, please combine patches 3/4 and 4/4, man pages updates usually come together with the new feature. Thanks, - Arnaldo Full build output: [acme@five perf]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;make VF=1 O=/tmp/build/perf -C tools/perf BUILD_BPF_SKEL=1 install-bin make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j24' parallel build HOSTCC /tmp/build/perf/fixdep.o HOSTLD /tmp/build/perf/fixdep-in.o LINK /tmp/build/perf/fixdep Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... libbfd: [ on ] ... libbfd-buildid: [ on ] ... libcap: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] ... disassembler-four-args: [ on ] ... backtrace: [ on ] ... eventfd: [ on ] ... fortify-source: [ on ] ... sync-compare-and-swap: [ on ] ... get_current_dir_name: [ on ] ... gettid: [ on ] ... libelf-getphdrnum: [ on ] ... libelf-gelf_getnote: [ on ] ... libelf-getshdrstrndx: [ on ] ... libpython-version: [ on ] ... libslang: [ on ] ... libslang-include-subdir: [ on ] ... pthread-attr-setaffinity-np: [ on ] ... pthread-barrier: [ on ] ... reallocarray: [ on ] ... stackprotector-all: [ on ] ... timerfd: [ on ] ... sched_getcpu: [ on ] ... sdt: [ on ] ... setns: [ on ] ... file-handle: [ on ] ... bionic: [ OFF ] ... compile-32: [ OFF ] ... compile-x32: [ OFF ] ... cplus-demangle: [ on ] ... gtk2: [ OFF ] ... gtk2-infobar: [ OFF ] ... hello: [ OFF ] ... libbabeltrace: [ on ] ... libbfd-liberty: [ OFF ] ... libbfd-liberty-z: [ OFF ] ... libopencsd: [ OFF ] ... libunwind-x86: [ OFF ] ... libunwind-x86_64: [ OFF ] ... libunwind-arm: [ OFF ] ... libunwind-aarch64: [ OFF ] ... libunwind-debug-frame: [ OFF ] ... libunwind-debug-frame-arm: [ OFF ] ... libunwind-debug-frame-aarch64: [ OFF ] ... cxx: [ OFF ] ... llvm: [ OFF ] ... llvm-version: [ OFF ] ... clang: [ OFF ] ... libbpf: [ OFF ] ... libpfm4: [ OFF ] ... libdebuginfod: [ on ] ... clang-bpf-co-re: [ on ] ... prefix: /home/acme ... bindir: /home/acme/bin ... libdir: /home/acme/lib64 ... sysconfdir: /home/acme/etc ... LIBUNWIND_DIR: ... LIBDW_DIR: ... JDIR: /usr/lib/jvm/java-11-openjdk-11.0.9.11-4.fc33.x86_64 ... DWARF post unwind library: libunwind GEN /tmp/build/perf/common-cmds.h CFLAGS= make -C ../bpf/bpftool \ OUTPUT=/tmp/build/perf/util/bpf_skel/.tmp/ bootstrap CC /tmp/build/perf/exec-cmd.o CC /tmp/build/perf/help.o MKDIR /tmp/build/perf/pmu-events/ MKDIR /tmp/build/perf/jvmti/ MKDIR /tmp/build/perf/fd/ MKDIR /tmp/build/perf/fs/ MKDIR /tmp/build/perf/fs/ HOSTCC /tmp/build/perf/pmu-events/json.o CC /tmp/build/perf/parse-options.o CC /tmp/build/perf/fd/array.o CC /tmp/build/perf/pager.o CC /tmp/build/perf/jvmti/libjvmti.o CC /tmp/build/perf/fs/fs.o CC /tmp/build/perf/run-command.o MKDIR /tmp/build/perf/jvmti/ CC /tmp/build/perf/sigchain.o MKDIR /tmp/build/perf/fs/ CC /tmp/build/perf/fs/tracing_path.o CC /tmp/build/perf/fs/cgroup.o MKDIR /tmp/build/perf/pmu-events/ CC /tmp/build/perf/jvmti/libstring.o HOSTCC /tmp/build/perf/pmu-events/jevents.o CC /tmp/build/perf/jvmti/libctype.o CC /tmp/build/perf/subcmd-config.o CC /tmp/build/perf/jvmti/jvmti_agent.o LD /tmp/build/perf/fd/libapi-in.o CC /tmp/build/perf/event-parse.o HOSTCC /tmp/build/perf/pmu-events/jsmn.o CC /tmp/build/perf/event-plugin.o CC /tmp/build/perf/cpu.o CC /tmp/build/perf/trace-seq.o CC /tmp/build/perf/core.o CC /tmp/build/perf/parse-filter.o CC /tmp/build/perf/debug.o CC /tmp/build/perf/cpumap.o LD /tmp/build/perf/fs/libapi-in.o CC /tmp/build/perf/threadmap.o LD /tmp/build/perf/libsubcmd-in.o HOSTLD /tmp/build/perf/pmu-events/jevents-in.o CC /tmp/build/perf/str_error_r.o CC /tmp/build/perf/evsel.o GEN /tmp/build/perf/bpf_helper_defs.h CC /tmp/build/perf/evlist.o CC /tmp/build/perf/parse-utils.o CC /tmp/build/perf/zalloc.o CC /tmp/build/perf/kbuffer-parse.o LD /tmp/build/perf/jvmti/jvmti-in.o CC /tmp/build/perf/mmap.o CC /tmp/build/perf/tep_strerror.o CC /tmp/build/perf/xyarray.o CC /tmp/build/perf/event-parse-api.o CC /tmp/build/perf/lib.o LINK /tmp/build/perf/pmu-events/jevents LD /tmp/build/perf/libapi-in.o AR /tmp/build/perf/libsubcmd.a LINK /tmp/build/perf/libperf-jvmti.so LD /tmp/build/perf/libtraceevent-in.o CC /tmp/build/perf/plugin_hrtimer.o CC /tmp/build/perf/plugin_kmem.o CC /tmp/build/perf/plugin_mac80211.o CC /tmp/build/perf/plugin_kvm.o CC /tmp/build/perf/plugin_jbd2.o LD /tmp/build/perf/libperf-in.o CC /tmp/build/perf/plugin_sched_switch.o CC /tmp/build/perf/plugin_function.o CC /tmp/build/perf/plugin_scsi.o CC /tmp/build/perf/plugin_xen.o CC /tmp/build/perf/plugin_futex.o CC /tmp/build/perf/plugin_cfg80211.o CC /tmp/build/perf/plugin_tlb.o AR /tmp/build/perf/libapi.a LD /tmp/build/perf/plugin_hrtimer-in.o LINK /tmp/build/perf/libtraceevent.a LD /tmp/build/perf/plugin_kvm-in.o LD /tmp/build/perf/plugin_scsi-in.o LD /tmp/build/perf/plugin_kmem-in.o LD /tmp/build/perf/plugin_mac80211-in.o LD /tmp/build/perf/plugin_futex-in.o LD /tmp/build/perf/plugin_function-in.o LD /tmp/build/perf/plugin_xen-in.o LD /tmp/build/perf/plugin_sched_switch-in.o LD /tmp/build/perf/plugin_tlb-in.o LD /tmp/build/perf/plugin_jbd2-in.o LINK /tmp/build/perf/plugin_hrtimer.so LINK /tmp/build/perf/plugin_kmem.so AR /tmp/build/perf/libperf.a LINK /tmp/build/perf/plugin_scsi.so LINK /tmp/build/perf/plugin_kvm.so LINK /tmp/build/perf/plugin_mac80211.so LD /tmp/build/perf/plugin_cfg80211-in.o LINK /tmp/build/perf/plugin_futex.so LINK /tmp/build/perf/plugin_xen.so LINK /tmp/build/perf/plugin_function.so LINK /tmp/build/perf/plugin_tlb.so LINK /tmp/build/perf/plugin_jbd2.so LINK /tmp/build/perf/plugin_cfg80211.so LINK /tmp/build/perf/plugin_sched_switch.so GEN /tmp/build/perf/pmu-events/pmu-events.c GEN /tmp/build/perf/libtraceevent-dynamic-list MKDIR /tmp/build/perf/staticobjs/ MKDIR /tmp/build/perf/staticobjs/ MKDIR /tmp/build/perf/staticobjs/ MKDIR /tmp/build/perf/staticobjs/ MKDIR /tmp/build/perf/staticobjs/ MKDIR /tmp/build/perf/staticobjs/ MKDIR /tmp/build/perf/staticobjs/ PERF_VERSION = 5.11.rc1.g5eb0b370de61 MKDIR /tmp/build/perf/staticobjs/ CC /tmp/build/perf/staticobjs/libbpf_probes.o CC /tmp/build/perf/staticobjs/libbpf.o CC /tmp/build/perf/staticobjs/bpf.o CC /tmp/build/perf/staticobjs/nlattr.o CC /tmp/build/perf/staticobjs/btf.o CC /tmp/build/perf/staticobjs/xsk.o GEN perf-archive CC /tmp/build/perf/staticobjs/hashmap.o GEN perf-with-kcore CC /tmp/build/perf/staticobjs/btf_dump.o CC /tmp/build/perf/staticobjs/libbpf_errno.o CC /tmp/build/perf/staticobjs/str_error.o CC /tmp/build/perf/staticobjs/bpf_prog_linfo.o CC /tmp/build/perf/staticobjs/netlink.o CC /tmp/build/perf/staticobjs/ringbuf.o LD /tmp/build/perf/staticobjs/libbpf-in.o LINK /tmp/build/perf/libbpf.a CLANG /tmp/build/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o DESCEND plugins GEN /tmp/build/perf/python/perf.so CC /tmp/build/perf/plugins/plugin_jbd2.o CC /tmp/build/perf/plugins/plugin_kmem.o CC /tmp/build/perf/plugins/plugin_hrtimer.o CC /tmp/build/perf/plugins/plugin_mac80211.o CC /tmp/build/perf/plugins/plugin_kvm.o CC /tmp/build/perf/plugins/plugin_function.o CC /tmp/build/perf/plugins/plugin_xen.o CC /tmp/build/perf/plugins/plugin_sched_switch.o CC /tmp/build/perf/plugins/plugin_futex.o CC /tmp/build/perf/plugins/plugin_scsi.o CC /tmp/build/perf/plugins/plugin_tlb.o CC /tmp/build/perf/plugins/plugin_cfg80211.o LD /tmp/build/perf/plugins/plugin_jbd2-in.o LD /tmp/build/perf/plugins/plugin_kmem-in.o LD /tmp/build/perf/plugins/plugin_hrtimer-in.o LD /tmp/build/perf/plugins/plugin_kvm-in.o LD /tmp/build/perf/plugins/plugin_mac80211-in.o LD /tmp/build/perf/plugins/plugin_function-in.o LD /tmp/build/perf/plugins/plugin_xen-in.o LD /tmp/build/perf/plugins/plugin_sched_switch-in.o LD /tmp/build/perf/plugins/plugin_scsi-in.o LD /tmp/build/perf/plugins/plugin_futex-in.o LD /tmp/build/perf/plugins/plugin_cfg80211-in.o LD /tmp/build/perf/plugins/plugin_tlb-in.o LINK /tmp/build/perf/plugins/plugin_jbd2.so LINK /tmp/build/perf/plugins/plugin_hrtimer.so LINK /tmp/build/perf/plugins/plugin_kmem.so LINK /tmp/build/perf/plugins/plugin_mac80211.so LINK /tmp/build/perf/plugins/plugin_kvm.so LINK /tmp/build/perf/plugins/plugin_sched_switch.so LINK /tmp/build/perf/plugins/plugin_scsi.so LINK /tmp/build/perf/plugins/plugin_xen.so LINK /tmp/build/perf/plugins/plugin_function.so LINK /tmp/build/perf/plugins/plugin_futex.so LINK /tmp/build/perf/plugins/plugin_tlb.so LINK /tmp/build/perf/plugins/plugin_cfg80211.so INSTALL trace_plugins CC /tmp/build/perf/pmu-events/pmu-events.o LD /tmp/build/perf/pmu-events/pmu-events-in.o Auto-detecting system features: ... libbfd: [ on ] ... disassembler-four-args: [ on ] ... zlib: [ on ] ... libcap: [ on ] ... clang-bpf-co-re: [ on ] ... reallocarray: [ on ] MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/ MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/ CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/main.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/common.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/btf.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/json_writer.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/gen.o Auto-detecting system features: ... libelf: [ on ] ... zlib: [ on ] ... bpf: [ on ] GEN /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/bpf_helper_defs.h MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_probes.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/nlattr.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/xsk.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_errno.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/netlink.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/str_error.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/hashmap.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf_prog_linfo.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf_dump.o CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ringbuf.o LD /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o LINK /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a LINK /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame CC /tmp/build/perf/builtin-bench.o CC /tmp/build/perf/builtin-annotate.o CC /tmp/build/perf/builtin-diff.o CC /tmp/build/perf/builtin-config.o CC /tmp/build/perf/builtin-ftrace.o CC /tmp/build/perf/builtin-help.o CC /tmp/build/perf/builtin-evlist.o CC /tmp/build/perf/builtin-sched.o CC /tmp/build/perf/builtin-buildid-list.o CC /tmp/build/perf/builtin-kallsyms.o CC /tmp/build/perf/builtin-buildid-cache.o CC /tmp/build/perf/builtin-record.o CC /tmp/build/perf/builtin-report.o CC /tmp/build/perf/builtin-list.o CC /tmp/build/perf/builtin-stat.o CC /tmp/build/perf/builtin-top.o CC /tmp/build/perf/builtin-timechart.o CC /tmp/build/perf/builtin-script.o CC /tmp/build/perf/builtin-kmem.o CC /tmp/build/perf/builtin-lock.o CC /tmp/build/perf/builtin-kvm.o CC /tmp/build/perf/builtin-inject.o CC /tmp/build/perf/builtin-mem.o CC /tmp/build/perf/builtin-version.o CC /tmp/build/perf/builtin-data.o CC /tmp/build/perf/builtin-trace.o CC /tmp/build/perf/builtin-probe.o CC /tmp/build/perf/builtin-c2c.o MKDIR /tmp/build/perf/bench/ MKDIR /tmp/build/perf/bench/ MKDIR /tmp/build/perf/tests/ CC /tmp/build/perf/arch/common.o MKDIR /tmp/build/perf/ui/ MKDIR /tmp/build/perf/bench/ MKDIR /tmp/build/perf/tests/ CC /tmp/build/perf/bench/sched-messaging.o CC /tmp/build/perf/bench/sched-pipe.o CC /tmp/build/perf/tests/builtin-test.o MKDIR /tmp/build/perf/scripts/python/Perf-Trace-Util/ MKDIR /tmp/build/perf/scripts/perl/Perf-Trace-Util/ MKDIR /tmp/build/perf/tests/ MKDIR /tmp/build/perf/ui/ CC /tmp/build/perf/ui/setup.o CC /tmp/build/perf/tests/attr.o CC /tmp/build/perf/bench/syscall.o CC /tmp/build/perf/tests/parse-events.o CC /tmp/build/perf/trace/beauty/clone.o CC /tmp/build/perf/scripts/python/Perf-Trace-Util/Context.o CC /tmp/build/perf/tests/dso-data.o MKDIR /tmp/build/perf/arch/x86/util/ CC /tmp/build/perf/ui/helpline.o CC /tmp/build/perf/scripts/perl/Perf-Trace-Util/Context.o MKDIR /tmp/build/perf/ui/ CC /tmp/build/perf/ui/util.o CC /tmp/build/perf/arch/x86/util/header.o MKDIR /tmp/build/perf/arch/x86/tests/ CC /tmp/build/perf/ui/hist.o CC /tmp/build/perf/ui/progress.o CC /tmp/build/perf/tests/vmlinux-kallsyms.o CC /tmp/build/perf/arch/x86/tests/regs_load.o CC /tmp/build/perf/bench/mem-functions.o CC /tmp/build/perf/trace/beauty/fcntl.o CC /tmp/build/perf/trace/beauty/flock.o CC /tmp/build/perf/bench/futex-hash.o CC /tmp/build/perf/trace/beauty/fsmount.o CC /tmp/build/perf/perf.o CC /tmp/build/perf/tests/openat-syscall.o MKDIR /tmp/build/perf/ui/stdio/ MKDIR /tmp/build/perf/arch/x86/util/ LD /tmp/build/perf/scripts/python/Perf-Trace-Util/perf-in.o MKDIR /tmp/build/perf/arch/x86/tests/ CC /tmp/build/perf/tests/openat-syscall-all-cpus.o CC /tmp/build/perf/arch/x86/util/pmu.o CC /tmp/build/perf/trace/beauty/fspick.o CC /tmp/build/perf/arch/x86/util/tsc.o CC /tmp/build/perf/util/annotate.o CC /tmp/build/perf/tests/openat-syscall-tp-fields.o CC /tmp/build/perf/trace/beauty/ioctl.o CC /tmp/build/perf/bench/futex-wake.o CC /tmp/build/perf/arch/x86/tests/dwarf-unwind.o CC /tmp/build/perf/bench/futex-wake-parallel.o CC /tmp/build/perf/ui/stdio/hist.o CC /tmp/build/perf/arch/x86/tests/arch-tests.o CC /tmp/build/perf/trace/beauty/kcmp.o CC /tmp/build/perf/arch/x86/util/perf_regs.o CC /tmp/build/perf/trace/beauty/mount_flags.o CC /tmp/build/perf/arch/x86/util/kvm-stat.o CC /tmp/build/perf/bench/futex-requeue.o CC /tmp/build/perf/util/block-info.o CC /tmp/build/perf/arch/x86/tests/rdpmc.o CC /tmp/build/perf/bench/futex-lock-pi.o CC /tmp/build/perf/arch/x86/tests/insn-x86.o CC /tmp/build/perf/trace/beauty/move_mount.o CC /tmp/build/perf/tests/mmap-basic.o CC /tmp/build/perf/trace/beauty/pkey_alloc.o CC /tmp/build/perf/tests/perf-record.o CC /tmp/build/perf/trace/beauty/arch_prctl.o CC /tmp/build/perf/arch/x86/util/topdown.o CC /tmp/build/perf/tests/evsel-roundtrip-name.o CC /tmp/build/perf/tests/evsel-tp-sched.o CC /tmp/build/perf/arch/x86/tests/intel-pt-pkt-decoder-test.o CC /tmp/build/perf/bench/epoll-wait.o CC /tmp/build/perf/tests/fdarray.o CC /tmp/build/perf/bench/epoll-ctl.o CC /tmp/build/perf/arch/x86/util/machine.o CC /tmp/build/perf/trace/beauty/prctl.o CC /tmp/build/perf/ui/browser.o CC /tmp/build/perf/arch/x86/util/event.o CC /tmp/build/perf/arch/x86/tests/bp-modify.o CC /tmp/build/perf/trace/beauty/renameat.o CC /tmp/build/perf/tests/pmu.o CC /tmp/build/perf/tests/pmu-events.o CC /tmp/build/perf/tests/hists_common.o CC /tmp/build/perf/bench/synthesize.o CC /tmp/build/perf/arch/x86/util/dwarf-regs.o CC /tmp/build/perf/tests/hists_link.o CC /tmp/build/perf/trace/beauty/sockaddr.o CC /tmp/build/perf/arch/x86/util/unwind-libunwind.o CC /tmp/build/perf/trace/beauty/socket.o CC /tmp/build/perf/trace/beauty/statx.o LD /tmp/build/perf/arch/x86/tests/perf-in.o CC /tmp/build/perf/arch/x86/util/auxtrace.o CC /tmp/build/perf/arch/x86/util/archinsn.o CC /tmp/build/perf/bench/kallsyms-parse.o CC /tmp/build/perf/bench/find-bit-bench.o CC /tmp/build/perf/arch/x86/util/intel-pt.o CC /tmp/build/perf/trace/beauty/sync_file_range.o CC /tmp/build/perf/arch/x86/util/intel-bts.o MKDIR /tmp/build/perf/trace/beauty/tracepoints/ MKDIR /tmp/build/perf/ui/browsers/ CC /tmp/build/perf/trace/beauty/tracepoints/x86_irq_vectors.o MKDIR /tmp/build/perf/trace/beauty/tracepoints/ CC /tmp/build/perf/util/block-range.o CC /tmp/build/perf/ui/browsers/annotate.o MKDIR /tmp/build/perf/ui/browsers/ CC /tmp/build/perf/util/build-id.o CC /tmp/build/perf/bench/inject-buildid.o CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o MKDIR /tmp/build/perf/ui/tui/ CC /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o MKDIR /tmp/build/perf/ui/tui/ CC /tmp/build/perf/ui/browsers/hists.o CC /tmp/build/perf/ui/browsers/map.o LD /tmp/build/perf/arch/x86/util/perf-in.o CC /tmp/build/perf/tests/hists_filter.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o CC /tmp/build/perf/ui/tui/setup.o CC /tmp/build/perf/ui/tui/util.o MKDIR /tmp/build/perf/ui/tui/ CC /tmp/build/perf/bench/numa.o CC /tmp/build/perf/util/cacheline.o CC /tmp/build/perf/ui/browsers/scripts.o LD /tmp/build/perf/trace/beauty/tracepoints/perf-in.o CC /tmp/build/perf/ui/tui/helpline.o CC /tmp/build/perf/util/config.o CC /tmp/build/perf/ui/tui/progress.o CC /tmp/build/perf/ui/browsers/header.o LD /tmp/build/perf/trace/beauty/perf-in.o CC /tmp/build/perf/ui/browsers/res_sample.o CC /tmp/build/perf/tests/hists_output.o LD /tmp/build/perf/bench/perf-in.o CC /tmp/build/perf/tests/hists_cumulate.o CC /tmp/build/perf/util/copyfile.o LD /tmp/build/perf/arch/x86/perf-in.o CC /tmp/build/perf/util/ctype.o CC /tmp/build/perf/util/db-export.o CC /tmp/build/perf/util/env.o CC /tmp/build/perf/util/event.o LD /tmp/build/perf/arch/perf-in.o CC /tmp/build/perf/util/evlist.o CC /tmp/build/perf/util/sideband_evlist.o CC /tmp/build/perf/util/evsel.o CC /tmp/build/perf/util/evsel_fprintf.o CC /tmp/build/perf/tests/python-use.o CC /tmp/build/perf/util/perf_event_attr_fprintf.o CC /tmp/build/perf/util/evswitch.o CC /tmp/build/perf/util/find_bit.o CC /tmp/build/perf/tests/bp_signal.o LD /tmp/build/perf/ui/tui/perf-in.o CC /tmp/build/perf/util/get_current_dir_name.o CC /tmp/build/perf/tests/bp_signal_overflow.o CC /tmp/build/perf/util/kallsyms.o CC /tmp/build/perf/tests/bp_account.o CC /tmp/build/perf/util/llvm-utils.o CC /tmp/build/perf/util/levenshtein.o CC /tmp/build/perf/util/mmap.o CC /tmp/build/perf/tests/wp.o CC /tmp/build/perf/util/memswap.o CC /tmp/build/perf/util/perf_regs.o BISON /tmp/build/perf/util/parse-events-bison.c CC /tmp/build/perf/tests/task-exit.o CC /tmp/build/perf/util/path.o CC /tmp/build/perf/util/print_binary.o CC /tmp/build/perf/util/rlimit.o CC /tmp/build/perf/tests/sw-clock.o CC /tmp/build/perf/tests/mmap-thread-lookup.o CC /tmp/build/perf/util/argv_split.o CC /tmp/build/perf/util/rbtree.o CC /tmp/build/perf/tests/thread-maps-share.o CC /tmp/build/perf/util/libstring.o CC /tmp/build/perf/tests/switch-tracking.o CC /tmp/build/perf/tests/keep-tracking.o CC /tmp/build/perf/util/bitmap.o CC /tmp/build/perf/util/hweight.o CC /tmp/build/perf/util/smt.o CC /tmp/build/perf/tests/code-reading.o CC /tmp/build/perf/util/strbuf.o CC /tmp/build/perf/util/string.o CC /tmp/build/perf/tests/sample-parsing.o CC /tmp/build/perf/tests/parse-no-sample-id-all.o CC /tmp/build/perf/util/strfilter.o CC /tmp/build/perf/tests/kmod-path.o CC /tmp/build/perf/util/strlist.o CC /tmp/build/perf/util/top.o CC /tmp/build/perf/tests/thread-map.o CC /tmp/build/perf/util/usage.o CC /tmp/build/perf/util/dso.o CC /tmp/build/perf/util/dsos.o CC /tmp/build/perf/util/symbol.o CC /tmp/build/perf/util/symbol_fprintf.o CC /tmp/build/perf/tests/llvm.o CC /tmp/build/perf/util/color.o CC /tmp/build/perf/util/color_config.o CC /tmp/build/perf/util/metricgroup.o CC /tmp/build/perf/util/header.o CC /tmp/build/perf/util/callchain.o CC /tmp/build/perf/util/values.o CC /tmp/build/perf/tests/bpf.o CC /tmp/build/perf/util/debug.o CC /tmp/build/perf/util/fncache.o CC /tmp/build/perf/tests/topology.o CC /tmp/build/perf/util/machine.o CC /tmp/build/perf/tests/cpumap.o CC /tmp/build/perf/util/map.o CC /tmp/build/perf/tests/mem.o CC /tmp/build/perf/util/pstack.o CC /tmp/build/perf/util/session.o CC /tmp/build/perf/tests/stat.o CC /tmp/build/perf/tests/event_update.o LD /tmp/build/perf/ui/browsers/perf-in.o CC /tmp/build/perf/tests/event-times.o CC /tmp/build/perf/tests/expr.o CC /tmp/build/perf/util/sample-raw.o CC /tmp/build/perf/util/s390-sample-raw.o CC /tmp/build/perf/tests/sdt.o CC /tmp/build/perf/util/syscalltbl.o CC /tmp/build/perf/tests/is_printable_array.o CC /tmp/build/perf/util/ordered-events.o CC /tmp/build/perf/tests/backward-ring-buffer.o CC /tmp/build/perf/tests/bitmap.o CC /tmp/build/perf/util/namespaces.o CC /tmp/build/perf/tests/perf-hooks.o CC /tmp/build/perf/tests/clang.o CC /tmp/build/perf/util/comm.o CC /tmp/build/perf/tests/unit_number__scnprintf.o CC /tmp/build/perf/tests/mem2node.o CC /tmp/build/perf/tests/maps.o CC /tmp/build/perf/util/thread.o CC /tmp/build/perf/util/thread_map.o CC /tmp/build/perf/tests/time-utils-test.o CC /tmp/build/perf/tests/genelf.o CC /tmp/build/perf/util/trace-event-parse.o BISON /tmp/build/perf/util/pmu-bison.c CC /tmp/build/perf/util/trace-event-read.o CC /tmp/build/perf/tests/api-io.o CC /tmp/build/perf/util/trace-event-info.o CC /tmp/build/perf/util/trace-event-scripting.o CC /tmp/build/perf/tests/pfm.o CC /tmp/build/perf/tests/demangle-java-test.o CC /tmp/build/perf/util/trace-event.o LD /tmp/build/perf/ui/perf-in.o CC /tmp/build/perf/tests/parse-metric.o CC /tmp/build/perf/tests/pe-file-parsing.o CC /tmp/build/perf/tests/expand-cgroup.o CC /tmp/build/perf/tests/perf-time-to-tsc.o CC /tmp/build/perf/util/svghelper.o CC /tmp/build/perf/util/sort.o CC /tmp/build/perf/util/hist.o CC /tmp/build/perf/tests/dwarf-unwind.o CC /tmp/build/perf/tests/llvm-src-base.o CC /tmp/build/perf/util/cpumap.o CC /tmp/build/perf/util/util.o CC /tmp/build/perf/util/affinity.o CC /tmp/build/perf/util/cputopo.o CC /tmp/build/perf/util/target.o CC /tmp/build/perf/util/cgroup.o CC /tmp/build/perf/tests/llvm-src-kbuild.o CC /tmp/build/perf/util/rblist.o CC /tmp/build/perf/tests/llvm-src-prologue.o CC /tmp/build/perf/tests/llvm-src-relocation.o CC /tmp/build/perf/util/intlist.o CC /tmp/build/perf/util/counts.o CC /tmp/build/perf/util/vdso.o CC /tmp/build/perf/util/stat.o CC /tmp/build/perf/util/stat-shadow.o CC /tmp/build/perf/util/stat-display.o CC /tmp/build/perf/util/perf_api_probe.o CC /tmp/build/perf/util/record.o CC /tmp/build/perf/util/srcline.o LD /tmp/build/perf/tests/perf-in.o CC /tmp/build/perf/util/srccode.o CC /tmp/build/perf/util/synthetic-events.o CC /tmp/build/perf/util/data.o CC /tmp/build/perf/util/cloexec.o CC /tmp/build/perf/util/tsc.o CC /tmp/build/perf/util/rwsem.o CC /tmp/build/perf/util/call-path.o CC /tmp/build/perf/util/thread-stack.o CC /tmp/build/perf/util/spark.o CC /tmp/build/perf/util/topdown.o CC /tmp/build/perf/util/auxtrace.o CC /tmp/build/perf/util/intel-pt.o CC /tmp/build/perf/util/stream.o CC /tmp/build/perf/util/intel-bts.o MKDIR /tmp/build/perf/util/arm-spe-decoder/ CC /tmp/build/perf/util/arm-spe.o MKDIR /tmp/build/perf/util/intel-pt-decoder/ MKDIR /tmp/build/perf/util/arm-spe-decoder/ CC /tmp/build/perf/util/s390-cpumsf.o MKDIR /tmp/build/perf/util/intel-pt-decoder/ MKDIR /tmp/build/perf/util/scripting-engines/ MKDIR /tmp/build/perf/util/intel-pt-decoder/ MKDIR /tmp/build/perf/util/scripting-engines/ CC /tmp/build/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.o GEN /tmp/build/perf/util/intel-pt-decoder/inat-tables.c CC /tmp/build/perf/util/arm-spe-decoder/arm-spe-decoder.o CC /tmp/build/perf/util/scripting-engines/trace-event-perl.o CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-log.o CC /tmp/build/perf/util/dump-insn.o CC /tmp/build/perf/util/parse-branch-options.o CC /tmp/build/perf/util/scripting-engines/trace-event-python.o CC /tmp/build/perf/util/parse-regs-options.o CC /tmp/build/perf/util/parse-sublevel-options.o MKDIR /tmp/build/perf/util/intel-pt-decoder/ CC /tmp/build/perf/util/term.o CC /tmp/build/perf/util/help-unknown-cmd.o LD /tmp/build/perf/util/arm-spe-decoder/perf-in.o CC /tmp/build/perf/util/mem-events.o CC /tmp/build/perf/util/vsprintf.o CC /tmp/build/perf/util/units.o CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o BISON /tmp/build/perf/util/expr-bison.c CC /tmp/build/perf/util/time-utils.o CC /tmp/build/perf/util/branch.o CC /tmp/build/perf/util/mem2node.o CC /tmp/build/perf/util/clockid.o CC /tmp/build/perf/util/bpf-loader.o CC /tmp/build/perf/util/bpf_map.o CC /tmp/build/perf/util/bpf_counter.o CC /tmp/build/perf/util/bpf-prologue.o CC /tmp/build/perf/util/symbol-elf.o CC /tmp/build/perf/util/probe-file.o CC /tmp/build/perf/util/probe-event.o CC /tmp/build/perf/util/probe-finder.o CC /tmp/build/perf/util/dwarf-aux.o CC /tmp/build/perf/util/dwarf-regs.o LD /tmp/build/perf/util/scripting-engines/perf-in.o CC /tmp/build/perf/util/unwind-libunwind-local.o CC /tmp/build/perf/util/unwind-libunwind.o CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-insn-decoder.o CC /tmp/build/perf/util/data-convert-bt.o CC /tmp/build/perf/util/zlib.o CC /tmp/build/perf/util/lzma.o CC /tmp/build/perf/util/cap.o CC /tmp/build/perf/util/zstd.o CC /tmp/build/perf/util/demangle-java.o CC /tmp/build/perf/util/demangle-rust.o CC /tmp/build/perf/util/genelf.o CC /tmp/build/perf/util/jitdump.o CC /tmp/build/perf/util/genelf_debug.o CC /tmp/build/perf/util/perf-hooks.o CC /tmp/build/perf/util/bpf-event.o FLEX /tmp/build/perf/util/pmu-flex.c CC /tmp/build/perf/util/pmu-bison.o CC /tmp/build/perf/util/pmu.o CC /tmp/build/perf/util/pmu-flex.o FLEX /tmp/build/perf/util/parse-events-flex.c CC /tmp/build/perf/util/parse-events-bison.o FLEX /tmp/build/perf/util/expr-flex.c CC /tmp/build/perf/util/expr-bison.o LD /tmp/build/perf/util/intel-pt-decoder/perf-in.o CC /tmp/build/perf/util/parse-events.o CC /tmp/build/perf/util/parse-events-flex.o CC /tmp/build/perf/util/expr-flex.o CC /tmp/build/perf/util/expr.o LD /tmp/build/perf/scripts/perl/Perf-Trace-Util/perf-in.o LD /tmp/build/perf/scripts/perf-in.o LD /tmp/build/perf/util/perf-in.o LD /tmp/build/perf/perf-in.o LINK /tmp/build/perf/perf INSTALL tests INSTALL binaries INSTALL libperf-jvmti.so INSTALL libexec INSTALL bpf-headers INSTALL bpf-examples INSTALL perf-archive INSTALL perf-with-kcore INSTALL strace/groups INSTALL perl-scripts INSTALL python-scripts INSTALL perf_completion-script INSTALL perf-tip make: Leaving directory '/home/acme/git/perf/tools/perf' [acme@five perf]$ > Signed-off-by: Song Liu <songliubraving@fb.com> > --- > tools/perf/Makefile.perf | 2 +- > tools/perf/builtin-stat.c | 77 ++++- > tools/perf/util/Build | 1 + > tools/perf/util/bpf_counter.c | 296 ++++++++++++++++++ > tools/perf/util/bpf_counter.h | 72 +++++ > .../util/bpf_skel/bpf_prog_profiler.bpf.c | 93 ++++++ > tools/perf/util/evsel.c | 9 + > tools/perf/util/evsel.h | 6 + > tools/perf/util/stat-display.c | 4 +- > tools/perf/util/stat.c | 2 +- > tools/perf/util/target.c | 34 +- > tools/perf/util/target.h | 10 + > 12 files changed, 588 insertions(+), 18 deletions(-) > create mode 100644 tools/perf/util/bpf_counter.c > create mode 100644 tools/perf/util/bpf_counter.h > create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c > > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf > index d182a2dbb9bbd..8c4e039c3b813 100644 > --- a/tools/perf/Makefile.perf > +++ b/tools/perf/Makefile.perf > @@ -1015,7 +1015,7 @@ python-clean: > > SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) > SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) > -SKELETONS := > +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h > > ifdef BUILD_BPF_SKEL > BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > index 8cc24967bc273..09bffb3fbcdd4 100644 > --- a/tools/perf/builtin-stat.c > +++ b/tools/perf/builtin-stat.c > @@ -67,6 +67,7 @@ > #include "util/top.h" > #include "util/affinity.h" > #include "util/pfm.h" > +#include "util/bpf_counter.h" > #include "asm/bug.h" > > #include <linux/time64.h> > @@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs) > return 0; > } > > +static int read_bpf_map_counters(void) > +{ > + struct evsel *counter; > + int err; > + > + evlist__for_each_entry(evsel_list, counter) { > + err = bpf_counter__read(counter); > + if (err) > + return err; > + } > + return 0; > +} > + > static void read_counters(struct timespec *rs) > { > struct evsel *counter; > + int err; > > - if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0)) > - return; > + if (!stat_config.stop_read_counter) { > + err = read_bpf_map_counters(); > + if (err == -EAGAIN) > + err = read_affinity_counters(rs); > + if (err < 0) > + return; > + } > > evlist__for_each_entry(evsel_list, counter) { > if (counter->err) > @@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times) > return false; > } > > -static void enable_counters(void) > +static int enable_counters(void) > { > + struct evsel *evsel; > + int err; > + > + evlist__for_each_entry(evsel_list, evsel) { > + err = bpf_counter__enable(evsel); > + if (err) > + return err; > + } > + > if (stat_config.initial_delay < 0) { > pr_info(EVLIST_DISABLED_MSG); > - return; > + return 0; > } > > if (stat_config.initial_delay > 0) { > @@ -518,6 +547,7 @@ static void enable_counters(void) > if (stat_config.initial_delay > 0) > pr_info(EVLIST_ENABLED_MSG); > } > + return 0; > } > > static void disable_counters(void) > @@ -720,7 +750,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > const bool forks = (argc > 0); > bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false; > struct affinity affinity; > - int i, cpu; > + int i, cpu, err; > bool second_pass = false; > > if (forks) { > @@ -737,6 +767,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > if (affinity__setup(&affinity) < 0) > return -1; > > + evlist__for_each_entry(evsel_list, counter) { > + if (bpf_counter__load(counter, &target)) > + return -1; > + } > + > evlist__for_each_cpu (evsel_list, i, cpu) { > affinity__set(&affinity, cpu); > > @@ -850,7 +885,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > } > > if (STAT_RECORD) { > - int err, fd = perf_data__fd(&perf_stat.data); > + int fd = perf_data__fd(&perf_stat.data); > > if (is_pipe) { > err = perf_header__write_pipe(perf_data__fd(&perf_stat.data)); > @@ -876,7 +911,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > > if (forks) { > evlist__start_workload(evsel_list); > - enable_counters(); > + err = enable_counters(); > + if (err) > + return -1; > > if (interval || timeout || evlist__ctlfd_initialized(evsel_list)) > status = dispatch_events(forks, timeout, interval, ×); > @@ -895,7 +932,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > if (WIFSIGNALED(status)) > psignal(WTERMSIG(status), argv[0]); > } else { > - enable_counters(); > + err = enable_counters(); > + if (err) > + return -1; > status = dispatch_events(forks, timeout, interval, ×); > } > > @@ -1085,6 +1124,10 @@ static struct option stat_options[] = { > "stat events on existing process id"), > OPT_STRING('t', "tid", &target.tid, "tid", > "stat events on existing thread id"), > +#ifdef HAVE_BPF_SKEL > + OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id", > + "stat events on existing bpf program id"), > +#endif > OPT_BOOLEAN('a', "all-cpus", &target.system_wide, > "system-wide collection from all CPUs"), > OPT_BOOLEAN('g', "group", &group, > @@ -2064,11 +2107,12 @@ int cmd_stat(int argc, const char **argv) > "perf stat [<options>] [<command>]", > NULL > }; > - int status = -EINVAL, run_idx; > + int status = -EINVAL, run_idx, err; > const char *mode; > FILE *output = stderr; > unsigned int interval, timeout; > const char * const stat_subcommands[] = { "record", "report" }; > + char errbuf[BUFSIZ]; > > setlocale(LC_ALL, ""); > > @@ -2179,6 +2223,12 @@ int cmd_stat(int argc, const char **argv) > } else if (big_num_opt == 0) /* User passed --no-big-num */ > stat_config.big_num = false; > > + err = target__validate(&target); > + if (err) { > + target__strerror(&target, err, errbuf, BUFSIZ); > + pr_warning("%s\n", errbuf); > + } > + > setup_system_wide(argc); > > /* > @@ -2252,8 +2302,6 @@ int cmd_stat(int argc, const char **argv) > } > } > > - target__validate(&target); > - > if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide)) > target.per_thread = true; > > @@ -2384,9 +2432,10 @@ int cmd_stat(int argc, const char **argv) > * tools remain -acme > */ > int fd = perf_data__fd(&perf_stat.data); > - int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, > - process_synthesized_event, > - &perf_stat.session->machines.host); > + > + err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, > + process_synthesized_event, > + &perf_stat.session->machines.host); > if (err) { > pr_warning("Couldn't synthesize the kernel mmap record, harmless, " > "older tools may produce warnings about this file\n."); > diff --git a/tools/perf/util/Build b/tools/perf/util/Build > index e2563d0154eb6..188521f343470 100644 > --- a/tools/perf/util/Build > +++ b/tools/perf/util/Build > @@ -135,6 +135,7 @@ perf-y += clockid.o > > perf-$(CONFIG_LIBBPF) += bpf-loader.o > perf-$(CONFIG_LIBBPF) += bpf_map.o > +perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o > perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o > perf-$(CONFIG_LIBELF) += symbol-elf.o > perf-$(CONFIG_LIBELF) += probe-file.o > diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c > new file mode 100644 > index 0000000000000..f2cb86a40c882 > --- /dev/null > +++ b/tools/perf/util/bpf_counter.c > @@ -0,0 +1,296 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +/* Copyright (c) 2019 Facebook */ > + > +#include <limits.h> > +#include <unistd.h> > +#include <sys/time.h> > +#include <sys/resource.h> > +#include <linux/err.h> > +#include <linux/zalloc.h> > +#include <bpf/bpf.h> > +#include <bpf/btf.h> > +#include <bpf/libbpf.h> > + > +#include "bpf_counter.h" > +#include "counts.h" > +#include "debug.h" > +#include "evsel.h" > +#include "target.h" > + > +#include "bpf_skel/bpf_prog_profiler.skel.h" > + > +static inline void *u64_to_ptr(__u64 ptr) > +{ > + return (void *)(unsigned long)ptr; > +} > + > +static void set_max_rlimit(void) > +{ > + struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY }; > + > + setrlimit(RLIMIT_MEMLOCK, &rinf); > +} > + > +static struct bpf_counter *bpf_counter_alloc(void) > +{ > + struct bpf_counter *counter; > + > + counter = zalloc(sizeof(*counter)); > + if (counter) > + INIT_LIST_HEAD(&counter->list); > + return counter; > +} > + > +static int bpf_program_profiler__destroy(struct evsel *evsel) > +{ > + struct bpf_counter *counter; > + > + list_for_each_entry(counter, &evsel->bpf_counter_list, list) > + bpf_prog_profiler_bpf__destroy(counter->skel); > + INIT_LIST_HEAD(&evsel->bpf_counter_list); > + return 0; > +} > + > +static char *bpf_target_prog_name(int tgt_fd) > +{ > + struct bpf_prog_info_linear *info_linear; > + struct bpf_func_info *func_info; > + const struct btf_type *t; > + char *name = NULL; > + struct btf *btf; > + > + info_linear = bpf_program__get_prog_info_linear( > + tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO); > + if (IS_ERR_OR_NULL(info_linear)) { > + pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd); > + return NULL; > + } > + > + if (info_linear->info.btf_id == 0 || > + btf__get_from_id(info_linear->info.btf_id, &btf)) { > + pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd); > + goto out; > + } > + > + func_info = u64_to_ptr(info_linear->info.func_info); > + t = btf__type_by_id(btf, func_info[0].type_id); > + if (!t) { > + pr_debug("btf %d doesn't have type %d\n", > + info_linear->info.btf_id, func_info[0].type_id); > + goto out; > + } > + name = strdup(btf__name_by_offset(btf, t->name_off)); > +out: > + free(info_linear); > + return name; > +} > + > +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id) > +{ > + struct bpf_prog_profiler_bpf *skel; > + struct bpf_counter *counter; > + struct bpf_program *prog; > + char *prog_name; > + int prog_fd; > + int err; > + > + prog_fd = bpf_prog_get_fd_by_id(prog_id); > + if (prog_fd < 0) { > + pr_err("Failed to open fd for bpf prog %u\n", prog_id); > + return -1; > + } > + counter = bpf_counter_alloc(); > + if (!counter) { > + close(prog_fd); > + return -1; > + } > + > + skel = bpf_prog_profiler_bpf__open(); > + if (!skel) { > + pr_err("Failed to open bpf skeleton\n"); > + goto err_out; > + } > + skel->rodata->num_cpu = evsel__nr_cpus(evsel); > + > + bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel)); > + bpf_map__resize(skel->maps.fentry_readings, 1); > + bpf_map__resize(skel->maps.accum_readings, 1); > + > + prog_name = bpf_target_prog_name(prog_fd); > + if (!prog_name) { > + pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id); > + goto err_out; > + } > + > + bpf_object__for_each_program(prog, skel->obj) { > + err = bpf_program__set_attach_target(prog, prog_fd, prog_name); > + if (err) { > + pr_err("bpf_program__set_attach_target failed.\n" > + "Does bpf prog %u have BTF?\n", prog_id); > + goto err_out; > + } > + } > + set_max_rlimit(); > + err = bpf_prog_profiler_bpf__load(skel); > + if (err) { > + pr_err("bpf_prog_profiler_bpf__load failed\n"); > + goto err_out; > + } > + > + counter->skel = skel; > + list_add(&counter->list, &evsel->bpf_counter_list); > + close(prog_fd); > + return 0; > +err_out: > + free(counter); > + close(prog_fd); > + return -1; > +} > + > +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target) > +{ > + char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p; > + u32 prog_id; > + int ret; > + > + bpf_str_ = bpf_str = strdup(target->bpf_str); > + if (!bpf_str) > + return -1; > + > + while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) { > + prog_id = strtoul(tok, &p, 10); > + if (prog_id == 0 || prog_id == UINT_MAX || > + (*p != '\0' && *p != ',')) { > + pr_err("Failed to parse bpf prog ids %s\n", > + target->bpf_str); > + return -1; > + } > + > + ret = bpf_program_profiler_load_one(evsel, prog_id); > + if (ret) { > + bpf_program_profiler__destroy(evsel); > + free(bpf_str_); > + return -1; > + } > + bpf_str = NULL; > + } > + free(bpf_str_); > + return 0; > +} > + > +static int bpf_program_profiler__enable(struct evsel *evsel) > +{ > + struct bpf_counter *counter; > + int ret; > + > + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { > + ret = bpf_prog_profiler_bpf__attach(counter->skel); > + if (ret) { > + bpf_program_profiler__destroy(evsel); > + return ret; > + } > + } > + return 0; > +} > + > +static int bpf_program_profiler__read(struct evsel *evsel) > +{ > + int num_cpu = evsel__nr_cpus(evsel); > + struct bpf_perf_event_value values[num_cpu]; > + struct bpf_counter *counter; > + int reading_map_fd; > + __u32 key = 0; > + int err, cpu; > + > + if (list_empty(&evsel->bpf_counter_list)) > + return -EAGAIN; > + > + for (cpu = 0; cpu < num_cpu; cpu++) { > + perf_counts(evsel->counts, cpu, 0)->val = 0; > + perf_counts(evsel->counts, cpu, 0)->ena = 0; > + perf_counts(evsel->counts, cpu, 0)->run = 0; > + } > + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { > + struct bpf_prog_profiler_bpf *skel = counter->skel; > + > + reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > + > + err = bpf_map_lookup_elem(reading_map_fd, &key, values); > + if (err) { > + fprintf(stderr, "failed to read value\n"); > + return err; > + } > + > + for (cpu = 0; cpu < num_cpu; cpu++) { > + perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter; > + perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled; > + perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running; > + } > + } > + return 0; > +} > + > +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu, > + int fd) > +{ > + struct bpf_prog_profiler_bpf *skel; > + struct bpf_counter *counter; > + int ret; > + > + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { > + skel = counter->skel; > + ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events), > + &cpu, &fd, BPF_ANY); > + if (ret) > + return ret; > + } > + return 0; > +} > + > +struct bpf_counter_ops bpf_program_profiler_ops = { > + .load = bpf_program_profiler__load, > + .enable = bpf_program_profiler__enable, > + .read = bpf_program_profiler__read, > + .destroy = bpf_program_profiler__destroy, > + .install_pe = bpf_program_profiler__install_pe, > +}; > + > +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd) > +{ > + if (list_empty(&evsel->bpf_counter_list)) > + return 0; > + return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd); > +} > + > +int bpf_counter__load(struct evsel *evsel, struct target *target) > +{ > + if (target__has_bpf(target)) > + evsel->bpf_counter_ops = &bpf_program_profiler_ops; > + > + if (evsel->bpf_counter_ops) > + return evsel->bpf_counter_ops->load(evsel, target); > + return 0; > +} > + > +int bpf_counter__enable(struct evsel *evsel) > +{ > + if (list_empty(&evsel->bpf_counter_list)) > + return 0; > + return evsel->bpf_counter_ops->enable(evsel); > +} > + > +int bpf_counter__read(struct evsel *evsel) > +{ > + if (list_empty(&evsel->bpf_counter_list)) > + return -EAGAIN; > + return evsel->bpf_counter_ops->read(evsel); > +} > + > +void bpf_counter__destroy(struct evsel *evsel) > +{ > + if (list_empty(&evsel->bpf_counter_list)) > + return; > + evsel->bpf_counter_ops->destroy(evsel); > + evsel->bpf_counter_ops = NULL; > +} > diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h > new file mode 100644 > index 0000000000000..2eca210e5dc16 > --- /dev/null > +++ b/tools/perf/util/bpf_counter.h > @@ -0,0 +1,72 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef __PERF_BPF_COUNTER_H > +#define __PERF_BPF_COUNTER_H 1 > + > +#include <linux/list.h> > + > +struct evsel; > +struct target; > +struct bpf_counter; > + > +typedef int (*bpf_counter_evsel_op)(struct evsel *evsel); > +typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel, > + struct target *target); > +typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel, > + int cpu, > + int fd); > + > +struct bpf_counter_ops { > + bpf_counter_evsel_target_op load; > + bpf_counter_evsel_op enable; > + bpf_counter_evsel_op read; > + bpf_counter_evsel_op destroy; > + bpf_counter_evsel_install_pe_op install_pe; > +}; > + > +struct bpf_counter { > + void *skel; > + struct list_head list; > +}; > + > +#ifdef HAVE_BPF_SKEL > + > +int bpf_counter__load(struct evsel *evsel, struct target *target); > +int bpf_counter__enable(struct evsel *evsel); > +int bpf_counter__read(struct evsel *evsel); > +void bpf_counter__destroy(struct evsel *evsel); > +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); > + > +#else /* HAVE_BPF_SKEL */ > + > +#include<linux/err.h> > + > +static inline int bpf_counter__load(struct evsel *evsel __maybe_unused, > + struct target *target __maybe_unused) > +{ > + return 0; > +} > + > +static inline int bpf_counter__enable(struct evsel *evsel __maybe_unused) > +{ > + return 0; > +} > + > +static inline int bpf_counter__read(struct evsel *evsel __maybe_unused) > +{ > + return -EAGAIN; > +} > + > +static inline void bpf_counter__destroy(struct evsel *evsel __maybe_unused) > +{ > +} > + > +static inline int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, > + int cpu __maybe_unused, > + int fd __maybe_unused) > +{ > + return 0; > +} > + > +#endif /* HAVE_BPF_SKEL */ > + > +#endif /* __PERF_BPF_COUNTER_H */ > diff --git a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c > new file mode 100644 > index 0000000000000..c7cec92d02360 > --- /dev/null > +++ b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c > @@ -0,0 +1,93 @@ > +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > +// Copyright (c) 2020 Facebook > +#include <linux/bpf.h> > +#include <bpf/bpf_helpers.h> > +#include <bpf/bpf_tracing.h> > + > +/* map of perf event fds, num_cpu * num_metric entries */ > +struct { > + __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); > + __uint(key_size, sizeof(__u32)); > + __uint(value_size, sizeof(int)); > +} events SEC(".maps"); > + > +/* readings at fentry */ > +struct { > + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); > + __uint(key_size, sizeof(__u32)); > + __uint(value_size, sizeof(struct bpf_perf_event_value)); > + __uint(max_entries, 1); > +} fentry_readings SEC(".maps"); > + > +/* accumulated readings */ > +struct { > + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); > + __uint(key_size, sizeof(__u32)); > + __uint(value_size, sizeof(struct bpf_perf_event_value)); > + __uint(max_entries, 1); > +} accum_readings SEC(".maps"); > + > +const volatile __u32 num_cpu = 1; > + > +SEC("fentry/XXX") > +int BPF_PROG(fentry_XXX) > +{ > + __u32 key = bpf_get_smp_processor_id(); > + struct bpf_perf_event_value *ptr; > + __u32 zero = 0; > + long err; > + > + /* look up before reading, to reduce error */ > + ptr = bpf_map_lookup_elem(&fentry_readings, &zero); > + if (!ptr) > + return 0; > + > + err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr)); > + if (err) > + return 0; > + > + return 0; > +} > + > +static inline void > +fexit_update_maps(struct bpf_perf_event_value *after) > +{ > + struct bpf_perf_event_value *before, diff, *accum; > + __u32 zero = 0; > + > + before = bpf_map_lookup_elem(&fentry_readings, &zero); > + /* only account samples with a valid fentry_reading */ > + if (before && before->counter) { > + struct bpf_perf_event_value *accum; > + > + diff.counter = after->counter - before->counter; > + diff.enabled = after->enabled - before->enabled; > + diff.running = after->running - before->running; > + > + accum = bpf_map_lookup_elem(&accum_readings, &zero); > + if (accum) { > + accum->counter += diff.counter; > + accum->enabled += diff.enabled; > + accum->running += diff.running; > + } > + } > +} > + > +SEC("fexit/XXX") > +int BPF_PROG(fexit_XXX) > +{ > + struct bpf_perf_event_value reading; > + __u32 cpu = bpf_get_smp_processor_id(); > + __u32 one = 1, zero = 0; > + int err; > + > + /* read all events before updating the maps, to reduce error */ > + err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading)); > + if (err) > + return 0; > + > + fexit_update_maps(&reading); > + return 0; > +} > + > +char LICENSE[] SEC("license") = "Dual BSD/GPL"; > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c > index c26ea82220bd8..7265308765d73 100644 > --- a/tools/perf/util/evsel.c > +++ b/tools/perf/util/evsel.c > @@ -25,6 +25,7 @@ > #include <stdlib.h> > #include <perf/evsel.h> > #include "asm/bug.h" > +#include "bpf_counter.h" > #include "callchain.h" > #include "cgroup.h" > #include "counts.h" > @@ -51,6 +52,10 @@ > #include <internal/lib.h> > > #include <linux/ctype.h> > +#include <bpf/bpf.h> > +#include <bpf/libbpf.h> > +#include <bpf/btf.h> > +#include "rlimit.h" > > struct perf_missing_features perf_missing_features; > > @@ -247,6 +252,7 @@ void evsel__init(struct evsel *evsel, > evsel->bpf_obj = NULL; > evsel->bpf_fd = -1; > INIT_LIST_HEAD(&evsel->config_terms); > + INIT_LIST_HEAD(&evsel->bpf_counter_list); > perf_evsel__object.init(evsel); > evsel->sample_size = __evsel__sample_size(attr->sample_type); > evsel__calc_id_pos(evsel); > @@ -1366,6 +1372,7 @@ void evsel__exit(struct evsel *evsel) > { > assert(list_empty(&evsel->core.node)); > assert(evsel->evlist == NULL); > + bpf_counter__destroy(evsel); > evsel__free_counts(evsel); > perf_evsel__free_fd(&evsel->core); > perf_evsel__free_id(&evsel->core); > @@ -1781,6 +1788,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, > > FD(evsel, cpu, thread) = fd; > > + bpf_counter__install_pe(evsel, cpu, fd); > + > if (unlikely(test_attr__enabled)) { > test_attr__open(&evsel->core.attr, pid, cpus->map[cpu], > fd, group_fd, flags); > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h > index cd1d8dd431997..40e3946cd7518 100644 > --- a/tools/perf/util/evsel.h > +++ b/tools/perf/util/evsel.h > @@ -10,6 +10,7 @@ > #include <internal/evsel.h> > #include <perf/evsel.h> > #include "symbol_conf.h" > +#include "bpf_counter.h" > #include <internal/cpumap.h> > > struct bpf_object; > @@ -17,6 +18,8 @@ struct cgroup; > struct perf_counts; > struct perf_stat_evsel; > union perf_event; > +struct bpf_counter_ops; > +struct target; > > typedef int (evsel__sb_cb_t)(union perf_event *event, void *data); > > @@ -127,6 +130,8 @@ struct evsel { > * See also evsel__has_callchain(). > */ > __u64 synth_sample_type; > + struct list_head bpf_counter_list; > + struct bpf_counter_ops *bpf_counter_ops; > }; > > struct perf_missing_features { > @@ -424,4 +429,5 @@ static inline bool evsel__is_dummy_event(struct evsel *evsel) > struct perf_env *evsel__env(struct evsel *evsel); > > int evsel__store_ids(struct evsel *evsel, struct evlist *evlist); > + > #endif /* __PERF_EVSEL_H */ > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c > index 583ae4f09c5d1..cce7a76d6473c 100644 > --- a/tools/perf/util/stat-display.c > +++ b/tools/perf/util/stat-display.c > @@ -1045,7 +1045,9 @@ static void print_header(struct perf_stat_config *config, > if (!config->csv_output) { > fprintf(output, "\n"); > fprintf(output, " Performance counter stats for "); > - if (_target->system_wide) > + if (_target->bpf_str) > + fprintf(output, "\'BPF program(s) %s", _target->bpf_str); > + else if (_target->system_wide) > fprintf(output, "\'system wide"); > else if (_target->cpu_list) > fprintf(output, "\'CPU(s) %s", _target->cpu_list); > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c > index 8ce1479c98f03..0b3957323f668 100644 > --- a/tools/perf/util/stat.c > +++ b/tools/perf/util/stat.c > @@ -527,7 +527,7 @@ int create_perf_stat_counter(struct evsel *evsel, > if (leader->core.nr_members > 1) > attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP; > > - attr->inherit = !config->no_inherit; > + attr->inherit = !config->no_inherit && list_empty(&evsel->bpf_counter_list); > > /* > * Some events get initialized with sample_(period/type) set, > diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c > index a3db13dea937c..0f383418e3df5 100644 > --- a/tools/perf/util/target.c > +++ b/tools/perf/util/target.c > @@ -56,6 +56,34 @@ enum target_errno target__validate(struct target *target) > ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM; > } > > + /* BPF and CPU are mutually exclusive */ > + if (target->bpf_str && target->cpu_list) { > + target->cpu_list = NULL; > + if (ret == TARGET_ERRNO__SUCCESS) > + ret = TARGET_ERRNO__BPF_OVERRIDE_CPU; > + } > + > + /* BPF and PID/TID are mutually exclusive */ > + if (target->bpf_str && target->tid) { > + target->tid = NULL; > + if (ret == TARGET_ERRNO__SUCCESS) > + ret = TARGET_ERRNO__BPF_OVERRIDE_PID; > + } > + > + /* BPF and UID are mutually exclusive */ > + if (target->bpf_str && target->uid_str) { > + target->uid_str = NULL; > + if (ret == TARGET_ERRNO__SUCCESS) > + ret = TARGET_ERRNO__BPF_OVERRIDE_UID; > + } > + > + /* BPF and THREADS are mutually exclusive */ > + if (target->bpf_str && target->per_thread) { > + target->per_thread = false; > + if (ret == TARGET_ERRNO__SUCCESS) > + ret = TARGET_ERRNO__BPF_OVERRIDE_THREAD; > + } > + > /* THREAD and SYSTEM/CPU are mutually exclusive */ > if (target->per_thread && (target->system_wide || target->cpu_list)) { > target->per_thread = false; > @@ -109,6 +137,10 @@ static const char *target__error_str[] = { > "PID/TID switch overriding SYSTEM", > "UID switch overriding SYSTEM", > "SYSTEM/CPU switch overriding PER-THREAD", > + "BPF switch overriding CPU", > + "BPF switch overriding PID/TID", > + "BPF switch overriding UID", > + "BPF switch overriding THREAD", > "Invalid User: %s", > "Problems obtaining information for user %s", > }; > @@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum, > > switch (errnum) { > case TARGET_ERRNO__PID_OVERRIDE_CPU ... > - TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD: > + TARGET_ERRNO__BPF_OVERRIDE_THREAD: > snprintf(buf, buflen, "%s", msg); > break; > > diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h > index 6ef01a83b24e9..f132c6c2eef81 100644 > --- a/tools/perf/util/target.h > +++ b/tools/perf/util/target.h > @@ -10,6 +10,7 @@ struct target { > const char *tid; > const char *cpu_list; > const char *uid_str; > + const char *bpf_str; > uid_t uid; > bool system_wide; > bool uses_mmap; > @@ -36,6 +37,10 @@ enum target_errno { > TARGET_ERRNO__PID_OVERRIDE_SYSTEM, > TARGET_ERRNO__UID_OVERRIDE_SYSTEM, > TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD, > + TARGET_ERRNO__BPF_OVERRIDE_CPU, > + TARGET_ERRNO__BPF_OVERRIDE_PID, > + TARGET_ERRNO__BPF_OVERRIDE_UID, > + TARGET_ERRNO__BPF_OVERRIDE_THREAD, > > /* for target__parse_uid() */ > TARGET_ERRNO__INVALID_UID, > @@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target) > return target->system_wide || target->cpu_list; > } > > +static inline bool target__has_bpf(struct target *target) > +{ > + return target->bpf_str; > +} > + > static inline bool target__none(struct target *target) > { > return !target__has_task(target) && !target__has_cpu(target); > -- > 2.24.1 > -- - Arnaldo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-28 20:11 ` Arnaldo Carvalho de Melo @ 2020-12-28 23:43 ` Song Liu 2020-12-29 5:53 ` Song Liu 2020-12-29 15:15 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 25+ messages in thread From: Song Liu @ 2020-12-28 23:43 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team > On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu: >> Introduce perf-stat -b option, which counts events for BPF programs, like: >> >> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 >> 1.487903822 115,200 ref-cycles >> 1.487903822 86,012 cycles >> 2.489147029 80,560 ref-cycles >> 2.489147029 73,784 cycles >> 3.490341825 60,720 ref-cycles >> 3.490341825 37,797 cycles >> 4.491540887 37,120 ref-cycles >> 4.491540887 31,963 cycles >> >> The example above counts cycles and ref-cycles of BPF program of id 254. >> This is similar to bpftool-prog-profile command, but more flexible. >> >> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF >> programs (monitor-progs) to the target BPF program (target-prog). The >> monitor-progs read perf_event before and after the target-prog, and >> aggregate the difference in a BPF map. Then the user space reads data >> from these maps. >> >> A new struct bpf_counter is introduced to provide common interface that >> uses BPF programs/maps to count perf events. > > Segfaulting here: > > [root@five ~]# bpftool prog | grep tracepoint > 110: tracepoint name syscall_unaugme tag 57cd311f2e27366b gpl > 111: tracepoint name sys_enter_conne tag 3555418ac9476139 gpl > 112: tracepoint name sys_enter_sendt tag bc7fcadbaf7b8145 gpl > 113: tracepoint name sys_enter_open tag 0e59c3ac2bea5280 gpl > 114: tracepoint name sys_enter_opena tag 0baf443610f59837 gpl > 115: tracepoint name sys_enter_renam tag 24664e4aca62d7fa gpl > 116: tracepoint name sys_enter_renam tag 20093e51a8634ebb gpl > 117: tracepoint name sys_enter tag 0bc3fc9d11754ba1 gpl > 118: tracepoint name sys_exit tag 29c7ae234d79bd5c gpl > [root@five ~]# > [root@five ~]# gdb perf > GNU gdb (GDB) Fedora 10.1-2.fc33 > Reading symbols from perf... > (gdb) run stat -e instructions,cycles -b 113 -I 1000 > Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000 > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > > Program received signal SIGSEGV, Segmentation fault. > 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 > 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > (gdb) bt > #0 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 > #1 0x0000000000000000 in ?? () > (gdb) > > [acme@five perf]$ clang -v |& head -2 > clang version 11.0.0 (Fedora 11.0.0-2.fc33) > Target: x86_64-unknown-linux-gnu > [acme@five perf]$ > > Do you need any extra info? Hmm... I am not able to reproduce this. I am trying to setup an environment similar to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? Thanks, Song > > Please when resubmitting, please combine patches 3/4 and 4/4, man pages > updates usually come together with the new feature. > > Thanks, > > - Arnaldo > > Full build output: > > [acme@five perf]$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;make VF=1 O=/tmp/build/perf -C tools/perf BUILD_BPF_SKEL=1 install-bin > make: Entering directory '/home/acme/git/perf/tools/perf' > BUILD: Doing 'make -j24' parallel build > HOSTCC /tmp/build/perf/fixdep.o > HOSTLD /tmp/build/perf/fixdep-in.o > LINK /tmp/build/perf/fixdep > Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' > diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h > > Auto-detecting system features: > ... dwarf: [ on ] > ... dwarf_getlocations: [ on ] > ... glibc: [ on ] > ... libbfd: [ on ] > ... libbfd-buildid: [ on ] > ... libcap: [ on ] > ... libelf: [ on ] > ... libnuma: [ on ] > ... numa_num_possible_cpus: [ on ] > ... libperl: [ on ] > ... libpython: [ on ] > ... libcrypto: [ on ] > ... libunwind: [ on ] > ... libdw-dwarf-unwind: [ on ] > ... zlib: [ on ] > ... lzma: [ on ] > ... get_cpuid: [ on ] > ... bpf: [ on ] > ... libaio: [ on ] > ... libzstd: [ on ] > ... disassembler-four-args: [ on ] > ... backtrace: [ on ] > ... eventfd: [ on ] > ... fortify-source: [ on ] > ... sync-compare-and-swap: [ on ] > ... get_current_dir_name: [ on ] > ... gettid: [ on ] > ... libelf-getphdrnum: [ on ] > ... libelf-gelf_getnote: [ on ] > ... libelf-getshdrstrndx: [ on ] > ... libpython-version: [ on ] > ... libslang: [ on ] > ... libslang-include-subdir: [ on ] > ... pthread-attr-setaffinity-np: [ on ] > ... pthread-barrier: [ on ] > ... reallocarray: [ on ] > ... stackprotector-all: [ on ] > ... timerfd: [ on ] > ... sched_getcpu: [ on ] > ... sdt: [ on ] > ... setns: [ on ] > ... file-handle: [ on ] > > ... bionic: [ OFF ] > ... compile-32: [ OFF ] > ... compile-x32: [ OFF ] > ... cplus-demangle: [ on ] > ... gtk2: [ OFF ] > ... gtk2-infobar: [ OFF ] > ... hello: [ OFF ] > ... libbabeltrace: [ on ] > ... libbfd-liberty: [ OFF ] > ... libbfd-liberty-z: [ OFF ] > ... libopencsd: [ OFF ] > ... libunwind-x86: [ OFF ] > ... libunwind-x86_64: [ OFF ] > ... libunwind-arm: [ OFF ] > ... libunwind-aarch64: [ OFF ] > ... libunwind-debug-frame: [ OFF ] > ... libunwind-debug-frame-arm: [ OFF ] > ... libunwind-debug-frame-aarch64: [ OFF ] > ... cxx: [ OFF ] > ... llvm: [ OFF ] > ... llvm-version: [ OFF ] > ... clang: [ OFF ] > ... libbpf: [ OFF ] > ... libpfm4: [ OFF ] > ... libdebuginfod: [ on ] > ... clang-bpf-co-re: [ on ] > ... prefix: /home/acme > ... bindir: /home/acme/bin > ... libdir: /home/acme/lib64 > ... sysconfdir: /home/acme/etc > ... LIBUNWIND_DIR: > ... LIBDW_DIR: > ... JDIR: /usr/lib/jvm/java-11-openjdk-11.0.9.11-4.fc33.x86_64 > ... DWARF post unwind library: libunwind > > GEN /tmp/build/perf/common-cmds.h > CFLAGS= make -C ../bpf/bpftool \ > OUTPUT=/tmp/build/perf/util/bpf_skel/.tmp/ bootstrap > CC /tmp/build/perf/exec-cmd.o > CC /tmp/build/perf/help.o > MKDIR /tmp/build/perf/pmu-events/ > MKDIR /tmp/build/perf/jvmti/ > MKDIR /tmp/build/perf/fd/ > MKDIR /tmp/build/perf/fs/ > MKDIR /tmp/build/perf/fs/ > HOSTCC /tmp/build/perf/pmu-events/json.o > CC /tmp/build/perf/parse-options.o > CC /tmp/build/perf/fd/array.o > CC /tmp/build/perf/pager.o > CC /tmp/build/perf/jvmti/libjvmti.o > CC /tmp/build/perf/fs/fs.o > CC /tmp/build/perf/run-command.o > MKDIR /tmp/build/perf/jvmti/ > CC /tmp/build/perf/sigchain.o > MKDIR /tmp/build/perf/fs/ > CC /tmp/build/perf/fs/tracing_path.o > CC /tmp/build/perf/fs/cgroup.o > MKDIR /tmp/build/perf/pmu-events/ > CC /tmp/build/perf/jvmti/libstring.o > HOSTCC /tmp/build/perf/pmu-events/jevents.o > CC /tmp/build/perf/jvmti/libctype.o > CC /tmp/build/perf/subcmd-config.o > CC /tmp/build/perf/jvmti/jvmti_agent.o > LD /tmp/build/perf/fd/libapi-in.o > CC /tmp/build/perf/event-parse.o > HOSTCC /tmp/build/perf/pmu-events/jsmn.o > CC /tmp/build/perf/event-plugin.o > CC /tmp/build/perf/cpu.o > CC /tmp/build/perf/trace-seq.o > CC /tmp/build/perf/core.o > CC /tmp/build/perf/parse-filter.o > CC /tmp/build/perf/debug.o > CC /tmp/build/perf/cpumap.o > LD /tmp/build/perf/fs/libapi-in.o > CC /tmp/build/perf/threadmap.o > LD /tmp/build/perf/libsubcmd-in.o > HOSTLD /tmp/build/perf/pmu-events/jevents-in.o > CC /tmp/build/perf/str_error_r.o > CC /tmp/build/perf/evsel.o > GEN /tmp/build/perf/bpf_helper_defs.h > CC /tmp/build/perf/evlist.o > CC /tmp/build/perf/parse-utils.o > CC /tmp/build/perf/zalloc.o > CC /tmp/build/perf/kbuffer-parse.o > LD /tmp/build/perf/jvmti/jvmti-in.o > CC /tmp/build/perf/mmap.o > CC /tmp/build/perf/tep_strerror.o > CC /tmp/build/perf/xyarray.o > CC /tmp/build/perf/event-parse-api.o > CC /tmp/build/perf/lib.o > LINK /tmp/build/perf/pmu-events/jevents > LD /tmp/build/perf/libapi-in.o > AR /tmp/build/perf/libsubcmd.a > LINK /tmp/build/perf/libperf-jvmti.so > LD /tmp/build/perf/libtraceevent-in.o > CC /tmp/build/perf/plugin_hrtimer.o > CC /tmp/build/perf/plugin_kmem.o > CC /tmp/build/perf/plugin_mac80211.o > CC /tmp/build/perf/plugin_kvm.o > CC /tmp/build/perf/plugin_jbd2.o > LD /tmp/build/perf/libperf-in.o > CC /tmp/build/perf/plugin_sched_switch.o > CC /tmp/build/perf/plugin_function.o > CC /tmp/build/perf/plugin_scsi.o > CC /tmp/build/perf/plugin_xen.o > CC /tmp/build/perf/plugin_futex.o > CC /tmp/build/perf/plugin_cfg80211.o > CC /tmp/build/perf/plugin_tlb.o > AR /tmp/build/perf/libapi.a > LD /tmp/build/perf/plugin_hrtimer-in.o > LINK /tmp/build/perf/libtraceevent.a > LD /tmp/build/perf/plugin_kvm-in.o > LD /tmp/build/perf/plugin_scsi-in.o > LD /tmp/build/perf/plugin_kmem-in.o > LD /tmp/build/perf/plugin_mac80211-in.o > LD /tmp/build/perf/plugin_futex-in.o > LD /tmp/build/perf/plugin_function-in.o > LD /tmp/build/perf/plugin_xen-in.o > LD /tmp/build/perf/plugin_sched_switch-in.o > LD /tmp/build/perf/plugin_tlb-in.o > LD /tmp/build/perf/plugin_jbd2-in.o > LINK /tmp/build/perf/plugin_hrtimer.so > LINK /tmp/build/perf/plugin_kmem.so > AR /tmp/build/perf/libperf.a > LINK /tmp/build/perf/plugin_scsi.so > LINK /tmp/build/perf/plugin_kvm.so > LINK /tmp/build/perf/plugin_mac80211.so > LD /tmp/build/perf/plugin_cfg80211-in.o > LINK /tmp/build/perf/plugin_futex.so > LINK /tmp/build/perf/plugin_xen.so > LINK /tmp/build/perf/plugin_function.so > LINK /tmp/build/perf/plugin_tlb.so > LINK /tmp/build/perf/plugin_jbd2.so > LINK /tmp/build/perf/plugin_cfg80211.so > LINK /tmp/build/perf/plugin_sched_switch.so > GEN /tmp/build/perf/pmu-events/pmu-events.c > GEN /tmp/build/perf/libtraceevent-dynamic-list > MKDIR /tmp/build/perf/staticobjs/ > MKDIR /tmp/build/perf/staticobjs/ > MKDIR /tmp/build/perf/staticobjs/ > MKDIR /tmp/build/perf/staticobjs/ > MKDIR /tmp/build/perf/staticobjs/ > MKDIR /tmp/build/perf/staticobjs/ > MKDIR /tmp/build/perf/staticobjs/ > PERF_VERSION = 5.11.rc1.g5eb0b370de61 > MKDIR /tmp/build/perf/staticobjs/ > CC /tmp/build/perf/staticobjs/libbpf_probes.o > CC /tmp/build/perf/staticobjs/libbpf.o > CC /tmp/build/perf/staticobjs/bpf.o > CC /tmp/build/perf/staticobjs/nlattr.o > CC /tmp/build/perf/staticobjs/btf.o > CC /tmp/build/perf/staticobjs/xsk.o > GEN perf-archive > CC /tmp/build/perf/staticobjs/hashmap.o > GEN perf-with-kcore > CC /tmp/build/perf/staticobjs/btf_dump.o > CC /tmp/build/perf/staticobjs/libbpf_errno.o > CC /tmp/build/perf/staticobjs/str_error.o > CC /tmp/build/perf/staticobjs/bpf_prog_linfo.o > CC /tmp/build/perf/staticobjs/netlink.o > CC /tmp/build/perf/staticobjs/ringbuf.o > LD /tmp/build/perf/staticobjs/libbpf-in.o > LINK /tmp/build/perf/libbpf.a > CLANG /tmp/build/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o > DESCEND plugins > GEN /tmp/build/perf/python/perf.so > CC /tmp/build/perf/plugins/plugin_jbd2.o > CC /tmp/build/perf/plugins/plugin_kmem.o > CC /tmp/build/perf/plugins/plugin_hrtimer.o > CC /tmp/build/perf/plugins/plugin_mac80211.o > CC /tmp/build/perf/plugins/plugin_kvm.o > CC /tmp/build/perf/plugins/plugin_function.o > CC /tmp/build/perf/plugins/plugin_xen.o > CC /tmp/build/perf/plugins/plugin_sched_switch.o > CC /tmp/build/perf/plugins/plugin_futex.o > CC /tmp/build/perf/plugins/plugin_scsi.o > CC /tmp/build/perf/plugins/plugin_tlb.o > CC /tmp/build/perf/plugins/plugin_cfg80211.o > LD /tmp/build/perf/plugins/plugin_jbd2-in.o > LD /tmp/build/perf/plugins/plugin_kmem-in.o > LD /tmp/build/perf/plugins/plugin_hrtimer-in.o > LD /tmp/build/perf/plugins/plugin_kvm-in.o > LD /tmp/build/perf/plugins/plugin_mac80211-in.o > LD /tmp/build/perf/plugins/plugin_function-in.o > LD /tmp/build/perf/plugins/plugin_xen-in.o > LD /tmp/build/perf/plugins/plugin_sched_switch-in.o > LD /tmp/build/perf/plugins/plugin_scsi-in.o > LD /tmp/build/perf/plugins/plugin_futex-in.o > LD /tmp/build/perf/plugins/plugin_cfg80211-in.o > LD /tmp/build/perf/plugins/plugin_tlb-in.o > LINK /tmp/build/perf/plugins/plugin_jbd2.so > LINK /tmp/build/perf/plugins/plugin_hrtimer.so > LINK /tmp/build/perf/plugins/plugin_kmem.so > LINK /tmp/build/perf/plugins/plugin_mac80211.so > LINK /tmp/build/perf/plugins/plugin_kvm.so > LINK /tmp/build/perf/plugins/plugin_sched_switch.so > LINK /tmp/build/perf/plugins/plugin_scsi.so > LINK /tmp/build/perf/plugins/plugin_xen.so > LINK /tmp/build/perf/plugins/plugin_function.so > LINK /tmp/build/perf/plugins/plugin_futex.so > LINK /tmp/build/perf/plugins/plugin_tlb.so > LINK /tmp/build/perf/plugins/plugin_cfg80211.so > INSTALL trace_plugins > CC /tmp/build/perf/pmu-events/pmu-events.o > LD /tmp/build/perf/pmu-events/pmu-events-in.o > > Auto-detecting system features: > ... libbfd: [ on ] > ... disassembler-four-args: [ on ] > ... zlib: [ on ] > ... libcap: [ on ] > ... clang-bpf-co-re: [ on ] > ... reallocarray: [ on ] > > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/ > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/ > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/main.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/common.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/btf.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/json_writer.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/gen.o > > Auto-detecting system features: > ... libelf: [ on ] > ... zlib: [ on ] > ... bpf: [ on ] > > GEN /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/bpf_helper_defs.h > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ > MKDIR /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_probes.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/nlattr.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/xsk.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_errno.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/netlink.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/str_error.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/hashmap.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf_prog_linfo.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf_dump.o > CC /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ringbuf.o > LD /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o > LINK /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a > LINK /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool > GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > CC /tmp/build/perf/builtin-bench.o > CC /tmp/build/perf/builtin-annotate.o > CC /tmp/build/perf/builtin-diff.o > CC /tmp/build/perf/builtin-config.o > CC /tmp/build/perf/builtin-ftrace.o > CC /tmp/build/perf/builtin-help.o > CC /tmp/build/perf/builtin-evlist.o > CC /tmp/build/perf/builtin-sched.o > CC /tmp/build/perf/builtin-buildid-list.o > CC /tmp/build/perf/builtin-kallsyms.o > CC /tmp/build/perf/builtin-buildid-cache.o > CC /tmp/build/perf/builtin-record.o > CC /tmp/build/perf/builtin-report.o > CC /tmp/build/perf/builtin-list.o > CC /tmp/build/perf/builtin-stat.o > CC /tmp/build/perf/builtin-top.o > CC /tmp/build/perf/builtin-timechart.o > CC /tmp/build/perf/builtin-script.o > CC /tmp/build/perf/builtin-kmem.o > CC /tmp/build/perf/builtin-lock.o > CC /tmp/build/perf/builtin-kvm.o > CC /tmp/build/perf/builtin-inject.o > CC /tmp/build/perf/builtin-mem.o > CC /tmp/build/perf/builtin-version.o > CC /tmp/build/perf/builtin-data.o > CC /tmp/build/perf/builtin-trace.o > CC /tmp/build/perf/builtin-probe.o > CC /tmp/build/perf/builtin-c2c.o > MKDIR /tmp/build/perf/bench/ > MKDIR /tmp/build/perf/bench/ > MKDIR /tmp/build/perf/tests/ > CC /tmp/build/perf/arch/common.o > MKDIR /tmp/build/perf/ui/ > MKDIR /tmp/build/perf/bench/ > MKDIR /tmp/build/perf/tests/ > CC /tmp/build/perf/bench/sched-messaging.o > CC /tmp/build/perf/bench/sched-pipe.o > CC /tmp/build/perf/tests/builtin-test.o > MKDIR /tmp/build/perf/scripts/python/Perf-Trace-Util/ > MKDIR /tmp/build/perf/scripts/perl/Perf-Trace-Util/ > MKDIR /tmp/build/perf/tests/ > MKDIR /tmp/build/perf/ui/ > CC /tmp/build/perf/ui/setup.o > CC /tmp/build/perf/tests/attr.o > CC /tmp/build/perf/bench/syscall.o > CC /tmp/build/perf/tests/parse-events.o > CC /tmp/build/perf/trace/beauty/clone.o > CC /tmp/build/perf/scripts/python/Perf-Trace-Util/Context.o > CC /tmp/build/perf/tests/dso-data.o > MKDIR /tmp/build/perf/arch/x86/util/ > CC /tmp/build/perf/ui/helpline.o > CC /tmp/build/perf/scripts/perl/Perf-Trace-Util/Context.o > MKDIR /tmp/build/perf/ui/ > CC /tmp/build/perf/ui/util.o > CC /tmp/build/perf/arch/x86/util/header.o > MKDIR /tmp/build/perf/arch/x86/tests/ > CC /tmp/build/perf/ui/hist.o > CC /tmp/build/perf/ui/progress.o > CC /tmp/build/perf/tests/vmlinux-kallsyms.o > CC /tmp/build/perf/arch/x86/tests/regs_load.o > CC /tmp/build/perf/bench/mem-functions.o > CC /tmp/build/perf/trace/beauty/fcntl.o > CC /tmp/build/perf/trace/beauty/flock.o > CC /tmp/build/perf/bench/futex-hash.o > CC /tmp/build/perf/trace/beauty/fsmount.o > CC /tmp/build/perf/perf.o > CC /tmp/build/perf/tests/openat-syscall.o > MKDIR /tmp/build/perf/ui/stdio/ > MKDIR /tmp/build/perf/arch/x86/util/ > LD /tmp/build/perf/scripts/python/Perf-Trace-Util/perf-in.o > MKDIR /tmp/build/perf/arch/x86/tests/ > CC /tmp/build/perf/tests/openat-syscall-all-cpus.o > CC /tmp/build/perf/arch/x86/util/pmu.o > CC /tmp/build/perf/trace/beauty/fspick.o > CC /tmp/build/perf/arch/x86/util/tsc.o > CC /tmp/build/perf/util/annotate.o > CC /tmp/build/perf/tests/openat-syscall-tp-fields.o > CC /tmp/build/perf/trace/beauty/ioctl.o > CC /tmp/build/perf/bench/futex-wake.o > CC /tmp/build/perf/arch/x86/tests/dwarf-unwind.o > CC /tmp/build/perf/bench/futex-wake-parallel.o > CC /tmp/build/perf/ui/stdio/hist.o > CC /tmp/build/perf/arch/x86/tests/arch-tests.o > CC /tmp/build/perf/trace/beauty/kcmp.o > CC /tmp/build/perf/arch/x86/util/perf_regs.o > CC /tmp/build/perf/trace/beauty/mount_flags.o > CC /tmp/build/perf/arch/x86/util/kvm-stat.o > CC /tmp/build/perf/bench/futex-requeue.o > CC /tmp/build/perf/util/block-info.o > CC /tmp/build/perf/arch/x86/tests/rdpmc.o > CC /tmp/build/perf/bench/futex-lock-pi.o > CC /tmp/build/perf/arch/x86/tests/insn-x86.o > CC /tmp/build/perf/trace/beauty/move_mount.o > CC /tmp/build/perf/tests/mmap-basic.o > CC /tmp/build/perf/trace/beauty/pkey_alloc.o > CC /tmp/build/perf/tests/perf-record.o > CC /tmp/build/perf/trace/beauty/arch_prctl.o > CC /tmp/build/perf/arch/x86/util/topdown.o > CC /tmp/build/perf/tests/evsel-roundtrip-name.o > CC /tmp/build/perf/tests/evsel-tp-sched.o > CC /tmp/build/perf/arch/x86/tests/intel-pt-pkt-decoder-test.o > CC /tmp/build/perf/bench/epoll-wait.o > CC /tmp/build/perf/tests/fdarray.o > CC /tmp/build/perf/bench/epoll-ctl.o > CC /tmp/build/perf/arch/x86/util/machine.o > CC /tmp/build/perf/trace/beauty/prctl.o > CC /tmp/build/perf/ui/browser.o > CC /tmp/build/perf/arch/x86/util/event.o > CC /tmp/build/perf/arch/x86/tests/bp-modify.o > CC /tmp/build/perf/trace/beauty/renameat.o > CC /tmp/build/perf/tests/pmu.o > CC /tmp/build/perf/tests/pmu-events.o > CC /tmp/build/perf/tests/hists_common.o > CC /tmp/build/perf/bench/synthesize.o > CC /tmp/build/perf/arch/x86/util/dwarf-regs.o > CC /tmp/build/perf/tests/hists_link.o > CC /tmp/build/perf/trace/beauty/sockaddr.o > CC /tmp/build/perf/arch/x86/util/unwind-libunwind.o > CC /tmp/build/perf/trace/beauty/socket.o > CC /tmp/build/perf/trace/beauty/statx.o > LD /tmp/build/perf/arch/x86/tests/perf-in.o > CC /tmp/build/perf/arch/x86/util/auxtrace.o > CC /tmp/build/perf/arch/x86/util/archinsn.o > CC /tmp/build/perf/bench/kallsyms-parse.o > CC /tmp/build/perf/bench/find-bit-bench.o > CC /tmp/build/perf/arch/x86/util/intel-pt.o > CC /tmp/build/perf/trace/beauty/sync_file_range.o > CC /tmp/build/perf/arch/x86/util/intel-bts.o > MKDIR /tmp/build/perf/trace/beauty/tracepoints/ > MKDIR /tmp/build/perf/ui/browsers/ > CC /tmp/build/perf/trace/beauty/tracepoints/x86_irq_vectors.o > MKDIR /tmp/build/perf/trace/beauty/tracepoints/ > CC /tmp/build/perf/util/block-range.o > CC /tmp/build/perf/ui/browsers/annotate.o > MKDIR /tmp/build/perf/ui/browsers/ > CC /tmp/build/perf/util/build-id.o > CC /tmp/build/perf/bench/inject-buildid.o > CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o > MKDIR /tmp/build/perf/ui/tui/ > CC /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o > MKDIR /tmp/build/perf/ui/tui/ > CC /tmp/build/perf/ui/browsers/hists.o > CC /tmp/build/perf/ui/browsers/map.o > LD /tmp/build/perf/arch/x86/util/perf-in.o > CC /tmp/build/perf/tests/hists_filter.o > CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o > CC /tmp/build/perf/ui/tui/setup.o > CC /tmp/build/perf/ui/tui/util.o > MKDIR /tmp/build/perf/ui/tui/ > CC /tmp/build/perf/bench/numa.o > CC /tmp/build/perf/util/cacheline.o > CC /tmp/build/perf/ui/browsers/scripts.o > LD /tmp/build/perf/trace/beauty/tracepoints/perf-in.o > CC /tmp/build/perf/ui/tui/helpline.o > CC /tmp/build/perf/util/config.o > CC /tmp/build/perf/ui/tui/progress.o > CC /tmp/build/perf/ui/browsers/header.o > LD /tmp/build/perf/trace/beauty/perf-in.o > CC /tmp/build/perf/ui/browsers/res_sample.o > CC /tmp/build/perf/tests/hists_output.o > LD /tmp/build/perf/bench/perf-in.o > CC /tmp/build/perf/tests/hists_cumulate.o > CC /tmp/build/perf/util/copyfile.o > LD /tmp/build/perf/arch/x86/perf-in.o > CC /tmp/build/perf/util/ctype.o > CC /tmp/build/perf/util/db-export.o > CC /tmp/build/perf/util/env.o > CC /tmp/build/perf/util/event.o > LD /tmp/build/perf/arch/perf-in.o > CC /tmp/build/perf/util/evlist.o > CC /tmp/build/perf/util/sideband_evlist.o > CC /tmp/build/perf/util/evsel.o > CC /tmp/build/perf/util/evsel_fprintf.o > CC /tmp/build/perf/tests/python-use.o > CC /tmp/build/perf/util/perf_event_attr_fprintf.o > CC /tmp/build/perf/util/evswitch.o > CC /tmp/build/perf/util/find_bit.o > CC /tmp/build/perf/tests/bp_signal.o > LD /tmp/build/perf/ui/tui/perf-in.o > CC /tmp/build/perf/util/get_current_dir_name.o > CC /tmp/build/perf/tests/bp_signal_overflow.o > CC /tmp/build/perf/util/kallsyms.o > CC /tmp/build/perf/tests/bp_account.o > CC /tmp/build/perf/util/llvm-utils.o > CC /tmp/build/perf/util/levenshtein.o > CC /tmp/build/perf/util/mmap.o > CC /tmp/build/perf/tests/wp.o > CC /tmp/build/perf/util/memswap.o > CC /tmp/build/perf/util/perf_regs.o > BISON /tmp/build/perf/util/parse-events-bison.c > CC /tmp/build/perf/tests/task-exit.o > CC /tmp/build/perf/util/path.o > CC /tmp/build/perf/util/print_binary.o > CC /tmp/build/perf/util/rlimit.o > CC /tmp/build/perf/tests/sw-clock.o > CC /tmp/build/perf/tests/mmap-thread-lookup.o > CC /tmp/build/perf/util/argv_split.o > CC /tmp/build/perf/util/rbtree.o > CC /tmp/build/perf/tests/thread-maps-share.o > CC /tmp/build/perf/util/libstring.o > CC /tmp/build/perf/tests/switch-tracking.o > CC /tmp/build/perf/tests/keep-tracking.o > CC /tmp/build/perf/util/bitmap.o > CC /tmp/build/perf/util/hweight.o > CC /tmp/build/perf/util/smt.o > CC /tmp/build/perf/tests/code-reading.o > CC /tmp/build/perf/util/strbuf.o > CC /tmp/build/perf/util/string.o > CC /tmp/build/perf/tests/sample-parsing.o > CC /tmp/build/perf/tests/parse-no-sample-id-all.o > CC /tmp/build/perf/util/strfilter.o > CC /tmp/build/perf/tests/kmod-path.o > CC /tmp/build/perf/util/strlist.o > CC /tmp/build/perf/util/top.o > CC /tmp/build/perf/tests/thread-map.o > CC /tmp/build/perf/util/usage.o > CC /tmp/build/perf/util/dso.o > CC /tmp/build/perf/util/dsos.o > CC /tmp/build/perf/util/symbol.o > CC /tmp/build/perf/util/symbol_fprintf.o > CC /tmp/build/perf/tests/llvm.o > CC /tmp/build/perf/util/color.o > CC /tmp/build/perf/util/color_config.o > CC /tmp/build/perf/util/metricgroup.o > CC /tmp/build/perf/util/header.o > CC /tmp/build/perf/util/callchain.o > CC /tmp/build/perf/util/values.o > CC /tmp/build/perf/tests/bpf.o > CC /tmp/build/perf/util/debug.o > CC /tmp/build/perf/util/fncache.o > CC /tmp/build/perf/tests/topology.o > CC /tmp/build/perf/util/machine.o > CC /tmp/build/perf/tests/cpumap.o > CC /tmp/build/perf/util/map.o > CC /tmp/build/perf/tests/mem.o > CC /tmp/build/perf/util/pstack.o > CC /tmp/build/perf/util/session.o > CC /tmp/build/perf/tests/stat.o > CC /tmp/build/perf/tests/event_update.o > LD /tmp/build/perf/ui/browsers/perf-in.o > CC /tmp/build/perf/tests/event-times.o > CC /tmp/build/perf/tests/expr.o > CC /tmp/build/perf/util/sample-raw.o > CC /tmp/build/perf/util/s390-sample-raw.o > CC /tmp/build/perf/tests/sdt.o > CC /tmp/build/perf/util/syscalltbl.o > CC /tmp/build/perf/tests/is_printable_array.o > CC /tmp/build/perf/util/ordered-events.o > CC /tmp/build/perf/tests/backward-ring-buffer.o > CC /tmp/build/perf/tests/bitmap.o > CC /tmp/build/perf/util/namespaces.o > CC /tmp/build/perf/tests/perf-hooks.o > CC /tmp/build/perf/tests/clang.o > CC /tmp/build/perf/util/comm.o > CC /tmp/build/perf/tests/unit_number__scnprintf.o > CC /tmp/build/perf/tests/mem2node.o > CC /tmp/build/perf/tests/maps.o > CC /tmp/build/perf/util/thread.o > CC /tmp/build/perf/util/thread_map.o > CC /tmp/build/perf/tests/time-utils-test.o > CC /tmp/build/perf/tests/genelf.o > CC /tmp/build/perf/util/trace-event-parse.o > BISON /tmp/build/perf/util/pmu-bison.c > CC /tmp/build/perf/util/trace-event-read.o > CC /tmp/build/perf/tests/api-io.o > CC /tmp/build/perf/util/trace-event-info.o > CC /tmp/build/perf/util/trace-event-scripting.o > CC /tmp/build/perf/tests/pfm.o > CC /tmp/build/perf/tests/demangle-java-test.o > CC /tmp/build/perf/util/trace-event.o > LD /tmp/build/perf/ui/perf-in.o > CC /tmp/build/perf/tests/parse-metric.o > CC /tmp/build/perf/tests/pe-file-parsing.o > CC /tmp/build/perf/tests/expand-cgroup.o > CC /tmp/build/perf/tests/perf-time-to-tsc.o > CC /tmp/build/perf/util/svghelper.o > CC /tmp/build/perf/util/sort.o > CC /tmp/build/perf/util/hist.o > CC /tmp/build/perf/tests/dwarf-unwind.o > CC /tmp/build/perf/tests/llvm-src-base.o > CC /tmp/build/perf/util/cpumap.o > CC /tmp/build/perf/util/util.o > CC /tmp/build/perf/util/affinity.o > CC /tmp/build/perf/util/cputopo.o > CC /tmp/build/perf/util/target.o > CC /tmp/build/perf/util/cgroup.o > CC /tmp/build/perf/tests/llvm-src-kbuild.o > CC /tmp/build/perf/util/rblist.o > CC /tmp/build/perf/tests/llvm-src-prologue.o > CC /tmp/build/perf/tests/llvm-src-relocation.o > CC /tmp/build/perf/util/intlist.o > CC /tmp/build/perf/util/counts.o > CC /tmp/build/perf/util/vdso.o > CC /tmp/build/perf/util/stat.o > CC /tmp/build/perf/util/stat-shadow.o > CC /tmp/build/perf/util/stat-display.o > CC /tmp/build/perf/util/perf_api_probe.o > CC /tmp/build/perf/util/record.o > CC /tmp/build/perf/util/srcline.o > LD /tmp/build/perf/tests/perf-in.o > CC /tmp/build/perf/util/srccode.o > CC /tmp/build/perf/util/synthetic-events.o > CC /tmp/build/perf/util/data.o > CC /tmp/build/perf/util/cloexec.o > CC /tmp/build/perf/util/tsc.o > CC /tmp/build/perf/util/rwsem.o > CC /tmp/build/perf/util/call-path.o > CC /tmp/build/perf/util/thread-stack.o > CC /tmp/build/perf/util/spark.o > CC /tmp/build/perf/util/topdown.o > CC /tmp/build/perf/util/auxtrace.o > CC /tmp/build/perf/util/intel-pt.o > CC /tmp/build/perf/util/stream.o > CC /tmp/build/perf/util/intel-bts.o > MKDIR /tmp/build/perf/util/arm-spe-decoder/ > CC /tmp/build/perf/util/arm-spe.o > MKDIR /tmp/build/perf/util/intel-pt-decoder/ > MKDIR /tmp/build/perf/util/arm-spe-decoder/ > CC /tmp/build/perf/util/s390-cpumsf.o > MKDIR /tmp/build/perf/util/intel-pt-decoder/ > MKDIR /tmp/build/perf/util/scripting-engines/ > MKDIR /tmp/build/perf/util/intel-pt-decoder/ > MKDIR /tmp/build/perf/util/scripting-engines/ > CC /tmp/build/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.o > GEN /tmp/build/perf/util/intel-pt-decoder/inat-tables.c > CC /tmp/build/perf/util/arm-spe-decoder/arm-spe-decoder.o > CC /tmp/build/perf/util/scripting-engines/trace-event-perl.o > CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o > CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-log.o > CC /tmp/build/perf/util/dump-insn.o > CC /tmp/build/perf/util/parse-branch-options.o > CC /tmp/build/perf/util/scripting-engines/trace-event-python.o > CC /tmp/build/perf/util/parse-regs-options.o > CC /tmp/build/perf/util/parse-sublevel-options.o > MKDIR /tmp/build/perf/util/intel-pt-decoder/ > CC /tmp/build/perf/util/term.o > CC /tmp/build/perf/util/help-unknown-cmd.o > LD /tmp/build/perf/util/arm-spe-decoder/perf-in.o > CC /tmp/build/perf/util/mem-events.o > CC /tmp/build/perf/util/vsprintf.o > CC /tmp/build/perf/util/units.o > CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o > BISON /tmp/build/perf/util/expr-bison.c > CC /tmp/build/perf/util/time-utils.o > CC /tmp/build/perf/util/branch.o > CC /tmp/build/perf/util/mem2node.o > CC /tmp/build/perf/util/clockid.o > CC /tmp/build/perf/util/bpf-loader.o > CC /tmp/build/perf/util/bpf_map.o > CC /tmp/build/perf/util/bpf_counter.o > CC /tmp/build/perf/util/bpf-prologue.o > CC /tmp/build/perf/util/symbol-elf.o > CC /tmp/build/perf/util/probe-file.o > CC /tmp/build/perf/util/probe-event.o > CC /tmp/build/perf/util/probe-finder.o > CC /tmp/build/perf/util/dwarf-aux.o > CC /tmp/build/perf/util/dwarf-regs.o > LD /tmp/build/perf/util/scripting-engines/perf-in.o > CC /tmp/build/perf/util/unwind-libunwind-local.o > CC /tmp/build/perf/util/unwind-libunwind.o > CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-insn-decoder.o > CC /tmp/build/perf/util/data-convert-bt.o > CC /tmp/build/perf/util/zlib.o > CC /tmp/build/perf/util/lzma.o > CC /tmp/build/perf/util/cap.o > CC /tmp/build/perf/util/zstd.o > CC /tmp/build/perf/util/demangle-java.o > CC /tmp/build/perf/util/demangle-rust.o > CC /tmp/build/perf/util/genelf.o > CC /tmp/build/perf/util/jitdump.o > CC /tmp/build/perf/util/genelf_debug.o > CC /tmp/build/perf/util/perf-hooks.o > CC /tmp/build/perf/util/bpf-event.o > FLEX /tmp/build/perf/util/pmu-flex.c > CC /tmp/build/perf/util/pmu-bison.o > CC /tmp/build/perf/util/pmu.o > CC /tmp/build/perf/util/pmu-flex.o > FLEX /tmp/build/perf/util/parse-events-flex.c > CC /tmp/build/perf/util/parse-events-bison.o > FLEX /tmp/build/perf/util/expr-flex.c > CC /tmp/build/perf/util/expr-bison.o > LD /tmp/build/perf/util/intel-pt-decoder/perf-in.o > CC /tmp/build/perf/util/parse-events.o > CC /tmp/build/perf/util/parse-events-flex.o > CC /tmp/build/perf/util/expr-flex.o > CC /tmp/build/perf/util/expr.o > LD /tmp/build/perf/scripts/perl/Perf-Trace-Util/perf-in.o > LD /tmp/build/perf/scripts/perf-in.o > LD /tmp/build/perf/util/perf-in.o > LD /tmp/build/perf/perf-in.o > LINK /tmp/build/perf/perf > INSTALL tests > INSTALL binaries > INSTALL libperf-jvmti.so > INSTALL libexec > INSTALL bpf-headers > INSTALL bpf-examples > INSTALL perf-archive > INSTALL perf-with-kcore > INSTALL strace/groups > INSTALL perl-scripts > INSTALL python-scripts > INSTALL perf_completion-script > INSTALL perf-tip > make: Leaving directory '/home/acme/git/perf/tools/perf' > [acme@five perf]$ > >> Signed-off-by: Song Liu <songliubraving@fb.com> >> --- >> tools/perf/Makefile.perf | 2 +- >> tools/perf/builtin-stat.c | 77 ++++- >> tools/perf/util/Build | 1 + >> tools/perf/util/bpf_counter.c | 296 ++++++++++++++++++ >> tools/perf/util/bpf_counter.h | 72 +++++ >> .../util/bpf_skel/bpf_prog_profiler.bpf.c | 93 ++++++ >> tools/perf/util/evsel.c | 9 + >> tools/perf/util/evsel.h | 6 + >> tools/perf/util/stat-display.c | 4 +- >> tools/perf/util/stat.c | 2 +- >> tools/perf/util/target.c | 34 +- >> tools/perf/util/target.h | 10 + >> 12 files changed, 588 insertions(+), 18 deletions(-) >> create mode 100644 tools/perf/util/bpf_counter.c >> create mode 100644 tools/perf/util/bpf_counter.h >> create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c >> >> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf >> index d182a2dbb9bbd..8c4e039c3b813 100644 >> --- a/tools/perf/Makefile.perf >> +++ b/tools/perf/Makefile.perf >> @@ -1015,7 +1015,7 @@ python-clean: >> >> SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) >> SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) >> -SKELETONS := >> +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h >> >> ifdef BUILD_BPF_SKEL >> BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool >> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c >> index 8cc24967bc273..09bffb3fbcdd4 100644 >> --- a/tools/perf/builtin-stat.c >> +++ b/tools/perf/builtin-stat.c >> @@ -67,6 +67,7 @@ >> #include "util/top.h" >> #include "util/affinity.h" >> #include "util/pfm.h" >> +#include "util/bpf_counter.h" >> #include "asm/bug.h" >> >> #include <linux/time64.h> >> @@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs) >> return 0; >> } >> >> +static int read_bpf_map_counters(void) >> +{ >> + struct evsel *counter; >> + int err; >> + >> + evlist__for_each_entry(evsel_list, counter) { >> + err = bpf_counter__read(counter); >> + if (err) >> + return err; >> + } >> + return 0; >> +} >> + >> static void read_counters(struct timespec *rs) >> { >> struct evsel *counter; >> + int err; >> >> - if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0)) >> - return; >> + if (!stat_config.stop_read_counter) { >> + err = read_bpf_map_counters(); >> + if (err == -EAGAIN) >> + err = read_affinity_counters(rs); >> + if (err < 0) >> + return; >> + } >> >> evlist__for_each_entry(evsel_list, counter) { >> if (counter->err) >> @@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times) >> return false; >> } >> >> -static void enable_counters(void) >> +static int enable_counters(void) >> { >> + struct evsel *evsel; >> + int err; >> + >> + evlist__for_each_entry(evsel_list, evsel) { >> + err = bpf_counter__enable(evsel); >> + if (err) >> + return err; >> + } >> + >> if (stat_config.initial_delay < 0) { >> pr_info(EVLIST_DISABLED_MSG); >> - return; >> + return 0; >> } >> >> if (stat_config.initial_delay > 0) { >> @@ -518,6 +547,7 @@ static void enable_counters(void) >> if (stat_config.initial_delay > 0) >> pr_info(EVLIST_ENABLED_MSG); >> } >> + return 0; >> } >> >> static void disable_counters(void) >> @@ -720,7 +750,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) >> const bool forks = (argc > 0); >> bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false; >> struct affinity affinity; >> - int i, cpu; >> + int i, cpu, err; >> bool second_pass = false; >> >> if (forks) { >> @@ -737,6 +767,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) >> if (affinity__setup(&affinity) < 0) >> return -1; >> >> + evlist__for_each_entry(evsel_list, counter) { >> + if (bpf_counter__load(counter, &target)) >> + return -1; >> + } >> + >> evlist__for_each_cpu (evsel_list, i, cpu) { >> affinity__set(&affinity, cpu); >> >> @@ -850,7 +885,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) >> } >> >> if (STAT_RECORD) { >> - int err, fd = perf_data__fd(&perf_stat.data); >> + int fd = perf_data__fd(&perf_stat.data); >> >> if (is_pipe) { >> err = perf_header__write_pipe(perf_data__fd(&perf_stat.data)); >> @@ -876,7 +911,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) >> >> if (forks) { >> evlist__start_workload(evsel_list); >> - enable_counters(); >> + err = enable_counters(); >> + if (err) >> + return -1; >> >> if (interval || timeout || evlist__ctlfd_initialized(evsel_list)) >> status = dispatch_events(forks, timeout, interval, ×); >> @@ -895,7 +932,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) >> if (WIFSIGNALED(status)) >> psignal(WTERMSIG(status), argv[0]); >> } else { >> - enable_counters(); >> + err = enable_counters(); >> + if (err) >> + return -1; >> status = dispatch_events(forks, timeout, interval, ×); >> } >> >> @@ -1085,6 +1124,10 @@ static struct option stat_options[] = { >> "stat events on existing process id"), >> OPT_STRING('t', "tid", &target.tid, "tid", >> "stat events on existing thread id"), >> +#ifdef HAVE_BPF_SKEL >> + OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id", >> + "stat events on existing bpf program id"), >> +#endif >> OPT_BOOLEAN('a', "all-cpus", &target.system_wide, >> "system-wide collection from all CPUs"), >> OPT_BOOLEAN('g', "group", &group, >> @@ -2064,11 +2107,12 @@ int cmd_stat(int argc, const char **argv) >> "perf stat [<options>] [<command>]", >> NULL >> }; >> - int status = -EINVAL, run_idx; >> + int status = -EINVAL, run_idx, err; >> const char *mode; >> FILE *output = stderr; >> unsigned int interval, timeout; >> const char * const stat_subcommands[] = { "record", "report" }; >> + char errbuf[BUFSIZ]; >> >> setlocale(LC_ALL, ""); >> >> @@ -2179,6 +2223,12 @@ int cmd_stat(int argc, const char **argv) >> } else if (big_num_opt == 0) /* User passed --no-big-num */ >> stat_config.big_num = false; >> >> + err = target__validate(&target); >> + if (err) { >> + target__strerror(&target, err, errbuf, BUFSIZ); >> + pr_warning("%s\n", errbuf); >> + } >> + >> setup_system_wide(argc); >> >> /* >> @@ -2252,8 +2302,6 @@ int cmd_stat(int argc, const char **argv) >> } >> } >> >> - target__validate(&target); >> - >> if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide)) >> target.per_thread = true; >> >> @@ -2384,9 +2432,10 @@ int cmd_stat(int argc, const char **argv) >> * tools remain -acme >> */ >> int fd = perf_data__fd(&perf_stat.data); >> - int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, >> - process_synthesized_event, >> - &perf_stat.session->machines.host); >> + >> + err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, >> + process_synthesized_event, >> + &perf_stat.session->machines.host); >> if (err) { >> pr_warning("Couldn't synthesize the kernel mmap record, harmless, " >> "older tools may produce warnings about this file\n."); >> diff --git a/tools/perf/util/Build b/tools/perf/util/Build >> index e2563d0154eb6..188521f343470 100644 >> --- a/tools/perf/util/Build >> +++ b/tools/perf/util/Build >> @@ -135,6 +135,7 @@ perf-y += clockid.o >> >> perf-$(CONFIG_LIBBPF) += bpf-loader.o >> perf-$(CONFIG_LIBBPF) += bpf_map.o >> +perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o >> perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o >> perf-$(CONFIG_LIBELF) += symbol-elf.o >> perf-$(CONFIG_LIBELF) += probe-file.o >> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c >> new file mode 100644 >> index 0000000000000..f2cb86a40c882 >> --- /dev/null >> +++ b/tools/perf/util/bpf_counter.c >> @@ -0,0 +1,296 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> + >> +/* Copyright (c) 2019 Facebook */ >> + >> +#include <limits.h> >> +#include <unistd.h> >> +#include <sys/time.h> >> +#include <sys/resource.h> >> +#include <linux/err.h> >> +#include <linux/zalloc.h> >> +#include <bpf/bpf.h> >> +#include <bpf/btf.h> >> +#include <bpf/libbpf.h> >> + >> +#include "bpf_counter.h" >> +#include "counts.h" >> +#include "debug.h" >> +#include "evsel.h" >> +#include "target.h" >> + >> +#include "bpf_skel/bpf_prog_profiler.skel.h" >> + >> +static inline void *u64_to_ptr(__u64 ptr) >> +{ >> + return (void *)(unsigned long)ptr; >> +} >> + >> +static void set_max_rlimit(void) >> +{ >> + struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY }; >> + >> + setrlimit(RLIMIT_MEMLOCK, &rinf); >> +} >> + >> +static struct bpf_counter *bpf_counter_alloc(void) >> +{ >> + struct bpf_counter *counter; >> + >> + counter = zalloc(sizeof(*counter)); >> + if (counter) >> + INIT_LIST_HEAD(&counter->list); >> + return counter; >> +} >> + >> +static int bpf_program_profiler__destroy(struct evsel *evsel) >> +{ >> + struct bpf_counter *counter; >> + >> + list_for_each_entry(counter, &evsel->bpf_counter_list, list) >> + bpf_prog_profiler_bpf__destroy(counter->skel); >> + INIT_LIST_HEAD(&evsel->bpf_counter_list); >> + return 0; >> +} >> + >> +static char *bpf_target_prog_name(int tgt_fd) >> +{ >> + struct bpf_prog_info_linear *info_linear; >> + struct bpf_func_info *func_info; >> + const struct btf_type *t; >> + char *name = NULL; >> + struct btf *btf; >> + >> + info_linear = bpf_program__get_prog_info_linear( >> + tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO); >> + if (IS_ERR_OR_NULL(info_linear)) { >> + pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd); >> + return NULL; >> + } >> + >> + if (info_linear->info.btf_id == 0 || >> + btf__get_from_id(info_linear->info.btf_id, &btf)) { >> + pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd); >> + goto out; >> + } >> + >> + func_info = u64_to_ptr(info_linear->info.func_info); >> + t = btf__type_by_id(btf, func_info[0].type_id); >> + if (!t) { >> + pr_debug("btf %d doesn't have type %d\n", >> + info_linear->info.btf_id, func_info[0].type_id); >> + goto out; >> + } >> + name = strdup(btf__name_by_offset(btf, t->name_off)); >> +out: >> + free(info_linear); >> + return name; >> +} >> + >> +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id) >> +{ >> + struct bpf_prog_profiler_bpf *skel; >> + struct bpf_counter *counter; >> + struct bpf_program *prog; >> + char *prog_name; >> + int prog_fd; >> + int err; >> + >> + prog_fd = bpf_prog_get_fd_by_id(prog_id); >> + if (prog_fd < 0) { >> + pr_err("Failed to open fd for bpf prog %u\n", prog_id); >> + return -1; >> + } >> + counter = bpf_counter_alloc(); >> + if (!counter) { >> + close(prog_fd); >> + return -1; >> + } >> + >> + skel = bpf_prog_profiler_bpf__open(); >> + if (!skel) { >> + pr_err("Failed to open bpf skeleton\n"); >> + goto err_out; >> + } >> + skel->rodata->num_cpu = evsel__nr_cpus(evsel); >> + >> + bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel)); >> + bpf_map__resize(skel->maps.fentry_readings, 1); >> + bpf_map__resize(skel->maps.accum_readings, 1); >> + >> + prog_name = bpf_target_prog_name(prog_fd); >> + if (!prog_name) { >> + pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id); >> + goto err_out; >> + } >> + >> + bpf_object__for_each_program(prog, skel->obj) { >> + err = bpf_program__set_attach_target(prog, prog_fd, prog_name); >> + if (err) { >> + pr_err("bpf_program__set_attach_target failed.\n" >> + "Does bpf prog %u have BTF?\n", prog_id); >> + goto err_out; >> + } >> + } >> + set_max_rlimit(); >> + err = bpf_prog_profiler_bpf__load(skel); >> + if (err) { >> + pr_err("bpf_prog_profiler_bpf__load failed\n"); >> + goto err_out; >> + } >> + >> + counter->skel = skel; >> + list_add(&counter->list, &evsel->bpf_counter_list); >> + close(prog_fd); >> + return 0; >> +err_out: >> + free(counter); >> + close(prog_fd); >> + return -1; >> +} >> + >> +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target) >> +{ >> + char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p; >> + u32 prog_id; >> + int ret; >> + >> + bpf_str_ = bpf_str = strdup(target->bpf_str); >> + if (!bpf_str) >> + return -1; >> + >> + while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) { >> + prog_id = strtoul(tok, &p, 10); >> + if (prog_id == 0 || prog_id == UINT_MAX || >> + (*p != '\0' && *p != ',')) { >> + pr_err("Failed to parse bpf prog ids %s\n", >> + target->bpf_str); >> + return -1; >> + } >> + >> + ret = bpf_program_profiler_load_one(evsel, prog_id); >> + if (ret) { >> + bpf_program_profiler__destroy(evsel); >> + free(bpf_str_); >> + return -1; >> + } >> + bpf_str = NULL; >> + } >> + free(bpf_str_); >> + return 0; >> +} >> + >> +static int bpf_program_profiler__enable(struct evsel *evsel) >> +{ >> + struct bpf_counter *counter; >> + int ret; >> + >> + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { >> + ret = bpf_prog_profiler_bpf__attach(counter->skel); >> + if (ret) { >> + bpf_program_profiler__destroy(evsel); >> + return ret; >> + } >> + } >> + return 0; >> +} >> + >> +static int bpf_program_profiler__read(struct evsel *evsel) >> +{ >> + int num_cpu = evsel__nr_cpus(evsel); >> + struct bpf_perf_event_value values[num_cpu]; >> + struct bpf_counter *counter; >> + int reading_map_fd; >> + __u32 key = 0; >> + int err, cpu; >> + >> + if (list_empty(&evsel->bpf_counter_list)) >> + return -EAGAIN; >> + >> + for (cpu = 0; cpu < num_cpu; cpu++) { >> + perf_counts(evsel->counts, cpu, 0)->val = 0; >> + perf_counts(evsel->counts, cpu, 0)->ena = 0; >> + perf_counts(evsel->counts, cpu, 0)->run = 0; >> + } >> + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { >> + struct bpf_prog_profiler_bpf *skel = counter->skel; >> + >> + reading_map_fd = bpf_map__fd(skel->maps.accum_readings); >> + >> + err = bpf_map_lookup_elem(reading_map_fd, &key, values); >> + if (err) { >> + fprintf(stderr, "failed to read value\n"); >> + return err; >> + } >> + >> + for (cpu = 0; cpu < num_cpu; cpu++) { >> + perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter; >> + perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled; >> + perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running; >> + } >> + } >> + return 0; >> +} >> + >> +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu, >> + int fd) >> +{ >> + struct bpf_prog_profiler_bpf *skel; >> + struct bpf_counter *counter; >> + int ret; >> + >> + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { >> + skel = counter->skel; >> + ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events), >> + &cpu, &fd, BPF_ANY); >> + if (ret) >> + return ret; >> + } >> + return 0; >> +} >> + >> +struct bpf_counter_ops bpf_program_profiler_ops = { >> + .load = bpf_program_profiler__load, >> + .enable = bpf_program_profiler__enable, >> + .read = bpf_program_profiler__read, >> + .destroy = bpf_program_profiler__destroy, >> + .install_pe = bpf_program_profiler__install_pe, >> +}; >> + >> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd) >> +{ >> + if (list_empty(&evsel->bpf_counter_list)) >> + return 0; >> + return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd); >> +} >> + >> +int bpf_counter__load(struct evsel *evsel, struct target *target) >> +{ >> + if (target__has_bpf(target)) >> + evsel->bpf_counter_ops = &bpf_program_profiler_ops; >> + >> + if (evsel->bpf_counter_ops) >> + return evsel->bpf_counter_ops->load(evsel, target); >> + return 0; >> +} >> + >> +int bpf_counter__enable(struct evsel *evsel) >> +{ >> + if (list_empty(&evsel->bpf_counter_list)) >> + return 0; >> + return evsel->bpf_counter_ops->enable(evsel); >> +} >> + >> +int bpf_counter__read(struct evsel *evsel) >> +{ >> + if (list_empty(&evsel->bpf_counter_list)) >> + return -EAGAIN; >> + return evsel->bpf_counter_ops->read(evsel); >> +} >> + >> +void bpf_counter__destroy(struct evsel *evsel) >> +{ >> + if (list_empty(&evsel->bpf_counter_list)) >> + return; >> + evsel->bpf_counter_ops->destroy(evsel); >> + evsel->bpf_counter_ops = NULL; >> +} >> diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h >> new file mode 100644 >> index 0000000000000..2eca210e5dc16 >> --- /dev/null >> +++ b/tools/perf/util/bpf_counter.h >> @@ -0,0 +1,72 @@ >> +/* SPDX-License-Identifier: GPL-2.0 */ >> +#ifndef __PERF_BPF_COUNTER_H >> +#define __PERF_BPF_COUNTER_H 1 >> + >> +#include <linux/list.h> >> + >> +struct evsel; >> +struct target; >> +struct bpf_counter; >> + >> +typedef int (*bpf_counter_evsel_op)(struct evsel *evsel); >> +typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel, >> + struct target *target); >> +typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel, >> + int cpu, >> + int fd); >> + >> +struct bpf_counter_ops { >> + bpf_counter_evsel_target_op load; >> + bpf_counter_evsel_op enable; >> + bpf_counter_evsel_op read; >> + bpf_counter_evsel_op destroy; >> + bpf_counter_evsel_install_pe_op install_pe; >> +}; >> + >> +struct bpf_counter { >> + void *skel; >> + struct list_head list; >> +}; >> + >> +#ifdef HAVE_BPF_SKEL >> + >> +int bpf_counter__load(struct evsel *evsel, struct target *target); >> +int bpf_counter__enable(struct evsel *evsel); >> +int bpf_counter__read(struct evsel *evsel); >> +void bpf_counter__destroy(struct evsel *evsel); >> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); >> + >> +#else /* HAVE_BPF_SKEL */ >> + >> +#include<linux/err.h> >> + >> +static inline int bpf_counter__load(struct evsel *evsel __maybe_unused, >> + struct target *target __maybe_unused) >> +{ >> + return 0; >> +} >> + >> +static inline int bpf_counter__enable(struct evsel *evsel __maybe_unused) >> +{ >> + return 0; >> +} >> + >> +static inline int bpf_counter__read(struct evsel *evsel __maybe_unused) >> +{ >> + return -EAGAIN; >> +} >> + >> +static inline void bpf_counter__destroy(struct evsel *evsel __maybe_unused) >> +{ >> +} >> + >> +static inline int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, >> + int cpu __maybe_unused, >> + int fd __maybe_unused) >> +{ >> + return 0; >> +} >> + >> +#endif /* HAVE_BPF_SKEL */ >> + >> +#endif /* __PERF_BPF_COUNTER_H */ >> diff --git a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c >> new file mode 100644 >> index 0000000000000..c7cec92d02360 >> --- /dev/null >> +++ b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c >> @@ -0,0 +1,93 @@ >> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) >> +// Copyright (c) 2020 Facebook >> +#include <linux/bpf.h> >> +#include <bpf/bpf_helpers.h> >> +#include <bpf/bpf_tracing.h> >> + >> +/* map of perf event fds, num_cpu * num_metric entries */ >> +struct { >> + __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); >> + __uint(key_size, sizeof(__u32)); >> + __uint(value_size, sizeof(int)); >> +} events SEC(".maps"); >> + >> +/* readings at fentry */ >> +struct { >> + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); >> + __uint(key_size, sizeof(__u32)); >> + __uint(value_size, sizeof(struct bpf_perf_event_value)); >> + __uint(max_entries, 1); >> +} fentry_readings SEC(".maps"); >> + >> +/* accumulated readings */ >> +struct { >> + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); >> + __uint(key_size, sizeof(__u32)); >> + __uint(value_size, sizeof(struct bpf_perf_event_value)); >> + __uint(max_entries, 1); >> +} accum_readings SEC(".maps"); >> + >> +const volatile __u32 num_cpu = 1; >> + >> +SEC("fentry/XXX") >> +int BPF_PROG(fentry_XXX) >> +{ >> + __u32 key = bpf_get_smp_processor_id(); >> + struct bpf_perf_event_value *ptr; >> + __u32 zero = 0; >> + long err; >> + >> + /* look up before reading, to reduce error */ >> + ptr = bpf_map_lookup_elem(&fentry_readings, &zero); >> + if (!ptr) >> + return 0; >> + >> + err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr)); >> + if (err) >> + return 0; >> + >> + return 0; >> +} >> + >> +static inline void >> +fexit_update_maps(struct bpf_perf_event_value *after) >> +{ >> + struct bpf_perf_event_value *before, diff, *accum; >> + __u32 zero = 0; >> + >> + before = bpf_map_lookup_elem(&fentry_readings, &zero); >> + /* only account samples with a valid fentry_reading */ >> + if (before && before->counter) { >> + struct bpf_perf_event_value *accum; >> + >> + diff.counter = after->counter - before->counter; >> + diff.enabled = after->enabled - before->enabled; >> + diff.running = after->running - before->running; >> + >> + accum = bpf_map_lookup_elem(&accum_readings, &zero); >> + if (accum) { >> + accum->counter += diff.counter; >> + accum->enabled += diff.enabled; >> + accum->running += diff.running; >> + } >> + } >> +} >> + >> +SEC("fexit/XXX") >> +int BPF_PROG(fexit_XXX) >> +{ >> + struct bpf_perf_event_value reading; >> + __u32 cpu = bpf_get_smp_processor_id(); >> + __u32 one = 1, zero = 0; >> + int err; >> + >> + /* read all events before updating the maps, to reduce error */ >> + err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading)); >> + if (err) >> + return 0; >> + >> + fexit_update_maps(&reading); >> + return 0; >> +} >> + >> +char LICENSE[] SEC("license") = "Dual BSD/GPL"; >> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c >> index c26ea82220bd8..7265308765d73 100644 >> --- a/tools/perf/util/evsel.c >> +++ b/tools/perf/util/evsel.c >> @@ -25,6 +25,7 @@ >> #include <stdlib.h> >> #include <perf/evsel.h> >> #include "asm/bug.h" >> +#include "bpf_counter.h" >> #include "callchain.h" >> #include "cgroup.h" >> #include "counts.h" >> @@ -51,6 +52,10 @@ >> #include <internal/lib.h> >> >> #include <linux/ctype.h> >> +#include <bpf/bpf.h> >> +#include <bpf/libbpf.h> >> +#include <bpf/btf.h> >> +#include "rlimit.h" >> >> struct perf_missing_features perf_missing_features; >> >> @@ -247,6 +252,7 @@ void evsel__init(struct evsel *evsel, >> evsel->bpf_obj = NULL; >> evsel->bpf_fd = -1; >> INIT_LIST_HEAD(&evsel->config_terms); >> + INIT_LIST_HEAD(&evsel->bpf_counter_list); >> perf_evsel__object.init(evsel); >> evsel->sample_size = __evsel__sample_size(attr->sample_type); >> evsel__calc_id_pos(evsel); >> @@ -1366,6 +1372,7 @@ void evsel__exit(struct evsel *evsel) >> { >> assert(list_empty(&evsel->core.node)); >> assert(evsel->evlist == NULL); >> + bpf_counter__destroy(evsel); >> evsel__free_counts(evsel); >> perf_evsel__free_fd(&evsel->core); >> perf_evsel__free_id(&evsel->core); >> @@ -1781,6 +1788,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, >> >> FD(evsel, cpu, thread) = fd; >> >> + bpf_counter__install_pe(evsel, cpu, fd); >> + >> if (unlikely(test_attr__enabled)) { >> test_attr__open(&evsel->core.attr, pid, cpus->map[cpu], >> fd, group_fd, flags); >> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h >> index cd1d8dd431997..40e3946cd7518 100644 >> --- a/tools/perf/util/evsel.h >> +++ b/tools/perf/util/evsel.h >> @@ -10,6 +10,7 @@ >> #include <internal/evsel.h> >> #include <perf/evsel.h> >> #include "symbol_conf.h" >> +#include "bpf_counter.h" >> #include <internal/cpumap.h> >> >> struct bpf_object; >> @@ -17,6 +18,8 @@ struct cgroup; >> struct perf_counts; >> struct perf_stat_evsel; >> union perf_event; >> +struct bpf_counter_ops; >> +struct target; >> >> typedef int (evsel__sb_cb_t)(union perf_event *event, void *data); >> >> @@ -127,6 +130,8 @@ struct evsel { >> * See also evsel__has_callchain(). >> */ >> __u64 synth_sample_type; >> + struct list_head bpf_counter_list; >> + struct bpf_counter_ops *bpf_counter_ops; >> }; >> >> struct perf_missing_features { >> @@ -424,4 +429,5 @@ static inline bool evsel__is_dummy_event(struct evsel *evsel) >> struct perf_env *evsel__env(struct evsel *evsel); >> >> int evsel__store_ids(struct evsel *evsel, struct evlist *evlist); >> + >> #endif /* __PERF_EVSEL_H */ >> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c >> index 583ae4f09c5d1..cce7a76d6473c 100644 >> --- a/tools/perf/util/stat-display.c >> +++ b/tools/perf/util/stat-display.c >> @@ -1045,7 +1045,9 @@ static void print_header(struct perf_stat_config *config, >> if (!config->csv_output) { >> fprintf(output, "\n"); >> fprintf(output, " Performance counter stats for "); >> - if (_target->system_wide) >> + if (_target->bpf_str) >> + fprintf(output, "\'BPF program(s) %s", _target->bpf_str); >> + else if (_target->system_wide) >> fprintf(output, "\'system wide"); >> else if (_target->cpu_list) >> fprintf(output, "\'CPU(s) %s", _target->cpu_list); >> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c >> index 8ce1479c98f03..0b3957323f668 100644 >> --- a/tools/perf/util/stat.c >> +++ b/tools/perf/util/stat.c >> @@ -527,7 +527,7 @@ int create_perf_stat_counter(struct evsel *evsel, >> if (leader->core.nr_members > 1) >> attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP; >> >> - attr->inherit = !config->no_inherit; >> + attr->inherit = !config->no_inherit && list_empty(&evsel->bpf_counter_list); >> >> /* >> * Some events get initialized with sample_(period/type) set, >> diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c >> index a3db13dea937c..0f383418e3df5 100644 >> --- a/tools/perf/util/target.c >> +++ b/tools/perf/util/target.c >> @@ -56,6 +56,34 @@ enum target_errno target__validate(struct target *target) >> ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM; >> } >> >> + /* BPF and CPU are mutually exclusive */ >> + if (target->bpf_str && target->cpu_list) { >> + target->cpu_list = NULL; >> + if (ret == TARGET_ERRNO__SUCCESS) >> + ret = TARGET_ERRNO__BPF_OVERRIDE_CPU; >> + } >> + >> + /* BPF and PID/TID are mutually exclusive */ >> + if (target->bpf_str && target->tid) { >> + target->tid = NULL; >> + if (ret == TARGET_ERRNO__SUCCESS) >> + ret = TARGET_ERRNO__BPF_OVERRIDE_PID; >> + } >> + >> + /* BPF and UID are mutually exclusive */ >> + if (target->bpf_str && target->uid_str) { >> + target->uid_str = NULL; >> + if (ret == TARGET_ERRNO__SUCCESS) >> + ret = TARGET_ERRNO__BPF_OVERRIDE_UID; >> + } >> + >> + /* BPF and THREADS are mutually exclusive */ >> + if (target->bpf_str && target->per_thread) { >> + target->per_thread = false; >> + if (ret == TARGET_ERRNO__SUCCESS) >> + ret = TARGET_ERRNO__BPF_OVERRIDE_THREAD; >> + } >> + >> /* THREAD and SYSTEM/CPU are mutually exclusive */ >> if (target->per_thread && (target->system_wide || target->cpu_list)) { >> target->per_thread = false; >> @@ -109,6 +137,10 @@ static const char *target__error_str[] = { >> "PID/TID switch overriding SYSTEM", >> "UID switch overriding SYSTEM", >> "SYSTEM/CPU switch overriding PER-THREAD", >> + "BPF switch overriding CPU", >> + "BPF switch overriding PID/TID", >> + "BPF switch overriding UID", >> + "BPF switch overriding THREAD", >> "Invalid User: %s", >> "Problems obtaining information for user %s", >> }; >> @@ -134,7 +166,7 @@ int target__strerror(struct target *target, int errnum, >> >> switch (errnum) { >> case TARGET_ERRNO__PID_OVERRIDE_CPU ... >> - TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD: >> + TARGET_ERRNO__BPF_OVERRIDE_THREAD: >> snprintf(buf, buflen, "%s", msg); >> break; >> >> diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h >> index 6ef01a83b24e9..f132c6c2eef81 100644 >> --- a/tools/perf/util/target.h >> +++ b/tools/perf/util/target.h >> @@ -10,6 +10,7 @@ struct target { >> const char *tid; >> const char *cpu_list; >> const char *uid_str; >> + const char *bpf_str; >> uid_t uid; >> bool system_wide; >> bool uses_mmap; >> @@ -36,6 +37,10 @@ enum target_errno { >> TARGET_ERRNO__PID_OVERRIDE_SYSTEM, >> TARGET_ERRNO__UID_OVERRIDE_SYSTEM, >> TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD, >> + TARGET_ERRNO__BPF_OVERRIDE_CPU, >> + TARGET_ERRNO__BPF_OVERRIDE_PID, >> + TARGET_ERRNO__BPF_OVERRIDE_UID, >> + TARGET_ERRNO__BPF_OVERRIDE_THREAD, >> >> /* for target__parse_uid() */ >> TARGET_ERRNO__INVALID_UID, >> @@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target) >> return target->system_wide || target->cpu_list; >> } >> >> +static inline bool target__has_bpf(struct target *target) >> +{ >> + return target->bpf_str; >> +} >> + >> static inline bool target__none(struct target *target) >> { >> return !target__has_task(target) && !target__has_cpu(target); >> -- >> 2.24.1 >> > > -- > > - Arnaldo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-28 23:43 ` Song Liu @ 2020-12-29 5:53 ` Song Liu 2020-12-29 15:15 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 25+ messages in thread From: Song Liu @ 2020-12-29 5:53 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team > On Dec 28, 2020, at 3:43 PM, Song Liu <songliubraving@fb.com> wrote: > > > >> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: >> >> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu: >>> Introduce perf-stat -b option, which counts events for BPF programs, like: >>> >>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 >>> 1.487903822 115,200 ref-cycles >>> 1.487903822 86,012 cycles >>> 2.489147029 80,560 ref-cycles >>> 2.489147029 73,784 cycles >>> 3.490341825 60,720 ref-cycles >>> 3.490341825 37,797 cycles >>> 4.491540887 37,120 ref-cycles >>> 4.491540887 31,963 cycles >>> >>> The example above counts cycles and ref-cycles of BPF program of id 254. >>> This is similar to bpftool-prog-profile command, but more flexible. >>> >>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF >>> programs (monitor-progs) to the target BPF program (target-prog). The >>> monitor-progs read perf_event before and after the target-prog, and >>> aggregate the difference in a BPF map. Then the user space reads data >>> from these maps. >>> >>> A new struct bpf_counter is introduced to provide common interface that >>> uses BPF programs/maps to count perf events. >> >> Segfaulting here: >> >> [root@five ~]# bpftool prog | grep tracepoint >> 110: tracepoint name syscall_unaugme tag 57cd311f2e27366b gpl >> 111: tracepoint name sys_enter_conne tag 3555418ac9476139 gpl >> 112: tracepoint name sys_enter_sendt tag bc7fcadbaf7b8145 gpl >> 113: tracepoint name sys_enter_open tag 0e59c3ac2bea5280 gpl >> 114: tracepoint name sys_enter_opena tag 0baf443610f59837 gpl >> 115: tracepoint name sys_enter_renam tag 24664e4aca62d7fa gpl >> 116: tracepoint name sys_enter_renam tag 20093e51a8634ebb gpl >> 117: tracepoint name sys_enter tag 0bc3fc9d11754ba1 gpl >> 118: tracepoint name sys_exit tag 29c7ae234d79bd5c gpl >> [root@five ~]# >> [root@five ~]# gdb perf >> GNU gdb (GDB) Fedora 10.1-2.fc33 >> Reading symbols from perf... >> (gdb) run stat -e instructions,cycles -b 113 -I 1000 >> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000 >> [Thread debugging using libthread_db enabled] >> Using host libthread_db library "/lib64/libthread_db.so.1". >> libbpf: elf: skipping unrecognized data section(9) .eh_frame >> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame >> libbpf: elf: skipping unrecognized data section(9) .eh_frame >> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame >> >> Program received signal SIGSEGV, Segmentation fault. >> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 >> 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); >> (gdb) bt >> #0 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 >> #1 0x0000000000000000 in ?? () >> (gdb) >> >> [acme@five perf]$ clang -v |& head -2 >> clang version 11.0.0 (Fedora 11.0.0-2.fc33) >> Target: x86_64-unknown-linux-gnu >> [acme@five perf]$ >> >> Do you need any extra info? > > Hmm... I am not able to reproduce this. I am trying to setup an environment similar > to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? > I tried it on CentOS Stream release 8, with gcc version 8.4.1 20200928 (Red Hat 8.4.1-1) (GCC) clang version 11.0.0 (Red Hat 11.0.0-0.2.rc2.module_el8.4.0+533+50191577) Unfortunately, I still cannot repro it. I didn't find the issue while looking through the code. AFAICS, the code fail over when the skeleton is not ready, so bpf_program_profiler__read() should find a valid skeleton. Could you please help run the test with the following patch? The patch is also available at https://git.kernel.org/pub/scm/linux/kernel/git/song/linux.git perf-dash-b Thanks, Song diff --git i/tools/perf/util/bpf_counter.c w/tools/perf/util/bpf_counter.c index f2cb86a40c882..e09c571365b56 100644 --- i/tools/perf/util/bpf_counter.c +++ w/tools/perf/util/bpf_counter.c @@ -46,8 +46,10 @@ static int bpf_program_profiler__destroy(struct evsel *evsel) { struct bpf_counter *counter; - list_for_each_entry(counter, &evsel->bpf_counter_list, list) + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { + pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter); bpf_prog_profiler_bpf__destroy(counter->skel); + } INIT_LIST_HEAD(&evsel->bpf_counter_list); return 0; } @@ -141,8 +143,14 @@ static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id) counter->skel = skel; list_add(&counter->list, &evsel->bpf_counter_list); close(prog_fd); + pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter); + pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel); + pr_debug("%s return 0\n", __func__); return 0; err_out: + pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter); + pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel); + pr_debug("%s return -1\n", __func__); free(counter); close(prog_fd); return -1; @@ -214,11 +222,22 @@ static int bpf_program_profiler__read(struct evsel *evsel) list_for_each_entry(counter, &evsel->bpf_counter_list, list) { struct bpf_prog_profiler_bpf *skel = counter->skel; + pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter); + pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel); + if (!skel) { + pr_err("%s !skel\n", __func__); + continue; + } + if (!skel->maps.accum_readings) { + pr_err("%s !skel->maps.accum_readings", __func__); + continue; + } + reading_map_fd = bpf_map__fd(skel->maps.accum_readings); err = bpf_map_lookup_elem(reading_map_fd, &key, values); if (err) { - fprintf(stderr, "failed to read value\n"); + pr_err("failed to read value\n"); return err; } @@ -240,6 +259,8 @@ static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu, list_for_each_entry(counter, &evsel->bpf_counter_list, list) { skel = counter->skel; + pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter); + pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel); ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events), &cpu, &fd, BPF_ANY); if (ret) ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-28 23:43 ` Song Liu 2020-12-29 5:53 ` Song Liu @ 2020-12-29 15:15 ` Arnaldo Carvalho de Melo 2020-12-29 18:42 ` Song Liu 1 sibling, 1 reply; 25+ messages in thread From: Arnaldo Carvalho de Melo @ 2020-12-29 15:15 UTC (permalink / raw) To: Song Liu Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu: > > > > On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > > > Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu: > >> Introduce perf-stat -b option, which counts events for BPF programs, like: > >> > >> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 > >> 1.487903822 115,200 ref-cycles > >> 1.487903822 86,012 cycles > >> 2.489147029 80,560 ref-cycles > >> 2.489147029 73,784 cycles > >> 3.490341825 60,720 ref-cycles > >> 3.490341825 37,797 cycles > >> 4.491540887 37,120 ref-cycles > >> 4.491540887 31,963 cycles > >> > >> The example above counts cycles and ref-cycles of BPF program of id 254. > >> This is similar to bpftool-prog-profile command, but more flexible. > >> > >> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF > >> programs (monitor-progs) to the target BPF program (target-prog). The > >> monitor-progs read perf_event before and after the target-prog, and > >> aggregate the difference in a BPF map. Then the user space reads data > >> from these maps. > >> > >> A new struct bpf_counter is introduced to provide common interface that > >> uses BPF programs/maps to count perf events. > > > > Segfaulting here: > > > > [root@five ~]# bpftool prog | grep tracepoint > > 110: tracepoint name syscall_unaugme tag 57cd311f2e27366b gpl > > 111: tracepoint name sys_enter_conne tag 3555418ac9476139 gpl > > 112: tracepoint name sys_enter_sendt tag bc7fcadbaf7b8145 gpl > > 113: tracepoint name sys_enter_open tag 0e59c3ac2bea5280 gpl > > 114: tracepoint name sys_enter_opena tag 0baf443610f59837 gpl > > 115: tracepoint name sys_enter_renam tag 24664e4aca62d7fa gpl > > 116: tracepoint name sys_enter_renam tag 20093e51a8634ebb gpl > > 117: tracepoint name sys_enter tag 0bc3fc9d11754ba1 gpl > > 118: tracepoint name sys_exit tag 29c7ae234d79bd5c gpl > > [root@five ~]# > > [root@five ~]# gdb perf > > GNU gdb (GDB) Fedora 10.1-2.fc33 > > Reading symbols from perf... > > (gdb) run stat -e instructions,cycles -b 113 -I 1000 > > Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000 > > [Thread debugging using libthread_db enabled] > > Using host libthread_db library "/lib64/libthread_db.so.1". > > libbpf: elf: skipping unrecognized data section(9) .eh_frame > > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > > libbpf: elf: skipping unrecognized data section(9) .eh_frame > > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 > > 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > > (gdb) bt > > #0 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 > > #1 0x0000000000000000 in ?? () > > (gdb) > > > > [acme@five perf]$ clang -v |& head -2 > > clang version 11.0.0 (Fedora 11.0.0-2.fc33) > > Target: x86_64-unknown-linux-gnu > > [acme@five perf]$ > > > > Do you need any extra info? > > Hmm... I am not able to reproduce this. I am trying to setup an environment similar > to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? I'll try it with a BPF proggie attached to a kprobes, but here is something else I noticed: [root@five perf]# export PYTHONPATH=/tmp/build/perf/python [root@five perf]# tools/perf/python/twatch.py Traceback (most recent call last): File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module> import perf ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy [root@five perf]# perf test python 19: 'import perf' in python : FAILED! [root@five perf]# perf test -v python 19: 'import perf' in python : --- start --- test child forked, pid 3198864 python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' " Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy test child finished with -1 ---- end ---- 'import perf' in python: FAILED! [root@five perf]# This should be trivial, I hope, just add the new object file to tools/perf/util/python-ext-sources, then do a 'perf test python', if it fails, use 'perf test -v python' to see what is preventing the python binding from loading. - Arnaldo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-29 15:15 ` Arnaldo Carvalho de Melo @ 2020-12-29 18:42 ` Song Liu 2020-12-29 18:48 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 25+ messages in thread From: Song Liu @ 2020-12-29 18:42 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team > On Dec 29, 2020, at 7:15 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu: >> >> >>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: >>> >>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu: >>>> Introduce perf-stat -b option, which counts events for BPF programs, like: >>>> >>>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 >>>> 1.487903822 115,200 ref-cycles >>>> 1.487903822 86,012 cycles >>>> 2.489147029 80,560 ref-cycles >>>> 2.489147029 73,784 cycles >>>> 3.490341825 60,720 ref-cycles >>>> 3.490341825 37,797 cycles >>>> 4.491540887 37,120 ref-cycles >>>> 4.491540887 31,963 cycles >>>> >>>> The example above counts cycles and ref-cycles of BPF program of id 254. >>>> This is similar to bpftool-prog-profile command, but more flexible. >>>> >>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF >>>> programs (monitor-progs) to the target BPF program (target-prog). The >>>> monitor-progs read perf_event before and after the target-prog, and >>>> aggregate the difference in a BPF map. Then the user space reads data >>>> from these maps. >>>> >>>> A new struct bpf_counter is introduced to provide common interface that >>>> uses BPF programs/maps to count perf events. >>> >>> Segfaulting here: >>> >>> [root@five ~]# bpftool prog | grep tracepoint >>> 110: tracepoint name syscall_unaugme tag 57cd311f2e27366b gpl >>> 111: tracepoint name sys_enter_conne tag 3555418ac9476139 gpl >>> 112: tracepoint name sys_enter_sendt tag bc7fcadbaf7b8145 gpl >>> 113: tracepoint name sys_enter_open tag 0e59c3ac2bea5280 gpl >>> 114: tracepoint name sys_enter_opena tag 0baf443610f59837 gpl >>> 115: tracepoint name sys_enter_renam tag 24664e4aca62d7fa gpl >>> 116: tracepoint name sys_enter_renam tag 20093e51a8634ebb gpl >>> 117: tracepoint name sys_enter tag 0bc3fc9d11754ba1 gpl >>> 118: tracepoint name sys_exit tag 29c7ae234d79bd5c gpl >>> [root@five ~]# >>> [root@five ~]# gdb perf >>> GNU gdb (GDB) Fedora 10.1-2.fc33 >>> Reading symbols from perf... >>> (gdb) run stat -e instructions,cycles -b 113 -I 1000 >>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000 >>> [Thread debugging using libthread_db enabled] >>> Using host libthread_db library "/lib64/libthread_db.so.1". >>> libbpf: elf: skipping unrecognized data section(9) .eh_frame >>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame >>> libbpf: elf: skipping unrecognized data section(9) .eh_frame >>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame >>> >>> Program received signal SIGSEGV, Segmentation fault. >>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 >>> 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); >>> (gdb) bt >>> #0 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 >>> #1 0x0000000000000000 in ?? () >>> (gdb) >>> >>> [acme@five perf]$ clang -v |& head -2 >>> clang version 11.0.0 (Fedora 11.0.0-2.fc33) >>> Target: x86_64-unknown-linux-gnu >>> [acme@five perf]$ >>> >>> Do you need any extra info? >> >> Hmm... I am not able to reproduce this. I am trying to setup an environment similar >> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? > > I'll try it with a BPF proggie attached to a kprobes, but here is > something else I noticed: > > [root@five perf]# export PYTHONPATH=/tmp/build/perf/python > [root@five perf]# tools/perf/python/twatch.py > Traceback (most recent call last): > File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module> > import perf > ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy > [root@five perf]# perf test python > 19: 'import perf' in python : FAILED! > [root@five perf]# perf test -v python > 19: 'import perf' in python : > --- start --- > test child forked, pid 3198864 > python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' " > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy > test child finished with -1 > ---- end ---- > 'import perf' in python: FAILED! > [root@five perf]# > > This should be trivial, I hope, just add the new object file to > tools/perf/util/python-ext-sources, then do a 'perf test python', if it > fails, use 'perf test -v python' to see what is preventing the python > binding from loading. I fixed the undefined bpf_counter__destroy. But this one looks trickier: 19: 'import perf' in python : --- start --- test child forked, pid 2714986 python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/bin/python2' " Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: XXXXX /tools/perf/python/perf.so: undefined symbol: bpf_map_update_elem Given I already have: diff --git i/tools/perf/util/python-ext-sources w/tools/perf/util/python-ext-sources index a9d9c142eb7c3..2cac55273eca2 100644 --- i/tools/perf/util/python-ext-sources +++ w/tools/perf/util/python-ext-sources @@ -35,3 +35,6 @@ util/symbol_fprintf.c util/units.c util/affinity.c util/rwsem.c +util/bpf_counter.c +../lib/bpf/bpf.c +../lib/bpf/libbpf.c How should I fix this? Thanks, Song PS: I still cannot reproduce that segfault... > ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-29 18:42 ` Song Liu @ 2020-12-29 18:48 ` Arnaldo Carvalho de Melo 2020-12-29 19:11 ` Song Liu 0 siblings, 1 reply; 25+ messages in thread From: Arnaldo Carvalho de Melo @ 2020-12-29 18:48 UTC (permalink / raw) To: Song Liu Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team Em Tue, Dec 29, 2020 at 06:42:18PM +0000, Song Liu escreveu: > > > > On Dec 29, 2020, at 7:15 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > > > Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu: > >> > >> > >>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > >>> > >>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu: > >>>> Introduce perf-stat -b option, which counts events for BPF programs, like: > >>>> > >>>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 > >>>> 1.487903822 115,200 ref-cycles > >>>> 1.487903822 86,012 cycles > >>>> 2.489147029 80,560 ref-cycles > >>>> 2.489147029 73,784 cycles > >>>> 3.490341825 60,720 ref-cycles > >>>> 3.490341825 37,797 cycles > >>>> 4.491540887 37,120 ref-cycles > >>>> 4.491540887 31,963 cycles > >>>> > >>>> The example above counts cycles and ref-cycles of BPF program of id 254. > >>>> This is similar to bpftool-prog-profile command, but more flexible. > >>>> > >>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF > >>>> programs (monitor-progs) to the target BPF program (target-prog). The > >>>> monitor-progs read perf_event before and after the target-prog, and > >>>> aggregate the difference in a BPF map. Then the user space reads data > >>>> from these maps. > >>>> > >>>> A new struct bpf_counter is introduced to provide common interface that > >>>> uses BPF programs/maps to count perf events. > >>> > >>> Segfaulting here: > >>> > >>> [root@five ~]# bpftool prog | grep tracepoint > >>> 110: tracepoint name syscall_unaugme tag 57cd311f2e27366b gpl > >>> 111: tracepoint name sys_enter_conne tag 3555418ac9476139 gpl > >>> 112: tracepoint name sys_enter_sendt tag bc7fcadbaf7b8145 gpl > >>> 113: tracepoint name sys_enter_open tag 0e59c3ac2bea5280 gpl > >>> 114: tracepoint name sys_enter_opena tag 0baf443610f59837 gpl > >>> 115: tracepoint name sys_enter_renam tag 24664e4aca62d7fa gpl > >>> 116: tracepoint name sys_enter_renam tag 20093e51a8634ebb gpl > >>> 117: tracepoint name sys_enter tag 0bc3fc9d11754ba1 gpl > >>> 118: tracepoint name sys_exit tag 29c7ae234d79bd5c gpl > >>> [root@five ~]# > >>> [root@five ~]# gdb perf > >>> GNU gdb (GDB) Fedora 10.1-2.fc33 > >>> Reading symbols from perf... > >>> (gdb) run stat -e instructions,cycles -b 113 -I 1000 > >>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000 > >>> [Thread debugging using libthread_db enabled] > >>> Using host libthread_db library "/lib64/libthread_db.so.1". > >>> libbpf: elf: skipping unrecognized data section(9) .eh_frame > >>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > >>> libbpf: elf: skipping unrecognized data section(9) .eh_frame > >>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > >>> > >>> Program received signal SIGSEGV, Segmentation fault. > >>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 > >>> 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > >>> (gdb) bt > >>> #0 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 > >>> #1 0x0000000000000000 in ?? () > >>> (gdb) > >>> > >>> [acme@five perf]$ clang -v |& head -2 > >>> clang version 11.0.0 (Fedora 11.0.0-2.fc33) > >>> Target: x86_64-unknown-linux-gnu > >>> [acme@five perf]$ > >>> > >>> Do you need any extra info? > >> > >> Hmm... I am not able to reproduce this. I am trying to setup an environment similar > >> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? > > > > I'll try it with a BPF proggie attached to a kprobes, but here is > > something else I noticed: > > > > [root@five perf]# export PYTHONPATH=/tmp/build/perf/python > > [root@five perf]# tools/perf/python/twatch.py > > Traceback (most recent call last): > > File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module> > > import perf > > ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy > > [root@five perf]# perf test python > > 19: 'import perf' in python : FAILED! > > [root@five perf]# perf test -v python > > 19: 'import perf' in python : > > --- start --- > > test child forked, pid 3198864 > > python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' " > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy > > test child finished with -1 > > ---- end ---- > > 'import perf' in python: FAILED! > > [root@five perf]# > > > > This should be trivial, I hope, just add the new object file to > > tools/perf/util/python-ext-sources, then do a 'perf test python', if it > > fails, use 'perf test -v python' to see what is preventing the python > > binding from loading. > > I fixed the undefined bpf_counter__destroy. But this one looks trickier: > > 19: 'import perf' in python : > --- start --- > test child forked, pid 2714986 > python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/bin/python2' " > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ImportError: XXXXX /tools/perf/python/perf.so: undefined symbol: bpf_map_update_elem > > Given I already have: I'll check this one to get a patch that at least moves the needle here, i.e. probably we can leave supporting bpf counters in the python binding for a later step. - Arnaldo > diff --git i/tools/perf/util/python-ext-sources w/tools/perf/util/python-ext-sources > index a9d9c142eb7c3..2cac55273eca2 100644 > --- i/tools/perf/util/python-ext-sources > +++ w/tools/perf/util/python-ext-sources > @@ -35,3 +35,6 @@ util/symbol_fprintf.c > util/units.c > util/affinity.c > util/rwsem.c > +util/bpf_counter.c > +../lib/bpf/bpf.c > +../lib/bpf/libbpf.c > > > How should I fix this? > > Thanks, > Song > > PS: I still cannot reproduce that segfault... > > > > -- - Arnaldo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-29 18:48 ` Arnaldo Carvalho de Melo @ 2020-12-29 19:11 ` Song Liu 2020-12-29 19:18 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 25+ messages in thread From: Song Liu @ 2020-12-29 19:11 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team > On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Tue, Dec 29, 2020 at 06:42:18PM +0000, Song Liu escreveu: >> >> >>> On Dec 29, 2020, at 7:15 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: >>> >>> Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu: >>>> >>>> >>>>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: >>>>> >>>>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu: >>>>>> Introduce perf-stat -b option, which counts events for BPF programs, like: >>>>>> >>>>>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 >>>>>> 1.487903822 115,200 ref-cycles >>>>>> 1.487903822 86,012 cycles >>>>>> 2.489147029 80,560 ref-cycles >>>>>> 2.489147029 73,784 cycles >>>>>> 3.490341825 60,720 ref-cycles >>>>>> 3.490341825 37,797 cycles >>>>>> 4.491540887 37,120 ref-cycles >>>>>> 4.491540887 31,963 cycles >>>>>> >>>>>> The example above counts cycles and ref-cycles of BPF program of id 254. >>>>>> This is similar to bpftool-prog-profile command, but more flexible. >>>>>> >>>>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF >>>>>> programs (monitor-progs) to the target BPF program (target-prog). The >>>>>> monitor-progs read perf_event before and after the target-prog, and >>>>>> aggregate the difference in a BPF map. Then the user space reads data >>>>>> from these maps. >>>>>> >>>>>> A new struct bpf_counter is introduced to provide common interface that >>>>>> uses BPF programs/maps to count perf events. >>>>> >>>>> Segfaulting here: >>>>> >>>>> [root@five ~]# bpftool prog | grep tracepoint >>>>> 110: tracepoint name syscall_unaugme tag 57cd311f2e27366b gpl >>>>> 111: tracepoint name sys_enter_conne tag 3555418ac9476139 gpl >>>>> 112: tracepoint name sys_enter_sendt tag bc7fcadbaf7b8145 gpl >>>>> 113: tracepoint name sys_enter_open tag 0e59c3ac2bea5280 gpl >>>>> 114: tracepoint name sys_enter_opena tag 0baf443610f59837 gpl >>>>> 115: tracepoint name sys_enter_renam tag 24664e4aca62d7fa gpl >>>>> 116: tracepoint name sys_enter_renam tag 20093e51a8634ebb gpl >>>>> 117: tracepoint name sys_enter tag 0bc3fc9d11754ba1 gpl >>>>> 118: tracepoint name sys_exit tag 29c7ae234d79bd5c gpl >>>>> [root@five ~]# >>>>> [root@five ~]# gdb perf >>>>> GNU gdb (GDB) Fedora 10.1-2.fc33 >>>>> Reading symbols from perf... >>>>> (gdb) run stat -e instructions,cycles -b 113 -I 1000 >>>>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000 >>>>> [Thread debugging using libthread_db enabled] >>>>> Using host libthread_db library "/lib64/libthread_db.so.1". >>>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame >>>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame >>>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame >>>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame >>>>> >>>>> Program received signal SIGSEGV, Segmentation fault. >>>>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 >>>>> 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); >>>>> (gdb) bt >>>>> #0 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 >>>>> #1 0x0000000000000000 in ?? () >>>>> (gdb) >>>>> >>>>> [acme@five perf]$ clang -v |& head -2 >>>>> clang version 11.0.0 (Fedora 11.0.0-2.fc33) >>>>> Target: x86_64-unknown-linux-gnu >>>>> [acme@five perf]$ >>>>> >>>>> Do you need any extra info? >>>> >>>> Hmm... I am not able to reproduce this. I am trying to setup an environment similar >>>> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? >>> >>> I'll try it with a BPF proggie attached to a kprobes, but here is >>> something else I noticed: >>> >>> [root@five perf]# export PYTHONPATH=/tmp/build/perf/python >>> [root@five perf]# tools/perf/python/twatch.py >>> Traceback (most recent call last): >>> File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module> >>> import perf >>> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy >>> [root@five perf]# perf test python >>> 19: 'import perf' in python : FAILED! >>> [root@five perf]# perf test -v python >>> 19: 'import perf' in python : >>> --- start --- >>> test child forked, pid 3198864 >>> python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' " >>> Traceback (most recent call last): >>> File "<stdin>", line 1, in <module> >>> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy >>> test child finished with -1 >>> ---- end ---- >>> 'import perf' in python: FAILED! >>> [root@five perf]# >>> >>> This should be trivial, I hope, just add the new object file to >>> tools/perf/util/python-ext-sources, then do a 'perf test python', if it >>> fails, use 'perf test -v python' to see what is preventing the python >>> binding from loading. >> >> I fixed the undefined bpf_counter__destroy. But this one looks trickier: >> >> 19: 'import perf' in python : >> --- start --- >> test child forked, pid 2714986 >> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/bin/python2' " >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> ImportError: XXXXX /tools/perf/python/perf.so: undefined symbol: bpf_map_update_elem >> >> Given I already have: > > I'll check this one to get a patch that at least moves the needle here, > i.e. probably we can leave supporting bpf counters in the python binding > for a later step. Thanks Arnaldo! Currently, I have: 1. Fixed issues highlighted by Namhyung; 2. Merged 3/4 and 4/4; 3. NOT found segfault; 4. NOT fixed python import perf. I don't have good ideas with 3 and 4... Shall I send current code as v7? Thanks, Song > > - Arnaldo > >> diff --git i/tools/perf/util/python-ext-sources w/tools/perf/util/python-ext-sources >> index a9d9c142eb7c3..2cac55273eca2 100644 >> --- i/tools/perf/util/python-ext-sources >> +++ w/tools/perf/util/python-ext-sources >> @@ -35,3 +35,6 @@ util/symbol_fprintf.c >> util/units.c >> util/affinity.c >> util/rwsem.c >> +util/bpf_counter.c >> +../lib/bpf/bpf.c >> +../lib/bpf/libbpf.c >> >> >> How should I fix this? >> >> Thanks, >> Song >> >> PS: I still cannot reproduce that segfault... >> >>> >> > > -- > > - Arnaldo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-29 19:11 ` Song Liu @ 2020-12-29 19:18 ` Arnaldo Carvalho de Melo 2020-12-29 19:23 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 25+ messages in thread From: Arnaldo Carvalho de Melo @ 2020-12-29 19:18 UTC (permalink / raw) To: Song Liu Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team Em Tue, Dec 29, 2020 at 07:11:12PM +0000, Song Liu escreveu: > > > > On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > > > Em Tue, Dec 29, 2020 at 06:42:18PM +0000, Song Liu escreveu: > >> > >> > >>> On Dec 29, 2020, at 7:15 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > >>> > >>> Em Mon, Dec 28, 2020 at 11:43:25PM +0000, Song Liu escreveu: > >>>> > >>>> > >>>>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > >>>>> > >>>>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu: > >>>>>> Introduce perf-stat -b option, which counts events for BPF programs, like: > >>>>>> > >>>>>> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 > >>>>>> 1.487903822 115,200 ref-cycles > >>>>>> 1.487903822 86,012 cycles > >>>>>> 2.489147029 80,560 ref-cycles > >>>>>> 2.489147029 73,784 cycles > >>>>>> 3.490341825 60,720 ref-cycles > >>>>>> 3.490341825 37,797 cycles > >>>>>> 4.491540887 37,120 ref-cycles > >>>>>> 4.491540887 31,963 cycles > >>>>>> > >>>>>> The example above counts cycles and ref-cycles of BPF program of id 254. > >>>>>> This is similar to bpftool-prog-profile command, but more flexible. > >>>>>> > >>>>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF > >>>>>> programs (monitor-progs) to the target BPF program (target-prog). The > >>>>>> monitor-progs read perf_event before and after the target-prog, and > >>>>>> aggregate the difference in a BPF map. Then the user space reads data > >>>>>> from these maps. > >>>>>> > >>>>>> A new struct bpf_counter is introduced to provide common interface that > >>>>>> uses BPF programs/maps to count perf events. > >>>>> > >>>>> Segfaulting here: > >>>>> > >>>>> [root@five ~]# bpftool prog | grep tracepoint > >>>>> 110: tracepoint name syscall_unaugme tag 57cd311f2e27366b gpl > >>>>> 111: tracepoint name sys_enter_conne tag 3555418ac9476139 gpl > >>>>> 112: tracepoint name sys_enter_sendt tag bc7fcadbaf7b8145 gpl > >>>>> 113: tracepoint name sys_enter_open tag 0e59c3ac2bea5280 gpl > >>>>> 114: tracepoint name sys_enter_opena tag 0baf443610f59837 gpl > >>>>> 115: tracepoint name sys_enter_renam tag 24664e4aca62d7fa gpl > >>>>> 116: tracepoint name sys_enter_renam tag 20093e51a8634ebb gpl > >>>>> 117: tracepoint name sys_enter tag 0bc3fc9d11754ba1 gpl > >>>>> 118: tracepoint name sys_exit tag 29c7ae234d79bd5c gpl > >>>>> [root@five ~]# > >>>>> [root@five ~]# gdb perf > >>>>> GNU gdb (GDB) Fedora 10.1-2.fc33 > >>>>> Reading symbols from perf... > >>>>> (gdb) run stat -e instructions,cycles -b 113 -I 1000 > >>>>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000 > >>>>> [Thread debugging using libthread_db enabled] > >>>>> Using host libthread_db library "/lib64/libthread_db.so.1". > >>>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame > >>>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > >>>>> libbpf: elf: skipping unrecognized data section(9) .eh_frame > >>>>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > >>>>> > >>>>> Program received signal SIGSEGV, Segmentation fault. > >>>>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 > >>>>> 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > >>>>> (gdb) bt > >>>>> #0 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217 > >>>>> #1 0x0000000000000000 in ?? () > >>>>> (gdb) > >>>>> > >>>>> [acme@five perf]$ clang -v |& head -2 > >>>>> clang version 11.0.0 (Fedora 11.0.0-2.fc33) > >>>>> Target: x86_64-unknown-linux-gnu > >>>>> [acme@five perf]$ > >>>>> > >>>>> Do you need any extra info? > >>>> > >>>> Hmm... I am not able to reproduce this. I am trying to setup an environment similar > >>>> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? > >>> > >>> I'll try it with a BPF proggie attached to a kprobes, but here is > >>> something else I noticed: > >>> > >>> [root@five perf]# export PYTHONPATH=/tmp/build/perf/python > >>> [root@five perf]# tools/perf/python/twatch.py > >>> Traceback (most recent call last): > >>> File "/home/acme/git/perf/tools/perf/python/twatch.py", line 9, in <module> > >>> import perf > >>> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy > >>> [root@five perf]# perf test python > >>> 19: 'import perf' in python : FAILED! > >>> [root@five perf]# perf test -v python > >>> 19: 'import perf' in python : > >>> --- start --- > >>> test child forked, pid 3198864 > >>> python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' " > >>> Traceback (most recent call last): > >>> File "<stdin>", line 1, in <module> > >>> ImportError: /tmp/build/perf/python/perf.cpython-39-x86_64-linux-gnu.so: undefined symbol: bpf_counter__destroy > >>> test child finished with -1 > >>> ---- end ---- > >>> 'import perf' in python: FAILED! > >>> [root@five perf]# > >>> > >>> This should be trivial, I hope, just add the new object file to > >>> tools/perf/util/python-ext-sources, then do a 'perf test python', if it > >>> fails, use 'perf test -v python' to see what is preventing the python > >>> binding from loading. > >> > >> I fixed the undefined bpf_counter__destroy. But this one looks trickier: > >> > >> 19: 'import perf' in python : > >> --- start --- > >> test child forked, pid 2714986 > >> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/bin/python2' " > >> Traceback (most recent call last): > >> File "<stdin>", line 1, in <module> > >> ImportError: XXXXX /tools/perf/python/perf.so: undefined symbol: bpf_map_update_elem > >> > >> Given I already have: > > > > I'll check this one to get a patch that at least moves the needle here, > > i.e. probably we can leave supporting bpf counters in the python binding > > for a later step. > > Thanks Arnaldo! > > Currently, I have: > 1. Fixed issues highlighted by Namhyung; > 2. Merged 3/4 and 4/4; > 3. NOT found segfault; > 4. NOT fixed python import perf. > > I don't have good ideas with 3 and 4... Shall I send current code as v7? For 4, please fold the patch below into the relevant patch, we don't need bpf_counter.h included in util/evsel.h, you even added a forward declaration for that 'struct bpf_counter_ops'. And in general we should refrain from adding extra includes to header files, .h-ell is not good. Then we provide a stub for that bpf_counter__destroy() so that util/evsel.o when linked into the perf python biding find it there, links ok. As we don't have a way to create such events via the perf python binding, there will nothing to be done when destroying evsels created via python. - Arnaldo diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 40e3946cd7518113..8226b1fefa8cf2a3 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -10,7 +10,6 @@ #include <internal/evsel.h> #include <perf/evsel.h> #include "symbol_conf.h" -#include "bpf_counter.h" #include <internal/cpumap.h> struct bpf_object; diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c index cc5ade85a33fc999..9609cc166d71a6f5 100644 --- a/tools/perf/util/python.c +++ b/tools/perf/util/python.c @@ -79,6 +79,21 @@ int metricgroup__copy_metric_events(struct evlist *evlist, struct cgroup *cgrp, return 0; } +/* + * XXX: All these evsel destructors need some better mechanism, like a linked + * list of destructors registered when the relevant code indeed is used instead + * of having more and more calls in perf_evsel__delete(). -- acme + * + * For now, add one more: + * + * Not to drag the BPF bandwagon... + */ +void bpf_counter__destroy(struct evsel *evsel); + +void bpf_counter__destroy(struct evsel *evsel __maybe_unused) +{ +} + /* * Support debug printing even though util/debug.c is not linked. That means * implementing 'verbose' and 'eprintf'. ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-29 19:18 ` Arnaldo Carvalho de Melo @ 2020-12-29 19:23 ` Arnaldo Carvalho de Melo 2020-12-29 19:32 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 25+ messages in thread From: Arnaldo Carvalho de Melo @ 2020-12-29 19:23 UTC (permalink / raw) To: Song Liu Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team Em Tue, Dec 29, 2020 at 04:18:48PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Tue, Dec 29, 2020 at 07:11:12PM +0000, Song Liu escreveu: > > > On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > > I'll check this one to get a patch that at least moves the needle here, > > > i.e. probably we can leave supporting bpf counters in the python binding > > > for a later step. > > Thanks Arnaldo! > > Currently, I have: > > 1. Fixed issues highlighted by Namhyung; > > 2. Merged 3/4 and 4/4; > > 3. NOT found segfault; > > 4. NOT fixed python import perf. > > I don't have good ideas with 3 and 4... Shall I send current code as v7? > For 4, please fold the patch below into the relevant patch, we don't > need bpf_counter.h included in util/evsel.h, you even added a forward > declaration for that 'struct bpf_counter_ops'. > And in general we should refrain from adding extra includes to header > files, .h-ell is not good. > > Then we provide a stub for that bpf_counter__destroy() so that > util/evsel.o when linked into the perf python biding find it there, > links ok. Ok, one more stub is needed, I wasn't building all the time with $ make BUILD_BPF_SKEL=1 Ditch the previous patch please, use the one below instead: - Arnaldo diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 40e3946cd7518113..8226b1fefa8cf2a3 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -10,7 +10,6 @@ #include <internal/evsel.h> #include <perf/evsel.h> #include "symbol_conf.h" -#include "bpf_counter.h" #include <internal/cpumap.h> struct bpf_object; diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c index cc5ade85a33fc999..278abecb5bdfc0d2 100644 --- a/tools/perf/util/python.c +++ b/tools/perf/util/python.c @@ -79,6 +79,27 @@ int metricgroup__copy_metric_events(struct evlist *evlist, struct cgroup *cgrp, return 0; } +/* + * XXX: All these evsel destructors need some better mechanism, like a linked + * list of destructors registered when the relevant code indeed is used instead + * of having more and more calls in perf_evsel__delete(). -- acme + * + * For now, add some more: + * + * Not to drag the BPF bandwagon... + */ +void bpf_counter__destroy(struct evsel *evsel); +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); + +void bpf_counter__destroy(struct evsel *evsel __maybe_unused) +{ +} + +int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, int cpu __maybe_unused, int fd __maybe_unused) +{ + return 0; +} + /* * Support debug printing even though util/debug.c is not linked. That means * implementing 'verbose' and 'eprintf'. ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-29 19:23 ` Arnaldo Carvalho de Melo @ 2020-12-29 19:32 ` Arnaldo Carvalho de Melo 2020-12-29 21:40 ` Song Liu 0 siblings, 1 reply; 25+ messages in thread From: Arnaldo Carvalho de Melo @ 2020-12-29 19:32 UTC (permalink / raw) To: Song Liu Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team Em Tue, Dec 29, 2020 at 04:23:47PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Tue, Dec 29, 2020 at 04:18:48PM -0300, Arnaldo Carvalho de Melo escreveu: > > Em Tue, Dec 29, 2020 at 07:11:12PM +0000, Song Liu escreveu: > > > > On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > > > I'll check this one to get a patch that at least moves the needle here, > > > > i.e. probably we can leave supporting bpf counters in the python binding > > > > for a later step. > > > > Thanks Arnaldo! > > > > Currently, I have: > > > 1. Fixed issues highlighted by Namhyung; > > > 2. Merged 3/4 and 4/4; > > > 3. NOT found segfault; > > > 4. NOT fixed python import perf. For 3, now with a kprobe: [root@five ~]# bpftool prog | grep hrtimer -A10 99: kprobe name hrtimer_nanosle tag 0e77bacaf4555f83 gpl loaded_at 2020-12-29T16:25:34-0300 uid 0 xlated 80B jited 49B memlock 4096B btf_id 253 [root@five ~]# perf stat -I 1000 -b 99 libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame Segmentation fault (core dumped) [root@five ~]# (gdb) run stat -I 1000 -b 99 Starting program: /root/bin/perf stat -I 1000 -b 99 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.32-2.fc33.x86_64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame Program received signal SIGSEGV, Segmentation fault. 0x000000000056559b in bpf_program_profiler__read (evsel=0xc39770) at util/bpf_counter.c:217 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-4.fc33.x86_64 cyrus-sasl-lib-2.1.27-6.fc33.x86_64 elfutils-debuginfod-client-0.182-1.fc33.x86_64 elfutils-libelf-0.182-1.fc33.x86_64 elfutils-libs-0.182-1.fc33.x86_64 keyutils-libs-1.6-5.fc33.x86_64 krb5-libs-1.18.2-29.fc33.x86_64 libbabeltrace-1.5.8-3.fc33.x86_64 libbrotli-1.0.9-3.fc33.x86_64 libcap-2.26-8.fc33.x86_64 libcom_err-1.45.6-4.fc33.x86_64 libcurl-7.71.1-8.fc33.x86_64 libgcc-10.2.1-9.fc33.x86_64 libidn2-2.3.0-4.fc33.x86_64 libnghttp2-1.41.0-3.fc33.x86_64 libpsl-0.21.1-2.fc33.x86_64 libselinux-3.1-2.fc33.x86_64 libssh-0.9.5-1.fc33.x86_64 libunistring-0.9.10-9.fc33.x86_64 libunwind-1.4.0-4.fc33.x86_64 libuuid-2.36-3.fc33.x86_64 libxcrypt-4.4.17-1.fc33.x86_64 libzstd-1.4.5-5.fc33.x86_64 numactl-libs-2.0.14-1.fc33.x86_64 openldap-2.4.50-5.fc33.x86_64 openssl-libs-1.1.1i-1.fc33.x86_64 pcre-8.44-2.fc33.x86_64 pcre2-10.36-1.fc33.x86_64 perl-libs-5.32.0-464.fc33.x86_64 popt-1.18-2.fc33.x86_64 python3-libs-3.9.1-1.fc33.x86_64 slang-2.3.2-8.fc33.x86_64 xz-libs-5.2.5-3.fc33.x86_64 (gdb) bt #0 0x000000000056559b in bpf_program_profiler__read (evsel=0xc39770) at util/bpf_counter.c:217 #1 0x0000000000000000 in ?? () (gdb) p skel->maps.accum_readings Cannot access memory at address 0x20 (gdb) p skel $1 = (struct bpf_prog_profiler_bpf *) 0x0 (gdb) list -10 202 int reading_map_fd; 203 __u32 key = 0; 204 int err, cpu; 205 206 if (list_empty(&evsel->bpf_counter_list)) 207 return -EAGAIN; 208 209 for (cpu = 0; cpu < num_cpu; cpu++) { 210 perf_counts(evsel->counts, cpu, 0)->val = 0; 211 perf_counts(evsel->counts, cpu, 0)->ena = 0; (gdb) 212 perf_counts(evsel->counts, cpu, 0)->run = 0; 213 } 214 list_for_each_entry(counter, &evsel->bpf_counter_list, list) { 215 struct bpf_prog_profiler_bpf *skel = counter->skel; 216 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); 218 219 err = bpf_map_lookup_elem(reading_map_fd, &key, values); 220 if (err) { 221 fprintf(stderr, "failed to read value\n"); (gdb) p counter->skel $2 = (void *) 0x0 (gdb) p perf_evsel__name(counter) No symbol "perf_evsel__name" in current context. (gdb) p evsel__name(counter) $3 = 0xc77420 "unknown attr type: 13078424" (gdb) p evsel->type There is no member named type. (gdb) p evsel->core. attr cpus fd id ids node nr_members own_cpus sample_id system_wide threads (gdb) p evsel->core.attr.type $4 = 1 (gdb) p evsel->core.attr.config $5 = 0 (gdb) p evsel->evlist $6 = (struct evlist *) 0xc3cfd0 (gdb) p evsel->evlist->core.nr_entries $7 = 10 (gdb) 10 entries, the default for 'perf stat' With just one event: (gdb) run stat -e cycles -I 1000 -b 99 Starting program: /root/bin/perf stat -e cycles -I 1000 -b 99 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.32-2.fc33.x86_64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". libbpf: elf: skipping unrecognized data section(9) .eh_frame libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame Program received signal SIGSEGV, Segmentation fault. 0x000000000056559b in bpf_program_profiler__read (evsel=0xc392c0) at util/bpf_counter.c:217 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-4.fc33.x86_64 cyrus-sasl-lib-2.1.27-6.fc33.x86_64 elfutils-debuginfod-client-0.182-1.fc33.x86_64 elfutils-libelf-0.182-1.fc33.x86_64 elfutils-libs-0.182-1.fc33.x86_64 keyutils-libs-1.6-5.fc33.x86_64 krb5-libs-1.18.2-29.fc33.x86_64 libbabeltrace-1.5.8-3.fc33.x86_64 libbrotli-1.0.9-3.fc33.x86_64 libcap-2.26-8.fc33.x86_64 libcom_err-1.45.6-4.fc33.x86_64 libcurl-7.71.1-8.fc33.x86_64 libgcc-10.2.1-9.fc33.x86_64 libidn2-2.3.0-4.fc33.x86_64 libnghttp2-1.41.0-3.fc33.x86_64 libpsl-0.21.1-2.fc33.x86_64 libselinux-3.1-2.fc33.x86_64 libssh-0.9.5-1.fc33.x86_64 libunistring-0.9.10-9.fc33.x86_64 libunwind-1.4.0-4.fc33.x86_64 libuuid-2.36-3.fc33.x86_64 libxcrypt-4.4.17-1.fc33.x86_64 libzstd-1.4.5-5.fc33.x86_64 numactl-libs-2.0.14-1.fc33.x86_64 openldap-2.4.50-5.fc33.x86_64 openssl-libs-1.1.1i-1.fc33.x86_64 pcre-8.44-2.fc33.x86_64 pcre2-10.36-1.fc33.x86_64 perl-libs-5.32.0-464.fc33.x86_64 popt-1.18-2.fc33.x86_64 python3-libs-3.9.1-1.fc33.x86_64 slang-2.3.2-8.fc33.x86_64 xz-libs-5.2.5-3.fc33.x86_64 (gdb) bt #0 0x000000000056559b in bpf_program_profiler__read (evsel=0xc392c0) at util/bpf_counter.c:217 #1 0x0000000000000000 in ?? () (gdb) p evsel->name $1 = 0xc37960 "cycles" (gdb) p evsel->bpf_counter_ bpf_counter_list bpf_counter_ops (gdb) p evsel->bpf_counter_ops $2 = (struct bpf_counter_ops *) 0xa08ec0 <bpf_program_profiler_ops> (gdb) p evsel->bpf_counter_ bpf_counter_list bpf_counter_ops (gdb) p evsel->bpf_counter_list $3 = {next = 0xc36e18, prev = 0xc36e18} (gdb) p evsel->s sample_size side_band stats supported synth_sample_type (gdb) list -5 207 return -EAGAIN; 208 209 for (cpu = 0; cpu < num_cpu; cpu++) { 210 perf_counts(evsel->counts, cpu, 0)->val = 0; 211 perf_counts(evsel->counts, cpu, 0)->ena = 0; 212 perf_counts(evsel->counts, cpu, 0)->run = 0; 213 } 214 list_for_each_entry(counter, &evsel->bpf_counter_list, list) { 215 struct bpf_prog_profiler_bpf *skel = counter->skel; 216 (gdb) 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); 218 219 err = bpf_map_lookup_elem(reading_map_fd, &key, values); 220 if (err) { 221 fprintf(stderr, "failed to read value\n"); 222 return err; 223 } 224 225 for (cpu = 0; cpu < num_cpu; cpu++) { 226 perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter; (gdb) p counter->skel $4 = (void *) 0x0 (gdb) skel is NULL?! I ran out of time, have to go errands now. will bbl. - Arnaldo > > > I don't have good ideas with 3 and 4... Shall I send current code as v7? > > > For 4, please fold the patch below into the relevant patch, we don't > > need bpf_counter.h included in util/evsel.h, you even added a forward > > declaration for that 'struct bpf_counter_ops'. > > > And in general we should refrain from adding extra includes to header > > files, .h-ell is not good. > > > > Then we provide a stub for that bpf_counter__destroy() so that > > util/evsel.o when linked into the perf python biding find it there, > > links ok. > > Ok, one more stub is needed, I wasn't building all the time with > > $ make BUILD_BPF_SKEL=1 > > Ditch the previous patch please, use the one below instead: > > - Arnaldo > > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h > index 40e3946cd7518113..8226b1fefa8cf2a3 100644 > --- a/tools/perf/util/evsel.h > +++ b/tools/perf/util/evsel.h > @@ -10,7 +10,6 @@ > #include <internal/evsel.h> > #include <perf/evsel.h> > #include "symbol_conf.h" > -#include "bpf_counter.h" > #include <internal/cpumap.h> > > struct bpf_object; > diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c > index cc5ade85a33fc999..278abecb5bdfc0d2 100644 > --- a/tools/perf/util/python.c > +++ b/tools/perf/util/python.c > @@ -79,6 +79,27 @@ int metricgroup__copy_metric_events(struct evlist *evlist, struct cgroup *cgrp, > return 0; > } > > +/* > + * XXX: All these evsel destructors need some better mechanism, like a linked > + * list of destructors registered when the relevant code indeed is used instead > + * of having more and more calls in perf_evsel__delete(). -- acme > + * > + * For now, add some more: > + * > + * Not to drag the BPF bandwagon... > + */ > +void bpf_counter__destroy(struct evsel *evsel); > +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); > + > +void bpf_counter__destroy(struct evsel *evsel __maybe_unused) > +{ > +} > + > +int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, int cpu __maybe_unused, int fd __maybe_unused) > +{ > + return 0; > +} > + > /* > * Support debug printing even though util/debug.c is not linked. That means > * implementing 'verbose' and 'eprintf'. -- - Arnaldo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-29 19:32 ` Arnaldo Carvalho de Melo @ 2020-12-29 21:40 ` Song Liu 0 siblings, 0 replies; 25+ messages in thread From: Song Liu @ 2020-12-29 21:40 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: lkml, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, Kernel Team > On Dec 29, 2020, at 11:32 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Tue, Dec 29, 2020 at 04:23:47PM -0300, Arnaldo Carvalho de Melo escreveu: >> Em Tue, Dec 29, 2020 at 04:18:48PM -0300, Arnaldo Carvalho de Melo escreveu: >>> Em Tue, Dec 29, 2020 at 07:11:12PM +0000, Song Liu escreveu: >>>>> On Dec 29, 2020, at 10:48 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote: >>>>> I'll check this one to get a patch that at least moves the needle here, >>>>> i.e. probably we can leave supporting bpf counters in the python binding >>>>> for a later step. >> >>>> Thanks Arnaldo! >> >>>> Currently, I have: >>>> 1. Fixed issues highlighted by Namhyung; >>>> 2. Merged 3/4 and 4/4; >>>> 3. NOT found segfault; >>>> 4. NOT fixed python import perf. > > For 3, now with a kprobe: > > [root@five ~]# bpftool prog | grep hrtimer -A10 > 99: kprobe name hrtimer_nanosle tag 0e77bacaf4555f83 gpl > loaded_at 2020-12-29T16:25:34-0300 uid 0 > xlated 80B jited 49B memlock 4096B > btf_id 253 > [root@five ~]# perf stat -I 1000 -b 99 > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > Segmentation fault (core dumped) > [root@five ~]# > > (gdb) run stat -I 1000 -b 99 > Starting program: /root/bin/perf stat -I 1000 -b 99 > Missing separate debuginfos, use: dnf debuginfo-install glibc-2.32-2.fc33.x86_64 > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > > Program received signal SIGSEGV, Segmentation fault. > 0x000000000056559b in bpf_program_profiler__read (evsel=0xc39770) at util/bpf_counter.c:217 > 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-4.fc33.x86_64 cyrus-sasl-lib-2.1.27-6.fc33.x86_64 elfutils-debuginfod-client-0.182-1.fc33.x86_64 elfutils-libelf-0.182-1.fc33.x86_64 elfutils-libs-0.182-1.fc33.x86_64 keyutils-libs-1.6-5.fc33.x86_64 krb5-libs-1.18.2-29.fc33.x86_64 libbabeltrace-1.5.8-3.fc33.x86_64 libbrotli-1.0.9-3.fc33.x86_64 libcap-2.26-8.fc33.x86_64 libcom_err-1.45.6-4.fc33.x86_64 libcurl-7.71.1-8.fc33.x86_64 libgcc-10.2.1-9.fc33.x86_64 libidn2-2.3.0-4.fc33.x86_64 libnghttp2-1.41.0-3.fc33.x86_64 libpsl-0.21.1-2.fc33.x86_64 libselinux-3.1-2.fc33.x86_64 libssh-0.9.5-1.fc33.x86_64 libunistring-0.9.10-9.fc33.x86_64 libunwind-1.4.0-4.fc33.x86_64 libuuid-2.36-3.fc33.x86_64 libxcrypt-4.4.17-1.fc33.x86_64 libzstd-1.4.5-5.fc33.x86_64 numactl-libs-2.0.14-1.fc33.x86_64 openldap-2.4.50-5.fc33.x86_64 openssl-libs-1.1.1i-1.fc33.x86_64 pcre-8.44-2.fc33.x86_64 pcre2-10.36-1.fc33.x86_64 perl-libs-5.32.0-464.fc33.x86_64 popt-1.18-2.fc33.x86_64 python3-libs-3.9.1-1.fc33.x86_64 slang-2.3.2-8.fc33.x86_64 xz-libs-5.2.5-3.fc33.x86_64 > (gdb) bt > #0 0x000000000056559b in bpf_program_profiler__read (evsel=0xc39770) at util/bpf_counter.c:217 > #1 0x0000000000000000 in ?? () > (gdb) p skel->maps.accum_readings > Cannot access memory at address 0x20 > (gdb) p skel > $1 = (struct bpf_prog_profiler_bpf *) 0x0 > (gdb) list -10 > 202 int reading_map_fd; > 203 __u32 key = 0; > 204 int err, cpu; > 205 > 206 if (list_empty(&evsel->bpf_counter_list)) > 207 return -EAGAIN; > 208 > 209 for (cpu = 0; cpu < num_cpu; cpu++) { > 210 perf_counts(evsel->counts, cpu, 0)->val = 0; > 211 perf_counts(evsel->counts, cpu, 0)->ena = 0; > (gdb) > 212 perf_counts(evsel->counts, cpu, 0)->run = 0; > 213 } > 214 list_for_each_entry(counter, &evsel->bpf_counter_list, list) { > 215 struct bpf_prog_profiler_bpf *skel = counter->skel; > 216 > 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > 218 > 219 err = bpf_map_lookup_elem(reading_map_fd, &key, values); > 220 if (err) { > 221 fprintf(stderr, "failed to read value\n"); > (gdb) p counter->skel > $2 = (void *) 0x0 > (gdb) p perf_evsel__name(counter) > No symbol "perf_evsel__name" in current context. > (gdb) p evsel__name(counter) > $3 = 0xc77420 "unknown attr type: 13078424" > (gdb) p evsel->type > There is no member named type. > (gdb) p evsel->core. > attr cpus fd id ids node nr_members own_cpus sample_id system_wide threads > (gdb) p evsel->core.attr.type > $4 = 1 > (gdb) p evsel->core.attr.config > $5 = 0 > (gdb) p evsel->evlist > $6 = (struct evlist *) 0xc3cfd0 > (gdb) p evsel->evlist->core.nr_entries > $7 = 10 > (gdb) > > > 10 entries, the default for 'perf stat' > > > With just one event: > > (gdb) run stat -e cycles -I 1000 -b 99 > Starting program: /root/bin/perf stat -e cycles -I 1000 -b 99 > Missing separate debuginfos, use: dnf debuginfo-install glibc-2.32-2.fc33.x86_64 > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > libbpf: elf: skipping unrecognized data section(9) .eh_frame > libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame > > Program received signal SIGSEGV, Segmentation fault. > 0x000000000056559b in bpf_program_profiler__read (evsel=0xc392c0) at util/bpf_counter.c:217 > 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-4.fc33.x86_64 cyrus-sasl-lib-2.1.27-6.fc33.x86_64 elfutils-debuginfod-client-0.182-1.fc33.x86_64 elfutils-libelf-0.182-1.fc33.x86_64 elfutils-libs-0.182-1.fc33.x86_64 keyutils-libs-1.6-5.fc33.x86_64 krb5-libs-1.18.2-29.fc33.x86_64 libbabeltrace-1.5.8-3.fc33.x86_64 libbrotli-1.0.9-3.fc33.x86_64 libcap-2.26-8.fc33.x86_64 libcom_err-1.45.6-4.fc33.x86_64 libcurl-7.71.1-8.fc33.x86_64 libgcc-10.2.1-9.fc33.x86_64 libidn2-2.3.0-4.fc33.x86_64 libnghttp2-1.41.0-3.fc33.x86_64 libpsl-0.21.1-2.fc33.x86_64 libselinux-3.1-2.fc33.x86_64 libssh-0.9.5-1.fc33.x86_64 libunistring-0.9.10-9.fc33.x86_64 libunwind-1.4.0-4.fc33.x86_64 libuuid-2.36-3.fc33.x86_64 libxcrypt-4.4.17-1.fc33.x86_64 libzstd-1.4.5-5.fc33.x86_64 numactl-libs-2.0.14-1.fc33.x86_64 openldap-2.4.50-5.fc33.x86_64 openssl-libs-1.1.1i-1.fc33.x86_64 pcre-8.44-2.fc33.x86_64 pcre2-10.36-1.fc33.x86_64 perl-libs-5.32.0-464.fc33.x86_64 popt-1.18-2.fc33.x86_64 python3-libs-3.9.1-1.fc33.x86_64 slang-2.3.2-8.fc33.x86_64 xz-libs-5.2.5-3.fc33.x86_64 > (gdb) bt > #0 0x000000000056559b in bpf_program_profiler__read (evsel=0xc392c0) at util/bpf_counter.c:217 > #1 0x0000000000000000 in ?? () > (gdb) p evsel->name > $1 = 0xc37960 "cycles" > (gdb) p evsel->bpf_counter_ > bpf_counter_list bpf_counter_ops > (gdb) p evsel->bpf_counter_ops > $2 = (struct bpf_counter_ops *) 0xa08ec0 <bpf_program_profiler_ops> > (gdb) p evsel->bpf_counter_ > bpf_counter_list bpf_counter_ops > (gdb) p evsel->bpf_counter_list > $3 = {next = 0xc36e18, prev = 0xc36e18} > (gdb) p evsel->s > sample_size side_band stats supported synth_sample_type > (gdb) list -5 > 207 return -EAGAIN; > 208 > 209 for (cpu = 0; cpu < num_cpu; cpu++) { > 210 perf_counts(evsel->counts, cpu, 0)->val = 0; > 211 perf_counts(evsel->counts, cpu, 0)->ena = 0; > 212 perf_counts(evsel->counts, cpu, 0)->run = 0; > 213 } > 214 list_for_each_entry(counter, &evsel->bpf_counter_list, list) { > 215 struct bpf_prog_profiler_bpf *skel = counter->skel; > 216 > (gdb) > 217 reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > 218 > 219 err = bpf_map_lookup_elem(reading_map_fd, &key, values); > 220 if (err) { > 221 fprintf(stderr, "failed to read value\n"); > 222 return err; > 223 } > 224 > 225 for (cpu = 0; cpu < num_cpu; cpu++) { > 226 perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter; > (gdb) p counter->skel > $4 = (void *) 0x0 > (gdb) > > skel is NULL?! So it is skel == NULL. In v7 (coming soon), I fixed some issues in the allocate/free of skel, and added some assert(). Let's see how that goes.. Thanks, Song > > I ran out of time, have to go errands now. will bbl. > > - Arnaldo > >>>> I don't have good ideas with 3 and 4... Shall I send current code as v7? >> >>> For 4, please fold the patch below into the relevant patch, we don't >>> need bpf_counter.h included in util/evsel.h, you even added a forward >>> declaration for that 'struct bpf_counter_ops'. >> >>> And in general we should refrain from adding extra includes to header >>> files, .h-ell is not good. >>> >>> Then we provide a stub for that bpf_counter__destroy() so that >>> util/evsel.o when linked into the perf python biding find it there, >>> links ok. >> >> Ok, one more stub is needed, I wasn't building all the time with >> >> $ make BUILD_BPF_SKEL=1 >> >> Ditch the previous patch please, use the one below instead: >> >> - Arnaldo >> >> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h >> index 40e3946cd7518113..8226b1fefa8cf2a3 100644 >> --- a/tools/perf/util/evsel.h >> +++ b/tools/perf/util/evsel.h >> @@ -10,7 +10,6 @@ >> #include <internal/evsel.h> >> #include <perf/evsel.h> >> #include "symbol_conf.h" >> -#include "bpf_counter.h" >> #include <internal/cpumap.h> >> >> struct bpf_object; >> diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c >> index cc5ade85a33fc999..278abecb5bdfc0d2 100644 >> --- a/tools/perf/util/python.c >> +++ b/tools/perf/util/python.c >> @@ -79,6 +79,27 @@ int metricgroup__copy_metric_events(struct evlist *evlist, struct cgroup *cgrp, >> return 0; >> } >> >> +/* >> + * XXX: All these evsel destructors need some better mechanism, like a linked >> + * list of destructors registered when the relevant code indeed is used instead >> + * of having more and more calls in perf_evsel__delete(). -- acme >> + * >> + * For now, add some more: >> + * >> + * Not to drag the BPF bandwagon... >> + */ >> +void bpf_counter__destroy(struct evsel *evsel); >> +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); >> + >> +void bpf_counter__destroy(struct evsel *evsel __maybe_unused) >> +{ >> +} >> + >> +int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, int cpu __maybe_unused, int fd __maybe_unused) >> +{ >> + return 0; >> +} >> + >> /* >> * Support debug printing even though util/debug.c is not linked. That means >> * implementing 'verbose' and 'eprintf'. > > -- > > - Arnaldo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu 2020-12-28 20:11 ` Arnaldo Carvalho de Melo @ 2020-12-29 7:22 ` Namhyung Kim 2020-12-29 17:46 ` Song Liu 1 sibling, 1 reply; 25+ messages in thread From: Namhyung Kim @ 2020-12-29 7:22 UTC (permalink / raw) To: Song Liu Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa, kernel-team On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote: > > Introduce perf-stat -b option, which counts events for BPF programs, like: > > [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 > 1.487903822 115,200 ref-cycles > 1.487903822 86,012 cycles > 2.489147029 80,560 ref-cycles > 2.489147029 73,784 cycles > 3.490341825 60,720 ref-cycles > 3.490341825 37,797 cycles > 4.491540887 37,120 ref-cycles > 4.491540887 31,963 cycles > > The example above counts cycles and ref-cycles of BPF program of id 254. > This is similar to bpftool-prog-profile command, but more flexible. > > perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF > programs (monitor-progs) to the target BPF program (target-prog). The > monitor-progs read perf_event before and after the target-prog, and > aggregate the difference in a BPF map. Then the user space reads data > from these maps. > > A new struct bpf_counter is introduced to provide common interface that > uses BPF programs/maps to count perf events. > > Signed-off-by: Song Liu <songliubraving@fb.com> > --- > tools/perf/Makefile.perf | 2 +- > tools/perf/builtin-stat.c | 77 ++++- > tools/perf/util/Build | 1 + > tools/perf/util/bpf_counter.c | 296 ++++++++++++++++++ > tools/perf/util/bpf_counter.h | 72 +++++ > .../util/bpf_skel/bpf_prog_profiler.bpf.c | 93 ++++++ > tools/perf/util/evsel.c | 9 + > tools/perf/util/evsel.h | 6 + > tools/perf/util/stat-display.c | 4 +- > tools/perf/util/stat.c | 2 +- > tools/perf/util/target.c | 34 +- > tools/perf/util/target.h | 10 + > 12 files changed, 588 insertions(+), 18 deletions(-) > create mode 100644 tools/perf/util/bpf_counter.c > create mode 100644 tools/perf/util/bpf_counter.h > create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c > > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf > index d182a2dbb9bbd..8c4e039c3b813 100644 > --- a/tools/perf/Makefile.perf > +++ b/tools/perf/Makefile.perf > @@ -1015,7 +1015,7 @@ python-clean: > > SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) > SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) > -SKELETONS := > +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h > > ifdef BUILD_BPF_SKEL > BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > index 8cc24967bc273..09bffb3fbcdd4 100644 > --- a/tools/perf/builtin-stat.c > +++ b/tools/perf/builtin-stat.c > @@ -67,6 +67,7 @@ > #include "util/top.h" > #include "util/affinity.h" > #include "util/pfm.h" > +#include "util/bpf_counter.h" > #include "asm/bug.h" > > #include <linux/time64.h> > @@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs) > return 0; > } > > +static int read_bpf_map_counters(void) > +{ > + struct evsel *counter; > + int err; > + > + evlist__for_each_entry(evsel_list, counter) { > + err = bpf_counter__read(counter); > + if (err) > + return err; > + } > + return 0; > +} > + > static void read_counters(struct timespec *rs) > { > struct evsel *counter; > + int err; > > - if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0)) > - return; > + if (!stat_config.stop_read_counter) { > + err = read_bpf_map_counters(); > + if (err == -EAGAIN) > + err = read_affinity_counters(rs); Instead of checking the error code, can we do something like if (target__has_bpf(target)) read_bpf_map_counters(); ? > + if (err < 0) > + return; > + } > > evlist__for_each_entry(evsel_list, counter) { > if (counter->err) > @@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times) > return false; > } > > -static void enable_counters(void) > +static int enable_counters(void) > { > + struct evsel *evsel; > + int err; > + > + evlist__for_each_entry(evsel_list, evsel) { > + err = bpf_counter__enable(evsel); > + if (err) > + return err; Ditto. > + } > + > if (stat_config.initial_delay < 0) { > pr_info(EVLIST_DISABLED_MSG); > - return; > + return 0; > } > > if (stat_config.initial_delay > 0) { > @@ -518,6 +547,7 @@ static void enable_counters(void) > if (stat_config.initial_delay > 0) > pr_info(EVLIST_ENABLED_MSG); > } > + return 0; > } > > static void disable_counters(void) > @@ -720,7 +750,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > const bool forks = (argc > 0); > bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false; > struct affinity affinity; > - int i, cpu; > + int i, cpu, err; > bool second_pass = false; > > if (forks) { > @@ -737,6 +767,11 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > if (affinity__setup(&affinity) < 0) > return -1; > > + evlist__for_each_entry(evsel_list, counter) { > + if (bpf_counter__load(counter, &target)) > + return -1; > + } > + Ditto. > evlist__for_each_cpu (evsel_list, i, cpu) { > affinity__set(&affinity, cpu); > > @@ -850,7 +885,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > } > > if (STAT_RECORD) { > - int err, fd = perf_data__fd(&perf_stat.data); > + int fd = perf_data__fd(&perf_stat.data); > > if (is_pipe) { > err = perf_header__write_pipe(perf_data__fd(&perf_stat.data)); [SNIP] > perf-$(CONFIG_LIBELF) += probe-file.o > diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c > new file mode 100644 > index 0000000000000..f2cb86a40c882 > --- /dev/null > +++ b/tools/perf/util/bpf_counter.c > @@ -0,0 +1,296 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +/* Copyright (c) 2019 Facebook */ > + > +#include <limits.h> > +#include <unistd.h> > +#include <sys/time.h> > +#include <sys/resource.h> > +#include <linux/err.h> > +#include <linux/zalloc.h> > +#include <bpf/bpf.h> > +#include <bpf/btf.h> > +#include <bpf/libbpf.h> > + > +#include "bpf_counter.h" > +#include "counts.h" > +#include "debug.h" > +#include "evsel.h" > +#include "target.h" > + > +#include "bpf_skel/bpf_prog_profiler.skel.h" > + > +static inline void *u64_to_ptr(__u64 ptr) > +{ > + return (void *)(unsigned long)ptr; > +} > + > +static void set_max_rlimit(void) > +{ > + struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY }; > + > + setrlimit(RLIMIT_MEMLOCK, &rinf); > +} This looks scary.. > + > +static struct bpf_counter *bpf_counter_alloc(void) > +{ > + struct bpf_counter *counter; > + > + counter = zalloc(sizeof(*counter)); > + if (counter) > + INIT_LIST_HEAD(&counter->list); > + return counter; > +} > + > +static int bpf_program_profiler__destroy(struct evsel *evsel) > +{ > + struct bpf_counter *counter; > + > + list_for_each_entry(counter, &evsel->bpf_counter_list, list) > + bpf_prog_profiler_bpf__destroy(counter->skel); > + INIT_LIST_HEAD(&evsel->bpf_counter_list); > + return 0; > +} > + > +static char *bpf_target_prog_name(int tgt_fd) > +{ > + struct bpf_prog_info_linear *info_linear; > + struct bpf_func_info *func_info; > + const struct btf_type *t; > + char *name = NULL; > + struct btf *btf; > + > + info_linear = bpf_program__get_prog_info_linear( > + tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO); > + if (IS_ERR_OR_NULL(info_linear)) { > + pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd); > + return NULL; > + } > + > + if (info_linear->info.btf_id == 0 || > + btf__get_from_id(info_linear->info.btf_id, &btf)) { > + pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd); > + goto out; > + } > + > + func_info = u64_to_ptr(info_linear->info.func_info); > + t = btf__type_by_id(btf, func_info[0].type_id); > + if (!t) { > + pr_debug("btf %d doesn't have type %d\n", > + info_linear->info.btf_id, func_info[0].type_id); > + goto out; > + } > + name = strdup(btf__name_by_offset(btf, t->name_off)); > +out: > + free(info_linear); > + return name; > +} > + > +static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id) > +{ > + struct bpf_prog_profiler_bpf *skel; > + struct bpf_counter *counter; > + struct bpf_program *prog; > + char *prog_name; > + int prog_fd; > + int err; > + > + prog_fd = bpf_prog_get_fd_by_id(prog_id); > + if (prog_fd < 0) { > + pr_err("Failed to open fd for bpf prog %u\n", prog_id); > + return -1; > + } > + counter = bpf_counter_alloc(); > + if (!counter) { > + close(prog_fd); > + return -1; > + } > + > + skel = bpf_prog_profiler_bpf__open(); > + if (!skel) { > + pr_err("Failed to open bpf skeleton\n"); > + goto err_out; > + } > + skel->rodata->num_cpu = evsel__nr_cpus(evsel); > + > + bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel)); > + bpf_map__resize(skel->maps.fentry_readings, 1); > + bpf_map__resize(skel->maps.accum_readings, 1); > + > + prog_name = bpf_target_prog_name(prog_fd); > + if (!prog_name) { > + pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id); > + goto err_out; > + } > + > + bpf_object__for_each_program(prog, skel->obj) { > + err = bpf_program__set_attach_target(prog, prog_fd, prog_name); > + if (err) { > + pr_err("bpf_program__set_attach_target failed.\n" > + "Does bpf prog %u have BTF?\n", prog_id); > + goto err_out; > + } > + } > + set_max_rlimit(); > + err = bpf_prog_profiler_bpf__load(skel); > + if (err) { > + pr_err("bpf_prog_profiler_bpf__load failed\n"); > + goto err_out; > + } > + > + counter->skel = skel; > + list_add(&counter->list, &evsel->bpf_counter_list); > + close(prog_fd); > + return 0; > +err_out: > + free(counter); > + close(prog_fd); I don't know how the 'skel' part is managed, is it safe to leave? > + return -1; > +} > + > +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target) > +{ > + char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p; > + u32 prog_id; > + int ret; > + > + bpf_str_ = bpf_str = strdup(target->bpf_str); > + if (!bpf_str) > + return -1; > + > + while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) { > + prog_id = strtoul(tok, &p, 10); > + if (prog_id == 0 || prog_id == UINT_MAX || > + (*p != '\0' && *p != ',')) { > + pr_err("Failed to parse bpf prog ids %s\n", > + target->bpf_str); > + return -1; > + } > + > + ret = bpf_program_profiler_load_one(evsel, prog_id); > + if (ret) { > + bpf_program_profiler__destroy(evsel); > + free(bpf_str_); > + return -1; > + } > + bpf_str = NULL; > + } > + free(bpf_str_); > + return 0; > +} > + > +static int bpf_program_profiler__enable(struct evsel *evsel) > +{ > + struct bpf_counter *counter; > + int ret; > + > + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { > + ret = bpf_prog_profiler_bpf__attach(counter->skel); > + if (ret) { > + bpf_program_profiler__destroy(evsel); > + return ret; > + } > + } > + return 0; > +} > + > +static int bpf_program_profiler__read(struct evsel *evsel) > +{ > + int num_cpu = evsel__nr_cpus(evsel); > + struct bpf_perf_event_value values[num_cpu]; > + struct bpf_counter *counter; > + int reading_map_fd; > + __u32 key = 0; > + int err, cpu; > + > + if (list_empty(&evsel->bpf_counter_list)) > + return -EAGAIN; > + > + for (cpu = 0; cpu < num_cpu; cpu++) { > + perf_counts(evsel->counts, cpu, 0)->val = 0; > + perf_counts(evsel->counts, cpu, 0)->ena = 0; > + perf_counts(evsel->counts, cpu, 0)->run = 0; > + } Hmm.. not sure it's correct to reset counters here. > + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { > + struct bpf_prog_profiler_bpf *skel = counter->skel; > + > + reading_map_fd = bpf_map__fd(skel->maps.accum_readings); > + > + err = bpf_map_lookup_elem(reading_map_fd, &key, values); > + if (err) { > + fprintf(stderr, "failed to read value\n"); > + return err; > + } > + > + for (cpu = 0; cpu < num_cpu; cpu++) { > + perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter; > + perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled; > + perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running; > + } > + } So this just aggregates all the counters in BPF programs, right? > + return 0; > +} > + > +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu, > + int fd) > +{ > + struct bpf_prog_profiler_bpf *skel; > + struct bpf_counter *counter; > + int ret; > + > + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { > + skel = counter->skel; > + ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events), > + &cpu, &fd, BPF_ANY); > + if (ret) > + return ret; > + } > + return 0; > +} > + > +struct bpf_counter_ops bpf_program_profiler_ops = { > + .load = bpf_program_profiler__load, > + .enable = bpf_program_profiler__enable, > + .read = bpf_program_profiler__read, > + .destroy = bpf_program_profiler__destroy, > + .install_pe = bpf_program_profiler__install_pe, What is 'pe'? Btw, do you think other kinds of bpf programs are added later? It seems 'perf stat -b' is somewhat coupled with this profiler ops. Will it be possible to run other ops in a same evsel? > +}; > + > +int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd) > +{ > + if (list_empty(&evsel->bpf_counter_list)) > + return 0; > + return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd); > +} > + > +int bpf_counter__load(struct evsel *evsel, struct target *target) > +{ > + if (target__has_bpf(target)) > + evsel->bpf_counter_ops = &bpf_program_profiler_ops; > + > + if (evsel->bpf_counter_ops) > + return evsel->bpf_counter_ops->load(evsel, target); > + return 0; > +} > + > +int bpf_counter__enable(struct evsel *evsel) > +{ > + if (list_empty(&evsel->bpf_counter_list)) > + return 0; > + return evsel->bpf_counter_ops->enable(evsel); > +} > + > +int bpf_counter__read(struct evsel *evsel) > +{ > + if (list_empty(&evsel->bpf_counter_list)) > + return -EAGAIN; > + return evsel->bpf_counter_ops->read(evsel); > +} > + > +void bpf_counter__destroy(struct evsel *evsel) > +{ > + if (list_empty(&evsel->bpf_counter_list)) > + return; > + evsel->bpf_counter_ops->destroy(evsel); > + evsel->bpf_counter_ops = NULL; > +} [SNIP] > diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h > index 6ef01a83b24e9..f132c6c2eef81 100644 > --- a/tools/perf/util/target.h > +++ b/tools/perf/util/target.h > @@ -10,6 +10,7 @@ struct target { > const char *tid; > const char *cpu_list; > const char *uid_str; > + const char *bpf_str; > uid_t uid; > bool system_wide; > bool uses_mmap; > @@ -36,6 +37,10 @@ enum target_errno { > TARGET_ERRNO__PID_OVERRIDE_SYSTEM, > TARGET_ERRNO__UID_OVERRIDE_SYSTEM, > TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD, > + TARGET_ERRNO__BPF_OVERRIDE_CPU, > + TARGET_ERRNO__BPF_OVERRIDE_PID, > + TARGET_ERRNO__BPF_OVERRIDE_UID, > + TARGET_ERRNO__BPF_OVERRIDE_THREAD, > > /* for target__parse_uid() */ > TARGET_ERRNO__INVALID_UID, > @@ -59,6 +64,11 @@ static inline bool target__has_cpu(struct target *target) > return target->system_wide || target->cpu_list; > } > > +static inline bool target__has_bpf(struct target *target) > +{ > + return target->bpf_str; > +} > + > static inline bool target__none(struct target *target) > { > return !target__has_task(target) && !target__has_cpu(target); Shouldn't it have && !target__has_bpf() too? Thanks, Namhyung ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-29 7:22 ` Namhyung Kim @ 2020-12-29 17:46 ` Song Liu 2020-12-29 17:59 ` Song Liu 0 siblings, 1 reply; 25+ messages in thread From: Song Liu @ 2020-12-29 17:46 UTC (permalink / raw) To: Namhyung Kim Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa, Kernel Team > On Dec 28, 2020, at 11:22 PM, Namhyung Kim <namhyung@kernel.org> wrote: > > On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote: >> >> Introduce perf-stat -b option, which counts events for BPF programs, like: >> >> [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000 >> 1.487903822 115,200 ref-cycles >> 1.487903822 86,012 cycles >> 2.489147029 80,560 ref-cycles >> 2.489147029 73,784 cycles >> 3.490341825 60,720 ref-cycles >> 3.490341825 37,797 cycles >> 4.491540887 37,120 ref-cycles >> 4.491540887 31,963 cycles >> >> The example above counts cycles and ref-cycles of BPF program of id 254. >> This is similar to bpftool-prog-profile command, but more flexible. >> >> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF >> programs (monitor-progs) to the target BPF program (target-prog). The >> monitor-progs read perf_event before and after the target-prog, and >> aggregate the difference in a BPF map. Then the user space reads data >> from these maps. >> >> A new struct bpf_counter is introduced to provide common interface that >> uses BPF programs/maps to count perf events. >> >> Signed-off-by: Song Liu <songliubraving@fb.com> >> --- >> tools/perf/Makefile.perf | 2 +- >> tools/perf/builtin-stat.c | 77 ++++- >> tools/perf/util/Build | 1 + >> tools/perf/util/bpf_counter.c | 296 ++++++++++++++++++ >> tools/perf/util/bpf_counter.h | 72 +++++ >> .../util/bpf_skel/bpf_prog_profiler.bpf.c | 93 ++++++ >> tools/perf/util/evsel.c | 9 + >> tools/perf/util/evsel.h | 6 + >> tools/perf/util/stat-display.c | 4 +- >> tools/perf/util/stat.c | 2 +- >> tools/perf/util/target.c | 34 +- >> tools/perf/util/target.h | 10 + >> 12 files changed, 588 insertions(+), 18 deletions(-) >> create mode 100644 tools/perf/util/bpf_counter.c >> create mode 100644 tools/perf/util/bpf_counter.h >> create mode 100644 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c >> >> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf >> index d182a2dbb9bbd..8c4e039c3b813 100644 >> --- a/tools/perf/Makefile.perf >> +++ b/tools/perf/Makefile.perf >> @@ -1015,7 +1015,7 @@ python-clean: >> >> SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) >> SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) >> -SKELETONS := >> +SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h >> >> ifdef BUILD_BPF_SKEL >> BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool >> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c >> index 8cc24967bc273..09bffb3fbcdd4 100644 >> --- a/tools/perf/builtin-stat.c >> +++ b/tools/perf/builtin-stat.c >> @@ -67,6 +67,7 @@ >> #include "util/top.h" >> #include "util/affinity.h" >> #include "util/pfm.h" >> +#include "util/bpf_counter.h" >> #include "asm/bug.h" >> >> #include <linux/time64.h> >> @@ -409,12 +410,31 @@ static int read_affinity_counters(struct timespec *rs) >> return 0; >> } >> >> +static int read_bpf_map_counters(void) >> +{ >> + struct evsel *counter; >> + int err; >> + >> + evlist__for_each_entry(evsel_list, counter) { >> + err = bpf_counter__read(counter); >> + if (err) >> + return err; >> + } >> + return 0; >> +} >> + >> static void read_counters(struct timespec *rs) >> { >> struct evsel *counter; >> + int err; >> >> - if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0)) >> - return; >> + if (!stat_config.stop_read_counter) { >> + err = read_bpf_map_counters(); >> + if (err == -EAGAIN) >> + err = read_affinity_counters(rs); > > Instead of checking the error code, can we do something like > > if (target__has_bpf(target)) > read_bpf_map_counters(); > > ? Yeah, we can do that. > >> + if (err < 0) >> + return; >> + } >> >> evlist__for_each_entry(evsel_list, counter) { >> if (counter->err) >> @@ -496,11 +516,20 @@ static bool handle_interval(unsigned int interval, int *times) >> return false; >> } >> >> -static void enable_counters(void) >> +static int enable_counters(void) >> { >> + struct evsel *evsel; >> + int err; >> + >> + evlist__for_each_entry(evsel_list, evsel) { >> + err = bpf_counter__enable(evsel); >> + if (err) >> + return err; > > Ditto. For this one, we still need to check the return value, as bpf_counter__enable() may fail. We can add a global check to skip the loop. > >> + } >> + [...] >> + >> +#include "bpf_skel/bpf_prog_profiler.skel.h" >> + >> +static inline void *u64_to_ptr(__u64 ptr) >> +{ >> + return (void *)(unsigned long)ptr; >> +} >> + >> +static void set_max_rlimit(void) >> +{ >> + struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY }; >> + >> + setrlimit(RLIMIT_MEMLOCK, &rinf); >> +} > > This looks scary.. I guess this is OK as we requires root rights for -b? > [...] >> + if (!counter) { >> + close(prog_fd); >> + return -1; >> + } >> + >> + skel = bpf_prog_profiler_bpf__open(); >> + if (!skel) { >> + pr_err("Failed to open bpf skeleton\n"); >> + goto err_out; >> + } >> + skel->rodata->num_cpu = evsel__nr_cpus(evsel); >> + >> + bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel)); >> + bpf_map__resize(skel->maps.fentry_readings, 1); >> + bpf_map__resize(skel->maps.accum_readings, 1); >> + >> + prog_name = bpf_target_prog_name(prog_fd); >> + if (!prog_name) { >> + pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id); >> + goto err_out; >> + } >> + >> + bpf_object__for_each_program(prog, skel->obj) { >> + err = bpf_program__set_attach_target(prog, prog_fd, prog_name); >> + if (err) { >> + pr_err("bpf_program__set_attach_target failed.\n" >> + "Does bpf prog %u have BTF?\n", prog_id); >> + goto err_out; >> + } >> + } >> + set_max_rlimit(); >> + err = bpf_prog_profiler_bpf__load(skel); >> + if (err) { >> + pr_err("bpf_prog_profiler_bpf__load failed\n"); >> + goto err_out; >> + } >> + >> + counter->skel = skel; >> + list_add(&counter->list, &evsel->bpf_counter_list); >> + close(prog_fd); >> + return 0; >> +err_out: >> + free(counter); >> + close(prog_fd); > > I don't know how the 'skel' part is managed, is it safe to leave? Good catch! We do have bpf_program_profiler__destroy() in bpf_program_profiler__load(). But I should have counter->skel = skel in err path. Will fix. > > >> + return -1; >> +} >> + >> +static int bpf_program_profiler__load(struct evsel *evsel, struct target *target) >> +{ >> + char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p; >> + u32 prog_id; >> + int ret; >> + >> + bpf_str_ = bpf_str = strdup(target->bpf_str); >> + if (!bpf_str) >> + return -1; >> + >> + while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) { >> + prog_id = strtoul(tok, &p, 10); >> + if (prog_id == 0 || prog_id == UINT_MAX || >> + (*p != '\0' && *p != ',')) { >> + pr_err("Failed to parse bpf prog ids %s\n", >> + target->bpf_str); >> + return -1; >> + } >> + >> + ret = bpf_program_profiler_load_one(evsel, prog_id); >> + if (ret) { >> + bpf_program_profiler__destroy(evsel); >> + free(bpf_str_); >> + return -1; >> + } >> + bpf_str = NULL; >> + } >> + free(bpf_str_); >> + return 0; >> +} >> + >> +static int bpf_program_profiler__enable(struct evsel *evsel) >> +{ >> + struct bpf_counter *counter; >> + int ret; >> + >> + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { >> + ret = bpf_prog_profiler_bpf__attach(counter->skel); >> + if (ret) { >> + bpf_program_profiler__destroy(evsel); >> + return ret; >> + } >> + } >> + return 0; >> +} >> + >> +static int bpf_program_profiler__read(struct evsel *evsel) >> +{ >> + int num_cpu = evsel__nr_cpus(evsel); >> + struct bpf_perf_event_value values[num_cpu]; >> + struct bpf_counter *counter; >> + int reading_map_fd; >> + __u32 key = 0; >> + int err, cpu; >> + >> + if (list_empty(&evsel->bpf_counter_list)) >> + return -EAGAIN; >> + >> + for (cpu = 0; cpu < num_cpu; cpu++) { >> + perf_counts(evsel->counts, cpu, 0)->val = 0; >> + perf_counts(evsel->counts, cpu, 0)->ena = 0; >> + perf_counts(evsel->counts, cpu, 0)->run = 0; >> + } > > Hmm.. not sure it's correct to reset counters here. Yeah, we need to reset the user space values here. Otherwise, the later aggregation would give wrong number. > > >> + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { >> + struct bpf_prog_profiler_bpf *skel = counter->skel; >> + >> + reading_map_fd = bpf_map__fd(skel->maps.accum_readings); >> + >> + err = bpf_map_lookup_elem(reading_map_fd, &key, values); >> + if (err) { >> + fprintf(stderr, "failed to read value\n"); >> + return err; >> + } >> + >> + for (cpu = 0; cpu < num_cpu; cpu++) { >> + perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter; >> + perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled; >> + perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running; >> + } >> + } > > So this just aggregates all the counters in BPF programs, right? Yes. > > >> + return 0; >> +} >> + >> +static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu, >> + int fd) >> +{ >> + struct bpf_prog_profiler_bpf *skel; >> + struct bpf_counter *counter; >> + int ret; >> + >> + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { >> + skel = counter->skel; >> + ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events), >> + &cpu, &fd, BPF_ANY); >> + if (ret) >> + return ret; >> + } >> + return 0; >> +} >> + >> +struct bpf_counter_ops bpf_program_profiler_ops = { >> + .load = bpf_program_profiler__load, >> + .enable = bpf_program_profiler__enable, >> + .read = bpf_program_profiler__read, >> + .destroy = bpf_program_profiler__destroy, >> + .install_pe = bpf_program_profiler__install_pe, > > What is 'pe'? pe here means perf_event. > > Btw, do you think other kinds of bpf programs are added later? > It seems 'perf stat -b' is somewhat coupled with this profiler ops. > Will it be possible to run other ops in a same evsel? It will be possible to add other ops. I have some idea of using BPF programs in other perf scenarios. To clarify, I think each instance of evsel should only have one ops attached. And each session of perf-stat should only use one kind of ops. > >> [...] >> +static inline bool target__has_bpf(struct target *target) >> +{ >> + return target->bpf_str; >> +} >> + >> static inline bool target__none(struct target *target) >> { >> return !target__has_task(target) && !target__has_cpu(target); > > Shouldn't it have && !target__has_bpf() too? Will fix. Thanks, Song ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs 2020-12-29 17:46 ` Song Liu @ 2020-12-29 17:59 ` Song Liu 0 siblings, 0 replies; 25+ messages in thread From: Song Liu @ 2020-12-29 17:59 UTC (permalink / raw) To: Namhyung Kim Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa, Kernel Team > On Dec 29, 2020, at 9:46 AM, Song Liu <songliubraving@fb.com> wrote: > >>> [...] > > [...] > >>> +static inline bool target__has_bpf(struct target *target) >>> +{ >>> + return target->bpf_str; >>> +} >>> + >>> static inline bool target__none(struct target *target) >>> { >>> return !target__has_task(target) && !target__has_cpu(target); >> >> Shouldn't it have && !target__has_bpf() too? Actually, we don't need target__has_bpf() here. As -b requires setting up counters system wide (in setup_system_wide()). If we add target__has_bpf() here, we will need something like below, which I think it not necessary. diff --git i/tools/perf/builtin-stat.c w/tools/perf/builtin-stat.c index 09bffb3fbcdd4..853cec040191b 100644 --- i/tools/perf/builtin-stat.c +++ w/tools/perf/builtin-stat.c @@ -2081,7 +2081,7 @@ static void setup_system_wide(int forks) * - there is workload specified but all requested * events are system wide events */ - if (!target__none(&target)) + if (!target__none(&target) && !target__has_bpf(&target)) return; if (!forks) diff --git i/tools/perf/util/target.h w/tools/perf/util/target.h index f132c6c2eef81..295fb11f4daff 100644 --- i/tools/perf/util/target.h +++ w/tools/perf/util/target.h @@ -71,7 +71,8 @@ static inline bool target__has_bpf(struct target *target) static inline bool target__none(struct target *target) { - return !target__has_task(target) && !target__has_cpu(target); + return !target__has_task(target) && !target__has_cpu(target) && + !target__has_bpf(target); } static inline bool target__has_per_thread(struct target *target) ^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v6 4/4] perf-stat: add documentation for -b option 2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu ` (2 preceding siblings ...) 2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu @ 2020-12-28 17:40 ` Song Liu 2020-12-29 7:24 ` Namhyung Kim 3 siblings, 1 reply; 25+ messages in thread From: Song Liu @ 2020-12-28 17:40 UTC (permalink / raw) To: linux-kernel Cc: acme, peterz, mingo, alexander.shishkin, namhyung, mark.rutland, jolsa, kernel-team, Song Liu Add documentation to perf-stat -b option, which stats event for BPF programs. Signed-off-by: Song Liu <songliubraving@fb.com> --- tools/perf/Documentation/perf-stat.txt | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 5d4a673d7621a..15b9a646e853d 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -75,6 +75,20 @@ report:: --tid=<tid>:: stat events on existing thread id (comma separated list) +-b:: +--bpf-prog:: + stat events on existing bpf program id (comma separated list), + requiring root righs. For example: + + # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000 + + Performance counter stats for 'BPF program(s) 17247': + + 85,967 cycles + 28,982 instructions # 0.34 insn per cycle + + 1.102235068 seconds time elapsed + ifdef::HAVE_LIBPFM[] --pfm-events events:: Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net) -- 2.24.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v6 4/4] perf-stat: add documentation for -b option 2020-12-28 17:40 ` [PATCH v6 4/4] perf-stat: add documentation for -b option Song Liu @ 2020-12-29 7:24 ` Namhyung Kim 2020-12-29 16:59 ` Song Liu 0 siblings, 1 reply; 25+ messages in thread From: Namhyung Kim @ 2020-12-29 7:24 UTC (permalink / raw) To: Song Liu Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa, kernel-team On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote: > > Add documentation to perf-stat -b option, which stats event for BPF > programs. > > Signed-off-by: Song Liu <songliubraving@fb.com> > --- > tools/perf/Documentation/perf-stat.txt | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt > index 5d4a673d7621a..15b9a646e853d 100644 > --- a/tools/perf/Documentation/perf-stat.txt > +++ b/tools/perf/Documentation/perf-stat.txt > @@ -75,6 +75,20 @@ report:: > --tid=<tid>:: > stat events on existing thread id (comma separated list) > > +-b:: > +--bpf-prog:: > + stat events on existing bpf program id (comma separated list), > + requiring root righs. For example: Typo: rights It'd be nice if it can show how we can get the id. Thanks, Namhyung > + > + # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000 > + > + Performance counter stats for 'BPF program(s) 17247': > + > + 85,967 cycles > + 28,982 instructions # 0.34 insn per cycle > + > + 1.102235068 seconds time elapsed > + > ifdef::HAVE_LIBPFM[] > --pfm-events events:: > Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net) > -- > 2.24.1 > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v6 4/4] perf-stat: add documentation for -b option 2020-12-29 7:24 ` Namhyung Kim @ 2020-12-29 16:59 ` Song Liu 0 siblings, 0 replies; 25+ messages in thread From: Song Liu @ 2020-12-29 16:59 UTC (permalink / raw) To: Namhyung Kim Cc: linux-kernel, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Mark Rutland, Jiri Olsa, Kernel Team > On Dec 28, 2020, at 11:24 PM, Namhyung Kim <namhyung@kernel.org> wrote: > > On Tue, Dec 29, 2020 at 2:41 AM Song Liu <songliubraving@fb.com> wrote: >> >> Add documentation to perf-stat -b option, which stats event for BPF >> programs. >> >> Signed-off-by: Song Liu <songliubraving@fb.com> >> --- >> tools/perf/Documentation/perf-stat.txt | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt >> index 5d4a673d7621a..15b9a646e853d 100644 >> --- a/tools/perf/Documentation/perf-stat.txt >> +++ b/tools/perf/Documentation/perf-stat.txt >> @@ -75,6 +75,20 @@ report:: >> --tid=<tid>:: >> stat events on existing thread id (comma separated list) >> >> +-b:: >> +--bpf-prog:: >> + stat events on existing bpf program id (comma separated list), >> + requiring root righs. For example: > > Typo: rights > > It'd be nice if it can show how we can get the id. Thanks for the review! I fill fix these in the next version. Song ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2020-12-29 21:42 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-28 17:40 [PATCH v6 0/4] Introduce perf-stat -b for BPF programs Song Liu 2020-12-28 17:40 ` [PATCH v6 1/4] bpftool: add Makefile target bootstrap Song Liu 2020-12-28 17:40 ` [PATCH v6 2/4] perf: support build BPF skeletons with perf Song Liu 2020-12-29 7:01 ` Namhyung Kim 2020-12-29 11:48 ` Arnaldo Carvalho de Melo 2020-12-29 17:14 ` Song Liu 2020-12-29 18:16 ` Arnaldo Carvalho de Melo 2020-12-28 17:40 ` [PATCH v6 3/4] perf-stat: enable counting events for BPF programs Song Liu 2020-12-28 20:11 ` Arnaldo Carvalho de Melo 2020-12-28 23:43 ` Song Liu 2020-12-29 5:53 ` Song Liu 2020-12-29 15:15 ` Arnaldo Carvalho de Melo 2020-12-29 18:42 ` Song Liu 2020-12-29 18:48 ` Arnaldo Carvalho de Melo 2020-12-29 19:11 ` Song Liu 2020-12-29 19:18 ` Arnaldo Carvalho de Melo 2020-12-29 19:23 ` Arnaldo Carvalho de Melo 2020-12-29 19:32 ` Arnaldo Carvalho de Melo 2020-12-29 21:40 ` Song Liu 2020-12-29 7:22 ` Namhyung Kim 2020-12-29 17:46 ` Song Liu 2020-12-29 17:59 ` Song Liu 2020-12-28 17:40 ` [PATCH v6 4/4] perf-stat: add documentation for -b option Song Liu 2020-12-29 7:24 ` Namhyung Kim 2020-12-29 16:59 ` Song Liu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.