linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
@ 2021-03-16 21:18 Song Liu
  2021-03-16 21:18 ` [PATCH v2 1/3] perf-stat: introduce bperf, " Song Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Song Liu @ 2021-03-16 21:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: kernel-team, acme, acme, namhyung, jolsa, Song Liu

perf uses performance monitoring counters (PMCs) to monitor system
performance. The PMCs are limited hardware resources. For example,
Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.

Modern data center systems use these PMCs in many different ways:
system level monitoring, (maybe nested) container level monitoring, per
process monitoring, profiling (in sample mode), etc. In some cases,
there are more active perf_events than available hardware PMCs. To allow
all perf_events to have a chance to run, it is necessary to do expensive
time multiplexing of events.

On the other hand, many monitoring tools count the common metrics (cycles,
instructions). It is a waste to have multiple tools create multiple
perf_events of "cycles" and occupy multiple PMCs.

bperf tries to reduce such wastes by allowing multiple perf_events of
"cycles" or "instructions" (at different scopes) to share PMUs. Instead
of having each perf-stat session to read its own perf_events, bperf uses
BPF programs to read the perf_events and aggregate readings to BPF maps.
Then, the perf-stat session(s) reads the values from these BPF maps.

Changes v1 => v2:
  1. Add documentation.
  2. Add a shell test.
  3. Rename options, default path of the atto-map, and some variables.
  4. Add a separate patch that moves clock_gettime() in __run_perf_stat()
     to after enable_counters().
  5. Make perf_cpu_map for all cpus a global variable.
  6. Use sysfs__mountpoint() for default attr-map path.
  7. Use cpu__max_cpu() instead of libbpf_num_possible_cpus().
  8. Add flag "enabled" to the follower program. Then move follower attach
     to bperf__load() and simplify bperf__enable().

Song Liu (3):
  perf-stat: introduce bperf, share hardware PMCs with BPF
  perf-stat: measure t0 and ref_time after enable_counters()
  perf-test: add a test for perf-stat --bpf-counters option

 tools/perf/Documentation/perf-stat.txt        |  11 +
 tools/perf/Makefile.perf                      |   1 +
 tools/perf/builtin-stat.c                     |  20 +-
 tools/perf/tests/shell/stat_bpf_counters.sh   |  34 ++
 tools/perf/util/bpf_counter.c                 | 519 +++++++++++++++++-
 tools/perf/util/bpf_skel/bperf.h              |  14 +
 tools/perf/util/bpf_skel/bperf_follower.bpf.c |  69 +++
 tools/perf/util/bpf_skel/bperf_leader.bpf.c   |  46 ++
 tools/perf/util/bpf_skel/bperf_u.h            |  14 +
 tools/perf/util/evsel.h                       |  20 +-
 tools/perf/util/target.h                      |   4 +-
 11 files changed, 742 insertions(+), 10 deletions(-)
 create mode 100755 tools/perf/tests/shell/stat_bpf_counters.sh
 create mode 100644 tools/perf/util/bpf_skel/bperf.h
 create mode 100644 tools/perf/util/bpf_skel/bperf_follower.bpf.c
 create mode 100644 tools/perf/util/bpf_skel/bperf_leader.bpf.c
 create mode 100644 tools/perf/util/bpf_skel/bperf_u.h

--
2.30.2

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-16 21:18 [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Song Liu
@ 2021-03-16 21:18 ` Song Liu
  2021-03-18  5:54   ` Namhyung Kim
  2021-03-18 21:15   ` Jiri Olsa
  2021-03-16 21:18 ` [PATCH v2 2/3] perf-stat: measure t0 and ref_time after enable_counters() Song Liu
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 33+ messages in thread
From: Song Liu @ 2021-03-16 21:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: kernel-team, acme, acme, namhyung, jolsa, Song Liu

perf uses performance monitoring counters (PMCs) to monitor system
performance. The PMCs are limited hardware resources. For example,
Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.

Modern data center systems use these PMCs in many different ways:
system level monitoring, (maybe nested) container level monitoring, per
process monitoring, profiling (in sample mode), etc. In some cases,
there are more active perf_events than available hardware PMCs. To allow
all perf_events to have a chance to run, it is necessary to do expensive
time multiplexing of events.

On the other hand, many monitoring tools count the common metrics (cycles,
instructions). It is a waste to have multiple tools create multiple
perf_events of "cycles" and occupy multiple PMCs.

bperf tries to reduce such wastes by allowing multiple perf_events of
"cycles" or "instructions" (at different scopes) to share PMUs. Instead
of having each perf-stat session to read its own perf_events, bperf uses
BPF programs to read the perf_events and aggregate readings to BPF maps.
Then, the perf-stat session(s) reads the values from these BPF maps.

Please refer to the comment before the definition of bperf_ops for the
description of bperf architecture.

bperf is off by default. To enable it, pass --bpf-counters option to
perf-stat. bperf uses a BPF hashmap to share information about BPF
programs and maps used by bperf. This map is pinned to bpffs. The default
path is /sys/fs/bpf/perf_attr_map. The user could change the path with
option --bpf-attr-map.

Signed-off-by: Song Liu <songliubraving@fb.com>

---
Known limitations:
1. Do not support per cgroup events;
2. Do not support monitoring of BPF program (perf-stat -b);
3. Do not support event groups;
4. Do not support inherit events during fork().

The following commands have been tested:

   perf stat --bpf-counters -e cycles,ref-cycles -a
   perf stat --bpf-counters -e cycles,instructions -C 1,3,4
   perf stat --bpf-counters -e cycles -p 123
   perf stat --bpf-counters -e cycles -t 100,101
   perf stat --bpf-counters -e cycles,ref-cycles -- stressapptest ...
---
 tools/perf/Documentation/perf-stat.txt        |  11 +
 tools/perf/Makefile.perf                      |   1 +
 tools/perf/builtin-stat.c                     |  10 +
 tools/perf/util/bpf_counter.c                 | 519 +++++++++++++++++-
 tools/perf/util/bpf_skel/bperf.h              |  14 +
 tools/perf/util/bpf_skel/bperf_follower.bpf.c |  69 +++
 tools/perf/util/bpf_skel/bperf_leader.bpf.c   |  46 ++
 tools/perf/util/bpf_skel/bperf_u.h            |  14 +
 tools/perf/util/evsel.h                       |  20 +-
 tools/perf/util/target.h                      |   4 +-
 10 files changed, 701 insertions(+), 7 deletions(-)
 create mode 100644 tools/perf/util/bpf_skel/bperf.h
 create mode 100644 tools/perf/util/bpf_skel/bperf_follower.bpf.c
 create mode 100644 tools/perf/util/bpf_skel/bperf_leader.bpf.c
 create mode 100644 tools/perf/util/bpf_skel/bperf_u.h

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 08a1714494f87..d2e7656b5ef81 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -93,6 +93,17 @@ report::
 
         1.102235068 seconds time elapsed
 
+--bpf-counters::
+	Use BPF programs to aggregate readings from perf_events.  This
+	allows multiple perf-stat sessions that are counting the same metric (cycles,
+	instructions, etc.) to share hardware counters.
+
+--bpf-attr-map::
+	With option "--bpf-counters", different perf-stat sessions share
+	information about shared BPF programs and maps via a pinned hashmap.
+	Use "--bpf-attr-map" to specify the path of this pinned hashmap.
+	The default path is /sys/fs/bpf/perf_attr_map.
+
 ifdef::HAVE_LIBPFM[]
 --pfm-events events::
 Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net)
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index f6e609673de2b..ca9aa08e85a1f 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1007,6 +1007,7 @@ python-clean:
 SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
 SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
 SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
+SKELETONS += $(SKEL_OUT)/bperf_leader.skel.h $(SKEL_OUT)/bperf_follower.skel.h
 
 ifdef BUILD_BPF_SKEL
 BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2e2e4a8345ea2..92696373da994 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -792,6 +792,12 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	}
 
 	evlist__for_each_cpu (evsel_list, i, cpu) {
+		/*
+		 * bperf calls evsel__open_per_cpu() in bperf__load(), so
+		 * no need to call it again here.
+		 */
+		if (target.use_bpf)
+			break;
 		affinity__set(&affinity, cpu);
 
 		evlist__for_each_entry(evsel_list, counter) {
@@ -1146,6 +1152,10 @@ static struct option stat_options[] = {
 #ifdef HAVE_BPF_SKEL
 	OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id",
 		   "stat events on existing bpf program id"),
+	OPT_BOOLEAN(0, "bpf-counters", &target.use_bpf,
+		    "use bpf program to count events"),
+	OPT_STRING(0, "bpf-attr-map", &target.attr_map, "attr-map-path",
+		   "path to perf_event_attr map"),
 #endif
 	OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
 		    "system-wide collection from all CPUs"),
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index 04f89120b3232..81d1df3c4ec0e 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -5,6 +5,7 @@
 #include <assert.h>
 #include <limits.h>
 #include <unistd.h>
+#include <sys/file.h>
 #include <sys/time.h>
 #include <sys/resource.h>
 #include <linux/err.h>
@@ -12,14 +13,45 @@
 #include <bpf/bpf.h>
 #include <bpf/btf.h>
 #include <bpf/libbpf.h>
+#include <api/fs/fs.h>
 
 #include "bpf_counter.h"
 #include "counts.h"
 #include "debug.h"
 #include "evsel.h"
+#include "evlist.h"
 #include "target.h"
+#include "cpumap.h"
+#include "thread_map.h"
 
 #include "bpf_skel/bpf_prog_profiler.skel.h"
+#include "bpf_skel/bperf_u.h"
+#include "bpf_skel/bperf_leader.skel.h"
+#include "bpf_skel/bperf_follower.skel.h"
+
+/*
+ * bperf uses a hashmap, the attr_map, to track all the leader programs.
+ * The hashmap is pinned in bpffs. flock() on this file is used to ensure
+ * no concurrent access to the attr_map.  The key of attr_map is struct
+ * perf_event_attr, and the value is struct perf_event_attr_map_entry.
+ *
+ * struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
+ * leader prog, and the diff_map. Each perf-stat session holds a reference
+ * to the bpf_link to make sure the leader prog is attached to sched_switch
+ * tracepoint.
+ *
+ * Since the hashmap only contains IDs of the bpf_link and diff_map, it
+ * does not hold any references to the leader program. Once all perf-stat
+ * sessions of these events exit, the leader prog, its maps, and the
+ * perf_events will be freed.
+ */
+struct perf_event_attr_map_entry {
+	__u32 link_id;
+	__u32 diff_map_id;
+};
+
+#define DEFAULT_ATTR_MAP_PATH "fs/bpf/perf_attr_map"
+#define ATTR_MAP_SIZE 16
 
 static inline void *u64_to_ptr(__u64 ptr)
 {
@@ -274,17 +306,494 @@ struct bpf_counter_ops bpf_program_profiler_ops = {
 	.install_pe = bpf_program_profiler__install_pe,
 };
 
+static __u32 bpf_link_get_id(int fd)
+{
+	struct bpf_link_info link_info = {0};
+	__u32 link_info_len = sizeof(link_info);
+
+	bpf_obj_get_info_by_fd(fd, &link_info, &link_info_len);
+	return link_info.id;
+}
+
+static __u32 bpf_link_get_prog_id(int fd)
+{
+	struct bpf_link_info link_info = {0};
+	__u32 link_info_len = sizeof(link_info);
+
+	bpf_obj_get_info_by_fd(fd, &link_info, &link_info_len);
+	return link_info.prog_id;
+}
+
+static __u32 bpf_map_get_id(int fd)
+{
+	struct bpf_map_info map_info = {0};
+	__u32 map_info_len = sizeof(map_info);
+
+	bpf_obj_get_info_by_fd(fd, &map_info, &map_info_len);
+	return map_info.id;
+}
+
+static int bperf_lock_attr_map(struct target *target)
+{
+	char path[PATH_MAX];
+	int map_fd, err;
+
+	if (target->attr_map) {
+		scnprintf(path, PATH_MAX, "%s", target->attr_map);
+	} else {
+		scnprintf(path, PATH_MAX, "%s/%s", sysfs__mountpoint(),
+			  DEFAULT_ATTR_MAP_PATH);
+	}
+
+	if (access(path, F_OK)) {
+		map_fd = bpf_create_map(BPF_MAP_TYPE_HASH,
+					sizeof(struct perf_event_attr),
+					sizeof(struct perf_event_attr_map_entry),
+					ATTR_MAP_SIZE, 0);
+		if (map_fd < 0)
+			return -1;
+
+		err = bpf_obj_pin(map_fd, path);
+		if (err) {
+			/* someone pinned the map in parallel? */
+			close(map_fd);
+			map_fd = bpf_obj_get(path);
+			if (map_fd < 0)
+				return -1;
+		}
+	} else {
+		map_fd = bpf_obj_get(path);
+		if (map_fd < 0)
+			return -1;
+	}
+
+	err = flock(map_fd, LOCK_EX);
+	if (err) {
+		close(map_fd);
+		return -1;
+	}
+	return map_fd;
+}
+
+/* trigger the leader program on a cpu */
+static int bperf_trigger_reading(int prog_fd, int cpu)
+{
+	DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
+			    .ctx_in = NULL,
+			    .ctx_size_in = 0,
+			    .flags = BPF_F_TEST_RUN_ON_CPU,
+			    .cpu = cpu,
+			    .retval = 0,
+		);
+
+	return bpf_prog_test_run_opts(prog_fd, &opts);
+}
+
+static int bperf_check_target(struct evsel *evsel,
+			      struct target *target,
+			      enum bperf_filter_type *filter_type,
+			      __u32 *filter_entry_cnt)
+{
+	if (evsel->leader->core.nr_members > 1) {
+		pr_err("bpf managed perf events do not yet support groups.\n");
+		return -1;
+	}
+
+	/* determine filter type based on target */
+	if (target->system_wide) {
+		*filter_type = BPERF_FILTER_GLOBAL;
+		*filter_entry_cnt = 1;
+	} else if (target->cpu_list) {
+		*filter_type = BPERF_FILTER_CPU;
+		*filter_entry_cnt = perf_cpu_map__nr(evsel__cpus(evsel));
+	} else if (target->tid) {
+		*filter_type = BPERF_FILTER_PID;
+		*filter_entry_cnt = perf_thread_map__nr(evsel->core.threads);
+	} else if (target->pid || evsel->evlist->workload.pid != -1) {
+		*filter_type = BPERF_FILTER_TGID;
+		*filter_entry_cnt = perf_thread_map__nr(evsel->core.threads);
+	} else {
+		pr_err("bpf managed perf events do not yet support these targets.\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static	struct perf_cpu_map *all_cpu_map;
+
+static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
+				       struct perf_event_attr_map_entry *entry)
+{
+	struct bperf_leader_bpf *skel = bperf_leader_bpf__open();
+	int link_fd, diff_map_fd, err;
+	struct bpf_link *link = NULL;
+
+	if (!skel) {
+		pr_err("Failed to open leader skeleton\n");
+		return -1;
+	}
+
+	bpf_map__resize(skel->maps.events, libbpf_num_possible_cpus());
+	err = bperf_leader_bpf__load(skel);
+	if (err) {
+		pr_err("Failed to load leader skeleton\n");
+		goto out;
+	}
+
+	err = -1;
+	link = bpf_program__attach(skel->progs.on_switch);
+	if (!link) {
+		pr_err("Failed to attach leader program\n");
+		goto out;
+	}
+
+	link_fd = bpf_link__fd(link);
+	diff_map_fd = bpf_map__fd(skel->maps.diff_readings);
+	entry->link_id = bpf_link_get_id(link_fd);
+	entry->diff_map_id = bpf_map_get_id(diff_map_fd);
+	err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, entry, BPF_ANY);
+	assert(err == 0);
+
+	evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry->link_id);
+	assert(evsel->bperf_leader_link_fd >= 0);
+
+	/*
+	 * save leader_skel for install_pe, which is called within
+	 * following evsel__open_per_cpu call
+	 */
+	evsel->leader_skel = skel;
+	evsel__open_per_cpu(evsel, all_cpu_map, -1);
+
+out:
+	bperf_leader_bpf__destroy(skel);
+	bpf_link__destroy(link);
+	return err;
+}
+
+static int bperf__load(struct evsel *evsel, struct target *target)
+{
+	struct perf_event_attr_map_entry entry = {0xffffffff, 0xffffffff};
+	int attr_map_fd, diff_map_fd = -1, err;
+	enum bperf_filter_type filter_type;
+	__u32 filter_entry_cnt, i;
+
+	if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
+		return -1;
+
+	if (!all_cpu_map) {
+		all_cpu_map = perf_cpu_map__new(NULL);
+		if (!all_cpu_map)
+			return -1;
+	}
+
+	evsel->bperf_leader_prog_fd = -1;
+	evsel->bperf_leader_link_fd = -1;
+
+	/*
+	 * Step 1: hold a fd on the leader program and the bpf_link, if
+	 * the program is not already gone, reload the program.
+	 * Use flock() to ensure exclusive access to the perf_event_attr
+	 * map.
+	 */
+	attr_map_fd = bperf_lock_attr_map(target);
+	if (attr_map_fd < 0) {
+		pr_err("Failed to lock perf_event_attr map\n");
+		return -1;
+	}
+
+	err = bpf_map_lookup_elem(attr_map_fd, &evsel->core.attr, &entry);
+	if (err) {
+		err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, &entry, BPF_ANY);
+		if (err)
+			goto out;
+	}
+
+	evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry.link_id);
+	if (evsel->bperf_leader_link_fd < 0 &&
+	    bperf_reload_leader_program(evsel, attr_map_fd, &entry))
+		goto out;
+
+	/*
+	 * The bpf_link holds reference to the leader program, and the
+	 * leader program holds reference to the maps. Therefore, if
+	 * link_id is valid, diff_map_id should also be valid.
+	 */
+	evsel->bperf_leader_prog_fd = bpf_prog_get_fd_by_id(
+		bpf_link_get_prog_id(evsel->bperf_leader_link_fd));
+	assert(evsel->bperf_leader_prog_fd >= 0);
+
+	diff_map_fd = bpf_map_get_fd_by_id(entry.diff_map_id);
+	assert(diff_map_fd >= 0);
+
+	/*
+	 * bperf uses BPF_PROG_TEST_RUN to get accurate reading. Check
+	 * whether the kernel support it
+	 */
+	err = bperf_trigger_reading(evsel->bperf_leader_prog_fd, 0);
+	if (err) {
+		pr_err("The kernel does not support test_run for raw_tp BPF programs.\n"
+		       "Therefore, --use-bpf might show inaccurate readings\n");
+		goto out;
+	}
+
+	/* Step 2: load the follower skeleton */
+	evsel->follower_skel = bperf_follower_bpf__open();
+	if (!evsel->follower_skel) {
+		pr_err("Failed to open follower skeleton\n");
+		goto out;
+	}
+
+	/* attach fexit program to the leader program */
+	bpf_program__set_attach_target(evsel->follower_skel->progs.fexit_XXX,
+				       evsel->bperf_leader_prog_fd, "on_switch");
+
+	/* connect to leader diff_reading map */
+	bpf_map__reuse_fd(evsel->follower_skel->maps.diff_readings, diff_map_fd);
+
+	/* set up reading map */
+	bpf_map__set_max_entries(evsel->follower_skel->maps.accum_readings,
+				 filter_entry_cnt);
+	/* set up follower filter based on target */
+	bpf_map__set_max_entries(evsel->follower_skel->maps.filter,
+				 filter_entry_cnt);
+	err = bperf_follower_bpf__load(evsel->follower_skel);
+	if (err) {
+		pr_err("Failed to load follower skeleton\n");
+		bperf_follower_bpf__destroy(evsel->follower_skel);
+		evsel->follower_skel = NULL;
+		goto out;
+	}
+
+	for (i = 0; i < filter_entry_cnt; i++) {
+		int filter_map_fd;
+		__u32 key;
+
+		if (filter_type == BPERF_FILTER_PID ||
+		    filter_type == BPERF_FILTER_TGID)
+			key = evsel->core.threads->map[i].pid;
+		else if (filter_type == BPERF_FILTER_CPU)
+			key = evsel->core.cpus->map[i];
+		else
+			break;
+
+		filter_map_fd = bpf_map__fd(evsel->follower_skel->maps.filter);
+		bpf_map_update_elem(filter_map_fd, &key, &i, BPF_ANY);
+	}
+
+	evsel->follower_skel->bss->type = filter_type;
+
+	err = bperf_follower_bpf__attach(evsel->follower_skel);
+
+out:
+	if (err && evsel->bperf_leader_link_fd >= 0)
+		close(evsel->bperf_leader_link_fd);
+	if (err && evsel->bperf_leader_prog_fd >= 0)
+		close(evsel->bperf_leader_prog_fd);
+	if (diff_map_fd >= 0)
+		close(diff_map_fd);
+
+	flock(attr_map_fd, LOCK_UN);
+	close(attr_map_fd);
+
+	return err;
+}
+
+static int bperf__install_pe(struct evsel *evsel, int cpu, int fd)
+{
+	struct bperf_leader_bpf *skel = evsel->leader_skel;
+
+	return bpf_map_update_elem(bpf_map__fd(skel->maps.events),
+				   &cpu, &fd, BPF_ANY);
+}
+
+/*
+ * trigger the leader prog on each cpu, so the accum_reading map could get
+ * the latest readings.
+ */
+static int bperf_sync_counters(struct evsel *evsel)
+{
+	int num_cpu, i, cpu;
+
+	num_cpu = all_cpu_map->nr;
+	for (i = 0; i < num_cpu; i++) {
+		cpu = all_cpu_map->map[i];
+		bperf_trigger_reading(evsel->bperf_leader_prog_fd, cpu);
+	}
+	return 0;
+}
+
+static int bperf__enable(struct evsel *evsel)
+{
+	evsel->follower_skel->bss->enabled = 1;
+	return 0;
+}
+
+static int bperf__read(struct evsel *evsel)
+{
+	struct bperf_follower_bpf *skel = evsel->follower_skel;
+	__u32 num_cpu_bpf = cpu__max_cpu();
+	struct bpf_perf_event_value values[num_cpu_bpf];
+	int reading_map_fd, err = 0;
+	__u32 i, j, num_cpu;
+
+	bperf_sync_counters(evsel);
+	reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
+
+	for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
+		__u32 cpu;
+
+		err = bpf_map_lookup_elem(reading_map_fd, &i, values);
+		if (err)
+			goto out;
+		switch (evsel->follower_skel->bss->type) {
+		case BPERF_FILTER_GLOBAL:
+			assert(i == 0);
+
+			num_cpu = all_cpu_map->nr;
+			for (j = 0; j < num_cpu; j++) {
+				cpu = all_cpu_map->map[j];
+				perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
+				perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
+				perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
+			}
+			break;
+		case BPERF_FILTER_CPU:
+			cpu = evsel->core.cpus->map[i];
+			perf_counts(evsel->counts, i, 0)->val = values[cpu].counter;
+			perf_counts(evsel->counts, i, 0)->ena = values[cpu].enabled;
+			perf_counts(evsel->counts, i, 0)->run = values[cpu].running;
+			break;
+		case BPERF_FILTER_PID:
+		case BPERF_FILTER_TGID:
+			perf_counts(evsel->counts, 0, i)->val = 0;
+			perf_counts(evsel->counts, 0, i)->ena = 0;
+			perf_counts(evsel->counts, 0, i)->run = 0;
+
+			for (cpu = 0; cpu < num_cpu_bpf; cpu++) {
+				perf_counts(evsel->counts, 0, i)->val += values[cpu].counter;
+				perf_counts(evsel->counts, 0, i)->ena += values[cpu].enabled;
+				perf_counts(evsel->counts, 0, i)->run += values[cpu].running;
+			}
+			break;
+		default:
+			break;
+		}
+	}
+out:
+	return err;
+}
+
+static int bperf__destroy(struct evsel *evsel)
+{
+	bperf_follower_bpf__destroy(evsel->follower_skel);
+	close(evsel->bperf_leader_prog_fd);
+	close(evsel->bperf_leader_link_fd);
+	return 0;
+}
+
+/*
+ * bperf: share hardware PMCs with BPF
+ *
+ * perf uses performance monitoring counters (PMC) to monitor system
+ * performance. The PMCs are limited hardware resources. For example,
+ * Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
+ *
+ * Modern data center systems use these PMCs in many different ways:
+ * system level monitoring, (maybe nested) container level monitoring, per
+ * process monitoring, profiling (in sample mode), etc. In some cases,
+ * there are more active perf_events than available hardware PMCs. To allow
+ * all perf_events to have a chance to run, it is necessary to do expensive
+ * time multiplexing of events.
+ *
+ * On the other hand, many monitoring tools count the common metrics
+ * (cycles, instructions). It is a waste to have multiple tools create
+ * multiple perf_events of "cycles" and occupy multiple PMCs.
+ *
+ * bperf tries to reduce such wastes by allowing multiple perf_events of
+ * "cycles" or "instructions" (at different scopes) to share PMUs. Instead
+ * of having each perf-stat session to read its own perf_events, bperf uses
+ * BPF programs to read the perf_events and aggregate readings to BPF maps.
+ * Then, the perf-stat session(s) reads the values from these BPF maps.
+ *
+ *                                ||
+ *       shared progs and maps <- || -> per session progs and maps
+ *                                ||
+ *   ---------------              ||
+ *   | perf_events |              ||
+ *   ---------------       fexit  ||      -----------------
+ *          |             --------||----> | follower prog |
+ *       --------------- /        || ---  -----------------
+ * cs -> | leader prog |/         ||/        |         |
+ *   --> ---------------         /||  --------------  ------------------
+ *  /       |         |         / ||  | filter map |  | accum_readings |
+ * /  ------------  ------------  ||  --------------  ------------------
+ * |  | prev map |  | diff map |  ||                        |
+ * |  ------------  ------------  ||                        |
+ *  \                             ||                        |
+ * = \ ==================================================== | ============
+ *    \                                                    /   user space
+ *     \                                                  /
+ *      \                                                /
+ *    BPF_PROG_TEST_RUN                    BPF_MAP_LOOKUP_ELEM
+ *        \                                            /
+ *         \                                          /
+ *          \------  perf-stat ----------------------/
+ *
+ * The figure above shows the architecture of bperf. Note that the figure
+ * is divided into 3 regions: shared progs and maps (top left), per session
+ * progs and maps (top right), and user space (bottom).
+ *
+ * The leader prog is triggered on each context switch (cs). The leader
+ * prog reads perf_events and stores the difference (current_reading -
+ * previous_reading) to the diff map. For the same metric, e.g. "cycles",
+ * multiple perf-stat sessions share the same leader prog.
+ *
+ * Each perf-stat session creates a follower prog as fexit program to the
+ * leader prog. It is possible to attach up to BPF_MAX_TRAMP_PROGS (38)
+ * follower progs to the same leader prog. The follower prog checks current
+ * task and processor ID to decide whether to add the value from the diff
+ * map to its accumulated reading map (accum_readings).
+ *
+ * Finally, perf-stat user space reads the value from accum_reading map.
+ *
+ * Besides context switch, it is also necessary to trigger the leader prog
+ * before perf-stat reads the value. Otherwise, the accum_reading map may
+ * not have the latest reading from the perf_events. This is achieved by
+ * triggering the event via sys_bpf(BPF_PROG_TEST_RUN) to each CPU.
+ *
+ * Comment before the definition of struct perf_event_attr_map_entry
+ * describes how different sessions of perf-stat share information about
+ * the leader prog.
+ */
+
+struct bpf_counter_ops bperf_ops = {
+	.load       = bperf__load,
+	.enable     = bperf__enable,
+	.read       = bperf__read,
+	.install_pe = bperf__install_pe,
+	.destroy    = bperf__destroy,
+};
+
+static inline bool bpf_counter_skip(struct evsel *evsel)
+{
+	return list_empty(&evsel->bpf_counter_list) &&
+		evsel->follower_skel == NULL;
+}
+
 int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd)
 {
-	if (list_empty(&evsel->bpf_counter_list))
+	if (bpf_counter_skip(evsel))
 		return 0;
 	return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd);
 }
 
 int bpf_counter__load(struct evsel *evsel, struct target *target)
 {
-	if (target__has_bpf(target))
+	if (target->bpf_str)
 		evsel->bpf_counter_ops = &bpf_program_profiler_ops;
+	else if (target->use_bpf)
+		evsel->bpf_counter_ops = &bperf_ops;
 
 	if (evsel->bpf_counter_ops)
 		return evsel->bpf_counter_ops->load(evsel, target);
@@ -293,21 +802,21 @@ int bpf_counter__load(struct evsel *evsel, struct target *target)
 
 int bpf_counter__enable(struct evsel *evsel)
 {
-	if (list_empty(&evsel->bpf_counter_list))
+	if (bpf_counter_skip(evsel))
 		return 0;
 	return evsel->bpf_counter_ops->enable(evsel);
 }
 
 int bpf_counter__read(struct evsel *evsel)
 {
-	if (list_empty(&evsel->bpf_counter_list))
+	if (bpf_counter_skip(evsel))
 		return -EAGAIN;
 	return evsel->bpf_counter_ops->read(evsel);
 }
 
 void bpf_counter__destroy(struct evsel *evsel)
 {
-	if (list_empty(&evsel->bpf_counter_list))
+	if (bpf_counter_skip(evsel))
 		return;
 	evsel->bpf_counter_ops->destroy(evsel);
 	evsel->bpf_counter_ops = NULL;
diff --git a/tools/perf/util/bpf_skel/bperf.h b/tools/perf/util/bpf_skel/bperf.h
new file mode 100644
index 0000000000000..186a5551ddb9d
--- /dev/null
+++ b/tools/perf/util/bpf_skel/bperf.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+// Copyright (c) 2021 Facebook
+
+#ifndef __BPERF_STAT_H
+#define __BPERF_STAT_H
+
+typedef struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(struct bpf_perf_event_value));
+	__uint(max_entries, 1);
+} reading_map;
+
+#endif /* __BPERF_STAT_H */
diff --git a/tools/perf/util/bpf_skel/bperf_follower.bpf.c b/tools/perf/util/bpf_skel/bperf_follower.bpf.c
new file mode 100644
index 0000000000000..b8fa3cb2da230
--- /dev/null
+++ b/tools/perf/util/bpf_skel/bperf_follower.bpf.c
@@ -0,0 +1,69 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+// Copyright (c) 2021 Facebook
+#include <linux/bpf.h>
+#include <linux/perf_event.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "bperf.h"
+#include "bperf_u.h"
+
+reading_map diff_readings SEC(".maps");
+reading_map accum_readings SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(__u32));
+} filter SEC(".maps");
+
+enum bperf_filter_type type = 0;
+int enabled = 0;
+
+SEC("fexit/XXX")
+int BPF_PROG(fexit_XXX)
+{
+	struct bpf_perf_event_value *diff_val, *accum_val;
+	__u32 filter_key, zero = 0;
+	__u32 *accum_key;
+
+	if (!enabled)
+		return 0;
+
+	switch (type) {
+	case BPERF_FILTER_GLOBAL:
+		accum_key = &zero;
+		goto do_add;
+	case BPERF_FILTER_CPU:
+		filter_key = bpf_get_smp_processor_id();
+		break;
+	case BPERF_FILTER_PID:
+		filter_key = bpf_get_current_pid_tgid() & 0xffffffff;
+		break;
+	case BPERF_FILTER_TGID:
+		filter_key = bpf_get_current_pid_tgid() >> 32;
+		break;
+	default:
+		return 0;
+	}
+
+	accum_key = bpf_map_lookup_elem(&filter, &filter_key);
+	if (!accum_key)
+		return 0;
+
+do_add:
+	diff_val = bpf_map_lookup_elem(&diff_readings, &zero);
+	if (!diff_val)
+		return 0;
+
+	accum_val = bpf_map_lookup_elem(&accum_readings, accum_key);
+	if (!accum_val)
+		return 0;
+
+	accum_val->counter += diff_val->counter;
+	accum_val->enabled += diff_val->enabled;
+	accum_val->running += diff_val->running;
+
+	return 0;
+}
+
+char LICENSE[] SEC("license") = "Dual BSD/GPL";
diff --git a/tools/perf/util/bpf_skel/bperf_leader.bpf.c b/tools/perf/util/bpf_skel/bperf_leader.bpf.c
new file mode 100644
index 0000000000000..4f70d1459e86c
--- /dev/null
+++ b/tools/perf/util/bpf_skel/bperf_leader.bpf.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+// Copyright (c) 2021 Facebook
+#include <linux/bpf.h>
+#include <linux/perf_event.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "bperf.h"
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(int));
+	__uint(map_flags, BPF_F_PRESERVE_ELEMS);
+} events SEC(".maps");
+
+reading_map prev_readings SEC(".maps");
+reading_map diff_readings SEC(".maps");
+
+SEC("raw_tp/sched_switch")
+int BPF_PROG(on_switch)
+{
+	struct bpf_perf_event_value val, *prev_val, *diff_val;
+	__u32 key = bpf_get_smp_processor_id();
+	__u32 zero = 0;
+	long err;
+
+	prev_val = bpf_map_lookup_elem(&prev_readings, &zero);
+	if (!prev_val)
+		return 0;
+
+	diff_val = bpf_map_lookup_elem(&diff_readings, &zero);
+	if (!diff_val)
+		return 0;
+
+	err = bpf_perf_event_read_value(&events, key, &val, sizeof(val));
+	if (err)
+		return 0;
+
+	diff_val->counter = val.counter - prev_val->counter;
+	diff_val->enabled = val.enabled - prev_val->enabled;
+	diff_val->running = val.running - prev_val->running;
+	*prev_val = val;
+	return 0;
+}
+
+char LICENSE[] SEC("license") = "Dual BSD/GPL";
diff --git a/tools/perf/util/bpf_skel/bperf_u.h b/tools/perf/util/bpf_skel/bperf_u.h
new file mode 100644
index 0000000000000..1ce0c2c905c11
--- /dev/null
+++ b/tools/perf/util/bpf_skel/bperf_u.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+// Copyright (c) 2021 Facebook
+
+#ifndef __BPERF_STAT_U_H
+#define __BPERF_STAT_U_H
+
+enum bperf_filter_type {
+	BPERF_FILTER_GLOBAL = 1,
+	BPERF_FILTER_CPU,
+	BPERF_FILTER_PID,
+	BPERF_FILTER_TGID,
+};
+
+#endif /* __BPERF_STAT_U_H */
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 6026487353dd8..dd4f56f9cfdf5 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -20,6 +20,8 @@ union perf_event;
 struct bpf_counter_ops;
 struct target;
 struct hashmap;
+struct bperf_leader_bpf;
+struct bperf_follower_bpf;
 
 typedef int (evsel__sb_cb_t)(union perf_event *event, void *data);
 
@@ -130,8 +132,24 @@ struct evsel {
 	 * See also evsel__has_callchain().
 	 */
 	__u64			synth_sample_type;
-	struct list_head	bpf_counter_list;
+
+	/*
+	 * bpf_counter_ops serves two use cases:
+	 *   1. perf-stat -b          counting events used byBPF programs
+	 *   2. perf-stat --use-bpf   use BPF programs to aggregate counts
+	 */
 	struct bpf_counter_ops	*bpf_counter_ops;
+
+	/* for perf-stat -b */
+	struct list_head	bpf_counter_list;
+
+	/* for perf-stat --use-bpf */
+	int			bperf_leader_prog_fd;
+	int			bperf_leader_link_fd;
+	union {
+		struct bperf_leader_bpf *leader_skel;
+		struct bperf_follower_bpf *follower_skel;
+	};
 };
 
 struct perf_missing_features {
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index f132c6c2eef81..1bce3eb28ef25 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -16,6 +16,8 @@ struct target {
 	bool	     uses_mmap;
 	bool	     default_per_cpu;
 	bool	     per_thread;
+	bool	     use_bpf;
+	const char   *attr_map;
 };
 
 enum target_errno {
@@ -66,7 +68,7 @@ static inline bool target__has_cpu(struct target *target)
 
 static inline bool target__has_bpf(struct target *target)
 {
-	return target->bpf_str;
+	return target->bpf_str || target->use_bpf;
 }
 
 static inline bool target__none(struct target *target)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 2/3] perf-stat: measure t0 and ref_time after enable_counters()
  2021-03-16 21:18 [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Song Liu
  2021-03-16 21:18 ` [PATCH v2 1/3] perf-stat: introduce bperf, " Song Liu
@ 2021-03-16 21:18 ` Song Liu
  2021-03-16 21:18 ` [PATCH v2 3/3] perf-test: add a test for perf-stat --bpf-counters option Song Liu
  2021-03-17  5:29 ` [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Namhyung Kim
  3 siblings, 0 replies; 33+ messages in thread
From: Song Liu @ 2021-03-16 21:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: kernel-team, acme, acme, namhyung, jolsa, Song Liu

Take measurements of t0 and ref_time after enable_counters(), so that
they only measure the time consumed when the counters are enabled.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/builtin-stat.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 92696373da994..d030c3a49a8e4 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -931,15 +931,15 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	/*
 	 * Enable counters and exec the command:
 	 */
-	t0 = rdclock();
-	clock_gettime(CLOCK_MONOTONIC, &ref_time);
-
 	if (forks) {
 		evlist__start_workload(evsel_list);
 		err = enable_counters();
 		if (err)
 			return -1;
 
+		t0 = rdclock();
+		clock_gettime(CLOCK_MONOTONIC, &ref_time);
+
 		if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
 			status = dispatch_events(forks, timeout, interval, &times);
 		if (child_pid != -1) {
@@ -960,6 +960,10 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 		err = enable_counters();
 		if (err)
 			return -1;
+
+		t0 = rdclock();
+		clock_gettime(CLOCK_MONOTONIC, &ref_time);
+
 		status = dispatch_events(forks, timeout, interval, &times);
 	}
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 3/3] perf-test: add a test for perf-stat --bpf-counters option
  2021-03-16 21:18 [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Song Liu
  2021-03-16 21:18 ` [PATCH v2 1/3] perf-stat: introduce bperf, " Song Liu
  2021-03-16 21:18 ` [PATCH v2 2/3] perf-stat: measure t0 and ref_time after enable_counters() Song Liu
@ 2021-03-16 21:18 ` Song Liu
  2021-03-18  6:07   ` Namhyung Kim
  2021-03-17  5:29 ` [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Namhyung Kim
  3 siblings, 1 reply; 33+ messages in thread
From: Song Liu @ 2021-03-16 21:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: kernel-team, acme, acme, namhyung, jolsa, Song Liu

Add a test to compare the output of perf-stat with and without option
--bpf-counters. If the difference is more than 10%, the test is considered
as failed.

For stable results between two runs (w/ and w/o --bpf-counters), the test
program should: 1) be long enough for better signal-noise-ratio; 2) not
depend on the behavior of IO subsystem (for less noise from caching). So
far, the best option we found is stressapptest.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/tests/shell/stat_bpf_counters.sh | 34 +++++++++++++++++++++
 1 file changed, 34 insertions(+)
 create mode 100755 tools/perf/tests/shell/stat_bpf_counters.sh

diff --git a/tools/perf/tests/shell/stat_bpf_counters.sh b/tools/perf/tests/shell/stat_bpf_counters.sh
new file mode 100755
index 0000000000000..c0bcb38d6b53c
--- /dev/null
+++ b/tools/perf/tests/shell/stat_bpf_counters.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+# perf stat --bpf-counters test
+# SPDX-License-Identifier: GPL-2.0
+
+set -e
+
+# check whether $2 is within +/- 10% of $1
+compare_number()
+{
+	first_num=$1
+	second_num=$2
+
+	# upper bound is first_num * 110%
+	upper=$(( $first_num + $first_num / 10 ))
+	# lower bound is first_num * 90%
+	lower=$(( $first_num - $first_num / 10 ))
+
+	if [ $second_num -gt $upper ] || [ $second_num -lt $lower ]; then
+		echo "The difference between $first_num and $second_num are greater than 10%."
+		exit 1
+	fi
+}
+
+# skip if --bpf-counters is not supported
+perf stat --bpf-counters true > /dev/null 2>&1 || exit 2
+
+# skip if stressapptest is not available
+stressapptest -s 1 -M 100 -m 1 > /dev/null 2>&1 || exit 2
+
+base_cycles=$(perf stat --no-big-num -e cycles -- stressapptest -s 3 -M 100 -m 1 2>&1 | grep -e cycles | awk '{print $1}')
+bpf_cycles=$(perf stat --no-big-num --bpf-counters -e cycles -- stressapptest -s 3 -M 100 -m 1 2>&1 | grep -e cycles | awk '{print $1}')
+
+compare_number $base_cycles $bpf_cycles
+exit 0
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-16 21:18 [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Song Liu
                   ` (2 preceding siblings ...)
  2021-03-16 21:18 ` [PATCH v2 3/3] perf-test: add a test for perf-stat --bpf-counters option Song Liu
@ 2021-03-17  5:29 ` Namhyung Kim
  2021-03-17  9:19   ` Jiri Olsa
  2021-03-17 13:11   ` Arnaldo Carvalho de Melo
  3 siblings, 2 replies; 33+ messages in thread
From: Namhyung Kim @ 2021-03-17  5:29 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, Kernel Team, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa

Hi Song,

On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
>
> perf uses performance monitoring counters (PMCs) to monitor system
> performance. The PMCs are limited hardware resources. For example,
> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
>
> Modern data center systems use these PMCs in many different ways:
> system level monitoring, (maybe nested) container level monitoring, per
> process monitoring, profiling (in sample mode), etc. In some cases,
> there are more active perf_events than available hardware PMCs. To allow
> all perf_events to have a chance to run, it is necessary to do expensive
> time multiplexing of events.
>
> On the other hand, many monitoring tools count the common metrics (cycles,
> instructions). It is a waste to have multiple tools create multiple
> perf_events of "cycles" and occupy multiple PMCs.

Right, it'd be really helpful when the PMCs are frequently or mostly shared.
But it'd also increase the overhead for uncontended cases as BPF programs
need to run on every context switch.  Depending on the workload, it may
cause a non-negligible performance impact.  So users should be aware of it.

Thanks,
Namhyung

>
> bperf tries to reduce such wastes by allowing multiple perf_events of
> "cycles" or "instructions" (at different scopes) to share PMUs. Instead
> of having each perf-stat session to read its own perf_events, bperf uses
> BPF programs to read the perf_events and aggregate readings to BPF maps.
> Then, the perf-stat session(s) reads the values from these BPF maps.
>
> Changes v1 => v2:
>   1. Add documentation.
>   2. Add a shell test.
>   3. Rename options, default path of the atto-map, and some variables.
>   4. Add a separate patch that moves clock_gettime() in __run_perf_stat()
>      to after enable_counters().
>   5. Make perf_cpu_map for all cpus a global variable.
>   6. Use sysfs__mountpoint() for default attr-map path.
>   7. Use cpu__max_cpu() instead of libbpf_num_possible_cpus().
>   8. Add flag "enabled" to the follower program. Then move follower attach
>      to bperf__load() and simplify bperf__enable().
>
> Song Liu (3):
>   perf-stat: introduce bperf, share hardware PMCs with BPF
>   perf-stat: measure t0 and ref_time after enable_counters()
>   perf-test: add a test for perf-stat --bpf-counters option
>
>  tools/perf/Documentation/perf-stat.txt        |  11 +
>  tools/perf/Makefile.perf                      |   1 +
>  tools/perf/builtin-stat.c                     |  20 +-
>  tools/perf/tests/shell/stat_bpf_counters.sh   |  34 ++
>  tools/perf/util/bpf_counter.c                 | 519 +++++++++++++++++-
>  tools/perf/util/bpf_skel/bperf.h              |  14 +
>  tools/perf/util/bpf_skel/bperf_follower.bpf.c |  69 +++
>  tools/perf/util/bpf_skel/bperf_leader.bpf.c   |  46 ++
>  tools/perf/util/bpf_skel/bperf_u.h            |  14 +
>  tools/perf/util/evsel.h                       |  20 +-
>  tools/perf/util/target.h                      |   4 +-
>  11 files changed, 742 insertions(+), 10 deletions(-)
>  create mode 100755 tools/perf/tests/shell/stat_bpf_counters.sh
>  create mode 100644 tools/perf/util/bpf_skel/bperf.h
>  create mode 100644 tools/perf/util/bpf_skel/bperf_follower.bpf.c
>  create mode 100644 tools/perf/util/bpf_skel/bperf_leader.bpf.c
>  create mode 100644 tools/perf/util/bpf_skel/bperf_u.h
>
> --
> 2.30.2

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-17  5:29 ` [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Namhyung Kim
@ 2021-03-17  9:19   ` Jiri Olsa
  2021-03-17 13:11   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 33+ messages in thread
From: Jiri Olsa @ 2021-03-17  9:19 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Song Liu, linux-kernel, Kernel Team, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa

On Wed, Mar 17, 2021 at 02:29:28PM +0900, Namhyung Kim wrote:
> Hi Song,
> 
> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
> >
> > perf uses performance monitoring counters (PMCs) to monitor system
> > performance. The PMCs are limited hardware resources. For example,
> > Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
> >
> > Modern data center systems use these PMCs in many different ways:
> > system level monitoring, (maybe nested) container level monitoring, per
> > process monitoring, profiling (in sample mode), etc. In some cases,
> > there are more active perf_events than available hardware PMCs. To allow
> > all perf_events to have a chance to run, it is necessary to do expensive
> > time multiplexing of events.
> >
> > On the other hand, many monitoring tools count the common metrics (cycles,
> > instructions). It is a waste to have multiple tools create multiple
> > perf_events of "cycles" and occupy multiple PMCs.
> 
> Right, it'd be really helpful when the PMCs are frequently or mostly shared.
> But it'd also increase the overhead for uncontended cases as BPF programs
> need to run on every context switch.  Depending on the workload, it may
> cause a non-negligible performance impact.  So users should be aware of it.

right, let's get get some idea of how bad that actualy is

Song,
could you please get some numbers from runnning for example
'perf bench sched messaging ...' with both normal and bpf
mode perf stat? for all supported target options

thanks,
jirka


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-17  5:29 ` [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Namhyung Kim
  2021-03-17  9:19   ` Jiri Olsa
@ 2021-03-17 13:11   ` Arnaldo Carvalho de Melo
  2021-03-18  3:52     ` Song Liu
  1 sibling, 1 reply; 33+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-17 13:11 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Song Liu, linux-kernel, Kernel Team, Arnaldo Carvalho de Melo, Jiri Olsa

Em Wed, Mar 17, 2021 at 02:29:28PM +0900, Namhyung Kim escreveu:
> Hi Song,
> 
> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
> >
> > perf uses performance monitoring counters (PMCs) to monitor system
> > performance. The PMCs are limited hardware resources. For example,
> > Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
> >
> > Modern data center systems use these PMCs in many different ways:
> > system level monitoring, (maybe nested) container level monitoring, per
> > process monitoring, profiling (in sample mode), etc. In some cases,
> > there are more active perf_events than available hardware PMCs. To allow
> > all perf_events to have a chance to run, it is necessary to do expensive
> > time multiplexing of events.
> >
> > On the other hand, many monitoring tools count the common metrics (cycles,
> > instructions). It is a waste to have multiple tools create multiple
> > perf_events of "cycles" and occupy multiple PMCs.
> 
> Right, it'd be really helpful when the PMCs are frequently or mostly shared.
> But it'd also increase the overhead for uncontended cases as BPF programs
> need to run on every context switch.  Depending on the workload, it may
> cause a non-negligible performance impact.  So users should be aware of it.

Would be interesting to, humm, measure both cases to have a firm number
of the impact, how many instructions are added when sharing using
--bpf-counters?

I.e. compare the "expensive time multiplexing of events" with its
avoidance by using --bpf-counters.

Song, have you perfmormed such measurements?

- Arnaldo
 
> Thanks,
> Namhyung
> 
> >
> > bperf tries to reduce such wastes by allowing multiple perf_events of
> > "cycles" or "instructions" (at different scopes) to share PMUs. Instead
> > of having each perf-stat session to read its own perf_events, bperf uses
> > BPF programs to read the perf_events and aggregate readings to BPF maps.
> > Then, the perf-stat session(s) reads the values from these BPF maps.
> >
> > Changes v1 => v2:
> >   1. Add documentation.
> >   2. Add a shell test.
> >   3. Rename options, default path of the atto-map, and some variables.
> >   4. Add a separate patch that moves clock_gettime() in __run_perf_stat()
> >      to after enable_counters().
> >   5. Make perf_cpu_map for all cpus a global variable.
> >   6. Use sysfs__mountpoint() for default attr-map path.
> >   7. Use cpu__max_cpu() instead of libbpf_num_possible_cpus().
> >   8. Add flag "enabled" to the follower program. Then move follower attach
> >      to bperf__load() and simplify bperf__enable().
> >
> > Song Liu (3):
> >   perf-stat: introduce bperf, share hardware PMCs with BPF
> >   perf-stat: measure t0 and ref_time after enable_counters()
> >   perf-test: add a test for perf-stat --bpf-counters option
> >
> >  tools/perf/Documentation/perf-stat.txt        |  11 +
> >  tools/perf/Makefile.perf                      |   1 +
> >  tools/perf/builtin-stat.c                     |  20 +-
> >  tools/perf/tests/shell/stat_bpf_counters.sh   |  34 ++
> >  tools/perf/util/bpf_counter.c                 | 519 +++++++++++++++++-
> >  tools/perf/util/bpf_skel/bperf.h              |  14 +
> >  tools/perf/util/bpf_skel/bperf_follower.bpf.c |  69 +++
> >  tools/perf/util/bpf_skel/bperf_leader.bpf.c   |  46 ++
> >  tools/perf/util/bpf_skel/bperf_u.h            |  14 +
> >  tools/perf/util/evsel.h                       |  20 +-
> >  tools/perf/util/target.h                      |   4 +-
> >  11 files changed, 742 insertions(+), 10 deletions(-)
> >  create mode 100755 tools/perf/tests/shell/stat_bpf_counters.sh
> >  create mode 100644 tools/perf/util/bpf_skel/bperf.h
> >  create mode 100644 tools/perf/util/bpf_skel/bperf_follower.bpf.c
> >  create mode 100644 tools/perf/util/bpf_skel/bperf_leader.bpf.c
> >  create mode 100644 tools/perf/util/bpf_skel/bperf_u.h
> >
> > --
> > 2.30.2

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-17 13:11   ` Arnaldo Carvalho de Melo
@ 2021-03-18  3:52     ` Song Liu
  2021-03-18  4:32       ` Namhyung Kim
  2021-03-18 21:14       ` Jiri Olsa
  0 siblings, 2 replies; 33+ messages in thread
From: Song Liu @ 2021-03-18  3:52 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, linux-kernel, Kernel Team,
	Arnaldo Carvalho de Melo, Jiri Olsa



> On Mar 17, 2021, at 6:11 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Wed, Mar 17, 2021 at 02:29:28PM +0900, Namhyung Kim escreveu:
>> Hi Song,
>> 
>> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
>>> 
>>> perf uses performance monitoring counters (PMCs) to monitor system
>>> performance. The PMCs are limited hardware resources. For example,
>>> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
>>> 
>>> Modern data center systems use these PMCs in many different ways:
>>> system level monitoring, (maybe nested) container level monitoring, per
>>> process monitoring, profiling (in sample mode), etc. In some cases,
>>> there are more active perf_events than available hardware PMCs. To allow
>>> all perf_events to have a chance to run, it is necessary to do expensive
>>> time multiplexing of events.
>>> 
>>> On the other hand, many monitoring tools count the common metrics (cycles,
>>> instructions). It is a waste to have multiple tools create multiple
>>> perf_events of "cycles" and occupy multiple PMCs.
>> 
>> Right, it'd be really helpful when the PMCs are frequently or mostly shared.
>> But it'd also increase the overhead for uncontended cases as BPF programs
>> need to run on every context switch.  Depending on the workload, it may
>> cause a non-negligible performance impact.  So users should be aware of it.
> 
> Would be interesting to, humm, measure both cases to have a firm number
> of the impact, how many instructions are added when sharing using
> --bpf-counters?
> 
> I.e. compare the "expensive time multiplexing of events" with its
> avoidance by using --bpf-counters.
> 
> Song, have you perfmormed such measurements?

I have got some measurements with perf-bench-sched-messaging:

The system: x86_64 with 23 cores (46 HT)

The perf-stat command:
perf stat -e cycles,cycles,instructions,instructions,ref-cycles,ref-cycles <target, etc.>

The benchmark command and output:
./perf bench sched messaging -g 40 -l 50000 -t
# Running 'sched/messaging' benchmark:
# 20 sender and receiver threads per group
# 40 groups == 1600 threads run
     Total time: 10X.XXX [sec]


I use the "Total time" as measurement, so smaller number is better. 

For each condition, I run the command 5 times, and took the median of 
"Total time". 

Baseline (no perf-stat)			104.873 [sec]
# global
perf stat -a				107.887 [sec]
perf stat -a --bpf-counters		106.071 [sec]
# per task
perf stat 				106.314 [sec]
perf stat --bpf-counters 		105.965 [sec]
# per cpu
perf stat -C 1,3,5 			107.063 [sec]
perf stat -C 1,3,5 --bpf-counters 	106.406 [sec]

From the data, --bpf-counters is slightly better than the regular event
for all targets. I noticed that the results are not very stable. There 
are a couple 108.xx runs in some of the conditions (w/ and w/o 
--bpf-counters).


I also measured the average runtime of the BPF programs, with 

	sysctl kernel.bpf_stats_enabled=1

For each event, if we have one leader and two followers, the total run 
time is about 340ns. IOW, 340ns for two perf-stat reading instructions, 
340ns for two perf-stat reading cycles, etc. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-18  3:52     ` Song Liu
@ 2021-03-18  4:32       ` Namhyung Kim
  2021-03-18  7:03         ` Song Liu
  2021-03-18 21:14       ` Jiri Olsa
  1 sibling, 1 reply; 33+ messages in thread
From: Namhyung Kim @ 2021-03-18  4:32 UTC (permalink / raw)
  To: Song Liu
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Kernel Team,
	Arnaldo Carvalho de Melo, Jiri Olsa

On Thu, Mar 18, 2021 at 12:52 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Mar 17, 2021, at 6:11 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >
> > Em Wed, Mar 17, 2021 at 02:29:28PM +0900, Namhyung Kim escreveu:
> >> Hi Song,
> >>
> >> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
> >>>
> >>> perf uses performance monitoring counters (PMCs) to monitor system
> >>> performance. The PMCs are limited hardware resources. For example,
> >>> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
> >>>
> >>> Modern data center systems use these PMCs in many different ways:
> >>> system level monitoring, (maybe nested) container level monitoring, per
> >>> process monitoring, profiling (in sample mode), etc. In some cases,
> >>> there are more active perf_events than available hardware PMCs. To allow
> >>> all perf_events to have a chance to run, it is necessary to do expensive
> >>> time multiplexing of events.
> >>>
> >>> On the other hand, many monitoring tools count the common metrics (cycles,
> >>> instructions). It is a waste to have multiple tools create multiple
> >>> perf_events of "cycles" and occupy multiple PMCs.
> >>
> >> Right, it'd be really helpful when the PMCs are frequently or mostly shared.
> >> But it'd also increase the overhead for uncontended cases as BPF programs
> >> need to run on every context switch.  Depending on the workload, it may
> >> cause a non-negligible performance impact.  So users should be aware of it.
> >
> > Would be interesting to, humm, measure both cases to have a firm number
> > of the impact, how many instructions are added when sharing using
> > --bpf-counters?
> >
> > I.e. compare the "expensive time multiplexing of events" with its
> > avoidance by using --bpf-counters.
> >
> > Song, have you perfmormed such measurements?
>
> I have got some measurements with perf-bench-sched-messaging:
>
> The system: x86_64 with 23 cores (46 HT)
>
> The perf-stat command:
> perf stat -e cycles,cycles,instructions,instructions,ref-cycles,ref-cycles <target, etc.>
>
> The benchmark command and output:
> ./perf bench sched messaging -g 40 -l 50000 -t
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver threads per group
> # 40 groups == 1600 threads run
>      Total time: 10X.XXX [sec]
>
>
> I use the "Total time" as measurement, so smaller number is better.
>
> For each condition, I run the command 5 times, and took the median of
> "Total time".
>
> Baseline (no perf-stat)                 104.873 [sec]
> # global
> perf stat -a                            107.887 [sec]
> perf stat -a --bpf-counters             106.071 [sec]
> # per task
> perf stat                               106.314 [sec]
> perf stat --bpf-counters                105.965 [sec]
> # per cpu
> perf stat -C 1,3,5                      107.063 [sec]
> perf stat -C 1,3,5 --bpf-counters       106.406 [sec]
>
> From the data, --bpf-counters is slightly better than the regular event
> for all targets. I noticed that the results are not very stable. There
> are a couple 108.xx runs in some of the conditions (w/ and w/o
> --bpf-counters).

Hmm.. so this result is when multiplexing happened, right?
I wondered how/why the regular perf stat is slower..

Thanks,
Namhyung

>
>
> I also measured the average runtime of the BPF programs, with
>
>         sysctl kernel.bpf_stats_enabled=1
>
> For each event, if we have one leader and two followers, the total run
> time is about 340ns. IOW, 340ns for two perf-stat reading instructions,
> 340ns for two perf-stat reading cycles, etc.
>
> Thanks,
> Song

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-16 21:18 ` [PATCH v2 1/3] perf-stat: introduce bperf, " Song Liu
@ 2021-03-18  5:54   ` Namhyung Kim
  2021-03-18  7:22     ` Song Liu
  2021-03-18 21:15   ` Jiri Olsa
  1 sibling, 1 reply; 33+ messages in thread
From: Namhyung Kim @ 2021-03-18  5:54 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, Kernel Team, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa

On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
> +static int bperf_check_target(struct evsel *evsel,
> +                             struct target *target,
> +                             enum bperf_filter_type *filter_type,
> +                             __u32 *filter_entry_cnt)
> +{
> +       if (evsel->leader->core.nr_members > 1) {
> +               pr_err("bpf managed perf events do not yet support groups.\n");
> +               return -1;
> +       }
> +
> +       /* determine filter type based on target */
> +       if (target->system_wide) {
> +               *filter_type = BPERF_FILTER_GLOBAL;
> +               *filter_entry_cnt = 1;
> +       } else if (target->cpu_list) {
> +               *filter_type = BPERF_FILTER_CPU;
> +               *filter_entry_cnt = perf_cpu_map__nr(evsel__cpus(evsel));
> +       } else if (target->tid) {
> +               *filter_type = BPERF_FILTER_PID;
> +               *filter_entry_cnt = perf_thread_map__nr(evsel->core.threads);
> +       } else if (target->pid || evsel->evlist->workload.pid != -1) {
> +               *filter_type = BPERF_FILTER_TGID;
> +               *filter_entry_cnt = perf_thread_map__nr(evsel->core.threads);
> +       } else {
> +               pr_err("bpf managed perf events do not yet support these targets.\n");
> +               return -1;
> +       }
> +
> +       return 0;
> +}
> +
> +static struct perf_cpu_map *all_cpu_map;
> +
> +static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
> +                                      struct perf_event_attr_map_entry *entry)
> +{
> +       struct bperf_leader_bpf *skel = bperf_leader_bpf__open();
> +       int link_fd, diff_map_fd, err;
> +       struct bpf_link *link = NULL;
> +
> +       if (!skel) {
> +               pr_err("Failed to open leader skeleton\n");
> +               return -1;
> +       }
> +
> +       bpf_map__resize(skel->maps.events, libbpf_num_possible_cpus());
> +       err = bperf_leader_bpf__load(skel);
> +       if (err) {
> +               pr_err("Failed to load leader skeleton\n");
> +               goto out;
> +       }
> +
> +       err = -1;
> +       link = bpf_program__attach(skel->progs.on_switch);
> +       if (!link) {
> +               pr_err("Failed to attach leader program\n");
> +               goto out;
> +       }
> +
> +       link_fd = bpf_link__fd(link);
> +       diff_map_fd = bpf_map__fd(skel->maps.diff_readings);
> +       entry->link_id = bpf_link_get_id(link_fd);
> +       entry->diff_map_id = bpf_map_get_id(diff_map_fd);
> +       err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, entry, BPF_ANY);
> +       assert(err == 0);
> +
> +       evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry->link_id);
> +       assert(evsel->bperf_leader_link_fd >= 0);

Isn't it the same as link_fd?

> +
> +       /*
> +        * save leader_skel for install_pe, which is called within
> +        * following evsel__open_per_cpu call
> +        */
> +       evsel->leader_skel = skel;
> +       evsel__open_per_cpu(evsel, all_cpu_map, -1);
> +
> +out:
> +       bperf_leader_bpf__destroy(skel);
> +       bpf_link__destroy(link);

Why do we destroy it?  Is it because we get an another reference?

> +       return err;
> +}
> +
> +static int bperf__load(struct evsel *evsel, struct target *target)
> +{
> +       struct perf_event_attr_map_entry entry = {0xffffffff, 0xffffffff};
> +       int attr_map_fd, diff_map_fd = -1, err;
> +       enum bperf_filter_type filter_type;
> +       __u32 filter_entry_cnt, i;
> +
> +       if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
> +               return -1;
> +
> +       if (!all_cpu_map) {
> +               all_cpu_map = perf_cpu_map__new(NULL);
> +               if (!all_cpu_map)
> +                       return -1;
> +       }
> +
> +       evsel->bperf_leader_prog_fd = -1;
> +       evsel->bperf_leader_link_fd = -1;
> +
> +       /*
> +        * Step 1: hold a fd on the leader program and the bpf_link, if
> +        * the program is not already gone, reload the program.
> +        * Use flock() to ensure exclusive access to the perf_event_attr
> +        * map.
> +        */
> +       attr_map_fd = bperf_lock_attr_map(target);
> +       if (attr_map_fd < 0) {
> +               pr_err("Failed to lock perf_event_attr map\n");
> +               return -1;
> +       }
> +
> +       err = bpf_map_lookup_elem(attr_map_fd, &evsel->core.attr, &entry);
> +       if (err) {
> +               err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, &entry, BPF_ANY);
> +               if (err)
> +                       goto out;
> +       }
> +
> +       evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry.link_id);
> +       if (evsel->bperf_leader_link_fd < 0 &&
> +           bperf_reload_leader_program(evsel, attr_map_fd, &entry))
> +               goto out;
> +
> +       /*
> +        * The bpf_link holds reference to the leader program, and the
> +        * leader program holds reference to the maps. Therefore, if
> +        * link_id is valid, diff_map_id should also be valid.
> +        */
> +       evsel->bperf_leader_prog_fd = bpf_prog_get_fd_by_id(
> +               bpf_link_get_prog_id(evsel->bperf_leader_link_fd));
> +       assert(evsel->bperf_leader_prog_fd >= 0);
> +
> +       diff_map_fd = bpf_map_get_fd_by_id(entry.diff_map_id);
> +       assert(diff_map_fd >= 0);
> +
> +       /*
> +        * bperf uses BPF_PROG_TEST_RUN to get accurate reading. Check
> +        * whether the kernel support it
> +        */
> +       err = bperf_trigger_reading(evsel->bperf_leader_prog_fd, 0);
> +       if (err) {
> +               pr_err("The kernel does not support test_run for raw_tp BPF programs.\n"
> +                      "Therefore, --use-bpf might show inaccurate readings\n");
> +               goto out;
> +       }
> +
> +       /* Step 2: load the follower skeleton */
> +       evsel->follower_skel = bperf_follower_bpf__open();
> +       if (!evsel->follower_skel) {
> +               pr_err("Failed to open follower skeleton\n");
> +               goto out;
> +       }
> +
> +       /* attach fexit program to the leader program */
> +       bpf_program__set_attach_target(evsel->follower_skel->progs.fexit_XXX,
> +                                      evsel->bperf_leader_prog_fd, "on_switch");
> +
> +       /* connect to leader diff_reading map */
> +       bpf_map__reuse_fd(evsel->follower_skel->maps.diff_readings, diff_map_fd);
> +
> +       /* set up reading map */
> +       bpf_map__set_max_entries(evsel->follower_skel->maps.accum_readings,
> +                                filter_entry_cnt);
> +       /* set up follower filter based on target */
> +       bpf_map__set_max_entries(evsel->follower_skel->maps.filter,
> +                                filter_entry_cnt);
> +       err = bperf_follower_bpf__load(evsel->follower_skel);
> +       if (err) {
> +               pr_err("Failed to load follower skeleton\n");
> +               bperf_follower_bpf__destroy(evsel->follower_skel);
> +               evsel->follower_skel = NULL;
> +               goto out;
> +       }
> +
> +       for (i = 0; i < filter_entry_cnt; i++) {
> +               int filter_map_fd;
> +               __u32 key;
> +
> +               if (filter_type == BPERF_FILTER_PID ||
> +                   filter_type == BPERF_FILTER_TGID)
> +                       key = evsel->core.threads->map[i].pid;
> +               else if (filter_type == BPERF_FILTER_CPU)
> +                       key = evsel->core.cpus->map[i];
> +               else
> +                       break;
> +
> +               filter_map_fd = bpf_map__fd(evsel->follower_skel->maps.filter);
> +               bpf_map_update_elem(filter_map_fd, &key, &i, BPF_ANY);
> +       }
> +
> +       evsel->follower_skel->bss->type = filter_type;
> +
> +       err = bperf_follower_bpf__attach(evsel->follower_skel);
> +
> +out:
> +       if (err && evsel->bperf_leader_link_fd >= 0)
> +               close(evsel->bperf_leader_link_fd);
> +       if (err && evsel->bperf_leader_prog_fd >= 0)
> +               close(evsel->bperf_leader_prog_fd);
> +       if (diff_map_fd >= 0)
> +               close(diff_map_fd);
> +
> +       flock(attr_map_fd, LOCK_UN);
> +       close(attr_map_fd);
> +
> +       return err;
> +}
> +
> +static int bperf__install_pe(struct evsel *evsel, int cpu, int fd)
> +{
> +       struct bperf_leader_bpf *skel = evsel->leader_skel;
> +
> +       return bpf_map_update_elem(bpf_map__fd(skel->maps.events),
> +                                  &cpu, &fd, BPF_ANY);
> +}
> +
> +/*
> + * trigger the leader prog on each cpu, so the accum_reading map could get
> + * the latest readings.
> + */
> +static int bperf_sync_counters(struct evsel *evsel)
> +{
> +       int num_cpu, i, cpu;
> +
> +       num_cpu = all_cpu_map->nr;
> +       for (i = 0; i < num_cpu; i++) {
> +               cpu = all_cpu_map->map[i];
> +               bperf_trigger_reading(evsel->bperf_leader_prog_fd, cpu);
> +       }
> +       return 0;
> +}
> +
> +static int bperf__enable(struct evsel *evsel)
> +{
> +       evsel->follower_skel->bss->enabled = 1;
> +       return 0;
> +}
> +
> +static int bperf__read(struct evsel *evsel)
> +{
> +       struct bperf_follower_bpf *skel = evsel->follower_skel;
> +       __u32 num_cpu_bpf = cpu__max_cpu();
> +       struct bpf_perf_event_value values[num_cpu_bpf];
> +       int reading_map_fd, err = 0;
> +       __u32 i, j, num_cpu;
> +
> +       bperf_sync_counters(evsel);
> +       reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> +
> +       for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
> +               __u32 cpu;
> +
> +               err = bpf_map_lookup_elem(reading_map_fd, &i, values);
> +               if (err)
> +                       goto out;
> +               switch (evsel->follower_skel->bss->type) {
> +               case BPERF_FILTER_GLOBAL:
> +                       assert(i == 0);
> +
> +                       num_cpu = all_cpu_map->nr;
> +                       for (j = 0; j < num_cpu; j++) {
> +                               cpu = all_cpu_map->map[j];
> +                               perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
> +                               perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
> +                               perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;

I'm confused with this.  Does the accum_readings map contain values
for all cpus?  IIUC it has only a single entry but you access it for each cpu.
What am I missing?

Thanks,
Namhyung


> +                       }
> +                       break;
> +               case BPERF_FILTER_CPU:
> +                       cpu = evsel->core.cpus->map[i];
> +                       perf_counts(evsel->counts, i, 0)->val = values[cpu].counter;
> +                       perf_counts(evsel->counts, i, 0)->ena = values[cpu].enabled;
> +                       perf_counts(evsel->counts, i, 0)->run = values[cpu].running;
> +                       break;
> +               case BPERF_FILTER_PID:
> +               case BPERF_FILTER_TGID:
> +                       perf_counts(evsel->counts, 0, i)->val = 0;
> +                       perf_counts(evsel->counts, 0, i)->ena = 0;
> +                       perf_counts(evsel->counts, 0, i)->run = 0;
> +
> +                       for (cpu = 0; cpu < num_cpu_bpf; cpu++) {
> +                               perf_counts(evsel->counts, 0, i)->val += values[cpu].counter;
> +                               perf_counts(evsel->counts, 0, i)->ena += values[cpu].enabled;
> +                               perf_counts(evsel->counts, 0, i)->run += values[cpu].running;
> +                       }
> +                       break;
> +               default:
> +                       break;
> +               }
> +       }
> +out:
> +       return err;
> +}
> +
> +static int bperf__destroy(struct evsel *evsel)
> +{
> +       bperf_follower_bpf__destroy(evsel->follower_skel);
> +       close(evsel->bperf_leader_prog_fd);
> +       close(evsel->bperf_leader_link_fd);
> +       return 0;
> +}

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/3] perf-test: add a test for perf-stat --bpf-counters option
  2021-03-16 21:18 ` [PATCH v2 3/3] perf-test: add a test for perf-stat --bpf-counters option Song Liu
@ 2021-03-18  6:07   ` Namhyung Kim
  2021-03-18  7:39     ` Song Liu
  0 siblings, 1 reply; 33+ messages in thread
From: Namhyung Kim @ 2021-03-18  6:07 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, Kernel Team, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa

On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
>
> Add a test to compare the output of perf-stat with and without option
> --bpf-counters. If the difference is more than 10%, the test is considered
> as failed.
>
> For stable results between two runs (w/ and w/o --bpf-counters), the test
> program should: 1) be long enough for better signal-noise-ratio; 2) not
> depend on the behavior of IO subsystem (for less noise from caching). So
> far, the best option we found is stressapptest.
>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  tools/perf/tests/shell/stat_bpf_counters.sh | 34 +++++++++++++++++++++
>  1 file changed, 34 insertions(+)
>  create mode 100755 tools/perf/tests/shell/stat_bpf_counters.sh
>
> diff --git a/tools/perf/tests/shell/stat_bpf_counters.sh b/tools/perf/tests/shell/stat_bpf_counters.sh
> new file mode 100755
> index 0000000000000..c0bcb38d6b53c
> --- /dev/null
> +++ b/tools/perf/tests/shell/stat_bpf_counters.sh
> @@ -0,0 +1,34 @@
> +#!/bin/sh
> +# perf stat --bpf-counters test
> +# SPDX-License-Identifier: GPL-2.0
> +
> +set -e
> +
> +# check whether $2 is within +/- 10% of $1
> +compare_number()
> +{
> +       first_num=$1
> +       second_num=$2
> +
> +       # upper bound is first_num * 110%
> +       upper=$(( $first_num + $first_num / 10 ))
> +       # lower bound is first_num * 90%
> +       lower=$(( $first_num - $first_num / 10 ))
> +
> +       if [ $second_num -gt $upper ] || [ $second_num -lt $lower ]; then
> +               echo "The difference between $first_num and $second_num are greater than 10%."
> +               exit 1
> +       fi
> +}
> +
> +# skip if --bpf-counters is not supported
> +perf stat --bpf-counters true > /dev/null 2>&1 || exit 2
> +
> +# skip if stressapptest is not available
> +stressapptest -s 1 -M 100 -m 1 > /dev/null 2>&1 || exit 2

I don't know how popular it is, but we can print some info
in case we miss it.

> +
> +base_cycles=$(perf stat --no-big-num -e cycles -- stressapptest -s 3 -M 100 -m 1 2>&1 | grep -e cycles | awk '{print $1}')
> +bpf_cycles=$(perf stat --no-big-num --bpf-counters -e cycles -- stressapptest -s 3 -M 100 -m 1 2>&1 | grep -e cycles | awk '{print $1}')

I think just awk '/cycles/ {print $1}' should work.

Thanks,
Namhyung


> +
> +compare_number $base_cycles $bpf_cycles
> +exit 0
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-18  4:32       ` Namhyung Kim
@ 2021-03-18  7:03         ` Song Liu
  0 siblings, 0 replies; 33+ messages in thread
From: Song Liu @ 2021-03-18  7:03 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Kernel Team,
	Arnaldo Carvalho de Melo, Jiri Olsa



> On Mar 17, 2021, at 9:32 PM, Namhyung Kim <namhyung@kernel.org> wrote:
> 
> On Thu, Mar 18, 2021 at 12:52 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Mar 17, 2021, at 6:11 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>> 
>>> Em Wed, Mar 17, 2021 at 02:29:28PM +0900, Namhyung Kim escreveu:
>>>> Hi Song,
>>>> 
>>>> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
>>>>> 
>>>>> perf uses performance monitoring counters (PMCs) to monitor system
>>>>> performance. The PMCs are limited hardware resources. For example,
>>>>> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
>>>>> 
>>>>> Modern data center systems use these PMCs in many different ways:
>>>>> system level monitoring, (maybe nested) container level monitoring, per
>>>>> process monitoring, profiling (in sample mode), etc. In some cases,
>>>>> there are more active perf_events than available hardware PMCs. To allow
>>>>> all perf_events to have a chance to run, it is necessary to do expensive
>>>>> time multiplexing of events.
>>>>> 
>>>>> On the other hand, many monitoring tools count the common metrics (cycles,
>>>>> instructions). It is a waste to have multiple tools create multiple
>>>>> perf_events of "cycles" and occupy multiple PMCs.
>>>> 
>>>> Right, it'd be really helpful when the PMCs are frequently or mostly shared.
>>>> But it'd also increase the overhead for uncontended cases as BPF programs
>>>> need to run on every context switch.  Depending on the workload, it may
>>>> cause a non-negligible performance impact.  So users should be aware of it.
>>> 
>>> Would be interesting to, humm, measure both cases to have a firm number
>>> of the impact, how many instructions are added when sharing using
>>> --bpf-counters?
>>> 
>>> I.e. compare the "expensive time multiplexing of events" with its
>>> avoidance by using --bpf-counters.
>>> 
>>> Song, have you perfmormed such measurements?
>> 
>> I have got some measurements with perf-bench-sched-messaging:
>> 
>> The system: x86_64 with 23 cores (46 HT)
>> 
>> The perf-stat command:
>> perf stat -e cycles,cycles,instructions,instructions,ref-cycles,ref-cycles <target, etc.>
>> 
>> The benchmark command and output:
>> ./perf bench sched messaging -g 40 -l 50000 -t
>> # Running 'sched/messaging' benchmark:
>> # 20 sender and receiver threads per group
>> # 40 groups == 1600 threads run
>>     Total time: 10X.XXX [sec]
>> 
>> 
>> I use the "Total time" as measurement, so smaller number is better.
>> 
>> For each condition, I run the command 5 times, and took the median of
>> "Total time".
>> 
>> Baseline (no perf-stat)                 104.873 [sec]
>> # global
>> perf stat -a                            107.887 [sec]
>> perf stat -a --bpf-counters             106.071 [sec]
>> # per task
>> perf stat                               106.314 [sec]
>> perf stat --bpf-counters                105.965 [sec]
>> # per cpu
>> perf stat -C 1,3,5                      107.063 [sec]
>> perf stat -C 1,3,5 --bpf-counters       106.406 [sec]
>> 
>> From the data, --bpf-counters is slightly better than the regular event
>> for all targets. I noticed that the results are not very stable. There
>> are a couple 108.xx runs in some of the conditions (w/ and w/o
>> --bpf-counters).
> 
> Hmm.. so this result is when multiplexing happened, right?
> I wondered how/why the regular perf stat is slower..

I should have made this more clear. This is when regular perf-stat time 
multiplexing (2x ref-cycles on Intel). OTOH, bpf-counters does enables 
sharing, so there is no time multiplexing. IOW, this is overhead of BPF 
vs. overhead of time multiplexing. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-18  5:54   ` Namhyung Kim
@ 2021-03-18  7:22     ` Song Liu
  2021-03-18 13:49       ` Namhyung Kim
  0 siblings, 1 reply; 33+ messages in thread
From: Song Liu @ 2021-03-18  7:22 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-kernel, Kernel Team, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa



> On Mar 17, 2021, at 10:54 PM, Namhyung Kim <namhyung@kernel.org> wrote:
> 

[...]

>> +
>> +static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
>> +                                      struct perf_event_attr_map_entry *entry)
>> +{
>> +       struct bperf_leader_bpf *skel = bperf_leader_bpf__open();
>> +       int link_fd, diff_map_fd, err;
>> +       struct bpf_link *link = NULL;
>> +
>> +       if (!skel) {
>> +               pr_err("Failed to open leader skeleton\n");
>> +               return -1;
>> +       }
>> +
>> +       bpf_map__resize(skel->maps.events, libbpf_num_possible_cpus());
>> +       err = bperf_leader_bpf__load(skel);
>> +       if (err) {
>> +               pr_err("Failed to load leader skeleton\n");
>> +               goto out;
>> +       }
>> +
>> +       err = -1;
>> +       link = bpf_program__attach(skel->progs.on_switch);
>> +       if (!link) {
>> +               pr_err("Failed to attach leader program\n");
>> +               goto out;
>> +       }
>> +
>> +       link_fd = bpf_link__fd(link);
>> +       diff_map_fd = bpf_map__fd(skel->maps.diff_readings);
>> +       entry->link_id = bpf_link_get_id(link_fd);
>> +       entry->diff_map_id = bpf_map_get_id(diff_map_fd);
>> +       err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, entry, BPF_ANY);
>> +       assert(err == 0);
>> +
>> +       evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry->link_id);
>> +       assert(evsel->bperf_leader_link_fd >= 0);
> 
> Isn't it the same as link_fd?

This is a different fd on the same link. 

> 
>> +
>> +       /*
>> +        * save leader_skel for install_pe, which is called within
>> +        * following evsel__open_per_cpu call
>> +        */
>> +       evsel->leader_skel = skel;
>> +       evsel__open_per_cpu(evsel, all_cpu_map, -1);
>> +
>> +out:
>> +       bperf_leader_bpf__destroy(skel);
>> +       bpf_link__destroy(link);
> 
> Why do we destroy it?  Is it because we get an another reference?

Yes. We only need evsel->bperf_leader_link_fd to keep the whole 
skeleton attached. 

When multiple perf-stat sessions are sharing the leader skeleton, 
only the first one loads the leader skeleton, by calling 
bperf_reload_leader_program(). Other sessions simply hold a fd to 
the bpf_link. More explanation in bperf__load() below.  


> 
>> +       return err;
>> +}
>> +
>> +static int bperf__load(struct evsel *evsel, struct target *target)
>> +{
>> +       struct perf_event_attr_map_entry entry = {0xffffffff, 0xffffffff};
>> +       int attr_map_fd, diff_map_fd = -1, err;
>> +       enum bperf_filter_type filter_type;
>> +       __u32 filter_entry_cnt, i;
>> +
>> +       if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
>> +               return -1;
>> +
>> +       if (!all_cpu_map) {
>> +               all_cpu_map = perf_cpu_map__new(NULL);
>> +               if (!all_cpu_map)
>> +                       return -1;
>> +       }
>> +
>> +       evsel->bperf_leader_prog_fd = -1;
>> +       evsel->bperf_leader_link_fd = -1;
>> +
>> +       /*
>> +        * Step 1: hold a fd on the leader program and the bpf_link, if
>> +        * the program is not already gone, reload the program.
>> +        * Use flock() to ensure exclusive access to the perf_event_attr
>> +        * map.
>> +        */
>> +       attr_map_fd = bperf_lock_attr_map(target);
>> +       if (attr_map_fd < 0) {
>> +               pr_err("Failed to lock perf_event_attr map\n");
>> +               return -1;
>> +       }
>> +
>> +       err = bpf_map_lookup_elem(attr_map_fd, &evsel->core.attr, &entry);
>> +       if (err) {
>> +               err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, &entry, BPF_ANY);
>> +               if (err)
>> +                       goto out;
>> +       }
>> +
>> +       evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry.link_id);
>> +       if (evsel->bperf_leader_link_fd < 0 &&
>> +           bperf_reload_leader_program(evsel, attr_map_fd, &entry))
>> +               goto out;

Continue with previous explanation. In bperf_reload_leader_program(), 
we open another reference to the link, and destroy the skeleton. This 
brings the code to the same state as evsel->bperf_leader_link_fd >= 
condition above. 

>> +
>> +       /*
>> +        * The bpf_link holds reference to the leader program, and the
>> +        * leader program holds reference to the maps. Therefore, if
>> +        * link_id is valid, diff_map_id should also be valid.
>> +        */
>> +       evsel->bperf_leader_prog_fd = bpf_prog_get_fd_by_id(
>> +               bpf_link_get_prog_id(evsel->bperf_leader_link_fd));
>> +       assert(evsel->bperf_leader_prog_fd >= 0);
>> +
>> +       diff_map_fd = bpf_map_get_fd_by_id(entry.diff_map_id);
>> +       assert(diff_map_fd >= 0);
>> +

[...]

>> +static int bperf__read(struct evsel *evsel)
>> +{
>> +       struct bperf_follower_bpf *skel = evsel->follower_skel;
>> +       __u32 num_cpu_bpf = cpu__max_cpu();
>> +       struct bpf_perf_event_value values[num_cpu_bpf];
>> +       int reading_map_fd, err = 0;
>> +       __u32 i, j, num_cpu;
>> +
>> +       bperf_sync_counters(evsel);
>> +       reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>> +
>> +       for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
>> +               __u32 cpu;
>> +
>> +               err = bpf_map_lookup_elem(reading_map_fd, &i, values);
>> +               if (err)
>> +                       goto out;
>> +               switch (evsel->follower_skel->bss->type) {
>> +               case BPERF_FILTER_GLOBAL:
>> +                       assert(i == 0);
>> +
>> +                       num_cpu = all_cpu_map->nr;
>> +                       for (j = 0; j < num_cpu; j++) {
>> +                               cpu = all_cpu_map->map[j];
>> +                               perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
>> +                               perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
>> +                               perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
> 
> I'm confused with this.  Does the accum_readings map contain values
> for all cpus?  IIUC it has only a single entry but you access it for each cpu.
> What am I missing?

accumulated_reading is a percpu array. In this case, each cpu has its own 
bpf_perf_event_value with index 0. The BPF program could only access the 
data on current cpu. When reading from use space, we get #-of-cpus entries 
for index 0.  

Does this make sense?

Thanks,
Song


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/3] perf-test: add a test for perf-stat --bpf-counters option
  2021-03-18  6:07   ` Namhyung Kim
@ 2021-03-18  7:39     ` Song Liu
  0 siblings, 0 replies; 33+ messages in thread
From: Song Liu @ 2021-03-18  7:39 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-kernel, Kernel Team, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa



> On Mar 17, 2021, at 11:07 PM, Namhyung Kim <namhyung@kernel.org> wrote:
> 
> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
>> 
>> Add a test to compare the output of perf-stat with and without option
>> --bpf-counters. If the difference is more than 10%, the test is considered
>> as failed.
>> 
>> For stable results between two runs (w/ and w/o --bpf-counters), the test
>> program should: 1) be long enough for better signal-noise-ratio; 2) not
>> depend on the behavior of IO subsystem (for less noise from caching). So
>> far, the best option we found is stressapptest.
>> 
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>> tools/perf/tests/shell/stat_bpf_counters.sh | 34 +++++++++++++++++++++
>> 1 file changed, 34 insertions(+)
>> create mode 100755 tools/perf/tests/shell/stat_bpf_counters.sh
>> 
>> diff --git a/tools/perf/tests/shell/stat_bpf_counters.sh b/tools/perf/tests/shell/stat_bpf_counters.sh
>> new file mode 100755
>> index 0000000000000..c0bcb38d6b53c
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/stat_bpf_counters.sh
>> @@ -0,0 +1,34 @@
>> +#!/bin/sh
>> +# perf stat --bpf-counters test
>> +# SPDX-License-Identifier: GPL-2.0
>> +
>> +set -e
>> +
>> +# check whether $2 is within +/- 10% of $1
>> +compare_number()
>> +{
>> +       first_num=$1
>> +       second_num=$2
>> +
>> +       # upper bound is first_num * 110%
>> +       upper=$(( $first_num + $first_num / 10 ))
>> +       # lower bound is first_num * 90%
>> +       lower=$(( $first_num - $first_num / 10 ))
>> +
>> +       if [ $second_num -gt $upper ] || [ $second_num -lt $lower ]; then
>> +               echo "The difference between $first_num and $second_num are greater than 10%."
>> +               exit 1
>> +       fi
>> +}
>> +
>> +# skip if --bpf-counters is not supported
>> +perf stat --bpf-counters true > /dev/null 2>&1 || exit 2
>> +
>> +# skip if stressapptest is not available
>> +stressapptest -s 1 -M 100 -m 1 > /dev/null 2>&1 || exit 2
> 
> I don't know how popular it is, but we can print some info
> in case we miss it.

I just realized that perf-bench-sched-messaging is a good test to use, 
so we don't need stressapptest. Attached the updated version below.

> 
>> +
>> +base_cycles=$(perf stat --no-big-num -e cycles -- stressapptest -s 3 -M 100 -m 1 2>&1 | grep -e cycles | awk '{print $1}')
>> +bpf_cycles=$(perf stat --no-big-num --bpf-counters -e cycles -- stressapptest -s 3 -M 100 -m 1 2>&1 | grep -e cycles | awk '{print $1}')
> 
> I think just awk '/cycles/ {print $1}' should work.

Thanks! Fixed in the new version. 

Song




cat tools/perf/tests/shell/stat_bpf_counters.sh
#!/bin/sh
# perf stat --bpf-counters test
# SPDX-License-Identifier: GPL-2.0

set -e

# check whether $2 is within +/- 10% of $1
compare_number()
{
        first_num=$1
        second_num=$2

        # upper bound is first_num * 110%
        upper=$(( $first_num + $first_num / 10 ))
        # lower bound is first_num * 90%
        lower=$(( $first_num - $first_num / 10 ))

        if [ $second_num -gt $upper ] || [ $second_num -lt $lower ]; then
                echo "The difference between $first_num and $second_num are greater than 10%."
                exit 1
        fi
}

# skip if --bpf-counters is not supported
perf stat --bpf-counters true > /dev/null 2>&1 || exit 2

base_cycles=$(perf stat --no-big-num -e cycles -- perf bench sched messaging -g 1 -l 100 -t 2>&1 | awk '/cycles/ {print $1}')
bpf_cycles=$(perf stat --no-big-num --bpf-counters -e cycles -- perf bench sched messaging -g 1 -l 100 -t 2>&1 | awk '/cycles/ {print $1}')

compare_number $base_cycles $bpf_cycles
exit 0






^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-18  7:22     ` Song Liu
@ 2021-03-18 13:49       ` Namhyung Kim
  2021-03-18 17:16         ` Song Liu
  0 siblings, 1 reply; 33+ messages in thread
From: Namhyung Kim @ 2021-03-18 13:49 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, Kernel Team, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa

On Thu, Mar 18, 2021 at 4:22 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Mar 17, 2021, at 10:54 PM, Namhyung Kim <namhyung@kernel.org> wrote:
> >
>
> [...]
>
> >> +
> >> +static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
> >> +                                      struct perf_event_attr_map_entry *entry)
> >> +{
> >> +       struct bperf_leader_bpf *skel = bperf_leader_bpf__open();
> >> +       int link_fd, diff_map_fd, err;
> >> +       struct bpf_link *link = NULL;
> >> +
> >> +       if (!skel) {
> >> +               pr_err("Failed to open leader skeleton\n");
> >> +               return -1;
> >> +       }
> >> +
> >> +       bpf_map__resize(skel->maps.events, libbpf_num_possible_cpus());
> >> +       err = bperf_leader_bpf__load(skel);
> >> +       if (err) {
> >> +               pr_err("Failed to load leader skeleton\n");
> >> +               goto out;
> >> +       }
> >> +
> >> +       err = -1;
> >> +       link = bpf_program__attach(skel->progs.on_switch);
> >> +       if (!link) {
> >> +               pr_err("Failed to attach leader program\n");
> >> +               goto out;
> >> +       }
> >> +
> >> +       link_fd = bpf_link__fd(link);
> >> +       diff_map_fd = bpf_map__fd(skel->maps.diff_readings);
> >> +       entry->link_id = bpf_link_get_id(link_fd);
> >> +       entry->diff_map_id = bpf_map_get_id(diff_map_fd);
> >> +       err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, entry, BPF_ANY);
> >> +       assert(err == 0);
> >> +
> >> +       evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry->link_id);
> >> +       assert(evsel->bperf_leader_link_fd >= 0);
> >
> > Isn't it the same as link_fd?
>
> This is a different fd on the same link.

Ok

>
> >
> >> +
> >> +       /*
> >> +        * save leader_skel for install_pe, which is called within
> >> +        * following evsel__open_per_cpu call
> >> +        */
> >> +       evsel->leader_skel = skel;
> >> +       evsel__open_per_cpu(evsel, all_cpu_map, -1);
> >> +
> >> +out:
> >> +       bperf_leader_bpf__destroy(skel);
> >> +       bpf_link__destroy(link);
> >
> > Why do we destroy it?  Is it because we get an another reference?
>
> Yes. We only need evsel->bperf_leader_link_fd to keep the whole
> skeleton attached.
>
> When multiple perf-stat sessions are sharing the leader skeleton,
> only the first one loads the leader skeleton, by calling
> bperf_reload_leader_program(). Other sessions simply hold a fd to
> the bpf_link. More explanation in bperf__load() below.

Ok.

>
>
> >
> >> +       return err;
> >> +}
> >> +
> >> +static int bperf__load(struct evsel *evsel, struct target *target)
> >> +{
> >> +       struct perf_event_attr_map_entry entry = {0xffffffff, 0xffffffff};
> >> +       int attr_map_fd, diff_map_fd = -1, err;
> >> +       enum bperf_filter_type filter_type;
> >> +       __u32 filter_entry_cnt, i;
> >> +
> >> +       if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
> >> +               return -1;
> >> +
> >> +       if (!all_cpu_map) {
> >> +               all_cpu_map = perf_cpu_map__new(NULL);
> >> +               if (!all_cpu_map)
> >> +                       return -1;
> >> +       }
> >> +
> >> +       evsel->bperf_leader_prog_fd = -1;
> >> +       evsel->bperf_leader_link_fd = -1;
> >> +
> >> +       /*
> >> +        * Step 1: hold a fd on the leader program and the bpf_link, if
> >> +        * the program is not already gone, reload the program.
> >> +        * Use flock() to ensure exclusive access to the perf_event_attr
> >> +        * map.
> >> +        */
> >> +       attr_map_fd = bperf_lock_attr_map(target);
> >> +       if (attr_map_fd < 0) {
> >> +               pr_err("Failed to lock perf_event_attr map\n");
> >> +               return -1;
> >> +       }
> >> +
> >> +       err = bpf_map_lookup_elem(attr_map_fd, &evsel->core.attr, &entry);
> >> +       if (err) {
> >> +               err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, &entry, BPF_ANY);
> >> +               if (err)
> >> +                       goto out;
> >> +       }
> >> +
> >> +       evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry.link_id);
> >> +       if (evsel->bperf_leader_link_fd < 0 &&
> >> +           bperf_reload_leader_program(evsel, attr_map_fd, &entry))
> >> +               goto out;
>
> Continue with previous explanation. In bperf_reload_leader_program(),
> we open another reference to the link, and destroy the skeleton. This
> brings the code to the same state as evsel->bperf_leader_link_fd >=
> condition above.

Thanks for the explanation.

>
> >> +
> >> +       /*
> >> +        * The bpf_link holds reference to the leader program, and the
> >> +        * leader program holds reference to the maps. Therefore, if
> >> +        * link_id is valid, diff_map_id should also be valid.
> >> +        */
> >> +       evsel->bperf_leader_prog_fd = bpf_prog_get_fd_by_id(
> >> +               bpf_link_get_prog_id(evsel->bperf_leader_link_fd));
> >> +       assert(evsel->bperf_leader_prog_fd >= 0);
> >> +
> >> +       diff_map_fd = bpf_map_get_fd_by_id(entry.diff_map_id);
> >> +       assert(diff_map_fd >= 0);
> >> +
>
> [...]
>
> >> +static int bperf__read(struct evsel *evsel)
> >> +{
> >> +       struct bperf_follower_bpf *skel = evsel->follower_skel;
> >> +       __u32 num_cpu_bpf = cpu__max_cpu();
> >> +       struct bpf_perf_event_value values[num_cpu_bpf];
> >> +       int reading_map_fd, err = 0;
> >> +       __u32 i, j, num_cpu;
> >> +
> >> +       bperf_sync_counters(evsel);
> >> +       reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> >> +
> >> +       for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
> >> +               __u32 cpu;
> >> +
> >> +               err = bpf_map_lookup_elem(reading_map_fd, &i, values);
> >> +               if (err)
> >> +                       goto out;
> >> +               switch (evsel->follower_skel->bss->type) {
> >> +               case BPERF_FILTER_GLOBAL:
> >> +                       assert(i == 0);
> >> +
> >> +                       num_cpu = all_cpu_map->nr;
> >> +                       for (j = 0; j < num_cpu; j++) {
> >> +                               cpu = all_cpu_map->map[j];
> >> +                               perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
> >> +                               perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
> >> +                               perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
> >
> > I'm confused with this.  Does the accum_readings map contain values
> > for all cpus?  IIUC it has only a single entry but you access it for each cpu.
> > What am I missing?
>
> accumulated_reading is a percpu array. In this case, each cpu has its own
> bpf_perf_event_value with index 0. The BPF program could only access the
> data on current cpu. When reading from use space, we get #-of-cpus entries
> for index 0.
>
> Does this make sense?

Yep, I didn't know it returns all values when reading from user space.  Then
I think per cpu event doesn't have many entries too.  Like the global case
it can simply put the value with key 0, no?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-18 13:49       ` Namhyung Kim
@ 2021-03-18 17:16         ` Song Liu
  0 siblings, 0 replies; 33+ messages in thread
From: Song Liu @ 2021-03-18 17:16 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-kernel, Kernel Team, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa



> On Mar 18, 2021, at 6:49 AM, Namhyung Kim <namhyung@kernel.org> wrote:
> 
> On Thu, Mar 18, 2021 at 4:22 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Mar 17, 2021, at 10:54 PM, Namhyung Kim <namhyung@kernel.org> wrote:
>>> 
>> 
>> [...]
>> 
>>>> +
>>>> +static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
>>>> +                                      struct perf_event_attr_map_entry *entry)
>>>> +{
>>>> +       struct bperf_leader_bpf *skel = bperf_leader_bpf__open();
>>>> +       int link_fd, diff_map_fd, err;
>>>> +       struct bpf_link *link = NULL;
>>>> +
>>>> +       if (!skel) {
>>>> +               pr_err("Failed to open leader skeleton\n");
>>>> +               return -1;
>>>> +       }
>>>> +
>>>> +       bpf_map__resize(skel->maps.events, libbpf_num_possible_cpus());
>>>> +       err = bperf_leader_bpf__load(skel);
>>>> +       if (err) {
>>>> +               pr_err("Failed to load leader skeleton\n");
>>>> +               goto out;
>>>> +       }
>>>> +
>>>> +       err = -1;
>>>> +       link = bpf_program__attach(skel->progs.on_switch);
>>>> +       if (!link) {
>>>> +               pr_err("Failed to attach leader program\n");
>>>> +               goto out;
>>>> +       }
>>>> +
>>>> +       link_fd = bpf_link__fd(link);
>>>> +       diff_map_fd = bpf_map__fd(skel->maps.diff_readings);
>>>> +       entry->link_id = bpf_link_get_id(link_fd);
>>>> +       entry->diff_map_id = bpf_map_get_id(diff_map_fd);
>>>> +       err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, entry, BPF_ANY);
>>>> +       assert(err == 0);
>>>> +
>>>> +       evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry->link_id);
>>>> +       assert(evsel->bperf_leader_link_fd >= 0);
>>> 
>>> Isn't it the same as link_fd?
>> 
>> This is a different fd on the same link.
> 
> Ok
> 
>> 
>>> 
>>>> +
>>>> +       /*
>>>> +        * save leader_skel for install_pe, which is called within
>>>> +        * following evsel__open_per_cpu call
>>>> +        */
>>>> +       evsel->leader_skel = skel;
>>>> +       evsel__open_per_cpu(evsel, all_cpu_map, -1);
>>>> +
>>>> +out:
>>>> +       bperf_leader_bpf__destroy(skel);
>>>> +       bpf_link__destroy(link);
>>> 
>>> Why do we destroy it?  Is it because we get an another reference?
>> 
>> Yes. We only need evsel->bperf_leader_link_fd to keep the whole
>> skeleton attached.
>> 
>> When multiple perf-stat sessions are sharing the leader skeleton,
>> only the first one loads the leader skeleton, by calling
>> bperf_reload_leader_program(). Other sessions simply hold a fd to
>> the bpf_link. More explanation in bperf__load() below.
> 
> Ok.
> 
>> 
>> 
>>> 
>>>> +       return err;
>>>> +}
>>>> +
>>>> +static int bperf__load(struct evsel *evsel, struct target *target)
>>>> +{
>>>> +       struct perf_event_attr_map_entry entry = {0xffffffff, 0xffffffff};
>>>> +       int attr_map_fd, diff_map_fd = -1, err;
>>>> +       enum bperf_filter_type filter_type;
>>>> +       __u32 filter_entry_cnt, i;
>>>> +
>>>> +       if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
>>>> +               return -1;
>>>> +
>>>> +       if (!all_cpu_map) {
>>>> +               all_cpu_map = perf_cpu_map__new(NULL);
>>>> +               if (!all_cpu_map)
>>>> +                       return -1;
>>>> +       }
>>>> +
>>>> +       evsel->bperf_leader_prog_fd = -1;
>>>> +       evsel->bperf_leader_link_fd = -1;
>>>> +
>>>> +       /*
>>>> +        * Step 1: hold a fd on the leader program and the bpf_link, if
>>>> +        * the program is not already gone, reload the program.
>>>> +        * Use flock() to ensure exclusive access to the perf_event_attr
>>>> +        * map.
>>>> +        */
>>>> +       attr_map_fd = bperf_lock_attr_map(target);
>>>> +       if (attr_map_fd < 0) {
>>>> +               pr_err("Failed to lock perf_event_attr map\n");
>>>> +               return -1;
>>>> +       }
>>>> +
>>>> +       err = bpf_map_lookup_elem(attr_map_fd, &evsel->core.attr, &entry);
>>>> +       if (err) {
>>>> +               err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, &entry, BPF_ANY);
>>>> +               if (err)
>>>> +                       goto out;
>>>> +       }
>>>> +
>>>> +       evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry.link_id);
>>>> +       if (evsel->bperf_leader_link_fd < 0 &&
>>>> +           bperf_reload_leader_program(evsel, attr_map_fd, &entry))
>>>> +               goto out;
>> 
>> Continue with previous explanation. In bperf_reload_leader_program(),
>> we open another reference to the link, and destroy the skeleton. This
>> brings the code to the same state as evsel->bperf_leader_link_fd >=
>> condition above.
> 
> Thanks for the explanation.
> 
>> 
>>>> +
>>>> +       /*
>>>> +        * The bpf_link holds reference to the leader program, and the
>>>> +        * leader program holds reference to the maps. Therefore, if
>>>> +        * link_id is valid, diff_map_id should also be valid.
>>>> +        */
>>>> +       evsel->bperf_leader_prog_fd = bpf_prog_get_fd_by_id(
>>>> +               bpf_link_get_prog_id(evsel->bperf_leader_link_fd));
>>>> +       assert(evsel->bperf_leader_prog_fd >= 0);
>>>> +
>>>> +       diff_map_fd = bpf_map_get_fd_by_id(entry.diff_map_id);
>>>> +       assert(diff_map_fd >= 0);
>>>> +
>> 
>> [...]
>> 
>>>> +static int bperf__read(struct evsel *evsel)
>>>> +{
>>>> +       struct bperf_follower_bpf *skel = evsel->follower_skel;
>>>> +       __u32 num_cpu_bpf = cpu__max_cpu();
>>>> +       struct bpf_perf_event_value values[num_cpu_bpf];
>>>> +       int reading_map_fd, err = 0;
>>>> +       __u32 i, j, num_cpu;
>>>> +
>>>> +       bperf_sync_counters(evsel);
>>>> +       reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>>>> +
>>>> +       for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
>>>> +               __u32 cpu;
>>>> +
>>>> +               err = bpf_map_lookup_elem(reading_map_fd, &i, values);
>>>> +               if (err)
>>>> +                       goto out;
>>>> +               switch (evsel->follower_skel->bss->type) {
>>>> +               case BPERF_FILTER_GLOBAL:
>>>> +                       assert(i == 0);
>>>> +
>>>> +                       num_cpu = all_cpu_map->nr;
>>>> +                       for (j = 0; j < num_cpu; j++) {
>>>> +                               cpu = all_cpu_map->map[j];
>>>> +                               perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
>>>> +                               perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
>>>> +                               perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
>>> 
>>> I'm confused with this.  Does the accum_readings map contain values
>>> for all cpus?  IIUC it has only a single entry but you access it for each cpu.
>>> What am I missing?
>> 
>> accumulated_reading is a percpu array. In this case, each cpu has its own
>> bpf_perf_event_value with index 0. The BPF program could only access the
>> data on current cpu. When reading from use space, we get #-of-cpus entries
>> for index 0.
>> 
>> Does this make sense?
> 
> Yep, I didn't know it returns all values when reading from user space.  Then
> I think per cpu event doesn't have many entries too.  Like the global case
> it can simply put the value with key 0, no?

Current per cpu event use same logic as per task events, so we do have multiple
entries. I think it is possible to modify the logic to use one entry for per
cpu events. 

Thanks,
Song



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-18  3:52     ` Song Liu
  2021-03-18  4:32       ` Namhyung Kim
@ 2021-03-18 21:14       ` Jiri Olsa
  2021-03-19  0:09         ` Arnaldo
  1 sibling, 1 reply; 33+ messages in thread
From: Jiri Olsa @ 2021-03-18 21:14 UTC (permalink / raw)
  To: Song Liu
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, linux-kernel,
	Kernel Team, Arnaldo Carvalho de Melo, Jiri Olsa

On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
> 
> 
> > On Mar 17, 2021, at 6:11 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > 
> > Em Wed, Mar 17, 2021 at 02:29:28PM +0900, Namhyung Kim escreveu:
> >> Hi Song,
> >> 
> >> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com> wrote:
> >>> 
> >>> perf uses performance monitoring counters (PMCs) to monitor system
> >>> performance. The PMCs are limited hardware resources. For example,
> >>> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
> >>> 
> >>> Modern data center systems use these PMCs in many different ways:
> >>> system level monitoring, (maybe nested) container level monitoring, per
> >>> process monitoring, profiling (in sample mode), etc. In some cases,
> >>> there are more active perf_events than available hardware PMCs. To allow
> >>> all perf_events to have a chance to run, it is necessary to do expensive
> >>> time multiplexing of events.
> >>> 
> >>> On the other hand, many monitoring tools count the common metrics (cycles,
> >>> instructions). It is a waste to have multiple tools create multiple
> >>> perf_events of "cycles" and occupy multiple PMCs.
> >> 
> >> Right, it'd be really helpful when the PMCs are frequently or mostly shared.
> >> But it'd also increase the overhead for uncontended cases as BPF programs
> >> need to run on every context switch.  Depending on the workload, it may
> >> cause a non-negligible performance impact.  So users should be aware of it.
> > 
> > Would be interesting to, humm, measure both cases to have a firm number
> > of the impact, how many instructions are added when sharing using
> > --bpf-counters?
> > 
> > I.e. compare the "expensive time multiplexing of events" with its
> > avoidance by using --bpf-counters.
> > 
> > Song, have you perfmormed such measurements?
> 
> I have got some measurements with perf-bench-sched-messaging:
> 
> The system: x86_64 with 23 cores (46 HT)
> 
> The perf-stat command:
> perf stat -e cycles,cycles,instructions,instructions,ref-cycles,ref-cycles <target, etc.>
> 
> The benchmark command and output:
> ./perf bench sched messaging -g 40 -l 50000 -t
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver threads per group
> # 40 groups == 1600 threads run
>      Total time: 10X.XXX [sec]
> 
> 
> I use the "Total time" as measurement, so smaller number is better. 
> 
> For each condition, I run the command 5 times, and took the median of 
> "Total time". 
> 
> Baseline (no perf-stat)			104.873 [sec]
> # global
> perf stat -a				107.887 [sec]
> perf stat -a --bpf-counters		106.071 [sec]
> # per task
> perf stat 				106.314 [sec]
> perf stat --bpf-counters 		105.965 [sec]
> # per cpu
> perf stat -C 1,3,5 			107.063 [sec]
> perf stat -C 1,3,5 --bpf-counters 	106.406 [sec]

I can't see why it's actualy faster than normal perf ;-)
would be worth to find out

jirka

> 
> From the data, --bpf-counters is slightly better than the regular event
> for all targets. I noticed that the results are not very stable. There 
> are a couple 108.xx runs in some of the conditions (w/ and w/o 
> --bpf-counters).
> 
> 
> I also measured the average runtime of the BPF programs, with 
> 
> 	sysctl kernel.bpf_stats_enabled=1
> 
> For each event, if we have one leader and two followers, the total run 
> time is about 340ns. IOW, 340ns for two perf-stat reading instructions, 
> 340ns for two perf-stat reading cycles, etc. 
> 
> Thanks,
> Song
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-16 21:18 ` [PATCH v2 1/3] perf-stat: introduce bperf, " Song Liu
  2021-03-18  5:54   ` Namhyung Kim
@ 2021-03-18 21:15   ` Jiri Olsa
  2021-03-19 18:41     ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 33+ messages in thread
From: Jiri Olsa @ 2021-03-18 21:15 UTC (permalink / raw)
  To: Song Liu; +Cc: linux-kernel, kernel-team, acme, acme, namhyung, jolsa

On Tue, Mar 16, 2021 at 02:18:35PM -0700, Song Liu wrote:
> perf uses performance monitoring counters (PMCs) to monitor system
> performance. The PMCs are limited hardware resources. For example,
> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
> 
> Modern data center systems use these PMCs in many different ways:
> system level monitoring, (maybe nested) container level monitoring, per
> process monitoring, profiling (in sample mode), etc. In some cases,
> there are more active perf_events than available hardware PMCs. To allow
> all perf_events to have a chance to run, it is necessary to do expensive
> time multiplexing of events.
> 
> On the other hand, many monitoring tools count the common metrics (cycles,
> instructions). It is a waste to have multiple tools create multiple
> perf_events of "cycles" and occupy multiple PMCs.
> 
> bperf tries to reduce such wastes by allowing multiple perf_events of
> "cycles" or "instructions" (at different scopes) to share PMUs. Instead
> of having each perf-stat session to read its own perf_events, bperf uses
> BPF programs to read the perf_events and aggregate readings to BPF maps.
> Then, the perf-stat session(s) reads the values from these BPF maps.
> 
> Please refer to the comment before the definition of bperf_ops for the
> description of bperf architecture.
> 
> bperf is off by default. To enable it, pass --bpf-counters option to
> perf-stat. bperf uses a BPF hashmap to share information about BPF
> programs and maps used by bperf. This map is pinned to bpffs. The default
> path is /sys/fs/bpf/perf_attr_map. The user could change the path with
> option --bpf-attr-map.
> 
> Signed-off-by: Song Liu <songliubraving@fb.com>

Reviewed-by: Jiri Olsa <jolsa@redhat.com>

thanks,
jirka

> 
> ---
> Known limitations:
> 1. Do not support per cgroup events;
> 2. Do not support monitoring of BPF program (perf-stat -b);
> 3. Do not support event groups;
> 4. Do not support inherit events during fork().
> 
> The following commands have been tested:
> 
>    perf stat --bpf-counters -e cycles,ref-cycles -a
>    perf stat --bpf-counters -e cycles,instructions -C 1,3,4
>    perf stat --bpf-counters -e cycles -p 123
>    perf stat --bpf-counters -e cycles -t 100,101
>    perf stat --bpf-counters -e cycles,ref-cycles -- stressapptest ...
> ---
>  tools/perf/Documentation/perf-stat.txt        |  11 +
>  tools/perf/Makefile.perf                      |   1 +
>  tools/perf/builtin-stat.c                     |  10 +
>  tools/perf/util/bpf_counter.c                 | 519 +++++++++++++++++-
>  tools/perf/util/bpf_skel/bperf.h              |  14 +
>  tools/perf/util/bpf_skel/bperf_follower.bpf.c |  69 +++
>  tools/perf/util/bpf_skel/bperf_leader.bpf.c   |  46 ++
>  tools/perf/util/bpf_skel/bperf_u.h            |  14 +
>  tools/perf/util/evsel.h                       |  20 +-
>  tools/perf/util/target.h                      |   4 +-
>  10 files changed, 701 insertions(+), 7 deletions(-)
>  create mode 100644 tools/perf/util/bpf_skel/bperf.h
>  create mode 100644 tools/perf/util/bpf_skel/bperf_follower.bpf.c
>  create mode 100644 tools/perf/util/bpf_skel/bperf_leader.bpf.c
>  create mode 100644 tools/perf/util/bpf_skel/bperf_u.h
> 
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 08a1714494f87..d2e7656b5ef81 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -93,6 +93,17 @@ report::
>  
>          1.102235068 seconds time elapsed
>  
> +--bpf-counters::
> +	Use BPF programs to aggregate readings from perf_events.  This
> +	allows multiple perf-stat sessions that are counting the same metric (cycles,
> +	instructions, etc.) to share hardware counters.
> +
> +--bpf-attr-map::
> +	With option "--bpf-counters", different perf-stat sessions share
> +	information about shared BPF programs and maps via a pinned hashmap.
> +	Use "--bpf-attr-map" to specify the path of this pinned hashmap.
> +	The default path is /sys/fs/bpf/perf_attr_map.
> +
>  ifdef::HAVE_LIBPFM[]
>  --pfm-events events::
>  Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net)
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index f6e609673de2b..ca9aa08e85a1f 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -1007,6 +1007,7 @@ python-clean:
>  SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel)
>  SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp)
>  SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
> +SKELETONS += $(SKEL_OUT)/bperf_leader.skel.h $(SKEL_OUT)/bperf_follower.skel.h
>  
>  ifdef BUILD_BPF_SKEL
>  BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 2e2e4a8345ea2..92696373da994 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -792,6 +792,12 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  	}
>  
>  	evlist__for_each_cpu (evsel_list, i, cpu) {
> +		/*
> +		 * bperf calls evsel__open_per_cpu() in bperf__load(), so
> +		 * no need to call it again here.
> +		 */
> +		if (target.use_bpf)
> +			break;
>  		affinity__set(&affinity, cpu);
>  
>  		evlist__for_each_entry(evsel_list, counter) {
> @@ -1146,6 +1152,10 @@ static struct option stat_options[] = {
>  #ifdef HAVE_BPF_SKEL
>  	OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id",
>  		   "stat events on existing bpf program id"),
> +	OPT_BOOLEAN(0, "bpf-counters", &target.use_bpf,
> +		    "use bpf program to count events"),
> +	OPT_STRING(0, "bpf-attr-map", &target.attr_map, "attr-map-path",
> +		   "path to perf_event_attr map"),
>  #endif
>  	OPT_BOOLEAN('a', "all-cpus", &target.system_wide,
>  		    "system-wide collection from all CPUs"),
> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
> index 04f89120b3232..81d1df3c4ec0e 100644
> --- a/tools/perf/util/bpf_counter.c
> +++ b/tools/perf/util/bpf_counter.c
> @@ -5,6 +5,7 @@
>  #include <assert.h>
>  #include <limits.h>
>  #include <unistd.h>
> +#include <sys/file.h>
>  #include <sys/time.h>
>  #include <sys/resource.h>
>  #include <linux/err.h>
> @@ -12,14 +13,45 @@
>  #include <bpf/bpf.h>
>  #include <bpf/btf.h>
>  #include <bpf/libbpf.h>
> +#include <api/fs/fs.h>
>  
>  #include "bpf_counter.h"
>  #include "counts.h"
>  #include "debug.h"
>  #include "evsel.h"
> +#include "evlist.h"
>  #include "target.h"
> +#include "cpumap.h"
> +#include "thread_map.h"
>  
>  #include "bpf_skel/bpf_prog_profiler.skel.h"
> +#include "bpf_skel/bperf_u.h"
> +#include "bpf_skel/bperf_leader.skel.h"
> +#include "bpf_skel/bperf_follower.skel.h"
> +
> +/*
> + * bperf uses a hashmap, the attr_map, to track all the leader programs.
> + * The hashmap is pinned in bpffs. flock() on this file is used to ensure
> + * no concurrent access to the attr_map.  The key of attr_map is struct
> + * perf_event_attr, and the value is struct perf_event_attr_map_entry.
> + *
> + * struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
> + * leader prog, and the diff_map. Each perf-stat session holds a reference
> + * to the bpf_link to make sure the leader prog is attached to sched_switch
> + * tracepoint.
> + *
> + * Since the hashmap only contains IDs of the bpf_link and diff_map, it
> + * does not hold any references to the leader program. Once all perf-stat
> + * sessions of these events exit, the leader prog, its maps, and the
> + * perf_events will be freed.
> + */
> +struct perf_event_attr_map_entry {
> +	__u32 link_id;
> +	__u32 diff_map_id;
> +};
> +
> +#define DEFAULT_ATTR_MAP_PATH "fs/bpf/perf_attr_map"
> +#define ATTR_MAP_SIZE 16
>  
>  static inline void *u64_to_ptr(__u64 ptr)
>  {
> @@ -274,17 +306,494 @@ struct bpf_counter_ops bpf_program_profiler_ops = {
>  	.install_pe = bpf_program_profiler__install_pe,
>  };
>  
> +static __u32 bpf_link_get_id(int fd)
> +{
> +	struct bpf_link_info link_info = {0};
> +	__u32 link_info_len = sizeof(link_info);
> +
> +	bpf_obj_get_info_by_fd(fd, &link_info, &link_info_len);
> +	return link_info.id;
> +}
> +
> +static __u32 bpf_link_get_prog_id(int fd)
> +{
> +	struct bpf_link_info link_info = {0};
> +	__u32 link_info_len = sizeof(link_info);
> +
> +	bpf_obj_get_info_by_fd(fd, &link_info, &link_info_len);
> +	return link_info.prog_id;
> +}
> +
> +static __u32 bpf_map_get_id(int fd)
> +{
> +	struct bpf_map_info map_info = {0};
> +	__u32 map_info_len = sizeof(map_info);
> +
> +	bpf_obj_get_info_by_fd(fd, &map_info, &map_info_len);
> +	return map_info.id;
> +}
> +
> +static int bperf_lock_attr_map(struct target *target)
> +{
> +	char path[PATH_MAX];
> +	int map_fd, err;
> +
> +	if (target->attr_map) {
> +		scnprintf(path, PATH_MAX, "%s", target->attr_map);
> +	} else {
> +		scnprintf(path, PATH_MAX, "%s/%s", sysfs__mountpoint(),
> +			  DEFAULT_ATTR_MAP_PATH);
> +	}
> +
> +	if (access(path, F_OK)) {
> +		map_fd = bpf_create_map(BPF_MAP_TYPE_HASH,
> +					sizeof(struct perf_event_attr),
> +					sizeof(struct perf_event_attr_map_entry),
> +					ATTR_MAP_SIZE, 0);
> +		if (map_fd < 0)
> +			return -1;
> +
> +		err = bpf_obj_pin(map_fd, path);
> +		if (err) {
> +			/* someone pinned the map in parallel? */
> +			close(map_fd);
> +			map_fd = bpf_obj_get(path);
> +			if (map_fd < 0)
> +				return -1;
> +		}
> +	} else {
> +		map_fd = bpf_obj_get(path);
> +		if (map_fd < 0)
> +			return -1;
> +	}
> +
> +	err = flock(map_fd, LOCK_EX);
> +	if (err) {
> +		close(map_fd);
> +		return -1;
> +	}
> +	return map_fd;
> +}
> +
> +/* trigger the leader program on a cpu */
> +static int bperf_trigger_reading(int prog_fd, int cpu)
> +{
> +	DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
> +			    .ctx_in = NULL,
> +			    .ctx_size_in = 0,
> +			    .flags = BPF_F_TEST_RUN_ON_CPU,
> +			    .cpu = cpu,
> +			    .retval = 0,
> +		);
> +
> +	return bpf_prog_test_run_opts(prog_fd, &opts);
> +}
> +
> +static int bperf_check_target(struct evsel *evsel,
> +			      struct target *target,
> +			      enum bperf_filter_type *filter_type,
> +			      __u32 *filter_entry_cnt)
> +{
> +	if (evsel->leader->core.nr_members > 1) {
> +		pr_err("bpf managed perf events do not yet support groups.\n");
> +		return -1;
> +	}
> +
> +	/* determine filter type based on target */
> +	if (target->system_wide) {
> +		*filter_type = BPERF_FILTER_GLOBAL;
> +		*filter_entry_cnt = 1;
> +	} else if (target->cpu_list) {
> +		*filter_type = BPERF_FILTER_CPU;
> +		*filter_entry_cnt = perf_cpu_map__nr(evsel__cpus(evsel));
> +	} else if (target->tid) {
> +		*filter_type = BPERF_FILTER_PID;
> +		*filter_entry_cnt = perf_thread_map__nr(evsel->core.threads);
> +	} else if (target->pid || evsel->evlist->workload.pid != -1) {
> +		*filter_type = BPERF_FILTER_TGID;
> +		*filter_entry_cnt = perf_thread_map__nr(evsel->core.threads);
> +	} else {
> +		pr_err("bpf managed perf events do not yet support these targets.\n");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static	struct perf_cpu_map *all_cpu_map;
> +
> +static int bperf_reload_leader_program(struct evsel *evsel, int attr_map_fd,
> +				       struct perf_event_attr_map_entry *entry)
> +{
> +	struct bperf_leader_bpf *skel = bperf_leader_bpf__open();
> +	int link_fd, diff_map_fd, err;
> +	struct bpf_link *link = NULL;
> +
> +	if (!skel) {
> +		pr_err("Failed to open leader skeleton\n");
> +		return -1;
> +	}
> +
> +	bpf_map__resize(skel->maps.events, libbpf_num_possible_cpus());
> +	err = bperf_leader_bpf__load(skel);
> +	if (err) {
> +		pr_err("Failed to load leader skeleton\n");
> +		goto out;
> +	}
> +
> +	err = -1;
> +	link = bpf_program__attach(skel->progs.on_switch);
> +	if (!link) {
> +		pr_err("Failed to attach leader program\n");
> +		goto out;
> +	}
> +
> +	link_fd = bpf_link__fd(link);
> +	diff_map_fd = bpf_map__fd(skel->maps.diff_readings);
> +	entry->link_id = bpf_link_get_id(link_fd);
> +	entry->diff_map_id = bpf_map_get_id(diff_map_fd);
> +	err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, entry, BPF_ANY);
> +	assert(err == 0);
> +
> +	evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry->link_id);
> +	assert(evsel->bperf_leader_link_fd >= 0);
> +
> +	/*
> +	 * save leader_skel for install_pe, which is called within
> +	 * following evsel__open_per_cpu call
> +	 */
> +	evsel->leader_skel = skel;
> +	evsel__open_per_cpu(evsel, all_cpu_map, -1);
> +
> +out:
> +	bperf_leader_bpf__destroy(skel);
> +	bpf_link__destroy(link);
> +	return err;
> +}
> +
> +static int bperf__load(struct evsel *evsel, struct target *target)
> +{
> +	struct perf_event_attr_map_entry entry = {0xffffffff, 0xffffffff};
> +	int attr_map_fd, diff_map_fd = -1, err;
> +	enum bperf_filter_type filter_type;
> +	__u32 filter_entry_cnt, i;
> +
> +	if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt))
> +		return -1;
> +
> +	if (!all_cpu_map) {
> +		all_cpu_map = perf_cpu_map__new(NULL);
> +		if (!all_cpu_map)
> +			return -1;
> +	}
> +
> +	evsel->bperf_leader_prog_fd = -1;
> +	evsel->bperf_leader_link_fd = -1;
> +
> +	/*
> +	 * Step 1: hold a fd on the leader program and the bpf_link, if
> +	 * the program is not already gone, reload the program.
> +	 * Use flock() to ensure exclusive access to the perf_event_attr
> +	 * map.
> +	 */
> +	attr_map_fd = bperf_lock_attr_map(target);
> +	if (attr_map_fd < 0) {
> +		pr_err("Failed to lock perf_event_attr map\n");
> +		return -1;
> +	}
> +
> +	err = bpf_map_lookup_elem(attr_map_fd, &evsel->core.attr, &entry);
> +	if (err) {
> +		err = bpf_map_update_elem(attr_map_fd, &evsel->core.attr, &entry, BPF_ANY);
> +		if (err)
> +			goto out;
> +	}
> +
> +	evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry.link_id);
> +	if (evsel->bperf_leader_link_fd < 0 &&
> +	    bperf_reload_leader_program(evsel, attr_map_fd, &entry))
> +		goto out;
> +
> +	/*
> +	 * The bpf_link holds reference to the leader program, and the
> +	 * leader program holds reference to the maps. Therefore, if
> +	 * link_id is valid, diff_map_id should also be valid.
> +	 */
> +	evsel->bperf_leader_prog_fd = bpf_prog_get_fd_by_id(
> +		bpf_link_get_prog_id(evsel->bperf_leader_link_fd));
> +	assert(evsel->bperf_leader_prog_fd >= 0);
> +
> +	diff_map_fd = bpf_map_get_fd_by_id(entry.diff_map_id);
> +	assert(diff_map_fd >= 0);
> +
> +	/*
> +	 * bperf uses BPF_PROG_TEST_RUN to get accurate reading. Check
> +	 * whether the kernel support it
> +	 */
> +	err = bperf_trigger_reading(evsel->bperf_leader_prog_fd, 0);
> +	if (err) {
> +		pr_err("The kernel does not support test_run for raw_tp BPF programs.\n"
> +		       "Therefore, --use-bpf might show inaccurate readings\n");
> +		goto out;
> +	}
> +
> +	/* Step 2: load the follower skeleton */
> +	evsel->follower_skel = bperf_follower_bpf__open();
> +	if (!evsel->follower_skel) {
> +		pr_err("Failed to open follower skeleton\n");
> +		goto out;
> +	}
> +
> +	/* attach fexit program to the leader program */
> +	bpf_program__set_attach_target(evsel->follower_skel->progs.fexit_XXX,
> +				       evsel->bperf_leader_prog_fd, "on_switch");
> +
> +	/* connect to leader diff_reading map */
> +	bpf_map__reuse_fd(evsel->follower_skel->maps.diff_readings, diff_map_fd);
> +
> +	/* set up reading map */
> +	bpf_map__set_max_entries(evsel->follower_skel->maps.accum_readings,
> +				 filter_entry_cnt);
> +	/* set up follower filter based on target */
> +	bpf_map__set_max_entries(evsel->follower_skel->maps.filter,
> +				 filter_entry_cnt);
> +	err = bperf_follower_bpf__load(evsel->follower_skel);
> +	if (err) {
> +		pr_err("Failed to load follower skeleton\n");
> +		bperf_follower_bpf__destroy(evsel->follower_skel);
> +		evsel->follower_skel = NULL;
> +		goto out;
> +	}
> +
> +	for (i = 0; i < filter_entry_cnt; i++) {
> +		int filter_map_fd;
> +		__u32 key;
> +
> +		if (filter_type == BPERF_FILTER_PID ||
> +		    filter_type == BPERF_FILTER_TGID)
> +			key = evsel->core.threads->map[i].pid;
> +		else if (filter_type == BPERF_FILTER_CPU)
> +			key = evsel->core.cpus->map[i];
> +		else
> +			break;
> +
> +		filter_map_fd = bpf_map__fd(evsel->follower_skel->maps.filter);
> +		bpf_map_update_elem(filter_map_fd, &key, &i, BPF_ANY);
> +	}
> +
> +	evsel->follower_skel->bss->type = filter_type;
> +
> +	err = bperf_follower_bpf__attach(evsel->follower_skel);
> +
> +out:
> +	if (err && evsel->bperf_leader_link_fd >= 0)
> +		close(evsel->bperf_leader_link_fd);
> +	if (err && evsel->bperf_leader_prog_fd >= 0)
> +		close(evsel->bperf_leader_prog_fd);
> +	if (diff_map_fd >= 0)
> +		close(diff_map_fd);
> +
> +	flock(attr_map_fd, LOCK_UN);
> +	close(attr_map_fd);
> +
> +	return err;
> +}
> +
> +static int bperf__install_pe(struct evsel *evsel, int cpu, int fd)
> +{
> +	struct bperf_leader_bpf *skel = evsel->leader_skel;
> +
> +	return bpf_map_update_elem(bpf_map__fd(skel->maps.events),
> +				   &cpu, &fd, BPF_ANY);
> +}
> +
> +/*
> + * trigger the leader prog on each cpu, so the accum_reading map could get
> + * the latest readings.
> + */
> +static int bperf_sync_counters(struct evsel *evsel)
> +{
> +	int num_cpu, i, cpu;
> +
> +	num_cpu = all_cpu_map->nr;
> +	for (i = 0; i < num_cpu; i++) {
> +		cpu = all_cpu_map->map[i];
> +		bperf_trigger_reading(evsel->bperf_leader_prog_fd, cpu);
> +	}
> +	return 0;
> +}
> +
> +static int bperf__enable(struct evsel *evsel)
> +{
> +	evsel->follower_skel->bss->enabled = 1;
> +	return 0;
> +}
> +
> +static int bperf__read(struct evsel *evsel)
> +{
> +	struct bperf_follower_bpf *skel = evsel->follower_skel;
> +	__u32 num_cpu_bpf = cpu__max_cpu();
> +	struct bpf_perf_event_value values[num_cpu_bpf];
> +	int reading_map_fd, err = 0;
> +	__u32 i, j, num_cpu;
> +
> +	bperf_sync_counters(evsel);
> +	reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
> +
> +	for (i = 0; i < bpf_map__max_entries(skel->maps.accum_readings); i++) {
> +		__u32 cpu;
> +
> +		err = bpf_map_lookup_elem(reading_map_fd, &i, values);
> +		if (err)
> +			goto out;
> +		switch (evsel->follower_skel->bss->type) {
> +		case BPERF_FILTER_GLOBAL:
> +			assert(i == 0);
> +
> +			num_cpu = all_cpu_map->nr;
> +			for (j = 0; j < num_cpu; j++) {
> +				cpu = all_cpu_map->map[j];
> +				perf_counts(evsel->counts, cpu, 0)->val = values[cpu].counter;
> +				perf_counts(evsel->counts, cpu, 0)->ena = values[cpu].enabled;
> +				perf_counts(evsel->counts, cpu, 0)->run = values[cpu].running;
> +			}
> +			break;
> +		case BPERF_FILTER_CPU:
> +			cpu = evsel->core.cpus->map[i];
> +			perf_counts(evsel->counts, i, 0)->val = values[cpu].counter;
> +			perf_counts(evsel->counts, i, 0)->ena = values[cpu].enabled;
> +			perf_counts(evsel->counts, i, 0)->run = values[cpu].running;
> +			break;
> +		case BPERF_FILTER_PID:
> +		case BPERF_FILTER_TGID:
> +			perf_counts(evsel->counts, 0, i)->val = 0;
> +			perf_counts(evsel->counts, 0, i)->ena = 0;
> +			perf_counts(evsel->counts, 0, i)->run = 0;
> +
> +			for (cpu = 0; cpu < num_cpu_bpf; cpu++) {
> +				perf_counts(evsel->counts, 0, i)->val += values[cpu].counter;
> +				perf_counts(evsel->counts, 0, i)->ena += values[cpu].enabled;
> +				perf_counts(evsel->counts, 0, i)->run += values[cpu].running;
> +			}
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +out:
> +	return err;
> +}
> +
> +static int bperf__destroy(struct evsel *evsel)
> +{
> +	bperf_follower_bpf__destroy(evsel->follower_skel);
> +	close(evsel->bperf_leader_prog_fd);
> +	close(evsel->bperf_leader_link_fd);
> +	return 0;
> +}
> +
> +/*
> + * bperf: share hardware PMCs with BPF
> + *
> + * perf uses performance monitoring counters (PMC) to monitor system
> + * performance. The PMCs are limited hardware resources. For example,
> + * Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
> + *
> + * Modern data center systems use these PMCs in many different ways:
> + * system level monitoring, (maybe nested) container level monitoring, per
> + * process monitoring, profiling (in sample mode), etc. In some cases,
> + * there are more active perf_events than available hardware PMCs. To allow
> + * all perf_events to have a chance to run, it is necessary to do expensive
> + * time multiplexing of events.
> + *
> + * On the other hand, many monitoring tools count the common metrics
> + * (cycles, instructions). It is a waste to have multiple tools create
> + * multiple perf_events of "cycles" and occupy multiple PMCs.
> + *
> + * bperf tries to reduce such wastes by allowing multiple perf_events of
> + * "cycles" or "instructions" (at different scopes) to share PMUs. Instead
> + * of having each perf-stat session to read its own perf_events, bperf uses
> + * BPF programs to read the perf_events and aggregate readings to BPF maps.
> + * Then, the perf-stat session(s) reads the values from these BPF maps.
> + *
> + *                                ||
> + *       shared progs and maps <- || -> per session progs and maps
> + *                                ||
> + *   ---------------              ||
> + *   | perf_events |              ||
> + *   ---------------       fexit  ||      -----------------
> + *          |             --------||----> | follower prog |
> + *       --------------- /        || ---  -----------------
> + * cs -> | leader prog |/         ||/        |         |
> + *   --> ---------------         /||  --------------  ------------------
> + *  /       |         |         / ||  | filter map |  | accum_readings |
> + * /  ------------  ------------  ||  --------------  ------------------
> + * |  | prev map |  | diff map |  ||                        |
> + * |  ------------  ------------  ||                        |
> + *  \                             ||                        |
> + * = \ ==================================================== | ============
> + *    \                                                    /   user space
> + *     \                                                  /
> + *      \                                                /
> + *    BPF_PROG_TEST_RUN                    BPF_MAP_LOOKUP_ELEM
> + *        \                                            /
> + *         \                                          /
> + *          \------  perf-stat ----------------------/
> + *
> + * The figure above shows the architecture of bperf. Note that the figure
> + * is divided into 3 regions: shared progs and maps (top left), per session
> + * progs and maps (top right), and user space (bottom).
> + *
> + * The leader prog is triggered on each context switch (cs). The leader
> + * prog reads perf_events and stores the difference (current_reading -
> + * previous_reading) to the diff map. For the same metric, e.g. "cycles",
> + * multiple perf-stat sessions share the same leader prog.
> + *
> + * Each perf-stat session creates a follower prog as fexit program to the
> + * leader prog. It is possible to attach up to BPF_MAX_TRAMP_PROGS (38)
> + * follower progs to the same leader prog. The follower prog checks current
> + * task and processor ID to decide whether to add the value from the diff
> + * map to its accumulated reading map (accum_readings).
> + *
> + * Finally, perf-stat user space reads the value from accum_reading map.
> + *
> + * Besides context switch, it is also necessary to trigger the leader prog
> + * before perf-stat reads the value. Otherwise, the accum_reading map may
> + * not have the latest reading from the perf_events. This is achieved by
> + * triggering the event via sys_bpf(BPF_PROG_TEST_RUN) to each CPU.
> + *
> + * Comment before the definition of struct perf_event_attr_map_entry
> + * describes how different sessions of perf-stat share information about
> + * the leader prog.
> + */
> +
> +struct bpf_counter_ops bperf_ops = {
> +	.load       = bperf__load,
> +	.enable     = bperf__enable,
> +	.read       = bperf__read,
> +	.install_pe = bperf__install_pe,
> +	.destroy    = bperf__destroy,
> +};
> +
> +static inline bool bpf_counter_skip(struct evsel *evsel)
> +{
> +	return list_empty(&evsel->bpf_counter_list) &&
> +		evsel->follower_skel == NULL;
> +}
> +
>  int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd)
>  {
> -	if (list_empty(&evsel->bpf_counter_list))
> +	if (bpf_counter_skip(evsel))
>  		return 0;
>  	return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd);
>  }
>  
>  int bpf_counter__load(struct evsel *evsel, struct target *target)
>  {
> -	if (target__has_bpf(target))
> +	if (target->bpf_str)
>  		evsel->bpf_counter_ops = &bpf_program_profiler_ops;
> +	else if (target->use_bpf)
> +		evsel->bpf_counter_ops = &bperf_ops;
>  
>  	if (evsel->bpf_counter_ops)
>  		return evsel->bpf_counter_ops->load(evsel, target);
> @@ -293,21 +802,21 @@ int bpf_counter__load(struct evsel *evsel, struct target *target)
>  
>  int bpf_counter__enable(struct evsel *evsel)
>  {
> -	if (list_empty(&evsel->bpf_counter_list))
> +	if (bpf_counter_skip(evsel))
>  		return 0;
>  	return evsel->bpf_counter_ops->enable(evsel);
>  }
>  
>  int bpf_counter__read(struct evsel *evsel)
>  {
> -	if (list_empty(&evsel->bpf_counter_list))
> +	if (bpf_counter_skip(evsel))
>  		return -EAGAIN;
>  	return evsel->bpf_counter_ops->read(evsel);
>  }
>  
>  void bpf_counter__destroy(struct evsel *evsel)
>  {
> -	if (list_empty(&evsel->bpf_counter_list))
> +	if (bpf_counter_skip(evsel))
>  		return;
>  	evsel->bpf_counter_ops->destroy(evsel);
>  	evsel->bpf_counter_ops = NULL;
> diff --git a/tools/perf/util/bpf_skel/bperf.h b/tools/perf/util/bpf_skel/bperf.h
> new file mode 100644
> index 0000000000000..186a5551ddb9d
> --- /dev/null
> +++ b/tools/perf/util/bpf_skel/bperf.h
> @@ -0,0 +1,14 @@
> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +// Copyright (c) 2021 Facebook
> +
> +#ifndef __BPERF_STAT_H
> +#define __BPERF_STAT_H
> +
> +typedef struct {
> +	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
> +	__uint(key_size, sizeof(__u32));
> +	__uint(value_size, sizeof(struct bpf_perf_event_value));
> +	__uint(max_entries, 1);
> +} reading_map;
> +
> +#endif /* __BPERF_STAT_H */
> diff --git a/tools/perf/util/bpf_skel/bperf_follower.bpf.c b/tools/perf/util/bpf_skel/bperf_follower.bpf.c
> new file mode 100644
> index 0000000000000..b8fa3cb2da230
> --- /dev/null
> +++ b/tools/perf/util/bpf_skel/bperf_follower.bpf.c
> @@ -0,0 +1,69 @@
> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +// Copyright (c) 2021 Facebook
> +#include <linux/bpf.h>
> +#include <linux/perf_event.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include "bperf.h"
> +#include "bperf_u.h"
> +
> +reading_map diff_readings SEC(".maps");
> +reading_map accum_readings SEC(".maps");
> +
> +struct {
> +	__uint(type, BPF_MAP_TYPE_HASH);
> +	__uint(key_size, sizeof(__u32));
> +	__uint(value_size, sizeof(__u32));
> +} filter SEC(".maps");
> +
> +enum bperf_filter_type type = 0;
> +int enabled = 0;
> +
> +SEC("fexit/XXX")
> +int BPF_PROG(fexit_XXX)
> +{
> +	struct bpf_perf_event_value *diff_val, *accum_val;
> +	__u32 filter_key, zero = 0;
> +	__u32 *accum_key;
> +
> +	if (!enabled)
> +		return 0;
> +
> +	switch (type) {
> +	case BPERF_FILTER_GLOBAL:
> +		accum_key = &zero;
> +		goto do_add;
> +	case BPERF_FILTER_CPU:
> +		filter_key = bpf_get_smp_processor_id();
> +		break;
> +	case BPERF_FILTER_PID:
> +		filter_key = bpf_get_current_pid_tgid() & 0xffffffff;
> +		break;
> +	case BPERF_FILTER_TGID:
> +		filter_key = bpf_get_current_pid_tgid() >> 32;
> +		break;
> +	default:
> +		return 0;
> +	}
> +
> +	accum_key = bpf_map_lookup_elem(&filter, &filter_key);
> +	if (!accum_key)
> +		return 0;
> +
> +do_add:
> +	diff_val = bpf_map_lookup_elem(&diff_readings, &zero);
> +	if (!diff_val)
> +		return 0;
> +
> +	accum_val = bpf_map_lookup_elem(&accum_readings, accum_key);
> +	if (!accum_val)
> +		return 0;
> +
> +	accum_val->counter += diff_val->counter;
> +	accum_val->enabled += diff_val->enabled;
> +	accum_val->running += diff_val->running;
> +
> +	return 0;
> +}
> +
> +char LICENSE[] SEC("license") = "Dual BSD/GPL";
> diff --git a/tools/perf/util/bpf_skel/bperf_leader.bpf.c b/tools/perf/util/bpf_skel/bperf_leader.bpf.c
> new file mode 100644
> index 0000000000000..4f70d1459e86c
> --- /dev/null
> +++ b/tools/perf/util/bpf_skel/bperf_leader.bpf.c
> @@ -0,0 +1,46 @@
> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +// Copyright (c) 2021 Facebook
> +#include <linux/bpf.h>
> +#include <linux/perf_event.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include "bperf.h"
> +
> +struct {
> +	__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
> +	__uint(key_size, sizeof(__u32));
> +	__uint(value_size, sizeof(int));
> +	__uint(map_flags, BPF_F_PRESERVE_ELEMS);
> +} events SEC(".maps");
> +
> +reading_map prev_readings SEC(".maps");
> +reading_map diff_readings SEC(".maps");
> +
> +SEC("raw_tp/sched_switch")
> +int BPF_PROG(on_switch)
> +{
> +	struct bpf_perf_event_value val, *prev_val, *diff_val;
> +	__u32 key = bpf_get_smp_processor_id();
> +	__u32 zero = 0;
> +	long err;
> +
> +	prev_val = bpf_map_lookup_elem(&prev_readings, &zero);
> +	if (!prev_val)
> +		return 0;
> +
> +	diff_val = bpf_map_lookup_elem(&diff_readings, &zero);
> +	if (!diff_val)
> +		return 0;
> +
> +	err = bpf_perf_event_read_value(&events, key, &val, sizeof(val));
> +	if (err)
> +		return 0;
> +
> +	diff_val->counter = val.counter - prev_val->counter;
> +	diff_val->enabled = val.enabled - prev_val->enabled;
> +	diff_val->running = val.running - prev_val->running;
> +	*prev_val = val;
> +	return 0;
> +}
> +
> +char LICENSE[] SEC("license") = "Dual BSD/GPL";
> diff --git a/tools/perf/util/bpf_skel/bperf_u.h b/tools/perf/util/bpf_skel/bperf_u.h
> new file mode 100644
> index 0000000000000..1ce0c2c905c11
> --- /dev/null
> +++ b/tools/perf/util/bpf_skel/bperf_u.h
> @@ -0,0 +1,14 @@
> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +// Copyright (c) 2021 Facebook
> +
> +#ifndef __BPERF_STAT_U_H
> +#define __BPERF_STAT_U_H
> +
> +enum bperf_filter_type {
> +	BPERF_FILTER_GLOBAL = 1,
> +	BPERF_FILTER_CPU,
> +	BPERF_FILTER_PID,
> +	BPERF_FILTER_TGID,
> +};
> +
> +#endif /* __BPERF_STAT_U_H */
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 6026487353dd8..dd4f56f9cfdf5 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -20,6 +20,8 @@ union perf_event;
>  struct bpf_counter_ops;
>  struct target;
>  struct hashmap;
> +struct bperf_leader_bpf;
> +struct bperf_follower_bpf;
>  
>  typedef int (evsel__sb_cb_t)(union perf_event *event, void *data);
>  
> @@ -130,8 +132,24 @@ struct evsel {
>  	 * See also evsel__has_callchain().
>  	 */
>  	__u64			synth_sample_type;
> -	struct list_head	bpf_counter_list;
> +
> +	/*
> +	 * bpf_counter_ops serves two use cases:
> +	 *   1. perf-stat -b          counting events used byBPF programs
> +	 *   2. perf-stat --use-bpf   use BPF programs to aggregate counts
> +	 */
>  	struct bpf_counter_ops	*bpf_counter_ops;
> +
> +	/* for perf-stat -b */
> +	struct list_head	bpf_counter_list;
> +
> +	/* for perf-stat --use-bpf */
> +	int			bperf_leader_prog_fd;
> +	int			bperf_leader_link_fd;
> +	union {
> +		struct bperf_leader_bpf *leader_skel;
> +		struct bperf_follower_bpf *follower_skel;
> +	};
>  };
>  
>  struct perf_missing_features {
> diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
> index f132c6c2eef81..1bce3eb28ef25 100644
> --- a/tools/perf/util/target.h
> +++ b/tools/perf/util/target.h
> @@ -16,6 +16,8 @@ struct target {
>  	bool	     uses_mmap;
>  	bool	     default_per_cpu;
>  	bool	     per_thread;
> +	bool	     use_bpf;
> +	const char   *attr_map;
>  };
>  
>  enum target_errno {
> @@ -66,7 +68,7 @@ static inline bool target__has_cpu(struct target *target)
>  
>  static inline bool target__has_bpf(struct target *target)
>  {
> -	return target->bpf_str;
> +	return target->bpf_str || target->use_bpf;
>  }
>  
>  static inline bool target__none(struct target *target)
> -- 
> 2.30.2
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-18 21:14       ` Jiri Olsa
@ 2021-03-19  0:09         ` Arnaldo
  2021-03-19  0:22           ` Song Liu
  0 siblings, 1 reply; 33+ messages in thread
From: Arnaldo @ 2021-03-19  0:09 UTC (permalink / raw)
  To: Jiri Olsa, Song Liu
  Cc: Arnaldo Carvalho de Melo, Namhyung Kim, linux-kernel,
	Kernel Team, Arnaldo Carvalho de Melo, Jiri Olsa



On March 18, 2021 6:14:34 PM GMT-03:00, Jiri Olsa <jolsa@redhat.com> wrote:
>On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
>> 
>> 
>> > On Mar 17, 2021, at 6:11 AM, Arnaldo Carvalho de Melo
><acme@kernel.org> wrote:
>> > 
>> > Em Wed, Mar 17, 2021 at 02:29:28PM +0900, Namhyung Kim escreveu:
>> >> Hi Song,
>> >> 
>> >> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com>
>wrote:
>> >>> 
>> >>> perf uses performance monitoring counters (PMCs) to monitor
>system
>> >>> performance. The PMCs are limited hardware resources. For
>example,
>> >>> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
>> >>> 
>> >>> Modern data center systems use these PMCs in many different ways:
>> >>> system level monitoring, (maybe nested) container level
>monitoring, per
>> >>> process monitoring, profiling (in sample mode), etc. In some
>cases,
>> >>> there are more active perf_events than available hardware PMCs.
>To allow
>> >>> all perf_events to have a chance to run, it is necessary to do
>expensive
>> >>> time multiplexing of events.
>> >>> 
>> >>> On the other hand, many monitoring tools count the common metrics
>(cycles,
>> >>> instructions). It is a waste to have multiple tools create
>multiple
>> >>> perf_events of "cycles" and occupy multiple PMCs.
>> >> 
>> >> Right, it'd be really helpful when the PMCs are frequently or
>mostly shared.
>> >> But it'd also increase the overhead for uncontended cases as BPF
>programs
>> >> need to run on every context switch.  Depending on the workload,
>it may
>> >> cause a non-negligible performance impact.  So users should be
>aware of it.
>> > 
>> > Would be interesting to, humm, measure both cases to have a firm
>number
>> > of the impact, how many instructions are added when sharing using
>> > --bpf-counters?
>> > 
>> > I.e. compare the "expensive time multiplexing of events" with its
>> > avoidance by using --bpf-counters.
>> > 
>> > Song, have you perfmormed such measurements?
>> 
>> I have got some measurements with perf-bench-sched-messaging:
>> 
>> The system: x86_64 with 23 cores (46 HT)
>> 
>> The perf-stat command:
>> perf stat -e
>cycles,cycles,instructions,instructions,ref-cycles,ref-cycles <target,
>etc.>
>> 
>> The benchmark command and output:
>> ./perf bench sched messaging -g 40 -l 50000 -t
>> # Running 'sched/messaging' benchmark:
>> # 20 sender and receiver threads per group
>> # 40 groups == 1600 threads run
>>      Total time: 10X.XXX [sec]
>> 
>> 
>> I use the "Total time" as measurement, so smaller number is better. 
>> 
>> For each condition, I run the command 5 times, and took the median of
>
>> "Total time". 
>> 
>> Baseline (no perf-stat)			104.873 [sec]
>> # global
>> perf stat -a				107.887 [sec]
>> perf stat -a --bpf-counters		106.071 [sec]
>> # per task
>> perf stat 				106.314 [sec]
>> perf stat --bpf-counters 		105.965 [sec]
>> # per cpu
>> perf stat -C 1,3,5 			107.063 [sec]
>> perf stat -C 1,3,5 --bpf-counters 	106.406 [sec]
>
>I can't see why it's actualy faster than normal perf ;-)
>would be worth to find out

Isn't this all about contended cases?

>
>jirka
>
>> 
>> From the data, --bpf-counters is slightly better than the regular
>event
>> for all targets. I noticed that the results are not very stable.
>There 
>> are a couple 108.xx runs in some of the conditions (w/ and w/o 
>> --bpf-counters).
>> 
>> 
>> I also measured the average runtime of the BPF programs, with 
>> 
>> 	sysctl kernel.bpf_stats_enabled=1
>> 
>> For each event, if we have one leader and two followers, the total
>run 
>> time is about 340ns. IOW, 340ns for two perf-stat reading
>instructions, 
>> 340ns for two perf-stat reading cycles, etc. 
>> 
>> Thanks,
>> Song
>> 

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-19  0:09         ` Arnaldo
@ 2021-03-19  0:22           ` Song Liu
  2021-03-19  0:54             ` Namhyung Kim
  0 siblings, 1 reply; 33+ messages in thread
From: Song Liu @ 2021-03-19  0:22 UTC (permalink / raw)
  To: Arnaldo
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, Namhyung Kim, linux-kernel,
	Kernel Team, Arnaldo Carvalho de Melo, Jiri Olsa



> On Mar 18, 2021, at 5:09 PM, Arnaldo <arnaldo.melo@gmail.com> wrote:
> 
> 
> 
> On March 18, 2021 6:14:34 PM GMT-03:00, Jiri Olsa <jolsa@redhat.com> wrote:
>> On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
>>> 
>>> 
>>>> On Mar 17, 2021, at 6:11 AM, Arnaldo Carvalho de Melo
>> <acme@kernel.org> wrote:
>>>> 
>>>> Em Wed, Mar 17, 2021 at 02:29:28PM +0900, Namhyung Kim escreveu:
>>>>> Hi Song,
>>>>> 
>>>>> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com>
>> wrote:
>>>>>> 
>>>>>> perf uses performance monitoring counters (PMCs) to monitor
>> system
>>>>>> performance. The PMCs are limited hardware resources. For
>> example,
>>>>>> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
>>>>>> 
>>>>>> Modern data center systems use these PMCs in many different ways:
>>>>>> system level monitoring, (maybe nested) container level
>> monitoring, per
>>>>>> process monitoring, profiling (in sample mode), etc. In some
>> cases,
>>>>>> there are more active perf_events than available hardware PMCs.
>> To allow
>>>>>> all perf_events to have a chance to run, it is necessary to do
>> expensive
>>>>>> time multiplexing of events.
>>>>>> 
>>>>>> On the other hand, many monitoring tools count the common metrics
>> (cycles,
>>>>>> instructions). It is a waste to have multiple tools create
>> multiple
>>>>>> perf_events of "cycles" and occupy multiple PMCs.
>>>>> 
>>>>> Right, it'd be really helpful when the PMCs are frequently or
>> mostly shared.
>>>>> But it'd also increase the overhead for uncontended cases as BPF
>> programs
>>>>> need to run on every context switch.  Depending on the workload,
>> it may
>>>>> cause a non-negligible performance impact.  So users should be
>> aware of it.
>>>> 
>>>> Would be interesting to, humm, measure both cases to have a firm
>> number
>>>> of the impact, how many instructions are added when sharing using
>>>> --bpf-counters?
>>>> 
>>>> I.e. compare the "expensive time multiplexing of events" with its
>>>> avoidance by using --bpf-counters.
>>>> 
>>>> Song, have you perfmormed such measurements?
>>> 
>>> I have got some measurements with perf-bench-sched-messaging:
>>> 
>>> The system: x86_64 with 23 cores (46 HT)
>>> 
>>> The perf-stat command:
>>> perf stat -e
>> cycles,cycles,instructions,instructions,ref-cycles,ref-cycles <target,
>> etc.>
>>> 
>>> The benchmark command and output:
>>> ./perf bench sched messaging -g 40 -l 50000 -t
>>> # Running 'sched/messaging' benchmark:
>>> # 20 sender and receiver threads per group
>>> # 40 groups == 1600 threads run
>>>     Total time: 10X.XXX [sec]
>>> 
>>> 
>>> I use the "Total time" as measurement, so smaller number is better. 
>>> 
>>> For each condition, I run the command 5 times, and took the median of
>> 
>>> "Total time". 
>>> 
>>> Baseline (no perf-stat)			104.873 [sec]
>>> # global
>>> perf stat -a				107.887 [sec]
>>> perf stat -a --bpf-counters		106.071 [sec]
>>> # per task
>>> perf stat 				106.314 [sec]
>>> perf stat --bpf-counters 		105.965 [sec]
>>> # per cpu
>>> perf stat -C 1,3,5 			107.063 [sec]
>>> perf stat -C 1,3,5 --bpf-counters 	106.406 [sec]
>> 
>> I can't see why it's actualy faster than normal perf ;-)
>> would be worth to find out
> 
> Isn't this all about contended cases?

Yeah, the normal perf is doing time multiplexing; while --bpf-counters 
doesn't need it. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-19  0:22           ` Song Liu
@ 2021-03-19  0:54             ` Namhyung Kim
  2021-03-19 15:35               ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 33+ messages in thread
From: Namhyung Kim @ 2021-03-19  0:54 UTC (permalink / raw)
  To: Song Liu
  Cc: Arnaldo, Jiri Olsa, Arnaldo Carvalho de Melo, linux-kernel,
	Kernel Team, Arnaldo Carvalho de Melo, Jiri Olsa

On Fri, Mar 19, 2021 at 9:22 AM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Mar 18, 2021, at 5:09 PM, Arnaldo <arnaldo.melo@gmail.com> wrote:
> >
> >
> >
> > On March 18, 2021 6:14:34 PM GMT-03:00, Jiri Olsa <jolsa@redhat.com> wrote:
> >> On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
> >>>
> >>>
> >>>> On Mar 17, 2021, at 6:11 AM, Arnaldo Carvalho de Melo
> >> <acme@kernel.org> wrote:
> >>>>
> >>>> Em Wed, Mar 17, 2021 at 02:29:28PM +0900, Namhyung Kim escreveu:
> >>>>> Hi Song,
> >>>>>
> >>>>> On Wed, Mar 17, 2021 at 6:18 AM Song Liu <songliubraving@fb.com>
> >> wrote:
> >>>>>>
> >>>>>> perf uses performance monitoring counters (PMCs) to monitor
> >> system
> >>>>>> performance. The PMCs are limited hardware resources. For
> >> example,
> >>>>>> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
> >>>>>>
> >>>>>> Modern data center systems use these PMCs in many different ways:
> >>>>>> system level monitoring, (maybe nested) container level
> >> monitoring, per
> >>>>>> process monitoring, profiling (in sample mode), etc. In some
> >> cases,
> >>>>>> there are more active perf_events than available hardware PMCs.
> >> To allow
> >>>>>> all perf_events to have a chance to run, it is necessary to do
> >> expensive
> >>>>>> time multiplexing of events.
> >>>>>>
> >>>>>> On the other hand, many monitoring tools count the common metrics
> >> (cycles,
> >>>>>> instructions). It is a waste to have multiple tools create
> >> multiple
> >>>>>> perf_events of "cycles" and occupy multiple PMCs.
> >>>>>
> >>>>> Right, it'd be really helpful when the PMCs are frequently or
> >> mostly shared.
> >>>>> But it'd also increase the overhead for uncontended cases as BPF
> >> programs
> >>>>> need to run on every context switch.  Depending on the workload,
> >> it may
> >>>>> cause a non-negligible performance impact.  So users should be
> >> aware of it.
> >>>>
> >>>> Would be interesting to, humm, measure both cases to have a firm
> >> number
> >>>> of the impact, how many instructions are added when sharing using
> >>>> --bpf-counters?
> >>>>
> >>>> I.e. compare the "expensive time multiplexing of events" with its
> >>>> avoidance by using --bpf-counters.
> >>>>
> >>>> Song, have you perfmormed such measurements?
> >>>
> >>> I have got some measurements with perf-bench-sched-messaging:
> >>>
> >>> The system: x86_64 with 23 cores (46 HT)
> >>>
> >>> The perf-stat command:
> >>> perf stat -e
> >> cycles,cycles,instructions,instructions,ref-cycles,ref-cycles <target,
> >> etc.>
> >>>
> >>> The benchmark command and output:
> >>> ./perf bench sched messaging -g 40 -l 50000 -t
> >>> # Running 'sched/messaging' benchmark:
> >>> # 20 sender and receiver threads per group
> >>> # 40 groups == 1600 threads run
> >>>     Total time: 10X.XXX [sec]
> >>>
> >>>
> >>> I use the "Total time" as measurement, so smaller number is better.
> >>>
> >>> For each condition, I run the command 5 times, and took the median of
> >>
> >>> "Total time".
> >>>
> >>> Baseline (no perf-stat)                     104.873 [sec]
> >>> # global
> >>> perf stat -a                                107.887 [sec]
> >>> perf stat -a --bpf-counters         106.071 [sec]
> >>> # per task
> >>> perf stat                           106.314 [sec]
> >>> perf stat --bpf-counters            105.965 [sec]
> >>> # per cpu
> >>> perf stat -C 1,3,5                  107.063 [sec]
> >>> perf stat -C 1,3,5 --bpf-counters   106.406 [sec]
> >>
> >> I can't see why it's actualy faster than normal perf ;-)
> >> would be worth to find out
> >
> > Isn't this all about contended cases?
>
> Yeah, the normal perf is doing time multiplexing; while --bpf-counters
> doesn't need it.

Yep, so for uncontended cases, normal perf should be the same as the
baseline (faster than the bperf).  But for contended cases, the bperf
works faster.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-19  0:54             ` Namhyung Kim
@ 2021-03-19 15:35               ` Arnaldo Carvalho de Melo
  2021-03-19 15:58                 ` Namhyung Kim
  0 siblings, 1 reply; 33+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-19 15:35 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Song Liu, Jiri Olsa, linux-kernel, Kernel Team, Jiri Olsa

Em Fri, Mar 19, 2021 at 09:54:59AM +0900, Namhyung Kim escreveu:
> On Fri, Mar 19, 2021 at 9:22 AM Song Liu <songliubraving@fb.com> wrote:
> > > On Mar 18, 2021, at 5:09 PM, Arnaldo <arnaldo.melo@gmail.com> wrote:
> > > On March 18, 2021 6:14:34 PM GMT-03:00, Jiri Olsa <jolsa@redhat.com> wrote:
> > >> On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
> > >>> perf stat -C 1,3,5                  107.063 [sec]
> > >>> perf stat -C 1,3,5 --bpf-counters   106.406 [sec]

> > >> I can't see why it's actualy faster than normal perf ;-)
> > >> would be worth to find out

> > > Isn't this all about contended cases?

> > Yeah, the normal perf is doing time multiplexing; while --bpf-counters
> > doesn't need it.

> Yep, so for uncontended cases, normal perf should be the same as the
> baseline (faster than the bperf).  But for contended cases, the bperf
> works faster.

The difference should be small enough that for people that use this in a
machine where contention happens most of the time, setting a
~/.perfconfig to use it by default should be advantageous, i.e. no need
to use --bpf-counters on the command line all the time.

So, Namhyung, can I take that as an Acked-by or a Reviewed-by? I'll take
a look again now but I want to have this merged on perf/core so that I
can work on a new BPF SKEL to use this:

https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=tmp.bpf/bpf_perf_enable

:-)

- Arnaldo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-19 15:35               ` Arnaldo Carvalho de Melo
@ 2021-03-19 15:58                 ` Namhyung Kim
  2021-03-19 16:14                   ` Song Liu
  0 siblings, 1 reply; 33+ messages in thread
From: Namhyung Kim @ 2021-03-19 15:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Song Liu, Jiri Olsa, linux-kernel, Kernel Team, Jiri Olsa

Hi Arnaldo,

On Sat, Mar 20, 2021 at 12:35 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Fri, Mar 19, 2021 at 09:54:59AM +0900, Namhyung Kim escreveu:
> > On Fri, Mar 19, 2021 at 9:22 AM Song Liu <songliubraving@fb.com> wrote:
> > > > On Mar 18, 2021, at 5:09 PM, Arnaldo <arnaldo.melo@gmail.com> wrote:
> > > > On March 18, 2021 6:14:34 PM GMT-03:00, Jiri Olsa <jolsa@redhat.com> wrote:
> > > >> On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
> > > >>> perf stat -C 1,3,5                  107.063 [sec]
> > > >>> perf stat -C 1,3,5 --bpf-counters   106.406 [sec]
>
> > > >> I can't see why it's actualy faster than normal perf ;-)
> > > >> would be worth to find out
>
> > > > Isn't this all about contended cases?
>
> > > Yeah, the normal perf is doing time multiplexing; while --bpf-counters
> > > doesn't need it.
>
> > Yep, so for uncontended cases, normal perf should be the same as the
> > baseline (faster than the bperf).  But for contended cases, the bperf
> > works faster.
>
> The difference should be small enough that for people that use this in a
> machine where contention happens most of the time, setting a
> ~/.perfconfig to use it by default should be advantageous, i.e. no need
> to use --bpf-counters on the command line all the time.
>
> So, Namhyung, can I take that as an Acked-by or a Reviewed-by? I'll take
> a look again now but I want to have this merged on perf/core so that I
> can work on a new BPF SKEL to use this:

I have a concern for the per cpu target, but it can be done later, so

Acked-by: Namhyung Kim <namhyung@kernel.org>

>
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=tmp.bpf/bpf_perf_enable

Interesting!  Actually I was thinking about the similar too. :)

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-19 15:58                 ` Namhyung Kim
@ 2021-03-19 16:14                   ` Song Liu
  2021-03-23 21:10                     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 33+ messages in thread
From: Song Liu @ 2021-03-19 16:14 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel, Kernel Team,
	Jiri Olsa



> On Mar 19, 2021, at 8:58 AM, Namhyung Kim <namhyung@kernel.org> wrote:
> 
> Hi Arnaldo,
> 
> On Sat, Mar 20, 2021 at 12:35 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
>> 
>> Em Fri, Mar 19, 2021 at 09:54:59AM +0900, Namhyung Kim escreveu:
>>> On Fri, Mar 19, 2021 at 9:22 AM Song Liu <songliubraving@fb.com> wrote:
>>>>> On Mar 18, 2021, at 5:09 PM, Arnaldo <arnaldo.melo@gmail.com> wrote:
>>>>> On March 18, 2021 6:14:34 PM GMT-03:00, Jiri Olsa <jolsa@redhat.com> wrote:
>>>>>> On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
>>>>>>> perf stat -C 1,3,5                  107.063 [sec]
>>>>>>> perf stat -C 1,3,5 --bpf-counters   106.406 [sec]
>> 
>>>>>> I can't see why it's actualy faster than normal perf ;-)
>>>>>> would be worth to find out
>> 
>>>>> Isn't this all about contended cases?
>> 
>>>> Yeah, the normal perf is doing time multiplexing; while --bpf-counters
>>>> doesn't need it.
>> 
>>> Yep, so for uncontended cases, normal perf should be the same as the
>>> baseline (faster than the bperf).  But for contended cases, the bperf
>>> works faster.
>> 
>> The difference should be small enough that for people that use this in a
>> machine where contention happens most of the time, setting a
>> ~/.perfconfig to use it by default should be advantageous, i.e. no need
>> to use --bpf-counters on the command line all the time.
>> 
>> So, Namhyung, can I take that as an Acked-by or a Reviewed-by? I'll take
>> a look again now but I want to have this merged on perf/core so that I
>> can work on a new BPF SKEL to use this:
> 
> I have a concern for the per cpu target, but it can be done later, so
> 
> Acked-by: Namhyung Kim <namhyung@kernel.org>
> 
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=tmp.bpf/bpf_perf_enable
> 
> Interesting!  Actually I was thinking about the similar too. :)

Hi Namhyung, Jiri, and Arnaldo,

Thanks a lot for your kind review. 

Here is updated 3/3, where we use perf-bench instead of stressapptest.

Thanks,
Song


From cc79d161be9c9d24198f7e35b50058a6e15076fd Mon Sep 17 00:00:00 2001
From: Song Liu <songliubraving@fb.com>
Date: Tue, 16 Mar 2021 00:19:53 -0700
Subject: [PATCH v3 3/3] perf-test: add a test for perf-stat --bpf-counters
 option

Add a test to compare the output of perf-stat with and without option
--bpf-counters. If the difference is more than 10%, the test is considered
as failed.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/tests/shell/stat_bpf_counters.sh | 31 +++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100755 tools/perf/tests/shell/stat_bpf_counters.sh

diff --git a/tools/perf/tests/shell/stat_bpf_counters.sh b/tools/perf/tests/shell/stat_bpf_counters.sh
new file mode 100755
index 0000000000000..7aabf177ce8d1
--- /dev/null
+++ b/tools/perf/tests/shell/stat_bpf_counters.sh
@@ -0,0 +1,31 @@
+#!/bin/sh
+# perf stat --bpf-counters test
+# SPDX-License-Identifier: GPL-2.0
+
+set -e
+
+# check whether $2 is within +/- 10% of $1
+compare_number()
+{
+       first_num=$1
+       second_num=$2
+
+       # upper bound is first_num * 110%
+       upper=$(( $first_num + $first_num / 10 ))
+       # lower bound is first_num * 90%
+       lower=$(( $first_num - $first_num / 10 ))
+
+       if [ $second_num -gt $upper ] || [ $second_num -lt $lower ]; then
+               echo "The difference between $first_num and $second_num are greater than 10%."
+               exit 1
+       fi
+}
+
+# skip if --bpf-counters is not supported
+perf stat --bpf-counters true > /dev/null 2>&1 || exit 2
+
+base_cycles=$(perf stat --no-big-num -e cycles -- perf bench sched messaging -g 1 -l 100 -t 2>&1 | awk '/cycles/ {print $1}')
+bpf_cycles=$(perf stat --no-big-num --bpf-counters -e cycles -- perf bench sched messaging -g 1 -l 100 -t 2>&1 | awk '/cycles/ {print $1}')
+
+compare_number $base_cycles $bpf_cycles
+exit 0
--
2.30.2



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-18 21:15   ` Jiri Olsa
@ 2021-03-19 18:41     ` Arnaldo Carvalho de Melo
  2021-03-19 18:55       ` Jiri Olsa
                         ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-19 18:41 UTC (permalink / raw)
  To: Song Liu, Jiri Olsa; +Cc: linux-kernel, kernel-team, acme, namhyung, jolsa

Em Thu, Mar 18, 2021 at 10:15:13PM +0100, Jiri Olsa escreveu:
> On Tue, Mar 16, 2021 at 02:18:35PM -0700, Song Liu wrote:
> > bperf is off by default. To enable it, pass --bpf-counters option to
> > perf-stat. bperf uses a BPF hashmap to share information about BPF
> > programs and maps used by bperf. This map is pinned to bpffs. The default
> > path is /sys/fs/bpf/perf_attr_map. The user could change the path with
> > option --bpf-attr-map.
> > 
> > Signed-off-by: Song Liu <songliubraving@fb.com>
> 
> Reviewed-by: Jiri Olsa <jolsa@redhat.com>

After applying just this first patch in the series I'm getting this
after a 'make -C tools/ clean', now I'm checking if I need some new
clang, ideas?

- Arnaldo

[acme@quaco perf]$ make O=/tmp/build/perf -C tools/perf BUILD_BPF_SKEL=1 PYTHON=python3 install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
  BUILD:   Doing 'make -j8' parallel build
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

Auto-detecting system features:
...                         dwarf: [ on  ]
...            dwarf_getlocations: [ on  ]
...                         glibc: [ on  ]
...                        libbfd: [ on  ]
...                libbfd-buildid: [ on  ]
...                        libcap: [ on  ]
...                        libelf: [ on  ]
...                       libnuma: [ on  ]
...        numa_num_possible_cpus: [ on  ]
...                       libperl: [ on  ]
...                     libpython: [ on  ]
...                     libcrypto: [ on  ]
...                     libunwind: [ on  ]
...            libdw-dwarf-unwind: [ on  ]
...                          zlib: [ on  ]
...                          lzma: [ on  ]
...                     get_cpuid: [ on  ]
...                           bpf: [ on  ]
...                        libaio: [ on  ]
...                       libzstd: [ on  ]
...        disassembler-four-args: [ on  ]

  GEN      /tmp/build/perf/common-cmds.h
  CC       /tmp/build/perf/exec-cmd.o
  MKDIR    /tmp/build/perf/fd/
  MKDIR    /tmp/build/perf/fs/
  CC       /tmp/build/perf/fs/fs.o
  CC       /tmp/build/perf/event-parse.o
  CC       /tmp/build/perf/fd/array.o
  CC       /tmp/build/perf/core.o
  GEN      /tmp/build/perf/bpf_helper_defs.h
  CC       /tmp/build/perf/event-plugin.o
  MKDIR    /tmp/build/perf/staticobjs/
  PERF_VERSION = 5.12.rc2.g3df07f57f205
  CC       /tmp/build/perf/staticobjs/libbpf.o
  CC       /tmp/build/perf/cpu.o
  LD       /tmp/build/perf/fd/libapi-in.o
  CC       /tmp/build/perf/cpumap.o
  CC       /tmp/build/perf/help.o
  MKDIR    /tmp/build/perf/fs/
  CC       /tmp/build/perf/fs/tracing_path.o
  CC       /tmp/build/perf/fs/cgroup.o
  CC       /tmp/build/perf/trace-seq.o
  CC       /tmp/build/perf/pager.o
  CC       /tmp/build/perf/parse-options.o
  LD       /tmp/build/perf/fs/libapi-in.o
  CC       /tmp/build/perf/debug.o
  CC       /tmp/build/perf/str_error_r.o
  CC       /tmp/build/perf/run-command.o
  CC       /tmp/build/perf/sigchain.o
  LD       /tmp/build/perf/libapi-in.o
  AR       /tmp/build/perf/libapi.a
  CC       /tmp/build/perf/subcmd-config.o
  CC       /tmp/build/perf/threadmap.o
  CC       /tmp/build/perf/evsel.o
  CC       /tmp/build/perf/parse-filter.o
  MKDIR    /tmp/build/perf/staticobjs/
  CC       /tmp/build/perf/staticobjs/bpf.o
  CC       /tmp/build/perf/evlist.o
  CC       /tmp/build/perf/parse-utils.o
  CC       /tmp/build/perf/kbuffer-parse.o
  CC       /tmp/build/perf/tep_strerror.o
  CC       /tmp/build/perf/mmap.o
  CC       /tmp/build/perf/zalloc.o
  CC       /tmp/build/perf/event-parse-api.o
  LD       /tmp/build/perf/libsubcmd-in.o
  AR       /tmp/build/perf/libsubcmd.a
  CC       /tmp/build/perf/xyarray.o
  LD       /tmp/build/perf/libtraceevent-in.o
  LINK     /tmp/build/perf/libtraceevent.a
  CC       /tmp/build/perf/staticobjs/nlattr.o
  CC       /tmp/build/perf/staticobjs/btf.o
  CC       /tmp/build/perf/lib.o
  CC       /tmp/build/perf/staticobjs/libbpf_errno.o
  CC       /tmp/build/perf/staticobjs/str_error.o
  CC       /tmp/build/perf/staticobjs/netlink.o
  CC       /tmp/build/perf/staticobjs/bpf_prog_linfo.o
  CC       /tmp/build/perf/staticobjs/libbpf_probes.o
  LD       /tmp/build/perf/libperf-in.o
  AR       /tmp/build/perf/libperf.a
  MKDIR    /tmp/build/perf/pmu-events/
  HOSTCC   /tmp/build/perf/pmu-events/json.o
  CC       /tmp/build/perf/plugin_jbd2.o
  CC       /tmp/build/perf/staticobjs/xsk.o
  MKDIR    /tmp/build/perf/pmu-events/
  HOSTCC   /tmp/build/perf/pmu-events/jsmn.o
  CC       /tmp/build/perf/staticobjs/hashmap.o
  LD       /tmp/build/perf/plugin_jbd2-in.o
  CC       /tmp/build/perf/staticobjs/btf_dump.o
  CC       /tmp/build/perf/plugin_hrtimer.o
  HOSTCC   /tmp/build/perf/pmu-events/jevents.o
  LD       /tmp/build/perf/plugin_hrtimer-in.o
  CC       /tmp/build/perf/plugin_kmem.o
  CC       /tmp/build/perf/staticobjs/ringbuf.o
  LD       /tmp/build/perf/plugin_kmem-in.o
  CC       /tmp/build/perf/plugin_kvm.o
  HOSTLD   /tmp/build/perf/pmu-events/jevents-in.o
  CC       /tmp/build/perf/perf-read-vdso32
  CC       /tmp/build/perf/plugin_mac80211.o
  LD       /tmp/build/perf/plugin_kvm-in.o
  CC       /tmp/build/perf/plugin_sched_switch.o
  CC       /tmp/build/perf/plugin_function.o
  MKDIR    /tmp/build/perf/jvmti/
  CC       /tmp/build/perf/jvmti/libjvmti.o
  LD       /tmp/build/perf/plugin_mac80211-in.o
  GEN      perf-archive
  LD       /tmp/build/perf/plugin_sched_switch-in.o
  CC       /tmp/build/perf/plugin_futex.o
  CC       /tmp/build/perf/plugin_xen.o
  CC       /tmp/build/perf/plugin_scsi.o
  LD       /tmp/build/perf/plugin_function-in.o
  CC       /tmp/build/perf/plugin_cfg80211.o
  LD       /tmp/build/perf/plugin_futex-in.o
  LD       /tmp/build/perf/plugin_xen-in.o
  LINK     /tmp/build/perf/plugin_jbd2.so
  LINK     /tmp/build/perf/plugin_hrtimer.so
  CC       /tmp/build/perf/plugin_tlb.o
  LINK     /tmp/build/perf/plugin_kmem.so
  LINK     /tmp/build/perf/plugin_kvm.so
  LD       /tmp/build/perf/plugin_scsi-in.o
  LINK     /tmp/build/perf/plugin_mac80211.so
  LINK     /tmp/build/perf/plugin_sched_switch.so
  LINK     /tmp/build/perf/plugin_function.so
  LINK     /tmp/build/perf/plugin_futex.so
  LD       /tmp/build/perf/plugin_cfg80211-in.o
  LINK     /tmp/build/perf/plugin_xen.so
  LINK     /tmp/build/perf/plugin_scsi.so
  LD       /tmp/build/perf/plugin_tlb-in.o
  MKDIR    /tmp/build/perf/jvmti/
  LINK     /tmp/build/perf/plugin_cfg80211.so
  CC       /tmp/build/perf/jvmti/jvmti_agent.o
  LINK     /tmp/build/perf/plugin_tlb.so
  GEN      perf-with-kcore
  CC       /tmp/build/perf/jvmti/libstring.o
CFLAGS= make -C ../bpf/bpftool \
	OUTPUT=/tmp/build/perf/util/bpf_skel/.tmp/ bootstrap
  CC       /tmp/build/perf/jvmti/libctype.o
  GEN      /tmp/build/perf/libtraceevent-dynamic-list
  LINK     /tmp/build/perf/pmu-events/jevents
  DESCEND  plugins
  GEN      /tmp/build/perf/python/perf.so
  GEN      /tmp/build/perf/pmu-events/pmu-events.c
  CC       /tmp/build/perf/plugins/plugin_jbd2.o
  CC       /tmp/build/perf/plugins/plugin_hrtimer.o
  LD       /tmp/build/perf/plugins/plugin_jbd2-in.o
  LD       /tmp/build/perf/plugins/plugin_hrtimer-in.o
  CC       /tmp/build/perf/plugins/plugin_kmem.o
  CC       /tmp/build/perf/plugins/plugin_kvm.o
  LD       /tmp/build/perf/jvmti/jvmti-in.o
  LD       /tmp/build/perf/plugins/plugin_kmem-in.o
  LINK     /tmp/build/perf/libperf-jvmti.so
  CC       /tmp/build/perf/plugins/plugin_mac80211.o
  CC       /tmp/build/perf/plugins/plugin_sched_switch.o
  CC       /tmp/build/perf/pmu-events/pmu-events.o
  LD       /tmp/build/perf/plugins/plugin_kvm-in.o
  CC       /tmp/build/perf/plugins/plugin_function.o
  LD       /tmp/build/perf/plugins/plugin_mac80211-in.o
  CC       /tmp/build/perf/plugins/plugin_futex.o
  LD       /tmp/build/perf/plugins/plugin_sched_switch-in.o
  CC       /tmp/build/perf/plugins/plugin_xen.o
  LD       /tmp/build/perf/plugins/plugin_function-in.o
  CC       /tmp/build/perf/plugins/plugin_scsi.o
  LD       /tmp/build/perf/plugins/plugin_futex-in.o
  CC       /tmp/build/perf/plugins/plugin_cfg80211.o
  LD       /tmp/build/perf/plugins/plugin_xen-in.o
  CC       /tmp/build/perf/plugins/plugin_tlb.o
  LD       /tmp/build/perf/plugins/plugin_scsi-in.o
  LD       /tmp/build/perf/plugins/plugin_cfg80211-in.o
  LINK     /tmp/build/perf/plugins/plugin_jbd2.so
  LINK     /tmp/build/perf/plugins/plugin_hrtimer.so
  LINK     /tmp/build/perf/plugins/plugin_kmem.so
  LINK     /tmp/build/perf/plugins/plugin_kvm.so
  LINK     /tmp/build/perf/plugins/plugin_mac80211.so
  LD       /tmp/build/perf/plugins/plugin_tlb-in.o
  LINK     /tmp/build/perf/plugins/plugin_function.so
  LINK     /tmp/build/perf/plugins/plugin_sched_switch.so
  LINK     /tmp/build/perf/plugins/plugin_futex.so
  LINK     /tmp/build/perf/plugins/plugin_xen.so
  LINK     /tmp/build/perf/plugins/plugin_scsi.so
  LINK     /tmp/build/perf/plugins/plugin_cfg80211.so
  LINK     /tmp/build/perf/plugins/plugin_tlb.so
  INSTALL  trace_plugins
  LD       /tmp/build/perf/pmu-events/pmu-events-in.o

Auto-detecting system features:
...                        libbfd: [ on  ]
...        disassembler-four-args: [ on  ]
...                          zlib: [ on  ]
...                        libcap: [ on  ]
...               clang-bpf-co-re: [ on  ]

  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/main.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/common.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/json_writer.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/gen.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/btf.o
  GEN      /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/bpf_helper_defs.h
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf.o
  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/nlattr.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_errno.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/str_error.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/netlink.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf_prog_linfo.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_probes.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/xsk.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/hashmap.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf_dump.o
  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ringbuf.o
  LD       /tmp/build/perf/staticobjs/libbpf-in.o
  LINK     /tmp/build/perf/libbpf.a
  CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o
  CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bperf_leader.bpf.o
  CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o
  LD       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o
  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a
  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool
  GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h
  GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h
  GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h
libbpf: map 'prev_readings': unexpected def kind var.
Error: failed to open BPF object file: Invalid argument
libbpf: map 'diff_readings': unexpected def kind var.
Error: failed to open BPF object file: Invalid argument
make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h] Error 255
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h] Error 255
make[1]: *** [Makefile.perf:236: sub-make] Error 2
make: *** [Makefile:110: install-bin] Error 2
make: Leaving directory '/home/acme/git/perf/tools/perf'
[acme@quaco perf]$ clang -v
clang version 11.0.0 (https://github.com/llvm/llvm-project 67420f1b0e9c673ee638f2680fa83f468019004f)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
[acme@quaco perf]$

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-19 18:41     ` Arnaldo Carvalho de Melo
@ 2021-03-19 18:55       ` Jiri Olsa
  2021-03-19 22:06         ` Song Liu
  2021-03-23  0:53       ` Song Liu
  2021-03-23 12:25       ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 33+ messages in thread
From: Jiri Olsa @ 2021-03-19 18:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Song Liu, linux-kernel, kernel-team, acme, namhyung, jolsa

On Fri, Mar 19, 2021 at 03:41:57PM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

>   LD       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o
>   LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a
>   LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool
>   GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h
>   GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h
>   GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h
> libbpf: map 'prev_readings': unexpected def kind var.
> Error: failed to open BPF object file: Invalid argument
> libbpf: map 'diff_readings': unexpected def kind var.
> Error: failed to open BPF object file: Invalid argument

I'm getting clean build for the same options,
could you please send the same output also with 'JOBS=1 V=1'


> make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h] Error 255
> make[2]: *** Waiting for unfinished jobs....
> make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h] Error 255
> make[1]: *** [Makefile.perf:236: sub-make] Error 2
> make: *** [Makefile:110: install-bin] Error 2
> make: Leaving directory '/home/acme/git/perf/tools/perf'
> [acme@quaco perf]$ clang -v
> clang version 11.0.0 (https://github.com/llvm/llvm-project 67420f1b0e9c673ee638f2680fa83f468019004f)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /usr/local/bin
> Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
> Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
> Candidate multilib: .;@m64
> Candidate multilib: 32;@m32
> Selected multilib: .;@m64
> [acme@quaco perf]$
> 

I have:

[jolsa@dell-r440-01 linux-perf]$ clang --version
clang version 11.0.0 (Fedora 11.0.0-2.fc33)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin


jirka


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-19 18:55       ` Jiri Olsa
@ 2021-03-19 22:06         ` Song Liu
  0 siblings, 0 replies; 33+ messages in thread
From: Song Liu @ 2021-03-19 22:06 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Kernel Team,
	Arnaldo Carvalho de Melo, Namhyung Kim, jolsa



> On Mar 19, 2021, at 11:55 AM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Fri, Mar 19, 2021 at 03:41:57PM -0300, Arnaldo Carvalho de Melo wrote:
> 
> SNIP
> 
>>  LD       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o
>>  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a
>>  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool
>>  GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h
>>  GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h
>>  GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h
>> libbpf: map 'prev_readings': unexpected def kind var.
>> Error: failed to open BPF object file: Invalid argument
>> libbpf: map 'diff_readings': unexpected def kind var.
>> Error: failed to open BPF object file: Invalid argument
> 
> I'm getting clean build for the same options,
> could you please send the same output also with 'JOBS=1 V=1'
> 
> 
>> make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h] Error 255
>> make[2]: *** Waiting for unfinished jobs....
>> make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h] Error 255
>> make[1]: *** [Makefile.perf:236: sub-make] Error 2
>> make: *** [Makefile:110: install-bin] Error 2
>> make: Leaving directory '/home/acme/git/perf/tools/perf'
>> [acme@quaco perf]$ clang -v
>> clang version 11.0.0 (https://github.com/llvm/llvm-project 67420f1b0e9c673ee638f2680fa83f468019004f)
>> Target: x86_64-unknown-linux-gnu
>> Thread model: posix
>> InstalledDir: /usr/local/bin
>> Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
>> Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
>> Candidate multilib: .;@m64
>> Candidate multilib: 32;@m32
>> Selected multilib: .;@m64
>> [acme@quaco perf]$
>> 
> 
> I have:
> 
> [jolsa@dell-r440-01 linux-perf]$ clang --version
> clang version 11.0.0 (Fedora 11.0.0-2.fc33)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /usr/bin

I am not able to repro this error either. I tried two versions of clang:

clang version 11.0.0 (Red Hat 11.0.0-0.2.rc2.module_el8.4.0+533+50191577)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /bin

clang version 12.0.0 (https://github.com/llvm/llvm-project.git 07f1e1f44c87d1ee84caf13d6e5aa64eb7e1b068)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin

Thanks,
Song


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-19 18:41     ` Arnaldo Carvalho de Melo
  2021-03-19 18:55       ` Jiri Olsa
@ 2021-03-23  0:53       ` Song Liu
  2021-03-23 12:25       ` Arnaldo Carvalho de Melo
  2 siblings, 0 replies; 33+ messages in thread
From: Song Liu @ 2021-03-23  0:53 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, linux-kernel, Kernel Team, Arnaldo Carvalho de Melo,
	Namhyung Kim, jolsa



> On Mar 19, 2021, at 11:41 AM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Thu, Mar 18, 2021 at 10:15:13PM +0100, Jiri Olsa escreveu:
>> On Tue, Mar 16, 2021 at 02:18:35PM -0700, Song Liu wrote:
>>> bperf is off by default. To enable it, pass --bpf-counters option to
>>> perf-stat. bperf uses a BPF hashmap to share information about BPF
>>> programs and maps used by bperf. This map is pinned to bpffs. The default
>>> path is /sys/fs/bpf/perf_attr_map. The user could change the path with
>>> option --bpf-attr-map.
>>> 
>>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> 
>> Reviewed-by: Jiri Olsa <jolsa@redhat.com>
> 
> After applying just this first patch in the series I'm getting this
> after a 'make -C tools/ clean', now I'm checking if I need some new
> clang, ideas?
> 
> - Arnaldo

Hi Arnaldo, 

Are you still getting this error? 

Thanks,
Song

> 
> [acme@quaco perf]$ make O=/tmp/build/perf -C tools/perf BUILD_BPF_SKEL=1 PYTHON=python3 install-bin
> make: Entering directory '/home/acme/git/perf/tools/perf'
>  BUILD:   Doing 'make -j8' parallel build
> Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
> diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
> Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
> diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
> 
> Auto-detecting system features:
> ...                         dwarf: [ on  ]
> ...            dwarf_getlocations: [ on  ]
> ...                         glibc: [ on  ]
> ...                        libbfd: [ on  ]
> ...                libbfd-buildid: [ on  ]
> ...                        libcap: [ on  ]
> ...                        libelf: [ on  ]
> ...                       libnuma: [ on  ]
> ...        numa_num_possible_cpus: [ on  ]
> ...                       libperl: [ on  ]
> ...                     libpython: [ on  ]
> ...                     libcrypto: [ on  ]
> ...                     libunwind: [ on  ]
> ...            libdw-dwarf-unwind: [ on  ]
> ...                          zlib: [ on  ]
> ...                          lzma: [ on  ]
> ...                     get_cpuid: [ on  ]
> ...                           bpf: [ on  ]
> ...                        libaio: [ on  ]
> ...                       libzstd: [ on  ]
> ...        disassembler-four-args: [ on  ]
> 
>  GEN      /tmp/build/perf/common-cmds.h
>  CC       /tmp/build/perf/exec-cmd.o
>  MKDIR    /tmp/build/perf/fd/
>  MKDIR    /tmp/build/perf/fs/
>  CC       /tmp/build/perf/fs/fs.o
>  CC       /tmp/build/perf/event-parse.o
>  CC       /tmp/build/perf/fd/array.o
>  CC       /tmp/build/perf/core.o
>  GEN      /tmp/build/perf/bpf_helper_defs.h
>  CC       /tmp/build/perf/event-plugin.o
>  MKDIR    /tmp/build/perf/staticobjs/
>  PERF_VERSION = 5.12.rc2.g3df07f57f205
>  CC       /tmp/build/perf/staticobjs/libbpf.o
>  CC       /tmp/build/perf/cpu.o
>  LD       /tmp/build/perf/fd/libapi-in.o
>  CC       /tmp/build/perf/cpumap.o
>  CC       /tmp/build/perf/help.o
>  MKDIR    /tmp/build/perf/fs/
>  CC       /tmp/build/perf/fs/tracing_path.o
>  CC       /tmp/build/perf/fs/cgroup.o
>  CC       /tmp/build/perf/trace-seq.o
>  CC       /tmp/build/perf/pager.o
>  CC       /tmp/build/perf/parse-options.o
>  LD       /tmp/build/perf/fs/libapi-in.o
>  CC       /tmp/build/perf/debug.o
>  CC       /tmp/build/perf/str_error_r.o
>  CC       /tmp/build/perf/run-command.o
>  CC       /tmp/build/perf/sigchain.o
>  LD       /tmp/build/perf/libapi-in.o
>  AR       /tmp/build/perf/libapi.a
>  CC       /tmp/build/perf/subcmd-config.o
>  CC       /tmp/build/perf/threadmap.o
>  CC       /tmp/build/perf/evsel.o
>  CC       /tmp/build/perf/parse-filter.o
>  MKDIR    /tmp/build/perf/staticobjs/
>  CC       /tmp/build/perf/staticobjs/bpf.o
>  CC       /tmp/build/perf/evlist.o
>  CC       /tmp/build/perf/parse-utils.o
>  CC       /tmp/build/perf/kbuffer-parse.o
>  CC       /tmp/build/perf/tep_strerror.o
>  CC       /tmp/build/perf/mmap.o
>  CC       /tmp/build/perf/zalloc.o
>  CC       /tmp/build/perf/event-parse-api.o
>  LD       /tmp/build/perf/libsubcmd-in.o
>  AR       /tmp/build/perf/libsubcmd.a
>  CC       /tmp/build/perf/xyarray.o
>  LD       /tmp/build/perf/libtraceevent-in.o
>  LINK     /tmp/build/perf/libtraceevent.a
>  CC       /tmp/build/perf/staticobjs/nlattr.o
>  CC       /tmp/build/perf/staticobjs/btf.o
>  CC       /tmp/build/perf/lib.o
>  CC       /tmp/build/perf/staticobjs/libbpf_errno.o
>  CC       /tmp/build/perf/staticobjs/str_error.o
>  CC       /tmp/build/perf/staticobjs/netlink.o
>  CC       /tmp/build/perf/staticobjs/bpf_prog_linfo.o
>  CC       /tmp/build/perf/staticobjs/libbpf_probes.o
>  LD       /tmp/build/perf/libperf-in.o
>  AR       /tmp/build/perf/libperf.a
>  MKDIR    /tmp/build/perf/pmu-events/
>  HOSTCC   /tmp/build/perf/pmu-events/json.o
>  CC       /tmp/build/perf/plugin_jbd2.o
>  CC       /tmp/build/perf/staticobjs/xsk.o
>  MKDIR    /tmp/build/perf/pmu-events/
>  HOSTCC   /tmp/build/perf/pmu-events/jsmn.o
>  CC       /tmp/build/perf/staticobjs/hashmap.o
>  LD       /tmp/build/perf/plugin_jbd2-in.o
>  CC       /tmp/build/perf/staticobjs/btf_dump.o
>  CC       /tmp/build/perf/plugin_hrtimer.o
>  HOSTCC   /tmp/build/perf/pmu-events/jevents.o
>  LD       /tmp/build/perf/plugin_hrtimer-in.o
>  CC       /tmp/build/perf/plugin_kmem.o
>  CC       /tmp/build/perf/staticobjs/ringbuf.o
>  LD       /tmp/build/perf/plugin_kmem-in.o
>  CC       /tmp/build/perf/plugin_kvm.o
>  HOSTLD   /tmp/build/perf/pmu-events/jevents-in.o
>  CC       /tmp/build/perf/perf-read-vdso32
>  CC       /tmp/build/perf/plugin_mac80211.o
>  LD       /tmp/build/perf/plugin_kvm-in.o
>  CC       /tmp/build/perf/plugin_sched_switch.o
>  CC       /tmp/build/perf/plugin_function.o
>  MKDIR    /tmp/build/perf/jvmti/
>  CC       /tmp/build/perf/jvmti/libjvmti.o
>  LD       /tmp/build/perf/plugin_mac80211-in.o
>  GEN      perf-archive
>  LD       /tmp/build/perf/plugin_sched_switch-in.o
>  CC       /tmp/build/perf/plugin_futex.o
>  CC       /tmp/build/perf/plugin_xen.o
>  CC       /tmp/build/perf/plugin_scsi.o
>  LD       /tmp/build/perf/plugin_function-in.o
>  CC       /tmp/build/perf/plugin_cfg80211.o
>  LD       /tmp/build/perf/plugin_futex-in.o
>  LD       /tmp/build/perf/plugin_xen-in.o
>  LINK     /tmp/build/perf/plugin_jbd2.so
>  LINK     /tmp/build/perf/plugin_hrtimer.so
>  CC       /tmp/build/perf/plugin_tlb.o
>  LINK     /tmp/build/perf/plugin_kmem.so
>  LINK     /tmp/build/perf/plugin_kvm.so
>  LD       /tmp/build/perf/plugin_scsi-in.o
>  LINK     /tmp/build/perf/plugin_mac80211.so
>  LINK     /tmp/build/perf/plugin_sched_switch.so
>  LINK     /tmp/build/perf/plugin_function.so
>  LINK     /tmp/build/perf/plugin_futex.so
>  LD       /tmp/build/perf/plugin_cfg80211-in.o
>  LINK     /tmp/build/perf/plugin_xen.so
>  LINK     /tmp/build/perf/plugin_scsi.so
>  LD       /tmp/build/perf/plugin_tlb-in.o
>  MKDIR    /tmp/build/perf/jvmti/
>  LINK     /tmp/build/perf/plugin_cfg80211.so
>  CC       /tmp/build/perf/jvmti/jvmti_agent.o
>  LINK     /tmp/build/perf/plugin_tlb.so
>  GEN      perf-with-kcore
>  CC       /tmp/build/perf/jvmti/libstring.o
> CFLAGS= make -C ../bpf/bpftool \
> 	OUTPUT=/tmp/build/perf/util/bpf_skel/.tmp/ bootstrap
>  CC       /tmp/build/perf/jvmti/libctype.o
>  GEN      /tmp/build/perf/libtraceevent-dynamic-list
>  LINK     /tmp/build/perf/pmu-events/jevents
>  DESCEND  plugins
>  GEN      /tmp/build/perf/python/perf.so
>  GEN      /tmp/build/perf/pmu-events/pmu-events.c
>  CC       /tmp/build/perf/plugins/plugin_jbd2.o
>  CC       /tmp/build/perf/plugins/plugin_hrtimer.o
>  LD       /tmp/build/perf/plugins/plugin_jbd2-in.o
>  LD       /tmp/build/perf/plugins/plugin_hrtimer-in.o
>  CC       /tmp/build/perf/plugins/plugin_kmem.o
>  CC       /tmp/build/perf/plugins/plugin_kvm.o
>  LD       /tmp/build/perf/jvmti/jvmti-in.o
>  LD       /tmp/build/perf/plugins/plugin_kmem-in.o
>  LINK     /tmp/build/perf/libperf-jvmti.so
>  CC       /tmp/build/perf/plugins/plugin_mac80211.o
>  CC       /tmp/build/perf/plugins/plugin_sched_switch.o
>  CC       /tmp/build/perf/pmu-events/pmu-events.o
>  LD       /tmp/build/perf/plugins/plugin_kvm-in.o
>  CC       /tmp/build/perf/plugins/plugin_function.o
>  LD       /tmp/build/perf/plugins/plugin_mac80211-in.o
>  CC       /tmp/build/perf/plugins/plugin_futex.o
>  LD       /tmp/build/perf/plugins/plugin_sched_switch-in.o
>  CC       /tmp/build/perf/plugins/plugin_xen.o
>  LD       /tmp/build/perf/plugins/plugin_function-in.o
>  CC       /tmp/build/perf/plugins/plugin_scsi.o
>  LD       /tmp/build/perf/plugins/plugin_futex-in.o
>  CC       /tmp/build/perf/plugins/plugin_cfg80211.o
>  LD       /tmp/build/perf/plugins/plugin_xen-in.o
>  CC       /tmp/build/perf/plugins/plugin_tlb.o
>  LD       /tmp/build/perf/plugins/plugin_scsi-in.o
>  LD       /tmp/build/perf/plugins/plugin_cfg80211-in.o
>  LINK     /tmp/build/perf/plugins/plugin_jbd2.so
>  LINK     /tmp/build/perf/plugins/plugin_hrtimer.so
>  LINK     /tmp/build/perf/plugins/plugin_kmem.so
>  LINK     /tmp/build/perf/plugins/plugin_kvm.so
>  LINK     /tmp/build/perf/plugins/plugin_mac80211.so
>  LD       /tmp/build/perf/plugins/plugin_tlb-in.o
>  LINK     /tmp/build/perf/plugins/plugin_function.so
>  LINK     /tmp/build/perf/plugins/plugin_sched_switch.so
>  LINK     /tmp/build/perf/plugins/plugin_futex.so
>  LINK     /tmp/build/perf/plugins/plugin_xen.so
>  LINK     /tmp/build/perf/plugins/plugin_scsi.so
>  LINK     /tmp/build/perf/plugins/plugin_cfg80211.so
>  LINK     /tmp/build/perf/plugins/plugin_tlb.so
>  INSTALL  trace_plugins
>  LD       /tmp/build/perf/pmu-events/pmu-events-in.o
> 
> Auto-detecting system features:
> ...                        libbfd: [ on  ]
> ...        disassembler-four-args: [ on  ]
> ...                          zlib: [ on  ]
> ...                        libcap: [ on  ]
> ...               clang-bpf-co-re: [ on  ]
> 
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/main.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/common.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/json_writer.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/gen.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/btf.o
>  GEN      /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/bpf_helper_defs.h
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf.o
>  MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/nlattr.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_errno.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/str_error.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/netlink.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf_prog_linfo.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_probes.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/xsk.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/hashmap.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf_dump.o
>  CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ringbuf.o
>  LD       /tmp/build/perf/staticobjs/libbpf-in.o
>  LINK     /tmp/build/perf/libbpf.a
>  CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o
>  CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bperf_leader.bpf.o
>  CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o
>  LD       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o
>  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a
>  LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool
>  GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h
>  GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h
>  GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h
> libbpf: map 'prev_readings': unexpected def kind var.
> Error: failed to open BPF object file: Invalid argument
> libbpf: map 'diff_readings': unexpected def kind var.
> Error: failed to open BPF object file: Invalid argument
> make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h] Error 255
> make[2]: *** Waiting for unfinished jobs....
> make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h] Error 255
> make[1]: *** [Makefile.perf:236: sub-make] Error 2
> make: *** [Makefile:110: install-bin] Error 2
> make: Leaving directory '/home/acme/git/perf/tools/perf'
> [acme@quaco perf]$ clang -v
> clang version 11.0.0 (https://github.com/llvm/llvm-project 67420f1b0e9c673ee638f2680fa83f468019004f)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /usr/local/bin
> Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
> Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
> Candidate multilib: .;@m64
> Candidate multilib: 32;@m32
> Selected multilib: .;@m64
> [acme@quaco perf]$


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-19 18:41     ` Arnaldo Carvalho de Melo
  2021-03-19 18:55       ` Jiri Olsa
  2021-03-23  0:53       ` Song Liu
@ 2021-03-23 12:25       ` Arnaldo Carvalho de Melo
  2021-03-23 12:37         ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 33+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-23 12:25 UTC (permalink / raw)
  To: Song Liu, Jiri Olsa; +Cc: linux-kernel, kernel-team, acme, namhyung, jolsa

Em Fri, Mar 19, 2021 at 03:41:57PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Mar 18, 2021 at 10:15:13PM +0100, Jiri Olsa escreveu:
> > On Tue, Mar 16, 2021 at 02:18:35PM -0700, Song Liu wrote:
> > > bperf is off by default. To enable it, pass --bpf-counters option to
> > > perf-stat. bperf uses a BPF hashmap to share information about BPF
> > > programs and maps used by bperf. This map is pinned to bpffs. The default
> > > path is /sys/fs/bpf/perf_attr_map. The user could change the path with
> > > option --bpf-attr-map.
> > > 
> > > Signed-off-by: Song Liu <songliubraving@fb.com>
> > 
> > Reviewed-by: Jiri Olsa <jolsa@redhat.com>
> 
> After applying just this first patch in the series I'm getting this
> after a 'make -C tools/ clean', now I'm checking if I need some new
> clang, ideas?

Works now with clang from fedora 33, I was using a locally built, older,
now I get this when trying as non-root, expected, but we need to improve
the wording.

[acme@five perf]$ perf stat --bpf-counters sleep 1
Failed to lock perf_event_attr map
[acme@five perf]$
 
> - Arnaldo
> 
> [acme@quaco perf]$ make O=/tmp/build/perf -C tools/perf BUILD_BPF_SKEL=1 PYTHON=python3 install-bin
> make: Entering directory '/home/acme/git/perf/tools/perf'
>   BUILD:   Doing 'make -j8' parallel build
> Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
> diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
> Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
> diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
> 
> Auto-detecting system features:
> ...                         dwarf: [ on  ]
> ...            dwarf_getlocations: [ on  ]
> ...                         glibc: [ on  ]
> ...                        libbfd: [ on  ]
> ...                libbfd-buildid: [ on  ]
> ...                        libcap: [ on  ]
> ...                        libelf: [ on  ]
> ...                       libnuma: [ on  ]
> ...        numa_num_possible_cpus: [ on  ]
> ...                       libperl: [ on  ]
> ...                     libpython: [ on  ]
> ...                     libcrypto: [ on  ]
> ...                     libunwind: [ on  ]
> ...            libdw-dwarf-unwind: [ on  ]
> ...                          zlib: [ on  ]
> ...                          lzma: [ on  ]
> ...                     get_cpuid: [ on  ]
> ...                           bpf: [ on  ]
> ...                        libaio: [ on  ]
> ...                       libzstd: [ on  ]
> ...        disassembler-four-args: [ on  ]
> 
>   GEN      /tmp/build/perf/common-cmds.h
>   CC       /tmp/build/perf/exec-cmd.o
>   MKDIR    /tmp/build/perf/fd/
>   MKDIR    /tmp/build/perf/fs/
>   CC       /tmp/build/perf/fs/fs.o
>   CC       /tmp/build/perf/event-parse.o
>   CC       /tmp/build/perf/fd/array.o
>   CC       /tmp/build/perf/core.o
>   GEN      /tmp/build/perf/bpf_helper_defs.h
>   CC       /tmp/build/perf/event-plugin.o
>   MKDIR    /tmp/build/perf/staticobjs/
>   PERF_VERSION = 5.12.rc2.g3df07f57f205
>   CC       /tmp/build/perf/staticobjs/libbpf.o
>   CC       /tmp/build/perf/cpu.o
>   LD       /tmp/build/perf/fd/libapi-in.o
>   CC       /tmp/build/perf/cpumap.o
>   CC       /tmp/build/perf/help.o
>   MKDIR    /tmp/build/perf/fs/
>   CC       /tmp/build/perf/fs/tracing_path.o
>   CC       /tmp/build/perf/fs/cgroup.o
>   CC       /tmp/build/perf/trace-seq.o
>   CC       /tmp/build/perf/pager.o
>   CC       /tmp/build/perf/parse-options.o
>   LD       /tmp/build/perf/fs/libapi-in.o
>   CC       /tmp/build/perf/debug.o
>   CC       /tmp/build/perf/str_error_r.o
>   CC       /tmp/build/perf/run-command.o
>   CC       /tmp/build/perf/sigchain.o
>   LD       /tmp/build/perf/libapi-in.o
>   AR       /tmp/build/perf/libapi.a
>   CC       /tmp/build/perf/subcmd-config.o
>   CC       /tmp/build/perf/threadmap.o
>   CC       /tmp/build/perf/evsel.o
>   CC       /tmp/build/perf/parse-filter.o
>   MKDIR    /tmp/build/perf/staticobjs/
>   CC       /tmp/build/perf/staticobjs/bpf.o
>   CC       /tmp/build/perf/evlist.o
>   CC       /tmp/build/perf/parse-utils.o
>   CC       /tmp/build/perf/kbuffer-parse.o
>   CC       /tmp/build/perf/tep_strerror.o
>   CC       /tmp/build/perf/mmap.o
>   CC       /tmp/build/perf/zalloc.o
>   CC       /tmp/build/perf/event-parse-api.o
>   LD       /tmp/build/perf/libsubcmd-in.o
>   AR       /tmp/build/perf/libsubcmd.a
>   CC       /tmp/build/perf/xyarray.o
>   LD       /tmp/build/perf/libtraceevent-in.o
>   LINK     /tmp/build/perf/libtraceevent.a
>   CC       /tmp/build/perf/staticobjs/nlattr.o
>   CC       /tmp/build/perf/staticobjs/btf.o
>   CC       /tmp/build/perf/lib.o
>   CC       /tmp/build/perf/staticobjs/libbpf_errno.o
>   CC       /tmp/build/perf/staticobjs/str_error.o
>   CC       /tmp/build/perf/staticobjs/netlink.o
>   CC       /tmp/build/perf/staticobjs/bpf_prog_linfo.o
>   CC       /tmp/build/perf/staticobjs/libbpf_probes.o
>   LD       /tmp/build/perf/libperf-in.o
>   AR       /tmp/build/perf/libperf.a
>   MKDIR    /tmp/build/perf/pmu-events/
>   HOSTCC   /tmp/build/perf/pmu-events/json.o
>   CC       /tmp/build/perf/plugin_jbd2.o
>   CC       /tmp/build/perf/staticobjs/xsk.o
>   MKDIR    /tmp/build/perf/pmu-events/
>   HOSTCC   /tmp/build/perf/pmu-events/jsmn.o
>   CC       /tmp/build/perf/staticobjs/hashmap.o
>   LD       /tmp/build/perf/plugin_jbd2-in.o
>   CC       /tmp/build/perf/staticobjs/btf_dump.o
>   CC       /tmp/build/perf/plugin_hrtimer.o
>   HOSTCC   /tmp/build/perf/pmu-events/jevents.o
>   LD       /tmp/build/perf/plugin_hrtimer-in.o
>   CC       /tmp/build/perf/plugin_kmem.o
>   CC       /tmp/build/perf/staticobjs/ringbuf.o
>   LD       /tmp/build/perf/plugin_kmem-in.o
>   CC       /tmp/build/perf/plugin_kvm.o
>   HOSTLD   /tmp/build/perf/pmu-events/jevents-in.o
>   CC       /tmp/build/perf/perf-read-vdso32
>   CC       /tmp/build/perf/plugin_mac80211.o
>   LD       /tmp/build/perf/plugin_kvm-in.o
>   CC       /tmp/build/perf/plugin_sched_switch.o
>   CC       /tmp/build/perf/plugin_function.o
>   MKDIR    /tmp/build/perf/jvmti/
>   CC       /tmp/build/perf/jvmti/libjvmti.o
>   LD       /tmp/build/perf/plugin_mac80211-in.o
>   GEN      perf-archive
>   LD       /tmp/build/perf/plugin_sched_switch-in.o
>   CC       /tmp/build/perf/plugin_futex.o
>   CC       /tmp/build/perf/plugin_xen.o
>   CC       /tmp/build/perf/plugin_scsi.o
>   LD       /tmp/build/perf/plugin_function-in.o
>   CC       /tmp/build/perf/plugin_cfg80211.o
>   LD       /tmp/build/perf/plugin_futex-in.o
>   LD       /tmp/build/perf/plugin_xen-in.o
>   LINK     /tmp/build/perf/plugin_jbd2.so
>   LINK     /tmp/build/perf/plugin_hrtimer.so
>   CC       /tmp/build/perf/plugin_tlb.o
>   LINK     /tmp/build/perf/plugin_kmem.so
>   LINK     /tmp/build/perf/plugin_kvm.so
>   LD       /tmp/build/perf/plugin_scsi-in.o
>   LINK     /tmp/build/perf/plugin_mac80211.so
>   LINK     /tmp/build/perf/plugin_sched_switch.so
>   LINK     /tmp/build/perf/plugin_function.so
>   LINK     /tmp/build/perf/plugin_futex.so
>   LD       /tmp/build/perf/plugin_cfg80211-in.o
>   LINK     /tmp/build/perf/plugin_xen.so
>   LINK     /tmp/build/perf/plugin_scsi.so
>   LD       /tmp/build/perf/plugin_tlb-in.o
>   MKDIR    /tmp/build/perf/jvmti/
>   LINK     /tmp/build/perf/plugin_cfg80211.so
>   CC       /tmp/build/perf/jvmti/jvmti_agent.o
>   LINK     /tmp/build/perf/plugin_tlb.so
>   GEN      perf-with-kcore
>   CC       /tmp/build/perf/jvmti/libstring.o
> CFLAGS= make -C ../bpf/bpftool \
> 	OUTPUT=/tmp/build/perf/util/bpf_skel/.tmp/ bootstrap
>   CC       /tmp/build/perf/jvmti/libctype.o
>   GEN      /tmp/build/perf/libtraceevent-dynamic-list
>   LINK     /tmp/build/perf/pmu-events/jevents
>   DESCEND  plugins
>   GEN      /tmp/build/perf/python/perf.so
>   GEN      /tmp/build/perf/pmu-events/pmu-events.c
>   CC       /tmp/build/perf/plugins/plugin_jbd2.o
>   CC       /tmp/build/perf/plugins/plugin_hrtimer.o
>   LD       /tmp/build/perf/plugins/plugin_jbd2-in.o
>   LD       /tmp/build/perf/plugins/plugin_hrtimer-in.o
>   CC       /tmp/build/perf/plugins/plugin_kmem.o
>   CC       /tmp/build/perf/plugins/plugin_kvm.o
>   LD       /tmp/build/perf/jvmti/jvmti-in.o
>   LD       /tmp/build/perf/plugins/plugin_kmem-in.o
>   LINK     /tmp/build/perf/libperf-jvmti.so
>   CC       /tmp/build/perf/plugins/plugin_mac80211.o
>   CC       /tmp/build/perf/plugins/plugin_sched_switch.o
>   CC       /tmp/build/perf/pmu-events/pmu-events.o
>   LD       /tmp/build/perf/plugins/plugin_kvm-in.o
>   CC       /tmp/build/perf/plugins/plugin_function.o
>   LD       /tmp/build/perf/plugins/plugin_mac80211-in.o
>   CC       /tmp/build/perf/plugins/plugin_futex.o
>   LD       /tmp/build/perf/plugins/plugin_sched_switch-in.o
>   CC       /tmp/build/perf/plugins/plugin_xen.o
>   LD       /tmp/build/perf/plugins/plugin_function-in.o
>   CC       /tmp/build/perf/plugins/plugin_scsi.o
>   LD       /tmp/build/perf/plugins/plugin_futex-in.o
>   CC       /tmp/build/perf/plugins/plugin_cfg80211.o
>   LD       /tmp/build/perf/plugins/plugin_xen-in.o
>   CC       /tmp/build/perf/plugins/plugin_tlb.o
>   LD       /tmp/build/perf/plugins/plugin_scsi-in.o
>   LD       /tmp/build/perf/plugins/plugin_cfg80211-in.o
>   LINK     /tmp/build/perf/plugins/plugin_jbd2.so
>   LINK     /tmp/build/perf/plugins/plugin_hrtimer.so
>   LINK     /tmp/build/perf/plugins/plugin_kmem.so
>   LINK     /tmp/build/perf/plugins/plugin_kvm.so
>   LINK     /tmp/build/perf/plugins/plugin_mac80211.so
>   LD       /tmp/build/perf/plugins/plugin_tlb-in.o
>   LINK     /tmp/build/perf/plugins/plugin_function.so
>   LINK     /tmp/build/perf/plugins/plugin_sched_switch.so
>   LINK     /tmp/build/perf/plugins/plugin_futex.so
>   LINK     /tmp/build/perf/plugins/plugin_xen.so
>   LINK     /tmp/build/perf/plugins/plugin_scsi.so
>   LINK     /tmp/build/perf/plugins/plugin_cfg80211.so
>   LINK     /tmp/build/perf/plugins/plugin_tlb.so
>   INSTALL  trace_plugins
>   LD       /tmp/build/perf/pmu-events/pmu-events-in.o
> 
> Auto-detecting system features:
> ...                        libbfd: [ on  ]
> ...        disassembler-four-args: [ on  ]
> ...                          zlib: [ on  ]
> ...                        libcap: [ on  ]
> ...               clang-bpf-co-re: [ on  ]
> 
>   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/
>   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/main.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/common.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/json_writer.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/gen.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/btf.o
>   GEN      /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/bpf_helper_defs.h
>   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf.o
>   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/nlattr.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_errno.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/str_error.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/netlink.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf_prog_linfo.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_probes.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/xsk.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/hashmap.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf_dump.o
>   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ringbuf.o
>   LD       /tmp/build/perf/staticobjs/libbpf-in.o
>   LINK     /tmp/build/perf/libbpf.a
>   CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o
>   CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bperf_leader.bpf.o
>   CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o
>   LD       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o
>   LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a
>   LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool
>   GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h
>   GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h
>   GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h
> libbpf: map 'prev_readings': unexpected def kind var.
> Error: failed to open BPF object file: Invalid argument
> libbpf: map 'diff_readings': unexpected def kind var.
> Error: failed to open BPF object file: Invalid argument
> make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h] Error 255
> make[2]: *** Waiting for unfinished jobs....
> make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h] Error 255
> make[1]: *** [Makefile.perf:236: sub-make] Error 2
> make: *** [Makefile:110: install-bin] Error 2
> make: Leaving directory '/home/acme/git/perf/tools/perf'
> [acme@quaco perf]$ clang -v
> clang version 11.0.0 (https://github.com/llvm/llvm-project 67420f1b0e9c673ee638f2680fa83f468019004f)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /usr/local/bin
> Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
> Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
> Candidate multilib: .;@m64
> Candidate multilib: 32;@m32
> Selected multilib: .;@m64
> [acme@quaco perf]$

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-23 12:25       ` Arnaldo Carvalho de Melo
@ 2021-03-23 12:37         ` Arnaldo Carvalho de Melo
  2021-03-23 18:27           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 33+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-23 12:37 UTC (permalink / raw)
  To: Song Liu, Jiri Olsa; +Cc: linux-kernel, kernel-team, acme, namhyung, jolsa

Em Tue, Mar 23, 2021 at 09:25:52AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Mar 19, 2021 at 03:41:57PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, Mar 18, 2021 at 10:15:13PM +0100, Jiri Olsa escreveu:
> > > On Tue, Mar 16, 2021 at 02:18:35PM -0700, Song Liu wrote:
> > > > bperf is off by default. To enable it, pass --bpf-counters option to
> > > > perf-stat. bperf uses a BPF hashmap to share information about BPF
> > > > programs and maps used by bperf. This map is pinned to bpffs. The default
> > > > path is /sys/fs/bpf/perf_attr_map. The user could change the path with
> > > > option --bpf-attr-map.
> > > > 
> > > > Signed-off-by: Song Liu <songliubraving@fb.com>
> > > 
> > > Reviewed-by: Jiri Olsa <jolsa@redhat.com>
> > 
> > After applying just this first patch in the series I'm getting this
> > after a 'make -C tools/ clean', now I'm checking if I need some new
> > clang, ideas?
> 
> Works now with clang from fedora 33, I was using a locally built, older,
> now I get this when trying as non-root, expected, but we need to improve
> the wording.

Fails as root as well, investigating:

[root@five ~]# ls -lad /sys/fs/bpf/
drwx-----T. 2 root root 0 Mar 23 06:03 /sys/fs/bpf/
[root@five ~]# strace -e bpf perf stat --bpf-counters sleep 1
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=120, value_size=8, max_entries=16, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = -1 EPERM (Operation not permitted)
Failed to lock perf_event_attr map
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=13916, si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
+++ exited with 255 +++
[root@five ~]#
 
> [acme@five perf]$ perf stat --bpf-counters sleep 1
> Failed to lock perf_event_attr map
> [acme@five perf]$
>  
> > - Arnaldo
> > 
> > [acme@quaco perf]$ make O=/tmp/build/perf -C tools/perf BUILD_BPF_SKEL=1 PYTHON=python3 install-bin
> > make: Entering directory '/home/acme/git/perf/tools/perf'
> >   BUILD:   Doing 'make -j8' parallel build
> > Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
> > diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
> > Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
> > diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
> > 
> > Auto-detecting system features:
> > ...                         dwarf: [ on  ]
> > ...            dwarf_getlocations: [ on  ]
> > ...                         glibc: [ on  ]
> > ...                        libbfd: [ on  ]
> > ...                libbfd-buildid: [ on  ]
> > ...                        libcap: [ on  ]
> > ...                        libelf: [ on  ]
> > ...                       libnuma: [ on  ]
> > ...        numa_num_possible_cpus: [ on  ]
> > ...                       libperl: [ on  ]
> > ...                     libpython: [ on  ]
> > ...                     libcrypto: [ on  ]
> > ...                     libunwind: [ on  ]
> > ...            libdw-dwarf-unwind: [ on  ]
> > ...                          zlib: [ on  ]
> > ...                          lzma: [ on  ]
> > ...                     get_cpuid: [ on  ]
> > ...                           bpf: [ on  ]
> > ...                        libaio: [ on  ]
> > ...                       libzstd: [ on  ]
> > ...        disassembler-four-args: [ on  ]
> > 
> >   GEN      /tmp/build/perf/common-cmds.h
> >   CC       /tmp/build/perf/exec-cmd.o
> >   MKDIR    /tmp/build/perf/fd/
> >   MKDIR    /tmp/build/perf/fs/
> >   CC       /tmp/build/perf/fs/fs.o
> >   CC       /tmp/build/perf/event-parse.o
> >   CC       /tmp/build/perf/fd/array.o
> >   CC       /tmp/build/perf/core.o
> >   GEN      /tmp/build/perf/bpf_helper_defs.h
> >   CC       /tmp/build/perf/event-plugin.o
> >   MKDIR    /tmp/build/perf/staticobjs/
> >   PERF_VERSION = 5.12.rc2.g3df07f57f205
> >   CC       /tmp/build/perf/staticobjs/libbpf.o
> >   CC       /tmp/build/perf/cpu.o
> >   LD       /tmp/build/perf/fd/libapi-in.o
> >   CC       /tmp/build/perf/cpumap.o
> >   CC       /tmp/build/perf/help.o
> >   MKDIR    /tmp/build/perf/fs/
> >   CC       /tmp/build/perf/fs/tracing_path.o
> >   CC       /tmp/build/perf/fs/cgroup.o
> >   CC       /tmp/build/perf/trace-seq.o
> >   CC       /tmp/build/perf/pager.o
> >   CC       /tmp/build/perf/parse-options.o
> >   LD       /tmp/build/perf/fs/libapi-in.o
> >   CC       /tmp/build/perf/debug.o
> >   CC       /tmp/build/perf/str_error_r.o
> >   CC       /tmp/build/perf/run-command.o
> >   CC       /tmp/build/perf/sigchain.o
> >   LD       /tmp/build/perf/libapi-in.o
> >   AR       /tmp/build/perf/libapi.a
> >   CC       /tmp/build/perf/subcmd-config.o
> >   CC       /tmp/build/perf/threadmap.o
> >   CC       /tmp/build/perf/evsel.o
> >   CC       /tmp/build/perf/parse-filter.o
> >   MKDIR    /tmp/build/perf/staticobjs/
> >   CC       /tmp/build/perf/staticobjs/bpf.o
> >   CC       /tmp/build/perf/evlist.o
> >   CC       /tmp/build/perf/parse-utils.o
> >   CC       /tmp/build/perf/kbuffer-parse.o
> >   CC       /tmp/build/perf/tep_strerror.o
> >   CC       /tmp/build/perf/mmap.o
> >   CC       /tmp/build/perf/zalloc.o
> >   CC       /tmp/build/perf/event-parse-api.o
> >   LD       /tmp/build/perf/libsubcmd-in.o
> >   AR       /tmp/build/perf/libsubcmd.a
> >   CC       /tmp/build/perf/xyarray.o
> >   LD       /tmp/build/perf/libtraceevent-in.o
> >   LINK     /tmp/build/perf/libtraceevent.a
> >   CC       /tmp/build/perf/staticobjs/nlattr.o
> >   CC       /tmp/build/perf/staticobjs/btf.o
> >   CC       /tmp/build/perf/lib.o
> >   CC       /tmp/build/perf/staticobjs/libbpf_errno.o
> >   CC       /tmp/build/perf/staticobjs/str_error.o
> >   CC       /tmp/build/perf/staticobjs/netlink.o
> >   CC       /tmp/build/perf/staticobjs/bpf_prog_linfo.o
> >   CC       /tmp/build/perf/staticobjs/libbpf_probes.o
> >   LD       /tmp/build/perf/libperf-in.o
> >   AR       /tmp/build/perf/libperf.a
> >   MKDIR    /tmp/build/perf/pmu-events/
> >   HOSTCC   /tmp/build/perf/pmu-events/json.o
> >   CC       /tmp/build/perf/plugin_jbd2.o
> >   CC       /tmp/build/perf/staticobjs/xsk.o
> >   MKDIR    /tmp/build/perf/pmu-events/
> >   HOSTCC   /tmp/build/perf/pmu-events/jsmn.o
> >   CC       /tmp/build/perf/staticobjs/hashmap.o
> >   LD       /tmp/build/perf/plugin_jbd2-in.o
> >   CC       /tmp/build/perf/staticobjs/btf_dump.o
> >   CC       /tmp/build/perf/plugin_hrtimer.o
> >   HOSTCC   /tmp/build/perf/pmu-events/jevents.o
> >   LD       /tmp/build/perf/plugin_hrtimer-in.o
> >   CC       /tmp/build/perf/plugin_kmem.o
> >   CC       /tmp/build/perf/staticobjs/ringbuf.o
> >   LD       /tmp/build/perf/plugin_kmem-in.o
> >   CC       /tmp/build/perf/plugin_kvm.o
> >   HOSTLD   /tmp/build/perf/pmu-events/jevents-in.o
> >   CC       /tmp/build/perf/perf-read-vdso32
> >   CC       /tmp/build/perf/plugin_mac80211.o
> >   LD       /tmp/build/perf/plugin_kvm-in.o
> >   CC       /tmp/build/perf/plugin_sched_switch.o
> >   CC       /tmp/build/perf/plugin_function.o
> >   MKDIR    /tmp/build/perf/jvmti/
> >   CC       /tmp/build/perf/jvmti/libjvmti.o
> >   LD       /tmp/build/perf/plugin_mac80211-in.o
> >   GEN      perf-archive
> >   LD       /tmp/build/perf/plugin_sched_switch-in.o
> >   CC       /tmp/build/perf/plugin_futex.o
> >   CC       /tmp/build/perf/plugin_xen.o
> >   CC       /tmp/build/perf/plugin_scsi.o
> >   LD       /tmp/build/perf/plugin_function-in.o
> >   CC       /tmp/build/perf/plugin_cfg80211.o
> >   LD       /tmp/build/perf/plugin_futex-in.o
> >   LD       /tmp/build/perf/plugin_xen-in.o
> >   LINK     /tmp/build/perf/plugin_jbd2.so
> >   LINK     /tmp/build/perf/plugin_hrtimer.so
> >   CC       /tmp/build/perf/plugin_tlb.o
> >   LINK     /tmp/build/perf/plugin_kmem.so
> >   LINK     /tmp/build/perf/plugin_kvm.so
> >   LD       /tmp/build/perf/plugin_scsi-in.o
> >   LINK     /tmp/build/perf/plugin_mac80211.so
> >   LINK     /tmp/build/perf/plugin_sched_switch.so
> >   LINK     /tmp/build/perf/plugin_function.so
> >   LINK     /tmp/build/perf/plugin_futex.so
> >   LD       /tmp/build/perf/plugin_cfg80211-in.o
> >   LINK     /tmp/build/perf/plugin_xen.so
> >   LINK     /tmp/build/perf/plugin_scsi.so
> >   LD       /tmp/build/perf/plugin_tlb-in.o
> >   MKDIR    /tmp/build/perf/jvmti/
> >   LINK     /tmp/build/perf/plugin_cfg80211.so
> >   CC       /tmp/build/perf/jvmti/jvmti_agent.o
> >   LINK     /tmp/build/perf/plugin_tlb.so
> >   GEN      perf-with-kcore
> >   CC       /tmp/build/perf/jvmti/libstring.o
> > CFLAGS= make -C ../bpf/bpftool \
> > 	OUTPUT=/tmp/build/perf/util/bpf_skel/.tmp/ bootstrap
> >   CC       /tmp/build/perf/jvmti/libctype.o
> >   GEN      /tmp/build/perf/libtraceevent-dynamic-list
> >   LINK     /tmp/build/perf/pmu-events/jevents
> >   DESCEND  plugins
> >   GEN      /tmp/build/perf/python/perf.so
> >   GEN      /tmp/build/perf/pmu-events/pmu-events.c
> >   CC       /tmp/build/perf/plugins/plugin_jbd2.o
> >   CC       /tmp/build/perf/plugins/plugin_hrtimer.o
> >   LD       /tmp/build/perf/plugins/plugin_jbd2-in.o
> >   LD       /tmp/build/perf/plugins/plugin_hrtimer-in.o
> >   CC       /tmp/build/perf/plugins/plugin_kmem.o
> >   CC       /tmp/build/perf/plugins/plugin_kvm.o
> >   LD       /tmp/build/perf/jvmti/jvmti-in.o
> >   LD       /tmp/build/perf/plugins/plugin_kmem-in.o
> >   LINK     /tmp/build/perf/libperf-jvmti.so
> >   CC       /tmp/build/perf/plugins/plugin_mac80211.o
> >   CC       /tmp/build/perf/plugins/plugin_sched_switch.o
> >   CC       /tmp/build/perf/pmu-events/pmu-events.o
> >   LD       /tmp/build/perf/plugins/plugin_kvm-in.o
> >   CC       /tmp/build/perf/plugins/plugin_function.o
> >   LD       /tmp/build/perf/plugins/plugin_mac80211-in.o
> >   CC       /tmp/build/perf/plugins/plugin_futex.o
> >   LD       /tmp/build/perf/plugins/plugin_sched_switch-in.o
> >   CC       /tmp/build/perf/plugins/plugin_xen.o
> >   LD       /tmp/build/perf/plugins/plugin_function-in.o
> >   CC       /tmp/build/perf/plugins/plugin_scsi.o
> >   LD       /tmp/build/perf/plugins/plugin_futex-in.o
> >   CC       /tmp/build/perf/plugins/plugin_cfg80211.o
> >   LD       /tmp/build/perf/plugins/plugin_xen-in.o
> >   CC       /tmp/build/perf/plugins/plugin_tlb.o
> >   LD       /tmp/build/perf/plugins/plugin_scsi-in.o
> >   LD       /tmp/build/perf/plugins/plugin_cfg80211-in.o
> >   LINK     /tmp/build/perf/plugins/plugin_jbd2.so
> >   LINK     /tmp/build/perf/plugins/plugin_hrtimer.so
> >   LINK     /tmp/build/perf/plugins/plugin_kmem.so
> >   LINK     /tmp/build/perf/plugins/plugin_kvm.so
> >   LINK     /tmp/build/perf/plugins/plugin_mac80211.so
> >   LD       /tmp/build/perf/plugins/plugin_tlb-in.o
> >   LINK     /tmp/build/perf/plugins/plugin_function.so
> >   LINK     /tmp/build/perf/plugins/plugin_sched_switch.so
> >   LINK     /tmp/build/perf/plugins/plugin_futex.so
> >   LINK     /tmp/build/perf/plugins/plugin_xen.so
> >   LINK     /tmp/build/perf/plugins/plugin_scsi.so
> >   LINK     /tmp/build/perf/plugins/plugin_cfg80211.so
> >   LINK     /tmp/build/perf/plugins/plugin_tlb.so
> >   INSTALL  trace_plugins
> >   LD       /tmp/build/perf/pmu-events/pmu-events-in.o
> > 
> > Auto-detecting system features:
> > ...                        libbfd: [ on  ]
> > ...        disassembler-four-args: [ on  ]
> > ...                          zlib: [ on  ]
> > ...                        libcap: [ on  ]
> > ...               clang-bpf-co-re: [ on  ]
> > 
> >   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/
> >   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/main.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/common.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/json_writer.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/gen.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/btf.o
> >   GEN      /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/bpf_helper_defs.h
> >   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
> >   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf.o
> >   MKDIR    /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/nlattr.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_errno.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/str_error.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/netlink.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/bpf_prog_linfo.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf_probes.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/xsk.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/hashmap.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/btf_dump.o
> >   CC       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/ringbuf.o
> >   LD       /tmp/build/perf/staticobjs/libbpf-in.o
> >   LINK     /tmp/build/perf/libbpf.a
> >   CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o
> >   CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bperf_leader.bpf.o
> >   CLANG    /tmp/build/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o
> >   LD       /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/staticobjs/libbpf-in.o
> >   LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/libbpf/libbpf.a
> >   LINK     /tmp/build/perf/util/bpf_skel/.tmp//bootstrap/bpftool
> >   GEN-SKEL /tmp/build/perf/util/bpf_skel/bpf_prog_profiler.skel.h
> >   GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h
> >   GEN-SKEL /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h
> > libbpf: map 'prev_readings': unexpected def kind var.
> > Error: failed to open BPF object file: Invalid argument
> > libbpf: map 'diff_readings': unexpected def kind var.
> > Error: failed to open BPF object file: Invalid argument
> > make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h] Error 255
> > make[2]: *** Waiting for unfinished jobs....
> > make[2]: *** [Makefile.perf:1029: /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h] Error 255
> > make[1]: *** [Makefile.perf:236: sub-make] Error 2
> > make: *** [Makefile:110: install-bin] Error 2
> > make: Leaving directory '/home/acme/git/perf/tools/perf'
> > [acme@quaco perf]$ clang -v
> > clang version 11.0.0 (https://github.com/llvm/llvm-project 67420f1b0e9c673ee638f2680fa83f468019004f)
> > Target: x86_64-unknown-linux-gnu
> > Thread model: posix
> > InstalledDir: /usr/local/bin
> > Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
> > Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/10
> > Candidate multilib: .;@m64
> > Candidate multilib: 32;@m32
> > Selected multilib: .;@m64
> > [acme@quaco perf]$
> 
> -- 
> 
> - Arnaldo

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] perf-stat: introduce bperf, share hardware PMCs with BPF
  2021-03-23 12:37         ` Arnaldo Carvalho de Melo
@ 2021-03-23 18:27           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 33+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-23 18:27 UTC (permalink / raw)
  To: Song Liu, Jiri Olsa; +Cc: linux-kernel, kernel-team, acme, namhyung, jolsa

Em Tue, Mar 23, 2021 at 09:37:42AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Mar 23, 2021 at 09:25:52AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Fri, Mar 19, 2021 at 03:41:57PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Thu, Mar 18, 2021 at 10:15:13PM +0100, Jiri Olsa escreveu:
> > > > On Tue, Mar 16, 2021 at 02:18:35PM -0700, Song Liu wrote:
> > > > > bperf is off by default. To enable it, pass --bpf-counters option to
> > > > > perf-stat. bperf uses a BPF hashmap to share information about BPF
> > > > > programs and maps used by bperf. This map is pinned to bpffs. The default
> > > > > path is /sys/fs/bpf/perf_attr_map. The user could change the path with
> > > > > option --bpf-attr-map.
> > > > > 
> > > > > Signed-off-by: Song Liu <songliubraving@fb.com>
> > > > 
> > > > Reviewed-by: Jiri Olsa <jolsa@redhat.com>
> > > 
> > > After applying just this first patch in the series I'm getting this
> > > after a 'make -C tools/ clean', now I'm checking if I need some new
> > > clang, ideas?
> > 
> > Works now with clang from fedora 33, I was using a locally built, older,
> > now I get this when trying as non-root, expected, but we need to improve
> > the wording.
> 
> Fails as root as well, investigating:
> 
> [root@five ~]# ls -lad /sys/fs/bpf/
> drwx-----T. 2 root root 0 Mar 23 06:03 /sys/fs/bpf/
> [root@five ~]# strace -e bpf perf stat --bpf-counters sleep 1
> bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=120, value_size=8, max_entries=16, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = -1 EPERM (Operation not permitted)
> Failed to lock perf_event_attr map
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=13916, si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
> +++ exited with 255 +++
> [root@five ~]#
>  
> > [acme@five perf]$ perf stat --bpf-counters sleep 1
> > Failed to lock perf_event_attr map
> > [acme@five perf]$

Now it works, on 5.12-rc2+

[root@five pahole]# perf stat --bpf-counters sleep 1
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame

 Performance counter stats for 'sleep 1':

              0.84 msec task-clock                #    0.001 CPUs utilized
                 3      context-switches          #    3.589 K/sec
                 2      cpu-migrations            #    2.393 K/sec
                67      page-faults               #   80.164 K/sec
         2,760,235      cycles                    #    3.303 GHz                      (88.36%)
           241,055      stalled-cycles-frontend   #    8.73% frontend cycles idle     (45.93%)
         1,251,751      stalled-cycles-backend    #   45.35% backend cycles idle      (65.67%)
         2,340,813      instructions              #    0.85  insn per cycle
                                                  #    0.53  stalled cycles per insn
           459,088      branches                  #  549.289 M/sec
            12,243      branch-misses             #    2.67% of all branches

       1.000928326 seconds time elapsed

       0.000881000 seconds user
       0.000000000 seconds sys


[root@five pahole]# clang --version
clang version 11.0.0 (Fedora 11.0.0-2.fc33)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
[root@five pahole]# uname -a
Linux five 5.12.0-rc2+ #2 SMP Tue Mar 23 12:51:43 -03 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@five pahole]#


Full BPF setup, lotsa syscalls:

[root@five pahole]# strace -e bpf perf stat --bpf-counters sleep 1
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b39760, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 4
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 120) = 4
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 120) = 4
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\08\0\0\08\0\0\0\t\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=89, btf_log_size=0, btf_log_level=0}, 120) = 4
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 120) = 4
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 4
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254c90, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="test", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 5
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=4, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 5
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=4, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 7
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=4, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 8
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_CGROUP_SOCK, insn_cnt=2, insns=0x7ffda3254c90, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_SOCK_CREATE, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 9
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=4, func_info_rec_size=8, func_info=0x1b7e6c0, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 9
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=9}}, 120) = 10
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=10, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=8, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b39760, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 11
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=5, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=11, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=192, next_id=0, open_flags=0}, 120) = 4
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=195, next_id=0, open_flags=0}, 120) = 5
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=4, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=4, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=305}, 120) = 7
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=7, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=5, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 8
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 8
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=32, max_entries=1, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 9
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=5, insns=0x7ffda3254e30, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 10
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 9
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=8, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 9
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=8, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 10
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=8, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 36
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=36, key=0x7ffda3254e50, value=0x7f0a61b70000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=8, func_info_rec_size=8, func_info=0x1b80000, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b75420, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=4}, 120) = 37
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=10, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=37}}, 120) = 38
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b392b0, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 5
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 5
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=5, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 39
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=5, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 40
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=5, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 41
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=5, func_info_rec_size=8, func_info=0x1b78bb0, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 42
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=42}}, 120) = 43
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=43, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=41, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b392b0, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 44
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=39, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=44, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=197, next_id=0, open_flags=0}, 120) = 5
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=203, next_id=0, open_flags=0}, 120) = 39
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=5, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=5, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=307}, 120) = 40
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=40, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=39, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 41
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 41
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=41, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 42
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=41, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 43
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=41, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 69
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=69, key=0x7ffda3254e50, value=0x7f0a61b6f000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=41, func_info_rec_size=8, func_info=0x1b805c0, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b78100, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=5}, 120) = 70
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=43, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=70}}, 120) = 71
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b36840, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 39
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 39
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=39, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 72
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=39, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 73
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=39, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 74
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=39, func_info_rec_size=8, func_info=0x1b813d0, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 75
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=75}}, 120) = 76
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=76, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=74, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b36840, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 77
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=72, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=77, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=201, next_id=0, open_flags=0}, 120) = 39
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=209, next_id=0, open_flags=0}, 120) = 72
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=39, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=39, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=309}, 120) = 73
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=73, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=72, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 74
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 74
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=74, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 75
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=74, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 76
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=74, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 102
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=102, key=0x7ffda3254e50, value=0x7f0a61b6e000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=74, func_info_rec_size=8, func_info=0x1b79b40, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b786a0, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=39}, 120) = 103
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=76, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=103}}, 120) = 104
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b36a70, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 72
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 72
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=72, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 105
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=72, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 106
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=72, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 107
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=72, func_info_rec_size=8, func_info=0x1b75a70, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 108
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=108}}, 120) = 109
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=109, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=107, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b36a70, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 110
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=105, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=110, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=205, next_id=0, open_flags=0}, 120) = 72
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=215, next_id=0, open_flags=0}, 120) = 105
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=72, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=72, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=311}, 120) = 106
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=106, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=105, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 107
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 107
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=107, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 108
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=107, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 109
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=107, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 135
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=135, key=0x7ffda3254e50, value=0x7f0a61b6d000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=107, func_info_rec_size=8, func_info=0x1b7b310, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b7b740, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=72}, 120) = 136
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=109, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=136}}, 120) = 137
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b3da70, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 105
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 105
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=105, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 138
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=105, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 139
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=105, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 140
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=105, func_info_rec_size=8, func_info=0x1b7a670, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 141
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=141}}, 120) = 142
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=142, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=140, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b3da70, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 143
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=138, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=143, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=209, next_id=0, open_flags=0}, 120) = 105
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=221, next_id=0, open_flags=0}, 120) = 138
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=105, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=105, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=313}, 120) = 139
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=139, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=138, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 140
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 140
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=140, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 141
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=140, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 142
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=140, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 168
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=168, key=0x7ffda3254e50, value=0x7f0a61b6c000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=140, func_info_rec_size=8, func_info=0x1b3ec60, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b3dd00, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=105}, 120) = 169
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=142, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=169}}, 120) = 170
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b7ee10, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 138
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 138
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=138, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 171
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=138, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 172
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=138, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 173
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=138, func_info_rec_size=8, func_info=0x1b7a850, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 174
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=174}}, 120) = 175
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=175, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=173, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b7ee10, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 176
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=171, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=176, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=213, next_id=0, open_flags=0}, 120) = 138
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=227, next_id=0, open_flags=0}, 120) = 171
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=138, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=138, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=315}, 120) = 172
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=172, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=171, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 173
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 173
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=173, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 174
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=173, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 175
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=173, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 201
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=201, key=0x7ffda3254e50, value=0x7f0a61b6b000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=173, func_info_rec_size=8, func_info=0x1b3fc70, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b3e4c0, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=138}, 120) = 202
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=175, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=202}}, 120) = 203
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b7f0f0, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 171
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 171
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=171, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 204
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=171, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 205
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=171, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 206
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=171, func_info_rec_size=8, func_info=0x1b3e990, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 207
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=207}}, 120) = 208
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=208, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=206, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b7f0f0, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 209
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=204, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=209, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=217, next_id=0, open_flags=0}, 120) = 171
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=233, next_id=0, open_flags=0}, 120) = 204
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=171, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=171, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=317}, 120) = 205
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=205, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=204, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 206
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 206
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=206, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 207
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=206, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 208
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=206, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 234
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=234, key=0x7ffda3254e50, value=0x7f0a61b6a000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=206, func_info_rec_size=8, func_info=0x1b3f8c0, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b3fd40, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=171}, 120) = 235
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=208, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=235}}, 120) = 236
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b7f3b0, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 204
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 204
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=204, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 237
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=204, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 238
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=204, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 239
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=204, func_info_rec_size=8, func_info=0x1b40a20, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 240
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=240}}, 120) = 241
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=241, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=239, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b7f3b0, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 242
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=237, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=242, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=221, next_id=0, open_flags=0}, 120) = 204
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=239, next_id=0, open_flags=0}, 120) = 237
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=204, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=204, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=319}, 120) = 238
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=238, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=237, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 239
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 239
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=239, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 240
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=239, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 241
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=239, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 267
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=267, key=0x7ffda3254e50, value=0x7f0a61b69000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=239, func_info_rec_size=8, func_info=0x1b421c0, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b41550, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=204}, 120) = 268
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=241, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=268}}, 120) = 269
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b7e8c0, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 237
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 237
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=237, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 270
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=237, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 271
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=237, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 272
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=237, func_info_rec_size=8, func_info=0x1b42200, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 273
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=273}}, 120) = 274
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=274, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=272, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b7e8c0, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 275
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=270, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=275, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=225, next_id=0, open_flags=0}, 120) = 237
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=245, next_id=0, open_flags=0}, 120) = 270
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=237, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=237, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=321}, 120) = 271
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=271, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=270, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 272
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 272
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=272, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 273
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=272, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 274
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=272, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 300
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=300, key=0x7ffda3254e50, value=0x7f0a61b68000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=272, func_info_rec_size=8, func_info=0x1b43810, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b42300, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=237}, 120) = 301
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=274, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=301}}, 120) = 302
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 120) = 3
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x1b76d00, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = -1 ENOENT (No such file or directory)
libbpf: elf: skipping unrecognized data section(7) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(7) .eh_frame
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 270
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0D\2\0\0D\2\0\0\356\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1354, btf_log_size=0, btf_log_level=0}, 120) = 270
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=24, map_flags=BPF_F_PRESERVE_ELEMS, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=270, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 303
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="prev_readings", map_ifindex=0, btf_fd=270, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 304
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="diff_readings", map_ifindex=0, btf_fd=270, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 305
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_RAW_TRACEPOINT, insn_cnt=48, insns=0x1b7fd00, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="on_switch", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=270, func_info_rec_size=8, func_info=0x1b451b0, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b761b0, line_info_cnt=25, attach_btf_id=0, attach_prog_fd=0}, 120) = 306
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name="sched_switch", prog_fd=306}}, 120) = 307
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=307, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=305, info_len=80, info=0x7ffda3255200}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x1b76d00, value=0x7ffda32551a8, flags=BPF_ANY}, 120) = 0
bpf(BPF_LINK_GET_FD_BY_ID, 0x7ffda32550e0, 120) = 308
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=303, key=0x7ffda325505c, value=0x7ffda3255058, flags=BPF_ANY}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=308, info_len=32, info=0x7ffda32551b0}}, 120) = 0
bpf(BPF_PROG_GET_FD_BY_ID, {prog_id=229, next_id=0, open_flags=0}, 120) = 270
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=251, next_id=0, open_flags=0}, 120) = 303
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(13) .rel.eh_frame for section(8) .eh_frame
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=270, info_len=216, info=0x7ffda3254f50}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=270, info_len=216, info=0x1b72d20}}, 120) = 0
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=323}, 120) = 304
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=304, info_len=32, info=0x7ffda3255000}}, 120) = 0
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=303, info_len=80, info=0x7ffda32550f0}}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffda3254ff0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 120) = 305
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\204\2\0\0\204\2\0\0\250\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1604, btf_log_size=0, btf_log_level=0}, 120) = 305
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=24, max_entries=1, map_flags=0, inner_map_fd=0, map_name="accum_readings", map_ifindex=0, btf_fd=305, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 306
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=1, map_flags=0, inner_map_fd=0, map_name="filter", map_ifindex=0, btf_fd=305, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 120) = 307
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="bperf_fo.bss", map_ifindex=0, btf_fd=305, btf_key_type_id=0, btf_value_type_id=27, btf_vmlinux_value_type_id=0}, 120) = 333
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=333, key=0x7ffda3254e50, value=0x7f0a61b67000, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACING, insn_cnt=58, insns=0x1b35650, license="Dual BSD/GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 12, 0), prog_flags=0, prog_name="fexit_XXX", prog_ifindex=0, expected_attach_type=BPF_TRACE_FEXIT, prog_btf_fd=305, func_info_rec_size=8, func_info=0x1b45120, func_info_cnt=1, line_info_rec_size=16, line_info=0x1b44100, line_info_cnt=28, attach_btf_id=22, attach_prog_fd=270}, 120) = 334
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=307, key=0x7ffda32551b0, value=0x7ffda32551a4, flags=BPF_ANY}, 120) = 0
bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=334}}, 120) = 335
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=75357, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=4, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=9, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=5, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=42, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=39, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=75, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=72, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=108, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=105, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=141, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=138, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=174, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=171, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=207, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=204, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=240, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=237, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=273, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_PROG_TEST_RUN, {test={prog_fd=270, retval=0, data_size_in=0, data_size_out=0, data_in=NULL, data_out=NULL, repeat=0, duration=0, ctx_size_in=0, ctx_size_out=0, ctx_in=NULL, ctx_out=NULL}, ...}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=306, key=0x7ffda3255160, value=0x7ffda3254f10, flags=BPF_ANY}, 120) = 0

 Performance counter stats for 'sleep 1':

              1.64 msec task-clock                #    0.002 CPUs utilized          
                 3      context-switches          #    1.827 K/sec                  
                 1      cpu-migrations            #  609.091 /sec                   
                69      page-faults               #   42.027 K/sec                  
         6,661,659      cycles                    #    4.058 GHz                    
         2,098,928      stalled-cycles-frontend   #   31.51% frontend cycles idle     (88.25%)
           354,763      stalled-cycles-backend    #    5.33% backend cycles idle      (87.59%)
         2,741,209      instructions              #    0.41  insn per cycle         
                                                  #    0.77  stalled cycles per insn  (43.20%)
           463,473      branches                  #  282.297 M/sec                    (80.93%)
            13,458      branch-misses             #    2.90% of all branches        

       1.001882586 seconds time elapsed

       0.000000000 seconds user
       0.001667000 seconds sys


--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_USER, si_pid=75348, si_uid=0} ---
+++ exited with 0 +++
[root@five pahole]# 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-19 16:14                   ` Song Liu
@ 2021-03-23 21:10                     ` Arnaldo Carvalho de Melo
  2021-03-23 21:26                       ` Song Liu
  0 siblings, 1 reply; 33+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-23 21:10 UTC (permalink / raw)
  To: Song Liu; +Cc: Namhyung Kim, Jiri Olsa, linux-kernel, Kernel Team, Jiri Olsa

Em Fri, Mar 19, 2021 at 04:14:42PM +0000, Song Liu escreveu:
> > On Mar 19, 2021, at 8:58 AM, Namhyung Kim <namhyung@kernel.org> wrote:
> > On Sat, Mar 20, 2021 at 12:35 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >> Em Fri, Mar 19, 2021 at 09:54:59AM +0900, Namhyung Kim escreveu:
> >>> On Fri, Mar 19, 2021 at 9:22 AM Song Liu <songliubraving@fb.com> wrote:
> >>>>> On Mar 18, 2021, at 5:09 PM, Arnaldo <arnaldo.melo@gmail.com> wrote:
> >>>>> On March 18, 2021 6:14:34 PM GMT-03:00, Jiri Olsa <jolsa@redhat.com> wrote:
> >>>>>> On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
> >>>>>>> perf stat -C 1,3,5                  107.063 [sec]
> >>>>>>> perf stat -C 1,3,5 --bpf-counters   106.406 [sec]

> >>>>>> I can't see why it's actualy faster than normal perf ;-)
> >>>>>> would be worth to find out

> >>>>> Isn't this all about contended cases?

> >>>> Yeah, the normal perf is doing time multiplexing; while --bpf-counters
> >>>> doesn't need it.

> >>> Yep, so for uncontended cases, normal perf should be the same as the
> >>> baseline (faster than the bperf).  But for contended cases, the bperf
> >>> works faster.

> >> The difference should be small enough that for people that use this in a
> >> machine where contention happens most of the time, setting a
> >> ~/.perfconfig to use it by default should be advantageous, i.e. no need
> >> to use --bpf-counters on the command line all the time.

> >> So, Namhyung, can I take that as an Acked-by or a Reviewed-by? I'll take
> >> a look again now but I want to have this merged on perf/core so that I
> >> can work on a new BPF SKEL to use this:

> > I have a concern for the per cpu target, but it can be done later, so

> > Acked-by: Namhyung Kim <namhyung@kernel.org>

> >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=tmp.bpf/bpf_perf_enable

> > Interesting!  Actually I was thinking about the similar too. :)
> 
> Hi Namhyung, Jiri, and Arnaldo,
> 
> Thanks a lot for your kind review. 
> 
> Here is updated 3/3, where we use perf-bench instead of stressapptest.

I had to apply this updated 3/3 manually, as there was some munging, its
all now at:

https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=tmp.perf/core

Please take a look at the "Committer testing" section I added to the
main patch, introducing bperf:

https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core&id=7fac83aaf2eecc9e7e7b72da694c49bb4ce7fdfc

And check if I made any mistake or if something else could be added.

It'll move to perf/core after my set of automated tests finishes.

- Arnaldo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF
  2021-03-23 21:10                     ` Arnaldo Carvalho de Melo
@ 2021-03-23 21:26                       ` Song Liu
  0 siblings, 0 replies; 33+ messages in thread
From: Song Liu @ 2021-03-23 21:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Jiri Olsa, linux-kernel, Kernel Team, Jiri Olsa



> On Mar 23, 2021, at 2:10 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> Em Fri, Mar 19, 2021 at 04:14:42PM +0000, Song Liu escreveu:
>>> On Mar 19, 2021, at 8:58 AM, Namhyung Kim <namhyung@kernel.org> wrote:
>>> On Sat, Mar 20, 2021 at 12:35 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>>> Em Fri, Mar 19, 2021 at 09:54:59AM +0900, Namhyung Kim escreveu:
>>>>> On Fri, Mar 19, 2021 at 9:22 AM Song Liu <songliubraving@fb.com> wrote:
>>>>>>> On Mar 18, 2021, at 5:09 PM, Arnaldo <arnaldo.melo@gmail.com> wrote:
>>>>>>> On March 18, 2021 6:14:34 PM GMT-03:00, Jiri Olsa <jolsa@redhat.com> wrote:
>>>>>>>> On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
>>>>>>>>> perf stat -C 1,3,5                  107.063 [sec]
>>>>>>>>> perf stat -C 1,3,5 --bpf-counters   106.406 [sec]
> 
>>>>>>>> I can't see why it's actualy faster than normal perf ;-)
>>>>>>>> would be worth to find out
> 
>>>>>>> Isn't this all about contended cases?
> 
>>>>>> Yeah, the normal perf is doing time multiplexing; while --bpf-counters
>>>>>> doesn't need it.
> 
>>>>> Yep, so for uncontended cases, normal perf should be the same as the
>>>>> baseline (faster than the bperf).  But for contended cases, the bperf
>>>>> works faster.
> 
>>>> The difference should be small enough that for people that use this in a
>>>> machine where contention happens most of the time, setting a
>>>> ~/.perfconfig to use it by default should be advantageous, i.e. no need
>>>> to use --bpf-counters on the command line all the time.
> 
>>>> So, Namhyung, can I take that as an Acked-by or a Reviewed-by? I'll take
>>>> a look again now but I want to have this merged on perf/core so that I
>>>> can work on a new BPF SKEL to use this:
> 
>>> I have a concern for the per cpu target, but it can be done later, so
> 
>>> Acked-by: Namhyung Kim <namhyung@kernel.org>
> 
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=tmp.bpf/bpf_perf_enable
> 
>>> Interesting!  Actually I was thinking about the similar too. :)
>> 
>> Hi Namhyung, Jiri, and Arnaldo,
>> 
>> Thanks a lot for your kind review. 
>> 
>> Here is updated 3/3, where we use perf-bench instead of stressapptest.
> 
> I had to apply this updated 3/3 manually, as there was some munging, its
> all now at:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=tmp.perf/core
> 
> Please take a look at the "Committer testing" section I added to the
> main patch, introducing bperf:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=tmp.perf/core&id=7fac83aaf2eecc9e7e7b72da694c49bb4ce7fdfc
> 
> And check if I made any mistake or if something else could be added.
> 
> It'll move to perf/core after my set of automated tests finishes.

Thanks Arnaldo! Looks great!

Song



^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-03-23 21:27 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16 21:18 [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Song Liu
2021-03-16 21:18 ` [PATCH v2 1/3] perf-stat: introduce bperf, " Song Liu
2021-03-18  5:54   ` Namhyung Kim
2021-03-18  7:22     ` Song Liu
2021-03-18 13:49       ` Namhyung Kim
2021-03-18 17:16         ` Song Liu
2021-03-18 21:15   ` Jiri Olsa
2021-03-19 18:41     ` Arnaldo Carvalho de Melo
2021-03-19 18:55       ` Jiri Olsa
2021-03-19 22:06         ` Song Liu
2021-03-23  0:53       ` Song Liu
2021-03-23 12:25       ` Arnaldo Carvalho de Melo
2021-03-23 12:37         ` Arnaldo Carvalho de Melo
2021-03-23 18:27           ` Arnaldo Carvalho de Melo
2021-03-16 21:18 ` [PATCH v2 2/3] perf-stat: measure t0 and ref_time after enable_counters() Song Liu
2021-03-16 21:18 ` [PATCH v2 3/3] perf-test: add a test for perf-stat --bpf-counters option Song Liu
2021-03-18  6:07   ` Namhyung Kim
2021-03-18  7:39     ` Song Liu
2021-03-17  5:29 ` [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF Namhyung Kim
2021-03-17  9:19   ` Jiri Olsa
2021-03-17 13:11   ` Arnaldo Carvalho de Melo
2021-03-18  3:52     ` Song Liu
2021-03-18  4:32       ` Namhyung Kim
2021-03-18  7:03         ` Song Liu
2021-03-18 21:14       ` Jiri Olsa
2021-03-19  0:09         ` Arnaldo
2021-03-19  0:22           ` Song Liu
2021-03-19  0:54             ` Namhyung Kim
2021-03-19 15:35               ` Arnaldo Carvalho de Melo
2021-03-19 15:58                 ` Namhyung Kim
2021-03-19 16:14                   ` Song Liu
2021-03-23 21:10                     ` Arnaldo Carvalho de Melo
2021-03-23 21:26                       ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).