linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] perf util: move bperf definitions to a libperf header
@ 2021-04-03  0:29 Song Liu
  2021-04-03  0:29 ` [PATCH 2/2] perf-stat: introduce config stat.bpf-counter-events Song Liu
  2021-04-06 14:21 ` [PATCH 1/2] perf util: move bperf definitions to a libperf header Jiri Olsa
  0 siblings, 2 replies; 6+ messages in thread
From: Song Liu @ 2021-04-03  0:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, acme, acme, namhyung, jolsa, songliubraving, Song Liu

By following the same protocol, other tools can share hardware PMCs with
perf. Move perf_event_attr_map_entry and BPERF_DEFAULT_ATTR_MAP_PATH to
bperf.h for other tools to use.

Also add bperf_attr_map_compatible() to check whether existing attr_map
is compatible with current perf binary.

Signed-off-by: Song Liu <song@kernel.org>
---
 tools/lib/perf/include/perf/bperf.h | 29 +++++++++++++++++++
 tools/perf/util/bpf_counter.c       | 44 ++++++++++++++---------------
 2 files changed, 50 insertions(+), 23 deletions(-)
 create mode 100644 tools/lib/perf/include/perf/bperf.h

diff --git a/tools/lib/perf/include/perf/bperf.h b/tools/lib/perf/include/perf/bperf.h
new file mode 100644
index 0000000000000..02b2fd5e50c75
--- /dev/null
+++ b/tools/lib/perf/include/perf/bperf.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+#ifndef __LIBPERF_BPERF_H
+#define __LIBPERF_BPERF_H
+
+/*
+ * bperf uses a hashmap, the attr_map, to track all the leader programs.
+ * The hashmap is pinned in bpffs. flock() on this file is used to ensure
+ * no concurrent access to the attr_map.  The key of attr_map is struct
+ * perf_event_attr, and the value is struct perf_event_attr_map_entry.
+ *
+ * struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
+ * leader prog, and the diff_map. Each perf-stat session holds a reference
+ * to the bpf_link to make sure the leader prog is attached to sched_switch
+ * tracepoint.
+ *
+ * Since the hashmap only contains IDs of the bpf_link and diff_map, it
+ * does not hold any references to the leader program. Once all perf-stat
+ * sessions of these events exit, the leader prog, its maps, and the
+ * perf_events will be freed.
+ */
+struct perf_event_attr_map_entry {
+	__u32 link_id;
+	__u32 diff_map_id;
+};
+
+/* pin the map at sysfs__mountpoint()/BPERF_DEFAULT_ATTR_MAP_PATH */
+#define BPERF_DEFAULT_ATTR_MAP_PATH "fs/bpf/perf_attr_map"
+
+#endif /* __LIBPERF_BPERF_H */
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index 81d1df3c4ec0e..df70c8dcf7850 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -14,6 +14,7 @@
 #include <bpf/btf.h>
 #include <bpf/libbpf.h>
 #include <api/fs/fs.h>
+#include <perf/bperf.h>
 
 #include "bpf_counter.h"
 #include "counts.h"
@@ -29,28 +30,6 @@
 #include "bpf_skel/bperf_leader.skel.h"
 #include "bpf_skel/bperf_follower.skel.h"
 
-/*
- * bperf uses a hashmap, the attr_map, to track all the leader programs.
- * The hashmap is pinned in bpffs. flock() on this file is used to ensure
- * no concurrent access to the attr_map.  The key of attr_map is struct
- * perf_event_attr, and the value is struct perf_event_attr_map_entry.
- *
- * struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
- * leader prog, and the diff_map. Each perf-stat session holds a reference
- * to the bpf_link to make sure the leader prog is attached to sched_switch
- * tracepoint.
- *
- * Since the hashmap only contains IDs of the bpf_link and diff_map, it
- * does not hold any references to the leader program. Once all perf-stat
- * sessions of these events exit, the leader prog, its maps, and the
- * perf_events will be freed.
- */
-struct perf_event_attr_map_entry {
-	__u32 link_id;
-	__u32 diff_map_id;
-};
-
-#define DEFAULT_ATTR_MAP_PATH "fs/bpf/perf_attr_map"
 #define ATTR_MAP_SIZE 16
 
 static inline void *u64_to_ptr(__u64 ptr)
@@ -333,6 +312,20 @@ static __u32 bpf_map_get_id(int fd)
 	return map_info.id;
 }
 
+static bool bperf_attr_map_compatible(int attr_map_fd)
+{
+	struct bpf_map_info map_info = {0};
+	__u32 map_info_len = sizeof(map_info);
+	int err;
+
+	err = bpf_obj_get_info_by_fd(attr_map_fd, &map_info, &map_info_len);
+
+	if (err)
+		return false;
+	return (map_info.key_size == sizeof(struct perf_event_attr)) &&
+		(map_info.value_size == sizeof(struct perf_event_attr_map_entry));
+}
+
 static int bperf_lock_attr_map(struct target *target)
 {
 	char path[PATH_MAX];
@@ -342,7 +335,7 @@ static int bperf_lock_attr_map(struct target *target)
 		scnprintf(path, PATH_MAX, "%s", target->attr_map);
 	} else {
 		scnprintf(path, PATH_MAX, "%s/%s", sysfs__mountpoint(),
-			  DEFAULT_ATTR_MAP_PATH);
+			  BPERF_DEFAULT_ATTR_MAP_PATH);
 	}
 
 	if (access(path, F_OK)) {
@@ -367,6 +360,11 @@ static int bperf_lock_attr_map(struct target *target)
 			return -1;
 	}
 
+	if (!bperf_attr_map_compatible(map_fd)) {
+		close(map_fd);
+		return -1;
+
+	}
 	err = flock(map_fd, LOCK_EX);
 	if (err) {
 		close(map_fd);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] perf-stat: introduce config stat.bpf-counter-events
  2021-04-03  0:29 [PATCH 1/2] perf util: move bperf definitions to a libperf header Song Liu
@ 2021-04-03  0:29 ` Song Liu
  2021-04-06 14:21   ` Jiri Olsa
  2021-04-06 14:21 ` [PATCH 1/2] perf util: move bperf definitions to a libperf header Jiri Olsa
  1 sibling, 1 reply; 6+ messages in thread
From: Song Liu @ 2021-04-03  0:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, acme, acme, namhyung, jolsa, songliubraving, Song Liu

Currently, to use BPF to aggregate perf event counters, the user uses
--bpf-counters option. Enable "use bpf by default" events with a config
option, stat.bpf-counter-events. This is limited to hardware events in
evsel__hw_names.

This also enables mixed BPF event and regular event in the same sesssion.
For example:

   perf config stat.bpf-counter-events=instructions
   perf stat -e instructions,cs

The second command will use BPF for "instructions" but not "cs".

Signed-off-by: Song Liu <song@kernel.org>
---
 tools/perf/Documentation/perf-stat.txt |  2 ++
 tools/perf/builtin-stat.c              | 41 ++++++++++++++++----------
 tools/perf/util/bpf_counter.c          | 11 +++++++
 tools/perf/util/config.c               | 32 ++++++++++++++++++++
 tools/perf/util/evsel.c                |  2 ++
 tools/perf/util/evsel.h                |  1 +
 tools/perf/util/target.h               |  5 ----
 7 files changed, 74 insertions(+), 20 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 744211fa8c186..6d4733eaac170 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -97,6 +97,8 @@ report::
 	Use BPF programs to aggregate readings from perf_events.  This
 	allows multiple perf-stat sessions that are counting the same metric (cycles,
 	instructions, etc.) to share hardware counters.
+	To use BPF programs on common hardware events by default, use
+	"perf config stat.bpf-counter-events=<list_of_events>".
 
 --bpf-attr-map::
 	With option "--bpf-counters", different perf-stat sessions share
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 4bb48c6b66980..5adfa708ffe68 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -423,17 +423,28 @@ static int read_affinity_counters(struct timespec *rs)
 	return 0;
 }
 
+/*
+ * Returns:
+ *     0   if all events use BPF;
+ *     1   if some events do NOT use BPF;
+ *     < 0 on errors;
+ */
 static int read_bpf_map_counters(void)
 {
+	bool has_none_bpf_events = false;
 	struct evsel *counter;
 	int err;
 
 	evlist__for_each_entry(evsel_list, counter) {
+		if (!counter->bpf_counter_ops) {
+			has_none_bpf_events = true;
+			continue;
+		}
 		err = bpf_counter__read(counter);
 		if (err)
 			return err;
 	}
-	return 0;
+	return has_none_bpf_events ? 1 : 0;
 }
 
 static void read_counters(struct timespec *rs)
@@ -442,9 +453,10 @@ static void read_counters(struct timespec *rs)
 	int err;
 
 	if (!stat_config.stop_read_counter) {
-		if (target__has_bpf(&target))
-			err = read_bpf_map_counters();
-		else
+		err = read_bpf_map_counters();
+		if (err < 0)
+			return;
+		if (err)
 			err = read_affinity_counters(rs);
 		if (err < 0)
 			return;
@@ -535,12 +547,13 @@ static int enable_counters(void)
 	struct evsel *evsel;
 	int err;
 
-	if (target__has_bpf(&target)) {
-		evlist__for_each_entry(evsel_list, evsel) {
-			err = bpf_counter__enable(evsel);
-			if (err)
-				return err;
-		}
+	evlist__for_each_entry(evsel_list, evsel) {
+		if (!evsel->bpf_counter_ops)
+			continue;
+
+		err = bpf_counter__enable(evsel);
+		if (err)
+			return err;
 	}
 
 	if (stat_config.initial_delay < 0) {
@@ -784,11 +797,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	if (affinity__setup(&affinity) < 0)
 		return -1;
 
-	if (target__has_bpf(&target)) {
-		evlist__for_each_entry(evsel_list, counter) {
-			if (bpf_counter__load(counter, &target))
-				return -1;
-		}
+	evlist__for_each_entry(evsel_list, counter) {
+		if (bpf_counter__load(counter, &target))
+			return -1;
 	}
 
 	evlist__for_each_cpu (evsel_list, i, cpu) {
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index df70c8dcf7850..ea45914af1693 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -792,6 +792,17 @@ int bpf_counter__load(struct evsel *evsel, struct target *target)
 		evsel->bpf_counter_ops = &bpf_program_profiler_ops;
 	else if (target->use_bpf)
 		evsel->bpf_counter_ops = &bperf_ops;
+	else {
+		int i;
+
+		for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
+			if (!strcmp(evsel->name, evsel__hw_names[i])) {
+				if (evsel__use_bpf_counters[i])
+					evsel->bpf_counter_ops = &bperf_ops;
+				break;
+			}
+		}
+	}
 
 	if (evsel->bpf_counter_ops)
 		return evsel->bpf_counter_ops->load(evsel, target);
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 2daeaa9a4a241..fe2ec56258735 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -18,6 +18,7 @@
 #include "util/hist.h"  /* perf_hist_config */
 #include "util/llvm-utils.h"   /* perf_llvm_config */
 #include "util/stat.h"  /* perf_stat__set_big_num */
+#include "util/evsel.h"  /* evsel__hw_names, evsel__use_bpf_counters */
 #include "build-id.h"
 #include "debug.h"
 #include "config.h"
@@ -433,6 +434,29 @@ static int perf_buildid_config(const char *var, const char *value)
 	return 0;
 }
 
+static int perf_stat_config_parse_bpf_counter_event(const char *value)
+{
+	char *event_str, *event_str_, *tok, *saveptr = NULL;
+	int i;
+
+	event_str_ = event_str = strdup(value);
+	if (!event_str)
+		return -1;
+
+	while ((tok = strtok_r(event_str, ",", &saveptr)) != NULL) {
+		for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
+			if (!strcmp(tok, evsel__hw_names[i])) {
+				evsel__use_bpf_counters[i] = true;
+				break;
+			}
+		}
+		event_str = NULL;
+	}
+
+	free(event_str_);
+	return 0;
+}
+
 static int perf_default_core_config(const char *var __maybe_unused,
 				    const char *value __maybe_unused)
 {
@@ -454,9 +478,17 @@ static int perf_ui_config(const char *var, const char *value)
 
 static int perf_stat_config(const char *var, const char *value)
 {
+	int err = 0;
+
 	if (!strcmp(var, "stat.big-num"))
 		perf_stat__set_big_num(perf_config_bool(var, value));
 
+	if (!strcmp(var, "stat.bpf-counter-events")) {
+		err = perf_stat_config_parse_bpf_counter_event(value);
+		if (err)
+			return err;
+	}
+
 	/* Add other config variables here. */
 	return 0;
 }
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2d2614eeaa20e..592d93bcccd04 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -492,6 +492,8 @@ const char *evsel__hw_names[PERF_COUNT_HW_MAX] = {
 	"ref-cycles",
 };
 
+bool evsel__use_bpf_counters[PERF_COUNT_HW_MAX] = {false};
+
 static const char *__evsel__hw_name(u64 config)
 {
 	if (config < PERF_COUNT_HW_MAX && evsel__hw_names[config])
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index dd4f56f9cfdf5..32439c5684f7d 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -247,6 +247,7 @@ extern const char *evsel__hw_cache_op[PERF_COUNT_HW_CACHE_OP_MAX][EVSEL__MAX_ALI
 extern const char *evsel__hw_cache_result[PERF_COUNT_HW_CACHE_RESULT_MAX][EVSEL__MAX_ALIASES];
 extern const char *evsel__hw_names[PERF_COUNT_HW_MAX];
 extern const char *evsel__sw_names[PERF_COUNT_SW_MAX];
+extern bool evsel__use_bpf_counters[PERF_COUNT_HW_MAX];
 int __evsel__hw_cache_type_op_res_name(u8 type, u8 op, u8 result, char *bf, size_t size);
 const char *evsel__name(struct evsel *evsel);
 
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 1bce3eb28ef25..4ff56217f2a65 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -66,11 +66,6 @@ static inline bool target__has_cpu(struct target *target)
 	return target->system_wide || target->cpu_list;
 }
 
-static inline bool target__has_bpf(struct target *target)
-{
-	return target->bpf_str || target->use_bpf;
-}
-
 static inline bool target__none(struct target *target)
 {
 	return !target__has_task(target) && !target__has_cpu(target);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] perf util: move bperf definitions to a libperf header
  2021-04-03  0:29 [PATCH 1/2] perf util: move bperf definitions to a libperf header Song Liu
  2021-04-03  0:29 ` [PATCH 2/2] perf-stat: introduce config stat.bpf-counter-events Song Liu
@ 2021-04-06 14:21 ` Jiri Olsa
  2021-04-06 16:26   ` Song Liu
  1 sibling, 1 reply; 6+ messages in thread
From: Jiri Olsa @ 2021-04-06 14:21 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, kernel-team, acme, acme, namhyung, jolsa, songliubraving

On Fri, Apr 02, 2021 at 05:29:37PM -0700, Song Liu wrote:
> By following the same protocol, other tools can share hardware PMCs with
> perf. Move perf_event_attr_map_entry and BPERF_DEFAULT_ATTR_MAP_PATH to
> bperf.h for other tools to use.

hi,
so is this necessary for some other tool now?

> 
> Also add bperf_attr_map_compatible() to check whether existing attr_map
> is compatible with current perf binary.

please separate this change

> 
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>  tools/lib/perf/include/perf/bperf.h | 29 +++++++++++++++++++
>  tools/perf/util/bpf_counter.c       | 44 ++++++++++++++---------------
>  2 files changed, 50 insertions(+), 23 deletions(-)
>  create mode 100644 tools/lib/perf/include/perf/bperf.h
> 
> diff --git a/tools/lib/perf/include/perf/bperf.h b/tools/lib/perf/include/perf/bperf.h
> new file mode 100644
> index 0000000000000..02b2fd5e50c75
> --- /dev/null
> +++ b/tools/lib/perf/include/perf/bperf.h

I wonder we want to call it bpf_perf.h to be more generic?
or best just bpf.h ... but that might give us some conflict
headache in future ;-)

> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
> +#ifndef __LIBPERF_BPERF_H
> +#define __LIBPERF_BPERF_H
> +
> +/*
> + * bperf uses a hashmap, the attr_map, to track all the leader programs.
> + * The hashmap is pinned in bpffs. flock() on this file is used to ensure
> + * no concurrent access to the attr_map.  The key of attr_map is struct
> + * perf_event_attr, and the value is struct perf_event_attr_map_entry.
> + *
> + * struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
> + * leader prog, and the diff_map. Each perf-stat session holds a reference
> + * to the bpf_link to make sure the leader prog is attached to sched_switch
> + * tracepoint.
> + *
> + * Since the hashmap only contains IDs of the bpf_link and diff_map, it
> + * does not hold any references to the leader program. Once all perf-stat
> + * sessions of these events exit, the leader prog, its maps, and the
> + * perf_events will be freed.
> + */
> +struct perf_event_attr_map_entry {
> +	__u32 link_id;
> +	__u32 diff_map_id;
> +};

this header file should be self contained,
so you need __u32 definitions


> +
> +/* pin the map at sysfs__mountpoint()/BPERF_DEFAULT_ATTR_MAP_PATH */
> +#define BPERF_DEFAULT_ATTR_MAP_PATH "fs/bpf/perf_attr_map"

if we are going to expose this, I think we should expose just
"perf_attr_map" ... without the 'fs/bpf' part, because that
could be mounted anywhere

thanks,
jirka

> +
> +#endif /* __LIBPERF_BPERF_H */
> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
> index 81d1df3c4ec0e..df70c8dcf7850 100644
> --- a/tools/perf/util/bpf_counter.c
> +++ b/tools/perf/util/bpf_counter.c
> @@ -14,6 +14,7 @@
>  #include <bpf/btf.h>
>  #include <bpf/libbpf.h>
>  #include <api/fs/fs.h>
> +#include <perf/bperf.h>
>  
>  #include "bpf_counter.h"
>  #include "counts.h"
> @@ -29,28 +30,6 @@
>  #include "bpf_skel/bperf_leader.skel.h"
>  #include "bpf_skel/bperf_follower.skel.h"
>  
> -/*
> - * bperf uses a hashmap, the attr_map, to track all the leader programs.
> - * The hashmap is pinned in bpffs. flock() on this file is used to ensure
> - * no concurrent access to the attr_map.  The key of attr_map is struct
> - * perf_event_attr, and the value is struct perf_event_attr_map_entry.
> - *
> - * struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
> - * leader prog, and the diff_map. Each perf-stat session holds a reference
> - * to the bpf_link to make sure the leader prog is attached to sched_switch
> - * tracepoint.
> - *
> - * Since the hashmap only contains IDs of the bpf_link and diff_map, it
> - * does not hold any references to the leader program. Once all perf-stat
> - * sessions of these events exit, the leader prog, its maps, and the
> - * perf_events will be freed.
> - */
> -struct perf_event_attr_map_entry {
> -	__u32 link_id;
> -	__u32 diff_map_id;
> -};
> -
> -#define DEFAULT_ATTR_MAP_PATH "fs/bpf/perf_attr_map"
>  #define ATTR_MAP_SIZE 16
>  
>  static inline void *u64_to_ptr(__u64 ptr)
> @@ -333,6 +312,20 @@ static __u32 bpf_map_get_id(int fd)
>  	return map_info.id;
>  }
>  
> +static bool bperf_attr_map_compatible(int attr_map_fd)
> +{
> +	struct bpf_map_info map_info = {0};
> +	__u32 map_info_len = sizeof(map_info);
> +	int err;
> +
> +	err = bpf_obj_get_info_by_fd(attr_map_fd, &map_info, &map_info_len);
> +
> +	if (err)
> +		return false;
> +	return (map_info.key_size == sizeof(struct perf_event_attr)) &&
> +		(map_info.value_size == sizeof(struct perf_event_attr_map_entry));
> +}
> +
>  static int bperf_lock_attr_map(struct target *target)
>  {
>  	char path[PATH_MAX];
> @@ -342,7 +335,7 @@ static int bperf_lock_attr_map(struct target *target)
>  		scnprintf(path, PATH_MAX, "%s", target->attr_map);
>  	} else {
>  		scnprintf(path, PATH_MAX, "%s/%s", sysfs__mountpoint(),
> -			  DEFAULT_ATTR_MAP_PATH);
> +			  BPERF_DEFAULT_ATTR_MAP_PATH);
>  	}
>  
>  	if (access(path, F_OK)) {
> @@ -367,6 +360,11 @@ static int bperf_lock_attr_map(struct target *target)
>  			return -1;
>  	}
>  
> +	if (!bperf_attr_map_compatible(map_fd)) {
> +		close(map_fd);
> +		return -1;
> +
> +	}
>  	err = flock(map_fd, LOCK_EX);
>  	if (err) {
>  		close(map_fd);
> -- 
> 2.30.2
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] perf-stat: introduce config stat.bpf-counter-events
  2021-04-03  0:29 ` [PATCH 2/2] perf-stat: introduce config stat.bpf-counter-events Song Liu
@ 2021-04-06 14:21   ` Jiri Olsa
  2021-04-06 21:46     ` Song Liu
  0 siblings, 1 reply; 6+ messages in thread
From: Jiri Olsa @ 2021-04-06 14:21 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, kernel-team, acme, acme, namhyung, jolsa, songliubraving

On Fri, Apr 02, 2021 at 05:29:38PM -0700, Song Liu wrote:
> Currently, to use BPF to aggregate perf event counters, the user uses
> --bpf-counters option. Enable "use bpf by default" events with a config
> option, stat.bpf-counter-events. This is limited to hardware events in
> evsel__hw_names.
> 
> This also enables mixed BPF event and regular event in the same sesssion.
> For example:
> 
>    perf config stat.bpf-counter-events=instructions
>    perf stat -e instructions,cs

hum, so this will effectively allow to mix 'bpf-shared' counters
with normals ones.. I don't think we're ready for that ;-)

> 
> The second command will use BPF for "instructions" but not "cs".
> 
> Signed-off-by: Song Liu <song@kernel.org>
> ---
>  tools/perf/Documentation/perf-stat.txt |  2 ++
>  tools/perf/builtin-stat.c              | 41 ++++++++++++++++----------
>  tools/perf/util/bpf_counter.c          | 11 +++++++
>  tools/perf/util/config.c               | 32 ++++++++++++++++++++
>  tools/perf/util/evsel.c                |  2 ++
>  tools/perf/util/evsel.h                |  1 +
>  tools/perf/util/target.h               |  5 ----
>  7 files changed, 74 insertions(+), 20 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 744211fa8c186..6d4733eaac170 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -97,6 +97,8 @@ report::
>  	Use BPF programs to aggregate readings from perf_events.  This
>  	allows multiple perf-stat sessions that are counting the same metric (cycles,
>  	instructions, etc.) to share hardware counters.
> +	To use BPF programs on common hardware events by default, use
> +	"perf config stat.bpf-counter-events=<list_of_events>".
>  
>  --bpf-attr-map::
>  	With option "--bpf-counters", different perf-stat sessions share
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 4bb48c6b66980..5adfa708ffe68 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -423,17 +423,28 @@ static int read_affinity_counters(struct timespec *rs)
>  	return 0;
>  }
>  
> +/*
> + * Returns:
> + *     0   if all events use BPF;
> + *     1   if some events do NOT use BPF;
> + *     < 0 on errors;
> + */
>  static int read_bpf_map_counters(void)
>  {
> +	bool has_none_bpf_events = false;
>  	struct evsel *counter;
>  	int err;
>  
>  	evlist__for_each_entry(evsel_list, counter) {
> +		if (!counter->bpf_counter_ops) {
> +			has_none_bpf_events = true;
> +			continue;
> +		}
>  		err = bpf_counter__read(counter);
>  		if (err)
>  			return err;
>  	}
> -	return 0;
> +	return has_none_bpf_events ? 1 : 0;
>  }
>  
>  static void read_counters(struct timespec *rs)
> @@ -442,9 +453,10 @@ static void read_counters(struct timespec *rs)
>  	int err;
>  
>  	if (!stat_config.stop_read_counter) {
> -		if (target__has_bpf(&target))
> -			err = read_bpf_map_counters();
> -		else
> +		err = read_bpf_map_counters();
> +		if (err < 0)
> +			return;
> +		if (err)
>  			err = read_affinity_counters(rs);

so read_affinity_counters will read also 'bpf-shared' counters no?
as long as it was separated, I did not see a problem, now we have
counters that either have bpf ops set or have not

it'd be great to do some generic separation.. I was thinking to move
bpf_counter_ops into some generic counter ops and we would just fill
in the proper ops for the counter.. buuut the affinity readings are
not compatible with what we are doing in bperf_read and the profiler
bpf read

so I think the solution will be just to skip those events in
read_affinity_counters and all the other code, and have some
helper like:

   bool evsel__is_bpf(evsel)

so it's clear why it's skipped

thanks,
jirka


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] perf util: move bperf definitions to a libperf header
  2021-04-06 14:21 ` [PATCH 1/2] perf util: move bperf definitions to a libperf header Jiri Olsa
@ 2021-04-06 16:26   ` Song Liu
  0 siblings, 0 replies; 6+ messages in thread
From: Song Liu @ 2021-04-06 16:26 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Song Liu, linux-kernel, Kernel Team, acme, acme, namhyung, jolsa



> On Apr 6, 2021, at 7:21 AM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Fri, Apr 02, 2021 at 05:29:37PM -0700, Song Liu wrote:
>> By following the same protocol, other tools can share hardware PMCs with
>> perf. Move perf_event_attr_map_entry and BPERF_DEFAULT_ATTR_MAP_PATH to
>> bperf.h for other tools to use.
> 
> hi,
> so is this necessary for some other tool now?

We have monitoring tools do perf_event_open(). I would like to migrate these
to bperf. 

> 
>> 
>> Also add bperf_attr_map_compatible() to check whether existing attr_map
>> is compatible with current perf binary.
> 
> please separate this change

Will do. 

> 
>> 
>> Signed-off-by: Song Liu <song@kernel.org>
>> ---
>> tools/lib/perf/include/perf/bperf.h | 29 +++++++++++++++++++
>> tools/perf/util/bpf_counter.c       | 44 ++++++++++++++---------------
>> 2 files changed, 50 insertions(+), 23 deletions(-)
>> create mode 100644 tools/lib/perf/include/perf/bperf.h
>> 
>> diff --git a/tools/lib/perf/include/perf/bperf.h b/tools/lib/perf/include/perf/bperf.h
>> new file mode 100644
>> index 0000000000000..02b2fd5e50c75
>> --- /dev/null
>> +++ b/tools/lib/perf/include/perf/bperf.h
> 
> I wonder we want to call it bpf_perf.h to be more generic?
> or best just bpf.h ... but that might give us some conflict
> headache in future ;-)

I would rather avoid bpf.h... I am ok with bpf_perf.h or bperf.h

> 
>> @@ -0,0 +1,29 @@
>> +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
>> +#ifndef __LIBPERF_BPERF_H
>> +#define __LIBPERF_BPERF_H
>> +
>> +/*
>> + * bperf uses a hashmap, the attr_map, to track all the leader programs.
>> + * The hashmap is pinned in bpffs. flock() on this file is used to ensure
>> + * no concurrent access to the attr_map.  The key of attr_map is struct
>> + * perf_event_attr, and the value is struct perf_event_attr_map_entry.
>> + *
>> + * struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
>> + * leader prog, and the diff_map. Each perf-stat session holds a reference
>> + * to the bpf_link to make sure the leader prog is attached to sched_switch
>> + * tracepoint.
>> + *
>> + * Since the hashmap only contains IDs of the bpf_link and diff_map, it
>> + * does not hold any references to the leader program. Once all perf-stat
>> + * sessions of these events exit, the leader prog, its maps, and the
>> + * perf_events will be freed.
>> + */
>> +struct perf_event_attr_map_entry {
>> +	__u32 link_id;
>> +	__u32 diff_map_id;
>> +};
> 
> this header file should be self contained,
> so you need __u32 definitions

Will add. 

> 
> 
>> +
>> +/* pin the map at sysfs__mountpoint()/BPERF_DEFAULT_ATTR_MAP_PATH */
>> +#define BPERF_DEFAULT_ATTR_MAP_PATH "fs/bpf/perf_attr_map"
> 
> if we are going to expose this, I think we should expose just
> "perf_attr_map" ... without the 'fs/bpf' part, because that
> could be mounted anywhere

Will fix this. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] perf-stat: introduce config stat.bpf-counter-events
  2021-04-06 14:21   ` Jiri Olsa
@ 2021-04-06 21:46     ` Song Liu
  0 siblings, 0 replies; 6+ messages in thread
From: Song Liu @ 2021-04-06 21:46 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Song Liu, linux-kernel, Kernel Team, acme, acme, namhyung, jolsa



> On Apr 6, 2021, at 7:21 AM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Fri, Apr 02, 2021 at 05:29:38PM -0700, Song Liu wrote:
>> Currently, to use BPF to aggregate perf event counters, the user uses
>> --bpf-counters option. Enable "use bpf by default" events with a config
>> option, stat.bpf-counter-events. This is limited to hardware events in
>> evsel__hw_names.
>> 
>> This also enables mixed BPF event and regular event in the same sesssion.
>> For example:
>> 
>>   perf config stat.bpf-counter-events=instructions
>>   perf stat -e instructions,cs
> 
> hum, so this will effectively allow to mix 'bpf-shared' counters
> with normals ones.. I don't think we're ready for that ;-)

I think we are ready. :) all bpf_counter stuff is within evsel, so mixing 
them doesn't need much work. 

> 
>> 
>> The second command will use BPF for "instructions" but not "cs".
>> 
>> Signed-off-by: Song Liu <song@kernel.org>
>> ---
>> tools/perf/Documentation/perf-stat.txt |  2 ++
>> tools/perf/builtin-stat.c              | 41 ++++++++++++++++----------
>> tools/perf/util/bpf_counter.c          | 11 +++++++
>> tools/perf/util/config.c               | 32 ++++++++++++++++++++
>> tools/perf/util/evsel.c                |  2 ++
>> tools/perf/util/evsel.h                |  1 +
>> tools/perf/util/target.h               |  5 ----
>> 7 files changed, 74 insertions(+), 20 deletions(-)
>> 
>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
>> index 744211fa8c186..6d4733eaac170 100644
>> --- a/tools/perf/Documentation/perf-stat.txt
>> +++ b/tools/perf/Documentation/perf-stat.txt
>> @@ -97,6 +97,8 @@ report::
>> 	Use BPF programs to aggregate readings from perf_events.  This
>> 	allows multiple perf-stat sessions that are counting the same metric (cycles,
>> 	instructions, etc.) to share hardware counters.
>> +	To use BPF programs on common hardware events by default, use
>> +	"perf config stat.bpf-counter-events=<list_of_events>".
>> 
>> --bpf-attr-map::
>> 	With option "--bpf-counters", different perf-stat sessions share
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 4bb48c6b66980..5adfa708ffe68 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -423,17 +423,28 @@ static int read_affinity_counters(struct timespec *rs)
>> 	return 0;
>> }
>> 
>> +/*
>> + * Returns:
>> + *     0   if all events use BPF;
>> + *     1   if some events do NOT use BPF;
>> + *     < 0 on errors;
>> + */
>> static int read_bpf_map_counters(void)
>> {
>> +	bool has_none_bpf_events = false;
>> 	struct evsel *counter;
>> 	int err;
>> 
>> 	evlist__for_each_entry(evsel_list, counter) {
>> +		if (!counter->bpf_counter_ops) {
>> +			has_none_bpf_events = true;
>> +			continue;
>> +		}
>> 		err = bpf_counter__read(counter);
>> 		if (err)
>> 			return err;
>> 	}
>> -	return 0;
>> +	return has_none_bpf_events ? 1 : 0;
>> }
>> 
>> static void read_counters(struct timespec *rs)
>> @@ -442,9 +453,10 @@ static void read_counters(struct timespec *rs)
>> 	int err;
>> 
>> 	if (!stat_config.stop_read_counter) {
>> -		if (target__has_bpf(&target))
>> -			err = read_bpf_map_counters();
>> -		else
>> +		err = read_bpf_map_counters();
>> +		if (err < 0)
>> +			return;
>> +		if (err)
>> 			err = read_affinity_counters(rs);
> 
> so read_affinity_counters will read also 'bpf-shared' counters no?
> as long as it was separated, I did not see a problem, now we have
> counters that either have bpf ops set or have not
> 
> it'd be great to do some generic separation.. I was thinking to move
> bpf_counter_ops into some generic counter ops and we would just fill
> in the proper ops for the counter.. buuut the affinity readings are
> not compatible with what we are doing in bperf_read and the profiler
> bpf read
> 
> so I think the solution will be just to skip those events in
> read_affinity_counters and all the other code, and have some
> helper like:
> 
>   bool evsel__is_bpf(evsel)
> 
> so it's clear why it's skipped

Yes, this will be better! Current version does have the problem of 
extra read in read_affinity_counters(). Will fix this. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-06 21:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-03  0:29 [PATCH 1/2] perf util: move bperf definitions to a libperf header Song Liu
2021-04-03  0:29 ` [PATCH 2/2] perf-stat: introduce config stat.bpf-counter-events Song Liu
2021-04-06 14:21   ` Jiri Olsa
2021-04-06 21:46     ` Song Liu
2021-04-06 14:21 ` [PATCH 1/2] perf util: move bperf definitions to a libperf header Jiri Olsa
2021-04-06 16:26   ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).