linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] perf print-events: Fix "perf list" can not display the PMU prefix for some hybrid cache events
@ 2022-09-23  3:00 zhengjun.xing
  2022-09-23  3:00 ` [PATCH v2 2/2] perf parse-events: Remove "not supported" " zhengjun.xing
  0 siblings, 1 reply; 4+ messages in thread
From: zhengjun.xing @ 2022-09-23  3:00 UTC (permalink / raw)
  To: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users, irogers, ak, kan.liang,
	zhengjun.xing, Yi Ammy

From: Zhengjun Xing <zhengjun.xing@linux.intel.com>

Some hybrid hardware cache events are only available on one CPU PMU. For
example, 'L1-dcache-load-misses' is only available on cpu_core. We have
supported in the perf list clearly reporting this info, the function works
fine before but recently the argument "config" in API is_event_supported()
is changed from "u64" to "unsigned int" which caused a regression, the
"perf list" then can not display the PMU prefix for some hybrid cache
events. For the hybrid systems, the PMU type ID is stored at config[63:32],
define config to "unsigned int" will miss the PMU type ID information, then
the regression happened, the config should be defined as "u64".

Before:
 # ./perf list |grep "Hardware cache event"
  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  L1-icache-loads                                    [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  node-load-misses                                   [Hardware cache event]
  node-loads                                         [Hardware cache event]

After:
 # ./perf list |grep "Hardware cache event"
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  cpu_atom/L1-icache-loads/                          [Hardware cache event]
  cpu_core/L1-dcache-load-misses/                    [Hardware cache event]
  cpu_core/node-load-misses/                         [Hardware cache event]
  cpu_core/node-loads/                               [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]

Fixes: 9b7c7728f4e4 ("perf parse-events: Break out tracepoint and printing")
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Reported-by: Yi Ammy <ammy.yi@intel.com>
---
Change log:
  v2:
    * Adds Acked-by from Ian Rogers <irogers@google.com>
    * Adds Reported-by from Yi Ammy <ammy.yi@intel.com>

 tools/perf/util/print-events.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c
index ba1ab5134685..04050d4f6db8 100644
--- a/tools/perf/util/print-events.c
+++ b/tools/perf/util/print-events.c
@@ -239,7 +239,7 @@ void print_sdt_events(const char *subsys_glob, const char *event_glob,
 	strlist__delete(sdtlist);
 }
 
-static bool is_event_supported(u8 type, unsigned int config)
+static bool is_event_supported(u8 type, u64 config)
 {
 	bool ret = true;
 	int open_return;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 2/2] perf parse-events: Remove "not supported" hybrid cache events
  2022-09-23  3:00 [PATCH v2 1/2] perf print-events: Fix "perf list" can not display the PMU prefix for some hybrid cache events zhengjun.xing
@ 2022-09-23  3:00 ` zhengjun.xing
  2022-09-23 16:55   ` Ian Rogers
  0 siblings, 1 reply; 4+ messages in thread
From: zhengjun.xing @ 2022-09-23  3:00 UTC (permalink / raw)
  To: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users, irogers, ak, kan.liang,
	zhengjun.xing, Yi Ammy

From: Zhengjun Xing <zhengjun.xing@linux.intel.com>

By default, we create two hybrid cache events, one is for cpu_core, and
another is for cpu_atom. But Some hybrid hardware cache events are only
available on one CPU PMU. For example, the 'L1-dcache-load-misses' is only
available on cpu_core, while the 'L1-icache-loads' is only available on
cpu_atom. We need to remove "not supported" hybrid cache events. By
extending is_event_supported() to global API and using it to check if the
hybrid cache events are supported before being created, we can remove the
"not supported" hybrid cache events.

Before:

 # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1

 Performance counter stats for 'system wide':

            52,570      cpu_core/L1-dcache-load-misses/
   <not supported>      cpu_atom/L1-dcache-load-misses/
   <not supported>      cpu_core/L1-icache-loads/
         1,471,817      cpu_atom/L1-icache-loads/

       1.004915229 seconds time elapsed

After:

 # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1

 Performance counter stats for 'system wide':

            54,510      cpu_core/L1-dcache-load-misses/
         1,441,286      cpu_atom/L1-icache-loads/

       1.005114281 seconds time elapsed

Fixes: 30def61f64ba ("perf parse-events: Create two hybrid cache events")
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Reported-by: Yi Ammy <ammy.yi@intel.com>
---
Change log:
  v2:
    * Adds a comment for removing "not supported" hybrid cache events.
    * Remove goto and add a strdup check
    * "is_event_supported" move to parse-events.c per Ian's suggestion.
    * Adds Reported-by from Yi Ammy <ammy.yi@intel.com>

 tools/perf/util/parse-events-hybrid.c | 21 ++++++++++++---
 tools/perf/util/parse-events.c        | 39 +++++++++++++++++++++++++++
 tools/perf/util/parse-events.h        |  1 +
 tools/perf/util/print-events.c        | 39 ---------------------------
 4 files changed, 57 insertions(+), 43 deletions(-)

diff --git a/tools/perf/util/parse-events-hybrid.c b/tools/perf/util/parse-events-hybrid.c
index 284f8eabd3b9..7c9f9150bad5 100644
--- a/tools/perf/util/parse-events-hybrid.c
+++ b/tools/perf/util/parse-events-hybrid.c
@@ -33,7 +33,8 @@ static void config_hybrid_attr(struct perf_event_attr *attr,
 	 * If the PMU type ID is 0, the PERF_TYPE_RAW will be applied.
 	 */
 	attr->type = type;
-	attr->config = attr->config | ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
+	attr->config = (attr->config & PERF_HW_EVENT_MASK) |
+			((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
 }
 
 static int create_event_hybrid(__u32 config_type, int *idx,
@@ -48,13 +49,25 @@ static int create_event_hybrid(__u32 config_type, int *idx,
 	__u64 config = attr->config;
 
 	config_hybrid_attr(attr, config_type, pmu->type);
+
+	/*
+	 * Some hybrid hardware cache events are only available on one CPU
+	 * PMU. For example, the 'L1-dcache-load-misses' is only available
+	 * on cpu_core, while the 'L1-icache-loads' is only available on
+	 * cpu_atom. We need to remove "not supported" hybrid cache events.
+	 */
+	if (attr->type == PERF_TYPE_HW_CACHE
+	    && !is_event_supported(attr->type, attr->config))
+		return 0;
+
 	evsel = parse_events__add_event_hybrid(list, idx, attr, name, metric_id,
 					       pmu, config_terms);
-	if (evsel)
+	if (evsel) {
 		evsel->pmu_name = strdup(pmu->name);
-	else
+		if (!evsel->pmu_name)
+			return -ENOMEM;
+	} else
 		return -ENOMEM;
-
 	attr->type = type;
 	attr->config = config;
 	return 0;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index f05e15acd33f..f3b2c2a87456 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -28,6 +28,7 @@
 #include "util/parse-events-hybrid.h"
 #include "util/pmu-hybrid.h"
 #include "tracepoint.h"
+#include "thread_map.h"
 
 #define MAX_NAME_LEN 100
 
@@ -157,6 +158,44 @@ struct event_symbol event_symbols_sw[PERF_COUNT_SW_MAX] = {
 #define PERF_EVENT_TYPE(config)		__PERF_EVENT_FIELD(config, TYPE)
 #define PERF_EVENT_ID(config)		__PERF_EVENT_FIELD(config, EVENT)
 
+bool is_event_supported(u8 type, u64 config)
+{
+	bool ret = true;
+	int open_return;
+	struct evsel *evsel;
+	struct perf_event_attr attr = {
+		.type = type,
+		.config = config,
+		.disabled = 1,
+	};
+	struct perf_thread_map *tmap = thread_map__new_by_tid(0);
+
+	if (tmap == NULL)
+		return false;
+
+	evsel = evsel__new(&attr);
+	if (evsel) {
+		open_return = evsel__open(evsel, NULL, tmap);
+		ret = open_return >= 0;
+
+		if (open_return == -EACCES) {
+			/*
+			 * This happens if the paranoid value
+			 * /proc/sys/kernel/perf_event_paranoid is set to 2
+			 * Re-run with exclude_kernel set; we don't do that
+			 * by default as some ARM machines do not support it.
+			 *
+			 */
+			evsel->core.attr.exclude_kernel = 1;
+			ret = evsel__open(evsel, NULL, tmap) >= 0;
+		}
+		evsel__delete(evsel);
+	}
+
+	perf_thread_map__put(tmap);
+	return ret;
+}
+
 const char *event_type(int type)
 {
 	switch (type) {
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 7e6a601d9cd0..07df7bb7b042 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -19,6 +19,7 @@ struct option;
 struct perf_pmu;
 
 bool have_tracepoints(struct list_head *evlist);
+bool is_event_supported(u8 type, u64 config);
 
 const char *event_type(int type);
 
diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c
index 04050d4f6db8..c4d5d87fae2f 100644
--- a/tools/perf/util/print-events.c
+++ b/tools/perf/util/print-events.c
@@ -22,7 +22,6 @@
 #include "probe-file.h"
 #include "string2.h"
 #include "strlist.h"
-#include "thread_map.h"
 #include "tracepoint.h"
 #include "pfm.h"
 #include "pmu-hybrid.h"
@@ -239,44 +238,6 @@ void print_sdt_events(const char *subsys_glob, const char *event_glob,
 	strlist__delete(sdtlist);
 }
 
-static bool is_event_supported(u8 type, u64 config)
-{
-	bool ret = true;
-	int open_return;
-	struct evsel *evsel;
-	struct perf_event_attr attr = {
-		.type = type,
-		.config = config,
-		.disabled = 1,
-	};
-	struct perf_thread_map *tmap = thread_map__new_by_tid(0);
-
-	if (tmap == NULL)
-		return false;
-
-	evsel = evsel__new(&attr);
-	if (evsel) {
-		open_return = evsel__open(evsel, NULL, tmap);
-		ret = open_return >= 0;
-
-		if (open_return == -EACCES) {
-			/*
-			 * This happens if the paranoid value
-			 * /proc/sys/kernel/perf_event_paranoid is set to 2
-			 * Re-run with exclude_kernel set; we don't do that
-			 * by default as some ARM machines do not support it.
-			 *
-			 */
-			evsel->core.attr.exclude_kernel = 1;
-			ret = evsel__open(evsel, NULL, tmap) >= 0;
-		}
-		evsel__delete(evsel);
-	}
-
-	perf_thread_map__put(tmap);
-	return ret;
-}
-
 int print_hwcache_events(const char *event_glob, bool name_only)
 {
 	unsigned int type, op, i, evt_i = 0, evt_num = 0, npmus = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 2/2] perf parse-events: Remove "not supported" hybrid cache events
  2022-09-23  3:00 ` [PATCH v2 2/2] perf parse-events: Remove "not supported" " zhengjun.xing
@ 2022-09-23 16:55   ` Ian Rogers
  2022-09-26 13:17     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 4+ messages in thread
From: Ian Rogers @ 2022-09-23 16:55 UTC (permalink / raw)
  To: zhengjun.xing
  Cc: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung,
	linux-kernel, linux-perf-users, ak, kan.liang, Yi Ammy

On Thu, Sep 22, 2022 at 7:58 PM <zhengjun.xing@linux.intel.com> wrote:
>
> From: Zhengjun Xing <zhengjun.xing@linux.intel.com>
>
> By default, we create two hybrid cache events, one is for cpu_core, and
> another is for cpu_atom. But Some hybrid hardware cache events are only
> available on one CPU PMU. For example, the 'L1-dcache-load-misses' is only
> available on cpu_core, while the 'L1-icache-loads' is only available on
> cpu_atom. We need to remove "not supported" hybrid cache events. By
> extending is_event_supported() to global API and using it to check if the
> hybrid cache events are supported before being created, we can remove the
> "not supported" hybrid cache events.
>
> Before:
>
>  # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1
>
>  Performance counter stats for 'system wide':
>
>             52,570      cpu_core/L1-dcache-load-misses/
>    <not supported>      cpu_atom/L1-dcache-load-misses/
>    <not supported>      cpu_core/L1-icache-loads/
>          1,471,817      cpu_atom/L1-icache-loads/
>
>        1.004915229 seconds time elapsed
>
> After:
>
>  # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1
>
>  Performance counter stats for 'system wide':
>
>             54,510      cpu_core/L1-dcache-load-misses/
>          1,441,286      cpu_atom/L1-icache-loads/
>
>        1.005114281 seconds time elapsed
>
> Fixes: 30def61f64ba ("perf parse-events: Create two hybrid cache events")
> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> Reported-by: Yi Ammy <ammy.yi@intel.com>

Acked-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

> ---
> Change log:
>   v2:
>     * Adds a comment for removing "not supported" hybrid cache events.
>     * Remove goto and add a strdup check
>     * "is_event_supported" move to parse-events.c per Ian's suggestion.
>     * Adds Reported-by from Yi Ammy <ammy.yi@intel.com>
>
>  tools/perf/util/parse-events-hybrid.c | 21 ++++++++++++---
>  tools/perf/util/parse-events.c        | 39 +++++++++++++++++++++++++++
>  tools/perf/util/parse-events.h        |  1 +
>  tools/perf/util/print-events.c        | 39 ---------------------------
>  4 files changed, 57 insertions(+), 43 deletions(-)
>
> diff --git a/tools/perf/util/parse-events-hybrid.c b/tools/perf/util/parse-events-hybrid.c
> index 284f8eabd3b9..7c9f9150bad5 100644
> --- a/tools/perf/util/parse-events-hybrid.c
> +++ b/tools/perf/util/parse-events-hybrid.c
> @@ -33,7 +33,8 @@ static void config_hybrid_attr(struct perf_event_attr *attr,
>          * If the PMU type ID is 0, the PERF_TYPE_RAW will be applied.
>          */
>         attr->type = type;
> -       attr->config = attr->config | ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
> +       attr->config = (attr->config & PERF_HW_EVENT_MASK) |
> +                       ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
>  }
>
>  static int create_event_hybrid(__u32 config_type, int *idx,
> @@ -48,13 +49,25 @@ static int create_event_hybrid(__u32 config_type, int *idx,
>         __u64 config = attr->config;
>
>         config_hybrid_attr(attr, config_type, pmu->type);
> +
> +       /*
> +        * Some hybrid hardware cache events are only available on one CPU
> +        * PMU. For example, the 'L1-dcache-load-misses' is only available
> +        * on cpu_core, while the 'L1-icache-loads' is only available on
> +        * cpu_atom. We need to remove "not supported" hybrid cache events.
> +        */
> +       if (attr->type == PERF_TYPE_HW_CACHE
> +           && !is_event_supported(attr->type, attr->config))
> +               return 0;
> +
>         evsel = parse_events__add_event_hybrid(list, idx, attr, name, metric_id,
>                                                pmu, config_terms);
> -       if (evsel)
> +       if (evsel) {
>                 evsel->pmu_name = strdup(pmu->name);
> -       else
> +               if (!evsel->pmu_name)
> +                       return -ENOMEM;
> +       } else
>                 return -ENOMEM;
> -
>         attr->type = type;
>         attr->config = config;
>         return 0;
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index f05e15acd33f..f3b2c2a87456 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -28,6 +28,7 @@
>  #include "util/parse-events-hybrid.h"
>  #include "util/pmu-hybrid.h"
>  #include "tracepoint.h"
> +#include "thread_map.h"
>
>  #define MAX_NAME_LEN 100
>
> @@ -157,6 +158,44 @@ struct event_symbol event_symbols_sw[PERF_COUNT_SW_MAX] = {
>  #define PERF_EVENT_TYPE(config)                __PERF_EVENT_FIELD(config, TYPE)
>  #define PERF_EVENT_ID(config)          __PERF_EVENT_FIELD(config, EVENT)
>
> +bool is_event_supported(u8 type, u64 config)
> +{
> +       bool ret = true;
> +       int open_return;
> +       struct evsel *evsel;
> +       struct perf_event_attr attr = {
> +               .type = type,
> +               .config = config,
> +               .disabled = 1,
> +       };
> +       struct perf_thread_map *tmap = thread_map__new_by_tid(0);
> +
> +       if (tmap == NULL)
> +               return false;
> +
> +       evsel = evsel__new(&attr);
> +       if (evsel) {
> +               open_return = evsel__open(evsel, NULL, tmap);
> +               ret = open_return >= 0;
> +
> +               if (open_return == -EACCES) {
> +                       /*
> +                        * This happens if the paranoid value
> +                        * /proc/sys/kernel/perf_event_paranoid is set to 2
> +                        * Re-run with exclude_kernel set; we don't do that
> +                        * by default as some ARM machines do not support it.
> +                        *
> +                        */
> +                       evsel->core.attr.exclude_kernel = 1;
> +                       ret = evsel__open(evsel, NULL, tmap) >= 0;
> +               }
> +               evsel__delete(evsel);
> +       }
> +
> +       perf_thread_map__put(tmap);
> +       return ret;
> +}
> +
>  const char *event_type(int type)
>  {
>         switch (type) {
> diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
> index 7e6a601d9cd0..07df7bb7b042 100644
> --- a/tools/perf/util/parse-events.h
> +++ b/tools/perf/util/parse-events.h
> @@ -19,6 +19,7 @@ struct option;
>  struct perf_pmu;
>
>  bool have_tracepoints(struct list_head *evlist);
> +bool is_event_supported(u8 type, u64 config);
>
>  const char *event_type(int type);
>
> diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c
> index 04050d4f6db8..c4d5d87fae2f 100644
> --- a/tools/perf/util/print-events.c
> +++ b/tools/perf/util/print-events.c
> @@ -22,7 +22,6 @@
>  #include "probe-file.h"
>  #include "string2.h"
>  #include "strlist.h"
> -#include "thread_map.h"
>  #include "tracepoint.h"
>  #include "pfm.h"
>  #include "pmu-hybrid.h"
> @@ -239,44 +238,6 @@ void print_sdt_events(const char *subsys_glob, const char *event_glob,
>         strlist__delete(sdtlist);
>  }
>
> -static bool is_event_supported(u8 type, u64 config)
> -{
> -       bool ret = true;
> -       int open_return;
> -       struct evsel *evsel;
> -       struct perf_event_attr attr = {
> -               .type = type,
> -               .config = config,
> -               .disabled = 1,
> -       };
> -       struct perf_thread_map *tmap = thread_map__new_by_tid(0);
> -
> -       if (tmap == NULL)
> -               return false;
> -
> -       evsel = evsel__new(&attr);
> -       if (evsel) {
> -               open_return = evsel__open(evsel, NULL, tmap);
> -               ret = open_return >= 0;
> -
> -               if (open_return == -EACCES) {
> -                       /*
> -                        * This happens if the paranoid value
> -                        * /proc/sys/kernel/perf_event_paranoid is set to 2
> -                        * Re-run with exclude_kernel set; we don't do that
> -                        * by default as some ARM machines do not support it.
> -                        *
> -                        */
> -                       evsel->core.attr.exclude_kernel = 1;
> -                       ret = evsel__open(evsel, NULL, tmap) >= 0;
> -               }
> -               evsel__delete(evsel);
> -       }
> -
> -       perf_thread_map__put(tmap);
> -       return ret;
> -}
> -
>  int print_hwcache_events(const char *event_glob, bool name_only)
>  {
>         unsigned int type, op, i, evt_i = 0, evt_num = 0, npmus = 0;
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 2/2] perf parse-events: Remove "not supported" hybrid cache events
  2022-09-23 16:55   ` Ian Rogers
@ 2022-09-26 13:17     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-09-26 13:17 UTC (permalink / raw)
  To: Ian Rogers
  Cc: zhengjun.xing, peterz, mingo, alexander.shishkin, jolsa,
	namhyung, linux-kernel, linux-perf-users, ak, kan.liang, Yi Ammy

Em Fri, Sep 23, 2022 at 09:55:16AM -0700, Ian Rogers escreveu:
> On Thu, Sep 22, 2022 at 7:58 PM <zhengjun.xing@linux.intel.com> wrote:
> >
> > From: Zhengjun Xing <zhengjun.xing@linux.intel.com>
> >
> > By default, we create two hybrid cache events, one is for cpu_core, and
> > another is for cpu_atom. But Some hybrid hardware cache events are only
> > available on one CPU PMU. For example, the 'L1-dcache-load-misses' is only
> > available on cpu_core, while the 'L1-icache-loads' is only available on
> > cpu_atom. We need to remove "not supported" hybrid cache events. By
> > extending is_event_supported() to global API and using it to check if the
> > hybrid cache events are supported before being created, we can remove the
> > "not supported" hybrid cache events.
> >
> > Before:
> >
> >  # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1
> >
> >  Performance counter stats for 'system wide':
> >
> >             52,570      cpu_core/L1-dcache-load-misses/
> >    <not supported>      cpu_atom/L1-dcache-load-misses/
> >    <not supported>      cpu_core/L1-icache-loads/
> >          1,471,817      cpu_atom/L1-icache-loads/
> >
> >        1.004915229 seconds time elapsed
> >
> > After:
> >
> >  # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1
> >
> >  Performance counter stats for 'system wide':
> >
> >             54,510      cpu_core/L1-dcache-load-misses/
> >          1,441,286      cpu_atom/L1-icache-loads/
> >
> >        1.005114281 seconds time elapsed
> >
> > Fixes: 30def61f64ba ("perf parse-events: Create two hybrid cache events")
> > Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
> > Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> > Reported-by: Yi Ammy <ammy.yi@intel.com>
> 
> Acked-by: Ian Rogers <irogers@google.com>

Thanks, applied.

- Arnaldo

 
> Thanks,
> Ian
> 
> > ---
> > Change log:
> >   v2:
> >     * Adds a comment for removing "not supported" hybrid cache events.
> >     * Remove goto and add a strdup check
> >     * "is_event_supported" move to parse-events.c per Ian's suggestion.
> >     * Adds Reported-by from Yi Ammy <ammy.yi@intel.com>
> >
> >  tools/perf/util/parse-events-hybrid.c | 21 ++++++++++++---
> >  tools/perf/util/parse-events.c        | 39 +++++++++++++++++++++++++++
> >  tools/perf/util/parse-events.h        |  1 +
> >  tools/perf/util/print-events.c        | 39 ---------------------------
> >  4 files changed, 57 insertions(+), 43 deletions(-)
> >
> > diff --git a/tools/perf/util/parse-events-hybrid.c b/tools/perf/util/parse-events-hybrid.c
> > index 284f8eabd3b9..7c9f9150bad5 100644
> > --- a/tools/perf/util/parse-events-hybrid.c
> > +++ b/tools/perf/util/parse-events-hybrid.c
> > @@ -33,7 +33,8 @@ static void config_hybrid_attr(struct perf_event_attr *attr,
> >          * If the PMU type ID is 0, the PERF_TYPE_RAW will be applied.
> >          */
> >         attr->type = type;
> > -       attr->config = attr->config | ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
> > +       attr->config = (attr->config & PERF_HW_EVENT_MASK) |
> > +                       ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
> >  }
> >
> >  static int create_event_hybrid(__u32 config_type, int *idx,
> > @@ -48,13 +49,25 @@ static int create_event_hybrid(__u32 config_type, int *idx,
> >         __u64 config = attr->config;
> >
> >         config_hybrid_attr(attr, config_type, pmu->type);
> > +
> > +       /*
> > +        * Some hybrid hardware cache events are only available on one CPU
> > +        * PMU. For example, the 'L1-dcache-load-misses' is only available
> > +        * on cpu_core, while the 'L1-icache-loads' is only available on
> > +        * cpu_atom. We need to remove "not supported" hybrid cache events.
> > +        */
> > +       if (attr->type == PERF_TYPE_HW_CACHE
> > +           && !is_event_supported(attr->type, attr->config))
> > +               return 0;
> > +
> >         evsel = parse_events__add_event_hybrid(list, idx, attr, name, metric_id,
> >                                                pmu, config_terms);
> > -       if (evsel)
> > +       if (evsel) {
> >                 evsel->pmu_name = strdup(pmu->name);
> > -       else
> > +               if (!evsel->pmu_name)
> > +                       return -ENOMEM;
> > +       } else
> >                 return -ENOMEM;
> > -
> >         attr->type = type;
> >         attr->config = config;
> >         return 0;
> > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> > index f05e15acd33f..f3b2c2a87456 100644
> > --- a/tools/perf/util/parse-events.c
> > +++ b/tools/perf/util/parse-events.c
> > @@ -28,6 +28,7 @@
> >  #include "util/parse-events-hybrid.h"
> >  #include "util/pmu-hybrid.h"
> >  #include "tracepoint.h"
> > +#include "thread_map.h"
> >
> >  #define MAX_NAME_LEN 100
> >
> > @@ -157,6 +158,44 @@ struct event_symbol event_symbols_sw[PERF_COUNT_SW_MAX] = {
> >  #define PERF_EVENT_TYPE(config)                __PERF_EVENT_FIELD(config, TYPE)
> >  #define PERF_EVENT_ID(config)          __PERF_EVENT_FIELD(config, EVENT)
> >
> > +bool is_event_supported(u8 type, u64 config)
> > +{
> > +       bool ret = true;
> > +       int open_return;
> > +       struct evsel *evsel;
> > +       struct perf_event_attr attr = {
> > +               .type = type,
> > +               .config = config,
> > +               .disabled = 1,
> > +       };
> > +       struct perf_thread_map *tmap = thread_map__new_by_tid(0);
> > +
> > +       if (tmap == NULL)
> > +               return false;
> > +
> > +       evsel = evsel__new(&attr);
> > +       if (evsel) {
> > +               open_return = evsel__open(evsel, NULL, tmap);
> > +               ret = open_return >= 0;
> > +
> > +               if (open_return == -EACCES) {
> > +                       /*
> > +                        * This happens if the paranoid value
> > +                        * /proc/sys/kernel/perf_event_paranoid is set to 2
> > +                        * Re-run with exclude_kernel set; we don't do that
> > +                        * by default as some ARM machines do not support it.
> > +                        *
> > +                        */
> > +                       evsel->core.attr.exclude_kernel = 1;
> > +                       ret = evsel__open(evsel, NULL, tmap) >= 0;
> > +               }
> > +               evsel__delete(evsel);
> > +       }
> > +
> > +       perf_thread_map__put(tmap);
> > +       return ret;
> > +}
> > +
> >  const char *event_type(int type)
> >  {
> >         switch (type) {
> > diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
> > index 7e6a601d9cd0..07df7bb7b042 100644
> > --- a/tools/perf/util/parse-events.h
> > +++ b/tools/perf/util/parse-events.h
> > @@ -19,6 +19,7 @@ struct option;
> >  struct perf_pmu;
> >
> >  bool have_tracepoints(struct list_head *evlist);
> > +bool is_event_supported(u8 type, u64 config);
> >
> >  const char *event_type(int type);
> >
> > diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c
> > index 04050d4f6db8..c4d5d87fae2f 100644
> > --- a/tools/perf/util/print-events.c
> > +++ b/tools/perf/util/print-events.c
> > @@ -22,7 +22,6 @@
> >  #include "probe-file.h"
> >  #include "string2.h"
> >  #include "strlist.h"
> > -#include "thread_map.h"
> >  #include "tracepoint.h"
> >  #include "pfm.h"
> >  #include "pmu-hybrid.h"
> > @@ -239,44 +238,6 @@ void print_sdt_events(const char *subsys_glob, const char *event_glob,
> >         strlist__delete(sdtlist);
> >  }
> >
> > -static bool is_event_supported(u8 type, u64 config)
> > -{
> > -       bool ret = true;
> > -       int open_return;
> > -       struct evsel *evsel;
> > -       struct perf_event_attr attr = {
> > -               .type = type,
> > -               .config = config,
> > -               .disabled = 1,
> > -       };
> > -       struct perf_thread_map *tmap = thread_map__new_by_tid(0);
> > -
> > -       if (tmap == NULL)
> > -               return false;
> > -
> > -       evsel = evsel__new(&attr);
> > -       if (evsel) {
> > -               open_return = evsel__open(evsel, NULL, tmap);
> > -               ret = open_return >= 0;
> > -
> > -               if (open_return == -EACCES) {
> > -                       /*
> > -                        * This happens if the paranoid value
> > -                        * /proc/sys/kernel/perf_event_paranoid is set to 2
> > -                        * Re-run with exclude_kernel set; we don't do that
> > -                        * by default as some ARM machines do not support it.
> > -                        *
> > -                        */
> > -                       evsel->core.attr.exclude_kernel = 1;
> > -                       ret = evsel__open(evsel, NULL, tmap) >= 0;
> > -               }
> > -               evsel__delete(evsel);
> > -       }
> > -
> > -       perf_thread_map__put(tmap);
> > -       return ret;
> > -}
> > -
> >  int print_hwcache_events(const char *event_glob, bool name_only)
> >  {
> >         unsigned int type, op, i, evt_i = 0, evt_num = 0, npmus = 0;
> > --
> > 2.25.1
> >

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-09-26 14:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-23  3:00 [PATCH v2 1/2] perf print-events: Fix "perf list" can not display the PMU prefix for some hybrid cache events zhengjun.xing
2022-09-23  3:00 ` [PATCH v2 2/2] perf parse-events: Remove "not supported" " zhengjun.xing
2022-09-23 16:55   ` Ian Rogers
2022-09-26 13:17     ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).