linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] Add perf stat default events for hybrid machines
@ 2022-07-21  6:57 zhengjun.xing
  2022-07-21  6:57 ` [PATCH v4 1/5] perf stat: Revert "perf stat: Add default hybrid events" zhengjun.xing
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: zhengjun.xing @ 2022-07-21  6:57 UTC (permalink / raw)
  To: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users, irogers, ak, kan.liang, zhengjun.xing

From: Zhengjun Xing <zhengjun.xing@linux.intel.com>

The patch series is to clean up the existing perf stat default and support
the perf metrics Topdown for the p-core PMU in the perf stat default. The
first 4 patches are the clean-up patch and fixing the "--detailed" issue.
The last patch adds support for the perf metrics Topdown, the perf metrics
Topdown support for e-core PMU will be implemented later separately.

Kan Liang (4):
  perf stat: Revert "perf stat: Add default hybrid events"
  perf evsel: Add arch_evsel__hw_name()
  perf evlist: Always use arch_evlist__add_default_attrs()
  perf x86 evlist: Add default hybrid events for perf stat

Zhengjun Xing (1):
  perf stat: Add topdown metrics in the default perf stat on the hybrid
    machine

 tools/perf/arch/x86/util/evlist.c  | 64 +++++++++++++++++++++++++-----
 tools/perf/arch/x86/util/evsel.c   | 20 ++++++++++
 tools/perf/arch/x86/util/topdown.c | 51 ++++++++++++++++++++++++
 tools/perf/arch/x86/util/topdown.h |  1 +
 tools/perf/builtin-stat.c          | 50 ++++-------------------
 tools/perf/util/evlist.c           | 11 +++--
 tools/perf/util/evlist.h           |  9 ++++-
 tools/perf/util/evsel.c            |  7 +++-
 tools/perf/util/evsel.h            |  1 +
 tools/perf/util/stat-display.c     |  2 +-
 tools/perf/util/topdown.c          |  7 ++++
 tools/perf/util/topdown.h          |  3 +-
 12 files changed, 166 insertions(+), 60 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 1/5] perf stat: Revert "perf stat: Add default hybrid events"
  2022-07-21  6:57 [PATCH v4 0/5] Add perf stat default events for hybrid machines zhengjun.xing
@ 2022-07-21  6:57 ` zhengjun.xing
  2022-07-21  6:57 ` [PATCH v4 2/5] perf evsel: Add arch_evsel__hw_name() zhengjun.xing
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: zhengjun.xing @ 2022-07-21  6:57 UTC (permalink / raw)
  To: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users, irogers, ak, kan.liang, zhengjun.xing

From: Kan Liang <kan.liang@linux.intel.com>

This reverts commit ac2dc29edd21 ("perf stat: Add default hybrid
events").

Between this patch and the reverted patch, the commit 6c1912898ed2
("perf parse-events: Rename parse_events_error functions") and the
commit 07eafd4e053a ("perf parse-event: Add init and exit to
parse_event_error") clean up the parse_events_error_*() codes. The
related change is also reverted.

The reverted patch is hard to be extended to support new default
events, e.g., Topdown events, and the existing "--detailed" option
on a hybrid platform.

A new solution will be proposed in the following patch to enable the
perf stat default on a hybrid platform.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
---
Change log:
  v4: 
    * Adds Acked-by from Namhyung Kim <namhyung@kernel.org>
  v3:
    * no change since v1.

 tools/perf/builtin-stat.c | 30 ------------------------------
 1 file changed, 30 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 4ce87a8eb7d7..6ac79d95f3b5 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1685,12 +1685,6 @@ static int add_default_attributes(void)
   { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS	},
   { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES		},
 
-};
-	struct perf_event_attr default_sw_attrs[] = {
-  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK		},
-  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES	},
-  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS		},
-  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS		},
 };
 
 /*
@@ -1947,30 +1941,6 @@ static int add_default_attributes(void)
 	}
 
 	if (!evsel_list->core.nr_entries) {
-		if (perf_pmu__has_hybrid()) {
-			struct parse_events_error errinfo;
-			const char *hybrid_str = "cycles,instructions,branches,branch-misses";
-
-			if (target__has_cpu(&target))
-				default_sw_attrs[0].config = PERF_COUNT_SW_CPU_CLOCK;
-
-			if (evlist__add_default_attrs(evsel_list,
-						      default_sw_attrs) < 0) {
-				return -1;
-			}
-
-			parse_events_error__init(&errinfo);
-			err = parse_events(evsel_list, hybrid_str, &errinfo);
-			if (err) {
-				fprintf(stderr,
-					"Cannot set up hybrid events %s: %d\n",
-					hybrid_str, err);
-				parse_events_error__print(&errinfo, hybrid_str);
-			}
-			parse_events_error__exit(&errinfo);
-			return err ? -1 : 0;
-		}
-
 		if (target__has_cpu(&target))
 			default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 2/5] perf evsel: Add arch_evsel__hw_name()
  2022-07-21  6:57 [PATCH v4 0/5] Add perf stat default events for hybrid machines zhengjun.xing
  2022-07-21  6:57 ` [PATCH v4 1/5] perf stat: Revert "perf stat: Add default hybrid events" zhengjun.xing
@ 2022-07-21  6:57 ` zhengjun.xing
  2022-07-21  6:57 ` [PATCH v4 3/5] perf evlist: Always use arch_evlist__add_default_attrs() zhengjun.xing
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: zhengjun.xing @ 2022-07-21  6:57 UTC (permalink / raw)
  To: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users, irogers, ak, kan.liang, zhengjun.xing

From: Kan Liang <kan.liang@linux.intel.com>

The commit 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE") extends the two types to become PMU aware types for
a hybrid system. However, current evsel__hw_name doesn't take the PMU
type into account. It mistakenly returns the "unknown-hardware" for the
hardware event with a specific PMU type.

Add an Arch specific arch_evsel__hw_name() to specially handle the PMU
aware hardware event.

Currently, the extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE is only
supported by X86. Only implement the specific arch_evsel__hw_name() for
X86 in the patch.

Nothing is changed for the other Archs.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
---
Change log:
  v4:
    * Adds Acked-by from Namhyung Kim <namhyung@kernel.org>
    * Rebase code to the latest perf/core branch
  v3:
    * no change since v1.

 tools/perf/arch/x86/util/evsel.c | 20 ++++++++++++++++++++
 tools/perf/util/evsel.c          |  7 ++++++-
 tools/perf/util/evsel.h          |  1 +
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index 882c1a8c1ded..ea3972d785d1 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -66,6 +66,26 @@ bool arch_evsel__must_be_in_group(const struct evsel *evsel)
 		 strcasestr(evsel->name, "topdown"));
 }
 
+int arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
+{
+	u64 event = evsel->core.attr.config & PERF_HW_EVENT_MASK;
+	u64 pmu = evsel->core.attr.config >> PERF_PMU_TYPE_SHIFT;
+	const char *event_name;
+
+	if (event < PERF_COUNT_HW_MAX && evsel__hw_names[event])
+		event_name = evsel__hw_names[event];
+	else
+		event_name = "unknown-hardware";
+
+	/* The PMU type is not required for the non-hybrid platform. */
+	if (!pmu)
+		return  scnprintf(bf, size, "%s", event_name);
+
+	return scnprintf(bf, size, "%s/%s/",
+			 evsel->pmu_name ? evsel->pmu_name : "cpu",
+			 event_name);
+}
+
 static void ibs_l3miss_warn(void)
 {
 	pr_warning(
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8fea51a9cd90..8199774a1dc2 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -593,9 +593,14 @@ static int evsel__add_modifiers(struct evsel *evsel, char *bf, size_t size)
 	return r;
 }
 
+int __weak arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
+{
+	return scnprintf(bf, size, "%s", __evsel__hw_name(evsel->core.attr.config));
+}
+
 static int evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
 {
-	int r = scnprintf(bf, size, "%s", __evsel__hw_name(evsel->core.attr.config));
+	int r = arch_evsel__hw_name(evsel, bf, size);
 	return r + evsel__add_modifiers(evsel, bf + r, size - r);
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 92bed8e2f7d8..9ec48049ee68 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -271,6 +271,7 @@ extern const char *const evsel__hw_names[PERF_COUNT_HW_MAX];
 extern const char *const evsel__sw_names[PERF_COUNT_SW_MAX];
 extern char *evsel__bpf_counter_events;
 bool evsel__match_bpf_counter_events(const char *name);
+int arch_evsel__hw_name(struct evsel *evsel, char *bf, size_t size);
 
 int __evsel__hw_cache_type_op_res_name(u8 type, u8 op, u8 result, char *bf, size_t size);
 const char *evsel__name(struct evsel *evsel);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 3/5] perf evlist: Always use arch_evlist__add_default_attrs()
  2022-07-21  6:57 [PATCH v4 0/5] Add perf stat default events for hybrid machines zhengjun.xing
  2022-07-21  6:57 ` [PATCH v4 1/5] perf stat: Revert "perf stat: Add default hybrid events" zhengjun.xing
  2022-07-21  6:57 ` [PATCH v4 2/5] perf evsel: Add arch_evsel__hw_name() zhengjun.xing
@ 2022-07-21  6:57 ` zhengjun.xing
  2022-07-21  6:57 ` [PATCH v4 4/5] perf x86 evlist: Add default hybrid events for perf stat zhengjun.xing
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: zhengjun.xing @ 2022-07-21  6:57 UTC (permalink / raw)
  To: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users, irogers, ak, kan.liang, zhengjun.xing

From: Kan Liang <kan.liang@linux.intel.com>

Current perf stat uses the evlist__add_default_attrs() to add the
generic default attrs, and uses arch_evlist__add_default_attrs()
to add the Arch specific default attrs, e.g., Topdown for X86.

It works well for the non-hybrid platforms. However, for a hybrid
platform, the hard code generic default attrs don't work.

Uses arch_evlist__add_default_attrs() to replace the
evlist__add_default_attrs(). The arch_evlist__add_default_attrs() is
modified to invoke the same __evlist__add_default_attrs() for the
generic default attrs. No functional change.

Add default_null_attrs[] to indicate the Arch specific attrs.
No functional change for the Arch specific default attrs either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
---
Change log:
  v4:
    * Adds Acked-by from Namhyung Kim <namhyung@kernel.org>
  v3:
    * no change since v1.

 tools/perf/arch/x86/util/evlist.c | 7 ++++++-
 tools/perf/builtin-stat.c         | 6 +++++-
 tools/perf/util/evlist.c          | 9 +++++++--
 tools/perf/util/evlist.h          | 7 +++++--
 4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
index 68f681ad54c1..777bdf182a58 100644
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@@ -8,8 +8,13 @@
 #define TOPDOWN_L1_EVENTS	"{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
 #define TOPDOWN_L2_EVENTS	"{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
 
-int arch_evlist__add_default_attrs(struct evlist *evlist)
+int arch_evlist__add_default_attrs(struct evlist *evlist,
+				   struct perf_event_attr *attrs,
+				   size_t nr_attrs)
 {
+	if (nr_attrs)
+		return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
+
 	if (!pmu_have_event("cpu", "slots"))
 		return 0;
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 6ac79d95f3b5..837c3ca91af1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1777,6 +1777,9 @@ static int add_default_attributes(void)
 	(PERF_COUNT_HW_CACHE_OP_PREFETCH	<<  8) |
 	(PERF_COUNT_HW_CACHE_RESULT_MISS	<< 16)				},
 };
+
+	struct perf_event_attr default_null_attrs[] = {};
+
 	/* Set attrs if no event is selected and !null_run: */
 	if (stat_config.null_run)
 		return 0;
@@ -1958,7 +1961,8 @@ static int add_default_attributes(void)
 			return -1;
 
 		stat_config.topdown_level = TOPDOWN_MAX_LEVEL;
-		if (arch_evlist__add_default_attrs(evsel_list) < 0)
+		/* Platform specific attrs */
+		if (evlist__add_default_attrs(evsel_list, default_null_attrs) < 0)
 			return -1;
 	}
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 48af7d379d82..efa5f006b5c6 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -342,9 +342,14 @@ int __evlist__add_default_attrs(struct evlist *evlist, struct perf_event_attr *a
 	return evlist__add_attrs(evlist, attrs, nr_attrs);
 }
 
-__weak int arch_evlist__add_default_attrs(struct evlist *evlist __maybe_unused)
+__weak int arch_evlist__add_default_attrs(struct evlist *evlist,
+					  struct perf_event_attr *attrs,
+					  size_t nr_attrs)
 {
-	return 0;
+	if (!nr_attrs)
+		return 0;
+
+	return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
 }
 
 struct evsel *evlist__find_tracepoint_by_id(struct evlist *evlist, int id)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1bde9ccf4e7d..129095c0fe6d 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -107,10 +107,13 @@ static inline int evlist__add_default(struct evlist *evlist)
 int __evlist__add_default_attrs(struct evlist *evlist,
 				     struct perf_event_attr *attrs, size_t nr_attrs);
 
+int arch_evlist__add_default_attrs(struct evlist *evlist,
+				   struct perf_event_attr *attrs,
+				   size_t nr_attrs);
+
 #define evlist__add_default_attrs(evlist, array) \
-	__evlist__add_default_attrs(evlist, array, ARRAY_SIZE(array))
+	arch_evlist__add_default_attrs(evlist, array, ARRAY_SIZE(array))
 
-int arch_evlist__add_default_attrs(struct evlist *evlist);
 struct evsel *arch_evlist__leader(struct list_head *list);
 
 int evlist__add_dummy(struct evlist *evlist);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 4/5] perf x86 evlist: Add default hybrid events for perf stat
  2022-07-21  6:57 [PATCH v4 0/5] Add perf stat default events for hybrid machines zhengjun.xing
                   ` (2 preceding siblings ...)
  2022-07-21  6:57 ` [PATCH v4 3/5] perf evlist: Always use arch_evlist__add_default_attrs() zhengjun.xing
@ 2022-07-21  6:57 ` zhengjun.xing
  2022-07-21  6:57 ` [PATCH v4 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine zhengjun.xing
  2022-07-29 15:03 ` [PATCH v4 0/5] Add perf stat default events for hybrid machines Ian Rogers
  5 siblings, 0 replies; 8+ messages in thread
From: zhengjun.xing @ 2022-07-21  6:57 UTC (permalink / raw)
  To: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users, irogers, ak, kan.liang, zhengjun.xing

From: Kan Liang <kan.liang@linux.intel.com>

Provide a new solution to replace the reverted commit ac2dc29edd21
("perf stat: Add default hybrid events").

For the default software attrs, nothing is changed.
For the default hardware attrs, create a new evsel for each hybrid pmu.

With the new solution, adding a new default attr will not require the
special support for the hybrid platform anymore.

Also, the "--detailed" is supported on the hybrid platform

With the patch,

./perf stat -a -ddd sleep 1

 Performance counter stats for 'system wide':

       32,231.06 msec cpu-clock                 #   32.056 CPUs utilized
             529      context-switches          #   16.413 /sec
              32      cpu-migrations            #    0.993 /sec
              69      page-faults               #    2.141 /sec
     176,754,151      cpu_core/cycles/          #    5.484 M/sec          (41.65%)
     161,695,280      cpu_atom/cycles/          #    5.017 M/sec          (49.92%)
      48,595,992      cpu_core/instructions/    #    1.508 M/sec          (49.98%)
      32,363,337      cpu_atom/instructions/    #    1.004 M/sec          (58.26%)
      10,088,639      cpu_core/branches/        #  313.010 K/sec          (58.31%)
       6,390,582      cpu_atom/branches/        #  198.274 K/sec          (58.26%)
         846,201      cpu_core/branch-misses/   #   26.254 K/sec          (66.65%)
         676,477      cpu_atom/branch-misses/   #   20.988 K/sec          (58.27%)
      14,290,070      cpu_core/L1-dcache-loads/ #  443.363 K/sec          (66.66%)
       9,983,532      cpu_atom/L1-dcache-loads/ #  309.749 K/sec          (58.27%)
         740,725      cpu_core/L1-dcache-load-misses/ #   22.982 K/sec    (66.66%)
 <not supported>      cpu_atom/L1-dcache-load-misses/
         480,441      cpu_core/LLC-loads/       #   14.906 K/sec          (66.67%)
         326,570      cpu_atom/LLC-loads/       #   10.132 K/sec          (58.27%)
             329      cpu_core/LLC-load-misses/ #   10.208 /sec           (66.68%)
               0      cpu_atom/LLC-load-misses/ #    0.000 /sec           (58.32%)
 <not supported>      cpu_core/L1-icache-loads/
      21,982,491      cpu_atom/L1-icache-loads/ #  682.028 K/sec          (58.43%)
       4,493,189      cpu_core/L1-icache-load-misses/ #  139.406 K/sec    (33.34%)
       4,711,404      cpu_atom/L1-icache-load-misses/ #  146.176 K/sec    (50.08%)
      13,713,090      cpu_core/dTLB-loads/      #  425.462 K/sec          (33.34%)
       9,384,727      cpu_atom/dTLB-loads/      #  291.170 K/sec          (50.08%)
         157,387      cpu_core/dTLB-load-misses/ #    4.883 K/sec         (33.33%)
         108,328      cpu_atom/dTLB-load-misses/ #    3.361 K/sec         (50.08%)
 <not supported>      cpu_core/iTLB-loads/
 <not supported>      cpu_atom/iTLB-loads/
          37,655      cpu_core/iTLB-load-misses/ #    1.168 K/sec         (33.32%)
          61,661      cpu_atom/iTLB-load-misses/ #    1.913 K/sec         (50.03%)
 <not supported>      cpu_core/L1-dcache-prefetches/
 <not supported>      cpu_atom/L1-dcache-prefetches/
 <not supported>      cpu_core/L1-dcache-prefetch-misses/
 <not supported>      cpu_atom/L1-dcache-prefetch-misses/

       1.005466919 seconds time elapsed

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
---
Change log:
  v4:
    * Adds Acked-by from Namhyung Kim <namhyung@kernel.org>
  v3:
    * Use evsel__new() in place of evsel__new_idx()
  v2:
    * The index of all new evsel will be updated when adding to the evlist,
      just set 0 idx for the new evsel.

 tools/perf/arch/x86/util/evlist.c | 52 ++++++++++++++++++++++++++++++-
 tools/perf/util/evlist.c          |  2 +-
 tools/perf/util/evlist.h          |  2 ++
 3 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
index 777bdf182a58..c83f8c11735f 100644
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@@ -4,16 +4,66 @@
 #include "util/evlist.h"
 #include "util/parse-events.h"
 #include "topdown.h"
+#include "util/event.h"
+#include "util/pmu-hybrid.h"
 
 #define TOPDOWN_L1_EVENTS	"{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
 #define TOPDOWN_L2_EVENTS	"{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
 
+static int ___evlist__add_default_attrs(struct evlist *evlist,
+					struct perf_event_attr *attrs,
+					size_t nr_attrs)
+{
+	struct perf_cpu_map *cpus;
+	struct evsel *evsel, *n;
+	struct perf_pmu *pmu;
+	LIST_HEAD(head);
+	size_t i = 0;
+
+	for (i = 0; i < nr_attrs; i++)
+		event_attr_init(attrs + i);
+
+	if (!perf_pmu__has_hybrid())
+		return evlist__add_attrs(evlist, attrs, nr_attrs);
+
+	for (i = 0; i < nr_attrs; i++) {
+		if (attrs[i].type == PERF_TYPE_SOFTWARE) {
+			evsel = evsel__new(attrs + i);
+			if (evsel == NULL)
+				goto out_delete_partial_list;
+			list_add_tail(&evsel->core.node, &head);
+			continue;
+		}
+
+		perf_pmu__for_each_hybrid_pmu(pmu) {
+			evsel = evsel__new(attrs + i);
+			if (evsel == NULL)
+				goto out_delete_partial_list;
+			evsel->core.attr.config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT;
+			cpus = perf_cpu_map__get(pmu->cpus);
+			evsel->core.cpus = cpus;
+			evsel->core.own_cpus = perf_cpu_map__get(cpus);
+			evsel->pmu_name = strdup(pmu->name);
+			list_add_tail(&evsel->core.node, &head);
+		}
+	}
+
+	evlist__splice_list_tail(evlist, &head);
+
+	return 0;
+
+out_delete_partial_list:
+	__evlist__for_each_entry_safe(&head, n, evsel)
+		evsel__delete(evsel);
+	return -1;
+}
+
 int arch_evlist__add_default_attrs(struct evlist *evlist,
 				   struct perf_event_attr *attrs,
 				   size_t nr_attrs)
 {
 	if (nr_attrs)
-		return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
+		return ___evlist__add_default_attrs(evlist, attrs, nr_attrs);
 
 	if (!pmu_have_event("cpu", "slots"))
 		return 0;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index efa5f006b5c6..5ff4b9504828 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -309,7 +309,7 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
 	return evsel;
 }
 
-static int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
+int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
 {
 	struct evsel *evsel, *n;
 	LIST_HEAD(head);
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 129095c0fe6d..351ba2887a79 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -104,6 +104,8 @@ static inline int evlist__add_default(struct evlist *evlist)
 	return __evlist__add_default(evlist, true);
 }
 
+int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs);
+
 int __evlist__add_default_attrs(struct evlist *evlist,
 				     struct perf_event_attr *attrs, size_t nr_attrs);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine
  2022-07-21  6:57 [PATCH v4 0/5] Add perf stat default events for hybrid machines zhengjun.xing
                   ` (3 preceding siblings ...)
  2022-07-21  6:57 ` [PATCH v4 4/5] perf x86 evlist: Add default hybrid events for perf stat zhengjun.xing
@ 2022-07-21  6:57 ` zhengjun.xing
  2022-07-29 15:03 ` [PATCH v4 0/5] Add perf stat default events for hybrid machines Ian Rogers
  5 siblings, 0 replies; 8+ messages in thread
From: zhengjun.xing @ 2022-07-21  6:57 UTC (permalink / raw)
  To: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung
  Cc: linux-kernel, linux-perf-users, irogers, ak, kan.liang, zhengjun.xing

From: Zhengjun Xing <zhengjun.xing@linux.intel.com>

Topdown metrics are missed in the default perf stat on the hybrid machine,
add Topdown metrics in default perf stat for hybrid systems.

Currently, we support the perf metrics Topdown for the p-core PMU in the
perf stat default, the perf metrics Topdown support for e-core PMU will be
implemented later separately. Refactor the code adds two x86 specific
functions. Widen the size of the event name column by 7 chars, so that all
metrics after the "#" become aligned again.

The perf metrics topdown feature is supported on the cpu_core of ADL. The
dedicated perf metrics counter and the fixed counter 3 are used for the
topdown events. Adding the topdown metrics doesn't trigger multiplexing.

Before:

 # ./perf  stat  -a true

 Performance counter stats for 'system wide':

             53.70 msec cpu-clock                 #   25.736 CPUs utilized
                80      context-switches          #    1.490 K/sec
                24      cpu-migrations            #  446.951 /sec
                52      page-faults               #  968.394 /sec
         2,788,555      cpu_core/cycles/          #   51.931 M/sec
           851,129      cpu_atom/cycles/          #   15.851 M/sec
         2,974,030      cpu_core/instructions/    #   55.385 M/sec
           416,919      cpu_atom/instructions/    #    7.764 M/sec
           586,136      cpu_core/branches/        #   10.916 M/sec
            79,872      cpu_atom/branches/        #    1.487 M/sec
            14,220      cpu_core/branch-misses/   #  264.819 K/sec
             7,691      cpu_atom/branch-misses/   #  143.229 K/sec

       0.002086438 seconds time elapsed

After:

 # ./perf stat  -a true

 Performance counter stats for 'system wide':

             61.39 msec cpu-clock                        #   24.874 CPUs utilized
                76      context-switches                 #    1.238 K/sec
                24      cpu-migrations                   #  390.968 /sec
                52      page-faults                      #  847.097 /sec
         2,753,695      cpu_core/cycles/                 #   44.859 M/sec
           903,899      cpu_atom/cycles/                 #   14.725 M/sec
         2,927,529      cpu_core/instructions/           #   47.690 M/sec
           428,498      cpu_atom/instructions/           #    6.980 M/sec
           581,299      cpu_core/branches/               #    9.470 M/sec
            83,409      cpu_atom/branches/               #    1.359 M/sec
            13,641      cpu_core/branch-misses/          #  222.216 K/sec
             8,008      cpu_atom/branch-misses/          #  130.453 K/sec
        14,761,308      cpu_core/slots/                  #  240.466 M/sec
         3,288,625      cpu_core/topdown-retiring/       #     22.3% retiring
         1,323,323      cpu_core/topdown-bad-spec/       #      9.0% bad speculation
         5,477,470      cpu_core/topdown-fe-bound/       #     37.1% frontend bound
         4,679,199      cpu_core/topdown-be-bound/       #     31.7% backend bound
           646,194      cpu_core/topdown-heavy-ops/      #      4.4% heavy operations       #     17.9% light operations
         1,244,999      cpu_core/topdown-br-mispredict/  #      8.4% branch mispredict      #      0.5% machine clears
         3,891,800      cpu_core/topdown-fetch-lat/      #     26.4% fetch latency          #     10.7% fetch bandwidth
         1,879,034      cpu_core/topdown-mem-bound/      #     12.7% memory bound           #     19.0% Core bound

       0.002467839 seconds time elapsed

Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
---
Change log:
  v4:
    * Adds Acked-by from Namhyung Kim <namhyung@kernel.org>
  v3:
    * Make the pr_warning in one line.
  v2:
    * Refactor arch_get_topdown_pmu_name() as Namhyung's suggestion.

 tools/perf/arch/x86/util/evlist.c  | 13 ++------
 tools/perf/arch/x86/util/topdown.c | 51 ++++++++++++++++++++++++++++++
 tools/perf/arch/x86/util/topdown.h |  1 +
 tools/perf/builtin-stat.c          | 14 ++------
 tools/perf/util/stat-display.c     |  2 +-
 tools/perf/util/topdown.c          |  7 ++++
 tools/perf/util/topdown.h          |  3 +-
 7 files changed, 66 insertions(+), 25 deletions(-)

diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
index c83f8c11735f..cb59ce9b9638 100644
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@@ -3,12 +3,9 @@
 #include "util/pmu.h"
 #include "util/evlist.h"
 #include "util/parse-events.h"
-#include "topdown.h"
 #include "util/event.h"
 #include "util/pmu-hybrid.h"
-
-#define TOPDOWN_L1_EVENTS	"{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
-#define TOPDOWN_L2_EVENTS	"{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
+#include "topdown.h"
 
 static int ___evlist__add_default_attrs(struct evlist *evlist,
 					struct perf_event_attr *attrs,
@@ -65,13 +62,7 @@ int arch_evlist__add_default_attrs(struct evlist *evlist,
 	if (nr_attrs)
 		return ___evlist__add_default_attrs(evlist, attrs, nr_attrs);
 
-	if (!pmu_have_event("cpu", "slots"))
-		return 0;
-
-	if (pmu_have_event("cpu", "topdown-heavy-ops"))
-		return parse_events(evlist, TOPDOWN_L2_EVENTS, NULL);
-	else
-		return parse_events(evlist, TOPDOWN_L1_EVENTS, NULL);
+	return topdown_parse_events(evlist);
 }
 
 struct evsel *arch_evlist__leader(struct list_head *list)
diff --git a/tools/perf/arch/x86/util/topdown.c b/tools/perf/arch/x86/util/topdown.c
index f81a7cfe4d63..67c524324125 100644
--- a/tools/perf/arch/x86/util/topdown.c
+++ b/tools/perf/arch/x86/util/topdown.c
@@ -3,9 +3,17 @@
 #include "api/fs/fs.h"
 #include "util/pmu.h"
 #include "util/topdown.h"
+#include "util/evlist.h"
+#include "util/debug.h"
+#include "util/pmu-hybrid.h"
 #include "topdown.h"
 #include "evsel.h"
 
+#define TOPDOWN_L1_EVENTS       "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
+#define TOPDOWN_L1_EVENTS_CORE  "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/,cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/}"
+#define TOPDOWN_L2_EVENTS       "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
+#define TOPDOWN_L2_EVENTS_CORE  "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/,cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/,cpu_core/topdown-heavy-ops/,cpu_core/topdown-br-mispredict/,cpu_core/topdown-fetch-lat/,cpu_core/topdown-mem-bound/}"
+
 /* Check whether there is a PMU which supports the perf metrics. */
 bool topdown_sys_has_perf_metrics(void)
 {
@@ -73,3 +81,46 @@ bool arch_topdown_sample_read(struct evsel *leader)
 
 	return false;
 }
+
+const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn)
+{
+	const char *pmu_name;
+
+	if (!perf_pmu__has_hybrid())
+		return "cpu";
+
+	if (!evlist->hybrid_pmu_name) {
+		if (warn)
+			pr_warning("WARNING: default to use cpu_core topdown events\n");
+		evlist->hybrid_pmu_name = perf_pmu__hybrid_type_to_pmu("core");
+	}
+
+	pmu_name = evlist->hybrid_pmu_name;
+
+	return pmu_name;
+}
+
+int topdown_parse_events(struct evlist *evlist)
+{
+	const char *topdown_events;
+	const char *pmu_name;
+
+	if (!topdown_sys_has_perf_metrics())
+		return 0;
+
+	pmu_name = arch_get_topdown_pmu_name(evlist, false);
+
+	if (pmu_have_event(pmu_name, "topdown-heavy-ops")) {
+		if (!strcmp(pmu_name, "cpu_core"))
+			topdown_events = TOPDOWN_L2_EVENTS_CORE;
+		else
+			topdown_events = TOPDOWN_L2_EVENTS;
+	} else {
+		if (!strcmp(pmu_name, "cpu_core"))
+			topdown_events = TOPDOWN_L1_EVENTS_CORE;
+		else
+			topdown_events = TOPDOWN_L1_EVENTS;
+	}
+
+	return parse_events(evlist, topdown_events, NULL);
+}
diff --git a/tools/perf/arch/x86/util/topdown.h b/tools/perf/arch/x86/util/topdown.h
index 46bf9273e572..7eb81f042838 100644
--- a/tools/perf/arch/x86/util/topdown.h
+++ b/tools/perf/arch/x86/util/topdown.h
@@ -3,5 +3,6 @@
 #define _TOPDOWN_H 1
 
 bool topdown_sys_has_perf_metrics(void);
+int topdown_parse_events(struct evlist *evlist);
 
 #endif
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 837c3ca91af1..c6b68be78f8c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -71,6 +71,7 @@
 #include "util/bpf_counter.h"
 #include "util/iostat.h"
 #include "util/pmu-hybrid.h"
+ #include "util/topdown.h"
 #include "asm/bug.h"
 
 #include <linux/time64.h>
@@ -1858,22 +1859,11 @@ static int add_default_attributes(void)
 		unsigned int max_level = 1;
 		char *str = NULL;
 		bool warn = false;
-		const char *pmu_name = "cpu";
+		const char *pmu_name = arch_get_topdown_pmu_name(evsel_list, true);
 
 		if (!force_metric_only)
 			stat_config.metric_only = true;
 
-		if (perf_pmu__has_hybrid()) {
-			if (!evsel_list->hybrid_pmu_name) {
-				pr_warning("WARNING: default to use cpu_core topdown events\n");
-				evsel_list->hybrid_pmu_name = perf_pmu__hybrid_type_to_pmu("core");
-			}
-
-			pmu_name = evsel_list->hybrid_pmu_name;
-			if (!pmu_name)
-				return -1;
-		}
-
 		if (pmu_have_event(pmu_name, topdown_metric_L2_attrs[5])) {
 			metric_attrs = topdown_metric_L2_attrs;
 			max_level = 2;
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 606f09b09226..44045565c8f8 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -374,7 +374,7 @@ static void abs_printout(struct perf_stat_config *config,
 			config->csv_output ? 0 : config->unit_width,
 			evsel->unit, config->csv_sep);
 
-	fprintf(output, "%-*s", config->csv_output ? 0 : 25, evsel__name(evsel));
+	fprintf(output, "%-*s", config->csv_output ? 0 : 32, evsel__name(evsel));
 
 	print_cgroup(config, evsel);
 }
diff --git a/tools/perf/util/topdown.c b/tools/perf/util/topdown.c
index a369f84ceb6a..1090841550f7 100644
--- a/tools/perf/util/topdown.c
+++ b/tools/perf/util/topdown.c
@@ -65,3 +65,10 @@ __weak bool arch_topdown_sample_read(struct evsel *leader __maybe_unused)
 {
 	return false;
 }
+
+__weak const char *arch_get_topdown_pmu_name(struct evlist *evlist
+					     __maybe_unused,
+					     bool warn __maybe_unused)
+{
+	return "cpu";
+}
diff --git a/tools/perf/util/topdown.h b/tools/perf/util/topdown.h
index 118e75281f93..f9531528c559 100644
--- a/tools/perf/util/topdown.h
+++ b/tools/perf/util/topdown.h
@@ -2,11 +2,12 @@
 #ifndef TOPDOWN_H
 #define TOPDOWN_H 1
 #include "evsel.h"
+#include "evlist.h"
 
 bool arch_topdown_check_group(bool *warn);
 void arch_topdown_group_warn(void);
 bool arch_topdown_sample_read(struct evsel *leader);
-
+const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn);
 int topdown_filter_events(const char **attr, char **str, bool use_group,
 			  const char *pmu_name);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 0/5] Add perf stat default events for hybrid machines
  2022-07-21  6:57 [PATCH v4 0/5] Add perf stat default events for hybrid machines zhengjun.xing
                   ` (4 preceding siblings ...)
  2022-07-21  6:57 ` [PATCH v4 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine zhengjun.xing
@ 2022-07-29 15:03 ` Ian Rogers
  2022-07-29 16:44   ` Arnaldo Carvalho de Melo
  5 siblings, 1 reply; 8+ messages in thread
From: Ian Rogers @ 2022-07-29 15:03 UTC (permalink / raw)
  To: zhengjun.xing
  Cc: acme, peterz, mingo, alexander.shishkin, jolsa, namhyung,
	linux-kernel, linux-perf-users, ak, kan.liang

On Wed, Jul 20, 2022 at 11:56 PM <zhengjun.xing@linux.intel.com> wrote:
>
> From: Zhengjun Xing <zhengjun.xing@linux.intel.com>
>
> The patch series is to clean up the existing perf stat default and support
> the perf metrics Topdown for the p-core PMU in the perf stat default. The
> first 4 patches are the clean-up patch and fixing the "--detailed" issue.
> The last patch adds support for the perf metrics Topdown, the perf metrics
> Topdown support for e-core PMU will be implemented later separately.
>
> Kan Liang (4):
>   perf stat: Revert "perf stat: Add default hybrid events"
>   perf evsel: Add arch_evsel__hw_name()
>   perf evlist: Always use arch_evlist__add_default_attrs()
>   perf x86 evlist: Add default hybrid events for perf stat
>
> Zhengjun Xing (1):
>   perf stat: Add topdown metrics in the default perf stat on the hybrid
>     machine

Acked-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

>  tools/perf/arch/x86/util/evlist.c  | 64 +++++++++++++++++++++++++-----
>  tools/perf/arch/x86/util/evsel.c   | 20 ++++++++++
>  tools/perf/arch/x86/util/topdown.c | 51 ++++++++++++++++++++++++
>  tools/perf/arch/x86/util/topdown.h |  1 +
>  tools/perf/builtin-stat.c          | 50 ++++-------------------
>  tools/perf/util/evlist.c           | 11 +++--
>  tools/perf/util/evlist.h           |  9 ++++-
>  tools/perf/util/evsel.c            |  7 +++-
>  tools/perf/util/evsel.h            |  1 +
>  tools/perf/util/stat-display.c     |  2 +-
>  tools/perf/util/topdown.c          |  7 ++++
>  tools/perf/util/topdown.h          |  3 +-
>  12 files changed, 166 insertions(+), 60 deletions(-)
>
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 0/5] Add perf stat default events for hybrid machines
  2022-07-29 15:03 ` [PATCH v4 0/5] Add perf stat default events for hybrid machines Ian Rogers
@ 2022-07-29 16:44   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-07-29 16:44 UTC (permalink / raw)
  To: Ian Rogers
  Cc: zhengjun.xing, peterz, mingo, alexander.shishkin, jolsa,
	namhyung, linux-kernel, linux-perf-users, ak, kan.liang

Em Fri, Jul 29, 2022 at 08:03:20AM -0700, Ian Rogers escreveu:
> On Wed, Jul 20, 2022 at 11:56 PM <zhengjun.xing@linux.intel.com> wrote:
> >
> > From: Zhengjun Xing <zhengjun.xing@linux.intel.com>
> >
> > The patch series is to clean up the existing perf stat default and support
> > the perf metrics Topdown for the p-core PMU in the perf stat default. The
> > first 4 patches are the clean-up patch and fixing the "--detailed" issue.
> > The last patch adds support for the perf metrics Topdown, the perf metrics
> > Topdown support for e-core PMU will be implemented later separately.
> >
> > Kan Liang (4):
> >   perf stat: Revert "perf stat: Add default hybrid events"
> >   perf evsel: Add arch_evsel__hw_name()
> >   perf evlist: Always use arch_evlist__add_default_attrs()
> >   perf x86 evlist: Add default hybrid events for perf stat
> >
> > Zhengjun Xing (1):
> >   perf stat: Add topdown metrics in the default perf stat on the hybrid
> >     machine
> 
> Acked-by: Ian Rogers <irogers@google.com>

Thanks, applied.

- Arnaldo


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-07-29 16:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-21  6:57 [PATCH v4 0/5] Add perf stat default events for hybrid machines zhengjun.xing
2022-07-21  6:57 ` [PATCH v4 1/5] perf stat: Revert "perf stat: Add default hybrid events" zhengjun.xing
2022-07-21  6:57 ` [PATCH v4 2/5] perf evsel: Add arch_evsel__hw_name() zhengjun.xing
2022-07-21  6:57 ` [PATCH v4 3/5] perf evlist: Always use arch_evlist__add_default_attrs() zhengjun.xing
2022-07-21  6:57 ` [PATCH v4 4/5] perf x86 evlist: Add default hybrid events for perf stat zhengjun.xing
2022-07-21  6:57 ` [PATCH v4 5/5] perf stat: Add topdown metrics in the default perf stat on the hybrid machine zhengjun.xing
2022-07-29 15:03 ` [PATCH v4 0/5] Add perf stat default events for hybrid machines Ian Rogers
2022-07-29 16:44   ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).