All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] perf evlist: Keep topdown counters in weak group
@ 2022-05-05  4:38 Ian Rogers
  2022-05-05  4:38 ` [PATCH 2/2] perf test: Add basic stat and topdown group test Ian Rogers
  2022-05-05 11:56 ` [PATCH 1/2] perf evlist: Keep topdown counters in weak group Liang, Kan
  0 siblings, 2 replies; 16+ messages in thread
From: Ian Rogers @ 2022-05-05  4:38 UTC (permalink / raw)
  To: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users
  Cc: Stephane Eranian, Ian Rogers

On Intel Icelake, topdown events must always be grouped with a slots
event as leader. When a metric is parsed a weak group is formed and
retried if perf_event_open fails. The retried events aren't grouped
breaking the slots leader requirement. This change modifies the weak
group "reset" behavior so that topdown events aren't broken from the
group for the retry.

$ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

 Performance counter stats for 'system wide':

    47,867,188,483      slots                                                         (92.27%)
   <not supported>      topdown-bad-spec
   <not supported>      topdown-be-bound
   <not supported>      topdown-fe-bound
   <not supported>      topdown-retiring
     2,173,346,937      branch-instructions                                           (92.27%)
        10,540,253      branch-misses             #    0.48% of all branches          (92.29%)
        96,291,140      bus-cycles                                                    (92.29%)
         6,214,202      cache-misses              #   20.120 % of all cache refs      (92.29%)
        30,886,082      cache-references                                              (76.91%)
    11,773,726,641      cpu-cycles                                                    (84.62%)
    11,807,585,307      instructions              #    1.00  insn per cycle           (92.31%)
                 0      mem-loads                                                     (92.32%)
     2,212,928,573      mem-stores                                                    (84.69%)
    10,024,403,118      ref-cycles                                                    (92.35%)
        16,232,978      baclears.any                                                  (92.35%)
        23,832,633      ARITH.DIVIDER_ACTIVE                                          (84.59%)

       0.981070734 seconds time elapsed

After:

$ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

 Performance counter stats for 'system wide':

       31040189283      slots                                                         (92.27%)
        8997514811      topdown-bad-spec          #     28.2% bad speculation         (92.27%)
       10997536028      topdown-be-bound          #     34.5% backend bound           (92.27%)
        4778060526      topdown-fe-bound          #     15.0% frontend bound          (92.27%)
        7086628768      topdown-retiring          #     22.2% retiring                (92.27%)
        1417611942      branch-instructions                                           (92.26%)
           5285529      branch-misses             #    0.37% of all branches          (92.28%)
          62922469      bus-cycles                                                    (92.29%)
           1440708      cache-misses              #    8.292 % of all cache refs      (92.30%)
          17374098      cache-references                                              (76.94%)
        8040889520      cpu-cycles                                                    (84.63%)
        7709992319      instructions              #    0.96  insn per cycle           (92.32%)
                 0      mem-loads                                                     (92.32%)
        1515669558      mem-stores                                                    (84.68%)
        6542411177      ref-cycles                                                    (92.35%)
           4154149      baclears.any                                                  (92.35%)
          20556152      ARITH.DIVIDER_ACTIVE                                          (84.59%)

       1.010799593 seconds time elapsed

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/arch/x86/util/evsel.c | 12 ++++++++++++
 tools/perf/util/evlist.c         | 16 ++++++++++++++--
 tools/perf/util/evsel.c          | 10 ++++++++++
 tools/perf/util/evsel.h          |  3 +++
 4 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index ac2899a25b7a..40b171de2086 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -3,6 +3,7 @@
 #include <stdlib.h>
 #include "util/evsel.h"
 #include "util/env.h"
+#include "util/pmu.h"
 #include "linux/string.h"
 
 void arch_evsel__set_sample_weight(struct evsel *evsel)
@@ -29,3 +30,14 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
 
 	free(env.cpuid);
 }
+
+bool arch_evsel__must_be_in_group(const struct evsel *evsel)
+{
+	if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
+	    !pmu_have_event("cpu", "slots"))
+		return false;
+
+	return evsel->name &&
+		(!strcasecmp(evsel->name, "slots") ||
+		 !strncasecmp(evsel->name, "topdown", 7));
+}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 52ea004ba01e..dfa65a383502 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1790,8 +1790,17 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
 		if (evsel__has_leader(c2, leader)) {
 			if (is_open && close)
 				perf_evsel__close(&c2->core);
-			evsel__set_leader(c2, c2);
-			c2->core.nr_members = 0;
+			/*
+			 * We want to close all members of the group and reopen
+			 * them. Some events, like Intel topdown, require being
+			 * in a group and so keep these in the group.
+			 */
+			if (!evsel__must_be_in_group(c2) && c2 != leader) {
+				evsel__set_leader(c2, c2);
+				c2->core.nr_members = 0;
+				leader->core.nr_members--;
+			}
+
 			/*
 			 * Set this for all former members of the group
 			 * to indicate they get reopened.
@@ -1799,6 +1808,9 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
 			c2->reset_group = true;
 		}
 	}
+	/* Reset the leader count if all entries were removed. */
+	if (leader->core.nr_members)
+		leader->core.nr_members = 0;
 	return leader;
 }
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d38722560e80..b7c0c9775673 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -3082,3 +3082,13 @@ int evsel__source_count(const struct evsel *evsel)
 	}
 	return count;
 }
+
+bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
+{
+	return false;
+}
+
+bool evsel__must_be_in_group(const struct evsel *evsel)
+{
+	return arch_evsel__must_be_in_group(evsel);
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 45d674812239..0ed2850b7ebb 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -488,6 +488,9 @@ bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
 bool evsel__is_leader(struct evsel *evsel);
 void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
 int evsel__source_count(const struct evsel *evsel);
+bool evsel__must_be_in_group(const struct evsel *evsel);
+
+bool arch_evsel__must_be_in_group(const struct evsel *evsel);
 
 /*
  * Macro to swap the bit-field postition and size.
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] perf test: Add basic stat and topdown group test
  2022-05-05  4:38 [PATCH 1/2] perf evlist: Keep topdown counters in weak group Ian Rogers
@ 2022-05-05  4:38 ` Ian Rogers
  2022-05-05 12:12   ` Liang, Kan
  2022-05-05 11:56 ` [PATCH 1/2] perf evlist: Keep topdown counters in weak group Liang, Kan
  1 sibling, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-05-05  4:38 UTC (permalink / raw)
  To: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users
  Cc: Stephane Eranian, Ian Rogers

Add a basic stat test.
Add two tests of grouping behavior for topdown events. Topdown events
are special as they must be grouped with the slots event first.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/shell/stat.sh | 65 ++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)
 create mode 100755 tools/perf/tests/shell/stat.sh

diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh
new file mode 100755
index 000000000000..80869ea6debc
--- /dev/null
+++ b/tools/perf/tests/shell/stat.sh
@@ -0,0 +1,65 @@
+#!/bin/sh
+# perf stat tests
+# SPDX-License-Identifier: GPL-2.0
+
+set -e
+
+err=0
+test_default_stat() {
+  echo "Basic stat command test"
+  if ! perf stat true 2>&1 | egrep -q "Performance counter stats for 'true':"
+  then
+    echo "Basic stat command test [Failed]"
+    err=1
+    return
+  fi
+  echo "Basic stat command test [Success]"
+}
+
+test_topdown_groups() {
+  # Topdown events must be grouped with the slots event first. Test that
+  # parse-events reorders this.
+  echo "Topdown event group test"
+  if ! perf stat -e '{slots,topdown-retiring}' true > /dev/null 2>&1
+  then
+    echo "Topdown event group test [Skipped event parsing failed]"
+    return
+  fi
+  if perf stat -e '{slots,topdown-retiring}' true 2>&1 | egrep -q "<not supported>"
+  then
+    echo "Topdown event group test [Failed events not supported]"
+    err=1
+    return
+  fi
+  if perf stat -e '{topdown-retiring,slots}' true 2>&1 | egrep -q "<not supported>"
+  then
+    echo "Topdown event group test [Failed slots not reordered first]"
+    err=1
+    return
+  fi
+  echo "Topdown event group test [Success]"
+}
+
+test_topdown_weak_groups() {
+  # Weak groups break if the perf_event_open of multiple grouped events
+  # fails. Breaking a topdown group causes the events to fail. Test a very large
+  # grouping to see that the topdown events aren't broken out.
+  echo "Topdown weak groups test"
+  if ! perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring},branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE' true > /dev/null 2>&1
+  then
+    echo "Topdown weak groups test [Skipped event parsing failed]"
+    return
+  fi
+  if perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' true 2>&1 | egrep -q "<not supported>"
+  then
+    echo "Topdown weak groups test [Failed events not supported]"
+    err=1
+    return
+  fi
+  echo "Topdown weak groups test [Success]"
+}
+
+test_default_stat
+test_topdown_groups
+test_topdown_weak_groups
+exit $err
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group
  2022-05-05  4:38 [PATCH 1/2] perf evlist: Keep topdown counters in weak group Ian Rogers
  2022-05-05  4:38 ` [PATCH 2/2] perf test: Add basic stat and topdown group test Ian Rogers
@ 2022-05-05 11:56 ` Liang, Kan
  2022-05-05 15:18   ` Ian Rogers
  1 sibling, 1 reply; 16+ messages in thread
From: Liang, Kan @ 2022-05-05 11:56 UTC (permalink / raw)
  To: Ian Rogers, Caleb Biggers, Perry Taylor, Kshipra Bopardikar,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ravi Bangoria, Andi Kleen, Haowen Bai, Riccardo Mancini,
	Kim Phillips, Madhavan Srinivasan, Shunsuke Nakamura,
	Florian Fischer, linux-kernel, linux-perf-users
  Cc: Stephane Eranian

Hi Ian,

Thanks for the patches.

On 5/5/2022 12:38 AM, Ian Rogers wrote:
> On Intel Icelake, topdown events must always be grouped with a slots
> event as leader. When a metric is parsed a weak group is formed and
> retried if perf_event_open fails. The retried events aren't grouped
> breaking the slots leader requirement. This change modifies the weak
> group "reset" behavior so that topdown events aren't broken from the
> group for the retry.
> 
> $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
> 
>   Performance counter stats for 'system wide':
> 
>      47,867,188,483      slots                                                         (92.27%)
>     <not supported>      topdown-bad-spec
>     <not supported>      topdown-be-bound
>     <not supported>      topdown-fe-bound
>     <not supported>      topdown-retiring
>       2,173,346,937      branch-instructions                                           (92.27%)
>          10,540,253      branch-misses             #    0.48% of all branches          (92.29%)
>          96,291,140      bus-cycles                                                    (92.29%)
>           6,214,202      cache-misses              #   20.120 % of all cache refs      (92.29%)
>          30,886,082      cache-references                                              (76.91%)
>      11,773,726,641      cpu-cycles                                                    (84.62%)
>      11,807,585,307      instructions              #    1.00  insn per cycle           (92.31%)
>                   0      mem-loads                                                     (92.32%)
>       2,212,928,573      mem-stores                                                    (84.69%)
>      10,024,403,118      ref-cycles                                                    (92.35%)
>          16,232,978      baclears.any                                                  (92.35%)
>          23,832,633      ARITH.DIVIDER_ACTIVE                                          (84.59%)
> 
>         0.981070734 seconds time elapsed
> 
> After:
> 
> $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
> 
>   Performance counter stats for 'system wide':
> 
>         31040189283      slots                                                         (92.27%)
>          8997514811      topdown-bad-spec          #     28.2% bad speculation         (92.27%)
>         10997536028      topdown-be-bound          #     34.5% backend bound           (92.27%)
>          4778060526      topdown-fe-bound          #     15.0% frontend bound          (92.27%)
>          7086628768      topdown-retiring          #     22.2% retiring                (92.27%)
>          1417611942      branch-instructions                                           (92.26%)
>             5285529      branch-misses             #    0.37% of all branches          (92.28%)
>            62922469      bus-cycles                                                    (92.29%)
>             1440708      cache-misses              #    8.292 % of all cache refs      (92.30%)
>            17374098      cache-references                                              (76.94%)
>          8040889520      cpu-cycles                                                    (84.63%)
>          7709992319      instructions              #    0.96  insn per cycle           (92.32%)
>                   0      mem-loads                                                     (92.32%)
>          1515669558      mem-stores                                                    (84.68%)
>          6542411177      ref-cycles                                                    (92.35%)
>             4154149      baclears.any                                                  (92.35%)
>            20556152      ARITH.DIVIDER_ACTIVE                                          (84.59%)
> 
>         1.010799593 seconds time elapsed
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>   tools/perf/arch/x86/util/evsel.c | 12 ++++++++++++
>   tools/perf/util/evlist.c         | 16 ++++++++++++++--
>   tools/perf/util/evsel.c          | 10 ++++++++++
>   tools/perf/util/evsel.h          |  3 +++
>   4 files changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
> index ac2899a25b7a..40b171de2086 100644
> --- a/tools/perf/arch/x86/util/evsel.c
> +++ b/tools/perf/arch/x86/util/evsel.c
> @@ -3,6 +3,7 @@
>   #include <stdlib.h>
>   #include "util/evsel.h"
>   #include "util/env.h"
> +#include "util/pmu.h"
>   #include "linux/string.h"
>   
>   void arch_evsel__set_sample_weight(struct evsel *evsel)
> @@ -29,3 +30,14 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
>   
>   	free(env.cpuid);
>   }
> +
> +bool arch_evsel__must_be_in_group(const struct evsel *evsel)
> +{
> +	if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
> +	    !pmu_have_event("cpu", "slots"))
> +		return false;
> +

The big core of ADL also supports Perf_metrics. At least, we should 
check both "cpu" and "cpu_core" here.

Thanks,
Kan
> +	return evsel->name &&
> +		(!strcasecmp(evsel->name, "slots") ||
> +		 !strncasecmp(evsel->name, "topdown", 7));
> +}
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 52ea004ba01e..dfa65a383502 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1790,8 +1790,17 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
>   		if (evsel__has_leader(c2, leader)) {
>   			if (is_open && close)
>   				perf_evsel__close(&c2->core);
> -			evsel__set_leader(c2, c2);
> -			c2->core.nr_members = 0;
> +			/*
> +			 * We want to close all members of the group and reopen
> +			 * them. Some events, like Intel topdown, require being
> +			 * in a group and so keep these in the group.
> +			 */
> +			if (!evsel__must_be_in_group(c2) && c2 != leader) {
> +				evsel__set_leader(c2, c2);
> +				c2->core.nr_members = 0;
> +				leader->core.nr_members--;
> +			}
> +
>   			/*
>   			 * Set this for all former members of the group
>   			 * to indicate they get reopened.
> @@ -1799,6 +1808,9 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
>   			c2->reset_group = true;
>   		}
>   	}
> +	/* Reset the leader count if all entries were removed. */
> +	if (leader->core.nr_members)
> +		leader->core.nr_members = 0;
>   	return leader;
>   }
>   
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index d38722560e80..b7c0c9775673 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -3082,3 +3082,13 @@ int evsel__source_count(const struct evsel *evsel)
>   	}
>   	return count;
>   }
> +
> +bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
> +{
> +	return false;
> +}
> +
> +bool evsel__must_be_in_group(const struct evsel *evsel)
> +{
> +	return arch_evsel__must_be_in_group(evsel);
> +}
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 45d674812239..0ed2850b7ebb 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -488,6 +488,9 @@ bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
>   bool evsel__is_leader(struct evsel *evsel);
>   void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
>   int evsel__source_count(const struct evsel *evsel);
> +bool evsel__must_be_in_group(const struct evsel *evsel);
> +
> +bool arch_evsel__must_be_in_group(const struct evsel *evsel);
>   
>   /*
>    * Macro to swap the bit-field postition and size.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] perf test: Add basic stat and topdown group test
  2022-05-05  4:38 ` [PATCH 2/2] perf test: Add basic stat and topdown group test Ian Rogers
@ 2022-05-05 12:12   ` Liang, Kan
  2022-05-05 15:22     ` Ian Rogers
  0 siblings, 1 reply; 16+ messages in thread
From: Liang, Kan @ 2022-05-05 12:12 UTC (permalink / raw)
  To: Ian Rogers, Caleb Biggers, Perry Taylor, Kshipra Bopardikar,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ravi Bangoria, Andi Kleen, Haowen Bai, Riccardo Mancini,
	Kim Phillips, Madhavan Srinivasan, Shunsuke Nakamura,
	Florian Fischer, linux-kernel, linux-perf-users, Zhengjun Xing
  Cc: Stephane Eranian



On 5/5/2022 12:38 AM, Ian Rogers wrote:
> Add a basic stat test.
> Add two tests of grouping behavior for topdown events. Topdown events
> are special as they must be grouped with the slots event first.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>   tools/perf/tests/shell/stat.sh | 65 ++++++++++++++++++++++++++++++++++
>   1 file changed, 65 insertions(+)
>   create mode 100755 tools/perf/tests/shell/stat.sh
> 
> diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh
> new file mode 100755
> index 000000000000..80869ea6debc
> --- /dev/null
> +++ b/tools/perf/tests/shell/stat.sh
> @@ -0,0 +1,65 @@
> +#!/bin/sh
> +# perf stat tests
> +# SPDX-License-Identifier: GPL-2.0
> +
> +set -e
> +
> +err=0
> +test_default_stat() {
> +  echo "Basic stat command test"
> +  if ! perf stat true 2>&1 | egrep -q "Performance counter stats for 'true':"
> +  then
> +    echo "Basic stat command test [Failed]"
> +    err=1
> +    return
> +  fi
> +  echo "Basic stat command test [Success]"
> +}
> +
> +test_topdown_groups() {
> +  # Topdown events must be grouped with the slots event first. Test that
> +  # parse-events reorders this.
> +  echo "Topdown event group test"
> +  if ! perf stat -e '{slots,topdown-retiring}' true > /dev/null 2>&1
> +  then
> +    echo "Topdown event group test [Skipped event parsing failed]"
> +    return
> +  fi
> +  if perf stat -e '{slots,topdown-retiring}' true 2>&1 | egrep -q "<not supported>"
> +  then
> +    echo "Topdown event group test [Failed events not supported]"
> +    err=1
> +    return
> +  fi
> +  if perf stat -e '{topdown-retiring,slots}' true 2>&1 | egrep -q "<not supported>"
> +  then
> +    echo "Topdown event group test [Failed slots not reordered first]"
> +    err=1
> +    return
> +  fi
> +  echo "Topdown event group test [Success]"
> +}
> +
> +test_topdown_weak_groups() {
> +  # Weak groups break if the perf_event_open of multiple grouped events
> +  # fails. Breaking a topdown group causes the events to fail. Test a very large
> +  # grouping to see that the topdown events aren't broken out.
> +  echo "Topdown weak groups test"
> +  if ! perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring},branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE' true > /dev/null 2>&1
> +  then
> +    echo "Topdown weak groups test [Skipped event parsing failed]"
> +    return
> +  fi
> +  if perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' true 2>&1 | egrep -q "<not supported>"
> +  then
> +    echo "Topdown weak groups test [Failed events not supported]"
> +    err=1
> +    return
> +  fi
> +  echo "Topdown weak groups test [Success]"
> +}
> +

Should we check the existence of the slots event before the test?
The perf metrics feature only be available on the new platform after 
ICL. It doesn't work on Atom.

Also, I think the test may fails on the hybrid platform, since big core 
and small core have different formula for the topdown. I think we should 
avoid the test for the hybrid platform for now.
+Zhengjun, who is fixing the topdown gap for the hybrid platform. I 
think he may take care of the hybrid support later.

Thanks,
Kan
> +test_default_stat
> +test_topdown_groups
> +test_topdown_weak_groups
> +exit $err

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group
  2022-05-05 11:56 ` [PATCH 1/2] perf evlist: Keep topdown counters in weak group Liang, Kan
@ 2022-05-05 15:18   ` Ian Rogers
  2022-05-05 18:15     ` Liang, Kan
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-05-05 15:18 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Stephane Eranian

On Thu, May 5, 2022 at 4:56 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
> Hi Ian,
>
> Thanks for the patches.
>
> On 5/5/2022 12:38 AM, Ian Rogers wrote:
> > On Intel Icelake, topdown events must always be grouped with a slots
> > event as leader. When a metric is parsed a weak group is formed and
> > retried if perf_event_open fails. The retried events aren't grouped
> > breaking the slots leader requirement. This change modifies the weak
> > group "reset" behavior so that topdown events aren't broken from the
> > group for the retry.
> >
> > $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
> >
> >   Performance counter stats for 'system wide':
> >
> >      47,867,188,483      slots                                                         (92.27%)
> >     <not supported>      topdown-bad-spec
> >     <not supported>      topdown-be-bound
> >     <not supported>      topdown-fe-bound
> >     <not supported>      topdown-retiring
> >       2,173,346,937      branch-instructions                                           (92.27%)
> >          10,540,253      branch-misses             #    0.48% of all branches          (92.29%)
> >          96,291,140      bus-cycles                                                    (92.29%)
> >           6,214,202      cache-misses              #   20.120 % of all cache refs      (92.29%)
> >          30,886,082      cache-references                                              (76.91%)
> >      11,773,726,641      cpu-cycles                                                    (84.62%)
> >      11,807,585,307      instructions              #    1.00  insn per cycle           (92.31%)
> >                   0      mem-loads                                                     (92.32%)
> >       2,212,928,573      mem-stores                                                    (84.69%)
> >      10,024,403,118      ref-cycles                                                    (92.35%)
> >          16,232,978      baclears.any                                                  (92.35%)
> >          23,832,633      ARITH.DIVIDER_ACTIVE                                          (84.59%)
> >
> >         0.981070734 seconds time elapsed
> >
> > After:
> >
> > $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
> >
> >   Performance counter stats for 'system wide':
> >
> >         31040189283      slots                                                         (92.27%)
> >          8997514811      topdown-bad-spec          #     28.2% bad speculation         (92.27%)
> >         10997536028      topdown-be-bound          #     34.5% backend bound           (92.27%)
> >          4778060526      topdown-fe-bound          #     15.0% frontend bound          (92.27%)
> >          7086628768      topdown-retiring          #     22.2% retiring                (92.27%)
> >          1417611942      branch-instructions                                           (92.26%)
> >             5285529      branch-misses             #    0.37% of all branches          (92.28%)
> >            62922469      bus-cycles                                                    (92.29%)
> >             1440708      cache-misses              #    8.292 % of all cache refs      (92.30%)
> >            17374098      cache-references                                              (76.94%)
> >          8040889520      cpu-cycles                                                    (84.63%)
> >          7709992319      instructions              #    0.96  insn per cycle           (92.32%)
> >                   0      mem-loads                                                     (92.32%)
> >          1515669558      mem-stores                                                    (84.68%)
> >          6542411177      ref-cycles                                                    (92.35%)
> >             4154149      baclears.any                                                  (92.35%)
> >            20556152      ARITH.DIVIDER_ACTIVE                                          (84.59%)
> >
> >         1.010799593 seconds time elapsed
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >   tools/perf/arch/x86/util/evsel.c | 12 ++++++++++++
> >   tools/perf/util/evlist.c         | 16 ++++++++++++++--
> >   tools/perf/util/evsel.c          | 10 ++++++++++
> >   tools/perf/util/evsel.h          |  3 +++
> >   4 files changed, 39 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
> > index ac2899a25b7a..40b171de2086 100644
> > --- a/tools/perf/arch/x86/util/evsel.c
> > +++ b/tools/perf/arch/x86/util/evsel.c
> > @@ -3,6 +3,7 @@
> >   #include <stdlib.h>
> >   #include "util/evsel.h"
> >   #include "util/env.h"
> > +#include "util/pmu.h"
> >   #include "linux/string.h"
> >
> >   void arch_evsel__set_sample_weight(struct evsel *evsel)
> > @@ -29,3 +30,14 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
> >
> >       free(env.cpuid);
> >   }
> > +
> > +bool arch_evsel__must_be_in_group(const struct evsel *evsel)
> > +{
> > +     if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
> > +         !pmu_have_event("cpu", "slots"))
> > +             return false;
> > +
>
> The big core of ADL also supports Perf_metrics. At least, we should
> check both "cpu" and "cpu_core" here.
>
> Thanks,
> Kan

Thanks Kan, we have the same problem for
arch_evlist__add_default_attrs and arch_evlist__leader:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/arch/x86/util/evlist.c?h=perf/core
So I think fixing all of these should be a follow up. I am working to
get access to an Alderlake system, could we land this first?

Thanks,
Ian

> > +     return evsel->name &&
> > +             (!strcasecmp(evsel->name, "slots") ||
> > +              !strncasecmp(evsel->name, "topdown", 7));
> > +}
> > diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> > index 52ea004ba01e..dfa65a383502 100644
> > --- a/tools/perf/util/evlist.c
> > +++ b/tools/perf/util/evlist.c
> > @@ -1790,8 +1790,17 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
> >               if (evsel__has_leader(c2, leader)) {
> >                       if (is_open && close)
> >                               perf_evsel__close(&c2->core);
> > -                     evsel__set_leader(c2, c2);
> > -                     c2->core.nr_members = 0;
> > +                     /*
> > +                      * We want to close all members of the group and reopen
> > +                      * them. Some events, like Intel topdown, require being
> > +                      * in a group and so keep these in the group.
> > +                      */
> > +                     if (!evsel__must_be_in_group(c2) && c2 != leader) {
> > +                             evsel__set_leader(c2, c2);
> > +                             c2->core.nr_members = 0;
> > +                             leader->core.nr_members--;
> > +                     }
> > +
> >                       /*
> >                        * Set this for all former members of the group
> >                        * to indicate they get reopened.
> > @@ -1799,6 +1808,9 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
> >                       c2->reset_group = true;
> >               }
> >       }
> > +     /* Reset the leader count if all entries were removed. */
> > +     if (leader->core.nr_members)
> > +             leader->core.nr_members = 0;
> >       return leader;
> >   }
> >
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index d38722560e80..b7c0c9775673 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -3082,3 +3082,13 @@ int evsel__source_count(const struct evsel *evsel)
> >       }
> >       return count;
> >   }
> > +
> > +bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
> > +{
> > +     return false;
> > +}
> > +
> > +bool evsel__must_be_in_group(const struct evsel *evsel)
> > +{
> > +     return arch_evsel__must_be_in_group(evsel);
> > +}
> > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> > index 45d674812239..0ed2850b7ebb 100644
> > --- a/tools/perf/util/evsel.h
> > +++ b/tools/perf/util/evsel.h
> > @@ -488,6 +488,9 @@ bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
> >   bool evsel__is_leader(struct evsel *evsel);
> >   void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
> >   int evsel__source_count(const struct evsel *evsel);
> > +bool evsel__must_be_in_group(const struct evsel *evsel);
> > +
> > +bool arch_evsel__must_be_in_group(const struct evsel *evsel);
> >
> >   /*
> >    * Macro to swap the bit-field postition and size.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] perf test: Add basic stat and topdown group test
  2022-05-05 12:12   ` Liang, Kan
@ 2022-05-05 15:22     ` Ian Rogers
  2022-05-05 18:19       ` Liang, Kan
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-05-05 15:22 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Zhengjun Xing, Stephane Eranian

On Thu, May 5, 2022 at 5:12 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
> On 5/5/2022 12:38 AM, Ian Rogers wrote:
> > Add a basic stat test.
> > Add two tests of grouping behavior for topdown events. Topdown events
> > are special as they must be grouped with the slots event first.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >   tools/perf/tests/shell/stat.sh | 65 ++++++++++++++++++++++++++++++++++
> >   1 file changed, 65 insertions(+)
> >   create mode 100755 tools/perf/tests/shell/stat.sh
> >
> > diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh
> > new file mode 100755
> > index 000000000000..80869ea6debc
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/stat.sh
> > @@ -0,0 +1,65 @@
> > +#!/bin/sh
> > +# perf stat tests
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +set -e
> > +
> > +err=0
> > +test_default_stat() {
> > +  echo "Basic stat command test"
> > +  if ! perf stat true 2>&1 | egrep -q "Performance counter stats for 'true':"
> > +  then
> > +    echo "Basic stat command test [Failed]"
> > +    err=1
> > +    return
> > +  fi
> > +  echo "Basic stat command test [Success]"
> > +}
> > +
> > +test_topdown_groups() {
> > +  # Topdown events must be grouped with the slots event first. Test that
> > +  # parse-events reorders this.
> > +  echo "Topdown event group test"
> > +  if ! perf stat -e '{slots,topdown-retiring}' true > /dev/null 2>&1
> > +  then
> > +    echo "Topdown event group test [Skipped event parsing failed]"
> > +    return
> > +  fi
> > +  if perf stat -e '{slots,topdown-retiring}' true 2>&1 | egrep -q "<not supported>"
> > +  then
> > +    echo "Topdown event group test [Failed events not supported]"
> > +    err=1
> > +    return
> > +  fi
> > +  if perf stat -e '{topdown-retiring,slots}' true 2>&1 | egrep -q "<not supported>"
> > +  then
> > +    echo "Topdown event group test [Failed slots not reordered first]"
> > +    err=1
> > +    return
> > +  fi
> > +  echo "Topdown event group test [Success]"
> > +}
> > +
> > +test_topdown_weak_groups() {
> > +  # Weak groups break if the perf_event_open of multiple grouped events
> > +  # fails. Breaking a topdown group causes the events to fail. Test a very large
> > +  # grouping to see that the topdown events aren't broken out.
> > +  echo "Topdown weak groups test"
> > +  if ! perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring},branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE' true > /dev/null 2>&1
> > +  then
> > +    echo "Topdown weak groups test [Skipped event parsing failed]"
> > +    return
> > +  fi
> > +  if perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' true 2>&1 | egrep -q "<not supported>"
> > +  then
> > +    echo "Topdown weak groups test [Failed events not supported]"
> > +    err=1
> > +    return
> > +  fi
> > +  echo "Topdown weak groups test [Success]"
> > +}
> > +
>
> Should we check the existence of the slots event before the test?
> The perf metrics feature only be available on the new platform after
> ICL. It doesn't work on Atom.
>
> Also, I think the test may fails on the hybrid platform, since big core
> and small core have different formula for the topdown. I think we should
> avoid the test for the hybrid platform for now.
> +Zhengjun, who is fixing the topdown gap for the hybrid platform. I
> think he may take care of the hybrid support later.

Thanks Kan, the test filters out systems that don't support the events
and silently skips the test. The main purpose of the test is to make
sure the somewhat complicated grouping operations for Icelake have
some coverage. Adding more coverage for hybrid would be great, but not
something I think gates this change.

Thanks,
Ian

> Thanks,
> Kan
> > +test_default_stat
> > +test_topdown_groups
> > +test_topdown_weak_groups
> > +exit $err

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group
  2022-05-05 15:18   ` Ian Rogers
@ 2022-05-05 18:15     ` Liang, Kan
  2022-05-05 18:31       ` Ian Rogers
  0 siblings, 1 reply; 16+ messages in thread
From: Liang, Kan @ 2022-05-05 18:15 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Stephane Eranian, Zhengjun Xing



On 5/5/2022 11:18 AM, Ian Rogers wrote:
> On Thu, May 5, 2022 at 4:56 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>> Hi Ian,
>>
>> Thanks for the patches.
>>
>> On 5/5/2022 12:38 AM, Ian Rogers wrote:
>>> On Intel Icelake, topdown events must always be grouped with a slots
>>> event as leader. When a metric is parsed a weak group is formed and
>>> retried if perf_event_open fails. The retried events aren't grouped
>>> breaking the slots leader requirement. This change modifies the weak
>>> group "reset" behavior so that topdown events aren't broken from the
>>> group for the retry.
>>>
>>> $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
>>>
>>>    Performance counter stats for 'system wide':
>>>
>>>       47,867,188,483      slots                                                         (92.27%)
>>>      <not supported>      topdown-bad-spec
>>>      <not supported>      topdown-be-bound
>>>      <not supported>      topdown-fe-bound
>>>      <not supported>      topdown-retiring
>>>        2,173,346,937      branch-instructions                                           (92.27%)
>>>           10,540,253      branch-misses             #    0.48% of all branches          (92.29%)
>>>           96,291,140      bus-cycles                                                    (92.29%)
>>>            6,214,202      cache-misses              #   20.120 % of all cache refs      (92.29%)
>>>           30,886,082      cache-references                                              (76.91%)
>>>       11,773,726,641      cpu-cycles                                                    (84.62%)
>>>       11,807,585,307      instructions              #    1.00  insn per cycle           (92.31%)
>>>                    0      mem-loads                                                     (92.32%)
>>>        2,212,928,573      mem-stores                                                    (84.69%)
>>>       10,024,403,118      ref-cycles                                                    (92.35%)
>>>           16,232,978      baclears.any                                                  (92.35%)
>>>           23,832,633      ARITH.DIVIDER_ACTIVE                                          (84.59%)
>>>
>>>          0.981070734 seconds time elapsed
>>>
>>> After:
>>>
>>> $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
>>>
>>>    Performance counter stats for 'system wide':
>>>
>>>          31040189283      slots                                                         (92.27%)
>>>           8997514811      topdown-bad-spec          #     28.2% bad speculation         (92.27%)
>>>          10997536028      topdown-be-bound          #     34.5% backend bound           (92.27%)
>>>           4778060526      topdown-fe-bound          #     15.0% frontend bound          (92.27%)
>>>           7086628768      topdown-retiring          #     22.2% retiring                (92.27%)
>>>           1417611942      branch-instructions                                           (92.26%)
>>>              5285529      branch-misses             #    0.37% of all branches          (92.28%)
>>>             62922469      bus-cycles                                                    (92.29%)
>>>              1440708      cache-misses              #    8.292 % of all cache refs      (92.30%)
>>>             17374098      cache-references                                              (76.94%)
>>>           8040889520      cpu-cycles                                                    (84.63%)
>>>           7709992319      instructions              #    0.96  insn per cycle           (92.32%)
>>>                    0      mem-loads                                                     (92.32%)
>>>           1515669558      mem-stores                                                    (84.68%)
>>>           6542411177      ref-cycles                                                    (92.35%)
>>>              4154149      baclears.any                                                  (92.35%)
>>>             20556152      ARITH.DIVIDER_ACTIVE                                          (84.59%)
>>>
>>>          1.010799593 seconds time elapsed
>>>
>>> Signed-off-by: Ian Rogers <irogers@google.com>
>>> ---
>>>    tools/perf/arch/x86/util/evsel.c | 12 ++++++++++++
>>>    tools/perf/util/evlist.c         | 16 ++++++++++++++--
>>>    tools/perf/util/evsel.c          | 10 ++++++++++
>>>    tools/perf/util/evsel.h          |  3 +++
>>>    4 files changed, 39 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
>>> index ac2899a25b7a..40b171de2086 100644
>>> --- a/tools/perf/arch/x86/util/evsel.c
>>> +++ b/tools/perf/arch/x86/util/evsel.c
>>> @@ -3,6 +3,7 @@
>>>    #include <stdlib.h>
>>>    #include "util/evsel.h"
>>>    #include "util/env.h"
>>> +#include "util/pmu.h"
>>>    #include "linux/string.h"
>>>
>>>    void arch_evsel__set_sample_weight(struct evsel *evsel)
>>> @@ -29,3 +30,14 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
>>>
>>>        free(env.cpuid);
>>>    }
>>> +
>>> +bool arch_evsel__must_be_in_group(const struct evsel *evsel)
>>> +{
>>> +     if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
>>> +         !pmu_have_event("cpu", "slots"))
>>> +             return false;
>>> +
>>
>> The big core of ADL also supports Perf_metrics. At least, we should
>> check both "cpu" and "cpu_core" here.
>>
>> Thanks,
>> Kan
> 
> Thanks Kan, we have the same problem for
> arch_evlist__add_default_attrs and arch_evlist__leader:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/arch/x86/util/evlist.c?h=perf/core

These are known issues. Zhengjun already have some patches to fix them, 
but hasn't posted yet.

The arch_evlist__add_default_attrs impacts the perf stat default. On the 
ADL, perf stat default doesn't display the topdown events.

The arch_evlist__leader impacts the sampling read of the topdown events. 
Sampling read the topdown events should always fail with the current 
perf tool on ADL.

We already have many backlogs on ADL. I think it's better not to 
introduce more issues.

> So I think fixing all of these should be a follow up. I am working to
> get access to an Alderlake system, could we land this first?
> 

I think we can use pmu_name to replace the "cpu" to fix the issue for 
the hybrid platform. For a hybrid platform, the pmu_name is either 
cpu_atom or cpu_core.

Besides, the topdown events may have a PMU prefix, e.g., 
cpu_core/topdown-be-bound/. The strcasecmp may not work well for this case.

How about the below patch?
If it's OK for you, could you please merge it into your V2 patch set?
I can do the test on a ADL system.

diff --git a/tools/perf/arch/x86/util/evsel.c 
b/tools/perf/arch/x86/util/evsel.c
index 40b171de2086..551ae2bab70e 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -33,11 +33,12 @@ void arch_evsel__fixup_new_cycles(struct 
perf_event_attr *attr)

  bool arch_evsel__must_be_in_group(const struct evsel *evsel)
  {
-	if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
-	    !pmu_have_event("cpu", "slots"))
+	const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
+
+	if (!pmu_have_event(pmu_name, "slots"))
  		return false;

  	return evsel->name &&
-		(!strcasecmp(evsel->name, "slots") ||
-		 !strncasecmp(evsel->name, "topdown", 7));
+		(strcasestr(evsel->name, "slots") ||
+		 strcasestr(evsel->name, "topdown"));
  }



Thanks,
Kan

> Thanks,
> Ian
> 
>>> +     return evsel->name &&
>>> +             (!strcasecmp(evsel->name, "slots") ||
>>> +              !strncasecmp(evsel->name, "topdown", 7));
>>> +}
>>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
>>> index 52ea004ba01e..dfa65a383502 100644
>>> --- a/tools/perf/util/evlist.c
>>> +++ b/tools/perf/util/evlist.c
>>> @@ -1790,8 +1790,17 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
>>>                if (evsel__has_leader(c2, leader)) {
>>>                        if (is_open && close)
>>>                                perf_evsel__close(&c2->core);
>>> -                     evsel__set_leader(c2, c2);
>>> -                     c2->core.nr_members = 0;
>>> +                     /*
>>> +                      * We want to close all members of the group and reopen
>>> +                      * them. Some events, like Intel topdown, require being
>>> +                      * in a group and so keep these in the group.
>>> +                      */
>>> +                     if (!evsel__must_be_in_group(c2) && c2 != leader) {
>>> +                             evsel__set_leader(c2, c2);
>>> +                             c2->core.nr_members = 0;
>>> +                             leader->core.nr_members--;
>>> +                     }
>>> +
>>>                        /*
>>>                         * Set this for all former members of the group
>>>                         * to indicate they get reopened.
>>> @@ -1799,6 +1808,9 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
>>>                        c2->reset_group = true;
>>>                }
>>>        }
>>> +     /* Reset the leader count if all entries were removed. */
>>> +     if (leader->core.nr_members)
>>> +             leader->core.nr_members = 0;
>>>        return leader;
>>>    }
>>>
>>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>>> index d38722560e80..b7c0c9775673 100644
>>> --- a/tools/perf/util/evsel.c
>>> +++ b/tools/perf/util/evsel.c
>>> @@ -3082,3 +3082,13 @@ int evsel__source_count(const struct evsel *evsel)
>>>        }
>>>        return count;
>>>    }
>>> +
>>> +bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
>>> +{
>>> +     return false;
>>> +}
>>> +
>>> +bool evsel__must_be_in_group(const struct evsel *evsel)
>>> +{
>>> +     return arch_evsel__must_be_in_group(evsel);
>>> +}
>>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
>>> index 45d674812239..0ed2850b7ebb 100644
>>> --- a/tools/perf/util/evsel.h
>>> +++ b/tools/perf/util/evsel.h
>>> @@ -488,6 +488,9 @@ bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
>>>    bool evsel__is_leader(struct evsel *evsel);
>>>    void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
>>>    int evsel__source_count(const struct evsel *evsel);
>>> +bool evsel__must_be_in_group(const struct evsel *evsel);
>>> +
>>> +bool arch_evsel__must_be_in_group(const struct evsel *evsel);
>>>
>>>    /*
>>>     * Macro to swap the bit-field postition and size.

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] perf test: Add basic stat and topdown group test
  2022-05-05 15:22     ` Ian Rogers
@ 2022-05-05 18:19       ` Liang, Kan
  2022-05-05 18:35         ` Ian Rogers
  0 siblings, 1 reply; 16+ messages in thread
From: Liang, Kan @ 2022-05-05 18:19 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Zhengjun Xing, Stephane Eranian



On 5/5/2022 11:22 AM, Ian Rogers wrote:
> On Thu, May 5, 2022 at 5:12 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>> On 5/5/2022 12:38 AM, Ian Rogers wrote:
>>> Add a basic stat test.
>>> Add two tests of grouping behavior for topdown events. Topdown events
>>> are special as they must be grouped with the slots event first.
>>>
>>> Signed-off-by: Ian Rogers <irogers@google.com>
>>> ---
>>>    tools/perf/tests/shell/stat.sh | 65 ++++++++++++++++++++++++++++++++++
>>>    1 file changed, 65 insertions(+)
>>>    create mode 100755 tools/perf/tests/shell/stat.sh
>>>
>>> diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh
>>> new file mode 100755
>>> index 000000000000..80869ea6debc
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/stat.sh
>>> @@ -0,0 +1,65 @@
>>> +#!/bin/sh
>>> +# perf stat tests
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +
>>> +set -e
>>> +
>>> +err=0
>>> +test_default_stat() {
>>> +  echo "Basic stat command test"
>>> +  if ! perf stat true 2>&1 | egrep -q "Performance counter stats for 'true':"
>>> +  then
>>> +    echo "Basic stat command test [Failed]"
>>> +    err=1
>>> +    return
>>> +  fi
>>> +  echo "Basic stat command test [Success]"
>>> +}
>>> +
>>> +test_topdown_groups() {
>>> +  # Topdown events must be grouped with the slots event first. Test that
>>> +  # parse-events reorders this.
>>> +  echo "Topdown event group test"
>>> +  if ! perf stat -e '{slots,topdown-retiring}' true > /dev/null 2>&1
>>> +  then
>>> +    echo "Topdown event group test [Skipped event parsing failed]"
>>> +    return
>>> +  fi
>>> +  if perf stat -e '{slots,topdown-retiring}' true 2>&1 | egrep -q "<not supported>"
>>> +  then
>>> +    echo "Topdown event group test [Failed events not supported]"
>>> +    err=1
>>> +    return
>>> +  fi
>>> +  if perf stat -e '{topdown-retiring,slots}' true 2>&1 | egrep -q "<not supported>"
>>> +  then
>>> +    echo "Topdown event group test [Failed slots not reordered first]"
>>> +    err=1
>>> +    return
>>> +  fi
>>> +  echo "Topdown event group test [Success]"
>>> +}
>>> +
>>> +test_topdown_weak_groups() {
>>> +  # Weak groups break if the perf_event_open of multiple grouped events
>>> +  # fails. Breaking a topdown group causes the events to fail. Test a very large
>>> +  # grouping to see that the topdown events aren't broken out.
>>> +  echo "Topdown weak groups test"
>>> +  if ! perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring},branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE' true > /dev/null 2>&1
>>> +  then
>>> +    echo "Topdown weak groups test [Skipped event parsing failed]"
>>> +    return
>>> +  fi
>>> +  if perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' true 2>&1 | egrep -q "<not supported>"
>>> +  then
>>> +    echo "Topdown weak groups test [Failed events not supported]"
>>> +    err=1
>>> +    return
>>> +  fi
>>> +  echo "Topdown weak groups test [Success]"
>>> +}
>>> +
>>
>> Should we check the existence of the slots event before the test?
>> The perf metrics feature only be available on the new platform after
>> ICL. It doesn't work on Atom.
>>
>> Also, I think the test may fails on the hybrid platform, since big core
>> and small core have different formula for the topdown. I think we should
>> avoid the test for the hybrid platform for now.
>> +Zhengjun, who is fixing the topdown gap for the hybrid platform. I
>> think he may take care of the hybrid support later.
> 
> Thanks Kan, the test filters out systems that don't support the events
> and silently skips the test. The main purpose of the test is to make
> sure the somewhat complicated grouping operations for Icelake have
> some coverage. Adding more coverage for hybrid would be great, but not
> something I think gates this change.
>

Sure, we can add the coverage for hybrid later. But please make sure the 
test can filter out both the systems which doesn't support perf metircs 
and the hybrid system.

Thanks,
Kan

> Thanks,
> Ian
> 
>> Thanks,
>> Kan
>>> +test_default_stat
>>> +test_topdown_groups
>>> +test_topdown_weak_groups
>>> +exit $err

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group
  2022-05-05 18:15     ` Liang, Kan
@ 2022-05-05 18:31       ` Ian Rogers
  2022-05-05 19:43         ` Liang, Kan
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-05-05 18:31 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Stephane Eranian, Zhengjun Xing

On Thu, May 5, 2022 at 11:15 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
> On 5/5/2022 11:18 AM, Ian Rogers wrote:
> > On Thu, May 5, 2022 at 4:56 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
> >>
> >> Hi Ian,
> >>
> >> Thanks for the patches.
> >>
> >> On 5/5/2022 12:38 AM, Ian Rogers wrote:
> >>> On Intel Icelake, topdown events must always be grouped with a slots
> >>> event as leader. When a metric is parsed a weak group is formed and
> >>> retried if perf_event_open fails. The retried events aren't grouped
> >>> breaking the slots leader requirement. This change modifies the weak
> >>> group "reset" behavior so that topdown events aren't broken from the
> >>> group for the retry.
> >>>
> >>> $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
> >>>
> >>>    Performance counter stats for 'system wide':
> >>>
> >>>       47,867,188,483      slots                                                         (92.27%)
> >>>      <not supported>      topdown-bad-spec
> >>>      <not supported>      topdown-be-bound
> >>>      <not supported>      topdown-fe-bound
> >>>      <not supported>      topdown-retiring
> >>>        2,173,346,937      branch-instructions                                           (92.27%)
> >>>           10,540,253      branch-misses             #    0.48% of all branches          (92.29%)
> >>>           96,291,140      bus-cycles                                                    (92.29%)
> >>>            6,214,202      cache-misses              #   20.120 % of all cache refs      (92.29%)
> >>>           30,886,082      cache-references                                              (76.91%)
> >>>       11,773,726,641      cpu-cycles                                                    (84.62%)
> >>>       11,807,585,307      instructions              #    1.00  insn per cycle           (92.31%)
> >>>                    0      mem-loads                                                     (92.32%)
> >>>        2,212,928,573      mem-stores                                                    (84.69%)
> >>>       10,024,403,118      ref-cycles                                                    (92.35%)
> >>>           16,232,978      baclears.any                                                  (92.35%)
> >>>           23,832,633      ARITH.DIVIDER_ACTIVE                                          (84.59%)
> >>>
> >>>          0.981070734 seconds time elapsed
> >>>
> >>> After:
> >>>
> >>> $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
> >>>
> >>>    Performance counter stats for 'system wide':
> >>>
> >>>          31040189283      slots                                                         (92.27%)
> >>>           8997514811      topdown-bad-spec          #     28.2% bad speculation         (92.27%)
> >>>          10997536028      topdown-be-bound          #     34.5% backend bound           (92.27%)
> >>>           4778060526      topdown-fe-bound          #     15.0% frontend bound          (92.27%)
> >>>           7086628768      topdown-retiring          #     22.2% retiring                (92.27%)
> >>>           1417611942      branch-instructions                                           (92.26%)
> >>>              5285529      branch-misses             #    0.37% of all branches          (92.28%)
> >>>             62922469      bus-cycles                                                    (92.29%)
> >>>              1440708      cache-misses              #    8.292 % of all cache refs      (92.30%)
> >>>             17374098      cache-references                                              (76.94%)
> >>>           8040889520      cpu-cycles                                                    (84.63%)
> >>>           7709992319      instructions              #    0.96  insn per cycle           (92.32%)
> >>>                    0      mem-loads                                                     (92.32%)
> >>>           1515669558      mem-stores                                                    (84.68%)
> >>>           6542411177      ref-cycles                                                    (92.35%)
> >>>              4154149      baclears.any                                                  (92.35%)
> >>>             20556152      ARITH.DIVIDER_ACTIVE                                          (84.59%)
> >>>
> >>>          1.010799593 seconds time elapsed
> >>>
> >>> Signed-off-by: Ian Rogers <irogers@google.com>
> >>> ---
> >>>    tools/perf/arch/x86/util/evsel.c | 12 ++++++++++++
> >>>    tools/perf/util/evlist.c         | 16 ++++++++++++++--
> >>>    tools/perf/util/evsel.c          | 10 ++++++++++
> >>>    tools/perf/util/evsel.h          |  3 +++
> >>>    4 files changed, 39 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
> >>> index ac2899a25b7a..40b171de2086 100644
> >>> --- a/tools/perf/arch/x86/util/evsel.c
> >>> +++ b/tools/perf/arch/x86/util/evsel.c
> >>> @@ -3,6 +3,7 @@
> >>>    #include <stdlib.h>
> >>>    #include "util/evsel.h"
> >>>    #include "util/env.h"
> >>> +#include "util/pmu.h"
> >>>    #include "linux/string.h"
> >>>
> >>>    void arch_evsel__set_sample_weight(struct evsel *evsel)
> >>> @@ -29,3 +30,14 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
> >>>
> >>>        free(env.cpuid);
> >>>    }
> >>> +
> >>> +bool arch_evsel__must_be_in_group(const struct evsel *evsel)
> >>> +{
> >>> +     if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
> >>> +         !pmu_have_event("cpu", "slots"))
> >>> +             return false;
> >>> +
> >>
> >> The big core of ADL also supports Perf_metrics. At least, we should
> >> check both "cpu" and "cpu_core" here.
> >>
> >> Thanks,
> >> Kan
> >
> > Thanks Kan, we have the same problem for
> > arch_evlist__add_default_attrs and arch_evlist__leader:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/arch/x86/util/evlist.c?h=perf/core
>
> These are known issues. Zhengjun already have some patches to fix them,
> but hasn't posted yet.
>
> The arch_evlist__add_default_attrs impacts the perf stat default. On the
> ADL, perf stat default doesn't display the topdown events.
>
> The arch_evlist__leader impacts the sampling read of the topdown events.
> Sampling read the topdown events should always fail with the current
> perf tool on ADL.
>
> We already have many backlogs on ADL. I think it's better not to
> introduce more issues.
>
> > So I think fixing all of these should be a follow up. I am working to
> > get access to an Alderlake system, could we land this first?
> >
>
> I think we can use pmu_name to replace the "cpu" to fix the issue for
> the hybrid platform. For a hybrid platform, the pmu_name is either
> cpu_atom or cpu_core.
>
> Besides, the topdown events may have a PMU prefix, e.g.,
> cpu_core/topdown-be-bound/. The strcasecmp may not work well for this case.
>
> How about the below patch?
> If it's OK for you, could you please merge it into your V2 patch set?
> I can do the test on a ADL system.
>
> diff --git a/tools/perf/arch/x86/util/evsel.c
> b/tools/perf/arch/x86/util/evsel.c
> index 40b171de2086..551ae2bab70e 100644
> --- a/tools/perf/arch/x86/util/evsel.c
> +++ b/tools/perf/arch/x86/util/evsel.c
> @@ -33,11 +33,12 @@ void arch_evsel__fixup_new_cycles(struct
> perf_event_attr *attr)
>
>   bool arch_evsel__must_be_in_group(const struct evsel *evsel)
>   {
> -       if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
> -           !pmu_have_event("cpu", "slots"))
> +       const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
> +
> +       if (!pmu_have_event(pmu_name, "slots"))
>                 return false;

Hmm. The idea with this test is to see if the architecture supports
topdown events before going further. There's a similar test in all the
arch_evlist functions. I think with cpu_core this needs to become:

if (!pmu_have_event("cpu", "slots") && !pmu_have_event("cpu_core", "slots") )

But we should add a helper function for this. It is odd to have this
change supporting Alderlake but the existing evlist work not. Perhaps
we should just wait until Zhengjun's patches land.

Thanks,
Ian

>         return evsel->name &&
> -               (!strcasecmp(evsel->name, "slots") ||
> -                !strncasecmp(evsel->name, "topdown", 7));
> +               (strcasestr(evsel->name, "slots") ||
> +                strcasestr(evsel->name, "topdown"));
>   }
>
>
>
> Thanks,
> Kan
>
> > Thanks,
> > Ian
> >
> >>> +     return evsel->name &&
> >>> +             (!strcasecmp(evsel->name, "slots") ||
> >>> +              !strncasecmp(evsel->name, "topdown", 7));
> >>> +}
> >>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> >>> index 52ea004ba01e..dfa65a383502 100644
> >>> --- a/tools/perf/util/evlist.c
> >>> +++ b/tools/perf/util/evlist.c
> >>> @@ -1790,8 +1790,17 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
> >>>                if (evsel__has_leader(c2, leader)) {
> >>>                        if (is_open && close)
> >>>                                perf_evsel__close(&c2->core);
> >>> -                     evsel__set_leader(c2, c2);
> >>> -                     c2->core.nr_members = 0;
> >>> +                     /*
> >>> +                      * We want to close all members of the group and reopen
> >>> +                      * them. Some events, like Intel topdown, require being
> >>> +                      * in a group and so keep these in the group.
> >>> +                      */
> >>> +                     if (!evsel__must_be_in_group(c2) && c2 != leader) {
> >>> +                             evsel__set_leader(c2, c2);
> >>> +                             c2->core.nr_members = 0;
> >>> +                             leader->core.nr_members--;
> >>> +                     }
> >>> +
> >>>                        /*
> >>>                         * Set this for all former members of the group
> >>>                         * to indicate they get reopened.
> >>> @@ -1799,6 +1808,9 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
> >>>                        c2->reset_group = true;
> >>>                }
> >>>        }
> >>> +     /* Reset the leader count if all entries were removed. */
> >>> +     if (leader->core.nr_members)
> >>> +             leader->core.nr_members = 0;
> >>>        return leader;
> >>>    }
> >>>
> >>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> >>> index d38722560e80..b7c0c9775673 100644
> >>> --- a/tools/perf/util/evsel.c
> >>> +++ b/tools/perf/util/evsel.c
> >>> @@ -3082,3 +3082,13 @@ int evsel__source_count(const struct evsel *evsel)
> >>>        }
> >>>        return count;
> >>>    }
> >>> +
> >>> +bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
> >>> +{
> >>> +     return false;
> >>> +}
> >>> +
> >>> +bool evsel__must_be_in_group(const struct evsel *evsel)
> >>> +{
> >>> +     return arch_evsel__must_be_in_group(evsel);
> >>> +}
> >>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> >>> index 45d674812239..0ed2850b7ebb 100644
> >>> --- a/tools/perf/util/evsel.h
> >>> +++ b/tools/perf/util/evsel.h
> >>> @@ -488,6 +488,9 @@ bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
> >>>    bool evsel__is_leader(struct evsel *evsel);
> >>>    void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
> >>>    int evsel__source_count(const struct evsel *evsel);
> >>> +bool evsel__must_be_in_group(const struct evsel *evsel);
> >>> +
> >>> +bool arch_evsel__must_be_in_group(const struct evsel *evsel);
> >>>
> >>>    /*
> >>>     * Macro to swap the bit-field postition and size.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] perf test: Add basic stat and topdown group test
  2022-05-05 18:19       ` Liang, Kan
@ 2022-05-05 18:35         ` Ian Rogers
  2022-05-05 20:27           ` Liang, Kan
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-05-05 18:35 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Zhengjun Xing, Stephane Eranian

On Thu, May 5, 2022 at 11:19 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
> On 5/5/2022 11:22 AM, Ian Rogers wrote:
> > On Thu, May 5, 2022 at 5:12 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
> >>
> >> On 5/5/2022 12:38 AM, Ian Rogers wrote:
> >>> Add a basic stat test.
> >>> Add two tests of grouping behavior for topdown events. Topdown events
> >>> are special as they must be grouped with the slots event first.
> >>>
> >>> Signed-off-by: Ian Rogers <irogers@google.com>
> >>> ---
> >>>    tools/perf/tests/shell/stat.sh | 65 ++++++++++++++++++++++++++++++++++
> >>>    1 file changed, 65 insertions(+)
> >>>    create mode 100755 tools/perf/tests/shell/stat.sh
> >>>
> >>> diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh
> >>> new file mode 100755
> >>> index 000000000000..80869ea6debc
> >>> --- /dev/null
> >>> +++ b/tools/perf/tests/shell/stat.sh
> >>> @@ -0,0 +1,65 @@
> >>> +#!/bin/sh
> >>> +# perf stat tests
> >>> +# SPDX-License-Identifier: GPL-2.0
> >>> +
> >>> +set -e
> >>> +
> >>> +err=0
> >>> +test_default_stat() {
> >>> +  echo "Basic stat command test"
> >>> +  if ! perf stat true 2>&1 | egrep -q "Performance counter stats for 'true':"
> >>> +  then
> >>> +    echo "Basic stat command test [Failed]"
> >>> +    err=1
> >>> +    return
> >>> +  fi
> >>> +  echo "Basic stat command test [Success]"
> >>> +}
> >>> +
> >>> +test_topdown_groups() {
> >>> +  # Topdown events must be grouped with the slots event first. Test that
> >>> +  # parse-events reorders this.
> >>> +  echo "Topdown event group test"
> >>> +  if ! perf stat -e '{slots,topdown-retiring}' true > /dev/null 2>&1
> >>> +  then
> >>> +    echo "Topdown event group test [Skipped event parsing failed]"
> >>> +    return
> >>> +  fi
> >>> +  if perf stat -e '{slots,topdown-retiring}' true 2>&1 | egrep -q "<not supported>"
> >>> +  then
> >>> +    echo "Topdown event group test [Failed events not supported]"
> >>> +    err=1
> >>> +    return
> >>> +  fi
> >>> +  if perf stat -e '{topdown-retiring,slots}' true 2>&1 | egrep -q "<not supported>"
> >>> +  then
> >>> +    echo "Topdown event group test [Failed slots not reordered first]"
> >>> +    err=1
> >>> +    return
> >>> +  fi
> >>> +  echo "Topdown event group test [Success]"
> >>> +}
> >>> +
> >>> +test_topdown_weak_groups() {
> >>> +  # Weak groups break if the perf_event_open of multiple grouped events
> >>> +  # fails. Breaking a topdown group causes the events to fail. Test a very large
> >>> +  # grouping to see that the topdown events aren't broken out.
> >>> +  echo "Topdown weak groups test"
> >>> +  if ! perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring},branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE' true > /dev/null 2>&1
> >>> +  then
> >>> +    echo "Topdown weak groups test [Skipped event parsing failed]"
> >>> +    return
> >>> +  fi
> >>> +  if perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' true 2>&1 | egrep -q "<not supported>"
> >>> +  then
> >>> +    echo "Topdown weak groups test [Failed events not supported]"
> >>> +    err=1
> >>> +    return
> >>> +  fi
> >>> +  echo "Topdown weak groups test [Success]"
> >>> +}
> >>> +
> >>
> >> Should we check the existence of the slots event before the test?
> >> The perf metrics feature only be available on the new platform after
> >> ICL. It doesn't work on Atom.
> >>
> >> Also, I think the test may fails on the hybrid platform, since big core
> >> and small core have different formula for the topdown. I think we should
> >> avoid the test for the hybrid platform for now.
> >> +Zhengjun, who is fixing the topdown gap for the hybrid platform. I
> >> think he may take care of the hybrid support later.
> >
> > Thanks Kan, the test filters out systems that don't support the events
> > and silently skips the test. The main purpose of the test is to make
> > sure the somewhat complicated grouping operations for Icelake have
> > some coverage. Adding more coverage for hybrid would be great, but not
> > something I think gates this change.
> >
>
> Sure, we can add the coverage for hybrid later. But please make sure the
> test can filter out both the systems which doesn't support perf metircs
> and the hybrid system.
>
> Thanks,
> Kan

If the test fails on hybrid then that feels like value add :-) We
genuinely have broken grouping functions. We could just add to the
test a skip if /sys/devices/cpu_core and /sys/devices/cpu_atom
directories exist (making assumptions on where sysfs is mounted). I'm
not yet able to test on Alderlake hence not wanting to have a lot of
untested code.

Thanks,
Ian

> > Thanks,
> > Ian
> >
> >> Thanks,
> >> Kan
> >>> +test_default_stat
> >>> +test_topdown_groups
> >>> +test_topdown_weak_groups
> >>> +exit $err

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group
  2022-05-05 18:31       ` Ian Rogers
@ 2022-05-05 19:43         ` Liang, Kan
  2022-05-09 17:28           ` Ian Rogers
  0 siblings, 1 reply; 16+ messages in thread
From: Liang, Kan @ 2022-05-05 19:43 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Stephane Eranian, Zhengjun Xing



On 5/5/2022 2:31 PM, Ian Rogers wrote:
>>> So I think fixing all of these should be a follow up. I am working to
>>> get access to an Alderlake system, could we land this first?
>>>
>> I think we can use pmu_name to replace the "cpu" to fix the issue for
>> the hybrid platform. For a hybrid platform, the pmu_name is either
>> cpu_atom or cpu_core.
>>
>> Besides, the topdown events may have a PMU prefix, e.g.,
>> cpu_core/topdown-be-bound/. The strcasecmp may not work well for this case.
>>
>> How about the below patch?
>> If it's OK for you, could you please merge it into your V2 patch set?
>> I can do the test on a ADL system.
>>
>> diff --git a/tools/perf/arch/x86/util/evsel.c
>> b/tools/perf/arch/x86/util/evsel.c
>> index 40b171de2086..551ae2bab70e 100644
>> --- a/tools/perf/arch/x86/util/evsel.c
>> +++ b/tools/perf/arch/x86/util/evsel.c
>> @@ -33,11 +33,12 @@ void arch_evsel__fixup_new_cycles(struct
>> perf_event_attr *attr)
>>
>>    bool arch_evsel__must_be_in_group(const struct evsel *evsel)
>>    {
>> -       if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
>> -           !pmu_have_event("cpu", "slots"))
>> +       const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
>> +
>> +       if (!pmu_have_event(pmu_name, "slots"))
>>                  return false;
> Hmm. The idea with this test is to see if the architecture supports
> topdown events before going further. There's a similar test in all the
> arch_evlist functions. I think with cpu_core this needs to become:
>

The case is a little bit different here. For the arch_evlist functions, 
the input is the evlist, not the specific evsel. So we have to check all 
the possible PMU names which are "cpu" and "cpu_core". Then we decide 
whether going further.

The input of the evsel__must_be_in_group() is the evsel. The PMU name is 
stored in the evsel->pmu_name. I don't think we need to check all the 
possible PMU names. Using evsel->pmu_name should be good enough.

> if (!pmu_have_event("cpu", "slots") && !pmu_have_event("cpu_core", "slots") )
> 
> But we should add a helper function for this. It is odd to have this
> change supporting Alderlake but the existing evlist work not. Perhaps
> we should just wait until Zhengjun's patches land.

Yes, a helper function is good for the arch_evlist functions. But I 
don't think this patch needs the helper function. Zhengjun's patches are 
to fix the other topdown issues on ADL. There is no dependency between 
this patch and zhengjun's patches.

Thanks,
Kan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] perf test: Add basic stat and topdown group test
  2022-05-05 18:35         ` Ian Rogers
@ 2022-05-05 20:27           ` Liang, Kan
  0 siblings, 0 replies; 16+ messages in thread
From: Liang, Kan @ 2022-05-05 20:27 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Zhengjun Xing, Stephane Eranian



On 5/5/2022 2:35 PM, Ian Rogers wrote:
> On Thu, May 5, 2022 at 11:19 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>> On 5/5/2022 11:22 AM, Ian Rogers wrote:
>>> On Thu, May 5, 2022 at 5:12 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>>>
>>>> On 5/5/2022 12:38 AM, Ian Rogers wrote:
>>>>> Add a basic stat test.
>>>>> Add two tests of grouping behavior for topdown events. Topdown events
>>>>> are special as they must be grouped with the slots event first.
>>>>>
>>>>> Signed-off-by: Ian Rogers <irogers@google.com>
>>>>> ---
>>>>>     tools/perf/tests/shell/stat.sh | 65 ++++++++++++++++++++++++++++++++++
>>>>>     1 file changed, 65 insertions(+)
>>>>>     create mode 100755 tools/perf/tests/shell/stat.sh
>>>>>
>>>>> diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh
>>>>> new file mode 100755
>>>>> index 000000000000..80869ea6debc
>>>>> --- /dev/null
>>>>> +++ b/tools/perf/tests/shell/stat.sh
>>>>> @@ -0,0 +1,65 @@
>>>>> +#!/bin/sh
>>>>> +# perf stat tests
>>>>> +# SPDX-License-Identifier: GPL-2.0
>>>>> +
>>>>> +set -e
>>>>> +
>>>>> +err=0
>>>>> +test_default_stat() {
>>>>> +  echo "Basic stat command test"
>>>>> +  if ! perf stat true 2>&1 | egrep -q "Performance counter stats for 'true':"
>>>>> +  then
>>>>> +    echo "Basic stat command test [Failed]"
>>>>> +    err=1
>>>>> +    return
>>>>> +  fi
>>>>> +  echo "Basic stat command test [Success]"
>>>>> +}
>>>>> +
>>>>> +test_topdown_groups() {
>>>>> +  # Topdown events must be grouped with the slots event first. Test that
>>>>> +  # parse-events reorders this.
>>>>> +  echo "Topdown event group test"
>>>>> +  if ! perf stat -e '{slots,topdown-retiring}' true > /dev/null 2>&1
>>>>> +  then
>>>>> +    echo "Topdown event group test [Skipped event parsing failed]"
>>>>> +    return
>>>>> +  fi
>>>>> +  if perf stat -e '{slots,topdown-retiring}' true 2>&1 | egrep -q "<not supported>"
>>>>> +  then
>>>>> +    echo "Topdown event group test [Failed events not supported]"
>>>>> +    err=1
>>>>> +    return
>>>>> +  fi
>>>>> +  if perf stat -e '{topdown-retiring,slots}' true 2>&1 | egrep -q "<not supported>"
>>>>> +  then
>>>>> +    echo "Topdown event group test [Failed slots not reordered first]"
>>>>> +    err=1
>>>>> +    return
>>>>> +  fi
>>>>> +  echo "Topdown event group test [Success]"
>>>>> +}
>>>>> +
>>>>> +test_topdown_weak_groups() {
>>>>> +  # Weak groups break if the perf_event_open of multiple grouped events
>>>>> +  # fails. Breaking a topdown group causes the events to fail. Test a very large
>>>>> +  # grouping to see that the topdown events aren't broken out.
>>>>> +  echo "Topdown weak groups test"
>>>>> +  if ! perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring},branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE' true > /dev/null 2>&1
>>>>> +  then
>>>>> +    echo "Topdown weak groups test [Skipped event parsing failed]"
>>>>> +    return
>>>>> +  fi
>>>>> +  if perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' true 2>&1 | egrep -q "<not supported>"
>>>>> +  then
>>>>> +    echo "Topdown weak groups test [Failed events not supported]"
>>>>> +    err=1
>>>>> +    return
>>>>> +  fi
>>>>> +  echo "Topdown weak groups test [Success]"
>>>>> +}
>>>>> +
>>>>
>>>> Should we check the existence of the slots event before the test?
>>>> The perf metrics feature only be available on the new platform after
>>>> ICL. It doesn't work on Atom.
>>>>
>>>> Also, I think the test may fails on the hybrid platform, since big core
>>>> and small core have different formula for the topdown. I think we should
>>>> avoid the test for the hybrid platform for now.
>>>> +Zhengjun, who is fixing the topdown gap for the hybrid platform. I
>>>> think he may take care of the hybrid support later.
>>>
>>> Thanks Kan, the test filters out systems that don't support the events
>>> and silently skips the test. The main purpose of the test is to make
>>> sure the somewhat complicated grouping operations for Icelake have
>>> some coverage. Adding more coverage for hybrid would be great, but not
>>> something I think gates this change.
>>>
>>
>> Sure, we can add the coverage for hybrid later. But please make sure the
>> test can filter out both the systems which doesn't support perf metircs
>> and the hybrid system.
>>
>> Thanks,
>> Kan
> 
> If the test fails on hybrid then that feels like value add :-) 

Indeed. The test case is valid.

Could you please use architecture events to replace the 
baclears.any,ARITH.DIVIDER_ACTIVE?
It's not guaranteed that the two events are supported for the future 
platforms.

I use the duplicated cache-misses,cache-references events to replace 
baclears.any,ARITH.DIVIDER_ACTIVE and tested on the ADL. The results 
look good.

./perf stat -e 
'{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,cache-misses,cache-references}:W' 
true
WARNING: events in group from different hybrid PMUs!

  Performance counter stats for 'true':

          3,603,798      cpu_core/slots/
            480,506      cpu_core/topdown-bad-spec/ #     13.3% bad 
speculation
      <not counted>      cpu_atom/topdown-bad-spec/ 
                 (0.00%)
            833,819      cpu_core/topdown-be-bound/ #     23.0% backend 
bound
      <not counted>      cpu_atom/topdown-be-bound/ 
                 (0.00%)
          1,483,916      cpu_core/topdown-fe-bound/ #     41.0% frontend 
bound
      <not counted>      cpu_atom/topdown-fe-bound/ 
                 (0.00%)
            819,687      cpu_core/topdown-retiring/ #     22.7% retiring
      <not counted>      cpu_atom/topdown-retiring/ 
                 (0.00%)
            151,010      cpu_core/branch-instructions/
      <not counted>      cpu_atom/branch-instructions/ 
                    (0.00%)
              4,402      cpu_core/branch-misses/
      <not counted>      cpu_atom/branch-misses/ 
                (0.00%)
          1,456,308      cpu_core/bus-cycles/
      <not counted>      cpu_atom/bus-cycles/ 
                (0.00%)
              1,132      cpu_core/cache-misses/
      <not counted>      cpu_atom/cache-misses/ 
                (0.00%)
             14,226      cpu_core/cache-references/
      <not counted>      cpu_atom/cache-references/ 
                 (0.00%)
            600,633      cpu_core/cpu-cycles/
      <not counted>      cpu_atom/cpu-cycles/ 
                (0.00%)
            737,806      cpu_core/instructions/
      <not counted>      cpu_atom/instructions/ 
                (0.00%)
                  0      cpu_core/mem-loads/
      <not counted>      cpu_atom/mem-loads/ 
                (0.00%)
            120,151      cpu_core/mem-stores/
      <not counted>      cpu_atom/mem-stores/ 
                (0.00%)
          1,456,308      cpu_core/ref-cycles/
      <not counted>      cpu_atom/ref-cycles/ 
                (0.00%)
      <not counted>      cpu_core/cache-misses/ 
                (0.00%)
      <not counted>      cpu_atom/cache-misses/ 
                (0.00%)
      <not counted>      cpu_core/cache-references/ 
                 (0.00%)
      <not counted>      cpu_atom/cache-references/ 
                 (0.00%)

        0.001518955 seconds time elapsed

        0.001531000 seconds user
        0.000000000 seconds sys

Thanks,
Kan
> We
> genuinely have broken grouping functions. We could just add to the
> test a skip if /sys/devices/cpu_core and /sys/devices/cpu_atom
> directories exist (making assumptions on where sysfs is mounted). I'm
> not yet able to test on Alderlake hence not wanting to have a lot of
> untested code.
> 
> Thanks,
> Ian
> 
>>> Thanks,
>>> Ian
>>>
>>>> Thanks,
>>>> Kan
>>>>> +test_default_stat
>>>>> +test_topdown_groups
>>>>> +test_topdown_weak_groups
>>>>> +exit $err

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group
  2022-05-05 19:43         ` Liang, Kan
@ 2022-05-09 17:28           ` Ian Rogers
  2022-05-09 21:01             ` Liang, Kan
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-05-09 17:28 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Stephane Eranian, Zhengjun Xing

On Thu, May 5, 2022 at 12:44 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
>
> On 5/5/2022 2:31 PM, Ian Rogers wrote:
> >>> So I think fixing all of these should be a follow up. I am working to
> >>> get access to an Alderlake system, could we land this first?
> >>>
> >> I think we can use pmu_name to replace the "cpu" to fix the issue for
> >> the hybrid platform. For a hybrid platform, the pmu_name is either
> >> cpu_atom or cpu_core.
> >>
> >> Besides, the topdown events may have a PMU prefix, e.g.,
> >> cpu_core/topdown-be-bound/. The strcasecmp may not work well for this case.
> >>
> >> How about the below patch?
> >> If it's OK for you, could you please merge it into your V2 patch set?
> >> I can do the test on a ADL system.
> >>
> >> diff --git a/tools/perf/arch/x86/util/evsel.c
> >> b/tools/perf/arch/x86/util/evsel.c
> >> index 40b171de2086..551ae2bab70e 100644
> >> --- a/tools/perf/arch/x86/util/evsel.c
> >> +++ b/tools/perf/arch/x86/util/evsel.c
> >> @@ -33,11 +33,12 @@ void arch_evsel__fixup_new_cycles(struct
> >> perf_event_attr *attr)
> >>
> >>    bool arch_evsel__must_be_in_group(const struct evsel *evsel)
> >>    {
> >> -       if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
> >> -           !pmu_have_event("cpu", "slots"))
> >> +       const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
> >> +
> >> +       if (!pmu_have_event(pmu_name, "slots"))
> >>                  return false;
> > Hmm. The idea with this test is to see if the architecture supports
> > topdown events before going further. There's a similar test in all the
> > arch_evlist functions. I think with cpu_core this needs to become:
> >
>
> The case is a little bit different here. For the arch_evlist functions,
> the input is the evlist, not the specific evsel. So we have to check all
> the possible PMU names which are "cpu" and "cpu_core". Then we decide
> whether going further.
>
> The input of the evsel__must_be_in_group() is the evsel. The PMU name is
> stored in the evsel->pmu_name. I don't think we need to check all the
> possible PMU names. Using evsel->pmu_name should be good enough.
>
> > if (!pmu_have_event("cpu", "slots") && !pmu_have_event("cpu_core", "slots") )
> >
> > But we should add a helper function for this. It is odd to have this
> > change supporting Alderlake but the existing evlist work not. Perhaps
> > we should just wait until Zhengjun's patches land.
>
> Yes, a helper function is good for the arch_evlist functions. But I
> don't think this patch needs the helper function. Zhengjun's patches are
> to fix the other topdown issues on ADL. There is no dependency between
> this patch and zhengjun's patches.
>
> Thanks,
> Kan

TL;DR I think we can move forward with landing these patches to fix Icelake.

For Alderlake/hybrid we have a problem. To determine what happens with
grouping we need to know does the CPU have topdown events? This is a
runtime question for doing perf_event_open and so an arch test and
weak symbol are appropriate. For Icelake we are determining the
presence of topdown events by looking at the special PMU cpu. For
Alderlake the same information can be found by looking at the PMUs
cpu_core and cpu_atom, but how to discover those PMU names? It is
already somewhat concerning that we've hard coded "cpu" and we don't
want to have an ever growing list of PMU names.

We have similarly hard coded "cpu" in the topology code here:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/cputopo.c?h=tmp.perf/core#n18
Is this unreasonable given cpu is already supposed to be ABI stable:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/Documentation/ABI/stable/sysfs-devices-system-cpu?h=tmp.perf/core

It is hard to say what the right hybrid fix is here. I should get a
system I can poke shortly. I'd also like to compare what's in sysfs
for Alderlake with ARM's big.little approach. I can imagine we need a
function that returns a list of CPU like PMUs for probing. Ideally we
could work this out from sysfs and use some stable ABI.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group
  2022-05-09 17:28           ` Ian Rogers
@ 2022-05-09 21:01             ` Liang, Kan
  2022-05-10 16:58               ` Ian Rogers
  0 siblings, 1 reply; 16+ messages in thread
From: Liang, Kan @ 2022-05-09 21:01 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Stephane Eranian, Zhengjun Xing



On 5/9/2022 1:28 PM, Ian Rogers wrote:
> On Thu, May 5, 2022 at 12:44 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>>
>>
>> On 5/5/2022 2:31 PM, Ian Rogers wrote:
>>>>> So I think fixing all of these should be a follow up. I am working to
>>>>> get access to an Alderlake system, could we land this first?
>>>>>
>>>> I think we can use pmu_name to replace the "cpu" to fix the issue for
>>>> the hybrid platform. For a hybrid platform, the pmu_name is either
>>>> cpu_atom or cpu_core.
>>>>
>>>> Besides, the topdown events may have a PMU prefix, e.g.,
>>>> cpu_core/topdown-be-bound/. The strcasecmp may not work well for this case.
>>>>
>>>> How about the below patch?
>>>> If it's OK for you, could you please merge it into your V2 patch set?
>>>> I can do the test on a ADL system.
>>>>
>>>> diff --git a/tools/perf/arch/x86/util/evsel.c
>>>> b/tools/perf/arch/x86/util/evsel.c
>>>> index 40b171de2086..551ae2bab70e 100644
>>>> --- a/tools/perf/arch/x86/util/evsel.c
>>>> +++ b/tools/perf/arch/x86/util/evsel.c
>>>> @@ -33,11 +33,12 @@ void arch_evsel__fixup_new_cycles(struct
>>>> perf_event_attr *attr)
>>>>
>>>>     bool arch_evsel__must_be_in_group(const struct evsel *evsel)
>>>>     {
>>>> -       if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
>>>> -           !pmu_have_event("cpu", "slots"))
>>>> +       const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
>>>> +
>>>> +       if (!pmu_have_event(pmu_name, "slots"))
>>>>                   return false;
>>> Hmm. The idea with this test is to see if the architecture supports
>>> topdown events before going further. There's a similar test in all the
>>> arch_evlist functions. I think with cpu_core this needs to become:
>>>
>>
>> The case is a little bit different here. For the arch_evlist functions,
>> the input is the evlist, not the specific evsel. So we have to check all
>> the possible PMU names which are "cpu" and "cpu_core". Then we decide
>> whether going further.
>>
>> The input of the evsel__must_be_in_group() is the evsel. The PMU name is
>> stored in the evsel->pmu_name. I don't think we need to check all the
>> possible PMU names. Using evsel->pmu_name should be good enough.
>>
>>> if (!pmu_have_event("cpu", "slots") && !pmu_have_event("cpu_core", "slots") )
>>>
>>> But we should add a helper function for this. It is odd to have this
>>> change supporting Alderlake but the existing evlist work not. Perhaps
>>> we should just wait until Zhengjun's patches land.
>>
>> Yes, a helper function is good for the arch_evlist functions. But I
>> don't think this patch needs the helper function. Zhengjun's patches are
>> to fix the other topdown issues on ADL. There is no dependency between
>> this patch and zhengjun's patches.
>>
>> Thanks,
>> Kan
> 
> TL;DR I think we can move forward with landing these patches to fix Icelake.

This patch doesn't work with the hybrid platform for sure. I can send 
you a fix for the hybrid part if you prefer this way? Then I guess you 
may append it as the patch 3 for V2.

Besides the hybrid thing, the patch set also has other two issues I 
mentioned in the previous reply.
- I don't think the strcasecmp() can handle the case like 
cpu/topdown-bad-spec/ or cpu/slots/. It should be an issue for both 
hybrid and non-hybrid platforms.
- It's better not to use non-architecture events, e.g., baclears.any, 
ARITH.DIVIDER_ACTIVE, even in the test case. The non-architecture events 
may be disappear in the future platforms. If so, you have to update the 
test case again for the future platforms.
IMHO, I don't think the patch set is ready.

> 
> For Alderlake/hybrid we have a problem. To determine what happens with
> grouping we need to know does the CPU have topdown events? This is a
> runtime question for doing perf_event_open and so an arch test and
> weak symbol are appropriate. For Icelake we are determining the
> presence of topdown events by looking at the special PMU cpu. For
> Alderlake the same information can be found by looking at the PMUs
> cpu_core and cpu_atom, but how to discover those PMU names?

The PMU name can be retrieved either from the event list or perf command.
For the non-hybrid, the PMU name is hard code to "cpu" for the core 
events. So users/event files don't need to specify the PMU name.
For the hybrid platform, a PMU name is required and stored in the 
evsel->pmu_name. If the evsel->pmu_name is NULL, we can assume that it's 
a non-hybrid PMU, CPU.

> It is
> already somewhat concerning that we've hard coded "cpu" and we don't
> want to have an ever growing list of PMU names.
> 
> We have similarly hard coded "cpu" in the topology code here:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/cputopo.c?h=tmp.perf/core#n18
> Is this unreasonable given cpu is already supposed to be ABI stable:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/Documentation/ABI/stable/sysfs-devices-system-cpu?h=tmp.perf/core

I don't think there is a stable ABI for a PMU name.
The PMU name may be changed generation by generation because of the 
different micro arch. We will try our best to keep it unchanged in X86 
but it's not guaranteed especially when the hybrid is introduced.


> 
> It is hard to say what the right hybrid fix is here. I should get a
> system I can poke shortly. I'd also like to compare what's in sysfs
> for Alderlake with ARM's big.little approach. I can imagine we need a
> function that returns a list of CPU like PMUs for probing. Ideally we
> could work this out from sysfs and use some stable ABI.
>

I don't think there is a standard PMU naming rule for all the ARCHs. For 
X86, it may be possible. You can assume that the name like "cpu" or 
"cpu_*" are for core PMUs. But for other ARCH e.g., ARM, AFAIK, they use 
a quite different naming rule.


Thanks,
Kan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group
  2022-05-09 21:01             ` Liang, Kan
@ 2022-05-10 16:58               ` Ian Rogers
  2022-05-10 19:24                 ` Liang, Kan
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-05-10 16:58 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Stephane Eranian, Zhengjun Xing

On Mon, May 9, 2022 at 2:01 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
> On 5/9/2022 1:28 PM, Ian Rogers wrote:
> > On Thu, May 5, 2022 at 12:44 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
> >>
> >>
> >>
> >> On 5/5/2022 2:31 PM, Ian Rogers wrote:
> >>>>> So I think fixing all of these should be a follow up. I am working to
> >>>>> get access to an Alderlake system, could we land this first?
> >>>>>
> >>>> I think we can use pmu_name to replace the "cpu" to fix the issue for
> >>>> the hybrid platform. For a hybrid platform, the pmu_name is either
> >>>> cpu_atom or cpu_core.
> >>>>
> >>>> Besides, the topdown events may have a PMU prefix, e.g.,
> >>>> cpu_core/topdown-be-bound/. The strcasecmp may not work well for this case.
> >>>>
> >>>> How about the below patch?
> >>>> If it's OK for you, could you please merge it into your V2 patch set?
> >>>> I can do the test on a ADL system.
> >>>>
> >>>> diff --git a/tools/perf/arch/x86/util/evsel.c
> >>>> b/tools/perf/arch/x86/util/evsel.c
> >>>> index 40b171de2086..551ae2bab70e 100644
> >>>> --- a/tools/perf/arch/x86/util/evsel.c
> >>>> +++ b/tools/perf/arch/x86/util/evsel.c
> >>>> @@ -33,11 +33,12 @@ void arch_evsel__fixup_new_cycles(struct
> >>>> perf_event_attr *attr)
> >>>>
> >>>>     bool arch_evsel__must_be_in_group(const struct evsel *evsel)
> >>>>     {
> >>>> -       if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
> >>>> -           !pmu_have_event("cpu", "slots"))
> >>>> +       const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
> >>>> +
> >>>> +       if (!pmu_have_event(pmu_name, "slots"))
> >>>>                   return false;
> >>> Hmm. The idea with this test is to see if the architecture supports
> >>> topdown events before going further. There's a similar test in all the
> >>> arch_evlist functions. I think with cpu_core this needs to become:
> >>>
> >>
> >> The case is a little bit different here. For the arch_evlist functions,
> >> the input is the evlist, not the specific evsel. So we have to check all
> >> the possible PMU names which are "cpu" and "cpu_core". Then we decide
> >> whether going further.
> >>
> >> The input of the evsel__must_be_in_group() is the evsel. The PMU name is
> >> stored in the evsel->pmu_name. I don't think we need to check all the
> >> possible PMU names. Using evsel->pmu_name should be good enough.
> >>
> >>> if (!pmu_have_event("cpu", "slots") && !pmu_have_event("cpu_core", "slots") )
> >>>
> >>> But we should add a helper function for this. It is odd to have this
> >>> change supporting Alderlake but the existing evlist work not. Perhaps
> >>> we should just wait until Zhengjun's patches land.
> >>
> >> Yes, a helper function is good for the arch_evlist functions. But I
> >> don't think this patch needs the helper function. Zhengjun's patches are
> >> to fix the other topdown issues on ADL. There is no dependency between
> >> this patch and zhengjun's patches.
> >>
> >> Thanks,
> >> Kan
> >
> > TL;DR I think we can move forward with landing these patches to fix Icelake.
>
> This patch doesn't work with the hybrid platform for sure. I can send
> you a fix for the hybrid part if you prefer this way? Then I guess you
> may append it as the patch 3 for V2.
>
> Besides the hybrid thing, the patch set also has other two issues I
> mentioned in the previous reply.
> - I don't think the strcasecmp() can handle the case like
> cpu/topdown-bad-spec/ or cpu/slots/. It should be an issue for both
> hybrid and non-hybrid platforms.
> - It's better not to use non-architecture events, e.g., baclears.any,
> ARITH.DIVIDER_ACTIVE, even in the test case. The non-architecture events
> may be disappear in the future platforms. If so, you have to update the
> test case again for the future platforms.
> IMHO, I don't think the patch set is ready.

So all the stated objections are that I'm checking cpu/slots/ for an
indication of topdown support and this doesn't work for hybrid? This
is identical to the arch evlist code:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/arch/x86/util/evlist.c?h=perf/core#n10
both in the functions arch_evlist__add_default_attrs and
arch_evlist__leader (one I wrote and one I didn't). So the patch set
isn't ready because I haven't fixed alderlake, but alderlake already
isn't working? And there is no example of how to make this work for
alderlake. So basically your ask is that I bring up alderlake. I think
this is stretching things for a patch fixing icelake. The value here
is in fixing icelake and alderlake will have to be the next problem.

> >
> > For Alderlake/hybrid we have a problem. To determine what happens with
> > grouping we need to know does the CPU have topdown events? This is a
> > runtime question for doing perf_event_open and so an arch test and
> > weak symbol are appropriate. For Icelake we are determining the
> > presence of topdown events by looking at the special PMU cpu. For
> > Alderlake the same information can be found by looking at the PMUs
> > cpu_core and cpu_atom, but how to discover those PMU names?
>
> The PMU name can be retrieved either from the event list or perf command.
> For the non-hybrid, the PMU name is hard code to "cpu" for the core
> events. So users/event files don't need to specify the PMU name.
> For the hybrid platform, a PMU name is required and stored in the
> evsel->pmu_name. If the evsel->pmu_name is NULL, we can assume that it's
> a non-hybrid PMU, CPU.

This doesn't make sense. Hybrid implies more than 1 CPU type, how can
more than one be the same as 1 type to be used for the PMU? Again, you
are asking I make all the alderlake logic work and I think that should
be follow up. As shown above the arch evlist code also needs fixing as
follow up.

Thanks,
Ian

> > It is
> > already somewhat concerning that we've hard coded "cpu" and we don't
> > want to have an ever growing list of PMU names.
> >
> > We have similarly hard coded "cpu" in the topology code here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/cputopo.c?h=tmp.perf/core#n18
> > Is this unreasonable given cpu is already supposed to be ABI stable:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/Documentation/ABI/stable/sysfs-devices-system-cpu?h=tmp.perf/core
>
> I don't think there is a stable ABI for a PMU name.
> The PMU name may be changed generation by generation because of the
> different micro arch. We will try our best to keep it unchanged in X86
> but it's not guaranteed especially when the hybrid is introduced.
>
>
> >
> > It is hard to say what the right hybrid fix is here. I should get a
> > system I can poke shortly. I'd also like to compare what's in sysfs
> > for Alderlake with ARM's big.little approach. I can imagine we need a
> > function that returns a list of CPU like PMUs for probing. Ideally we
> > could work this out from sysfs and use some stable ABI.
> >
>
> I don't think there is a standard PMU naming rule for all the ARCHs. For
> X86, it may be possible. You can assume that the name like "cpu" or
> "cpu_*" are for core PMUs. But for other ARCH e.g., ARM, AFAIK, they use
> a quite different naming rule.
>
>
> Thanks,
> Kan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group
  2022-05-10 16:58               ` Ian Rogers
@ 2022-05-10 19:24                 ` Liang, Kan
  0 siblings, 0 replies; 16+ messages in thread
From: Liang, Kan @ 2022-05-10 19:24 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Caleb Biggers, Perry Taylor, Kshipra Bopardikar, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ravi Bangoria,
	Andi Kleen, Haowen Bai, Riccardo Mancini, Kim Phillips,
	Madhavan Srinivasan, Shunsuke Nakamura, Florian Fischer,
	linux-kernel, linux-perf-users, Stephane Eranian, Zhengjun Xing



On 5/10/2022 12:58 PM, Ian Rogers wrote:
> On Mon, May 9, 2022 at 2:01 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>> On 5/9/2022 1:28 PM, Ian Rogers wrote:
>>> On Thu, May 5, 2022 at 12:44 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>>>
>>>>
>>>>
>>>> On 5/5/2022 2:31 PM, Ian Rogers wrote:
>>>>>>> So I think fixing all of these should be a follow up. I am working to
>>>>>>> get access to an Alderlake system, could we land this first?
>>>>>>>
>>>>>> I think we can use pmu_name to replace the "cpu" to fix the issue for
>>>>>> the hybrid platform. For a hybrid platform, the pmu_name is either
>>>>>> cpu_atom or cpu_core.
>>>>>>
>>>>>> Besides, the topdown events may have a PMU prefix, e.g.,
>>>>>> cpu_core/topdown-be-bound/. The strcasecmp may not work well for this case.
>>>>>>
>>>>>> How about the below patch?
>>>>>> If it's OK for you, could you please merge it into your V2 patch set?
>>>>>> I can do the test on a ADL system.
>>>>>>
>>>>>> diff --git a/tools/perf/arch/x86/util/evsel.c
>>>>>> b/tools/perf/arch/x86/util/evsel.c
>>>>>> index 40b171de2086..551ae2bab70e 100644
>>>>>> --- a/tools/perf/arch/x86/util/evsel.c
>>>>>> +++ b/tools/perf/arch/x86/util/evsel.c
>>>>>> @@ -33,11 +33,12 @@ void arch_evsel__fixup_new_cycles(struct
>>>>>> perf_event_attr *attr)
>>>>>>
>>>>>>      bool arch_evsel__must_be_in_group(const struct evsel *evsel)
>>>>>>      {
>>>>>> -       if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
>>>>>> -           !pmu_have_event("cpu", "slots"))
>>>>>> +       const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
>>>>>> +
>>>>>> +       if (!pmu_have_event(pmu_name, "slots"))
>>>>>>                    return false;
>>>>> Hmm. The idea with this test is to see if the architecture supports
>>>>> topdown events before going further. There's a similar test in all the
>>>>> arch_evlist functions. I think with cpu_core this needs to become:
>>>>>
>>>>
>>>> The case is a little bit different here. For the arch_evlist functions,
>>>> the input is the evlist, not the specific evsel. So we have to check all
>>>> the possible PMU names which are "cpu" and "cpu_core". Then we decide
>>>> whether going further.
>>>>
>>>> The input of the evsel__must_be_in_group() is the evsel. The PMU name is
>>>> stored in the evsel->pmu_name. I don't think we need to check all the
>>>> possible PMU names. Using evsel->pmu_name should be good enough.
>>>>
>>>>> if (!pmu_have_event("cpu", "slots") && !pmu_have_event("cpu_core", "slots") )
>>>>>
>>>>> But we should add a helper function for this. It is odd to have this
>>>>> change supporting Alderlake but the existing evlist work not. Perhaps
>>>>> we should just wait until Zhengjun's patches land.
>>>>
>>>> Yes, a helper function is good for the arch_evlist functions. But I
>>>> don't think this patch needs the helper function. Zhengjun's patches are
>>>> to fix the other topdown issues on ADL. There is no dependency between
>>>> this patch and zhengjun's patches.
>>>>
>>>> Thanks,
>>>> Kan
>>>
>>> TL;DR I think we can move forward with landing these patches to fix Icelake.
>>
>> This patch doesn't work with the hybrid platform for sure. I can send
>> you a fix for the hybrid part if you prefer this way? Then I guess you
>> may append it as the patch 3 for V2.
>>
>> Besides the hybrid thing, the patch set also has other two issues I
>> mentioned in the previous reply.
>> - I don't think the strcasecmp() can handle the case like
>> cpu/topdown-bad-spec/ or cpu/slots/. It should be an issue for both
>> hybrid and non-hybrid platforms.
>> - It's better not to use non-architecture events, e.g., baclears.any,
>> ARITH.DIVIDER_ACTIVE, even in the test case. The non-architecture events
>> may be disappear in the future platforms. If so, you have to update the
>> test case again for the future platforms.
>> IMHO, I don't think the patch set is ready.
> 
> So all the stated objections are that I'm checking cpu/slots/ for an
> indication of topdown support and this doesn't work for hybrid?

No, besides the hybrid issue, I also pointed out two non-hybrid issues 
for the patch set.

Have you tried this with the patch set on ICX?

./perf stat -e 
'{cpu/slots/,cpu/topdown-bad-spec/,cpu/topdown-be-bound/,cpu/topdown-fe-bound/,cpu/topdown-retiring/,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,cache-misses,cache-references}:W' 
-a sleep 1

I don't think it works.


The ARITH.DIVIDER_ACTIVE is already a deprecated event in SPR.
I don't think we want to update the test case for the platform after SPR.


> This
> is identical to the arch evlist code:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/arch/x86/util/evlist.c?h=perf/core#n10
> both in the functions arch_evlist__add_default_attrs and
> arch_evlist__leader (one I wrote and one I didn't). So the patch set
> isn't ready because I haven't fixed alderlake, but alderlake already
> isn't working? And there is no example of how to make this work for
> alderlake. So basically your ask is that I bring up alderlake. I think
> this is stretching things for a patch fixing icelake. The value here
> is in fixing icelake and alderlake will have to be the next problem.
> 
>>>
>>> For Alderlake/hybrid we have a problem. To determine what happens with
>>> grouping we need to know does the CPU have topdown events? This is a
>>> runtime question for doing perf_event_open and so an arch test and
>>> weak symbol are appropriate. For Icelake we are determining the
>>> presence of topdown events by looking at the special PMU cpu. For
>>> Alderlake the same information can be found by looking at the PMUs
>>> cpu_core and cpu_atom, but how to discover those PMU names?
>>
>> The PMU name can be retrieved either from the event list or perf command.
>> For the non-hybrid, the PMU name is hard code to "cpu" for the core
>> events. So users/event files don't need to specify the PMU name.
>> For the hybrid platform, a PMU name is required and stored in the
>> evsel->pmu_name. If the evsel->pmu_name is NULL, we can assume that it's
>> a non-hybrid PMU, CPU.
> 
> This doesn't make sense. Hybrid implies more than 1 CPU type, how can
> more than one be the same as 1 type to be used for the PMU? 

One event can only belongs to one PMU. It's saved in the evsel->pmu_name.
But it may be NULL on a non-hybrid machine, because there is only one 
CPU type for a non-hybrid machine. The old code may not set the variable.

> Again, you
> are asking I make all the alderlake logic work and I think that should
> be follow up.

I didn't ask you to make all the alderlake logic work.

What I asked is not to introduce more such kind problem, because we 
already have enough to fix.

I even offered you a fixed patch for the problem.

https://lore.kernel.org/all/f1c212e3-c1fd-4a2d-0dfe-bc913d4f4f36@linux.intel.com/T/#m22465c6d35be59fc853a7d86f5bd6025ce7e539b


> As shown above the arch evlist code also needs fixing as
> follow up.

Zhengjun is working on them. There is no dependency between this patch 
and that fix. Don't worry about it.


Thanks,
Kan

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-05-10 19:24 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-05  4:38 [PATCH 1/2] perf evlist: Keep topdown counters in weak group Ian Rogers
2022-05-05  4:38 ` [PATCH 2/2] perf test: Add basic stat and topdown group test Ian Rogers
2022-05-05 12:12   ` Liang, Kan
2022-05-05 15:22     ` Ian Rogers
2022-05-05 18:19       ` Liang, Kan
2022-05-05 18:35         ` Ian Rogers
2022-05-05 20:27           ` Liang, Kan
2022-05-05 11:56 ` [PATCH 1/2] perf evlist: Keep topdown counters in weak group Liang, Kan
2022-05-05 15:18   ` Ian Rogers
2022-05-05 18:15     ` Liang, Kan
2022-05-05 18:31       ` Ian Rogers
2022-05-05 19:43         ` Liang, Kan
2022-05-09 17:28           ` Ian Rogers
2022-05-09 21:01             ` Liang, Kan
2022-05-10 16:58               ` Ian Rogers
2022-05-10 19:24                 ` Liang, Kan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.