linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] perf tool: Enable cpu list for hybrid
@ 2021-07-12  7:12 Jin Yao
  2021-07-12  7:12 ` [PATCH v3 1/3] libperf: Add perf_cpu_map__default_new() Jin Yao
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jin Yao @ 2021-07-12  7:12 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The perf-record and perf-stat have supported the option '-C/--cpus'
to count or collect only on the list of CPUs provided. This option
needs to be supported for hybrid as well.

v3:
---
Rebase to latest perf/core branch.

v2:
---
Automatically map to hybrid pmu.

For example,

If cpu0-7 are 'cpu_core' and cpu9-11 are 'cpu_atom',

  # perf stat -e cycles -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           1,914,704      cpu_core/cycles/
           2,036,983      cpu_atom/cycles/

         1.005815641 seconds time elapsed

It automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
cpu_atom/cycles/, also with some warnings output.

Jin Yao (3):
  libperf: Add perf_cpu_map__default_new()
  perf tools: Create hybrid flag in target
  perf tools: Enable on a list of CPUs for hybrid

 tools/lib/perf/cpumap.c              |  5 +++
 tools/lib/perf/include/perf/cpumap.h |  1 +
 tools/perf/builtin-record.c          |  7 +++
 tools/perf/builtin-stat.c            |  6 +++
 tools/perf/util/evlist-hybrid.c      | 65 ++++++++++++++++++++++++++++
 tools/perf/util/evlist-hybrid.h      |  1 +
 tools/perf/util/evlist.c             |  3 +-
 tools/perf/util/pmu.c                | 35 +++++++++++++++
 tools/perf/util/pmu.h                |  4 ++
 tools/perf/util/target.h             |  1 +
 10 files changed, 127 insertions(+), 1 deletion(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/3] libperf: Add perf_cpu_map__default_new()
  2021-07-12  7:12 [PATCH v3 0/3] perf tool: Enable cpu list for hybrid Jin Yao
@ 2021-07-12  7:12 ` Jin Yao
  2021-07-12  7:12 ` [PATCH v3 2/3] perf tools: Create hybrid flag in target Jin Yao
  2021-07-12  7:12 ` [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid Jin Yao
  2 siblings, 0 replies; 10+ messages in thread
From: Jin Yao @ 2021-07-12  7:12 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The libperf already has a static function 'cpu_map__default_new()'.
Add a new API perf_cpu_map__default_new() to export the function.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
v3:
 - No change.

v2:
 - New in v2.

 tools/lib/perf/cpumap.c              | 5 +++++
 tools/lib/perf/include/perf/cpumap.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index ca0215047c32..51b6553912e0 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -68,6 +68,11 @@ static struct perf_cpu_map *cpu_map__default_new(void)
 	return cpus;
 }
 
+struct perf_cpu_map *perf_cpu_map__default_new(void)
+{
+	return cpu_map__default_new();
+}
+
 static int cmp_int(const void *a, const void *b)
 {
 	return *(const int *)a - *(const int*)b;
diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
index 6a17ad730cbc..7c27766ea0bf 100644
--- a/tools/lib/perf/include/perf/cpumap.h
+++ b/tools/lib/perf/include/perf/cpumap.h
@@ -9,6 +9,7 @@
 struct perf_cpu_map;
 
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__dummy_new(void);
+LIBPERF_API struct perf_cpu_map *perf_cpu_map__default_new(void);
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list);
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__read(FILE *file);
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 2/3] perf tools: Create hybrid flag in target
  2021-07-12  7:12 [PATCH v3 0/3] perf tool: Enable cpu list for hybrid Jin Yao
  2021-07-12  7:12 ` [PATCH v3 1/3] libperf: Add perf_cpu_map__default_new() Jin Yao
@ 2021-07-12  7:12 ` Jin Yao
  2021-07-12  7:12 ` [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid Jin Yao
  2 siblings, 0 replies; 10+ messages in thread
From: Jin Yao @ 2021-07-12  7:12 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

User may count or collect only on a cpu list via '-C/--cpus' option.
Previously cpus for evsel was retrieved from pmu sysfs. But if the
target cpu list is defined, the retrieved cpus are not kept and the
target cpu list is used instead.

But for hybrid system, we can't directly use target cpu list. The
cpu list may not available on hybrid pmu (e.g. cpu_core or cpu_atom).
So we should not set 'has_user_cpus' flag for hybrid system.

The difficulity is that we can't call perf_pmu__has_hybrid() in evlist.c
to check hybrid system otherwise 'perf test python' would be failed
(undefined symbol for perf_pmu__has_hybrid). If we add pmu.c to
python-ext-sources, too many symbol dependencies are hard to resolve.

We use an alternative method by using a new 'hybrid' flag in target
for hybrid system checking.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
v3:
 - No change.

v2:
 - New in v2.

 tools/perf/util/evlist.c | 2 +-
 tools/perf/util/target.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 47581a237c7a..06f8890816c3 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1002,7 +1002,7 @@ int evlist__create_maps(struct evlist *evlist, struct target *target)
 	if (!cpus)
 		goto out_delete_threads;
 
-	evlist->core.has_user_cpus = !!target->cpu_list;
+	evlist->core.has_user_cpus = !!target->cpu_list && !target->hybrid;
 
 	perf_evlist__set_maps(&evlist->core, cpus, threads);
 
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 4ff56217f2a6..daec6cba500d 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -17,6 +17,7 @@ struct target {
 	bool	     default_per_cpu;
 	bool	     per_thread;
 	bool	     use_bpf;
+	bool	     hybrid;
 	const char   *attr_map;
 };
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid
  2021-07-12  7:12 [PATCH v3 0/3] perf tool: Enable cpu list for hybrid Jin Yao
  2021-07-12  7:12 ` [PATCH v3 1/3] libperf: Add perf_cpu_map__default_new() Jin Yao
  2021-07-12  7:12 ` [PATCH v3 2/3] perf tools: Create hybrid flag in target Jin Yao
@ 2021-07-12  7:12 ` Jin Yao
  2021-07-19 19:36   ` Jiri Olsa
  2 siblings, 1 reply; 10+ messages in thread
From: Jin Yao @ 2021-07-12  7:12 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The perf-record and perf-stat have supported the option '-C/--cpus'
to count or collect only on the list of CPUs provided. This option
needs to be supported for hybrid as well.

For hybrid support, it needs to check that the CPUs are available on
hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11
is 'cpu_atom'.

Before:

  # perf stat -e cpu_core/cycles/ -C11 -- sleep 1

   Performance counter stats for 'CPU(s) 11':

     <not supported>      cpu_core/cycles/

         1.006179431 seconds time elapsed

The perf-stat silently returned "<not supported>" without any helpful
information. It should error out that cpu11 was not 'cpu_core'.

After:

  # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
  WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
  failed to use cpu list 11

We also need to support the events without pmu prefix specified.

  # perf stat -e cycles -C11 -- sleep 1
  WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)

   Performance counter stats for 'CPU(s) 11':

           1,067,373      cpu_atom/cycles/

         1.005544738 seconds time elapsed

The perf tool creates two cycles events automatically, cpu_core/cycles/ and
cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning
for cpu_core/cycles/ and only count the cpu_atom/cycles/.

If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', the example,

  # perf stat -e cycles -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           1,914,704      cpu_core/cycles/
           2,036,983      cpu_atom/cycles/

         1.005815641 seconds time elapsed

It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
cpu_atom/cycles/, and output with some warnings.

Some more complex examples,

  # perf stat -e cycles,instructions -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
  WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           2,780,387      cpu_core/cycles/
           1,583,432      cpu_atom/cycles/
           3,957,277      cpu_core/instructions/
           1,167,089      cpu_atom/instructions/

         1.006005124 seconds time elapsed

  # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           3,290,301      cpu_core/cycles/
           1,953,073      cpu_atom/cycles/
           1,407,869      cpu_atom/instructions/

         1.006260912 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
v3:
 - Rebase to perf/core.
 - No functional change.

v2:
 - Automatically map to hybrid pmu.

 tools/perf/builtin-record.c     |  7 ++++
 tools/perf/builtin-stat.c       |  6 +++
 tools/perf/util/evlist-hybrid.c | 65 +++++++++++++++++++++++++++++++++
 tools/perf/util/evlist-hybrid.h |  1 +
 tools/perf/util/evlist.c        |  1 +
 tools/perf/util/pmu.c           | 35 ++++++++++++++++++
 tools/perf/util/pmu.h           |  4 ++
 7 files changed, 119 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 671a21c9ee4d..9518b028b850 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2884,6 +2884,13 @@ int cmd_record(int argc, const char **argv)
 	/* Enable ignoring missing threads when -u/-p option is defined. */
 	rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX || rec->opts.target.pid;
 
+	if (evlist__use_cpu_list(rec->evlist, rec->opts.target.cpu_list)) {
+		pr_err("failed to use cpu list %s\n",
+		       rec->opts.target.cpu_list);
+		goto out;
+	}
+
+	rec->opts.target.hybrid = perf_pmu__has_hybrid();
 	err = -ENOMEM;
 	if (evlist__create_maps(rec->evlist, &rec->opts.target) < 0)
 		usage_with_options(record_usage, record_options);
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index d25cb8088e8c..f7067587008f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2430,6 +2430,12 @@ int cmd_stat(int argc, const char **argv)
 	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
 		target.per_thread = true;
 
+	if (evlist__use_cpu_list(evsel_list, target.cpu_list)) {
+		pr_err("failed to use cpu list %s\n", target.cpu_list);
+		goto out;
+	}
+
+	target.hybrid = perf_pmu__has_hybrid();
 	if (evlist__create_maps(evsel_list, &target) < 0) {
 		if (target__has_task(&target)) {
 			pr_err("Problems finding threads of monitor\n");
diff --git a/tools/perf/util/evlist-hybrid.c b/tools/perf/util/evlist-hybrid.c
index db3f5fbdebe1..13c9f3063dda 100644
--- a/tools/perf/util/evlist-hybrid.c
+++ b/tools/perf/util/evlist-hybrid.c
@@ -86,3 +86,68 @@ bool evlist__has_hybrid(struct evlist *evlist)
 
 	return false;
 }
+
+int evlist__use_cpu_list(struct evlist *evlist, const char *cpu_list)
+{
+	struct perf_cpu_map *cpus;
+	struct evsel *evsel, *tmp;
+	struct perf_pmu *pmu;
+	int ret, unmatched_count = 0, events_nr = 0;
+
+	if (!perf_pmu__has_hybrid() || !cpu_list)
+		return 0;
+
+	cpus = perf_cpu_map__new(cpu_list);
+	if (!cpus)
+		return -1;
+
+	evlist__for_each_entry_safe(evlist, tmp, evsel) {
+		struct perf_cpu_map *matched_cpus, *unmatched_cpus;
+		char buf1[128], buf2[128];
+
+		pmu = perf_pmu__find_hybrid_pmu(evsel->pmu_name);
+		if (!pmu)
+			continue;
+
+		ret = perf_pmu__cpus_match(pmu, cpus, &matched_cpus,
+					   &unmatched_cpus);
+		if (ret)
+			goto out;
+
+		events_nr++;
+
+		if (matched_cpus->nr > 0 && (unmatched_cpus->nr > 0 ||
+		    matched_cpus->nr < cpus->nr ||
+		    matched_cpus->nr < pmu->cpus->nr)) {
+			perf_cpu_map__put(evsel->core.cpus);
+			perf_cpu_map__put(evsel->core.own_cpus);
+			evsel->core.cpus = perf_cpu_map__get(matched_cpus);
+			evsel->core.own_cpus = perf_cpu_map__get(matched_cpus);
+
+			if (unmatched_cpus->nr > 0) {
+				cpu_map__snprint(matched_cpus, buf1, sizeof(buf1));
+				pr_warning("WARNING: use %s in '%s' for '%s', skip other cpus in list.\n",
+					   buf1, pmu->name, evsel->name);
+			}
+		}
+
+		if (matched_cpus->nr == 0) {
+			evlist__remove(evlist, evsel);
+			evsel__delete(evsel);
+
+			cpu_map__snprint(cpus, buf1, sizeof(buf1));
+			cpu_map__snprint(pmu->cpus, buf2, sizeof(buf2));
+			pr_warning("WARNING: %s isn't a '%s', please use a CPU list in the '%s' range (%s)\n",
+				   buf1, pmu->name, pmu->name, buf2);
+			unmatched_count++;
+		}
+
+		perf_cpu_map__put(matched_cpus);
+		perf_cpu_map__put(unmatched_cpus);
+	}
+
+	ret = (unmatched_count == events_nr) ? -1 : 0;
+out:
+	perf_cpu_map__put(cpus);
+	return ret;
+}
diff --git a/tools/perf/util/evlist-hybrid.h b/tools/perf/util/evlist-hybrid.h
index 19f74b4c340a..f33a4e8443a1 100644
--- a/tools/perf/util/evlist-hybrid.h
+++ b/tools/perf/util/evlist-hybrid.h
@@ -10,5 +10,6 @@
 int evlist__add_default_hybrid(struct evlist *evlist, bool precise);
 void evlist__warn_hybrid_group(struct evlist *evlist);
 bool evlist__has_hybrid(struct evlist *evlist);
+int evlist__use_cpu_list(struct evlist *evlist, const char *cpu_list);
 
 #endif /* __PERF_EVLIST_HYBRID_H */
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 06f8890816c3..5f92319ce258 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -27,6 +27,7 @@
 #include "util/perf_api_probe.h"
 #include "util/evsel_fprintf.h"
 #include "util/evlist-hybrid.h"
+#include "util/pmu.h"
 #include <signal.h>
 #include <unistd.h>
 #include <sched.h>
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 44b90d638ad5..229cb975c5d7 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1906,3 +1906,38 @@ int perf_pmu__match(char *pattern, char *name, char *tok)
 
 	return 0;
 }
+
+int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
+			 struct perf_cpu_map **mcpus_ptr,
+			 struct perf_cpu_map **ucpus_ptr)
+{
+	struct perf_cpu_map *pmu_cpus = pmu->cpus;
+	struct perf_cpu_map *matched_cpus, *unmatched_cpus;
+	int matched_nr = 0, unmatched_nr = 0;
+
+	matched_cpus = perf_cpu_map__default_new();
+	if (!matched_cpus)
+		return -1;
+
+	unmatched_cpus = perf_cpu_map__default_new();
+	if (!unmatched_cpus) {
+		perf_cpu_map__put(matched_cpus);
+		return -1;
+	}
+
+	for (int i = 0; i < cpus->nr; i++) {
+		int cpu;
+
+		cpu = perf_cpu_map__idx(pmu_cpus, cpus->map[i]);
+		if (cpu == -1)
+			unmatched_cpus->map[unmatched_nr++] = cpus->map[i];
+		else
+			matched_cpus->map[matched_nr++] = cpus->map[i];
+	}
+
+	unmatched_cpus->nr = unmatched_nr;
+	matched_cpus->nr = matched_nr;
+	*mcpus_ptr = matched_cpus;
+	*ucpus_ptr = unmatched_cpus;
+	return 0;
+}
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 926da483a141..e05ee1906f65 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -11,6 +11,7 @@
 #include "pmu-events/pmu-events.h"
 
 struct evsel_config_term;
+struct perf_cpu_map;
 
 enum {
 	PERF_PMU_FORMAT_VALUE_CONFIG,
@@ -135,4 +136,7 @@ void perf_pmu__warn_invalid_config(struct perf_pmu *pmu, __u64 config,
 bool perf_pmu__has_hybrid(void);
 int perf_pmu__match(char *pattern, char *name, char *tok);
 
+int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
+			 struct perf_cpu_map **mcpus_ptr,
+			 struct perf_cpu_map **ucpus_ptr);
 #endif /* __PMU_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid
  2021-07-12  7:12 ` [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid Jin Yao
@ 2021-07-19 19:36   ` Jiri Olsa
  2021-07-20  7:07     ` Jin, Yao
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2021-07-19 19:36 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Mon, Jul 12, 2021 at 03:12:35PM +0800, Jin Yao wrote:
> The perf-record and perf-stat have supported the option '-C/--cpus'
> to count or collect only on the list of CPUs provided. This option
> needs to be supported for hybrid as well.
> 
> For hybrid support, it needs to check that the CPUs are available on
> hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11
> is 'cpu_atom'.
> 
> Before:
> 
>   # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
> 
>    Performance counter stats for 'CPU(s) 11':
> 
>      <not supported>      cpu_core/cycles/
> 
>          1.006179431 seconds time elapsed
> 
> The perf-stat silently returned "<not supported>" without any helpful
> information. It should error out that cpu11 was not 'cpu_core'.
> 
> After:
> 
>   # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
>   WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
>   failed to use cpu list 11
> 
> We also need to support the events without pmu prefix specified.
> 
>   # perf stat -e cycles -C11 -- sleep 1
>   WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
> 
>    Performance counter stats for 'CPU(s) 11':
> 
>            1,067,373      cpu_atom/cycles/
> 
>          1.005544738 seconds time elapsed
> 
> The perf tool creates two cycles events automatically, cpu_core/cycles/ and
> cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning
> for cpu_core/cycles/ and only count the cpu_atom/cycles/.
> 
> If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', the example,
> 
>   # perf stat -e cycles -C0,11 -- sleep 1
>   WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
>   WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
> 
>    Performance counter stats for 'CPU(s) 0,11':
> 
>            1,914,704      cpu_core/cycles/
>            2,036,983      cpu_atom/cycles/
> 
>          1.005815641 seconds time elapsed
> 
> It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
> cpu_atom/cycles/, and output with some warnings.
> 
> Some more complex examples,
> 
>   # perf stat -e cycles,instructions -C0,11 -- sleep 1
>   WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
>   WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
>   WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list.
>   WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list.
> 
>    Performance counter stats for 'CPU(s) 0,11':
> 
>            2,780,387      cpu_core/cycles/
>            1,583,432      cpu_atom/cycles/
>            3,957,277      cpu_core/instructions/
>            1,167,089      cpu_atom/instructions/
> 
>          1.006005124 seconds time elapsed
> 
>   # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1
>   WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
>   WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
>   WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list.
> 
>    Performance counter stats for 'CPU(s) 0,11':
> 
>            3,290,301      cpu_core/cycles/
>            1,953,073      cpu_atom/cycles/
>            1,407,869      cpu_atom/instructions/
> 
>          1.006260912 seconds time elapsed
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
> v3:
>  - Rebase to perf/core.
>  - No functional change.
> 
> v2:
>  - Automatically map to hybrid pmu.
> 
>  tools/perf/builtin-record.c     |  7 ++++
>  tools/perf/builtin-stat.c       |  6 +++
>  tools/perf/util/evlist-hybrid.c | 65 +++++++++++++++++++++++++++++++++
>  tools/perf/util/evlist-hybrid.h |  1 +
>  tools/perf/util/evlist.c        |  1 +
>  tools/perf/util/pmu.c           | 35 ++++++++++++++++++
>  tools/perf/util/pmu.h           |  4 ++
>  7 files changed, 119 insertions(+)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 671a21c9ee4d..9518b028b850 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -2884,6 +2884,13 @@ int cmd_record(int argc, const char **argv)
>  	/* Enable ignoring missing threads when -u/-p option is defined. */
>  	rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX || rec->opts.target.pid;
>  
> +	if (evlist__use_cpu_list(rec->evlist, rec->opts.target.cpu_list)) {
> +		pr_err("failed to use cpu list %s\n",
> +		       rec->opts.target.cpu_list);
> +		goto out;
> +	}
> +
> +	rec->opts.target.hybrid = perf_pmu__has_hybrid();
>  	err = -ENOMEM;
>  	if (evlist__create_maps(rec->evlist, &rec->opts.target) < 0)
>  		usage_with_options(record_usage, record_options);
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index d25cb8088e8c..f7067587008f 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -2430,6 +2430,12 @@ int cmd_stat(int argc, const char **argv)
>  	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
>  		target.per_thread = true;
>  
> +	if (evlist__use_cpu_list(evsel_list, target.cpu_list)) {
> +		pr_err("failed to use cpu list %s\n", target.cpu_list);
> +		goto out;
> +	}
> +
> +	target.hybrid = perf_pmu__has_hybrid();
>  	if (evlist__create_maps(evsel_list, &target) < 0) {
>  		if (target__has_task(&target)) {
>  			pr_err("Problems finding threads of monitor\n");
> diff --git a/tools/perf/util/evlist-hybrid.c b/tools/perf/util/evlist-hybrid.c
> index db3f5fbdebe1..13c9f3063dda 100644
> --- a/tools/perf/util/evlist-hybrid.c
> +++ b/tools/perf/util/evlist-hybrid.c
> @@ -86,3 +86,68 @@ bool evlist__has_hybrid(struct evlist *evlist)
>  
>  	return false;
>  }
> +
> +int evlist__use_cpu_list(struct evlist *evlist, const char *cpu_list)


the name seems not to cover what it's doing, how about something
like evlist__fix_cpus or such

> +{
> +	struct perf_cpu_map *cpus;
> +	struct evsel *evsel, *tmp;
> +	struct perf_pmu *pmu;
> +	int ret, unmatched_count = 0, events_nr = 0;
> +
> +	if (!perf_pmu__has_hybrid() || !cpu_list)
> +		return 0;
> +
> +	cpus = perf_cpu_map__new(cpu_list);
> +	if (!cpus)
> +		return -1;
> +
> +	evlist__for_each_entry_safe(evlist, tmp, evsel) {
> +		struct perf_cpu_map *matched_cpus, *unmatched_cpus;
> +		char buf1[128], buf2[128];
> +
> +		pmu = perf_pmu__find_hybrid_pmu(evsel->pmu_name);
> +		if (!pmu)
> +			continue;
> +
> +		ret = perf_pmu__cpus_match(pmu, cpus, &matched_cpus,
> +					   &unmatched_cpus);
> +		if (ret)
> +			goto out;
> +
> +		events_nr++;
> +
> +		if (matched_cpus->nr > 0 && (unmatched_cpus->nr > 0 ||
> +		    matched_cpus->nr < cpus->nr ||
> +		    matched_cpus->nr < pmu->cpus->nr)) {
> +			perf_cpu_map__put(evsel->core.cpus);
> +			perf_cpu_map__put(evsel->core.own_cpus);
> +			evsel->core.cpus = perf_cpu_map__get(matched_cpus);
> +			evsel->core.own_cpus = perf_cpu_map__get(matched_cpus);

I'm bit confused in here.. AFAIUI there's 2 evsel objects create
for hybrid 'cycles' ... should they have already proper cpus set?

> +
> +			if (unmatched_cpus->nr > 0) {
> +				cpu_map__snprint(matched_cpus, buf1, sizeof(buf1));
> +				pr_warning("WARNING: use %s in '%s' for '%s', skip other cpus in list.\n",
> +					   buf1, pmu->name, evsel->name);
> +			}
> +		}
> +
> +		if (matched_cpus->nr == 0) {
> +			evlist__remove(evlist, evsel);
> +			evsel__delete(evsel);
> +
> +			cpu_map__snprint(cpus, buf1, sizeof(buf1));
> +			cpu_map__snprint(pmu->cpus, buf2, sizeof(buf2));
> +			pr_warning("WARNING: %s isn't a '%s', please use a CPU list in the '%s' range (%s)\n",
> +				   buf1, pmu->name, pmu->name, buf2);
> +			unmatched_count++;
> +		}

hum, should we rather fail in here?

jirka


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid
  2021-07-19 19:36   ` Jiri Olsa
@ 2021-07-20  7:07     ` Jin, Yao
  2021-07-20  9:16       ` Jiri Olsa
  0 siblings, 1 reply; 10+ messages in thread
From: Jin, Yao @ 2021-07-20  7:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Hi Jiri,

On 7/20/2021 3:36 AM, Jiri Olsa wrote:
> On Mon, Jul 12, 2021 at 03:12:35PM +0800, Jin Yao wrote:
>> The perf-record and perf-stat have supported the option '-C/--cpus'
>> to count or collect only on the list of CPUs provided. This option
>> needs to be supported for hybrid as well.
>>
>> For hybrid support, it needs to check that the CPUs are available on
>> hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11
>> is 'cpu_atom'.
>>
>> Before:
>>
>>    # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
>>
>>     Performance counter stats for 'CPU(s) 11':
>>
>>       <not supported>      cpu_core/cycles/
>>
>>           1.006179431 seconds time elapsed
>>
>> The perf-stat silently returned "<not supported>" without any helpful
>> information. It should error out that cpu11 was not 'cpu_core'.
>>
>> After:
>>
>>    # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
>>    WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
>>    failed to use cpu list 11
>>
>> We also need to support the events without pmu prefix specified.
>>
>>    # perf stat -e cycles -C11 -- sleep 1
>>    WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
>>
>>     Performance counter stats for 'CPU(s) 11':
>>
>>             1,067,373      cpu_atom/cycles/
>>
>>           1.005544738 seconds time elapsed
>>
>> The perf tool creates two cycles events automatically, cpu_core/cycles/ and
>> cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning
>> for cpu_core/cycles/ and only count the cpu_atom/cycles/.
>>
>> If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', the example,
>>
>>    # perf stat -e cycles -C0,11 -- sleep 1
>>    WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
>>
>>     Performance counter stats for 'CPU(s) 0,11':
>>
>>             1,914,704      cpu_core/cycles/
>>             2,036,983      cpu_atom/cycles/
>>
>>           1.005815641 seconds time elapsed
>>
>> It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
>> cpu_atom/cycles/, and output with some warnings.
>>
>> Some more complex examples,
>>
>>    # perf stat -e cycles,instructions -C0,11 -- sleep 1
>>    WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
>>    WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list.
>>
>>     Performance counter stats for 'CPU(s) 0,11':
>>
>>             2,780,387      cpu_core/cycles/
>>             1,583,432      cpu_atom/cycles/
>>             3,957,277      cpu_core/instructions/
>>             1,167,089      cpu_atom/instructions/
>>
>>           1.006005124 seconds time elapsed
>>
>>    # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1
>>    WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list.
>>
>>     Performance counter stats for 'CPU(s) 0,11':
>>
>>             3,290,301      cpu_core/cycles/
>>             1,953,073      cpu_atom/cycles/
>>             1,407,869      cpu_atom/instructions/
>>
>>           1.006260912 seconds time elapsed
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>> v3:
>>   - Rebase to perf/core.
>>   - No functional change.
>>
>> v2:
>>   - Automatically map to hybrid pmu.
>>
>>   tools/perf/builtin-record.c     |  7 ++++
>>   tools/perf/builtin-stat.c       |  6 +++
>>   tools/perf/util/evlist-hybrid.c | 65 +++++++++++++++++++++++++++++++++
>>   tools/perf/util/evlist-hybrid.h |  1 +
>>   tools/perf/util/evlist.c        |  1 +
>>   tools/perf/util/pmu.c           | 35 ++++++++++++++++++
>>   tools/perf/util/pmu.h           |  4 ++
>>   7 files changed, 119 insertions(+)
>>
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index 671a21c9ee4d..9518b028b850 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -2884,6 +2884,13 @@ int cmd_record(int argc, const char **argv)
>>   	/* Enable ignoring missing threads when -u/-p option is defined. */
>>   	rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX || rec->opts.target.pid;
>>   
>> +	if (evlist__use_cpu_list(rec->evlist, rec->opts.target.cpu_list)) {
>> +		pr_err("failed to use cpu list %s\n",
>> +		       rec->opts.target.cpu_list);
>> +		goto out;
>> +	}
>> +
>> +	rec->opts.target.hybrid = perf_pmu__has_hybrid();
>>   	err = -ENOMEM;
>>   	if (evlist__create_maps(rec->evlist, &rec->opts.target) < 0)
>>   		usage_with_options(record_usage, record_options);
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index d25cb8088e8c..f7067587008f 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -2430,6 +2430,12 @@ int cmd_stat(int argc, const char **argv)
>>   	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
>>   		target.per_thread = true;
>>   
>> +	if (evlist__use_cpu_list(evsel_list, target.cpu_list)) {
>> +		pr_err("failed to use cpu list %s\n", target.cpu_list);
>> +		goto out;
>> +	}
>> +
>> +	target.hybrid = perf_pmu__has_hybrid();
>>   	if (evlist__create_maps(evsel_list, &target) < 0) {
>>   		if (target__has_task(&target)) {
>>   			pr_err("Problems finding threads of monitor\n");
>> diff --git a/tools/perf/util/evlist-hybrid.c b/tools/perf/util/evlist-hybrid.c
>> index db3f5fbdebe1..13c9f3063dda 100644
>> --- a/tools/perf/util/evlist-hybrid.c
>> +++ b/tools/perf/util/evlist-hybrid.c
>> @@ -86,3 +86,68 @@ bool evlist__has_hybrid(struct evlist *evlist)
>>   
>>   	return false;
>>   }
>> +
>> +int evlist__use_cpu_list(struct evlist *evlist, const char *cpu_list)
> 
> 
> the name seems not to cover what it's doing, how about something
> like evlist__fix_cpus or such
> 

OK, evlist__fix_cpus() is better, use this name in v4.

>> +{
>> +	struct perf_cpu_map *cpus;
>> +	struct evsel *evsel, *tmp;
>> +	struct perf_pmu *pmu;
>> +	int ret, unmatched_count = 0, events_nr = 0;
>> +
>> +	if (!perf_pmu__has_hybrid() || !cpu_list)
>> +		return 0;
>> +
>> +	cpus = perf_cpu_map__new(cpu_list);
>> +	if (!cpus)
>> +		return -1;
>> +
>> +	evlist__for_each_entry_safe(evlist, tmp, evsel) {
>> +		struct perf_cpu_map *matched_cpus, *unmatched_cpus;
>> +		char buf1[128], buf2[128];
>> +
>> +		pmu = perf_pmu__find_hybrid_pmu(evsel->pmu_name);
>> +		if (!pmu)
>> +			continue;
>> +
>> +		ret = perf_pmu__cpus_match(pmu, cpus, &matched_cpus,
>> +					   &unmatched_cpus);
>> +		if (ret)
>> +			goto out;
>> +
>> +		events_nr++;
>> +
>> +		if (matched_cpus->nr > 0 && (unmatched_cpus->nr > 0 ||
>> +		    matched_cpus->nr < cpus->nr ||
>> +		    matched_cpus->nr < pmu->cpus->nr)) {
>> +			perf_cpu_map__put(evsel->core.cpus);
>> +			perf_cpu_map__put(evsel->core.own_cpus);
>> +			evsel->core.cpus = perf_cpu_map__get(matched_cpus);
>> +			evsel->core.own_cpus = perf_cpu_map__get(matched_cpus);
> 
> I'm bit confused in here.. AFAIUI there's 2 evsel objects create
> for hybrid 'cycles' ... should they have already proper cpus set?
> 

For 'cycles', yes two evsels are created automatically. One is for atom CPU (e.g. 8-11), the other 
is for core CPU (e.g. 0-7). In this example, these 2 evsels have already the cpus set.

While the 'cpus' here is just the user specified cpu list.
cpus = perf_cpu_map__new(cpu_list);

We need to check that the cpu in 'cpus' is available on hybrid pmu or not and adjust the 
evsel->core.cpus according the matching results.

>> +
>> +			if (unmatched_cpus->nr > 0) {
>> +				cpu_map__snprint(matched_cpus, buf1, sizeof(buf1));
>> +				pr_warning("WARNING: use %s in '%s' for '%s', skip other cpus in list.\n",
>> +					   buf1, pmu->name, evsel->name);
>> +			}
>> +		}
>> +
>> +		if (matched_cpus->nr == 0) {
>> +			evlist__remove(evlist, evsel);
>> +			evsel__delete(evsel);
>> +
>> +			cpu_map__snprint(cpus, buf1, sizeof(buf1));
>> +			cpu_map__snprint(pmu->cpus, buf2, sizeof(buf2));
>> +			pr_warning("WARNING: %s isn't a '%s', please use a CPU list in the '%s' range (%s)\n",
>> +				   buf1, pmu->name, pmu->name, buf2);
>> +			unmatched_count++;
>> +		}
> 
> hum, should we rather fail in here?
> 

perf stat -e cpu_core/cycles/,cpu_atom/instructions/ -C11

CPU11 is atom CPU so the evsel 'cpu_core/cycles/' is failed but cpu_atom/instructions/ is OK.

Don't we report the partially successful event?

Thanks
Jin Yao

> jirka
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid
  2021-07-20  7:07     ` Jin, Yao
@ 2021-07-20  9:16       ` Jiri Olsa
  2021-07-21  4:30         ` Jin, Yao
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2021-07-20  9:16 UTC (permalink / raw)
  To: Jin, Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Tue, Jul 20, 2021 at 03:07:02PM +0800, Jin, Yao wrote:

SNIP

> 
> OK, evlist__fix_cpus() is better, use this name in v4.
> 
> > > +{
> > > +	struct perf_cpu_map *cpus;
> > > +	struct evsel *evsel, *tmp;
> > > +	struct perf_pmu *pmu;
> > > +	int ret, unmatched_count = 0, events_nr = 0;
> > > +
> > > +	if (!perf_pmu__has_hybrid() || !cpu_list)
> > > +		return 0;
> > > +
> > > +	cpus = perf_cpu_map__new(cpu_list);
> > > +	if (!cpus)
> > > +		return -1;
> > > +
> > > +	evlist__for_each_entry_safe(evlist, tmp, evsel) {
> > > +		struct perf_cpu_map *matched_cpus, *unmatched_cpus;
> > > +		char buf1[128], buf2[128];
> > > +
> > > +		pmu = perf_pmu__find_hybrid_pmu(evsel->pmu_name);
> > > +		if (!pmu)
> > > +			continue;
> > > +
> > > +		ret = perf_pmu__cpus_match(pmu, cpus, &matched_cpus,
> > > +					   &unmatched_cpus);
> > > +		if (ret)
> > > +			goto out;
> > > +
> > > +		events_nr++;
> > > +
> > > +		if (matched_cpus->nr > 0 && (unmatched_cpus->nr > 0 ||
> > > +		    matched_cpus->nr < cpus->nr ||
> > > +		    matched_cpus->nr < pmu->cpus->nr)) {
> > > +			perf_cpu_map__put(evsel->core.cpus);
> > > +			perf_cpu_map__put(evsel->core.own_cpus);
> > > +			evsel->core.cpus = perf_cpu_map__get(matched_cpus);
> > > +			evsel->core.own_cpus = perf_cpu_map__get(matched_cpus);
> > 
> > I'm bit confused in here.. AFAIUI there's 2 evsel objects create
> > for hybrid 'cycles' ... should they have already proper cpus set?
> > 
> 
> For 'cycles', yes two evsels are created automatically. One is for atom CPU
> (e.g. 8-11), the other is for core CPU (e.g. 0-7). In this example, these 2
> evsels have already the cpus set.

hum, so those evsels are created with pmu's cpus, right?

> 
> While the 'cpus' here is just the user specified cpu list.
> cpus = perf_cpu_map__new(cpu_list);

then I think they will be changed by evlist__create_maps
with whatever user wants?

could we just change __perf_evlist__propagate_maps to follow
pmu's cpus?

jirka

> 
> We need to check that the cpu in 'cpus' is available on hybrid pmu or not
> and adjust the evsel->core.cpus according the matching results.
> 
> > > +
> > > +			if (unmatched_cpus->nr > 0) {
> > > +				cpu_map__snprint(matched_cpus, buf1, sizeof(buf1));
> > > +				pr_warning("WARNING: use %s in '%s' for '%s', skip other cpus in list.\n",
> > > +					   buf1, pmu->name, evsel->name);
> > > +			}
> > > +		}
> > > +
> > > +		if (matched_cpus->nr == 0) {
> > > +			evlist__remove(evlist, evsel);
> > > +			evsel__delete(evsel);
> > > +
> > > +			cpu_map__snprint(cpus, buf1, sizeof(buf1));
> > > +			cpu_map__snprint(pmu->cpus, buf2, sizeof(buf2));
> > > +			pr_warning("WARNING: %s isn't a '%s', please use a CPU list in the '%s' range (%s)\n",
> > > +				   buf1, pmu->name, pmu->name, buf2);
> > > +			unmatched_count++;
> > > +		}
> > 
> > hum, should we rather fail in here?
> > 
> 
> perf stat -e cpu_core/cycles/,cpu_atom/instructions/ -C11
> 
> CPU11 is atom CPU so the evsel 'cpu_core/cycles/' is failed but cpu_atom/instructions/ is OK.
> 
> Don't we report the partially successful event?
> 
> Thanks
> Jin Yao
> 
> > jirka
> > 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid
  2021-07-20  9:16       ` Jiri Olsa
@ 2021-07-21  4:30         ` Jin, Yao
  2021-07-22 10:19           ` Jiri Olsa
  0 siblings, 1 reply; 10+ messages in thread
From: Jin, Yao @ 2021-07-21  4:30 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Hi Jiri,

On 7/20/2021 5:16 PM, Jiri Olsa wrote:
> On Tue, Jul 20, 2021 at 03:07:02PM +0800, Jin, Yao wrote:
> 
> SNIP
> 
>>
>> OK, evlist__fix_cpus() is better, use this name in v4.
>>
>>>> +{
>>>> +	struct perf_cpu_map *cpus;
>>>> +	struct evsel *evsel, *tmp;
>>>> +	struct perf_pmu *pmu;
>>>> +	int ret, unmatched_count = 0, events_nr = 0;
>>>> +
>>>> +	if (!perf_pmu__has_hybrid() || !cpu_list)
>>>> +		return 0;
>>>> +
>>>> +	cpus = perf_cpu_map__new(cpu_list);
>>>> +	if (!cpus)
>>>> +		return -1;
>>>> +
>>>> +	evlist__for_each_entry_safe(evlist, tmp, evsel) {
>>>> +		struct perf_cpu_map *matched_cpus, *unmatched_cpus;
>>>> +		char buf1[128], buf2[128];
>>>> +
>>>> +		pmu = perf_pmu__find_hybrid_pmu(evsel->pmu_name);
>>>> +		if (!pmu)
>>>> +			continue;
>>>> +
>>>> +		ret = perf_pmu__cpus_match(pmu, cpus, &matched_cpus,
>>>> +					   &unmatched_cpus);
>>>> +		if (ret)
>>>> +			goto out;
>>>> +
>>>> +		events_nr++;
>>>> +
>>>> +		if (matched_cpus->nr > 0 && (unmatched_cpus->nr > 0 ||
>>>> +		    matched_cpus->nr < cpus->nr ||
>>>> +		    matched_cpus->nr < pmu->cpus->nr)) {
>>>> +			perf_cpu_map__put(evsel->core.cpus);
>>>> +			perf_cpu_map__put(evsel->core.own_cpus);
>>>> +			evsel->core.cpus = perf_cpu_map__get(matched_cpus);
>>>> +			evsel->core.own_cpus = perf_cpu_map__get(matched_cpus);
>>>
>>> I'm bit confused in here.. AFAIUI there's 2 evsel objects create
>>> for hybrid 'cycles' ... should they have already proper cpus set?
>>>
>>
>> For 'cycles', yes two evsels are created automatically. One is for atom CPU
>> (e.g. 8-11), the other is for core CPU (e.g. 0-7). In this example, these 2
>> evsels have already the cpus set.
> 
> hum, so those evsels are created with pmu's cpus, right?
> 

Yes, that's right. But we also check and adjust the evsel->cpus by using user's cpu list on hybrid 
(what the evlist__use_cpu_list() does).

>>
>> While the 'cpus' here is just the user specified cpu list.
>> cpus = perf_cpu_map__new(cpu_list);
> 
> then I think they will be changed by evlist__create_maps
> with whatever user wants?
> 

No, it will not be changed by evlist__create_maps.

In evlist__create_maps(),
evlist->core.has_user_cpus = !!target->cpu_list && !target->hybrid;

It disables has_user_cpus for hybrid.

So in __perf_evlist__propagate_maps, they will not be changed by evlist->cpus.

if (!evsel->own_cpus || evlist->has_user_cpus) {
	perf_cpu_map__put(evsel->cpus);
	evsel->cpus = perf_cpu_map__get(evlist->cpus);
	
> could we just change __perf_evlist__propagate_maps to follow
> pmu's cpus?
> 

In __perf_evlist__propagate_maps, it has already followed pmu's cpus because the 
evlist->has_user_cpus is false for hybrid.

Thanks
Jin Yao

> jirka
> 
>>
>> We need to check that the cpu in 'cpus' is available on hybrid pmu or not
>> and adjust the evsel->core.cpus according the matching results.
>>
>>>> +
>>>> +			if (unmatched_cpus->nr > 0) {
>>>> +				cpu_map__snprint(matched_cpus, buf1, sizeof(buf1));
>>>> +				pr_warning("WARNING: use %s in '%s' for '%s', skip other cpus in list.\n",
>>>> +					   buf1, pmu->name, evsel->name);
>>>> +			}
>>>> +		}
>>>> +
>>>> +		if (matched_cpus->nr == 0) {
>>>> +			evlist__remove(evlist, evsel);
>>>> +			evsel__delete(evsel);
>>>> +
>>>> +			cpu_map__snprint(cpus, buf1, sizeof(buf1));
>>>> +			cpu_map__snprint(pmu->cpus, buf2, sizeof(buf2));
>>>> +			pr_warning("WARNING: %s isn't a '%s', please use a CPU list in the '%s' range (%s)\n",
>>>> +				   buf1, pmu->name, pmu->name, buf2);
>>>> +			unmatched_count++;
>>>> +		}
>>>
>>> hum, should we rather fail in here?
>>>
>>
>> perf stat -e cpu_core/cycles/,cpu_atom/instructions/ -C11
>>
>> CPU11 is atom CPU so the evsel 'cpu_core/cycles/' is failed but cpu_atom/instructions/ is OK.
>>
>> Don't we report the partially successful event?
>>
>> Thanks
>> Jin Yao
>>
>>> jirka
>>>
>>
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid
  2021-07-21  4:30         ` Jin, Yao
@ 2021-07-22 10:19           ` Jiri Olsa
  2021-07-23  0:49             ` Jin, Yao
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2021-07-22 10:19 UTC (permalink / raw)
  To: Jin, Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Wed, Jul 21, 2021 at 12:30:11PM +0800, Jin, Yao wrote:
> Hi Jiri,
> 
> On 7/20/2021 5:16 PM, Jiri Olsa wrote:
> > On Tue, Jul 20, 2021 at 03:07:02PM +0800, Jin, Yao wrote:
> > 
> > SNIP
> > 
> > > 
> > > OK, evlist__fix_cpus() is better, use this name in v4.
> > > 
> > > > > +{
> > > > > +	struct perf_cpu_map *cpus;
> > > > > +	struct evsel *evsel, *tmp;
> > > > > +	struct perf_pmu *pmu;
> > > > > +	int ret, unmatched_count = 0, events_nr = 0;
> > > > > +
> > > > > +	if (!perf_pmu__has_hybrid() || !cpu_list)
> > > > > +		return 0;
> > > > > +
> > > > > +	cpus = perf_cpu_map__new(cpu_list);
> > > > > +	if (!cpus)
> > > > > +		return -1;
> > > > > +
> > > > > +	evlist__for_each_entry_safe(evlist, tmp, evsel) {
> > > > > +		struct perf_cpu_map *matched_cpus, *unmatched_cpus;
> > > > > +		char buf1[128], buf2[128];
> > > > > +
> > > > > +		pmu = perf_pmu__find_hybrid_pmu(evsel->pmu_name);
> > > > > +		if (!pmu)
> > > > > +			continue;
> > > > > +
> > > > > +		ret = perf_pmu__cpus_match(pmu, cpus, &matched_cpus,
> > > > > +					   &unmatched_cpus);
> > > > > +		if (ret)
> > > > > +			goto out;
> > > > > +
> > > > > +		events_nr++;
> > > > > +
> > > > > +		if (matched_cpus->nr > 0 && (unmatched_cpus->nr > 0 ||
> > > > > +		    matched_cpus->nr < cpus->nr ||
> > > > > +		    matched_cpus->nr < pmu->cpus->nr)) {
> > > > > +			perf_cpu_map__put(evsel->core.cpus);
> > > > > +			perf_cpu_map__put(evsel->core.own_cpus);
> > > > > +			evsel->core.cpus = perf_cpu_map__get(matched_cpus);
> > > > > +			evsel->core.own_cpus = perf_cpu_map__get(matched_cpus);
> > > > 
> > > > I'm bit confused in here.. AFAIUI there's 2 evsel objects create
> > > > for hybrid 'cycles' ... should they have already proper cpus set?
> > > > 
> > > 
> > > For 'cycles', yes two evsels are created automatically. One is for atom CPU
> > > (e.g. 8-11), the other is for core CPU (e.g. 0-7). In this example, these 2
> > > evsels have already the cpus set.
> > 
> > hum, so those evsels are created with pmu's cpus, right?
> > 
> 
> Yes, that's right. But we also check and adjust the evsel->cpus by using
> user's cpu list on hybrid (what the evlist__use_cpu_list() does).
> 
> > > 
> > > While the 'cpus' here is just the user specified cpu list.
> > > cpus = perf_cpu_map__new(cpu_list);
> > 
> > then I think they will be changed by evlist__create_maps
> > with whatever user wants?
> > 
> 
> No, it will not be changed by evlist__create_maps.
> 
> In evlist__create_maps(),
> evlist->core.has_user_cpus = !!target->cpu_list && !target->hybrid;
> 
> It disables has_user_cpus for hybrid.
> 
> So in __perf_evlist__propagate_maps, they will not be changed by evlist->cpus.
> 
> if (!evsel->own_cpus || evlist->has_user_cpus) {
> 	perf_cpu_map__put(evsel->cpus);
> 	evsel->cpus = perf_cpu_map__get(evlist->cpus);
> 	
> > could we just change __perf_evlist__propagate_maps to follow
> > pmu's cpus?
> > 
> 
> In __perf_evlist__propagate_maps, it has already followed pmu's cpus because
> the evlist->has_user_cpus is false for hybrid.

sorry for delay

ok, so we first fix the cpus on hybrid events and then
propagate maps.. I guess it's ok, because it's in libperf
and that has no notion of hybrid so far

could you please rename that function so it's also obvious
it's for hybrid only

  evlist__fix_hybrid_cpus ? not sure ;-)

and add some comment with example to explain what the
function is doing

thanks,
jirka


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid
  2021-07-22 10:19           ` Jiri Olsa
@ 2021-07-23  0:49             ` Jin, Yao
  0 siblings, 0 replies; 10+ messages in thread
From: Jin, Yao @ 2021-07-23  0:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Hi Jiri,

On 7/22/2021 6:19 PM, Jiri Olsa wrote:
> On Wed, Jul 21, 2021 at 12:30:11PM +0800, Jin, Yao wrote:
>> Hi Jiri,
>>
>> On 7/20/2021 5:16 PM, Jiri Olsa wrote:
>>> On Tue, Jul 20, 2021 at 03:07:02PM +0800, Jin, Yao wrote:
>>>
>>> SNIP
>>>
>>>>
>>>> OK, evlist__fix_cpus() is better, use this name in v4.
>>>>
>>>>>> +{
>>>>>> +	struct perf_cpu_map *cpus;
>>>>>> +	struct evsel *evsel, *tmp;
>>>>>> +	struct perf_pmu *pmu;
>>>>>> +	int ret, unmatched_count = 0, events_nr = 0;
>>>>>> +
>>>>>> +	if (!perf_pmu__has_hybrid() || !cpu_list)
>>>>>> +		return 0;
>>>>>> +
>>>>>> +	cpus = perf_cpu_map__new(cpu_list);
>>>>>> +	if (!cpus)
>>>>>> +		return -1;
>>>>>> +
>>>>>> +	evlist__for_each_entry_safe(evlist, tmp, evsel) {
>>>>>> +		struct perf_cpu_map *matched_cpus, *unmatched_cpus;
>>>>>> +		char buf1[128], buf2[128];
>>>>>> +
>>>>>> +		pmu = perf_pmu__find_hybrid_pmu(evsel->pmu_name);
>>>>>> +		if (!pmu)
>>>>>> +			continue;
>>>>>> +
>>>>>> +		ret = perf_pmu__cpus_match(pmu, cpus, &matched_cpus,
>>>>>> +					   &unmatched_cpus);
>>>>>> +		if (ret)
>>>>>> +			goto out;
>>>>>> +
>>>>>> +		events_nr++;
>>>>>> +
>>>>>> +		if (matched_cpus->nr > 0 && (unmatched_cpus->nr > 0 ||
>>>>>> +		    matched_cpus->nr < cpus->nr ||
>>>>>> +		    matched_cpus->nr < pmu->cpus->nr)) {
>>>>>> +			perf_cpu_map__put(evsel->core.cpus);
>>>>>> +			perf_cpu_map__put(evsel->core.own_cpus);
>>>>>> +			evsel->core.cpus = perf_cpu_map__get(matched_cpus);
>>>>>> +			evsel->core.own_cpus = perf_cpu_map__get(matched_cpus);
>>>>>
>>>>> I'm bit confused in here.. AFAIUI there's 2 evsel objects create
>>>>> for hybrid 'cycles' ... should they have already proper cpus set?
>>>>>
>>>>
>>>> For 'cycles', yes two evsels are created automatically. One is for atom CPU
>>>> (e.g. 8-11), the other is for core CPU (e.g. 0-7). In this example, these 2
>>>> evsels have already the cpus set.
>>>
>>> hum, so those evsels are created with pmu's cpus, right?
>>>
>>
>> Yes, that's right. But we also check and adjust the evsel->cpus by using
>> user's cpu list on hybrid (what the evlist__use_cpu_list() does).
>>
>>>>
>>>> While the 'cpus' here is just the user specified cpu list.
>>>> cpus = perf_cpu_map__new(cpu_list);
>>>
>>> then I think they will be changed by evlist__create_maps
>>> with whatever user wants?
>>>
>>
>> No, it will not be changed by evlist__create_maps.
>>
>> In evlist__create_maps(),
>> evlist->core.has_user_cpus = !!target->cpu_list && !target->hybrid;
>>
>> It disables has_user_cpus for hybrid.
>>
>> So in __perf_evlist__propagate_maps, they will not be changed by evlist->cpus.
>>
>> if (!evsel->own_cpus || evlist->has_user_cpus) {
>> 	perf_cpu_map__put(evsel->cpus);
>> 	evsel->cpus = perf_cpu_map__get(evlist->cpus);
>> 	
>>> could we just change __perf_evlist__propagate_maps to follow
>>> pmu's cpus?
>>>
>>
>> In __perf_evlist__propagate_maps, it has already followed pmu's cpus because
>> the evlist->has_user_cpus is false for hybrid.
> 
> sorry for delay
> 

Never mind. :)

> ok, so we first fix the cpus on hybrid events and then
> propagate maps.. I guess it's ok, because it's in libperf
> and that has no notion of hybrid so far
> 

Yes. If we want the libperf to be hybrid aware, the interface has to be modified but actually we 
need to avoid modifying the libperf interface. So I finally decide to adjust the evsel->cpus first 
and then propatate maps.

> could you please rename that function so it's also obvious
> it's for hybrid only
> 
>    evlist__fix_hybrid_cpus ? not sure ;-)
> 

Sure, I will rename the funciton in v4.

> and add some comment with example to explain what the
> function is doing
> 

Got it!

Thanks
Jin Yao

> thanks,
> jirka
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-07-23  0:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12  7:12 [PATCH v3 0/3] perf tool: Enable cpu list for hybrid Jin Yao
2021-07-12  7:12 ` [PATCH v3 1/3] libperf: Add perf_cpu_map__default_new() Jin Yao
2021-07-12  7:12 ` [PATCH v3 2/3] perf tools: Create hybrid flag in target Jin Yao
2021-07-12  7:12 ` [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid Jin Yao
2021-07-19 19:36   ` Jiri Olsa
2021-07-20  7:07     ` Jin, Yao
2021-07-20  9:16       ` Jiri Olsa
2021-07-21  4:30         ` Jin, Yao
2021-07-22 10:19           ` Jiri Olsa
2021-07-23  0:49             ` Jin, Yao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).