linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1)
@ 2022-09-24 16:57 Namhyung Kim
  2022-09-24 16:57 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

Hello,

The system-wide evsel has a cpu map for all (online) cpus regardless
of user requested cpus.  But the cpu map handling code has some
special case for it and I think we can cleanup the code by making sure
that such a evsel has a proper cpu/thread maps from the beginning.
This patches should not cause any change in the behavior.

You can get the code from 'perf/cpumap-update-v1' branch in

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
  
Thanks,
Namhyung


Namhyung Kim (5):
  libperf: Populate system-wide evsel maps
  libperf: Propagate maps only if necessary
  perf tools: Get rid of evlist__add_on_all_cpus()
  perf tools: Add evlist__add_sched_switch()
  perf tools: Remove special handling of system-wide evsel

 tools/lib/perf/evlist.c             | 23 ++++++++-------
 tools/lib/perf/evsel.c              |  3 --
 tools/perf/arch/x86/util/intel-pt.c | 15 ++++------
 tools/perf/builtin-script.c         |  3 --
 tools/perf/tests/switch-tracking.c  | 15 ++++------
 tools/perf/util/evlist.c            | 46 ++++++++++++-----------------
 tools/perf/util/evlist.h            |  1 +
 tools/perf/util/evsel.c             | 12 ++------
 tools/perf/util/stat.c              |  3 --
 9 files changed, 44 insertions(+), 77 deletions(-)

-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/5] libperf: Populate system-wide evsel maps
  2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
@ 2022-09-24 16:57 ` Namhyung Kim
  2022-09-27  6:56   ` Adrian Hunter
  2022-09-24 16:57 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

Setting proper cpu and thread maps for system wide evsels regardless of
user requested cpu in __perf_evlist__propagate_maps().  Those evsels
need to be active on all cpus always.  Do it in the libperf so that we
can guarantee it has proper maps.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evlist.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 6b1bafe267a4..187129652ab6 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 	 * We already have cpus for evsel (via PMU sysfs) so
 	 * keep it, if there's no target cpu list defined.
 	 */
-	if (!evsel->own_cpus ||
-	    (!evsel->system_wide && evlist->has_user_cpus) ||
-	    (!evsel->system_wide &&
-	     !evsel->requires_cpu &&
-	     perf_cpu_map__empty(evlist->user_requested_cpus))) {
+	if (evsel->system_wide) {
+		perf_cpu_map__put(evsel->cpus);
+		evsel->cpus = perf_cpu_map__new(NULL);
+	} else if (!evsel->own_cpus || evlist->has_user_cpus ||
+		   (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
 		perf_cpu_map__put(evsel->cpus);
 		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
 	} else if (evsel->cpus != evsel->own_cpus) {
@@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 		evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
 	}
 
-	if (!evsel->system_wide) {
+	if (evsel->system_wide) {
+		perf_thread_map__put(evsel->threads);
+		evsel->threads = perf_thread_map__new_dummy();
+	} else {
 		perf_thread_map__put(evsel->threads);
 		evsel->threads = perf_thread_map__get(evlist->threads);
 	}
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
  2022-09-24 16:57 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
@ 2022-09-24 16:57 ` Namhyung Kim
  2022-09-27  7:06   ` Adrian Hunter
  2022-09-24 16:57 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

The current code propagate evsel's cpu map settings to evlist when it's
added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
be updated in perf_evlist__set_maps() later.  No need to do it before
evlist's cpus are set actually.

Actually we discarded this intermediate all_cpus maps at the beginning
of perf_evlist__set_maps().  Let's not do this.  It's only needed when
an evsel is added after the evlist cpu maps are set.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evlist.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 187129652ab6..cc070c3a134d 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -67,10 +67,6 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
 {
 	struct perf_evsel *evsel;
 
-	/* Recomputing all_cpus, so start with a blank slate. */
-	perf_cpu_map__put(evlist->all_cpus);
-	evlist->all_cpus = NULL;
-
 	perf_evlist__for_each_evsel(evlist, evsel)
 		__perf_evlist__propagate_maps(evlist, evsel);
 }
@@ -81,7 +77,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
 	evsel->idx = evlist->nr_entries;
 	list_add_tail(&evsel->node, &evlist->entries);
 	evlist->nr_entries += 1;
-	__perf_evlist__propagate_maps(evlist, evsel);
+
+	if (evlist->all_cpus)
+		__perf_evlist__propagate_maps(evlist, evsel);
 }
 
 void perf_evlist__remove(struct perf_evlist *evlist,
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus()
  2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
  2022-09-24 16:57 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
  2022-09-24 16:57 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
@ 2022-09-24 16:57 ` Namhyung Kim
  2022-09-27  7:07   ` Adrian Hunter
  2022-09-24 16:57 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

The cpu and thread maps are properly handled in libperf now.  No need to
do it in the perf tools anymore.  Let's remove the logic.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evlist.c | 29 ++---------------------------
 1 file changed, 2 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index fcfe5bcc0bcf..dcf57b271ff1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -268,28 +268,6 @@ int evlist__add_dummy(struct evlist *evlist)
 	return 0;
 }
 
-static void evlist__add_on_all_cpus(struct evlist *evlist, struct evsel *evsel)
-{
-	evsel->core.system_wide = true;
-
-	/*
-	 * All CPUs.
-	 *
-	 * Note perf_event_open() does not accept CPUs that are not online, so
-	 * in fact this CPU list will include only all online CPUs.
-	 */
-	perf_cpu_map__put(evsel->core.own_cpus);
-	evsel->core.own_cpus = perf_cpu_map__new(NULL);
-	perf_cpu_map__put(evsel->core.cpus);
-	evsel->core.cpus = perf_cpu_map__get(evsel->core.own_cpus);
-
-	/* No threads */
-	perf_thread_map__put(evsel->core.threads);
-	evsel->core.threads = perf_thread_map__new_dummy();
-
-	evlist__add(evlist, evsel);
-}
-
 struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
 {
 	struct evsel *evsel = evlist__dummy_event(evlist);
@@ -302,14 +280,11 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
 	evsel->core.attr.exclude_hv = 1;
 	evsel->core.attr.freq = 0;
 	evsel->core.attr.sample_period = 1;
+	evsel->core.system_wide = system_wide;
 	evsel->no_aux_samples = true;
 	evsel->name = strdup("dummy:u");
 
-	if (system_wide)
-		evlist__add_on_all_cpus(evlist, evsel);
-	else
-		evlist__add(evlist, evsel);
-
+	evlist__add(evlist, evsel);
 	return evsel;
 }
 
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/5] perf tools: Add evlist__add_sched_switch()
  2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
                   ` (2 preceding siblings ...)
  2022-09-24 16:57 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
@ 2022-09-24 16:57 ` Namhyung Kim
  2022-09-27  7:07   ` Adrian Hunter
  2022-09-24 16:57 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
  2022-09-26 19:24 ` [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Arnaldo Carvalho de Melo
  5 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

Add a help to create a system-wide sched_switch event.  One merit is
that it sets the system-wide bit before adding it to evlist so that
the libperf can handle the cpu and thread maps correctly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/arch/x86/util/intel-pt.c | 15 +++++----------
 tools/perf/tests/switch-tracking.c  | 15 +++++----------
 tools/perf/util/evlist.c            | 17 +++++++++++++++++
 tools/perf/util/evlist.h            |  1 +
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 13933020a79e..793b35f2221a 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -11,6 +11,7 @@
 #include <linux/bitops.h>
 #include <linux/log2.h>
 #include <linux/zalloc.h>
+#include <linux/err.h>
 #include <cpuid.h>
 
 #include "../../../util/session.h"
@@ -426,20 +427,14 @@ static int intel_pt_track_switches(struct evlist *evlist)
 	if (!evlist__can_select_event(evlist, sched_switch))
 		return -EPERM;
 
-	err = parse_event(evlist, sched_switch);
-	if (err) {
-		pr_debug2("%s: failed to parse %s, error %d\n",
+	evsel = evlist__add_sched_switch(evlist, true);
+	if (IS_ERR(evsel)) {
+		err = PTR_ERR(evsel);
+		pr_debug2("%s: failed to create %s, error = %d\n",
 			  __func__, sched_switch, err);
 		return err;
 	}
 
-	evsel = evlist__last(evlist);
-
-	evsel__set_sample_bit(evsel, CPU);
-	evsel__set_sample_bit(evsel, TIME);
-
-	evsel->core.system_wide = true;
-	evsel->no_aux_samples = true;
 	evsel->immediate = true;
 
 	return 0;
diff --git a/tools/perf/tests/switch-tracking.c b/tools/perf/tests/switch-tracking.c
index 2d46af9ef935..87f565c7f650 100644
--- a/tools/perf/tests/switch-tracking.c
+++ b/tools/perf/tests/switch-tracking.c
@@ -6,6 +6,7 @@
 #include <time.h>
 #include <stdlib.h>
 #include <linux/zalloc.h>
+#include <linux/err.h>
 #include <perf/cpumap.h>
 #include <perf/evlist.h>
 #include <perf/mmap.h>
@@ -398,19 +399,13 @@ static int test__switch_tracking(struct test_suite *test __maybe_unused, int sub
 		goto out;
 	}
 
-	err = parse_event(evlist, sched_switch);
-	if (err) {
-		pr_debug("Failed to parse event %s\n", sched_switch);
+	switch_evsel = evlist__add_sched_switch(evlist, true);
+	if (IS_ERR(switch_evsel)) {
+		err = PTR_ERR(switch_evsel);
+		pr_debug("Failed to create event %s\n", sched_switch);
 		goto out_err;
 	}
 
-	switch_evsel = evlist__last(evlist);
-
-	evsel__set_sample_bit(switch_evsel, CPU);
-	evsel__set_sample_bit(switch_evsel, TIME);
-
-	switch_evsel->core.system_wide = true;
-	switch_evsel->no_aux_samples = true;
 	switch_evsel->immediate = true;
 
 	/* Test moving an event to the front */
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index dcf57b271ff1..6612b00949e7 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -288,6 +288,23 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
 	return evsel;
 }
 
+struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide)
+{
+	struct evsel *evsel = evsel__newtp_idx("sched", "sched_switch", 0);
+
+	if (IS_ERR(evsel))
+		return evsel;
+
+	evsel__set_sample_bit(evsel, CPU);
+	evsel__set_sample_bit(evsel, TIME);
+
+	evsel->core.system_wide = system_wide;
+	evsel->no_aux_samples = true;
+
+	evlist__add(evlist, evsel);
+	return evsel;
+};
+
 int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
 {
 	struct evsel *evsel, *n;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 9d967fe3953a..16734c6756b3 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -127,6 +127,7 @@ static inline struct evsel *evlist__add_dummy_on_all_cpus(struct evlist *evlist)
 {
 	return evlist__add_aux_dummy(evlist, true);
 }
+struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide);
 
 int evlist__add_sb_event(struct evlist *evlist, struct perf_event_attr *attr,
 			 evsel__sb_cb_t cb, void *data);
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
  2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
                   ` (3 preceding siblings ...)
  2022-09-24 16:57 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
@ 2022-09-24 16:57 ` Namhyung Kim
  2022-09-27  7:09   ` Adrian Hunter
  2022-09-26 19:24 ` [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Arnaldo Carvalho de Melo
  5 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

For system-wide evsels, the thread map should be dummy - i.e. it has a
single entry of -1.  But the code guarantees such a thread map, so no
need to handle it specially.

No functional change intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evsel.c      |  3 ---
 tools/perf/builtin-script.c |  3 ---
 tools/perf/util/evsel.c     | 12 ++----------
 tools/perf/util/stat.c      |  3 ---
 4 files changed, 2 insertions(+), 19 deletions(-)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8ce5bbd09666..8b51b008a81f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
 	if (ncpus == 0 || nthreads == 0)
 		return 0;
 
-	if (evsel->system_wide)
-		nthreads = 1;
-
 	evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
 	if (evsel->sample_id == NULL)
 		return -ENOMEM;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 886f53cfa257..7fa467ed91dc 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
 	struct perf_cpu cpu;
 	static int header_printed;
 
-	if (counter->core.system_wide)
-		nthreads = 1;
-
 	if (!header_printed) {
 		printf("%3s %8s %15s %15s %15s %15s %s\n",
 		       "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5776bfa70f11..e319bb17d10d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
 static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		struct perf_thread_map *threads)
 {
-	int nthreads;
+	int nthreads = perf_thread_map__nr(threads);
 
 	if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
 	    (perf_missing_features.aux_output     && evsel->core.attr.aux_output))
@@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		threads = empty_thread_map;
 	}
 
-	if (evsel->core.system_wide)
-		nthreads = 1;
-	else
-		nthreads = threads->nr;
-
 	if (evsel->core.fd == NULL &&
 	    perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
 		return -ENOMEM;
@@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	if (threads == NULL)
 		threads = empty_thread_map;
 
-	if (evsel->core.system_wide)
-		nthreads = 1;
-	else
-		nthreads = threads->nr;
+	nthreads = perf_thread_map__nr(threads);
 
 	if (evsel->cgrp)
 		pid = evsel->cgrp->fd;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ce5e9e372fc4..cef943377ad7 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
 	int ncpus = evsel__nr_cpus(counter);
 	int idx, thread;
 
-	if (counter->core.system_wide)
-		nthreads = 1;
-
 	for (thread = 0; thread < nthreads; thread++) {
 		for (idx = 0; idx < ncpus; idx++) {
 			if (process_counter_values(config, counter, idx, thread,
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1)
  2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
                   ` (4 preceding siblings ...)
  2022-09-24 16:57 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
@ 2022-09-26 19:24 ` Arnaldo Carvalho de Melo
  5 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-09-26 19:24 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
	Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan

Em Sat, Sep 24, 2022 at 09:57:32AM -0700, Namhyung Kim escreveu:
> Hello,
> 
> The system-wide evsel has a cpu map for all (online) cpus regardless
> of user requested cpus.  But the cpu map handling code has some
> special case for it and I think we can cleanup the code by making sure
> that such a evsel has a proper cpu/thread maps from the beginning.
> This patches should not cause any change in the behavior.
> 
> You can get the code from 'perf/cpumap-update-v1' branch in
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>   
> Thanks,
> Namhyung

Looks sane, applied locally for testing.

- Arnaldo
 
> 
> Namhyung Kim (5):
>   libperf: Populate system-wide evsel maps
>   libperf: Propagate maps only if necessary
>   perf tools: Get rid of evlist__add_on_all_cpus()
>   perf tools: Add evlist__add_sched_switch()
>   perf tools: Remove special handling of system-wide evsel
> 
>  tools/lib/perf/evlist.c             | 23 ++++++++-------
>  tools/lib/perf/evsel.c              |  3 --
>  tools/perf/arch/x86/util/intel-pt.c | 15 ++++------
>  tools/perf/builtin-script.c         |  3 --
>  tools/perf/tests/switch-tracking.c  | 15 ++++------
>  tools/perf/util/evlist.c            | 46 ++++++++++++-----------------
>  tools/perf/util/evlist.h            |  1 +
>  tools/perf/util/evsel.c             | 12 ++------
>  tools/perf/util/stat.c              |  3 --
>  9 files changed, 44 insertions(+), 77 deletions(-)
> 
> -- 
> 2.37.3.998.g577e59143f-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] libperf: Populate system-wide evsel maps
  2022-09-24 16:57 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
@ 2022-09-27  6:56   ` Adrian Hunter
  0 siblings, 0 replies; 26+ messages in thread
From: Adrian Hunter @ 2022-09-27  6:56 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Kan Liang, Leo Yan

On 24/09/22 19:57, Namhyung Kim wrote:
> Setting proper cpu and thread maps for system wide evsels regardless of
> user requested cpu in __perf_evlist__propagate_maps().  Those evsels
> need to be active on all cpus always.  Do it in the libperf so that we
> can guarantee it has proper maps.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  tools/lib/perf/evlist.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 6b1bafe267a4..187129652ab6 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>  	 * We already have cpus for evsel (via PMU sysfs) so
>  	 * keep it, if there's no target cpu list defined.
>  	 */
> -	if (!evsel->own_cpus ||
> -	    (!evsel->system_wide && evlist->has_user_cpus) ||
> -	    (!evsel->system_wide &&
> -	     !evsel->requires_cpu &&
> -	     perf_cpu_map__empty(evlist->user_requested_cpus))) {
> +	if (evsel->system_wide) {
> +		perf_cpu_map__put(evsel->cpus);
> +		evsel->cpus = perf_cpu_map__new(NULL);
> +	} else if (!evsel->own_cpus || evlist->has_user_cpus ||
> +		   (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
>  		perf_cpu_map__put(evsel->cpus);
>  		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
>  	} else if (evsel->cpus != evsel->own_cpus) {
> @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>  		evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
>  	}
>  
> -	if (!evsel->system_wide) {
> +	if (evsel->system_wide) {
> +		perf_thread_map__put(evsel->threads);
> +		evsel->threads = perf_thread_map__new_dummy();
> +	} else {
>  		perf_thread_map__put(evsel->threads);
>  		evsel->threads = perf_thread_map__get(evlist->threads);
>  	}


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-24 16:57 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
@ 2022-09-27  7:06   ` Adrian Hunter
  2022-09-27 17:28     ` Namhyung Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Adrian Hunter @ 2022-09-27  7:06 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Kan Liang, Leo Yan

On 24/09/22 19:57, Namhyung Kim wrote:
> The current code propagate evsel's cpu map settings to evlist when it's
> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> be updated in perf_evlist__set_maps() later.  No need to do it before
> evlist's cpus are set actually.
> 
> Actually we discarded this intermediate all_cpus maps at the beginning
> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> an evsel is added after the evlist cpu maps are set.

That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
added to the evlist.  It can also remove an evsel from the evlist.

There might be other cases like that, but that was just one that stuck
out.

> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/lib/perf/evlist.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 187129652ab6..cc070c3a134d 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -67,10 +67,6 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>  {
>  	struct perf_evsel *evsel;
>  
> -	/* Recomputing all_cpus, so start with a blank slate. */
> -	perf_cpu_map__put(evlist->all_cpus);
> -	evlist->all_cpus = NULL;
> -
>  	perf_evlist__for_each_evsel(evlist, evsel)
>  		__perf_evlist__propagate_maps(evlist, evsel);
>  }
> @@ -81,7 +77,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
>  	evsel->idx = evlist->nr_entries;
>  	list_add_tail(&evsel->node, &evlist->entries);
>  	evlist->nr_entries += 1;
> -	__perf_evlist__propagate_maps(evlist, evsel);
> +
> +	if (evlist->all_cpus)
> +		__perf_evlist__propagate_maps(evlist, evsel);
>  }
>  
>  void perf_evlist__remove(struct perf_evlist *evlist,


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus()
  2022-09-24 16:57 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
@ 2022-09-27  7:07   ` Adrian Hunter
  0 siblings, 0 replies; 26+ messages in thread
From: Adrian Hunter @ 2022-09-27  7:07 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Kan Liang, Leo Yan

On 24/09/22 19:57, Namhyung Kim wrote:
> The cpu and thread maps are properly handled in libperf now.  No need to
> do it in the perf tools anymore.  Let's remove the logic.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  tools/perf/util/evlist.c | 29 ++---------------------------
>  1 file changed, 2 insertions(+), 27 deletions(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index fcfe5bcc0bcf..dcf57b271ff1 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -268,28 +268,6 @@ int evlist__add_dummy(struct evlist *evlist)
>  	return 0;
>  }
>  
> -static void evlist__add_on_all_cpus(struct evlist *evlist, struct evsel *evsel)
> -{
> -	evsel->core.system_wide = true;
> -
> -	/*
> -	 * All CPUs.
> -	 *
> -	 * Note perf_event_open() does not accept CPUs that are not online, so
> -	 * in fact this CPU list will include only all online CPUs.
> -	 */
> -	perf_cpu_map__put(evsel->core.own_cpus);
> -	evsel->core.own_cpus = perf_cpu_map__new(NULL);
> -	perf_cpu_map__put(evsel->core.cpus);
> -	evsel->core.cpus = perf_cpu_map__get(evsel->core.own_cpus);
> -
> -	/* No threads */
> -	perf_thread_map__put(evsel->core.threads);
> -	evsel->core.threads = perf_thread_map__new_dummy();
> -
> -	evlist__add(evlist, evsel);
> -}
> -
>  struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
>  {
>  	struct evsel *evsel = evlist__dummy_event(evlist);
> @@ -302,14 +280,11 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
>  	evsel->core.attr.exclude_hv = 1;
>  	evsel->core.attr.freq = 0;
>  	evsel->core.attr.sample_period = 1;
> +	evsel->core.system_wide = system_wide;
>  	evsel->no_aux_samples = true;
>  	evsel->name = strdup("dummy:u");
>  
> -	if (system_wide)
> -		evlist__add_on_all_cpus(evlist, evsel);
> -	else
> -		evlist__add(evlist, evsel);
> -
> +	evlist__add(evlist, evsel);
>  	return evsel;
>  }
>  


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/5] perf tools: Add evlist__add_sched_switch()
  2022-09-24 16:57 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
@ 2022-09-27  7:07   ` Adrian Hunter
  0 siblings, 0 replies; 26+ messages in thread
From: Adrian Hunter @ 2022-09-27  7:07 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Kan Liang, Leo Yan

On 24/09/22 19:57, Namhyung Kim wrote:
> Add a help to create a system-wide sched_switch event.  One merit is
> that it sets the system-wide bit before adding it to evlist so that
> the libperf can handle the cpu and thread maps correctly.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  tools/perf/arch/x86/util/intel-pt.c | 15 +++++----------
>  tools/perf/tests/switch-tracking.c  | 15 +++++----------
>  tools/perf/util/evlist.c            | 17 +++++++++++++++++
>  tools/perf/util/evlist.h            |  1 +
>  4 files changed, 28 insertions(+), 20 deletions(-)
> 
> diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
> index 13933020a79e..793b35f2221a 100644
> --- a/tools/perf/arch/x86/util/intel-pt.c
> +++ b/tools/perf/arch/x86/util/intel-pt.c
> @@ -11,6 +11,7 @@
>  #include <linux/bitops.h>
>  #include <linux/log2.h>
>  #include <linux/zalloc.h>
> +#include <linux/err.h>
>  #include <cpuid.h>
>  
>  #include "../../../util/session.h"
> @@ -426,20 +427,14 @@ static int intel_pt_track_switches(struct evlist *evlist)
>  	if (!evlist__can_select_event(evlist, sched_switch))
>  		return -EPERM;
>  
> -	err = parse_event(evlist, sched_switch);
> -	if (err) {
> -		pr_debug2("%s: failed to parse %s, error %d\n",
> +	evsel = evlist__add_sched_switch(evlist, true);
> +	if (IS_ERR(evsel)) {
> +		err = PTR_ERR(evsel);
> +		pr_debug2("%s: failed to create %s, error = %d\n",
>  			  __func__, sched_switch, err);
>  		return err;
>  	}
>  
> -	evsel = evlist__last(evlist);
> -
> -	evsel__set_sample_bit(evsel, CPU);
> -	evsel__set_sample_bit(evsel, TIME);
> -
> -	evsel->core.system_wide = true;
> -	evsel->no_aux_samples = true;
>  	evsel->immediate = true;
>  
>  	return 0;
> diff --git a/tools/perf/tests/switch-tracking.c b/tools/perf/tests/switch-tracking.c
> index 2d46af9ef935..87f565c7f650 100644
> --- a/tools/perf/tests/switch-tracking.c
> +++ b/tools/perf/tests/switch-tracking.c
> @@ -6,6 +6,7 @@
>  #include <time.h>
>  #include <stdlib.h>
>  #include <linux/zalloc.h>
> +#include <linux/err.h>
>  #include <perf/cpumap.h>
>  #include <perf/evlist.h>
>  #include <perf/mmap.h>
> @@ -398,19 +399,13 @@ static int test__switch_tracking(struct test_suite *test __maybe_unused, int sub
>  		goto out;
>  	}
>  
> -	err = parse_event(evlist, sched_switch);
> -	if (err) {
> -		pr_debug("Failed to parse event %s\n", sched_switch);
> +	switch_evsel = evlist__add_sched_switch(evlist, true);
> +	if (IS_ERR(switch_evsel)) {
> +		err = PTR_ERR(switch_evsel);
> +		pr_debug("Failed to create event %s\n", sched_switch);
>  		goto out_err;
>  	}
>  
> -	switch_evsel = evlist__last(evlist);
> -
> -	evsel__set_sample_bit(switch_evsel, CPU);
> -	evsel__set_sample_bit(switch_evsel, TIME);
> -
> -	switch_evsel->core.system_wide = true;
> -	switch_evsel->no_aux_samples = true;
>  	switch_evsel->immediate = true;
>  
>  	/* Test moving an event to the front */
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index dcf57b271ff1..6612b00949e7 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -288,6 +288,23 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
>  	return evsel;
>  }
>  
> +struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide)
> +{
> +	struct evsel *evsel = evsel__newtp_idx("sched", "sched_switch", 0);
> +
> +	if (IS_ERR(evsel))
> +		return evsel;
> +
> +	evsel__set_sample_bit(evsel, CPU);
> +	evsel__set_sample_bit(evsel, TIME);
> +
> +	evsel->core.system_wide = system_wide;
> +	evsel->no_aux_samples = true;
> +
> +	evlist__add(evlist, evsel);
> +	return evsel;
> +};
> +
>  int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
>  {
>  	struct evsel *evsel, *n;
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 9d967fe3953a..16734c6756b3 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -127,6 +127,7 @@ static inline struct evsel *evlist__add_dummy_on_all_cpus(struct evlist *evlist)
>  {
>  	return evlist__add_aux_dummy(evlist, true);
>  }
> +struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide);
>  
>  int evlist__add_sb_event(struct evlist *evlist, struct perf_event_attr *attr,
>  			 evsel__sb_cb_t cb, void *data);


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
  2022-09-24 16:57 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
@ 2022-09-27  7:09   ` Adrian Hunter
  0 siblings, 0 replies; 26+ messages in thread
From: Adrian Hunter @ 2022-09-27  7:09 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Kan Liang, Leo Yan

On 24/09/22 19:57, Namhyung Kim wrote:
> For system-wide evsels, the thread map should be dummy - i.e. it has a
> single entry of -1.  But the code guarantees such a thread map, so no
> need to handle it specially.
> 
> No functional change intended.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  tools/lib/perf/evsel.c      |  3 ---
>  tools/perf/builtin-script.c |  3 ---
>  tools/perf/util/evsel.c     | 12 ++----------
>  tools/perf/util/stat.c      |  3 ---
>  4 files changed, 2 insertions(+), 19 deletions(-)
> 
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 8ce5bbd09666..8b51b008a81f 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
>  	if (ncpus == 0 || nthreads == 0)
>  		return 0;
>  
> -	if (evsel->system_wide)
> -		nthreads = 1;
> -
>  	evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
>  	if (evsel->sample_id == NULL)
>  		return -ENOMEM;
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 886f53cfa257..7fa467ed91dc 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
>  	struct perf_cpu cpu;
>  	static int header_printed;
>  
> -	if (counter->core.system_wide)
> -		nthreads = 1;
> -
>  	if (!header_printed) {
>  		printf("%3s %8s %15s %15s %15s %15s %s\n",
>  		       "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 5776bfa70f11..e319bb17d10d 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
>  static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
>  		struct perf_thread_map *threads)
>  {
> -	int nthreads;
> +	int nthreads = perf_thread_map__nr(threads);
>  
>  	if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
>  	    (perf_missing_features.aux_output     && evsel->core.attr.aux_output))
> @@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
>  		threads = empty_thread_map;
>  	}
>  
> -	if (evsel->core.system_wide)
> -		nthreads = 1;
> -	else
> -		nthreads = threads->nr;
> -
>  	if (evsel->core.fd == NULL &&
>  	    perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
>  		return -ENOMEM;
> @@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>  	if (threads == NULL)
>  		threads = empty_thread_map;
>  
> -	if (evsel->core.system_wide)
> -		nthreads = 1;
> -	else
> -		nthreads = threads->nr;
> +	nthreads = perf_thread_map__nr(threads);
>  
>  	if (evsel->cgrp)
>  		pid = evsel->cgrp->fd;
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index ce5e9e372fc4..cef943377ad7 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
>  	int ncpus = evsel__nr_cpus(counter);
>  	int idx, thread;
>  
> -	if (counter->core.system_wide)
> -		nthreads = 1;
> -
>  	for (thread = 0; thread < nthreads; thread++) {
>  		for (idx = 0; idx < ncpus; idx++) {
>  			if (process_counter_values(config, counter, idx, thread,


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-27  7:06   ` Adrian Hunter
@ 2022-09-27 17:28     ` Namhyung Kim
  2022-09-28  7:53       ` Adrian Hunter
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-27 17:28 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan

Hi Adrian,

On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 24/09/22 19:57, Namhyung Kim wrote:
> > The current code propagate evsel's cpu map settings to evlist when it's
> > added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> > be updated in perf_evlist__set_maps() later.  No need to do it before
> > evlist's cpus are set actually.
> >
> > Actually we discarded this intermediate all_cpus maps at the beginning
> > of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> > an evsel is added after the evlist cpu maps are set.
>
> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
> added to the evlist.  It can also remove an evsel from the evlist.

Thanks for your review.  I think it's fine to change evsel cpus or to remove
an evsel from evlist before calling evlist__create_maps().  The function
will take care of setting evlist's all_cpus from the evsels in the evlist.
So previous changes in evsel/cpus wouldn't be any special.

After this point, adding a new evsel needs to update evlist all cpus by
propagating cpu maps.  So I think hybrid cpus should be fine.
Did I miss something?

>
> There might be other cases like that, but that was just one that stuck
> out.

Thanks for sharing your concern.  Please let me know if you could
come up with another.

Namhyung

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-27 17:28     ` Namhyung Kim
@ 2022-09-28  7:53       ` Adrian Hunter
  2022-09-28 23:46         ` Namhyung Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Adrian Hunter @ 2022-09-28  7:53 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan

On 27/09/22 20:28, Namhyung Kim wrote:
> Hi Adrian,
> 
> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>> On 24/09/22 19:57, Namhyung Kim wrote:
>>> The current code propagate evsel's cpu map settings to evlist when it's
>>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
>>> be updated in perf_evlist__set_maps() later.  No need to do it before
>>> evlist's cpus are set actually.
>>>
>>> Actually we discarded this intermediate all_cpus maps at the beginning
>>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
>>> an evsel is added after the evlist cpu maps are set.
>>
>> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
>> added to the evlist.  It can also remove an evsel from the evlist.
> 
> Thanks for your review.  I think it's fine to change evsel cpus or to remove
> an evsel from evlist before calling evlist__create_maps().  The function
> will take care of setting evlist's all_cpus from the evsels in the evlist.
> So previous changes in evsel/cpus wouldn't be any special.
> 
> After this point, adding a new evsel needs to update evlist all cpus by
> propagating cpu maps.  So I think hybrid cpus should be fine.
> Did I miss something?

I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
cpus from the target->cpu_list (using perf record -C) , since after this
patch all_cpus always starts with the target->cpu_list instead of an empty
list.  But then, in the hybrid case, it puts a dummy event that uses the
target cpu list anyway, so the result is the same.

I don't know if there are any cases where all_cpus would actually need to
exclude some of the cpus from target->cpu_list.

> 
>>
>> There might be other cases like that, but that was just one that stuck
>> out.
> 
> Thanks for sharing your concern.  Please let me know if you could
> come up with another.
> 
> Namhyung


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-28  7:53       ` Adrian Hunter
@ 2022-09-28 23:46         ` Namhyung Kim
  2022-09-29  2:07           ` Ian Rogers
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-28 23:46 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan

On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 27/09/22 20:28, Namhyung Kim wrote:
> > Hi Adrian,
> >
> > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>
> >> On 24/09/22 19:57, Namhyung Kim wrote:
> >>> The current code propagate evsel's cpu map settings to evlist when it's
> >>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> >>> be updated in perf_evlist__set_maps() later.  No need to do it before
> >>> evlist's cpus are set actually.
> >>>
> >>> Actually we discarded this intermediate all_cpus maps at the beginning
> >>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> >>> an evsel is added after the evlist cpu maps are set.
> >>
> >> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
> >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
> >> added to the evlist.  It can also remove an evsel from the evlist.
> >
> > Thanks for your review.  I think it's fine to change evsel cpus or to remove
> > an evsel from evlist before calling evlist__create_maps().  The function
> > will take care of setting evlist's all_cpus from the evsels in the evlist.
> > So previous changes in evsel/cpus wouldn't be any special.
> >
> > After this point, adding a new evsel needs to update evlist all cpus by
> > propagating cpu maps.  So I think hybrid cpus should be fine.
> > Did I miss something?
>
> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
> cpus from the target->cpu_list (using perf record -C) , since after this
> patch all_cpus always starts with the target->cpu_list instead of an empty
> list.  But then, in the hybrid case, it puts a dummy event that uses the
> target cpu list anyway, so the result is the same.
>
> I don't know if there are any cases where all_cpus would actually need to
> exclude some of the cpus from target->cpu_list.

I'm not aware of other cases to reduce cpu list.  I think it'd be fine
if it has a cpu in the evlist->all_cpus even if it's not used.  The evsel
should have a correct list anyway and we mostly use the evsel cpus
to do the real work.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-28 23:46         ` Namhyung Kim
@ 2022-09-29  2:07           ` Ian Rogers
  2022-09-29  5:09             ` Namhyung Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2022-09-29  2:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar,
	Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan

On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >
> > On 27/09/22 20:28, Namhyung Kim wrote:
> > > Hi Adrian,
> > >
> > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >>
> > >> On 24/09/22 19:57, Namhyung Kim wrote:
> > >>> The current code propagate evsel's cpu map settings to evlist when it's
> > >>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> > >>> be updated in perf_evlist__set_maps() later.  No need to do it before
> > >>> evlist's cpus are set actually.
> > >>>
> > >>> Actually we discarded this intermediate all_cpus maps at the beginning
> > >>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> > >>> an evsel is added after the evlist cpu maps are set.
> > >>
> > >> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
> > >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
> > >> added to the evlist.  It can also remove an evsel from the evlist.
> > >
> > > Thanks for your review.  I think it's fine to change evsel cpus or to remove
> > > an evsel from evlist before calling evlist__create_maps().  The function
> > > will take care of setting evlist's all_cpus from the evsels in the evlist.
> > > So previous changes in evsel/cpus wouldn't be any special.
> > >
> > > After this point, adding a new evsel needs to update evlist all cpus by
> > > propagating cpu maps.  So I think hybrid cpus should be fine.
> > > Did I miss something?
> >
> > I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
> > cpus from the target->cpu_list (using perf record -C) , since after this
> > patch all_cpus always starts with the target->cpu_list instead of an empty
> > list.  But then, in the hybrid case, it puts a dummy event that uses the
> > target cpu list anyway, so the result is the same.
> >
> > I don't know if there are any cases where all_cpus would actually need to
> > exclude some of the cpus from target->cpu_list.
>
> I'm not aware of other cases to reduce cpu list.  I think it'd be fine
> if it has a cpu in the evlist->all_cpus even if it's not used.  The evsel
> should have a correct list anyway and we mostly use the evsel cpus
> to do the real work.
>
> Thanks,
> Namhyung

The affinity changes made it so that we use all_cpus probably more
often than the evsel CPU maps for real work. The reason being we want
to avoid IPIs so we do all the work on 1 CPU and then move to the next
CPU in evlist all_cpus. evsel CPU maps are used to make sure the
indices are kept accurate - for example, if an uncore event is
measured with a CPU event:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404

Thanks,
Ian

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-29  2:07           ` Ian Rogers
@ 2022-09-29  5:09             ` Namhyung Kim
  2022-09-29  5:18               ` Adrian Hunter
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-29  5:09 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar,
	Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan

On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote:
>
> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >
> > > On 27/09/22 20:28, Namhyung Kim wrote:
> > > > Hi Adrian,
> > > >
> > > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > > >>
> > > >> On 24/09/22 19:57, Namhyung Kim wrote:
> > > >>> The current code propagate evsel's cpu map settings to evlist when it's
> > > >>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> > > >>> be updated in perf_evlist__set_maps() later.  No need to do it before
> > > >>> evlist's cpus are set actually.
> > > >>>
> > > >>> Actually we discarded this intermediate all_cpus maps at the beginning
> > > >>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> > > >>> an evsel is added after the evlist cpu maps are set.
> > > >>
> > > >> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
> > > >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
> > > >> added to the evlist.  It can also remove an evsel from the evlist.
> > > >
> > > > Thanks for your review.  I think it's fine to change evsel cpus or to remove
> > > > an evsel from evlist before calling evlist__create_maps().  The function
> > > > will take care of setting evlist's all_cpus from the evsels in the evlist.
> > > > So previous changes in evsel/cpus wouldn't be any special.
> > > >
> > > > After this point, adding a new evsel needs to update evlist all cpus by
> > > > propagating cpu maps.  So I think hybrid cpus should be fine.
> > > > Did I miss something?
> > >
> > > I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
> > > cpus from the target->cpu_list (using perf record -C) , since after this
> > > patch all_cpus always starts with the target->cpu_list instead of an empty
> > > list.  But then, in the hybrid case, it puts a dummy event that uses the
> > > target cpu list anyway, so the result is the same.
> > >
> > > I don't know if there are any cases where all_cpus would actually need to
> > > exclude some of the cpus from target->cpu_list.
> >
> > I'm not aware of other cases to reduce cpu list.  I think it'd be fine
> > if it has a cpu in the evlist->all_cpus even if it's not used.  The evsel
> > should have a correct list anyway and we mostly use the evsel cpus
> > to do the real work.
> >
> > Thanks,
> > Namhyung
>
> The affinity changes made it so that we use all_cpus probably more
> often than the evsel CPU maps for real work. The reason being we want
> to avoid IPIs so we do all the work on 1 CPU and then move to the next
> CPU in evlist all_cpus. evsel CPU maps are used to make sure the
> indices are kept accurate - for example, if an uncore event is
> measured with a CPU event:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404

Right, I meant it'd check the evsel cpus eventually even if it iterates
on the evlist all_cpus.  The evlist_cpu_iterator__next() will skip a
CPU if it's not in the evsel cpus.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-29  5:09             ` Namhyung Kim
@ 2022-09-29  5:18               ` Adrian Hunter
  2022-09-29 20:42                 ` Namhyung Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Adrian Hunter @ 2022-09-29  5:18 UTC (permalink / raw)
  To: Namhyung Kim, Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, linux-perf-users, Kan Liang, Leo Yan

On 29/09/22 08:09, Namhyung Kim wrote:
> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote:
>>
>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>>>
>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>
>>>> On 27/09/22 20:28, Namhyung Kim wrote:
>>>>> Hi Adrian,
>>>>>
>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>>>
>>>>>> On 24/09/22 19:57, Namhyung Kim wrote:
>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's
>>>>>>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
>>>>>>> be updated in perf_evlist__set_maps() later.  No need to do it before
>>>>>>> evlist's cpus are set actually.
>>>>>>>
>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning
>>>>>>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
>>>>>>> an evsel is added after the evlist cpu maps are set.
>>>>>>
>>>>>> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
>>>>>> added to the evlist.  It can also remove an evsel from the evlist.
>>>>>
>>>>> Thanks for your review.  I think it's fine to change evsel cpus or to remove
>>>>> an evsel from evlist before calling evlist__create_maps().  The function
>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist.
>>>>> So previous changes in evsel/cpus wouldn't be any special.
>>>>>
>>>>> After this point, adding a new evsel needs to update evlist all cpus by
>>>>> propagating cpu maps.  So I think hybrid cpus should be fine.
>>>>> Did I miss something?
>>>>
>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
>>>> cpus from the target->cpu_list (using perf record -C) , since after this
>>>> patch all_cpus always starts with the target->cpu_list instead of an empty
>>>> list.  But then, in the hybrid case, it puts a dummy event that uses the
>>>> target cpu list anyway, so the result is the same.
>>>>
>>>> I don't know if there are any cases where all_cpus would actually need to
>>>> exclude some of the cpus from target->cpu_list.
>>>
>>> I'm not aware of other cases to reduce cpu list.  I think it'd be fine
>>> if it has a cpu in the evlist->all_cpus even if it's not used.  The evsel
>>> should have a correct list anyway and we mostly use the evsel cpus
>>> to do the real work.
>>>
>>> Thanks,
>>> Namhyung
>>
>> The affinity changes made it so that we use all_cpus probably more
>> often than the evsel CPU maps for real work. The reason being we want
>> to avoid IPIs so we do all the work on 1 CPU and then move to the next
>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the
>> indices are kept accurate - for example, if an uncore event is
>> measured with a CPU event:
>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366
>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404
> 
> Right, I meant it'd check the evsel cpus eventually even if it iterates
> on the evlist all_cpus.  The evlist_cpu_iterator__next() will skip a
> CPU if it's not in the evsel cpus.
> 
> Thanks,
> Namhyung

Perhaps an alternative is to be explicit about deferring map
propagation e.g.

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 19eaea99aa4f..5ce19e62397d 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -70,6 +70,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
 	/* Recomputing all_cpus, so start with a blank slate. */
 	perf_cpu_map__put(evlist->all_cpus);
 	evlist->all_cpus = NULL;
+	evlist->defer_map_propagation = false;
 
 	perf_evlist__for_each_evsel(evlist, evsel)
 		__perf_evlist__propagate_maps(evlist, evsel);
@@ -81,7 +82,8 @@ void perf_evlist__add(struct perf_evlist *evlist,
 	evsel->idx = evlist->nr_entries;
 	list_add_tail(&evsel->node, &evlist->entries);
 	evlist->nr_entries += 1;
-	__perf_evlist__propagate_maps(evlist, evsel);
+	if (!evlist->defer_map_propagation)
+		__perf_evlist__propagate_maps(evlist, evsel);
 }
 
 void perf_evlist__remove(struct perf_evlist *evlist,
@@ -177,9 +179,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
 		evlist->threads = perf_thread_map__get(threads);
 	}
 
-	if (!evlist->all_cpus && cpus)
-		evlist->all_cpus = perf_cpu_map__get(cpus);
-
 	perf_evlist__propagate_maps(evlist);
 }
 
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 6f89aec3e608..dbe0b763f597 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -19,6 +19,7 @@ struct perf_evlist {
 	int			 nr_entries;
 	int			 nr_groups;
 	bool			 has_user_cpus;
+	bool			 defer_map_propagation;
 	/**
 	 * The cpus passed from the command line or all online CPUs by
 	 * default.
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 52d254b1530c..1c2523d66a14 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -3975,6 +3975,7 @@ int cmd_record(int argc, const char **argv)
 	rec->evlist = evlist__new();
 	if (rec->evlist == NULL)
 		return -ENOMEM;
+	rec->evlist->core.defer_map_propagation = true;
 
 	err = perf_config(perf_record_config, rec);
 	if (err)


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-29  5:18               ` Adrian Hunter
@ 2022-09-29 20:42                 ` Namhyung Kim
  2022-09-30 12:49                   ` Adrian Hunter
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-29 20:42 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ian Rogers, Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar,
	Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan

On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 29/09/22 08:09, Namhyung Kim wrote:
> > On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote:
> >>
> >> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >>>
> >>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>
> >>>> On 27/09/22 20:28, Namhyung Kim wrote:
> >>>>> Hi Adrian,
> >>>>>
> >>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>>>
> >>>>>> On 24/09/22 19:57, Namhyung Kim wrote:
> >>>>>>> The current code propagate evsel's cpu map settings to evlist when it's
> >>>>>>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> >>>>>>> be updated in perf_evlist__set_maps() later.  No need to do it before
> >>>>>>> evlist's cpus are set actually.
> >>>>>>>
> >>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning
> >>>>>>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> >>>>>>> an evsel is added after the evlist cpu maps are set.
> >>>>>>
> >>>>>> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
> >>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
> >>>>>> added to the evlist.  It can also remove an evsel from the evlist.
> >>>>>
> >>>>> Thanks for your review.  I think it's fine to change evsel cpus or to remove
> >>>>> an evsel from evlist before calling evlist__create_maps().  The function
> >>>>> will take care of setting evlist's all_cpus from the evsels in the evlist.
> >>>>> So previous changes in evsel/cpus wouldn't be any special.
> >>>>>
> >>>>> After this point, adding a new evsel needs to update evlist all cpus by
> >>>>> propagating cpu maps.  So I think hybrid cpus should be fine.
> >>>>> Did I miss something?
> >>>>
> >>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
> >>>> cpus from the target->cpu_list (using perf record -C) , since after this
> >>>> patch all_cpus always starts with the target->cpu_list instead of an empty
> >>>> list.  But then, in the hybrid case, it puts a dummy event that uses the
> >>>> target cpu list anyway, so the result is the same.
> >>>>
> >>>> I don't know if there are any cases where all_cpus would actually need to
> >>>> exclude some of the cpus from target->cpu_list.
> >>>
> >>> I'm not aware of other cases to reduce cpu list.  I think it'd be fine
> >>> if it has a cpu in the evlist->all_cpus even if it's not used.  The evsel
> >>> should have a correct list anyway and we mostly use the evsel cpus
> >>> to do the real work.
> >>>
> >>> Thanks,
> >>> Namhyung
> >>
> >> The affinity changes made it so that we use all_cpus probably more
> >> often than the evsel CPU maps for real work. The reason being we want
> >> to avoid IPIs so we do all the work on 1 CPU and then move to the next
> >> CPU in evlist all_cpus. evsel CPU maps are used to make sure the
> >> indices are kept accurate - for example, if an uncore event is
> >> measured with a CPU event:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366
> >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404
> >
> > Right, I meant it'd check the evsel cpus eventually even if it iterates
> > on the evlist all_cpus.  The evlist_cpu_iterator__next() will skip a
> > CPU if it's not in the evsel cpus.
> >
> > Thanks,
> > Namhyung
>
> Perhaps an alternative is to be explicit about deferring map
> propagation e.g.

Thanks for your patch.  Yeah, we can use this.

But I still think it'd be better doing it unconditionally
since any propagation before perf_evlist__set_maps
will be discarded anyway.  With this change, other
than perf record will collect all cpus before _set_maps
and then discard it.  It seems like a waste, no?

Or else, we can have allow_map_propagation initialized
to false and set it to true in perf_evlist__set_maps().

Thanks,
Namhyung


>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 19eaea99aa4f..5ce19e62397d 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -70,6 +70,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>         /* Recomputing all_cpus, so start with a blank slate. */
>         perf_cpu_map__put(evlist->all_cpus);
>         evlist->all_cpus = NULL;
> +       evlist->defer_map_propagation = false;
>
>         perf_evlist__for_each_evsel(evlist, evsel)
>                 __perf_evlist__propagate_maps(evlist, evsel);
> @@ -81,7 +82,8 @@ void perf_evlist__add(struct perf_evlist *evlist,
>         evsel->idx = evlist->nr_entries;
>         list_add_tail(&evsel->node, &evlist->entries);
>         evlist->nr_entries += 1;
> -       __perf_evlist__propagate_maps(evlist, evsel);
> +       if (!evlist->defer_map_propagation)
> +               __perf_evlist__propagate_maps(evlist, evsel);
>  }
>
>  void perf_evlist__remove(struct perf_evlist *evlist,
> @@ -177,9 +179,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
>                 evlist->threads = perf_thread_map__get(threads);
>         }
>
> -       if (!evlist->all_cpus && cpus)
> -               evlist->all_cpus = perf_cpu_map__get(cpus);
> -
>         perf_evlist__propagate_maps(evlist);
>  }
>
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 6f89aec3e608..dbe0b763f597 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -19,6 +19,7 @@ struct perf_evlist {
>         int                      nr_entries;
>         int                      nr_groups;
>         bool                     has_user_cpus;
> +       bool                     defer_map_propagation;
>         /**
>          * The cpus passed from the command line or all online CPUs by
>          * default.
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 52d254b1530c..1c2523d66a14 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -3975,6 +3975,7 @@ int cmd_record(int argc, const char **argv)
>         rec->evlist = evlist__new();
>         if (rec->evlist == NULL)
>                 return -ENOMEM;
> +       rec->evlist->core.defer_map_propagation = true;
>
>         err = perf_config(perf_record_config, rec);
>         if (err)
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-29 20:42                 ` Namhyung Kim
@ 2022-09-30 12:49                   ` Adrian Hunter
  2022-09-30 16:44                     ` Namhyung Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Adrian Hunter @ 2022-09-30 12:49 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ian Rogers, Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar,
	Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan

On 29/09/22 23:42, Namhyung Kim wrote:
> On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>> On 29/09/22 08:09, Namhyung Kim wrote:
>>> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote:
>>>>
>>>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>>>>>
>>>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>>>
>>>>>> On 27/09/22 20:28, Namhyung Kim wrote:
>>>>>>> Hi Adrian,
>>>>>>>
>>>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>>>>>
>>>>>>>> On 24/09/22 19:57, Namhyung Kim wrote:
>>>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's
>>>>>>>>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
>>>>>>>>> be updated in perf_evlist__set_maps() later.  No need to do it before
>>>>>>>>> evlist's cpus are set actually.
>>>>>>>>>
>>>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning
>>>>>>>>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
>>>>>>>>> an evsel is added after the evlist cpu maps are set.
>>>>>>>>
>>>>>>>> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
>>>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
>>>>>>>> added to the evlist.  It can also remove an evsel from the evlist.
>>>>>>>
>>>>>>> Thanks for your review.  I think it's fine to change evsel cpus or to remove
>>>>>>> an evsel from evlist before calling evlist__create_maps().  The function
>>>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist.
>>>>>>> So previous changes in evsel/cpus wouldn't be any special.
>>>>>>>
>>>>>>> After this point, adding a new evsel needs to update evlist all cpus by
>>>>>>> propagating cpu maps.  So I think hybrid cpus should be fine.
>>>>>>> Did I miss something?
>>>>>>
>>>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
>>>>>> cpus from the target->cpu_list (using perf record -C) , since after this
>>>>>> patch all_cpus always starts with the target->cpu_list instead of an empty
>>>>>> list.  But then, in the hybrid case, it puts a dummy event that uses the
>>>>>> target cpu list anyway, so the result is the same.
>>>>>>
>>>>>> I don't know if there are any cases where all_cpus would actually need to
>>>>>> exclude some of the cpus from target->cpu_list.
>>>>>
>>>>> I'm not aware of other cases to reduce cpu list.  I think it'd be fine
>>>>> if it has a cpu in the evlist->all_cpus even if it's not used.  The evsel
>>>>> should have a correct list anyway and we mostly use the evsel cpus
>>>>> to do the real work.
>>>>>
>>>>> Thanks,
>>>>> Namhyung
>>>>
>>>> The affinity changes made it so that we use all_cpus probably more
>>>> often than the evsel CPU maps for real work. The reason being we want
>>>> to avoid IPIs so we do all the work on 1 CPU and then move to the next
>>>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the
>>>> indices are kept accurate - for example, if an uncore event is
>>>> measured with a CPU event:
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404
>>>
>>> Right, I meant it'd check the evsel cpus eventually even if it iterates
>>> on the evlist all_cpus.  The evlist_cpu_iterator__next() will skip a
>>> CPU if it's not in the evsel cpus.
>>>
>>> Thanks,
>>> Namhyung
>>
>> Perhaps an alternative is to be explicit about deferring map
>> propagation e.g.
> 
> Thanks for your patch.  Yeah, we can use this.
> 
> But I still think it'd be better doing it unconditionally
> since any propagation before perf_evlist__set_maps
> will be discarded anyway.  With this change, other
> than perf record will collect all cpus before _set_maps
> and then discard it.  It seems like a waste, no?
> 
> Or else, we can have allow_map_propagation initialized
> to false and set it to true in perf_evlist__set_maps().
> 

That sounds fine.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-30 12:49                   ` Adrian Hunter
@ 2022-09-30 16:44                     ` Namhyung Kim
  2022-09-30 16:56                       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-09-30 16:44 UTC (permalink / raw)
  To: Adrian Hunter, Arnaldo Carvalho de Melo
  Cc: Ian Rogers, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML,
	linux-perf-users, Kan Liang, Leo Yan

On Fri, Sep 30, 2022 at 5:50 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 29/09/22 23:42, Namhyung Kim wrote:
> > On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>
> >> On 29/09/22 08:09, Namhyung Kim wrote:
> >>> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote:
> >>>>
> >>>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >>>>>
> >>>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>>>
> >>>>>> On 27/09/22 20:28, Namhyung Kim wrote:
> >>>>>>> Hi Adrian,
> >>>>>>>
> >>>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>>>>>
> >>>>>>>> On 24/09/22 19:57, Namhyung Kim wrote:
> >>>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's
> >>>>>>>>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> >>>>>>>>> be updated in perf_evlist__set_maps() later.  No need to do it before
> >>>>>>>>> evlist's cpus are set actually.
> >>>>>>>>>
> >>>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning
> >>>>>>>>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> >>>>>>>>> an evsel is added after the evlist cpu maps are set.
> >>>>>>>>
> >>>>>>>> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
> >>>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
> >>>>>>>> added to the evlist.  It can also remove an evsel from the evlist.
> >>>>>>>
> >>>>>>> Thanks for your review.  I think it's fine to change evsel cpus or to remove
> >>>>>>> an evsel from evlist before calling evlist__create_maps().  The function
> >>>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist.
> >>>>>>> So previous changes in evsel/cpus wouldn't be any special.
> >>>>>>>
> >>>>>>> After this point, adding a new evsel needs to update evlist all cpus by
> >>>>>>> propagating cpu maps.  So I think hybrid cpus should be fine.
> >>>>>>> Did I miss something?
> >>>>>>
> >>>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
> >>>>>> cpus from the target->cpu_list (using perf record -C) , since after this
> >>>>>> patch all_cpus always starts with the target->cpu_list instead of an empty
> >>>>>> list.  But then, in the hybrid case, it puts a dummy event that uses the
> >>>>>> target cpu list anyway, so the result is the same.
> >>>>>>
> >>>>>> I don't know if there are any cases where all_cpus would actually need to
> >>>>>> exclude some of the cpus from target->cpu_list.
> >>>>>
> >>>>> I'm not aware of other cases to reduce cpu list.  I think it'd be fine
> >>>>> if it has a cpu in the evlist->all_cpus even if it's not used.  The evsel
> >>>>> should have a correct list anyway and we mostly use the evsel cpus
> >>>>> to do the real work.
> >>>>>
> >>>>> Thanks,
> >>>>> Namhyung
> >>>>
> >>>> The affinity changes made it so that we use all_cpus probably more
> >>>> often than the evsel CPU maps for real work. The reason being we want
> >>>> to avoid IPIs so we do all the work on 1 CPU and then move to the next
> >>>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the
> >>>> indices are kept accurate - for example, if an uncore event is
> >>>> measured with a CPU event:
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404
> >>>
> >>> Right, I meant it'd check the evsel cpus eventually even if it iterates
> >>> on the evlist all_cpus.  The evlist_cpu_iterator__next() will skip a
> >>> CPU if it's not in the evsel cpus.
> >>>
> >>> Thanks,
> >>> Namhyung
> >>
> >> Perhaps an alternative is to be explicit about deferring map
> >> propagation e.g.
> >
> > Thanks for your patch.  Yeah, we can use this.
> >
> > But I still think it'd be better doing it unconditionally
> > since any propagation before perf_evlist__set_maps
> > will be discarded anyway.  With this change, other
> > than perf record will collect all cpus before _set_maps
> > and then discard it.  It seems like a waste, no?
> >
> > Or else, we can have allow_map_propagation initialized
> > to false and set it to true in perf_evlist__set_maps().
> >
>
> That sounds fine.

Thanks!

Arnaldo, how do you want to handle it?  I can send v2 for the
whole series, but I think you already applied it.  Then do you
want me to send this change on top?

Namhyung

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-09-30 16:44                     ` Namhyung Kim
@ 2022-09-30 16:56                       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-09-30 16:56 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Adrian Hunter, Ian Rogers, Jiri Olsa, Ingo Molnar,
	Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan

Em Fri, Sep 30, 2022 at 09:44:49AM -0700, Namhyung Kim escreveu:
> On Fri, Sep 30, 2022 at 5:50 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >
> > On 29/09/22 23:42, Namhyung Kim wrote:
> > > On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >>
> > >> On 29/09/22 08:09, Namhyung Kim wrote:
> > >>> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote:
> > >>>>
> > >>>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >>>>>
> > >>>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >>>>>>
> > >>>>>> On 27/09/22 20:28, Namhyung Kim wrote:
> > >>>>>>> Hi Adrian,
> > >>>>>>>
> > >>>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >>>>>>>>
> > >>>>>>>> On 24/09/22 19:57, Namhyung Kim wrote:
> > >>>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's
> > >>>>>>>>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> > >>>>>>>>> be updated in perf_evlist__set_maps() later.  No need to do it before
> > >>>>>>>>> evlist's cpus are set actually.
> > >>>>>>>>>
> > >>>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning
> > >>>>>>>>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> > >>>>>>>>> an evsel is added after the evlist cpu maps are set.
> > >>>>>>>>
> > >>>>>>>> That might not be true.  Consider evlist__fix_hybrid_cpus() which fiddles
> > >>>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
> > >>>>>>>> added to the evlist.  It can also remove an evsel from the evlist.
> > >>>>>>>
> > >>>>>>> Thanks for your review.  I think it's fine to change evsel cpus or to remove
> > >>>>>>> an evsel from evlist before calling evlist__create_maps().  The function
> > >>>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist.
> > >>>>>>> So previous changes in evsel/cpus wouldn't be any special.
> > >>>>>>>
> > >>>>>>> After this point, adding a new evsel needs to update evlist all cpus by
> > >>>>>>> propagating cpu maps.  So I think hybrid cpus should be fine.
> > >>>>>>> Did I miss something?
> > >>>>>>
> > >>>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
> > >>>>>> cpus from the target->cpu_list (using perf record -C) , since after this
> > >>>>>> patch all_cpus always starts with the target->cpu_list instead of an empty
> > >>>>>> list.  But then, in the hybrid case, it puts a dummy event that uses the
> > >>>>>> target cpu list anyway, so the result is the same.
> > >>>>>>
> > >>>>>> I don't know if there are any cases where all_cpus would actually need to
> > >>>>>> exclude some of the cpus from target->cpu_list.
> > >>>>>
> > >>>>> I'm not aware of other cases to reduce cpu list.  I think it'd be fine
> > >>>>> if it has a cpu in the evlist->all_cpus even if it's not used.  The evsel
> > >>>>> should have a correct list anyway and we mostly use the evsel cpus
> > >>>>> to do the real work.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Namhyung
> > >>>>
> > >>>> The affinity changes made it so that we use all_cpus probably more
> > >>>> often than the evsel CPU maps for real work. The reason being we want
> > >>>> to avoid IPIs so we do all the work on 1 CPU and then move to the next
> > >>>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the
> > >>>> indices are kept accurate - for example, if an uncore event is
> > >>>> measured with a CPU event:
> > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366
> > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404
> > >>>
> > >>> Right, I meant it'd check the evsel cpus eventually even if it iterates
> > >>> on the evlist all_cpus.  The evlist_cpu_iterator__next() will skip a
> > >>> CPU if it's not in the evsel cpus.
> > >>>
> > >>> Thanks,
> > >>> Namhyung
> > >>
> > >> Perhaps an alternative is to be explicit about deferring map
> > >> propagation e.g.
> > >
> > > Thanks for your patch.  Yeah, we can use this.
> > >
> > > But I still think it'd be better doing it unconditionally
> > > since any propagation before perf_evlist__set_maps
> > > will be discarded anyway.  With this change, other
> > > than perf record will collect all cpus before _set_maps
> > > and then discard it.  It seems like a waste, no?
> > >
> > > Or else, we can have allow_map_propagation initialized
> > > to false and set it to true in perf_evlist__set_maps().
> > >
> >
> > That sounds fine.
> 
> Thanks!
> 
> Arnaldo, how do you want to handle it?  I can send v2 for the
> whole series, but I think you already applied it.  Then do you
> want me to send this change on top?

Send v2 for the whole series, I haven't yet published it so I can
replace.

- Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] libperf: Populate system-wide evsel maps
  2022-10-06 18:43   ` Ian Rogers
@ 2022-10-06 23:13     ` Namhyung Kim
  0 siblings, 0 replies; 26+ messages in thread
From: Namhyung Kim @ 2022-10-06 23:13 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan

Hi Ian,

On Thu, Oct 6, 2022 at 11:44 AM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Setting proper cpu and thread maps for system wide evsels regardless of
> > user requested cpu in __perf_evlist__propagate_maps().  Those evsels
> > need to be active on all cpus always.  Do it in the libperf so that we
> > can guarantee it has proper maps.
> >
> > Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/lib/perf/evlist.c | 15 +++++++++------
> >  1 file changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> > index 6b1bafe267a4..187129652ab6 100644
> > --- a/tools/lib/perf/evlist.c
> > +++ b/tools/lib/perf/evlist.c
> > @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> >          * We already have cpus for evsel (via PMU sysfs) so
> >          * keep it, if there's no target cpu list defined.
> >          */
>
> This comment is for the 'else if' case and so is a little out of place
> before the system wide test.

Right, I'll add a comment for system-wide evsels too.

>
> > -       if (!evsel->own_cpus ||
> > -           (!evsel->system_wide && evlist->has_user_cpus) ||
> > -           (!evsel->system_wide &&
> > -            !evsel->requires_cpu &&
> > -            perf_cpu_map__empty(evlist->user_requested_cpus))) {
> > +       if (evsel->system_wide) {
> > +               perf_cpu_map__put(evsel->cpus);
> > +               evsel->cpus = perf_cpu_map__new(NULL);
>
> Looking at perf_cpu_map__new, will this mean that in system wide mode
> every event/evsel will now read /sys/devices/system/cpu/online here?
> We may want to cache the cpumap to avoid this.

Yeah, it's redundant.  I thought it's ok since the system-wide evsels
are not common.  We can consider caching when it becomes a problem.

Thanks,
Namhyung


>
> > +       } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> > +                  (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
> >                 perf_cpu_map__put(evsel->cpus);
> >                 evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
> >         } else if (evsel->cpus != evsel->own_cpus) {
> > @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> >                 evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
> >         }
> >
> > -       if (!evsel->system_wide) {
> > +       if (evsel->system_wide) {
> > +               perf_thread_map__put(evsel->threads);
> > +               evsel->threads = perf_thread_map__new_dummy();
> > +       } else {
> >                 perf_thread_map__put(evsel->threads);
> >                 evsel->threads = perf_thread_map__get(evlist->threads);
> >         }
> > --
> > 2.38.0.rc1.362.ged0d419d3c-goog
> >

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] libperf: Populate system-wide evsel maps
  2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
@ 2022-10-06 18:43   ` Ian Rogers
  2022-10-06 23:13     ` Namhyung Kim
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2022-10-06 18:43 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan

On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Setting proper cpu and thread maps for system wide evsels regardless of
> user requested cpu in __perf_evlist__propagate_maps().  Those evsels
> need to be active on all cpus always.  Do it in the libperf so that we
> can guarantee it has proper maps.
>
> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/lib/perf/evlist.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 6b1bafe267a4..187129652ab6 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>          * We already have cpus for evsel (via PMU sysfs) so
>          * keep it, if there's no target cpu list defined.
>          */

This comment is for the 'else if' case and so is a little out of place
before the system wide test.

> -       if (!evsel->own_cpus ||
> -           (!evsel->system_wide && evlist->has_user_cpus) ||
> -           (!evsel->system_wide &&
> -            !evsel->requires_cpu &&
> -            perf_cpu_map__empty(evlist->user_requested_cpus))) {
> +       if (evsel->system_wide) {
> +               perf_cpu_map__put(evsel->cpus);
> +               evsel->cpus = perf_cpu_map__new(NULL);

Looking at perf_cpu_map__new, will this mean that in system wide mode
every event/evsel will now read /sys/devices/system/cpu/online here?
We may want to cache the cpumap to avoid this.

Thanks,
Ian

> +       } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> +                  (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
>                 perf_cpu_map__put(evsel->cpus);
>                 evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
>         } else if (evsel->cpus != evsel->own_cpus) {
> @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>                 evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
>         }
>
> -       if (!evsel->system_wide) {
> +       if (evsel->system_wide) {
> +               perf_thread_map__put(evsel->threads);
> +               evsel->threads = perf_thread_map__new_dummy();
> +       } else {
>                 perf_thread_map__put(evsel->threads);
>                 evsel->threads = perf_thread_map__get(evlist->threads);
>         }
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/5] libperf: Populate system-wide evsel maps
  2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
  2022-10-06 18:43   ` Ian Rogers
  0 siblings, 1 reply; 26+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

Setting proper cpu and thread maps for system wide evsels regardless of
user requested cpu in __perf_evlist__propagate_maps().  Those evsels
need to be active on all cpus always.  Do it in the libperf so that we
can guarantee it has proper maps.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evlist.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 6b1bafe267a4..187129652ab6 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 	 * We already have cpus for evsel (via PMU sysfs) so
 	 * keep it, if there's no target cpu list defined.
 	 */
-	if (!evsel->own_cpus ||
-	    (!evsel->system_wide && evlist->has_user_cpus) ||
-	    (!evsel->system_wide &&
-	     !evsel->requires_cpu &&
-	     perf_cpu_map__empty(evlist->user_requested_cpus))) {
+	if (evsel->system_wide) {
+		perf_cpu_map__put(evsel->cpus);
+		evsel->cpus = perf_cpu_map__new(NULL);
+	} else if (!evsel->own_cpus || evlist->has_user_cpus ||
+		   (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
 		perf_cpu_map__put(evsel->cpus);
 		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
 	} else if (evsel->cpus != evsel->own_cpus) {
@@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 		evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
 	}
 
-	if (!evsel->system_wide) {
+	if (evsel->system_wide) {
+		perf_thread_map__put(evsel->threads);
+		evsel->threads = perf_thread_map__new_dummy();
+	} else {
 		perf_thread_map__put(evsel->threads);
 		evsel->threads = perf_thread_map__get(evlist->threads);
 	}
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/5] libperf: Populate system-wide evsel maps
  2022-09-30 17:27 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v2) Namhyung Kim
@ 2022-09-30 17:27 ` Namhyung Kim
  0 siblings, 0 replies; 26+ messages in thread
From: Namhyung Kim @ 2022-09-30 17:27 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

Setting proper cpu and thread maps for system wide evsels regardless of
user requested cpu in __perf_evlist__propagate_maps().  Those evsels
need to be active on all cpus always.  Do it in the libperf so that we
can guarantee it has proper maps.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evlist.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 6b1bafe267a4..187129652ab6 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 	 * We already have cpus for evsel (via PMU sysfs) so
 	 * keep it, if there's no target cpu list defined.
 	 */
-	if (!evsel->own_cpus ||
-	    (!evsel->system_wide && evlist->has_user_cpus) ||
-	    (!evsel->system_wide &&
-	     !evsel->requires_cpu &&
-	     perf_cpu_map__empty(evlist->user_requested_cpus))) {
+	if (evsel->system_wide) {
+		perf_cpu_map__put(evsel->cpus);
+		evsel->cpus = perf_cpu_map__new(NULL);
+	} else if (!evsel->own_cpus || evlist->has_user_cpus ||
+		   (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
 		perf_cpu_map__put(evsel->cpus);
 		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
 	} else if (evsel->cpus != evsel->own_cpus) {
@@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 		evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
 	}
 
-	if (!evsel->system_wide) {
+	if (evsel->system_wide) {
+		perf_thread_map__put(evsel->threads);
+		evsel->threads = perf_thread_map__new_dummy();
+	} else {
 		perf_thread_map__put(evsel->threads);
 		evsel->threads = perf_thread_map__get(evlist->threads);
 	}
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2022-10-06 23:14 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
2022-09-24 16:57 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
2022-09-27  6:56   ` Adrian Hunter
2022-09-24 16:57 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
2022-09-27  7:06   ` Adrian Hunter
2022-09-27 17:28     ` Namhyung Kim
2022-09-28  7:53       ` Adrian Hunter
2022-09-28 23:46         ` Namhyung Kim
2022-09-29  2:07           ` Ian Rogers
2022-09-29  5:09             ` Namhyung Kim
2022-09-29  5:18               ` Adrian Hunter
2022-09-29 20:42                 ` Namhyung Kim
2022-09-30 12:49                   ` Adrian Hunter
2022-09-30 16:44                     ` Namhyung Kim
2022-09-30 16:56                       ` Arnaldo Carvalho de Melo
2022-09-24 16:57 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
2022-09-27  7:07   ` Adrian Hunter
2022-09-24 16:57 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
2022-09-27  7:07   ` Adrian Hunter
2022-09-24 16:57 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
2022-09-27  7:09   ` Adrian Hunter
2022-09-26 19:24 ` [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Arnaldo Carvalho de Melo
2022-09-30 17:27 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v2) Namhyung Kim
2022-09-30 17:27 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
2022-10-06 18:43   ` Ian Rogers
2022-10-06 23:13     ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).