linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3)
@ 2022-10-03 20:46 Namhyung Kim
  2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

Hello,

The system-wide evsel has a cpu map for all (online) cpus regardless
of user requested cpus.  But the cpu map handling code has some
special case for it and I think we can cleanup the code by making sure
that such a evsel has a proper cpu/thread maps from the beginning.
This patches should not cause any change in the behavior.

Changes from v2:
 * build evlist->core.all_cpus from the beginning  (Adrian)

Changes from v1:
 * use evlist->core.needs_map_propagation field
 * add Reviewed-by from Adrian

You can get the code from 'perf/cpumap-update-v3' branch in

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung

Namhyung Kim (5):
  libperf: Populate system-wide evsel maps
  libperf: Propagate maps only if necessary
  perf tools: Get rid of evlist__add_on_all_cpus()
  perf tools: Add evlist__add_sched_switch()
  perf tools: Remove special handling of system-wide evsel

 tools/lib/perf/evlist.c                  | 26 +++++++-------
 tools/lib/perf/evsel.c                   |  3 --
 tools/lib/perf/include/internal/evlist.h |  1 +
 tools/perf/arch/x86/util/intel-pt.c      | 15 +++-----
 tools/perf/builtin-script.c              |  3 --
 tools/perf/tests/switch-tracking.c       | 15 +++-----
 tools/perf/util/evlist.c                 | 46 ++++++++++--------------
 tools/perf/util/evlist.h                 |  1 +
 tools/perf/util/evsel.c                  | 12 ++-----
 tools/perf/util/stat.c                   |  3 --
 10 files changed, 46 insertions(+), 79 deletions(-)


base-commit: 62e64c9d2fd12839c02f1b3e8b873e7cb34e8720
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] libperf: Populate system-wide evsel maps
  2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
  2022-10-06 18:43   ` Ian Rogers
  2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

Setting proper cpu and thread maps for system wide evsels regardless of
user requested cpu in __perf_evlist__propagate_maps().  Those evsels
need to be active on all cpus always.  Do it in the libperf so that we
can guarantee it has proper maps.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evlist.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 6b1bafe267a4..187129652ab6 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 	 * We already have cpus for evsel (via PMU sysfs) so
 	 * keep it, if there's no target cpu list defined.
 	 */
-	if (!evsel->own_cpus ||
-	    (!evsel->system_wide && evlist->has_user_cpus) ||
-	    (!evsel->system_wide &&
-	     !evsel->requires_cpu &&
-	     perf_cpu_map__empty(evlist->user_requested_cpus))) {
+	if (evsel->system_wide) {
+		perf_cpu_map__put(evsel->cpus);
+		evsel->cpus = perf_cpu_map__new(NULL);
+	} else if (!evsel->own_cpus || evlist->has_user_cpus ||
+		   (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
 		perf_cpu_map__put(evsel->cpus);
 		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
 	} else if (evsel->cpus != evsel->own_cpus) {
@@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 		evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
 	}
 
-	if (!evsel->system_wide) {
+	if (evsel->system_wide) {
+		perf_thread_map__put(evsel->threads);
+		evsel->threads = perf_thread_map__new_dummy();
+	} else {
 		perf_thread_map__put(evsel->threads);
 		evsel->threads = perf_thread_map__get(evlist->threads);
 	}
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
  2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
  2022-10-04 12:14   ` Arnaldo Carvalho de Melo
  2022-10-06 18:52   ` Ian Rogers
  2022-10-03 20:46 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

The current code propagate evsel's cpu map settings to evlist when it's
added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
be updated in perf_evlist__set_maps() later.  No need to do it before
evlist's cpus are set actually.

In fact it discards this intermediate all_cpus maps at the beginning
of perf_evlist__set_maps().  Let's not do this.  It's only needed when
an evsel is added after the evlist cpu/thread maps are set.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evlist.c                  | 11 ++++-------
 tools/lib/perf/include/internal/evlist.h |  1 +
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 187129652ab6..8ce92070086c 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
 {
 	struct perf_evsel *evsel;
 
-	/* Recomputing all_cpus, so start with a blank slate. */
-	perf_cpu_map__put(evlist->all_cpus);
-	evlist->all_cpus = NULL;
+	evlist->needs_map_propagation = true;
 
 	perf_evlist__for_each_evsel(evlist, evsel)
 		__perf_evlist__propagate_maps(evlist, evsel);
@@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
 	evsel->idx = evlist->nr_entries;
 	list_add_tail(&evsel->node, &evlist->entries);
 	evlist->nr_entries += 1;
-	__perf_evlist__propagate_maps(evlist, evsel);
+
+	if (evlist->needs_map_propagation)
+		__perf_evlist__propagate_maps(evlist, evsel);
 }
 
 void perf_evlist__remove(struct perf_evlist *evlist,
@@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
 		evlist->threads = perf_thread_map__get(threads);
 	}
 
-	if (!evlist->all_cpus && cpus)
-		evlist->all_cpus = perf_cpu_map__get(cpus);
-
 	perf_evlist__propagate_maps(evlist);
 }
 
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 6f89aec3e608..850f07070036 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -19,6 +19,7 @@ struct perf_evlist {
 	int			 nr_entries;
 	int			 nr_groups;
 	bool			 has_user_cpus;
+	bool			 needs_map_propagation;
 	/**
 	 * The cpus passed from the command line or all online CPUs by
 	 * default.
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus()
  2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
  2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
  2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
  2022-10-03 20:46 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
  2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
  4 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

The cpu and thread maps are properly handled in libperf now.  No need to
do it in the perf tools anymore.  Let's remove the logic.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evlist.c | 29 ++---------------------------
 1 file changed, 2 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index fcfe5bcc0bcf..dcf57b271ff1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -268,28 +268,6 @@ int evlist__add_dummy(struct evlist *evlist)
 	return 0;
 }
 
-static void evlist__add_on_all_cpus(struct evlist *evlist, struct evsel *evsel)
-{
-	evsel->core.system_wide = true;
-
-	/*
-	 * All CPUs.
-	 *
-	 * Note perf_event_open() does not accept CPUs that are not online, so
-	 * in fact this CPU list will include only all online CPUs.
-	 */
-	perf_cpu_map__put(evsel->core.own_cpus);
-	evsel->core.own_cpus = perf_cpu_map__new(NULL);
-	perf_cpu_map__put(evsel->core.cpus);
-	evsel->core.cpus = perf_cpu_map__get(evsel->core.own_cpus);
-
-	/* No threads */
-	perf_thread_map__put(evsel->core.threads);
-	evsel->core.threads = perf_thread_map__new_dummy();
-
-	evlist__add(evlist, evsel);
-}
-
 struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
 {
 	struct evsel *evsel = evlist__dummy_event(evlist);
@@ -302,14 +280,11 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
 	evsel->core.attr.exclude_hv = 1;
 	evsel->core.attr.freq = 0;
 	evsel->core.attr.sample_period = 1;
+	evsel->core.system_wide = system_wide;
 	evsel->no_aux_samples = true;
 	evsel->name = strdup("dummy:u");
 
-	if (system_wide)
-		evlist__add_on_all_cpus(evlist, evsel);
-	else
-		evlist__add(evlist, evsel);
-
+	evlist__add(evlist, evsel);
 	return evsel;
 }
 
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] perf tools: Add evlist__add_sched_switch()
  2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
                   ` (2 preceding siblings ...)
  2022-10-03 20:46 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
  2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
  4 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

Add a help to create a system-wide sched_switch event.  One merit is
that it sets the system-wide bit before adding it to evlist so that
the libperf can handle the cpu and thread maps correctly.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/arch/x86/util/intel-pt.c | 15 +++++----------
 tools/perf/tests/switch-tracking.c  | 15 +++++----------
 tools/perf/util/evlist.c            | 17 +++++++++++++++++
 tools/perf/util/evlist.h            |  1 +
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 13933020a79e..793b35f2221a 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -11,6 +11,7 @@
 #include <linux/bitops.h>
 #include <linux/log2.h>
 #include <linux/zalloc.h>
+#include <linux/err.h>
 #include <cpuid.h>
 
 #include "../../../util/session.h"
@@ -426,20 +427,14 @@ static int intel_pt_track_switches(struct evlist *evlist)
 	if (!evlist__can_select_event(evlist, sched_switch))
 		return -EPERM;
 
-	err = parse_event(evlist, sched_switch);
-	if (err) {
-		pr_debug2("%s: failed to parse %s, error %d\n",
+	evsel = evlist__add_sched_switch(evlist, true);
+	if (IS_ERR(evsel)) {
+		err = PTR_ERR(evsel);
+		pr_debug2("%s: failed to create %s, error = %d\n",
 			  __func__, sched_switch, err);
 		return err;
 	}
 
-	evsel = evlist__last(evlist);
-
-	evsel__set_sample_bit(evsel, CPU);
-	evsel__set_sample_bit(evsel, TIME);
-
-	evsel->core.system_wide = true;
-	evsel->no_aux_samples = true;
 	evsel->immediate = true;
 
 	return 0;
diff --git a/tools/perf/tests/switch-tracking.c b/tools/perf/tests/switch-tracking.c
index 2d46af9ef935..87f565c7f650 100644
--- a/tools/perf/tests/switch-tracking.c
+++ b/tools/perf/tests/switch-tracking.c
@@ -6,6 +6,7 @@
 #include <time.h>
 #include <stdlib.h>
 #include <linux/zalloc.h>
+#include <linux/err.h>
 #include <perf/cpumap.h>
 #include <perf/evlist.h>
 #include <perf/mmap.h>
@@ -398,19 +399,13 @@ static int test__switch_tracking(struct test_suite *test __maybe_unused, int sub
 		goto out;
 	}
 
-	err = parse_event(evlist, sched_switch);
-	if (err) {
-		pr_debug("Failed to parse event %s\n", sched_switch);
+	switch_evsel = evlist__add_sched_switch(evlist, true);
+	if (IS_ERR(switch_evsel)) {
+		err = PTR_ERR(switch_evsel);
+		pr_debug("Failed to create event %s\n", sched_switch);
 		goto out_err;
 	}
 
-	switch_evsel = evlist__last(evlist);
-
-	evsel__set_sample_bit(switch_evsel, CPU);
-	evsel__set_sample_bit(switch_evsel, TIME);
-
-	switch_evsel->core.system_wide = true;
-	switch_evsel->no_aux_samples = true;
 	switch_evsel->immediate = true;
 
 	/* Test moving an event to the front */
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index dcf57b271ff1..6612b00949e7 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -288,6 +288,23 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
 	return evsel;
 }
 
+struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide)
+{
+	struct evsel *evsel = evsel__newtp_idx("sched", "sched_switch", 0);
+
+	if (IS_ERR(evsel))
+		return evsel;
+
+	evsel__set_sample_bit(evsel, CPU);
+	evsel__set_sample_bit(evsel, TIME);
+
+	evsel->core.system_wide = system_wide;
+	evsel->no_aux_samples = true;
+
+	evlist__add(evlist, evsel);
+	return evsel;
+};
+
 int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
 {
 	struct evsel *evsel, *n;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 9d967fe3953a..16734c6756b3 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -127,6 +127,7 @@ static inline struct evsel *evlist__add_dummy_on_all_cpus(struct evlist *evlist)
 {
 	return evlist__add_aux_dummy(evlist, true);
 }
+struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide);
 
 int evlist__add_sb_event(struct evlist *evlist, struct perf_event_attr *attr,
 			 evsel__sb_cb_t cb, void *data);
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
  2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
                   ` (3 preceding siblings ...)
  2022-10-03 20:46 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
  2022-10-06 18:57   ` Ian Rogers
  4 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

For system-wide evsels, the thread map should be dummy - i.e. it has a
single entry of -1.  But the code guarantees such a thread map, so no
need to handle it specially.

No functional change intended.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evsel.c      |  3 ---
 tools/perf/builtin-script.c |  3 ---
 tools/perf/util/evsel.c     | 12 ++----------
 tools/perf/util/stat.c      |  3 ---
 4 files changed, 2 insertions(+), 19 deletions(-)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8ce5bbd09666..8b51b008a81f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
 	if (ncpus == 0 || nthreads == 0)
 		return 0;
 
-	if (evsel->system_wide)
-		nthreads = 1;
-
 	evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
 	if (evsel->sample_id == NULL)
 		return -ENOMEM;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 886f53cfa257..7fa467ed91dc 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
 	struct perf_cpu cpu;
 	static int header_printed;
 
-	if (counter->core.system_wide)
-		nthreads = 1;
-
 	if (!header_printed) {
 		printf("%3s %8s %15s %15s %15s %15s %s\n",
 		       "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5776bfa70f11..e319bb17d10d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
 static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		struct perf_thread_map *threads)
 {
-	int nthreads;
+	int nthreads = perf_thread_map__nr(threads);
 
 	if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
 	    (perf_missing_features.aux_output     && evsel->core.attr.aux_output))
@@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		threads = empty_thread_map;
 	}
 
-	if (evsel->core.system_wide)
-		nthreads = 1;
-	else
-		nthreads = threads->nr;
-
 	if (evsel->core.fd == NULL &&
 	    perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
 		return -ENOMEM;
@@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	if (threads == NULL)
 		threads = empty_thread_map;
 
-	if (evsel->core.system_wide)
-		nthreads = 1;
-	else
-		nthreads = threads->nr;
+	nthreads = perf_thread_map__nr(threads);
 
 	if (evsel->cgrp)
 		pid = evsel->cgrp->fd;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ce5e9e372fc4..cef943377ad7 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
 	int ncpus = evsel__nr_cpus(counter);
 	int idx, thread;
 
-	if (counter->core.system_wide)
-		nthreads = 1;
-
 	for (thread = 0; thread < nthreads; thread++) {
 		for (idx = 0; idx < ncpus; idx++) {
 			if (process_counter_values(config, counter, idx, thread,
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
@ 2022-10-04 12:14   ` Arnaldo Carvalho de Melo
  2022-10-04 13:55     ` Adrian Hunter
  2022-10-06 18:52   ` Ian Rogers
  1 sibling, 1 reply; 16+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-10-04 12:14 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
	Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan

Em Mon, Oct 03, 2022 at 01:46:44PM -0700, Namhyung Kim escreveu:
> The current code propagate evsel's cpu map settings to evlist when it's
> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> be updated in perf_evlist__set_maps() later.  No need to do it before
> evlist's cpus are set actually.
> 
> In fact it discards this intermediate all_cpus maps at the beginning
> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> an evsel is added after the evlist cpu/thread maps are set.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/lib/perf/evlist.c                  | 11 ++++-------
>  tools/lib/perf/include/internal/evlist.h |  1 +
>  2 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 187129652ab6..8ce92070086c 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>  {
>  	struct perf_evsel *evsel;
>  
> -	/* Recomputing all_cpus, so start with a blank slate. */
> -	perf_cpu_map__put(evlist->all_cpus);
> -	evlist->all_cpus = NULL;
> +	evlist->needs_map_propagation = true;
>  
>  	perf_evlist__for_each_evsel(evlist, evsel)
>  		__perf_evlist__propagate_maps(evlist, evsel);
> @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
>  	evsel->idx = evlist->nr_entries;
>  	list_add_tail(&evsel->node, &evlist->entries);
>  	evlist->nr_entries += 1;
> -	__perf_evlist__propagate_maps(evlist, evsel);
> +
> +	if (evlist->needs_map_propagation)
> +		__perf_evlist__propagate_maps(evlist, evsel);
>  }
>  
>  void perf_evlist__remove(struct perf_evlist *evlist,
> @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
>  		evlist->threads = perf_thread_map__get(threads);
>  	}
>  
> -	if (!evlist->all_cpus && cpus)
> -		evlist->all_cpus = perf_cpu_map__get(cpus);
> -
>  	perf_evlist__propagate_maps(evlist);

IIRC Adrian suggested this extra thing, but he provided the Reviewed-by
for the previous patch, where this isn't present, Adrian, can you please
confirm this and if this is the case provide your Reviewed-by for this
new version?

Thanks,

- Arnaldo

>  }
>  
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 6f89aec3e608..850f07070036 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -19,6 +19,7 @@ struct perf_evlist {
>  	int			 nr_entries;
>  	int			 nr_groups;
>  	bool			 has_user_cpus;
> +	bool			 needs_map_propagation;
>  	/**
>  	 * The cpus passed from the command line or all online CPUs by
>  	 * default.
> -- 
> 2.38.0.rc1.362.ged0d419d3c-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-10-04 12:14   ` Arnaldo Carvalho de Melo
@ 2022-10-04 13:55     ` Adrian Hunter
  0 siblings, 0 replies; 16+ messages in thread
From: Adrian Hunter @ 2022-10-04 13:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Namhyung Kim
  Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
	linux-perf-users, Kan Liang, Leo Yan

On 4/10/22 15:14, Arnaldo Carvalho de Melo wrote:
> Em Mon, Oct 03, 2022 at 01:46:44PM -0700, Namhyung Kim escreveu:
>> The current code propagate evsel's cpu map settings to evlist when it's
>> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
>> be updated in perf_evlist__set_maps() later.  No need to do it before
>> evlist's cpus are set actually.
>>
>> In fact it discards this intermediate all_cpus maps at the beginning
>> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
>> an evsel is added after the evlist cpu/thread maps are set.
>>
>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>> ---
>>  tools/lib/perf/evlist.c                  | 11 ++++-------
>>  tools/lib/perf/include/internal/evlist.h |  1 +
>>  2 files changed, 5 insertions(+), 7 deletions(-)
>>
>> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
>> index 187129652ab6..8ce92070086c 100644
>> --- a/tools/lib/perf/evlist.c
>> +++ b/tools/lib/perf/evlist.c
>> @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>>  {
>>  	struct perf_evsel *evsel;
>>  
>> -	/* Recomputing all_cpus, so start with a blank slate. */
>> -	perf_cpu_map__put(evlist->all_cpus);
>> -	evlist->all_cpus = NULL;
>> +	evlist->needs_map_propagation = true;
>>  
>>  	perf_evlist__for_each_evsel(evlist, evsel)
>>  		__perf_evlist__propagate_maps(evlist, evsel);
>> @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
>>  	evsel->idx = evlist->nr_entries;
>>  	list_add_tail(&evsel->node, &evlist->entries);
>>  	evlist->nr_entries += 1;
>> -	__perf_evlist__propagate_maps(evlist, evsel);
>> +
>> +	if (evlist->needs_map_propagation)
>> +		__perf_evlist__propagate_maps(evlist, evsel);
>>  }
>>  
>>  void perf_evlist__remove(struct perf_evlist *evlist,
>> @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
>>  		evlist->threads = perf_thread_map__get(threads);
>>  	}
>>  
>> -	if (!evlist->all_cpus && cpus)
>> -		evlist->all_cpus = perf_cpu_map__get(cpus);
>> -
>>  	perf_evlist__propagate_maps(evlist);
> 
> IIRC Adrian suggested this extra thing, but he provided the Reviewed-by
> for the previous patch, where this isn't present, Adrian, can you please
> confirm this and if this is the case provide your Reviewed-by for this
> new version?

Oops sorry, I meant this one:

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>


> 
> Thanks,
> 
> - Arnaldo
> 
>>  }
>>  
>> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
>> index 6f89aec3e608..850f07070036 100644
>> --- a/tools/lib/perf/include/internal/evlist.h
>> +++ b/tools/lib/perf/include/internal/evlist.h
>> @@ -19,6 +19,7 @@ struct perf_evlist {
>>  	int			 nr_entries;
>>  	int			 nr_groups;
>>  	bool			 has_user_cpus;
>> +	bool			 needs_map_propagation;
>>  	/**
>>  	 * The cpus passed from the command line or all online CPUs by
>>  	 * default.
>> -- 
>> 2.38.0.rc1.362.ged0d419d3c-goog
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] libperf: Populate system-wide evsel maps
  2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
@ 2022-10-06 18:43   ` Ian Rogers
  2022-10-06 23:13     ` Namhyung Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-10-06 18:43 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan

On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Setting proper cpu and thread maps for system wide evsels regardless of
> user requested cpu in __perf_evlist__propagate_maps().  Those evsels
> need to be active on all cpus always.  Do it in the libperf so that we
> can guarantee it has proper maps.
>
> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/lib/perf/evlist.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 6b1bafe267a4..187129652ab6 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>          * We already have cpus for evsel (via PMU sysfs) so
>          * keep it, if there's no target cpu list defined.
>          */

This comment is for the 'else if' case and so is a little out of place
before the system wide test.

> -       if (!evsel->own_cpus ||
> -           (!evsel->system_wide && evlist->has_user_cpus) ||
> -           (!evsel->system_wide &&
> -            !evsel->requires_cpu &&
> -            perf_cpu_map__empty(evlist->user_requested_cpus))) {
> +       if (evsel->system_wide) {
> +               perf_cpu_map__put(evsel->cpus);
> +               evsel->cpus = perf_cpu_map__new(NULL);

Looking at perf_cpu_map__new, will this mean that in system wide mode
every event/evsel will now read /sys/devices/system/cpu/online here?
We may want to cache the cpumap to avoid this.

Thanks,
Ian

> +       } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> +                  (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
>                 perf_cpu_map__put(evsel->cpus);
>                 evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
>         } else if (evsel->cpus != evsel->own_cpus) {
> @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>                 evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
>         }
>
> -       if (!evsel->system_wide) {
> +       if (evsel->system_wide) {
> +               perf_thread_map__put(evsel->threads);
> +               evsel->threads = perf_thread_map__new_dummy();
> +       } else {
>                 perf_thread_map__put(evsel->threads);
>                 evsel->threads = perf_thread_map__get(evlist->threads);
>         }
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
  2022-10-04 12:14   ` Arnaldo Carvalho de Melo
@ 2022-10-06 18:52   ` Ian Rogers
  2022-10-06 23:21     ` Namhyung Kim
  1 sibling, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-10-06 18:52 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan

On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> The current code propagate evsel's cpu map settings to evlist when it's
> added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> be updated in perf_evlist__set_maps() later.  No need to do it before
> evlist's cpus are set actually.
>
> In fact it discards this intermediate all_cpus maps at the beginning
> of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> an evsel is added after the evlist cpu/thread maps are set.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/lib/perf/evlist.c                  | 11 ++++-------
>  tools/lib/perf/include/internal/evlist.h |  1 +
>  2 files changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 187129652ab6..8ce92070086c 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>  {
>         struct perf_evsel *evsel;
>
> -       /* Recomputing all_cpus, so start with a blank slate. */
> -       perf_cpu_map__put(evlist->all_cpus);
> -       evlist->all_cpus = NULL;
> +       evlist->needs_map_propagation = true;

Might be nice to also clear this in  perf_evlist__init.

>
>         perf_evlist__for_each_evsel(evlist, evsel)
>                 __perf_evlist__propagate_maps(evlist, evsel);
> @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
>         evsel->idx = evlist->nr_entries;
>         list_add_tail(&evsel->node, &evlist->entries);
>         evlist->nr_entries += 1;
> -       __perf_evlist__propagate_maps(evlist, evsel);
> +
> +       if (evlist->needs_map_propagation)
> +               __perf_evlist__propagate_maps(evlist, evsel);

I think a comment here would be useful. Something like:
Adding events won't set the CPU maps in the evlist until
set_maps/propogate_maps is called. Catch the case that an evsel is
added after this and propagate the map.

Thanks,
Ian

>  }
>
>  void perf_evlist__remove(struct perf_evlist *evlist,
> @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
>                 evlist->threads = perf_thread_map__get(threads);
>         }
>
> -       if (!evlist->all_cpus && cpus)
> -               evlist->all_cpus = perf_cpu_map__get(cpus);
> -
>         perf_evlist__propagate_maps(evlist);
>  }
>
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 6f89aec3e608..850f07070036 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -19,6 +19,7 @@ struct perf_evlist {
>         int                      nr_entries;
>         int                      nr_groups;
>         bool                     has_user_cpus;
> +       bool                     needs_map_propagation;
>         /**
>          * The cpus passed from the command line or all online CPUs by
>          * default.
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
  2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
@ 2022-10-06 18:57   ` Ian Rogers
  0 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2022-10-06 18:57 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan

On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> For system-wide evsels, the thread map should be dummy - i.e. it has a
> single entry of -1.  But the code guarantees such a thread map, so no
> need to handle it specially.
>
> No functional change intended.
>
> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/lib/perf/evsel.c      |  3 ---
>  tools/perf/builtin-script.c |  3 ---
>  tools/perf/util/evsel.c     | 12 ++----------
>  tools/perf/util/stat.c      |  3 ---
>  4 files changed, 2 insertions(+), 19 deletions(-)
>
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 8ce5bbd09666..8b51b008a81f 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
>         if (ncpus == 0 || nthreads == 0)
>                 return 0;
>
> -       if (evsel->system_wide)
> -               nthreads = 1;
> -
>         evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
>         if (evsel->sample_id == NULL)
>                 return -ENOMEM;
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 886f53cfa257..7fa467ed91dc 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
>         struct perf_cpu cpu;
>         static int header_printed;
>
> -       if (counter->core.system_wide)
> -               nthreads = 1;
> -
>         if (!header_printed) {
>                 printf("%3s %8s %15s %15s %15s %15s %s\n",
>                        "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 5776bfa70f11..e319bb17d10d 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
>  static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
>                 struct perf_thread_map *threads)
>  {
> -       int nthreads;
> +       int nthreads = perf_thread_map__nr(threads);

A general nit for something not introduced by this change. nr/nthreads
here is really the number of perf_event_open file descriptors needed
for the thread map. I wonder if a variable name of num_thread_fds
would make the code more intention revealing.

Thanks,
Ian

>
>         if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
>             (perf_missing_features.aux_output     && evsel->core.attr.aux_output))
> @@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
>                 threads = empty_thread_map;
>         }
>
> -       if (evsel->core.system_wide)
> -               nthreads = 1;
> -       else
> -               nthreads = threads->nr;
> -
>         if (evsel->core.fd == NULL &&
>             perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
>                 return -ENOMEM;
> @@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>         if (threads == NULL)
>                 threads = empty_thread_map;
>
> -       if (evsel->core.system_wide)
> -               nthreads = 1;
> -       else
> -               nthreads = threads->nr;
> +       nthreads = perf_thread_map__nr(threads);
>
>         if (evsel->cgrp)
>                 pid = evsel->cgrp->fd;
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index ce5e9e372fc4..cef943377ad7 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
>         int ncpus = evsel__nr_cpus(counter);
>         int idx, thread;
>
> -       if (counter->core.system_wide)
> -               nthreads = 1;
> -
>         for (thread = 0; thread < nthreads; thread++) {
>                 for (idx = 0; idx < ncpus; idx++) {
>                         if (process_counter_values(config, counter, idx, thread,
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] libperf: Populate system-wide evsel maps
  2022-10-06 18:43   ` Ian Rogers
@ 2022-10-06 23:13     ` Namhyung Kim
  0 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-06 23:13 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan

Hi Ian,

On Thu, Oct 6, 2022 at 11:44 AM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Setting proper cpu and thread maps for system wide evsels regardless of
> > user requested cpu in __perf_evlist__propagate_maps().  Those evsels
> > need to be active on all cpus always.  Do it in the libperf so that we
> > can guarantee it has proper maps.
> >
> > Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/lib/perf/evlist.c | 15 +++++++++------
> >  1 file changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> > index 6b1bafe267a4..187129652ab6 100644
> > --- a/tools/lib/perf/evlist.c
> > +++ b/tools/lib/perf/evlist.c
> > @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> >          * We already have cpus for evsel (via PMU sysfs) so
> >          * keep it, if there's no target cpu list defined.
> >          */
>
> This comment is for the 'else if' case and so is a little out of place
> before the system wide test.

Right, I'll add a comment for system-wide evsels too.

>
> > -       if (!evsel->own_cpus ||
> > -           (!evsel->system_wide && evlist->has_user_cpus) ||
> > -           (!evsel->system_wide &&
> > -            !evsel->requires_cpu &&
> > -            perf_cpu_map__empty(evlist->user_requested_cpus))) {
> > +       if (evsel->system_wide) {
> > +               perf_cpu_map__put(evsel->cpus);
> > +               evsel->cpus = perf_cpu_map__new(NULL);
>
> Looking at perf_cpu_map__new, will this mean that in system wide mode
> every event/evsel will now read /sys/devices/system/cpu/online here?
> We may want to cache the cpumap to avoid this.

Yeah, it's redundant.  I thought it's ok since the system-wide evsels
are not common.  We can consider caching when it becomes a problem.

Thanks,
Namhyung


>
> > +       } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> > +                  (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
> >                 perf_cpu_map__put(evsel->cpus);
> >                 evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
> >         } else if (evsel->cpus != evsel->own_cpus) {
> > @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> >                 evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
> >         }
> >
> > -       if (!evsel->system_wide) {
> > +       if (evsel->system_wide) {
> > +               perf_thread_map__put(evsel->threads);
> > +               evsel->threads = perf_thread_map__new_dummy();
> > +       } else {
> >                 perf_thread_map__put(evsel->threads);
> >                 evsel->threads = perf_thread_map__get(evlist->threads);
> >         }
> > --
> > 2.38.0.rc1.362.ged0d419d3c-goog
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
  2022-10-06 18:52   ` Ian Rogers
@ 2022-10-06 23:21     ` Namhyung Kim
  0 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-06 23:21 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
	LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan

On Thu, Oct 6, 2022 at 11:52 AM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > The current code propagate evsel's cpu map settings to evlist when it's
> > added to an evlist.  But the evlist->all_cpus and each evsel's cpus will
> > be updated in perf_evlist__set_maps() later.  No need to do it before
> > evlist's cpus are set actually.
> >
> > In fact it discards this intermediate all_cpus maps at the beginning
> > of perf_evlist__set_maps().  Let's not do this.  It's only needed when
> > an evsel is added after the evlist cpu/thread maps are set.
> >
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/lib/perf/evlist.c                  | 11 ++++-------
> >  tools/lib/perf/include/internal/evlist.h |  1 +
> >  2 files changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> > index 187129652ab6..8ce92070086c 100644
> > --- a/tools/lib/perf/evlist.c
> > +++ b/tools/lib/perf/evlist.c
> > @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> >  {
> >         struct perf_evsel *evsel;
> >
> > -       /* Recomputing all_cpus, so start with a blank slate. */
> > -       perf_cpu_map__put(evlist->all_cpus);
> > -       evlist->all_cpus = NULL;
> > +       evlist->needs_map_propagation = true;
>
> Might be nice to also clear this in  perf_evlist__init.

It's zero-initialized so I skipped it.  I couldn't find places
where it resets the existing evlist.

But now I think we should not call perf_evlist__set_maps()
in evlist__init() from perf tools.

>
> >
> >         perf_evlist__for_each_evsel(evlist, evsel)
> >                 __perf_evlist__propagate_maps(evlist, evsel);
> > @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
> >         evsel->idx = evlist->nr_entries;
> >         list_add_tail(&evsel->node, &evlist->entries);
> >         evlist->nr_entries += 1;
> > -       __perf_evlist__propagate_maps(evlist, evsel);
> > +
> > +       if (evlist->needs_map_propagation)
> > +               __perf_evlist__propagate_maps(evlist, evsel);
>
> I think a comment here would be useful. Something like:
> Adding events won't set the CPU maps in the evlist until
> set_maps/propogate_maps is called. Catch the case that an evsel is
> added after this and propagate the map.

Looks good, will add!

Thanks,
Namhyung


> >  }
> >
> >  void perf_evlist__remove(struct perf_evlist *evlist,
> > @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
> >                 evlist->threads = perf_thread_map__get(threads);
> >         }
> >
> > -       if (!evlist->all_cpus && cpus)
> > -               evlist->all_cpus = perf_cpu_map__get(cpus);
> > -
> >         perf_evlist__propagate_maps(evlist);
> >  }
> >
> > diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> > index 6f89aec3e608..850f07070036 100644
> > --- a/tools/lib/perf/include/internal/evlist.h
> > +++ b/tools/lib/perf/include/internal/evlist.h
> > @@ -19,6 +19,7 @@ struct perf_evlist {
> >         int                      nr_entries;
> >         int                      nr_groups;
> >         bool                     has_user_cpus;
> > +       bool                     needs_map_propagation;
> >         /**
> >          * The cpus passed from the command line or all online CPUs by
> >          * default.
> > --
> > 2.38.0.rc1.362.ged0d419d3c-goog
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
  2022-09-30 17:27 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v2) Namhyung Kim
@ 2022-09-30 17:27 ` Namhyung Kim
  0 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-09-30 17:27 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

For system-wide evsels, the thread map should be dummy - i.e. it has a
single entry of -1.  But the code guarantees such a thread map, so no
need to handle it specially.

No functional change intended.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evsel.c      |  3 ---
 tools/perf/builtin-script.c |  3 ---
 tools/perf/util/evsel.c     | 12 ++----------
 tools/perf/util/stat.c      |  3 ---
 4 files changed, 2 insertions(+), 19 deletions(-)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8ce5bbd09666..8b51b008a81f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
 	if (ncpus == 0 || nthreads == 0)
 		return 0;
 
-	if (evsel->system_wide)
-		nthreads = 1;
-
 	evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
 	if (evsel->sample_id == NULL)
 		return -ENOMEM;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 886f53cfa257..7fa467ed91dc 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
 	struct perf_cpu cpu;
 	static int header_printed;
 
-	if (counter->core.system_wide)
-		nthreads = 1;
-
 	if (!header_printed) {
 		printf("%3s %8s %15s %15s %15s %15s %s\n",
 		       "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5776bfa70f11..e319bb17d10d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
 static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		struct perf_thread_map *threads)
 {
-	int nthreads;
+	int nthreads = perf_thread_map__nr(threads);
 
 	if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
 	    (perf_missing_features.aux_output     && evsel->core.attr.aux_output))
@@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		threads = empty_thread_map;
 	}
 
-	if (evsel->core.system_wide)
-		nthreads = 1;
-	else
-		nthreads = threads->nr;
-
 	if (evsel->core.fd == NULL &&
 	    perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
 		return -ENOMEM;
@@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	if (threads == NULL)
 		threads = empty_thread_map;
 
-	if (evsel->core.system_wide)
-		nthreads = 1;
-	else
-		nthreads = threads->nr;
+	nthreads = perf_thread_map__nr(threads);
 
 	if (evsel->cgrp)
 		pid = evsel->cgrp->fd;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ce5e9e372fc4..cef943377ad7 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
 	int ncpus = evsel__nr_cpus(counter);
 	int idx, thread;
 
-	if (counter->core.system_wide)
-		nthreads = 1;
-
 	for (thread = 0; thread < nthreads; thread++) {
 		for (idx = 0; idx < ncpus; idx++) {
 			if (process_counter_values(config, counter, idx, thread,
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
  2022-09-24 16:57 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
@ 2022-09-27  7:09   ` Adrian Hunter
  0 siblings, 0 replies; 16+ messages in thread
From: Adrian Hunter @ 2022-09-27  7:09 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
	Kan Liang, Leo Yan

On 24/09/22 19:57, Namhyung Kim wrote:
> For system-wide evsels, the thread map should be dummy - i.e. it has a
> single entry of -1.  But the code guarantees such a thread map, so no
> need to handle it specially.
> 
> No functional change intended.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  tools/lib/perf/evsel.c      |  3 ---
>  tools/perf/builtin-script.c |  3 ---
>  tools/perf/util/evsel.c     | 12 ++----------
>  tools/perf/util/stat.c      |  3 ---
>  4 files changed, 2 insertions(+), 19 deletions(-)
> 
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 8ce5bbd09666..8b51b008a81f 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
>  	if (ncpus == 0 || nthreads == 0)
>  		return 0;
>  
> -	if (evsel->system_wide)
> -		nthreads = 1;
> -
>  	evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
>  	if (evsel->sample_id == NULL)
>  		return -ENOMEM;
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 886f53cfa257..7fa467ed91dc 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
>  	struct perf_cpu cpu;
>  	static int header_printed;
>  
> -	if (counter->core.system_wide)
> -		nthreads = 1;
> -
>  	if (!header_printed) {
>  		printf("%3s %8s %15s %15s %15s %15s %s\n",
>  		       "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 5776bfa70f11..e319bb17d10d 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
>  static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
>  		struct perf_thread_map *threads)
>  {
> -	int nthreads;
> +	int nthreads = perf_thread_map__nr(threads);
>  
>  	if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
>  	    (perf_missing_features.aux_output     && evsel->core.attr.aux_output))
> @@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
>  		threads = empty_thread_map;
>  	}
>  
> -	if (evsel->core.system_wide)
> -		nthreads = 1;
> -	else
> -		nthreads = threads->nr;
> -
>  	if (evsel->core.fd == NULL &&
>  	    perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
>  		return -ENOMEM;
> @@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>  	if (threads == NULL)
>  		threads = empty_thread_map;
>  
> -	if (evsel->core.system_wide)
> -		nthreads = 1;
> -	else
> -		nthreads = threads->nr;
> +	nthreads = perf_thread_map__nr(threads);
>  
>  	if (evsel->cgrp)
>  		pid = evsel->cgrp->fd;
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index ce5e9e372fc4..cef943377ad7 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
>  	int ncpus = evsel__nr_cpus(counter);
>  	int idx, thread;
>  
> -	if (counter->core.system_wide)
> -		nthreads = 1;
> -
>  	for (thread = 0; thread < nthreads; thread++) {
>  		for (idx = 0; idx < ncpus; idx++) {
>  			if (process_counter_values(config, counter, idx, thread,


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
  2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
@ 2022-09-24 16:57 ` Namhyung Kim
  2022-09-27  7:09   ` Adrian Hunter
  0 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
	linux-perf-users, Kan Liang, Leo Yan

For system-wide evsels, the thread map should be dummy - i.e. it has a
single entry of -1.  But the code guarantees such a thread map, so no
need to handle it specially.

No functional change intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/lib/perf/evsel.c      |  3 ---
 tools/perf/builtin-script.c |  3 ---
 tools/perf/util/evsel.c     | 12 ++----------
 tools/perf/util/stat.c      |  3 ---
 4 files changed, 2 insertions(+), 19 deletions(-)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8ce5bbd09666..8b51b008a81f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
 	if (ncpus == 0 || nthreads == 0)
 		return 0;
 
-	if (evsel->system_wide)
-		nthreads = 1;
-
 	evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
 	if (evsel->sample_id == NULL)
 		return -ENOMEM;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 886f53cfa257..7fa467ed91dc 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
 	struct perf_cpu cpu;
 	static int header_printed;
 
-	if (counter->core.system_wide)
-		nthreads = 1;
-
 	if (!header_printed) {
 		printf("%3s %8s %15s %15s %15s %15s %s\n",
 		       "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5776bfa70f11..e319bb17d10d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
 static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		struct perf_thread_map *threads)
 {
-	int nthreads;
+	int nthreads = perf_thread_map__nr(threads);
 
 	if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
 	    (perf_missing_features.aux_output     && evsel->core.attr.aux_output))
@@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		threads = empty_thread_map;
 	}
 
-	if (evsel->core.system_wide)
-		nthreads = 1;
-	else
-		nthreads = threads->nr;
-
 	if (evsel->core.fd == NULL &&
 	    perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
 		return -ENOMEM;
@@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	if (threads == NULL)
 		threads = empty_thread_map;
 
-	if (evsel->core.system_wide)
-		nthreads = 1;
-	else
-		nthreads = threads->nr;
+	nthreads = perf_thread_map__nr(threads);
 
 	if (evsel->cgrp)
 		pid = evsel->cgrp->fd;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ce5e9e372fc4..cef943377ad7 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
 	int ncpus = evsel__nr_cpus(counter);
 	int idx, thread;
 
-	if (counter->core.system_wide)
-		nthreads = 1;
-
 	for (thread = 0; thread < nthreads; thread++) {
 		for (idx = 0; idx < ncpus; idx++) {
 			if (process_counter_values(config, counter, idx, thread,
-- 
2.37.3.998.g577e59143f-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-10-06 23:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
2022-10-06 18:43   ` Ian Rogers
2022-10-06 23:13     ` Namhyung Kim
2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
2022-10-04 12:14   ` Arnaldo Carvalho de Melo
2022-10-04 13:55     ` Adrian Hunter
2022-10-06 18:52   ` Ian Rogers
2022-10-06 23:21     ` Namhyung Kim
2022-10-03 20:46 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
2022-10-03 20:46 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
2022-10-06 18:57   ` Ian Rogers
  -- strict thread matches above, loose matches on Subject: below --
2022-09-30 17:27 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v2) Namhyung Kim
2022-09-30 17:27 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
2022-09-24 16:57 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
2022-09-27  7:09   ` Adrian Hunter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).