* [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3)
@ 2022-10-03 20:46 Namhyung Kim
2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
` (4 more replies)
0 siblings, 5 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
linux-perf-users, Kan Liang, Leo Yan
Hello,
The system-wide evsel has a cpu map for all (online) cpus regardless
of user requested cpus. But the cpu map handling code has some
special case for it and I think we can cleanup the code by making sure
that such a evsel has a proper cpu/thread maps from the beginning.
This patches should not cause any change in the behavior.
Changes from v2:
* build evlist->core.all_cpus from the beginning (Adrian)
Changes from v1:
* use evlist->core.needs_map_propagation field
* add Reviewed-by from Adrian
You can get the code from 'perf/cpumap-update-v3' branch in
git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
Thanks,
Namhyung
Namhyung Kim (5):
libperf: Populate system-wide evsel maps
libperf: Propagate maps only if necessary
perf tools: Get rid of evlist__add_on_all_cpus()
perf tools: Add evlist__add_sched_switch()
perf tools: Remove special handling of system-wide evsel
tools/lib/perf/evlist.c | 26 +++++++-------
tools/lib/perf/evsel.c | 3 --
tools/lib/perf/include/internal/evlist.h | 1 +
tools/perf/arch/x86/util/intel-pt.c | 15 +++-----
tools/perf/builtin-script.c | 3 --
tools/perf/tests/switch-tracking.c | 15 +++-----
tools/perf/util/evlist.c | 46 ++++++++++--------------
tools/perf/util/evlist.h | 1 +
tools/perf/util/evsel.c | 12 ++-----
tools/perf/util/stat.c | 3 --
10 files changed, 46 insertions(+), 79 deletions(-)
base-commit: 62e64c9d2fd12839c02f1b3e8b873e7cb34e8720
--
2.38.0.rc1.362.ged0d419d3c-goog
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/5] libperf: Populate system-wide evsel maps
2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
2022-10-06 18:43 ` Ian Rogers
2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
` (3 subsequent siblings)
4 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
linux-perf-users, Kan Liang, Leo Yan
Setting proper cpu and thread maps for system wide evsels regardless of
user requested cpu in __perf_evlist__propagate_maps(). Those evsels
need to be active on all cpus always. Do it in the libperf so that we
can guarantee it has proper maps.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/lib/perf/evlist.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 6b1bafe267a4..187129652ab6 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
* We already have cpus for evsel (via PMU sysfs) so
* keep it, if there's no target cpu list defined.
*/
- if (!evsel->own_cpus ||
- (!evsel->system_wide && evlist->has_user_cpus) ||
- (!evsel->system_wide &&
- !evsel->requires_cpu &&
- perf_cpu_map__empty(evlist->user_requested_cpus))) {
+ if (evsel->system_wide) {
+ perf_cpu_map__put(evsel->cpus);
+ evsel->cpus = perf_cpu_map__new(NULL);
+ } else if (!evsel->own_cpus || evlist->has_user_cpus ||
+ (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
perf_cpu_map__put(evsel->cpus);
evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
} else if (evsel->cpus != evsel->own_cpus) {
@@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
}
- if (!evsel->system_wide) {
+ if (evsel->system_wide) {
+ perf_thread_map__put(evsel->threads);
+ evsel->threads = perf_thread_map__new_dummy();
+ } else {
perf_thread_map__put(evsel->threads);
evsel->threads = perf_thread_map__get(evlist->threads);
}
--
2.38.0.rc1.362.ged0d419d3c-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/5] libperf: Propagate maps only if necessary
2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
2022-10-04 12:14 ` Arnaldo Carvalho de Melo
2022-10-06 18:52 ` Ian Rogers
2022-10-03 20:46 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
` (2 subsequent siblings)
4 siblings, 2 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
linux-perf-users, Kan Liang, Leo Yan
The current code propagate evsel's cpu map settings to evlist when it's
added to an evlist. But the evlist->all_cpus and each evsel's cpus will
be updated in perf_evlist__set_maps() later. No need to do it before
evlist's cpus are set actually.
In fact it discards this intermediate all_cpus maps at the beginning
of perf_evlist__set_maps(). Let's not do this. It's only needed when
an evsel is added after the evlist cpu/thread maps are set.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/lib/perf/evlist.c | 11 ++++-------
tools/lib/perf/include/internal/evlist.h | 1 +
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 187129652ab6..8ce92070086c 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
{
struct perf_evsel *evsel;
- /* Recomputing all_cpus, so start with a blank slate. */
- perf_cpu_map__put(evlist->all_cpus);
- evlist->all_cpus = NULL;
+ evlist->needs_map_propagation = true;
perf_evlist__for_each_evsel(evlist, evsel)
__perf_evlist__propagate_maps(evlist, evsel);
@@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
evsel->idx = evlist->nr_entries;
list_add_tail(&evsel->node, &evlist->entries);
evlist->nr_entries += 1;
- __perf_evlist__propagate_maps(evlist, evsel);
+
+ if (evlist->needs_map_propagation)
+ __perf_evlist__propagate_maps(evlist, evsel);
}
void perf_evlist__remove(struct perf_evlist *evlist,
@@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
evlist->threads = perf_thread_map__get(threads);
}
- if (!evlist->all_cpus && cpus)
- evlist->all_cpus = perf_cpu_map__get(cpus);
-
perf_evlist__propagate_maps(evlist);
}
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 6f89aec3e608..850f07070036 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -19,6 +19,7 @@ struct perf_evlist {
int nr_entries;
int nr_groups;
bool has_user_cpus;
+ bool needs_map_propagation;
/**
* The cpus passed from the command line or all online CPUs by
* default.
--
2.38.0.rc1.362.ged0d419d3c-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus()
2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
2022-10-03 20:46 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
4 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
linux-perf-users, Kan Liang, Leo Yan
The cpu and thread maps are properly handled in libperf now. No need to
do it in the perf tools anymore. Let's remove the logic.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/evlist.c | 29 ++---------------------------
1 file changed, 2 insertions(+), 27 deletions(-)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index fcfe5bcc0bcf..dcf57b271ff1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -268,28 +268,6 @@ int evlist__add_dummy(struct evlist *evlist)
return 0;
}
-static void evlist__add_on_all_cpus(struct evlist *evlist, struct evsel *evsel)
-{
- evsel->core.system_wide = true;
-
- /*
- * All CPUs.
- *
- * Note perf_event_open() does not accept CPUs that are not online, so
- * in fact this CPU list will include only all online CPUs.
- */
- perf_cpu_map__put(evsel->core.own_cpus);
- evsel->core.own_cpus = perf_cpu_map__new(NULL);
- perf_cpu_map__put(evsel->core.cpus);
- evsel->core.cpus = perf_cpu_map__get(evsel->core.own_cpus);
-
- /* No threads */
- perf_thread_map__put(evsel->core.threads);
- evsel->core.threads = perf_thread_map__new_dummy();
-
- evlist__add(evlist, evsel);
-}
-
struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
{
struct evsel *evsel = evlist__dummy_event(evlist);
@@ -302,14 +280,11 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
evsel->core.attr.exclude_hv = 1;
evsel->core.attr.freq = 0;
evsel->core.attr.sample_period = 1;
+ evsel->core.system_wide = system_wide;
evsel->no_aux_samples = true;
evsel->name = strdup("dummy:u");
- if (system_wide)
- evlist__add_on_all_cpus(evlist, evsel);
- else
- evlist__add(evlist, evsel);
-
+ evlist__add(evlist, evsel);
return evsel;
}
--
2.38.0.rc1.362.ged0d419d3c-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/5] perf tools: Add evlist__add_sched_switch()
2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
` (2 preceding siblings ...)
2022-10-03 20:46 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
4 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
linux-perf-users, Kan Liang, Leo Yan
Add a help to create a system-wide sched_switch event. One merit is
that it sets the system-wide bit before adding it to evlist so that
the libperf can handle the cpu and thread maps correctly.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/arch/x86/util/intel-pt.c | 15 +++++----------
tools/perf/tests/switch-tracking.c | 15 +++++----------
tools/perf/util/evlist.c | 17 +++++++++++++++++
tools/perf/util/evlist.h | 1 +
4 files changed, 28 insertions(+), 20 deletions(-)
diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 13933020a79e..793b35f2221a 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -11,6 +11,7 @@
#include <linux/bitops.h>
#include <linux/log2.h>
#include <linux/zalloc.h>
+#include <linux/err.h>
#include <cpuid.h>
#include "../../../util/session.h"
@@ -426,20 +427,14 @@ static int intel_pt_track_switches(struct evlist *evlist)
if (!evlist__can_select_event(evlist, sched_switch))
return -EPERM;
- err = parse_event(evlist, sched_switch);
- if (err) {
- pr_debug2("%s: failed to parse %s, error %d\n",
+ evsel = evlist__add_sched_switch(evlist, true);
+ if (IS_ERR(evsel)) {
+ err = PTR_ERR(evsel);
+ pr_debug2("%s: failed to create %s, error = %d\n",
__func__, sched_switch, err);
return err;
}
- evsel = evlist__last(evlist);
-
- evsel__set_sample_bit(evsel, CPU);
- evsel__set_sample_bit(evsel, TIME);
-
- evsel->core.system_wide = true;
- evsel->no_aux_samples = true;
evsel->immediate = true;
return 0;
diff --git a/tools/perf/tests/switch-tracking.c b/tools/perf/tests/switch-tracking.c
index 2d46af9ef935..87f565c7f650 100644
--- a/tools/perf/tests/switch-tracking.c
+++ b/tools/perf/tests/switch-tracking.c
@@ -6,6 +6,7 @@
#include <time.h>
#include <stdlib.h>
#include <linux/zalloc.h>
+#include <linux/err.h>
#include <perf/cpumap.h>
#include <perf/evlist.h>
#include <perf/mmap.h>
@@ -398,19 +399,13 @@ static int test__switch_tracking(struct test_suite *test __maybe_unused, int sub
goto out;
}
- err = parse_event(evlist, sched_switch);
- if (err) {
- pr_debug("Failed to parse event %s\n", sched_switch);
+ switch_evsel = evlist__add_sched_switch(evlist, true);
+ if (IS_ERR(switch_evsel)) {
+ err = PTR_ERR(switch_evsel);
+ pr_debug("Failed to create event %s\n", sched_switch);
goto out_err;
}
- switch_evsel = evlist__last(evlist);
-
- evsel__set_sample_bit(switch_evsel, CPU);
- evsel__set_sample_bit(switch_evsel, TIME);
-
- switch_evsel->core.system_wide = true;
- switch_evsel->no_aux_samples = true;
switch_evsel->immediate = true;
/* Test moving an event to the front */
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index dcf57b271ff1..6612b00949e7 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -288,6 +288,23 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
return evsel;
}
+struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide)
+{
+ struct evsel *evsel = evsel__newtp_idx("sched", "sched_switch", 0);
+
+ if (IS_ERR(evsel))
+ return evsel;
+
+ evsel__set_sample_bit(evsel, CPU);
+ evsel__set_sample_bit(evsel, TIME);
+
+ evsel->core.system_wide = system_wide;
+ evsel->no_aux_samples = true;
+
+ evlist__add(evlist, evsel);
+ return evsel;
+};
+
int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
{
struct evsel *evsel, *n;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 9d967fe3953a..16734c6756b3 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -127,6 +127,7 @@ static inline struct evsel *evlist__add_dummy_on_all_cpus(struct evlist *evlist)
{
return evlist__add_aux_dummy(evlist, true);
}
+struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide);
int evlist__add_sb_event(struct evlist *evlist, struct perf_event_attr *attr,
evsel__sb_cb_t cb, void *data);
--
2.38.0.rc1.362.ged0d419d3c-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
` (3 preceding siblings ...)
2022-10-03 20:46 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
@ 2022-10-03 20:46 ` Namhyung Kim
2022-10-06 18:57 ` Ian Rogers
4 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
linux-perf-users, Kan Liang, Leo Yan
For system-wide evsels, the thread map should be dummy - i.e. it has a
single entry of -1. But the code guarantees such a thread map, so no
need to handle it specially.
No functional change intended.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/lib/perf/evsel.c | 3 ---
tools/perf/builtin-script.c | 3 ---
tools/perf/util/evsel.c | 12 ++----------
tools/perf/util/stat.c | 3 ---
4 files changed, 2 insertions(+), 19 deletions(-)
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8ce5bbd09666..8b51b008a81f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
if (ncpus == 0 || nthreads == 0)
return 0;
- if (evsel->system_wide)
- nthreads = 1;
-
evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
if (evsel->sample_id == NULL)
return -ENOMEM;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 886f53cfa257..7fa467ed91dc 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
struct perf_cpu cpu;
static int header_printed;
- if (counter->core.system_wide)
- nthreads = 1;
-
if (!header_printed) {
printf("%3s %8s %15s %15s %15s %15s %s\n",
"CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5776bfa70f11..e319bb17d10d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads)
{
- int nthreads;
+ int nthreads = perf_thread_map__nr(threads);
if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
(perf_missing_features.aux_output && evsel->core.attr.aux_output))
@@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
threads = empty_thread_map;
}
- if (evsel->core.system_wide)
- nthreads = 1;
- else
- nthreads = threads->nr;
-
if (evsel->core.fd == NULL &&
perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
return -ENOMEM;
@@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
if (threads == NULL)
threads = empty_thread_map;
- if (evsel->core.system_wide)
- nthreads = 1;
- else
- nthreads = threads->nr;
+ nthreads = perf_thread_map__nr(threads);
if (evsel->cgrp)
pid = evsel->cgrp->fd;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ce5e9e372fc4..cef943377ad7 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
int ncpus = evsel__nr_cpus(counter);
int idx, thread;
- if (counter->core.system_wide)
- nthreads = 1;
-
for (thread = 0; thread < nthreads; thread++) {
for (idx = 0; idx < ncpus; idx++) {
if (process_counter_values(config, counter, idx, thread,
--
2.38.0.rc1.362.ged0d419d3c-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
@ 2022-10-04 12:14 ` Arnaldo Carvalho de Melo
2022-10-04 13:55 ` Adrian Hunter
2022-10-06 18:52 ` Ian Rogers
1 sibling, 1 reply; 16+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-10-04 12:14 UTC (permalink / raw)
To: Namhyung Kim
Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan
Em Mon, Oct 03, 2022 at 01:46:44PM -0700, Namhyung Kim escreveu:
> The current code propagate evsel's cpu map settings to evlist when it's
> added to an evlist. But the evlist->all_cpus and each evsel's cpus will
> be updated in perf_evlist__set_maps() later. No need to do it before
> evlist's cpus are set actually.
>
> In fact it discards this intermediate all_cpus maps at the beginning
> of perf_evlist__set_maps(). Let's not do this. It's only needed when
> an evsel is added after the evlist cpu/thread maps are set.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/lib/perf/evlist.c | 11 ++++-------
> tools/lib/perf/include/internal/evlist.h | 1 +
> 2 files changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 187129652ab6..8ce92070086c 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> {
> struct perf_evsel *evsel;
>
> - /* Recomputing all_cpus, so start with a blank slate. */
> - perf_cpu_map__put(evlist->all_cpus);
> - evlist->all_cpus = NULL;
> + evlist->needs_map_propagation = true;
>
> perf_evlist__for_each_evsel(evlist, evsel)
> __perf_evlist__propagate_maps(evlist, evsel);
> @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
> evsel->idx = evlist->nr_entries;
> list_add_tail(&evsel->node, &evlist->entries);
> evlist->nr_entries += 1;
> - __perf_evlist__propagate_maps(evlist, evsel);
> +
> + if (evlist->needs_map_propagation)
> + __perf_evlist__propagate_maps(evlist, evsel);
> }
>
> void perf_evlist__remove(struct perf_evlist *evlist,
> @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
> evlist->threads = perf_thread_map__get(threads);
> }
>
> - if (!evlist->all_cpus && cpus)
> - evlist->all_cpus = perf_cpu_map__get(cpus);
> -
> perf_evlist__propagate_maps(evlist);
IIRC Adrian suggested this extra thing, but he provided the Reviewed-by
for the previous patch, where this isn't present, Adrian, can you please
confirm this and if this is the case provide your Reviewed-by for this
new version?
Thanks,
- Arnaldo
> }
>
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 6f89aec3e608..850f07070036 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -19,6 +19,7 @@ struct perf_evlist {
> int nr_entries;
> int nr_groups;
> bool has_user_cpus;
> + bool needs_map_propagation;
> /**
> * The cpus passed from the command line or all online CPUs by
> * default.
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
--
- Arnaldo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
2022-10-04 12:14 ` Arnaldo Carvalho de Melo
@ 2022-10-04 13:55 ` Adrian Hunter
0 siblings, 0 replies; 16+ messages in thread
From: Adrian Hunter @ 2022-10-04 13:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Namhyung Kim
Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers,
linux-perf-users, Kan Liang, Leo Yan
On 4/10/22 15:14, Arnaldo Carvalho de Melo wrote:
> Em Mon, Oct 03, 2022 at 01:46:44PM -0700, Namhyung Kim escreveu:
>> The current code propagate evsel's cpu map settings to evlist when it's
>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will
>> be updated in perf_evlist__set_maps() later. No need to do it before
>> evlist's cpus are set actually.
>>
>> In fact it discards this intermediate all_cpus maps at the beginning
>> of perf_evlist__set_maps(). Let's not do this. It's only needed when
>> an evsel is added after the evlist cpu/thread maps are set.
>>
>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>> ---
>> tools/lib/perf/evlist.c | 11 ++++-------
>> tools/lib/perf/include/internal/evlist.h | 1 +
>> 2 files changed, 5 insertions(+), 7 deletions(-)
>>
>> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
>> index 187129652ab6..8ce92070086c 100644
>> --- a/tools/lib/perf/evlist.c
>> +++ b/tools/lib/perf/evlist.c
>> @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>> {
>> struct perf_evsel *evsel;
>>
>> - /* Recomputing all_cpus, so start with a blank slate. */
>> - perf_cpu_map__put(evlist->all_cpus);
>> - evlist->all_cpus = NULL;
>> + evlist->needs_map_propagation = true;
>>
>> perf_evlist__for_each_evsel(evlist, evsel)
>> __perf_evlist__propagate_maps(evlist, evsel);
>> @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
>> evsel->idx = evlist->nr_entries;
>> list_add_tail(&evsel->node, &evlist->entries);
>> evlist->nr_entries += 1;
>> - __perf_evlist__propagate_maps(evlist, evsel);
>> +
>> + if (evlist->needs_map_propagation)
>> + __perf_evlist__propagate_maps(evlist, evsel);
>> }
>>
>> void perf_evlist__remove(struct perf_evlist *evlist,
>> @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
>> evlist->threads = perf_thread_map__get(threads);
>> }
>>
>> - if (!evlist->all_cpus && cpus)
>> - evlist->all_cpus = perf_cpu_map__get(cpus);
>> -
>> perf_evlist__propagate_maps(evlist);
>
> IIRC Adrian suggested this extra thing, but he provided the Reviewed-by
> for the previous patch, where this isn't present, Adrian, can you please
> confirm this and if this is the case provide your Reviewed-by for this
> new version?
Oops sorry, I meant this one:
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
>
> Thanks,
>
> - Arnaldo
>
>> }
>>
>> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
>> index 6f89aec3e608..850f07070036 100644
>> --- a/tools/lib/perf/include/internal/evlist.h
>> +++ b/tools/lib/perf/include/internal/evlist.h
>> @@ -19,6 +19,7 @@ struct perf_evlist {
>> int nr_entries;
>> int nr_groups;
>> bool has_user_cpus;
>> + bool needs_map_propagation;
>> /**
>> * The cpus passed from the command line or all online CPUs by
>> * default.
>> --
>> 2.38.0.rc1.362.ged0d419d3c-goog
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] libperf: Populate system-wide evsel maps
2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
@ 2022-10-06 18:43 ` Ian Rogers
2022-10-06 23:13 ` Namhyung Kim
0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-10-06 18:43 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan
On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Setting proper cpu and thread maps for system wide evsels regardless of
> user requested cpu in __perf_evlist__propagate_maps(). Those evsels
> need to be active on all cpus always. Do it in the libperf so that we
> can guarantee it has proper maps.
>
> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/lib/perf/evlist.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 6b1bafe267a4..187129652ab6 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> * We already have cpus for evsel (via PMU sysfs) so
> * keep it, if there's no target cpu list defined.
> */
This comment is for the 'else if' case and so is a little out of place
before the system wide test.
> - if (!evsel->own_cpus ||
> - (!evsel->system_wide && evlist->has_user_cpus) ||
> - (!evsel->system_wide &&
> - !evsel->requires_cpu &&
> - perf_cpu_map__empty(evlist->user_requested_cpus))) {
> + if (evsel->system_wide) {
> + perf_cpu_map__put(evsel->cpus);
> + evsel->cpus = perf_cpu_map__new(NULL);
Looking at perf_cpu_map__new, will this mean that in system wide mode
every event/evsel will now read /sys/devices/system/cpu/online here?
We may want to cache the cpumap to avoid this.
Thanks,
Ian
> + } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> + (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
> perf_cpu_map__put(evsel->cpus);
> evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
> } else if (evsel->cpus != evsel->own_cpus) {
> @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
> }
>
> - if (!evsel->system_wide) {
> + if (evsel->system_wide) {
> + perf_thread_map__put(evsel->threads);
> + evsel->threads = perf_thread_map__new_dummy();
> + } else {
> perf_thread_map__put(evsel->threads);
> evsel->threads = perf_thread_map__get(evlist->threads);
> }
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
2022-10-04 12:14 ` Arnaldo Carvalho de Melo
@ 2022-10-06 18:52 ` Ian Rogers
2022-10-06 23:21 ` Namhyung Kim
1 sibling, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2022-10-06 18:52 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan
On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> The current code propagate evsel's cpu map settings to evlist when it's
> added to an evlist. But the evlist->all_cpus and each evsel's cpus will
> be updated in perf_evlist__set_maps() later. No need to do it before
> evlist's cpus are set actually.
>
> In fact it discards this intermediate all_cpus maps at the beginning
> of perf_evlist__set_maps(). Let's not do this. It's only needed when
> an evsel is added after the evlist cpu/thread maps are set.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/lib/perf/evlist.c | 11 ++++-------
> tools/lib/perf/include/internal/evlist.h | 1 +
> 2 files changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 187129652ab6..8ce92070086c 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> {
> struct perf_evsel *evsel;
>
> - /* Recomputing all_cpus, so start with a blank slate. */
> - perf_cpu_map__put(evlist->all_cpus);
> - evlist->all_cpus = NULL;
> + evlist->needs_map_propagation = true;
Might be nice to also clear this in perf_evlist__init.
>
> perf_evlist__for_each_evsel(evlist, evsel)
> __perf_evlist__propagate_maps(evlist, evsel);
> @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
> evsel->idx = evlist->nr_entries;
> list_add_tail(&evsel->node, &evlist->entries);
> evlist->nr_entries += 1;
> - __perf_evlist__propagate_maps(evlist, evsel);
> +
> + if (evlist->needs_map_propagation)
> + __perf_evlist__propagate_maps(evlist, evsel);
I think a comment here would be useful. Something like:
Adding events won't set the CPU maps in the evlist until
set_maps/propogate_maps is called. Catch the case that an evsel is
added after this and propagate the map.
Thanks,
Ian
> }
>
> void perf_evlist__remove(struct perf_evlist *evlist,
> @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
> evlist->threads = perf_thread_map__get(threads);
> }
>
> - if (!evlist->all_cpus && cpus)
> - evlist->all_cpus = perf_cpu_map__get(cpus);
> -
> perf_evlist__propagate_maps(evlist);
> }
>
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 6f89aec3e608..850f07070036 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -19,6 +19,7 @@ struct perf_evlist {
> int nr_entries;
> int nr_groups;
> bool has_user_cpus;
> + bool needs_map_propagation;
> /**
> * The cpus passed from the command line or all online CPUs by
> * default.
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
@ 2022-10-06 18:57 ` Ian Rogers
0 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2022-10-06 18:57 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan
On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> For system-wide evsels, the thread map should be dummy - i.e. it has a
> single entry of -1. But the code guarantees such a thread map, so no
> need to handle it specially.
>
> No functional change intended.
>
> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/lib/perf/evsel.c | 3 ---
> tools/perf/builtin-script.c | 3 ---
> tools/perf/util/evsel.c | 12 ++----------
> tools/perf/util/stat.c | 3 ---
> 4 files changed, 2 insertions(+), 19 deletions(-)
>
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 8ce5bbd09666..8b51b008a81f 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
> if (ncpus == 0 || nthreads == 0)
> return 0;
>
> - if (evsel->system_wide)
> - nthreads = 1;
> -
> evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
> if (evsel->sample_id == NULL)
> return -ENOMEM;
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 886f53cfa257..7fa467ed91dc 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
> struct perf_cpu cpu;
> static int header_printed;
>
> - if (counter->core.system_wide)
> - nthreads = 1;
> -
> if (!header_printed) {
> printf("%3s %8s %15s %15s %15s %15s %s\n",
> "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 5776bfa70f11..e319bb17d10d 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
> static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> struct perf_thread_map *threads)
> {
> - int nthreads;
> + int nthreads = perf_thread_map__nr(threads);
A general nit for something not introduced by this change. nr/nthreads
here is really the number of perf_event_open file descriptors needed
for the thread map. I wonder if a variable name of num_thread_fds
would make the code more intention revealing.
Thanks,
Ian
>
> if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
> (perf_missing_features.aux_output && evsel->core.attr.aux_output))
> @@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> threads = empty_thread_map;
> }
>
> - if (evsel->core.system_wide)
> - nthreads = 1;
> - else
> - nthreads = threads->nr;
> -
> if (evsel->core.fd == NULL &&
> perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
> return -ENOMEM;
> @@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> if (threads == NULL)
> threads = empty_thread_map;
>
> - if (evsel->core.system_wide)
> - nthreads = 1;
> - else
> - nthreads = threads->nr;
> + nthreads = perf_thread_map__nr(threads);
>
> if (evsel->cgrp)
> pid = evsel->cgrp->fd;
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index ce5e9e372fc4..cef943377ad7 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
> int ncpus = evsel__nr_cpus(counter);
> int idx, thread;
>
> - if (counter->core.system_wide)
> - nthreads = 1;
> -
> for (thread = 0; thread < nthreads; thread++) {
> for (idx = 0; idx < ncpus; idx++) {
> if (process_counter_values(config, counter, idx, thread,
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/5] libperf: Populate system-wide evsel maps
2022-10-06 18:43 ` Ian Rogers
@ 2022-10-06 23:13 ` Namhyung Kim
0 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-06 23:13 UTC (permalink / raw)
To: Ian Rogers
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan
Hi Ian,
On Thu, Oct 6, 2022 at 11:44 AM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Setting proper cpu and thread maps for system wide evsels regardless of
> > user requested cpu in __perf_evlist__propagate_maps(). Those evsels
> > need to be active on all cpus always. Do it in the libperf so that we
> > can guarantee it has proper maps.
> >
> > Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> > tools/lib/perf/evlist.c | 15 +++++++++------
> > 1 file changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> > index 6b1bafe267a4..187129652ab6 100644
> > --- a/tools/lib/perf/evlist.c
> > +++ b/tools/lib/perf/evlist.c
> > @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> > * We already have cpus for evsel (via PMU sysfs) so
> > * keep it, if there's no target cpu list defined.
> > */
>
> This comment is for the 'else if' case and so is a little out of place
> before the system wide test.
Right, I'll add a comment for system-wide evsels too.
>
> > - if (!evsel->own_cpus ||
> > - (!evsel->system_wide && evlist->has_user_cpus) ||
> > - (!evsel->system_wide &&
> > - !evsel->requires_cpu &&
> > - perf_cpu_map__empty(evlist->user_requested_cpus))) {
> > + if (evsel->system_wide) {
> > + perf_cpu_map__put(evsel->cpus);
> > + evsel->cpus = perf_cpu_map__new(NULL);
>
> Looking at perf_cpu_map__new, will this mean that in system wide mode
> every event/evsel will now read /sys/devices/system/cpu/online here?
> We may want to cache the cpumap to avoid this.
Yeah, it's redundant. I thought it's ok since the system-wide evsels
are not common. We can consider caching when it becomes a problem.
Thanks,
Namhyung
>
> > + } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> > + (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) {
> > perf_cpu_map__put(evsel->cpus);
> > evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
> > } else if (evsel->cpus != evsel->own_cpus) {
> > @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> > evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
> > }
> >
> > - if (!evsel->system_wide) {
> > + if (evsel->system_wide) {
> > + perf_thread_map__put(evsel->threads);
> > + evsel->threads = perf_thread_map__new_dummy();
> > + } else {
> > perf_thread_map__put(evsel->threads);
> > evsel->threads = perf_thread_map__get(evlist->threads);
> > }
> > --
> > 2.38.0.rc1.362.ged0d419d3c-goog
> >
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary
2022-10-06 18:52 ` Ian Rogers
@ 2022-10-06 23:21 ` Namhyung Kim
0 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-10-06 23:21 UTC (permalink / raw)
To: Ian Rogers
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan
On Thu, Oct 6, 2022 at 11:52 AM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > The current code propagate evsel's cpu map settings to evlist when it's
> > added to an evlist. But the evlist->all_cpus and each evsel's cpus will
> > be updated in perf_evlist__set_maps() later. No need to do it before
> > evlist's cpus are set actually.
> >
> > In fact it discards this intermediate all_cpus maps at the beginning
> > of perf_evlist__set_maps(). Let's not do this. It's only needed when
> > an evsel is added after the evlist cpu/thread maps are set.
> >
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> > tools/lib/perf/evlist.c | 11 ++++-------
> > tools/lib/perf/include/internal/evlist.h | 1 +
> > 2 files changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> > index 187129652ab6..8ce92070086c 100644
> > --- a/tools/lib/perf/evlist.c
> > +++ b/tools/lib/perf/evlist.c
> > @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> > {
> > struct perf_evsel *evsel;
> >
> > - /* Recomputing all_cpus, so start with a blank slate. */
> > - perf_cpu_map__put(evlist->all_cpus);
> > - evlist->all_cpus = NULL;
> > + evlist->needs_map_propagation = true;
>
> Might be nice to also clear this in perf_evlist__init.
It's zero-initialized so I skipped it. I couldn't find places
where it resets the existing evlist.
But now I think we should not call perf_evlist__set_maps()
in evlist__init() from perf tools.
>
> >
> > perf_evlist__for_each_evsel(evlist, evsel)
> > __perf_evlist__propagate_maps(evlist, evsel);
> > @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
> > evsel->idx = evlist->nr_entries;
> > list_add_tail(&evsel->node, &evlist->entries);
> > evlist->nr_entries += 1;
> > - __perf_evlist__propagate_maps(evlist, evsel);
> > +
> > + if (evlist->needs_map_propagation)
> > + __perf_evlist__propagate_maps(evlist, evsel);
>
> I think a comment here would be useful. Something like:
> Adding events won't set the CPU maps in the evlist until
> set_maps/propogate_maps is called. Catch the case that an evsel is
> added after this and propagate the map.
Looks good, will add!
Thanks,
Namhyung
> > }
> >
> > void perf_evlist__remove(struct perf_evlist *evlist,
> > @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
> > evlist->threads = perf_thread_map__get(threads);
> > }
> >
> > - if (!evlist->all_cpus && cpus)
> > - evlist->all_cpus = perf_cpu_map__get(cpus);
> > -
> > perf_evlist__propagate_maps(evlist);
> > }
> >
> > diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> > index 6f89aec3e608..850f07070036 100644
> > --- a/tools/lib/perf/include/internal/evlist.h
> > +++ b/tools/lib/perf/include/internal/evlist.h
> > @@ -19,6 +19,7 @@ struct perf_evlist {
> > int nr_entries;
> > int nr_groups;
> > bool has_user_cpus;
> > + bool needs_map_propagation;
> > /**
> > * The cpus passed from the command line or all online CPUs by
> > * default.
> > --
> > 2.38.0.rc1.362.ged0d419d3c-goog
> >
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
2022-09-30 17:27 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v2) Namhyung Kim
@ 2022-09-30 17:27 ` Namhyung Kim
0 siblings, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2022-09-30 17:27 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
linux-perf-users, Kan Liang, Leo Yan
For system-wide evsels, the thread map should be dummy - i.e. it has a
single entry of -1. But the code guarantees such a thread map, so no
need to handle it specially.
No functional change intended.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/lib/perf/evsel.c | 3 ---
tools/perf/builtin-script.c | 3 ---
tools/perf/util/evsel.c | 12 ++----------
tools/perf/util/stat.c | 3 ---
4 files changed, 2 insertions(+), 19 deletions(-)
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8ce5bbd09666..8b51b008a81f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
if (ncpus == 0 || nthreads == 0)
return 0;
- if (evsel->system_wide)
- nthreads = 1;
-
evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
if (evsel->sample_id == NULL)
return -ENOMEM;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 886f53cfa257..7fa467ed91dc 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
struct perf_cpu cpu;
static int header_printed;
- if (counter->core.system_wide)
- nthreads = 1;
-
if (!header_printed) {
printf("%3s %8s %15s %15s %15s %15s %s\n",
"CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5776bfa70f11..e319bb17d10d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads)
{
- int nthreads;
+ int nthreads = perf_thread_map__nr(threads);
if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
(perf_missing_features.aux_output && evsel->core.attr.aux_output))
@@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
threads = empty_thread_map;
}
- if (evsel->core.system_wide)
- nthreads = 1;
- else
- nthreads = threads->nr;
-
if (evsel->core.fd == NULL &&
perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
return -ENOMEM;
@@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
if (threads == NULL)
threads = empty_thread_map;
- if (evsel->core.system_wide)
- nthreads = 1;
- else
- nthreads = threads->nr;
+ nthreads = perf_thread_map__nr(threads);
if (evsel->cgrp)
pid = evsel->cgrp->fd;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ce5e9e372fc4..cef943377ad7 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
int ncpus = evsel__nr_cpus(counter);
int idx, thread;
- if (counter->core.system_wide)
- nthreads = 1;
-
for (thread = 0; thread < nthreads; thread++) {
for (idx = 0; idx < ncpus; idx++) {
if (process_counter_values(config, counter, idx, thread,
--
2.38.0.rc1.362.ged0d419d3c-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
2022-09-24 16:57 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
@ 2022-09-27 7:09 ` Adrian Hunter
0 siblings, 0 replies; 16+ messages in thread
From: Adrian Hunter @ 2022-09-27 7:09 UTC (permalink / raw)
To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users,
Kan Liang, Leo Yan
On 24/09/22 19:57, Namhyung Kim wrote:
> For system-wide evsels, the thread map should be dummy - i.e. it has a
> single entry of -1. But the code guarantees such a thread map, so no
> need to handle it specially.
>
> No functional change intended.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
> tools/lib/perf/evsel.c | 3 ---
> tools/perf/builtin-script.c | 3 ---
> tools/perf/util/evsel.c | 12 ++----------
> tools/perf/util/stat.c | 3 ---
> 4 files changed, 2 insertions(+), 19 deletions(-)
>
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 8ce5bbd09666..8b51b008a81f 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
> if (ncpus == 0 || nthreads == 0)
> return 0;
>
> - if (evsel->system_wide)
> - nthreads = 1;
> -
> evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
> if (evsel->sample_id == NULL)
> return -ENOMEM;
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 886f53cfa257..7fa467ed91dc 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
> struct perf_cpu cpu;
> static int header_printed;
>
> - if (counter->core.system_wide)
> - nthreads = 1;
> -
> if (!header_printed) {
> printf("%3s %8s %15s %15s %15s %15s %s\n",
> "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 5776bfa70f11..e319bb17d10d 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
> static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> struct perf_thread_map *threads)
> {
> - int nthreads;
> + int nthreads = perf_thread_map__nr(threads);
>
> if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
> (perf_missing_features.aux_output && evsel->core.attr.aux_output))
> @@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
> threads = empty_thread_map;
> }
>
> - if (evsel->core.system_wide)
> - nthreads = 1;
> - else
> - nthreads = threads->nr;
> -
> if (evsel->core.fd == NULL &&
> perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
> return -ENOMEM;
> @@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
> if (threads == NULL)
> threads = empty_thread_map;
>
> - if (evsel->core.system_wide)
> - nthreads = 1;
> - else
> - nthreads = threads->nr;
> + nthreads = perf_thread_map__nr(threads);
>
> if (evsel->cgrp)
> pid = evsel->cgrp->fd;
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index ce5e9e372fc4..cef943377ad7 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
> int ncpus = evsel__nr_cpus(counter);
> int idx, thread;
>
> - if (counter->core.system_wide)
> - nthreads = 1;
> -
> for (thread = 0; thread < nthreads; thread++) {
> for (idx = 0; idx < ncpus; idx++) {
> if (process_counter_values(config, counter, idx, thread,
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 5/5] perf tools: Remove special handling of system-wide evsel
2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
@ 2022-09-24 16:57 ` Namhyung Kim
2022-09-27 7:09 ` Adrian Hunter
0 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter,
linux-perf-users, Kan Liang, Leo Yan
For system-wide evsels, the thread map should be dummy - i.e. it has a
single entry of -1. But the code guarantees such a thread map, so no
need to handle it specially.
No functional change intended.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/lib/perf/evsel.c | 3 ---
tools/perf/builtin-script.c | 3 ---
tools/perf/util/evsel.c | 12 ++----------
tools/perf/util/stat.c | 3 ---
4 files changed, 2 insertions(+), 19 deletions(-)
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 8ce5bbd09666..8b51b008a81f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
if (ncpus == 0 || nthreads == 0)
return 0;
- if (evsel->system_wide)
- nthreads = 1;
-
evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
if (evsel->sample_id == NULL)
return -ENOMEM;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 886f53cfa257..7fa467ed91dc 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp)
struct perf_cpu cpu;
static int header_printed;
- if (counter->core.system_wide)
- nthreads = 1;
-
if (!header_printed) {
printf("%3s %8s %15s %15s %15s %15s %s\n",
"CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT");
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5776bfa70f11..e319bb17d10d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map;
static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads)
{
- int nthreads;
+ int nthreads = perf_thread_map__nr(threads);
if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) ||
(perf_missing_features.aux_output && evsel->core.attr.aux_output))
@@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
threads = empty_thread_map;
}
- if (evsel->core.system_wide)
- nthreads = 1;
- else
- nthreads = threads->nr;
-
if (evsel->core.fd == NULL &&
perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
return -ENOMEM;
@@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
if (threads == NULL)
threads = empty_thread_map;
- if (evsel->core.system_wide)
- nthreads = 1;
- else
- nthreads = threads->nr;
+ nthreads = perf_thread_map__nr(threads);
if (evsel->cgrp)
pid = evsel->cgrp->fd;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ce5e9e372fc4..cef943377ad7 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config,
int ncpus = evsel__nr_cpus(counter);
int idx, thread;
- if (counter->core.system_wide)
- nthreads = 1;
-
for (thread = 0; thread < nthreads; thread++) {
for (idx = 0; idx < ncpus; idx++) {
if (process_counter_values(config, counter, idx, thread,
--
2.37.3.998.g577e59143f-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
end of thread, other threads:[~2022-10-06 23:22 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim
2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim
2022-10-06 18:43 ` Ian Rogers
2022-10-06 23:13 ` Namhyung Kim
2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim
2022-10-04 12:14 ` Arnaldo Carvalho de Melo
2022-10-04 13:55 ` Adrian Hunter
2022-10-06 18:52 ` Ian Rogers
2022-10-06 23:21 ` Namhyung Kim
2022-10-03 20:46 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim
2022-10-03 20:46 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim
2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
2022-10-06 18:57 ` Ian Rogers
-- strict thread matches above, loose matches on Subject: below --
2022-09-30 17:27 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v2) Namhyung Kim
2022-09-30 17:27 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim
2022-09-24 16:57 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim
2022-09-27 7:09 ` Adrian Hunter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).