linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu
@ 2022-04-22 16:23 Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 01/21] libperf evsel: Factor out perf_evsel__ioctl() Adrian Hunter
                   ` (21 more replies)
  0 siblings, 22 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Hi

Here are patches to support capturing Intel PT sideband events such as
mmap, task, context switch, text poke etc, on every CPU even when tracing
selected user_requested_cpus.  That is, when using the perf record -C or
 --cpu option.

This is needed for:
1. text poke: a text poke on any CPU affects all CPUs
2. tracing user space: a user space process can migrate between CPUs so
mmap events that happen on a different CPU can be needed to decode a
user_requested_cpus CPU.

For example:

	Trace on CPU 1:

	perf record --kcore -C 1 -e intel_pt// &

	Start a task on CPU 0:

	taskset 0x1 testprog &

	Migrate it to CPU 1:

	taskset -p 0x2 <testprog pid>

	Stop tracing:

	kill %1

	Prior to these changes there will be errors decoding testprog
	in userspace because the comm and mmap events for testprog will not
	have been captured.

There is quite a bit of preparation:

The first 5 patches stop auxtrace mixing up mmap idx between evlist and
evsel.  That is going to matter when
evlist->all_cpus != evlist->user_requested_cpus != evsel->cpus:

      libperf evsel: Factor out perf_evsel__ioctl()
      libperf evsel: Add perf_evsel__enable_thread()
      perf evlist: Use libperf functions in evlist__enable_event_idx()
      perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c
      perf auxtrace: Do not mix up mmap idx

The next 6 patches stop attempts to auxtrace mmap when it is not an
auxtrace event e.g. when mmapping the CPUs on which only sideband is
captured:

      libperf evlist: Remove ->idx() per_cpu parameter
      libperf evlist: Move ->idx() into mmap_per_evsel()
      libperf evlist: Add evsel as a parameter to ->idx()
      perf auxtrace: Record whether an auxtrace mmap is needed
      perf auxctrace: Add mmap_needed to auxtrace_mmap_params
      perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter

The next 5 patches switch to setting up dummy event maps before adding the
evsel so that the evsel is subject to map propagation, primarily to cause
addition of the evsel's CPUs to all_cpus.

      perf evlist: Factor out evlist__dummy_event()
      perf evlist: Add evlist__add_system_wide_dummy()
      perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke()
      perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking
      perf intel-pt: Track sideband system-wide when needed

The remaining 5 patches make more significant changes.

First change from using user_requested_cpus to using all_cpus where necessary:

      perf tools: Allow all_cpus to be a superset of user_requested_cpus

Secondly, mmap all per-thread and all per-cpu events:

      libperf evlist: Allow mixing per-thread and per-cpu mmaps

Stop using system_wide flag for uncore because it will not work anymore:

      perf stat: Add per_cpu_only flag for uncore

Finally change map propagation so that system-wide events retain their cpus and
(dummy) threads:

      perf tools: Allow system-wide events to keep their own CPUs
      perf tools: Allow system-wide events to keep their own threads


Adrian Hunter (21):
      libperf evsel: Factor out perf_evsel__ioctl()
      libperf evsel: Add perf_evsel__enable_thread()
      perf evlist: Use libperf functions in evlist__enable_event_idx()
      perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c
      perf auxtrace: Do not mix up mmap idx
      libperf evlist: Remove ->idx() per_cpu parameter
      libperf evlist: Move ->idx() into mmap_per_evsel()
      libperf evlist: Add evsel as a parameter to ->idx()
      perf auxtrace: Record whether an auxtrace mmap is needed
      perf auxctrace: Add mmap_needed to auxtrace_mmap_params
      perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter
      perf evlist: Factor out evlist__dummy_event()
      perf evlist: Add evlist__add_system_wide_dummy()
      perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke()
      perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking
      perf intel-pt: Track sideband system-wide when needed
      perf tools: Allow all_cpus to be a superset of user_requested_cpus
      libperf evlist: Allow mixing per-thread and per-cpu mmaps
      perf stat: Add per_cpu_only flag for uncore
      perf tools: Allow system-wide events to keep their own CPUs
      perf tools: Allow system-wide events to keep their own threads

 tools/lib/perf/evlist.c                  |  67 +++++++------------
 tools/lib/perf/evsel.c                   |  29 +++++++--
 tools/lib/perf/include/internal/evlist.h |   3 +-
 tools/lib/perf/include/internal/evsel.h  |   1 +
 tools/lib/perf/include/perf/evsel.h      |   1 +
 tools/perf/arch/arm/util/cs-etm.c        |   1 +
 tools/perf/arch/arm64/util/arm-spe.c     |   1 +
 tools/perf/arch/s390/util/auxtrace.c     |   1 +
 tools/perf/arch/x86/util/intel-bts.c     |   1 +
 tools/perf/arch/x86/util/intel-pt.c      |  32 ++++------
 tools/perf/builtin-record.c              |  39 +++++-------
 tools/perf/builtin-stat.c                |   5 +-
 tools/perf/util/auxtrace.c               |  31 +++++++--
 tools/perf/util/auxtrace.h               |   8 ++-
 tools/perf/util/evlist.c                 | 106 +++++++++++++++----------------
 tools/perf/util/evlist.h                 |   7 +-
 tools/perf/util/evsel.c                  |   1 +
 tools/perf/util/evsel.h                  |   1 +
 tools/perf/util/mmap.c                   |   4 +-
 tools/perf/util/parse-events.c           |   2 +-
 20 files changed, 176 insertions(+), 165 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH RFC 01/21] libperf evsel: Factor out perf_evsel__ioctl()
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 19:05   ` Arnaldo Carvalho de Melo
  2022-04-22 16:23 ` [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread() Adrian Hunter
                   ` (20 subsequent siblings)
  21 siblings, 1 reply; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Factor out perf_evsel__ioctl() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evsel.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 210ea7c06ce8..20ae9f5f8b30 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -328,6 +328,17 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int thread,
 	return 0;
 }
 
+static int perf_evsel__ioctl(struct perf_evsel *evsel, int ioc, void *arg,
+			     int cpu_map_idx, int thread)
+{
+	int *fd = FD(evsel, cpu_map_idx, thread);
+
+	if (fd == NULL || *fd < 0)
+		return -1;
+
+	return ioctl(*fd, ioc, arg);
+}
+
 static int perf_evsel__run_ioctl(struct perf_evsel *evsel,
 				 int ioc,  void *arg,
 				 int cpu_map_idx)
@@ -335,13 +346,7 @@ static int perf_evsel__run_ioctl(struct perf_evsel *evsel,
 	int thread;
 
 	for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) {
-		int err;
-		int *fd = FD(evsel, cpu_map_idx, thread);
-
-		if (fd == NULL || *fd < 0)
-			return -1;
-
-		err = ioctl(*fd, ioc, arg);
+		int err = perf_evsel__ioctl(evsel, ioc, arg, cpu_map_idx, thread);
 
 		if (err)
 			return err;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread()
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 01/21] libperf evsel: Factor out perf_evsel__ioctl() Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-27 21:48   ` Namhyung Kim
  2022-05-03 16:45   ` Ian Rogers
  2022-04-22 16:23 ` [PATCH RFC 03/21] perf evlist: Use libperf functions in evlist__enable_event_idx() Adrian Hunter
                   ` (19 subsequent siblings)
  21 siblings, 2 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Add perf_evsel__enable_thread() as a counterpart to
perf_evsel__enable_cpu(), to enable all events for a thread.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evsel.c              | 10 ++++++++++
 tools/lib/perf/include/perf/evsel.h |  1 +
 2 files changed, 11 insertions(+)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 20ae9f5f8b30..2a1f07f877be 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -360,6 +360,16 @@ int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx)
 	return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, cpu_map_idx);
 }
 
+int perf_evsel__enable_thread(struct perf_evsel *evsel, int thread)
+{
+	int err = 0;
+	int i;
+
+	for (i = 0; i < xyarray__max_x(evsel->fd) && !err; i++)
+		err = perf_evsel__ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, i, thread);
+	return err;
+}
+
 int perf_evsel__enable(struct perf_evsel *evsel)
 {
 	int i;
diff --git a/tools/lib/perf/include/perf/evsel.h b/tools/lib/perf/include/perf/evsel.h
index 2a9516b42d15..699c0ed97d34 100644
--- a/tools/lib/perf/include/perf/evsel.h
+++ b/tools/lib/perf/include/perf/evsel.h
@@ -36,6 +36,7 @@ LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int
 				 struct perf_counts_values *count);
 LIBPERF_API int perf_evsel__enable(struct perf_evsel *evsel);
 LIBPERF_API int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
+LIBPERF_API int perf_evsel__enable_thread(struct perf_evsel *evsel, int thread);
 LIBPERF_API int perf_evsel__disable(struct perf_evsel *evsel);
 LIBPERF_API int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
 LIBPERF_API struct perf_cpu_map *perf_evsel__cpus(struct perf_evsel *evsel);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 03/21] perf evlist: Use libperf functions in evlist__enable_event_idx()
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 01/21] libperf evsel: Factor out perf_evsel__ioctl() Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread() Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 04/21] perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c Adrian Hunter
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

evlist__enable_event_idx() is used only for auxtrace events which are never
system_wide. Simplify by using libperf enable event functions.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/evlist.c | 44 ++--------------------------------------
 1 file changed, 2 insertions(+), 42 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 52ea004ba01e..9fcecf7daa62 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -334,14 +334,6 @@ int evlist__add_newtp(struct evlist *evlist, const char *sys, const char *name,
 	return 0;
 }
 
-static int evlist__nr_threads(struct evlist *evlist, struct evsel *evsel)
-{
-	if (evsel->core.system_wide)
-		return 1;
-	else
-		return perf_thread_map__nr(evlist->core.threads);
-}
-
 struct evlist_cpu_iterator evlist__cpu_begin(struct evlist *evlist, struct affinity *affinity)
 {
 	struct evlist_cpu_iterator itr = {
@@ -546,46 +538,14 @@ void evlist__toggle_enable(struct evlist *evlist)
 	(evlist->enabled ? evlist__disable : evlist__enable)(evlist);
 }
 
-static int evlist__enable_event_cpu(struct evlist *evlist, struct evsel *evsel, int cpu)
-{
-	int thread;
-	int nr_threads = evlist__nr_threads(evlist, evsel);
-
-	if (!evsel->core.fd)
-		return -EINVAL;
-
-	for (thread = 0; thread < nr_threads; thread++) {
-		int err = ioctl(FD(evsel, cpu, thread), PERF_EVENT_IOC_ENABLE, 0);
-		if (err)
-			return err;
-	}
-	return 0;
-}
-
-static int evlist__enable_event_thread(struct evlist *evlist, struct evsel *evsel, int thread)
-{
-	int cpu;
-	int nr_cpus = perf_cpu_map__nr(evlist->core.user_requested_cpus);
-
-	if (!evsel->core.fd)
-		return -EINVAL;
-
-	for (cpu = 0; cpu < nr_cpus; cpu++) {
-		int err = ioctl(FD(evsel, cpu, thread), PERF_EVENT_IOC_ENABLE, 0);
-		if (err)
-			return err;
-	}
-	return 0;
-}
-
 int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx)
 {
 	bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.user_requested_cpus);
 
 	if (per_cpu_mmaps)
-		return evlist__enable_event_cpu(evlist, evsel, idx);
+		return perf_evsel__enable_cpu(&evsel->core, idx);
 
-	return evlist__enable_event_thread(evlist, evsel, idx);
+	return perf_evsel__enable_thread(&evsel->core, idx);
 }
 
 int evlist__add_pollfd(struct evlist *evlist, int fd)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 04/21] perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (2 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 03/21] perf evlist: Use libperf functions in evlist__enable_event_idx() Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 05/21] perf auxtrace: Do not mix up mmap idx Adrian Hunter
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

evlist__enable_event_idx() is used only by auxtrace. Move it to auxtrace.c
in preparation for making it even more auxtrace specific.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/auxtrace.c | 10 ++++++++++
 tools/perf/util/evlist.c   | 10 ----------
 tools/perf/util/evlist.h   |  2 --
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index df1c5bbbaa0d..10936a38031f 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -636,6 +636,16 @@ int auxtrace_parse_snapshot_options(struct auxtrace_record *itr,
 	return -EINVAL;
 }
 
+static int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx)
+{
+	bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.user_requested_cpus);
+
+	if (per_cpu_mmaps)
+		return perf_evsel__enable_cpu(&evsel->core, idx);
+
+	return perf_evsel__enable_thread(&evsel->core, idx);
+}
+
 int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
 {
 	struct evsel *evsel;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9fcecf7daa62..f1309b39afe4 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -538,16 +538,6 @@ void evlist__toggle_enable(struct evlist *evlist)
 	(evlist->enabled ? evlist__disable : evlist__enable)(evlist);
 }
 
-int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx)
-{
-	bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.user_requested_cpus);
-
-	if (per_cpu_mmaps)
-		return perf_evsel__enable_cpu(&evsel->core, idx);
-
-	return perf_evsel__enable_thread(&evsel->core, idx);
-}
-
 int evlist__add_pollfd(struct evlist *evlist, int fd)
 {
 	return perf_evlist__add_pollfd(&evlist->core, fd, NULL, POLLIN, fdarray_flag__default);
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a21daaa5fc1b..4062f5aebfc1 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -196,8 +196,6 @@ void evlist__toggle_enable(struct evlist *evlist);
 void evlist__disable_evsel(struct evlist *evlist, char *evsel_name);
 void evlist__enable_evsel(struct evlist *evlist, char *evsel_name);
 
-int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx);
-
 void evlist__set_selected(struct evlist *evlist, struct evsel *evsel);
 
 int evlist__create_maps(struct evlist *evlist, struct target *target);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 05/21] perf auxtrace: Do not mix up mmap idx
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (3 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 04/21] perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-27 21:54   ` Namhyung Kim
  2022-04-22 16:23 ` [PATCH RFC 06/21] libperf evlist: Remove ->idx() per_cpu parameter Adrian Hunter
                   ` (16 subsequent siblings)
  21 siblings, 1 reply; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

The idx is with respect to evlist not evsel. That hasn't mattered because
they are the same at present. Prepare for that not being the case, which it
won't be when sideband tracking events are allowed on all CPUs even when
auxtrace is limited to selected CPUs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/auxtrace.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 10936a38031f..2d015b0be549 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -640,8 +640,14 @@ static int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel,
 {
 	bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.user_requested_cpus);
 
-	if (per_cpu_mmaps)
-		return perf_evsel__enable_cpu(&evsel->core, idx);
+	if (per_cpu_mmaps) {
+		struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);
+		int cpu = perf_cpu_map__idx(evsel->core.cpus, evlist_cpu);
+
+		if (cpu == -1)
+			return -EINVAL;
+		return perf_evsel__enable_cpu(&evsel->core, cpu);
+	}
 
 	return perf_evsel__enable_thread(&evsel->core, idx);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 06/21] libperf evlist: Remove ->idx() per_cpu parameter
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (4 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 05/21] perf auxtrace: Do not mix up mmap idx Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 07/21] libperf evlist: Move ->idx() into mmap_per_evsel() Adrian Hunter
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Remove ->idx() per_cpu parameter because it isn't needed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evlist.c                  | 4 ++--
 tools/lib/perf/include/internal/evlist.h | 2 +-
 tools/perf/util/evlist.c                 | 3 ++-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index a09315538a30..6d0fa7b2f417 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -517,7 +517,7 @@ mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 		int output_overwrite = -1;
 
 		if (ops->idx)
-			ops->idx(evlist, mp, thread, false);
+			ops->idx(evlist, mp, thread);
 
 		if (mmap_per_evsel(evlist, ops, thread, mp, 0, thread,
 				   &output, &output_overwrite))
@@ -544,7 +544,7 @@ mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 		int output_overwrite = -1;
 
 		if (ops->idx)
-			ops->idx(evlist, mp, cpu, true);
+			ops->idx(evlist, mp, cpu);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (mmap_per_evsel(evlist, ops, cpu, mp, cpu,
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index e3e64f37db7b..0d5c830431a7 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -38,7 +38,7 @@ struct perf_evlist {
 };
 
 typedef void
-(*perf_evlist_mmap__cb_idx_t)(struct perf_evlist*, struct perf_mmap_param*, int, bool);
+(*perf_evlist_mmap__cb_idx_t)(struct perf_evlist*, struct perf_mmap_param*, int);
 typedef struct perf_mmap*
 (*perf_evlist_mmap__cb_get_t)(struct perf_evlist*, bool, int);
 typedef int
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f1309b39afe4..09a1d3400fd9 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -748,10 +748,11 @@ static struct mmap *evlist__alloc_mmap(struct evlist *evlist,
 static void
 perf_evlist__mmap_cb_idx(struct perf_evlist *_evlist,
 			 struct perf_mmap_param *_mp,
-			 int idx, bool per_cpu)
+			 int idx)
 {
 	struct evlist *evlist = container_of(_evlist, struct evlist, core);
 	struct mmap_params *mp = container_of(_mp, struct mmap_params, core);
+	bool per_cpu = !perf_cpu_map__empty(_evlist->user_requested_cpus);
 
 	auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, idx, per_cpu);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 07/21] libperf evlist: Move ->idx() into mmap_per_evsel()
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (5 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 06/21] libperf evlist: Remove ->idx() per_cpu parameter Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 08/21] libperf evlist: Add evsel as a parameter to ->idx() Adrian Hunter
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Move ->idx() into mmap_per_evsel() in preparation for adding evsel as a
parameter.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evlist.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 6d0fa7b2f417..673c267f900e 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -474,6 +474,9 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 			 */
 			refcount_set(&map->refcnt, 2);
 
+			if (ops->idx)
+				ops->idx(evlist, mp, idx);
+
 			if (ops->mmap(map, mp, *output, evlist_cpu) < 0)
 				return -1;
 
@@ -516,9 +519,6 @@ mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 		int output = -1;
 		int output_overwrite = -1;
 
-		if (ops->idx)
-			ops->idx(evlist, mp, thread);
-
 		if (mmap_per_evsel(evlist, ops, thread, mp, 0, thread,
 				   &output, &output_overwrite))
 			goto out_unmap;
@@ -543,9 +543,6 @@ mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 		int output = -1;
 		int output_overwrite = -1;
 
-		if (ops->idx)
-			ops->idx(evlist, mp, cpu);
-
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (mmap_per_evsel(evlist, ops, cpu, mp, cpu,
 					   thread, &output, &output_overwrite))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 08/21] libperf evlist: Add evsel as a parameter to ->idx()
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (6 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 07/21] libperf evlist: Move ->idx() into mmap_per_evsel() Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 09/21] perf auxtrace: Record whether an auxtrace mmap is needed Adrian Hunter
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Add evsel as a parameter to ->idx() in preparation for correctly
determining whether an auxtrace mmap is needed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evlist.c                  | 2 +-
 tools/lib/perf/include/internal/evlist.h | 3 ++-
 tools/perf/util/evlist.c                 | 1 +
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 673c267f900e..ad04da81c367 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -475,7 +475,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 			refcount_set(&map->refcnt, 2);
 
 			if (ops->idx)
-				ops->idx(evlist, mp, idx);
+				ops->idx(evlist, evsel, mp, idx);
 
 			if (ops->mmap(map, mp, *output, evlist_cpu) < 0)
 				return -1;
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 0d5c830431a7..6f89aec3e608 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -38,7 +38,8 @@ struct perf_evlist {
 };
 
 typedef void
-(*perf_evlist_mmap__cb_idx_t)(struct perf_evlist*, struct perf_mmap_param*, int);
+(*perf_evlist_mmap__cb_idx_t)(struct perf_evlist*, struct perf_evsel*,
+			      struct perf_mmap_param*, int);
 typedef struct perf_mmap*
 (*perf_evlist_mmap__cb_get_t)(struct perf_evlist*, bool, int);
 typedef int
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 09a1d3400fd9..7ae56b062f44 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -747,6 +747,7 @@ static struct mmap *evlist__alloc_mmap(struct evlist *evlist,
 
 static void
 perf_evlist__mmap_cb_idx(struct perf_evlist *_evlist,
+			 struct perf_evsel *_evsel __maybe_unused,
 			 struct perf_mmap_param *_mp,
 			 int idx)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 09/21] perf auxtrace: Record whether an auxtrace mmap is needed
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (7 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 08/21] libperf evlist: Add evsel as a parameter to ->idx() Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 10/21] perf auxctrace: Add mmap_needed to auxtrace_mmap_params Adrian Hunter
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Add a flag needs_auxtrace_mmap to record whether an auxtrace mmap is
needed, in preparation for correctly determining whether or not an
auxtrace mmap is needed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/arch/arm/util/cs-etm.c    | 1 +
 tools/perf/arch/arm64/util/arm-spe.c | 1 +
 tools/perf/arch/s390/util/auxtrace.c | 1 +
 tools/perf/arch/x86/util/intel-bts.c | 1 +
 tools/perf/arch/x86/util/intel-pt.c  | 1 +
 tools/perf/util/evsel.h              | 1 +
 6 files changed, 6 insertions(+)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index 11c71aa219f7..1b54638d53b0 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -319,6 +319,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
 			}
 			evsel->core.attr.freq = 0;
 			evsel->core.attr.sample_period = 1;
+			evsel->needs_auxtrace_mmap = true;
 			cs_etm_evsel = evsel;
 			opts->full_auxtrace = true;
 		}
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index af4d63af8072..e24c22f187df 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -159,6 +159,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
 			}
 			evsel->core.attr.freq = 0;
 			evsel->core.attr.sample_period = arm_spe_pmu->default_config->sample_period;
+			evsel->needs_auxtrace_mmap = true;
 			arm_spe_evsel = evsel;
 			opts->full_auxtrace = true;
 		}
diff --git a/tools/perf/arch/s390/util/auxtrace.c b/tools/perf/arch/s390/util/auxtrace.c
index 0db5c58c98e8..5068baa3e092 100644
--- a/tools/perf/arch/s390/util/auxtrace.c
+++ b/tools/perf/arch/s390/util/auxtrace.c
@@ -98,6 +98,7 @@ struct auxtrace_record *auxtrace_record__init(struct evlist *evlist,
 	evlist__for_each_entry(evlist, pos) {
 		if (pos->core.attr.config == PERF_EVENT_CPUM_SF_DIAG) {
 			diagnose = 1;
+			pos->needs_auxtrace_mmap = true;
 			break;
 		}
 	}
diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c
index d68a0f48e41e..bcccfbade5c6 100644
--- a/tools/perf/arch/x86/util/intel-bts.c
+++ b/tools/perf/arch/x86/util/intel-bts.c
@@ -129,6 +129,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr,
 			}
 			evsel->core.attr.freq = 0;
 			evsel->core.attr.sample_period = 1;
+			evsel->needs_auxtrace_mmap = true;
 			intel_bts_evsel = evsel;
 			opts->full_auxtrace = true;
 		}
diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 38ec2666ec12..2eaac4638aab 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -649,6 +649,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 			evsel->core.attr.freq = 0;
 			evsel->core.attr.sample_period = 1;
 			evsel->no_aux_samples = true;
+			evsel->needs_auxtrace_mmap = true;
 			intel_pt_evsel = evsel;
 			opts->full_auxtrace = true;
 		}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 041b42d33bf5..1a07694afbdd 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -120,6 +120,7 @@ struct evsel {
 	bool			merged_stat;
 	bool			reset_group;
 	bool			errored;
+	bool			needs_auxtrace_mmap;
 	struct hashmap		*per_pkg_mask;
 	int			err;
 	struct {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 10/21] perf auxctrace: Add mmap_needed to auxtrace_mmap_params
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (8 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 09/21] perf auxtrace: Record whether an auxtrace mmap is needed Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 11/21] perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter Adrian Hunter
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Add mmap_needed to auxtrace_mmap_params.

Currently an auxtrace mmap is always attempted even if the event is not an
auxtrace event. That works because, when AUX area tracing, there is always
an auxtrace event first for every mmap. Prepare for that not being the
case, which it won't be when sideband tracking events are allowed on
all CPUs even when auxtrace is limited to selected CPUs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/auxtrace.c | 10 ++++++++--
 tools/perf/util/auxtrace.h |  7 +++++--
 tools/perf/util/evlist.c   |  5 +++--
 tools/perf/util/mmap.c     |  1 +
 4 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 2d015b0be549..9f01ce405971 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -125,7 +125,7 @@ int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
 	mm->tid = mp->tid;
 	mm->cpu = mp->cpu.cpu;
 
-	if (!mp->len) {
+	if (!mp->len || !mp->mmap_needed) {
 		mm->base = NULL;
 		return 0;
 	}
@@ -168,9 +168,15 @@ void auxtrace_mmap_params__init(struct auxtrace_mmap_params *mp,
 }
 
 void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
-				   struct evlist *evlist, int idx,
+				   struct evlist *evlist,
+				   struct evsel *evsel, int idx,
 				   bool per_cpu)
 {
+	mp->mmap_needed = evsel->needs_auxtrace_mmap;
+
+	if (!mp->mmap_needed)
+		return;
+
 	mp->idx = idx;
 
 	if (per_cpu) {
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index dc38b6f57232..4e715e2d9291 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -353,6 +353,7 @@ struct auxtrace_mmap_params {
 	int		prot;
 	int		idx;
 	pid_t		tid;
+	bool		mmap_needed;
 	struct perf_cpu	cpu;
 };
 
@@ -490,7 +491,8 @@ void auxtrace_mmap_params__init(struct auxtrace_mmap_params *mp,
 				unsigned int auxtrace_pages,
 				bool auxtrace_overwrite);
 void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
-				   struct evlist *evlist, int idx,
+				   struct evlist *evlist,
+				   struct evsel *evsel, int idx,
 				   bool per_cpu);
 
 typedef int (*process_auxtrace_t)(struct perf_tool *tool,
@@ -863,7 +865,8 @@ void auxtrace_mmap_params__init(struct auxtrace_mmap_params *mp,
 				unsigned int auxtrace_pages,
 				bool auxtrace_overwrite);
 void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
-				   struct evlist *evlist, int idx,
+				   struct evlist *evlist,
+				   struct evsel *evsel, int idx,
 				   bool per_cpu);
 
 #define ITRACE_HELP ""
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7ae56b062f44..996bdc203616 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -747,15 +747,16 @@ static struct mmap *evlist__alloc_mmap(struct evlist *evlist,
 
 static void
 perf_evlist__mmap_cb_idx(struct perf_evlist *_evlist,
-			 struct perf_evsel *_evsel __maybe_unused,
+			 struct perf_evsel *_evsel,
 			 struct perf_mmap_param *_mp,
 			 int idx)
 {
 	struct evlist *evlist = container_of(_evlist, struct evlist, core);
 	struct mmap_params *mp = container_of(_mp, struct mmap_params, core);
 	bool per_cpu = !perf_cpu_map__empty(_evlist->user_requested_cpus);
+	struct evsel *evsel = container_of(_evsel, struct evsel, core);
 
-	auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, idx, per_cpu);
+	auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, evsel, idx, per_cpu);
 }
 
 static struct perf_mmap*
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 50502b4a7ca4..de59c4da852b 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -62,6 +62,7 @@ void __weak auxtrace_mmap_params__init(struct auxtrace_mmap_params *mp __maybe_u
 
 void __weak auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp __maybe_unused,
 					  struct evlist *evlist __maybe_unused,
+					  struct evsel *evsel __maybe_unused,
 					  int idx __maybe_unused,
 					  bool per_cpu __maybe_unused)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 11/21] perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (9 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 10/21] perf auxctrace: Add mmap_needed to auxtrace_mmap_params Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 12/21] perf evlist: Factor out evlist__dummy_event() Adrian Hunter
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Remove auxtrace_mmap_params__set_idx() per_cpu parameter because it isn't
needed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/auxtrace.c | 5 +++--
 tools/perf/util/auxtrace.h | 3 +--
 tools/perf/util/evlist.c   | 3 +--
 tools/perf/util/mmap.c     | 3 +--
 4 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 9f01ce405971..246afe99a7fb 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -169,9 +169,10 @@ void auxtrace_mmap_params__init(struct auxtrace_mmap_params *mp,
 
 void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
 				   struct evlist *evlist,
-				   struct evsel *evsel, int idx,
-				   bool per_cpu)
+				   struct evsel *evsel, int idx)
 {
+	bool per_cpu = !perf_cpu_map__empty(evlist->core.user_requested_cpus);
+
 	mp->mmap_needed = evsel->needs_auxtrace_mmap;
 
 	if (!mp->mmap_needed)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 4e715e2d9291..7931c34f749a 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -492,8 +492,7 @@ void auxtrace_mmap_params__init(struct auxtrace_mmap_params *mp,
 				bool auxtrace_overwrite);
 void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
 				   struct evlist *evlist,
-				   struct evsel *evsel, int idx,
-				   bool per_cpu);
+				   struct evsel *evsel, int idx);
 
 typedef int (*process_auxtrace_t)(struct perf_tool *tool,
 				  struct mmap *map,
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 996bdc203616..25eae096bdac 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -753,10 +753,9 @@ perf_evlist__mmap_cb_idx(struct perf_evlist *_evlist,
 {
 	struct evlist *evlist = container_of(_evlist, struct evlist, core);
 	struct mmap_params *mp = container_of(_mp, struct mmap_params, core);
-	bool per_cpu = !perf_cpu_map__empty(_evlist->user_requested_cpus);
 	struct evsel *evsel = container_of(_evsel, struct evsel, core);
 
-	auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, evsel, idx, per_cpu);
+	auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, evsel, idx);
 }
 
 static struct perf_mmap*
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index de59c4da852b..a4dff881be39 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -63,8 +63,7 @@ void __weak auxtrace_mmap_params__init(struct auxtrace_mmap_params *mp __maybe_u
 void __weak auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp __maybe_unused,
 					  struct evlist *evlist __maybe_unused,
 					  struct evsel *evsel __maybe_unused,
-					  int idx __maybe_unused,
-					  bool per_cpu __maybe_unused)
+					  int idx __maybe_unused)
 {
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 12/21] perf evlist: Factor out evlist__dummy_event()
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (10 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 11/21] perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 13/21] perf evlist: Add evlist__add_system_wide_dummy() Adrian Hunter
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Factor out evlist__dummy_event() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/evlist.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 25eae096bdac..78c47cbafbc2 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -242,14 +242,20 @@ int __evlist__add_default(struct evlist *evlist, bool precise)
 	return 0;
 }
 
-int evlist__add_dummy(struct evlist *evlist)
+static struct evsel *evlist__dummy_event(struct evlist *evlist)
 {
 	struct perf_event_attr attr = {
 		.type	= PERF_TYPE_SOFTWARE,
 		.config = PERF_COUNT_SW_DUMMY,
 		.size	= sizeof(attr), /* to capture ABI version */
 	};
-	struct evsel *evsel = evsel__new_idx(&attr, evlist->core.nr_entries);
+
+	return evsel__new_idx(&attr, evlist->core.nr_entries);
+}
+
+int evlist__add_dummy(struct evlist *evlist)
+{
+	struct evsel *evsel = evlist__dummy_event(evlist);
 
 	if (evsel == NULL)
 		return -ENOMEM;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 13/21] perf evlist: Add evlist__add_system_wide_dummy()
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (11 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 12/21] perf evlist: Factor out evlist__dummy_event() Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 14/21] perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke() Adrian Hunter
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Add evlist__add_system_wide_dummy() to enable creating a system-wide dummy
event that sets up the system-wide maps before map propagation.

For convenience, add evlist__add_aux_dummy() so that the logic can be used
whether or not the event needs to be system-wide.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/evlist.c | 40 ++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h |  5 +++++
 2 files changed, 45 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 78c47cbafbc2..58ea562ddbd2 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -264,6 +264,46 @@ int evlist__add_dummy(struct evlist *evlist)
 	return 0;
 }
 
+static void evlist__add_system_wide(struct evlist *evlist, struct evsel *evsel)
+{
+	evsel->core.system_wide = true;
+
+	/* All CPUs */
+	perf_cpu_map__put(evsel->core.own_cpus);
+	evsel->core.own_cpus = perf_cpu_map__new(NULL);
+	perf_cpu_map__put(evsel->core.cpus);
+	evsel->core.cpus = perf_cpu_map__get(evsel->core.own_cpus);
+
+	/* No threads */
+	perf_thread_map__put(evsel->core.threads);
+	evsel->core.threads = perf_thread_map__new_dummy();
+
+	evlist__add(evlist, evsel);
+}
+
+struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide)
+{
+	struct evsel *evsel = evlist__dummy_event(evlist);
+
+	if (!evsel)
+		return NULL;
+
+	evsel->core.attr.exclude_kernel = 1;
+	evsel->core.attr.exclude_guest = 1;
+	evsel->core.attr.exclude_hv = 1;
+	evsel->core.attr.freq = 0;
+	evsel->core.attr.sample_period = 1;
+	evsel->no_aux_samples = true;
+	evsel->name = strdup("dummy:u");
+
+	if (system_wide)
+		evlist__add_system_wide(evlist, evsel);
+	else
+		evlist__add(evlist, evsel);
+
+	return evsel;
+}
+
 static int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs)
 {
 	struct evsel *evsel, *n;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 4062f5aebfc1..dd1af114e033 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -114,6 +114,11 @@ int arch_evlist__add_default_attrs(struct evlist *evlist);
 struct evsel *arch_evlist__leader(struct list_head *list);
 
 int evlist__add_dummy(struct evlist *evlist);
+struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide);
+static inline struct evsel *evlist__add_system_wide_dummy(struct evlist *evlist)
+{
+	return evlist__add_aux_dummy(evlist, true);
+}
 
 int evlist__add_sb_event(struct evlist *evlist, struct perf_event_attr *attr,
 			 evsel__sb_cb_t cb, void *data);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 14/21] perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke()
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (12 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 13/21] perf evlist: Add evlist__add_system_wide_dummy() Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 15/21] perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking Adrian Hunter
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Use evlist__add_system_wide_dummy() in record__config_text_poke() in
preparation for allowing system-wide events on all CPUs while the user
requested events are on only user requested CPUs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/builtin-record.c | 21 +++------------------
 1 file changed, 3 insertions(+), 18 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 069825c48d40..83d2f2b5dcda 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -869,7 +869,6 @@ static int record__auxtrace_init(struct record *rec __maybe_unused)
 static int record__config_text_poke(struct evlist *evlist)
 {
 	struct evsel *evsel;
-	int err;
 
 	/* Nothing to do if text poke is already configured */
 	evlist__for_each_entry(evlist, evsel) {
@@ -877,27 +876,13 @@ static int record__config_text_poke(struct evlist *evlist)
 			return 0;
 	}
 
-	err = parse_events(evlist, "dummy:u", NULL);
-	if (err)
-		return err;
-
-	evsel = evlist__last(evlist);
+	evsel = evlist__add_system_wide_dummy(evlist);
+	if (!evsel)
+		return -ENOMEM;
 
-	evsel->core.attr.freq = 0;
-	evsel->core.attr.sample_period = 1;
 	evsel->core.attr.text_poke = 1;
 	evsel->core.attr.ksymbol = 1;
-
-	evsel->core.system_wide = true;
-	evsel->no_aux_samples = true;
 	evsel->immediate = true;
-
-	/* Text poke must be collected on all CPUs */
-	perf_cpu_map__put(evsel->core.own_cpus);
-	evsel->core.own_cpus = perf_cpu_map__new(NULL);
-	perf_cpu_map__put(evsel->core.cpus);
-	evsel->core.cpus = perf_cpu_map__get(evsel->core.own_cpus);
-
 	evsel__set_sample_bit(evsel, TIME);
 
 	return 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 15/21] perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (13 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 14/21] perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke() Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 16/21] perf intel-pt: Track sideband system-wide when needed Adrian Hunter
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Use evlist__add_system_wide_dummy() for switch tracking in preparation for
allowing system-wide events on all CPUs while the user requested events are
on only user requested CPUs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/arch/x86/util/intel-pt.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 2eaac4638aab..e45d64dec57c 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -811,18 +811,11 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 			if (!cpu_wide && perf_can_record_cpu_wide()) {
 				struct evsel *switch_evsel;
 
-				err = parse_events(evlist, "dummy:u", NULL);
-				if (err)
-					return err;
+				switch_evsel = evlist__add_system_wide_dummy(evlist);
+				if (!switch_evsel)
+					return -ENOMEM;
 
-				switch_evsel = evlist__last(evlist);
-
-				switch_evsel->core.attr.freq = 0;
-				switch_evsel->core.attr.sample_period = 1;
 				switch_evsel->core.attr.context_switch = 1;
-
-				switch_evsel->core.system_wide = true;
-				switch_evsel->no_aux_samples = true;
 				switch_evsel->immediate = true;
 
 				evsel__set_sample_bit(switch_evsel, TID);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 16/21] perf intel-pt: Track sideband system-wide when needed
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (14 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 15/21] perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-04-22 16:23 ` [PATCH RFC 17/21] perf tools: Allow all_cpus to be a superset of user_requested_cpus Adrian Hunter
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

User space tasks can migrate between CPUs, so when tracing selected CPUs,
sideband for all CPUs is still needed. This is in preparation for allowing
system-wide events on all CPUs while the user requested events are on only
user requested CPUs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/arch/x86/util/intel-pt.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index e45d64dec57c..62be78bc90b6 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -864,20 +864,22 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 
 	/* Add dummy event to keep tracking */
 	if (opts->full_auxtrace) {
+		bool need_system_wide_tracking;
 		struct evsel *tracking_evsel;
 
-		err = parse_events(evlist, "dummy:u", NULL);
-		if (err)
-			return err;
+		/*
+		 * User space tasks can migrate between CPUs, so when tracing
+		 * selected CPUs, sideband for all CPUs is still needed.
+		 */
+		need_system_wide_tracking = evlist->core.has_user_cpus &&
+					    !intel_pt_evsel->core.attr.exclude_user;
 
-		tracking_evsel = evlist__last(evlist);
+		tracking_evsel = evlist__add_aux_dummy(evlist, need_system_wide_tracking);
+		if (!tracking_evsel)
+			return -ENOMEM;
 
 		evlist__set_tracking_event(evlist, tracking_evsel);
 
-		tracking_evsel->core.attr.freq = 0;
-		tracking_evsel->core.attr.sample_period = 1;
-
-		tracking_evsel->no_aux_samples = true;
 		if (need_immediate)
 			tracking_evsel->immediate = true;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 17/21] perf tools: Allow all_cpus to be a superset of user_requested_cpus
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (15 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 16/21] perf intel-pt: Track sideband system-wide when needed Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-05-03 17:41   ` Ian Rogers
  2022-04-22 16:23 ` [PATCH RFC 18/21] libperf evlist: Allow mixing per-thread and per-cpu mmaps Adrian Hunter
                   ` (4 subsequent siblings)
  21 siblings, 1 reply; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

To support collection of system-wide events with user requested CPUs,
all_cpus must be a superset of user_requested_cpus.

In order to support all_cpus to be a superset of user_requested_cpus,
all_cpus must be used instead of user_requested_cpus when dealing with CPUs
of all events instead of CPUs of requested events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evlist.c     | 12 ++++++------
 tools/perf/builtin-record.c | 18 ++++++++++++------
 tools/perf/util/auxtrace.c  |  2 +-
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index ad04da81c367..048b546f9444 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -294,7 +294,7 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
 
 int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 {
-	int nr_cpus = perf_cpu_map__nr(evlist->user_requested_cpus);
+	int nr_cpus = perf_cpu_map__nr(evlist->all_cpus);
 	int nr_threads = perf_thread_map__nr(evlist->threads);
 	int nfds = 0;
 	struct perf_evsel *evsel;
@@ -426,7 +426,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	       int idx, struct perf_mmap_param *mp, int cpu_idx,
 	       int thread, int *_output, int *_output_overwrite)
 {
-	struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->user_requested_cpus, cpu_idx);
+	struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->all_cpus, cpu_idx);
 	struct perf_evsel *evsel;
 	int revent;
 
@@ -536,7 +536,7 @@ mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	     struct perf_mmap_param *mp)
 {
 	int nr_threads = perf_thread_map__nr(evlist->threads);
-	int nr_cpus    = perf_cpu_map__nr(evlist->user_requested_cpus);
+	int nr_cpus    = perf_cpu_map__nr(evlist->all_cpus);
 	int cpu, thread;
 
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
@@ -561,8 +561,8 @@ static int perf_evlist__nr_mmaps(struct perf_evlist *evlist)
 {
 	int nr_mmaps;
 
-	nr_mmaps = perf_cpu_map__nr(evlist->user_requested_cpus);
-	if (perf_cpu_map__empty(evlist->user_requested_cpus))
+	nr_mmaps = perf_cpu_map__nr(evlist->all_cpus);
+	if (perf_cpu_map__empty(evlist->all_cpus))
 		nr_mmaps = perf_thread_map__nr(evlist->threads);
 
 	return nr_mmaps;
@@ -573,7 +573,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
 			  struct perf_mmap_param *mp)
 {
 	struct perf_evsel *evsel;
-	const struct perf_cpu_map *cpus = evlist->user_requested_cpus;
+	const struct perf_cpu_map *cpus = evlist->all_cpus;
 
 	if (!ops || !ops->get || !ops->mmap)
 		return -EINVAL;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 83d2f2b5dcda..42127cfd9cc1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -967,14 +967,20 @@ static void record__thread_data_close_pipes(struct record_thread *thread_data)
 	}
 }
 
+static bool evlst__per_thread(struct evlist *evlist)
+{
+	return cpu_map__is_dummy(evlist->core.user_requested_cpus);
+}
+
 static int record__thread_data_init_maps(struct record_thread *thread_data, struct evlist *evlist)
 {
 	int m, tm, nr_mmaps = evlist->core.nr_mmaps;
 	struct mmap *mmap = evlist->mmap;
 	struct mmap *overwrite_mmap = evlist->overwrite_mmap;
-	struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
+	struct perf_cpu_map *cpus = evlist->core.all_cpus;
+	bool per_thread = evlst__per_thread(evlist);
 
-	if (cpu_map__is_dummy(cpus))
+	if (per_thread)
 		thread_data->nr_mmaps = nr_mmaps;
 	else
 		thread_data->nr_mmaps = bitmap_weight(thread_data->mask->maps.bits,
@@ -995,7 +1001,7 @@ static int record__thread_data_init_maps(struct record_thread *thread_data, stru
 		 thread_data->nr_mmaps, thread_data->maps, thread_data->overwrite_maps);
 
 	for (m = 0, tm = 0; m < nr_mmaps && tm < thread_data->nr_mmaps; m++) {
-		if (cpu_map__is_dummy(cpus) ||
+		if (per_thread ||
 		    test_bit(cpus->map[m].cpu, thread_data->mask->maps.bits)) {
 			if (thread_data->maps) {
 				thread_data->maps[tm] = &mmap[m];
@@ -1870,7 +1876,7 @@ static int record__synthesize(struct record *rec, bool tail)
 		return err;
 	}
 
-	err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->core.user_requested_cpus,
+	err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->core.all_cpus,
 					     process_synthesized_event, NULL);
 	if (err < 0) {
 		pr_err("Couldn't synthesize cpu map.\n");
@@ -3667,12 +3673,12 @@ static int record__init_thread_default_masks(struct record *rec, struct perf_cpu
 static int record__init_thread_masks(struct record *rec)
 {
 	int ret = 0;
-	struct perf_cpu_map *cpus = rec->evlist->core.user_requested_cpus;
+	struct perf_cpu_map *cpus = rec->evlist->core.all_cpus;
 
 	if (!record__threads_enabled(rec))
 		return record__init_thread_default_masks(rec, cpus);
 
-	if (cpu_map__is_dummy(cpus)) {
+	if (evlst__per_thread(rec->evlist)) {
 		pr_err("--per-thread option is mutually exclusive to parallel streaming mode.\n");
 		return -EINVAL;
 	}
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 246afe99a7fb..bac1f1eb95a7 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -181,7 +181,7 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
 	mp->idx = idx;
 
 	if (per_cpu) {
-		mp->cpu = perf_cpu_map__cpu(evlist->core.user_requested_cpus, idx);
+		mp->cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);
 		if (evlist->core.threads)
 			mp->tid = perf_thread_map__pid(evlist->core.threads, 0);
 		else
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 18/21] libperf evlist: Allow mixing per-thread and per-cpu mmaps
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (16 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 17/21] perf tools: Allow all_cpus to be a superset of user_requested_cpus Adrian Hunter
@ 2022-04-22 16:23 ` Adrian Hunter
  2022-05-03 20:29   ` Namhyung Kim
  2022-04-22 16:24 ` [PATCH RFC 19/21] perf stat: Add requires_cpu flag for uncore Adrian Hunter
                   ` (3 subsequent siblings)
  21 siblings, 1 reply; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

mmap_per_evsel() will skip events that do not match the CPU, so all CPUs
can be iterated in any case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evlist.c | 34 +++++-----------------------------
 1 file changed, 5 insertions(+), 29 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 048b546f9444..37dfa9d936a7 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -508,29 +508,6 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	return 0;
 }
 
-static int
-mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
-		struct perf_mmap_param *mp)
-{
-	int thread;
-	int nr_threads = perf_thread_map__nr(evlist->threads);
-
-	for (thread = 0; thread < nr_threads; thread++) {
-		int output = -1;
-		int output_overwrite = -1;
-
-		if (mmap_per_evsel(evlist, ops, thread, mp, 0, thread,
-				   &output, &output_overwrite))
-			goto out_unmap;
-	}
-
-	return 0;
-
-out_unmap:
-	perf_evlist__munmap(evlist);
-	return -1;
-}
-
 static int
 mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	     struct perf_mmap_param *mp)
@@ -561,9 +538,12 @@ static int perf_evlist__nr_mmaps(struct perf_evlist *evlist)
 {
 	int nr_mmaps;
 
+	/* One for each CPU */
 	nr_mmaps = perf_cpu_map__nr(evlist->all_cpus);
-	if (perf_cpu_map__empty(evlist->all_cpus))
-		nr_mmaps = perf_thread_map__nr(evlist->threads);
+	/* One for each thread */
+	nr_mmaps += perf_thread_map__nr(evlist->threads);
+	/* Minus the dummy CPU or dummy thread */
+	nr_mmaps -= 1;
 
 	return nr_mmaps;
 }
@@ -573,7 +553,6 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
 			  struct perf_mmap_param *mp)
 {
 	struct perf_evsel *evsel;
-	const struct perf_cpu_map *cpus = evlist->all_cpus;
 
 	if (!ops || !ops->get || !ops->mmap)
 		return -EINVAL;
@@ -592,9 +571,6 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	if (perf_cpu_map__empty(cpus))
-		return mmap_per_thread(evlist, ops, mp);
-
 	return mmap_per_cpu(evlist, ops, mp);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 19/21] perf stat: Add requires_cpu flag for uncore
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (17 preceding siblings ...)
  2022-04-22 16:23 ` [PATCH RFC 18/21] libperf evlist: Allow mixing per-thread and per-cpu mmaps Adrian Hunter
@ 2022-04-22 16:24 ` Adrian Hunter
  2022-04-29 22:57   ` Namhyung Kim
  2022-04-22 16:24 ` [PATCH RFC 20/21] perf tools: Allow system-wide events to keep their own CPUs Adrian Hunter
                   ` (2 subsequent siblings)
  21 siblings, 1 reply; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Uncore events require a CPU i.e. it cannot be -1.

The evsel system_wide flag is intended for events that should be on every
CPU, which does not make sense for uncore events because uncore events do
not map one-to-one with CPUs.

These 2 requirements are not exactly the same, so introduce a new flag
'requires_cpu' the uncore case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evlist.c                 | 4 +++-
 tools/lib/perf/include/internal/evsel.h | 1 +
 tools/perf/builtin-stat.c               | 5 +----
 tools/perf/util/evsel.c                 | 1 +
 tools/perf/util/parse-events.c          | 2 +-
 5 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 37dfa9d936a7..9fbcca3fc836 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -42,7 +42,9 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 	if (!evsel->own_cpus || evlist->has_user_cpus) {
 		perf_cpu_map__put(evsel->cpus);
 		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
-	} else if (!evsel->system_wide && perf_cpu_map__empty(evlist->user_requested_cpus)) {
+	} else if (!evsel->system_wide &&
+		   !evsel->requires_cpu &&
+		   perf_cpu_map__empty(evlist->user_requested_cpus)) {
 		perf_cpu_map__put(evsel->cpus);
 		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
 	} else if (evsel->cpus != evsel->own_cpus) {
diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
index cfc9ebd7968e..77fbb8b97e5c 100644
--- a/tools/lib/perf/include/internal/evsel.h
+++ b/tools/lib/perf/include/internal/evsel.h
@@ -50,6 +50,7 @@ struct perf_evsel {
 	/* parse modifier helper */
 	int			 nr_members;
 	bool			 system_wide;
+	bool			 requires_cpu;
 	int			 idx;
 };
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a96f106dc93a..8972ae546cfe 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -364,9 +364,6 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu_
 	if (!counter->supported)
 		return -ENOENT;
 
-	if (counter->core.system_wide)
-		nthreads = 1;
-
 	for (thread = 0; thread < nthreads; thread++) {
 		struct perf_counts_values *count;
 
@@ -2224,7 +2221,7 @@ static void setup_system_wide(int forks)
 		struct evsel *counter;
 
 		evlist__for_each_entry(evsel_list, counter) {
-			if (!counter->core.system_wide &&
+			if (!counter->core.requires_cpu &&
 			    strcmp(counter->name, "duration_time")) {
 				return;
 			}
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2a1729e7aee4..81bbddb6fbc0 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -382,6 +382,7 @@ struct evsel *evsel__clone(struct evsel *orig)
 	evsel->core.threads = perf_thread_map__get(orig->core.threads);
 	evsel->core.nr_members = orig->core.nr_members;
 	evsel->core.system_wide = orig->core.system_wide;
+	evsel->core.requires_cpu = orig->core.requires_cpu;
 
 	if (orig->name) {
 		evsel->name = strdup(orig->name);
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index dd84fed698a3..783359017548 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -350,7 +350,7 @@ __add_event(struct list_head *list, int *idx,
 	(*idx)++;
 	evsel->core.cpus = cpus;
 	evsel->core.own_cpus = perf_cpu_map__get(cpus);
-	evsel->core.system_wide = pmu ? pmu->is_uncore : false;
+	evsel->core.requires_cpu = pmu ? pmu->is_uncore : false;
 	evsel->auto_merge_stats = auto_merge_stats;
 
 	if (name)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 20/21] perf tools: Allow system-wide events to keep their own CPUs
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (18 preceding siblings ...)
  2022-04-22 16:24 ` [PATCH RFC 19/21] perf stat: Add requires_cpu flag for uncore Adrian Hunter
@ 2022-04-22 16:24 ` Adrian Hunter
  2022-04-22 16:24 ` [PATCH RFC 21/21] perf tools: Allow system-wide events to keep their own threads Adrian Hunter
  2022-05-03 18:09 ` [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Ian Rogers
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Currently, user_requested_cpus supplants system-wide CPUs when the evlist
has_user_cpus. Change that so that system-wide events retain their own
CPUs and they are added to all_cpus.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evlist.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 9fbcca3fc836..51fd550e326f 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -39,12 +39,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 	 * We already have cpus for evsel (via PMU sysfs) so
 	 * keep it, if there's no target cpu list defined.
 	 */
-	if (!evsel->own_cpus || evlist->has_user_cpus) {
-		perf_cpu_map__put(evsel->cpus);
-		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
-	} else if (!evsel->system_wide &&
-		   !evsel->requires_cpu &&
-		   perf_cpu_map__empty(evlist->user_requested_cpus)) {
+	if (!evsel->own_cpus ||
+	    (!evsel->system_wide && evlist->has_user_cpus) ||
+	    (!evsel->system_wide &&
+	     !evsel->requires_cpu &&
+	     perf_cpu_map__empty(evlist->user_requested_cpus))) {
 		perf_cpu_map__put(evsel->cpus);
 		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
 	} else if (evsel->cpus != evsel->own_cpus) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH RFC 21/21] perf tools: Allow system-wide events to keep their own threads
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (19 preceding siblings ...)
  2022-04-22 16:24 ` [PATCH RFC 20/21] perf tools: Allow system-wide events to keep their own CPUs Adrian Hunter
@ 2022-04-22 16:24 ` Adrian Hunter
  2022-05-03 18:09 ` [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Ian Rogers
  21 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-22 16:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

System-wide events do not have threads, so do not propagate threads to
them.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/lib/perf/evlist.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 51fd550e326f..076a27650491 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -51,8 +51,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 		evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
 	}
 
-	perf_thread_map__put(evsel->threads);
-	evsel->threads = perf_thread_map__get(evlist->threads);
+	if (!evsel->system_wide) {
+		perf_thread_map__put(evsel->threads);
+		evsel->threads = perf_thread_map__get(evlist->threads);
+	}
+
 	evlist->all_cpus = perf_cpu_map__merge(evlist->all_cpus, evsel->cpus);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 01/21] libperf evsel: Factor out perf_evsel__ioctl()
  2022-04-22 16:23 ` [PATCH RFC 01/21] libperf evsel: Factor out perf_evsel__ioctl() Adrian Hunter
@ 2022-04-22 19:05   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 35+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-04-22 19:05 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Jiri Olsa, Ian Rogers, Alexey Bayduraev, Namhyung Kim, Leo Yan,
	linux-kernel

Em Fri, Apr 22, 2022 at 07:23:42PM +0300, Adrian Hunter escreveu:
> Factor out perf_evsel__ioctl() so it can be reused.

Cherry picking this one as I look at the patchset.

- Arnaldo
 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/lib/perf/evsel.c | 19 ++++++++++++-------
>  1 file changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 210ea7c06ce8..20ae9f5f8b30 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -328,6 +328,17 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int thread,
>  	return 0;
>  }
>  
> +static int perf_evsel__ioctl(struct perf_evsel *evsel, int ioc, void *arg,
> +			     int cpu_map_idx, int thread)
> +{
> +	int *fd = FD(evsel, cpu_map_idx, thread);
> +
> +	if (fd == NULL || *fd < 0)
> +		return -1;
> +
> +	return ioctl(*fd, ioc, arg);
> +}
> +
>  static int perf_evsel__run_ioctl(struct perf_evsel *evsel,
>  				 int ioc,  void *arg,
>  				 int cpu_map_idx)
> @@ -335,13 +346,7 @@ static int perf_evsel__run_ioctl(struct perf_evsel *evsel,
>  	int thread;
>  
>  	for (thread = 0; thread < xyarray__max_y(evsel->fd); thread++) {
> -		int err;
> -		int *fd = FD(evsel, cpu_map_idx, thread);
> -
> -		if (fd == NULL || *fd < 0)
> -			return -1;
> -
> -		err = ioctl(*fd, ioc, arg);
> +		int err = perf_evsel__ioctl(evsel, ioc, arg, cpu_map_idx, thread);
>  
>  		if (err)
>  			return err;
> -- 
> 2.25.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread()
  2022-04-22 16:23 ` [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread() Adrian Hunter
@ 2022-04-27 21:48   ` Namhyung Kim
  2022-04-28  4:15     ` Adrian Hunter
  2022-05-03 16:45   ` Ian Rogers
  1 sibling, 1 reply; 35+ messages in thread
From: Namhyung Kim @ 2022-04-27 21:48 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers,
	Alexey Bayduraev, Leo Yan, linux-kernel

Hi Adrian,

On Fri, Apr 22, 2022 at 9:24 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> Add perf_evsel__enable_thread() as a counterpart to
> perf_evsel__enable_cpu(), to enable all events for a thread.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/lib/perf/evsel.c              | 10 ++++++++++
>  tools/lib/perf/include/perf/evsel.h |  1 +
>  2 files changed, 11 insertions(+)
>
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 20ae9f5f8b30..2a1f07f877be 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -360,6 +360,16 @@ int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx)
>         return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, cpu_map_idx);
>  }
>
> +int perf_evsel__enable_thread(struct perf_evsel *evsel, int thread)
> +{
> +       int err = 0;
> +       int i;
> +
> +       for (i = 0; i < xyarray__max_x(evsel->fd) && !err; i++)
> +               err = perf_evsel__ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, i, thread);

You might want to break the loop when it fails.

Thanks,
Namhyung


> +       return err;
> +}
> +
>  int perf_evsel__enable(struct perf_evsel *evsel)
>  {
>         int i;
> diff --git a/tools/lib/perf/include/perf/evsel.h b/tools/lib/perf/include/perf/evsel.h
> index 2a9516b42d15..699c0ed97d34 100644
> --- a/tools/lib/perf/include/perf/evsel.h
> +++ b/tools/lib/perf/include/perf/evsel.h
> @@ -36,6 +36,7 @@ LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int
>                                  struct perf_counts_values *count);
>  LIBPERF_API int perf_evsel__enable(struct perf_evsel *evsel);
>  LIBPERF_API int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
> +LIBPERF_API int perf_evsel__enable_thread(struct perf_evsel *evsel, int thread);
>  LIBPERF_API int perf_evsel__disable(struct perf_evsel *evsel);
>  LIBPERF_API int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
>  LIBPERF_API struct perf_cpu_map *perf_evsel__cpus(struct perf_evsel *evsel);
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 05/21] perf auxtrace: Do not mix up mmap idx
  2022-04-22 16:23 ` [PATCH RFC 05/21] perf auxtrace: Do not mix up mmap idx Adrian Hunter
@ 2022-04-27 21:54   ` Namhyung Kim
  2022-04-28  4:29     ` Adrian Hunter
  0 siblings, 1 reply; 35+ messages in thread
From: Namhyung Kim @ 2022-04-27 21:54 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers,
	Alexey Bayduraev, Leo Yan, linux-kernel

On Fri, Apr 22, 2022 at 9:24 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> The idx is with respect to evlist not evsel. That hasn't mattered because
> they are the same at present. Prepare for that not being the case, which it
> won't be when sideband tracking events are allowed on all CPUs even when
> auxtrace is limited to selected CPUs.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/perf/util/auxtrace.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 10936a38031f..2d015b0be549 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -640,8 +640,14 @@ static int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel,
>  {
>         bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.user_requested_cpus);
>
> -       if (per_cpu_mmaps)
> -               return perf_evsel__enable_cpu(&evsel->core, idx);
> +       if (per_cpu_mmaps) {
> +               struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);
> +               int cpu = perf_cpu_map__idx(evsel->core.cpus, evlist_cpu);

While it can be thought of as an index from the function name,
it'd be nice if we could be explicit like cpu_map_idx.

Thanks,
Namhyung

> +
> +               if (cpu == -1)
> +                       return -EINVAL;
> +               return perf_evsel__enable_cpu(&evsel->core, cpu);
> +       }
>
>         return perf_evsel__enable_thread(&evsel->core, idx);
>  }
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread()
  2022-04-27 21:48   ` Namhyung Kim
@ 2022-04-28  4:15     ` Adrian Hunter
  2022-04-28 23:50       ` Namhyung Kim
  0 siblings, 1 reply; 35+ messages in thread
From: Adrian Hunter @ 2022-04-28  4:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers,
	Alexey Bayduraev, Leo Yan, linux-kernel

On 28/04/22 00:48, Namhyung Kim wrote:
> Hi Adrian,
> 
> On Fri, Apr 22, 2022 at 9:24 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>> Add perf_evsel__enable_thread() as a counterpart to
>> perf_evsel__enable_cpu(), to enable all events for a thread.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  tools/lib/perf/evsel.c              | 10 ++++++++++
>>  tools/lib/perf/include/perf/evsel.h |  1 +
>>  2 files changed, 11 insertions(+)
>>
>> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
>> index 20ae9f5f8b30..2a1f07f877be 100644
>> --- a/tools/lib/perf/evsel.c
>> +++ b/tools/lib/perf/evsel.c
>> @@ -360,6 +360,16 @@ int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx)
>>         return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, cpu_map_idx);
>>  }
>>
>> +int perf_evsel__enable_thread(struct perf_evsel *evsel, int thread)
>> +{
>> +       int err = 0;
>> +       int i;
>> +
>> +       for (i = 0; i < xyarray__max_x(evsel->fd) && !err; i++)
>> +               err = perf_evsel__ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, i, thread);
> 
> You might want to break the loop when it fails.

Thanks for looking at this.  It should break because of " && !err".

> 
> Thanks,
> Namhyung
> 
> 
>> +       return err;
>> +}
>> +
>>  int perf_evsel__enable(struct perf_evsel *evsel)
>>  {
>>         int i;
>> diff --git a/tools/lib/perf/include/perf/evsel.h b/tools/lib/perf/include/perf/evsel.h
>> index 2a9516b42d15..699c0ed97d34 100644
>> --- a/tools/lib/perf/include/perf/evsel.h
>> +++ b/tools/lib/perf/include/perf/evsel.h
>> @@ -36,6 +36,7 @@ LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int
>>                                  struct perf_counts_values *count);
>>  LIBPERF_API int perf_evsel__enable(struct perf_evsel *evsel);
>>  LIBPERF_API int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
>> +LIBPERF_API int perf_evsel__enable_thread(struct perf_evsel *evsel, int thread);
>>  LIBPERF_API int perf_evsel__disable(struct perf_evsel *evsel);
>>  LIBPERF_API int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
>>  LIBPERF_API struct perf_cpu_map *perf_evsel__cpus(struct perf_evsel *evsel);
>> --
>> 2.25.1
>>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 05/21] perf auxtrace: Do not mix up mmap idx
  2022-04-27 21:54   ` Namhyung Kim
@ 2022-04-28  4:29     ` Adrian Hunter
  0 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-04-28  4:29 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers,
	Alexey Bayduraev, Leo Yan, linux-kernel

On 28/04/22 00:54, Namhyung Kim wrote:
> On Fri, Apr 22, 2022 at 9:24 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>> The idx is with respect to evlist not evsel. That hasn't mattered because
>> they are the same at present. Prepare for that not being the case, which it
>> won't be when sideband tracking events are allowed on all CPUs even when
>> auxtrace is limited to selected CPUs.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  tools/perf/util/auxtrace.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
>> index 10936a38031f..2d015b0be549 100644
>> --- a/tools/perf/util/auxtrace.c
>> +++ b/tools/perf/util/auxtrace.c
>> @@ -640,8 +640,14 @@ static int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel,
>>  {
>>         bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.user_requested_cpus);
>>
>> -       if (per_cpu_mmaps)
>> -               return perf_evsel__enable_cpu(&evsel->core, idx);
>> +       if (per_cpu_mmaps) {
>> +               struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);
>> +               int cpu = perf_cpu_map__idx(evsel->core.cpus, evlist_cpu);
> 
> While it can be thought of as an index from the function name,
> it'd be nice if we could be explicit like cpu_map_idx.

Ok

> 
> Thanks,
> Namhyung
> 
>> +
>> +               if (cpu == -1)
>> +                       return -EINVAL;
>> +               return perf_evsel__enable_cpu(&evsel->core, cpu);
>> +       }
>>
>>         return perf_evsel__enable_thread(&evsel->core, idx);
>>  }
>> --
>> 2.25.1
>>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread()
  2022-04-28  4:15     ` Adrian Hunter
@ 2022-04-28 23:50       ` Namhyung Kim
  0 siblings, 0 replies; 35+ messages in thread
From: Namhyung Kim @ 2022-04-28 23:50 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers,
	Alexey Bayduraev, Leo Yan, linux-kernel

On Wed, Apr 27, 2022 at 9:15 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 28/04/22 00:48, Namhyung Kim wrote:
> > Hi Adrian,
> >
> > On Fri, Apr 22, 2022 at 9:24 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>
> >> Add perf_evsel__enable_thread() as a counterpart to
> >> perf_evsel__enable_cpu(), to enable all events for a thread.
> >>
> >> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> >> ---
> >>  tools/lib/perf/evsel.c              | 10 ++++++++++
> >>  tools/lib/perf/include/perf/evsel.h |  1 +
> >>  2 files changed, 11 insertions(+)
> >>
> >> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> >> index 20ae9f5f8b30..2a1f07f877be 100644
> >> --- a/tools/lib/perf/evsel.c
> >> +++ b/tools/lib/perf/evsel.c
> >> @@ -360,6 +360,16 @@ int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx)
> >>         return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, cpu_map_idx);
> >>  }
> >>
> >> +int perf_evsel__enable_thread(struct perf_evsel *evsel, int thread)
> >> +{
> >> +       int err = 0;
> >> +       int i;
> >> +
> >> +       for (i = 0; i < xyarray__max_x(evsel->fd) && !err; i++)
> >> +               err = perf_evsel__ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, i, thread);
> >
> > You might want to break the loop when it fails.
>
> Thanks for looking at this.  It should break because of " && !err".

Oh, I missed that part, sorry!

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 19/21] perf stat: Add requires_cpu flag for uncore
  2022-04-22 16:24 ` [PATCH RFC 19/21] perf stat: Add requires_cpu flag for uncore Adrian Hunter
@ 2022-04-29 22:57   ` Namhyung Kim
  2022-04-30  1:10     ` Ian Rogers
  0 siblings, 1 reply; 35+ messages in thread
From: Namhyung Kim @ 2022-04-29 22:57 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers,
	Alexey Bayduraev, Leo Yan, linux-kernel

On Fri, Apr 22, 2022 at 9:25 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> Uncore events require a CPU i.e. it cannot be -1.
>
> The evsel system_wide flag is intended for events that should be on every
> CPU, which does not make sense for uncore events because uncore events do
> not map one-to-one with CPUs.
>
> These 2 requirements are not exactly the same, so introduce a new flag
> 'requires_cpu' the uncore case.

Yeah, I like this change!  I was often confused by the two different things.

Thanks,
Namhyung

>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/lib/perf/evlist.c                 | 4 +++-
>  tools/lib/perf/include/internal/evsel.h | 1 +
>  tools/perf/builtin-stat.c               | 5 +----
>  tools/perf/util/evsel.c                 | 1 +
>  tools/perf/util/parse-events.c          | 2 +-
>  5 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 37dfa9d936a7..9fbcca3fc836 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -42,7 +42,9 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>         if (!evsel->own_cpus || evlist->has_user_cpus) {
>                 perf_cpu_map__put(evsel->cpus);
>                 evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
> -       } else if (!evsel->system_wide && perf_cpu_map__empty(evlist->user_requested_cpus)) {
> +       } else if (!evsel->system_wide &&
> +                  !evsel->requires_cpu &&
> +                  perf_cpu_map__empty(evlist->user_requested_cpus)) {
>                 perf_cpu_map__put(evsel->cpus);
>                 evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
>         } else if (evsel->cpus != evsel->own_cpus) {
> diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
> index cfc9ebd7968e..77fbb8b97e5c 100644
> --- a/tools/lib/perf/include/internal/evsel.h
> +++ b/tools/lib/perf/include/internal/evsel.h
> @@ -50,6 +50,7 @@ struct perf_evsel {
>         /* parse modifier helper */
>         int                      nr_members;
>         bool                     system_wide;
> +       bool                     requires_cpu;
>         int                      idx;
>  };
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index a96f106dc93a..8972ae546cfe 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -364,9 +364,6 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu_
>         if (!counter->supported)
>                 return -ENOENT;
>
> -       if (counter->core.system_wide)
> -               nthreads = 1;
> -
>         for (thread = 0; thread < nthreads; thread++) {
>                 struct perf_counts_values *count;
>
> @@ -2224,7 +2221,7 @@ static void setup_system_wide(int forks)
>                 struct evsel *counter;
>
>                 evlist__for_each_entry(evsel_list, counter) {
> -                       if (!counter->core.system_wide &&
> +                       if (!counter->core.requires_cpu &&
>                             strcmp(counter->name, "duration_time")) {
>                                 return;
>                         }
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 2a1729e7aee4..81bbddb6fbc0 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -382,6 +382,7 @@ struct evsel *evsel__clone(struct evsel *orig)
>         evsel->core.threads = perf_thread_map__get(orig->core.threads);
>         evsel->core.nr_members = orig->core.nr_members;
>         evsel->core.system_wide = orig->core.system_wide;
> +       evsel->core.requires_cpu = orig->core.requires_cpu;
>
>         if (orig->name) {
>                 evsel->name = strdup(orig->name);
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index dd84fed698a3..783359017548 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -350,7 +350,7 @@ __add_event(struct list_head *list, int *idx,
>         (*idx)++;
>         evsel->core.cpus = cpus;
>         evsel->core.own_cpus = perf_cpu_map__get(cpus);
> -       evsel->core.system_wide = pmu ? pmu->is_uncore : false;
> +       evsel->core.requires_cpu = pmu ? pmu->is_uncore : false;
>         evsel->auto_merge_stats = auto_merge_stats;
>
>         if (name)
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 19/21] perf stat: Add requires_cpu flag for uncore
  2022-04-29 22:57   ` Namhyung Kim
@ 2022-04-30  1:10     ` Ian Rogers
  0 siblings, 0 replies; 35+ messages in thread
From: Ian Rogers @ 2022-04-30  1:10 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, Jiri Olsa,
	Alexey Bayduraev, Leo Yan, linux-kernel

On Fri, Apr 29, 2022 at 3:58 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Fri, Apr 22, 2022 at 9:25 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >
> > Uncore events require a CPU i.e. it cannot be -1.
> >
> > The evsel system_wide flag is intended for events that should be on every
> > CPU, which does not make sense for uncore events because uncore events do
> > not map one-to-one with CPUs.
> >
> > These 2 requirements are not exactly the same, so introduce a new flag
> > 'requires_cpu' the uncore case.
>
> Yeah, I like this change!  I was often confused by the two different things.
>
> Thanks,
> Namhyung
>
> >
> > Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> > ---
> >  tools/lib/perf/evlist.c                 | 4 +++-
> >  tools/lib/perf/include/internal/evsel.h | 1 +
> >  tools/perf/builtin-stat.c               | 5 +----
> >  tools/perf/util/evsel.c                 | 1 +
> >  tools/perf/util/parse-events.c          | 2 +-
> >  5 files changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> > index 37dfa9d936a7..9fbcca3fc836 100644
> > --- a/tools/lib/perf/evlist.c
> > +++ b/tools/lib/perf/evlist.c
> > @@ -42,7 +42,9 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> >         if (!evsel->own_cpus || evlist->has_user_cpus) {
> >                 perf_cpu_map__put(evsel->cpus);
> >                 evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
> > -       } else if (!evsel->system_wide && perf_cpu_map__empty(evlist->user_requested_cpus)) {
> > +       } else if (!evsel->system_wide &&
> > +                  !evsel->requires_cpu &&
> > +                  perf_cpu_map__empty(evlist->user_requested_cpus)) {
> >                 perf_cpu_map__put(evsel->cpus);
> >                 evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
> >         } else if (evsel->cpus != evsel->own_cpus) {
> > diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
> > index cfc9ebd7968e..77fbb8b97e5c 100644
> > --- a/tools/lib/perf/include/internal/evsel.h
> > +++ b/tools/lib/perf/include/internal/evsel.h
> > @@ -50,6 +50,7 @@ struct perf_evsel {
> >         /* parse modifier helper */
> >         int                      nr_members;
> >         bool                     system_wide;

Nice cleanup! Could we add some comments here as to what the booleans
mean/imply?

Thanks,
Ian

> > +       bool                     requires_cpu;
> >         int                      idx;
> >  };
> >
> > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > index a96f106dc93a..8972ae546cfe 100644
> > --- a/tools/perf/builtin-stat.c
> > +++ b/tools/perf/builtin-stat.c
> > @@ -364,9 +364,6 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu_
> >         if (!counter->supported)
> >                 return -ENOENT;
> >
> > -       if (counter->core.system_wide)
> > -               nthreads = 1;
> > -
> >         for (thread = 0; thread < nthreads; thread++) {
> >                 struct perf_counts_values *count;
> >
> > @@ -2224,7 +2221,7 @@ static void setup_system_wide(int forks)
> >                 struct evsel *counter;
> >
> >                 evlist__for_each_entry(evsel_list, counter) {
> > -                       if (!counter->core.system_wide &&
> > +                       if (!counter->core.requires_cpu &&
> >                             strcmp(counter->name, "duration_time")) {
> >                                 return;
> >                         }
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index 2a1729e7aee4..81bbddb6fbc0 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -382,6 +382,7 @@ struct evsel *evsel__clone(struct evsel *orig)
> >         evsel->core.threads = perf_thread_map__get(orig->core.threads);
> >         evsel->core.nr_members = orig->core.nr_members;
> >         evsel->core.system_wide = orig->core.system_wide;
> > +       evsel->core.requires_cpu = orig->core.requires_cpu;
> >
> >         if (orig->name) {
> >                 evsel->name = strdup(orig->name);
> > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> > index dd84fed698a3..783359017548 100644
> > --- a/tools/perf/util/parse-events.c
> > +++ b/tools/perf/util/parse-events.c
> > @@ -350,7 +350,7 @@ __add_event(struct list_head *list, int *idx,
> >         (*idx)++;
> >         evsel->core.cpus = cpus;
> >         evsel->core.own_cpus = perf_cpu_map__get(cpus);
> > -       evsel->core.system_wide = pmu ? pmu->is_uncore : false;
> > +       evsel->core.requires_cpu = pmu ? pmu->is_uncore : false;
> >         evsel->auto_merge_stats = auto_merge_stats;
> >
> >         if (name)
> > --
> > 2.25.1
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread()
  2022-04-22 16:23 ` [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread() Adrian Hunter
  2022-04-27 21:48   ` Namhyung Kim
@ 2022-05-03 16:45   ` Ian Rogers
  1 sibling, 0 replies; 35+ messages in thread
From: Ian Rogers @ 2022-05-03 16:45 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Alexey Bayduraev,
	Namhyung Kim, Leo Yan, linux-kernel

On Fri, Apr 22, 2022 at 9:24 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> Add perf_evsel__enable_thread() as a counterpart to
> perf_evsel__enable_cpu(), to enable all events for a thread.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/lib/perf/evsel.c              | 10 ++++++++++
>  tools/lib/perf/include/perf/evsel.h |  1 +
>  2 files changed, 11 insertions(+)
>
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 20ae9f5f8b30..2a1f07f877be 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -360,6 +360,16 @@ int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx)
>         return perf_evsel__run_ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, cpu_map_idx);
>  }
>
> +int perf_evsel__enable_thread(struct perf_evsel *evsel, int thread)
> +{
> +       int err = 0;
> +       int i;
> +
> +       for (i = 0; i < xyarray__max_x(evsel->fd) && !err; i++)
> +               err = perf_evsel__ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, i, thread);

Looking at the argument names to perf_evsel__ioctl, i is the
cpu_map_idx. Would it be more intention revealing here to do:

perf_cpu_map__for_each_cpu(cpu, idx, evsel->cpus) {
   if (err = perf_evsel__ioctl(evsel, PERF_EVENT_IOC_ENABLE, NULL, idx, thread))
     break;
}

or perhaps:

for (idx = 0; idx < perf_cpu_map__nr(evsel->fd) && !err; idx++)

Thanks,
Ian

> +       return err;
> +}
> +
>  int perf_evsel__enable(struct perf_evsel *evsel)
>  {
>         int i;
> diff --git a/tools/lib/perf/include/perf/evsel.h b/tools/lib/perf/include/perf/evsel.h
> index 2a9516b42d15..699c0ed97d34 100644
> --- a/tools/lib/perf/include/perf/evsel.h
> +++ b/tools/lib/perf/include/perf/evsel.h
> @@ -36,6 +36,7 @@ LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int
>                                  struct perf_counts_values *count);
>  LIBPERF_API int perf_evsel__enable(struct perf_evsel *evsel);
>  LIBPERF_API int perf_evsel__enable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
> +LIBPERF_API int perf_evsel__enable_thread(struct perf_evsel *evsel, int thread);
>  LIBPERF_API int perf_evsel__disable(struct perf_evsel *evsel);
>  LIBPERF_API int perf_evsel__disable_cpu(struct perf_evsel *evsel, int cpu_map_idx);
>  LIBPERF_API struct perf_cpu_map *perf_evsel__cpus(struct perf_evsel *evsel);
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 17/21] perf tools: Allow all_cpus to be a superset of user_requested_cpus
  2022-04-22 16:23 ` [PATCH RFC 17/21] perf tools: Allow all_cpus to be a superset of user_requested_cpus Adrian Hunter
@ 2022-05-03 17:41   ` Ian Rogers
  0 siblings, 0 replies; 35+ messages in thread
From: Ian Rogers @ 2022-05-03 17:41 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Alexey Bayduraev,
	Namhyung Kim, Leo Yan, linux-kernel

On Fri, Apr 22, 2022 at 9:24 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> To support collection of system-wide events with user requested CPUs,
> all_cpus must be a superset of user_requested_cpus.
>
> In order to support all_cpus to be a superset of user_requested_cpus,
> all_cpus must be used instead of user_requested_cpus when dealing with CPUs
> of all events instead of CPUs of requested events.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Acked-by: Ian Rogers <irogers@google.com>

> ---
>  tools/lib/perf/evlist.c     | 12 ++++++------
>  tools/perf/builtin-record.c | 18 ++++++++++++------
>  tools/perf/util/auxtrace.c  |  2 +-
>  3 files changed, 19 insertions(+), 13 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index ad04da81c367..048b546f9444 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -294,7 +294,7 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
>
>  int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
>  {
> -       int nr_cpus = perf_cpu_map__nr(evlist->user_requested_cpus);
> +       int nr_cpus = perf_cpu_map__nr(evlist->all_cpus);
>         int nr_threads = perf_thread_map__nr(evlist->threads);
>         int nfds = 0;
>         struct perf_evsel *evsel;
> @@ -426,7 +426,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>                int idx, struct perf_mmap_param *mp, int cpu_idx,
>                int thread, int *_output, int *_output_overwrite)
>  {
> -       struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->user_requested_cpus, cpu_idx);
> +       struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->all_cpus, cpu_idx);
>         struct perf_evsel *evsel;
>         int revent;
>
> @@ -536,7 +536,7 @@ mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>              struct perf_mmap_param *mp)
>  {
>         int nr_threads = perf_thread_map__nr(evlist->threads);
> -       int nr_cpus    = perf_cpu_map__nr(evlist->user_requested_cpus);
> +       int nr_cpus    = perf_cpu_map__nr(evlist->all_cpus);
>         int cpu, thread;
>
>         for (cpu = 0; cpu < nr_cpus; cpu++) {
> @@ -561,8 +561,8 @@ static int perf_evlist__nr_mmaps(struct perf_evlist *evlist)
>  {
>         int nr_mmaps;
>
> -       nr_mmaps = perf_cpu_map__nr(evlist->user_requested_cpus);
> -       if (perf_cpu_map__empty(evlist->user_requested_cpus))
> +       nr_mmaps = perf_cpu_map__nr(evlist->all_cpus);
> +       if (perf_cpu_map__empty(evlist->all_cpus))
>                 nr_mmaps = perf_thread_map__nr(evlist->threads);
>
>         return nr_mmaps;
> @@ -573,7 +573,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
>                           struct perf_mmap_param *mp)
>  {
>         struct perf_evsel *evsel;
> -       const struct perf_cpu_map *cpus = evlist->user_requested_cpus;
> +       const struct perf_cpu_map *cpus = evlist->all_cpus;
>
>         if (!ops || !ops->get || !ops->mmap)
>                 return -EINVAL;
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 83d2f2b5dcda..42127cfd9cc1 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -967,14 +967,20 @@ static void record__thread_data_close_pipes(struct record_thread *thread_data)
>         }
>  }
>
> +static bool evlst__per_thread(struct evlist *evlist)
> +{
> +       return cpu_map__is_dummy(evlist->core.user_requested_cpus);
> +}
> +

This is much clearer than the previous code. Could we add a comment as
to why dummy implies per-thread? What would empty imply?

Just to note the cpu_map__is_dummy adds to the confusion on whether
dummy can appear merged into a map:
 "Events associated with a pid, rather than a CPU, use a single dummy
map with an entry of -1"
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/cpumap.h?h=perf/core#n55
If the dummy can appear in any cpu map then we should rephrase this
and possibly turn this into a has_dummy_cpu function.

Thanks,
Ian

>  static int record__thread_data_init_maps(struct record_thread *thread_data, struct evlist *evlist)
>  {
>         int m, tm, nr_mmaps = evlist->core.nr_mmaps;
>         struct mmap *mmap = evlist->mmap;
>         struct mmap *overwrite_mmap = evlist->overwrite_mmap;
> -       struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
> +       struct perf_cpu_map *cpus = evlist->core.all_cpus;
> +       bool per_thread = evlst__per_thread(evlist);
>
> -       if (cpu_map__is_dummy(cpus))
> +       if (per_thread)
>                 thread_data->nr_mmaps = nr_mmaps;
>         else
>                 thread_data->nr_mmaps = bitmap_weight(thread_data->mask->maps.bits,
> @@ -995,7 +1001,7 @@ static int record__thread_data_init_maps(struct record_thread *thread_data, stru
>                  thread_data->nr_mmaps, thread_data->maps, thread_data->overwrite_maps);
>
>         for (m = 0, tm = 0; m < nr_mmaps && tm < thread_data->nr_mmaps; m++) {
> -               if (cpu_map__is_dummy(cpus) ||
> +               if (per_thread ||
>                     test_bit(cpus->map[m].cpu, thread_data->mask->maps.bits)) {
>                         if (thread_data->maps) {
>                                 thread_data->maps[tm] = &mmap[m];
> @@ -1870,7 +1876,7 @@ static int record__synthesize(struct record *rec, bool tail)
>                 return err;
>         }
>
> -       err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->core.user_requested_cpus,
> +       err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->core.all_cpus,
>                                              process_synthesized_event, NULL);
>         if (err < 0) {
>                 pr_err("Couldn't synthesize cpu map.\n");
> @@ -3667,12 +3673,12 @@ static int record__init_thread_default_masks(struct record *rec, struct perf_cpu
>  static int record__init_thread_masks(struct record *rec)
>  {
>         int ret = 0;
> -       struct perf_cpu_map *cpus = rec->evlist->core.user_requested_cpus;
> +       struct perf_cpu_map *cpus = rec->evlist->core.all_cpus;
>
>         if (!record__threads_enabled(rec))
>                 return record__init_thread_default_masks(rec, cpus);
>
> -       if (cpu_map__is_dummy(cpus)) {
> +       if (evlst__per_thread(rec->evlist)) {
>                 pr_err("--per-thread option is mutually exclusive to parallel streaming mode.\n");
>                 return -EINVAL;
>         }
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 246afe99a7fb..bac1f1eb95a7 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -181,7 +181,7 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
>         mp->idx = idx;
>
>         if (per_cpu) {
> -               mp->cpu = perf_cpu_map__cpu(evlist->core.user_requested_cpus, idx);
> +               mp->cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx);
>                 if (evlist->core.threads)
>                         mp->tid = perf_thread_map__pid(evlist->core.threads, 0);
>                 else
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu
  2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
                   ` (20 preceding siblings ...)
  2022-04-22 16:24 ` [PATCH RFC 21/21] perf tools: Allow system-wide events to keep their own threads Adrian Hunter
@ 2022-05-03 18:09 ` Ian Rogers
  21 siblings, 0 replies; 35+ messages in thread
From: Ian Rogers @ 2022-05-03 18:09 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Alexey Bayduraev,
	Namhyung Kim, Leo Yan, linux-kernel

On Fri, Apr 22, 2022 at 9:24 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> Hi
>
> Here are patches to support capturing Intel PT sideband events such as
> mmap, task, context switch, text poke etc, on every CPU even when tracing
> selected user_requested_cpus.  That is, when using the perf record -C or
>  --cpu option.
>
> This is needed for:
> 1. text poke: a text poke on any CPU affects all CPUs
> 2. tracing user space: a user space process can migrate between CPUs so
> mmap events that happen on a different CPU can be needed to decode a
> user_requested_cpus CPU.
>
> For example:
>
>         Trace on CPU 1:
>
>         perf record --kcore -C 1 -e intel_pt// &
>
>         Start a task on CPU 0:
>
>         taskset 0x1 testprog &
>
>         Migrate it to CPU 1:
>
>         taskset -p 0x2 <testprog pid>
>
>         Stop tracing:
>
>         kill %1
>
>         Prior to these changes there will be errors decoding testprog
>         in userspace because the comm and mmap events for testprog will not
>         have been captured.
>
> There is quite a bit of preparation:
>
> The first 5 patches stop auxtrace mixing up mmap idx between evlist and
> evsel.  That is going to matter when
> evlist->all_cpus != evlist->user_requested_cpus != evsel->cpus:
>
>       libperf evsel: Factor out perf_evsel__ioctl()
>       libperf evsel: Add perf_evsel__enable_thread()
>       perf evlist: Use libperf functions in evlist__enable_event_idx()
>       perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c
>       perf auxtrace: Do not mix up mmap idx
>
> The next 6 patches stop attempts to auxtrace mmap when it is not an
> auxtrace event e.g. when mmapping the CPUs on which only sideband is
> captured:
>
>       libperf evlist: Remove ->idx() per_cpu parameter
>       libperf evlist: Move ->idx() into mmap_per_evsel()
>       libperf evlist: Add evsel as a parameter to ->idx()
>       perf auxtrace: Record whether an auxtrace mmap is needed
>       perf auxctrace: Add mmap_needed to auxtrace_mmap_params
>       perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter
>
> The next 5 patches switch to setting up dummy event maps before adding the
> evsel so that the evsel is subject to map propagation, primarily to cause
> addition of the evsel's CPUs to all_cpus.
>
>       perf evlist: Factor out evlist__dummy_event()
>       perf evlist: Add evlist__add_system_wide_dummy()
>       perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke()
>       perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking
>       perf intel-pt: Track sideband system-wide when needed
>
> The remaining 5 patches make more significant changes.
>
> First change from using user_requested_cpus to using all_cpus where necessary:
>
>       perf tools: Allow all_cpus to be a superset of user_requested_cpus
>
> Secondly, mmap all per-thread and all per-cpu events:
>
>       libperf evlist: Allow mixing per-thread and per-cpu mmaps
>
> Stop using system_wide flag for uncore because it will not work anymore:
>
>       perf stat: Add per_cpu_only flag for uncore
>
> Finally change map propagation so that system-wide events retain their cpus and
> (dummy) threads:
>
>       perf tools: Allow system-wide events to keep their own CPUs
>       perf tools: Allow system-wide events to keep their own threads
>
>
> Adrian Hunter (21):
>       libperf evsel: Factor out perf_evsel__ioctl()
>       libperf evsel: Add perf_evsel__enable_thread()
>       perf evlist: Use libperf functions in evlist__enable_event_idx()
>       perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c
>       perf auxtrace: Do not mix up mmap idx
>       libperf evlist: Remove ->idx() per_cpu parameter
>       libperf evlist: Move ->idx() into mmap_per_evsel()
>       libperf evlist: Add evsel as a parameter to ->idx()
>       perf auxtrace: Record whether an auxtrace mmap is needed
>       perf auxctrace: Add mmap_needed to auxtrace_mmap_params
>       perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter
>       perf evlist: Factor out evlist__dummy_event()
>       perf evlist: Add evlist__add_system_wide_dummy()
>       perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke()
>       perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking
>       perf intel-pt: Track sideband system-wide when needed
>       perf tools: Allow all_cpus to be a superset of user_requested_cpus
>       libperf evlist: Allow mixing per-thread and per-cpu mmaps
>       perf stat: Add per_cpu_only flag for uncore
>       perf tools: Allow system-wide events to keep their own CPUs
>       perf tools: Allow system-wide events to keep their own threads
>
>  tools/lib/perf/evlist.c                  |  67 +++++++------------
>  tools/lib/perf/evsel.c                   |  29 +++++++--
>  tools/lib/perf/include/internal/evlist.h |   3 +-
>  tools/lib/perf/include/internal/evsel.h  |   1 +
>  tools/lib/perf/include/perf/evsel.h      |   1 +
>  tools/perf/arch/arm/util/cs-etm.c        |   1 +
>  tools/perf/arch/arm64/util/arm-spe.c     |   1 +
>  tools/perf/arch/s390/util/auxtrace.c     |   1 +
>  tools/perf/arch/x86/util/intel-bts.c     |   1 +
>  tools/perf/arch/x86/util/intel-pt.c      |  32 ++++------
>  tools/perf/builtin-record.c              |  39 +++++-------
>  tools/perf/builtin-stat.c                |   5 +-
>  tools/perf/util/auxtrace.c               |  31 +++++++--
>  tools/perf/util/auxtrace.h               |   8 ++-
>  tools/perf/util/evlist.c                 | 106 +++++++++++++++----------------
>  tools/perf/util/evlist.h                 |   7 +-
>  tools/perf/util/evsel.c                  |   1 +
>  tools/perf/util/evsel.h                  |   1 +
>  tools/perf/util/mmap.c                   |   4 +-
>  tools/perf/util/parse-events.c           |   2 +-
>  20 files changed, 176 insertions(+), 165 deletions(-)
>
>
> Regards
> Adrian

Thanks Adrian, I'm very much in favor of this patch set. Can we add
some tests for intel-pt? They could be part of the 'perf record' shell
test:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/tests/shell/record.sh?h=perf/core
but probably better as there own thing. A command line broken prior to
this change would be great!

Aside from testing, we should clean up what dummy CPUs in the cpu maps
means. As per:
https://lore.kernel.org/linux-perf-users/CAP-5=fWfs2td9nZLGdEBD+C5s=upa_7SORab8tQ7qH=jX--F7w@mail.gmail.com/

I also think landing:
https://lore.kernel.org/linux-perf-users/20220503041757.2365696-3-irogers@google.com/
will help as it avoids the all_cpus map containing references to CPUs
from PMU sysfs that had been overridden.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 18/21] libperf evlist: Allow mixing per-thread and per-cpu mmaps
  2022-04-22 16:23 ` [PATCH RFC 18/21] libperf evlist: Allow mixing per-thread and per-cpu mmaps Adrian Hunter
@ 2022-05-03 20:29   ` Namhyung Kim
  2022-05-04  9:56     ` Adrian Hunter
  0 siblings, 1 reply; 35+ messages in thread
From: Namhyung Kim @ 2022-05-03 20:29 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers,
	Alexey Bayduraev, Leo Yan, linux-kernel

On Fri, Apr 22, 2022 at 9:25 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> mmap_per_evsel() will skip events that do not match the CPU, so all CPUs
> can be iterated in any case.
>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
[...]
> @@ -561,9 +538,12 @@ static int perf_evlist__nr_mmaps(struct perf_evlist *evlist)
>  {
>         int nr_mmaps;
>
> +       /* One for each CPU */
>         nr_mmaps = perf_cpu_map__nr(evlist->all_cpus);
> -       if (perf_cpu_map__empty(evlist->all_cpus))
> -               nr_mmaps = perf_thread_map__nr(evlist->threads);
> +       /* One for each thread */
> +       nr_mmaps += perf_thread_map__nr(evlist->threads);
> +       /* Minus the dummy CPU or dummy thread */
> +       nr_mmaps -= 1;

I'm not sure it'd work for per-task events with default-per-cpu mode.

Thanks,
Namhyung

>
>         return nr_mmaps;
>  }

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH RFC 18/21] libperf evlist: Allow mixing per-thread and per-cpu mmaps
  2022-05-03 20:29   ` Namhyung Kim
@ 2022-05-04  9:56     ` Adrian Hunter
  0 siblings, 0 replies; 35+ messages in thread
From: Adrian Hunter @ 2022-05-04  9:56 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers,
	Alexey Bayduraev, Leo Yan, linux-kernel

On 3/05/22 23:29, Namhyung Kim wrote:
> On Fri, Apr 22, 2022 at 9:25 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>> mmap_per_evsel() will skip events that do not match the CPU, so all CPUs
>> can be iterated in any case.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
> [...]
>> @@ -561,9 +538,12 @@ static int perf_evlist__nr_mmaps(struct perf_evlist *evlist)
>>  {
>>         int nr_mmaps;
>>
>> +       /* One for each CPU */
>>         nr_mmaps = perf_cpu_map__nr(evlist->all_cpus);
>> -       if (perf_cpu_map__empty(evlist->all_cpus))
>> -               nr_mmaps = perf_thread_map__nr(evlist->threads);
>> +       /* One for each thread */
>> +       nr_mmaps += perf_thread_map__nr(evlist->threads);
>> +       /* Minus the dummy CPU or dummy thread */
>> +       nr_mmaps -= 1;
> 
> I'm not sure it'd work for per-task events with default-per-cpu mode.

Thanks for noticing that. It ends up being too high which doesn't fail
immediately.  I need to add a check that nr_mmaps matches the number
of mmaps actually made.

> 
> Thanks,
> Namhyung
> 
>>
>>         return nr_mmaps;
>>  }


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-05-04  9:56 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-22 16:23 [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 01/21] libperf evsel: Factor out perf_evsel__ioctl() Adrian Hunter
2022-04-22 19:05   ` Arnaldo Carvalho de Melo
2022-04-22 16:23 ` [PATCH RFC 02/21] libperf evsel: Add perf_evsel__enable_thread() Adrian Hunter
2022-04-27 21:48   ` Namhyung Kim
2022-04-28  4:15     ` Adrian Hunter
2022-04-28 23:50       ` Namhyung Kim
2022-05-03 16:45   ` Ian Rogers
2022-04-22 16:23 ` [PATCH RFC 03/21] perf evlist: Use libperf functions in evlist__enable_event_idx() Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 04/21] perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 05/21] perf auxtrace: Do not mix up mmap idx Adrian Hunter
2022-04-27 21:54   ` Namhyung Kim
2022-04-28  4:29     ` Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 06/21] libperf evlist: Remove ->idx() per_cpu parameter Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 07/21] libperf evlist: Move ->idx() into mmap_per_evsel() Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 08/21] libperf evlist: Add evsel as a parameter to ->idx() Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 09/21] perf auxtrace: Record whether an auxtrace mmap is needed Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 10/21] perf auxctrace: Add mmap_needed to auxtrace_mmap_params Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 11/21] perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 12/21] perf evlist: Factor out evlist__dummy_event() Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 13/21] perf evlist: Add evlist__add_system_wide_dummy() Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 14/21] perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke() Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 15/21] perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 16/21] perf intel-pt: Track sideband system-wide when needed Adrian Hunter
2022-04-22 16:23 ` [PATCH RFC 17/21] perf tools: Allow all_cpus to be a superset of user_requested_cpus Adrian Hunter
2022-05-03 17:41   ` Ian Rogers
2022-04-22 16:23 ` [PATCH RFC 18/21] libperf evlist: Allow mixing per-thread and per-cpu mmaps Adrian Hunter
2022-05-03 20:29   ` Namhyung Kim
2022-05-04  9:56     ` Adrian Hunter
2022-04-22 16:24 ` [PATCH RFC 19/21] perf stat: Add requires_cpu flag for uncore Adrian Hunter
2022-04-29 22:57   ` Namhyung Kim
2022-04-30  1:10     ` Ian Rogers
2022-04-22 16:24 ` [PATCH RFC 20/21] perf tools: Allow system-wide events to keep their own CPUs Adrian Hunter
2022-04-22 16:24 ` [PATCH RFC 21/21] perf tools: Allow system-wide events to keep their own threads Adrian Hunter
2022-05-03 18:09 ` [PATCH RFC 00/21] perf intel-pt: Better support for perf record --cpu Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).