bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Make evlist CPUs more accurate
@ 2022-03-28 23:26 Ian Rogers
  2022-03-28 23:26 ` [PATCH v2 1/6] perf stat: Avoid segv if core.user_cpus isn't set Ian Rogers
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Ian Rogers @ 2022-03-28 23:26 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan,
	John Garry, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Kajol Jain, James Clark, German Gomez,
	Adrian Hunter, Riccardo Mancini, Andi Kleen, Alexey Bayduraev,
	Alexander Antonov, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel, netdev, bpf
  Cc: Stephane Eranian, Ian Rogers

evlist has all_cpus, computed to be the merge of all evsel CPU maps,
and cpus. cpus may contain more CPUs than all_cpus, as by default cpus
holds all online CPUs whilst all_cpus holds the merge/union from
evsels. For an uncore event there may just be 1 CPU per socket, which
will be a far smaller CPU map than all online CPUs.

These patches change cpus to be called user_requested_cpus, to reflect
their potential user specified nature. The user_requested_cpus are set
to be the current value intersected with all_cpus, so that
user_requested_cpus is always a subset of all_cpus. This fixes
printing code for metrics so that unnecessary blank lines aren't
printed.

To make the intersect function perform well, a perf_cpu_map__is_subset
function is added. While adding this function, also use it in
perf_cpu_map__merge to avoid creating a new CPU map for some currently
missed patterns.

v2. Reorders the "Avoid segv" patch and makes other adjustments
    suggested by Arnaldo Carvalho de Melo <acme@kernel.org>.

Ian Rogers (6):
  perf stat: Avoid segv if core.user_cpus isn't set.
  perf evlist: Rename cpus to user_requested_cpus
  perf cpumap: Add is_subset function
  perf cpumap: More cpu map reuse by merge.
  perf cpumap: Add intersect function.
  perf evlist: Respect all_cpus when setting user_requested_cpus

 tools/lib/perf/cpumap.c                  | 73 ++++++++++++++++++++----
 tools/lib/perf/evlist.c                  | 28 ++++-----
 tools/lib/perf/include/internal/cpumap.h |  1 +
 tools/lib/perf/include/internal/evlist.h |  7 ++-
 tools/lib/perf/include/perf/cpumap.h     |  2 +
 tools/perf/arch/arm/util/cs-etm.c        |  8 +--
 tools/perf/arch/arm64/util/arm-spe.c     |  2 +-
 tools/perf/arch/x86/util/intel-bts.c     |  2 +-
 tools/perf/arch/x86/util/intel-pt.c      |  4 +-
 tools/perf/bench/evlist-open-close.c     |  2 +-
 tools/perf/builtin-ftrace.c              |  2 +-
 tools/perf/builtin-record.c              |  6 +-
 tools/perf/builtin-stat.c                | 11 ++--
 tools/perf/builtin-top.c                 |  2 +-
 tools/perf/util/auxtrace.c               |  2 +-
 tools/perf/util/bpf_ftrace.c             |  4 +-
 tools/perf/util/evlist.c                 | 17 +++---
 tools/perf/util/record.c                 |  6 +-
 tools/perf/util/sideband_evlist.c        |  3 +-
 tools/perf/util/stat-display.c           |  2 +-
 tools/perf/util/synthetic-events.c       |  2 +-
 tools/perf/util/top.c                    |  8 ++-
 22 files changed, 132 insertions(+), 62 deletions(-)

-- 
2.35.1.1021.g381101b075-goog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/6] perf stat: Avoid segv if core.user_cpus isn't set.
  2022-03-28 23:26 [PATCH v2 0/6] Make evlist CPUs more accurate Ian Rogers
@ 2022-03-28 23:26 ` Ian Rogers
  2022-03-28 23:26 ` [PATCH v2 2/6] perf evlist: Rename cpus to user_requested_cpus Ian Rogers
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Ian Rogers @ 2022-03-28 23:26 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan,
	John Garry, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Kajol Jain, James Clark, German Gomez,
	Adrian Hunter, Riccardo Mancini, Andi Kleen, Alexey Bayduraev,
	Alexander Antonov, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel, netdev, bpf
  Cc: Stephane Eranian, Ian Rogers

Passing null to perf_cpu_map__max doesn't make sense as there is no
valid max. Avoid this problem by null checking in
perf_stat_init_aggr_mode.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 4ee40de698a4..b81ae5053218 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1472,7 +1472,10 @@ static int perf_stat_init_aggr_mode(void)
 	 * taking the highest cpu number to be the size of
 	 * the aggregation translate cpumap.
 	 */
-	nr = perf_cpu_map__max(evsel_list->core.cpus).cpu;
+	if (evsel_list->core.cpus)
+		nr = perf_cpu_map__max(evsel_list->core.cpus).cpu;
+	else
+		nr = 0;
 	stat_config.cpus_aggr_map = cpu_aggr_map__empty_new(nr + 1);
 	return stat_config.cpus_aggr_map ? 0 : -ENOMEM;
 }
-- 
2.35.1.1021.g381101b075-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/6] perf evlist: Rename cpus to user_requested_cpus
  2022-03-28 23:26 [PATCH v2 0/6] Make evlist CPUs more accurate Ian Rogers
  2022-03-28 23:26 ` [PATCH v2 1/6] perf stat: Avoid segv if core.user_cpus isn't set Ian Rogers
@ 2022-03-28 23:26 ` Ian Rogers
  2022-03-30 20:31   ` Arnaldo Carvalho de Melo
  2022-03-28 23:26 ` [PATCH v2 3/6] perf cpumap: Add is_subset function Ian Rogers
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2022-03-28 23:26 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan,
	John Garry, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Kajol Jain, James Clark, German Gomez,
	Adrian Hunter, Riccardo Mancini, Andi Kleen, Alexey Bayduraev,
	Alexander Antonov, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel, netdev, bpf
  Cc: Stephane Eranian, Ian Rogers

evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps
of all evsels. For non-task targets, cpus is set to be cpus requested
from the command line, defaulting to all online cpus if no cpus are
specified. For an uncore event, all_cpus may be just CPU 0 or every online
CPU. This causes all_cpus to have fewer values than the cpus variable
which is confusing given the 'all' in the name. To try to make the behavior
clearer, rename cpus to user_requested_cpus and add comments on the two
struct variables.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/evlist.c                  | 28 ++++++++++++------------
 tools/lib/perf/include/internal/evlist.h |  7 +++++-
 tools/perf/arch/arm/util/cs-etm.c        |  8 +++----
 tools/perf/arch/arm64/util/arm-spe.c     |  2 +-
 tools/perf/arch/x86/util/intel-bts.c     |  2 +-
 tools/perf/arch/x86/util/intel-pt.c      |  4 ++--
 tools/perf/bench/evlist-open-close.c     |  2 +-
 tools/perf/builtin-ftrace.c              |  2 +-
 tools/perf/builtin-record.c              |  6 ++---
 tools/perf/builtin-stat.c                | 10 ++++-----
 tools/perf/builtin-top.c                 |  2 +-
 tools/perf/util/auxtrace.c               |  2 +-
 tools/perf/util/bpf_ftrace.c             |  4 ++--
 tools/perf/util/evlist.c                 | 15 +++++++------
 tools/perf/util/record.c                 |  6 ++---
 tools/perf/util/sideband_evlist.c        |  3 ++-
 tools/perf/util/stat-display.c           |  2 +-
 tools/perf/util/synthetic-events.c       |  2 +-
 tools/perf/util/top.c                    |  8 ++++---
 19 files changed, 62 insertions(+), 53 deletions(-)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 9a770bfdc804..1b15ba13c477 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -41,10 +41,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
 	 */
 	if (!evsel->own_cpus || evlist->has_user_cpus) {
 		perf_cpu_map__put(evsel->cpus);
-		evsel->cpus = perf_cpu_map__get(evlist->cpus);
-	} else if (!evsel->system_wide && perf_cpu_map__empty(evlist->cpus)) {
+		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
+	} else if (!evsel->system_wide && perf_cpu_map__empty(evlist->user_requested_cpus)) {
 		perf_cpu_map__put(evsel->cpus);
-		evsel->cpus = perf_cpu_map__get(evlist->cpus);
+		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
 	} else if (evsel->cpus != evsel->own_cpus) {
 		perf_cpu_map__put(evsel->cpus);
 		evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
@@ -123,10 +123,10 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
-	perf_cpu_map__put(evlist->cpus);
+	perf_cpu_map__put(evlist->user_requested_cpus);
 	perf_cpu_map__put(evlist->all_cpus);
 	perf_thread_map__put(evlist->threads);
-	evlist->cpus = NULL;
+	evlist->user_requested_cpus = NULL;
 	evlist->all_cpus = NULL;
 	evlist->threads = NULL;
 	fdarray__exit(&evlist->pollfd);
@@ -155,9 +155,9 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
 	 * original reference count of 1.  If that is not the case it is up to
 	 * the caller to increase the reference count.
 	 */
-	if (cpus != evlist->cpus) {
-		perf_cpu_map__put(evlist->cpus);
-		evlist->cpus = perf_cpu_map__get(cpus);
+	if (cpus != evlist->user_requested_cpus) {
+		perf_cpu_map__put(evlist->user_requested_cpus);
+		evlist->user_requested_cpus = perf_cpu_map__get(cpus);
 	}
 
 	if (threads != evlist->threads) {
@@ -294,7 +294,7 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
 
 int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 {
-	int nr_cpus = perf_cpu_map__nr(evlist->cpus);
+	int nr_cpus = perf_cpu_map__nr(evlist->user_requested_cpus);
 	int nr_threads = perf_thread_map__nr(evlist->threads);
 	int nfds = 0;
 	struct perf_evsel *evsel;
@@ -426,7 +426,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	       int idx, struct perf_mmap_param *mp, int cpu_idx,
 	       int thread, int *_output, int *_output_overwrite)
 {
-	struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->cpus, cpu_idx);
+	struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->user_requested_cpus, cpu_idx);
 	struct perf_evsel *evsel;
 	int revent;
 
@@ -536,7 +536,7 @@ mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	     struct perf_mmap_param *mp)
 {
 	int nr_threads = perf_thread_map__nr(evlist->threads);
-	int nr_cpus    = perf_cpu_map__nr(evlist->cpus);
+	int nr_cpus    = perf_cpu_map__nr(evlist->user_requested_cpus);
 	int cpu, thread;
 
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
@@ -564,8 +564,8 @@ static int perf_evlist__nr_mmaps(struct perf_evlist *evlist)
 {
 	int nr_mmaps;
 
-	nr_mmaps = perf_cpu_map__nr(evlist->cpus);
-	if (perf_cpu_map__empty(evlist->cpus))
+	nr_mmaps = perf_cpu_map__nr(evlist->user_requested_cpus);
+	if (perf_cpu_map__empty(evlist->user_requested_cpus))
 		nr_mmaps = perf_thread_map__nr(evlist->threads);
 
 	return nr_mmaps;
@@ -576,7 +576,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
 			  struct perf_mmap_param *mp)
 {
 	struct perf_evsel *evsel;
-	const struct perf_cpu_map *cpus = evlist->cpus;
+	const struct perf_cpu_map *cpus = evlist->user_requested_cpus;
 	const struct perf_thread_map *threads = evlist->threads;
 
 	if (!ops || !ops->get || !ops->mmap)
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 4cefade540bd..e3e64f37db7b 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -19,7 +19,12 @@ struct perf_evlist {
 	int			 nr_entries;
 	int			 nr_groups;
 	bool			 has_user_cpus;
-	struct perf_cpu_map	*cpus;
+	/**
+	 * The cpus passed from the command line or all online CPUs by
+	 * default.
+	 */
+	struct perf_cpu_map	*user_requested_cpus;
+	/** The union of all evsel cpu maps. */
 	struct perf_cpu_map	*all_cpus;
 	struct perf_thread_map	*threads;
 	int			 nr_mmaps;
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index cbc555245959..11c71aa219f7 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -199,7 +199,7 @@ static int cs_etm_set_option(struct auxtrace_record *itr,
 			     struct evsel *evsel, u32 option)
 {
 	int i, err = -EINVAL;
-	struct perf_cpu_map *event_cpus = evsel->evlist->core.cpus;
+	struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus;
 	struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL);
 
 	/* Set option of each CPU we have */
@@ -299,7 +299,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
 				container_of(itr, struct cs_etm_recording, itr);
 	struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
 	struct evsel *evsel, *cs_etm_evsel = NULL;
-	struct perf_cpu_map *cpus = evlist->core.cpus;
+	struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
 	bool privileged = perf_event_paranoid_check(-1);
 	int err = 0;
 
@@ -522,7 +522,7 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
 {
 	int i;
 	int etmv3 = 0, etmv4 = 0, ete = 0;
-	struct perf_cpu_map *event_cpus = evlist->core.cpus;
+	struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus;
 	struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL);
 
 	/* cpu map is not empty, we have specific CPUs to work with */
@@ -713,7 +713,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
 	u32 offset;
 	u64 nr_cpu, type;
 	struct perf_cpu_map *cpu_map;
-	struct perf_cpu_map *event_cpus = session->evlist->core.cpus;
+	struct perf_cpu_map *event_cpus = session->evlist->core.user_requested_cpus;
 	struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL);
 	struct cs_etm_recording *ptr =
 			container_of(itr, struct cs_etm_recording, itr);
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index 5860bbaea95a..86e2e926aa0e 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -144,7 +144,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
 			container_of(itr, struct arm_spe_recording, itr);
 	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
 	struct evsel *evsel, *arm_spe_evsel = NULL;
-	struct perf_cpu_map *cpus = evlist->core.cpus;
+	struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
 	bool privileged = perf_event_paranoid_check(-1);
 	struct evsel *tracking_evsel;
 	int err;
diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c
index 4a76d49d25d6..d68a0f48e41e 100644
--- a/tools/perf/arch/x86/util/intel-bts.c
+++ b/tools/perf/arch/x86/util/intel-bts.c
@@ -110,7 +110,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr,
 			container_of(itr, struct intel_bts_recording, itr);
 	struct perf_pmu *intel_bts_pmu = btsr->intel_bts_pmu;
 	struct evsel *evsel, *intel_bts_evsel = NULL;
-	const struct perf_cpu_map *cpus = evlist->core.cpus;
+	const struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
 	bool privileged = perf_event_paranoid_check(-1);
 
 	if (opts->auxtrace_sample_mode) {
diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 8c31578d6f4a..38ec2666ec12 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -382,7 +382,7 @@ static int intel_pt_info_fill(struct auxtrace_record *itr,
 			ui__warning("Intel Processor Trace: TSC not available\n");
 	}
 
-	per_cpu_mmaps = !perf_cpu_map__empty(session->evlist->core.cpus);
+	per_cpu_mmaps = !perf_cpu_map__empty(session->evlist->core.user_requested_cpus);
 
 	auxtrace_info->type = PERF_AUXTRACE_INTEL_PT;
 	auxtrace_info->priv[INTEL_PT_PMU_TYPE] = intel_pt_pmu->type;
@@ -632,7 +632,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 	struct perf_pmu *intel_pt_pmu = ptr->intel_pt_pmu;
 	bool have_timing_info, need_immediate = false;
 	struct evsel *evsel, *intel_pt_evsel = NULL;
-	const struct perf_cpu_map *cpus = evlist->core.cpus;
+	const struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
 	bool privileged = perf_event_paranoid_check(-1);
 	u64 tsc_bit;
 	int err;
diff --git a/tools/perf/bench/evlist-open-close.c b/tools/perf/bench/evlist-open-close.c
index de56601f69ee..5a27691469ed 100644
--- a/tools/perf/bench/evlist-open-close.c
+++ b/tools/perf/bench/evlist-open-close.c
@@ -151,7 +151,7 @@ static int bench_evlist_open_close__run(char *evstr)
 
 	init_stats(&time_stats);
 
-	printf("  Number of cpus:\t%d\n", perf_cpu_map__nr(evlist->core.cpus));
+	printf("  Number of cpus:\t%d\n", perf_cpu_map__nr(evlist->core.user_requested_cpus));
 	printf("  Number of threads:\t%d\n", evlist->core.threads->nr);
 	printf("  Number of events:\t%d (%d fds)\n",
 		evlist->core.nr_entries, evlist__count_evsel_fds(evlist));
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index ad9ce1bfffa1..7de07bb16d23 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -301,7 +301,7 @@ static int set_tracing_cpumask(struct perf_cpu_map *cpumap)
 
 static int set_tracing_cpu(struct perf_ftrace *ftrace)
 {
-	struct perf_cpu_map *cpumap = ftrace->evlist->core.cpus;
+	struct perf_cpu_map *cpumap = ftrace->evlist->core.user_requested_cpus;
 
 	if (!target__has_cpu(&ftrace->target))
 		return 0;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0b4abed555d8..ba74fab02e62 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -987,7 +987,7 @@ static int record__thread_data_init_maps(struct record_thread *thread_data, stru
 	int m, tm, nr_mmaps = evlist->core.nr_mmaps;
 	struct mmap *mmap = evlist->mmap;
 	struct mmap *overwrite_mmap = evlist->overwrite_mmap;
-	struct perf_cpu_map *cpus = evlist->core.cpus;
+	struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
 
 	thread_data->nr_mmaps = bitmap_weight(thread_data->mask->maps.bits,
 					      thread_data->mask->maps.nbits);
@@ -1881,7 +1881,7 @@ static int record__synthesize(struct record *rec, bool tail)
 		return err;
 	}
 
-	err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->core.cpus,
+	err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->core.user_requested_cpus,
 					     process_synthesized_event, NULL);
 	if (err < 0) {
 		pr_err("Couldn't synthesize cpu map.\n");
@@ -3675,7 +3675,7 @@ static int record__init_thread_default_masks(struct record *rec, struct perf_cpu
 static int record__init_thread_masks(struct record *rec)
 {
 	int ret = 0;
-	struct perf_cpu_map *cpus = rec->evlist->core.cpus;
+	struct perf_cpu_map *cpus = rec->evlist->core.user_requested_cpus;
 
 	if (!record__threads_enabled(rec))
 		return record__init_thread_default_masks(rec, cpus);
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index b81ae5053218..a96f106dc93a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -804,7 +804,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	if (group)
 		evlist__set_leader(evsel_list);
 
-	if (!cpu_map__is_dummy(evsel_list->core.cpus)) {
+	if (!cpu_map__is_dummy(evsel_list->core.user_requested_cpus)) {
 		if (affinity__setup(&saved_affinity) < 0)
 			return -1;
 		affinity = &saved_affinity;
@@ -1458,7 +1458,7 @@ static int perf_stat_init_aggr_mode(void)
 	aggr_cpu_id_get_t get_id = aggr_mode__get_aggr(stat_config.aggr_mode);
 
 	if (get_id) {
-		stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.cpus,
+		stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.user_requested_cpus,
 							 get_id, /*data=*/NULL);
 		if (!stat_config.aggr_map) {
 			pr_err("cannot build %s map", aggr_mode__string[stat_config.aggr_mode]);
@@ -1472,8 +1472,8 @@ static int perf_stat_init_aggr_mode(void)
 	 * taking the highest cpu number to be the size of
 	 * the aggregation translate cpumap.
 	 */
-	if (evsel_list->core.cpus)
-		nr = perf_cpu_map__max(evsel_list->core.cpus).cpu;
+	if (evsel_list->core.user_requested_cpus)
+		nr = perf_cpu_map__max(evsel_list->core.user_requested_cpus).cpu;
 	else
 		nr = 0;
 	stat_config.cpus_aggr_map = cpu_aggr_map__empty_new(nr + 1);
@@ -1630,7 +1630,7 @@ static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
 	if (!get_id)
 		return 0;
 
-	stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.cpus, get_id, env);
+	stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.user_requested_cpus, get_id, env);
 	if (!stat_config.aggr_map) {
 		pr_err("cannot build %s map", aggr_mode__string[stat_config.aggr_mode]);
 		return -1;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 9b08e44a31d9..fd8fd913c533 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1021,7 +1021,7 @@ static int perf_top__start_counters(struct perf_top *top)
 
 	evlist__for_each_entry(evlist, counter) {
 try_again:
-		if (evsel__open(counter, top->evlist->core.cpus,
+		if (evsel__open(counter, top->evlist->core.user_requested_cpus,
 				     top->evlist->core.threads) < 0) {
 
 			/*
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 9e48652662d4..df1c5bbbaa0d 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -174,7 +174,7 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
 	mp->idx = idx;
 
 	if (per_cpu) {
-		mp->cpu = perf_cpu_map__cpu(evlist->core.cpus, idx);
+		mp->cpu = perf_cpu_map__cpu(evlist->core.user_requested_cpus, idx);
 		if (evlist->core.threads)
 			mp->tid = perf_thread_map__pid(evlist->core.threads, 0);
 		else
diff --git a/tools/perf/util/bpf_ftrace.c b/tools/perf/util/bpf_ftrace.c
index 4f4d3aaff37c..7a4297d8fd2c 100644
--- a/tools/perf/util/bpf_ftrace.c
+++ b/tools/perf/util/bpf_ftrace.c
@@ -38,7 +38,7 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
 
 	/* don't need to set cpu filter for system-wide mode */
 	if (ftrace->target.cpu_list) {
-		ncpus = perf_cpu_map__nr(ftrace->evlist->core.cpus);
+		ncpus = perf_cpu_map__nr(ftrace->evlist->core.user_requested_cpus);
 		bpf_map__set_max_entries(skel->maps.cpu_filter, ncpus);
 	}
 
@@ -63,7 +63,7 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
 		fd = bpf_map__fd(skel->maps.cpu_filter);
 
 		for (i = 0; i < ncpus; i++) {
-			cpu = perf_cpu_map__cpu(ftrace->evlist->core.cpus, i).cpu;
+			cpu = perf_cpu_map__cpu(ftrace->evlist->core.user_requested_cpus, i).cpu;
 			bpf_map_update_elem(fd, &cpu, &val, BPF_ANY);
 		}
 	}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9bb79e049957..cb2cf4463c08 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -440,7 +440,7 @@ static void __evlist__disable(struct evlist *evlist, char *evsel_name)
 	bool has_imm = false;
 
 	// See explanation in evlist__close()
-	if (!cpu_map__is_dummy(evlist->core.cpus)) {
+	if (!cpu_map__is_dummy(evlist->core.user_requested_cpus)) {
 		if (affinity__setup(&saved_affinity) < 0)
 			return;
 		affinity = &saved_affinity;
@@ -500,7 +500,7 @@ static void __evlist__enable(struct evlist *evlist, char *evsel_name)
 	struct affinity saved_affinity, *affinity = NULL;
 
 	// See explanation in evlist__close()
-	if (!cpu_map__is_dummy(evlist->core.cpus)) {
+	if (!cpu_map__is_dummy(evlist->core.user_requested_cpus)) {
 		if (affinity__setup(&saved_affinity) < 0)
 			return;
 		affinity = &saved_affinity;
@@ -565,7 +565,7 @@ static int evlist__enable_event_cpu(struct evlist *evlist, struct evsel *evsel,
 static int evlist__enable_event_thread(struct evlist *evlist, struct evsel *evsel, int thread)
 {
 	int cpu;
-	int nr_cpus = perf_cpu_map__nr(evlist->core.cpus);
+	int nr_cpus = perf_cpu_map__nr(evlist->core.user_requested_cpus);
 
 	if (!evsel->core.fd)
 		return -EINVAL;
@@ -580,7 +580,7 @@ static int evlist__enable_event_thread(struct evlist *evlist, struct evsel *evse
 
 int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx)
 {
-	bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.cpus);
+	bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.user_requested_cpus);
 
 	if (per_cpu_mmaps)
 		return evlist__enable_event_cpu(evlist, evsel, idx);
@@ -1301,10 +1301,11 @@ void evlist__close(struct evlist *evlist)
 	struct affinity affinity;
 
 	/*
-	 * With perf record core.cpus is usually NULL.
+	 * With perf record core.user_requested_cpus is usually NULL.
 	 * Use the old method to handle this for now.
 	 */
-	if (!evlist->core.cpus || cpu_map__is_dummy(evlist->core.cpus)) {
+	if (!evlist->core.user_requested_cpus ||
+	    cpu_map__is_dummy(evlist->core.user_requested_cpus)) {
 		evlist__for_each_entry_reverse(evlist, evsel)
 			evsel__close(evsel);
 		return;
@@ -1367,7 +1368,7 @@ int evlist__open(struct evlist *evlist)
 	 * Default: one fd per CPU, all threads, aka systemwide
 	 * as sys_perf_event_open(cpu = -1, thread = -1) is EINVAL
 	 */
-	if (evlist->core.threads == NULL && evlist->core.cpus == NULL) {
+	if (evlist->core.threads == NULL && evlist->core.user_requested_cpus == NULL) {
 		err = evlist__create_syswide_maps(evlist);
 		if (err < 0)
 			goto out_err;
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 007a64681416..5b09ecbb05dc 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -106,7 +106,7 @@ void evlist__config(struct evlist *evlist, struct record_opts *opts, struct call
 	if (opts->group)
 		evlist__set_leader(evlist);
 
-	if (perf_cpu_map__cpu(evlist->core.cpus, 0).cpu < 0)
+	if (perf_cpu_map__cpu(evlist->core.user_requested_cpus, 0).cpu < 0)
 		opts->no_inherit = true;
 
 	use_comm_exec = perf_can_comm_exec();
@@ -244,7 +244,7 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str)
 
 	evsel = evlist__last(temp_evlist);
 
-	if (!evlist || perf_cpu_map__empty(evlist->core.cpus)) {
+	if (!evlist || perf_cpu_map__empty(evlist->core.user_requested_cpus)) {
 		struct perf_cpu_map *cpus = perf_cpu_map__new(NULL);
 
 		if (cpus)
@@ -252,7 +252,7 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str)
 
 		perf_cpu_map__put(cpus);
 	} else {
-		cpu = perf_cpu_map__cpu(evlist->core.cpus, 0);
+		cpu = perf_cpu_map__cpu(evlist->core.user_requested_cpus, 0);
 	}
 
 	while (1) {
diff --git a/tools/perf/util/sideband_evlist.c b/tools/perf/util/sideband_evlist.c
index 748371ac22be..388846f17bc1 100644
--- a/tools/perf/util/sideband_evlist.c
+++ b/tools/perf/util/sideband_evlist.c
@@ -114,7 +114,8 @@ int evlist__start_sb_thread(struct evlist *evlist, struct target *target)
 	}
 
 	evlist__for_each_entry(evlist, counter) {
-		if (evsel__open(counter, evlist->core.cpus, evlist->core.threads) < 0)
+		if (evsel__open(counter, evlist->core.user_requested_cpus,
+				evlist->core.threads) < 0)
 			goto out_delete_evlist;
 	}
 
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 9cbe351b141f..138e3ab9d638 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -929,7 +929,7 @@ static void print_no_aggr_metric(struct perf_stat_config *config,
 	int all_idx;
 	struct perf_cpu cpu;
 
-	perf_cpu_map__for_each_cpu(cpu, all_idx, evlist->core.cpus) {
+	perf_cpu_map__for_each_cpu(cpu, all_idx, evlist->core.user_requested_cpus) {
 		struct evsel *counter;
 		bool first = true;
 
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index b654de0841f8..27acdc5e5723 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -2127,7 +2127,7 @@ int perf_event__synthesize_stat_events(struct perf_stat_config *config, struct p
 		return err;
 	}
 
-	err = perf_event__synthesize_cpu_map(tool, evlist->core.cpus, process, NULL);
+	err = perf_event__synthesize_cpu_map(tool, evlist->core.user_requested_cpus, process, NULL);
 	if (err < 0) {
 		pr_err("Couldn't synthesize thread map.\n");
 		return err;
diff --git a/tools/perf/util/top.c b/tools/perf/util/top.c
index c1ebfc5d2e0c..b8b32431d2f7 100644
--- a/tools/perf/util/top.c
+++ b/tools/perf/util/top.c
@@ -95,15 +95,17 @@ size_t perf_top__header_snprintf(struct perf_top *top, char *bf, size_t size)
 
 	if (target->cpu_list)
 		ret += SNPRINTF(bf + ret, size - ret, ", CPU%s: %s)",
-				perf_cpu_map__nr(top->evlist->core.cpus) > 1 ? "s" : "",
+				perf_cpu_map__nr(top->evlist->core.user_requested_cpus) > 1
+				? "s" : "",
 				target->cpu_list);
 	else {
 		if (target->tid)
 			ret += SNPRINTF(bf + ret, size - ret, ")");
 		else
 			ret += SNPRINTF(bf + ret, size - ret, ", %d CPU%s)",
-					perf_cpu_map__nr(top->evlist->core.cpus),
-					perf_cpu_map__nr(top->evlist->core.cpus) > 1 ? "s" : "");
+					perf_cpu_map__nr(top->evlist->core.user_requested_cpus),
+					perf_cpu_map__nr(top->evlist->core.user_requested_cpus) > 1
+					? "s" : "");
 	}
 
 	perf_top__reset_sample_counters(top);
-- 
2.35.1.1021.g381101b075-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/6] perf cpumap: Add is_subset function
  2022-03-28 23:26 [PATCH v2 0/6] Make evlist CPUs more accurate Ian Rogers
  2022-03-28 23:26 ` [PATCH v2 1/6] perf stat: Avoid segv if core.user_cpus isn't set Ian Rogers
  2022-03-28 23:26 ` [PATCH v2 2/6] perf evlist: Rename cpus to user_requested_cpus Ian Rogers
@ 2022-03-28 23:26 ` Ian Rogers
  2022-03-30 20:34   ` Arnaldo Carvalho de Melo
  2022-03-28 23:26 ` [PATCH v2 4/6] perf cpumap: More cpu map reuse by merge Ian Rogers
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2022-03-28 23:26 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan,
	John Garry, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Kajol Jain, James Clark, German Gomez,
	Adrian Hunter, Riccardo Mancini, Andi Kleen, Alexey Bayduraev,
	Alexander Antonov, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel, netdev, bpf
  Cc: Stephane Eranian, Ian Rogers

Returns true if the second argument is a subset of the first.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/cpumap.c                  | 20 ++++++++++++++++++++
 tools/lib/perf/include/internal/cpumap.h |  1 +
 2 files changed, 21 insertions(+)

diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index ee66760f1e63..23701024e0c0 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -319,6 +319,26 @@ struct perf_cpu perf_cpu_map__max(struct perf_cpu_map *map)
 	return map->nr > 0 ? map->map[map->nr - 1] : result;
 }
 
+/** Is 'b' a subset of 'a'. */
+bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu_map *b)
+{
+	if (a == b || !b)
+		return true;
+	if (!a || b->nr > a->nr)
+		return false;
+
+	for (int i = 0, j = 0; i < a->nr; i++) {
+		if (a->map[i].cpu > b->map[j].cpu)
+			return false;
+		if (a->map[i].cpu == b->map[j].cpu) {
+			j++;
+			if (j == b->nr)
+				return true;
+		}
+	}
+	return false;
+}
+
 /*
  * Merge two cpumaps
  *
diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
index 1973a18c096b..35dd29642296 100644
--- a/tools/lib/perf/include/internal/cpumap.h
+++ b/tools/lib/perf/include/internal/cpumap.h
@@ -25,5 +25,6 @@ struct perf_cpu_map {
 #endif
 
 int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu);
+bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu_map *b);
 
 #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
-- 
2.35.1.1021.g381101b075-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/6] perf cpumap: More cpu map reuse by merge.
  2022-03-28 23:26 [PATCH v2 0/6] Make evlist CPUs more accurate Ian Rogers
                   ` (2 preceding siblings ...)
  2022-03-28 23:26 ` [PATCH v2 3/6] perf cpumap: Add is_subset function Ian Rogers
@ 2022-03-28 23:26 ` Ian Rogers
  2022-03-30 20:34   ` Arnaldo Carvalho de Melo
  2022-03-28 23:26 ` [PATCH v2 5/6] perf cpumap: Add intersect function Ian Rogers
  2022-03-28 23:26 ` [PATCH v2 6/6] perf evlist: Respect all_cpus when setting user_requested_cpus Ian Rogers
  5 siblings, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2022-03-28 23:26 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan,
	John Garry, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Kajol Jain, James Clark, German Gomez,
	Adrian Hunter, Riccardo Mancini, Andi Kleen, Alexey Bayduraev,
	Alexander Antonov, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel, netdev, bpf
  Cc: Stephane Eranian, Ian Rogers

perf_cpu_map__merge will reuse one of its arguments if they are equal or
the other argument is NULL. The arguments could be reused if it is known
one set of values is a subset of the other. For example, a map of 0-1
and a map of just 0 when merged yields the map of 0-1. Currently a new
map is created rather than adding a reference count to the original 0-1
map.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/cpumap.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index 23701024e0c0..384d5e076ee4 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -355,17 +355,12 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
 	int i, j, k;
 	struct perf_cpu_map *merged;
 
-	if (!orig && !other)
-		return NULL;
-	if (!orig) {
-		perf_cpu_map__get(other);
-		return other;
-	}
-	if (!other)
-		return orig;
-	if (orig->nr == other->nr &&
-	    !memcmp(orig->map, other->map, orig->nr * sizeof(struct perf_cpu)))
+	if (perf_cpu_map__is_subset(orig, other))
 		return orig;
+	if (perf_cpu_map__is_subset(other, orig)) {
+		perf_cpu_map__put(orig);
+		return perf_cpu_map__get(other);
+	}
 
 	tmp_len = orig->nr + other->nr;
 	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
-- 
2.35.1.1021.g381101b075-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 5/6] perf cpumap: Add intersect function.
  2022-03-28 23:26 [PATCH v2 0/6] Make evlist CPUs more accurate Ian Rogers
                   ` (3 preceding siblings ...)
  2022-03-28 23:26 ` [PATCH v2 4/6] perf cpumap: More cpu map reuse by merge Ian Rogers
@ 2022-03-28 23:26 ` Ian Rogers
  2022-04-01 19:12   ` Arnaldo Carvalho de Melo
  2022-03-28 23:26 ` [PATCH v2 6/6] perf evlist: Respect all_cpus when setting user_requested_cpus Ian Rogers
  5 siblings, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2022-03-28 23:26 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan,
	John Garry, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Kajol Jain, James Clark, German Gomez,
	Adrian Hunter, Riccardo Mancini, Andi Kleen, Alexey Bayduraev,
	Alexander Antonov, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel, netdev, bpf
  Cc: Stephane Eranian, Ian Rogers

The merge function gives the union of two cpu maps. Add an intersect
function which will be used in the next change.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/cpumap.c              | 38 ++++++++++++++++++++++++++++
 tools/lib/perf/include/perf/cpumap.h |  2 ++
 2 files changed, 40 insertions(+)

diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index 384d5e076ee4..60cccd05f243 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -390,3 +390,41 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
 	perf_cpu_map__put(orig);
 	return merged;
 }
+
+struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig,
+					     struct perf_cpu_map *other)
+{
+	struct perf_cpu *tmp_cpus;
+	int tmp_len;
+	int i, j, k;
+	struct perf_cpu_map *merged = NULL;
+
+	if (perf_cpu_map__is_subset(other, orig))
+		return orig;
+	if (perf_cpu_map__is_subset(orig, other)) {
+		perf_cpu_map__put(orig);
+		return perf_cpu_map__get(other);
+	}
+
+	tmp_len = max(orig->nr, other->nr);
+	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
+	if (!tmp_cpus)
+		return NULL;
+
+	i = j = k = 0;
+	while (i < orig->nr && j < other->nr) {
+		if (orig->map[i].cpu < other->map[j].cpu)
+			i++;
+		else if (orig->map[i].cpu > other->map[j].cpu)
+			j++;
+		else {
+			j++;
+			tmp_cpus[k++] = orig->map[i++];
+		}
+	}
+	if (k)
+		merged = cpu_map__trim_new(k, tmp_cpus);
+	free(tmp_cpus);
+	perf_cpu_map__put(orig);
+	return merged;
+}
diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
index 4a2edbdb5e2b..a2a7216c0b78 100644
--- a/tools/lib/perf/include/perf/cpumap.h
+++ b/tools/lib/perf/include/perf/cpumap.h
@@ -19,6 +19,8 @@ LIBPERF_API struct perf_cpu_map *perf_cpu_map__read(FILE *file);
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map);
 LIBPERF_API struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
 						     struct perf_cpu_map *other);
+LIBPERF_API struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig,
+							 struct perf_cpu_map *other);
 LIBPERF_API void perf_cpu_map__put(struct perf_cpu_map *map);
 LIBPERF_API struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
 LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
-- 
2.35.1.1021.g381101b075-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 6/6] perf evlist: Respect all_cpus when setting user_requested_cpus
  2022-03-28 23:26 [PATCH v2 0/6] Make evlist CPUs more accurate Ian Rogers
                   ` (4 preceding siblings ...)
  2022-03-28 23:26 ` [PATCH v2 5/6] perf cpumap: Add intersect function Ian Rogers
@ 2022-03-28 23:26 ` Ian Rogers
  5 siblings, 0 replies; 11+ messages in thread
From: Ian Rogers @ 2022-03-28 23:26 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Mathieu Poirier, Suzuki K Poulose, Mike Leach, Leo Yan,
	John Garry, Will Deacon, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Kajol Jain, James Clark, German Gomez,
	Adrian Hunter, Riccardo Mancini, Andi Kleen, Alexey Bayduraev,
	Alexander Antonov, linux-perf-users, linux-kernel, coresight,
	linux-arm-kernel, netdev, bpf
  Cc: Stephane Eranian, Ian Rogers

If all_cpus is calculated it represents the merge/union of all
evsel cpu maps. By default user_requested_cpus is computed to be
the online CPUs. For uncore events, it is often the case currently
that all_cpus is a subset of user_requested_cpus. Metrics printed
without aggregation and with metric-only, in print_no_aggr_metric,
iterate over user_requested_cpus assuming every CPU has a metric to
print. For each CPU the prefix is printed, but then if the
evsel's cpus doesn't contain anything you get an empty line like
the following on a 2 socket 36 core SkylakeX:

```
$ perf stat -A -M DRAM_BW_Use -a --metric-only -I 1000
     1.000453137 CPU0                       0.00
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137 CPU18                      0.00
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     1.000453137
     2.003717143 CPU0                       0.00
...
```

While it is possible to be lazier in printing the prefix and
trailing newline, having user_requested_cpus not be a subset of
all_cpus is preferential so that wasted work isn't done elsewhere
user_requested_cpus is used. The change modifies user_requested_cpus
to be the intersection of user specified CPUs, or default all online
CPUs, with the CPUs computed through the merge of all evsel cpu maps.

New behavior:
```
$ perf stat -A -M DRAM_BW_Use -a --metric-only -I 1000
     1.001086325 CPU0                       0.00
     1.001086325 CPU18                      0.00
     2.003671291 CPU0                       0.00
     2.003671291 CPU18                      0.00
...
```

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/evlist.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index cb2cf4463c08..1a3308ec35f1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1036,6 +1036,8 @@ int evlist__create_maps(struct evlist *evlist, struct target *target)
 	if (!cpus)
 		goto out_delete_threads;
 
+	if (evlist->core.all_cpus)
+		cpus = perf_cpu_map__intersect(cpus, evlist->core.all_cpus);
 	evlist->core.has_user_cpus = !!target->cpu_list && !target->hybrid;
 
 	perf_evlist__set_maps(&evlist->core, cpus, threads);
-- 
2.35.1.1021.g381101b075-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/6] perf evlist: Rename cpus to user_requested_cpus
  2022-03-28 23:26 ` [PATCH v2 2/6] perf evlist: Rename cpus to user_requested_cpus Ian Rogers
@ 2022-03-30 20:31   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-03-30 20:31 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Mathieu Poirier, Suzuki K Poulose,
	Mike Leach, Leo Yan, John Garry, Will Deacon, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Kajol Jain, James Clark,
	German Gomez, Adrian Hunter, Riccardo Mancini, Andi Kleen,
	Alexey Bayduraev, Alexander Antonov, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel, netdev, bpf,
	Stephane Eranian

Em Mon, Mar 28, 2022 at 04:26:44PM -0700, Ian Rogers escreveu:
> evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps
> of all evsels. For non-task targets, cpus is set to be cpus requested
> from the command line, defaulting to all online cpus if no cpus are
> specified. For an uncore event, all_cpus may be just CPU 0 or every online
> CPU. This causes all_cpus to have fewer values than the cpus variable
> which is confusing given the 'all' in the name. To try to make the behavior
> clearer, rename cpus to user_requested_cpus and add comments on the two
> struct variables.

Thanks, applied.

- Arnaldo

 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/lib/perf/evlist.c                  | 28 ++++++++++++------------
>  tools/lib/perf/include/internal/evlist.h |  7 +++++-
>  tools/perf/arch/arm/util/cs-etm.c        |  8 +++----
>  tools/perf/arch/arm64/util/arm-spe.c     |  2 +-
>  tools/perf/arch/x86/util/intel-bts.c     |  2 +-
>  tools/perf/arch/x86/util/intel-pt.c      |  4 ++--
>  tools/perf/bench/evlist-open-close.c     |  2 +-
>  tools/perf/builtin-ftrace.c              |  2 +-
>  tools/perf/builtin-record.c              |  6 ++---
>  tools/perf/builtin-stat.c                | 10 ++++-----
>  tools/perf/builtin-top.c                 |  2 +-
>  tools/perf/util/auxtrace.c               |  2 +-
>  tools/perf/util/bpf_ftrace.c             |  4 ++--
>  tools/perf/util/evlist.c                 | 15 +++++++------
>  tools/perf/util/record.c                 |  6 ++---
>  tools/perf/util/sideband_evlist.c        |  3 ++-
>  tools/perf/util/stat-display.c           |  2 +-
>  tools/perf/util/synthetic-events.c       |  2 +-
>  tools/perf/util/top.c                    |  8 ++++---
>  19 files changed, 62 insertions(+), 53 deletions(-)
> 
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 9a770bfdc804..1b15ba13c477 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -41,10 +41,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>  	 */
>  	if (!evsel->own_cpus || evlist->has_user_cpus) {
>  		perf_cpu_map__put(evsel->cpus);
> -		evsel->cpus = perf_cpu_map__get(evlist->cpus);
> -	} else if (!evsel->system_wide && perf_cpu_map__empty(evlist->cpus)) {
> +		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
> +	} else if (!evsel->system_wide && perf_cpu_map__empty(evlist->user_requested_cpus)) {
>  		perf_cpu_map__put(evsel->cpus);
> -		evsel->cpus = perf_cpu_map__get(evlist->cpus);
> +		evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus);
>  	} else if (evsel->cpus != evsel->own_cpus) {
>  		perf_cpu_map__put(evsel->cpus);
>  		evsel->cpus = perf_cpu_map__get(evsel->own_cpus);
> @@ -123,10 +123,10 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
>  
>  void perf_evlist__exit(struct perf_evlist *evlist)
>  {
> -	perf_cpu_map__put(evlist->cpus);
> +	perf_cpu_map__put(evlist->user_requested_cpus);
>  	perf_cpu_map__put(evlist->all_cpus);
>  	perf_thread_map__put(evlist->threads);
> -	evlist->cpus = NULL;
> +	evlist->user_requested_cpus = NULL;
>  	evlist->all_cpus = NULL;
>  	evlist->threads = NULL;
>  	fdarray__exit(&evlist->pollfd);
> @@ -155,9 +155,9 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
>  	 * original reference count of 1.  If that is not the case it is up to
>  	 * the caller to increase the reference count.
>  	 */
> -	if (cpus != evlist->cpus) {
> -		perf_cpu_map__put(evlist->cpus);
> -		evlist->cpus = perf_cpu_map__get(cpus);
> +	if (cpus != evlist->user_requested_cpus) {
> +		perf_cpu_map__put(evlist->user_requested_cpus);
> +		evlist->user_requested_cpus = perf_cpu_map__get(cpus);
>  	}
>  
>  	if (threads != evlist->threads) {
> @@ -294,7 +294,7 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist,
>  
>  int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
>  {
> -	int nr_cpus = perf_cpu_map__nr(evlist->cpus);
> +	int nr_cpus = perf_cpu_map__nr(evlist->user_requested_cpus);
>  	int nr_threads = perf_thread_map__nr(evlist->threads);
>  	int nfds = 0;
>  	struct perf_evsel *evsel;
> @@ -426,7 +426,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>  	       int idx, struct perf_mmap_param *mp, int cpu_idx,
>  	       int thread, int *_output, int *_output_overwrite)
>  {
> -	struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->cpus, cpu_idx);
> +	struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->user_requested_cpus, cpu_idx);
>  	struct perf_evsel *evsel;
>  	int revent;
>  
> @@ -536,7 +536,7 @@ mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>  	     struct perf_mmap_param *mp)
>  {
>  	int nr_threads = perf_thread_map__nr(evlist->threads);
> -	int nr_cpus    = perf_cpu_map__nr(evlist->cpus);
> +	int nr_cpus    = perf_cpu_map__nr(evlist->user_requested_cpus);
>  	int cpu, thread;
>  
>  	for (cpu = 0; cpu < nr_cpus; cpu++) {
> @@ -564,8 +564,8 @@ static int perf_evlist__nr_mmaps(struct perf_evlist *evlist)
>  {
>  	int nr_mmaps;
>  
> -	nr_mmaps = perf_cpu_map__nr(evlist->cpus);
> -	if (perf_cpu_map__empty(evlist->cpus))
> +	nr_mmaps = perf_cpu_map__nr(evlist->user_requested_cpus);
> +	if (perf_cpu_map__empty(evlist->user_requested_cpus))
>  		nr_mmaps = perf_thread_map__nr(evlist->threads);
>  
>  	return nr_mmaps;
> @@ -576,7 +576,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
>  			  struct perf_mmap_param *mp)
>  {
>  	struct perf_evsel *evsel;
> -	const struct perf_cpu_map *cpus = evlist->cpus;
> +	const struct perf_cpu_map *cpus = evlist->user_requested_cpus;
>  	const struct perf_thread_map *threads = evlist->threads;
>  
>  	if (!ops || !ops->get || !ops->mmap)
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 4cefade540bd..e3e64f37db7b 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -19,7 +19,12 @@ struct perf_evlist {
>  	int			 nr_entries;
>  	int			 nr_groups;
>  	bool			 has_user_cpus;
> -	struct perf_cpu_map	*cpus;
> +	/**
> +	 * The cpus passed from the command line or all online CPUs by
> +	 * default.
> +	 */
> +	struct perf_cpu_map	*user_requested_cpus;
> +	/** The union of all evsel cpu maps. */
>  	struct perf_cpu_map	*all_cpus;
>  	struct perf_thread_map	*threads;
>  	int			 nr_mmaps;
> diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
> index cbc555245959..11c71aa219f7 100644
> --- a/tools/perf/arch/arm/util/cs-etm.c
> +++ b/tools/perf/arch/arm/util/cs-etm.c
> @@ -199,7 +199,7 @@ static int cs_etm_set_option(struct auxtrace_record *itr,
>  			     struct evsel *evsel, u32 option)
>  {
>  	int i, err = -EINVAL;
> -	struct perf_cpu_map *event_cpus = evsel->evlist->core.cpus;
> +	struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus;
>  	struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL);
>  
>  	/* Set option of each CPU we have */
> @@ -299,7 +299,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
>  				container_of(itr, struct cs_etm_recording, itr);
>  	struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
>  	struct evsel *evsel, *cs_etm_evsel = NULL;
> -	struct perf_cpu_map *cpus = evlist->core.cpus;
> +	struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
>  	bool privileged = perf_event_paranoid_check(-1);
>  	int err = 0;
>  
> @@ -522,7 +522,7 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
>  {
>  	int i;
>  	int etmv3 = 0, etmv4 = 0, ete = 0;
> -	struct perf_cpu_map *event_cpus = evlist->core.cpus;
> +	struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus;
>  	struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL);
>  
>  	/* cpu map is not empty, we have specific CPUs to work with */
> @@ -713,7 +713,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
>  	u32 offset;
>  	u64 nr_cpu, type;
>  	struct perf_cpu_map *cpu_map;
> -	struct perf_cpu_map *event_cpus = session->evlist->core.cpus;
> +	struct perf_cpu_map *event_cpus = session->evlist->core.user_requested_cpus;
>  	struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL);
>  	struct cs_etm_recording *ptr =
>  			container_of(itr, struct cs_etm_recording, itr);
> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
> index 5860bbaea95a..86e2e926aa0e 100644
> --- a/tools/perf/arch/arm64/util/arm-spe.c
> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> @@ -144,7 +144,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
>  			container_of(itr, struct arm_spe_recording, itr);
>  	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
>  	struct evsel *evsel, *arm_spe_evsel = NULL;
> -	struct perf_cpu_map *cpus = evlist->core.cpus;
> +	struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
>  	bool privileged = perf_event_paranoid_check(-1);
>  	struct evsel *tracking_evsel;
>  	int err;
> diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c
> index 4a76d49d25d6..d68a0f48e41e 100644
> --- a/tools/perf/arch/x86/util/intel-bts.c
> +++ b/tools/perf/arch/x86/util/intel-bts.c
> @@ -110,7 +110,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr,
>  			container_of(itr, struct intel_bts_recording, itr);
>  	struct perf_pmu *intel_bts_pmu = btsr->intel_bts_pmu;
>  	struct evsel *evsel, *intel_bts_evsel = NULL;
> -	const struct perf_cpu_map *cpus = evlist->core.cpus;
> +	const struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
>  	bool privileged = perf_event_paranoid_check(-1);
>  
>  	if (opts->auxtrace_sample_mode) {
> diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
> index 8c31578d6f4a..38ec2666ec12 100644
> --- a/tools/perf/arch/x86/util/intel-pt.c
> +++ b/tools/perf/arch/x86/util/intel-pt.c
> @@ -382,7 +382,7 @@ static int intel_pt_info_fill(struct auxtrace_record *itr,
>  			ui__warning("Intel Processor Trace: TSC not available\n");
>  	}
>  
> -	per_cpu_mmaps = !perf_cpu_map__empty(session->evlist->core.cpus);
> +	per_cpu_mmaps = !perf_cpu_map__empty(session->evlist->core.user_requested_cpus);
>  
>  	auxtrace_info->type = PERF_AUXTRACE_INTEL_PT;
>  	auxtrace_info->priv[INTEL_PT_PMU_TYPE] = intel_pt_pmu->type;
> @@ -632,7 +632,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
>  	struct perf_pmu *intel_pt_pmu = ptr->intel_pt_pmu;
>  	bool have_timing_info, need_immediate = false;
>  	struct evsel *evsel, *intel_pt_evsel = NULL;
> -	const struct perf_cpu_map *cpus = evlist->core.cpus;
> +	const struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
>  	bool privileged = perf_event_paranoid_check(-1);
>  	u64 tsc_bit;
>  	int err;
> diff --git a/tools/perf/bench/evlist-open-close.c b/tools/perf/bench/evlist-open-close.c
> index de56601f69ee..5a27691469ed 100644
> --- a/tools/perf/bench/evlist-open-close.c
> +++ b/tools/perf/bench/evlist-open-close.c
> @@ -151,7 +151,7 @@ static int bench_evlist_open_close__run(char *evstr)
>  
>  	init_stats(&time_stats);
>  
> -	printf("  Number of cpus:\t%d\n", perf_cpu_map__nr(evlist->core.cpus));
> +	printf("  Number of cpus:\t%d\n", perf_cpu_map__nr(evlist->core.user_requested_cpus));
>  	printf("  Number of threads:\t%d\n", evlist->core.threads->nr);
>  	printf("  Number of events:\t%d (%d fds)\n",
>  		evlist->core.nr_entries, evlist__count_evsel_fds(evlist));
> diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
> index ad9ce1bfffa1..7de07bb16d23 100644
> --- a/tools/perf/builtin-ftrace.c
> +++ b/tools/perf/builtin-ftrace.c
> @@ -301,7 +301,7 @@ static int set_tracing_cpumask(struct perf_cpu_map *cpumap)
>  
>  static int set_tracing_cpu(struct perf_ftrace *ftrace)
>  {
> -	struct perf_cpu_map *cpumap = ftrace->evlist->core.cpus;
> +	struct perf_cpu_map *cpumap = ftrace->evlist->core.user_requested_cpus;
>  
>  	if (!target__has_cpu(&ftrace->target))
>  		return 0;
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 0b4abed555d8..ba74fab02e62 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -987,7 +987,7 @@ static int record__thread_data_init_maps(struct record_thread *thread_data, stru
>  	int m, tm, nr_mmaps = evlist->core.nr_mmaps;
>  	struct mmap *mmap = evlist->mmap;
>  	struct mmap *overwrite_mmap = evlist->overwrite_mmap;
> -	struct perf_cpu_map *cpus = evlist->core.cpus;
> +	struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
>  
>  	thread_data->nr_mmaps = bitmap_weight(thread_data->mask->maps.bits,
>  					      thread_data->mask->maps.nbits);
> @@ -1881,7 +1881,7 @@ static int record__synthesize(struct record *rec, bool tail)
>  		return err;
>  	}
>  
> -	err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->core.cpus,
> +	err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->core.user_requested_cpus,
>  					     process_synthesized_event, NULL);
>  	if (err < 0) {
>  		pr_err("Couldn't synthesize cpu map.\n");
> @@ -3675,7 +3675,7 @@ static int record__init_thread_default_masks(struct record *rec, struct perf_cpu
>  static int record__init_thread_masks(struct record *rec)
>  {
>  	int ret = 0;
> -	struct perf_cpu_map *cpus = rec->evlist->core.cpus;
> +	struct perf_cpu_map *cpus = rec->evlist->core.user_requested_cpus;
>  
>  	if (!record__threads_enabled(rec))
>  		return record__init_thread_default_masks(rec, cpus);
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index b81ae5053218..a96f106dc93a 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -804,7 +804,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  	if (group)
>  		evlist__set_leader(evsel_list);
>  
> -	if (!cpu_map__is_dummy(evsel_list->core.cpus)) {
> +	if (!cpu_map__is_dummy(evsel_list->core.user_requested_cpus)) {
>  		if (affinity__setup(&saved_affinity) < 0)
>  			return -1;
>  		affinity = &saved_affinity;
> @@ -1458,7 +1458,7 @@ static int perf_stat_init_aggr_mode(void)
>  	aggr_cpu_id_get_t get_id = aggr_mode__get_aggr(stat_config.aggr_mode);
>  
>  	if (get_id) {
> -		stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.cpus,
> +		stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.user_requested_cpus,
>  							 get_id, /*data=*/NULL);
>  		if (!stat_config.aggr_map) {
>  			pr_err("cannot build %s map", aggr_mode__string[stat_config.aggr_mode]);
> @@ -1472,8 +1472,8 @@ static int perf_stat_init_aggr_mode(void)
>  	 * taking the highest cpu number to be the size of
>  	 * the aggregation translate cpumap.
>  	 */
> -	if (evsel_list->core.cpus)
> -		nr = perf_cpu_map__max(evsel_list->core.cpus).cpu;
> +	if (evsel_list->core.user_requested_cpus)
> +		nr = perf_cpu_map__max(evsel_list->core.user_requested_cpus).cpu;
>  	else
>  		nr = 0;
>  	stat_config.cpus_aggr_map = cpu_aggr_map__empty_new(nr + 1);
> @@ -1630,7 +1630,7 @@ static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
>  	if (!get_id)
>  		return 0;
>  
> -	stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.cpus, get_id, env);
> +	stat_config.aggr_map = cpu_aggr_map__new(evsel_list->core.user_requested_cpus, get_id, env);
>  	if (!stat_config.aggr_map) {
>  		pr_err("cannot build %s map", aggr_mode__string[stat_config.aggr_mode]);
>  		return -1;
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 9b08e44a31d9..fd8fd913c533 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1021,7 +1021,7 @@ static int perf_top__start_counters(struct perf_top *top)
>  
>  	evlist__for_each_entry(evlist, counter) {
>  try_again:
> -		if (evsel__open(counter, top->evlist->core.cpus,
> +		if (evsel__open(counter, top->evlist->core.user_requested_cpus,
>  				     top->evlist->core.threads) < 0) {
>  
>  			/*
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 9e48652662d4..df1c5bbbaa0d 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -174,7 +174,7 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
>  	mp->idx = idx;
>  
>  	if (per_cpu) {
> -		mp->cpu = perf_cpu_map__cpu(evlist->core.cpus, idx);
> +		mp->cpu = perf_cpu_map__cpu(evlist->core.user_requested_cpus, idx);
>  		if (evlist->core.threads)
>  			mp->tid = perf_thread_map__pid(evlist->core.threads, 0);
>  		else
> diff --git a/tools/perf/util/bpf_ftrace.c b/tools/perf/util/bpf_ftrace.c
> index 4f4d3aaff37c..7a4297d8fd2c 100644
> --- a/tools/perf/util/bpf_ftrace.c
> +++ b/tools/perf/util/bpf_ftrace.c
> @@ -38,7 +38,7 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
>  
>  	/* don't need to set cpu filter for system-wide mode */
>  	if (ftrace->target.cpu_list) {
> -		ncpus = perf_cpu_map__nr(ftrace->evlist->core.cpus);
> +		ncpus = perf_cpu_map__nr(ftrace->evlist->core.user_requested_cpus);
>  		bpf_map__set_max_entries(skel->maps.cpu_filter, ncpus);
>  	}
>  
> @@ -63,7 +63,7 @@ int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
>  		fd = bpf_map__fd(skel->maps.cpu_filter);
>  
>  		for (i = 0; i < ncpus; i++) {
> -			cpu = perf_cpu_map__cpu(ftrace->evlist->core.cpus, i).cpu;
> +			cpu = perf_cpu_map__cpu(ftrace->evlist->core.user_requested_cpus, i).cpu;
>  			bpf_map_update_elem(fd, &cpu, &val, BPF_ANY);
>  		}
>  	}
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 9bb79e049957..cb2cf4463c08 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -440,7 +440,7 @@ static void __evlist__disable(struct evlist *evlist, char *evsel_name)
>  	bool has_imm = false;
>  
>  	// See explanation in evlist__close()
> -	if (!cpu_map__is_dummy(evlist->core.cpus)) {
> +	if (!cpu_map__is_dummy(evlist->core.user_requested_cpus)) {
>  		if (affinity__setup(&saved_affinity) < 0)
>  			return;
>  		affinity = &saved_affinity;
> @@ -500,7 +500,7 @@ static void __evlist__enable(struct evlist *evlist, char *evsel_name)
>  	struct affinity saved_affinity, *affinity = NULL;
>  
>  	// See explanation in evlist__close()
> -	if (!cpu_map__is_dummy(evlist->core.cpus)) {
> +	if (!cpu_map__is_dummy(evlist->core.user_requested_cpus)) {
>  		if (affinity__setup(&saved_affinity) < 0)
>  			return;
>  		affinity = &saved_affinity;
> @@ -565,7 +565,7 @@ static int evlist__enable_event_cpu(struct evlist *evlist, struct evsel *evsel,
>  static int evlist__enable_event_thread(struct evlist *evlist, struct evsel *evsel, int thread)
>  {
>  	int cpu;
> -	int nr_cpus = perf_cpu_map__nr(evlist->core.cpus);
> +	int nr_cpus = perf_cpu_map__nr(evlist->core.user_requested_cpus);
>  
>  	if (!evsel->core.fd)
>  		return -EINVAL;
> @@ -580,7 +580,7 @@ static int evlist__enable_event_thread(struct evlist *evlist, struct evsel *evse
>  
>  int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx)
>  {
> -	bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.cpus);
> +	bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.user_requested_cpus);
>  
>  	if (per_cpu_mmaps)
>  		return evlist__enable_event_cpu(evlist, evsel, idx);
> @@ -1301,10 +1301,11 @@ void evlist__close(struct evlist *evlist)
>  	struct affinity affinity;
>  
>  	/*
> -	 * With perf record core.cpus is usually NULL.
> +	 * With perf record core.user_requested_cpus is usually NULL.
>  	 * Use the old method to handle this for now.
>  	 */
> -	if (!evlist->core.cpus || cpu_map__is_dummy(evlist->core.cpus)) {
> +	if (!evlist->core.user_requested_cpus ||
> +	    cpu_map__is_dummy(evlist->core.user_requested_cpus)) {
>  		evlist__for_each_entry_reverse(evlist, evsel)
>  			evsel__close(evsel);
>  		return;
> @@ -1367,7 +1368,7 @@ int evlist__open(struct evlist *evlist)
>  	 * Default: one fd per CPU, all threads, aka systemwide
>  	 * as sys_perf_event_open(cpu = -1, thread = -1) is EINVAL
>  	 */
> -	if (evlist->core.threads == NULL && evlist->core.cpus == NULL) {
> +	if (evlist->core.threads == NULL && evlist->core.user_requested_cpus == NULL) {
>  		err = evlist__create_syswide_maps(evlist);
>  		if (err < 0)
>  			goto out_err;
> diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
> index 007a64681416..5b09ecbb05dc 100644
> --- a/tools/perf/util/record.c
> +++ b/tools/perf/util/record.c
> @@ -106,7 +106,7 @@ void evlist__config(struct evlist *evlist, struct record_opts *opts, struct call
>  	if (opts->group)
>  		evlist__set_leader(evlist);
>  
> -	if (perf_cpu_map__cpu(evlist->core.cpus, 0).cpu < 0)
> +	if (perf_cpu_map__cpu(evlist->core.user_requested_cpus, 0).cpu < 0)
>  		opts->no_inherit = true;
>  
>  	use_comm_exec = perf_can_comm_exec();
> @@ -244,7 +244,7 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str)
>  
>  	evsel = evlist__last(temp_evlist);
>  
> -	if (!evlist || perf_cpu_map__empty(evlist->core.cpus)) {
> +	if (!evlist || perf_cpu_map__empty(evlist->core.user_requested_cpus)) {
>  		struct perf_cpu_map *cpus = perf_cpu_map__new(NULL);
>  
>  		if (cpus)
> @@ -252,7 +252,7 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str)
>  
>  		perf_cpu_map__put(cpus);
>  	} else {
> -		cpu = perf_cpu_map__cpu(evlist->core.cpus, 0);
> +		cpu = perf_cpu_map__cpu(evlist->core.user_requested_cpus, 0);
>  	}
>  
>  	while (1) {
> diff --git a/tools/perf/util/sideband_evlist.c b/tools/perf/util/sideband_evlist.c
> index 748371ac22be..388846f17bc1 100644
> --- a/tools/perf/util/sideband_evlist.c
> +++ b/tools/perf/util/sideband_evlist.c
> @@ -114,7 +114,8 @@ int evlist__start_sb_thread(struct evlist *evlist, struct target *target)
>  	}
>  
>  	evlist__for_each_entry(evlist, counter) {
> -		if (evsel__open(counter, evlist->core.cpus, evlist->core.threads) < 0)
> +		if (evsel__open(counter, evlist->core.user_requested_cpus,
> +				evlist->core.threads) < 0)
>  			goto out_delete_evlist;
>  	}
>  
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 9cbe351b141f..138e3ab9d638 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -929,7 +929,7 @@ static void print_no_aggr_metric(struct perf_stat_config *config,
>  	int all_idx;
>  	struct perf_cpu cpu;
>  
> -	perf_cpu_map__for_each_cpu(cpu, all_idx, evlist->core.cpus) {
> +	perf_cpu_map__for_each_cpu(cpu, all_idx, evlist->core.user_requested_cpus) {
>  		struct evsel *counter;
>  		bool first = true;
>  
> diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> index b654de0841f8..27acdc5e5723 100644
> --- a/tools/perf/util/synthetic-events.c
> +++ b/tools/perf/util/synthetic-events.c
> @@ -2127,7 +2127,7 @@ int perf_event__synthesize_stat_events(struct perf_stat_config *config, struct p
>  		return err;
>  	}
>  
> -	err = perf_event__synthesize_cpu_map(tool, evlist->core.cpus, process, NULL);
> +	err = perf_event__synthesize_cpu_map(tool, evlist->core.user_requested_cpus, process, NULL);
>  	if (err < 0) {
>  		pr_err("Couldn't synthesize thread map.\n");
>  		return err;
> diff --git a/tools/perf/util/top.c b/tools/perf/util/top.c
> index c1ebfc5d2e0c..b8b32431d2f7 100644
> --- a/tools/perf/util/top.c
> +++ b/tools/perf/util/top.c
> @@ -95,15 +95,17 @@ size_t perf_top__header_snprintf(struct perf_top *top, char *bf, size_t size)
>  
>  	if (target->cpu_list)
>  		ret += SNPRINTF(bf + ret, size - ret, ", CPU%s: %s)",
> -				perf_cpu_map__nr(top->evlist->core.cpus) > 1 ? "s" : "",
> +				perf_cpu_map__nr(top->evlist->core.user_requested_cpus) > 1
> +				? "s" : "",
>  				target->cpu_list);
>  	else {
>  		if (target->tid)
>  			ret += SNPRINTF(bf + ret, size - ret, ")");
>  		else
>  			ret += SNPRINTF(bf + ret, size - ret, ", %d CPU%s)",
> -					perf_cpu_map__nr(top->evlist->core.cpus),
> -					perf_cpu_map__nr(top->evlist->core.cpus) > 1 ? "s" : "");
> +					perf_cpu_map__nr(top->evlist->core.user_requested_cpus),
> +					perf_cpu_map__nr(top->evlist->core.user_requested_cpus) > 1
> +					? "s" : "");
>  	}
>  
>  	perf_top__reset_sample_counters(top);
> -- 
> 2.35.1.1021.g381101b075-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/6] perf cpumap: Add is_subset function
  2022-03-28 23:26 ` [PATCH v2 3/6] perf cpumap: Add is_subset function Ian Rogers
@ 2022-03-30 20:34   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-03-30 20:34 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Mathieu Poirier, Suzuki K Poulose,
	Mike Leach, Leo Yan, John Garry, Will Deacon, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Kajol Jain, James Clark,
	German Gomez, Adrian Hunter, Riccardo Mancini, Andi Kleen,
	Alexey Bayduraev, Alexander Antonov, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel, netdev, bpf,
	Stephane Eranian

Em Mon, Mar 28, 2022 at 04:26:45PM -0700, Ian Rogers escreveu:
> Returns true if the second argument is a subset of the first.

Thanks, applied.

- Arnaldo

 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/lib/perf/cpumap.c                  | 20 ++++++++++++++++++++
>  tools/lib/perf/include/internal/cpumap.h |  1 +
>  2 files changed, 21 insertions(+)
> 
> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> index ee66760f1e63..23701024e0c0 100644
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -319,6 +319,26 @@ struct perf_cpu perf_cpu_map__max(struct perf_cpu_map *map)
>  	return map->nr > 0 ? map->map[map->nr - 1] : result;
>  }
>  
> +/** Is 'b' a subset of 'a'. */
> +bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu_map *b)
> +{
> +	if (a == b || !b)
> +		return true;
> +	if (!a || b->nr > a->nr)
> +		return false;
> +
> +	for (int i = 0, j = 0; i < a->nr; i++) {
> +		if (a->map[i].cpu > b->map[j].cpu)
> +			return false;
> +		if (a->map[i].cpu == b->map[j].cpu) {
> +			j++;
> +			if (j == b->nr)
> +				return true;
> +		}
> +	}
> +	return false;
> +}
> +
>  /*
>   * Merge two cpumaps
>   *
> diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> index 1973a18c096b..35dd29642296 100644
> --- a/tools/lib/perf/include/internal/cpumap.h
> +++ b/tools/lib/perf/include/internal/cpumap.h
> @@ -25,5 +25,6 @@ struct perf_cpu_map {
>  #endif
>  
>  int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu);
> +bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu_map *b);
>  
>  #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
> -- 
> 2.35.1.1021.g381101b075-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/6] perf cpumap: More cpu map reuse by merge.
  2022-03-28 23:26 ` [PATCH v2 4/6] perf cpumap: More cpu map reuse by merge Ian Rogers
@ 2022-03-30 20:34   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-03-30 20:34 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Mathieu Poirier, Suzuki K Poulose,
	Mike Leach, Leo Yan, John Garry, Will Deacon, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Kajol Jain, James Clark,
	German Gomez, Adrian Hunter, Riccardo Mancini, Andi Kleen,
	Alexey Bayduraev, Alexander Antonov, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel, netdev, bpf,
	Stephane Eranian

Em Mon, Mar 28, 2022 at 04:26:46PM -0700, Ian Rogers escreveu:
> perf_cpu_map__merge will reuse one of its arguments if they are equal or
> the other argument is NULL. The arguments could be reused if it is known
> one set of values is a subset of the other. For example, a map of 0-1
> and a map of just 0 when merged yields the map of 0-1. Currently a new
> map is created rather than adding a reference count to the original 0-1
> map.

Thanks, applied.

- Arnaldo

 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/lib/perf/cpumap.c | 15 +++++----------
>  1 file changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> index 23701024e0c0..384d5e076ee4 100644
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -355,17 +355,12 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
>  	int i, j, k;
>  	struct perf_cpu_map *merged;
>  
> -	if (!orig && !other)
> -		return NULL;
> -	if (!orig) {
> -		perf_cpu_map__get(other);
> -		return other;
> -	}
> -	if (!other)
> -		return orig;
> -	if (orig->nr == other->nr &&
> -	    !memcmp(orig->map, other->map, orig->nr * sizeof(struct perf_cpu)))
> +	if (perf_cpu_map__is_subset(orig, other))
>  		return orig;
> +	if (perf_cpu_map__is_subset(other, orig)) {
> +		perf_cpu_map__put(orig);
> +		return perf_cpu_map__get(other);
> +	}
>  
>  	tmp_len = orig->nr + other->nr;
>  	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> -- 
> 2.35.1.1021.g381101b075-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 5/6] perf cpumap: Add intersect function.
  2022-03-28 23:26 ` [PATCH v2 5/6] perf cpumap: Add intersect function Ian Rogers
@ 2022-04-01 19:12   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-04-01 19:12 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Mathieu Poirier, Suzuki K Poulose,
	Mike Leach, Leo Yan, John Garry, Will Deacon, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Kajol Jain, James Clark,
	German Gomez, Adrian Hunter, Riccardo Mancini, Andi Kleen,
	Alexey Bayduraev, Alexander Antonov, linux-perf-users,
	linux-kernel, coresight, linux-arm-kernel, netdev, bpf,
	Stephane Eranian

Em Mon, Mar 28, 2022 at 04:26:47PM -0700, Ian Rogers escreveu:
> The merge function gives the union of two cpu maps. Add an intersect
> function which will be used in the next change.

So I really don't think intersect() shouldn't modify the contents of any
of its arguments, at most return one of them with a bumped refcount, as
an optimization.

The merge() operation is different in the sense that one expects that
one of the operands will be inserted into the other, and even then it
would be better to have a clearer semantic, i.e. merge(a, b) should mean
get the contents of b and insert into a.

Since we're talking about CPUs, it doesn't make sense to have a CPU
multiple times in the cpu_map, so we eliminate duplicates while doing
it.

Also perhaps the merge() operation should not even change any of the
operands, but instead return a new cpuset if one of the operands isn't
contained in the other, in which case a bump in the reference count of
the superset would be a valid optimization.

But that boat has departed already, i.e. perf_cpu_map__merge() is
already an exported libperf API, sigh.

This is something we're exporting, so I think this warrants further
discussion, even with a fix depending on the merge of this new API.

- Arnaldo
 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/lib/perf/cpumap.c              | 38 ++++++++++++++++++++++++++++
>  tools/lib/perf/include/perf/cpumap.h |  2 ++
>  2 files changed, 40 insertions(+)
> 
> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> index 384d5e076ee4..60cccd05f243 100644
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -390,3 +390,41 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
>  	perf_cpu_map__put(orig);
>  	return merged;
>  }
> +
> +struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig,
> +					     struct perf_cpu_map *other)
> +{
> +	struct perf_cpu *tmp_cpus;
> +	int tmp_len;
> +	int i, j, k;
> +	struct perf_cpu_map *merged = NULL;
> +
> +	if (perf_cpu_map__is_subset(other, orig))
> +		return orig;
> +	if (perf_cpu_map__is_subset(orig, other)) {
> +		perf_cpu_map__put(orig);
> +		return perf_cpu_map__get(other);
> +	}
> +
> +	tmp_len = max(orig->nr, other->nr);
> +	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> +	if (!tmp_cpus)
> +		return NULL;
> +
> +	i = j = k = 0;
> +	while (i < orig->nr && j < other->nr) {
> +		if (orig->map[i].cpu < other->map[j].cpu)
> +			i++;
> +		else if (orig->map[i].cpu > other->map[j].cpu)
> +			j++;
> +		else {
> +			j++;
> +			tmp_cpus[k++] = orig->map[i++];
> +		}
> +	}
> +	if (k)
> +		merged = cpu_map__trim_new(k, tmp_cpus);
> +	free(tmp_cpus);
> +	perf_cpu_map__put(orig);
> +	return merged;
> +}
> diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
> index 4a2edbdb5e2b..a2a7216c0b78 100644
> --- a/tools/lib/perf/include/perf/cpumap.h
> +++ b/tools/lib/perf/include/perf/cpumap.h
> @@ -19,6 +19,8 @@ LIBPERF_API struct perf_cpu_map *perf_cpu_map__read(FILE *file);
>  LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map);
>  LIBPERF_API struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
>  						     struct perf_cpu_map *other);
> +LIBPERF_API struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig,
> +							 struct perf_cpu_map *other);
>  LIBPERF_API void perf_cpu_map__put(struct perf_cpu_map *map);
>  LIBPERF_API struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
>  LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
> -- 
> 2.35.1.1021.g381101b075-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-04-01 19:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-28 23:26 [PATCH v2 0/6] Make evlist CPUs more accurate Ian Rogers
2022-03-28 23:26 ` [PATCH v2 1/6] perf stat: Avoid segv if core.user_cpus isn't set Ian Rogers
2022-03-28 23:26 ` [PATCH v2 2/6] perf evlist: Rename cpus to user_requested_cpus Ian Rogers
2022-03-30 20:31   ` Arnaldo Carvalho de Melo
2022-03-28 23:26 ` [PATCH v2 3/6] perf cpumap: Add is_subset function Ian Rogers
2022-03-30 20:34   ` Arnaldo Carvalho de Melo
2022-03-28 23:26 ` [PATCH v2 4/6] perf cpumap: More cpu map reuse by merge Ian Rogers
2022-03-30 20:34   ` Arnaldo Carvalho de Melo
2022-03-28 23:26 ` [PATCH v2 5/6] perf cpumap: Add intersect function Ian Rogers
2022-04-01 19:12   ` Arnaldo Carvalho de Melo
2022-03-28 23:26 ` [PATCH v2 6/6] perf evlist: Respect all_cpus when setting user_requested_cpus Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).