All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Reference count checker and cpu map related fixes
@ 2022-01-22  4:58 Ian Rogers
  2022-01-22  4:58 ` [PATCH 1/3] perf python: Fix cpu_map__item building Ian Rogers
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Ian Rogers @ 2022-01-22  4:58 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov
  Cc: eranian, Ian Rogers

The first patch is a fix, the second a refactor/cleanup and the third
something of an RFC.

The perf tool has a class of memory problems where reference counts
are used incorrectly. Memory sanitizers and valgrind don't provide
useful ways to debug these problems, you see a memory leak where the
only pertinent information is the original allocation site. What would
be more useful is knowing where a get fails to have a corresponding
put, where there are double puts, etc.

This work was motivated by the roll-back of:
https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
where fixing a missed put resulted in a use-after-free in a different
context. There was a sense in fixing the issue that a game of
wac-a-mole had been embarked upon in adding missed gets and puts.

The basic approach of the change is to add a level of indirection at
the get and put calls. Get allocates a level of indirection that, if
no corresponding put is called, becomes a memory leak (and associated
stack trace) that leak sanitizer can report. Similarly if two puts are
called for the same get, then a double free can be detected by address
sanitizer. This can also detect the use after put, which should also
yield a segv without a sanitizer.

The patches demonstrates the approach on the reference counted
perf_cpu_map API. The associated struct only has two variables and so
the associated indirection is limited, and the API was recently
refactored and should be fresh of mind. To avoid two APIs the approach
calls the indirection layer perf_cpu_map when in effect. To control
the naming of the struct and the stripping of the indirection macros
are necessary. Patch 3 implements the checker and has diffstats of:
 5 files changed, 146 insertions(+), 74 deletions(-)
The number of files changed is a reflection of the partial move to
libperf of this API and isn't typical. Overall reference count safety
is being added in at a cost of about 72 lines of code, hopefully
demonstrating the approach to be cheap enough that we can repeat it on
the other reference counted structs in perf - albeit that perf_cpu_map
is somewhat unusual in already using lots of accessor functions to
access the struct.

Did the approach spot any bugs? Not really. It did spot that in
perf_cpu_map__merge get was being used for the side-effect of
incrementing a reference count, but not returning the perf_cpu_map the
get returns. Having to be accurate with this kind of behavior is a
frustration of the approach, but it demonstrates the tool works and
for resolving reference count issues I feel it is worth it.

An alternative that was considered was ref_tracker:
 https://lwn.net/Articles/877603/
ref_tracker requires use of a reference counted struct to also use a
cookie/tracker. The cookie is combined with data in a ref_tracker_dir
to spot double puts. When an object is finished with leaks can be
detected, as with this approach when leak analysis happens. This
approach was preferred as it doesn't introduce cookies, spots use
after put and appears moderately more neutral to the API. Weaknesses
of the implemented approcah are not being able to do adhoc leak
detection and a preference for adding an accessor API to structs. I
believe there are other issues, hence the RFC.

Ian Rogers (3):
  perf python: Fix cpu_map__item building
  perf cpumap: Migrate to libperf cpumap api
  perf cpumap: Add reference count checking

 tools/lib/perf/cpumap.c                       | 120 ++++++++++++------
 tools/lib/perf/evsel.c                        |   4 +-
 tools/lib/perf/include/internal/cpumap.h      |  14 +-
 tools/perf/bench/epoll-ctl.c                  |   2 +-
 tools/perf/bench/epoll-wait.c                 |   2 +-
 tools/perf/bench/evlist-open-close.c          |   4 +-
 tools/perf/bench/futex-hash.c                 |   2 +-
 tools/perf/bench/futex-lock-pi.c              |   2 +-
 tools/perf/bench/futex-requeue.c              |   2 +-
 tools/perf/bench/futex-wake-parallel.c        |   2 +-
 tools/perf/bench/futex-wake.c                 |   2 +-
 tools/perf/builtin-ftrace.c                   |   2 +-
 tools/perf/builtin-stat.c                     |   7 +-
 tools/perf/tests/bitmap.c                     |   4 +-
 tools/perf/tests/cpumap.c                     |  20 ++-
 tools/perf/tests/event_update.c               |   8 +-
 tools/perf/tests/mem2node.c                   |   4 +-
 tools/perf/tests/mmap-basic.c                 |   5 +-
 tools/perf/tests/topology.c                   |  37 +++---
 tools/perf/util/auxtrace.c                    |   2 +-
 tools/perf/util/counts.c                      |   2 +-
 tools/perf/util/cpumap.c                      |  42 +++---
 tools/perf/util/cpumap.h                      |   2 +-
 tools/perf/util/cputopo.c                     |   4 +-
 tools/perf/util/evlist-hybrid.c               |  11 +-
 tools/perf/util/evsel.c                       |  20 +--
 tools/perf/util/evsel.h                       |   3 +-
 tools/perf/util/mmap.c                        |   2 +-
 tools/perf/util/perf_api_probe.c              |   4 +-
 tools/perf/util/pmu.c                         |  24 ++--
 tools/perf/util/python.c                      |   6 +-
 tools/perf/util/record.c                      |   6 +-
 .../scripting-engines/trace-event-python.c    |   4 +-
 tools/perf/util/session.c                     |   4 +-
 tools/perf/util/svghelper.c                   |   4 +-
 tools/perf/util/synthetic-events.c            |  18 +--
 tools/perf/util/top.c                         |   6 +-
 37 files changed, 245 insertions(+), 162 deletions(-)

-- 
2.35.0.rc0.227.g00780c9af4-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] perf python: Fix cpu_map__item building
  2022-01-22  4:58 [PATCH 0/3] Reference count checker and cpu map related fixes Ian Rogers
@ 2022-01-22  4:58 ` Ian Rogers
  2022-01-22 20:09   ` Arnaldo Carvalho de Melo
  2022-01-22  4:58 ` [PATCH 2/3] perf cpumap: Migrate to libperf cpumap api Ian Rogers
  2022-01-22  4:58 ` [PATCH 3/3] perf cpumap: Add reference count checking Ian Rogers
  2 siblings, 1 reply; 6+ messages in thread
From: Ian Rogers @ 2022-01-22  4:58 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov
  Cc: eranian, Ian Rogers

Value should be built as an integer.
Switch some uses of perf_cpu_map to use the library API.

Fixes: 6d18804b963b ("perf cpumap: Give CPUs their own type")
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/python.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index f3e5131f183c..52d8995cfd73 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -638,17 +638,17 @@ static Py_ssize_t pyrf_cpu_map__length(PyObject *obj)
 {
 	struct pyrf_cpu_map *pcpus = (void *)obj;
 
-	return pcpus->cpus->nr;
+	return perf_cpu_map__nr(pcpus->cpus);
 }
 
 static PyObject *pyrf_cpu_map__item(PyObject *obj, Py_ssize_t i)
 {
 	struct pyrf_cpu_map *pcpus = (void *)obj;
 
-	if (i >= pcpus->cpus->nr)
+	if (i >= perf_cpu_map__nr(pcpus->cpus))
 		return NULL;
 
-	return Py_BuildValue("i", pcpus->cpus->map[i]);
+	return Py_BuildValue("i", perf_cpu_map__cpu(pcpus->cpus, i).cpu);
 }
 
 static PySequenceMethods pyrf_cpu_map__sequence_methods = {
-- 
2.35.0.rc0.227.g00780c9af4-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] perf cpumap: Migrate to libperf cpumap api
  2022-01-22  4:58 [PATCH 0/3] Reference count checker and cpu map related fixes Ian Rogers
  2022-01-22  4:58 ` [PATCH 1/3] perf python: Fix cpu_map__item building Ian Rogers
@ 2022-01-22  4:58 ` Ian Rogers
  2022-01-22  4:58 ` [PATCH 3/3] perf cpumap: Add reference count checking Ian Rogers
  2 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2022-01-22  4:58 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov
  Cc: eranian, Ian Rogers

Switch from directly accessing the perf_cpu_map to using the appropriate
libperf API when possible. Using the API simplifies the job of
refactoring use of perf_cpu_map.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/evsel.c                        |  4 +-
 tools/perf/bench/epoll-ctl.c                  |  2 +-
 tools/perf/bench/epoll-wait.c                 |  2 +-
 tools/perf/bench/evlist-open-close.c          |  4 +-
 tools/perf/bench/futex-hash.c                 |  2 +-
 tools/perf/bench/futex-lock-pi.c              |  2 +-
 tools/perf/bench/futex-requeue.c              |  2 +-
 tools/perf/bench/futex-wake-parallel.c        |  2 +-
 tools/perf/bench/futex-wake.c                 |  2 +-
 tools/perf/builtin-ftrace.c                   |  2 +-
 tools/perf/builtin-stat.c                     |  7 ++--
 tools/perf/tests/bitmap.c                     |  4 +-
 tools/perf/tests/event_update.c               |  8 ++--
 tools/perf/tests/mem2node.c                   |  9 +++--
 tools/perf/tests/mmap-basic.c                 |  5 ++-
 tools/perf/tests/topology.c                   | 37 +++++++++++--------
 tools/perf/util/auxtrace.c                    |  2 +-
 tools/perf/util/counts.c                      |  2 +-
 tools/perf/util/cpumap.h                      |  2 +-
 tools/perf/util/cputopo.c                     |  4 +-
 tools/perf/util/evlist-hybrid.c               | 11 +++---
 tools/perf/util/evsel.c                       | 20 +++++-----
 tools/perf/util/evsel.h                       |  3 +-
 tools/perf/util/mmap.c                        |  2 +-
 tools/perf/util/perf_api_probe.c              |  4 +-
 tools/perf/util/record.c                      |  6 +--
 .../scripting-engines/trace-event-python.c    |  4 +-
 tools/perf/util/session.c                     |  4 +-
 tools/perf/util/svghelper.c                   |  4 +-
 tools/perf/util/synthetic-events.c            | 18 ++++-----
 tools/perf/util/top.c                         |  6 +--
 31 files changed, 99 insertions(+), 87 deletions(-)

diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 7ea86a44eae5..8543a594ea2f 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -141,7 +141,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
 	}
 
 	if (evsel->fd == NULL &&
-	    perf_evsel__alloc_fd(evsel, cpus->nr, threads->nr) < 0)
+		perf_evsel__alloc_fd(evsel, perf_cpu_map__nr(cpus), threads->nr) < 0)
 		return -ENOMEM;
 
 	perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
@@ -384,7 +384,7 @@ int perf_evsel__apply_filter(struct perf_evsel *evsel, const char *filter)
 {
 	int err = 0, i;
 
-	for (i = 0; i < evsel->cpus->nr && !err; i++)
+	for (i = 0; i < perf_cpu_map__nr(evsel->cpus) && !err; i++)
 		err = perf_evsel__run_ioctl(evsel,
 				     PERF_EVENT_IOC_SET_FILTER,
 				     (void *)filter, i);
diff --git a/tools/perf/bench/epoll-ctl.c b/tools/perf/bench/epoll-ctl.c
index 1a17ec83d3c4..740ae764537e 100644
--- a/tools/perf/bench/epoll-ctl.c
+++ b/tools/perf/bench/epoll-ctl.c
@@ -333,7 +333,7 @@ int bench_epoll_ctl(int argc, const char **argv)
 
 	/* default to the number of CPUs */
 	if (!nthreads)
-		nthreads = cpu->nr;
+		nthreads = perf_cpu_map__nr(cpu);
 
 	worker = calloc(nthreads, sizeof(*worker));
 	if (!worker)
diff --git a/tools/perf/bench/epoll-wait.c b/tools/perf/bench/epoll-wait.c
index 0d1dd8879197..37de970c9743 100644
--- a/tools/perf/bench/epoll-wait.c
+++ b/tools/perf/bench/epoll-wait.c
@@ -452,7 +452,7 @@ int bench_epoll_wait(int argc, const char **argv)
 
 	/* default to the number of CPUs and leave one for the writer pthread */
 	if (!nthreads)
-		nthreads = cpu->nr - 1;
+		nthreads = perf_cpu_map__nr(cpu) - 1;
 
 	worker = calloc(nthreads, sizeof(*worker));
 	if (!worker) {
diff --git a/tools/perf/bench/evlist-open-close.c b/tools/perf/bench/evlist-open-close.c
index 482738e9bdad..de56601f69ee 100644
--- a/tools/perf/bench/evlist-open-close.c
+++ b/tools/perf/bench/evlist-open-close.c
@@ -71,7 +71,7 @@ static int evlist__count_evsel_fds(struct evlist *evlist)
 	int cnt = 0;
 
 	evlist__for_each_entry(evlist, evsel)
-		cnt += evsel->core.threads->nr * evsel->core.cpus->nr;
+		cnt += evsel->core.threads->nr * perf_cpu_map__nr(evsel->core.cpus);
 
 	return cnt;
 }
@@ -151,7 +151,7 @@ static int bench_evlist_open_close__run(char *evstr)
 
 	init_stats(&time_stats);
 
-	printf("  Number of cpus:\t%d\n", evlist->core.cpus->nr);
+	printf("  Number of cpus:\t%d\n", perf_cpu_map__nr(evlist->core.cpus));
 	printf("  Number of threads:\t%d\n", evlist->core.threads->nr);
 	printf("  Number of events:\t%d (%d fds)\n",
 		evlist->core.nr_entries, evlist__count_evsel_fds(evlist));
diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index 9627b6ab8670..dbcecec4eeda 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -150,7 +150,7 @@ int bench_futex_hash(int argc, const char **argv)
 	}
 
 	if (!params.nthreads) /* default to the number of CPUs */
-		params.nthreads = cpu->nr;
+		params.nthreads = perf_cpu_map__nr(cpu);
 
 	worker = calloc(params.nthreads, sizeof(*worker));
 	if (!worker)
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index a512a320df74..6fc9a3d55c1f 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -173,7 +173,7 @@ int bench_futex_lock_pi(int argc, const char **argv)
 	}
 
 	if (!params.nthreads)
-		params.nthreads = cpu->nr;
+		params.nthreads = perf_cpu_map__nr(cpu);
 
 	worker = calloc(params.nthreads, sizeof(*worker));
 	if (!worker)
diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
index aca47ce8b1e7..2f59d5d1c509 100644
--- a/tools/perf/bench/futex-requeue.c
+++ b/tools/perf/bench/futex-requeue.c
@@ -175,7 +175,7 @@ int bench_futex_requeue(int argc, const char **argv)
 	}
 
 	if (!params.nthreads)
-		params.nthreads = cpu->nr;
+		params.nthreads = perf_cpu_map__nr(cpu);
 
 	worker = calloc(params.nthreads, sizeof(*worker));
 	if (!worker)
diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index 888ee6037945..861deb934745 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -252,7 +252,7 @@ int bench_futex_wake_parallel(int argc, const char **argv)
 		err(EXIT_FAILURE, "calloc");
 
 	if (!params.nthreads)
-		params.nthreads = cpu->nr;
+		params.nthreads = perf_cpu_map__nr(cpu);
 
 	/* some sanity checks */
 	if (params.nwakes > params.nthreads ||
diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
index aa82db51c0ab..cfda48bef1d7 100644
--- a/tools/perf/bench/futex-wake.c
+++ b/tools/perf/bench/futex-wake.c
@@ -151,7 +151,7 @@ int bench_futex_wake(int argc, const char **argv)
 	}
 
 	if (!params.nthreads)
-		params.nthreads = cpu->nr;
+		params.nthreads = perf_cpu_map__nr(cpu);
 
 	worker = calloc(params.nthreads, sizeof(*worker));
 	if (!worker)
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index 71452599f87d..dec24dc0e767 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -281,7 +281,7 @@ static int set_tracing_cpumask(struct perf_cpu_map *cpumap)
 	int ret;
 	int last_cpu;
 
-	last_cpu = perf_cpu_map__cpu(cpumap, cpumap->nr - 1).cpu;
+	last_cpu = perf_cpu_map__cpu(cpumap, perf_cpu_map__nr(cpumap) - 1).cpu;
 	mask_size = last_cpu / 4 + 2; /* one more byte for EOS */
 	mask_size += last_cpu / 32; /* ',' is needed for every 32th cpus */
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 934e992c966f..3f98689dd687 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -230,11 +230,12 @@ static bool cpus_map_matched(struct evsel *a, struct evsel *b)
 	if (!a->core.cpus || !b->core.cpus)
 		return false;
 
-	if (a->core.cpus->nr != b->core.cpus->nr)
+	if (perf_cpu_map__nr(a->core.cpus) != perf_cpu_map__nr(b->core.cpus))
 		return false;
 
-	for (int i = 0; i < a->core.cpus->nr; i++) {
-		if (a->core.cpus->map[i].cpu != b->core.cpus->map[i].cpu)
+	for (int i = 0; i < perf_cpu_map__nr(a->core.cpus); i++) {
+		if (perf_cpu_map__cpu(a->core.cpus, i).cpu !=
+		    perf_cpu_map__cpu(b->core.cpus, i).cpu)
 			return false;
 	}
 
diff --git a/tools/perf/tests/bitmap.c b/tools/perf/tests/bitmap.c
index 0bf399c49849..4965dd666956 100644
--- a/tools/perf/tests/bitmap.c
+++ b/tools/perf/tests/bitmap.c
@@ -17,8 +17,8 @@ static unsigned long *get_bitmap(const char *str, int nbits)
 	bm = bitmap_zalloc(nbits);
 
 	if (map && bm) {
-		for (i = 0; i < map->nr; i++)
-			set_bit(map->map[i].cpu, bm);
+		for (i = 0; i < perf_cpu_map__nr(map); i++)
+			set_bit(perf_cpu_map__cpu(map, i).cpu, bm);
 	}
 
 	if (map)
diff --git a/tools/perf/tests/event_update.c b/tools/perf/tests/event_update.c
index 16b6d6f47f38..78db4d704e76 100644
--- a/tools/perf/tests/event_update.c
+++ b/tools/perf/tests/event_update.c
@@ -75,10 +75,10 @@ static int process_event_cpus(struct perf_tool *tool __maybe_unused,
 
 	TEST_ASSERT_VAL("wrong id", ev->id == 123);
 	TEST_ASSERT_VAL("wrong type", ev->type == PERF_EVENT_UPDATE__CPUS);
-	TEST_ASSERT_VAL("wrong cpus", map->nr == 3);
-	TEST_ASSERT_VAL("wrong cpus", map->map[0].cpu == 1);
-	TEST_ASSERT_VAL("wrong cpus", map->map[1].cpu == 2);
-	TEST_ASSERT_VAL("wrong cpus", map->map[2].cpu == 3);
+	TEST_ASSERT_VAL("wrong cpus", perf_cpu_map__nr(map) == 3);
+	TEST_ASSERT_VAL("wrong cpus", perf_cpu_map__cpu(map, 0).cpu == 1);
+	TEST_ASSERT_VAL("wrong cpus", perf_cpu_map__cpu(map, 1).cpu == 2);
+	TEST_ASSERT_VAL("wrong cpus", perf_cpu_map__cpu(map, 2).cpu == 3);
 	perf_cpu_map__put(map);
 	return 0;
 }
diff --git a/tools/perf/tests/mem2node.c b/tools/perf/tests/mem2node.c
index f4a4aba33f76..4c96829510c9 100644
--- a/tools/perf/tests/mem2node.c
+++ b/tools/perf/tests/mem2node.c
@@ -25,14 +25,15 @@ static unsigned long *get_bitmap(const char *str, int nbits)
 {
 	struct perf_cpu_map *map = perf_cpu_map__new(str);
 	unsigned long *bm = NULL;
-	int i;
 
 	bm = bitmap_zalloc(nbits);
 
 	if (map && bm) {
-		for (i = 0; i < map->nr; i++) {
-			set_bit(map->map[i].cpu, bm);
-		}
+		struct perf_cpu cpu;
+		int i;
+
+		perf_cpu_map__for_each_cpu(cpu, i, map)
+			set_bit(cpu.cpu, bm);
 	}
 
 	if (map)
diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index 0ad62914b4d7..c3c17600f29c 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -59,11 +59,12 @@ static int test__basic_mmap(struct test_suite *test __maybe_unused, int subtest
 	}
 
 	CPU_ZERO(&cpu_set);
-	CPU_SET(cpus->map[0].cpu, &cpu_set);
+	CPU_SET(perf_cpu_map__cpu(cpus, 0).cpu, &cpu_set);
 	sched_setaffinity(0, sizeof(cpu_set), &cpu_set);
 	if (sched_setaffinity(0, sizeof(cpu_set), &cpu_set) < 0) {
 		pr_debug("sched_setaffinity() failed on CPU %d: %s ",
-			 cpus->map[0].cpu, str_error_r(errno, sbuf, sizeof(sbuf)));
+			 perf_cpu_map__cpu(cpus, 0).cpu,
+			 str_error_r(errno, sbuf, sizeof(sbuf)));
 		goto out_free_cpus;
 	}
 
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
index c4ef0c7002f1..ee1e3dcbc0bd 100644
--- a/tools/perf/tests/topology.c
+++ b/tools/perf/tests/topology.c
@@ -122,44 +122,48 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	}
 
 	// Test that CPU ID contains socket, die, core and CPU
-	for (i = 0; i < map->nr; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map); i++) {
 		id = aggr_cpu_id__cpu(perf_cpu_map__cpu(map, i), NULL);
-		TEST_ASSERT_VAL("Cpu map - CPU ID doesn't match", map->map[i].cpu == id.cpu.cpu);
+		TEST_ASSERT_VAL("Cpu map - CPU ID doesn't match",
+				perf_cpu_map__cpu(map, i).cpu == id.cpu.cpu);
 
 		TEST_ASSERT_VAL("Cpu map - Core ID doesn't match",
-			session->header.env.cpu[map->map[i].cpu].core_id == id.core);
+			session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].core_id == id.core);
 		TEST_ASSERT_VAL("Cpu map - Socket ID doesn't match",
-			session->header.env.cpu[map->map[i].cpu].socket_id == id.socket);
+			session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].socket_id ==
+			id.socket);
 
 		TEST_ASSERT_VAL("Cpu map - Die ID doesn't match",
-			session->header.env.cpu[map->map[i].cpu].die_id == id.die);
+			session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].die_id == id.die);
 		TEST_ASSERT_VAL("Cpu map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Cpu map - Thread is set", id.thread == -1);
 	}
 
 	// Test that core ID contains socket, die and core
-	for (i = 0; i < map->nr; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map); i++) {
 		id = aggr_cpu_id__core(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Core map - Core ID doesn't match",
-			session->header.env.cpu[map->map[i].cpu].core_id == id.core);
+			session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].core_id == id.core);
 
 		TEST_ASSERT_VAL("Core map - Socket ID doesn't match",
-			session->header.env.cpu[map->map[i].cpu].socket_id == id.socket);
+			session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].socket_id ==
+			id.socket);
 
 		TEST_ASSERT_VAL("Core map - Die ID doesn't match",
-			session->header.env.cpu[map->map[i].cpu].die_id == id.die);
+			session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].die_id == id.die);
 		TEST_ASSERT_VAL("Core map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Core map - Thread is set", id.thread == -1);
 	}
 
 	// Test that die ID contains socket and die
-	for (i = 0; i < map->nr; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map); i++) {
 		id = aggr_cpu_id__die(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Die map - Socket ID doesn't match",
-			session->header.env.cpu[map->map[i].cpu].socket_id == id.socket);
+			session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].socket_id ==
+			id.socket);
 
 		TEST_ASSERT_VAL("Die map - Die ID doesn't match",
-			session->header.env.cpu[map->map[i].cpu].die_id == id.die);
+			session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].die_id == id.die);
 
 		TEST_ASSERT_VAL("Die map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Die map - Core is set", id.core == -1);
@@ -168,10 +172,11 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	}
 
 	// Test that socket ID contains only socket
-	for (i = 0; i < map->nr; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map); i++) {
 		id = aggr_cpu_id__socket(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Socket map - Socket ID doesn't match",
-			session->header.env.cpu[map->map[i].cpu].socket_id == id.socket);
+			session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].socket_id ==
+			id.socket);
 
 		TEST_ASSERT_VAL("Socket map - Node ID is set", id.node == -1);
 		TEST_ASSERT_VAL("Socket map - Die ID is set", id.die == -1);
@@ -181,10 +186,10 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
 	}
 
 	// Test that node ID contains only node
-	for (i = 0; i < map->nr; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map); i++) {
 		id = aggr_cpu_id__node(perf_cpu_map__cpu(map, i), NULL);
 		TEST_ASSERT_VAL("Node map - Node ID doesn't match",
-			cpu__get_node(map->map[i]) == id.node);
+				cpu__get_node(perf_cpu_map__cpu(map, i)) == id.node);
 		TEST_ASSERT_VAL("Node map - Socket is set", id.socket == -1);
 		TEST_ASSERT_VAL("Node map - Die ID is set", id.die == -1);
 		TEST_ASSERT_VAL("Node map - Core is set", id.core == -1);
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 5632efc44738..825336304a37 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -174,7 +174,7 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
 	mp->idx = idx;
 
 	if (per_cpu) {
-		mp->cpu = evlist->core.cpus->map[idx];
+		mp->cpu = perf_cpu_map__cpu(evlist->core.cpus, idx);
 		if (evlist->core.threads)
 			mp->tid = perf_thread_map__pid(evlist->core.threads, 0);
 		else
diff --git a/tools/perf/util/counts.c b/tools/perf/util/counts.c
index 2b81707b9dba..7a447d918458 100644
--- a/tools/perf/util/counts.c
+++ b/tools/perf/util/counts.c
@@ -61,7 +61,7 @@ int evsel__alloc_counts(struct evsel *evsel)
 	struct perf_cpu_map *cpus = evsel__cpus(evsel);
 	int nthreads = perf_thread_map__nr(evsel->core.threads);
 
-	evsel->counts = perf_counts__new(cpus ? cpus->nr : 1, nthreads);
+	evsel->counts = perf_counts__new(perf_cpu_map__nr(cpus), nthreads);
 	return evsel->counts != NULL ? 0 : -ENOMEM;
 }
 
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 0d3c2006a15d..5c85fbd709b4 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -57,7 +57,7 @@ struct perf_cpu cpu__max_present_cpu(void);
  */
 static inline bool cpu_map__is_dummy(struct perf_cpu_map *cpus)
 {
-	return cpus->nr == 1 && cpus->map[0].cpu == -1;
+	return perf_cpu_map__nr(cpus) == 1 && perf_cpu_map__cpu(cpus, 0).cpu == -1;
 }
 
 /**
diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index e20b835a1194..d275d843c155 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -325,7 +325,7 @@ struct numa_topology *numa_topology__new(void)
 	if (!node_map)
 		goto out;
 
-	nr = (u32) node_map->nr;
+	nr = (u32) perf_cpu_map__nr(node_map);
 
 	tp = zalloc(sizeof(*tp) + sizeof(tp->nodes[0])*nr);
 	if (!tp)
@@ -334,7 +334,7 @@ struct numa_topology *numa_topology__new(void)
 	tp->nr = nr;
 
 	for (i = 0; i < nr; i++) {
-		if (load_numa_node(&tp->nodes[i], node_map->map[i].cpu)) {
+		if (load_numa_node(&tp->nodes[i], perf_cpu_map__cpu(node_map, i).cpu)) {
 			numa_topology__delete(tp);
 			tp = NULL;
 			break;
diff --git a/tools/perf/util/evlist-hybrid.c b/tools/perf/util/evlist-hybrid.c
index 7c554234b43d..7f234215147d 100644
--- a/tools/perf/util/evlist-hybrid.c
+++ b/tools/perf/util/evlist-hybrid.c
@@ -124,22 +124,23 @@ int evlist__fix_hybrid_cpus(struct evlist *evlist, const char *cpu_list)
 
 		events_nr++;
 
-		if (matched_cpus->nr > 0 && (unmatched_cpus->nr > 0 ||
-		    matched_cpus->nr < cpus->nr ||
-		    matched_cpus->nr < pmu->cpus->nr)) {
+		if (perf_cpu_map__nr(matched_cpus) > 0 &&
+		    (perf_cpu_map__nr(unmatched_cpus) > 0 ||
+		     perf_cpu_map__nr(matched_cpus) < perf_cpu_map__nr(cpus) ||
+		     perf_cpu_map__nr(matched_cpus) < perf_cpu_map__nr(pmu->cpus))) {
 			perf_cpu_map__put(evsel->core.cpus);
 			perf_cpu_map__put(evsel->core.own_cpus);
 			evsel->core.cpus = perf_cpu_map__get(matched_cpus);
 			evsel->core.own_cpus = perf_cpu_map__get(matched_cpus);
 
-			if (unmatched_cpus->nr > 0) {
+			if (perf_cpu_map__nr(unmatched_cpus) > 0) {
 				cpu_map__snprint(matched_cpus, buf1, sizeof(buf1));
 				pr_warning("WARNING: use %s in '%s' for '%s', skip other cpus in list.\n",
 					   buf1, pmu->name, evsel->name);
 			}
 		}
 
-		if (matched_cpus->nr == 0) {
+		if (perf_cpu_map__nr(matched_cpus) == 0) {
 			evlist__remove(evlist, evsel);
 			evsel__delete(evsel);
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2f6b18af49e5..fb0a2debf015 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1782,7 +1782,7 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		nthreads = threads->nr;
 
 	if (evsel->core.fd == NULL &&
-	    perf_evsel__alloc_fd(&evsel->core, cpus->nr, nthreads) < 0)
+	    perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0)
 		return -ENOMEM;
 
 	evsel->open_flags = PERF_FLAG_FD_CLOEXEC;
@@ -2020,9 +2020,10 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 			test_attr__ready();
 
 			pr_debug2_peo("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx",
-				pid, cpus->map[idx].cpu, group_fd, evsel->open_flags);
+				pid, perf_cpu_map__cpu(cpus, idx).cpu, group_fd, evsel->open_flags);
 
-			fd = sys_perf_event_open(&evsel->core.attr, pid, cpus->map[idx].cpu,
+			fd = sys_perf_event_open(&evsel->core.attr, pid,
+						perf_cpu_map__cpu(cpus, idx).cpu,
 						group_fd, evsel->open_flags);
 
 			FD(evsel, idx, thread) = fd;
@@ -2038,7 +2039,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 			bpf_counter__install_pe(evsel, idx, fd);
 
 			if (unlikely(test_attr__enabled)) {
-				test_attr__open(&evsel->core.attr, pid, cpus->map[idx],
+				test_attr__open(&evsel->core.attr, pid,
+						perf_cpu_map__cpu(cpus, idx),
 						fd, group_fd, evsel->open_flags);
 			}
 
@@ -2079,7 +2081,8 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	if (evsel__precise_ip_fallback(evsel))
 		goto retry_open;
 
-	if (evsel__ignore_missing_thread(evsel, cpus->nr, idx, threads, thread, err)) {
+	if (evsel__ignore_missing_thread(evsel, perf_cpu_map__nr(cpus),
+					 idx, threads, thread, err)) {
 		/* We just removed 1 thread, so lower the upper nthreads limit. */
 		nthreads--;
 
@@ -2119,7 +2122,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
 		struct perf_thread_map *threads)
 {
-	return evsel__open_cpu(evsel, cpus, threads, 0, cpus ? cpus->nr : 1);
+	return evsel__open_cpu(evsel, cpus, threads, 0, perf_cpu_map__nr(cpus));
 }
 
 void evsel__close(struct evsel *evsel)
@@ -2131,8 +2134,7 @@ void evsel__close(struct evsel *evsel)
 int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu_map_idx)
 {
 	if (cpu_map_idx == -1)
-		return evsel__open_cpu(evsel, cpus, NULL, 0,
-					cpus ? cpus->nr : 1);
+		return evsel__open_cpu(evsel, cpus, NULL, 0, perf_cpu_map__nr(cpus));
 
 	return evsel__open_cpu(evsel, cpus, NULL, cpu_map_idx, cpu_map_idx + 1);
 }
@@ -2982,7 +2984,7 @@ int evsel__store_ids(struct evsel *evsel, struct evlist *evlist)
 	struct perf_cpu_map *cpus = evsel->core.cpus;
 	struct perf_thread_map *threads = evsel->core.threads;
 
-	if (perf_evsel__alloc_id(&evsel->core, cpus->nr, threads->nr))
+	if (perf_evsel__alloc_id(&evsel->core, perf_cpu_map__nr(cpus), threads->nr))
 		return -ENOMEM;
 
 	return store_evsel_ids(evsel, evlist);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 5720ceebffac..041b42d33bf5 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -11,6 +11,7 @@
 #include <perf/evsel.h>
 #include "symbol_conf.h"
 #include <internal/cpumap.h>
+#include <perf/cpumap.h>
 
 struct bpf_object;
 struct cgroup;
@@ -191,7 +192,7 @@ static inline struct perf_cpu_map *evsel__cpus(struct evsel *evsel)
 
 static inline int evsel__nr_cpus(struct evsel *evsel)
 {
-	return evsel__cpus(evsel)->nr;
+	return perf_cpu_map__nr(evsel__cpus(evsel));
 }
 
 void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 12261ed8c15b..0e8ff8d1e206 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -250,7 +250,7 @@ static void build_node_mask(int node, struct mmap_cpu_mask *mask)
 
 	nr_cpus = perf_cpu_map__nr(cpu_map);
 	for (idx = 0; idx < nr_cpus; idx++) {
-		cpu = cpu_map->map[idx]; /* map c index to online cpu index */
+		cpu = perf_cpu_map__cpu(cpu_map, idx); /* map c index to online cpu index */
 		if (cpu__get_node(cpu) == node)
 			set_bit(cpu.cpu, mask->bits);
 	}
diff --git a/tools/perf/util/perf_api_probe.c b/tools/perf/util/perf_api_probe.c
index 734d006d9a8c..c28dd50bd571 100644
--- a/tools/perf/util/perf_api_probe.c
+++ b/tools/perf/util/perf_api_probe.c
@@ -67,7 +67,7 @@ static bool perf_probe_api(setup_probe_fn_t fn)
 	cpus = perf_cpu_map__new(NULL);
 	if (!cpus)
 		return false;
-	cpu = cpus->map[0];
+	cpu = perf_cpu_map__cpu(cpus, 0);
 	perf_cpu_map__put(cpus);
 
 	do {
@@ -144,7 +144,7 @@ bool perf_can_record_cpu_wide(void)
 	if (!cpus)
 		return false;
 
-	cpu = cpus->map[0];
+	cpu = perf_cpu_map__cpu(cpus, 0);
 	perf_cpu_map__put(cpus);
 
 	fd = sys_perf_event_open(&attr, -1, cpu.cpu, -1, 0);
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 20461f174991..007a64681416 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -106,7 +106,7 @@ void evlist__config(struct evlist *evlist, struct record_opts *opts, struct call
 	if (opts->group)
 		evlist__set_leader(evlist);
 
-	if (evlist->core.cpus->map[0].cpu < 0)
+	if (perf_cpu_map__cpu(evlist->core.cpus, 0).cpu < 0)
 		opts->no_inherit = true;
 
 	use_comm_exec = perf_can_comm_exec();
@@ -248,11 +248,11 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str)
 		struct perf_cpu_map *cpus = perf_cpu_map__new(NULL);
 
 		if (cpus)
-			cpu =  cpus->map[0];
+			cpu =  perf_cpu_map__cpu(cpus, 0);
 
 		perf_cpu_map__put(cpus);
 	} else {
-		cpu = evlist->core.cpus->map[0];
+		cpu = perf_cpu_map__cpu(evlist->core.cpus, 0);
 	}
 
 	while (1) {
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index f5ad0e62227a..e752e1f4a5f0 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1607,8 +1607,8 @@ static void python_process_stat(struct perf_stat_config *config,
 	}
 
 	for (thread = 0; thread < threads->nr; thread++) {
-		for (cpu = 0; cpu < cpus->nr; cpu++) {
-			process_stat(counter, cpus->map[cpu],
+		for (cpu = 0; cpu < perf_cpu_map__nr(cpus); cpu++) {
+			process_stat(counter, perf_cpu_map__cpu(cpus, cpu),
 				     perf_thread_map__pid(threads, thread), tstamp,
 				     perf_counts(counter->counts, cpu, thread));
 		}
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f19348dddd55..2c0d30f08e78 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -2537,8 +2537,8 @@ int perf_session__cpu_bitmap(struct perf_session *session,
 		return -1;
 	}
 
-	for (i = 0; i < map->nr; i++) {
-		struct perf_cpu cpu = map->map[i];
+	for (i = 0; i < perf_cpu_map__nr(map); i++) {
+		struct perf_cpu cpu = perf_cpu_map__cpu(map, i);
 
 		if (cpu.cpu >= nr_cpus) {
 			pr_err("Requested CPU %d too large. "
diff --git a/tools/perf/util/svghelper.c b/tools/perf/util/svghelper.c
index 4c9f211249db..1e0c731fc539 100644
--- a/tools/perf/util/svghelper.c
+++ b/tools/perf/util/svghelper.c
@@ -734,8 +734,8 @@ static int str_to_bitmap(char *s, cpumask_t *b, int nr_cpus)
 	if (!m)
 		return -1;
 
-	for (i = 0; i < m->nr; i++) {
-		c = m->map[i];
+	for (i = 0; i < perf_cpu_map__nr(m); i++) {
+		c = perf_cpu_map__cpu(m, i);
 		if (c.cpu >= nr_cpus) {
 			ret = -1;
 			break;
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index c9ba8050cc2b..70f095624a0b 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -1186,12 +1186,12 @@ int perf_event__synthesize_thread_map2(struct perf_tool *tool,
 static void synthesize_cpus(struct cpu_map_entries *cpus,
 			    struct perf_cpu_map *map)
 {
-	int i;
+	int i, map_nr = perf_cpu_map__nr(map);
 
-	cpus->nr = map->nr;
+	cpus->nr = map_nr;
 
-	for (i = 0; i < map->nr; i++)
-		cpus->cpu[i] = map->map[i].cpu;
+	for (i = 0; i < map_nr; i++)
+		cpus->cpu[i] = perf_cpu_map__cpu(map, i).cpu;
 }
 
 static void synthesize_mask(struct perf_record_record_cpu_map *mask,
@@ -1202,13 +1202,13 @@ static void synthesize_mask(struct perf_record_record_cpu_map *mask,
 	mask->nr = BITS_TO_LONGS(max);
 	mask->long_size = sizeof(long);
 
-	for (i = 0; i < map->nr; i++)
-		set_bit(map->map[i].cpu, mask->mask);
+	for (i = 0; i < perf_cpu_map__nr(map); i++)
+		set_bit(perf_cpu_map__cpu(map, i).cpu, mask->mask);
 }
 
 static size_t cpus_size(struct perf_cpu_map *map)
 {
-	return sizeof(struct cpu_map_entries) + map->nr * sizeof(u16);
+	return sizeof(struct cpu_map_entries) + perf_cpu_map__nr(map) * sizeof(u16);
 }
 
 static size_t mask_size(struct perf_cpu_map *map, int *max)
@@ -1217,9 +1217,9 @@ static size_t mask_size(struct perf_cpu_map *map, int *max)
 
 	*max = 0;
 
-	for (i = 0; i < map->nr; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map); i++) {
 		/* bit position of the cpu is + 1 */
-		int bit = map->map[i].cpu + 1;
+		int bit = perf_cpu_map__cpu(map, i).cpu + 1;
 
 		if (bit > *max)
 			*max = bit;
diff --git a/tools/perf/util/top.c b/tools/perf/util/top.c
index 27945eeb0cb5..c1ebfc5d2e0c 100644
--- a/tools/perf/util/top.c
+++ b/tools/perf/util/top.c
@@ -95,15 +95,15 @@ size_t perf_top__header_snprintf(struct perf_top *top, char *bf, size_t size)
 
 	if (target->cpu_list)
 		ret += SNPRINTF(bf + ret, size - ret, ", CPU%s: %s)",
-				top->evlist->core.cpus->nr > 1 ? "s" : "",
+				perf_cpu_map__nr(top->evlist->core.cpus) > 1 ? "s" : "",
 				target->cpu_list);
 	else {
 		if (target->tid)
 			ret += SNPRINTF(bf + ret, size - ret, ")");
 		else
 			ret += SNPRINTF(bf + ret, size - ret, ", %d CPU%s)",
-					top->evlist->core.cpus->nr,
-					top->evlist->core.cpus->nr > 1 ? "s" : "");
+					perf_cpu_map__nr(top->evlist->core.cpus),
+					perf_cpu_map__nr(top->evlist->core.cpus) > 1 ? "s" : "");
 	}
 
 	perf_top__reset_sample_counters(top);
-- 
2.35.0.rc0.227.g00780c9af4-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] perf cpumap: Add reference count checking
  2022-01-22  4:58 [PATCH 0/3] Reference count checker and cpu map related fixes Ian Rogers
  2022-01-22  4:58 ` [PATCH 1/3] perf python: Fix cpu_map__item building Ian Rogers
  2022-01-22  4:58 ` [PATCH 2/3] perf cpumap: Migrate to libperf cpumap api Ian Rogers
@ 2022-01-22  4:58 ` Ian Rogers
  2022-01-22 20:09   ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 6+ messages in thread
From: Ian Rogers @ 2022-01-22  4:58 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov
  Cc: eranian, Ian Rogers

Enabled when REFCNT_CHECKING is defined. The change adds a memory
allocated pointer that is interposed between the reference counted
cpu map at a get and freed by a put. The pointer replaces the original
perf_cpu_map struct, so use of the perf_cpu_map via APIs remains
unchanged. Any use of the cpu map without the API requires two versions,
typically handled via an UNWRAP macro.

This change is intended to catch:
 - use after put: using a cpumap after you have put it will cause a
   segv.
 - unbalanced puts: two puts for a get will result in a double free
   that can be captured and reported by tools like address sanitizer,
   including with the associated stack traces of allocation and frees.
 - missing puts: if a put is missing then the get turns into a memory
   leak that can be reported by leak sanitizer, including the stack
   trace at the point the get occurs.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/cpumap.c                  | 120 ++++++++++++++++-------
 tools/lib/perf/include/internal/cpumap.h |  14 ++-
 tools/perf/tests/cpumap.c                |  20 ++--
 tools/perf/util/cpumap.c                 |  42 ++++----
 tools/perf/util/pmu.c                    |  24 +++--
 5 files changed, 146 insertions(+), 74 deletions(-)

diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index ee66760f1e63..d401d133f84b 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -10,10 +10,16 @@
 #include <ctype.h>
 #include <limits.h>
 
-static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
+#ifndef REFCNT_CHECKING
+#define UNWRAP_MAP(x) x
+#else
+#define UNWRAP_MAP(x) x->orig
+#endif
+
+#ifndef REFCNT_CHECKING
+struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
 {
 	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
-
 	if (cpus != NULL) {
 		cpus->nr = nr_cpus;
 		refcount_set(&cpus->refcnt, 1);
@@ -21,13 +27,31 @@ static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
 	}
 	return cpus;
 }
+#else
+struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
+{
+	struct perf_cpu_map *wrapped_cpus = NULL;
+	struct original_perf_cpu_map *cpus =
+		malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
+	if (cpus != NULL) {
+		cpus->nr = nr_cpus;
+		refcount_set(&cpus->refcnt, 1);
+		wrapped_cpus = malloc(sizeof(*wrapped_cpus));
+		if (wrapped_cpus != NULL)
+			wrapped_cpus->orig = cpus;
+		else
+			free(cpus);
+	}
+	return wrapped_cpus;
+}
+#endif
 
 struct perf_cpu_map *perf_cpu_map__dummy_new(void)
 {
 	struct perf_cpu_map *cpus = perf_cpu_map__alloc(1);
 
 	if (cpus)
-		cpus->map[0].cpu = -1;
+		UNWRAP_MAP(cpus)->map[0].cpu = -1;
 
 	return cpus;
 }
@@ -35,23 +59,45 @@ struct perf_cpu_map *perf_cpu_map__dummy_new(void)
 static void cpu_map__delete(struct perf_cpu_map *map)
 {
 	if (map) {
-		WARN_ONCE(refcount_read(&map->refcnt) != 0,
+		WARN_ONCE(refcount_read(&UNWRAP_MAP(map)->refcnt) != 0,
 			  "cpu_map refcnt unbalanced\n");
+#ifdef REFCNT_CHECKING
+		free(map->orig);
+		map->orig = NULL;
+#endif
 		free(map);
 	}
 }
 
 struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
 {
-	if (map)
-		refcount_inc(&map->refcnt);
+	if (map) {
+#ifdef REFCNT_CHECKING
+		struct perf_cpu_map *new_wrapper;
+#endif
+		refcount_inc(&UNWRAP_MAP(map)->refcnt);
+#ifdef REFCNT_CHECKING
+		new_wrapper = malloc(sizeof(*new_wrapper));
+		new_wrapper->orig = map->orig;
+		map = new_wrapper;
+#endif
+	}
 	return map;
 }
 
 void perf_cpu_map__put(struct perf_cpu_map *map)
 {
-	if (map && refcount_dec_and_test(&map->refcnt))
-		cpu_map__delete(map);
+	if (map) {
+		if (refcount_dec_and_test(&UNWRAP_MAP(map)->refcnt))
+			cpu_map__delete(map);
+		else {
+#ifdef REFCNT_CHECKING
+			/* Free the wrapper object but the reference counted object remains. */
+			map->orig = NULL;
+			free(map);
+#endif
+		}
+	}
 }
 
 static struct perf_cpu_map *cpu_map__default_new(void)
@@ -68,7 +114,7 @@ static struct perf_cpu_map *cpu_map__default_new(void)
 		int i;
 
 		for (i = 0; i < nr_cpus; ++i)
-			cpus->map[i].cpu = i;
+			UNWRAP_MAP(cpus)->map[i].cpu = i;
 	}
 
 	return cpus;
@@ -94,15 +140,16 @@ static struct perf_cpu_map *cpu_map__trim_new(int nr_cpus, const struct perf_cpu
 	int i, j;
 
 	if (cpus != NULL) {
-		memcpy(cpus->map, tmp_cpus, payload_size);
-		qsort(cpus->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
+		memcpy(UNWRAP_MAP(cpus)->map, tmp_cpus, payload_size);
+		qsort(UNWRAP_MAP(cpus)->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
 		/* Remove dups */
 		j = 0;
 		for (i = 0; i < nr_cpus; i++) {
-			if (i == 0 || cpus->map[i].cpu != cpus->map[i - 1].cpu)
-				cpus->map[j++].cpu = cpus->map[i].cpu;
+			if (i == 0 ||
+			    UNWRAP_MAP(cpus)->map[i].cpu != UNWRAP_MAP(cpus)->map[i - 1].cpu)
+				UNWRAP_MAP(cpus)->map[j++].cpu = UNWRAP_MAP(cpus)->map[i].cpu;
 		}
-		cpus->nr = j;
+		UNWRAP_MAP(cpus)->nr = j;
 		assert(j <= nr_cpus);
 	}
 	return cpus;
@@ -263,20 +310,20 @@ struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx)
 		.cpu = -1
 	};
 
-	if (cpus && idx < cpus->nr)
-		return cpus->map[idx];
+	if (cpus && idx < UNWRAP_MAP(cpus)->nr)
+		return UNWRAP_MAP(cpus)->map[idx];
 
 	return result;
 }
 
 int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
 {
-	return cpus ? cpus->nr : 1;
+	return cpus ? UNWRAP_MAP(cpus)->nr : 1;
 }
 
 bool perf_cpu_map__empty(const struct perf_cpu_map *map)
 {
-	return map ? map->map[0].cpu == -1 : true;
+	return map ? UNWRAP_MAP(map)->map[0].cpu == -1 : true;
 }
 
 int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
@@ -287,10 +334,10 @@ int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
 		return -1;
 
 	low = 0;
-	high = cpus->nr;
+	high = UNWRAP_MAP(cpus)->nr;
 	while (low < high) {
 		int idx = (low + high) / 2;
-		struct perf_cpu cpu_at_idx = cpus->map[idx];
+		struct perf_cpu cpu_at_idx = UNWRAP_MAP(cpus)->map[idx];
 
 		if (cpu_at_idx.cpu == cpu.cpu)
 			return idx;
@@ -316,7 +363,7 @@ struct perf_cpu perf_cpu_map__max(struct perf_cpu_map *map)
 	};
 
 	// cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
-	return map->nr > 0 ? map->map[map->nr - 1] : result;
+	return UNWRAP_MAP(map)->nr > 0 ? UNWRAP_MAP(map)->map[UNWRAP_MAP(map)->nr - 1] : result;
 }
 
 /*
@@ -337,37 +384,36 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
 
 	if (!orig && !other)
 		return NULL;
-	if (!orig) {
-		perf_cpu_map__get(other);
-		return other;
-	}
+	if (!orig)
+		return perf_cpu_map__get(other);
 	if (!other)
 		return orig;
-	if (orig->nr == other->nr &&
-	    !memcmp(orig->map, other->map, orig->nr * sizeof(struct perf_cpu)))
+	if (UNWRAP_MAP(orig)->nr == UNWRAP_MAP(other)->nr &&
+	    !memcmp(UNWRAP_MAP(orig)->map, UNWRAP_MAP(other)->map,
+		    UNWRAP_MAP(orig)->nr * sizeof(struct perf_cpu)))
 		return orig;
 
-	tmp_len = orig->nr + other->nr;
+	tmp_len = UNWRAP_MAP(orig)->nr + UNWRAP_MAP(other)->nr;
 	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
 	if (!tmp_cpus)
 		return NULL;
 
 	/* Standard merge algorithm from wikipedia */
 	i = j = k = 0;
-	while (i < orig->nr && j < other->nr) {
-		if (orig->map[i].cpu <= other->map[j].cpu) {
-			if (orig->map[i].cpu == other->map[j].cpu)
+	while (i < UNWRAP_MAP(orig)->nr && j < UNWRAP_MAP(other)->nr) {
+		if (UNWRAP_MAP(orig)->map[i].cpu <= UNWRAP_MAP(other)->map[j].cpu) {
+			if (UNWRAP_MAP(orig)->map[i].cpu == UNWRAP_MAP(other)->map[j].cpu)
 				j++;
-			tmp_cpus[k++] = orig->map[i++];
+			tmp_cpus[k++] = UNWRAP_MAP(orig)->map[i++];
 		} else
-			tmp_cpus[k++] = other->map[j++];
+			tmp_cpus[k++] = UNWRAP_MAP(other)->map[j++];
 	}
 
-	while (i < orig->nr)
-		tmp_cpus[k++] = orig->map[i++];
+	while (i < UNWRAP_MAP(orig)->nr)
+		tmp_cpus[k++] = UNWRAP_MAP(orig)->map[i++];
 
-	while (j < other->nr)
-		tmp_cpus[k++] = other->map[j++];
+	while (j < UNWRAP_MAP(other)->nr)
+		tmp_cpus[k++] = UNWRAP_MAP(other)->map[j++];
 	assert(k <= tmp_len);
 
 	merged = cpu_map__trim_new(k, tmp_cpus);
diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
index 581f9ffb4237..64ad56d167a0 100644
--- a/tools/lib/perf/include/internal/cpumap.h
+++ b/tools/lib/perf/include/internal/cpumap.h
@@ -16,7 +16,12 @@ struct perf_cpu {
  * gaps if CPU numbers were used. For events associated with a pid, rather than
  * a CPU, a single dummy map with an entry of -1 is used.
  */
-struct perf_cpu_map {
+#ifndef REFCNT_CHECKING
+struct perf_cpu_map
+#else
+struct original_perf_cpu_map
+#endif
+{
 	refcount_t	refcnt;
 	/** Length of the map array. */
 	int		nr;
@@ -24,10 +29,17 @@ struct perf_cpu_map {
 	struct perf_cpu	map[];
 };
 
+#ifdef REFCNT_CHECKING
+struct perf_cpu_map {
+	struct original_perf_cpu_map *orig;
+};
+#endif
+
 #ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS	2048
 #endif
 
+struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus);
 int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu);
 
 #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
index 84e87e31f119..65d9546aec6e 100644
--- a/tools/perf/tests/cpumap.c
+++ b/tools/perf/tests/cpumap.c
@@ -11,6 +11,12 @@
 
 struct machine;
 
+#ifndef REFCNT_CHECKING
+#define UNWRAP_CPUMAP(x) x
+#else
+#define UNWRAP_CPUMAP(x) x->orig
+#endif
+
 static int process_event_mask(struct perf_tool *tool __maybe_unused,
 			 union perf_event *event,
 			 struct perf_sample *sample __maybe_unused,
@@ -35,10 +41,10 @@ static int process_event_mask(struct perf_tool *tool __maybe_unused,
 	}
 
 	map = cpu_map__new_data(data);
-	TEST_ASSERT_VAL("wrong nr",  map->nr == 20);
+	TEST_ASSERT_VAL("wrong nr",  perf_cpu_map__nr(map) == 20);
 
 	for (i = 0; i < 20; i++) {
-		TEST_ASSERT_VAL("wrong cpu", map->map[i].cpu == i);
+		TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, i).cpu == i);
 	}
 
 	perf_cpu_map__put(map);
@@ -66,10 +72,10 @@ static int process_event_cpus(struct perf_tool *tool __maybe_unused,
 	TEST_ASSERT_VAL("wrong cpu",  cpus->cpu[1] == 256);
 
 	map = cpu_map__new_data(data);
-	TEST_ASSERT_VAL("wrong nr",  map->nr == 2);
-	TEST_ASSERT_VAL("wrong cpu", map->map[0].cpu == 1);
-	TEST_ASSERT_VAL("wrong cpu", map->map[1].cpu == 256);
-	TEST_ASSERT_VAL("wrong refcnt", refcount_read(&map->refcnt) == 1);
+	TEST_ASSERT_VAL("wrong nr",  perf_cpu_map__nr(map) == 2);
+	TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, 0).cpu == 1);
+	TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, 1).cpu == 256);
+	TEST_ASSERT_VAL("wrong refcnt", refcount_read(&UNWRAP_CPUMAP(map)->refcnt) == 1);
 	perf_cpu_map__put(map);
 	return 0;
 }
@@ -130,7 +136,7 @@ static int test__cpu_map_merge(struct test_suite *test __maybe_unused, int subte
 	struct perf_cpu_map *c = perf_cpu_map__merge(a, b);
 	char buf[100];
 
-	TEST_ASSERT_VAL("failed to merge map: bad nr", c->nr == 5);
+	TEST_ASSERT_VAL("failed to merge map: bad nr", perf_cpu_map__nr(c) == 5);
 	cpu_map__snprint(c, buf, sizeof(buf));
 	TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, "1-2,4-5,7"));
 	perf_cpu_map__put(b);
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 12b2243222b0..e9976ca238fc 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -22,6 +22,12 @@ static int max_node_num;
  */
 static int *cpunode_map;
 
+#ifndef REFCNT_CHECKING
+#define UNWRAP_MAP(x) x
+#else
+#define UNWRAP_MAP(x) x->orig
+#endif
+
 static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
 {
 	struct perf_cpu_map *map;
@@ -37,9 +43,9 @@ static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
 			 * otherwise it would become 65535.
 			 */
 			if (cpus->cpu[i] == (u16) -1)
-				map->map[i].cpu = -1;
+				UNWRAP_MAP(map)->map[i].cpu = -1;
 			else
-				map->map[i].cpu = (int) cpus->cpu[i];
+				UNWRAP_MAP(map)->map[i].cpu = (int) cpus->cpu[i];
 		}
 	}
 
@@ -58,7 +64,7 @@ static struct perf_cpu_map *cpu_map__from_mask(struct perf_record_record_cpu_map
 		int cpu, i = 0;
 
 		for_each_set_bit(cpu, mask->mask, nbits)
-			map->map[i++].cpu = cpu;
+			UNWRAP_MAP(map)->map[i++].cpu = cpu;
 	}
 	return map;
 
@@ -84,16 +90,13 @@ size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp)
 
 struct perf_cpu_map *perf_cpu_map__empty_new(int nr)
 {
-	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(int) * nr);
+	struct perf_cpu_map *cpus = perf_cpu_map__alloc(nr);
 
 	if (cpus != NULL) {
 		int i;
 
-		cpus->nr = nr;
 		for (i = 0; i < nr; i++)
-			cpus->map[i].cpu = -1;
-
-		refcount_set(&cpus->refcnt, 1);
+			UNWRAP_MAP(cpus)->map[i].cpu = -1;
 	}
 
 	return cpus;
@@ -163,7 +166,7 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
 {
 	int idx;
 	struct perf_cpu cpu;
-	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(cpus->nr);
+	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(perf_cpu_map__nr(cpus));
 
 	if (!c)
 		return NULL;
@@ -187,7 +190,7 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
 		}
 	}
 	/* Trim. */
-	if (c->nr != cpus->nr) {
+	if (c->nr != perf_cpu_map__nr(cpus)) {
 		struct cpu_aggr_map *trimmed_c =
 			realloc(c,
 				sizeof(struct cpu_aggr_map) + sizeof(struct aggr_cpu_id) * c->nr);
@@ -494,31 +497,32 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
 
 #define COMMA first ? "" : ","
 
-	for (i = 0; i < map->nr + 1; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map) + 1; i++) {
 		struct perf_cpu cpu = { .cpu = INT_MAX };
-		bool last = i == map->nr;
+		bool last = i == perf_cpu_map__nr(map);
 
 		if (!last)
-			cpu = map->map[i];
+			cpu = perf_cpu_map__cpu(map, i);
 
 		if (start == -1) {
 			start = i;
 			if (last) {
 				ret += snprintf(buf + ret, size - ret,
 						"%s%d", COMMA,
-						map->map[i].cpu);
+						perf_cpu_map__cpu(map, i).cpu);
 			}
-		} else if (((i - start) != (cpu.cpu - map->map[start].cpu)) || last) {
+		} else if (((i - start) != (cpu.cpu - perf_cpu_map__cpu(map, start).cpu)) || last) {
 			int end = i - 1;
 
 			if (start == end) {
 				ret += snprintf(buf + ret, size - ret,
 						"%s%d", COMMA,
-						map->map[start].cpu);
+						perf_cpu_map__cpu(map, start).cpu);
 			} else {
 				ret += snprintf(buf + ret, size - ret,
 						"%s%d-%d", COMMA,
-						map->map[start].cpu, map->map[end].cpu);
+						perf_cpu_map__cpu(map, start).cpu,
+						perf_cpu_map__cpu(map, end).cpu);
 			}
 			first = false;
 			start = i;
@@ -545,7 +549,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
 	int i, cpu;
 	char *ptr = buf;
 	unsigned char *bitmap;
-	struct perf_cpu last_cpu = perf_cpu_map__cpu(map, map->nr - 1);
+	struct perf_cpu last_cpu = perf_cpu_map__cpu(map, perf_cpu_map__nr(map) - 1);
 
 	if (buf == NULL)
 		return 0;
@@ -556,7 +560,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
 		return 0;
 	}
 
-	for (i = 0; i < map->nr; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map); i++) {
 		cpu = perf_cpu_map__cpu(map, i).cpu;
 		bitmap[cpu / 8] |= 1 << (cpu % 8);
 	}
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 8dfbba15aeb8..1649321fe86f 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1992,13 +1992,20 @@ int perf_pmu__match(char *pattern, char *name, char *tok)
 	return 0;
 }
 
+#ifndef REFCNT_CHECKING
+#define UNWRAP_CPUMAP(x) x
+#else
+#define UNWRAP_CPUMAP(x) x->orig
+#endif
+
 int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
 			 struct perf_cpu_map **mcpus_ptr,
 			 struct perf_cpu_map **ucpus_ptr)
 {
 	struct perf_cpu_map *pmu_cpus = pmu->cpus;
 	struct perf_cpu_map *matched_cpus, *unmatched_cpus;
-	int matched_nr = 0, unmatched_nr = 0;
+	struct perf_cpu cpu;
+	int i, matched_nr = 0, unmatched_nr = 0;
 
 	matched_cpus = perf_cpu_map__default_new();
 	if (!matched_cpus)
@@ -2010,18 +2017,15 @@ int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
 		return -1;
 	}
 
-	for (int i = 0; i < cpus->nr; i++) {
-		int cpu;
-
-		cpu = perf_cpu_map__idx(pmu_cpus, cpus->map[i]);
-		if (cpu == -1)
-			unmatched_cpus->map[unmatched_nr++] = cpus->map[i];
+	perf_cpu_map__for_each_cpu(cpu, i, cpus) {
+		if (!perf_cpu_map__has(pmu_cpus, cpu))
+			UNWRAP_CPUMAP(unmatched_cpus)->map[unmatched_nr++] = cpu;
 		else
-			matched_cpus->map[matched_nr++] = cpus->map[i];
+			UNWRAP_CPUMAP(matched_cpus)->map[matched_nr++] = cpu;
 	}
 
-	unmatched_cpus->nr = unmatched_nr;
-	matched_cpus->nr = matched_nr;
+	UNWRAP_CPUMAP(unmatched_cpus)->nr = unmatched_nr;
+	UNWRAP_CPUMAP(matched_cpus)->nr = matched_nr;
 	*mcpus_ptr = matched_cpus;
 	*ucpus_ptr = unmatched_cpus;
 	return 0;
-- 
2.35.0.rc0.227.g00780c9af4-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/3] perf python: Fix cpu_map__item building
  2022-01-22  4:58 ` [PATCH 1/3] perf python: Fix cpu_map__item building Ian Rogers
@ 2022-01-22 20:09   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-22 20:09 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Darren Hart,
	Davidlohr Bueso, André Almeida, James Clark, John Garry,
	Riccardo Mancini, Yury Norov, Andy Shevchenko, Andrew Morton,
	Jin Yao, Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter,
	Kan Liang, Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov, eranian

Em Fri, Jan 21, 2022 at 08:58:09PM -0800, Ian Rogers escreveu:
> Value should be built as an integer.
> Switch some uses of perf_cpu_map to use the library API.
> 
> Fixes: 6d18804b963b ("perf cpumap: Give CPUs their own type")
> Signed-off-by: Ian Rogers <irogers@google.com>

Thanks, applied.

- Arnaldo

> ---
>  tools/perf/util/python.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
> index f3e5131f183c..52d8995cfd73 100644
> --- a/tools/perf/util/python.c
> +++ b/tools/perf/util/python.c
> @@ -638,17 +638,17 @@ static Py_ssize_t pyrf_cpu_map__length(PyObject *obj)
>  {
>  	struct pyrf_cpu_map *pcpus = (void *)obj;
>  
> -	return pcpus->cpus->nr;
> +	return perf_cpu_map__nr(pcpus->cpus);
>  }
>  
>  static PyObject *pyrf_cpu_map__item(PyObject *obj, Py_ssize_t i)
>  {
>  	struct pyrf_cpu_map *pcpus = (void *)obj;
>  
> -	if (i >= pcpus->cpus->nr)
> +	if (i >= perf_cpu_map__nr(pcpus->cpus))
>  		return NULL;
>  
> -	return Py_BuildValue("i", pcpus->cpus->map[i]);
> +	return Py_BuildValue("i", perf_cpu_map__cpu(pcpus->cpus, i).cpu);
>  }
>  
>  static PySequenceMethods pyrf_cpu_map__sequence_methods = {
> -- 
> 2.35.0.rc0.227.g00780c9af4-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] perf cpumap: Add reference count checking
  2022-01-22  4:58 ` [PATCH 3/3] perf cpumap: Add reference count checking Ian Rogers
@ 2022-01-22 20:09   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-22 20:09 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Darren Hart,
	Davidlohr Bueso, André Almeida, James Clark, John Garry,
	Riccardo Mancini, Yury Norov, Andy Shevchenko, Andrew Morton,
	Jin Yao, Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter,
	Kan Liang, Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov, eranian

Em Fri, Jan 21, 2022 at 08:58:11PM -0800, Ian Rogers escreveu:
> Enabled when REFCNT_CHECKING is defined. The change adds a memory
> allocated pointer that is interposed between the reference counted
> cpu map at a get and freed by a put. The pointer replaces the original
> perf_cpu_map struct, so use of the perf_cpu_map via APIs remains
> unchanged. Any use of the cpu map without the API requires two versions,
> typically handled via an UNWRAP macro.
> 
> This change is intended to catch:
>  - use after put: using a cpumap after you have put it will cause a
>    segv.
>  - unbalanced puts: two puts for a get will result in a double free
>    that can be captured and reported by tools like address sanitizer,
>    including with the associated stack traces of allocation and frees.
>  - missing puts: if a put is missing then the get turns into a memory
>    leak that can be reported by leak sanitizer, including the stack
>    trace at the point the get occurs.

I'll take a look at this after  testing/pushing the perf/core queue to Linus.

- Arnaldo
 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/lib/perf/cpumap.c                  | 120 ++++++++++++++++-------
>  tools/lib/perf/include/internal/cpumap.h |  14 ++-
>  tools/perf/tests/cpumap.c                |  20 ++--
>  tools/perf/util/cpumap.c                 |  42 ++++----
>  tools/perf/util/pmu.c                    |  24 +++--
>  5 files changed, 146 insertions(+), 74 deletions(-)
> 
> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> index ee66760f1e63..d401d133f84b 100644
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -10,10 +10,16 @@
>  #include <ctype.h>
>  #include <limits.h>
>  
> -static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
> +#ifndef REFCNT_CHECKING
> +#define UNWRAP_MAP(x) x
> +#else
> +#define UNWRAP_MAP(x) x->orig
> +#endif
> +
> +#ifndef REFCNT_CHECKING
> +struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
>  {
>  	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
> -
>  	if (cpus != NULL) {
>  		cpus->nr = nr_cpus;
>  		refcount_set(&cpus->refcnt, 1);
> @@ -21,13 +27,31 @@ static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
>  	}
>  	return cpus;
>  }
> +#else
> +struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
> +{
> +	struct perf_cpu_map *wrapped_cpus = NULL;
> +	struct original_perf_cpu_map *cpus =
> +		malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
> +	if (cpus != NULL) {
> +		cpus->nr = nr_cpus;
> +		refcount_set(&cpus->refcnt, 1);
> +		wrapped_cpus = malloc(sizeof(*wrapped_cpus));
> +		if (wrapped_cpus != NULL)
> +			wrapped_cpus->orig = cpus;
> +		else
> +			free(cpus);
> +	}
> +	return wrapped_cpus;
> +}
> +#endif
>  
>  struct perf_cpu_map *perf_cpu_map__dummy_new(void)
>  {
>  	struct perf_cpu_map *cpus = perf_cpu_map__alloc(1);
>  
>  	if (cpus)
> -		cpus->map[0].cpu = -1;
> +		UNWRAP_MAP(cpus)->map[0].cpu = -1;
>  
>  	return cpus;
>  }
> @@ -35,23 +59,45 @@ struct perf_cpu_map *perf_cpu_map__dummy_new(void)
>  static void cpu_map__delete(struct perf_cpu_map *map)
>  {
>  	if (map) {
> -		WARN_ONCE(refcount_read(&map->refcnt) != 0,
> +		WARN_ONCE(refcount_read(&UNWRAP_MAP(map)->refcnt) != 0,
>  			  "cpu_map refcnt unbalanced\n");
> +#ifdef REFCNT_CHECKING
> +		free(map->orig);
> +		map->orig = NULL;
> +#endif
>  		free(map);
>  	}
>  }
>  
>  struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
>  {
> -	if (map)
> -		refcount_inc(&map->refcnt);
> +	if (map) {
> +#ifdef REFCNT_CHECKING
> +		struct perf_cpu_map *new_wrapper;
> +#endif
> +		refcount_inc(&UNWRAP_MAP(map)->refcnt);
> +#ifdef REFCNT_CHECKING
> +		new_wrapper = malloc(sizeof(*new_wrapper));
> +		new_wrapper->orig = map->orig;
> +		map = new_wrapper;
> +#endif
> +	}
>  	return map;
>  }
>  
>  void perf_cpu_map__put(struct perf_cpu_map *map)
>  {
> -	if (map && refcount_dec_and_test(&map->refcnt))
> -		cpu_map__delete(map);
> +	if (map) {
> +		if (refcount_dec_and_test(&UNWRAP_MAP(map)->refcnt))
> +			cpu_map__delete(map);
> +		else {
> +#ifdef REFCNT_CHECKING
> +			/* Free the wrapper object but the reference counted object remains. */
> +			map->orig = NULL;
> +			free(map);
> +#endif
> +		}
> +	}
>  }
>  
>  static struct perf_cpu_map *cpu_map__default_new(void)
> @@ -68,7 +114,7 @@ static struct perf_cpu_map *cpu_map__default_new(void)
>  		int i;
>  
>  		for (i = 0; i < nr_cpus; ++i)
> -			cpus->map[i].cpu = i;
> +			UNWRAP_MAP(cpus)->map[i].cpu = i;
>  	}
>  
>  	return cpus;
> @@ -94,15 +140,16 @@ static struct perf_cpu_map *cpu_map__trim_new(int nr_cpus, const struct perf_cpu
>  	int i, j;
>  
>  	if (cpus != NULL) {
> -		memcpy(cpus->map, tmp_cpus, payload_size);
> -		qsort(cpus->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
> +		memcpy(UNWRAP_MAP(cpus)->map, tmp_cpus, payload_size);
> +		qsort(UNWRAP_MAP(cpus)->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
>  		/* Remove dups */
>  		j = 0;
>  		for (i = 0; i < nr_cpus; i++) {
> -			if (i == 0 || cpus->map[i].cpu != cpus->map[i - 1].cpu)
> -				cpus->map[j++].cpu = cpus->map[i].cpu;
> +			if (i == 0 ||
> +			    UNWRAP_MAP(cpus)->map[i].cpu != UNWRAP_MAP(cpus)->map[i - 1].cpu)
> +				UNWRAP_MAP(cpus)->map[j++].cpu = UNWRAP_MAP(cpus)->map[i].cpu;
>  		}
> -		cpus->nr = j;
> +		UNWRAP_MAP(cpus)->nr = j;
>  		assert(j <= nr_cpus);
>  	}
>  	return cpus;
> @@ -263,20 +310,20 @@ struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx)
>  		.cpu = -1
>  	};
>  
> -	if (cpus && idx < cpus->nr)
> -		return cpus->map[idx];
> +	if (cpus && idx < UNWRAP_MAP(cpus)->nr)
> +		return UNWRAP_MAP(cpus)->map[idx];
>  
>  	return result;
>  }
>  
>  int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
>  {
> -	return cpus ? cpus->nr : 1;
> +	return cpus ? UNWRAP_MAP(cpus)->nr : 1;
>  }
>  
>  bool perf_cpu_map__empty(const struct perf_cpu_map *map)
>  {
> -	return map ? map->map[0].cpu == -1 : true;
> +	return map ? UNWRAP_MAP(map)->map[0].cpu == -1 : true;
>  }
>  
>  int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
> @@ -287,10 +334,10 @@ int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
>  		return -1;
>  
>  	low = 0;
> -	high = cpus->nr;
> +	high = UNWRAP_MAP(cpus)->nr;
>  	while (low < high) {
>  		int idx = (low + high) / 2;
> -		struct perf_cpu cpu_at_idx = cpus->map[idx];
> +		struct perf_cpu cpu_at_idx = UNWRAP_MAP(cpus)->map[idx];
>  
>  		if (cpu_at_idx.cpu == cpu.cpu)
>  			return idx;
> @@ -316,7 +363,7 @@ struct perf_cpu perf_cpu_map__max(struct perf_cpu_map *map)
>  	};
>  
>  	// cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
> -	return map->nr > 0 ? map->map[map->nr - 1] : result;
> +	return UNWRAP_MAP(map)->nr > 0 ? UNWRAP_MAP(map)->map[UNWRAP_MAP(map)->nr - 1] : result;
>  }
>  
>  /*
> @@ -337,37 +384,36 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
>  
>  	if (!orig && !other)
>  		return NULL;
> -	if (!orig) {
> -		perf_cpu_map__get(other);
> -		return other;
> -	}
> +	if (!orig)
> +		return perf_cpu_map__get(other);
>  	if (!other)
>  		return orig;
> -	if (orig->nr == other->nr &&
> -	    !memcmp(orig->map, other->map, orig->nr * sizeof(struct perf_cpu)))
> +	if (UNWRAP_MAP(orig)->nr == UNWRAP_MAP(other)->nr &&
> +	    !memcmp(UNWRAP_MAP(orig)->map, UNWRAP_MAP(other)->map,
> +		    UNWRAP_MAP(orig)->nr * sizeof(struct perf_cpu)))
>  		return orig;
>  
> -	tmp_len = orig->nr + other->nr;
> +	tmp_len = UNWRAP_MAP(orig)->nr + UNWRAP_MAP(other)->nr;
>  	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
>  	if (!tmp_cpus)
>  		return NULL;
>  
>  	/* Standard merge algorithm from wikipedia */
>  	i = j = k = 0;
> -	while (i < orig->nr && j < other->nr) {
> -		if (orig->map[i].cpu <= other->map[j].cpu) {
> -			if (orig->map[i].cpu == other->map[j].cpu)
> +	while (i < UNWRAP_MAP(orig)->nr && j < UNWRAP_MAP(other)->nr) {
> +		if (UNWRAP_MAP(orig)->map[i].cpu <= UNWRAP_MAP(other)->map[j].cpu) {
> +			if (UNWRAP_MAP(orig)->map[i].cpu == UNWRAP_MAP(other)->map[j].cpu)
>  				j++;
> -			tmp_cpus[k++] = orig->map[i++];
> +			tmp_cpus[k++] = UNWRAP_MAP(orig)->map[i++];
>  		} else
> -			tmp_cpus[k++] = other->map[j++];
> +			tmp_cpus[k++] = UNWRAP_MAP(other)->map[j++];
>  	}
>  
> -	while (i < orig->nr)
> -		tmp_cpus[k++] = orig->map[i++];
> +	while (i < UNWRAP_MAP(orig)->nr)
> +		tmp_cpus[k++] = UNWRAP_MAP(orig)->map[i++];
>  
> -	while (j < other->nr)
> -		tmp_cpus[k++] = other->map[j++];
> +	while (j < UNWRAP_MAP(other)->nr)
> +		tmp_cpus[k++] = UNWRAP_MAP(other)->map[j++];
>  	assert(k <= tmp_len);
>  
>  	merged = cpu_map__trim_new(k, tmp_cpus);
> diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> index 581f9ffb4237..64ad56d167a0 100644
> --- a/tools/lib/perf/include/internal/cpumap.h
> +++ b/tools/lib/perf/include/internal/cpumap.h
> @@ -16,7 +16,12 @@ struct perf_cpu {
>   * gaps if CPU numbers were used. For events associated with a pid, rather than
>   * a CPU, a single dummy map with an entry of -1 is used.
>   */
> -struct perf_cpu_map {
> +#ifndef REFCNT_CHECKING
> +struct perf_cpu_map
> +#else
> +struct original_perf_cpu_map
> +#endif
> +{
>  	refcount_t	refcnt;
>  	/** Length of the map array. */
>  	int		nr;
> @@ -24,10 +29,17 @@ struct perf_cpu_map {
>  	struct perf_cpu	map[];
>  };
>  
> +#ifdef REFCNT_CHECKING
> +struct perf_cpu_map {
> +	struct original_perf_cpu_map *orig;
> +};
> +#endif
> +
>  #ifndef MAX_NR_CPUS
>  #define MAX_NR_CPUS	2048
>  #endif
>  
> +struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus);
>  int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu);
>  
>  #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
> diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
> index 84e87e31f119..65d9546aec6e 100644
> --- a/tools/perf/tests/cpumap.c
> +++ b/tools/perf/tests/cpumap.c
> @@ -11,6 +11,12 @@
>  
>  struct machine;
>  
> +#ifndef REFCNT_CHECKING
> +#define UNWRAP_CPUMAP(x) x
> +#else
> +#define UNWRAP_CPUMAP(x) x->orig
> +#endif
> +
>  static int process_event_mask(struct perf_tool *tool __maybe_unused,
>  			 union perf_event *event,
>  			 struct perf_sample *sample __maybe_unused,
> @@ -35,10 +41,10 @@ static int process_event_mask(struct perf_tool *tool __maybe_unused,
>  	}
>  
>  	map = cpu_map__new_data(data);
> -	TEST_ASSERT_VAL("wrong nr",  map->nr == 20);
> +	TEST_ASSERT_VAL("wrong nr",  perf_cpu_map__nr(map) == 20);
>  
>  	for (i = 0; i < 20; i++) {
> -		TEST_ASSERT_VAL("wrong cpu", map->map[i].cpu == i);
> +		TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, i).cpu == i);
>  	}
>  
>  	perf_cpu_map__put(map);
> @@ -66,10 +72,10 @@ static int process_event_cpus(struct perf_tool *tool __maybe_unused,
>  	TEST_ASSERT_VAL("wrong cpu",  cpus->cpu[1] == 256);
>  
>  	map = cpu_map__new_data(data);
> -	TEST_ASSERT_VAL("wrong nr",  map->nr == 2);
> -	TEST_ASSERT_VAL("wrong cpu", map->map[0].cpu == 1);
> -	TEST_ASSERT_VAL("wrong cpu", map->map[1].cpu == 256);
> -	TEST_ASSERT_VAL("wrong refcnt", refcount_read(&map->refcnt) == 1);
> +	TEST_ASSERT_VAL("wrong nr",  perf_cpu_map__nr(map) == 2);
> +	TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, 0).cpu == 1);
> +	TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, 1).cpu == 256);
> +	TEST_ASSERT_VAL("wrong refcnt", refcount_read(&UNWRAP_CPUMAP(map)->refcnt) == 1);
>  	perf_cpu_map__put(map);
>  	return 0;
>  }
> @@ -130,7 +136,7 @@ static int test__cpu_map_merge(struct test_suite *test __maybe_unused, int subte
>  	struct perf_cpu_map *c = perf_cpu_map__merge(a, b);
>  	char buf[100];
>  
> -	TEST_ASSERT_VAL("failed to merge map: bad nr", c->nr == 5);
> +	TEST_ASSERT_VAL("failed to merge map: bad nr", perf_cpu_map__nr(c) == 5);
>  	cpu_map__snprint(c, buf, sizeof(buf));
>  	TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, "1-2,4-5,7"));
>  	perf_cpu_map__put(b);
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 12b2243222b0..e9976ca238fc 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -22,6 +22,12 @@ static int max_node_num;
>   */
>  static int *cpunode_map;
>  
> +#ifndef REFCNT_CHECKING
> +#define UNWRAP_MAP(x) x
> +#else
> +#define UNWRAP_MAP(x) x->orig
> +#endif
> +
>  static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
>  {
>  	struct perf_cpu_map *map;
> @@ -37,9 +43,9 @@ static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
>  			 * otherwise it would become 65535.
>  			 */
>  			if (cpus->cpu[i] == (u16) -1)
> -				map->map[i].cpu = -1;
> +				UNWRAP_MAP(map)->map[i].cpu = -1;
>  			else
> -				map->map[i].cpu = (int) cpus->cpu[i];
> +				UNWRAP_MAP(map)->map[i].cpu = (int) cpus->cpu[i];
>  		}
>  	}
>  
> @@ -58,7 +64,7 @@ static struct perf_cpu_map *cpu_map__from_mask(struct perf_record_record_cpu_map
>  		int cpu, i = 0;
>  
>  		for_each_set_bit(cpu, mask->mask, nbits)
> -			map->map[i++].cpu = cpu;
> +			UNWRAP_MAP(map)->map[i++].cpu = cpu;
>  	}
>  	return map;
>  
> @@ -84,16 +90,13 @@ size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp)
>  
>  struct perf_cpu_map *perf_cpu_map__empty_new(int nr)
>  {
> -	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(int) * nr);
> +	struct perf_cpu_map *cpus = perf_cpu_map__alloc(nr);
>  
>  	if (cpus != NULL) {
>  		int i;
>  
> -		cpus->nr = nr;
>  		for (i = 0; i < nr; i++)
> -			cpus->map[i].cpu = -1;
> -
> -		refcount_set(&cpus->refcnt, 1);
> +			UNWRAP_MAP(cpus)->map[i].cpu = -1;
>  	}
>  
>  	return cpus;
> @@ -163,7 +166,7 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
>  {
>  	int idx;
>  	struct perf_cpu cpu;
> -	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(cpus->nr);
> +	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(perf_cpu_map__nr(cpus));
>  
>  	if (!c)
>  		return NULL;
> @@ -187,7 +190,7 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
>  		}
>  	}
>  	/* Trim. */
> -	if (c->nr != cpus->nr) {
> +	if (c->nr != perf_cpu_map__nr(cpus)) {
>  		struct cpu_aggr_map *trimmed_c =
>  			realloc(c,
>  				sizeof(struct cpu_aggr_map) + sizeof(struct aggr_cpu_id) * c->nr);
> @@ -494,31 +497,32 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
>  
>  #define COMMA first ? "" : ","
>  
> -	for (i = 0; i < map->nr + 1; i++) {
> +	for (i = 0; i < perf_cpu_map__nr(map) + 1; i++) {
>  		struct perf_cpu cpu = { .cpu = INT_MAX };
> -		bool last = i == map->nr;
> +		bool last = i == perf_cpu_map__nr(map);
>  
>  		if (!last)
> -			cpu = map->map[i];
> +			cpu = perf_cpu_map__cpu(map, i);
>  
>  		if (start == -1) {
>  			start = i;
>  			if (last) {
>  				ret += snprintf(buf + ret, size - ret,
>  						"%s%d", COMMA,
> -						map->map[i].cpu);
> +						perf_cpu_map__cpu(map, i).cpu);
>  			}
> -		} else if (((i - start) != (cpu.cpu - map->map[start].cpu)) || last) {
> +		} else if (((i - start) != (cpu.cpu - perf_cpu_map__cpu(map, start).cpu)) || last) {
>  			int end = i - 1;
>  
>  			if (start == end) {
>  				ret += snprintf(buf + ret, size - ret,
>  						"%s%d", COMMA,
> -						map->map[start].cpu);
> +						perf_cpu_map__cpu(map, start).cpu);
>  			} else {
>  				ret += snprintf(buf + ret, size - ret,
>  						"%s%d-%d", COMMA,
> -						map->map[start].cpu, map->map[end].cpu);
> +						perf_cpu_map__cpu(map, start).cpu,
> +						perf_cpu_map__cpu(map, end).cpu);
>  			}
>  			first = false;
>  			start = i;
> @@ -545,7 +549,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
>  	int i, cpu;
>  	char *ptr = buf;
>  	unsigned char *bitmap;
> -	struct perf_cpu last_cpu = perf_cpu_map__cpu(map, map->nr - 1);
> +	struct perf_cpu last_cpu = perf_cpu_map__cpu(map, perf_cpu_map__nr(map) - 1);
>  
>  	if (buf == NULL)
>  		return 0;
> @@ -556,7 +560,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
>  		return 0;
>  	}
>  
> -	for (i = 0; i < map->nr; i++) {
> +	for (i = 0; i < perf_cpu_map__nr(map); i++) {
>  		cpu = perf_cpu_map__cpu(map, i).cpu;
>  		bitmap[cpu / 8] |= 1 << (cpu % 8);
>  	}
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 8dfbba15aeb8..1649321fe86f 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -1992,13 +1992,20 @@ int perf_pmu__match(char *pattern, char *name, char *tok)
>  	return 0;
>  }
>  
> +#ifndef REFCNT_CHECKING
> +#define UNWRAP_CPUMAP(x) x
> +#else
> +#define UNWRAP_CPUMAP(x) x->orig
> +#endif
> +
>  int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
>  			 struct perf_cpu_map **mcpus_ptr,
>  			 struct perf_cpu_map **ucpus_ptr)
>  {
>  	struct perf_cpu_map *pmu_cpus = pmu->cpus;
>  	struct perf_cpu_map *matched_cpus, *unmatched_cpus;
> -	int matched_nr = 0, unmatched_nr = 0;
> +	struct perf_cpu cpu;
> +	int i, matched_nr = 0, unmatched_nr = 0;
>  
>  	matched_cpus = perf_cpu_map__default_new();
>  	if (!matched_cpus)
> @@ -2010,18 +2017,15 @@ int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
>  		return -1;
>  	}
>  
> -	for (int i = 0; i < cpus->nr; i++) {
> -		int cpu;
> -
> -		cpu = perf_cpu_map__idx(pmu_cpus, cpus->map[i]);
> -		if (cpu == -1)
> -			unmatched_cpus->map[unmatched_nr++] = cpus->map[i];
> +	perf_cpu_map__for_each_cpu(cpu, i, cpus) {
> +		if (!perf_cpu_map__has(pmu_cpus, cpu))
> +			UNWRAP_CPUMAP(unmatched_cpus)->map[unmatched_nr++] = cpu;
>  		else
> -			matched_cpus->map[matched_nr++] = cpus->map[i];
> +			UNWRAP_CPUMAP(matched_cpus)->map[matched_nr++] = cpu;
>  	}
>  
> -	unmatched_cpus->nr = unmatched_nr;
> -	matched_cpus->nr = matched_nr;
> +	UNWRAP_CPUMAP(unmatched_cpus)->nr = unmatched_nr;
> +	UNWRAP_CPUMAP(matched_cpus)->nr = matched_nr;
>  	*mcpus_ptr = matched_cpus;
>  	*ucpus_ptr = unmatched_cpus;
>  	return 0;
> -- 
> 2.35.0.rc0.227.g00780c9af4-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-01-22 20:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-22  4:58 [PATCH 0/3] Reference count checker and cpu map related fixes Ian Rogers
2022-01-22  4:58 ` [PATCH 1/3] perf python: Fix cpu_map__item building Ian Rogers
2022-01-22 20:09   ` Arnaldo Carvalho de Melo
2022-01-22  4:58 ` [PATCH 2/3] perf cpumap: Migrate to libperf cpumap api Ian Rogers
2022-01-22  4:58 ` [PATCH 3/3] perf cpumap: Add reference count checking Ian Rogers
2022-01-22 20:09   ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.