linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes
@ 2024-01-03  5:06 Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses Ian Rogers
                   ` (25 more replies)
  0 siblings, 26 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Modify the implementation of maps to not use an rbtree as the
container for maps, instead use a sorted array. Improve locking and
reference counting issues.

Similar to maps separate out and reimplement threads to use a hashmap
for lower memory consumption and faster look up. The fixes a
regression in memory usage where reference count checking switched to
using non-invasive tree nodes.  Reduce its default size by 32 times
and improve locking discipline. Also, fix regressions where tids had
become unordered to make `perf report --tasks` and
`perf trace --summary` output easier to read.

Better encapsulate the dsos abstraction. Remove the linked list and
rbtree used for faster iteration and log(n) lookup to a sorted array
for similar performance but half the memory usage per dso. Improve
reference counting and locking discipline, adding reference count
checking to dso.

v7:
 - rebase to latest perf-tools-next where 22 patches were applied by Arnaldo.
 - resolve merge conflicts, in particular with fc044c53b99f ("perf
   annotate-data: Add dso->data_types tree") that required more dso
   accessor functions.

v6 series is here:
https://lore.kernel.org/lkml/20231207011722.1220634-1-irogers@google.com/

Ian Rogers (25):
  perf maps: Switch from rbtree to lazily sorted array for addresses
  perf maps: Get map before returning in maps__find
  perf maps: Get map before returning in maps__find_by_name
  perf maps: Get map before returning in maps__find_next_entry
  perf maps: Hide maps internals
  perf maps: Locking tidy up of nr_maps
  perf dso: Reorder variables to save space in struct dso
  perf report: Sort child tasks by tid
  perf trace: Ignore thread hashing in summary
  perf machine: Move fprintf to for_each loop and a callback
  perf threads: Move threads to its own files
  perf threads: Switch from rbtree to hashmap
  perf threads: Reduce table size from 256 to 8
  perf dsos: Attempt to better abstract dsos internals
  perf dsos: Tidy reference counting and locking
  perf dsos: Add dsos__for_each_dso
  perf dso: Move dso functions out of dsos
  perf dsos: Switch more loops to dsos__for_each_dso
  perf dsos: Switch backing storage to array from rbtree/list
  perf dsos: Remove __dsos__addnew
  perf dsos: Remove __dsos__findnew_link_by_longname_id
  perf dsos: Switch hand code to bsearch
  perf dso: Add reference count checking and accessor functions
  perf dso: Reference counting related fixes
  perf dso: Use container_of to avoid a pointer in dso_data

 tools/perf/arch/x86/tests/dwarf-unwind.c      |    1 +
 tools/perf/builtin-annotate.c                 |    8 +-
 tools/perf/builtin-buildid-cache.c            |    2 +-
 tools/perf/builtin-buildid-list.c             |   18 +-
 tools/perf/builtin-inject.c                   |   96 +-
 tools/perf/builtin-kallsyms.c                 |    2 +-
 tools/perf/builtin-mem.c                      |    4 +-
 tools/perf/builtin-record.c                   |    2 +-
 tools/perf/builtin-report.c                   |  209 +--
 tools/perf/builtin-script.c                   |    8 +-
 tools/perf/builtin-top.c                      |    4 +-
 tools/perf/builtin-trace.c                    |   43 +-
 tools/perf/tests/code-reading.c               |    8 +-
 tools/perf/tests/dso-data.c                   |   67 +-
 tools/perf/tests/hists_common.c               |    6 +-
 tools/perf/tests/hists_cumulate.c             |    4 +-
 tools/perf/tests/hists_output.c               |    2 +-
 tools/perf/tests/maps.c                       |    7 +-
 tools/perf/tests/symbols.c                    |    2 +-
 tools/perf/tests/thread-maps-share.c          |    8 +-
 tools/perf/tests/vmlinux-kallsyms.c           |   16 +-
 tools/perf/ui/browsers/annotate.c             |    6 +-
 tools/perf/ui/browsers/hists.c                |    8 +-
 tools/perf/ui/browsers/map.c                  |    4 +-
 tools/perf/util/Build                         |    1 +
 tools/perf/util/annotate-data.c               |    6 +-
 tools/perf/util/annotate.c                    |   45 +-
 tools/perf/util/auxtrace.c                    |    2 +-
 tools/perf/util/block-info.c                  |    2 +-
 tools/perf/util/bpf-event.c                   |    9 +-
 tools/perf/util/bpf_lock_contention.c         |    8 +-
 tools/perf/util/build-id.c                    |  136 +-
 tools/perf/util/build-id.h                    |    2 -
 tools/perf/util/callchain.c                   |    4 +-
 tools/perf/util/data-convert-json.c           |    2 +-
 tools/perf/util/db-export.c                   |    6 +-
 tools/perf/util/dlfilter.c                    |   12 +-
 tools/perf/util/dso.c                         |  469 +++---
 tools/perf/util/dso.h                         |  549 +++++--
 tools/perf/util/dsos.c                        |  529 ++++---
 tools/perf/util/dsos.h                        |   40 +-
 tools/perf/util/event.c                       |   12 +-
 tools/perf/util/header.c                      |    8 +-
 tools/perf/util/hist.c                        |    4 +-
 tools/perf/util/intel-pt.c                    |   22 +-
 tools/perf/util/machine.c                     |  570 +++-----
 tools/perf/util/machine.h                     |   32 +-
 tools/perf/util/map.c                         |   73 +-
 tools/perf/util/maps.c                        | 1280 +++++++++++------
 tools/perf/util/maps.h                        |   65 +-
 tools/perf/util/probe-event.c                 |   26 +-
 tools/perf/util/rb_resort.h                   |    5 -
 .../util/scripting-engines/trace-event-perl.c |    6 +-
 .../scripting-engines/trace-event-python.c    |   21 +-
 tools/perf/util/session.c                     |   21 +
 tools/perf/util/session.h                     |    2 +
 tools/perf/util/sort.c                        |   19 +-
 tools/perf/util/srcline.c                     |   65 +-
 tools/perf/util/symbol-elf.c                  |  132 +-
 tools/perf/util/symbol.c                      |  217 +--
 tools/perf/util/symbol_fprintf.c              |    4 +-
 tools/perf/util/synthetic-events.c            |   24 +-
 tools/perf/util/thread.c                      |    8 +-
 tools/perf/util/thread.h                      |    6 -
 tools/perf/util/threads.c                     |  186 +++
 tools/perf/util/threads.h                     |   35 +
 tools/perf/util/unwind-libunwind-local.c      |   20 +-
 tools/perf/util/unwind-libunwind.c            |    9 +-
 tools/perf/util/vdso.c                        |   56 +-
 69 files changed, 3127 insertions(+), 2158 deletions(-)
 create mode 100644 tools/perf/util/threads.c
 create mode 100644 tools/perf/util/threads.h

-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-02-02  2:48   ` Namhyung Kim
  2024-01-03  5:06 ` [PATCH v7 02/25] perf maps: Get map before returning in maps__find Ian Rogers
                   ` (24 subsequent siblings)
  25 siblings, 1 reply; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Maps is a collection of maps primarily sorted by the starting address
of the map. Prior to this change the maps were held in an rbtree
requiring 4 pointers per node. Prior to reference count checking, the
rbnode was embedded in the map so 3 pointers per node were
necessary. This change switches the rbtree to an array lazily sorted
by address, much as the array sorting nodes by name. 1 pointer is
needed per node, but to avoid excessive resizing the backing array may
be twice the number of used elements. Meaning the memory overhead is
roughly half that of the rbtree. For a perf record with
"--no-bpf-event -g -a" of true, the memory overhead of perf inject is
reduce fom 3.3MB to 3MB, so 10% or 300KB is saved.

Map inserts always happen at the end of the array. The code tracks
whether the insertion violates the sorting property. O(log n) rb-tree
complexity is switched to O(1).

Remove slides the array, so O(log n) rb-tree complexity is degraded to
O(n).

A find may need to sort the array using qsort which is O(n*log n), but
in general the maps should be sorted and so average performance should
be O(log n) as with the rbtree.

An rbtree node consumes a cache line, but with the array 4 nodes fit
on a cache line. Iteration is simplified to scanning an array rather
than pointer chasing.

Overall it is expected the performance after the change should be
comparable to before, but with half of the memory consumed.

To avoid a list and repeated logic around splitting maps,
maps__merge_in is rewritten in terms of
maps__fixup_overlap_and_insert. maps_merge_in splits the given mapping
inserting remaining gaps. maps__fixup_overlap_and_insert splits the
existing mappings, then adds the incoming mapping. By adding the new
mapping first, then re-inserting the existing mappings the splitting
behavior matches.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/maps.c |    3 +
 tools/perf/util/map.c   |    1 +
 tools/perf/util/maps.c  | 1183 +++++++++++++++++++++++----------------
 tools/perf/util/maps.h  |   54 +-
 4 files changed, 757 insertions(+), 484 deletions(-)

diff --git a/tools/perf/tests/maps.c b/tools/perf/tests/maps.c
index bb3fbfe5a73e..b15417a0d617 100644
--- a/tools/perf/tests/maps.c
+++ b/tools/perf/tests/maps.c
@@ -156,6 +156,9 @@ static int test__maps__merge_in(struct test_suite *t __maybe_unused, int subtest
 	TEST_ASSERT_VAL("merge check failed", !ret);
 
 	maps__zput(maps);
+	map__zput(map_kcore1);
+	map__zput(map_kcore2);
+	map__zput(map_kcore3);
 	return TEST_OK;
 }
 
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 54c67cb7ecef..cf5a15db3a1f 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -168,6 +168,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 		if (dso == NULL)
 			goto out_delete;
 
+		assert(!dso->kernel);
 		map__init(result, start, start + len, pgoff, dso);
 
 		if (anon || no_dso) {
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index 0334fc18d9c6..6ee81160cdab 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -10,286 +10,477 @@
 #include "ui/ui.h"
 #include "unwind.h"
 
-struct map_rb_node {
-	struct rb_node rb_node;
-	struct map *map;
-};
-
-#define maps__for_each_entry(maps, map) \
-	for (map = maps__first(maps); map; map = map_rb_node__next(map))
+static void check_invariants(const struct maps *maps __maybe_unused)
+{
+#ifndef NDEBUG
+	assert(RC_CHK_ACCESS(maps)->nr_maps <= RC_CHK_ACCESS(maps)->nr_maps_allocated);
+	for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
+		struct map *map = RC_CHK_ACCESS(maps)->maps_by_address[i];
+
+		/* Check map is well-formed. */
+		assert(map__end(map) == 0 || map__start(map) <= map__end(map));
+		/* Expect at least 1 reference count. */
+		assert(refcount_read(map__refcnt(map)) > 0);
+
+		if (map__dso(map) && map__dso(map)->kernel)
+			assert(RC_CHK_EQUAL(map__kmap(map)->kmaps, maps));
+
+		if (i > 0) {
+			struct map *prev = RC_CHK_ACCESS(maps)->maps_by_address[i - 1];
+
+			/* If addresses are sorted... */
+			if (RC_CHK_ACCESS(maps)->maps_by_address_sorted) {
+				/* Maps should be in start address order. */
+				assert(map__start(prev) <= map__start(map));
+				/*
+				 * If the ends of maps aren't broken (during
+				 * construction) then they should be ordered
+				 * too.
+				 */
+				if (!RC_CHK_ACCESS(maps)->ends_broken) {
+					assert(map__end(prev) <= map__end(map));
+					assert(map__end(prev) <= map__start(map) ||
+					       map__start(prev) == map__start(map));
+				}
+			}
+		}
+	}
+	if (RC_CHK_ACCESS(maps)->maps_by_name) {
+		for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
+			struct map *map = RC_CHK_ACCESS(maps)->maps_by_name[i];
 
-#define maps__for_each_entry_safe(maps, map, next) \
-	for (map = maps__first(maps), next = map_rb_node__next(map); map; \
-	     map = next, next = map_rb_node__next(map))
+			/*
+			 * Maps by name maps should be in maps_by_address, so
+			 * the reference count should be higher.
+			 */
+			assert(refcount_read(map__refcnt(map)) > 1);
+		}
+	}
+#endif
+}
 
-static struct rb_root *maps__entries(struct maps *maps)
+static struct map **maps__maps_by_address(const struct maps *maps)
 {
-	return &RC_CHK_ACCESS(maps)->entries;
+	return RC_CHK_ACCESS(maps)->maps_by_address;
 }
 
-static struct rw_semaphore *maps__lock(struct maps *maps)
+static void maps__set_maps_by_address(struct maps *maps, struct map **new)
 {
-	return &RC_CHK_ACCESS(maps)->lock;
+	RC_CHK_ACCESS(maps)->maps_by_address = new;
+
 }
 
-static struct map **maps__maps_by_name(struct maps *maps)
+/* Not in the header, to aid reference counting. */
+static struct map **maps__maps_by_name(const struct maps *maps)
 {
 	return RC_CHK_ACCESS(maps)->maps_by_name;
+
 }
 
-static struct map_rb_node *maps__first(struct maps *maps)
+static void maps__set_maps_by_name(struct maps *maps, struct map **new)
 {
-	struct rb_node *first = rb_first(maps__entries(maps));
+	RC_CHK_ACCESS(maps)->maps_by_name = new;
 
-	if (first)
-		return rb_entry(first, struct map_rb_node, rb_node);
-	return NULL;
 }
 
-static struct map_rb_node *map_rb_node__next(struct map_rb_node *node)
+static bool maps__maps_by_address_sorted(const struct maps *maps)
 {
-	struct rb_node *next;
-
-	if (!node)
-		return NULL;
-
-	next = rb_next(&node->rb_node);
+	return RC_CHK_ACCESS(maps)->maps_by_address_sorted;
+}
 
-	if (!next)
-		return NULL;
+static void maps__set_maps_by_address_sorted(struct maps *maps, bool value)
+{
+	RC_CHK_ACCESS(maps)->maps_by_address_sorted = value;
+}
 
-	return rb_entry(next, struct map_rb_node, rb_node);
+static bool maps__maps_by_name_sorted(const struct maps *maps)
+{
+	return RC_CHK_ACCESS(maps)->maps_by_name_sorted;
 }
 
-static struct map_rb_node *maps__find_node(struct maps *maps, struct map *map)
+static void maps__set_maps_by_name_sorted(struct maps *maps, bool value)
 {
-	struct map_rb_node *rb_node;
+	RC_CHK_ACCESS(maps)->maps_by_name_sorted = value;
+}
 
-	maps__for_each_entry(maps, rb_node) {
-		if (rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map))
-			return rb_node;
-	}
-	return NULL;
+static struct rw_semaphore *maps__lock(struct maps *maps)
+{
+	/*
+	 * When the lock is acquired or released the maps invariants should
+	 * hold.
+	 */
+	check_invariants(maps);
+	return &RC_CHK_ACCESS(maps)->lock;
 }
 
 static void maps__init(struct maps *maps, struct machine *machine)
 {
-	refcount_set(maps__refcnt(maps), 1);
 	init_rwsem(maps__lock(maps));
-	RC_CHK_ACCESS(maps)->entries = RB_ROOT;
+	RC_CHK_ACCESS(maps)->maps_by_address = NULL;
+	RC_CHK_ACCESS(maps)->maps_by_name = NULL;
 	RC_CHK_ACCESS(maps)->machine = machine;
-	RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
+#ifdef HAVE_LIBUNWIND_SUPPORT
+	RC_CHK_ACCESS(maps)->addr_space = NULL;
+	RC_CHK_ACCESS(maps)->unwind_libunwind_ops = NULL;
+#endif
+	refcount_set(maps__refcnt(maps), 1);
 	RC_CHK_ACCESS(maps)->nr_maps = 0;
-	RC_CHK_ACCESS(maps)->maps_by_name = NULL;
+	RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
+	RC_CHK_ACCESS(maps)->last_search_by_name_idx = 0;
+	RC_CHK_ACCESS(maps)->maps_by_address_sorted = true;
+	RC_CHK_ACCESS(maps)->maps_by_name_sorted = false;
 }
 
-static void __maps__free_maps_by_name(struct maps *maps)
+static void maps__exit(struct maps *maps)
 {
-	/*
-	 * Free everything to try to do it from the rbtree in the next search
-	 */
-	for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
-		map__put(maps__maps_by_name(maps)[i]);
+	struct map **maps_by_address = maps__maps_by_address(maps);
+	struct map **maps_by_name = maps__maps_by_name(maps);
 
-	zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
-	RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
+	for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
+		map__zput(maps_by_address[i]);
+		if (maps_by_name)
+			map__zput(maps_by_name[i]);
+	}
+	zfree(&maps_by_address);
+	zfree(&maps_by_name);
+	unwind__finish_access(maps);
 }
 
-static int __maps__insert(struct maps *maps, struct map *map)
+struct maps *maps__new(struct machine *machine)
 {
-	struct rb_node **p = &maps__entries(maps)->rb_node;
-	struct rb_node *parent = NULL;
-	const u64 ip = map__start(map);
-	struct map_rb_node *m, *new_rb_node;
-
-	new_rb_node = malloc(sizeof(*new_rb_node));
-	if (!new_rb_node)
-		return -ENOMEM;
-
-	RB_CLEAR_NODE(&new_rb_node->rb_node);
-	new_rb_node->map = map__get(map);
+	struct maps *result;
+	RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
 
-	while (*p != NULL) {
-		parent = *p;
-		m = rb_entry(parent, struct map_rb_node, rb_node);
-		if (ip < map__start(m->map))
-			p = &(*p)->rb_left;
-		else
-			p = &(*p)->rb_right;
-	}
+	if (ADD_RC_CHK(result, maps))
+		maps__init(result, machine);
 
-	rb_link_node(&new_rb_node->rb_node, parent, p);
-	rb_insert_color(&new_rb_node->rb_node, maps__entries(maps));
-	return 0;
+	return result;
 }
 
-int maps__insert(struct maps *maps, struct map *map)
+static void maps__delete(struct maps *maps)
 {
-	int err;
-	const struct dso *dso = map__dso(map);
-
-	down_write(maps__lock(maps));
-	err = __maps__insert(maps, map);
-	if (err)
-		goto out;
+	maps__exit(maps);
+	RC_CHK_FREE(maps);
+}
 
-	++RC_CHK_ACCESS(maps)->nr_maps;
+struct maps *maps__get(struct maps *maps)
+{
+	struct maps *result;
 
-	if (dso && dso->kernel) {
-		struct kmap *kmap = map__kmap(map);
+	if (RC_CHK_GET(result, maps))
+		refcount_inc(maps__refcnt(maps));
 
-		if (kmap)
-			kmap->kmaps = maps;
-		else
-			pr_err("Internal error: kernel dso with non kernel map\n");
-	}
+	return result;
+}
 
+void maps__put(struct maps *maps)
+{
+	if (maps && refcount_dec_and_test(maps__refcnt(maps)))
+		maps__delete(maps);
+	else
+		RC_CHK_PUT(maps);
+}
 
+static void __maps__free_maps_by_name(struct maps *maps)
+{
 	/*
-	 * If we already performed some search by name, then we need to add the just
-	 * inserted map and resort.
+	 * Free everything to try to do it from the rbtree in the next search
 	 */
-	if (maps__maps_by_name(maps)) {
-		if (maps__nr_maps(maps) > RC_CHK_ACCESS(maps)->nr_maps_allocated) {
-			int nr_allocate = maps__nr_maps(maps) * 2;
-			struct map **maps_by_name = realloc(maps__maps_by_name(maps),
-							    nr_allocate * sizeof(map));
+	for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
+		map__put(maps__maps_by_name(maps)[i]);
 
-			if (maps_by_name == NULL) {
-				__maps__free_maps_by_name(maps);
-				err = -ENOMEM;
-				goto out;
-			}
+	zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
+}
 
-			RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
-			RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
+static int map__start_cmp(const void *a, const void *b)
+{
+	const struct map *map_a = *(const struct map * const *)a;
+	const struct map *map_b = *(const struct map * const *)b;
+	u64 map_a_start = map__start(map_a);
+	u64 map_b_start = map__start(map_b);
+
+	if (map_a_start == map_b_start) {
+		u64 map_a_end = map__end(map_a);
+		u64 map_b_end = map__end(map_b);
+
+		if  (map_a_end == map_b_end) {
+			/* Ensure maps with the same addresses have a fixed order. */
+			if (RC_CHK_ACCESS(map_a) == RC_CHK_ACCESS(map_b))
+				return 0;
+			return (intptr_t)RC_CHK_ACCESS(map_a) > (intptr_t)RC_CHK_ACCESS(map_b)
+				? 1 : -1;
 		}
-		maps__maps_by_name(maps)[maps__nr_maps(maps) - 1] = map__get(map);
-		__maps__sort_by_name(maps);
+		return map_a_end > map_b_end ? 1 : -1;
 	}
- out:
-	up_write(maps__lock(maps));
-	return err;
+	return map_a_start > map_b_start ? 1 : -1;
 }
 
-static void __maps__remove(struct maps *maps, struct map_rb_node *rb_node)
+static void __maps__sort_by_address(struct maps *maps)
 {
-	rb_erase_init(&rb_node->rb_node, maps__entries(maps));
-	map__put(rb_node->map);
-	free(rb_node);
+	if (maps__maps_by_address_sorted(maps))
+		return;
+
+	qsort(maps__maps_by_address(maps),
+		maps__nr_maps(maps),
+		sizeof(struct map *),
+		map__start_cmp);
+	maps__set_maps_by_address_sorted(maps, true);
 }
 
-void maps__remove(struct maps *maps, struct map *map)
+static void maps__sort_by_address(struct maps *maps)
 {
-	struct map_rb_node *rb_node;
-
 	down_write(maps__lock(maps));
-	if (RC_CHK_ACCESS(maps)->last_search_by_name == map)
-		RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
-
-	rb_node = maps__find_node(maps, map);
-	assert(rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map));
-	__maps__remove(maps, rb_node);
-	if (maps__maps_by_name(maps))
-		__maps__free_maps_by_name(maps);
-	--RC_CHK_ACCESS(maps)->nr_maps;
+	__maps__sort_by_address(maps);
 	up_write(maps__lock(maps));
 }
 
-static void __maps__purge(struct maps *maps)
+static int map__strcmp(const void *a, const void *b)
 {
-	struct map_rb_node *pos, *next;
-
-	if (maps__maps_by_name(maps))
-		__maps__free_maps_by_name(maps);
+	const struct map *map_a = *(const struct map * const *)a;
+	const struct map *map_b = *(const struct map * const *)b;
+	const struct dso *dso_a = map__dso(map_a);
+	const struct dso *dso_b = map__dso(map_b);
+	int ret = strcmp(dso_a->short_name, dso_b->short_name);
 
-	maps__for_each_entry_safe(maps, pos, next) {
-		rb_erase_init(&pos->rb_node,  maps__entries(maps));
-		map__put(pos->map);
-		free(pos);
+	if (ret == 0 && RC_CHK_ACCESS(map_a) != RC_CHK_ACCESS(map_b)) {
+		/* Ensure distinct but name equal maps have an order. */
+		return map__start_cmp(a, b);
 	}
+	return ret;
 }
 
-static void maps__exit(struct maps *maps)
+static int maps__sort_by_name(struct maps *maps)
 {
+	int err = 0;
 	down_write(maps__lock(maps));
-	__maps__purge(maps);
+	if (!maps__maps_by_name_sorted(maps)) {
+		struct map **maps_by_name = maps__maps_by_name(maps);
+
+		if (!maps_by_name) {
+			maps_by_name = malloc(RC_CHK_ACCESS(maps)->nr_maps_allocated *
+					sizeof(*maps_by_name));
+			if (!maps_by_name)
+				err = -ENOMEM;
+			else {
+				struct map **maps_by_address = maps__maps_by_address(maps);
+				unsigned int n = maps__nr_maps(maps);
+
+				maps__set_maps_by_name(maps, maps_by_name);
+				for (unsigned int i = 0; i < n; i++)
+					maps_by_name[i] = map__get(maps_by_address[i]);
+			}
+		}
+		if (!err) {
+			qsort(maps_by_name,
+				maps__nr_maps(maps),
+				sizeof(struct map *),
+				map__strcmp);
+			maps__set_maps_by_name_sorted(maps, true);
+		}
+	}
 	up_write(maps__lock(maps));
+	return err;
 }
 
-bool maps__empty(struct maps *maps)
+static unsigned int maps__by_address_index(const struct maps *maps, const struct map *map)
 {
-	return !maps__first(maps);
+	struct map **maps_by_address = maps__maps_by_address(maps);
+
+	if (maps__maps_by_address_sorted(maps)) {
+		struct map **mapp =
+			bsearch(&map, maps__maps_by_address(maps), maps__nr_maps(maps),
+				sizeof(*mapp), map__start_cmp);
+
+		if (mapp)
+			return mapp - maps_by_address;
+	} else {
+		for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
+			if (RC_CHK_ACCESS(maps_by_address[i]) == RC_CHK_ACCESS(map))
+				return i;
+		}
+	}
+	pr_err("Map missing from maps");
+	return -1;
 }
 
-struct maps *maps__new(struct machine *machine)
+static unsigned int maps__by_name_index(const struct maps *maps, const struct map *map)
 {
-	struct maps *result;
-	RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
+	struct map **maps_by_name = maps__maps_by_name(maps);
+
+	if (maps__maps_by_name_sorted(maps)) {
+		struct map **mapp =
+			bsearch(&map, maps_by_name, maps__nr_maps(maps),
+				sizeof(*mapp), map__strcmp);
+
+		if (mapp)
+			return mapp - maps_by_name;
+	} else {
+		for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
+			if (RC_CHK_ACCESS(maps_by_name[i]) == RC_CHK_ACCESS(map))
+				return i;
+		}
+	}
+	pr_err("Map missing from maps");
+	return -1;
+}
 
-	if (ADD_RC_CHK(result, maps))
-		maps__init(result, machine);
+static int __maps__insert(struct maps *maps, struct map *new)
+{
+	struct map **maps_by_address = maps__maps_by_address(maps);
+	struct map **maps_by_name = maps__maps_by_name(maps);
+	const struct dso *dso = map__dso(new);
+	unsigned int nr_maps = maps__nr_maps(maps);
+	unsigned int nr_allocate = RC_CHK_ACCESS(maps)->nr_maps_allocated;
+
+	if (nr_maps + 1 > nr_allocate) {
+		nr_allocate = !nr_allocate ? 32 : nr_allocate * 2;
+
+		maps_by_address = realloc(maps_by_address, nr_allocate * sizeof(new));
+		if (!maps_by_address)
+			return -ENOMEM;
+
+		maps__set_maps_by_address(maps, maps_by_address);
+		if (maps_by_name) {
+			maps_by_name = realloc(maps_by_name, nr_allocate * sizeof(new));
+			if (!maps_by_name) {
+				/*
+				 * If by name fails, just disable by name and it will
+				 * recompute next time it is required.
+				 */
+				__maps__free_maps_by_name(maps);
+			}
+			maps__set_maps_by_name(maps, maps_by_name);
+		}
+		RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
+	}
+	/* Insert the value at the end. */
+	maps_by_address[nr_maps] = map__get(new);
+	if (maps_by_name)
+		maps_by_name[nr_maps] = map__get(new);
 
-	return result;
+	nr_maps++;
+	RC_CHK_ACCESS(maps)->nr_maps = nr_maps;
+
+	/*
+	 * Recompute if things are sorted. If things are inserted in a sorted
+	 * manner, for example by processing /proc/pid/maps, then no
+	 * sorting/resorting will be necessary.
+	 */
+	if (nr_maps == 1) {
+		/* If there's just 1 entry then maps are sorted. */
+		maps__set_maps_by_address_sorted(maps, true);
+		maps__set_maps_by_name_sorted(maps, maps_by_name != NULL);
+	} else {
+		/* Sorted if maps were already sorted and this map starts after the last one. */
+		maps__set_maps_by_address_sorted(maps,
+			maps__maps_by_address_sorted(maps) &&
+			map__end(maps_by_address[nr_maps - 2]) <= map__start(new));
+		maps__set_maps_by_name_sorted(maps, false);
+	}
+	if (map__end(new) < map__start(new))
+		RC_CHK_ACCESS(maps)->ends_broken = true;
+	if (dso && dso->kernel) {
+		struct kmap *kmap = map__kmap(new);
+
+		if (kmap)
+			kmap->kmaps = maps;
+		else
+			pr_err("Internal error: kernel dso with non kernel map\n");
+	}
+	check_invariants(maps);
+	return 0;
 }
 
-static void maps__delete(struct maps *maps)
+int maps__insert(struct maps *maps, struct map *map)
 {
-	maps__exit(maps);
-	unwind__finish_access(maps);
-	RC_CHK_FREE(maps);
+	int ret;
+
+	down_write(maps__lock(maps));
+	ret = __maps__insert(maps, map);
+	up_write(maps__lock(maps));
+	return ret;
 }
 
-struct maps *maps__get(struct maps *maps)
+static void __maps__remove(struct maps *maps, struct map *map)
 {
-	struct maps *result;
+	struct map **maps_by_address = maps__maps_by_address(maps);
+	struct map **maps_by_name = maps__maps_by_name(maps);
+	unsigned int nr_maps = maps__nr_maps(maps);
+	unsigned int address_idx;
+
+	/* Slide later mappings over the one to remove */
+	address_idx = maps__by_address_index(maps, map);
+	map__put(maps_by_address[address_idx]);
+	memmove(&maps_by_address[address_idx],
+		&maps_by_address[address_idx + 1],
+		(nr_maps - address_idx - 1) * sizeof(*maps_by_address));
+
+	if (maps_by_name) {
+		unsigned int name_idx = maps__by_name_index(maps, map);
+
+		map__put(maps_by_name[name_idx]);
+		memmove(&maps_by_name[name_idx],
+			&maps_by_name[name_idx + 1],
+			(nr_maps - name_idx - 1) *  sizeof(*maps_by_name));
+	}
 
-	if (RC_CHK_GET(result, maps))
-		refcount_inc(maps__refcnt(maps));
+	--RC_CHK_ACCESS(maps)->nr_maps;
+	check_invariants(maps);
+}
 
-	return result;
+void maps__remove(struct maps *maps, struct map *map)
+{
+	down_write(maps__lock(maps));
+	__maps__remove(maps, map);
+	up_write(maps__lock(maps));
 }
 
-void maps__put(struct maps *maps)
+bool maps__empty(struct maps *maps)
 {
-	if (maps && refcount_dec_and_test(maps__refcnt(maps)))
-		maps__delete(maps);
-	else
-		RC_CHK_PUT(maps);
+	return maps__nr_maps(maps) == 0;
 }
 
 int maps__for_each_map(struct maps *maps, int (*cb)(struct map *map, void *data), void *data)
 {
-	struct map_rb_node *pos;
+	bool done = false;
 	int ret = 0;
 
-	down_read(maps__lock(maps));
-	maps__for_each_entry(maps, pos)	{
-		ret = cb(pos->map, data);
-		if (ret)
-			break;
+	/* See locking/sorting note. */
+	while (!done) {
+		down_read(maps__lock(maps));
+		if (maps__maps_by_address_sorted(maps)) {
+			struct map **maps_by_address = maps__maps_by_address(maps);
+			unsigned int n = maps__nr_maps(maps);
+
+			for (unsigned int i = 0; i < n; i++) {
+				struct map *map = maps_by_address[i];
+
+				ret = cb(map, data);
+				if (ret)
+					break;
+			}
+			done = true;
+		}
+		up_read(maps__lock(maps));
+		if (!done)
+			maps__sort_by_address(maps);
 	}
-	up_read(maps__lock(maps));
 	return ret;
 }
 
 void maps__remove_maps(struct maps *maps, bool (*cb)(struct map *map, void *data), void *data)
 {
-	struct map_rb_node *pos, *next;
-	unsigned int start_nr_maps;
+	struct map **maps_by_address;
 
 	down_write(maps__lock(maps));
 
-	start_nr_maps = maps__nr_maps(maps);
-	maps__for_each_entry_safe(maps, pos, next)	{
-		if (cb(pos->map, data)) {
-			__maps__remove(maps, pos);
-			--RC_CHK_ACCESS(maps)->nr_maps;
-		}
+	maps_by_address = maps__maps_by_address(maps);
+	for (unsigned int i = 0; i < maps__nr_maps(maps);) {
+		if (cb(maps_by_address[i], data))
+			__maps__remove(maps, maps_by_address[i]);
+		else
+			i++;
 	}
-	if (maps__maps_by_name(maps) && start_nr_maps != maps__nr_maps(maps))
-		__maps__free_maps_by_name(maps);
-
 	up_write(maps__lock(maps));
 }
 
@@ -300,7 +491,7 @@ struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp)
 	/* Ensure map is loaded before using map->map_ip */
 	if (map != NULL && map__load(map) >= 0) {
 		if (mapp != NULL)
-			*mapp = map;
+			*mapp = map; // TODO: map_put on else path when find returns a get.
 		return map__find_symbol(map, map__map_ip(map, addr));
 	}
 
@@ -348,7 +539,7 @@ int maps__find_ams(struct maps *maps, struct addr_map_symbol *ams)
 	if (ams->addr < map__start(ams->ms.map) || ams->addr >= map__end(ams->ms.map)) {
 		if (maps == NULL)
 			return -1;
-		ams->ms.map = maps__find(maps, ams->addr);
+		ams->ms.map = maps__find(maps, ams->addr);  // TODO: map_get
 		if (ams->ms.map == NULL)
 			return -1;
 	}
@@ -393,24 +584,28 @@ size_t maps__fprintf(struct maps *maps, FILE *fp)
  * Find first map where end > map->start.
  * Same as find_vma() in kernel.
  */
-static struct rb_node *first_ending_after(struct maps *maps, const struct map *map)
+static unsigned int first_ending_after(struct maps *maps, const struct map *map)
 {
-	struct rb_root *root;
-	struct rb_node *next, *first;
+	struct map **maps_by_address = maps__maps_by_address(maps);
+	int low = 0, high = (int)maps__nr_maps(maps) - 1, first = high + 1;
+
+	assert(maps__maps_by_address_sorted(maps));
+	if (low <= high && map__end(maps_by_address[0]) > map__start(map))
+		return 0;
 
-	root = maps__entries(maps);
-	next = root->rb_node;
-	first = NULL;
-	while (next) {
-		struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
+	while (low <= high) {
+		int mid = (low + high) / 2;
+		struct map *pos = maps_by_address[mid];
 
-		if (map__end(pos->map) > map__start(map)) {
-			first = next;
-			if (map__start(pos->map) <= map__start(map))
+		if (map__end(pos) > map__start(map)) {
+			first = mid;
+			if (map__start(pos) <= map__start(map)) {
+				/* Entry overlaps map. */
 				break;
-			next = next->rb_left;
+			}
+			high = mid - 1;
 		} else
-			next = next->rb_right;
+			low = mid + 1;
 	}
 	return first;
 }
@@ -419,171 +614,249 @@ static struct rb_node *first_ending_after(struct maps *maps, const struct map *m
  * Adds new to maps, if new overlaps existing entries then the existing maps are
  * adjusted or removed so that new fits without overlapping any entries.
  */
-int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
+static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
 {
-
-	struct rb_node *next;
+	struct map **maps_by_address;
 	int err = 0;
 	FILE *fp = debug_file();
 
-	down_write(maps__lock(maps));
+sort_again:
+	if (!maps__maps_by_address_sorted(maps))
+		__maps__sort_by_address(maps);
 
-	next = first_ending_after(maps, new);
-	while (next && !err) {
-		struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
-		next = rb_next(&pos->rb_node);
+	maps_by_address = maps__maps_by_address(maps);
+	/*
+	 * Iterate through entries where the end of the existing entry is
+	 * greater-than the new map's start.
+	 */
+	for (unsigned int i = first_ending_after(maps, new); i < maps__nr_maps(maps); ) {
+		struct map *pos = maps_by_address[i];
+		struct map *before = NULL, *after = NULL;
 
 		/*
 		 * Stop if current map starts after map->end.
 		 * Maps are ordered by start: next will not overlap for sure.
 		 */
-		if (map__start(pos->map) >= map__end(new))
+		if (map__start(pos) >= map__end(new))
 			break;
 
-		if (verbose >= 2) {
-
-			if (use_browser) {
-				pr_debug("overlapping maps in %s (disable tui for more info)\n",
-					 map__dso(new)->name);
-			} else {
-				pr_debug("overlapping maps:\n");
-				map__fprintf(new, fp);
-				map__fprintf(pos->map, fp);
-			}
+		if (use_browser) {
+			pr_debug("overlapping maps in %s (disable tui for more info)\n",
+				map__dso(new)->name);
+		} else if (verbose >= 2) {
+			pr_debug("overlapping maps:\n");
+			map__fprintf(new, fp);
+			map__fprintf(pos, fp);
 		}
 
-		rb_erase_init(&pos->rb_node, maps__entries(maps));
 		/*
 		 * Now check if we need to create new maps for areas not
 		 * overlapped by the new map:
 		 */
-		if (map__start(new) > map__start(pos->map)) {
-			struct map *before = map__clone(pos->map);
+		if (map__start(new) > map__start(pos)) {
+			/* Map starts within existing map. Need to shorten the existing map. */
+			before = map__clone(pos);
 
 			if (before == NULL) {
 				err = -ENOMEM;
-				goto put_map;
+				goto out_err;
 			}
-
 			map__set_end(before, map__start(new));
-			err = __maps__insert(maps, before);
-			if (err) {
-				map__put(before);
-				goto put_map;
-			}
 
 			if (verbose >= 2 && !use_browser)
 				map__fprintf(before, fp);
-			map__put(before);
 		}
-
-		if (map__end(new) < map__end(pos->map)) {
-			struct map *after = map__clone(pos->map);
+		if (map__end(new) < map__end(pos)) {
+			/* The new map isn't as long as the existing map. */
+			after = map__clone(pos);
 
 			if (after == NULL) {
+				map__zput(before);
 				err = -ENOMEM;
-				goto put_map;
+				goto out_err;
 			}
 
 			map__set_start(after, map__end(new));
-			map__add_pgoff(after, map__end(new) - map__start(pos->map));
-			assert(map__map_ip(pos->map, map__end(new)) ==
-				map__map_ip(after, map__end(new)));
-			err = __maps__insert(maps, after);
-			if (err) {
-				map__put(after);
-				goto put_map;
-			}
+			map__add_pgoff(after, map__end(new) - map__start(pos));
+			assert(map__map_ip(pos, map__end(new)) ==
+			       map__map_ip(after, map__end(new)));
+
 			if (verbose >= 2 && !use_browser)
 				map__fprintf(after, fp);
-			map__put(after);
 		}
-put_map:
-		map__put(pos->map);
-		free(pos);
+		/*
+		 * If adding one entry, for `before` or `after`, we can replace
+		 * the existing entry. If both `before` and `after` are
+		 * necessary than an insert is needed. If the existing entry
+		 * entirely overlaps the existing entry it can just be removed.
+		 */
+		if (before) {
+			map__put(maps_by_address[i]);
+			maps_by_address[i] = before;
+			/* Maps are still ordered, go to next one. */
+			i++;
+			if (after) {
+				__maps__insert(maps, after);
+				map__put(after);
+				if (!maps__maps_by_address_sorted(maps)) {
+					/*
+					 * Sorting broken so invariants don't
+					 * hold, sort and go again.
+					 */
+					goto sort_again;
+				}
+				/*
+				 * Maps are still ordered, skip after and go to
+				 * next one (terminate loop).
+				 */
+				i++;
+			}
+		} else if (after) {
+			map__put(maps_by_address[i]);
+			maps_by_address[i] = after;
+			/* Maps are ordered, go to next one. */
+			i++;
+		} else {
+			__maps__remove(maps, pos);
+			/*
+			 * Maps are ordered but no need to increase `i` as the
+			 * later maps were moved down.
+			 */
+		}
+		check_invariants(maps);
 	}
 	/* Add the map. */
-	err = __maps__insert(maps, new);
-	up_write(maps__lock(maps));
+	__maps__insert(maps, new);
+out_err:
 	return err;
 }
 
-int maps__copy_from(struct maps *maps, struct maps *parent)
+int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
 {
 	int err;
-	struct map_rb_node *rb_node;
 
+	down_write(maps__lock(maps));
+	err =  __maps__fixup_overlap_and_insert(maps, new);
+	up_write(maps__lock(maps));
+	return err;
+}
+
+int maps__copy_from(struct maps *dest, struct maps *parent)
+{
+	/* Note, if struct map were immutable then cloning could use ref counts. */
+	struct map **parent_maps_by_address;
+	int err = 0;
+	unsigned int n;
+
+	down_write(maps__lock(dest));
 	down_read(maps__lock(parent));
 
-	maps__for_each_entry(parent, rb_node) {
-		struct map *new = map__clone(rb_node->map);
+	parent_maps_by_address = maps__maps_by_address(parent);
+	n = maps__nr_maps(parent);
+	if (maps__empty(dest)) {
+		/* No existing mappings so just copy from parent to avoid reallocs in insert. */
+		unsigned int nr_maps_allocated = RC_CHK_ACCESS(parent)->nr_maps_allocated;
+		struct map **dest_maps_by_address =
+			malloc(nr_maps_allocated * sizeof(struct map *));
+		struct map **dest_maps_by_name = NULL;
 
-		if (new == NULL) {
+		if (!dest_maps_by_address)
 			err = -ENOMEM;
-			goto out_unlock;
+		else {
+			if (maps__maps_by_name(parent)) {
+				dest_maps_by_name =
+					malloc(nr_maps_allocated * sizeof(struct map *));
+			}
+
+			RC_CHK_ACCESS(dest)->maps_by_address = dest_maps_by_address;
+			RC_CHK_ACCESS(dest)->maps_by_name = dest_maps_by_name;
+			RC_CHK_ACCESS(dest)->nr_maps_allocated = nr_maps_allocated;
 		}
 
-		err = unwind__prepare_access(maps, new, NULL);
-		if (err)
-			goto out_unlock;
+		for (unsigned int i = 0; !err && i < n; i++) {
+			struct map *pos = parent_maps_by_address[i];
+			struct map *new = map__clone(pos);
 
-		err = maps__insert(maps, new);
-		if (err)
-			goto out_unlock;
+			if (!new)
+				err = -ENOMEM;
+			else {
+				err = unwind__prepare_access(dest, new, NULL);
+				if (!err) {
+					dest_maps_by_address[i] = new;
+					if (dest_maps_by_name)
+						dest_maps_by_name[i] = map__get(new);
+					RC_CHK_ACCESS(dest)->nr_maps = i + 1;
+				}
+			}
+			if (err)
+				map__put(new);
+		}
+		maps__set_maps_by_address_sorted(dest, maps__maps_by_address_sorted(parent));
+		if (!err) {
+			RC_CHK_ACCESS(dest)->last_search_by_name_idx =
+				RC_CHK_ACCESS(parent)->last_search_by_name_idx;
+			maps__set_maps_by_name_sorted(dest,
+						dest_maps_by_name &&
+						maps__maps_by_name_sorted(parent));
+		} else {
+			RC_CHK_ACCESS(dest)->last_search_by_name_idx = 0;
+			maps__set_maps_by_name_sorted(dest, false);
+		}
+	} else {
+		/* Unexpected copying to a maps containing entries. */
+		for (unsigned int i = 0; !err && i < n; i++) {
+			struct map *pos = parent_maps_by_address[i];
+			struct map *new = map__clone(pos);
 
-		map__put(new);
+			if (!new)
+				err = -ENOMEM;
+			else {
+				err = unwind__prepare_access(dest, new, NULL);
+				if (!err)
+					err = maps__insert(dest, new);
+			}
+			map__put(new);
+		}
 	}
-
-	err = 0;
-out_unlock:
 	up_read(maps__lock(parent));
+	up_write(maps__lock(dest));
 	return err;
 }
 
-struct map *maps__find(struct maps *maps, u64 ip)
+static int map__addr_cmp(const void *key, const void *entry)
 {
-	struct rb_node *p;
-	struct map_rb_node *m;
-
+	const u64 ip = *(const u64 *)key;
+	const struct map *map = *(const struct map * const *)entry;
 
-	down_read(maps__lock(maps));
-
-	p = maps__entries(maps)->rb_node;
-	while (p != NULL) {
-		m = rb_entry(p, struct map_rb_node, rb_node);
-		if (ip < map__start(m->map))
-			p = p->rb_left;
-		else if (ip >= map__end(m->map))
-			p = p->rb_right;
-		else
-			goto out;
-	}
-
-	m = NULL;
-out:
-	up_read(maps__lock(maps));
-	return m ? m->map : NULL;
+	if (ip < map__start(map))
+		return -1;
+	if (ip >= map__end(map))
+		return 1;
+	return 0;
 }
 
-static int map__strcmp(const void *a, const void *b)
+struct map *maps__find(struct maps *maps, u64 ip)
 {
-	const struct map *map_a = *(const struct map **)a;
-	const struct map *map_b = *(const struct map **)b;
-	const struct dso *dso_a = map__dso(map_a);
-	const struct dso *dso_b = map__dso(map_b);
-	int ret = strcmp(dso_a->short_name, dso_b->short_name);
-
-	if (ret == 0 && map_a != map_b) {
-		/*
-		 * Ensure distinct but name equal maps have an order in part to
-		 * aid reference counting.
-		 */
-		ret = (int)map__start(map_a) - (int)map__start(map_b);
-		if (ret == 0)
-			ret = (int)((intptr_t)map_a - (intptr_t)map_b);
+	struct map *result = NULL;
+	bool done = false;
+
+	/* See locking/sorting note. */
+	while (!done) {
+		down_read(maps__lock(maps));
+		if (maps__maps_by_address_sorted(maps)) {
+			struct map **mapp =
+				bsearch(&ip, maps__maps_by_address(maps), maps__nr_maps(maps),
+					sizeof(*mapp), map__addr_cmp);
+
+			if (mapp)
+				result = *mapp; // map__get(*mapp);
+			done = true;
+		}
+		up_read(maps__lock(maps));
+		if (!done)
+			maps__sort_by_address(maps);
 	}
-
-	return ret;
+	return result;
 }
 
 static int map__strcmp_name(const void *name, const void *b)
@@ -593,126 +866,113 @@ static int map__strcmp_name(const void *name, const void *b)
 	return strcmp(name, dso->short_name);
 }
 
-void __maps__sort_by_name(struct maps *maps)
-{
-	qsort(maps__maps_by_name(maps), maps__nr_maps(maps), sizeof(struct map *), map__strcmp);
-}
-
-static int map__groups__sort_by_name_from_rbtree(struct maps *maps)
-{
-	struct map_rb_node *rb_node;
-	struct map **maps_by_name = realloc(maps__maps_by_name(maps),
-					    maps__nr_maps(maps) * sizeof(struct map *));
-	int i = 0;
-
-	if (maps_by_name == NULL)
-		return -1;
-
-	up_read(maps__lock(maps));
-	down_write(maps__lock(maps));
-
-	RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
-	RC_CHK_ACCESS(maps)->nr_maps_allocated = maps__nr_maps(maps);
-
-	maps__for_each_entry(maps, rb_node)
-		maps_by_name[i++] = map__get(rb_node->map);
-
-	__maps__sort_by_name(maps);
-
-	up_write(maps__lock(maps));
-	down_read(maps__lock(maps));
-
-	return 0;
-}
-
-static struct map *__maps__find_by_name(struct maps *maps, const char *name)
+struct map *maps__find_by_name(struct maps *maps, const char *name)
 {
-	struct map **mapp;
+	struct map *result = NULL;
+	bool done = false;
 
-	if (maps__maps_by_name(maps) == NULL &&
-	    map__groups__sort_by_name_from_rbtree(maps))
-		return NULL;
+	/* See locking/sorting note. */
+	while (!done) {
+		unsigned int i;
 
-	mapp = bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
-		       sizeof(*mapp), map__strcmp_name);
-	if (mapp)
-		return *mapp;
-	return NULL;
-}
+		down_read(maps__lock(maps));
 
-struct map *maps__find_by_name(struct maps *maps, const char *name)
-{
-	struct map_rb_node *rb_node;
-	struct map *map;
-
-	down_read(maps__lock(maps));
+		/* First check last found entry. */
+		i = RC_CHK_ACCESS(maps)->last_search_by_name_idx;
+		if (i < maps__nr_maps(maps) && maps__maps_by_name(maps)) {
+			struct dso *dso = map__dso(maps__maps_by_name(maps)[i]);
 
+			if (dso && strcmp(dso->short_name, name) == 0) {
+				result = maps__maps_by_name(maps)[i]; // TODO: map__get
+				done = true;
+			}
+		}
 
-	if (RC_CHK_ACCESS(maps)->last_search_by_name) {
-		const struct dso *dso = map__dso(RC_CHK_ACCESS(maps)->last_search_by_name);
+		/* Second search sorted array. */
+		if (!done && maps__maps_by_name_sorted(maps)) {
+			struct map **mapp =
+				bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
+					sizeof(*mapp), map__strcmp_name);
 
-		if (strcmp(dso->short_name, name) == 0) {
-			map = RC_CHK_ACCESS(maps)->last_search_by_name;
-			goto out_unlock;
+			if (mapp) {
+				result = *mapp; // TODO: map__get
+				i = mapp - maps__maps_by_name(maps);
+				RC_CHK_ACCESS(maps)->last_search_by_name_idx = i;
+			}
+			done = true;
 		}
-	}
-	/*
-	 * If we have maps->maps_by_name, then the name isn't in the rbtree,
-	 * as maps->maps_by_name mirrors the rbtree when lookups by name are
-	 * made.
-	 */
-	map = __maps__find_by_name(maps, name);
-	if (map || maps__maps_by_name(maps) != NULL)
-		goto out_unlock;
-
-	/* Fallback to traversing the rbtree... */
-	maps__for_each_entry(maps, rb_node) {
-		struct dso *dso;
-
-		map = rb_node->map;
-		dso = map__dso(map);
-		if (strcmp(dso->short_name, name) == 0) {
-			RC_CHK_ACCESS(maps)->last_search_by_name = map;
-			goto out_unlock;
+		up_read(maps__lock(maps));
+		if (!done) {
+			/* Sort and retry binary search. */
+			if (maps__sort_by_name(maps)) {
+				/*
+				 * Memory allocation failed do linear search
+				 * through address sorted maps.
+				 */
+				struct map **maps_by_address;
+				unsigned int n;
+
+				down_read(maps__lock(maps));
+				maps_by_address =  maps__maps_by_address(maps);
+				n = maps__nr_maps(maps);
+				for (i = 0; i < n; i++) {
+					struct map *pos = maps_by_address[i];
+					struct dso *dso = map__dso(pos);
+
+					if (dso && strcmp(dso->short_name, name) == 0) {
+						result = pos; // TODO: map__get
+						break;
+					}
+				}
+				up_read(maps__lock(maps));
+				done = true;
+			}
 		}
 	}
-	map = NULL;
-
-out_unlock:
-	up_read(maps__lock(maps));
-	return map;
+	return result;
 }
 
 struct map *maps__find_next_entry(struct maps *maps, struct map *map)
 {
-	struct map_rb_node *rb_node = maps__find_node(maps, map);
-	struct map_rb_node *next = map_rb_node__next(rb_node);
+	unsigned int i;
+	struct map *result = NULL;
 
-	if (next)
-		return next->map;
+	down_read(maps__lock(maps));
+	i = maps__by_address_index(maps, map);
+	if (i < maps__nr_maps(maps))
+		result = maps__maps_by_address(maps)[i]; // TODO: map__get
 
-	return NULL;
+	up_read(maps__lock(maps));
+	return result;
 }
 
 void maps__fixup_end(struct maps *maps)
 {
-	struct map_rb_node *prev = NULL, *curr;
+	struct map **maps_by_address;
+	unsigned int n;
 
 	down_write(maps__lock(maps));
+	if (!maps__maps_by_address_sorted(maps))
+		__maps__sort_by_address(maps);
 
-	maps__for_each_entry(maps, curr) {
-		if (prev && (!map__end(prev->map) || map__end(prev->map) > map__start(curr->map)))
-			map__set_end(prev->map, map__start(curr->map));
+	maps_by_address = maps__maps_by_address(maps);
+	n = maps__nr_maps(maps);
+	for (unsigned int i = 1; i < n; i++) {
+		struct map *prev = maps_by_address[i - 1];
+		struct map *curr = maps_by_address[i];
 
-		prev = curr;
+		if (!map__end(prev) || map__end(prev) > map__start(curr))
+			map__set_end(prev, map__start(curr));
 	}
 
 	/*
 	 * We still haven't the actual symbols, so guess the
 	 * last map final address.
 	 */
-	if (curr && !map__end(curr->map))
-		map__set_end(curr->map, ~0ULL);
+	if (n > 0 && !map__end(maps_by_address[n - 1]))
+		map__set_end(maps_by_address[n - 1], ~0ULL);
+
+	RC_CHK_ACCESS(maps)->ends_broken = false;
 
 	up_write(maps__lock(maps));
 }
@@ -723,117 +983,92 @@ void maps__fixup_end(struct maps *maps)
  */
 int maps__merge_in(struct maps *kmaps, struct map *new_map)
 {
-	struct map_rb_node *rb_node;
-	struct rb_node *first;
-	bool overlaps;
-	LIST_HEAD(merged);
-	int err = 0;
+	unsigned int first_after_, kmaps__nr_maps;
+	struct map **kmaps_maps_by_address;
+	struct map **merged_maps_by_address;
+	unsigned int merged_nr_maps_allocated;
+
+	/* First try under a read lock. */
+	while (true) {
+		down_read(maps__lock(kmaps));
+		if (maps__maps_by_address_sorted(kmaps))
+			break;
 
-	down_read(maps__lock(kmaps));
-	first = first_ending_after(kmaps, new_map);
-	rb_node = first ? rb_entry(first, struct map_rb_node, rb_node) : NULL;
-	overlaps = rb_node && map__start(rb_node->map) < map__end(new_map);
-	up_read(maps__lock(kmaps));
+		up_read(maps__lock(kmaps));
+
+		/* First after binary search requires sorted maps. Sort and try again. */
+		maps__sort_by_address(kmaps);
+	}
+	first_after_ = first_ending_after(kmaps, new_map);
+	kmaps_maps_by_address = maps__maps_by_address(kmaps);
 
-	if (!overlaps)
+	if (first_after_ >= maps__nr_maps(kmaps) ||
+	    map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
+		/* No overlap so regular insert suffices. */
+		up_read(maps__lock(kmaps));
 		return maps__insert(kmaps, new_map);
+	}
+	up_read(maps__lock(kmaps));
 
-	maps__for_each_entry(kmaps, rb_node) {
-		struct map *old_map = rb_node->map;
+	/* Plain insert with a read-lock failed, try again now with the write lock. */
+	down_write(maps__lock(kmaps));
+	if (!maps__maps_by_address_sorted(kmaps))
+		__maps__sort_by_address(kmaps);
 
-		/* no overload with this one */
-		if (map__end(new_map) < map__start(old_map) ||
-		    map__start(new_map) >= map__end(old_map))
-			continue;
+	first_after_ = first_ending_after(kmaps, new_map);
+	kmaps_maps_by_address = maps__maps_by_address(kmaps);
+	kmaps__nr_maps = maps__nr_maps(kmaps);
 
-		if (map__start(new_map) < map__start(old_map)) {
-			/*
-			 * |new......
-			 *       |old....
-			 */
-			if (map__end(new_map) < map__end(old_map)) {
-				/*
-				 * |new......|     -> |new..|
-				 *       |old....| ->       |old....|
-				 */
-				map__set_end(new_map, map__start(old_map));
-			} else {
-				/*
-				 * |new.............| -> |new..|       |new..|
-				 *       |old....|    ->       |old....|
-				 */
-				struct map_list_node *m = map_list_node__new();
+	if (first_after_ >= kmaps__nr_maps ||
+	    map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
+		/* No overlap so regular insert suffices. */
+		up_write(maps__lock(kmaps));
+		return maps__insert(kmaps, new_map);
+	}
+	/* Array to merge into, possibly 1 more for the sake of new_map. */
+	merged_nr_maps_allocated = RC_CHK_ACCESS(kmaps)->nr_maps_allocated;
+	if (kmaps__nr_maps + 1 == merged_nr_maps_allocated)
+		merged_nr_maps_allocated++;
+
+	merged_maps_by_address = malloc(merged_nr_maps_allocated * sizeof(*merged_maps_by_address));
+	if (!merged_maps_by_address) {
+		up_write(maps__lock(kmaps));
+		return -ENOMEM;
+	}
+	RC_CHK_ACCESS(kmaps)->maps_by_address = merged_maps_by_address;
+	RC_CHK_ACCESS(kmaps)->maps_by_address_sorted = true;
+	zfree(&RC_CHK_ACCESS(kmaps)->maps_by_name);
+	RC_CHK_ACCESS(kmaps)->maps_by_name_sorted = false;
+	RC_CHK_ACCESS(kmaps)->nr_maps_allocated = merged_nr_maps_allocated;
 
-				if (!m) {
-					err = -ENOMEM;
-					goto out;
-				}
+	/* Copy entries before the new_map that can't overlap. */
+	for (unsigned int i = 0; i < first_after_; i++)
+		merged_maps_by_address[i] = map__get(kmaps_maps_by_address[i]);
 
-				m->map = map__clone(new_map);
-				if (!m->map) {
-					free(m);
-					err = -ENOMEM;
-					goto out;
-				}
+	RC_CHK_ACCESS(kmaps)->nr_maps = first_after_;
 
-				map__set_end(m->map, map__start(old_map));
-				list_add_tail(&m->node, &merged);
-				map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
-				map__set_start(new_map, map__end(old_map));
-			}
-		} else {
-			/*
-			 *      |new......
-			 * |old....
-			 */
-			if (map__end(new_map) < map__end(old_map)) {
-				/*
-				 *      |new..|   -> x
-				 * |old.........| -> |old.........|
-				 */
-				map__put(new_map);
-				new_map = NULL;
-				break;
-			} else {
-				/*
-				 *      |new......| ->         |new...|
-				 * |old....|        -> |old....|
-				 */
-				map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
-				map__set_start(new_map, map__end(old_map));
-			}
-		}
-	}
+	/* Add the new map, it will be split when the later overlapping mappings are added. */
+	__maps__insert(kmaps, new_map);
 
-out:
-	while (!list_empty(&merged)) {
-		struct map_list_node *old_node;
+	/* Insert mappings after new_map, splitting new_map in the process. */
+	for (unsigned int i = first_after_; i < kmaps__nr_maps; i++)
+		__maps__fixup_overlap_and_insert(kmaps, kmaps_maps_by_address[i]);
 
-		old_node = list_entry(merged.next, struct map_list_node, node);
-		list_del_init(&old_node->node);
-		if (!err)
-			err = maps__insert(kmaps, old_node->map);
-		map__put(old_node->map);
-		free(old_node);
-	}
+	/* Copy the maps from merged into kmaps. */
+	for (unsigned int i = 0; i < kmaps__nr_maps; i++)
+		map__zput(kmaps_maps_by_address[i]);
 
-	if (new_map) {
-		if (!err)
-			err = maps__insert(kmaps, new_map);
-		map__put(new_map);
-	}
-	return err;
+	free(kmaps_maps_by_address);
+	up_write(maps__lock(kmaps));
+	return 0;
 }
 
 void maps__load_first(struct maps *maps)
 {
-	struct map_rb_node *first;
-
 	down_read(maps__lock(maps));
 
-	first = maps__first(maps);
-	if (first)
-		map__load(first->map);
+	if (maps__nr_maps(maps) > 0)
+		map__load(maps__maps_by_address(maps)[0]);
 
 	up_read(maps__lock(maps));
 }
diff --git a/tools/perf/util/maps.h b/tools/perf/util/maps.h
index d836d04c9402..df9dd5a0e3c0 100644
--- a/tools/perf/util/maps.h
+++ b/tools/perf/util/maps.h
@@ -25,21 +25,56 @@ static inline struct map_list_node *map_list_node__new(void)
 	return malloc(sizeof(struct map_list_node));
 }
 
-struct map *maps__find(struct maps *maps, u64 addr);
+/*
+ * Locking/sorting note:
+ *
+ * Sorting is done with the write lock, iteration and binary searching happens
+ * under the read lock requiring being sorted. There is a race between sorting
+ * releasing the write lock and acquiring the read lock for iteration/searching
+ * where another thread could insert and break the sorting of the maps. In
+ * practice inserting maps should be rare meaning that the race shouldn't lead
+ * to live lock. Removal of maps doesn't break being sorted.
+ */
 
 DECLARE_RC_STRUCT(maps) {
-	struct rb_root      entries;
 	struct rw_semaphore lock;
-	struct machine	 *machine;
-	struct map	 *last_search_by_name;
+	/**
+	 * @maps_by_address: array of maps sorted by their starting address if
+	 * maps_by_address_sorted is true.
+	 */
+	struct map	 **maps_by_address;
+	/**
+	 * @maps_by_name: optional array of maps sorted by their dso name if
+	 * maps_by_name_sorted is true.
+	 */
 	struct map	 **maps_by_name;
-	refcount_t	 refcnt;
-	unsigned int	 nr_maps;
-	unsigned int	 nr_maps_allocated;
+	struct machine	 *machine;
 #ifdef HAVE_LIBUNWIND_SUPPORT
-	void				*addr_space;
+	void		*addr_space;
 	const struct unwind_libunwind_ops *unwind_libunwind_ops;
 #endif
+	refcount_t	 refcnt;
+	/**
+	 * @nr_maps: number of maps_by_address, and possibly maps_by_name,
+	 * entries that contain maps.
+	 */
+	unsigned int	 nr_maps;
+	/**
+	 * @nr_maps_allocated: number of entries in maps_by_address and possibly
+	 * maps_by_name.
+	 */
+	unsigned int	 nr_maps_allocated;
+	/**
+	 * @last_search_by_name_idx: cache of last found by name entry's index
+	 * as frequent searches for the same dso name are common.
+	 */
+	unsigned int	 last_search_by_name_idx;
+	/** @maps_by_address_sorted: is maps_by_address sorted. */
+	bool		 maps_by_address_sorted;
+	/** @maps_by_name_sorted: is maps_by_name sorted. */
+	bool		 maps_by_name_sorted;
+	/** @ends_broken: does the map contain a map where end values are unset/unsorted? */
+	bool		 ends_broken;
 };
 
 #define KMAP_NAME_LEN 256
@@ -102,6 +137,7 @@ size_t maps__fprintf(struct maps *maps, FILE *fp);
 int maps__insert(struct maps *maps, struct map *map);
 void maps__remove(struct maps *maps, struct map *map);
 
+struct map *maps__find(struct maps *maps, u64 addr);
 struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp);
 struct symbol *maps__find_symbol_by_name(struct maps *maps, const char *name, struct map **mapp);
 
@@ -117,8 +153,6 @@ struct map *maps__find_next_entry(struct maps *maps, struct map *map);
 
 int maps__merge_in(struct maps *kmaps, struct map *new_map);
 
-void __maps__sort_by_name(struct maps *maps);
-
 void maps__fixup_end(struct maps *maps);
 
 void maps__load_first(struct maps *maps);
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 02/25] perf maps: Get map before returning in maps__find
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 03/25] perf maps: Get map before returning in maps__find_by_name Ian Rogers
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Finding a map is done under a lock, returning the map without a
reference count means it can be removed without notice and causing
uses after free. Grab a reference count to the map within the lock
region and return this. Fix up locations that need a map__put
following this.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/arch/x86/tests/dwarf-unwind.c |  1 +
 tools/perf/tests/vmlinux-kallsyms.c      |  5 ++---
 tools/perf/util/bpf-event.c              |  1 +
 tools/perf/util/event.c                  |  4 ++--
 tools/perf/util/machine.c                | 22 ++++++++--------------
 tools/perf/util/maps.c                   | 17 ++++++++++-------
 tools/perf/util/symbol.c                 |  3 ++-
 7 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/tools/perf/arch/x86/tests/dwarf-unwind.c b/tools/perf/arch/x86/tests/dwarf-unwind.c
index 5bfec3345d59..c05c0a85dad4 100644
--- a/tools/perf/arch/x86/tests/dwarf-unwind.c
+++ b/tools/perf/arch/x86/tests/dwarf-unwind.c
@@ -34,6 +34,7 @@ static int sample_ustack(struct perf_sample *sample,
 	}
 
 	stack_size = map__end(map) - sp;
+	map__put(map);
 	stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;
 
 	memcpy(buf, (void *) sp, stack_size);
diff --git a/tools/perf/tests/vmlinux-kallsyms.c b/tools/perf/tests/vmlinux-kallsyms.c
index 822f893e67d5..e808e6fc8f76 100644
--- a/tools/perf/tests/vmlinux-kallsyms.c
+++ b/tools/perf/tests/vmlinux-kallsyms.c
@@ -151,10 +151,8 @@ static int test__vmlinux_matches_kallsyms_cb2(struct map *map, void *data)
 	u64 mem_end = map__unmap_ip(args->vmlinux_map, map__end(map));
 
 	pair = maps__find(args->kallsyms.kmaps, mem_start);
-	if (pair == NULL || map__priv(pair))
-		return 0;
 
-	if (map__start(pair) == mem_start) {
+	if (pair != NULL && !map__priv(pair) && map__start(pair) == mem_start) {
 		struct dso *dso = map__dso(map);
 
 		if (!args->header_printed) {
@@ -170,6 +168,7 @@ static int test__vmlinux_matches_kallsyms_cb2(struct map *map, void *data)
 		pr_info(" %s\n", dso->name);
 		map__set_priv(pair, 1);
 	}
+	map__put(pair);
 	return 0;
 }
 
diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index 830711cae30d..d07fd5ffa823 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -63,6 +63,7 @@ static int machine__process_bpf_event_load(struct machine *machine,
 			dso->bpf_prog.id = id;
 			dso->bpf_prog.sub_id = i;
 			dso->bpf_prog.env = env;
+			map__put(map);
 		}
 	}
 	return 0;
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 68f45e9e63b6..198903157f9e 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -511,7 +511,7 @@ size_t perf_event__fprintf_text_poke(union perf_event *event, struct machine *ma
 		struct addr_location al;
 
 		addr_location__init(&al);
-		al.map = map__get(maps__find(machine__kernel_maps(machine), tp->addr));
+		al.map = maps__find(machine__kernel_maps(machine), tp->addr);
 		if (al.map && map__load(al.map) >= 0) {
 			al.addr = map__map_ip(al.map, tp->addr);
 			al.sym = map__find_symbol(al.map, al.addr);
@@ -641,7 +641,7 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,
 		return NULL;
 	}
 	al->maps = maps__get(maps);
-	al->map = map__get(maps__find(maps, al->addr));
+	al->map = maps__find(maps, al->addr);
 	if (al->map != NULL) {
 		/*
 		 * Kernel maps might be changed when loading symbols so loading
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index b397a769006f..e8eb9f0b073f 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -896,7 +896,6 @@ static int machine__process_ksymbol_register(struct machine *machine,
 	struct symbol *sym;
 	struct dso *dso;
 	struct map *map = maps__find(machine__kernel_maps(machine), event->ksymbol.addr);
-	bool put_map = false;
 	int err = 0;
 
 	if (!map) {
@@ -913,12 +912,6 @@ static int machine__process_ksymbol_register(struct machine *machine,
 			err = -ENOMEM;
 			goto out;
 		}
-		/*
-		 * The inserted map has a get on it, we need to put to release
-		 * the reference count here, but do it after all accesses are
-		 * done.
-		 */
-		put_map = true;
 		if (event->ksymbol.ksym_type == PERF_RECORD_KSYMBOL_TYPE_OOL) {
 			dso->binary_type = DSO_BINARY_TYPE__OOL;
 			dso->data.file_size = event->ksymbol.len;
@@ -952,8 +945,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
 	}
 	dso__insert_symbol(dso, sym);
 out:
-	if (put_map)
-		map__put(map);
+	map__put(map);
 	return err;
 }
 
@@ -977,7 +969,7 @@ static int machine__process_ksymbol_unregister(struct machine *machine,
 		if (sym)
 			dso__delete_symbol(dso, sym);
 	}
-
+	map__put(map);
 	return 0;
 }
 
@@ -1005,11 +997,11 @@ int machine__process_text_poke(struct machine *machine, union perf_event *event,
 		perf_event__fprintf_text_poke(event, machine, stdout);
 
 	if (!event->text_poke.new_len)
-		return 0;
+		goto out;
 
 	if (cpumode != PERF_RECORD_MISC_KERNEL) {
 		pr_debug("%s: unsupported cpumode - ignoring\n", __func__);
-		return 0;
+		goto out;
 	}
 
 	if (dso) {
@@ -1032,7 +1024,8 @@ int machine__process_text_poke(struct machine *machine, union perf_event *event,
 		pr_debug("Failed to find kernel text poke address map for %#" PRI_lx64 "\n",
 			 event->text_poke.addr);
 	}
-
+out:
+	map__put(map);
 	return 0;
 }
 
@@ -1300,9 +1293,10 @@ static int machine__map_x86_64_entry_trampolines_cb(struct map *map, void *data)
 		return 0;
 
 	dest_map = maps__find(args->kmaps, map__pgoff(map));
-	if (dest_map != map)
+	if (RC_CHK_ACCESS(dest_map) != RC_CHK_ACCESS(map))
 		map__set_pgoff(map, map__map_ip(dest_map, map__pgoff(map)));
 
+	map__put(dest_map);
 	args->found = true;
 	return 0;
 }
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index 6ee81160cdab..17aa894721a7 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -487,15 +487,18 @@ void maps__remove_maps(struct maps *maps, bool (*cb)(struct map *map, void *data
 struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp)
 {
 	struct map *map = maps__find(maps, addr);
+	struct symbol *result = NULL;
 
 	/* Ensure map is loaded before using map->map_ip */
 	if (map != NULL && map__load(map) >= 0) {
-		if (mapp != NULL)
-			*mapp = map; // TODO: map_put on else path when find returns a get.
-		return map__find_symbol(map, map__map_ip(map, addr));
-	}
+		if (mapp)
+			*mapp = map;
 
-	return NULL;
+		result = map__find_symbol(map, map__map_ip(map, addr));
+		if (!mapp)
+			map__put(map);
+	}
+	return result;
 }
 
 struct maps__find_symbol_by_name_args {
@@ -539,7 +542,7 @@ int maps__find_ams(struct maps *maps, struct addr_map_symbol *ams)
 	if (ams->addr < map__start(ams->ms.map) || ams->addr >= map__end(ams->ms.map)) {
 		if (maps == NULL)
 			return -1;
-		ams->ms.map = maps__find(maps, ams->addr);  // TODO: map_get
+		ams->ms.map = maps__find(maps, ams->addr);
 		if (ams->ms.map == NULL)
 			return -1;
 	}
@@ -849,7 +852,7 @@ struct map *maps__find(struct maps *maps, u64 ip)
 					sizeof(*mapp), map__addr_cmp);
 
 			if (mapp)
-				result = *mapp; // map__get(*mapp);
+				result = map__get(*mapp);
 			done = true;
 		}
 		up_read(maps__lock(maps));
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index be212ba157dc..1710b89e207c 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -757,7 +757,6 @@ static int dso__load_all_kallsyms(struct dso *dso, const char *filename)
 
 static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso)
 {
-	struct map *curr_map;
 	struct symbol *pos;
 	int count = 0;
 	struct rb_root_cached old_root = dso->symbols;
@@ -770,6 +769,7 @@ static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso)
 	*root = RB_ROOT_CACHED;
 
 	while (next) {
+		struct map *curr_map;
 		struct dso *curr_map_dso;
 		char *module;
 
@@ -796,6 +796,7 @@ static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso)
 			pos->end -= map__start(curr_map) - map__pgoff(curr_map);
 		symbols__insert(&curr_map_dso->symbols, pos);
 		++count;
+		map__put(curr_map);
 	}
 
 	/* Symbols have been adjusted */
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 03/25] perf maps: Get map before returning in maps__find_by_name
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 02/25] perf maps: Get map before returning in maps__find Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 04/25] perf maps: Get map before returning in maps__find_next_entry Ian Rogers
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Finding a map is done under a lock, returning the map without a
reference count means it can be removed without notice and causing
uses after free. Grab a reference count to the map within the lock
region and return this. Fix up locations that need a map__put
following this. Also fix some reference counted pointer comparisons.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/vmlinux-kallsyms.c |  5 +++--
 tools/perf/util/machine.c           |  6 ++++--
 tools/perf/util/maps.c              |  6 +++---
 tools/perf/util/probe-event.c       |  1 +
 tools/perf/util/symbol-elf.c        |  4 +++-
 tools/perf/util/symbol.c            | 18 +++++++++++-------
 6 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/tools/perf/tests/vmlinux-kallsyms.c b/tools/perf/tests/vmlinux-kallsyms.c
index e808e6fc8f76..fecbf851bb2e 100644
--- a/tools/perf/tests/vmlinux-kallsyms.c
+++ b/tools/perf/tests/vmlinux-kallsyms.c
@@ -131,9 +131,10 @@ static int test__vmlinux_matches_kallsyms_cb1(struct map *map, void *data)
 	struct map *pair = maps__find_by_name(args->kallsyms.kmaps,
 					(dso->kernel ? dso->short_name : dso->name));
 
-	if (pair)
+	if (pair) {
 		map__set_priv(pair, 1);
-	else {
+		map__put(pair);
+	} else {
 		if (!args->header_printed) {
 			pr_info("WARN: Maps only in vmlinux:\n");
 			args->header_printed = true;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e8eb9f0b073f..7031f6fddcae 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1537,8 +1537,10 @@ static int maps__set_module_path(struct maps *maps, const char *path, struct kmo
 		return 0;
 
 	long_name = strdup(path);
-	if (long_name == NULL)
+	if (long_name == NULL) {
+		map__put(map);
 		return -ENOMEM;
+	}
 
 	dso = map__dso(map);
 	dso__set_long_name(dso, long_name, true);
@@ -1552,7 +1554,7 @@ static int maps__set_module_path(struct maps *maps, const char *path, struct kmo
 		dso->symtab_type++;
 		dso->comp = m->comp;
 	}
-
+	map__put(map);
 	return 0;
 }
 
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index 17aa894721a7..b85147cc8723 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -886,7 +886,7 @@ struct map *maps__find_by_name(struct maps *maps, const char *name)
 			struct dso *dso = map__dso(maps__maps_by_name(maps)[i]);
 
 			if (dso && strcmp(dso->short_name, name) == 0) {
-				result = maps__maps_by_name(maps)[i]; // TODO: map__get
+				result = map__get(maps__maps_by_name(maps)[i]);
 				done = true;
 			}
 		}
@@ -898,7 +898,7 @@ struct map *maps__find_by_name(struct maps *maps, const char *name)
 					sizeof(*mapp), map__strcmp_name);
 
 			if (mapp) {
-				result = *mapp; // TODO: map__get
+				result = map__get(*mapp);
 				i = mapp - maps__maps_by_name(maps);
 				RC_CHK_ACCESS(maps)->last_search_by_name_idx = i;
 			}
@@ -923,7 +923,7 @@ struct map *maps__find_by_name(struct maps *maps, const char *name)
 					struct dso *dso = map__dso(pos);
 
 					if (dso && strcmp(dso->short_name, name) == 0) {
-						result = pos; // TODO: map__get
+						result = map__get(pos);
 						break;
 					}
 				}
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index a1a796043691..be71abe8b9b0 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -358,6 +358,7 @@ static int kernel_get_module_dso(const char *module, struct dso **pdso)
 		map = maps__find_by_name(machine__kernel_maps(host_machine), module_name);
 		if (map) {
 			dso = map__dso(map);
+			map__put(map);
 			goto found;
 		}
 		pr_debug("Failed to find module %s.\n", module);
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 4b934ed3bfd1..5990e3fabdb5 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1470,8 +1470,10 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 		dso__set_loaded(curr_dso);
 		*curr_mapp = curr_map;
 		*curr_dsop = curr_dso;
-	} else
+	} else {
 		*curr_dsop = map__dso(curr_map);
+		map__put(curr_map);
+	}
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 1710b89e207c..0785a54e832e 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -814,7 +814,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 				struct map *initial_map)
 {
 	struct machine *machine;
-	struct map *curr_map = initial_map;
+	struct map *curr_map = map__get(initial_map);
 	struct symbol *pos;
 	int count = 0, moved = 0;
 	struct rb_root_cached *root = &dso->symbols;
@@ -858,13 +858,14 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 					dso__set_loaded(curr_map_dso);
 				}
 
+				map__zput(curr_map);
 				curr_map = maps__find_by_name(kmaps, module);
 				if (curr_map == NULL) {
 					pr_debug("%s/proc/{kallsyms,modules} "
 					         "inconsistency while looking "
 						 "for \"%s\" module!\n",
 						 machine->root_dir, module);
-					curr_map = initial_map;
+					curr_map = map__get(initial_map);
 					goto discard_symbol;
 				}
 				curr_map_dso = map__dso(curr_map);
@@ -888,7 +889,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 			 * symbols at this point.
 			 */
 			goto discard_symbol;
-		} else if (curr_map != initial_map) {
+		} else if (!RC_CHK_EQUAL(curr_map, initial_map)) {
 			char dso_name[PATH_MAX];
 			struct dso *ndso;
 
@@ -899,7 +900,8 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 			}
 
 			if (count == 0) {
-				curr_map = initial_map;
+				map__zput(curr_map);
+				curr_map = map__get(initial_map);
 				goto add_symbol;
 			}
 
@@ -913,6 +915,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 					kernel_range++);
 
 			ndso = dso__new(dso_name);
+			map__zput(curr_map);
 			if (ndso == NULL)
 				return -1;
 
@@ -926,6 +929,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 
 			map__set_mapping_type(curr_map, MAPPING_TYPE__IDENTITY);
 			if (maps__insert(kmaps, curr_map)) {
+				map__zput(curr_map);
 				dso__put(ndso);
 				return -1;
 			}
@@ -936,7 +940,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 			pos->end -= delta;
 		}
 add_symbol:
-		if (curr_map != initial_map) {
+		if (!RC_CHK_EQUAL(curr_map, initial_map)) {
 			struct dso *curr_map_dso = map__dso(curr_map);
 
 			rb_erase_cached(&pos->rb_node, root);
@@ -951,12 +955,12 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 		symbol__delete(pos);
 	}
 
-	if (curr_map != initial_map &&
+	if (!RC_CHK_EQUAL(curr_map, initial_map) &&
 	    dso->kernel == DSO_SPACE__KERNEL_GUEST &&
 	    machine__is_default_guest(maps__machine(kmaps))) {
 		dso__set_loaded(map__dso(curr_map));
 	}
-
+	map__put(curr_map);
 	return count + moved;
 }
 
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 04/25] perf maps: Get map before returning in maps__find_next_entry
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (2 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 03/25] perf maps: Get map before returning in maps__find_by_name Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 05/25] perf maps: Hide maps internals Ian Rogers
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Finding a map is done under a lock, returning the map without a
reference count means it can be removed without notice and causing
uses after free. Grab a reference count to the map within the lock
region and return this. Fix up locations that need a map__put
following this.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/machine.c | 4 +++-
 tools/perf/util/maps.c    | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 7031f6fddcae..4911734411b5 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1761,8 +1761,10 @@ int machine__create_kernel_maps(struct machine *machine)
 		struct map *next = maps__find_next_entry(machine__kernel_maps(machine),
 							 machine__kernel_map(machine));
 
-		if (next)
+		if (next) {
 			machine__set_kernel_mmap(machine, start, map__start(next));
+			map__put(next);
+		}
 	}
 
 out_put:
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index b85147cc8723..0438c417ee44 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -943,7 +943,7 @@ struct map *maps__find_next_entry(struct maps *maps, struct map *map)
 	down_read(maps__lock(maps));
 	i = maps__by_address_index(maps, map);
 	if (i < maps__nr_maps(maps))
-		result = maps__maps_by_address(maps)[i]; // TODO: map__get
+		result = map__get(maps__maps_by_address(maps)[i]);
 
 	up_read(maps__lock(maps));
 	return result;
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 05/25] perf maps: Hide maps internals
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (3 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 04/25] perf maps: Get map before returning in maps__find_next_entry Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 06/25] perf maps: Locking tidy up of nr_maps Ian Rogers
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Move the struct into the C file. Add maps__equal to work around
exposing the struct for reference count checking. Add accessors for
the unwind_libunwind_ops. Move maps_list_node to its only use in
symbol.c.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/thread-maps-share.c     |  8 +-
 tools/perf/util/callchain.c              |  2 +-
 tools/perf/util/maps.c                   | 96 +++++++++++++++++++++++
 tools/perf/util/maps.h                   | 97 +++---------------------
 tools/perf/util/symbol.c                 | 10 +++
 tools/perf/util/thread.c                 |  2 +-
 tools/perf/util/unwind-libunwind-local.c |  2 +-
 tools/perf/util/unwind-libunwind.c       |  7 +-
 8 files changed, 123 insertions(+), 101 deletions(-)

diff --git a/tools/perf/tests/thread-maps-share.c b/tools/perf/tests/thread-maps-share.c
index 7fa6f7c568e2..e9ecd30a5c05 100644
--- a/tools/perf/tests/thread-maps-share.c
+++ b/tools/perf/tests/thread-maps-share.c
@@ -46,9 +46,9 @@ static int test__thread_maps_share(struct test_suite *test __maybe_unused, int s
 	TEST_ASSERT_EQUAL("wrong refcnt", refcount_read(maps__refcnt(maps)), 4);
 
 	/* test the maps pointer is shared */
-	TEST_ASSERT_VAL("maps don't match", RC_CHK_EQUAL(maps, thread__maps(t1)));
-	TEST_ASSERT_VAL("maps don't match", RC_CHK_EQUAL(maps, thread__maps(t2)));
-	TEST_ASSERT_VAL("maps don't match", RC_CHK_EQUAL(maps, thread__maps(t3)));
+	TEST_ASSERT_VAL("maps don't match", maps__equal(maps, thread__maps(t1)));
+	TEST_ASSERT_VAL("maps don't match", maps__equal(maps, thread__maps(t2)));
+	TEST_ASSERT_VAL("maps don't match", maps__equal(maps, thread__maps(t3)));
 
 	/*
 	 * Verify the other leader was created by previous call.
@@ -73,7 +73,7 @@ static int test__thread_maps_share(struct test_suite *test __maybe_unused, int s
 	other_maps = thread__maps(other);
 	TEST_ASSERT_EQUAL("wrong refcnt", refcount_read(maps__refcnt(other_maps)), 2);
 
-	TEST_ASSERT_VAL("maps don't match", RC_CHK_EQUAL(other_maps, thread__maps(other_leader)));
+	TEST_ASSERT_VAL("maps don't match", maps__equal(other_maps, thread__maps(other_leader)));
 
 	/* release thread group */
 	thread__put(t3);
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 8262f69118db..7517d16c02ec 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1157,7 +1157,7 @@ int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *
 		if (al->map == NULL)
 			goto out;
 	}
-	if (RC_CHK_EQUAL(al->maps, machine__kernel_maps(machine))) {
+	if (maps__equal(al->maps, machine__kernel_maps(machine))) {
 		if (machine__is_host(machine)) {
 			al->cpumode = PERF_RECORD_MISC_KERNEL;
 			al->level = 'k';
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index 0438c417ee44..c08e412a4313 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -6,9 +6,63 @@
 #include "dso.h"
 #include "map.h"
 #include "maps.h"
+#include "rwsem.h"
 #include "thread.h"
 #include "ui/ui.h"
 #include "unwind.h"
+#include <internal/rc_check.h>
+
+/*
+ * Locking/sorting note:
+ *
+ * Sorting is done with the write lock, iteration and binary searching happens
+ * under the read lock requiring being sorted. There is a race between sorting
+ * releasing the write lock and acquiring the read lock for iteration/searching
+ * where another thread could insert and break the sorting of the maps. In
+ * practice inserting maps should be rare meaning that the race shouldn't lead
+ * to live lock. Removal of maps doesn't break being sorted.
+ */
+
+DECLARE_RC_STRUCT(maps) {
+	struct rw_semaphore lock;
+	/**
+	 * @maps_by_address: array of maps sorted by their starting address if
+	 * maps_by_address_sorted is true.
+	 */
+	struct map	 **maps_by_address;
+	/**
+	 * @maps_by_name: optional array of maps sorted by their dso name if
+	 * maps_by_name_sorted is true.
+	 */
+	struct map	 **maps_by_name;
+	struct machine	 *machine;
+#ifdef HAVE_LIBUNWIND_SUPPORT
+	void		*addr_space;
+	const struct unwind_libunwind_ops *unwind_libunwind_ops;
+#endif
+	refcount_t	 refcnt;
+	/**
+	 * @nr_maps: number of maps_by_address, and possibly maps_by_name,
+	 * entries that contain maps.
+	 */
+	unsigned int	 nr_maps;
+	/**
+	 * @nr_maps_allocated: number of entries in maps_by_address and possibly
+	 * maps_by_name.
+	 */
+	unsigned int	 nr_maps_allocated;
+	/**
+	 * @last_search_by_name_idx: cache of last found by name entry's index
+	 * as frequent searches for the same dso name are common.
+	 */
+	unsigned int	 last_search_by_name_idx;
+	/** @maps_by_address_sorted: is maps_by_address sorted. */
+	bool		 maps_by_address_sorted;
+	/** @maps_by_name_sorted: is maps_by_name sorted. */
+	bool		 maps_by_name_sorted;
+	/** @ends_broken: does the map contain a map where end values are unset/unsorted? */
+	bool		 ends_broken;
+};
 
 static void check_invariants(const struct maps *maps __maybe_unused)
 {
@@ -103,6 +157,43 @@ static void maps__set_maps_by_name_sorted(struct maps *maps, bool value)
 	RC_CHK_ACCESS(maps)->maps_by_name_sorted = value;
 }
 
+struct machine *maps__machine(const struct maps *maps)
+{
+	return RC_CHK_ACCESS(maps)->machine;
+}
+
+unsigned int maps__nr_maps(const struct maps *maps)
+{
+	return RC_CHK_ACCESS(maps)->nr_maps;
+}
+
+refcount_t *maps__refcnt(struct maps *maps)
+{
+	return &RC_CHK_ACCESS(maps)->refcnt;
+}
+
+#ifdef HAVE_LIBUNWIND_SUPPORT
+void *maps__addr_space(const struct maps *maps)
+{
+	return RC_CHK_ACCESS(maps)->addr_space;
+}
+
+void maps__set_addr_space(struct maps *maps, void *addr_space)
+{
+	RC_CHK_ACCESS(maps)->addr_space = addr_space;
+}
+
+const struct unwind_libunwind_ops *maps__unwind_libunwind_ops(const struct maps *maps)
+{
+	return RC_CHK_ACCESS(maps)->unwind_libunwind_ops;
+}
+
+void maps__set_unwind_libunwind_ops(struct maps *maps, const struct unwind_libunwind_ops *ops)
+{
+	RC_CHK_ACCESS(maps)->unwind_libunwind_ops = ops;
+}
+#endif
+
 static struct rw_semaphore *maps__lock(struct maps *maps)
 {
 	/*
@@ -440,6 +531,11 @@ bool maps__empty(struct maps *maps)
 	return maps__nr_maps(maps) == 0;
 }
 
+bool maps__equal(struct maps *a, struct maps *b)
+{
+	return RC_CHK_EQUAL(a, b);
+}
+
 int maps__for_each_map(struct maps *maps, int (*cb)(struct map *map, void *data), void *data)
 {
 	bool done = false;
diff --git a/tools/perf/util/maps.h b/tools/perf/util/maps.h
index df9dd5a0e3c0..4bcba136ffe5 100644
--- a/tools/perf/util/maps.h
+++ b/tools/perf/util/maps.h
@@ -3,80 +3,15 @@
 #define __PERF_MAPS_H
 
 #include <linux/refcount.h>
-#include <linux/rbtree.h>
 #include <stdio.h>
 #include <stdbool.h>
 #include <linux/types.h>
-#include "rwsem.h"
-#include <internal/rc_check.h>
 
 struct ref_reloc_sym;
 struct machine;
 struct map;
 struct maps;
 
-struct map_list_node {
-	struct list_head node;
-	struct map *map;
-};
-
-static inline struct map_list_node *map_list_node__new(void)
-{
-	return malloc(sizeof(struct map_list_node));
-}
-
-/*
- * Locking/sorting note:
- *
- * Sorting is done with the write lock, iteration and binary searching happens
- * under the read lock requiring being sorted. There is a race between sorting
- * releasing the write lock and acquiring the read lock for iteration/searching
- * where another thread could insert and break the sorting of the maps. In
- * practice inserting maps should be rare meaning that the race shouldn't lead
- * to live lock. Removal of maps doesn't break being sorted.
- */
-
-DECLARE_RC_STRUCT(maps) {
-	struct rw_semaphore lock;
-	/**
-	 * @maps_by_address: array of maps sorted by their starting address if
-	 * maps_by_address_sorted is true.
-	 */
-	struct map	 **maps_by_address;
-	/**
-	 * @maps_by_name: optional array of maps sorted by their dso name if
-	 * maps_by_name_sorted is true.
-	 */
-	struct map	 **maps_by_name;
-	struct machine	 *machine;
-#ifdef HAVE_LIBUNWIND_SUPPORT
-	void		*addr_space;
-	const struct unwind_libunwind_ops *unwind_libunwind_ops;
-#endif
-	refcount_t	 refcnt;
-	/**
-	 * @nr_maps: number of maps_by_address, and possibly maps_by_name,
-	 * entries that contain maps.
-	 */
-	unsigned int	 nr_maps;
-	/**
-	 * @nr_maps_allocated: number of entries in maps_by_address and possibly
-	 * maps_by_name.
-	 */
-	unsigned int	 nr_maps_allocated;
-	/**
-	 * @last_search_by_name_idx: cache of last found by name entry's index
-	 * as frequent searches for the same dso name are common.
-	 */
-	unsigned int	 last_search_by_name_idx;
-	/** @maps_by_address_sorted: is maps_by_address sorted. */
-	bool		 maps_by_address_sorted;
-	/** @maps_by_name_sorted: is maps_by_name sorted. */
-	bool		 maps_by_name_sorted;
-	/** @ends_broken: does the map contain a map where end values are unset/unsorted? */
-	bool		 ends_broken;
-};
-
 #define KMAP_NAME_LEN 256
 
 struct kmap {
@@ -100,36 +35,22 @@ static inline void __maps__zput(struct maps **map)
 
 #define maps__zput(map) __maps__zput(&map)
 
+bool maps__equal(struct maps *a, struct maps *b);
+
 /* Iterate over map calling cb for each entry. */
 int maps__for_each_map(struct maps *maps, int (*cb)(struct map *map, void *data), void *data);
 /* Iterate over map removing an entry if cb returns true. */
 void maps__remove_maps(struct maps *maps, bool (*cb)(struct map *map, void *data), void *data);
 
-static inline struct machine *maps__machine(struct maps *maps)
-{
-	return RC_CHK_ACCESS(maps)->machine;
-}
-
-static inline unsigned int maps__nr_maps(const struct maps *maps)
-{
-	return RC_CHK_ACCESS(maps)->nr_maps;
-}
-
-static inline refcount_t *maps__refcnt(struct maps *maps)
-{
-	return &RC_CHK_ACCESS(maps)->refcnt;
-}
+struct machine *maps__machine(const struct maps *maps);
+unsigned int maps__nr_maps(const struct maps *maps);
+refcount_t *maps__refcnt(struct maps *maps);
 
 #ifdef HAVE_LIBUNWIND_SUPPORT
-static inline void *maps__addr_space(struct maps *maps)
-{
-	return RC_CHK_ACCESS(maps)->addr_space;
-}
-
-static inline const struct unwind_libunwind_ops *maps__unwind_libunwind_ops(const struct maps *maps)
-{
-	return RC_CHK_ACCESS(maps)->unwind_libunwind_ops;
-}
+void *maps__addr_space(const struct maps *maps);
+void maps__set_addr_space(struct maps *maps, void *addr_space);
+const struct unwind_libunwind_ops *maps__unwind_libunwind_ops(const struct maps *maps);
+void maps__set_unwind_libunwind_ops(struct maps *maps, const struct unwind_libunwind_ops *ops);
 #endif
 
 size_t maps__fprintf(struct maps *maps, FILE *fp);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 0785a54e832e..35975189999b 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -63,6 +63,16 @@ struct symbol_conf symbol_conf = {
 	.res_sample		= 0,
 };
 
+struct map_list_node {
+	struct list_head node;
+	struct map *map;
+};
+
+static struct map_list_node *map_list_node__new(void)
+{
+	return malloc(sizeof(struct map_list_node));
+}
+
 static enum dso_binary_type binary_type_symtab[] = {
 	DSO_BINARY_TYPE__KALLSYMS,
 	DSO_BINARY_TYPE__GUEST_KALLSYMS,
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 89c47a5098e2..c59ab4d79163 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -383,7 +383,7 @@ static int thread__clone_maps(struct thread *thread, struct thread *parent, bool
 	if (thread__pid(thread) == thread__pid(parent))
 		return thread__prepare_access(thread);
 
-	if (RC_CHK_EQUAL(thread__maps(thread), thread__maps(parent))) {
+	if (maps__equal(thread__maps(thread), thread__maps(parent))) {
 		pr_debug("broken map groups on thread %d/%d parent %d/%d\n",
 			 thread__pid(thread), thread__tid(thread),
 			 thread__pid(parent), thread__tid(parent));
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index dac536e28360..6a5ac0faa6f4 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -706,7 +706,7 @@ static int _unwind__prepare_access(struct maps *maps)
 {
 	void *addr_space = unw_create_addr_space(&accessors, 0);
 
-	RC_CHK_ACCESS(maps)->addr_space = addr_space;
+	maps__set_addr_space(maps, addr_space);
 	if (!addr_space) {
 		pr_err("unwind: Can't create unwind address space.\n");
 		return -ENOMEM;
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 76cd63de80a8..2728eb4f13ea 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -12,11 +12,6 @@ struct unwind_libunwind_ops __weak *local_unwind_libunwind_ops;
 struct unwind_libunwind_ops __weak *x86_32_unwind_libunwind_ops;
 struct unwind_libunwind_ops __weak *arm64_unwind_libunwind_ops;
 
-static void unwind__register_ops(struct maps *maps, struct unwind_libunwind_ops *ops)
-{
-	RC_CHK_ACCESS(maps)->unwind_libunwind_ops = ops;
-}
-
 int unwind__prepare_access(struct maps *maps, struct map *map, bool *initialized)
 {
 	const char *arch;
@@ -60,7 +55,7 @@ int unwind__prepare_access(struct maps *maps, struct map *map, bool *initialized
 		return 0;
 	}
 out_register:
-	unwind__register_ops(maps, ops);
+	maps__set_unwind_libunwind_ops(maps, ops);
 
 	err = maps__unwind_libunwind_ops(maps)->prepare_access(maps);
 	if (initialized)
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 06/25] perf maps: Locking tidy up of nr_maps
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (4 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 05/25] perf maps: Hide maps internals Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 07/25] perf dso: Reorder variables to save space in struct dso Ian Rogers
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

After this change maps__nr_maps is only used by tests, existing users
are migrated to maps__empty. Compute maps__empty under the read lock.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/machine.c |  2 +-
 tools/perf/util/maps.c    | 10 ++++++++--
 tools/perf/util/maps.h    |  4 ++--
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 4911734411b5..3da92f18814a 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -440,7 +440,7 @@ static struct thread *findnew_guest_code(struct machine *machine,
 		return NULL;
 
 	/* Assume maps are set up if there are any */
-	if (maps__nr_maps(thread__maps(thread)))
+	if (!maps__empty(thread__maps(thread)))
 		return thread;
 
 	host_thread = machine__find_thread(host_machine, -1, pid);
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index c08e412a4313..cb52de9d6c2a 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -528,7 +528,13 @@ void maps__remove(struct maps *maps, struct map *map)
 
 bool maps__empty(struct maps *maps)
 {
-	return maps__nr_maps(maps) == 0;
+	bool res;
+
+	down_read(maps__lock(maps));
+	res = maps__nr_maps(maps) == 0;
+	up_read(maps__lock(maps));
+
+	return res;
 }
 
 bool maps__equal(struct maps *a, struct maps *b)
@@ -852,7 +858,7 @@ int maps__copy_from(struct maps *dest, struct maps *parent)
 
 	parent_maps_by_address = maps__maps_by_address(parent);
 	n = maps__nr_maps(parent);
-	if (maps__empty(dest)) {
+	if (maps__nr_maps(dest) == 0) {
 		/* No existing mappings so just copy from parent to avoid reallocs in insert. */
 		unsigned int nr_maps_allocated = RC_CHK_ACCESS(parent)->nr_maps_allocated;
 		struct map **dest_maps_by_address =
diff --git a/tools/perf/util/maps.h b/tools/perf/util/maps.h
index 4bcba136ffe5..d9aa62ed968a 100644
--- a/tools/perf/util/maps.h
+++ b/tools/perf/util/maps.h
@@ -43,8 +43,8 @@ int maps__for_each_map(struct maps *maps, int (*cb)(struct map *map, void *data)
 void maps__remove_maps(struct maps *maps, bool (*cb)(struct map *map, void *data), void *data);
 
 struct machine *maps__machine(const struct maps *maps);
-unsigned int maps__nr_maps(const struct maps *maps);
-refcount_t *maps__refcnt(struct maps *maps);
+unsigned int maps__nr_maps(const struct maps *maps); /* Test only. */
+refcount_t *maps__refcnt(struct maps *maps); /* Test only. */
 
 #ifdef HAVE_LIBUNWIND_SUPPORT
 void *maps__addr_space(const struct maps *maps);
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 07/25] perf dso: Reorder variables to save space in struct dso
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (5 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 06/25] perf maps: Locking tidy up of nr_maps Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 08/25] perf report: Sort child tasks by tid Ian Rogers
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Save 40 bytes and move from 8 to 7 cache lines. Make variable dwfl
dependent on being a powerpc build. Squeeze bits of int/enum types
when appropriate. Remove holes/padding by reordering variables.

Before:
```
struct dso {
        struct mutex               lock;                 /*     0    40 */
        struct list_head           node;                 /*    40    16 */
        struct rb_node             rb_node __attribute__((__aligned__(8))); /*    56    24 */
        /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
        struct rb_root *           root;                 /*    80     8 */
        struct rb_root_cached      symbols;              /*    88    16 */
        struct symbol * *          symbol_names;         /*   104     8 */
        size_t                     symbol_names_len;     /*   112     8 */
        struct rb_root_cached      inlined_nodes;        /*   120    16 */
        /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
        struct rb_root_cached      srclines;             /*   136    16 */
        struct {
                u64                addr;                 /*   152     8 */
                struct symbol *    symbol;               /*   160     8 */
        } last_find_result;                              /*   152    16 */
        void *                     a2l;                  /*   168     8 */
        char *                     symsrc_filename;      /*   176     8 */
        unsigned int               a2l_fails;            /*   184     4 */
        enum dso_space_type        kernel;               /*   188     4 */
        /* --- cacheline 3 boundary (192 bytes) --- */
        _Bool                      is_kmod;              /*   192     1 */

        /* XXX 3 bytes hole, try to pack */

        enum dso_swap_type         needs_swap;           /*   196     4 */
        enum dso_binary_type       symtab_type;          /*   200     4 */
        enum dso_binary_type       binary_type;          /*   204     4 */
        enum dso_load_errno        load_errno;           /*   208     4 */
        u8                         adjust_symbols:1;     /*   212: 0  1 */
        u8                         has_build_id:1;       /*   212: 1  1 */
        u8                         header_build_id:1;    /*   212: 2  1 */
        u8                         has_srcline:1;        /*   212: 3  1 */
        u8                         hit:1;                /*   212: 4  1 */
        u8                         annotate_warned:1;    /*   212: 5  1 */
        u8                         auxtrace_warned:1;    /*   212: 6  1 */
        u8                         short_name_allocated:1; /*   212: 7  1 */
        u8                         long_name_allocated:1; /*   213: 0  1 */
        u8                         is_64_bit:1;          /*   213: 1  1 */

        /* XXX 6 bits hole, try to pack */

        _Bool                      sorted_by_name;       /*   214     1 */
        _Bool                      loaded;               /*   215     1 */
        u8                         rel;                  /*   216     1 */

        /* XXX 7 bytes hole, try to pack */

        struct build_id            bid;                  /*   224    32 */
        /* --- cacheline 4 boundary (256 bytes) --- */
        u64                        text_offset;          /*   256     8 */
        u64                        text_end;             /*   264     8 */
        const char  *              short_name;           /*   272     8 */
        const char  *              long_name;            /*   280     8 */
        u16                        long_name_len;        /*   288     2 */
        u16                        short_name_len;       /*   290     2 */

        /* XXX 4 bytes hole, try to pack */

        void *                     dwfl;                 /*   296     8 */
        struct auxtrace_cache *    auxtrace_cache;       /*   304     8 */
        int                        comp;                 /*   312     4 */

        /* XXX 4 bytes hole, try to pack */

        /* --- cacheline 5 boundary (320 bytes) --- */
        struct {
                struct rb_root     cache;                /*   320     8 */
                int                fd;                   /*   328     4 */
                int                status;               /*   332     4 */
                u32                status_seen;          /*   336     4 */

                /* XXX 4 bytes hole, try to pack */

                u64                file_size;            /*   344     8 */
                struct list_head   open_entry;           /*   352    16 */
                u64                elf_base_addr;        /*   368     8 */
                u64                debug_frame_offset;   /*   376     8 */
                /* --- cacheline 6 boundary (384 bytes) --- */
                u64                eh_frame_hdr_addr;    /*   384     8 */
                u64                eh_frame_hdr_offset;  /*   392     8 */
        } data;                                          /*   320    80 */
        struct {
                u32                id;                   /*   400     4 */
                u32                sub_id;               /*   404     4 */
                struct perf_env *  env;                  /*   408     8 */
        } bpf_prog;                                      /*   400    16 */
        union {
                void *             priv;                 /*   416     8 */
                u64                db_id;                /*   416     8 */
        };                                               /*   416     8 */
        struct nsinfo *            nsinfo;               /*   424     8 */
        struct dso_id              id;                   /*   432    24 */
        /* --- cacheline 7 boundary (448 bytes) was 8 bytes ago --- */
        refcount_t                 refcnt;               /*   456     4 */
        char                       name[];               /*   460     0 */

        /* size: 464, cachelines: 8, members: 49 */
        /* sum members: 440, holes: 4, sum holes: 18 */
        /* sum bitfield members: 10 bits, bit holes: 1, sum bit holes: 6 bits */
        /* padding: 4 */
        /* forced alignments: 1 */
        /* last cacheline: 16 bytes */
} __attribute__((__aligned__(8)));
```

After:
```
struct dso {
        struct mutex               lock;                 /*     0    40 */
        struct list_head           node;                 /*    40    16 */
        struct rb_node             rb_node __attribute__((__aligned__(8))); /*    56    24 */
        /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
        struct rb_root *           root;                 /*    80     8 */
        struct rb_root_cached      symbols;              /*    88    16 */
        struct symbol * *          symbol_names;         /*   104     8 */
        size_t                     symbol_names_len;     /*   112     8 */
        struct rb_root_cached      inlined_nodes;        /*   120    16 */
        /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
        struct rb_root_cached      srclines;             /*   136    16 */
        struct {
                u64                addr;                 /*   152     8 */
                struct symbol *    symbol;               /*   160     8 */
        } last_find_result;                              /*   152    16 */
        struct build_id            bid;                  /*   168    32 */
        /* --- cacheline 3 boundary (192 bytes) was 8 bytes ago --- */
        u64                        text_offset;          /*   200     8 */
        u64                        text_end;             /*   208     8 */
        const char  *              short_name;           /*   216     8 */
        const char  *              long_name;            /*   224     8 */
        void *                     a2l;                  /*   232     8 */
        char *                     symsrc_filename;      /*   240     8 */
        struct nsinfo *            nsinfo;               /*   248     8 */
        /* --- cacheline 4 boundary (256 bytes) --- */
        struct auxtrace_cache *    auxtrace_cache;       /*   256     8 */
        union {
                void *             priv;                 /*   264     8 */
                u64                db_id;                /*   264     8 */
        };                                               /*   264     8 */
        struct {
                struct perf_env *  env;                  /*   272     8 */
                u32                id;                   /*   280     4 */
                u32                sub_id;               /*   284     4 */
        } bpf_prog;                                      /*   272    16 */
        struct {
                struct rb_root     cache;                /*   288     8 */
                struct list_head   open_entry;           /*   296    16 */
                u64                file_size;            /*   312     8 */
                /* --- cacheline 5 boundary (320 bytes) --- */
                u64                elf_base_addr;        /*   320     8 */
                u64                debug_frame_offset;   /*   328     8 */
                u64                eh_frame_hdr_addr;    /*   336     8 */
                u64                eh_frame_hdr_offset;  /*   344     8 */
                int                fd;                   /*   352     4 */
                int                status;               /*   356     4 */
                u32                status_seen;          /*   360     4 */
        } data;                                          /*   288    80 */

        /* XXX last struct has 4 bytes of padding */

        struct dso_id              id;                   /*   368    24 */
        /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
        unsigned int               a2l_fails;            /*   392     4 */
        int                        comp;                 /*   396     4 */
        refcount_t                 refcnt;               /*   400     4 */
        enum dso_load_errno        load_errno;           /*   404     4 */
        u16                        long_name_len;        /*   408     2 */
        u16                        short_name_len;       /*   410     2 */
        enum dso_binary_type       symtab_type:8;        /*   412: 0  4 */
        enum dso_binary_type       binary_type:8;        /*   412: 8  4 */
        enum dso_space_type        kernel:2;             /*   412:16  4 */
        enum dso_swap_type         needs_swap:2;         /*   412:18  4 */

        /* Bitfield combined with next fields */

        _Bool                      is_kmod:1;            /*   414: 4  1 */
        u8                         adjust_symbols:1;     /*   414: 5  1 */
        u8                         has_build_id:1;       /*   414: 6  1 */
        u8                         header_build_id:1;    /*   414: 7  1 */
        u8                         has_srcline:1;        /*   415: 0  1 */
        u8                         hit:1;                /*   415: 1  1 */
        u8                         annotate_warned:1;    /*   415: 2  1 */
        u8                         auxtrace_warned:1;    /*   415: 3  1 */
        u8                         short_name_allocated:1; /*   415: 4  1 */
        u8                         long_name_allocated:1; /*   415: 5  1 */
        u8                         is_64_bit:1;          /*   415: 6  1 */

        /* XXX 1 bit hole, try to pack */

        _Bool                      sorted_by_name;       /*   416     1 */
        _Bool                      loaded;               /*   417     1 */
        u8                         rel;                  /*   418     1 */
        char                       name[];               /*   419     0 */

        /* size: 424, cachelines: 7, members: 48 */
        /* sum members: 415 */
        /* sum bitfield members: 31 bits, bit holes: 1, sum bit holes: 1 bits */
        /* padding: 5 */
        /* paddings: 1, sum paddings: 4 */
        /* forced alignments: 1 */
        /* last cacheline: 40 bytes */
} __attribute__((__aligned__(8)));
```

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/dso.h | 84 +++++++++++++++++++++----------------------
 1 file changed, 42 insertions(+), 42 deletions(-)

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index ce9f3849a773..33a41bcea335 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -160,66 +160,66 @@ struct dso {
 		u64		addr;
 		struct symbol	*symbol;
 	} last_find_result;
-	void		 *a2l;
-	char		 *symsrc_filename;
-	unsigned int	 a2l_fails;
-	enum dso_space_type	kernel;
-	bool			is_kmod;
-	enum dso_swap_type	needs_swap;
-	enum dso_binary_type	symtab_type;
-	enum dso_binary_type	binary_type;
-	enum dso_load_errno	load_errno;
-	u8		 adjust_symbols:1;
-	u8		 has_build_id:1;
-	u8		 header_build_id:1;
-	u8		 has_srcline:1;
-	u8		 hit:1;
-	u8		 annotate_warned:1;
-	u8		 auxtrace_warned:1;
-	u8		 short_name_allocated:1;
-	u8		 long_name_allocated:1;
-	u8		 is_64_bit:1;
-	bool		 sorted_by_name;
-	bool		 loaded;
-	u8		 rel;
 	struct build_id	 bid;
 	u64		 text_offset;
 	u64		 text_end;
 	const char	 *short_name;
 	const char	 *long_name;
-	u16		 long_name_len;
-	u16		 short_name_len;
+	void		 *a2l;
+	char		 *symsrc_filename;
+#if defined(__powerpc__)
 	void		*dwfl;			/* DWARF debug info */
+#endif
+	struct nsinfo	*nsinfo;
 	struct auxtrace_cache *auxtrace_cache;
-	int		 comp;
-
+	union { /* Tool specific area */
+		void	 *priv;
+		u64	 db_id;
+	};
+	/* bpf prog information */
+	struct {
+		struct perf_env	*env;
+		u32		id;
+		u32		sub_id;
+	} bpf_prog;
 	/* dso data file */
 	struct {
 		struct rb_root	 cache;
-		int		 fd;
-		int		 status;
-		u32		 status_seen;
-		u64		 file_size;
 		struct list_head open_entry;
+		u64		 file_size;
 		u64		 elf_base_addr;
 		u64		 debug_frame_offset;
 		u64		 eh_frame_hdr_addr;
 		u64		 eh_frame_hdr_offset;
+		int		 fd;
+		int		 status;
+		u32		 status_seen;
 	} data;
-	/* bpf prog information */
-	struct {
-		u32		id;
-		u32		sub_id;
-		struct perf_env	*env;
-	} bpf_prog;
-
-	union { /* Tool specific area */
-		void	 *priv;
-		u64	 db_id;
-	};
-	struct nsinfo	*nsinfo;
 	struct dso_id	 id;
+	unsigned int	 a2l_fails;
+	int		 comp;
 	refcount_t	 refcnt;
+	enum dso_load_errno	load_errno;
+	u16		 long_name_len;
+	u16		 short_name_len;
+	enum dso_binary_type	symtab_type:8;
+	enum dso_binary_type	binary_type:8;
+	enum dso_space_type	kernel:2;
+	enum dso_swap_type	needs_swap:2;
+	bool			is_kmod:1;
+	u8		 adjust_symbols:1;
+	u8		 has_build_id:1;
+	u8		 header_build_id:1;
+	u8		 has_srcline:1;
+	u8		 hit:1;
+	u8		 annotate_warned:1;
+	u8		 auxtrace_warned:1;
+	u8		 short_name_allocated:1;
+	u8		 long_name_allocated:1;
+	u8		 is_64_bit:1;
+	bool		 sorted_by_name;
+	bool		 loaded;
+	u8		 rel;
 	char		 name[];
 };
 
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 08/25] perf report: Sort child tasks by tid
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (6 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 07/25] perf dso: Reorder variables to save space in struct dso Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 09/25] perf trace: Ignore thread hashing in summary Ian Rogers
                   ` (17 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Commit 91e467bc568f ("perf machine: Use hashtable for machine
threads") made the iteration of thread tids unordered. The perf report
--tasks output now shows child threads in an order determined by the
hashing. For example, in this snippet tid 3 appears after tid 256 even
though they have the same ppid 2:

```
$ perf report --tasks
%      pid      tid     ppid  comm
         0        0       -1 |swapper
         2        2        0 | kthreadd
       256      256        2 |  kworker/12:1H-k
    693761   693761        2 |  kworker/10:1-mm
   1301762  1301762        2 |  kworker/1:1-mm_
   1302530  1302530        2 |  kworker/u32:0-k
         3        3        2 |  rcu_gp
...
```

The output is easier to read if threads appear numerically
increasing. To allow for this, read all threads into a list then sort
with a comparator that orders by the child task's of the first common
parent. The list creation and deletion are created as utilities on
machine.  The indentation is possible by counting the number of
parents a child has.

With this change the output for the same data file is now like:
```
$ perf report --tasks
%      pid      tid     ppid  comm
         0        0       -1 |swapper
         1        1        0 | systemd
       823      823        1 |  systemd-journal
       853      853        1 |  systemd-udevd
      3230     3230        1 |  systemd-timesyn
      3236     3236        1 |  auditd
      3239     3239     3236 |   audisp-syslog
      3321     3321        1 |  accounts-daemon
...
```

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-report.c | 203 ++++++++++++++++++++----------------
 tools/perf/util/machine.c   |  30 ++++++
 tools/perf/util/machine.h   |  10 ++
 3 files changed, 155 insertions(+), 88 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index f2ed2b7e80a3..ed0cc813cebb 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -59,6 +59,7 @@
 #include <linux/ctype.h>
 #include <signal.h>
 #include <linux/bitmap.h>
+#include <linux/list_sort.h>
 #include <linux/string.h>
 #include <linux/stringify.h>
 #include <linux/time64.h>
@@ -828,35 +829,6 @@ static void tasks_setup(struct report *rep)
 	rep->tool.no_warn = true;
 }
 
-struct task {
-	struct thread		*thread;
-	struct list_head	 list;
-	struct list_head	 children;
-};
-
-static struct task *tasks_list(struct task *task, struct machine *machine)
-{
-	struct thread *parent_thread, *thread = task->thread;
-	struct task   *parent_task;
-
-	/* Already listed. */
-	if (!list_empty(&task->list))
-		return NULL;
-
-	/* Last one in the chain. */
-	if (thread__ppid(thread) == -1)
-		return task;
-
-	parent_thread = machine__find_thread(machine, -1, thread__ppid(thread));
-	if (!parent_thread)
-		return ERR_PTR(-ENOENT);
-
-	parent_task = thread__priv(parent_thread);
-	thread__put(parent_thread);
-	list_add_tail(&task->list, &parent_task->children);
-	return tasks_list(parent_task, machine);
-}
-
 struct maps__fprintf_task_args {
 	int indent;
 	FILE *fp;
@@ -900,89 +872,144 @@ static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp)
 	return args.printed;
 }
 
-static void task__print_level(struct task *task, FILE *fp, int level)
+static int thread_level(struct machine *machine, const struct thread *thread)
 {
-	struct thread *thread = task->thread;
-	struct task *child;
-	int comm_indent = fprintf(fp, "  %8d %8d %8d |%*s",
-				  thread__pid(thread), thread__tid(thread),
-				  thread__ppid(thread), level, "");
+	struct thread *parent_thread;
+	int res;
 
-	fprintf(fp, "%s\n", thread__comm_str(thread));
+	if (thread__tid(thread) <= 0)
+		return 0;
 
-	maps__fprintf_task(thread__maps(thread), comm_indent, fp);
+	if (thread__ppid(thread) <= 0)
+		return 1;
 
-	if (!list_empty(&task->children)) {
-		list_for_each_entry(child, &task->children, list)
-			task__print_level(child, fp, level + 1);
+	parent_thread = machine__find_thread(machine, -1, thread__ppid(thread));
+	if (!parent_thread) {
+		pr_err("Missing parent thread of %d\n", thread__tid(thread));
+		return 0;
 	}
+	res = 1 + thread_level(machine, parent_thread);
+	thread__put(parent_thread);
+	return res;
 }
 
-static int tasks_print(struct report *rep, FILE *fp)
+static void task__print_level(struct machine *machine, struct thread *thread, FILE *fp)
 {
-	struct perf_session *session = rep->session;
-	struct machine      *machine = &session->machines.host;
-	struct task *tasks, *task;
-	unsigned int nr = 0, itask = 0, i;
-	struct rb_node *nd;
-	LIST_HEAD(list);
+	int level = thread_level(machine, thread);
+	int comm_indent = fprintf(fp, "  %8d %8d %8d |%*s",
+				  thread__pid(thread), thread__tid(thread),
+				  thread__ppid(thread), level, "");
 
-	/*
-	 * No locking needed while accessing machine->threads,
-	 * because --tasks is single threaded command.
-	 */
+	fprintf(fp, "%s\n", thread__comm_str(thread));
 
-	/* Count all the threads. */
-	for (i = 0; i < THREADS__TABLE_SIZE; i++)
-		nr += machine->threads[i].nr;
+	maps__fprintf_task(thread__maps(thread), comm_indent, fp);
+}
 
-	tasks = malloc(sizeof(*tasks) * nr);
-	if (!tasks)
-		return -ENOMEM;
+static int task_list_cmp(void *priv, const struct list_head *la, const struct list_head *lb)
+{
+	struct machine *machine = priv;
+	struct thread_list *task_a = list_entry(la, struct thread_list, list);
+	struct thread_list *task_b = list_entry(lb, struct thread_list, list);
+	struct thread *a = task_a->thread;
+	struct thread *b = task_b->thread;
+	int level_a, level_b, res;
+
+	/* Compare a and b to root. */
+	if (thread__tid(a) == thread__tid(b))
+		return 0;
 
-	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
-		struct threads *threads = &machine->threads[i];
+	if (thread__tid(a) == 0)
+		return -1;
 
-		for (nd = rb_first_cached(&threads->entries); nd;
-		     nd = rb_next(nd)) {
-			task = tasks + itask++;
+	if (thread__tid(b) == 0)
+		return 1;
 
-			task->thread = rb_entry(nd, struct thread_rb_node, rb_node)->thread;
-			INIT_LIST_HEAD(&task->children);
-			INIT_LIST_HEAD(&task->list);
-			thread__set_priv(task->thread, task);
-		}
+	/* If parents match sort by tid. */
+	if (thread__ppid(a) == thread__ppid(b)) {
+		return thread__tid(a) < thread__tid(b)
+			? -1
+			: (thread__tid(a) > thread__tid(b) ? 1 : 0);
 	}
 
 	/*
-	 * Iterate every task down to the unprocessed parent
-	 * and link all in task children list. Task with no
-	 * parent is added into 'list'.
+	 * Find a and b such that if they are a child of each other a and b's
+	 * tid's match, otherwise a and b have a common parent and distinct
+	 * tid's to sort by. First make the depths of the threads match.
 	 */
-	for (itask = 0; itask < nr; itask++) {
-		task = tasks + itask;
-
-		if (!list_empty(&task->list))
-			continue;
-
-		task = tasks_list(task, machine);
-		if (IS_ERR(task)) {
-			pr_err("Error: failed to process tasks\n");
-			free(tasks);
-			return PTR_ERR(task);
+	level_a = thread_level(machine, a);
+	level_b = thread_level(machine, b);
+	a = thread__get(a);
+	b = thread__get(b);
+	for (int i = level_a; i > level_b; i--) {
+		struct thread *parent = machine__find_thread(machine, -1, thread__ppid(a));
+
+		thread__put(a);
+		if (!parent) {
+			pr_err("Missing parent thread of %d\n", thread__tid(a));
+			thread__put(b);
+			return -1;
 		}
+		a = parent;
+	}
+	for (int i = level_b; i > level_a; i--) {
+		struct thread *parent = machine__find_thread(machine, -1, thread__ppid(b));
 
-		if (task)
-			list_add_tail(&task->list, &list);
+		thread__put(b);
+		if (!parent) {
+			pr_err("Missing parent thread of %d\n", thread__tid(b));
+			thread__put(a);
+			return 1;
+		}
+		b = parent;
+	}
+	/* Search up to a common parent. */
+	while (thread__ppid(a) != thread__ppid(b)) {
+		struct thread *parent;
+
+		parent = machine__find_thread(machine, -1, thread__ppid(a));
+		thread__put(a);
+		if (!parent)
+			pr_err("Missing parent thread of %d\n", thread__tid(a));
+		a = parent;
+		parent = machine__find_thread(machine, -1, thread__ppid(b));
+		thread__put(b);
+		if (!parent)
+			pr_err("Missing parent thread of %d\n", thread__tid(b));
+		b = parent;
+		if (!a || !b)
+			return !a && !b ? 0 : (!a ? -1 : 1);
+	}
+	if (thread__tid(a) == thread__tid(b)) {
+		/* a is a child of b or vice-versa, deeper levels appear later. */
+		res = level_a < level_b ? -1 : (level_a > level_b ? 1 : 0);
+	} else {
+		/* Sort by tid now the parent is the same. */
+		res = thread__tid(a) < thread__tid(b) ? -1 : 1;
 	}
+	thread__put(a);
+	thread__put(b);
+	return res;
+}
+
+static int tasks_print(struct report *rep, FILE *fp)
+{
+	struct machine *machine = &rep->session->machines.host;
+	LIST_HEAD(tasks);
+	int ret;
 
-	fprintf(fp, "# %8s %8s %8s  %s\n", "pid", "tid", "ppid", "comm");
+	ret = machine__thread_list(machine, &tasks);
+	if (!ret) {
+		struct thread_list *task;
 
-	list_for_each_entry(task, &list, list)
-		task__print_level(task, fp, 0);
+		list_sort(machine, &tasks, task_list_cmp);
 
-	free(tasks);
-	return 0;
+		fprintf(fp, "# %8s %8s %8s  %s\n", "pid", "tid", "ppid", "comm");
+
+		list_for_each_entry(task, &tasks, list)
+			task__print_level(machine, task->thread, fp);
+	}
+	thread_list__delete(&tasks);
+	return ret;
 }
 
 static int __cmd_report(struct report *rep)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 3da92f18814a..7872ce92c9fc 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -3261,6 +3261,36 @@ int machines__for_each_thread(struct machines *machines,
 	return rc;
 }
 
+
+static int thread_list_cb(struct thread *thread, void *data)
+{
+	struct list_head *list = data;
+	struct thread_list *entry = malloc(sizeof(*entry));
+
+	if (!entry)
+		return -ENOMEM;
+
+	entry->thread = thread__get(thread);
+	list_add_tail(&entry->list, list);
+	return 0;
+}
+
+int machine__thread_list(struct machine *machine, struct list_head *list)
+{
+	return machine__for_each_thread(machine, thread_list_cb, list);
+}
+
+void thread_list__delete(struct list_head *list)
+{
+	struct thread_list *pos, *next;
+
+	list_for_each_entry_safe(pos, next, list, list) {
+		thread__zput(pos->thread);
+		list_del(&pos->list);
+		free(pos);
+	}
+}
+
 pid_t machine__get_current_tid(struct machine *machine, int cpu)
 {
 	if (cpu < 0 || (size_t)cpu >= machine->current_tid_sz)
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 1279acda6a8a..b738ce84817b 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -280,6 +280,16 @@ int machines__for_each_thread(struct machines *machines,
 			      int (*fn)(struct thread *thread, void *p),
 			      void *priv);
 
+struct thread_list {
+	struct list_head	 list;
+	struct thread		*thread;
+};
+
+/* Make a list of struct thread_list based on threads in the machine. */
+int machine__thread_list(struct machine *machine, struct list_head *list);
+/* Free up the nodes within the thread_list list. */
+void thread_list__delete(struct list_head *list);
+
 pid_t machine__get_current_tid(struct machine *machine, int cpu);
 int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid,
 			     pid_t tid);
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 09/25] perf trace: Ignore thread hashing in summary
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (7 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 08/25] perf report: Sort child tasks by tid Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 10/25] perf machine: Move fprintf to for_each loop and a callback Ian Rogers
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Commit 91e467bc568f ("perf machine: Use hashtable for machine
threads") made the iteration of thread tids unordered. The perf trace
--summary output sorts and prints each hash bucket, rather than all
threads globally. Change this behavior by turn all threads into a
list, sort the list by number of trace events then by tids, finally
print the list. This also allows the rbtree in threads to be not
accessed outside of machine.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-trace.c  | 41 +++++++++++++++++++++----------------
 tools/perf/util/rb_resort.h |  5 -----
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 109b8e64fe69..90eaff8c0f6e 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -74,6 +74,7 @@
 #include <linux/err.h>
 #include <linux/filter.h>
 #include <linux/kernel.h>
+#include <linux/list_sort.h>
 #include <linux/random.h>
 #include <linux/stringify.h>
 #include <linux/time64.h>
@@ -4312,34 +4313,38 @@ static unsigned long thread__nr_events(struct thread_trace *ttrace)
 	return ttrace ? ttrace->nr_events : 0;
 }
 
-DEFINE_RESORT_RB(threads,
-		(thread__nr_events(thread__priv(a->thread)) <
-		 thread__nr_events(thread__priv(b->thread))),
-	struct thread *thread;
-)
+static int trace_nr_events_cmp(void *priv __maybe_unused,
+			       const struct list_head *la,
+			       const struct list_head *lb)
 {
-	entry->thread = rb_entry(nd, struct thread_rb_node, rb_node)->thread;
+	struct thread_list *a = list_entry(la, struct thread_list, list);
+	struct thread_list *b = list_entry(lb, struct thread_list, list);
+	unsigned long a_nr_events = thread__nr_events(thread__priv(a->thread));
+	unsigned long b_nr_events = thread__nr_events(thread__priv(b->thread));
+
+	if (a_nr_events != b_nr_events)
+		return a_nr_events < b_nr_events ? -1 : 1;
+
+	/* Identical number of threads, place smaller tids first. */
+	return thread__tid(a->thread) < thread__tid(b->thread)
+		? -1
+		: (thread__tid(a->thread) > thread__tid(b->thread) ? 1 : 0);
 }
 
 static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp)
 {
 	size_t printed = trace__fprintf_threads_header(fp);
-	struct rb_node *nd;
-	int i;
-
-	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
-		DECLARE_RESORT_RB_MACHINE_THREADS(threads, trace->host, i);
+	LIST_HEAD(threads);
 
-		if (threads == NULL) {
-			fprintf(fp, "%s", "Error sorting output by nr_events!\n");
-			return 0;
-		}
+	if (machine__thread_list(trace->host, &threads) == 0) {
+		struct thread_list *pos;
 
-		resort_rb__for_each_entry(nd, threads)
-			printed += trace__fprintf_thread(fp, threads_entry->thread, trace);
+		list_sort(NULL, &threads, trace_nr_events_cmp);
 
-		resort_rb__delete(threads);
+		list_for_each_entry(pos, &threads, list)
+			printed += trace__fprintf_thread(fp, pos->thread, trace);
 	}
+	thread_list__delete(&threads);
 	return printed;
 }
 
diff --git a/tools/perf/util/rb_resort.h b/tools/perf/util/rb_resort.h
index 376e86cb4c3c..d927a0d25052 100644
--- a/tools/perf/util/rb_resort.h
+++ b/tools/perf/util/rb_resort.h
@@ -143,9 +143,4 @@ struct __name##_sorted *__name = __name##_sorted__new
 	DECLARE_RESORT_RB(__name)(&__ilist->rblist.entries.rb_root,		\
 				  __ilist->rblist.nr_entries)
 
-/* For 'struct machine->threads' */
-#define DECLARE_RESORT_RB_MACHINE_THREADS(__name, __machine, hash_bucket)    \
- DECLARE_RESORT_RB(__name)(&__machine->threads[hash_bucket].entries.rb_root, \
-			   __machine->threads[hash_bucket].nr)
-
 #endif /* _PERF_RESORT_RB_H_ */
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 10/25] perf machine: Move fprintf to for_each loop and a callback
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (8 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 09/25] perf trace: Ignore thread hashing in summary Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 11/25] perf threads: Move threads to its own files Ian Rogers
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Avoid exposing the threads data structure by switching to the callback
machine__for_each_thread approach. machine__fprintf is only used in
tests and verbose >3 output so don't turn to list and sort. Add
machine__threads_nr to be refactored later.

Note, all existing *_fprintf routines ignore fprintf errors.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/machine.c | 43 ++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 7872ce92c9fc..e072b2115b64 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1113,29 +1113,40 @@ size_t machine__fprintf_vmlinux_path(struct machine *machine, FILE *fp)
 	return printed;
 }
 
-size_t machine__fprintf(struct machine *machine, FILE *fp)
+struct machine_fprintf_cb_args {
+	FILE *fp;
+	size_t printed;
+};
+
+static int machine_fprintf_cb(struct thread *thread, void *data)
 {
-	struct rb_node *nd;
-	size_t ret;
-	int i;
+	struct machine_fprintf_cb_args *args = data;
 
-	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
-		struct threads *threads = &machine->threads[i];
+	/* TODO: handle fprintf errors. */
+	args->printed += thread__fprintf(thread, args->fp);
+	return 0;
+}
 
-		down_read(&threads->lock);
+static size_t machine__threads_nr(const struct machine *machine)
+{
+	size_t nr = 0;
 
-		ret = fprintf(fp, "Threads: %u\n", threads->nr);
+	for (int i = 0; i < THREADS__TABLE_SIZE; i++)
+		nr += machine->threads[i].nr;
 
-		for (nd = rb_first_cached(&threads->entries); nd;
-		     nd = rb_next(nd)) {
-			struct thread *pos = rb_entry(nd, struct thread_rb_node, rb_node)->thread;
+	return nr;
+}
 
-			ret += thread__fprintf(pos, fp);
-		}
+size_t machine__fprintf(struct machine *machine, FILE *fp)
+{
+	struct machine_fprintf_cb_args args = {
+		.fp = fp,
+		.printed = 0,
+	};
+	size_t ret = fprintf(fp, "Threads: %zu\n", machine__threads_nr(machine));
 
-		up_read(&threads->lock);
-	}
-	return ret;
+	machine__for_each_thread(machine, machine_fprintf_cb, &args);
+	return ret + args.printed;
 }
 
 static struct dso *machine__get_kernel(struct machine *machine)
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 11/25] perf threads: Move threads to its own files
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (9 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 10/25] perf machine: Move fprintf to for_each loop and a callback Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 12/25] perf threads: Switch from rbtree to hashmap Ian Rogers
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Move threads out of machine and move thread_rb_node into the C
file. This hides the implementation of threads from the rest of the
code allowing for it to be refactored.

Locking discipline is tightened up in this change.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/Build                 |   1 +
 tools/perf/util/bpf_lock_contention.c |   8 +-
 tools/perf/util/machine.c             | 287 ++++----------------------
 tools/perf/util/machine.h             |  20 +-
 tools/perf/util/thread.c              |   2 +-
 tools/perf/util/thread.h              |   6 -
 tools/perf/util/threads.c             | 244 ++++++++++++++++++++++
 tools/perf/util/threads.h             |  35 ++++
 8 files changed, 325 insertions(+), 278 deletions(-)
 create mode 100644 tools/perf/util/threads.c
 create mode 100644 tools/perf/util/threads.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 8027f450fa3e..a0e8cd68d490 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -71,6 +71,7 @@ perf-y += ordered-events.o
 perf-y += namespaces.o
 perf-y += comm.o
 perf-y += thread.o
+perf-y += threads.o
 perf-y += thread_map.o
 perf-y += parse-events-flex.o
 perf-y += parse-events-bison.o
diff --git a/tools/perf/util/bpf_lock_contention.c b/tools/perf/util/bpf_lock_contention.c
index 31ff19afc20c..3992c8a9fd96 100644
--- a/tools/perf/util/bpf_lock_contention.c
+++ b/tools/perf/util/bpf_lock_contention.c
@@ -210,7 +210,7 @@ static const char *lock_contention_get_name(struct lock_contention *con,
 
 		/* do not update idle comm which contains CPU number */
 		if (pid) {
-			struct thread *t = __machine__findnew_thread(machine, /*pid=*/-1, pid);
+			struct thread *t = machine__findnew_thread(machine, /*pid=*/-1, pid);
 
 			if (t == NULL)
 				return name;
@@ -302,9 +302,9 @@ int lock_contention_read(struct lock_contention *con)
 		return -1;
 
 	if (con->aggr_mode == LOCK_AGGR_TASK) {
-		struct thread *idle = __machine__findnew_thread(machine,
-								/*pid=*/0,
-								/*tid=*/0);
+		struct thread *idle = machine__findnew_thread(machine,
+							      /*pid=*/0,
+							      /*tid=*/0);
 		thread__set_comm(idle, "swapper", /*timestamp=*/0);
 	}
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e072b2115b64..e668a97255f8 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -43,9 +43,6 @@
 #include <linux/string.h>
 #include <linux/zalloc.h>
 
-static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd,
-				     struct thread *th, bool lock);
-
 static struct dso *machine__kernel_dso(struct machine *machine)
 {
 	return map__dso(machine->vmlinux_map);
@@ -58,35 +55,6 @@ static void dsos__init(struct dsos *dsos)
 	init_rwsem(&dsos->lock);
 }
 
-static void machine__threads_init(struct machine *machine)
-{
-	int i;
-
-	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
-		struct threads *threads = &machine->threads[i];
-		threads->entries = RB_ROOT_CACHED;
-		init_rwsem(&threads->lock);
-		threads->nr = 0;
-		threads->last_match = NULL;
-	}
-}
-
-static int thread_rb_node__cmp_tid(const void *key, const struct rb_node *nd)
-{
-	int to_find = (int) *((pid_t *)key);
-
-	return to_find - (int)thread__tid(rb_entry(nd, struct thread_rb_node, rb_node)->thread);
-}
-
-static struct thread_rb_node *thread_rb_node__find(const struct thread *th,
-						   struct rb_root *tree)
-{
-	pid_t to_find = thread__tid(th);
-	struct rb_node *nd = rb_find(&to_find, tree, thread_rb_node__cmp_tid);
-
-	return rb_entry(nd, struct thread_rb_node, rb_node);
-}
-
 static int machine__set_mmap_name(struct machine *machine)
 {
 	if (machine__is_host(machine))
@@ -120,7 +88,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 	RB_CLEAR_NODE(&machine->rb_node);
 	dsos__init(&machine->dsos);
 
-	machine__threads_init(machine);
+	threads__init(&machine->threads);
 
 	machine->vdso_info = NULL;
 	machine->env = NULL;
@@ -221,27 +189,11 @@ static void dsos__exit(struct dsos *dsos)
 
 void machine__delete_threads(struct machine *machine)
 {
-	struct rb_node *nd;
-	int i;
-
-	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
-		struct threads *threads = &machine->threads[i];
-		down_write(&threads->lock);
-		nd = rb_first_cached(&threads->entries);
-		while (nd) {
-			struct thread_rb_node *trb = rb_entry(nd, struct thread_rb_node, rb_node);
-
-			nd = rb_next(nd);
-			__machine__remove_thread(machine, trb, trb->thread, false);
-		}
-		up_write(&threads->lock);
-	}
+	threads__remove_all_threads(&machine->threads);
 }
 
 void machine__exit(struct machine *machine)
 {
-	int i;
-
 	if (machine == NULL)
 		return;
 
@@ -254,12 +206,7 @@ void machine__exit(struct machine *machine)
 	zfree(&machine->current_tid);
 	zfree(&machine->kallsyms_filename);
 
-	machine__delete_threads(machine);
-	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
-		struct threads *threads = &machine->threads[i];
-
-		exit_rwsem(&threads->lock);
-	}
+	threads__exit(&machine->threads);
 }
 
 void machine__delete(struct machine *machine)
@@ -526,7 +473,7 @@ static void machine__update_thread_pid(struct machine *machine,
 	if (thread__pid(th) == thread__tid(th))
 		return;
 
-	leader = __machine__findnew_thread(machine, thread__pid(th), thread__pid(th));
+	leader = machine__findnew_thread(machine, thread__pid(th), thread__pid(th));
 	if (!leader)
 		goto out_err;
 
@@ -560,160 +507,55 @@ static void machine__update_thread_pid(struct machine *machine,
 	goto out_put;
 }
 
-/*
- * Front-end cache - TID lookups come in blocks,
- * so most of the time we dont have to look up
- * the full rbtree:
- */
-static struct thread*
-__threads__get_last_match(struct threads *threads, struct machine *machine,
-			  int pid, int tid)
-{
-	struct thread *th;
-
-	th = threads->last_match;
-	if (th != NULL) {
-		if (thread__tid(th) == tid) {
-			machine__update_thread_pid(machine, th, pid);
-			return thread__get(th);
-		}
-		thread__put(threads->last_match);
-		threads->last_match = NULL;
-	}
-
-	return NULL;
-}
-
-static struct thread*
-threads__get_last_match(struct threads *threads, struct machine *machine,
-			int pid, int tid)
-{
-	struct thread *th = NULL;
-
-	if (perf_singlethreaded)
-		th = __threads__get_last_match(threads, machine, pid, tid);
-
-	return th;
-}
-
-static void
-__threads__set_last_match(struct threads *threads, struct thread *th)
-{
-	thread__put(threads->last_match);
-	threads->last_match = thread__get(th);
-}
-
-static void
-threads__set_last_match(struct threads *threads, struct thread *th)
-{
-	if (perf_singlethreaded)
-		__threads__set_last_match(threads, th);
-}
-
 /*
  * Caller must eventually drop thread->refcnt returned with a successful
  * lookup/new thread inserted.
  */
-static struct thread *____machine__findnew_thread(struct machine *machine,
-						  struct threads *threads,
-						  pid_t pid, pid_t tid,
-						  bool create)
+static struct thread *__machine__findnew_thread(struct machine *machine,
+						pid_t pid,
+						pid_t tid,
+						bool create)
 {
-	struct rb_node **p = &threads->entries.rb_root.rb_node;
-	struct rb_node *parent = NULL;
-	struct thread *th;
-	struct thread_rb_node *nd;
-	bool leftmost = true;
+	struct thread *th = threads__find(&machine->threads, tid);
+	bool created;
 
-	th = threads__get_last_match(threads, machine, pid, tid);
-	if (th)
+	if (th) {
+		machine__update_thread_pid(machine, th, pid);
 		return th;
-
-	while (*p != NULL) {
-		parent = *p;
-		th = rb_entry(parent, struct thread_rb_node, rb_node)->thread;
-
-		if (thread__tid(th) == tid) {
-			threads__set_last_match(threads, th);
-			machine__update_thread_pid(machine, th, pid);
-			return thread__get(th);
-		}
-
-		if (tid < thread__tid(th))
-			p = &(*p)->rb_left;
-		else {
-			p = &(*p)->rb_right;
-			leftmost = false;
-		}
 	}
-
 	if (!create)
 		return NULL;
 
-	th = thread__new(pid, tid);
-	if (th == NULL)
-		return NULL;
-
-	nd = malloc(sizeof(*nd));
-	if (nd == NULL) {
-		thread__put(th);
-		return NULL;
-	}
-	nd->thread = th;
-
-	rb_link_node(&nd->rb_node, parent, p);
-	rb_insert_color_cached(&nd->rb_node, &threads->entries, leftmost);
-	/*
-	 * We have to initialize maps separately after rb tree is updated.
-	 *
-	 * The reason is that we call machine__findnew_thread within
-	 * thread__init_maps to find the thread leader and that would screwed
-	 * the rb tree.
-	 */
-	if (thread__init_maps(th, machine)) {
-		pr_err("Thread init failed thread %d\n", pid);
-		rb_erase_cached(&nd->rb_node, &threads->entries);
-		RB_CLEAR_NODE(&nd->rb_node);
-		free(nd);
-		thread__put(th);
-		return NULL;
-	}
-	/*
-	 * It is now in the rbtree, get a ref
-	 */
-	threads__set_last_match(threads, th);
-	++threads->nr;
-
-	return thread__get(th);
-}
+	th = threads__findnew(&machine->threads, pid, tid, &created);
+	if (created) {
+		/*
+		 * We have to initialize maps separately after rb tree is
+		 * updated.
+		 *
+		 * The reason is that we call machine__findnew_thread within
+		 * thread__init_maps to find the thread leader and that would
+		 * screwed the rb tree.
+		 */
+		if (thread__init_maps(th, machine)) {
+			pr_err("Thread init failed thread %d\n", pid);
+			threads__remove(&machine->threads, th);
+			thread__put(th);
+			return NULL;
+		}
+	} else
+		machine__update_thread_pid(machine, th, pid);
 
-struct thread *__machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid)
-{
-	return ____machine__findnew_thread(machine, machine__threads(machine, tid), pid, tid, true);
+	return th;
 }
 
-struct thread *machine__findnew_thread(struct machine *machine, pid_t pid,
-				       pid_t tid)
+struct thread *machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid)
 {
-	struct threads *threads = machine__threads(machine, tid);
-	struct thread *th;
-
-	down_write(&threads->lock);
-	th = __machine__findnew_thread(machine, pid, tid);
-	up_write(&threads->lock);
-	return th;
+	return __machine__findnew_thread(machine, pid, tid, /*create=*/true);
 }
 
-struct thread *machine__find_thread(struct machine *machine, pid_t pid,
-				    pid_t tid)
+struct thread *machine__find_thread(struct machine *machine, pid_t pid, pid_t tid)
 {
-	struct threads *threads = machine__threads(machine, tid);
-	struct thread *th;
-
-	down_read(&threads->lock);
-	th =  ____machine__findnew_thread(machine, threads, pid, tid, false);
-	up_read(&threads->lock);
-	return th;
+	return __machine__findnew_thread(machine, pid, tid, /*create=*/false);
 }
 
 /*
@@ -1127,23 +969,13 @@ static int machine_fprintf_cb(struct thread *thread, void *data)
 	return 0;
 }
 
-static size_t machine__threads_nr(const struct machine *machine)
-{
-	size_t nr = 0;
-
-	for (int i = 0; i < THREADS__TABLE_SIZE; i++)
-		nr += machine->threads[i].nr;
-
-	return nr;
-}
-
 size_t machine__fprintf(struct machine *machine, FILE *fp)
 {
 	struct machine_fprintf_cb_args args = {
 		.fp = fp,
 		.printed = 0,
 	};
-	size_t ret = fprintf(fp, "Threads: %zu\n", machine__threads_nr(machine));
+	size_t ret = fprintf(fp, "Threads: %zu\n", threads__nr(&machine->threads));
 
 	machine__for_each_thread(machine, machine_fprintf_cb, &args);
 	return ret + args.printed;
@@ -2069,36 +1901,9 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
 	return 0;
 }
 
-static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd,
-				     struct thread *th, bool lock)
-{
-	struct threads *threads = machine__threads(machine, thread__tid(th));
-
-	if (!nd)
-		nd = thread_rb_node__find(th, &threads->entries.rb_root);
-
-	if (threads->last_match && RC_CHK_EQUAL(threads->last_match, th))
-		threads__set_last_match(threads, NULL);
-
-	if (lock)
-		down_write(&threads->lock);
-
-	BUG_ON(refcount_read(thread__refcnt(th)) == 0);
-
-	thread__put(nd->thread);
-	rb_erase_cached(&nd->rb_node, &threads->entries);
-	RB_CLEAR_NODE(&nd->rb_node);
-	--threads->nr;
-
-	free(nd);
-
-	if (lock)
-		up_write(&threads->lock);
-}
-
 void machine__remove_thread(struct machine *machine, struct thread *th)
 {
-	return __machine__remove_thread(machine, NULL, th, true);
+	return threads__remove(&machine->threads, th);
 }
 
 int machine__process_fork_event(struct machine *machine, union perf_event *event,
@@ -3232,23 +3037,7 @@ int machine__for_each_thread(struct machine *machine,
 			     int (*fn)(struct thread *thread, void *p),
 			     void *priv)
 {
-	struct threads *threads;
-	struct rb_node *nd;
-	int rc = 0;
-	int i;
-
-	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
-		threads = &machine->threads[i];
-		for (nd = rb_first_cached(&threads->entries); nd;
-		     nd = rb_next(nd)) {
-			struct thread_rb_node *trb = rb_entry(nd, struct thread_rb_node, rb_node);
-
-			rc = fn(trb->thread, priv);
-			if (rc != 0)
-				return rc;
-		}
-	}
-	return rc;
+	return threads__for_each_thread(&machine->threads, fn, priv);
 }
 
 int machines__for_each_thread(struct machines *machines,
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index b738ce84817b..e28c787616fe 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -7,6 +7,7 @@
 #include "maps.h"
 #include "dsos.h"
 #include "rwsem.h"
+#include "threads.h"
 
 struct addr_location;
 struct branch_stack;
@@ -28,16 +29,6 @@ extern const char *ref_reloc_sym_names[];
 
 struct vdso_info;
 
-#define THREADS__TABLE_BITS	8
-#define THREADS__TABLE_SIZE	(1 << THREADS__TABLE_BITS)
-
-struct threads {
-	struct rb_root_cached  entries;
-	struct rw_semaphore    lock;
-	unsigned int	       nr;
-	struct thread	       *last_match;
-};
-
 struct machine {
 	struct rb_node	  rb_node;
 	pid_t		  pid;
@@ -48,7 +39,7 @@ struct machine {
 	char		  *root_dir;
 	char		  *mmap_name;
 	char		  *kallsyms_filename;
-	struct threads    threads[THREADS__TABLE_SIZE];
+	struct threads    threads;
 	struct vdso_info  *vdso_info;
 	struct perf_env   *env;
 	struct dsos	  dsos;
@@ -69,12 +60,6 @@ struct machine {
 	bool		  trampolines_mapped;
 };
 
-static inline struct threads *machine__threads(struct machine *machine, pid_t tid)
-{
-	/* Cast it to handle tid == -1 */
-	return &machine->threads[(unsigned int)tid % THREADS__TABLE_SIZE];
-}
-
 /*
  * The main kernel (vmlinux) map
  */
@@ -220,7 +205,6 @@ bool machine__is(struct machine *machine, const char *arch);
 bool machine__normalized_is(struct machine *machine, const char *arch);
 int machine__nr_cpus_avail(struct machine *machine);
 
-struct thread *__machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid);
 struct thread *machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid);
 
 struct dso *machine__findnew_dso_id(struct machine *machine, const char *filename, struct dso_id *id);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index c59ab4d79163..1aa8962dcf52 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -26,7 +26,7 @@ int thread__init_maps(struct thread *thread, struct machine *machine)
 	if (pid == thread__tid(thread) || pid == -1) {
 		thread__set_maps(thread, maps__new(machine));
 	} else {
-		struct thread *leader = __machine__findnew_thread(machine, pid, pid);
+		struct thread *leader = machine__findnew_thread(machine, pid, pid);
 
 		if (leader) {
 			thread__set_maps(thread, maps__get(thread__maps(leader)));
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 0df775b5c110..4b8f3e9e513b 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -3,7 +3,6 @@
 #define __PERF_THREAD_H
 
 #include <linux/refcount.h>
-#include <linux/rbtree.h>
 #include <linux/list.h>
 #include <stdio.h>
 #include <unistd.h>
@@ -30,11 +29,6 @@ struct lbr_stitch {
 	struct callchain_cursor_node	*prev_lbr_cursor;
 };
 
-struct thread_rb_node {
-	struct rb_node rb_node;
-	struct thread *thread;
-};
-
 DECLARE_RC_STRUCT(thread) {
 	/** @maps: mmaps associated with this thread. */
 	struct maps		*maps;
diff --git a/tools/perf/util/threads.c b/tools/perf/util/threads.c
new file mode 100644
index 000000000000..d984ec939c7b
--- /dev/null
+++ b/tools/perf/util/threads.c
@@ -0,0 +1,244 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "threads.h"
+#include "machine.h"
+#include "thread.h"
+
+struct thread_rb_node {
+	struct rb_node rb_node;
+	struct thread *thread;
+};
+
+static struct threads_table_entry *threads__table(struct threads *threads, pid_t tid)
+{
+	/* Cast it to handle tid == -1 */
+	return &threads->table[(unsigned int)tid % THREADS__TABLE_SIZE];
+}
+
+void threads__init(struct threads *threads)
+{
+	for (int i = 0; i < THREADS__TABLE_SIZE; i++) {
+		struct threads_table_entry *table = &threads->table[i];
+
+		table->entries = RB_ROOT_CACHED;
+		init_rwsem(&table->lock);
+		table->nr = 0;
+		table->last_match = NULL;
+	}
+}
+
+void threads__exit(struct threads *threads)
+{
+	threads__remove_all_threads(threads);
+	for (int i = 0; i < THREADS__TABLE_SIZE; i++) {
+		struct threads_table_entry *table = &threads->table[i];
+
+		exit_rwsem(&table->lock);
+	}
+}
+
+size_t threads__nr(struct threads *threads)
+{
+	size_t nr = 0;
+
+	for (int i = 0; i < THREADS__TABLE_SIZE; i++) {
+		struct threads_table_entry *table = &threads->table[i];
+
+		down_read(&table->lock);
+		nr += table->nr;
+		up_read(&table->lock);
+	}
+	return nr;
+}
+
+/*
+ * Front-end cache - TID lookups come in blocks,
+ * so most of the time we dont have to look up
+ * the full rbtree:
+ */
+static struct thread *__threads_table_entry__get_last_match(struct threads_table_entry *table,
+							    pid_t tid)
+{
+	struct thread *th, *res = NULL;
+
+	th = table->last_match;
+	if (th != NULL) {
+		if (thread__tid(th) == tid)
+			res = thread__get(th);
+	}
+	return res;
+}
+
+static void __threads_table_entry__set_last_match(struct threads_table_entry *table,
+						  struct thread *th)
+{
+	thread__put(table->last_match);
+	table->last_match = thread__get(th);
+}
+
+static void threads_table_entry__set_last_match(struct threads_table_entry *table,
+						struct thread *th)
+{
+	down_write(&table->lock);
+	__threads_table_entry__set_last_match(table, th);
+	up_write(&table->lock);
+}
+
+struct thread *threads__find(struct threads *threads, pid_t tid)
+{
+	struct threads_table_entry *table  = threads__table(threads, tid);
+	struct rb_node **p;
+	struct thread *res = NULL;
+
+	down_read(&table->lock);
+	res = __threads_table_entry__get_last_match(table, tid);
+	if (res)
+		return res;
+
+	p = &table->entries.rb_root.rb_node;
+	while (*p != NULL) {
+		struct rb_node *parent = *p;
+		struct thread *th = rb_entry(parent, struct thread_rb_node, rb_node)->thread;
+
+		if (thread__tid(th) == tid) {
+			res = thread__get(th);
+			break;
+		}
+
+		if (tid < thread__tid(th))
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+	up_read(&table->lock);
+	if (res)
+		threads_table_entry__set_last_match(table, res);
+	return res;
+}
+
+struct thread *threads__findnew(struct threads *threads, pid_t pid, pid_t tid, bool *created)
+{
+	struct threads_table_entry *table  = threads__table(threads, tid);
+	struct rb_node **p;
+	struct rb_node *parent = NULL;
+	struct thread *res = NULL;
+	struct thread_rb_node *nd;
+	bool leftmost = true;
+
+	*created = false;
+	down_write(&table->lock);
+	p = &table->entries.rb_root.rb_node;
+	while (*p != NULL) {
+		struct thread *th;
+
+		parent = *p;
+		th = rb_entry(parent, struct thread_rb_node, rb_node)->thread;
+
+		if (thread__tid(th) == tid) {
+			__threads_table_entry__set_last_match(table, th);
+			res = thread__get(th);
+			goto out_unlock;
+		}
+
+		if (tid < thread__tid(th))
+			p = &(*p)->rb_left;
+		else {
+			leftmost = false;
+			p = &(*p)->rb_right;
+		}
+	}
+	nd = malloc(sizeof(*nd));
+	if (nd == NULL)
+		goto out_unlock;
+	res = thread__new(pid, tid);
+	if (!res)
+		free(nd);
+	else {
+		*created = true;
+		nd->thread = thread__get(res);
+		rb_link_node(&nd->rb_node, parent, p);
+		rb_insert_color_cached(&nd->rb_node, &table->entries, leftmost);
+		++table->nr;
+		__threads_table_entry__set_last_match(table, res);
+	}
+out_unlock:
+	up_write(&table->lock);
+	return res;
+}
+
+void threads__remove_all_threads(struct threads *threads)
+{
+	for (int i = 0; i < THREADS__TABLE_SIZE; i++) {
+		struct threads_table_entry *table = &threads->table[i];
+		struct rb_node *nd;
+
+		down_write(&table->lock);
+		__threads_table_entry__set_last_match(table, NULL);
+		nd = rb_first_cached(&table->entries);
+		while (nd) {
+			struct thread_rb_node *trb = rb_entry(nd, struct thread_rb_node, rb_node);
+
+			nd = rb_next(nd);
+			thread__put(trb->thread);
+			rb_erase_cached(&trb->rb_node, &table->entries);
+			RB_CLEAR_NODE(&trb->rb_node);
+			--table->nr;
+
+			free(trb);
+		}
+		assert(table->nr == 0);
+		up_write(&table->lock);
+	}
+}
+
+void threads__remove(struct threads *threads, struct thread *thread)
+{
+	struct rb_node **p;
+	struct threads_table_entry *table  = threads__table(threads, thread__tid(thread));
+	pid_t tid = thread__tid(thread);
+
+	down_write(&table->lock);
+	if (table->last_match && RC_CHK_EQUAL(table->last_match, thread))
+		__threads_table_entry__set_last_match(table, NULL);
+
+	p = &table->entries.rb_root.rb_node;
+	while (*p != NULL) {
+		struct rb_node *parent = *p;
+		struct thread_rb_node *nd = rb_entry(parent, struct thread_rb_node, rb_node);
+		struct thread *th = nd->thread;
+
+		if (RC_CHK_EQUAL(th, thread)) {
+			thread__put(nd->thread);
+			rb_erase_cached(&nd->rb_node, &table->entries);
+			RB_CLEAR_NODE(&nd->rb_node);
+			--table->nr;
+			free(nd);
+			break;
+		}
+
+		if (tid < thread__tid(th))
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+	up_write(&table->lock);
+}
+
+int threads__for_each_thread(struct threads *threads,
+			     int (*fn)(struct thread *thread, void *data),
+			     void *data)
+{
+	for (int i = 0; i < THREADS__TABLE_SIZE; i++) {
+		struct threads_table_entry *table = &threads->table[i];
+		struct rb_node *nd;
+
+		for (nd = rb_first_cached(&table->entries); nd; nd = rb_next(nd)) {
+			struct thread_rb_node *trb = rb_entry(nd, struct thread_rb_node, rb_node);
+			int rc = fn(trb->thread, data);
+
+			if (rc != 0)
+				return rc;
+		}
+	}
+	return 0;
+
+}
diff --git a/tools/perf/util/threads.h b/tools/perf/util/threads.h
new file mode 100644
index 000000000000..ed67de627578
--- /dev/null
+++ b/tools/perf/util/threads.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_THREADS_H
+#define __PERF_THREADS_H
+
+#include <linux/rbtree.h>
+#include "rwsem.h"
+
+struct thread;
+
+#define THREADS__TABLE_BITS	8
+#define THREADS__TABLE_SIZE	(1 << THREADS__TABLE_BITS)
+
+struct threads_table_entry {
+	struct rb_root_cached  entries;
+	struct rw_semaphore    lock;
+	unsigned int	       nr;
+	struct thread	       *last_match;
+};
+
+struct threads {
+	struct threads_table_entry table[THREADS__TABLE_SIZE];
+};
+
+void threads__init(struct threads *threads);
+void threads__exit(struct threads *threads);
+size_t threads__nr(struct threads *threads);
+struct thread *threads__find(struct threads *threads, pid_t tid);
+struct thread *threads__findnew(struct threads *threads, pid_t pid, pid_t tid, bool *created);
+void threads__remove_all_threads(struct threads *threads);
+void threads__remove(struct threads *threads, struct thread *thread);
+int threads__for_each_thread(struct threads *threads,
+			     int (*fn)(struct thread *thread, void *data),
+			     void *data);
+
+#endif	/* __PERF_THREADS_H */
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 12/25] perf threads: Switch from rbtree to hashmap
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (10 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 11/25] perf threads: Move threads to its own files Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 13/25] perf threads: Reduce table size from 256 to 8 Ian Rogers
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

The rbtree provides a sorting on entries but this is unused. Switch to
using hashmap for O(1) rather than O(log n) find/insert/remove
complexity.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/threads.c | 146 ++++++++++++--------------------------
 tools/perf/util/threads.h |   6 +-
 2 files changed, 47 insertions(+), 105 deletions(-)

diff --git a/tools/perf/util/threads.c b/tools/perf/util/threads.c
index d984ec939c7b..55923be53180 100644
--- a/tools/perf/util/threads.c
+++ b/tools/perf/util/threads.c
@@ -3,25 +3,30 @@
 #include "machine.h"
 #include "thread.h"
 
-struct thread_rb_node {
-	struct rb_node rb_node;
-	struct thread *thread;
-};
-
 static struct threads_table_entry *threads__table(struct threads *threads, pid_t tid)
 {
 	/* Cast it to handle tid == -1 */
 	return &threads->table[(unsigned int)tid % THREADS__TABLE_SIZE];
 }
 
+static size_t key_hash(long key, void *ctx __maybe_unused)
+{
+	/* The table lookup removes low bit entropy, but this is just ignored here. */
+	return key;
+}
+
+static bool key_equal(long key1, long key2, void *ctx __maybe_unused)
+{
+	return key1 == key2;
+}
+
 void threads__init(struct threads *threads)
 {
 	for (int i = 0; i < THREADS__TABLE_SIZE; i++) {
 		struct threads_table_entry *table = &threads->table[i];
 
-		table->entries = RB_ROOT_CACHED;
+		hashmap__init(&table->shard, key_hash, key_equal, NULL);
 		init_rwsem(&table->lock);
-		table->nr = 0;
 		table->last_match = NULL;
 	}
 }
@@ -32,6 +37,7 @@ void threads__exit(struct threads *threads)
 	for (int i = 0; i < THREADS__TABLE_SIZE; i++) {
 		struct threads_table_entry *table = &threads->table[i];
 
+		hashmap__clear(&table->shard);
 		exit_rwsem(&table->lock);
 	}
 }
@@ -44,7 +50,7 @@ size_t threads__nr(struct threads *threads)
 		struct threads_table_entry *table = &threads->table[i];
 
 		down_read(&table->lock);
-		nr += table->nr;
+		nr += hashmap__size(&table->shard);
 		up_read(&table->lock);
 	}
 	return nr;
@@ -86,28 +92,13 @@ static void threads_table_entry__set_last_match(struct threads_table_entry *tabl
 struct thread *threads__find(struct threads *threads, pid_t tid)
 {
 	struct threads_table_entry *table  = threads__table(threads, tid);
-	struct rb_node **p;
-	struct thread *res = NULL;
+	struct thread *res;
 
 	down_read(&table->lock);
 	res = __threads_table_entry__get_last_match(table, tid);
-	if (res)
-		return res;
-
-	p = &table->entries.rb_root.rb_node;
-	while (*p != NULL) {
-		struct rb_node *parent = *p;
-		struct thread *th = rb_entry(parent, struct thread_rb_node, rb_node)->thread;
-
-		if (thread__tid(th) == tid) {
-			res = thread__get(th);
-			break;
-		}
-
-		if (tid < thread__tid(th))
-			p = &(*p)->rb_left;
-		else
-			p = &(*p)->rb_right;
+	if (!res) {
+		if (hashmap__find(&table->shard, tid, &res))
+			res = thread__get(res);
 	}
 	up_read(&table->lock);
 	if (res)
@@ -118,49 +109,25 @@ struct thread *threads__find(struct threads *threads, pid_t tid)
 struct thread *threads__findnew(struct threads *threads, pid_t pid, pid_t tid, bool *created)
 {
 	struct threads_table_entry *table  = threads__table(threads, tid);
-	struct rb_node **p;
-	struct rb_node *parent = NULL;
 	struct thread *res = NULL;
-	struct thread_rb_node *nd;
-	bool leftmost = true;
 
 	*created = false;
 	down_write(&table->lock);
-	p = &table->entries.rb_root.rb_node;
-	while (*p != NULL) {
-		struct thread *th;
-
-		parent = *p;
-		th = rb_entry(parent, struct thread_rb_node, rb_node)->thread;
-
-		if (thread__tid(th) == tid) {
-			__threads_table_entry__set_last_match(table, th);
-			res = thread__get(th);
-			goto out_unlock;
-		}
-
-		if (tid < thread__tid(th))
-			p = &(*p)->rb_left;
-		else {
-			leftmost = false;
-			p = &(*p)->rb_right;
-		}
-	}
-	nd = malloc(sizeof(*nd));
-	if (nd == NULL)
-		goto out_unlock;
 	res = thread__new(pid, tid);
-	if (!res)
-		free(nd);
-	else {
-		*created = true;
-		nd->thread = thread__get(res);
-		rb_link_node(&nd->rb_node, parent, p);
-		rb_insert_color_cached(&nd->rb_node, &table->entries, leftmost);
-		++table->nr;
-		__threads_table_entry__set_last_match(table, res);
+	if (res) {
+		if (hashmap__add(&table->shard, tid, res)) {
+			/* Add failed. Assume a race so find other entry. */
+			thread__put(res);
+			res = NULL;
+			if (hashmap__find(&table->shard, tid, &res))
+				res = thread__get(res);
+		} else {
+			res = thread__get(res);
+			*created = true;
+		}
+		if (res)
+			__threads_table_entry__set_last_match(table, res);
 	}
-out_unlock:
 	up_write(&table->lock);
 	return res;
 }
@@ -169,57 +136,32 @@ void threads__remove_all_threads(struct threads *threads)
 {
 	for (int i = 0; i < THREADS__TABLE_SIZE; i++) {
 		struct threads_table_entry *table = &threads->table[i];
-		struct rb_node *nd;
+		struct hashmap_entry *cur, *tmp;
+		size_t bkt;
 
 		down_write(&table->lock);
 		__threads_table_entry__set_last_match(table, NULL);
-		nd = rb_first_cached(&table->entries);
-		while (nd) {
-			struct thread_rb_node *trb = rb_entry(nd, struct thread_rb_node, rb_node);
-
-			nd = rb_next(nd);
-			thread__put(trb->thread);
-			rb_erase_cached(&trb->rb_node, &table->entries);
-			RB_CLEAR_NODE(&trb->rb_node);
-			--table->nr;
+		hashmap__for_each_entry_safe((&table->shard), cur, tmp, bkt) {
+			struct thread *old_value;
 
-			free(trb);
+			hashmap__delete(&table->shard, cur->key, /*old_key=*/NULL, &old_value);
+			thread__put(old_value);
 		}
-		assert(table->nr == 0);
 		up_write(&table->lock);
 	}
 }
 
 void threads__remove(struct threads *threads, struct thread *thread)
 {
-	struct rb_node **p;
 	struct threads_table_entry *table  = threads__table(threads, thread__tid(thread));
-	pid_t tid = thread__tid(thread);
+	struct thread *old_value;
 
 	down_write(&table->lock);
 	if (table->last_match && RC_CHK_EQUAL(table->last_match, thread))
 		__threads_table_entry__set_last_match(table, NULL);
 
-	p = &table->entries.rb_root.rb_node;
-	while (*p != NULL) {
-		struct rb_node *parent = *p;
-		struct thread_rb_node *nd = rb_entry(parent, struct thread_rb_node, rb_node);
-		struct thread *th = nd->thread;
-
-		if (RC_CHK_EQUAL(th, thread)) {
-			thread__put(nd->thread);
-			rb_erase_cached(&nd->rb_node, &table->entries);
-			RB_CLEAR_NODE(&nd->rb_node);
-			--table->nr;
-			free(nd);
-			break;
-		}
-
-		if (tid < thread__tid(th))
-			p = &(*p)->rb_left;
-		else
-			p = &(*p)->rb_right;
-	}
+	hashmap__delete(&table->shard, thread__tid(thread), /*old_key=*/NULL, &old_value);
+	thread__put(old_value);
 	up_write(&table->lock);
 }
 
@@ -229,11 +171,11 @@ int threads__for_each_thread(struct threads *threads,
 {
 	for (int i = 0; i < THREADS__TABLE_SIZE; i++) {
 		struct threads_table_entry *table = &threads->table[i];
-		struct rb_node *nd;
+		struct hashmap_entry *cur;
+		size_t bkt;
 
-		for (nd = rb_first_cached(&table->entries); nd; nd = rb_next(nd)) {
-			struct thread_rb_node *trb = rb_entry(nd, struct thread_rb_node, rb_node);
-			int rc = fn(trb->thread, data);
+		hashmap__for_each_entry((&table->shard), cur, bkt) {
+			int rc = fn((struct thread *)cur->pvalue, data);
 
 			if (rc != 0)
 				return rc;
diff --git a/tools/perf/util/threads.h b/tools/perf/util/threads.h
index ed67de627578..d03bd91a7769 100644
--- a/tools/perf/util/threads.h
+++ b/tools/perf/util/threads.h
@@ -2,7 +2,7 @@
 #ifndef __PERF_THREADS_H
 #define __PERF_THREADS_H
 
-#include <linux/rbtree.h>
+#include "hashmap.h"
 #include "rwsem.h"
 
 struct thread;
@@ -11,9 +11,9 @@ struct thread;
 #define THREADS__TABLE_SIZE	(1 << THREADS__TABLE_BITS)
 
 struct threads_table_entry {
-	struct rb_root_cached  entries;
+	/* Key is tid, value is struct thread. */
+	struct hashmap	       shard;
 	struct rw_semaphore    lock;
-	unsigned int	       nr;
 	struct thread	       *last_match;
 };
 
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 13/25] perf threads: Reduce table size from 256 to 8
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (11 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 12/25] perf threads: Switch from rbtree to hashmap Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 14/25] perf dsos: Attempt to better abstract dsos internals Ian Rogers
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

The threads data structure is an array of hashmaps, previously
rbtrees. The two levels allows for a fixed outer array where access is
guarded by rw_semaphores. Commit 91e467bc568f ("perf machine: Use
hashtable for machine threads") sized the outer table at 256 entries
to avoid future scalability problems, however, this means the threads
struct is sized at 30,720 bytes. As the hashmaps allow O(1) access for
the common find/insert/remove operations, lower the number of entries
to 8. This reduces the size overhead to 960 bytes.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/threads.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/threads.h b/tools/perf/util/threads.h
index d03bd91a7769..da68d2223f18 100644
--- a/tools/perf/util/threads.h
+++ b/tools/perf/util/threads.h
@@ -7,7 +7,7 @@
 
 struct thread;
 
-#define THREADS__TABLE_BITS	8
+#define THREADS__TABLE_BITS	3
 #define THREADS__TABLE_SIZE	(1 << THREADS__TABLE_BITS)
 
 struct threads_table_entry {
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 14/25] perf dsos: Attempt to better abstract dsos internals
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (12 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 13/25] perf threads: Reduce table size from 256 to 8 Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 15/25] perf dsos: Tidy reference counting and locking Ian Rogers
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Move functions from machine and build-id to dsos. Pass dsos struct
rather than internal state. Rename some functions to better represent
which data structure they operate on.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c |  2 +-
 tools/perf/builtin-record.c |  2 +-
 tools/perf/util/build-id.c  | 38 +---------------------------
 tools/perf/util/build-id.h  |  2 --
 tools/perf/util/dso.h       |  6 -----
 tools/perf/util/dsos.c      | 49 ++++++++++++++++++++++++++++++++++---
 tools/perf/util/dsos.h      | 19 +++++++++++---
 tools/perf/util/machine.c   | 40 ++++++------------------------
 tools/perf/util/machine.h   |  2 ++
 tools/perf/util/session.c   | 21 ++++++++++++++++
 tools/perf/util/session.h   |  2 ++
 11 files changed, 97 insertions(+), 86 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index eb3ef5c24b66..ef73317e6ae7 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -2122,7 +2122,7 @@ static int __cmd_inject(struct perf_inject *inject)
 		 */
 		if (perf_header__has_feat(&session->header, HEADER_BUILD_ID) &&
 		    inject->have_auxtrace && !inject->itrace_synth_opts.set)
-			dsos__hit_all(session);
+			perf_session__dsos_hit_all(session);
 		/*
 		 * The AUX areas have been removed and replaced with
 		 * synthesized hardware events, so clear the feature flag.
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a89013c44fd5..cdeba474eaf6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1787,7 +1787,7 @@ record__finish_output(struct record *rec)
 		process_buildids(rec);
 
 		if (rec->buildid_all)
-			dsos__hit_all(rec->session);
+			perf_session__dsos_hit_all(rec->session);
 	}
 	perf_session__write_header(rec->session, rec->evlist, fd, true);
 
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 03c64b85383b..a617b1917e6b 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -390,42 +390,6 @@ int perf_session__write_buildid_table(struct perf_session *session,
 	return err;
 }
 
-static int __dsos__hit_all(struct list_head *head)
-{
-	struct dso *pos;
-
-	list_for_each_entry(pos, head, node)
-		pos->hit = true;
-
-	return 0;
-}
-
-static int machine__hit_all_dsos(struct machine *machine)
-{
-	return __dsos__hit_all(&machine->dsos.head);
-}
-
-int dsos__hit_all(struct perf_session *session)
-{
-	struct rb_node *nd;
-	int err;
-
-	err = machine__hit_all_dsos(&session->machines.host);
-	if (err)
-		return err;
-
-	for (nd = rb_first_cached(&session->machines.guests); nd;
-	     nd = rb_next(nd)) {
-		struct machine *pos = rb_entry(nd, struct machine, rb_node);
-
-		err = machine__hit_all_dsos(pos);
-		if (err)
-			return err;
-	}
-
-	return 0;
-}
-
 void disable_buildid_cache(void)
 {
 	no_buildid_cache = true;
@@ -992,7 +956,7 @@ int perf_session__cache_build_ids(struct perf_session *session)
 
 static bool machine__read_build_ids(struct machine *machine, bool with_hits)
 {
-	return __dsos__read_build_ids(&machine->dsos.head, with_hits);
+	return __dsos__read_build_ids(&machine->dsos, with_hits);
 }
 
 bool perf_session__read_build_ids(struct perf_session *session, bool with_hits)
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 4e3a1169379b..3fa8bffb07ca 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -39,8 +39,6 @@ int build_id__mark_dso_hit(struct perf_tool *tool, union perf_event *event,
 			   struct perf_sample *sample, struct evsel *evsel,
 			   struct machine *machine);
 
-int dsos__hit_all(struct perf_session *session);
-
 int perf_event__inject_buildid(struct perf_tool *tool, union perf_event *event,
 			       struct perf_sample *sample, struct evsel *evsel,
 			       struct machine *machine);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 33a41bcea335..2b9cf9177085 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -232,12 +232,6 @@ struct dso {
 #define dso__for_each_symbol(dso, pos, n)	\
 	symbols__for_each_entry(&(dso)->symbols, pos, n)
 
-#define dsos__for_each_with_build_id(pos, head)	\
-	list_for_each_entry(pos, head, node)	\
-		if (!pos->has_build_id)		\
-			continue;		\
-		else
-
 static inline void dso__set_loaded(struct dso *dso)
 {
 	dso->loaded = true;
diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index cf80aa42dd07..e65ef6762bed 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -12,6 +12,35 @@
 #include <symbol.h> // filename__read_build_id
 #include <unistd.h>
 
+void dsos__init(struct dsos *dsos)
+{
+	INIT_LIST_HEAD(&dsos->head);
+	dsos->root = RB_ROOT;
+	init_rwsem(&dsos->lock);
+}
+
+static void dsos__purge(struct dsos *dsos)
+{
+	struct dso *pos, *n;
+
+	down_write(&dsos->lock);
+
+	list_for_each_entry_safe(pos, n, &dsos->head, node) {
+		RB_CLEAR_NODE(&pos->rb_node);
+		pos->root = NULL;
+		list_del_init(&pos->node);
+		dso__put(pos);
+	}
+
+	up_write(&dsos->lock);
+}
+
+void dsos__exit(struct dsos *dsos)
+{
+	dsos__purge(dsos);
+	exit_rwsem(&dsos->lock);
+}
+
 static int __dso_id__cmp(struct dso_id *a, struct dso_id *b)
 {
 	if (a->maj > b->maj) return -1;
@@ -73,8 +102,9 @@ int dso__cmp_id(struct dso *a, struct dso *b)
 	return __dso_id__cmp(&a->id, &b->id);
 }
 
-bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
+bool __dsos__read_build_ids(struct dsos *dsos, bool with_hits)
 {
+	struct list_head *head = &dsos->head;
 	bool have_build_id = false;
 	struct dso *pos;
 	struct nscookie nsc;
@@ -303,9 +333,10 @@ struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id
 	return dso;
 }
 
-size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp,
+size_t __dsos__fprintf_buildid(struct dsos *dsos, FILE *fp,
 			       bool (skip)(struct dso *dso, int parm), int parm)
 {
+	struct list_head *head = &dsos->head;
 	struct dso *pos;
 	size_t ret = 0;
 
@@ -320,8 +351,9 @@ size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp,
 	return ret;
 }
 
-size_t __dsos__fprintf(struct list_head *head, FILE *fp)
+size_t __dsos__fprintf(struct dsos *dsos, FILE *fp)
 {
+	struct list_head *head = &dsos->head;
 	struct dso *pos;
 	size_t ret = 0;
 
@@ -331,3 +363,14 @@ size_t __dsos__fprintf(struct list_head *head, FILE *fp)
 
 	return ret;
 }
+
+int __dsos__hit_all(struct dsos *dsos)
+{
+	struct list_head *head = &dsos->head;
+	struct dso *pos;
+
+	list_for_each_entry(pos, head, node)
+		pos->hit = true;
+
+	return 0;
+}
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index 5dbec2bc6966..1c81ddf07f8f 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -21,6 +21,15 @@ struct dsos {
 	struct rw_semaphore lock;
 };
 
+#define dsos__for_each_with_build_id(pos, head)	\
+	list_for_each_entry(pos, head, node)	\
+		if (!pos->has_build_id)		\
+			continue;		\
+		else
+
+void dsos__init(struct dsos *dsos);
+void dsos__exit(struct dsos *dsos);
+
 void __dsos__add(struct dsos *dsos, struct dso *dso);
 void dsos__add(struct dsos *dsos, struct dso *dso);
 struct dso *__dsos__addnew(struct dsos *dsos, const char *name);
@@ -28,13 +37,15 @@ struct dso *__dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
 
 struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id);
  
+bool __dsos__read_build_ids(struct dsos *dsos, bool with_hits);
+
 struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso *dso,
 						const char *name, struct dso_id *id);
 
-bool __dsos__read_build_ids(struct list_head *head, bool with_hits);
-
-size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp,
+size_t __dsos__fprintf_buildid(struct dsos *dsos, FILE *fp,
 			       bool (skip)(struct dso *dso, int parm), int parm);
-size_t __dsos__fprintf(struct list_head *head, FILE *fp);
+size_t __dsos__fprintf(struct dsos *dsos, FILE *fp);
+
+int __dsos__hit_all(struct dsos *dsos);
 
 #endif /* __PERF_DSOS */
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e668a97255f8..d235d65fb35b 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -48,13 +48,6 @@ static struct dso *machine__kernel_dso(struct machine *machine)
 	return map__dso(machine->vmlinux_map);
 }
 
-static void dsos__init(struct dsos *dsos)
-{
-	INIT_LIST_HEAD(&dsos->head);
-	dsos->root = RB_ROOT;
-	init_rwsem(&dsos->lock);
-}
-
 static int machine__set_mmap_name(struct machine *machine)
 {
 	if (machine__is_host(machine))
@@ -165,28 +158,6 @@ struct machine *machine__new_kallsyms(void)
 	return machine;
 }
 
-static void dsos__purge(struct dsos *dsos)
-{
-	struct dso *pos, *n;
-
-	down_write(&dsos->lock);
-
-	list_for_each_entry_safe(pos, n, &dsos->head, node) {
-		RB_CLEAR_NODE(&pos->rb_node);
-		pos->root = NULL;
-		list_del_init(&pos->node);
-		dso__put(pos);
-	}
-
-	up_write(&dsos->lock);
-}
-
-static void dsos__exit(struct dsos *dsos)
-{
-	dsos__purge(dsos);
-	exit_rwsem(&dsos->lock);
-}
-
 void machine__delete_threads(struct machine *machine)
 {
 	threads__remove_all_threads(&machine->threads);
@@ -906,11 +877,11 @@ static struct map *machine__addnew_module_map(struct machine *machine, u64 start
 size_t machines__fprintf_dsos(struct machines *machines, FILE *fp)
 {
 	struct rb_node *nd;
-	size_t ret = __dsos__fprintf(&machines->host.dsos.head, fp);
+	size_t ret = __dsos__fprintf(&machines->host.dsos, fp);
 
 	for (nd = rb_first_cached(&machines->guests); nd; nd = rb_next(nd)) {
 		struct machine *pos = rb_entry(nd, struct machine, rb_node);
-		ret += __dsos__fprintf(&pos->dsos.head, fp);
+		ret += __dsos__fprintf(&pos->dsos, fp);
 	}
 
 	return ret;
@@ -919,7 +890,7 @@ size_t machines__fprintf_dsos(struct machines *machines, FILE *fp)
 size_t machine__fprintf_dsos_buildid(struct machine *m, FILE *fp,
 				     bool (skip)(struct dso *dso, int parm), int parm)
 {
-	return __dsos__fprintf_buildid(&m->dsos.head, fp, skip, parm);
+	return __dsos__fprintf_buildid(&m->dsos, fp, skip, parm);
 }
 
 size_t machines__fprintf_dsos_buildid(struct machines *machines, FILE *fp,
@@ -3281,3 +3252,8 @@ bool machine__is_lock_function(struct machine *machine, u64 addr)
 
 	return false;
 }
+
+int machine__hit_all_dsos(struct machine *machine)
+{
+	return __dsos__hit_all(&machine->dsos);
+}
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index e28c787616fe..05927aa3e813 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -306,4 +306,6 @@ int machine__map_x86_64_entry_trampolines(struct machine *machine,
 int machine__resolve(struct machine *machine, struct addr_location *al,
 		     struct perf_sample *sample);
 
+int machine__hit_all_dsos(struct machine *machine);
+
 #endif /* __PERF_MACHINE_H */
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 199d3e8df315..e7b5d360a212 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -2906,3 +2906,24 @@ int perf_event__process_id_index(struct perf_session *session,
 	}
 	return 0;
 }
+
+int perf_session__dsos_hit_all(struct perf_session *session)
+{
+	struct rb_node *nd;
+	int err;
+
+	err = machine__hit_all_dsos(&session->machines.host);
+	if (err)
+		return err;
+
+	for (nd = rb_first_cached(&session->machines.guests); nd;
+	     nd = rb_next(nd)) {
+		struct machine *pos = rb_entry(nd, struct machine, rb_node);
+
+		err = machine__hit_all_dsos(pos);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index ee3715e8563b..25c0d6c9cac9 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -154,6 +154,8 @@ int perf_session__deliver_synth_event(struct perf_session *session,
 				      union perf_event *event,
 				      struct perf_sample *sample);
 
+int perf_session__dsos_hit_all(struct perf_session *session);
+
 int perf_event__process_id_index(struct perf_session *session,
 				 union perf_event *event);
 
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 15/25] perf dsos: Tidy reference counting and locking
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (13 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 14/25] perf dsos: Attempt to better abstract dsos internals Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 16/25] perf dsos: Add dsos__for_each_dso Ian Rogers
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Move more functionality into dsos.c generally from machine, renaming
functions to match their new usage. The find function is made to
always "get" before returning a dso. Reduce the scope of locks in vdso
to match the locking paradigm.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/dsos.c    | 73 +++++++++++++++++++++++++++++++++++----
 tools/perf/util/dsos.h    |  9 ++++-
 tools/perf/util/machine.c | 62 ++-------------------------------
 tools/perf/util/map.c     |  4 +--
 tools/perf/util/vdso.c    | 48 +++++++++++--------------
 5 files changed, 97 insertions(+), 99 deletions(-)

diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index e65ef6762bed..d269e09005a7 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -181,7 +181,7 @@ struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso
 			 * at the end of the list of duplicates.
 			 */
 			if (!dso || (dso == this))
-				return this;	/* Find matching dso */
+				return dso__get(this);	/* Find matching dso */
 			/*
 			 * The core kernel DSOs may have duplicated long name.
 			 * In this case, the short name should be different.
@@ -253,15 +253,20 @@ static struct dso *__dsos__find_id(struct dsos *dsos, const char *name, struct d
 	if (cmp_short) {
 		list_for_each_entry(pos, &dsos->head, node)
 			if (__dso__cmp_short_name(name, id, pos) == 0)
-				return pos;
+				return dso__get(pos);
 		return NULL;
 	}
 	return __dsos__findnew_by_longname_id(&dsos->root, name, id);
 }
 
-struct dso *__dsos__find(struct dsos *dsos, const char *name, bool cmp_short)
+struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short)
 {
-	return __dsos__find_id(dsos, name, NULL, cmp_short);
+	struct dso *res;
+
+	down_read(&dsos->lock);
+	res = __dsos__find_id(dsos, name, NULL, cmp_short);
+	up_read(&dsos->lock);
+	return res;
 }
 
 static void dso__set_basename(struct dso *dso)
@@ -303,8 +308,6 @@ static struct dso *__dsos__addnew_id(struct dsos *dsos, const char *name, struct
 	if (dso != NULL) {
 		__dsos__add(dsos, dso);
 		dso__set_basename(dso);
-		/* Put dso here because __dsos_add already got it */
-		dso__put(dso);
 	}
 	return dso;
 }
@@ -328,7 +331,7 @@ struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id
 {
 	struct dso *dso;
 	down_write(&dsos->lock);
-	dso = dso__get(__dsos__findnew_id(dsos, name, id));
+	dso = __dsos__findnew_id(dsos, name, id);
 	up_write(&dsos->lock);
 	return dso;
 }
@@ -374,3 +377,59 @@ int __dsos__hit_all(struct dsos *dsos)
 
 	return 0;
 }
+
+struct dso *dsos__findnew_module_dso(struct dsos *dsos,
+				     struct machine *machine,
+				     struct kmod_path *m,
+				     const char *filename)
+{
+	struct dso *dso;
+
+	down_write(&dsos->lock);
+
+	dso = __dsos__find_id(dsos, m->name, NULL, /*cmp_short=*/true);
+	if (!dso) {
+		dso = __dsos__addnew(dsos, m->name);
+		if (dso == NULL)
+			goto out_unlock;
+
+		dso__set_module_info(dso, m, machine);
+		dso__set_long_name(dso, strdup(filename), true);
+		dso->kernel = DSO_SPACE__KERNEL;
+	}
+
+out_unlock:
+	up_write(&dsos->lock);
+	return dso;
+}
+
+struct dso *dsos__find_kernel_dso(struct dsos *dsos)
+{
+	struct dso *dso, *res = NULL;
+
+	down_read(&dsos->lock);
+	list_for_each_entry(dso, &dsos->head, node) {
+		/*
+		 * The cpumode passed to is_kernel_module is not the cpumode of
+		 * *this* event. If we insist on passing correct cpumode to
+		 * is_kernel_module, we should record the cpumode when we adding
+		 * this dso to the linked list.
+		 *
+		 * However we don't really need passing correct cpumode.  We
+		 * know the correct cpumode must be kernel mode (if not, we
+		 * should not link it onto kernel_dsos list).
+		 *
+		 * Therefore, we pass PERF_RECORD_MISC_CPUMODE_UNKNOWN.
+		 * is_kernel_module() treats it as a kernel cpumode.
+		 */
+		if (!dso->kernel ||
+		    is_kernel_module(dso->long_name,
+				     PERF_RECORD_MISC_CPUMODE_UNKNOWN))
+			continue;
+
+		res = dso__get(dso);
+		break;
+	}
+	up_read(&dsos->lock);
+	return res;
+}
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index 1c81ddf07f8f..a7c7f723c5ff 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -10,6 +10,8 @@
 
 struct dso;
 struct dso_id;
+struct kmod_path;
+struct machine;
 
 /*
  * DSOs are put into both a list for fast iteration and rbtree for fast
@@ -33,7 +35,7 @@ void dsos__exit(struct dsos *dsos);
 void __dsos__add(struct dsos *dsos, struct dso *dso);
 void dsos__add(struct dsos *dsos, struct dso *dso);
 struct dso *__dsos__addnew(struct dsos *dsos, const char *name);
-struct dso *__dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
+struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
 
 struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id);
  
@@ -48,4 +50,9 @@ size_t __dsos__fprintf(struct dsos *dsos, FILE *fp);
 
 int __dsos__hit_all(struct dsos *dsos);
 
+struct dso *dsos__findnew_module_dso(struct dsos *dsos, struct machine *machine,
+				     struct kmod_path *m, const char *filename);
+
+struct dso *dsos__find_kernel_dso(struct dsos *dsos);
+
 #endif /* __PERF_DSOS */
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index d235d65fb35b..8d0ea17e432a 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -645,31 +645,6 @@ int machine__process_lost_samples_event(struct machine *machine __maybe_unused,
 	return 0;
 }
 
-static struct dso *machine__findnew_module_dso(struct machine *machine,
-					       struct kmod_path *m,
-					       const char *filename)
-{
-	struct dso *dso;
-
-	down_write(&machine->dsos.lock);
-
-	dso = __dsos__find(&machine->dsos, m->name, true);
-	if (!dso) {
-		dso = __dsos__addnew(&machine->dsos, m->name);
-		if (dso == NULL)
-			goto out_unlock;
-
-		dso__set_module_info(dso, m, machine);
-		dso__set_long_name(dso, strdup(filename), true);
-		dso->kernel = DSO_SPACE__KERNEL;
-	}
-
-	dso__get(dso);
-out_unlock:
-	up_write(&machine->dsos.lock);
-	return dso;
-}
-
 int machine__process_aux_event(struct machine *machine __maybe_unused,
 			       union perf_event *event)
 {
@@ -853,7 +828,7 @@ static struct map *machine__addnew_module_map(struct machine *machine, u64 start
 	if (kmod_path__parse_name(&m, filename))
 		return NULL;
 
-	dso = machine__findnew_module_dso(machine, &m, filename);
+	dso = dsos__findnew_module_dso(&machine->dsos, machine, &m, filename);
 	if (dso == NULL)
 		goto out;
 
@@ -1662,40 +1637,7 @@ static int machine__process_kernel_mmap_event(struct machine *machine,
 		 * Should be there already, from the build-id table in
 		 * the header.
 		 */
-		struct dso *kernel = NULL;
-		struct dso *dso;
-
-		down_read(&machine->dsos.lock);
-
-		list_for_each_entry(dso, &machine->dsos.head, node) {
-
-			/*
-			 * The cpumode passed to is_kernel_module is not the
-			 * cpumode of *this* event. If we insist on passing
-			 * correct cpumode to is_kernel_module, we should
-			 * record the cpumode when we adding this dso to the
-			 * linked list.
-			 *
-			 * However we don't really need passing correct
-			 * cpumode.  We know the correct cpumode must be kernel
-			 * mode (if not, we should not link it onto kernel_dsos
-			 * list).
-			 *
-			 * Therefore, we pass PERF_RECORD_MISC_CPUMODE_UNKNOWN.
-			 * is_kernel_module() treats it as a kernel cpumode.
-			 */
-
-			if (!dso->kernel ||
-			    is_kernel_module(dso->long_name,
-					     PERF_RECORD_MISC_CPUMODE_UNKNOWN))
-				continue;
-
-
-			kernel = dso__get(dso);
-			break;
-		}
-
-		up_read(&machine->dsos.lock);
+		struct dso *kernel = dsos__find_kernel_dso(&machine->dsos);
 
 		if (kernel == NULL)
 			kernel = machine__findnew_dso(machine, machine->mmap_name);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index cf5a15db3a1f..7c1fff9e413d 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -196,9 +196,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 			 * reading the header will have the build ID set and all future mmaps will
 			 * have it missing.
 			 */
-			down_read(&machine->dsos.lock);
-			header_bid_dso = __dsos__find(&machine->dsos, filename, false);
-			up_read(&machine->dsos.lock);
+			header_bid_dso = dsos__find(&machine->dsos, filename, false);
 			if (header_bid_dso && header_bid_dso->header_build_id) {
 				dso__set_build_id(dso, &header_bid_dso->bid);
 				dso->header_build_id = 1;
diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c
index df8963796187..35532dcbff74 100644
--- a/tools/perf/util/vdso.c
+++ b/tools/perf/util/vdso.c
@@ -133,8 +133,6 @@ static struct dso *__machine__addnew_vdso(struct machine *machine, const char *s
 	if (dso != NULL) {
 		__dsos__add(&machine->dsos, dso);
 		dso__set_long_name(dso, long_name, false);
-		/* Put dso here because __dsos_add already got it */
-		dso__put(dso);
 	}
 
 	return dso;
@@ -252,17 +250,15 @@ static struct dso *__machine__findnew_compat(struct machine *machine,
 	const char *file_name;
 	struct dso *dso;
 
-	dso = __dsos__find(&machine->dsos, vdso_file->dso_name, true);
+	dso = dsos__find(&machine->dsos, vdso_file->dso_name, true);
 	if (dso)
-		goto out;
+		return dso;
 
 	file_name = vdso__get_compat_file(vdso_file);
 	if (!file_name)
-		goto out;
+		return NULL;
 
-	dso = __machine__addnew_vdso(machine, vdso_file->dso_name, file_name);
-out:
-	return dso;
+	return __machine__addnew_vdso(machine, vdso_file->dso_name, file_name);
 }
 
 static int __machine__findnew_vdso_compat(struct machine *machine,
@@ -308,21 +304,21 @@ static struct dso *machine__find_vdso(struct machine *machine,
 	dso_type = machine__thread_dso_type(machine, thread);
 	switch (dso_type) {
 	case DSO__TYPE_32BIT:
-		dso = __dsos__find(&machine->dsos, DSO__NAME_VDSO32, true);
+		dso = dsos__find(&machine->dsos, DSO__NAME_VDSO32, true);
 		if (!dso) {
-			dso = __dsos__find(&machine->dsos, DSO__NAME_VDSO,
-					   true);
+			dso = dsos__find(&machine->dsos, DSO__NAME_VDSO,
+					 true);
 			if (dso && dso_type != dso__type(dso, machine))
 				dso = NULL;
 		}
 		break;
 	case DSO__TYPE_X32BIT:
-		dso = __dsos__find(&machine->dsos, DSO__NAME_VDSOX32, true);
+		dso = dsos__find(&machine->dsos, DSO__NAME_VDSOX32, true);
 		break;
 	case DSO__TYPE_64BIT:
 	case DSO__TYPE_UNKNOWN:
 	default:
-		dso = __dsos__find(&machine->dsos, DSO__NAME_VDSO, true);
+		dso = dsos__find(&machine->dsos, DSO__NAME_VDSO, true);
 		break;
 	}
 
@@ -334,37 +330,33 @@ struct dso *machine__findnew_vdso(struct machine *machine,
 {
 	struct vdso_info *vdso_info;
 	struct dso *dso = NULL;
+	char *file;
 
-	down_write(&machine->dsos.lock);
 	if (!machine->vdso_info)
 		machine->vdso_info = vdso_info__new();
 
 	vdso_info = machine->vdso_info;
 	if (!vdso_info)
-		goto out_unlock;
+		return NULL;
 
 	dso = machine__find_vdso(machine, thread);
 	if (dso)
-		goto out_unlock;
+		return dso;
 
 #if BITS_PER_LONG == 64
 	if (__machine__findnew_vdso_compat(machine, thread, vdso_info, &dso))
-		goto out_unlock;
+		return dso;
 #endif
 
-	dso = __dsos__find(&machine->dsos, DSO__NAME_VDSO, true);
-	if (!dso) {
-		char *file;
+	dso = dsos__find(&machine->dsos, DSO__NAME_VDSO, true);
+	if (dso)
+		return dso;
 
-		file = get_file(&vdso_info->vdso);
-		if (file)
-			dso = __machine__addnew_vdso(machine, DSO__NAME_VDSO, file);
-	}
+	file = get_file(&vdso_info->vdso);
+	if (!file)
+		return NULL;
 
-out_unlock:
-	dso__get(dso);
-	up_write(&machine->dsos.lock);
-	return dso;
+	return __machine__addnew_vdso(machine, DSO__NAME_VDSO, file);
 }
 
 bool dso__is_vdso(struct dso *dso)
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 16/25] perf dsos: Add dsos__for_each_dso
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (14 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 15/25] perf dsos: Tidy reference counting and locking Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 17/25] perf dso: Move dso functions out of dsos Ian Rogers
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

To better abstract the dsos internals, add dsos__for_each_dso that
does a callback on each dso. This also means the read lock can be
correctly held.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c | 25 +++++++-----
 tools/perf/util/build-id.c  | 76 ++++++++++++++++++++-----------------
 tools/perf/util/dsos.c      | 16 ++++++++
 tools/perf/util/dsos.h      |  8 +---
 tools/perf/util/machine.c   | 40 +++++++++++--------
 5 files changed, 100 insertions(+), 65 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index ef73317e6ae7..ce5e28eaad90 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -1187,23 +1187,28 @@ static int synthesize_build_id(struct perf_inject *inject, struct dso *dso, pid_
 					       process_build_id, machine);
 }
 
+static int guest_session__add_build_ids_cb(struct dso *dso, void *data)
+{
+	struct guest_session *gs = data;
+	struct perf_inject *inject = container_of(gs, struct perf_inject, guest_session);
+
+	if (!dso->has_build_id)
+		return 0;
+
+	return synthesize_build_id(inject, dso, gs->machine_pid);
+
+}
+
 static int guest_session__add_build_ids(struct guest_session *gs)
 {
 	struct perf_inject *inject = container_of(gs, struct perf_inject, guest_session);
-	struct machine *machine = &gs->session->machines.host;
-	struct dso *dso;
-	int ret;
 
 	/* Build IDs will be put in the Build ID feature section */
 	perf_header__set_feat(&inject->session->header, HEADER_BUILD_ID);
 
-	dsos__for_each_with_build_id(dso, &machine->dsos.head) {
-		ret = synthesize_build_id(inject, dso, gs->machine_pid);
-		if (ret)
-			return ret;
-	}
-
-	return 0;
+	return dsos__for_each_dso(&gs->session->machines.host.dsos,
+				  guest_session__add_build_ids_cb,
+				  gs);
 }
 
 static int guest_session__ksymbol_event(struct perf_tool *tool,
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index a617b1917e6b..a6d3c253f19f 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -327,48 +327,56 @@ static int write_buildid(const char *name, size_t name_len, struct build_id *bid
 	return write_padded(fd, name, name_len + 1, len);
 }
 
-static int machine__write_buildid_table(struct machine *machine,
-					struct feat_fd *fd)
+struct machine__write_buildid_table_cb_args {
+	struct machine *machine;
+	struct feat_fd *fd;
+	u16 kmisc, umisc;
+};
+
+static int machine__write_buildid_table_cb(struct dso *dso, void *data)
 {
-	int err = 0;
-	struct dso *pos;
-	u16 kmisc = PERF_RECORD_MISC_KERNEL,
-	    umisc = PERF_RECORD_MISC_USER;
+	struct machine__write_buildid_table_cb_args *args = data;
+	const char *name;
+	size_t name_len;
+	bool in_kernel = false;
 
-	if (!machine__is_host(machine)) {
-		kmisc = PERF_RECORD_MISC_GUEST_KERNEL;
-		umisc = PERF_RECORD_MISC_GUEST_USER;
-	}
+	if (!dso->has_build_id)
+		return 0;
 
-	dsos__for_each_with_build_id(pos, &machine->dsos.head) {
-		const char *name;
-		size_t name_len;
-		bool in_kernel = false;
+	if (!dso->hit && !dso__is_vdso(dso))
+		return 0;
 
-		if (!pos->hit && !dso__is_vdso(pos))
-			continue;
+	if (dso__is_vdso(dso)) {
+		name = dso->short_name;
+		name_len = dso->short_name_len;
+	} else if (dso__is_kcore(dso)) {
+		name = args->machine->mmap_name;
+		name_len = strlen(name);
+	} else {
+		name = dso->long_name;
+		name_len = dso->long_name_len;
+	}
 
-		if (dso__is_vdso(pos)) {
-			name = pos->short_name;
-			name_len = pos->short_name_len;
-		} else if (dso__is_kcore(pos)) {
-			name = machine->mmap_name;
-			name_len = strlen(name);
-		} else {
-			name = pos->long_name;
-			name_len = pos->long_name_len;
-		}
+	in_kernel = dso->kernel || is_kernel_module(name, PERF_RECORD_MISC_CPUMODE_UNKNOWN);
+	return write_buildid(name, name_len, &dso->bid, args->machine->pid,
+			     in_kernel ? args->kmisc : args->umisc, args->fd);
+}
 
-		in_kernel = pos->kernel ||
-				is_kernel_module(name,
-					PERF_RECORD_MISC_CPUMODE_UNKNOWN);
-		err = write_buildid(name, name_len, &pos->bid, machine->pid,
-				    in_kernel ? kmisc : umisc, fd);
-		if (err)
-			break;
+static int machine__write_buildid_table(struct machine *machine, struct feat_fd *fd)
+{
+	struct machine__write_buildid_table_cb_args args = {
+		.machine = machine,
+		.fd = fd,
+		.kmisc = PERF_RECORD_MISC_KERNEL,
+		.umisc = PERF_RECORD_MISC_USER,
+	};
+
+	if (!machine__is_host(machine)) {
+		args.kmisc = PERF_RECORD_MISC_GUEST_KERNEL;
+		args.umisc = PERF_RECORD_MISC_GUEST_USER;
 	}
 
-	return err;
+	return dsos__for_each_dso(&machine->dsos, machine__write_buildid_table_cb, &args);
 }
 
 int perf_session__write_buildid_table(struct perf_session *session,
diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index d269e09005a7..d43f64939b12 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -433,3 +433,19 @@ struct dso *dsos__find_kernel_dso(struct dsos *dsos)
 	up_read(&dsos->lock);
 	return res;
 }
+
+int dsos__for_each_dso(struct dsos *dsos, int (*cb)(struct dso *dso, void *data), void *data)
+{
+	struct dso *dso;
+
+	down_read(&dsos->lock);
+	list_for_each_entry(dso, &dsos->head, node) {
+		int err;
+
+		err = cb(dso, data);
+		if (err)
+			return err;
+	}
+	up_read(&dsos->lock);
+	return 0;
+}
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index a7c7f723c5ff..317a263f0e37 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -23,12 +23,6 @@ struct dsos {
 	struct rw_semaphore lock;
 };
 
-#define dsos__for_each_with_build_id(pos, head)	\
-	list_for_each_entry(pos, head, node)	\
-		if (!pos->has_build_id)		\
-			continue;		\
-		else
-
 void dsos__init(struct dsos *dsos);
 void dsos__exit(struct dsos *dsos);
 
@@ -55,4 +49,6 @@ struct dso *dsos__findnew_module_dso(struct dsos *dsos, struct machine *machine,
 
 struct dso *dsos__find_kernel_dso(struct dsos *dsos);
 
+int dsos__for_each_dso(struct dsos *dsos, int (*cb)(struct dso *dso, void *data), void *data);
+
 #endif /* __PERF_DSOS */
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 8d0ea17e432a..f1186a5bb73c 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1561,16 +1561,14 @@ int machine__create_kernel_maps(struct machine *machine)
 	return ret;
 }
 
-static bool machine__uses_kcore(struct machine *machine)
+static int machine__uses_kcore_cb(struct dso *dso, void *data __maybe_unused)
 {
-	struct dso *dso;
-
-	list_for_each_entry(dso, &machine->dsos.head, node) {
-		if (dso__is_kcore(dso))
-			return true;
-	}
+	return dso__is_kcore(dso) ? 1 : 0;
+}
 
-	return false;
+static bool machine__uses_kcore(struct machine *machine)
+{
+	return dsos__for_each_dso(&machine->dsos, machine__uses_kcore_cb, NULL) != 0 ? true : false;
 }
 
 static bool perf_event__is_extra_kernel_mmap(struct machine *machine,
@@ -3136,16 +3134,28 @@ char *machine__resolve_kernel_addr(void *vmachine, unsigned long long *addrp, ch
 	return sym->name;
 }
 
+struct machine__for_each_dso_cb_args {
+	struct machine *machine;
+	machine__dso_t fn;
+	void *priv;
+};
+
+static int machine__for_each_dso_cb(struct dso *dso, void *data)
+{
+	struct machine__for_each_dso_cb_args *args = data;
+
+	return args->fn(dso, args->machine, args->priv);
+}
+
 int machine__for_each_dso(struct machine *machine, machine__dso_t fn, void *priv)
 {
-	struct dso *pos;
-	int err = 0;
+	struct machine__for_each_dso_cb_args args = {
+		.machine = machine,
+		.fn = fn,
+		.priv = priv,
+	};
 
-	list_for_each_entry(pos, &machine->dsos.head, node) {
-		if (fn(pos, machine, priv))
-			err = -1;
-	}
-	return err;
+	return dsos__for_each_dso(&machine->dsos, machine__for_each_dso_cb, &args);
 }
 
 int machine__for_each_kernel_map(struct machine *machine, machine__map_t fn, void *priv)
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 17/25] perf dso: Move dso functions out of dsos
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (15 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 16/25] perf dsos: Add dsos__for_each_dso Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 18/25] perf dsos: Switch more loops to dsos__for_each_dso Ian Rogers
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Move dso and dso_id functions to dso.c to match the struct
declarations.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/dso.c  | 61 ++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/dso.h  |  4 +++
 tools/perf/util/dsos.c | 61 ------------------------------------------
 3 files changed, 65 insertions(+), 61 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 22fd5fa806ed..69b9aa256776 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1269,6 +1269,67 @@ static void dso__set_long_name_id(struct dso *dso, const char *name, struct dso_
 		__dsos__findnew_link_by_longname_id(root, dso, NULL, id);
 }
 
+static int __dso_id__cmp(struct dso_id *a, struct dso_id *b)
+{
+	if (a->maj > b->maj) return -1;
+	if (a->maj < b->maj) return 1;
+
+	if (a->min > b->min) return -1;
+	if (a->min < b->min) return 1;
+
+	if (a->ino > b->ino) return -1;
+	if (a->ino < b->ino) return 1;
+
+	/*
+	 * Synthesized MMAP events have zero ino_generation, avoid comparing
+	 * them with MMAP events with actual ino_generation.
+	 *
+	 * I found it harmful because the mismatch resulted in a new
+	 * dso that did not have a build ID whereas the original dso did have a
+	 * build ID. The build ID was essential because the object was not found
+	 * otherwise. - Adrian
+	 */
+	if (a->ino_generation && b->ino_generation) {
+		if (a->ino_generation > b->ino_generation) return -1;
+		if (a->ino_generation < b->ino_generation) return 1;
+	}
+
+	return 0;
+}
+
+bool dso_id__empty(struct dso_id *id)
+{
+	if (!id)
+		return true;
+
+	return !id->maj && !id->min && !id->ino && !id->ino_generation;
+}
+
+void dso__inject_id(struct dso *dso, struct dso_id *id)
+{
+	dso->id.maj = id->maj;
+	dso->id.min = id->min;
+	dso->id.ino = id->ino;
+	dso->id.ino_generation = id->ino_generation;
+}
+
+int dso_id__cmp(struct dso_id *a, struct dso_id *b)
+{
+	/*
+	 * The second is always dso->id, so zeroes if not set, assume passing
+	 * NULL for a means a zeroed id
+	 */
+	if (dso_id__empty(a) || dso_id__empty(b))
+		return 0;
+
+	return __dso_id__cmp(a, b);
+}
+
+int dso__cmp_id(struct dso *a, struct dso *b)
+{
+	return __dso_id__cmp(&a->id, &b->id);
+}
+
 void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
 {
 	dso__set_long_name_id(dso, name, NULL, name_allocated);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 2b9cf9177085..7447d7a1942a 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -237,6 +237,9 @@ static inline void dso__set_loaded(struct dso *dso)
 	dso->loaded = true;
 }
 
+int dso_id__cmp(struct dso_id *a, struct dso_id *b);
+bool dso_id__empty(struct dso_id *id);
+
 struct dso *dso__new_id(const char *name, struct dso_id *id);
 struct dso *dso__new(const char *name);
 void dso__delete(struct dso *dso);
@@ -244,6 +247,7 @@ void dso__delete(struct dso *dso);
 int dso__cmp_id(struct dso *a, struct dso *b);
 void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated);
 void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated);
+void dso__inject_id(struct dso *dso, struct dso_id *id);
 
 int dso__name_len(const struct dso *dso);
 
diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index d43f64939b12..f816927a21ff 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -41,67 +41,6 @@ void dsos__exit(struct dsos *dsos)
 	exit_rwsem(&dsos->lock);
 }
 
-static int __dso_id__cmp(struct dso_id *a, struct dso_id *b)
-{
-	if (a->maj > b->maj) return -1;
-	if (a->maj < b->maj) return 1;
-
-	if (a->min > b->min) return -1;
-	if (a->min < b->min) return 1;
-
-	if (a->ino > b->ino) return -1;
-	if (a->ino < b->ino) return 1;
-
-	/*
-	 * Synthesized MMAP events have zero ino_generation, avoid comparing
-	 * them with MMAP events with actual ino_generation.
-	 *
-	 * I found it harmful because the mismatch resulted in a new
-	 * dso that did not have a build ID whereas the original dso did have a
-	 * build ID. The build ID was essential because the object was not found
-	 * otherwise. - Adrian
-	 */
-	if (a->ino_generation && b->ino_generation) {
-		if (a->ino_generation > b->ino_generation) return -1;
-		if (a->ino_generation < b->ino_generation) return 1;
-	}
-
-	return 0;
-}
-
-static bool dso_id__empty(struct dso_id *id)
-{
-	if (!id)
-		return true;
-
-	return !id->maj && !id->min && !id->ino && !id->ino_generation;
-}
-
-static void dso__inject_id(struct dso *dso, struct dso_id *id)
-{
-	dso->id.maj = id->maj;
-	dso->id.min = id->min;
-	dso->id.ino = id->ino;
-	dso->id.ino_generation = id->ino_generation;
-}
-
-static int dso_id__cmp(struct dso_id *a, struct dso_id *b)
-{
-	/*
-	 * The second is always dso->id, so zeroes if not set, assume passing
-	 * NULL for a means a zeroed id
-	 */
-	if (dso_id__empty(a) || dso_id__empty(b))
-		return 0;
-
-	return __dso_id__cmp(a, b);
-}
-
-int dso__cmp_id(struct dso *a, struct dso *b)
-{
-	return __dso_id__cmp(&a->id, &b->id);
-}
-
 bool __dsos__read_build_ids(struct dsos *dsos, bool with_hits)
 {
 	struct list_head *head = &dsos->head;
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 18/25] perf dsos: Switch more loops to dsos__for_each_dso
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (16 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 17/25] perf dso: Move dso functions out of dsos Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 19/25] perf dsos: Switch backing storage to array from rbtree/list Ian Rogers
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Switch loops within dsos.c, add a version that isn't locked. Switch
some unlocked loops to hold the read lock.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/build-id.c |   2 +-
 tools/perf/util/dsos.c     | 258 ++++++++++++++++++++++++-------------
 tools/perf/util/dsos.h     |   8 +-
 tools/perf/util/machine.c  |   8 +-
 4 files changed, 174 insertions(+), 102 deletions(-)

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index a6d3c253f19f..864bc26b6b46 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -964,7 +964,7 @@ int perf_session__cache_build_ids(struct perf_session *session)
 
 static bool machine__read_build_ids(struct machine *machine, bool with_hits)
 {
-	return __dsos__read_build_ids(&machine->dsos, with_hits);
+	return dsos__read_build_ids(&machine->dsos, with_hits);
 }
 
 bool perf_session__read_build_ids(struct perf_session *session, bool with_hits)
diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index f816927a21ff..b7fbfb877ae3 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -41,38 +41,65 @@ void dsos__exit(struct dsos *dsos)
 	exit_rwsem(&dsos->lock);
 }
 
-bool __dsos__read_build_ids(struct dsos *dsos, bool with_hits)
+
+static int __dsos__for_each_dso(struct dsos *dsos,
+				int (*cb)(struct dso *dso, void *data),
+				void *data)
+{
+	struct dso *dso;
+
+	list_for_each_entry(dso, &dsos->head, node) {
+		int err;
+
+		err = cb(dso, data);
+		if (err)
+			return err;
+	}
+	return 0;
+}
+
+struct dsos__read_build_ids_cb_args {
+	bool with_hits;
+	bool have_build_id;
+};
+
+static int dsos__read_build_ids_cb(struct dso *dso, void *data)
 {
-	struct list_head *head = &dsos->head;
-	bool have_build_id = false;
-	struct dso *pos;
+	struct dsos__read_build_ids_cb_args *args = data;
 	struct nscookie nsc;
 
-	list_for_each_entry(pos, head, node) {
-		if (with_hits && !pos->hit && !dso__is_vdso(pos))
-			continue;
-		if (pos->has_build_id) {
-			have_build_id = true;
-			continue;
-		}
-		nsinfo__mountns_enter(pos->nsinfo, &nsc);
-		if (filename__read_build_id(pos->long_name, &pos->bid) > 0) {
-			have_build_id	  = true;
-			pos->has_build_id = true;
-		} else if (errno == ENOENT && pos->nsinfo) {
-			char *new_name = dso__filename_with_chroot(pos, pos->long_name);
-
-			if (new_name && filename__read_build_id(new_name,
-								&pos->bid) > 0) {
-				have_build_id = true;
-				pos->has_build_id = true;
-			}
-			free(new_name);
+	if (args->with_hits && !dso->hit && !dso__is_vdso(dso))
+		return 0;
+	if (dso->has_build_id) {
+		args->have_build_id = true;
+		return 0;
+	}
+	nsinfo__mountns_enter(dso->nsinfo, &nsc);
+	if (filename__read_build_id(dso->long_name, &dso->bid) > 0) {
+		args->have_build_id = true;
+		dso->has_build_id = true;
+	} else if (errno == ENOENT && dso->nsinfo) {
+		char *new_name = dso__filename_with_chroot(dso, dso->long_name);
+
+		if (new_name && filename__read_build_id(new_name, &dso->bid) > 0) {
+			args->have_build_id = true;
+			dso->has_build_id = true;
 		}
-		nsinfo__mountns_exit(&nsc);
+		free(new_name);
 	}
+	nsinfo__mountns_exit(&nsc);
+	return 0;
+}
 
-	return have_build_id;
+bool dsos__read_build_ids(struct dsos *dsos, bool with_hits)
+{
+	struct dsos__read_build_ids_cb_args args = {
+		.with_hits = with_hits,
+		.have_build_id = false,
+	};
+
+	dsos__for_each_dso(dsos, dsos__read_build_ids_cb, &args);
+	return args.have_build_id;
 }
 
 static int __dso__cmp_long_name(const char *long_name, struct dso_id *id, struct dso *b)
@@ -105,6 +132,7 @@ struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso
 
 	if (!name)
 		name = dso->long_name;
+
 	/*
 	 * Find node with the matching name
 	 */
@@ -185,17 +213,40 @@ static struct dso *__dsos__findnew_by_longname_id(struct rb_root *root, const ch
 	return __dsos__findnew_link_by_longname_id(root, NULL, name, id);
 }
 
+struct dsos__find_id_cb_args {
+	const char *name;
+	struct dso_id *id;
+	struct dso *res;
+};
+
+static int dsos__find_id_cb(struct dso *dso, void *data)
+{
+	struct dsos__find_id_cb_args *args = data;
+
+	if (__dso__cmp_short_name(args->name, args->id, dso) == 0) {
+		args->res = dso__get(dso);
+		return 1;
+	}
+	return 0;
+
+}
+
 static struct dso *__dsos__find_id(struct dsos *dsos, const char *name, struct dso_id *id, bool cmp_short)
 {
-	struct dso *pos;
+	struct dso *res;
 
 	if (cmp_short) {
-		list_for_each_entry(pos, &dsos->head, node)
-			if (__dso__cmp_short_name(name, id, pos) == 0)
-				return dso__get(pos);
-		return NULL;
+		struct dsos__find_id_cb_args args = {
+			.name = name,
+			.id = id,
+			.res = NULL,
+		};
+
+		__dsos__for_each_dso(dsos, dsos__find_id_cb, &args);
+		return args.res;
 	}
-	return __dsos__findnew_by_longname_id(&dsos->root, name, id);
+	res = __dsos__findnew_by_longname_id(&dsos->root, name, id);
+	return res;
 }
 
 struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short)
@@ -275,48 +326,74 @@ struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id
 	return dso;
 }
 
-size_t __dsos__fprintf_buildid(struct dsos *dsos, FILE *fp,
-			       bool (skip)(struct dso *dso, int parm), int parm)
-{
-	struct list_head *head = &dsos->head;
-	struct dso *pos;
-	size_t ret = 0;
+struct dsos__fprintf_buildid_cb_args {
+	FILE *fp;
+	bool (*skip)(struct dso *dso, int parm);
+	int parm;
+	size_t ret;
+};
 
-	list_for_each_entry(pos, head, node) {
-		char sbuild_id[SBUILD_ID_SIZE];
+static int dsos__fprintf_buildid_cb(struct dso *dso, void *data)
+{
+	struct dsos__fprintf_buildid_cb_args *args = data;
+	char sbuild_id[SBUILD_ID_SIZE];
 
-		if (skip && skip(pos, parm))
-			continue;
-		build_id__sprintf(&pos->bid, sbuild_id);
-		ret += fprintf(fp, "%-40s %s\n", sbuild_id, pos->long_name);
-	}
-	return ret;
+	if (args->skip && args->skip(dso, args->parm))
+		return 0;
+	build_id__sprintf(&dso->bid, sbuild_id);
+	args->ret += fprintf(args->fp, "%-40s %s\n", sbuild_id, dso->long_name);
+	return 0;
 }
 
-size_t __dsos__fprintf(struct dsos *dsos, FILE *fp)
+size_t dsos__fprintf_buildid(struct dsos *dsos, FILE *fp,
+			       bool (*skip)(struct dso *dso, int parm), int parm)
 {
-	struct list_head *head = &dsos->head;
-	struct dso *pos;
-	size_t ret = 0;
+	struct dsos__fprintf_buildid_cb_args args = {
+		.fp = fp,
+		.skip = skip,
+		.parm = parm,
+		.ret = 0,
+	};
+
+	dsos__for_each_dso(dsos, dsos__fprintf_buildid_cb, &args);
+	return args.ret;
+}
 
-	list_for_each_entry(pos, head, node) {
-		ret += dso__fprintf(pos, fp);
-	}
+struct dsos__fprintf_cb_args {
+	FILE *fp;
+	size_t ret;
+};
 
-	return ret;
+static int dsos__fprintf_cb(struct dso *dso, void *data)
+{
+	struct dsos__fprintf_cb_args *args = data;
+
+	args->ret += dso__fprintf(dso, args->fp);
+	return 0;
 }
 
-int __dsos__hit_all(struct dsos *dsos)
+size_t dsos__fprintf(struct dsos *dsos, FILE *fp)
 {
-	struct list_head *head = &dsos->head;
-	struct dso *pos;
+	struct dsos__fprintf_cb_args args = {
+		.fp = fp,
+		.ret = 0,
+	};
 
-	list_for_each_entry(pos, head, node)
-		pos->hit = true;
+	dsos__for_each_dso(dsos, dsos__fprintf_cb, &args);
+	return args.ret;
+}
 
+static int dsos__hit_all_cb(struct dso *dso, void *data __maybe_unused)
+{
+	dso->hit = true;
 	return 0;
 }
 
+int dsos__hit_all(struct dsos *dsos)
+{
+	return dsos__for_each_dso(dsos, dsos__hit_all_cb, NULL);
+}
+
 struct dso *dsos__findnew_module_dso(struct dsos *dsos,
 				     struct machine *machine,
 				     struct kmod_path *m,
@@ -342,49 +419,44 @@ struct dso *dsos__findnew_module_dso(struct dsos *dsos,
 	return dso;
 }
 
-struct dso *dsos__find_kernel_dso(struct dsos *dsos)
+static int dsos__find_kernel_dso_cb(struct dso *dso, void *data)
 {
-	struct dso *dso, *res = NULL;
+	struct dso **res = data;
+	/*
+	 * The cpumode passed to is_kernel_module is not the cpumode of *this*
+	 * event. If we insist on passing correct cpumode to is_kernel_module,
+	 * we should record the cpumode when we adding this dso to the linked
+	 * list.
+	 *
+	 * However we don't really need passing correct cpumode.  We know the
+	 * correct cpumode must be kernel mode (if not, we should not link it
+	 * onto kernel_dsos list).
+	 *
+	 * Therefore, we pass PERF_RECORD_MISC_CPUMODE_UNKNOWN.
+	 * is_kernel_module() treats it as a kernel cpumode.
+	 */
+	if (!dso->kernel ||
+	    is_kernel_module(dso->long_name, PERF_RECORD_MISC_CPUMODE_UNKNOWN))
+		return 0;
 
-	down_read(&dsos->lock);
-	list_for_each_entry(dso, &dsos->head, node) {
-		/*
-		 * The cpumode passed to is_kernel_module is not the cpumode of
-		 * *this* event. If we insist on passing correct cpumode to
-		 * is_kernel_module, we should record the cpumode when we adding
-		 * this dso to the linked list.
-		 *
-		 * However we don't really need passing correct cpumode.  We
-		 * know the correct cpumode must be kernel mode (if not, we
-		 * should not link it onto kernel_dsos list).
-		 *
-		 * Therefore, we pass PERF_RECORD_MISC_CPUMODE_UNKNOWN.
-		 * is_kernel_module() treats it as a kernel cpumode.
-		 */
-		if (!dso->kernel ||
-		    is_kernel_module(dso->long_name,
-				     PERF_RECORD_MISC_CPUMODE_UNKNOWN))
-			continue;
+	*res = dso__get(dso);
+	return 1;
+}
 
-		res = dso__get(dso);
-		break;
-	}
-	up_read(&dsos->lock);
+struct dso *dsos__find_kernel_dso(struct dsos *dsos)
+{
+	struct dso *res = NULL;
+
+	dsos__for_each_dso(dsos, dsos__find_kernel_dso_cb, &res);
 	return res;
 }
 
 int dsos__for_each_dso(struct dsos *dsos, int (*cb)(struct dso *dso, void *data), void *data)
 {
-	struct dso *dso;
+	int err;
 
 	down_read(&dsos->lock);
-	list_for_each_entry(dso, &dsos->head, node) {
-		int err;
-
-		err = cb(dso, data);
-		if (err)
-			return err;
-	}
+	err = __dsos__for_each_dso(dsos, cb, data);
 	up_read(&dsos->lock);
-	return 0;
+	return err;
 }
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index 317a263f0e37..50bd51523475 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -33,16 +33,16 @@ struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
 
 struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id);
  
-bool __dsos__read_build_ids(struct dsos *dsos, bool with_hits);
+bool dsos__read_build_ids(struct dsos *dsos, bool with_hits);
 
 struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso *dso,
 						const char *name, struct dso_id *id);
 
-size_t __dsos__fprintf_buildid(struct dsos *dsos, FILE *fp,
+size_t dsos__fprintf_buildid(struct dsos *dsos, FILE *fp,
 			       bool (skip)(struct dso *dso, int parm), int parm);
-size_t __dsos__fprintf(struct dsos *dsos, FILE *fp);
+size_t dsos__fprintf(struct dsos *dsos, FILE *fp);
 
-int __dsos__hit_all(struct dsos *dsos);
+int dsos__hit_all(struct dsos *dsos);
 
 struct dso *dsos__findnew_module_dso(struct dsos *dsos, struct machine *machine,
 				     struct kmod_path *m, const char *filename);
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index f1186a5bb73c..0210c10e616b 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -852,11 +852,11 @@ static struct map *machine__addnew_module_map(struct machine *machine, u64 start
 size_t machines__fprintf_dsos(struct machines *machines, FILE *fp)
 {
 	struct rb_node *nd;
-	size_t ret = __dsos__fprintf(&machines->host.dsos, fp);
+	size_t ret = dsos__fprintf(&machines->host.dsos, fp);
 
 	for (nd = rb_first_cached(&machines->guests); nd; nd = rb_next(nd)) {
 		struct machine *pos = rb_entry(nd, struct machine, rb_node);
-		ret += __dsos__fprintf(&pos->dsos, fp);
+		ret += dsos__fprintf(&pos->dsos, fp);
 	}
 
 	return ret;
@@ -865,7 +865,7 @@ size_t machines__fprintf_dsos(struct machines *machines, FILE *fp)
 size_t machine__fprintf_dsos_buildid(struct machine *m, FILE *fp,
 				     bool (skip)(struct dso *dso, int parm), int parm)
 {
-	return __dsos__fprintf_buildid(&m->dsos, fp, skip, parm);
+	return dsos__fprintf_buildid(&m->dsos, fp, skip, parm);
 }
 
 size_t machines__fprintf_dsos_buildid(struct machines *machines, FILE *fp,
@@ -3207,5 +3207,5 @@ bool machine__is_lock_function(struct machine *machine, u64 addr)
 
 int machine__hit_all_dsos(struct machine *machine)
 {
-	return __dsos__hit_all(&machine->dsos);
+	return dsos__hit_all(&machine->dsos);
 }
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 19/25] perf dsos: Switch backing storage to array from rbtree/list
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (17 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 18/25] perf dsos: Switch more loops to dsos__for_each_dso Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 20/25] perf dsos: Remove __dsos__addnew Ian Rogers
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

DSOs were held on a list for fast iteration and in an rbtree for fast
finds. Switch to using a lazily sorted array where iteration is just
iterating through the array and binary searches are the same
complexity as searching the rbtree. The find may need to sort the
array first which does increase the complexity, but add operations
have lower complexity and overall the complexity should remain about
the same.

The set name operations on the dso just records that the array is no
longer sorted, avoiding complexity in rebalancing the rbtree. Tighter
locking discipline is enforced to avoid the array being resorted while
long and short names or ids are changed.

The array is smaller in size, replacing 6 pointers with 2, and so even
with extra allocated space in the array, the array may be 50%
unoccupied, the memory saving should be at least 2x.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/dso.c  |  67 +++++++++------
 tools/perf/util/dso.h  |  10 +--
 tools/perf/util/dsos.c | 188 ++++++++++++++++++++++++++---------------
 tools/perf/util/dsos.h |  21 +++--
 4 files changed, 177 insertions(+), 109 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 69b9aa256776..e96369fb490b 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1241,35 +1241,35 @@ struct dso *machine__findnew_kernel(struct machine *machine, const char *name,
 	return dso;
 }
 
-static void dso__set_long_name_id(struct dso *dso, const char *name, struct dso_id *id, bool name_allocated)
+static void dso__set_long_name_id(struct dso *dso, const char *name, bool name_allocated)
 {
-	struct rb_root *root = dso->root;
+	struct dsos *dsos = dso->dsos;
 
 	if (name == NULL)
 		return;
 
-	if (dso->long_name_allocated)
-		free((char *)dso->long_name);
-
-	if (root) {
-		rb_erase(&dso->rb_node, root);
+	if (dsos) {
 		/*
-		 * __dsos__findnew_link_by_longname_id() isn't guaranteed to
-		 * add it back, so a clean removal is required here.
+		 * Need to avoid re-sorting the dsos breaking by non-atomically
+		 * renaming the dso.
 		 */
-		RB_CLEAR_NODE(&dso->rb_node);
-		dso->root = NULL;
+		down_write(&dsos->lock);
 	}
 
+	if (dso->long_name_allocated)
+		free((char *)dso->long_name);
+
 	dso->long_name		 = name;
 	dso->long_name_len	 = strlen(name);
 	dso->long_name_allocated = name_allocated;
 
-	if (root)
-		__dsos__findnew_link_by_longname_id(root, dso, NULL, id);
+	if (dsos) {
+		dsos->sorted = false;
+		up_write(&dsos->lock);
+	}
 }
 
-static int __dso_id__cmp(struct dso_id *a, struct dso_id *b)
+static int __dso_id__cmp(const struct dso_id *a, const struct dso_id *b)
 {
 	if (a->maj > b->maj) return -1;
 	if (a->maj < b->maj) return 1;
@@ -1297,7 +1297,7 @@ static int __dso_id__cmp(struct dso_id *a, struct dso_id *b)
 	return 0;
 }
 
-bool dso_id__empty(struct dso_id *id)
+bool dso_id__empty(const struct dso_id *id)
 {
 	if (!id)
 		return true;
@@ -1305,15 +1305,22 @@ bool dso_id__empty(struct dso_id *id)
 	return !id->maj && !id->min && !id->ino && !id->ino_generation;
 }
 
-void dso__inject_id(struct dso *dso, struct dso_id *id)
+void __dso__inject_id(struct dso *dso, struct dso_id *id)
 {
+	struct dsos *dsos = dso->dsos;
+
+	/* dsos write lock held by caller. */
+
 	dso->id.maj = id->maj;
 	dso->id.min = id->min;
 	dso->id.ino = id->ino;
 	dso->id.ino_generation = id->ino_generation;
+
+	if (dsos)
+		dsos->sorted = false;
 }
 
-int dso_id__cmp(struct dso_id *a, struct dso_id *b)
+int dso_id__cmp(const struct dso_id *a, const struct dso_id *b)
 {
 	/*
 	 * The second is always dso->id, so zeroes if not set, assume passing
@@ -1332,20 +1339,34 @@ int dso__cmp_id(struct dso *a, struct dso *b)
 
 void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
 {
-	dso__set_long_name_id(dso, name, NULL, name_allocated);
+	dso__set_long_name_id(dso, name, name_allocated);
 }
 
 void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated)
 {
+	struct dsos *dsos = dso->dsos;
+
 	if (name == NULL)
 		return;
 
+	if (dsos) {
+		/*
+		 * Need to avoid re-sorting the dsos breaking by non-atomically
+		 * renaming the dso.
+		 */
+		down_write(&dsos->lock);
+	}
 	if (dso->short_name_allocated)
 		free((char *)dso->short_name);
 
 	dso->short_name		  = name;
 	dso->short_name_len	  = strlen(name);
 	dso->short_name_allocated = name_allocated;
+
+	if (dsos) {
+		dsos->sorted = false;
+		up_write(&dsos->lock);
+	}
 }
 
 int dso__name_len(const struct dso *dso)
@@ -1381,7 +1402,7 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
 		strcpy(dso->name, name);
 		if (id)
 			dso->id = *id;
-		dso__set_long_name_id(dso, dso->name, id, false);
+		dso__set_long_name_id(dso, dso->name, false);
 		dso__set_short_name(dso, dso->name, false);
 		dso->symbols = RB_ROOT_CACHED;
 		dso->symbol_names = NULL;
@@ -1405,9 +1426,6 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
 		dso->is_kmod = 0;
 		dso->needs_swap = DSO_SWAP__UNSET;
 		dso->comp = COMP_ID__NONE;
-		RB_CLEAR_NODE(&dso->rb_node);
-		dso->root = NULL;
-		INIT_LIST_HEAD(&dso->node);
 		INIT_LIST_HEAD(&dso->data.open_entry);
 		mutex_init(&dso->lock);
 		refcount_set(&dso->refcnt, 1);
@@ -1423,9 +1441,8 @@ struct dso *dso__new(const char *name)
 
 void dso__delete(struct dso *dso)
 {
-	if (!RB_EMPTY_NODE(&dso->rb_node))
-		pr_err("DSO %s is still in rbtree when being deleted!\n",
-		       dso->long_name);
+	if (dso->dsos)
+		pr_err("DSO %s is still in rbtree when being deleted!\n", dso->long_name);
 
 	/* free inlines first, as they reference symbols */
 	inlines__tree_delete(&dso->inlined_nodes);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 7447d7a1942a..2e227822f10c 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -146,9 +146,7 @@ struct auxtrace_cache;
 
 struct dso {
 	struct mutex	 lock;
-	struct list_head node;
-	struct rb_node	 rb_node;	/* rbtree node sorted by long name */
-	struct rb_root	 *root;		/* root of rbtree that rb_node is in */
+	struct dsos	 *dsos;
 	struct rb_root_cached symbols;
 	struct symbol	 **symbol_names;
 	size_t		 symbol_names_len;
@@ -237,8 +235,8 @@ static inline void dso__set_loaded(struct dso *dso)
 	dso->loaded = true;
 }
 
-int dso_id__cmp(struct dso_id *a, struct dso_id *b);
-bool dso_id__empty(struct dso_id *id);
+int dso_id__cmp(const struct dso_id *a, const struct dso_id *b);
+bool dso_id__empty(const struct dso_id *id);
 
 struct dso *dso__new_id(const char *name, struct dso_id *id);
 struct dso *dso__new(const char *name);
@@ -247,7 +245,7 @@ void dso__delete(struct dso *dso);
 int dso__cmp_id(struct dso *a, struct dso *b);
 void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated);
 void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated);
-void dso__inject_id(struct dso *dso, struct dso_id *id);
+void __dso__inject_id(struct dso *dso, struct dso_id *id);
 
 int dso__name_len(const struct dso *dso);
 
diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index b7fbfb877ae3..cfc10e1a6802 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -14,24 +14,30 @@
 
 void dsos__init(struct dsos *dsos)
 {
-	INIT_LIST_HEAD(&dsos->head);
-	dsos->root = RB_ROOT;
 	init_rwsem(&dsos->lock);
+
+	dsos->cnt = 0;
+	dsos->allocated = 0;
+	dsos->dsos = NULL;
+	dsos->sorted = true;
 }
 
 static void dsos__purge(struct dsos *dsos)
 {
-	struct dso *pos, *n;
-
 	down_write(&dsos->lock);
 
-	list_for_each_entry_safe(pos, n, &dsos->head, node) {
-		RB_CLEAR_NODE(&pos->rb_node);
-		pos->root = NULL;
-		list_del_init(&pos->node);
-		dso__put(pos);
+	for (unsigned int i = 0; i < dsos->cnt; i++) {
+		struct dso *dso = dsos->dsos[i];
+
+		dso__put(dso);
+		dso->dsos = NULL;
 	}
 
+	zfree(&dsos->dsos);
+	dsos->cnt = 0;
+	dsos->allocated = 0;
+	dsos->sorted = true;
+
 	up_write(&dsos->lock);
 }
 
@@ -46,9 +52,8 @@ static int __dsos__for_each_dso(struct dsos *dsos,
 				int (*cb)(struct dso *dso, void *data),
 				void *data)
 {
-	struct dso *dso;
-
-	list_for_each_entry(dso, &dsos->head, node) {
+	for (unsigned int i = 0; i < dsos->cnt; i++) {
+		struct dso *dso = dsos->dsos[i];
 		int err;
 
 		err = cb(dso, data);
@@ -119,16 +124,47 @@ static int dso__cmp_short_name(struct dso *a, struct dso *b)
 	return __dso__cmp_short_name(a->short_name, &a->id, b);
 }
 
+static int dsos__cmp_long_name_id_short_name(const void *va, const void *vb)
+{
+	const struct dso *a = *((const struct dso **)va);
+	const struct dso *b = *((const struct dso **)vb);
+	int rc = strcmp(a->long_name, b->long_name);
+
+	if (!rc) {
+		rc = dso_id__cmp(&a->id, &b->id);
+		if (!rc)
+			rc = strcmp(a->short_name, b->short_name);
+	}
+	return rc;
+}
+
 /*
  * Find a matching entry and/or link current entry to RB tree.
  * Either one of the dso or name parameter must be non-NULL or the
  * function will not work.
  */
-struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso *dso,
-						const char *name, struct dso_id *id)
+struct dso *__dsos__findnew_link_by_longname_id(struct dsos *dsos,
+						struct dso *dso,
+						const char *name,
+						struct dso_id *id,
+						bool write_locked)
 {
-	struct rb_node **p = &root->rb_node;
-	struct rb_node  *parent = NULL;
+	int low = 0, high = dsos->cnt - 1;
+
+	if (!dsos->sorted) {
+		if (!write_locked) {
+			up_read(&dsos->lock);
+			down_write(&dsos->lock);
+			dso = __dsos__findnew_link_by_longname_id(dsos, dso, name, id,
+								  /*write_locked=*/true);
+			up_write(&dsos->lock);
+			down_read(&dsos->lock);
+			return dso;
+		}
+		qsort(dsos->dsos, dsos->cnt, sizeof(struct dso *),
+		      dsos__cmp_long_name_id_short_name);
+		dsos->sorted = true;
+	}
 
 	if (!name)
 		name = dso->long_name;
@@ -136,11 +172,11 @@ struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso
 	/*
 	 * Find node with the matching name
 	 */
-	while (*p) {
-		struct dso *this = rb_entry(*p, struct dso, rb_node);
+	while (low <= high) {
+		int mid = (low + high) / 2;
+		struct dso *this = dsos->dsos[mid];
 		int rc = __dso__cmp_long_name(name, id, this);
 
-		parent = *p;
 		if (rc == 0) {
 			/*
 			 * In case the new DSO is a duplicate of an existing
@@ -161,56 +197,53 @@ struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso
 			}
 		}
 		if (rc < 0)
-			p = &parent->rb_left;
+			high = mid - 1;
 		else
-			p = &parent->rb_right;
-	}
-	if (dso) {
-		/* Add new node and rebalance tree */
-		rb_link_node(&dso->rb_node, parent, p);
-		rb_insert_color(&dso->rb_node, root);
-		dso->root = root;
+			low = mid + 1;
 	}
+	if (dso)
+		__dsos__add(dsos, dso);
 	return NULL;
 }
 
-void __dsos__add(struct dsos *dsos, struct dso *dso)
+int __dsos__add(struct dsos *dsos, struct dso *dso)
 {
-	list_add_tail(&dso->node, &dsos->head);
-	__dsos__findnew_link_by_longname_id(&dsos->root, dso, NULL, &dso->id);
-	/*
-	 * It is now in the linked list, grab a reference, then garbage collect
-	 * this when needing memory, by looking at LRU dso instances in the
-	 * list with atomic_read(&dso->refcnt) == 1, i.e. no references
-	 * anywhere besides the one for the list, do, under a lock for the
-	 * list: remove it from the list, then a dso__put(), that probably will
-	 * be the last and will then call dso__delete(), end of life.
-	 *
-	 * That, or at the end of the 'struct machine' lifetime, when all
-	 * 'struct dso' instances will be removed from the list, in
-	 * dsos__exit(), if they have no other reference from some other data
-	 * structure.
-	 *
-	 * E.g.: after processing a 'perf.data' file and storing references
-	 * to objects instantiated while processing events, we will have
-	 * references to the 'thread', 'map', 'dso' structs all from 'struct
-	 * hist_entry' instances, but we may not need anything not referenced,
-	 * so we might as well call machines__exit()/machines__delete() and
-	 * garbage collect it.
-	 */
-	dso__get(dso);
+	if (dsos->cnt == dsos->allocated) {
+		unsigned int to_allocate = 2;
+		struct dso **temp;
+
+		if (dsos->allocated > 0)
+			to_allocate = dsos->allocated * 2;
+		temp = realloc(dsos->dsos, sizeof(struct dso *) * to_allocate);
+		if (!temp)
+			return -ENOMEM;
+		dsos->dsos = temp;
+		dsos->allocated = to_allocate;
+	}
+	dsos->dsos[dsos->cnt++] = dso__get(dso);
+	if (dsos->cnt >= 2 && dsos->sorted) {
+		dsos->sorted = dsos__cmp_long_name_id_short_name(&dsos->dsos[dsos->cnt - 2],
+								 &dsos->dsos[dsos->cnt - 1])
+			<= 0;
+	}
+	dso->dsos = dsos;
+	return 0;
 }
 
-void dsos__add(struct dsos *dsos, struct dso *dso)
+int dsos__add(struct dsos *dsos, struct dso *dso)
 {
+	int ret;
+
 	down_write(&dsos->lock);
-	__dsos__add(dsos, dso);
+	ret = __dsos__add(dsos, dso);
 	up_write(&dsos->lock);
+	return ret;
 }
 
-static struct dso *__dsos__findnew_by_longname_id(struct rb_root *root, const char *name, struct dso_id *id)
+static struct dso *__dsos__findnew_by_longname_id(struct dsos *dsos, const char *name,
+						struct dso_id *id, bool write_locked)
 {
-	return __dsos__findnew_link_by_longname_id(root, NULL, name, id);
+	return __dsos__findnew_link_by_longname_id(dsos, NULL, name, id, write_locked);
 }
 
 struct dsos__find_id_cb_args {
@@ -231,7 +264,8 @@ static int dsos__find_id_cb(struct dso *dso, void *data)
 
 }
 
-static struct dso *__dsos__find_id(struct dsos *dsos, const char *name, struct dso_id *id, bool cmp_short)
+static struct dso *__dsos__find_id(struct dsos *dsos, const char *name, struct dso_id *id,
+				   bool cmp_short, bool write_locked)
 {
 	struct dso *res;
 
@@ -245,7 +279,7 @@ static struct dso *__dsos__find_id(struct dsos *dsos, const char *name, struct d
 		__dsos__for_each_dso(dsos, dsos__find_id_cb, &args);
 		return args.res;
 	}
-	res = __dsos__findnew_by_longname_id(&dsos->root, name, id);
+	res = __dsos__findnew_by_longname_id(dsos, name, id, write_locked);
 	return res;
 }
 
@@ -254,7 +288,7 @@ struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short)
 	struct dso *res;
 
 	down_read(&dsos->lock);
-	res = __dsos__find_id(dsos, name, NULL, cmp_short);
+	res = __dsos__find_id(dsos, name, NULL, cmp_short, /*write_locked=*/false);
 	up_read(&dsos->lock);
 	return res;
 }
@@ -296,8 +330,13 @@ static struct dso *__dsos__addnew_id(struct dsos *dsos, const char *name, struct
 	struct dso *dso = dso__new_id(name, id);
 
 	if (dso != NULL) {
-		__dsos__add(dsos, dso);
+		/*
+		 * The dsos lock is held on entry, so rename the dso before
+		 * adding it to avoid needing to take the dsos lock again to say
+		 * the array isn't sorted.
+		 */
 		dso__set_basename(dso);
+		__dsos__add(dsos, dso);
 	}
 	return dso;
 }
@@ -309,10 +348,10 @@ struct dso *__dsos__addnew(struct dsos *dsos, const char *name)
 
 static struct dso *__dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id)
 {
-	struct dso *dso = __dsos__find_id(dsos, name, id, false);
+	struct dso *dso = __dsos__find_id(dsos, name, id, false, /*write_locked=*/true);
 
 	if (dso && dso_id__empty(&dso->id) && !dso_id__empty(id))
-		dso__inject_id(dso, id);
+		__dso__inject_id(dso, id);
 
 	return dso ? dso : __dsos__addnew_id(dsos, name, id);
 }
@@ -403,18 +442,27 @@ struct dso *dsos__findnew_module_dso(struct dsos *dsos,
 
 	down_write(&dsos->lock);
 
-	dso = __dsos__find_id(dsos, m->name, NULL, /*cmp_short=*/true);
+	dso = __dsos__find_id(dsos, m->name, NULL, /*cmp_short=*/true, /*write_locked=*/true);
+	if (dso) {
+		up_write(&dsos->lock);
+		return dso;
+	}
+	/*
+	 * Failed to find the dso so create it. Change the name before adding it
+	 * to the array, to avoid unnecessary sorts and potential locking
+	 * issues.
+	 */
+	dso = dso__new_id(m->name, /*id=*/NULL);
 	if (!dso) {
-		dso = __dsos__addnew(dsos, m->name);
-		if (dso == NULL)
-			goto out_unlock;
-
-		dso__set_module_info(dso, m, machine);
-		dso__set_long_name(dso, strdup(filename), true);
-		dso->kernel = DSO_SPACE__KERNEL;
+		up_write(&dsos->lock);
+		return NULL;
 	}
+	dso__set_basename(dso);
+	dso__set_module_info(dso, m, machine);
+	dso__set_long_name(dso,	strdup(filename), true);
+	dso->kernel = DSO_SPACE__KERNEL;
+	__dsos__add(dsos, dso);
 
-out_unlock:
 	up_write(&dsos->lock);
 	return dso;
 }
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index 50bd51523475..c1b3979ad4bd 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -14,20 +14,22 @@ struct kmod_path;
 struct machine;
 
 /*
- * DSOs are put into both a list for fast iteration and rbtree for fast
- * long name lookup.
+ * Collection of DSOs as an array for iteration speed, but sorted for O(n)
+ * lookup.
  */
 struct dsos {
-	struct list_head    head;
-	struct rb_root	    root;	/* rbtree root sorted by long name */
 	struct rw_semaphore lock;
+	struct dso **dsos;
+	unsigned int cnt;
+	unsigned int allocated;
+	bool sorted;
 };
 
 void dsos__init(struct dsos *dsos);
 void dsos__exit(struct dsos *dsos);
 
-void __dsos__add(struct dsos *dsos, struct dso *dso);
-void dsos__add(struct dsos *dsos, struct dso *dso);
+int __dsos__add(struct dsos *dsos, struct dso *dso);
+int dsos__add(struct dsos *dsos, struct dso *dso);
 struct dso *__dsos__addnew(struct dsos *dsos, const char *name);
 struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
 
@@ -35,8 +37,11 @@ struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id
  
 bool dsos__read_build_ids(struct dsos *dsos, bool with_hits);
 
-struct dso *__dsos__findnew_link_by_longname_id(struct rb_root *root, struct dso *dso,
-						const char *name, struct dso_id *id);
+struct dso *__dsos__findnew_link_by_longname_id(struct dsos *dsos,
+						struct dso *dso,
+						const char *name,
+						struct dso_id *id,
+						bool write_locked);
 
 size_t dsos__fprintf_buildid(struct dsos *dsos, FILE *fp,
 			       bool (skip)(struct dso *dso, int parm), int parm);
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 20/25] perf dsos: Remove __dsos__addnew
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (18 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 19/25] perf dsos: Switch backing storage to array from rbtree/list Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 21/25] perf dsos: Remove __dsos__findnew_link_by_longname_id Ian Rogers
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Function no longer used so remove.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/dsos.c | 5 -----
 tools/perf/util/dsos.h | 1 -
 2 files changed, 6 deletions(-)

diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index cfc10e1a6802..1495ab1cd7a0 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -341,11 +341,6 @@ static struct dso *__dsos__addnew_id(struct dsos *dsos, const char *name, struct
 	return dso;
 }
 
-struct dso *__dsos__addnew(struct dsos *dsos, const char *name)
-{
-	return __dsos__addnew_id(dsos, name, NULL);
-}
-
 static struct dso *__dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id)
 {
 	struct dso *dso = __dsos__find_id(dsos, name, id, false, /*write_locked=*/true);
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index c1b3979ad4bd..d1497b11d64c 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -30,7 +30,6 @@ void dsos__exit(struct dsos *dsos);
 
 int __dsos__add(struct dsos *dsos, struct dso *dso);
 int dsos__add(struct dsos *dsos, struct dso *dso);
-struct dso *__dsos__addnew(struct dsos *dsos, const char *name);
 struct dso *dsos__find(struct dsos *dsos, const char *name, bool cmp_short);
 
 struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id *id);
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 21/25] perf dsos: Remove __dsos__findnew_link_by_longname_id
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (19 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 20/25] perf dsos: Remove __dsos__addnew Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 22/25] perf dsos: Switch hand code to bsearch Ian Rogers
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Function was only called in dsos.c with the dso parameter as
NULL. Remove the function and specialize for the dso being NULL case
removing other unused functions along the way.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/dsos.c | 51 +++++++++---------------------------------
 tools/perf/util/dsos.h |  6 -----
 2 files changed, 10 insertions(+), 47 deletions(-)

diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index 1495ab1cd7a0..e4110438841b 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -119,11 +119,6 @@ static int __dso__cmp_short_name(const char *short_name, struct dso_id *id, stru
 	return rc ?: dso_id__cmp(id, &b->id);
 }
 
-static int dso__cmp_short_name(struct dso *a, struct dso *b)
-{
-	return __dso__cmp_short_name(a->short_name, &a->id, b);
-}
-
 static int dsos__cmp_long_name_id_short_name(const void *va, const void *vb)
 {
 	const struct dso *a = *((const struct dso **)va);
@@ -143,20 +138,21 @@ static int dsos__cmp_long_name_id_short_name(const void *va, const void *vb)
  * Either one of the dso or name parameter must be non-NULL or the
  * function will not work.
  */
-struct dso *__dsos__findnew_link_by_longname_id(struct dsos *dsos,
-						struct dso *dso,
-						const char *name,
-						struct dso_id *id,
-						bool write_locked)
+static struct dso *__dsos__find_by_longname_id(struct dsos *dsos,
+					       const char *name,
+					       struct dso_id *id,
+					       bool write_locked)
 {
 	int low = 0, high = dsos->cnt - 1;
 
 	if (!dsos->sorted) {
 		if (!write_locked) {
+			struct dso *dso;
+
 			up_read(&dsos->lock);
 			down_write(&dsos->lock);
-			dso = __dsos__findnew_link_by_longname_id(dsos, dso, name, id,
-								  /*write_locked=*/true);
+			dso = __dsos__find_by_longname_id(dsos, name, id,
+							  /*write_locked=*/true);
 			up_write(&dsos->lock);
 			down_read(&dsos->lock);
 			return dso;
@@ -166,9 +162,6 @@ struct dso *__dsos__findnew_link_by_longname_id(struct dsos *dsos,
 		dsos->sorted = true;
 	}
 
-	if (!name)
-		name = dso->long_name;
-
 	/*
 	 * Find node with the matching name
 	 */
@@ -178,31 +171,13 @@ struct dso *__dsos__findnew_link_by_longname_id(struct dsos *dsos,
 		int rc = __dso__cmp_long_name(name, id, this);
 
 		if (rc == 0) {
-			/*
-			 * In case the new DSO is a duplicate of an existing
-			 * one, print a one-time warning & put the new entry
-			 * at the end of the list of duplicates.
-			 */
-			if (!dso || (dso == this))
-				return dso__get(this);	/* Find matching dso */
-			/*
-			 * The core kernel DSOs may have duplicated long name.
-			 * In this case, the short name should be different.
-			 * Comparing the short names to differentiate the DSOs.
-			 */
-			rc = dso__cmp_short_name(dso, this);
-			if (rc == 0) {
-				pr_err("Duplicated dso name: %s\n", name);
-				return NULL;
-			}
+			return dso__get(this);	/* Find matching dso */
 		}
 		if (rc < 0)
 			high = mid - 1;
 		else
 			low = mid + 1;
 	}
-	if (dso)
-		__dsos__add(dsos, dso);
 	return NULL;
 }
 
@@ -240,12 +215,6 @@ int dsos__add(struct dsos *dsos, struct dso *dso)
 	return ret;
 }
 
-static struct dso *__dsos__findnew_by_longname_id(struct dsos *dsos, const char *name,
-						struct dso_id *id, bool write_locked)
-{
-	return __dsos__findnew_link_by_longname_id(dsos, NULL, name, id, write_locked);
-}
-
 struct dsos__find_id_cb_args {
 	const char *name;
 	struct dso_id *id;
@@ -279,7 +248,7 @@ static struct dso *__dsos__find_id(struct dsos *dsos, const char *name, struct d
 		__dsos__for_each_dso(dsos, dsos__find_id_cb, &args);
 		return args.res;
 	}
-	res = __dsos__findnew_by_longname_id(dsos, name, id, write_locked);
+	res = __dsos__find_by_longname_id(dsos, name, id, write_locked);
 	return res;
 }
 
diff --git a/tools/perf/util/dsos.h b/tools/perf/util/dsos.h
index d1497b11d64c..6c13b65648bc 100644
--- a/tools/perf/util/dsos.h
+++ b/tools/perf/util/dsos.h
@@ -36,12 +36,6 @@ struct dso *dsos__findnew_id(struct dsos *dsos, const char *name, struct dso_id
  
 bool dsos__read_build_ids(struct dsos *dsos, bool with_hits);
 
-struct dso *__dsos__findnew_link_by_longname_id(struct dsos *dsos,
-						struct dso *dso,
-						const char *name,
-						struct dso_id *id,
-						bool write_locked);
-
 size_t dsos__fprintf_buildid(struct dsos *dsos, FILE *fp,
 			       bool (skip)(struct dso *dso, int parm), int parm);
 size_t dsos__fprintf(struct dsos *dsos, FILE *fp);
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 22/25] perf dsos: Switch hand code to bsearch
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (20 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 21/25] perf dsos: Remove __dsos__findnew_link_by_longname_id Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 23/25] perf dso: Add reference count checking and accessor functions Ian Rogers
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Switch to using the bsearch library function rather than having a hand
written binary search. Const-ify some static functions to avoid
compiler warnings.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/dsos.c | 46 +++++++++++++++++++++++++-----------------
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index e4110438841b..23c3fe4f2abb 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -107,13 +107,15 @@ bool dsos__read_build_ids(struct dsos *dsos, bool with_hits)
 	return args.have_build_id;
 }
 
-static int __dso__cmp_long_name(const char *long_name, struct dso_id *id, struct dso *b)
+static int __dso__cmp_long_name(const char *long_name, const struct dso_id *id,
+				const struct dso *b)
 {
 	int rc = strcmp(long_name, b->long_name);
 	return rc ?: dso_id__cmp(id, &b->id);
 }
 
-static int __dso__cmp_short_name(const char *short_name, struct dso_id *id, struct dso *b)
+static int __dso__cmp_short_name(const char *short_name, const struct dso_id *id,
+				 const struct dso *b)
 {
 	int rc = strcmp(short_name, b->short_name);
 	return rc ?: dso_id__cmp(id, &b->id);
@@ -133,6 +135,19 @@ static int dsos__cmp_long_name_id_short_name(const void *va, const void *vb)
 	return rc;
 }
 
+struct dsos__key {
+	const char *long_name;
+	const struct dso_id *id;
+};
+
+static int dsos__cmp_key_long_name_id(const void *vkey, const void *vdso)
+{
+	const struct dsos__key *key = vkey;
+	const struct dso *dso = *((const struct dso **)vdso);
+
+	return __dso__cmp_long_name(key->long_name, key->id, dso);
+}
+
 /*
  * Find a matching entry and/or link current entry to RB tree.
  * Either one of the dso or name parameter must be non-NULL or the
@@ -143,7 +158,11 @@ static struct dso *__dsos__find_by_longname_id(struct dsos *dsos,
 					       struct dso_id *id,
 					       bool write_locked)
 {
-	int low = 0, high = dsos->cnt - 1;
+	struct dsos__key key = {
+		.long_name = name,
+		.id = id,
+	};
+	struct dso **res;
 
 	if (!dsos->sorted) {
 		if (!write_locked) {
@@ -162,23 +181,12 @@ static struct dso *__dsos__find_by_longname_id(struct dsos *dsos,
 		dsos->sorted = true;
 	}
 
-	/*
-	 * Find node with the matching name
-	 */
-	while (low <= high) {
-		int mid = (low + high) / 2;
-		struct dso *this = dsos->dsos[mid];
-		int rc = __dso__cmp_long_name(name, id, this);
+	res = bsearch(&key, dsos->dsos, dsos->cnt, sizeof(struct dso *),
+		      dsos__cmp_key_long_name_id);
+	if (!res)
+		return NULL;
 
-		if (rc == 0) {
-			return dso__get(this);	/* Find matching dso */
-		}
-		if (rc < 0)
-			high = mid - 1;
-		else
-			low = mid + 1;
-	}
-	return NULL;
+	return dso__get(*res);
 }
 
 int __dsos__add(struct dsos *dsos, struct dso *dso)
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 23/25] perf dso: Add reference count checking and accessor functions
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (21 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 22/25] perf dsos: Switch hand code to bsearch Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 24/25] perf dso: Reference counting related fixes Ian Rogers
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.

The majority of the change is mechanical in nature and not easy to
split up.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-annotate.c                 |   8 +-
 tools/perf/builtin-buildid-cache.c            |   2 +-
 tools/perf/builtin-buildid-list.c             |  18 +-
 tools/perf/builtin-inject.c                   |  71 ++-
 tools/perf/builtin-kallsyms.c                 |   2 +-
 tools/perf/builtin-mem.c                      |   4 +-
 tools/perf/builtin-report.c                   |   6 +-
 tools/perf/builtin-script.c                   |   8 +-
 tools/perf/builtin-top.c                      |   4 +-
 tools/perf/builtin-trace.c                    |   2 +-
 tools/perf/tests/code-reading.c               |   8 +-
 tools/perf/tests/dso-data.c                   |  11 +-
 tools/perf/tests/hists_common.c               |   6 +-
 tools/perf/tests/hists_cumulate.c             |   4 +-
 tools/perf/tests/hists_output.c               |   2 +-
 tools/perf/tests/maps.c                       |   4 +-
 tools/perf/tests/symbols.c                    |   2 +-
 tools/perf/tests/vmlinux-kallsyms.c           |   6 +-
 tools/perf/ui/browsers/annotate.c             |   6 +-
 tools/perf/ui/browsers/hists.c                |   8 +-
 tools/perf/ui/browsers/map.c                  |   4 +-
 tools/perf/util/annotate-data.c               |   6 +-
 tools/perf/util/annotate.c                    |  45 +-
 tools/perf/util/auxtrace.c                    |   2 +-
 tools/perf/util/block-info.c                  |   2 +-
 tools/perf/util/bpf-event.c                   |   8 +-
 tools/perf/util/build-id.c                    |  38 +-
 tools/perf/util/callchain.c                   |   2 +-
 tools/perf/util/data-convert-json.c           |   2 +-
 tools/perf/util/db-export.c                   |   6 +-
 tools/perf/util/dlfilter.c                    |  12 +-
 tools/perf/util/dso.c                         | 365 +++++++------
 tools/perf/util/dso.h                         | 483 ++++++++++++++++--
 tools/perf/util/dsos.c                        |  54 +-
 tools/perf/util/event.c                       |   8 +-
 tools/perf/util/header.c                      |   8 +-
 tools/perf/util/hist.c                        |   4 +-
 tools/perf/util/intel-pt.c                    |  22 +-
 tools/perf/util/machine.c                     |  46 +-
 tools/perf/util/map.c                         |  69 ++-
 tools/perf/util/maps.c                        |  14 +-
 tools/perf/util/probe-event.c                 |  25 +-
 .../util/scripting-engines/trace-event-perl.c |   6 +-
 .../scripting-engines/trace-event-python.c    |  21 +-
 tools/perf/util/sort.c                        |  19 +-
 tools/perf/util/srcline.c                     |  65 +--
 tools/perf/util/symbol-elf.c                  |  92 ++--
 tools/perf/util/symbol.c                      | 186 +++----
 tools/perf/util/symbol_fprintf.c              |   4 +-
 tools/perf/util/synthetic-events.c            |  24 +-
 tools/perf/util/thread.c                      |   4 +-
 tools/perf/util/unwind-libunwind-local.c      |  18 +-
 tools/perf/util/unwind-libunwind.c            |   2 +-
 tools/perf/util/vdso.c                        |   8 +-
 54 files changed, 1140 insertions(+), 716 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 6c1cc797692d..52260a14e38f 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -217,7 +217,7 @@ static int process_branch_callback(struct evsel *evsel,
 	}
 
 	if (a.map != NULL)
-		map__dso(a.map)->hit = 1;
+		dso__set_hit(map__dso(a.map));
 
 	hist__account_cycles(sample->branch_stack, al, sample, false, NULL);
 
@@ -252,7 +252,7 @@ static int evsel__add_sample(struct evsel *evsel, struct perf_sample *sample,
 		if (al->sym != NULL) {
 			struct dso *dso = map__dso(al->map);
 
-			rb_erase_cached(&al->sym->rb_node, &dso->symbols);
+			rb_erase_cached(&al->sym->rb_node, dso__symbols(dso));
 			symbol__delete(al->sym);
 			dso__reset_find_symbol_cache(dso);
 		}
@@ -341,7 +341,7 @@ static void print_annotated_data_header(struct hist_entry *he, struct evsel *evs
 	}
 
 	printf("Annotate type: '%s' in %s (%d samples):\n",
-	       he->mem_type->self.type_name, dso->name, nr_samples);
+		he->mem_type->self.type_name, dso__name(dso), nr_samples);
 
 	if (evsel__is_group_event(evsel)) {
 		struct evsel *pos;
@@ -487,7 +487,7 @@ static void hists__find_annotations(struct hists *hists,
 		struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
 		struct annotation *notes;
 
-		if (he->ms.sym == NULL || map__dso(he->ms.map)->annotate_warned)
+		if (he->ms.sym == NULL || dso__annotate_warned(map__dso(he->ms.map)))
 			goto find_next;
 
 		if (ann->sym_hist_filter &&
diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
index e2a40f1d9225..b0511d16aeb6 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -286,7 +286,7 @@ static bool dso__missing_buildid_cache(struct dso *dso, int parm __maybe_unused)
 
 		pr_warning("Problems with %s file, consider removing it from the cache\n",
 			   filename);
-	} else if (memcmp(dso->bid.data, bid.data, bid.size)) {
+	} else if (memcmp(dso__bid(dso)->data, bid.data, bid.size)) {
 		pr_warning("Problems with %s file, consider removing it from the cache\n",
 			   filename);
 	}
diff --git a/tools/perf/builtin-buildid-list.c b/tools/perf/builtin-buildid-list.c
index c9037477865a..383d5de36ce4 100644
--- a/tools/perf/builtin-buildid-list.c
+++ b/tools/perf/builtin-buildid-list.c
@@ -26,16 +26,18 @@ static int buildid__map_cb(struct map *map, void *arg __maybe_unused)
 {
 	const struct dso *dso = map__dso(map);
 	char bid_buf[SBUILD_ID_SIZE];
+	const char *dso_long_name = dso__long_name(dso);
+	const char *dso_short_name = dso__short_name(dso);
 
 	memset(bid_buf, 0, sizeof(bid_buf));
-	if (dso->has_build_id)
-		build_id__sprintf(&dso->bid, bid_buf);
+	if (dso__has_build_id(dso))
+		build_id__sprintf(dso__bid_const(dso), bid_buf);
 	printf("%s %16" PRIx64 " %16" PRIx64, bid_buf, map__start(map), map__end(map));
-	if (dso->long_name != NULL) {
-		printf(" %s", dso->long_name);
-	} else if (dso->short_name != NULL) {
-		printf(" %s", dso->short_name);
-	}
+	if (dso_long_name != NULL)
+		printf(" %s", dso_long_name);
+	else if (dso_short_name != NULL)
+		printf(" %s", dso_short_name);
+
 	printf("\n");
 
 	return 0;
@@ -76,7 +78,7 @@ static int filename__fprintf_build_id(const char *name, FILE *fp)
 
 static bool dso__skip_buildid(struct dso *dso, int with_hits)
 {
-	return with_hits && !dso->hit;
+	return with_hits && !dso__hit(dso);
 }
 
 static int perf_session__list_build_ids(bool force, bool with_hits)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index ce5e28eaad90..a212678d47be 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -445,10 +445,9 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
 	}
 
 	if (dso) {
-		mutex_lock(&dso->lock);
-		nsinfo__put(dso->nsinfo);
-		dso->nsinfo = nsi;
-		mutex_unlock(&dso->lock);
+		mutex_lock(dso__lock(dso));
+		dso__set_nsinfo(dso, nsi);
+		mutex_unlock(dso__lock(dso));
 	} else
 		nsinfo__put(nsi);
 
@@ -466,8 +465,8 @@ static int perf_event__repipe_buildid_mmap(struct perf_tool *tool,
 	dso = findnew_dso(event->mmap.pid, event->mmap.tid,
 			  event->mmap.filename, NULL, machine);
 
-	if (dso && !dso->hit) {
-		dso->hit = 1;
+	if (dso && !dso__hit(dso)) {
+		dso__set_hit(dso);
 		dso__inject_build_id(dso, tool, machine, sample->cpumode, 0);
 	}
 	dso__put(dso);
@@ -492,7 +491,7 @@ static int perf_event__repipe_mmap2(struct perf_tool *tool,
 				  event->mmap2.filename, NULL, machine);
 		if (dso) {
 			/* mark it not to inject build-id */
-			dso->hit = 1;
+			dso__set_hit(dso);
 		}
 		dso__put(dso);
 	}
@@ -544,7 +543,7 @@ static int perf_event__repipe_buildid_mmap2(struct perf_tool *tool,
 				  event->mmap2.filename, NULL, machine);
 		if (dso) {
 			/* mark it not to inject build-id */
-			dso->hit = 1;
+			dso__set_hit(dso);
 		}
 		dso__put(dso);
 		perf_event__repipe(tool, event, sample, machine);
@@ -554,8 +553,8 @@ static int perf_event__repipe_buildid_mmap2(struct perf_tool *tool,
 	dso = findnew_dso(event->mmap2.pid, event->mmap2.tid,
 			  event->mmap2.filename, &dso_id, machine);
 
-	if (dso && !dso->hit) {
-		dso->hit = 1;
+	if (dso && !dso__hit(dso)) {
+		dso__set_hit(dso);
 		dso__inject_build_id(dso, tool, machine, sample->cpumode,
 				     event->mmap2.flags);
 	}
@@ -631,24 +630,24 @@ static int dso__read_build_id(struct dso *dso)
 {
 	struct nscookie nsc;
 
-	if (dso->has_build_id)
+	if (dso__has_build_id(dso))
 		return 0;
 
-	mutex_lock(&dso->lock);
-	nsinfo__mountns_enter(dso->nsinfo, &nsc);
-	if (filename__read_build_id(dso->long_name, &dso->bid) > 0)
-		dso->has_build_id = true;
-	else if (dso->nsinfo) {
-		char *new_name = dso__filename_with_chroot(dso, dso->long_name);
+	mutex_lock(dso__lock(dso));
+	nsinfo__mountns_enter(dso__nsinfo(dso), &nsc);
+	if (filename__read_build_id(dso__long_name(dso), dso__bid(dso)) > 0)
+		dso__set_has_build_id(dso);
+	else if (dso__nsinfo(dso)) {
+		char *new_name = dso__filename_with_chroot(dso, dso__long_name(dso));
 
-		if (new_name && filename__read_build_id(new_name, &dso->bid) > 0)
-			dso->has_build_id = true;
+		if (new_name && filename__read_build_id(new_name, dso__bid(dso)) > 0)
+			dso__set_has_build_id(dso);
 		free(new_name);
 	}
 	nsinfo__mountns_exit(&nsc);
-	mutex_unlock(&dso->lock);
+	mutex_unlock(dso__lock(dso));
 
-	return dso->has_build_id ? 0 : -1;
+	return dso__has_build_id(dso) ? 0 : -1;
 }
 
 static struct strlist *perf_inject__parse_known_build_ids(
@@ -700,14 +699,14 @@ static bool perf_inject__lookup_known_build_id(struct perf_inject *inject,
 		dso_name = strchr(build_id, ' ');
 		bid_len = dso_name - pos->s;
 		dso_name = skip_spaces(dso_name);
-		if (strcmp(dso->long_name, dso_name))
+		if (strcmp(dso__long_name(dso), dso_name))
 			continue;
 		for (int ix = 0; 2 * ix + 1 < bid_len; ++ix) {
-			dso->bid.data[ix] = (hex(build_id[2 * ix]) << 4 |
-					     hex(build_id[2 * ix + 1]));
+			dso__bid(dso)->data[ix] = (hex(build_id[2 * ix]) << 4 |
+						  hex(build_id[2 * ix + 1]));
 		}
-		dso->bid.size = bid_len / 2;
-		dso->has_build_id = 1;
+		dso__bid(dso)->size = bid_len / 2;
+		dso__set_has_build_id(dso);
 		return true;
 	}
 	return false;
@@ -720,9 +719,9 @@ static int dso__inject_build_id(struct dso *dso, struct perf_tool *tool,
 						  tool);
 	int err;
 
-	if (is_anon_memory(dso->long_name) || flags & MAP_HUGETLB)
+	if (is_anon_memory(dso__long_name(dso)) || flags & MAP_HUGETLB)
 		return 0;
-	if (is_no_dso_memory(dso->long_name))
+	if (is_no_dso_memory(dso__long_name(dso)))
 		return 0;
 
 	if (inject->known_build_ids != NULL &&
@@ -730,14 +729,14 @@ static int dso__inject_build_id(struct dso *dso, struct perf_tool *tool,
 		return 1;
 
 	if (dso__read_build_id(dso) < 0) {
-		pr_debug("no build_id found for %s\n", dso->long_name);
+		pr_debug("no build_id found for %s\n", dso__long_name(dso));
 		return -1;
 	}
 
 	err = perf_event__synthesize_build_id(tool, dso, cpumode,
 					      perf_event__repipe, machine);
 	if (err) {
-		pr_err("Can't synthesize build_id event for %s\n", dso->long_name);
+		pr_err("Can't synthesize build_id event for %s\n", dso__long_name(dso));
 		return -1;
 	}
 
@@ -763,8 +762,8 @@ int perf_event__inject_buildid(struct perf_tool *tool, union perf_event *event,
 	if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
 		struct dso *dso = map__dso(al.map);
 
-		if (!dso->hit) {
-			dso->hit = 1;
+		if (!dso__hit(dso)) {
+			dso__set_hit(dso);
 			dso__inject_build_id(dso, tool, machine,
 					     sample->cpumode, map__flags(al.map));
 		}
@@ -1146,8 +1145,8 @@ static bool dso__is_in_kernel_space(struct dso *dso)
 		return false;
 
 	return dso__is_kcore(dso) ||
-	       dso->kernel ||
-	       is_kernel_module(dso->long_name, PERF_RECORD_MISC_CPUMODE_UNKNOWN);
+	       dso__kernel(dso) ||
+	       is_kernel_module(dso__long_name(dso), PERF_RECORD_MISC_CPUMODE_UNKNOWN);
 }
 
 static u64 evlist__first_id(struct evlist *evlist)
@@ -1181,7 +1180,7 @@ static int synthesize_build_id(struct perf_inject *inject, struct dso *dso, pid_
 	if (!machine)
 		return -ENOMEM;
 
-	dso->hit = 1;
+	dso__set_hit(dso);
 
 	return perf_event__synthesize_build_id(&inject->tool, dso, cpumode,
 					       process_build_id, machine);
@@ -1192,7 +1191,7 @@ static int guest_session__add_build_ids_cb(struct dso *dso, void *data)
 	struct guest_session *gs = data;
 	struct perf_inject *inject = container_of(gs, struct perf_inject, guest_session);
 
-	if (!dso->has_build_id)
+	if (!dso__has_build_id(dso))
 		return 0;
 
 	return synthesize_build_id(inject, dso, gs->machine_pid);
diff --git a/tools/perf/builtin-kallsyms.c b/tools/perf/builtin-kallsyms.c
index 7f75c5b73f26..a3c2ffdc1af8 100644
--- a/tools/perf/builtin-kallsyms.c
+++ b/tools/perf/builtin-kallsyms.c
@@ -38,7 +38,7 @@ static int __cmd_kallsyms(int argc, const char **argv)
 
 		dso = map__dso(map);
 		printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n",
-			symbol->name, dso->short_name, dso->long_name,
+			symbol->name, dso__short_name(dso), dso__long_name(dso),
 			map__unmap_ip(map, symbol->start), map__unmap_ip(map, symbol->end),
 			symbol->start, symbol->end);
 	}
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 51499c20da01..7c2f16d25a71 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -213,7 +213,7 @@ dump_raw_samples(struct perf_tool *tool,
 	if (al.map != NULL) {
 		dso = map__dso(al.map);
 		if (dso)
-			dso->hit = 1;
+			dso__set_hit(dso);
 	}
 
 	field_sep = symbol_conf.field_sep;
@@ -255,7 +255,7 @@ dump_raw_samples(struct perf_tool *tool,
 		symbol_conf.field_sep,
 		sample->data_src,
 		symbol_conf.field_sep,
-		dso ? dso->long_name : "???",
+		dso ? dso__long_name(dso) : "???",
 		al.sym ? al.sym->name : "???");
 out_put:
 	addr_location__exit(&al);
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ed0cc813cebb..aeb29f14974a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -322,7 +322,7 @@ static int process_sample_event(struct perf_tool *tool,
 	}
 
 	if (al.map != NULL)
-		map__dso(al.map)->hit = 1;
+		dso__set_hit(map__dso(al.map));
 
 	if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode) {
 		hist__account_cycles(sample->branch_stack, &al, sample,
@@ -609,7 +609,7 @@ static void report__warn_kptr_restrict(const struct report *rep)
 		return;
 
 	if (kernel_map == NULL ||
-	     (map__dso(kernel_map)->hit &&
+	    (dso__hit(map__dso(kernel_map)) &&
 	     (kernel_kmap->ref_reloc_sym == NULL ||
 	      kernel_kmap->ref_reloc_sym->addr == 0))) {
 		const char *desc =
@@ -850,7 +850,7 @@ static int maps__fprintf_task_cb(struct map *map, void *data)
 		prot & PROT_EXEC ? 'x' : '-',
 		map__flags(map) ? 's' : 'p',
 		map__pgoff(map),
-		dso->id.ino, dso->name);
+		dso__id_const(dso)->ino, dso__name(dso));
 
 	if (ret < 0)
 		return ret;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index b1f57401ff23..e31333c5ebd2 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1011,11 +1011,11 @@ static int perf_sample__fprintf_brstackoff(struct perf_sample *sample,
 		to   = entries[i].to;
 
 		if (thread__find_map_fb(thread, sample->cpumode, from, &alf) &&
-		    !map__dso(alf.map)->adjust_symbols)
+		    !dso__adjust_symbols(map__dso(alf.map)))
 			from = map__dso_map_ip(alf.map, from);
 
 		if (thread__find_map_fb(thread, sample->cpumode, to, &alt) &&
-		    !map__dso(alt.map)->adjust_symbols)
+		    !dso__adjust_symbols(map__dso(alt.map)))
 			to = map__dso_map_ip(alt.map, to);
 
 		printed += fprintf(fp, " 0x%"PRIx64, from);
@@ -1076,7 +1076,7 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
 		pr_debug("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n", start, end);
 		goto out;
 	}
-	if (dso->data.status == DSO_DATA_STATUS_ERROR) {
+	if (dso__data(dso)->status == DSO_DATA_STATUS_ERROR) {
 		pr_debug("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n", start, end);
 		goto out;
 	}
@@ -1088,7 +1088,7 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
 	len = dso__data_read_offset(dso, machine, offset, (u8 *)buffer,
 				    end - start + MAXINSN);
 
-	*is64bit = dso->is_64_bit;
+	*is64bit = dso__is_64_bit(dso);
 	if (len <= 0)
 		pr_debug("\tcannot fetch code for block at %" PRIx64 "-%" PRIx64 "\n",
 			start, end);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index baf1ab083436..86edc0c3f525 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -129,7 +129,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
 	/*
 	 * We can't annotate with just /proc/kallsyms
 	 */
-	if (dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS && !dso__is_kcore(dso)) {
+	if (dso__symtab_type(dso) == DSO_BINARY_TYPE__KALLSYMS && !dso__is_kcore(dso)) {
 		pr_err("Can't annotate %s: No vmlinux file was found in the "
 		       "path\n", sym->name);
 		sleep(1);
@@ -182,7 +182,7 @@ static void ui__warn_map_erange(struct map *map, struct symbol *sym, u64 ip)
 		    "Tools:  %s\n\n"
 		    "Not all samples will be on the annotation output.\n\n"
 		    "Please report to linux-kernel@vger.kernel.org\n",
-		    ip, dso->long_name, dso__symtab_origin(dso),
+		    ip, dso__long_name(dso), dso__symtab_origin(dso),
 		    map__start(map), map__end(map), sym->start, sym->end,
 		    sym->binding == STB_GLOBAL ? 'g' :
 		    sym->binding == STB_LOCAL  ? 'l' : 'w', sym->name,
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 90eaff8c0f6e..81300965d60d 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2903,7 +2903,7 @@ static void print_location(FILE *f, struct perf_sample *sample,
 {
 
 	if ((verbose > 0 || print_dso) && al->map)
-		fprintf(f, "%s@", map__dso(al->map)->long_name);
+		fprintf(f, "%s@", dso__long_name(map__dso(al->map)));
 
 	if ((verbose > 0 || print_sym) && al->sym)
 		fprintf(f, "%s+0x%" PRIx64, al->sym->name,
diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index 7a3a7bbbec71..6208f58622d1 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -253,9 +253,9 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode,
 		goto out;
 	}
 	dso = map__dso(al.map);
-	pr_debug("File is: %s\n", dso->long_name);
+	pr_debug("File is: %s\n", dso__long_name(dso));
 
-	if (dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS && !dso__is_kcore(dso)) {
+	if (dso__symtab_type(dso) == DSO_BINARY_TYPE__KALLSYMS && !dso__is_kcore(dso)) {
 		pr_debug("Unexpected kernel address - skipping\n");
 		goto out;
 	}
@@ -274,7 +274,7 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode,
 	 * modules to manage long jumps. Check if the ip offset falls in stubs
 	 * sections for kernel modules. And skip module address after text end
 	 */
-	if (dso->is_kmod && al.addr > dso->text_end) {
+	if (dso__is_kmod(dso) && al.addr > dso__text_end(dso)) {
 		pr_debug("skipping the module address %#"PRIx64" after text end\n", al.addr);
 		goto out;
 	}
@@ -315,7 +315,7 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode,
 		state->done[state->done_cnt++] = map__start(al.map);
 	}
 
-	objdump_name = dso->long_name;
+	objdump_name = dso__long_name(dso);
 	if (dso__needs_decompress(dso)) {
 		if (dso__decompress_kmodule_path(dso, objdump_name,
 						 decomp_name,
diff --git a/tools/perf/tests/dso-data.c b/tools/perf/tests/dso-data.c
index 2d67422c1222..fde4eca84b6f 100644
--- a/tools/perf/tests/dso-data.c
+++ b/tools/perf/tests/dso-data.c
@@ -228,7 +228,8 @@ static void dsos__delete(int cnt)
 	for (i = 0; i < cnt; i++) {
 		struct dso *dso = dsos[i];
 
-		unlink(dso->name);
+		dso__data_close(dso);
+		unlink(dso__name(dso));
 		dso__put(dso);
 	}
 
@@ -289,14 +290,14 @@ static int test__dso_data_cache(struct test_suite *test __maybe_unused, int subt
 	}
 
 	/* verify the first one is already open */
-	TEST_ASSERT_VAL("dsos[0] is not open", dsos[0]->data.fd != -1);
+	TEST_ASSERT_VAL("dsos[0] is not open", dso__data(dsos[0])->fd != -1);
 
 	/* open +1 dso to reach the allowed limit */
 	fd = dso__data_fd(dsos[i], &machine);
 	TEST_ASSERT_VAL("failed to get fd", fd > 0);
 
 	/* should force the first one to be closed */
-	TEST_ASSERT_VAL("failed to close dsos[0]", dsos[0]->data.fd == -1);
+	TEST_ASSERT_VAL("failed to close dsos[0]", dso__data(dsos[0])->fd == -1);
 
 	/* cleanup everything */
 	dsos__delete(dso_cnt);
@@ -371,7 +372,7 @@ static int test__dso_data_reopen(struct test_suite *test __maybe_unused, int sub
 	 * dso_0 should get closed, because we reached
 	 * the file descriptor limit
 	 */
-	TEST_ASSERT_VAL("failed to close dso_0", dso_0->data.fd == -1);
+	TEST_ASSERT_VAL("failed to close dso_0", dso__data(dso_0)->fd == -1);
 
 	/* open dso_0 */
 	fd = dso__data_fd(dso_0, &machine);
@@ -381,7 +382,7 @@ static int test__dso_data_reopen(struct test_suite *test __maybe_unused, int sub
 	 * dso_1 should get closed, because we reached
 	 * the file descriptor limit
 	 */
-	TEST_ASSERT_VAL("failed to close dso_1", dso_1->data.fd == -1);
+	TEST_ASSERT_VAL("failed to close dso_1", dso__data(dso_1)->fd == -1);
 
 	/* cleanup everything */
 	close(fd_extra);
diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index d08add0f4da6..187f12f5bc21 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -146,7 +146,7 @@ struct machine *setup_fake_machine(struct machines *machines)
 				goto out;
 			}
 
-			symbols__insert(&dso->symbols, sym);
+			symbols__insert(dso__symbols(dso), sym);
 		}
 
 		dso__put(dso);
@@ -183,7 +183,7 @@ void print_hists_in(struct hists *hists)
 
 			pr_info("%2d: entry: %-8s [%-8s] %20s: period = %"PRIu64"\n",
 				i, thread__comm_str(he->thread),
-				dso->short_name,
+				dso__short_name(dso),
 				he->ms.sym->name, he->stat.period);
 		}
 
@@ -212,7 +212,7 @@ void print_hists_out(struct hists *hists)
 
 			pr_info("%2d: entry: %8s:%5d [%-8s] %20s: period = %"PRIu64"/%"PRIu64"\n",
 				i, thread__comm_str(he->thread), thread__tid(he->thread),
-				dso->short_name,
+				dso__short_name(dso),
 				he->ms.sym->name, he->stat.period,
 				he->stat_acc ? he->stat_acc->period : 0);
 		}
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index 71dacb0fec4d..1e0f5a310fd5 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -164,11 +164,11 @@ static void put_fake_samples(void)
 typedef int (*test_fn_t)(struct evsel *, struct machine *);
 
 #define COMM(he)  (thread__comm_str(he->thread))
-#define DSO(he)   (map__dso(he->ms.map)->short_name)
+#define DSO(he)   (dso__short_name(map__dso(he->ms.map)))
 #define SYM(he)   (he->ms.sym->name)
 #define CPU(he)   (he->cpu)
 #define DEPTH(he) (he->callchain->max_depth)
-#define CDSO(cl)  (map__dso(cl->ms.map)->short_name)
+#define CDSO(cl)  (dso__short_name(map__dso(cl->ms.map)))
 #define CSYM(cl)  (cl->ms.sym->name)
 
 struct result {
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index ba1cccf57049..33b5cc8352a7 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -129,7 +129,7 @@ static void put_fake_samples(void)
 typedef int (*test_fn_t)(struct evsel *, struct machine *);
 
 #define COMM(he)  (thread__comm_str(he->thread))
-#define DSO(he)   (map__dso(he->ms.map)->short_name)
+#define DSO(he)   (dso__short_name(map__dso(he->ms.map)))
 #define SYM(he)   (he->ms.sym->name)
 #define CPU(he)   (he->cpu)
 #define PID(he)   (thread__tid(he->thread))
diff --git a/tools/perf/tests/maps.c b/tools/perf/tests/maps.c
index b15417a0d617..4f1f9385ea9c 100644
--- a/tools/perf/tests/maps.c
+++ b/tools/perf/tests/maps.c
@@ -26,7 +26,7 @@ static int check_maps_cb(struct map *map, void *data)
 
 	if (map__start(map) != merged->start ||
 	    map__end(map) != merged->end ||
-	    strcmp(map__dso(map)->name, merged->name) ||
+	    strcmp(dso__name(map__dso(map)), merged->name) ||
 	    refcount_read(map__refcnt(map)) != 1) {
 		return 1;
 	}
@@ -39,7 +39,7 @@ static int failed_cb(struct map *map, void *data __maybe_unused)
 	pr_debug("\tstart: %" PRIu64 " end: %" PRIu64 " name: '%s' refcnt: %d\n",
 		map__start(map),
 		map__end(map),
-		map__dso(map)->name,
+		dso__name(map__dso(map)),
 		refcount_read(map__refcnt(map)));
 
 	return 0;
diff --git a/tools/perf/tests/symbols.c b/tools/perf/tests/symbols.c
index 16e1c5502b09..4bcb277b0cac 100644
--- a/tools/perf/tests/symbols.c
+++ b/tools/perf/tests/symbols.c
@@ -72,7 +72,7 @@ static int test_dso(struct dso *dso)
 	if (verbose > 1)
 		dso__fprintf(dso, stderr);
 
-	for (nd = rb_first_cached(&dso->symbols); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(dso__symbols(dso)); nd; nd = rb_next(nd)) {
 		struct symbol *sym = rb_entry(nd, struct symbol, rb_node);
 
 		if (sym->type != STT_FUNC && sym->type != STT_GNU_IFUNC)
diff --git a/tools/perf/tests/vmlinux-kallsyms.c b/tools/perf/tests/vmlinux-kallsyms.c
index fecbf851bb2e..e30fd55f8e51 100644
--- a/tools/perf/tests/vmlinux-kallsyms.c
+++ b/tools/perf/tests/vmlinux-kallsyms.c
@@ -129,7 +129,7 @@ static int test__vmlinux_matches_kallsyms_cb1(struct map *map, void *data)
 	 * cases.
 	 */
 	struct map *pair = maps__find_by_name(args->kallsyms.kmaps,
-					(dso->kernel ? dso->short_name : dso->name));
+					(dso__kernel(dso) ? dso__short_name(dso) : dso__name(dso)));
 
 	if (pair) {
 		map__set_priv(pair, 1);
@@ -162,11 +162,11 @@ static int test__vmlinux_matches_kallsyms_cb2(struct map *map, void *data)
 		}
 
 		pr_info("WARN: %" PRIx64 "-%" PRIx64 " %" PRIx64 " %s in kallsyms as",
-			map__start(map), map__end(map), map__pgoff(map), dso->name);
+			map__start(map), map__end(map), map__pgoff(map), dso__name(dso));
 		if (mem_end != map__end(pair))
 			pr_info(":\nWARN: *%" PRIx64 "-%" PRIx64 " %" PRIx64,
 				map__start(pair), map__end(pair), map__pgoff(pair));
-		pr_info(" %s\n", dso->name);
+		pr_info(" %s\n", dso__name(dso));
 		map__set_priv(pair, 1);
 	}
 	map__put(pair);
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index cb2eb6dcb532..c81e683131b6 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -438,7 +438,7 @@ static int sym_title(struct symbol *sym, struct map *map, char *title,
 		     size_t sz, int percent_type)
 {
 	return snprintf(title, sz, "%s  %s [Percent: %s]", sym->name,
-			map__dso(map)->long_name,
+			dso__long_name(map__dso(map)),
 			percent_type_str(percent_type));
 }
 
@@ -966,14 +966,14 @@ int symbol__tui_annotate(struct map_symbol *ms, struct evsel *evsel,
 		return -1;
 
 	dso = map__dso(ms->map);
-	if (dso->annotate_warned)
+	if (dso__annotate_warned(dso))
 		return -1;
 
 	if (not_annotated) {
 		err = symbol__annotate2(ms, evsel, &browser.arch);
 		if (err) {
 			char msg[BUFSIZ];
-			dso->annotate_warned = true;
+			dso__set_annotate_warned(dso);
 			symbol__strerror_disassemble(ms, err, msg, sizeof(msg));
 			ui__error("Couldn't annotate %s:\n%s", sym->name, msg);
 			goto out_free_offsets;
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 3061dea29e6b..dd73347ea341 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2488,7 +2488,7 @@ add_annotate_opt(struct hist_browser *browser __maybe_unused,
 {
 	struct dso *dso;
 
-	if (!ms->map || (dso = map__dso(ms->map)) == NULL || dso->annotate_warned)
+	if (!ms->map || (dso = map__dso(ms->map)) == NULL || dso__annotate_warned(dso))
 		return 0;
 
 	if (!ms->sym)
@@ -2581,7 +2581,7 @@ static int hists_browser__zoom_map(struct hist_browser *browser, struct map *map
 	} else {
 		struct dso *dso = map__dso(map);
 		ui_helpline__fpush("To zoom out press ESC or ENTER + \"Zoom out of %s DSO\"",
-				   __map__is_kernel(map) ? "the Kernel" : dso->short_name);
+				   __map__is_kernel(map) ? "the Kernel" : dso__short_name(dso));
 		browser->hists->dso_filter = dso;
 		perf_hpp__set_elide(HISTC_DSO, true);
 		pstack__push(browser->pstack, &browser->hists->dso_filter);
@@ -2607,7 +2607,7 @@ add_dso_opt(struct hist_browser *browser, struct popup_action *act,
 
 	if (asprintf(optstr, "Zoom %s %s DSO (use the 'k' hotkey to zoom directly into the kernel)",
 		     browser->hists->dso_filter ? "out of" : "into",
-		     __map__is_kernel(map) ? "the Kernel" : map__dso(map)->short_name) < 0)
+		     __map__is_kernel(map) ? "the Kernel" : dso__short_name(map__dso(map))) < 0)
 		return 0;
 
 	act->ms.map = map;
@@ -3082,7 +3082,7 @@ static int evsel__hists_browse(struct evsel *evsel, int nr_events, const char *h
 			if (!browser->selection ||
 			    !browser->selection->map ||
 			    !map__dso(browser->selection->map) ||
-			    map__dso(browser->selection->map)->annotate_warned) {
+			    dso__annotate_warned(map__dso(browser->selection->map))) {
 				continue;
 			}
 
diff --git a/tools/perf/ui/browsers/map.c b/tools/perf/ui/browsers/map.c
index 3d1b958d8832..fba55175a935 100644
--- a/tools/perf/ui/browsers/map.c
+++ b/tools/perf/ui/browsers/map.c
@@ -76,7 +76,7 @@ static int map_browser__run(struct map_browser *browser)
 {
 	int key;
 
-	if (ui_browser__show(&browser->b, map__dso(browser->map)->long_name,
+	if (ui_browser__show(&browser->b, dso__long_name(map__dso(browser->map)),
 			     "Press ESC to exit, %s / to search",
 			     verbose > 0 ? "" : "restart with -v to use") < 0)
 		return -1;
@@ -106,7 +106,7 @@ int map__browse(struct map *map)
 {
 	struct map_browser mb = {
 		.b = {
-			.entries = &map__dso(map)->symbols,
+			.entries = dso__symbols(map__dso(map)),
 			.refresh = ui_browser__rb_tree_refresh,
 			.seek	 = ui_browser__rb_tree_seek,
 			.write	 = map_browser__write,
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index f22b4f18271c..f711c0d171a9 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -141,7 +141,7 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
 	/* Check existing nodes in dso->data_types tree */
 	key.self.type_name = type_name;
 	key.self.size = size;
-	node = rb_find(&key, &dso->data_types, data_type_cmp);
+	node = rb_find(&key, dso__data_types(dso), data_type_cmp);
 	if (node) {
 		result = rb_entry(node, struct annotated_data_type, node);
 		free(type_name);
@@ -162,7 +162,7 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
 	if (symbol_conf.annotate_data_member)
 		add_member_types(result, type_die);
 
-	rb_add(&result->node, &dso->data_types, data_type_less);
+	rb_add(&result->node, dso__data_types(dso), data_type_less);
 	return result;
 }
 
@@ -288,7 +288,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	Dwarf_Die type_die;
 	u64 pc;
 
-	di = debuginfo__new(dso->long_name);
+	di = debuginfo__new(dso__long_name(dso));
 	if (di == NULL) {
 		pr_debug("cannot get the debug info\n");
 		return NULL;
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 9b70ab110ce7..b6c4330a0a9c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1785,8 +1785,8 @@ int symbol__strerror_disassemble(struct map_symbol *ms, int errnum, char *buf, s
 		char bf[SBUILD_ID_SIZE + 15] = " with build id ";
 		char *build_id_msg = NULL;
 
-		if (dso->has_build_id) {
-			build_id__sprintf(&dso->bid, bf + 15);
+		if (dso__has_build_id(dso)) {
+			build_id__sprintf(dso__bid(dso), bf + 15);
 			build_id_msg = bf;
 		}
 		scnprintf(buf, buflen,
@@ -1808,11 +1808,11 @@ int symbol__strerror_disassemble(struct map_symbol *ms, int errnum, char *buf, s
 		scnprintf(buf, buflen, "Problems while parsing the CPUID in the arch specific initialization.");
 		break;
 	case SYMBOL_ANNOTATE_ERRNO__BPF_INVALID_FILE:
-		scnprintf(buf, buflen, "Invalid BPF file: %s.", dso->long_name);
+		scnprintf(buf, buflen, "Invalid BPF file: %s.", dso__long_name(dso));
 		break;
 	case SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF:
 		scnprintf(buf, buflen, "The %s BPF file has no BTF section, compile with -g or use pahole -J.",
-			  dso->long_name);
+			  dso__long_name(dso));
 		break;
 	default:
 		scnprintf(buf, buflen, "Internal error: Invalid %d error code\n", errnum);
@@ -1830,7 +1830,7 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil
 	char *pos;
 	int len;
 
-	if (dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS &&
+	if (dso__symtab_type(dso) == DSO_BINARY_TYPE__KALLSYMS &&
 	    !dso__is_kcore(dso))
 		return SYMBOL_ANNOTATE_ERRNO__NO_VMLINUX;
 
@@ -1839,7 +1839,7 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil
 		__symbol__join_symfs(filename, filename_size, build_id_filename);
 		free(build_id_filename);
 	} else {
-		if (dso->has_build_id)
+		if (dso__has_build_id(dso))
 			return ENOMEM;
 		goto fallback;
 	}
@@ -1873,20 +1873,20 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil
 		 * cache, or is just a kallsyms file, well, lets hope that this
 		 * DSO is the same as when 'perf record' ran.
 		 */
-		if (dso->kernel && dso->long_name[0] == '/')
-			snprintf(filename, filename_size, "%s", dso->long_name);
+		if (dso__kernel(dso) && dso__long_name(dso)[0] == '/')
+			snprintf(filename, filename_size, "%s", dso__long_name(dso));
 		else
-			__symbol__join_symfs(filename, filename_size, dso->long_name);
+			__symbol__join_symfs(filename, filename_size, dso__long_name(dso));
 
-		mutex_lock(&dso->lock);
-		if (access(filename, R_OK) && errno == ENOENT && dso->nsinfo) {
+		mutex_lock(dso__lock(dso));
+		if (access(filename, R_OK) && errno == ENOENT && dso__nsinfo(dso)) {
 			char *new_name = dso__filename_with_chroot(dso, filename);
 			if (new_name) {
 				strlcpy(filename, new_name, filename_size);
 				free(new_name);
 			}
 		}
-		mutex_unlock(&dso->lock);
+		mutex_unlock(dso__lock(dso));
 	}
 
 	free(build_id_path);
@@ -2172,11 +2172,11 @@ static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
 		 map__unmap_ip(map, sym->end));
 
 	pr_debug("annotating [%p] %30s : [%p] %30s\n",
-		 dso, dso->long_name, sym, sym->name);
+		 dso, dso__long_name(dso), sym, sym->name);
 
-	if (dso->binary_type == DSO_BINARY_TYPE__BPF_PROG_INFO) {
+	if (dso__binary_type(dso) == DSO_BINARY_TYPE__BPF_PROG_INFO) {
 		return symbol__disassemble_bpf(sym, args);
-	} else if (dso->binary_type == DSO_BINARY_TYPE__BPF_IMAGE) {
+	} else if (dso__binary_type(dso) == DSO_BINARY_TYPE__BPF_IMAGE) {
 		return symbol__disassemble_bpf_image(sym, args);
 	} else if (dso__is_kcore(dso)) {
 		kce.kcore_filename = symfs_filename;
@@ -2617,7 +2617,7 @@ int symbol__annotate_printf(struct map_symbol *ms, struct evsel *evsel)
 	int graph_dotted_len;
 	char buf[512];
 
-	filename = strdup(dso->long_name);
+	filename = strdup(dso__long_name(dso));
 	if (!filename)
 		return -ENOMEM;
 
@@ -2782,7 +2782,7 @@ int map_symbol__annotation_dump(struct map_symbol *ms, struct evsel *evsel)
 	}
 
 	fprintf(fp, "%s() %s\nEvent: %s\n\n",
-		ms->sym->name, map__dso(ms->map)->long_name, ev_name);
+		ms->sym->name, dso__long_name(map__dso(ms->map)), ev_name);
 	symbol__annotate_fprintf2(ms->sym, fp);
 
 	fclose(fp);
@@ -3036,7 +3036,7 @@ int symbol__tty_annotate2(struct map_symbol *ms, struct evsel *evsel)
 	if (err) {
 		char msg[BUFSIZ];
 
-		dso->annotate_warned = true;
+		dso__set_annotate_warned(dso);
 		symbol__strerror_disassemble(ms, err, msg, sizeof(msg));
 		ui__error("Couldn't annotate %s:\n%s", sym->name, msg);
 		return -1;
@@ -3045,13 +3045,12 @@ int symbol__tty_annotate2(struct map_symbol *ms, struct evsel *evsel)
 	if (annotate_opts.print_lines) {
 		srcline_full_filename = annotate_opts.full_path;
 		symbol__calc_lines(ms, &source_line);
-		print_summary(&source_line, dso->long_name);
+		print_summary(&source_line, dso__long_name(dso));
 	}
 
 	hists__scnprintf_title(hists, buf, sizeof(buf));
 	fprintf(stdout, "%s, [percent: %s]\n%s() %s\n",
-		buf, percent_type_str(annotate_opts.percent_type), sym->name,
-		dso->long_name);
+		buf, percent_type_str(annotate_opts.percent_type), sym->name, dso__long_name(dso));
 	symbol__annotate_fprintf2(sym, stdout);
 
 	annotated_source__purge(symbol__annotation(sym)->src);
@@ -3070,7 +3069,7 @@ int symbol__tty_annotate(struct map_symbol *ms, struct evsel *evsel)
 	if (err) {
 		char msg[BUFSIZ];
 
-		dso->annotate_warned = true;
+		dso__set_annotate_warned(dso);
 		symbol__strerror_disassemble(ms, err, msg, sizeof(msg));
 		ui__error("Couldn't annotate %s:\n%s", sym->name, msg);
 		return -1;
@@ -3081,7 +3080,7 @@ int symbol__tty_annotate(struct map_symbol *ms, struct evsel *evsel)
 	if (annotate_opts.print_lines) {
 		srcline_full_filename = annotate_opts.full_path;
 		symbol__calc_lines(ms, &source_line);
-		print_summary(&source_line, dso->long_name);
+		print_summary(&source_line, dso__long_name(dso));
 	}
 
 	symbol__annotate_printf(ms, evsel);
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 3684e6009b63..9b6151ec90e2 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -2652,7 +2652,7 @@ static int addr_filter__entire_dso(struct addr_filter *filt, struct dso *dso)
 	}
 
 	filt->addr = 0;
-	filt->size = dso->data.file_size;
+	filt->size = dso__data(dso)->file_size;
 
 	return 0;
 }
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index dec910989701..895ee8adf3b3 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -319,7 +319,7 @@ static int block_dso_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 
 	if (map && map__dso(map)) {
 		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
-				 map__dso(map)->short_name);
+				 dso__short_name(map__dso(map)));
 	}
 
 	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index d07fd5ffa823..b564d6fd078a 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -59,10 +59,10 @@ static int machine__process_bpf_event_load(struct machine *machine,
 		if (map) {
 			struct dso *dso = map__dso(map);
 
-			dso->binary_type = DSO_BINARY_TYPE__BPF_PROG_INFO;
-			dso->bpf_prog.id = id;
-			dso->bpf_prog.sub_id = i;
-			dso->bpf_prog.env = env;
+			dso__set_binary_type(dso, DSO_BINARY_TYPE__BPF_PROG_INFO);
+			dso__bpf_prog(dso)->id = id;
+			dso__bpf_prog(dso)->sub_id = i;
+			dso__bpf_prog(dso)->env = env;
 			map__put(map);
 		}
 	}
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 864bc26b6b46..83a1581e8cf1 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -60,7 +60,7 @@ int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
 
 	addr_location__init(&al);
 	if (thread__find_map(thread, sample->cpumode, sample->ip, &al))
-		map__dso(al.map)->hit = 1;
+		dso__set_hit(map__dso(al.map));
 
 	addr_location__exit(&al);
 	thread__put(thread);
@@ -272,10 +272,10 @@ char *__dso__build_id_filename(const struct dso *dso, char *bf, size_t size,
 	bool alloc = (bf == NULL);
 	int ret;
 
-	if (!dso->has_build_id)
+	if (!dso__has_build_id(dso))
 		return NULL;
 
-	build_id__sprintf(&dso->bid, sbuild_id);
+	build_id__sprintf(dso__bid_const(dso), sbuild_id);
 	linkname = build_id_cache__linkname(sbuild_id, NULL, 0);
 	if (!linkname)
 		return NULL;
@@ -340,25 +340,25 @@ static int machine__write_buildid_table_cb(struct dso *dso, void *data)
 	size_t name_len;
 	bool in_kernel = false;
 
-	if (!dso->has_build_id)
+	if (!dso__has_build_id(dso))
 		return 0;
 
-	if (!dso->hit && !dso__is_vdso(dso))
+	if (!dso__hit(dso) && !dso__is_vdso(dso))
 		return 0;
 
 	if (dso__is_vdso(dso)) {
-		name = dso->short_name;
-		name_len = dso->short_name_len;
+		name = dso__short_name(dso);
+		name_len = dso__short_name_len(dso);
 	} else if (dso__is_kcore(dso)) {
 		name = args->machine->mmap_name;
 		name_len = strlen(name);
 	} else {
-		name = dso->long_name;
-		name_len = dso->long_name_len;
+		name = dso__long_name(dso);
+		name_len = dso__long_name_len(dso);
 	}
 
-	in_kernel = dso->kernel || is_kernel_module(name, PERF_RECORD_MISC_CPUMODE_UNKNOWN);
-	return write_buildid(name, name_len, &dso->bid, args->machine->pid,
+	in_kernel = dso__kernel(dso) || is_kernel_module(name, PERF_RECORD_MISC_CPUMODE_UNKNOWN);
+	return write_buildid(name, name_len, dso__bid(dso), args->machine->pid,
 			     in_kernel ? args->kmisc : args->umisc, args->fd);
 }
 
@@ -876,11 +876,11 @@ static bool dso__build_id_mismatch(struct dso *dso, const char *name)
 	struct build_id bid;
 	bool ret = false;
 
-	mutex_lock(&dso->lock);
-	if (filename__read_build_id_ns(name, &bid, dso->nsinfo) >= 0)
+	mutex_lock(dso__lock(dso));
+	if (filename__read_build_id_ns(name, &bid, dso__nsinfo(dso)) >= 0)
 		ret = !dso__build_id_equal(dso, &bid);
 
-	mutex_unlock(&dso->lock);
+	mutex_unlock(dso__lock(dso));
 
 	return ret;
 }
@@ -890,13 +890,13 @@ static int dso__cache_build_id(struct dso *dso, struct machine *machine,
 {
 	bool is_kallsyms = dso__is_kallsyms(dso);
 	bool is_vdso = dso__is_vdso(dso);
-	const char *name = dso->long_name;
+	const char *name = dso__long_name(dso);
 	const char *proper_name = NULL;
 	const char *root_dir = NULL;
 	char *allocated_name = NULL;
 	int ret = 0;
 
-	if (!dso->has_build_id)
+	if (!dso__has_build_id(dso))
 		return 0;
 
 	if (dso__is_kcore(dso)) {
@@ -921,10 +921,10 @@ static int dso__cache_build_id(struct dso *dso, struct machine *machine,
 	if (!is_kallsyms && dso__build_id_mismatch(dso, name))
 		goto out_free;
 
-	mutex_lock(&dso->lock);
-	ret = build_id_cache__add_b(&dso->bid, name, dso->nsinfo,
+	mutex_lock(dso__lock(dso));
+	ret = build_id_cache__add_b(dso__bid(dso), name, dso__nsinfo(dso),
 				    is_kallsyms, is_vdso, proper_name, root_dir);
-	mutex_unlock(&dso->lock);
+	mutex_unlock(dso__lock(dso));
 out_free:
 	free(allocated_name);
 	return ret;
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 7517d16c02ec..68feed871809 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1205,7 +1205,7 @@ char *callchain_list__sym_name(struct callchain_list *cl,
 	if (show_dso)
 		scnprintf(bf + printed, bfsize - printed, " %s",
 			  cl->ms.map ?
-			  map__dso(cl->ms.map)->short_name :
+			  dso__short_name(map__dso(cl->ms.map)) :
 			  "unknown");
 
 	return bf;
diff --git a/tools/perf/util/data-convert-json.c b/tools/perf/util/data-convert-json.c
index 5bb3c2ba95ca..86ef936e2e04 100644
--- a/tools/perf/util/data-convert-json.c
+++ b/tools/perf/util/data-convert-json.c
@@ -134,7 +134,7 @@ static void output_sample_callchain_entry(struct perf_tool *tool,
 		output_json_key_string(out, false, 5, "symbol", al->sym->name);
 
 		if (dso) {
-			const char *dso_name = dso->short_name;
+			const char *dso_name = dso__short_name(dso);
 
 			if (dso_name && strlen(dso_name) > 0) {
 				fputc(',', out);
diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index b9fb71ab7a73..2fe3143e6689 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -146,10 +146,10 @@ int db_export__comm_thread(struct db_export *dbe, struct comm *comm,
 int db_export__dso(struct db_export *dbe, struct dso *dso,
 		   struct machine *machine)
 {
-	if (dso->db_id)
+	if (dso__db_id(dso))
 		return 0;
 
-	dso->db_id = ++dbe->dso_last_db_id;
+	dso__set_db_id(dso, ++dbe->dso_last_db_id);
 
 	if (dbe->export_dso)
 		return dbe->export_dso(dbe, dso, machine);
@@ -184,7 +184,7 @@ static int db_ids_from_al(struct db_export *dbe, struct addr_location *al,
 		err = db_export__dso(dbe, dso, maps__machine(al->maps));
 		if (err)
 			return err;
-		*dso_db_id = dso->db_id;
+		*dso_db_id = dso__db_id(dso);
 
 		if (!al->sym) {
 			al->sym = symbol__new(al->addr, 0, 0, 0, "unknown");
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index 908e16813722..7d180bdaedbc 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -33,13 +33,13 @@ static void al_to_d_al(struct addr_location *al, struct perf_dlfilter_al *d_al)
 	if (al->map) {
 		struct dso *dso = map__dso(al->map);
 
-		if (symbol_conf.show_kernel_path && dso->long_name)
-			d_al->dso = dso->long_name;
+		if (symbol_conf.show_kernel_path && dso__long_name(dso))
+			d_al->dso = dso__long_name(dso);
 		else
-			d_al->dso = dso->name;
-		d_al->is_64_bit = dso->is_64_bit;
-		d_al->buildid_size = dso->bid.size;
-		d_al->buildid = dso->bid.data;
+			d_al->dso = dso__name(dso);
+		d_al->is_64_bit = dso__is_64_bit(dso);
+		d_al->buildid_size = dso__bid(dso)->size;
+		d_al->buildid = dso__bid(dso)->data;
 	} else {
 		d_al->dso = NULL;
 		d_al->is_64_bit = 0;
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index e96369fb490b..ddf58f594df0 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -40,6 +40,12 @@ static const char * const debuglink_paths[] = {
 	"/usr/lib/debug%s/%s"
 };
 
+void dso__set_nsinfo(struct dso *dso, struct nsinfo *nsi)
+{
+	nsinfo__put(RC_CHK_ACCESS(dso)->nsinfo);
+	RC_CHK_ACCESS(dso)->nsinfo = nsi;
+}
+
 char dso__symtab_origin(const struct dso *dso)
 {
 	static const char origin[] = {
@@ -63,14 +69,14 @@ char dso__symtab_origin(const struct dso *dso)
 		[DSO_BINARY_TYPE__GUEST_VMLINUX]		= 'V',
 	};
 
-	if (dso == NULL || dso->symtab_type == DSO_BINARY_TYPE__NOT_FOUND)
+	if (dso == NULL || dso__symtab_type(dso) == DSO_BINARY_TYPE__NOT_FOUND)
 		return '!';
-	return origin[dso->symtab_type];
+	return origin[dso__symtab_type(dso)];
 }
 
 bool dso__is_object_file(const struct dso *dso)
 {
-	switch (dso->binary_type) {
+	switch (dso__binary_type(dso)) {
 	case DSO_BINARY_TYPE__KALLSYMS:
 	case DSO_BINARY_TYPE__GUEST_KALLSYMS:
 	case DSO_BINARY_TYPE__JAVA_JIT:
@@ -117,7 +123,7 @@ int dso__read_binary_type_filename(const struct dso *dso,
 		char symfile[PATH_MAX];
 		unsigned int i;
 
-		len = __symbol__join_symfs(filename, size, dso->long_name);
+		len = __symbol__join_symfs(filename, size, dso__long_name(dso));
 		last_slash = filename + len;
 		while (last_slash != filename && *last_slash != '/')
 			last_slash--;
@@ -159,12 +165,12 @@ int dso__read_binary_type_filename(const struct dso *dso,
 
 	case DSO_BINARY_TYPE__FEDORA_DEBUGINFO:
 		len = __symbol__join_symfs(filename, size, "/usr/lib/debug");
-		snprintf(filename + len, size - len, "%s.debug", dso->long_name);
+		snprintf(filename + len, size - len, "%s.debug", dso__long_name(dso));
 		break;
 
 	case DSO_BINARY_TYPE__UBUNTU_DEBUGINFO:
 		len = __symbol__join_symfs(filename, size, "/usr/lib/debug");
-		snprintf(filename + len, size - len, "%s", dso->long_name);
+		snprintf(filename + len, size - len, "%s", dso__long_name(dso));
 		break;
 
 	case DSO_BINARY_TYPE__MIXEDUP_UBUNTU_DEBUGINFO:
@@ -173,13 +179,13 @@ int dso__read_binary_type_filename(const struct dso *dso,
 		 * /usr/lib/debug/lib when it is expected to be in
 		 * /usr/lib/debug/usr/lib
 		 */
-		if (strlen(dso->long_name) < 9 ||
-		    strncmp(dso->long_name, "/usr/lib/", 9)) {
+		if (strlen(dso__long_name(dso)) < 9 ||
+		    strncmp(dso__long_name(dso), "/usr/lib/", 9)) {
 			ret = -1;
 			break;
 		}
 		len = __symbol__join_symfs(filename, size, "/usr/lib/debug");
-		snprintf(filename + len, size - len, "%s", dso->long_name + 4);
+		snprintf(filename + len, size - len, "%s", dso__long_name(dso) + 4);
 		break;
 
 	case DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO:
@@ -187,29 +193,29 @@ int dso__read_binary_type_filename(const struct dso *dso,
 		const char *last_slash;
 		size_t dir_size;
 
-		last_slash = dso->long_name + dso->long_name_len;
-		while (last_slash != dso->long_name && *last_slash != '/')
+		last_slash = dso__long_name(dso) + dso__long_name_len(dso);
+		while (last_slash != dso__long_name(dso) && *last_slash != '/')
 			last_slash--;
 
 		len = __symbol__join_symfs(filename, size, "");
-		dir_size = last_slash - dso->long_name + 2;
+		dir_size = last_slash - dso__long_name(dso) + 2;
 		if (dir_size > (size - len)) {
 			ret = -1;
 			break;
 		}
-		len += scnprintf(filename + len, dir_size, "%s",  dso->long_name);
+		len += scnprintf(filename + len, dir_size, "%s",  dso__long_name(dso));
 		len += scnprintf(filename + len , size - len, ".debug%s",
 								last_slash);
 		break;
 	}
 
 	case DSO_BINARY_TYPE__BUILDID_DEBUGINFO:
-		if (!dso->has_build_id) {
+		if (!dso__has_build_id(dso)) {
 			ret = -1;
 			break;
 		}
 
-		build_id__sprintf(&dso->bid, build_id_hex);
+		build_id__sprintf(dso__bid_const(dso), build_id_hex);
 		len = __symbol__join_symfs(filename, size, "/usr/lib/debug/.build-id/");
 		snprintf(filename + len, size - len, "%.2s/%s.debug",
 			 build_id_hex, build_id_hex + 2);
@@ -218,23 +224,23 @@ int dso__read_binary_type_filename(const struct dso *dso,
 	case DSO_BINARY_TYPE__VMLINUX:
 	case DSO_BINARY_TYPE__GUEST_VMLINUX:
 	case DSO_BINARY_TYPE__SYSTEM_PATH_DSO:
-		__symbol__join_symfs(filename, size, dso->long_name);
+		__symbol__join_symfs(filename, size, dso__long_name(dso));
 		break;
 
 	case DSO_BINARY_TYPE__GUEST_KMODULE:
 	case DSO_BINARY_TYPE__GUEST_KMODULE_COMP:
 		path__join3(filename, size, symbol_conf.symfs,
-			    root_dir, dso->long_name);
+			    root_dir, dso__long_name(dso));
 		break;
 
 	case DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE:
 	case DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP:
-		__symbol__join_symfs(filename, size, dso->long_name);
+		__symbol__join_symfs(filename, size, dso__long_name(dso));
 		break;
 
 	case DSO_BINARY_TYPE__KCORE:
 	case DSO_BINARY_TYPE__GUEST_KCORE:
-		snprintf(filename, size, "%s", dso->long_name);
+		snprintf(filename, size, "%s", dso__long_name(dso));
 		break;
 
 	default:
@@ -310,8 +316,8 @@ bool is_kernel_module(const char *pathname, int cpumode)
 
 bool dso__needs_decompress(struct dso *dso)
 {
-	return dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP ||
-		dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE_COMP;
+	return dso__symtab_type(dso) == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP ||
+		dso__symtab_type(dso) == DSO_BINARY_TYPE__GUEST_KMODULE_COMP;
 }
 
 int filename__decompress(const char *name, char *pathname,
@@ -363,11 +369,10 @@ static int decompress_kmodule(struct dso *dso, const char *name,
 	if (!dso__needs_decompress(dso))
 		return -1;
 
-	if (dso->comp == COMP_ID__NONE)
+	if (dso__comp(dso) == COMP_ID__NONE)
 		return -1;
 
-	return filename__decompress(name, pathname, len, dso->comp,
-				    &dso->load_errno);
+	return filename__decompress(name, pathname, len, dso__comp(dso), dso__load_errno(dso));
 }
 
 int dso__decompress_kmodule_fd(struct dso *dso, const char *name)
@@ -468,17 +473,17 @@ void dso__set_module_info(struct dso *dso, struct kmod_path *m,
 			  struct machine *machine)
 {
 	if (machine__is_host(machine))
-		dso->symtab_type = DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE;
+		dso__set_symtab_type(dso, DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE);
 	else
-		dso->symtab_type = DSO_BINARY_TYPE__GUEST_KMODULE;
+		dso__set_symtab_type(dso, DSO_BINARY_TYPE__GUEST_KMODULE);
 
 	/* _KMODULE_COMP should be next to _KMODULE */
 	if (m->kmod && m->comp) {
-		dso->symtab_type++;
-		dso->comp = m->comp;
+		dso__set_symtab_type(dso, dso__symtab_type(dso) + 1);
+		dso__set_comp(dso, m->comp);
 	}
 
-	dso->is_kmod = 1;
+	dso__set_is_kmod(dso);
 	dso__set_short_name(dso, strdup(m->name), true);
 }
 
@@ -491,13 +496,15 @@ static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
 
 static void dso__list_add(struct dso *dso)
 {
-	list_add_tail(&dso->data.open_entry, &dso__data_open);
+	list_add_tail(&dso__data(dso)->open_entry, &dso__data_open);
+	dso__data(dso)->dso = dso__get(dso);
 	dso__data_open_cnt++;
 }
 
 static void dso__list_del(struct dso *dso)
 {
-	list_del_init(&dso->data.open_entry);
+	list_del_init(&dso__data(dso)->open_entry);
+	dso__put(dso__data(dso)->dso);
 	WARN_ONCE(dso__data_open_cnt <= 0,
 		  "DSO data fd counter out of bounds.");
 	dso__data_open_cnt--;
@@ -528,7 +535,7 @@ static int do_open(char *name)
 
 char *dso__filename_with_chroot(const struct dso *dso, const char *filename)
 {
-	return filename_with_chroot(nsinfo__pid(dso->nsinfo), filename);
+	return filename_with_chroot(nsinfo__pid(dso__nsinfo_const(dso)), filename);
 }
 
 static int __open_dso(struct dso *dso, struct machine *machine)
@@ -541,18 +548,18 @@ static int __open_dso(struct dso *dso, struct machine *machine)
 	if (!name)
 		return -ENOMEM;
 
-	mutex_lock(&dso->lock);
+	mutex_lock(dso__lock(dso));
 	if (machine)
 		root_dir = machine->root_dir;
 
-	if (dso__read_binary_type_filename(dso, dso->binary_type,
+	if (dso__read_binary_type_filename(dso, dso__binary_type(dso),
 					    root_dir, name, PATH_MAX))
 		goto out;
 
 	if (!is_regular_file(name)) {
 		char *new_name;
 
-		if (errno != ENOENT || dso->nsinfo == NULL)
+		if (errno != ENOENT || dso__nsinfo(dso) == NULL)
 			goto out;
 
 		new_name = dso__filename_with_chroot(dso, name);
@@ -568,7 +575,7 @@ static int __open_dso(struct dso *dso, struct machine *machine)
 		size_t len = sizeof(newpath);
 
 		if (dso__decompress_kmodule_path(dso, name, newpath, len) < 0) {
-			fd = -dso->load_errno;
+			fd = -(*dso__load_errno(dso));
 			goto out;
 		}
 
@@ -582,7 +589,7 @@ static int __open_dso(struct dso *dso, struct machine *machine)
 		unlink(name);
 
 out:
-	mutex_unlock(&dso->lock);
+	mutex_unlock(dso__lock(dso));
 	free(name);
 	return fd;
 }
@@ -601,13 +608,13 @@ static int open_dso(struct dso *dso, struct machine *machine)
 	int fd;
 	struct nscookie nsc;
 
-	if (dso->binary_type != DSO_BINARY_TYPE__BUILD_ID_CACHE) {
-		mutex_lock(&dso->lock);
-		nsinfo__mountns_enter(dso->nsinfo, &nsc);
-		mutex_unlock(&dso->lock);
+	if (dso__binary_type(dso) != DSO_BINARY_TYPE__BUILD_ID_CACHE) {
+		mutex_lock(dso__lock(dso));
+		nsinfo__mountns_enter(dso__nsinfo(dso), &nsc);
+		mutex_unlock(dso__lock(dso));
 	}
 	fd = __open_dso(dso, machine);
-	if (dso->binary_type != DSO_BINARY_TYPE__BUILD_ID_CACHE)
+	if (dso__binary_type(dso) != DSO_BINARY_TYPE__BUILD_ID_CACHE)
 		nsinfo__mountns_exit(&nsc);
 
 	if (fd >= 0) {
@@ -624,10 +631,10 @@ static int open_dso(struct dso *dso, struct machine *machine)
 
 static void close_data_fd(struct dso *dso)
 {
-	if (dso->data.fd >= 0) {
-		close(dso->data.fd);
-		dso->data.fd = -1;
-		dso->data.file_size = 0;
+	if (dso__data(dso)->fd >= 0) {
+		close(dso__data(dso)->fd);
+		dso__data(dso)->fd = -1;
+		dso__data(dso)->file_size = 0;
 		dso__list_del(dso);
 	}
 }
@@ -646,10 +653,10 @@ static void close_dso(struct dso *dso)
 
 static void close_first_dso(void)
 {
-	struct dso *dso;
+	struct dso_data *dso_data;
 
-	dso = list_first_entry(&dso__data_open, struct dso, data.open_entry);
-	close_dso(dso);
+	dso_data = list_first_entry(&dso__data_open, struct dso_data, open_entry);
+	close_dso(dso_data->dso);
 }
 
 static rlim_t get_fd_limit(void)
@@ -728,28 +735,29 @@ static void try_to_open_dso(struct dso *dso, struct machine *machine)
 		DSO_BINARY_TYPE__NOT_FOUND,
 	};
 	int i = 0;
+	struct dso_data *dso_data = dso__data(dso);
 
-	if (dso->data.fd >= 0)
+	if (dso_data->fd >= 0)
 		return;
 
-	if (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND) {
-		dso->data.fd = open_dso(dso, machine);
+	if (dso__binary_type(dso) != DSO_BINARY_TYPE__NOT_FOUND) {
+		dso_data->fd = open_dso(dso, machine);
 		goto out;
 	}
 
 	do {
-		dso->binary_type = binary_type_data[i++];
+		dso__set_binary_type(dso, binary_type_data[i++]);
 
-		dso->data.fd = open_dso(dso, machine);
-		if (dso->data.fd >= 0)
+		dso_data->fd = open_dso(dso, machine);
+		if (dso_data->fd >= 0)
 			goto out;
 
-	} while (dso->binary_type != DSO_BINARY_TYPE__NOT_FOUND);
+	} while (dso__binary_type(dso) != DSO_BINARY_TYPE__NOT_FOUND);
 out:
-	if (dso->data.fd >= 0)
-		dso->data.status = DSO_DATA_STATUS_OK;
+	if (dso_data->fd >= 0)
+		dso_data->status = DSO_DATA_STATUS_OK;
 	else
-		dso->data.status = DSO_DATA_STATUS_ERROR;
+		dso_data->status = DSO_DATA_STATUS_ERROR;
 }
 
 /**
@@ -763,7 +771,7 @@ static void try_to_open_dso(struct dso *dso, struct machine *machine)
  */
 int dso__data_get_fd(struct dso *dso, struct machine *machine)
 {
-	if (dso->data.status == DSO_DATA_STATUS_ERROR)
+	if (dso__data(dso)->status == DSO_DATA_STATUS_ERROR)
 		return -1;
 
 	if (pthread_mutex_lock(&dso__data_open_lock) < 0)
@@ -771,10 +779,10 @@ int dso__data_get_fd(struct dso *dso, struct machine *machine)
 
 	try_to_open_dso(dso, machine);
 
-	if (dso->data.fd < 0)
+	if (dso__data(dso)->fd < 0)
 		pthread_mutex_unlock(&dso__data_open_lock);
 
-	return dso->data.fd;
+	return dso__data(dso)->fd;
 }
 
 void dso__data_put_fd(struct dso *dso __maybe_unused)
@@ -786,10 +794,10 @@ bool dso__data_status_seen(struct dso *dso, enum dso_data_status_seen by)
 {
 	u32 flag = 1 << by;
 
-	if (dso->data.status_seen & flag)
+	if (dso__data(dso)->status_seen & flag)
 		return true;
 
-	dso->data.status_seen |= flag;
+	dso__data(dso)->status_seen |= flag;
 
 	return false;
 }
@@ -799,12 +807,13 @@ static ssize_t bpf_read(struct dso *dso, u64 offset, char *data)
 {
 	struct bpf_prog_info_node *node;
 	ssize_t size = DSO__DATA_CACHE_SIZE;
+	struct dso_bpf_prog *dso_bpf_prog = dso__bpf_prog(dso);
 	u64 len;
 	u8 *buf;
 
-	node = perf_env__find_bpf_prog_info(dso->bpf_prog.env, dso->bpf_prog.id);
+	node = perf_env__find_bpf_prog_info(dso_bpf_prog->env, dso_bpf_prog->id);
 	if (!node || !node->info_linear) {
-		dso->data.status = DSO_DATA_STATUS_ERROR;
+		dso__data(dso)->status = DSO_DATA_STATUS_ERROR;
 		return -1;
 	}
 
@@ -822,14 +831,15 @@ static ssize_t bpf_read(struct dso *dso, u64 offset, char *data)
 static int bpf_size(struct dso *dso)
 {
 	struct bpf_prog_info_node *node;
+	struct dso_bpf_prog *dso_bpf_prog = dso__bpf_prog(dso);
 
-	node = perf_env__find_bpf_prog_info(dso->bpf_prog.env, dso->bpf_prog.id);
+	node = perf_env__find_bpf_prog_info(dso_bpf_prog->env, dso_bpf_prog->id);
 	if (!node || !node->info_linear) {
-		dso->data.status = DSO_DATA_STATUS_ERROR;
+		dso__data(dso)->status = DSO_DATA_STATUS_ERROR;
 		return -1;
 	}
 
-	dso->data.file_size = node->info_linear->info.jited_prog_len;
+	dso__data(dso)->file_size = node->info_linear->info.jited_prog_len;
 	return 0;
 }
 #endif // HAVE_LIBBPF_SUPPORT
@@ -837,10 +847,10 @@ static int bpf_size(struct dso *dso)
 static void
 dso_cache__free(struct dso *dso)
 {
-	struct rb_root *root = &dso->data.cache;
+	struct rb_root *root = &dso__data(dso)->cache;
 	struct rb_node *next = rb_first(root);
 
-	mutex_lock(&dso->lock);
+	mutex_lock(dso__lock(dso));
 	while (next) {
 		struct dso_cache *cache;
 
@@ -849,12 +859,12 @@ dso_cache__free(struct dso *dso)
 		rb_erase(&cache->rb_node, root);
 		free(cache);
 	}
-	mutex_unlock(&dso->lock);
+	mutex_unlock(dso__lock(dso));
 }
 
 static struct dso_cache *__dso_cache__find(struct dso *dso, u64 offset)
 {
-	const struct rb_root *root = &dso->data.cache;
+	const struct rb_root *root = &dso__data(dso)->cache;
 	struct rb_node * const *p = &root->rb_node;
 	const struct rb_node *parent = NULL;
 	struct dso_cache *cache;
@@ -880,13 +890,13 @@ static struct dso_cache *__dso_cache__find(struct dso *dso, u64 offset)
 static struct dso_cache *
 dso_cache__insert(struct dso *dso, struct dso_cache *new)
 {
-	struct rb_root *root = &dso->data.cache;
+	struct rb_root *root = &dso__data(dso)->cache;
 	struct rb_node **p = &root->rb_node;
 	struct rb_node *parent = NULL;
 	struct dso_cache *cache;
 	u64 offset = new->offset;
 
-	mutex_lock(&dso->lock);
+	mutex_lock(dso__lock(dso));
 	while (*p != NULL) {
 		u64 end;
 
@@ -907,7 +917,7 @@ dso_cache__insert(struct dso *dso, struct dso_cache *new)
 
 	cache = NULL;
 out:
-	mutex_unlock(&dso->lock);
+	mutex_unlock(dso__lock(dso));
 	return cache;
 }
 
@@ -932,18 +942,18 @@ static ssize_t file_read(struct dso *dso, struct machine *machine,
 	pthread_mutex_lock(&dso__data_open_lock);
 
 	/*
-	 * dso->data.fd might be closed if other thread opened another
+	 * dso__data(dso)->fd might be closed if other thread opened another
 	 * file (dso) due to open file limit (RLIMIT_NOFILE).
 	 */
 	try_to_open_dso(dso, machine);
 
-	if (dso->data.fd < 0) {
-		dso->data.status = DSO_DATA_STATUS_ERROR;
+	if (dso__data(dso)->fd < 0) {
+		dso__data(dso)->status = DSO_DATA_STATUS_ERROR;
 		ret = -errno;
 		goto out;
 	}
 
-	ret = pread(dso->data.fd, data, DSO__DATA_CACHE_SIZE, offset);
+	ret = pread(dso__data(dso)->fd, data, DSO__DATA_CACHE_SIZE, offset);
 out:
 	pthread_mutex_unlock(&dso__data_open_lock);
 	return ret;
@@ -963,11 +973,11 @@ static struct dso_cache *dso_cache__populate(struct dso *dso,
 		return NULL;
 	}
 #ifdef HAVE_LIBBPF_SUPPORT
-	if (dso->binary_type == DSO_BINARY_TYPE__BPF_PROG_INFO)
+	if (dso__binary_type(dso) == DSO_BINARY_TYPE__BPF_PROG_INFO)
 		*ret = bpf_read(dso, cache_offset, cache->data);
 	else
 #endif
-	if (dso->binary_type == DSO_BINARY_TYPE__OOL)
+	if (dso__binary_type(dso) == DSO_BINARY_TYPE__OOL)
 		*ret = DSO__DATA_CACHE_SIZE;
 	else
 		*ret = file_read(dso, machine, cache_offset, cache->data);
@@ -1056,25 +1066,25 @@ static int file_size(struct dso *dso, struct machine *machine)
 	pthread_mutex_lock(&dso__data_open_lock);
 
 	/*
-	 * dso->data.fd might be closed if other thread opened another
+	 * dso__data(dso)->fd might be closed if other thread opened another
 	 * file (dso) due to open file limit (RLIMIT_NOFILE).
 	 */
 	try_to_open_dso(dso, machine);
 
-	if (dso->data.fd < 0) {
+	if (dso__data(dso)->fd < 0) {
 		ret = -errno;
-		dso->data.status = DSO_DATA_STATUS_ERROR;
+		dso__data(dso)->status = DSO_DATA_STATUS_ERROR;
 		goto out;
 	}
 
-	if (fstat(dso->data.fd, &st) < 0) {
+	if (fstat(dso__data(dso)->fd, &st) < 0) {
 		ret = -errno;
 		pr_err("dso cache fstat failed: %s\n",
 		       str_error_r(errno, sbuf, sizeof(sbuf)));
-		dso->data.status = DSO_DATA_STATUS_ERROR;
+		dso__data(dso)->status = DSO_DATA_STATUS_ERROR;
 		goto out;
 	}
-	dso->data.file_size = st.st_size;
+	dso__data(dso)->file_size = st.st_size;
 
 out:
 	pthread_mutex_unlock(&dso__data_open_lock);
@@ -1083,13 +1093,13 @@ static int file_size(struct dso *dso, struct machine *machine)
 
 int dso__data_file_size(struct dso *dso, struct machine *machine)
 {
-	if (dso->data.file_size)
+	if (dso__data(dso)->file_size)
 		return 0;
 
-	if (dso->data.status == DSO_DATA_STATUS_ERROR)
+	if (dso__data(dso)->status == DSO_DATA_STATUS_ERROR)
 		return -1;
 #ifdef HAVE_LIBBPF_SUPPORT
-	if (dso->binary_type == DSO_BINARY_TYPE__BPF_PROG_INFO)
+	if (dso__binary_type(dso) == DSO_BINARY_TYPE__BPF_PROG_INFO)
 		return bpf_size(dso);
 #endif
 	return file_size(dso, machine);
@@ -1108,7 +1118,7 @@ off_t dso__data_size(struct dso *dso, struct machine *machine)
 		return -1;
 
 	/* For now just estimate dso data size is close to file size */
-	return dso->data.file_size;
+	return dso__data(dso)->file_size;
 }
 
 static ssize_t data_read_write_offset(struct dso *dso, struct machine *machine,
@@ -1119,7 +1129,7 @@ static ssize_t data_read_write_offset(struct dso *dso, struct machine *machine,
 		return -1;
 
 	/* Check the offset sanity. */
-	if (offset > dso->data.file_size)
+	if (offset > dso__data(dso)->file_size)
 		return -1;
 
 	if (offset + size < offset)
@@ -1142,7 +1152,7 @@ static ssize_t data_read_write_offset(struct dso *dso, struct machine *machine,
 ssize_t dso__data_read_offset(struct dso *dso, struct machine *machine,
 			      u64 offset, u8 *data, ssize_t size)
 {
-	if (dso->data.status == DSO_DATA_STATUS_ERROR)
+	if (dso__data(dso)->status == DSO_DATA_STATUS_ERROR)
 		return -1;
 
 	return data_read_write_offset(dso, machine, offset, data, size, true);
@@ -1182,7 +1192,7 @@ ssize_t dso__data_write_cache_offs(struct dso *dso, struct machine *machine,
 {
 	u8 *data = (u8 *)data_in; /* cast away const to use same fns for r/w */
 
-	if (dso->data.status == DSO_DATA_STATUS_ERROR)
+	if (dso__data(dso)->status == DSO_DATA_STATUS_ERROR)
 		return -1;
 
 	return data_read_write_offset(dso, machine, offset, data, size, false);
@@ -1235,7 +1245,7 @@ struct dso *machine__findnew_kernel(struct machine *machine, const char *name,
 	 */
 	if (dso != NULL) {
 		dso__set_short_name(dso, short_name, false);
-		dso->kernel = dso_type;
+		dso__set_kernel(dso, dso_type);
 	}
 
 	return dso;
@@ -1243,7 +1253,7 @@ struct dso *machine__findnew_kernel(struct machine *machine, const char *name,
 
 static void dso__set_long_name_id(struct dso *dso, const char *name, bool name_allocated)
 {
-	struct dsos *dsos = dso->dsos;
+	struct dsos *dsos = dso__dsos(dso);
 
 	if (name == NULL)
 		return;
@@ -1256,12 +1266,12 @@ static void dso__set_long_name_id(struct dso *dso, const char *name, bool name_a
 		down_write(&dsos->lock);
 	}
 
-	if (dso->long_name_allocated)
-		free((char *)dso->long_name);
+	if (dso__long_name_allocated(dso))
+		free((char *)dso__long_name(dso));
 
-	dso->long_name		 = name;
-	dso->long_name_len	 = strlen(name);
-	dso->long_name_allocated = name_allocated;
+	RC_CHK_ACCESS(dso)->long_name = name;
+	RC_CHK_ACCESS(dso)->long_name_len = strlen(name);
+	dso__set_long_name_allocated(dso, name_allocated);
 
 	if (dsos) {
 		dsos->sorted = false;
@@ -1307,14 +1317,15 @@ bool dso_id__empty(const struct dso_id *id)
 
 void __dso__inject_id(struct dso *dso, struct dso_id *id)
 {
-	struct dsos *dsos = dso->dsos;
+	struct dsos *dsos = dso__dsos(dso);
+	struct dso_id *dso_id = dso__id(dso);
 
 	/* dsos write lock held by caller. */
 
-	dso->id.maj = id->maj;
-	dso->id.min = id->min;
-	dso->id.ino = id->ino;
-	dso->id.ino_generation = id->ino_generation;
+	dso_id->maj = id->maj;
+	dso_id->min = id->min;
+	dso_id->ino = id->ino;
+	dso_id->ino_generation = id->ino_generation;
 
 	if (dsos)
 		dsos->sorted = false;
@@ -1334,7 +1345,7 @@ int dso_id__cmp(const struct dso_id *a, const struct dso_id *b)
 
 int dso__cmp_id(struct dso *a, struct dso *b)
 {
-	return __dso_id__cmp(&a->id, &b->id);
+	return __dso_id__cmp(dso__id(a), dso__id(b));
 }
 
 void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
@@ -1344,7 +1355,7 @@ void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated)
 
 void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated)
 {
-	struct dsos *dsos = dso->dsos;
+	struct dsos *dsos = dso__dsos(dso);
 
 	if (name == NULL)
 		return;
@@ -1356,12 +1367,12 @@ void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated)
 		 */
 		down_write(&dsos->lock);
 	}
-	if (dso->short_name_allocated)
-		free((char *)dso->short_name);
+	if (dso__short_name_allocated(dso))
+		free((char *)dso__short_name(dso));
 
-	dso->short_name		  = name;
-	dso->short_name_len	  = strlen(name);
-	dso->short_name_allocated = name_allocated;
+	RC_CHK_ACCESS(dso)->short_name		  = name;
+	RC_CHK_ACCESS(dso)->short_name_len	  = strlen(name);
+	dso__set_short_name_allocated(dso, name_allocated);
 
 	if (dsos) {
 		dsos->sorted = false;
@@ -1374,40 +1385,44 @@ int dso__name_len(const struct dso *dso)
 	if (!dso)
 		return strlen("[unknown]");
 	if (verbose > 0)
-		return dso->long_name_len;
+		return dso__long_name_len(dso);
 
-	return dso->short_name_len;
+	return dso__short_name_len(dso);
 }
 
 bool dso__loaded(const struct dso *dso)
 {
-	return dso->loaded;
+	return RC_CHK_ACCESS(dso)->loaded;
 }
 
 bool dso__sorted_by_name(const struct dso *dso)
 {
-	return dso->sorted_by_name;
+	return RC_CHK_ACCESS(dso)->sorted_by_name;
 }
 
 void dso__set_sorted_by_name(struct dso *dso)
 {
-	dso->sorted_by_name = true;
+	RC_CHK_ACCESS(dso)->sorted_by_name = true;
 }
 
 struct dso *dso__new_id(const char *name, struct dso_id *id)
 {
-	struct dso *dso = calloc(1, sizeof(*dso) + strlen(name) + 1);
+	RC_STRUCT(dso) *dso = zalloc(sizeof(*dso) + strlen(name) + 1);
+	struct dso *res;
+	struct dso_data *data;
 
-	if (dso != NULL) {
+	if (!dso)
+		return NULL;
+
+	if (ADD_RC_CHK(res, dso)) {
 		strcpy(dso->name, name);
 		if (id)
 			dso->id = *id;
-		dso__set_long_name_id(dso, dso->name, false);
-		dso__set_short_name(dso, dso->name, false);
+		dso__set_long_name_id(res, dso->name, false);
+		dso__set_short_name(res, dso->name, false);
 		dso->symbols = RB_ROOT_CACHED;
 		dso->symbol_names = NULL;
 		dso->symbol_names_len = 0;
-		dso->data.cache = RB_ROOT;
 		dso->inlined_nodes = RB_ROOT_CACHED;
 		dso->srclines = RB_ROOT_CACHED;
 		dso->data_types = RB_ROOT;
@@ -1426,12 +1441,16 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
 		dso->is_kmod = 0;
 		dso->needs_swap = DSO_SWAP__UNSET;
 		dso->comp = COMP_ID__NONE;
-		INIT_LIST_HEAD(&dso->data.open_entry);
 		mutex_init(&dso->lock);
 		refcount_set(&dso->refcnt, 1);
+		data = &dso->data;
+		data->cache = RB_ROOT;
+		data->fd = -1;
+		data->status = DSO_DATA_STATUS_UNKNOWN;
+		INIT_LIST_HEAD(&data->open_entry);
+		data->dso = NULL; /* Set when on the open_entry list. */
 	}
-
-	return dso;
+	return res;
 }
 
 struct dso *dso__new(const char *name)
@@ -1441,70 +1460,76 @@ struct dso *dso__new(const char *name)
 
 void dso__delete(struct dso *dso)
 {
-	if (dso->dsos)
-		pr_err("DSO %s is still in rbtree when being deleted!\n", dso->long_name);
+	if (dso__dsos(dso))
+		pr_err("DSO %s is still in rbtree when being deleted!\n", dso__long_name(dso));
 
 	/* free inlines first, as they reference symbols */
-	inlines__tree_delete(&dso->inlined_nodes);
-	srcline__tree_delete(&dso->srclines);
-	symbols__delete(&dso->symbols);
-	dso->symbol_names_len = 0;
-	zfree(&dso->symbol_names);
-	annotated_data_type__tree_delete(&dso->data_types);
-
-	if (dso->short_name_allocated) {
-		zfree((char **)&dso->short_name);
-		dso->short_name_allocated = false;
+	inlines__tree_delete(&RC_CHK_ACCESS(dso)->inlined_nodes);
+	srcline__tree_delete(&RC_CHK_ACCESS(dso)->srclines);
+	symbols__delete(&RC_CHK_ACCESS(dso)->symbols);
+	RC_CHK_ACCESS(dso)->symbol_names_len = 0;
+	zfree(&RC_CHK_ACCESS(dso)->symbol_names);
+	annotated_data_type__tree_delete(dso__data_types(dso));
+	if (RC_CHK_ACCESS(dso)->short_name_allocated) {
+		zfree((char **)&RC_CHK_ACCESS(dso)->short_name);
+		RC_CHK_ACCESS(dso)->short_name_allocated = false;
 	}
 
-	if (dso->long_name_allocated) {
-		zfree((char **)&dso->long_name);
-		dso->long_name_allocated = false;
+	if (RC_CHK_ACCESS(dso)->long_name_allocated) {
+		zfree((char **)&RC_CHK_ACCESS(dso)->long_name);
+		RC_CHK_ACCESS(dso)->long_name_allocated = false;
 	}
 
 	dso__data_close(dso);
-	auxtrace_cache__free(dso->auxtrace_cache);
+	auxtrace_cache__free(RC_CHK_ACCESS(dso)->auxtrace_cache);
 	dso_cache__free(dso);
 	dso__free_a2l(dso);
-	zfree(&dso->symsrc_filename);
-	nsinfo__zput(dso->nsinfo);
-	mutex_destroy(&dso->lock);
-	free(dso);
+	zfree(&RC_CHK_ACCESS(dso)->symsrc_filename);
+	nsinfo__zput(RC_CHK_ACCESS(dso)->nsinfo);
+	mutex_destroy(dso__lock(dso));
+	RC_CHK_FREE(dso);
 }
 
 struct dso *dso__get(struct dso *dso)
 {
-	if (dso)
-		refcount_inc(&dso->refcnt);
-	return dso;
+	struct dso *result;
+
+	if (RC_CHK_GET(result, dso))
+		refcount_inc(&RC_CHK_ACCESS(dso)->refcnt);
+
+	return result;
 }
 
 void dso__put(struct dso *dso)
 {
-	if (dso && refcount_dec_and_test(&dso->refcnt))
+	if (dso && refcount_dec_and_test(&RC_CHK_ACCESS(dso)->refcnt))
 		dso__delete(dso);
+	else
+		RC_CHK_PUT(dso);
 }
 
 void dso__set_build_id(struct dso *dso, struct build_id *bid)
 {
-	dso->bid = *bid;
-	dso->has_build_id = 1;
+	RC_CHK_ACCESS(dso)->bid = *bid;
+	RC_CHK_ACCESS(dso)->has_build_id = 1;
 }
 
 bool dso__build_id_equal(const struct dso *dso, struct build_id *bid)
 {
-	if (dso->bid.size > bid->size && dso->bid.size == BUILD_ID_SIZE) {
+	const struct build_id *dso_bid = dso__bid_const(dso);
+
+	if (dso_bid->size > bid->size && dso_bid->size == BUILD_ID_SIZE) {
 		/*
 		 * For the backward compatibility, it allows a build-id has
 		 * trailing zeros.
 		 */
-		return !memcmp(dso->bid.data, bid->data, bid->size) &&
-			!memchr_inv(&dso->bid.data[bid->size], 0,
-				    dso->bid.size - bid->size);
+		return !memcmp(dso_bid->data, bid->data, bid->size) &&
+			!memchr_inv(&dso_bid->data[bid->size], 0,
+				    dso_bid->size - bid->size);
 	}
 
-	return dso->bid.size == bid->size &&
-	       memcmp(dso->bid.data, bid->data, dso->bid.size) == 0;
+	return dso_bid->size == bid->size &&
+	       memcmp(dso_bid->data, bid->data, dso_bid->size) == 0;
 }
 
 void dso__read_running_kernel_build_id(struct dso *dso, struct machine *machine)
@@ -1514,8 +1539,8 @@ void dso__read_running_kernel_build_id(struct dso *dso, struct machine *machine)
 	if (machine__is_default_guest(machine))
 		return;
 	sprintf(path, "%s/sys/kernel/notes", machine->root_dir);
-	if (sysfs__read_build_id(path, &dso->bid) == 0)
-		dso->has_build_id = true;
+	if (sysfs__read_build_id(path, dso__bid(dso)) == 0)
+		dso__set_has_build_id(dso);
 }
 
 int dso__kernel_module_get_build_id(struct dso *dso,
@@ -1526,14 +1551,14 @@ int dso__kernel_module_get_build_id(struct dso *dso,
 	 * kernel module short names are of the form "[module]" and
 	 * we need just "module" here.
 	 */
-	const char *name = dso->short_name + 1;
+	const char *name = dso__short_name(dso) + 1;
 
 	snprintf(filename, sizeof(filename),
 		 "%s/sys/module/%.*s/notes/.note.gnu.build-id",
 		 root_dir, (int)strlen(name) - 1, name);
 
-	if (sysfs__read_build_id(filename, &dso->bid) == 0)
-		dso->has_build_id = true;
+	if (sysfs__read_build_id(filename, dso__bid(dso)) == 0)
+		dso__set_has_build_id(dso);
 
 	return 0;
 }
@@ -1542,21 +1567,21 @@ static size_t dso__fprintf_buildid(struct dso *dso, FILE *fp)
 {
 	char sbuild_id[SBUILD_ID_SIZE];
 
-	build_id__sprintf(&dso->bid, sbuild_id);
+	build_id__sprintf(dso__bid(dso), sbuild_id);
 	return fprintf(fp, "%s", sbuild_id);
 }
 
 size_t dso__fprintf(struct dso *dso, FILE *fp)
 {
 	struct rb_node *nd;
-	size_t ret = fprintf(fp, "dso: %s (", dso->short_name);
+	size_t ret = fprintf(fp, "dso: %s (", dso__short_name(dso));
 
-	if (dso->short_name != dso->long_name)
-		ret += fprintf(fp, "%s, ", dso->long_name);
+	if (dso__short_name(dso) != dso__long_name(dso))
+		ret += fprintf(fp, "%s, ", dso__long_name(dso));
 	ret += fprintf(fp, "%sloaded, ", dso__loaded(dso) ? "" : "NOT ");
 	ret += dso__fprintf_buildid(dso, fp);
 	ret += fprintf(fp, ")\n");
-	for (nd = rb_first_cached(&dso->symbols); nd; nd = rb_next(nd)) {
+	for (nd = rb_first_cached(dso__symbols(dso)); nd; nd = rb_next(nd)) {
 		struct symbol *pos = rb_entry(nd, struct symbol, rb_node);
 		ret += symbol__fprintf(pos, fp);
 	}
@@ -1580,7 +1605,7 @@ enum dso_type dso__type(struct dso *dso, struct machine *machine)
 
 int dso__strerror_load(struct dso *dso, char *buf, size_t buflen)
 {
-	int idx, errnum = dso->load_errno;
+	int idx, errnum = *dso__load_errno(dso);
 	/*
 	 * This must have a same ordering as the enum dso_load_errno.
 	 */
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 2e227822f10c..3e27f93898f2 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -11,6 +11,7 @@
 #include <linux/bitops.h>
 #include "build-id.h"
 #include "mutex.h"
+#include <internal/rc_check.h>
 
 struct machine;
 struct map;
@@ -100,26 +101,27 @@ enum dso_load_errno {
 	__DSO_LOAD_ERRNO__END,
 };
 
-#define DSO__SWAP(dso, type, val)			\
-({							\
-	type ____r = val;				\
-	BUG_ON(dso->needs_swap == DSO_SWAP__UNSET);	\
-	if (dso->needs_swap == DSO_SWAP__YES) {		\
-		switch (sizeof(____r)) {		\
-		case 2:					\
-			____r = bswap_16(val);		\
-			break;				\
-		case 4:					\
-			____r = bswap_32(val);		\
-			break;				\
-		case 8:					\
-			____r = bswap_64(val);		\
-			break;				\
-		default:				\
-			BUG_ON(1);			\
-		}					\
-	}						\
-	____r;						\
+#define DSO__SWAP(dso, type, val)				\
+({								\
+	type ____r = val;					\
+	enum dso_swap_type ___dst = dso__needs_swap(dso);	\
+	BUG_ON(___dst == DSO_SWAP__UNSET);			\
+	if (___dst == DSO_SWAP__YES) {				\
+		switch (sizeof(____r)) {			\
+		case 2:						\
+			____r = bswap_16(val);			\
+			break;					\
+		case 4:						\
+			____r = bswap_32(val);			\
+			break;					\
+		case 8:						\
+			____r = bswap_64(val);			\
+			break;					\
+		default:					\
+			BUG_ON(1);				\
+		}						\
+	}							\
+	____r;							\
 })
 
 #define DSO__DATA_CACHE_SIZE 4096
@@ -142,9 +144,29 @@ struct dso_cache {
 	char data[];
 };
 
+struct dso_data {
+	struct rb_root	 cache;
+	struct list_head open_entry;
+	struct dso	 *dso;
+	int		 fd;
+	int		 status;
+	u32		 status_seen;
+	u64		 file_size;
+	u64		 elf_base_addr;
+	u64		 debug_frame_offset;
+	u64		 eh_frame_hdr_addr;
+	u64		 eh_frame_hdr_offset;
+};
+
+struct dso_bpf_prog {
+	u32		id;
+	u32		sub_id;
+	struct perf_env	*env;
+};
+
 struct auxtrace_cache;
 
-struct dso {
+DECLARE_RC_STRUCT(dso) {
 	struct mutex	 lock;
 	struct dsos	 *dsos;
 	struct rb_root_cached symbols;
@@ -175,24 +197,9 @@ struct dso {
 		u64	 db_id;
 	};
 	/* bpf prog information */
-	struct {
-		struct perf_env	*env;
-		u32		id;
-		u32		sub_id;
-	} bpf_prog;
+	struct dso_bpf_prog bpf_prog;
 	/* dso data file */
-	struct {
-		struct rb_root	 cache;
-		struct list_head open_entry;
-		u64		 file_size;
-		u64		 elf_base_addr;
-		u64		 debug_frame_offset;
-		u64		 eh_frame_hdr_addr;
-		u64		 eh_frame_hdr_offset;
-		int		 fd;
-		int		 status;
-		u32		 status_seen;
-	} data;
+	struct dso_data	 data;
 	struct dso_id	 id;
 	unsigned int	 a2l_fails;
 	int		 comp;
@@ -228,11 +235,383 @@ struct dso {
  * @n: the 'struct rb_node *' to use as a temporary storage
  */
 #define dso__for_each_symbol(dso, pos, n)	\
-	symbols__for_each_entry(&(dso)->symbols, pos, n)
+	symbols__for_each_entry(dso__symbols(dso), pos, n)
+
+static inline void *dso__a2l(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->a2l;
+}
+
+static inline void dso__set_a2l(struct dso *dso, void *val)
+{
+	RC_CHK_ACCESS(dso)->a2l = val;
+}
+
+static inline unsigned int dso__a2l_fails(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->a2l_fails;
+}
+
+static inline void dso__set_a2l_fails(struct dso *dso, unsigned int val)
+{
+	RC_CHK_ACCESS(dso)->a2l_fails = val;
+}
+
+static inline bool dso__adjust_symbols(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->adjust_symbols;
+}
+
+static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
+{
+	RC_CHK_ACCESS(dso)->adjust_symbols = val;
+}
+
+static inline bool dso__annotate_warned(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->annotate_warned;
+}
+
+static inline void dso__set_annotate_warned(struct dso *dso)
+{
+	RC_CHK_ACCESS(dso)->annotate_warned = 1;
+}
+
+static inline struct auxtrace_cache *dso__auxtrace_cache(struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->auxtrace_cache;
+}
+
+static inline void dso__set_auxtrace_cache(struct dso *dso, struct auxtrace_cache *cache)
+{
+	RC_CHK_ACCESS(dso)->auxtrace_cache = cache;
+}
+
+static inline struct build_id *dso__bid(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->bid;
+}
+
+static inline const struct build_id *dso__bid_const(const struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->bid;
+}
+
+static inline struct dso_bpf_prog *dso__bpf_prog(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->bpf_prog;
+}
+
+static inline bool dso__has_build_id(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->has_build_id;
+}
+
+static inline void dso__set_has_build_id(struct dso *dso)
+{
+	RC_CHK_ACCESS(dso)->has_build_id = true;
+}
+
+static inline bool dso__has_srcline(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->has_srcline;
+}
+
+static inline void dso__set_has_srcline(struct dso *dso, bool val)
+{
+	RC_CHK_ACCESS(dso)->has_srcline = val;
+}
+
+static inline int dso__comp(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->comp;
+}
+
+static inline void dso__set_comp(struct dso *dso, int comp)
+{
+	RC_CHK_ACCESS(dso)->comp = comp;
+}
+
+static inline struct dso_data *dso__data(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->data;
+}
+
+static inline u64 dso__db_id(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->db_id;
+}
+
+static inline void dso__set_db_id(struct dso *dso, u64 db_id)
+{
+	RC_CHK_ACCESS(dso)->db_id = db_id;
+}
+
+static inline struct dsos *dso__dsos(struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->dsos;
+}
+
+static inline void dso__set_dsos(struct dso *dso, struct dsos *dsos)
+{
+	RC_CHK_ACCESS(dso)->dsos = dsos;
+}
+
+static inline bool dso__header_build_id(struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->header_build_id;
+}
+
+static inline void dso__set_header_build_id(struct dso *dso, bool val)
+{
+	RC_CHK_ACCESS(dso)->header_build_id = val;
+}
+
+static inline bool dso__hit(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->hit;
+}
+
+static inline void dso__set_hit(struct dso *dso)
+{
+	RC_CHK_ACCESS(dso)->hit = 1;
+}
+
+static inline struct dso_id *dso__id(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->id;
+}
+
+static inline const struct dso_id *dso__id_const(const struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->id;
+}
+
+static inline struct rb_root_cached *dso__inlined_nodes(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->inlined_nodes;
+}
+
+static inline bool dso__is_64_bit(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->is_64_bit;
+}
+
+static inline void dso__set_is_64_bit(struct dso *dso, bool is)
+{
+	RC_CHK_ACCESS(dso)->is_64_bit = is;
+}
+
+static inline bool dso__is_kmod(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->is_kmod;
+}
+
+static inline void dso__set_is_kmod(struct dso *dso)
+{
+	RC_CHK_ACCESS(dso)->is_kmod = 1;
+}
+
+static inline enum dso_space_type dso__kernel(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->kernel;
+}
+
+static inline void dso__set_kernel(struct dso *dso, enum dso_space_type kernel)
+{
+	RC_CHK_ACCESS(dso)->kernel = kernel;
+}
+
+static inline u64 dso__last_find_result_addr(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->last_find_result.addr;
+}
+
+static inline void dso__set_last_find_result_addr(struct dso *dso, u64 addr)
+{
+	RC_CHK_ACCESS(dso)->last_find_result.addr = addr;
+}
+
+static inline struct symbol *dso__last_find_result_symbol(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->last_find_result.symbol;
+}
+
+static inline void dso__set_last_find_result_symbol(struct dso *dso, struct symbol *symbol)
+{
+	RC_CHK_ACCESS(dso)->last_find_result.symbol = symbol;
+}
+
+static inline enum dso_load_errno *dso__load_errno(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->load_errno;
+}
 
 static inline void dso__set_loaded(struct dso *dso)
 {
-	dso->loaded = true;
+	RC_CHK_ACCESS(dso)->loaded = true;
+}
+
+static inline struct mutex *dso__lock(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->lock;
+}
+
+static inline const char *dso__long_name(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->long_name;
+}
+
+static inline bool dso__long_name_allocated(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->long_name_allocated;
+}
+
+static inline void dso__set_long_name_allocated(struct dso *dso, bool allocated)
+{
+	RC_CHK_ACCESS(dso)->long_name_allocated = allocated;
+}
+
+static inline u16 dso__long_name_len(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->long_name_len;
+}
+
+static inline const char *dso__name(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->name;
+}
+
+static inline enum dso_swap_type dso__needs_swap(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->needs_swap;
+}
+
+static inline void dso__set_needs_swap(struct dso *dso, enum dso_swap_type type)
+{
+	RC_CHK_ACCESS(dso)->needs_swap = type;
+}
+
+static inline struct nsinfo *dso__nsinfo(struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->nsinfo;
+}
+
+static inline const struct nsinfo *dso__nsinfo_const(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->nsinfo;
+}
+
+static inline struct nsinfo **dso__nsinfo_ptr(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->nsinfo;
+}
+
+void dso__set_nsinfo(struct dso *dso, struct nsinfo *nsi);
+
+static inline u8 dso__rel(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->rel;
+}
+
+static inline void dso__set_rel(struct dso *dso, u8 rel)
+{
+	RC_CHK_ACCESS(dso)->rel = rel;
+}
+
+static inline const char *dso__short_name(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->short_name;
+}
+
+static inline bool dso__short_name_allocated(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->short_name_allocated;
+}
+
+static inline void dso__set_short_name_allocated(struct dso *dso, bool allocated)
+{
+	RC_CHK_ACCESS(dso)->short_name_allocated = allocated;
+}
+
+static inline u16 dso__short_name_len(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->short_name_len;
+}
+
+static inline struct rb_root_cached *dso__srclines(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->srclines;
+}
+
+static inline struct rb_root *dso__data_types(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->data_types;
+}
+
+static inline struct rb_root_cached *dso__symbols(struct dso *dso)
+{
+	return &RC_CHK_ACCESS(dso)->symbols;
+}
+
+static inline struct symbol **dso__symbol_names(struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->symbol_names;
+}
+
+static inline void dso__set_symbol_names(struct dso *dso, struct symbol **names)
+{
+	RC_CHK_ACCESS(dso)->symbol_names = names;
+}
+
+static inline size_t dso__symbol_names_len(struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->symbol_names_len;
+}
+
+static inline void dso__set_symbol_names_len(struct dso *dso, size_t len)
+{
+	RC_CHK_ACCESS(dso)->symbol_names_len = len;
+}
+
+static inline const char *dso__symsrc_filename(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->symsrc_filename;
+}
+
+static inline void dso__set_symsrc_filename(struct dso *dso, char *val)
+{
+	RC_CHK_ACCESS(dso)->symsrc_filename = val;
+}
+
+static inline enum dso_binary_type dso__symtab_type(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->symtab_type;
+}
+
+static inline void dso__set_symtab_type(struct dso *dso, enum dso_binary_type bt)
+{
+	RC_CHK_ACCESS(dso)->symtab_type = bt;
+}
+
+static inline u64 dso__text_end(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->text_end;
+}
+
+static inline void dso__set_text_end(struct dso *dso, u64 val)
+{
+	RC_CHK_ACCESS(dso)->text_end = val;
+}
+
+static inline u64 dso__text_offset(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->text_offset;
+}
+
+static inline void dso__set_text_offset(struct dso *dso, u64 val)
+{
+	RC_CHK_ACCESS(dso)->text_offset = val;
 }
 
 int dso_id__cmp(const struct dso_id *a, const struct dso_id *b);
@@ -264,7 +643,7 @@ bool dso__loaded(const struct dso *dso);
 
 static inline bool dso__has_symbols(const struct dso *dso)
 {
-	return !RB_EMPTY_ROOT(&dso->symbols.rb_root);
+	return !RB_EMPTY_ROOT(&RC_CHK_ACCESS(dso)->symbols.rb_root);
 }
 
 char *dso__filename_with_chroot(const struct dso *dso, const char *filename);
@@ -380,21 +759,33 @@ void dso__reset_find_symbol_cache(struct dso *dso);
 size_t dso__fprintf_symbols_by_name(struct dso *dso, FILE *fp);
 size_t dso__fprintf(struct dso *dso, FILE *fp);
 
+static inline enum dso_binary_type dso__binary_type(const struct dso *dso)
+{
+	return RC_CHK_ACCESS(dso)->binary_type;
+}
+
+static inline void dso__set_binary_type(struct dso *dso, enum dso_binary_type bt)
+{
+	RC_CHK_ACCESS(dso)->binary_type = bt;
+}
+
 static inline bool dso__is_vmlinux(const struct dso *dso)
 {
-	return dso->binary_type == DSO_BINARY_TYPE__VMLINUX ||
-	       dso->binary_type == DSO_BINARY_TYPE__GUEST_VMLINUX;
+	enum dso_binary_type bt = dso__binary_type(dso);
+
+	return bt == DSO_BINARY_TYPE__VMLINUX || bt == DSO_BINARY_TYPE__GUEST_VMLINUX;
 }
 
 static inline bool dso__is_kcore(const struct dso *dso)
 {
-	return dso->binary_type == DSO_BINARY_TYPE__KCORE ||
-	       dso->binary_type == DSO_BINARY_TYPE__GUEST_KCORE;
+	enum dso_binary_type bt = dso__binary_type(dso);
+
+	return bt == DSO_BINARY_TYPE__KCORE || bt == DSO_BINARY_TYPE__GUEST_KCORE;
 }
 
 static inline bool dso__is_kallsyms(const struct dso *dso)
 {
-	return dso->kernel && dso->long_name[0] != '/';
+	return RC_CHK_ACCESS(dso)->kernel && RC_CHK_ACCESS(dso)->long_name[0] != '/';
 }
 
 bool dso__is_object_file(const struct dso *dso);
diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index 23c3fe4f2abb..ab3d0c01dd63 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -29,8 +29,8 @@ static void dsos__purge(struct dsos *dsos)
 	for (unsigned int i = 0; i < dsos->cnt; i++) {
 		struct dso *dso = dsos->dsos[i];
 
+		dso__set_dsos(dso, NULL);
 		dso__put(dso);
-		dso->dsos = NULL;
 	}
 
 	zfree(&dsos->dsos);
@@ -73,22 +73,22 @@ static int dsos__read_build_ids_cb(struct dso *dso, void *data)
 	struct dsos__read_build_ids_cb_args *args = data;
 	struct nscookie nsc;
 
-	if (args->with_hits && !dso->hit && !dso__is_vdso(dso))
+	if (args->with_hits && !dso__hit(dso) && !dso__is_vdso(dso))
 		return 0;
-	if (dso->has_build_id) {
+	if (dso__has_build_id(dso)) {
 		args->have_build_id = true;
 		return 0;
 	}
-	nsinfo__mountns_enter(dso->nsinfo, &nsc);
-	if (filename__read_build_id(dso->long_name, &dso->bid) > 0) {
+	nsinfo__mountns_enter(dso__nsinfo(dso), &nsc);
+	if (filename__read_build_id(dso__long_name(dso), dso__bid(dso)) > 0) {
 		args->have_build_id = true;
-		dso->has_build_id = true;
-	} else if (errno == ENOENT && dso->nsinfo) {
-		char *new_name = dso__filename_with_chroot(dso, dso->long_name);
+		dso__set_has_build_id(dso);
+	} else if (errno == ENOENT && dso__nsinfo(dso)) {
+		char *new_name = dso__filename_with_chroot(dso, dso__long_name(dso));
 
-		if (new_name && filename__read_build_id(new_name, &dso->bid) > 0) {
+		if (new_name && filename__read_build_id(new_name, dso__bid(dso)) > 0) {
 			args->have_build_id = true;
-			dso->has_build_id = true;
+			dso__set_has_build_id(dso);
 		}
 		free(new_name);
 	}
@@ -110,27 +110,27 @@ bool dsos__read_build_ids(struct dsos *dsos, bool with_hits)
 static int __dso__cmp_long_name(const char *long_name, const struct dso_id *id,
 				const struct dso *b)
 {
-	int rc = strcmp(long_name, b->long_name);
-	return rc ?: dso_id__cmp(id, &b->id);
+	int rc = strcmp(long_name, dso__long_name(b));
+	return rc ?: dso_id__cmp(id, dso__id_const(b));
 }
 
 static int __dso__cmp_short_name(const char *short_name, const struct dso_id *id,
 				 const struct dso *b)
 {
-	int rc = strcmp(short_name, b->short_name);
-	return rc ?: dso_id__cmp(id, &b->id);
+	int rc = strcmp(short_name, dso__short_name(b));
+	return rc ?: dso_id__cmp(id, dso__id_const(b));
 }
 
 static int dsos__cmp_long_name_id_short_name(const void *va, const void *vb)
 {
 	const struct dso *a = *((const struct dso **)va);
 	const struct dso *b = *((const struct dso **)vb);
-	int rc = strcmp(a->long_name, b->long_name);
+	int rc = strcmp(dso__long_name(a), dso__long_name(b));
 
 	if (!rc) {
-		rc = dso_id__cmp(&a->id, &b->id);
+		rc = dso_id__cmp(dso__id_const(a), dso__id_const(b));
 		if (!rc)
-			rc = strcmp(a->short_name, b->short_name);
+			rc = strcmp(dso__short_name(a), dso__short_name(b));
 	}
 	return rc;
 }
@@ -209,7 +209,7 @@ int __dsos__add(struct dsos *dsos, struct dso *dso)
 								 &dsos->dsos[dsos->cnt - 1])
 			<= 0;
 	}
-	dso->dsos = dsos;
+	dso__set_dsos(dso, dsos);
 	return 0;
 }
 
@@ -275,7 +275,7 @@ static void dso__set_basename(struct dso *dso)
 	char *base, *lname;
 	int tid;
 
-	if (sscanf(dso->long_name, "/tmp/perf-%d.map", &tid) == 1) {
+	if (sscanf(dso__long_name(dso), "/tmp/perf-%d.map", &tid) == 1) {
 		if (asprintf(&base, "[JIT] tid %d", tid) < 0)
 			return;
 	} else {
@@ -283,7 +283,7 @@ static void dso__set_basename(struct dso *dso)
 	       * basename() may modify path buffer, so we must pass
                * a copy.
                */
-		lname = strdup(dso->long_name);
+		lname = strdup(dso__long_name(dso));
 		if (!lname)
 			return;
 
@@ -322,7 +322,7 @@ static struct dso *__dsos__findnew_id(struct dsos *dsos, const char *name, struc
 {
 	struct dso *dso = __dsos__find_id(dsos, name, id, false, /*write_locked=*/true);
 
-	if (dso && dso_id__empty(&dso->id) && !dso_id__empty(id))
+	if (dso && dso_id__empty(dso__id(dso)) && !dso_id__empty(id))
 		__dso__inject_id(dso, id);
 
 	return dso ? dso : __dsos__addnew_id(dsos, name, id);
@@ -351,8 +351,8 @@ static int dsos__fprintf_buildid_cb(struct dso *dso, void *data)
 
 	if (args->skip && args->skip(dso, args->parm))
 		return 0;
-	build_id__sprintf(&dso->bid, sbuild_id);
-	args->ret += fprintf(args->fp, "%-40s %s\n", sbuild_id, dso->long_name);
+	build_id__sprintf(dso__bid(dso), sbuild_id);
+	args->ret += fprintf(args->fp, "%-40s %s\n", sbuild_id, dso__long_name(dso));
 	return 0;
 }
 
@@ -396,7 +396,7 @@ size_t dsos__fprintf(struct dsos *dsos, FILE *fp)
 
 static int dsos__hit_all_cb(struct dso *dso, void *data __maybe_unused)
 {
-	dso->hit = true;
+	dso__set_hit(dso);
 	return 0;
 }
 
@@ -432,7 +432,7 @@ struct dso *dsos__findnew_module_dso(struct dsos *dsos,
 	dso__set_basename(dso);
 	dso__set_module_info(dso, m, machine);
 	dso__set_long_name(dso,	strdup(filename), true);
-	dso->kernel = DSO_SPACE__KERNEL;
+	dso__set_kernel(dso, DSO_SPACE__KERNEL);
 	__dsos__add(dsos, dso);
 
 	up_write(&dsos->lock);
@@ -455,8 +455,8 @@ static int dsos__find_kernel_dso_cb(struct dso *dso, void *data)
 	 * Therefore, we pass PERF_RECORD_MISC_CPUMODE_UNKNOWN.
 	 * is_kernel_module() treats it as a kernel cpumode.
 	 */
-	if (!dso->kernel ||
-	    is_kernel_module(dso->long_name, PERF_RECORD_MISC_CPUMODE_UNKNOWN))
+	if (!dso__kernel(dso) ||
+	    is_kernel_module(dso__long_name(dso), PERF_RECORD_MISC_CPUMODE_UNKNOWN))
 		return 0;
 
 	*res = dso__get(dso);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 198903157f9e..f32f9abf6344 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -726,7 +726,7 @@ int machine__resolve(struct machine *machine, struct addr_location *al,
 	dso = al->map ? map__dso(al->map) : NULL;
 	dump_printf(" ...... dso: %s\n",
 		dso
-		? dso->long_name
+		? dso__long_name(dso)
 		: (al->level == 'H' ? "[hypervisor]" : "<not found>"));
 
 	if (thread__is_filtered(thread))
@@ -750,10 +750,10 @@ int machine__resolve(struct machine *machine, struct addr_location *al,
 	if (al->map) {
 		if (symbol_conf.dso_list &&
 		    (!dso || !(strlist__has_entry(symbol_conf.dso_list,
-						  dso->short_name) ||
-			       (dso->short_name != dso->long_name &&
+						  dso__short_name(dso)) ||
+			       (dso__short_name(dso) != dso__long_name(dso) &&
 				strlist__has_entry(symbol_conf.dso_list,
-						   dso->long_name))))) {
+						   dso__long_name(dso)))))) {
 			al->filtered |= (1 << HIST_FILTER__DSO);
 		}
 
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index a9f71f8343f0..1f88246da187 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2308,7 +2308,7 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
 
 		build_id__init(&bid, bev->data, size);
 		dso__set_build_id(dso, &bid);
-		dso->header_build_id = 1;
+		dso__set_header_build_id(dso, true);
 
 		if (dso_space != DSO_SPACE__USER) {
 			struct kmod_path m = { .name = NULL, };
@@ -2316,13 +2316,13 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
 			if (!kmod_path__parse_name(&m, filename) && m.kmod)
 				dso__set_module_info(dso, &m, machine);
 
-			dso->kernel = dso_space;
+			dso__set_kernel(dso, dso_space);
 			free(m.name);
 		}
 
-		build_id__sprintf(&dso->bid, sbuild_id);
+		build_id__sprintf(dso__bid(dso), sbuild_id);
 		pr_debug("build id event received for %s: %s [%zu]\n",
-			 dso->long_name, sbuild_id, size);
+			 dso__long_name(dso), sbuild_id, size);
 		dso__put(dso);
 	}
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0888b7163b7c..a9eef8b5aff0 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -2128,7 +2128,7 @@ static bool hists__filter_entry_by_dso(struct hists *hists,
 				       struct hist_entry *he)
 {
 	if (hists->dso_filter != NULL &&
-	    (he->ms.map == NULL || map__dso(he->ms.map) != hists->dso_filter)) {
+	    (he->ms.map == NULL || !RC_CHK_EQUAL(map__dso(he->ms.map), hists->dso_filter))) {
 		he->filtered |= (1 << HIST_FILTER__DSO);
 		return true;
 	}
@@ -2808,7 +2808,7 @@ int __hists__scnprintf_title(struct hists *hists, char *bf, size_t size, bool sh
 	}
 	if (dso)
 		printed += scnprintf(bf + printed, size - printed,
-				    ", DSO: %s", dso->short_name);
+				     ", DSO: %s", dso__short_name(dso));
 	if (socket_id > -1)
 		printed += scnprintf(bf + printed, size - printed,
 				    ", Processor Socket: %d", socket_id);
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index f38893e0b036..04a291562b14 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -598,15 +598,15 @@ static struct auxtrace_cache *intel_pt_cache(struct dso *dso,
 	struct auxtrace_cache *c;
 	unsigned int bits;
 
-	if (dso->auxtrace_cache)
-		return dso->auxtrace_cache;
+	if (dso__auxtrace_cache(dso))
+		return dso__auxtrace_cache(dso);
 
 	bits = intel_pt_cache_size(dso, machine);
 
 	/* Ignoring cache creation failure */
 	c = auxtrace_cache__new(bits, sizeof(struct intel_pt_cache_entry), 200);
 
-	dso->auxtrace_cache = c;
+	dso__set_auxtrace_cache(dso, c);
 
 	return c;
 }
@@ -650,7 +650,7 @@ intel_pt_cache_lookup(struct dso *dso, struct machine *machine, u64 offset)
 	if (!c)
 		return NULL;
 
-	return auxtrace_cache__lookup(dso->auxtrace_cache, offset);
+	return auxtrace_cache__lookup(dso__auxtrace_cache(dso), offset);
 }
 
 static void intel_pt_cache_invalidate(struct dso *dso, struct machine *machine,
@@ -661,7 +661,7 @@ static void intel_pt_cache_invalidate(struct dso *dso, struct machine *machine,
 	if (!c)
 		return;
 
-	auxtrace_cache__remove(dso->auxtrace_cache, offset);
+	auxtrace_cache__remove(dso__auxtrace_cache(dso), offset);
 }
 
 static inline bool intel_pt_guest_kernel_ip(uint64_t ip)
@@ -820,8 +820,8 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
 		}
 		dso = map__dso(al.map);
 
-		if (dso->data.status == DSO_DATA_STATUS_ERROR &&
-			dso__data_status_seen(dso, DSO_DATA_STATUS_SEEN_ITRACE)) {
+		if (dso__data(dso)->status == DSO_DATA_STATUS_ERROR &&
+		    dso__data_status_seen(dso, DSO_DATA_STATUS_SEEN_ITRACE)) {
 			ret = -ENOENT;
 			goto out_ret;
 		}
@@ -854,7 +854,7 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
 		/* Load maps to ensure dso->is_64_bit has been updated */
 		map__load(al.map);
 
-		x86_64 = dso->is_64_bit;
+		x86_64 = dso__is_64_bit(dso);
 
 		while (1) {
 			len = dso__data_read_offset(dso, machine,
@@ -1008,7 +1008,7 @@ static int __intel_pt_pgd_ip(uint64_t ip, void *data)
 
 	offset = map__map_ip(al.map, ip);
 
-	res = intel_pt_match_pgd_ip(ptq->pt, ip, offset, map__dso(al.map)->long_name);
+	res = intel_pt_match_pgd_ip(ptq->pt, ip, offset, dso__long_name(map__dso(al.map)));
 	addr_location__exit(&al);
 	return res;
 }
@@ -3416,7 +3416,7 @@ static int intel_pt_text_poke(struct intel_pt *pt, union perf_event *event)
 		}
 
 		dso = map__dso(al.map);
-		if (!dso || !dso->auxtrace_cache)
+		if (!dso || !dso__auxtrace_cache(dso))
 			continue;
 
 		offset = map__map_ip(al.map, addr);
@@ -3436,7 +3436,7 @@ static int intel_pt_text_poke(struct intel_pt *pt, union perf_event *event)
 		} else {
 			intel_pt_cache_invalidate(dso, machine, offset);
 			intel_pt_log("Invalidated instruction cache for %s at %#"PRIx64"\n",
-				     dso->long_name, addr);
+				     dso__long_name(dso), addr);
 		}
 	}
 out:
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0210c10e616b..49b8ccd5affe 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -693,7 +693,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
 			err = -ENOMEM;
 			goto out;
 		}
-		dso->kernel = DSO_SPACE__KERNEL;
+		dso__set_kernel(dso, DSO_SPACE__KERNEL);
 		map = map__new2(0, dso);
 		dso__put(dso);
 		if (!map) {
@@ -701,8 +701,8 @@ static int machine__process_ksymbol_register(struct machine *machine,
 			goto out;
 		}
 		if (event->ksymbol.ksym_type == PERF_RECORD_KSYMBOL_TYPE_OOL) {
-			dso->binary_type = DSO_BINARY_TYPE__OOL;
-			dso->data.file_size = event->ksymbol.len;
+			dso__set_binary_type(dso, DSO_BINARY_TYPE__OOL);
+			dso__data(dso)->file_size = event->ksymbol.len;
 			dso__set_loaded(dso);
 		}
 
@@ -717,7 +717,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
 		dso__set_loaded(dso);
 
 		if (is_bpf_image(event->ksymbol.name)) {
-			dso->binary_type = DSO_BINARY_TYPE__BPF_IMAGE;
+			dso__set_binary_type(dso, DSO_BINARY_TYPE__BPF_IMAGE);
 			dso__set_long_name(dso, "", false);
 		}
 	} else {
@@ -887,17 +887,17 @@ size_t machine__fprintf_vmlinux_path(struct machine *machine, FILE *fp)
 	size_t printed = 0;
 	struct dso *kdso = machine__kernel_dso(machine);
 
-	if (kdso->has_build_id) {
+	if (dso__has_build_id(kdso)) {
 		char filename[PATH_MAX];
-		if (dso__build_id_filename(kdso, filename, sizeof(filename),
-					   false))
+
+		if (dso__build_id_filename(kdso, filename, sizeof(filename), false))
 			printed += fprintf(fp, "[0] %s\n", filename);
 	}
 
-	for (i = 0; i < vmlinux_path__nr_entries; ++i)
-		printed += fprintf(fp, "[%d] %s\n",
-				   i + kdso->has_build_id, vmlinux_path[i]);
-
+	for (i = 0; i < vmlinux_path__nr_entries; ++i) {
+		printed += fprintf(fp, "[%d] %s\n", i + dso__has_build_id(kdso),
+				   vmlinux_path[i]);
+	}
 	return printed;
 }
 
@@ -947,7 +947,7 @@ static struct dso *machine__get_kernel(struct machine *machine)
 						 DSO_SPACE__KERNEL_GUEST);
 	}
 
-	if (kernel != NULL && (!kernel->has_build_id))
+	if (kernel != NULL && (!dso__has_build_id(kernel)))
 		dso__read_running_kernel_build_id(kernel, machine);
 
 	return kernel;
@@ -1312,8 +1312,8 @@ static char *get_kernel_version(const char *root_dir)
 
 static bool is_kmod_dso(struct dso *dso)
 {
-	return dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE ||
-	       dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE;
+	return dso__symtab_type(dso) == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE ||
+	       dso__symtab_type(dso) == DSO_BINARY_TYPE__GUEST_KMODULE;
 }
 
 static int maps__set_module_path(struct maps *maps, const char *path, struct kmod_path *m)
@@ -1340,8 +1340,8 @@ static int maps__set_module_path(struct maps *maps, const char *path, struct kmo
 	 * we need to update the symtab_type if needed.
 	 */
 	if (m->comp && is_kmod_dso(dso)) {
-		dso->symtab_type++;
-		dso->comp = m->comp;
+		dso__set_symtab_type(dso, dso__symtab_type(dso));
+		dso__set_comp(dso, m->comp);
 	}
 	map__put(map);
 	return 0;
@@ -1642,13 +1642,13 @@ static int machine__process_kernel_mmap_event(struct machine *machine,
 		if (kernel == NULL)
 			goto out_problem;
 
-		kernel->kernel = dso_space;
+		dso__set_kernel(kernel, dso_space);
 		if (__machine__create_kernel_maps(machine, kernel) < 0) {
 			dso__put(kernel);
 			goto out_problem;
 		}
 
-		if (strstr(kernel->long_name, "vmlinux"))
+		if (strstr(dso__long_name(kernel), "vmlinux"))
 			dso__set_short_name(kernel, "[kernel.vmlinux]", false);
 
 		if (machine__update_kernel_mmap(machine, xm->start, xm->end) < 0) {
@@ -2030,14 +2030,14 @@ static char *callchain_srcline(struct map_symbol *ms, u64 ip)
 		return srcline;
 
 	dso = map__dso(map);
-	srcline = srcline__tree_find(&dso->srclines, ip);
+	srcline = srcline__tree_find(dso__srclines(dso), ip);
 	if (!srcline) {
 		bool show_sym = false;
 		bool show_addr = callchain_param.key == CCKEY_ADDRESS;
 
 		srcline = get_srcline(dso, map__rip_2objdump(map, ip),
 				      ms->sym, show_sym, show_addr, ip);
-		srcline__tree_insert(&dso->srclines, ip, srcline);
+		srcline__tree_insert(dso__srclines(dso), ip, srcline);
 	}
 
 	return srcline;
@@ -2835,12 +2835,12 @@ static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms
 	addr = map__rip_2objdump(map, addr);
 	dso = map__dso(map);
 
-	inline_node = inlines__tree_find(&dso->inlined_nodes, addr);
+	inline_node = inlines__tree_find(dso__inlined_nodes(dso), addr);
 	if (!inline_node) {
 		inline_node = dso__parse_addr_inlines(dso, addr, sym);
 		if (!inline_node)
 			return ret;
-		inlines__tree_insert(&dso->inlined_nodes, inline_node);
+		inlines__tree_insert(dso__inlined_nodes(dso), inline_node);
 	}
 
 	ilist_ms = (struct map_symbol) {
@@ -3129,7 +3129,7 @@ char *machine__resolve_kernel_addr(void *vmachine, unsigned long long *addrp, ch
 	if (sym == NULL)
 		return NULL;
 
-	*modp = __map__is_kmodule(map) ? (char *)map__dso(map)->short_name : NULL;
+	*modp = __map__is_kmodule(map) ? (char *)dso__short_name(map__dso(map)) : NULL;
 	*addrp = map__unmap_ip(map, sym->start);
 	return sym->name;
 }
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 7c1fff9e413d..14fb8cf65b13 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -168,7 +168,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 		if (dso == NULL)
 			goto out_delete;
 
-		assert(!dso->kernel);
+		assert(!dso__kernel(dso));
 		map__init(result, start, start + len, pgoff, dso);
 
 		if (anon || no_dso) {
@@ -182,10 +182,9 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 			if (!(prot & PROT_EXEC))
 				dso__set_loaded(dso);
 		}
-		mutex_lock(&dso->lock);
-		nsinfo__put(dso->nsinfo);
-		dso->nsinfo = nsi;
-		mutex_unlock(&dso->lock);
+		mutex_lock(dso__lock(dso));
+		dso__set_nsinfo(dso, nsi);
+		mutex_unlock(dso__lock(dso));
 
 		if (build_id__is_defined(bid)) {
 			dso__set_build_id(dso, bid);
@@ -197,9 +196,9 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 			 * have it missing.
 			 */
 			header_bid_dso = dsos__find(&machine->dsos, filename, false);
-			if (header_bid_dso && header_bid_dso->header_build_id) {
-				dso__set_build_id(dso, &header_bid_dso->bid);
-				dso->header_build_id = 1;
+			if (header_bid_dso && dso__header_build_id(header_bid_dso)) {
+				dso__set_build_id(dso, dso__bid(header_bid_dso));
+				dso__set_header_build_id(dso, 1);
 			}
 		}
 		dso__put(dso);
@@ -221,7 +220,7 @@ struct map *map__new2(u64 start, struct dso *dso)
 	struct map *result;
 	RC_STRUCT(map) *map;
 
-	map = calloc(1, sizeof(*map) + (dso->kernel ? sizeof(struct kmap) : 0));
+	map = calloc(1, sizeof(*map) + (dso__kernel(dso) ? sizeof(struct kmap) : 0));
 	if (ADD_RC_CHK(result, map)) {
 		/*
 		 * ->end will be filled after we load all the symbols
@@ -234,7 +233,7 @@ struct map *map__new2(u64 start, struct dso *dso)
 
 bool __map__is_kernel(const struct map *map)
 {
-	if (!map__dso(map)->kernel)
+	if (!dso__kernel(map__dso(map)))
 		return false;
 	return machine__kernel_map(maps__machine(map__kmaps((struct map *)map))) == map;
 }
@@ -251,7 +250,7 @@ bool __map__is_bpf_prog(const struct map *map)
 	const char *name;
 	struct dso *dso = map__dso(map);
 
-	if (dso->binary_type == DSO_BINARY_TYPE__BPF_PROG_INFO)
+	if (dso__binary_type(dso) == DSO_BINARY_TYPE__BPF_PROG_INFO)
 		return true;
 
 	/*
@@ -259,7 +258,7 @@ bool __map__is_bpf_prog(const struct map *map)
 	 * type of DSO_BINARY_TYPE__BPF_PROG_INFO. In such cases, we can
 	 * guess the type based on name.
 	 */
-	name = dso->short_name;
+	name = dso__short_name(dso);
 	return name && (strstr(name, "bpf_prog_") == name);
 }
 
@@ -268,7 +267,7 @@ bool __map__is_bpf_image(const struct map *map)
 	const char *name;
 	struct dso *dso = map__dso(map);
 
-	if (dso->binary_type == DSO_BINARY_TYPE__BPF_IMAGE)
+	if (dso__binary_type(dso) == DSO_BINARY_TYPE__BPF_IMAGE)
 		return true;
 
 	/*
@@ -276,7 +275,7 @@ bool __map__is_bpf_image(const struct map *map)
 	 * type of DSO_BINARY_TYPE__BPF_IMAGE. In such cases, we can
 	 * guess the type based on name.
 	 */
-	name = dso->short_name;
+	name = dso__short_name(dso);
 	return name && is_bpf_image(name);
 }
 
@@ -284,7 +283,7 @@ bool __map__is_ool(const struct map *map)
 {
 	const struct dso *dso = map__dso(map);
 
-	return dso && dso->binary_type == DSO_BINARY_TYPE__OOL;
+	return dso && dso__binary_type(dso) == DSO_BINARY_TYPE__OOL;
 }
 
 bool map__has_symbols(const struct map *map)
@@ -315,7 +314,7 @@ void map__put(struct map *map)
 void map__fixup_start(struct map *map)
 {
 	struct dso *dso = map__dso(map);
-	struct rb_root_cached *symbols = &dso->symbols;
+	struct rb_root_cached *symbols = dso__symbols(dso);
 	struct rb_node *nd = rb_first_cached(symbols);
 
 	if (nd != NULL) {
@@ -328,7 +327,7 @@ void map__fixup_start(struct map *map)
 void map__fixup_end(struct map *map)
 {
 	struct dso *dso = map__dso(map);
-	struct rb_root_cached *symbols = &dso->symbols;
+	struct rb_root_cached *symbols = dso__symbols(dso);
 	struct rb_node *nd = rb_last(&symbols->rb_root);
 
 	if (nd != NULL) {
@@ -342,7 +341,7 @@ void map__fixup_end(struct map *map)
 int map__load(struct map *map)
 {
 	struct dso *dso = map__dso(map);
-	const char *name = dso->long_name;
+	const char *name = dso__long_name(dso);
 	int nr;
 
 	if (dso__loaded(dso))
@@ -350,10 +349,10 @@ int map__load(struct map *map)
 
 	nr = dso__load(dso, map);
 	if (nr < 0) {
-		if (dso->has_build_id) {
+		if (dso__has_build_id(dso)) {
 			char sbuild_id[SBUILD_ID_SIZE];
 
-			build_id__sprintf(&dso->bid, sbuild_id);
+			build_id__sprintf(dso__bid(dso), sbuild_id);
 			pr_debug("%s with build id %s not found", name, sbuild_id);
 		} else
 			pr_debug("Failed to open %s", name);
@@ -415,7 +414,7 @@ struct map *map__clone(struct map *from)
 	size_t size = sizeof(RC_STRUCT(map));
 	struct dso *dso = map__dso(from);
 
-	if (dso && dso->kernel)
+	if (dso && dso__kernel(dso))
 		size += sizeof(struct kmap);
 
 	map = memdup(RC_CHK_ACCESS(from), size);
@@ -432,14 +431,14 @@ size_t map__fprintf(struct map *map, FILE *fp)
 	const struct dso *dso = map__dso(map);
 
 	return fprintf(fp, " %" PRIx64 "-%" PRIx64 " %" PRIx64 " %s\n",
-		       map__start(map), map__end(map), map__pgoff(map), dso->name);
+		       map__start(map), map__end(map), map__pgoff(map), dso__name(dso));
 }
 
 static bool prefer_dso_long_name(const struct dso *dso, bool print_off)
 {
-	return dso->long_name &&
+	return dso__long_name(dso) &&
 	       (symbol_conf.show_kernel_path ||
-		(print_off && (dso->name[0] == '[' || dso__is_kcore(dso))));
+		(print_off && (dso__name(dso)[0] == '[' || dso__is_kcore(dso))));
 }
 
 static size_t __map__fprintf_dsoname(struct map *map, bool print_off, FILE *fp)
@@ -450,9 +449,9 @@ static size_t __map__fprintf_dsoname(struct map *map, bool print_off, FILE *fp)
 
 	if (dso) {
 		if (prefer_dso_long_name(dso, print_off))
-			dsoname = dso->long_name;
+			dsoname = dso__long_name(dso);
 		else
-			dsoname = dso->name;
+			dsoname = dso__name(dso);
 	}
 
 	if (symbol_conf.pad_output_len_dso) {
@@ -545,18 +544,18 @@ u64 map__rip_2objdump(struct map *map, u64 rip)
 		}
 	}
 
-	if (!dso->adjust_symbols)
+	if (!dso__adjust_symbols(dso))
 		return rip;
 
-	if (dso->rel)
+	if (dso__rel(dso))
 		return rip - map__pgoff(map);
 
 	/*
 	 * kernel modules also have DSO_TYPE_USER in dso->kernel,
 	 * but all kernel modules are ET_REL, so won't get here.
 	 */
-	if (dso->kernel == DSO_SPACE__USER)
-		return rip + dso->text_offset;
+	if (dso__kernel(dso) == DSO_SPACE__USER)
+		return rip + dso__text_offset(dso);
 
 	return map__unmap_ip(map, rip) - map__reloc(map);
 }
@@ -577,18 +576,18 @@ u64 map__objdump_2mem(struct map *map, u64 ip)
 {
 	const struct dso *dso = map__dso(map);
 
-	if (!dso->adjust_symbols)
+	if (!dso__adjust_symbols(dso))
 		return map__unmap_ip(map, ip);
 
-	if (dso->rel)
+	if (dso__rel(dso))
 		return map__unmap_ip(map, ip + map__pgoff(map));
 
 	/*
 	 * kernel modules also have DSO_TYPE_USER in dso->kernel,
 	 * but all kernel modules are ET_REL, so won't get here.
 	 */
-	if (dso->kernel == DSO_SPACE__USER)
-		return map__unmap_ip(map, ip - dso->text_offset);
+	if (dso__kernel(dso) == DSO_SPACE__USER)
+		return map__unmap_ip(map, ip - dso__text_offset(dso));
 
 	return ip + map__reloc(map);
 }
@@ -604,7 +603,7 @@ struct kmap *__map__kmap(struct map *map)
 {
 	const struct dso *dso = map__dso(map);
 
-	if (!dso || !dso->kernel)
+	if (!dso || !dso__kernel(dso))
 		return NULL;
 	return (struct kmap *)(&RC_CHK_ACCESS(map)[1]);
 }
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index cb52de9d6c2a..a248abe8c363 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -76,7 +76,7 @@ static void check_invariants(const struct maps *maps __maybe_unused)
 		/* Expect at least 1 reference count. */
 		assert(refcount_read(map__refcnt(map)) > 0);
 
-		if (map__dso(map) && map__dso(map)->kernel)
+		if (map__dso(map) && dso__kernel(map__dso(map)))
 			assert(RC_CHK_EQUAL(map__kmap(map)->kmaps, maps));
 
 		if (i > 0) {
@@ -331,7 +331,7 @@ static int map__strcmp(const void *a, const void *b)
 	const struct map *map_b = *(const struct map * const *)b;
 	const struct dso *dso_a = map__dso(map_a);
 	const struct dso *dso_b = map__dso(map_b);
-	int ret = strcmp(dso_a->short_name, dso_b->short_name);
+	int ret = strcmp(dso__short_name(dso_a), dso__short_name(dso_b));
 
 	if (ret == 0 && RC_CHK_ACCESS(map_a) != RC_CHK_ACCESS(map_b)) {
 		/* Ensure distinct but name equal maps have an order. */
@@ -470,7 +470,7 @@ static int __maps__insert(struct maps *maps, struct map *new)
 	}
 	if (map__end(new) < map__start(new))
 		RC_CHK_ACCESS(maps)->ends_broken = true;
-	if (dso && dso->kernel) {
+	if (dso && dso__kernel(dso)) {
 		struct kmap *kmap = map__kmap(new);
 
 		if (kmap)
@@ -747,7 +747,7 @@ static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
 
 		if (use_browser) {
 			pr_debug("overlapping maps in %s (disable tui for more info)\n",
-				map__dso(new)->name);
+				dso__name(map__dso(new)));
 		} else if (verbose >= 2) {
 			pr_debug("overlapping maps:\n");
 			map__fprintf(new, fp);
@@ -968,7 +968,7 @@ static int map__strcmp_name(const void *name, const void *b)
 {
 	const struct dso *dso = map__dso(*(const struct map **)b);
 
-	return strcmp(name, dso->short_name);
+	return strcmp(name, dso__short_name(dso));
 }
 
 struct map *maps__find_by_name(struct maps *maps, const char *name)
@@ -987,7 +987,7 @@ struct map *maps__find_by_name(struct maps *maps, const char *name)
 		if (i < maps__nr_maps(maps) && maps__maps_by_name(maps)) {
 			struct dso *dso = map__dso(maps__maps_by_name(maps)[i]);
 
-			if (dso && strcmp(dso->short_name, name) == 0) {
+			if (dso && strcmp(dso__short_name(dso), name) == 0) {
 				result = map__get(maps__maps_by_name(maps)[i]);
 				done = true;
 			}
@@ -1024,7 +1024,7 @@ struct map *maps__find_by_name(struct maps *maps, const char *name)
 					struct map *pos = maps_by_address[i];
 					struct dso *dso = map__dso(pos);
 
-					if (dso && strcmp(dso->short_name, name) == 0) {
+					if (dso && strcmp(dso__short_name(dso), name) == 0) {
 						result = map__get(pos);
 						break;
 					}
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index be71abe8b9b0..26c084b4a4a6 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -158,8 +158,8 @@ static int kernel_get_module_map_cb(struct map *map, void *data)
 {
 	struct kernel_get_module_map_cb_args *args = data;
 	struct dso *dso = map__dso(map);
-	const char *short_name = dso->short_name; /* short_name is "[module]" */
-	u16 short_name_len =  dso->short_name_len;
+	const char *short_name = dso__short_name(dso);
+	u16 short_name_len =  dso__short_name_len(dso);
 
 	if (strncmp(short_name + 1, args->module, short_name_len - 2) == 0 &&
 	    args->module[short_name_len - 2] == '\0') {
@@ -201,10 +201,9 @@ struct map *get_target_map(const char *target, struct nsinfo *nsi, bool user)
 		map = dso__new_map(target);
 		dso = map ? map__dso(map) : NULL;
 		if (dso) {
-			mutex_lock(&dso->lock);
-			nsinfo__put(dso->nsinfo);
-			dso->nsinfo = nsinfo__get(nsi);
-			mutex_unlock(&dso->lock);
+			mutex_lock(dso__lock(dso));
+			dso__set_nsinfo(dso, nsinfo__get(nsi));
+			mutex_unlock(dso__lock(dso));
 		}
 		return map;
 	} else {
@@ -367,11 +366,11 @@ static int kernel_get_module_dso(const char *module, struct dso **pdso)
 
 	map = machine__kernel_map(host_machine);
 	dso = map__dso(map);
-	if (!dso->has_build_id)
+	if (!dso__has_build_id(dso))
 		dso__read_running_kernel_build_id(dso, host_machine);
 
 	vmlinux_name = symbol_conf.vmlinux_name;
-	dso->load_errno = 0;
+	*dso__load_errno(dso) = 0;
 	if (vmlinux_name)
 		ret = dso__load_vmlinux(dso, map, vmlinux_name, false);
 	else
@@ -498,7 +497,7 @@ static struct debuginfo *open_from_debuginfod(struct dso *dso, struct nsinfo *ns
 	if (!c)
 		return NULL;
 
-	build_id__sprintf(&dso->bid, sbuild_id);
+	build_id__sprintf(dso__bid(dso), sbuild_id);
 	fd = debuginfod_find_debuginfo(c, (const unsigned char *)sbuild_id,
 					0, &path);
 	if (fd >= 0)
@@ -541,7 +540,7 @@ static struct debuginfo *open_debuginfo(const char *module, struct nsinfo *nsi,
 	if (!module || !strchr(module, '/')) {
 		err = kernel_get_module_dso(module, &dso);
 		if (err < 0) {
-			if (!dso || dso->load_errno == 0) {
+			if (!dso || *dso__load_errno(dso) == 0) {
 				if (!str_error_r(-err, reason, STRERR_BUFSIZE))
 					strcpy(reason, "(unknown)");
 			} else
@@ -558,7 +557,7 @@ static struct debuginfo *open_debuginfo(const char *module, struct nsinfo *nsi,
 			}
 			return NULL;
 		}
-		path = dso->long_name;
+		path = dso__long_name(dso);
 	}
 	nsinfo__mountns_enter(nsi, &nsc);
 	ret = debuginfo__new(path);
@@ -3796,8 +3795,8 @@ int show_available_funcs(const char *target, struct nsinfo *nsi,
 	/* Show all (filtered) symbols */
 	setup_pager();
 
-	for (size_t i = 0; i < dso->symbol_names_len; i++) {
-		struct symbol *pos = dso->symbol_names[i];
+	for (size_t i = 0; i < dso__symbol_names_len(dso); i++) {
+		struct symbol *pos = dso__symbol_names(dso)[i];
 
 		if (strfilter__compare(_filter, pos->name))
 			printf("%s\n", pos->name);
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index b072ac5d3bc2..e16257d5ab2c 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -320,10 +320,10 @@ static SV *perl_process_callchain(struct perf_sample *sample,
 			const char *dsoname = "[unknown]";
 
 			if (dso) {
-				if (symbol_conf.show_kernel_path && dso->long_name)
-					dsoname = dso->long_name;
+				if (symbol_conf.show_kernel_path && dso__long_name(dso))
+					dsoname = dso__long_name(dso);
 				else
-					dsoname = dso->name;
+					dsoname = dso__name(dso);
 			}
 			if (!hv_stores(elem, "dso", newSVpv(dsoname,0))) {
 				hv_undef(elem);
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 860e1837ba96..9cc2372c93c3 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -393,10 +393,10 @@ static const char *get_dsoname(struct map *map)
 	struct dso *dso = map ? map__dso(map) : NULL;
 
 	if (dso) {
-		if (symbol_conf.show_kernel_path && dso->long_name)
-			dsoname = dso->long_name;
+		if (symbol_conf.show_kernel_path && dso__long_name(dso))
+			dsoname = dso__long_name(dso);
 		else
-			dsoname = dso->name;
+			dsoname = dso__name(dso);
 	}
 
 	return dsoname;
@@ -799,8 +799,9 @@ static void set_sym_in_dict(PyObject *dict, struct addr_location *al,
 	if (al->map) {
 		struct dso *dso = map__dso(al->map);
 
-		pydict_set_item_string_decref(dict, dso_field, _PyUnicode_FromString(dso->name));
-		build_id__sprintf(&dso->bid, sbuild_id);
+		pydict_set_item_string_decref(dict, dso_field,
+					      _PyUnicode_FromString(dso__name(dso)));
+		build_id__sprintf(dso__bid(dso), sbuild_id);
 		pydict_set_item_string_decref(dict, dso_bid_field,
 			_PyUnicode_FromString(sbuild_id));
 		pydict_set_item_string_decref(dict, dso_map_start,
@@ -1242,14 +1243,14 @@ static int python_export_dso(struct db_export *dbe, struct dso *dso,
 	char sbuild_id[SBUILD_ID_SIZE];
 	PyObject *t;
 
-	build_id__sprintf(&dso->bid, sbuild_id);
+	build_id__sprintf(dso__bid(dso), sbuild_id);
 
 	t = tuple_new(5);
 
-	tuple_set_d64(t, 0, dso->db_id);
+	tuple_set_d64(t, 0, dso__db_id(dso));
 	tuple_set_d64(t, 1, machine->db_id);
-	tuple_set_string(t, 2, dso->short_name);
-	tuple_set_string(t, 3, dso->long_name);
+	tuple_set_string(t, 2, dso__short_name(dso));
+	tuple_set_string(t, 3, dso__long_name(dso));
 	tuple_set_string(t, 4, sbuild_id);
 
 	call_object(tables->dso_handler, t, "dso_table");
@@ -1269,7 +1270,7 @@ static int python_export_symbol(struct db_export *dbe, struct symbol *sym,
 	t = tuple_new(6);
 
 	tuple_set_d64(t, 0, *sym_db_id);
-	tuple_set_d64(t, 1, dso->db_id);
+	tuple_set_d64(t, 1, dso__db_id(dso));
 	tuple_set_d64(t, 2, sym->start);
 	tuple_set_d64(t, 3, sym->end);
 	tuple_set_s32(t, 4, sym->binding);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 30254eb63709..87c2ed6051ee 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -239,11 +239,11 @@ static int64_t _sort__dso_cmp(struct map *map_l, struct map *map_r)
 		return cmp_null(dso_r, dso_l);
 
 	if (verbose > 0) {
-		dso_name_l = dso_l->long_name;
-		dso_name_r = dso_r->long_name;
+		dso_name_l = dso__long_name(dso_l);
+		dso_name_r = dso__long_name(dso_r);
 	} else {
-		dso_name_l = dso_l->short_name;
-		dso_name_r = dso_r->short_name;
+		dso_name_l = dso__short_name(dso_l);
+		dso_name_r = dso__short_name(dso_r);
 	}
 
 	return strcmp(dso_name_l, dso_name_r);
@@ -262,7 +262,7 @@ static int _hist_entry__dso_snprintf(struct map *map, char *bf,
 	const char *dso_name = "[unknown]";
 
 	if (dso)
-		dso_name = verbose > 0 ? dso->long_name : dso->short_name;
+		dso_name = verbose > 0 ? dso__long_name(dso) : dso__short_name(dso);
 
 	return repsep_snprintf(bf, size, "%-*.*s", width, width, dso_name);
 }
@@ -364,7 +364,7 @@ static int _hist_entry__sym_snprintf(struct map_symbol *ms,
 		char o = dso ? dso__symtab_origin(dso) : '!';
 		u64 rip = ip;
 
-		if (dso && dso->kernel && dso->adjust_symbols)
+		if (dso && dso__kernel(dso) && dso__adjust_symbols(dso))
 			rip = map__unmap_ip(map, ip);
 
 		ret += repsep_snprintf(bf, size, "%-#*llx %c ",
@@ -1586,8 +1586,8 @@ sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right)
 	 */
 
 	if ((left->cpumode != PERF_RECORD_MISC_KERNEL) &&
-	    (!(map__flags(l_map) & MAP_SHARED)) && !l_dso->id.maj && !l_dso->id.min &&
-	    !l_dso->id.ino && !l_dso->id.ino_generation) {
+	    (!(map__flags(l_map) & MAP_SHARED)) && !dso__id(l_dso)->maj && !dso__id(l_dso)->min &&
+	     !dso__id(l_dso)->ino && !dso__id(l_dso)->ino_generation) {
 		/* userspace anonymous */
 
 		if (thread__pid(left->thread) > thread__pid(right->thread))
@@ -1626,7 +1626,8 @@ static int hist_entry__dcacheline_snprintf(struct hist_entry *he, char *bf,
 		if ((he->cpumode != PERF_RECORD_MISC_KERNEL) &&
 		     map && !(map__prot(map) & PROT_EXEC) &&
 		     (map__flags(map) & MAP_SHARED) &&
-		    (dso->id.maj || dso->id.min || dso->id.ino || dso->id.ino_generation))
+		     (dso__id(dso)->maj || dso__id(dso)->min || dso__id(dso)->ino ||
+		      dso__id(dso)->ino_generation))
 			level = 's';
 		else if (!map)
 			level = 'X';
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 034b496df297..7a56b8b0792a 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -27,14 +27,14 @@ bool srcline_full_filename;
 
 char *srcline__unknown = (char *)"??:0";
 
-static const char *dso__name(struct dso *dso)
+static const char *srcline_dso_name(struct dso *dso)
 {
 	const char *dso_name;
 
-	if (dso->symsrc_filename)
-		dso_name = dso->symsrc_filename;
+	if (dso__symsrc_filename(dso))
+		dso_name = dso__symsrc_filename(dso);
 	else
-		dso_name = dso->long_name;
+		dso_name = dso__long_name(dso);
 
 	if (dso_name[0] == '[')
 		return NULL;
@@ -636,7 +636,7 @@ static int addr2line(const char *dso_name, u64 addr,
 		     struct inline_node *node,
 		     struct symbol *sym __maybe_unused)
 {
-	struct child_process *a2l = dso->a2l;
+	struct child_process *a2l = dso__a2l(dso);
 	char *record_function = NULL;
 	char *record_filename = NULL;
 	unsigned int record_line_nr = 0;
@@ -653,8 +653,9 @@ static int addr2line(const char *dso_name, u64 addr,
 		if (!filename__has_section(dso_name, ".debug_line"))
 			goto out;
 
-		dso->a2l = addr2line_subprocess_init(symbol_conf.addr2line_path, dso_name);
-		a2l = dso->a2l;
+		dso__set_a2l(dso,
+			     addr2line_subprocess_init(symbol_conf.addr2line_path, dso_name));
+		a2l = dso__a2l(dso);
 	}
 
 	if (a2l == NULL) {
@@ -768,7 +769,7 @@ static int addr2line(const char *dso_name, u64 addr,
 	free(record_function);
 	free(record_filename);
 	if (io.eof) {
-		dso->a2l = NULL;
+		dso__set_a2l(dso, NULL);
 		addr2line_subprocess_cleanup(a2l);
 	}
 	return ret;
@@ -776,14 +777,14 @@ static int addr2line(const char *dso_name, u64 addr,
 
 void dso__free_a2l(struct dso *dso)
 {
-	struct child_process *a2l = dso->a2l;
+	struct child_process *a2l = dso__a2l(dso);
 
 	if (!a2l)
 		return;
 
 	addr2line_subprocess_cleanup(a2l);
 
-	dso->a2l = NULL;
+	dso__set_a2l(dso, NULL);
 }
 
 #endif /* HAVE_LIBBFD_SUPPORT */
@@ -821,33 +822,34 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
 	char *srcline;
 	const char *dso_name;
 
-	if (!dso->has_srcline)
+	if (!dso__has_srcline(dso))
 		goto out;
 
-	dso_name = dso__name(dso);
+	dso_name = srcline_dso_name(dso);
 	if (dso_name == NULL)
-		goto out;
+		goto out_err;
 
 	if (!addr2line(dso_name, addr, &file, &line, dso,
 		       unwind_inlines, NULL, sym))
-		goto out;
+		goto out_err;
 
 	srcline = srcline_from_fileline(file, line);
 	free(file);
 
 	if (!srcline)
-		goto out;
+		goto out_err;
 
-	dso->a2l_fails = 0;
+	dso__set_a2l_fails(dso, 0);
 
 	return srcline;
 
-out:
-	if (dso->a2l_fails && ++dso->a2l_fails > A2L_FAIL_LIMIT) {
-		dso->has_srcline = 0;
+out_err:
+	dso__set_a2l_fails(dso, dso__a2l_fails(dso) + 1);
+	if (dso__a2l_fails(dso) > A2L_FAIL_LIMIT) {
+		dso__set_has_srcline(dso, false);
 		dso__free_a2l(dso);
 	}
-
+out:
 	if (!show_addr)
 		return (show_sym && sym) ?
 			    strndup(sym->name, sym->namelen) : SRCLINE_UNKNOWN;
@@ -856,7 +858,7 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
 		if (asprintf(&srcline, "%s+%" PRIu64, show_sym ? sym->name : "",
 					ip - sym->start) < 0)
 			return SRCLINE_UNKNOWN;
-	} else if (asprintf(&srcline, "%s[%" PRIx64 "]", dso->short_name, addr) < 0)
+	} else if (asprintf(&srcline, "%s[%" PRIx64 "]", dso__short_name(dso), addr) < 0)
 		return SRCLINE_UNKNOWN;
 	return srcline;
 }
@@ -867,22 +869,23 @@ char *get_srcline_split(struct dso *dso, u64 addr, unsigned *line)
 	char *file = NULL;
 	const char *dso_name;
 
-	if (!dso->has_srcline)
-		goto out;
+	if (!dso__has_srcline(dso))
+		return NULL;
 
-	dso_name = dso__name(dso);
+	dso_name = srcline_dso_name(dso);
 	if (dso_name == NULL)
-		goto out;
+		goto out_err;
 
 	if (!addr2line(dso_name, addr, &file, line, dso, true, NULL, NULL))
-		goto out;
+		goto out_err;
 
-	dso->a2l_fails = 0;
+	dso__set_a2l_fails(dso, 0);
 	return file;
 
-out:
-	if (dso->a2l_fails && ++dso->a2l_fails > A2L_FAIL_LIMIT) {
-		dso->has_srcline = 0;
+out_err:
+	dso__set_a2l_fails(dso, dso__a2l_fails(dso) + 1);
+	if (dso__a2l_fails(dso) > A2L_FAIL_LIMIT) {
+		dso__set_has_srcline(dso, false);
 		dso__free_a2l(dso);
 	}
 
@@ -980,7 +983,7 @@ struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr,
 {
 	const char *dso_name;
 
-	dso_name = dso__name(dso);
+	dso_name = srcline_dso_name(dso);
 	if (dso_name == NULL)
 		return NULL;
 
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 5990e3fabdb5..de73f9fb3fe4 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -311,8 +311,8 @@ static char *demangle_sym(struct dso *dso, int kmodule, const char *elf_name)
 	 * DWARF DW_compile_unit has this, but we don't always have access
 	 * to it...
 	 */
-	if (!want_demangle(dso->kernel || kmodule))
-	    return demangled;
+	if (!want_demangle(dso__kernel(dso) || kmodule))
+		return demangled;
 
 	demangled = cxx_demangle_sym(elf_name, verbose > 0, verbose > 0);
 	if (demangled == NULL) {
@@ -469,7 +469,7 @@ static bool get_plt_sizes(struct dso *dso, GElf_Ehdr *ehdr, GElf_Shdr *shdr_plt,
 	}
 	if (*plt_entry_size)
 		return true;
-	pr_debug("Missing PLT entry size for %s\n", dso->long_name);
+	pr_debug("Missing PLT entry size for %s\n", dso__long_name(dso));
 	return false;
 }
 
@@ -653,7 +653,7 @@ static int dso__synthesize_plt_got_symbols(struct dso *dso, Elf *elf,
 		sym = symbol__new(shdr.sh_offset + i, shdr.sh_entsize, STB_GLOBAL, STT_FUNC, buf);
 		if (!sym)
 			goto out;
-		symbols__insert(&dso->symbols, sym);
+		symbols__insert(dso__symbols(dso), sym);
 	}
 	err = 0;
 out:
@@ -707,7 +707,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss)
 	plt_sym = symbol__new(shdr_plt.sh_offset, plt_header_size, STB_GLOBAL, STT_FUNC, ".plt");
 	if (!plt_sym)
 		goto out_elf_end;
-	symbols__insert(&dso->symbols, plt_sym);
+	symbols__insert(dso__symbols(dso), plt_sym);
 
 	/* Only x86 has .plt.got */
 	if (machine_is_x86(ehdr.e_machine) &&
@@ -829,7 +829,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss)
 			goto out_elf_end;
 
 		plt_offset += plt_entry_size;
-		symbols__insert(&dso->symbols, f);
+		symbols__insert(dso__symbols(dso), f);
 		++nr;
 	}
 
@@ -839,7 +839,7 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss)
 	if (err == 0)
 		return nr;
 	pr_debug("%s: problems reading %s PLT info.\n",
-		 __func__, dso->long_name);
+		 __func__, dso__long_name(dso));
 	return 0;
 }
 
@@ -1174,19 +1174,19 @@ static int dso__swap_init(struct dso *dso, unsigned char eidata)
 {
 	static unsigned int const endian = 1;
 
-	dso->needs_swap = DSO_SWAP__NO;
+	dso__set_needs_swap(dso, DSO_SWAP__NO);
 
 	switch (eidata) {
 	case ELFDATA2LSB:
 		/* We are big endian, DSO is little endian. */
 		if (*(unsigned char const *)&endian != 1)
-			dso->needs_swap = DSO_SWAP__YES;
+			dso__set_needs_swap(dso, DSO_SWAP__YES);
 		break;
 
 	case ELFDATA2MSB:
 		/* We are little endian, DSO is big endian. */
 		if (*(unsigned char const *)&endian != 0)
-			dso->needs_swap = DSO_SWAP__YES;
+			dso__set_needs_swap(dso, DSO_SWAP__YES);
 		break;
 
 	default:
@@ -1237,11 +1237,11 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
 		if (fd < 0)
 			return -1;
 
-		type = dso->symtab_type;
+		type = dso__symtab_type(dso);
 	} else {
 		fd = open(name, O_RDONLY);
 		if (fd < 0) {
-			dso->load_errno = errno;
+			*dso__load_errno(dso) = errno;
 			return -1;
 		}
 	}
@@ -1249,37 +1249,37 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
 	elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
 	if (elf == NULL) {
 		pr_debug("%s: cannot read %s ELF file.\n", __func__, name);
-		dso->load_errno = DSO_LOAD_ERRNO__INVALID_ELF;
+		*dso__load_errno(dso) = DSO_LOAD_ERRNO__INVALID_ELF;
 		goto out_close;
 	}
 
 	if (gelf_getehdr(elf, &ehdr) == NULL) {
-		dso->load_errno = DSO_LOAD_ERRNO__INVALID_ELF;
+		*dso__load_errno(dso) = DSO_LOAD_ERRNO__INVALID_ELF;
 		pr_debug("%s: cannot get elf header.\n", __func__);
 		goto out_elf_end;
 	}
 
 	if (dso__swap_init(dso, ehdr.e_ident[EI_DATA])) {
-		dso->load_errno = DSO_LOAD_ERRNO__INTERNAL_ERROR;
+		*dso__load_errno(dso) = DSO_LOAD_ERRNO__INTERNAL_ERROR;
 		goto out_elf_end;
 	}
 
 	/* Always reject images with a mismatched build-id: */
-	if (dso->has_build_id && !symbol_conf.ignore_vmlinux_buildid) {
+	if (dso__has_build_id(dso) && !symbol_conf.ignore_vmlinux_buildid) {
 		u8 build_id[BUILD_ID_SIZE];
 		struct build_id bid;
 		int size;
 
 		size = elf_read_build_id(elf, build_id, BUILD_ID_SIZE);
 		if (size <= 0) {
-			dso->load_errno = DSO_LOAD_ERRNO__CANNOT_READ_BUILDID;
+			*dso__load_errno(dso) = DSO_LOAD_ERRNO__CANNOT_READ_BUILDID;
 			goto out_elf_end;
 		}
 
 		build_id__init(&bid, build_id, size);
 		if (!dso__build_id_equal(dso, &bid)) {
 			pr_debug("%s: build id mismatch for %s.\n", __func__, name);
-			dso->load_errno = DSO_LOAD_ERRNO__MISMATCHING_BUILDID;
+			*dso__load_errno(dso) = DSO_LOAD_ERRNO__MISMATCHING_BUILDID;
 			goto out_elf_end;
 		}
 	}
@@ -1304,14 +1304,14 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
 	if (ss->opdshdr.sh_type != SHT_PROGBITS)
 		ss->opdsec = NULL;
 
-	if (dso->kernel == DSO_SPACE__USER)
+	if (dso__kernel(dso) == DSO_SPACE__USER)
 		ss->adjust_symbols = true;
 	else
 		ss->adjust_symbols = elf__needs_adjust_symbols(ehdr);
 
 	ss->name   = strdup(name);
 	if (!ss->name) {
-		dso->load_errno = errno;
+		*dso__load_errno(dso) = errno;
 		goto out_elf_end;
 	}
 
@@ -1378,7 +1378,7 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 	if (adjust_kernel_syms)
 		sym->st_value -= shdr->sh_addr - shdr->sh_offset;
 
-	if (strcmp(section_name, (curr_dso->short_name + dso->short_name_len)) == 0)
+	if (strcmp(section_name, (dso__short_name(curr_dso) + dso__short_name_len(dso))) == 0)
 		return 0;
 
 	if (strcmp(section_name, ".text") == 0) {
@@ -1387,7 +1387,7 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 		 * kallsyms and identity maps.  Overwrite it to
 		 * map to the kernel dso.
 		 */
-		if (*remap_kernel && dso->kernel && !kmodule) {
+		if (*remap_kernel && dso__kernel(dso) && !kmodule) {
 			*remap_kernel = false;
 			map__set_start(map, shdr->sh_addr + ref_reloc(kmap));
 			map__set_end(map, map__start(map) + shdr->sh_size);
@@ -1424,7 +1424,7 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 	if (!kmap)
 		return 0;
 
-	snprintf(dso_name, sizeof(dso_name), "%s%s", dso->short_name, section_name);
+	snprintf(dso_name, sizeof(dso_name), "%s%s", dso__short_name(dso), section_name);
 
 	curr_map = maps__find_by_name(kmaps, dso_name);
 	if (curr_map == NULL) {
@@ -1436,17 +1436,17 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 		curr_dso = dso__new(dso_name);
 		if (curr_dso == NULL)
 			return -1;
-		curr_dso->kernel = dso->kernel;
-		curr_dso->long_name = dso->long_name;
-		curr_dso->long_name_len = dso->long_name_len;
-		curr_dso->binary_type = dso->binary_type;
-		curr_dso->adjust_symbols = dso->adjust_symbols;
+		dso__set_kernel(curr_dso, dso__kernel(dso));
+		RC_CHK_ACCESS(curr_dso)->long_name = dso__long_name(dso);
+		RC_CHK_ACCESS(curr_dso)->long_name_len = dso__long_name_len(dso);
+		dso__set_binary_type(curr_dso, dso__binary_type(dso));
+		dso__set_adjust_symbols(curr_dso, dso__adjust_symbols(dso));
 		curr_map = map__new2(start, curr_dso);
 		dso__put(curr_dso);
 		if (curr_map == NULL)
 			return -1;
 
-		if (curr_dso->kernel)
+		if (dso__kernel(curr_dso))
 			map__kmap(curr_map)->kmaps = kmaps;
 
 		if (adjust_kernel_syms) {
@@ -1456,7 +1456,7 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 		} else {
 			map__set_mapping_type(curr_map, MAPPING_TYPE__IDENTITY);
 		}
-		curr_dso->symtab_type = dso->symtab_type;
+		dso__set_symtab_type(curr_dso, dso__symtab_type(dso));
 		if (maps__insert(kmaps, curr_map))
 			return -1;
 		/*
@@ -1482,7 +1482,7 @@ static int
 dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 		       struct symsrc *runtime_ss, int kmodule, int dynsym)
 {
-	struct kmap *kmap = dso->kernel ? map__kmap(map) : NULL;
+	struct kmap *kmap = dso__kernel(dso) ? map__kmap(map) : NULL;
 	struct maps *kmaps = kmap ? map__kmaps(map) : NULL;
 	struct map *curr_map = map;
 	struct dso *curr_dso = dso;
@@ -1515,8 +1515,8 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 	if (elf_section_by_name(runtime_ss->elf, &runtime_ss->ehdr, &tshdr,
 				".text", NULL)) {
-		dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
-		dso->text_end = tshdr.sh_offset + tshdr.sh_size;
+		dso__set_text_offset(dso, tshdr.sh_addr - tshdr.sh_offset);
+		dso__set_text_end(dso, tshdr.sh_offset + tshdr.sh_size);
 	}
 
 	if (runtime_ss->opdsec)
@@ -1575,16 +1575,16 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 	 * attempted to prelink vdso to its virtual address.
 	 */
 	if (dso__is_vdso(dso))
-		map__set_reloc(map, map__start(map) - dso->text_offset);
+		map__set_reloc(map, map__start(map) - dso__text_offset(dso));
 
-	dso->adjust_symbols = runtime_ss->adjust_symbols || ref_reloc(kmap);
+	dso__set_adjust_symbols(dso, runtime_ss->adjust_symbols || ref_reloc(kmap));
 	/*
 	 * Initial kernel and module mappings do not map to the dso.
 	 * Flag the fixups.
 	 */
-	if (dso->kernel) {
+	if (dso__kernel(dso)) {
 		remap_kernel = true;
-		adjust_kernel_syms = dso->adjust_symbols;
+		adjust_kernel_syms = dso__adjust_symbols(dso);
 	}
 	elf_symtab__for_each_symbol(syms, nr_syms, idx, sym) {
 		struct symbol *f;
@@ -1673,7 +1673,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 		    (sym.st_value & 1))
 			--sym.st_value;
 
-		if (dso->kernel) {
+		if (dso__kernel(dso)) {
 			if (dso__process_kernel_symbol(dso, map, &sym, &shdr, kmaps, kmap, &curr_dso, &curr_map,
 						       section_name, adjust_kernel_syms, kmodule, &remap_kernel))
 				goto out_elf_end;
@@ -1721,7 +1721,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 
 		arch__sym_update(f, &sym);
 
-		__symbols__insert(&curr_dso->symbols, f, dso->kernel);
+		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
 		nr++;
 	}
 
@@ -1729,8 +1729,8 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 	 * For misannotated, zeroed, ASM function sizes.
 	 */
 	if (nr > 0) {
-		symbols__fixup_end(&dso->symbols, false);
-		symbols__fixup_duplicate(&dso->symbols);
+		symbols__fixup_end(dso__symbols(dso), false);
+		symbols__fixup_duplicate(dso__symbols(dso));
 		if (kmap) {
 			/*
 			 * We need to fixup this here too because we create new
@@ -1750,16 +1750,16 @@ int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 	int nr = 0;
 	int err = -1;
 
-	dso->symtab_type = syms_ss->type;
-	dso->is_64_bit = syms_ss->is_64_bit;
-	dso->rel = syms_ss->ehdr.e_type == ET_REL;
+	dso__set_symtab_type(dso, syms_ss->type);
+	dso__set_is_64_bit(dso, syms_ss->is_64_bit);
+	dso__set_rel(dso, syms_ss->ehdr.e_type == ET_REL);
 
 	/*
 	 * Modules may already have symbols from kallsyms, but those symbols
 	 * have the wrong values for the dso maps, so remove them.
 	 */
 	if (kmodule && syms_ss->symtab)
-		symbols__delete(&dso->symbols);
+		symbols__delete(dso__symbols(dso));
 
 	if (!syms_ss->symtab) {
 		/*
@@ -1767,7 +1767,7 @@ int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 		 * to using kallsyms. The vmlinux runtime symbols aren't
 		 * of much use.
 		 */
-		if (dso->kernel)
+		if (dso__kernel(dso))
 			return err;
 	} else  {
 		err = dso__load_sym_internal(dso, map, syms_ss, runtime_ss,
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 35975189999b..7a065a075a32 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -515,52 +515,52 @@ static struct symbol *symbols__find_by_name(struct symbol *symbols[],
 
 void dso__reset_find_symbol_cache(struct dso *dso)
 {
-	dso->last_find_result.addr   = 0;
-	dso->last_find_result.symbol = NULL;
+	dso__set_last_find_result_addr(dso, 0);
+	dso__set_last_find_result_symbol(dso, NULL);
 }
 
 void dso__insert_symbol(struct dso *dso, struct symbol *sym)
 {
-	__symbols__insert(&dso->symbols, sym, dso->kernel);
+	__symbols__insert(dso__symbols(dso), sym, dso__kernel(dso));
 
 	/* update the symbol cache if necessary */
-	if (dso->last_find_result.addr >= sym->start &&
-	    (dso->last_find_result.addr < sym->end ||
+	if (dso__last_find_result_addr(dso) >= sym->start &&
+	    (dso__last_find_result_addr(dso) < sym->end ||
 	    sym->start == sym->end)) {
-		dso->last_find_result.symbol = sym;
+		dso__set_last_find_result_symbol(dso, sym);
 	}
 }
 
 void dso__delete_symbol(struct dso *dso, struct symbol *sym)
 {
-	rb_erase_cached(&sym->rb_node, &dso->symbols);
+	rb_erase_cached(&sym->rb_node, dso__symbols(dso));
 	symbol__delete(sym);
 	dso__reset_find_symbol_cache(dso);
 }
 
 struct symbol *dso__find_symbol(struct dso *dso, u64 addr)
 {
-	if (dso->last_find_result.addr != addr || dso->last_find_result.symbol == NULL) {
-		dso->last_find_result.addr   = addr;
-		dso->last_find_result.symbol = symbols__find(&dso->symbols, addr);
+	if (dso__last_find_result_addr(dso) != addr || dso__last_find_result_symbol(dso) == NULL) {
+		dso__set_last_find_result_addr(dso, addr);
+		dso__set_last_find_result_symbol(dso, symbols__find(dso__symbols(dso), addr));
 	}
 
-	return dso->last_find_result.symbol;
+	return dso__last_find_result_symbol(dso);
 }
 
 struct symbol *dso__find_symbol_nocache(struct dso *dso, u64 addr)
 {
-	return symbols__find(&dso->symbols, addr);
+	return symbols__find(dso__symbols(dso), addr);
 }
 
 struct symbol *dso__first_symbol(struct dso *dso)
 {
-	return symbols__first(&dso->symbols);
+	return symbols__first(dso__symbols(dso));
 }
 
 struct symbol *dso__last_symbol(struct dso *dso)
 {
-	return symbols__last(&dso->symbols);
+	return symbols__last(dso__symbols(dso));
 }
 
 struct symbol *dso__next_symbol(struct symbol *sym)
@@ -570,11 +570,11 @@ struct symbol *dso__next_symbol(struct symbol *sym)
 
 struct symbol *dso__next_symbol_by_name(struct dso *dso, size_t *idx)
 {
-	if (*idx + 1 >= dso->symbol_names_len)
+	if (*idx + 1 >= dso__symbol_names_len(dso))
 		return NULL;
 
 	++*idx;
-	return dso->symbol_names[*idx];
+	return dso__symbol_names(dso)[*idx];
 }
 
  /*
@@ -582,27 +582,29 @@ struct symbol *dso__next_symbol_by_name(struct dso *dso, size_t *idx)
   */
 struct symbol *dso__find_symbol_by_name(struct dso *dso, const char *name, size_t *idx)
 {
-	struct symbol *s = symbols__find_by_name(dso->symbol_names, dso->symbol_names_len,
-						name, SYMBOL_TAG_INCLUDE__NONE, idx);
-	if (!s)
-		s = symbols__find_by_name(dso->symbol_names, dso->symbol_names_len,
-					name, SYMBOL_TAG_INCLUDE__DEFAULT_ONLY, idx);
+	struct symbol *s = symbols__find_by_name(dso__symbol_names(dso),
+						 dso__symbol_names_len(dso),
+						 name, SYMBOL_TAG_INCLUDE__NONE, idx);
+	if (!s) {
+		s = symbols__find_by_name(dso__symbol_names(dso), dso__symbol_names_len(dso),
+					  name, SYMBOL_TAG_INCLUDE__DEFAULT_ONLY, idx);
+	}
 	return s;
 }
 
 void dso__sort_by_name(struct dso *dso)
 {
-	mutex_lock(&dso->lock);
+	mutex_lock(dso__lock(dso));
 	if (!dso__sorted_by_name(dso)) {
 		size_t len;
 
-		dso->symbol_names = symbols__sort_by_name(&dso->symbols, &len);
-		if (dso->symbol_names) {
-			dso->symbol_names_len = len;
+		dso__set_symbol_names(dso, symbols__sort_by_name(dso__symbols(dso), &len));
+		if (dso__symbol_names(dso)) {
+			dso__set_symbol_names_len(dso, len);
 			dso__set_sorted_by_name(dso);
 		}
 	}
-	mutex_unlock(&dso->lock);
+	mutex_unlock(dso__lock(dso));
 }
 
 /*
@@ -729,7 +731,7 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
 {
 	struct symbol *sym;
 	struct dso *dso = arg;
-	struct rb_root_cached *root = &dso->symbols;
+	struct rb_root_cached *root = dso__symbols(dso);
 
 	if (!symbol_type__filter(type))
 		return 0;
@@ -769,8 +771,8 @@ static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso)
 {
 	struct symbol *pos;
 	int count = 0;
-	struct rb_root_cached old_root = dso->symbols;
-	struct rb_root_cached *root = &dso->symbols;
+	struct rb_root_cached *root = dso__symbols(dso);
+	struct rb_root_cached old_root = *root;
 	struct rb_node *next = rb_first_cached(root);
 
 	if (!kmaps)
@@ -804,13 +806,13 @@ static int maps__split_kallsyms_for_kcore(struct maps *kmaps, struct dso *dso)
 			pos->end = map__end(curr_map);
 		if (pos->end)
 			pos->end -= map__start(curr_map) - map__pgoff(curr_map);
-		symbols__insert(&curr_map_dso->symbols, pos);
+		symbols__insert(dso__symbols(curr_map_dso), pos);
 		++count;
 		map__put(curr_map);
 	}
 
 	/* Symbols have been adjusted */
-	dso->adjust_symbols = 1;
+	dso__set_adjust_symbols(dso, true);
 
 	return count;
 }
@@ -827,7 +829,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 	struct map *curr_map = map__get(initial_map);
 	struct symbol *pos;
 	int count = 0, moved = 0;
-	struct rb_root_cached *root = &dso->symbols;
+	struct rb_root_cached *root = dso__symbols(dso);
 	struct rb_node *next = rb_first_cached(root);
 	int kernel_range = 0;
 	bool x86_64;
@@ -854,9 +856,9 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 
 			*module++ = '\0';
 			curr_map_dso = map__dso(curr_map);
-			if (strcmp(curr_map_dso->short_name, module)) {
+			if (strcmp(dso__short_name(curr_map_dso), module)) {
 				if (!RC_CHK_EQUAL(curr_map, initial_map) &&
-				    dso->kernel == DSO_SPACE__KERNEL_GUEST &&
+				    dso__kernel(dso) == DSO_SPACE__KERNEL_GUEST &&
 				    machine__is_default_guest(machine)) {
 					/*
 					 * We assume all symbols of a module are
@@ -879,7 +881,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 					goto discard_symbol;
 				}
 				curr_map_dso = map__dso(curr_map);
-				if (curr_map_dso->loaded &&
+				if (dso__loaded(curr_map_dso) &&
 				    !machine__is_default_guest(machine))
 					goto discard_symbol;
 			}
@@ -915,7 +917,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 				goto add_symbol;
 			}
 
-			if (dso->kernel == DSO_SPACE__KERNEL_GUEST)
+			if (dso__kernel(dso) == DSO_SPACE__KERNEL_GUEST)
 				snprintf(dso_name, sizeof(dso_name),
 					"[guest.kernel].%d",
 					kernel_range++);
@@ -929,7 +931,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 			if (ndso == NULL)
 				return -1;
 
-			ndso->kernel = dso->kernel;
+			dso__set_kernel(ndso, dso__kernel(dso));
 
 			curr_map = map__new2(pos->start, ndso);
 			if (curr_map == NULL) {
@@ -954,7 +956,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 			struct dso *curr_map_dso = map__dso(curr_map);
 
 			rb_erase_cached(&pos->rb_node, root);
-			symbols__insert(&curr_map_dso->symbols, pos);
+			symbols__insert(dso__symbols(curr_map_dso), pos);
 			++moved;
 		} else
 			++count;
@@ -966,7 +968,7 @@ static int maps__split_kallsyms(struct maps *kmaps, struct dso *dso, u64 delta,
 	}
 
 	if (!RC_CHK_EQUAL(curr_map, initial_map) &&
-	    dso->kernel == DSO_SPACE__KERNEL_GUEST &&
+	    dso__kernel(dso) == DSO_SPACE__KERNEL_GUEST &&
 	    machine__is_default_guest(maps__machine(kmaps))) {
 		dso__set_loaded(map__dso(curr_map));
 	}
@@ -1140,7 +1142,7 @@ static int do_validate_kcore_modules_cb(struct map *old_map, void *data)
 
 	dso = map__dso(old_map);
 	/* Module must be in memory at the same address */
-	mi = find_module(dso->short_name, modules);
+	mi = find_module(dso__short_name(dso), modules);
 	if (!mi || mi->start != map__start(old_map))
 		return -EINVAL;
 
@@ -1309,7 +1311,7 @@ static int dso__load_kcore(struct dso *dso, struct map *map,
 			      &is_64_bit);
 	if (err)
 		goto out_err;
-	dso->is_64_bit = is_64_bit;
+	dso__set_is_64_bit(dso, is_64_bit);
 
 	if (list_empty(&md.maps)) {
 		err = -EINVAL;
@@ -1401,10 +1403,10 @@ static int dso__load_kcore(struct dso *dso, struct map *map,
 	 * Set the data type and long name so that kcore can be read via
 	 * dso__data_read_addr().
 	 */
-	if (dso->kernel == DSO_SPACE__KERNEL_GUEST)
-		dso->binary_type = DSO_BINARY_TYPE__GUEST_KCORE;
+	if (dso__kernel(dso) == DSO_SPACE__KERNEL_GUEST)
+		dso__set_binary_type(dso, DSO_BINARY_TYPE__GUEST_KCORE);
 	else
-		dso->binary_type = DSO_BINARY_TYPE__KCORE;
+		dso__set_binary_type(dso, DSO_BINARY_TYPE__KCORE);
 	dso__set_long_name(dso, strdup(kcore_filename), true);
 
 	close(fd);
@@ -1465,13 +1467,13 @@ int __dso__load_kallsyms(struct dso *dso, const char *filename,
 	if (kallsyms__delta(kmap, filename, &delta))
 		return -1;
 
-	symbols__fixup_end(&dso->symbols, true);
-	symbols__fixup_duplicate(&dso->symbols);
+	symbols__fixup_end(dso__symbols(dso), true);
+	symbols__fixup_duplicate(dso__symbols(dso));
 
-	if (dso->kernel == DSO_SPACE__KERNEL_GUEST)
-		dso->symtab_type = DSO_BINARY_TYPE__GUEST_KALLSYMS;
+	if (dso__kernel(dso) == DSO_SPACE__KERNEL_GUEST)
+		dso__set_symtab_type(dso, DSO_BINARY_TYPE__GUEST_KALLSYMS);
 	else
-		dso->symtab_type = DSO_BINARY_TYPE__KALLSYMS;
+		dso__set_symtab_type(dso, DSO_BINARY_TYPE__KALLSYMS);
 
 	if (!no_kcore && !dso__load_kcore(dso, map, filename))
 		return maps__split_kallsyms_for_kcore(kmap->kmaps, dso);
@@ -1527,7 +1529,7 @@ static int dso__load_perf_map(const char *map_path, struct dso *dso)
 		if (sym == NULL)
 			goto out_delete_line;
 
-		symbols__insert(&dso->symbols, sym);
+		symbols__insert(dso__symbols(dso), sym);
 		nr_syms++;
 	}
 
@@ -1653,15 +1655,15 @@ int dso__load_bfd_symbols(struct dso *dso, const char *debugfile)
 		if (!symbol)
 			goto out_free;
 
-		symbols__insert(&dso->symbols, symbol);
+		symbols__insert(dso__symbols(dso), symbol);
 	}
 #ifdef bfd_get_section
 #undef bfd_asymbol_section
 #endif
 
-	symbols__fixup_end(&dso->symbols, false);
-	symbols__fixup_duplicate(&dso->symbols);
-	dso->adjust_symbols = 1;
+	symbols__fixup_end(dso__symbols(dso), false);
+	symbols__fixup_duplicate(dso__symbols(dso));
+	dso__set_adjust_symbols(dso);
 
 	err = 0;
 out_free:
@@ -1684,17 +1686,17 @@ static bool dso__is_compatible_symtab_type(struct dso *dso, bool kmod,
 	case DSO_BINARY_TYPE__MIXEDUP_UBUNTU_DEBUGINFO:
 	case DSO_BINARY_TYPE__BUILDID_DEBUGINFO:
 	case DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO:
-		return !kmod && dso->kernel == DSO_SPACE__USER;
+		return !kmod && dso__kernel(dso) == DSO_SPACE__USER;
 
 	case DSO_BINARY_TYPE__KALLSYMS:
 	case DSO_BINARY_TYPE__VMLINUX:
 	case DSO_BINARY_TYPE__KCORE:
-		return dso->kernel == DSO_SPACE__KERNEL;
+		return dso__kernel(dso) == DSO_SPACE__KERNEL;
 
 	case DSO_BINARY_TYPE__GUEST_KALLSYMS:
 	case DSO_BINARY_TYPE__GUEST_VMLINUX:
 	case DSO_BINARY_TYPE__GUEST_KCORE:
-		return dso->kernel == DSO_SPACE__KERNEL_GUEST;
+		return dso__kernel(dso) == DSO_SPACE__KERNEL_GUEST;
 
 	case DSO_BINARY_TYPE__GUEST_KMODULE:
 	case DSO_BINARY_TYPE__GUEST_KMODULE_COMP:
@@ -1704,7 +1706,7 @@ static bool dso__is_compatible_symtab_type(struct dso *dso, bool kmod,
 		 * kernel modules know their symtab type - it's set when
 		 * creating a module dso in machine__addnew_module_map().
 		 */
-		return kmod && dso->symtab_type == type;
+		return kmod && dso__symtab_type(dso) == type;
 
 	case DSO_BINARY_TYPE__BUILD_ID_CACHE:
 	case DSO_BINARY_TYPE__BUILD_ID_CACHE_DEBUGINFO:
@@ -1772,18 +1774,19 @@ int dso__load(struct dso *dso, struct map *map)
 	struct build_id bid;
 	struct nscookie nsc;
 	char newmapname[PATH_MAX];
-	const char *map_path = dso->long_name;
+	const char *map_path = dso__long_name(dso);
 
-	mutex_lock(&dso->lock);
-	perfmap = strncmp(dso->name, "/tmp/perf-", 10) == 0;
+	mutex_lock(dso__lock(dso));
+	perfmap = strncmp(dso__name(dso), "/tmp/perf-", 10) == 0;
 	if (perfmap) {
-		if (dso->nsinfo && (dso__find_perf_map(newmapname,
-		    sizeof(newmapname), &dso->nsinfo) == 0)) {
+		if (dso__nsinfo(dso) &&
+		    (dso__find_perf_map(newmapname, sizeof(newmapname),
+					dso__nsinfo_ptr(dso)) == 0)) {
 			map_path = newmapname;
 		}
 	}
 
-	nsinfo__mountns_enter(dso->nsinfo, &nsc);
+	nsinfo__mountns_enter(dso__nsinfo(dso), &nsc);
 
 	/* check again under the dso->lock */
 	if (dso__loaded(dso)) {
@@ -1791,15 +1794,15 @@ int dso__load(struct dso *dso, struct map *map)
 		goto out;
 	}
 
-	kmod = dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE ||
-		dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP ||
-		dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE ||
-		dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE_COMP;
+	kmod = dso__symtab_type(dso) == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE ||
+		dso__symtab_type(dso) == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP ||
+		dso__symtab_type(dso) == DSO_BINARY_TYPE__GUEST_KMODULE ||
+		dso__symtab_type(dso) == DSO_BINARY_TYPE__GUEST_KMODULE_COMP;
 
-	if (dso->kernel && !kmod) {
-		if (dso->kernel == DSO_SPACE__KERNEL)
+	if (dso__kernel(dso) && !kmod) {
+		if (dso__kernel(dso) == DSO_SPACE__KERNEL)
 			ret = dso__load_kernel_sym(dso, map);
-		else if (dso->kernel == DSO_SPACE__KERNEL_GUEST)
+		else if (dso__kernel(dso) == DSO_SPACE__KERNEL_GUEST)
 			ret = dso__load_guest_kernel_sym(dso, map);
 
 		machine = maps__machine(map__kmaps(map));
@@ -1808,12 +1811,13 @@ int dso__load(struct dso *dso, struct map *map)
 		goto out;
 	}
 
-	dso->adjust_symbols = 0;
+	dso__set_adjust_symbols(dso, false);
 
 	if (perfmap) {
 		ret = dso__load_perf_map(map_path, dso);
-		dso->symtab_type = ret > 0 ? DSO_BINARY_TYPE__JAVA_JIT :
-					     DSO_BINARY_TYPE__NOT_FOUND;
+		dso__set_symtab_type(dso, ret > 0
+				? DSO_BINARY_TYPE__JAVA_JIT
+				: DSO_BINARY_TYPE__NOT_FOUND);
 		goto out;
 	}
 
@@ -1828,9 +1832,9 @@ int dso__load(struct dso *dso, struct map *map)
 	 * Read the build id if possible. This is required for
 	 * DSO_BINARY_TYPE__BUILDID_DEBUGINFO to work
 	 */
-	if (!dso->has_build_id &&
-	    is_regular_file(dso->long_name)) {
-	    __symbol__join_symfs(name, PATH_MAX, dso->long_name);
+	if (!dso__has_build_id(dso) &&
+	    is_regular_file(dso__long_name(dso))) {
+		__symbol__join_symfs(name, PATH_MAX, dso__long_name(dso));
 		if (filename__read_build_id(name, &bid) > 0)
 			dso__set_build_id(dso, &bid);
 	}
@@ -1864,7 +1868,7 @@ int dso__load(struct dso *dso, struct map *map)
 			nsinfo__mountns_exit(&nsc);
 
 		is_reg = is_regular_file(name);
-		if (!is_reg && errno == ENOENT && dso->nsinfo) {
+		if (!is_reg && errno == ENOENT && dso__nsinfo(dso)) {
 			char *new_name = dso__filename_with_chroot(dso, name);
 			if (new_name) {
 				is_reg = is_regular_file(new_name);
@@ -1881,7 +1885,7 @@ int dso__load(struct dso *dso, struct map *map)
 			sirc = symsrc__init(ss, dso, name, symtab_type);
 
 		if (nsexit)
-			nsinfo__mountns_enter(dso->nsinfo, &nsc);
+			nsinfo__mountns_enter(dso__nsinfo(dso), &nsc);
 
 		if (bfdrc == 0) {
 			ret = 0;
@@ -1894,8 +1898,8 @@ int dso__load(struct dso *dso, struct map *map)
 		if (!syms_ss && symsrc__has_symtab(ss)) {
 			syms_ss = ss;
 			next_slot = true;
-			if (!dso->symsrc_filename)
-				dso->symsrc_filename = strdup(name);
+			if (!dso__symsrc_filename(dso))
+				dso__set_symsrc_filename(dso, strdup(name));
 		}
 
 		if (!runtime_ss && symsrc__possibly_runtime(ss)) {
@@ -1942,11 +1946,11 @@ int dso__load(struct dso *dso, struct map *map)
 		symsrc__destroy(&ss_[ss_pos - 1]);
 out_free:
 	free(name);
-	if (ret < 0 && strstr(dso->name, " (deleted)") != NULL)
+	if (ret < 0 && strstr(dso__name(dso), " (deleted)") != NULL)
 		ret = 0;
 out:
 	dso__set_loaded(dso);
-	mutex_unlock(&dso->lock);
+	mutex_unlock(dso__lock(dso));
 	nsinfo__mountns_exit(&nsc);
 
 	return ret;
@@ -1965,7 +1969,7 @@ int dso__load_vmlinux(struct dso *dso, struct map *map,
 	else
 		symbol__join_symfs(symfs_vmlinux, vmlinux);
 
-	if (dso->kernel == DSO_SPACE__KERNEL_GUEST)
+	if (dso__kernel(dso) == DSO_SPACE__KERNEL_GUEST)
 		symtab_type = DSO_BINARY_TYPE__GUEST_VMLINUX;
 	else
 		symtab_type = DSO_BINARY_TYPE__VMLINUX;
@@ -1978,10 +1982,10 @@ int dso__load_vmlinux(struct dso *dso, struct map *map,
 	 * an incorrect long name unless we set it here first.
 	 */
 	dso__set_long_name(dso, vmlinux, vmlinux_allocated);
-	if (dso->kernel == DSO_SPACE__KERNEL_GUEST)
-		dso->binary_type = DSO_BINARY_TYPE__GUEST_VMLINUX;
+	if (dso__kernel(dso) == DSO_SPACE__KERNEL_GUEST)
+		dso__set_binary_type(dso, DSO_BINARY_TYPE__GUEST_VMLINUX);
 	else
-		dso->binary_type = DSO_BINARY_TYPE__VMLINUX;
+		dso__set_binary_type(dso, DSO_BINARY_TYPE__VMLINUX);
 
 	err = dso__load_sym(dso, map, &ss, &ss, 0);
 	symsrc__destroy(&ss);
@@ -2074,7 +2078,7 @@ static char *dso__find_kallsyms(struct dso *dso, struct map *map)
 	bool is_host = false;
 	char path[PATH_MAX];
 
-	if (!dso->has_build_id) {
+	if (!dso__has_build_id(dso)) {
 		/*
 		 * Last resort, if we don't have a build-id and couldn't find
 		 * any vmlinux file, try the running kernel kallsyms table.
@@ -2099,7 +2103,7 @@ static char *dso__find_kallsyms(struct dso *dso, struct map *map)
 			goto proc_kallsyms;
 	}
 
-	build_id__sprintf(&dso->bid, sbuild_id);
+	build_id__sprintf(dso__bid(dso), sbuild_id);
 
 	/* Find kallsyms in build-id cache with kcore */
 	scnprintf(path, sizeof(path), "%s/%s/%s",
@@ -2192,7 +2196,7 @@ static int dso__load_kernel_sym(struct dso *dso, struct map *map)
 	free(kallsyms_allocated_filename);
 
 	if (err > 0 && !dso__is_kcore(dso)) {
-		dso->binary_type = DSO_BINARY_TYPE__KALLSYMS;
+		dso__set_binary_type(dso, DSO_BINARY_TYPE__KALLSYMS);
 		dso__set_long_name(dso, DSO__NAME_KALLSYMS, false);
 		map__fixup_start(map);
 		map__fixup_end(map);
@@ -2235,7 +2239,7 @@ static int dso__load_guest_kernel_sym(struct dso *dso, struct map *map)
 	if (err > 0)
 		pr_debug("Using %s for symbols\n", kallsyms_filename);
 	if (err > 0 && !dso__is_kcore(dso)) {
-		dso->binary_type = DSO_BINARY_TYPE__GUEST_KALLSYMS;
+		dso__set_binary_type(dso, DSO_BINARY_TYPE__GUEST_KALLSYMS);
 		dso__set_long_name(dso, machine->mmap_name, false);
 		map__fixup_start(map);
 		map__fixup_end(map);
diff --git a/tools/perf/util/symbol_fprintf.c b/tools/perf/util/symbol_fprintf.c
index 088f4abf230f..53e1af4ed9ac 100644
--- a/tools/perf/util/symbol_fprintf.c
+++ b/tools/perf/util/symbol_fprintf.c
@@ -64,8 +64,8 @@ size_t dso__fprintf_symbols_by_name(struct dso *dso,
 {
 	size_t ret = 0;
 
-	for (size_t i = 0; i < dso->symbol_names_len; i++) {
-		struct symbol *pos = dso->symbol_names[i];
+	for (size_t i = 0; i < dso__symbol_names_len(dso); i++) {
+		struct symbol *pos = dso__symbol_names(dso)[i];
 
 		ret += fprintf(fp, "%s\n", pos->name);
 	}
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 3712186353fb..10753303034a 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -385,8 +385,8 @@ static void perf_record_mmap2__read_build_id(struct perf_record_mmap2 *event,
 	id.ino_generation = event->ino_generation;
 
 	dso = dsos__findnew_id(&machine->dsos, event->filename, &id);
-	if (dso && dso->has_build_id) {
-		bid = dso->bid;
+	if (dso && dso__has_build_id(dso)) {
+		bid = *dso__bid(dso);
 		rc = 0;
 		goto out;
 	}
@@ -407,7 +407,7 @@ static void perf_record_mmap2__read_build_id(struct perf_record_mmap2 *event,
 		event->__reserved_1 = 0;
 		event->__reserved_2 = 0;
 
-		if (dso && !dso->has_build_id)
+		if (dso && !dso__has_build_id(dso))
 			dso__set_build_id(dso, &bid);
 	} else {
 		if (event->filename[0] == '/') {
@@ -684,7 +684,7 @@ static int perf_event__synthesize_modules_maps_cb(struct map *map, void *data)
 
 	dso = map__dso(map);
 	if (symbol_conf.buildid_mmap2) {
-		size = PERF_ALIGN(dso->long_name_len + 1, sizeof(u64));
+		size = PERF_ALIGN(dso__long_name_len(dso) + 1, sizeof(u64));
 		event->mmap2.header.type = PERF_RECORD_MMAP2;
 		event->mmap2.header.size = (sizeof(event->mmap2) -
 					(sizeof(event->mmap2.filename) - size));
@@ -694,11 +694,11 @@ static int perf_event__synthesize_modules_maps_cb(struct map *map, void *data)
 		event->mmap2.len   = map__size(map);
 		event->mmap2.pid   = args->machine->pid;
 
-		memcpy(event->mmap2.filename, dso->long_name, dso->long_name_len + 1);
+		memcpy(event->mmap2.filename, dso__long_name(dso), dso__long_name_len(dso) + 1);
 
 		perf_record_mmap2__read_build_id(&event->mmap2, args->machine, false);
 	} else {
-		size = PERF_ALIGN(dso->long_name_len + 1, sizeof(u64));
+		size = PERF_ALIGN(dso__long_name_len(dso) + 1, sizeof(u64));
 		event->mmap.header.type = PERF_RECORD_MMAP;
 		event->mmap.header.size = (sizeof(event->mmap) -
 					(sizeof(event->mmap.filename) - size));
@@ -708,7 +708,7 @@ static int perf_event__synthesize_modules_maps_cb(struct map *map, void *data)
 		event->mmap.len   = map__size(map);
 		event->mmap.pid   = args->machine->pid;
 
-		memcpy(event->mmap.filename, dso->long_name, dso->long_name_len + 1);
+		memcpy(event->mmap.filename, dso__long_name(dso), dso__long_name_len(dso) + 1);
 	}
 
 	if (perf_tool__process_synth_event(args->tool, event, args->machine, args->process) != 0)
@@ -2231,20 +2231,20 @@ int perf_event__synthesize_build_id(struct perf_tool *tool, struct dso *pos, u16
 	union perf_event ev;
 	size_t len;
 
-	if (!pos->hit)
+	if (!dso__hit(pos))
 		return 0;
 
 	memset(&ev, 0, sizeof(ev));
 
-	len = pos->long_name_len + 1;
+	len = dso__long_name_len(pos) + 1;
 	len = PERF_ALIGN(len, NAME_ALIGN);
-	ev.build_id.size = min(pos->bid.size, sizeof(pos->bid.data));
-	memcpy(&ev.build_id.build_id, pos->bid.data, ev.build_id.size);
+	ev.build_id.size = min(dso__bid(pos)->size, sizeof(dso__bid(pos)->data));
+	memcpy(&ev.build_id.build_id, dso__bid(pos)->data, ev.build_id.size);
 	ev.build_id.header.type = PERF_RECORD_HEADER_BUILD_ID;
 	ev.build_id.header.misc = misc | PERF_RECORD_MISC_BUILD_ID_SIZE;
 	ev.build_id.pid = machine->pid;
 	ev.build_id.header.size = sizeof(ev.build_id) + len;
-	memcpy(&ev.build_id.filename, pos->long_name, pos->long_name_len);
+	memcpy(&ev.build_id.filename, dso__long_name(pos), dso__long_name_len(pos));
 
 	return process(tool, &ev, NULL, machine);
 }
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 1aa8962dcf52..0a473112f881 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -457,14 +457,14 @@ int thread__memcpy(struct thread *thread, struct machine *machine,
 
 	dso = map__dso(al.map);
 
-	if (!dso || dso->data.status == DSO_DATA_STATUS_ERROR || map__load(al.map) < 0) {
+	if (!dso || dso__data(dso)->status == DSO_DATA_STATUS_ERROR || map__load(al.map) < 0) {
 		addr_location__exit(&al);
 		return -1;
 	}
 
 	offset = map__map_ip(al.map, ip);
 	if (is64bit)
-		*is64bit = dso->is_64_bit;
+		*is64bit = dso__is_64_bit(dso);
 
 	addr_location__exit(&al);
 
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index 6a5ac0faa6f4..cde267ea3e99 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -329,27 +329,27 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct unwind_info *ui,
 	};
 	int ret, fd;
 
-	if (dso->data.eh_frame_hdr_offset == 0) {
+	if (dso__data(dso)->eh_frame_hdr_offset == 0) {
 		fd = dso__data_get_fd(dso, ui->machine);
 		if (fd < 0)
 			return -EINVAL;
 
 		/* Check the .eh_frame section for unwinding info */
 		ret = elf_section_address_and_offset(fd, ".eh_frame_hdr",
-						     &dso->data.eh_frame_hdr_addr,
-						     &dso->data.eh_frame_hdr_offset);
-		dso->data.elf_base_addr = elf_base_address(fd);
+						     &dso__data(dso)->eh_frame_hdr_addr,
+						     &dso__data(dso)->eh_frame_hdr_offset);
+		dso__data(dso)->elf_base_addr = elf_base_address(fd);
 		dso__data_put_fd(dso);
-		if (ret || dso->data.eh_frame_hdr_offset == 0)
+		if (ret || dso__data(dso)->eh_frame_hdr_offset == 0)
 			return -EINVAL;
 	}
 
 	maps__for_each_map(thread__maps(ui->thread), read_unwind_spec_eh_frame_maps_cb, &args);
 
-	args.base_addr -= dso->data.elf_base_addr;
+	args.base_addr -= dso__data(dso)->elf_base_addr;
 	/* Address of .eh_frame_hdr */
-	*segbase = args.base_addr + dso->data.eh_frame_hdr_addr;
-	ret = unwind_spec_ehframe(dso, ui->machine, dso->data.eh_frame_hdr_offset,
+	*segbase = args.base_addr + dso__data(dso)->eh_frame_hdr_addr;
+	ret = unwind_spec_ehframe(dso, ui->machine, dso__data(dso)->eh_frame_hdr_offset,
 				   table_data, fde_count);
 	if (ret)
 		return ret;
@@ -460,7 +460,7 @@ find_proc_info(unw_addr_space_t as, unw_word_t ip, unw_proc_info_t *pi,
 		return -EINVAL;
 	}
 
-	pr_debug("unwind: find_proc_info dso %s\n", dso->name);
+	pr_debug("unwind: find_proc_info dso %s\n", dso__name(dso));
 
 	/* Check the .eh_frame section for unwinding info */
 	if (!read_unwind_spec_eh_frame(dso, ui, &table_data, &segbase, &fde_count)) {
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 2728eb4f13ea..cb8be6acfb6f 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -25,7 +25,7 @@ int unwind__prepare_access(struct maps *maps, struct map *map, bool *initialized
 		return 0;
 
 	if (maps__addr_space(maps)) {
-		pr_debug("unwind: thread map already set, dso=%s\n", dso->name);
+		pr_debug("unwind: thread map already set, dso=%s\n", dso__name(dso));
 		if (initialized)
 			*initialized = true;
 		return 0;
diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c
index 35532dcbff74..1b6f8f6db7aa 100644
--- a/tools/perf/util/vdso.c
+++ b/tools/perf/util/vdso.c
@@ -148,7 +148,7 @@ static int machine__thread_dso_type_maps_cb(struct map *map, void *data)
 	struct machine__thread_dso_type_maps_cb_args *args = data;
 	struct dso *dso = map__dso(map);
 
-	if (!dso || dso->long_name[0] != '/')
+	if (!dso || dso__long_name(dso)[0] != '/')
 		return 0;
 
 	args->dso_type = dso__type(dso, args->machine);
@@ -361,7 +361,7 @@ struct dso *machine__findnew_vdso(struct machine *machine,
 
 bool dso__is_vdso(struct dso *dso)
 {
-	return !strcmp(dso->short_name, DSO__NAME_VDSO) ||
-	       !strcmp(dso->short_name, DSO__NAME_VDSO32) ||
-	       !strcmp(dso->short_name, DSO__NAME_VDSOX32);
+	return !strcmp(dso__short_name(dso), DSO__NAME_VDSO) ||
+	       !strcmp(dso__short_name(dso), DSO__NAME_VDSO32) ||
+	       !strcmp(dso__short_name(dso), DSO__NAME_VDSOX32);
 }
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 24/25] perf dso: Reference counting related fixes
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (22 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 23/25] perf dso: Add reference count checking and accessor functions Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-03  5:06 ` [PATCH v7 25/25] perf dso: Use container_of to avoid a pointer in dso_data Ian Rogers
  2024-01-31 22:22 ` [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

Ensure gets and puts are better aligned fixing reference couting
checking problems.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/machine.c    |  4 ++--
 tools/perf/util/map.c        |  1 +
 tools/perf/util/symbol-elf.c | 38 +++++++++++++++++-------------------
 3 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 49b8ccd5affe..2dbb7b06b117 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -682,7 +682,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
 					     struct perf_sample *sample __maybe_unused)
 {
 	struct symbol *sym;
-	struct dso *dso;
+	struct dso *dso = NULL;
 	struct map *map = maps__find(machine__kernel_maps(machine), event->ksymbol.addr);
 	int err = 0;
 
@@ -695,7 +695,6 @@ static int machine__process_ksymbol_register(struct machine *machine,
 		}
 		dso__set_kernel(dso, DSO_SPACE__KERNEL);
 		map = map__new2(0, dso);
-		dso__put(dso);
 		if (!map) {
 			err = -ENOMEM;
 			goto out;
@@ -734,6 +733,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
 	dso__insert_symbol(dso, sym);
 out:
 	map__put(map);
+	dso__put(dso);
 	return err;
 }
 
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 14fb8cf65b13..4480134ef4ea 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -200,6 +200,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 				dso__set_build_id(dso, dso__bid(header_bid_dso));
 				dso__set_header_build_id(dso, 1);
 			}
+			dso__put(header_bid_dso);
 		}
 		dso__put(dso);
 	}
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index de73f9fb3fe4..4c00463abb7e 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1366,7 +1366,7 @@ void __weak arch__sym_update(struct symbol *s __maybe_unused,
 static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 				      GElf_Sym *sym, GElf_Shdr *shdr,
 				      struct maps *kmaps, struct kmap *kmap,
-				      struct dso **curr_dsop, struct map **curr_mapp,
+				      struct dso **curr_dsop,
 				      const char *section_name,
 				      bool adjust_kernel_syms, bool kmodule, bool *remap_kernel)
 {
@@ -1416,8 +1416,8 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 			map__set_pgoff(map, shdr->sh_offset);
 		}
 
-		*curr_mapp = map;
-		*curr_dsop = dso;
+		dso__put(*curr_dsop);
+		*curr_dsop = dso__get(dso);
 		return 0;
 	}
 
@@ -1442,10 +1442,10 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 		dso__set_binary_type(curr_dso, dso__binary_type(dso));
 		dso__set_adjust_symbols(curr_dso, dso__adjust_symbols(dso));
 		curr_map = map__new2(start, curr_dso);
-		dso__put(curr_dso);
-		if (curr_map == NULL)
+		if (curr_map == NULL) {
+			dso__put(curr_dso);
 			return -1;
-
+		}
 		if (dso__kernel(curr_dso))
 			map__kmap(curr_map)->kmaps = kmaps;
 
@@ -1459,21 +1459,15 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
 		dso__set_symtab_type(curr_dso, dso__symtab_type(dso));
 		if (maps__insert(kmaps, curr_map))
 			return -1;
-		/*
-		 * Add it before we drop the reference to curr_map, i.e. while
-		 * we still are sure to have a reference to this DSO via
-		 * *curr_map->dso.
-		 */
 		dsos__add(&maps__machine(kmaps)->dsos, curr_dso);
-		/* kmaps already got it */
-		map__put(curr_map);
 		dso__set_loaded(curr_dso);
-		*curr_mapp = curr_map;
+		dso__put(*curr_dsop);
 		*curr_dsop = curr_dso;
 	} else {
-		*curr_dsop = map__dso(curr_map);
-		map__put(curr_map);
+		dso__put(*curr_dsop);
+		*curr_dsop = dso__get(map__dso(curr_map));
 	}
+	map__put(curr_map);
 
 	return 0;
 }
@@ -1484,8 +1478,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 {
 	struct kmap *kmap = dso__kernel(dso) ? map__kmap(map) : NULL;
 	struct maps *kmaps = kmap ? map__kmaps(map) : NULL;
-	struct map *curr_map = map;
-	struct dso *curr_dso = dso;
+	struct dso *curr_dso;
 	Elf_Data *symstrs, *secstrs, *secstrs_run, *secstrs_sym;
 	uint32_t nr_syms;
 	int err = -1;
@@ -1586,6 +1579,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 		remap_kernel = true;
 		adjust_kernel_syms = dso__adjust_symbols(dso);
 	}
+	curr_dso = dso__get(dso);
 	elf_symtab__for_each_symbol(syms, nr_syms, idx, sym) {
 		struct symbol *f;
 		const char *elf_name = elf_sym__name(&sym, symstrs);
@@ -1674,8 +1668,11 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 			--sym.st_value;
 
 		if (dso__kernel(dso)) {
-			if (dso__process_kernel_symbol(dso, map, &sym, &shdr, kmaps, kmap, &curr_dso, &curr_map,
-						       section_name, adjust_kernel_syms, kmodule, &remap_kernel))
+			if (dso__process_kernel_symbol(dso, map, &sym, &shdr,
+						       kmaps, kmap, &curr_dso,
+						       section_name,
+						       adjust_kernel_syms,
+						       kmodule, &remap_kernel))
 				goto out_elf_end;
 		} else if ((used_opd && runtime_ss->adjust_symbols) ||
 			   (!used_opd && syms_ss->adjust_symbols)) {
@@ -1724,6 +1721,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 		__symbols__insert(dso__symbols(curr_dso), f, dso__kernel(dso));
 		nr++;
 	}
+	dso__put(curr_dso);
 
 	/*
 	 * For misannotated, zeroed, ASM function sizes.
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 25/25] perf dso: Use container_of to avoid a pointer in dso_data
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (23 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 24/25] perf dso: Reference counting related fixes Ian Rogers
@ 2024-01-03  5:06 ` Ian Rogers
  2024-01-31 22:22 ` [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-03  5:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, liuwenyu, linux-kernel, linux-perf-users,
	Guilherme Amadio

The dso pointer in dso_data is necessary for reference count checking
to account for the dso_data forming a global list of open dso's with
references to the dso. The dso pointer also allows for the indirection
that reference count checking needs. Outside of reference count
checking the indirection isn't needed and container_of is more
efficient and saves space.

The reference count won't be increased by placing items onto the
global list, matching how things were before the reference count
checking change, but we assert the dso is in dsos holding it live (and
that the set of open dsos is a subset of all dsos for the
machine). Update the DSO data tests so that they use a dsos struct to
make the invariant true.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/dso-data.c | 60 ++++++++++++++++++-------------------
 tools/perf/util/dso.c       | 16 +++++++++-
 tools/perf/util/dso.h       |  2 ++
 3 files changed, 46 insertions(+), 32 deletions(-)

diff --git a/tools/perf/tests/dso-data.c b/tools/perf/tests/dso-data.c
index fde4eca84b6f..5286ae8bd2d7 100644
--- a/tools/perf/tests/dso-data.c
+++ b/tools/perf/tests/dso-data.c
@@ -10,6 +10,7 @@
 #include <sys/resource.h>
 #include <api/fs/fs.h>
 #include "dso.h"
+#include "dsos.h"
 #include "machine.h"
 #include "symbol.h"
 #include "tests.h"
@@ -123,9 +124,10 @@ static int test__dso_data(struct test_suite *test __maybe_unused, int subtest __
 	TEST_ASSERT_VAL("No test file", file);
 
 	memset(&machine, 0, sizeof(machine));
+	dsos__init(&machine.dsos);
 
-	dso = dso__new((const char *)file);
-
+	dso = dso__new(file);
+	TEST_ASSERT_VAL("Failed to add dso", !dsos__add(&machine.dsos, dso));
 	TEST_ASSERT_VAL("Failed to access to dso",
 			dso__data_fd(dso, &machine) >= 0);
 
@@ -170,6 +172,7 @@ static int test__dso_data(struct test_suite *test __maybe_unused, int subtest __
 	}
 
 	dso__put(dso);
+	dsos__exit(&machine.dsos);
 	unlink(file);
 	return 0;
 }
@@ -199,41 +202,35 @@ static long open_files_cnt(void)
 	return nr - 1;
 }
 
-static struct dso **dsos;
-
-static int dsos__create(int cnt, int size)
+static int dsos__create(int cnt, int size, struct dsos *dsos)
 {
 	int i;
 
-	dsos = malloc(sizeof(*dsos) * cnt);
-	TEST_ASSERT_VAL("failed to alloc dsos array", dsos);
+	dsos__init(dsos);
 
 	for (i = 0; i < cnt; i++) {
-		char *file;
+		struct dso *dso;
+		char *file = test_file(size);
 
-		file = test_file(size);
 		TEST_ASSERT_VAL("failed to get dso file", file);
-
-		dsos[i] = dso__new(file);
-		TEST_ASSERT_VAL("failed to get dso", dsos[i]);
+		dso = dso__new(file);
+		TEST_ASSERT_VAL("failed to get dso", dso);
+		TEST_ASSERT_VAL("failed to add dso", !dsos__add(dsos, dso));
+		dso__put(dso);
 	}
 
 	return 0;
 }
 
-static void dsos__delete(int cnt)
+static void dsos__delete(struct dsos *dsos)
 {
-	int i;
-
-	for (i = 0; i < cnt; i++) {
-		struct dso *dso = dsos[i];
+	for (unsigned int i = 0; i < dsos->cnt; i++) {
+		struct dso *dso = dsos->dsos[i];
 
 		dso__data_close(dso);
 		unlink(dso__name(dso));
-		dso__put(dso);
 	}
-
-	free(dsos);
+	dsos__exit(dsos);
 }
 
 static int set_fd_limit(int n)
@@ -267,10 +264,10 @@ static int test__dso_data_cache(struct test_suite *test __maybe_unused, int subt
 	/* and this is now our dso open FDs limit */
 	dso_cnt = limit / 2;
 	TEST_ASSERT_VAL("failed to create dsos\n",
-		!dsos__create(dso_cnt, TEST_FILE_SIZE));
+			!dsos__create(dso_cnt, TEST_FILE_SIZE, &machine.dsos));
 
 	for (i = 0; i < (dso_cnt - 1); i++) {
-		struct dso *dso = dsos[i];
+		struct dso *dso = machine.dsos.dsos[i];
 
 		/*
 		 * Open dsos via dso__data_fd(), it opens the data
@@ -290,17 +287,17 @@ static int test__dso_data_cache(struct test_suite *test __maybe_unused, int subt
 	}
 
 	/* verify the first one is already open */
-	TEST_ASSERT_VAL("dsos[0] is not open", dso__data(dsos[0])->fd != -1);
+	TEST_ASSERT_VAL("dsos[0] is not open", dso__data(machine.dsos.dsos[0])->fd != -1);
 
 	/* open +1 dso to reach the allowed limit */
-	fd = dso__data_fd(dsos[i], &machine);
+	fd = dso__data_fd(machine.dsos.dsos[i], &machine);
 	TEST_ASSERT_VAL("failed to get fd", fd > 0);
 
 	/* should force the first one to be closed */
-	TEST_ASSERT_VAL("failed to close dsos[0]", dso__data(dsos[0])->fd == -1);
+	TEST_ASSERT_VAL("failed to close dsos[0]", dso__data(machine.dsos.dsos[0])->fd == -1);
 
 	/* cleanup everything */
-	dsos__delete(dso_cnt);
+	dsos__delete(&machine.dsos);
 
 	/* Make sure we did not leak any file descriptor. */
 	nr_end = open_files_cnt();
@@ -325,9 +322,9 @@ static int test__dso_data_reopen(struct test_suite *test __maybe_unused, int sub
 	long nr_end, nr = open_files_cnt(), lim = new_limit(3);
 	int fd, fd_extra;
 
-#define dso_0 (dsos[0])
-#define dso_1 (dsos[1])
-#define dso_2 (dsos[2])
+#define dso_0 (machine.dsos.dsos[0])
+#define dso_1 (machine.dsos.dsos[1])
+#define dso_2 (machine.dsos.dsos[2])
 
 	/* Rest the internal dso open counter limit. */
 	reset_fd_limit();
@@ -347,7 +344,8 @@ static int test__dso_data_reopen(struct test_suite *test __maybe_unused, int sub
 	TEST_ASSERT_VAL("failed to set file limit",
 			!set_fd_limit((lim)));
 
-	TEST_ASSERT_VAL("failed to create dsos\n", !dsos__create(3, TEST_FILE_SIZE));
+	TEST_ASSERT_VAL("failed to create dsos\n",
+			!dsos__create(3, TEST_FILE_SIZE, &machine.dsos));
 
 	/* open dso_0 */
 	fd = dso__data_fd(dso_0, &machine);
@@ -386,7 +384,7 @@ static int test__dso_data_reopen(struct test_suite *test __maybe_unused, int sub
 
 	/* cleanup everything */
 	close(fd_extra);
-	dsos__delete(3);
+	dsos__delete(&machine.dsos);
 
 	/* Make sure we did not leak any file descriptor. */
 	nr_end = open_files_cnt();
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index ddf58f594df0..83de99e52141 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -497,14 +497,20 @@ static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
 static void dso__list_add(struct dso *dso)
 {
 	list_add_tail(&dso__data(dso)->open_entry, &dso__data_open);
+#ifdef REFCNT_CHECKING
 	dso__data(dso)->dso = dso__get(dso);
+#endif
+	/* Assume the dso is part of dsos, hence the optional reference count above. */
+	assert(dso__dsos(dso));
 	dso__data_open_cnt++;
 }
 
 static void dso__list_del(struct dso *dso)
 {
 	list_del_init(&dso__data(dso)->open_entry);
+#ifdef REFCNT_CHECKING
 	dso__put(dso__data(dso)->dso);
+#endif
 	WARN_ONCE(dso__data_open_cnt <= 0,
 		  "DSO data fd counter out of bounds.");
 	dso__data_open_cnt--;
@@ -654,9 +660,15 @@ static void close_dso(struct dso *dso)
 static void close_first_dso(void)
 {
 	struct dso_data *dso_data;
+	struct dso *dso;
 
 	dso_data = list_first_entry(&dso__data_open, struct dso_data, open_entry);
-	close_dso(dso_data->dso);
+#ifdef REFCNT_CHECKING
+	dso = dso_data->dso;
+#else
+	dso = container_of(dso_data, struct dso, data);
+#endif
+	close_dso(dso);
 }
 
 static rlim_t get_fd_limit(void)
@@ -1448,7 +1460,9 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
 		data->fd = -1;
 		data->status = DSO_DATA_STATUS_UNKNOWN;
 		INIT_LIST_HEAD(&data->open_entry);
+#ifdef REFCNT_CHECKING
 		data->dso = NULL; /* Set when on the open_entry list. */
+#endif
 	}
 	return res;
 }
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 3e27f93898f2..3311c1740840 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -147,7 +147,9 @@ struct dso_cache {
 struct dso_data {
 	struct rb_root	 cache;
 	struct list_head open_entry;
+#ifdef REFCNT_CHECKING
 	struct dso	 *dso;
+#endif
 	int		 fd;
 	int		 status;
 	u32		 status_seen;
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes
  2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
                   ` (24 preceding siblings ...)
  2024-01-03  5:06 ` [PATCH v7 25/25] perf dso: Use container_of to avoid a pointer in dso_data Ian Rogers
@ 2024-01-31 22:22 ` Ian Rogers
  25 siblings, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-01-31 22:22 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Nick Terrell, Kan Liang, Andi Kleen,
	Kajol Jain, Athira Rajeev, Huacai Chen, Masami Hiramatsu,
	Vincent Whitchurch, Steinar H. Gunderson, Liam Howlett,
	Miguel Ojeda, Colin Ian King, Dmitrii Dolgov, Yang Jihong,
	Ming Wang, James Clark, K Prateek Nayak, Sean Christopherson,
	Leo Yan, Ravi Bangoria, German Gomez, Changbin Du, Paolo Bonzini,
	Li Dong, Sandipan Das, linux-kernel, linux-perf-users,
	Guilherme Amadio

On Tue, Jan 2, 2024 at 9:06 PM Ian Rogers <irogers@google.com> wrote:
>
> Modify the implementation of maps to not use an rbtree as the
> container for maps, instead use a sorted array. Improve locking and
> reference counting issues.
>
> Similar to maps separate out and reimplement threads to use a hashmap
> for lower memory consumption and faster look up. The fixes a
> regression in memory usage where reference count checking switched to
> using non-invasive tree nodes.  Reduce its default size by 32 times
> and improve locking discipline. Also, fix regressions where tids had
> become unordered to make `perf report --tasks` and
> `perf trace --summary` output easier to read.
>
> Better encapsulate the dsos abstraction. Remove the linked list and
> rbtree used for faster iteration and log(n) lookup to a sorted array
> for similar performance but half the memory usage per dso. Improve
> reference counting and locking discipline, adding reference count
> checking to dso.
>
> v7:
>  - rebase to latest perf-tools-next where 22 patches were applied by Arnaldo.
>  - resolve merge conflicts, in particular with fc044c53b99f ("perf
>    annotate-data: Add dso->data_types tree") that required more dso
>    accessor functions.

Ping. No review comments:
https://lore.kernel.org/lkml/20240103050635.391888-1-irogers@google.com/

This may start to conflict with Adrian's work:
https://lore.kernel.org/lkml/20240131192416.16387-1-adrian.hunter@intel.com/
but should just need minor get/put cleanup.

Thanks,
Ian


> v6 series is here:
> https://lore.kernel.org/lkml/20231207011722.1220634-1-irogers@google.com/
>
> Ian Rogers (25):
>   perf maps: Switch from rbtree to lazily sorted array for addresses
>   perf maps: Get map before returning in maps__find
>   perf maps: Get map before returning in maps__find_by_name
>   perf maps: Get map before returning in maps__find_next_entry
>   perf maps: Hide maps internals
>   perf maps: Locking tidy up of nr_maps
>   perf dso: Reorder variables to save space in struct dso
>   perf report: Sort child tasks by tid
>   perf trace: Ignore thread hashing in summary
>   perf machine: Move fprintf to for_each loop and a callback
>   perf threads: Move threads to its own files
>   perf threads: Switch from rbtree to hashmap
>   perf threads: Reduce table size from 256 to 8
>   perf dsos: Attempt to better abstract dsos internals
>   perf dsos: Tidy reference counting and locking
>   perf dsos: Add dsos__for_each_dso
>   perf dso: Move dso functions out of dsos
>   perf dsos: Switch more loops to dsos__for_each_dso
>   perf dsos: Switch backing storage to array from rbtree/list
>   perf dsos: Remove __dsos__addnew
>   perf dsos: Remove __dsos__findnew_link_by_longname_id
>   perf dsos: Switch hand code to bsearch
>   perf dso: Add reference count checking and accessor functions
>   perf dso: Reference counting related fixes
>   perf dso: Use container_of to avoid a pointer in dso_data
>
>  tools/perf/arch/x86/tests/dwarf-unwind.c      |    1 +
>  tools/perf/builtin-annotate.c                 |    8 +-
>  tools/perf/builtin-buildid-cache.c            |    2 +-
>  tools/perf/builtin-buildid-list.c             |   18 +-
>  tools/perf/builtin-inject.c                   |   96 +-
>  tools/perf/builtin-kallsyms.c                 |    2 +-
>  tools/perf/builtin-mem.c                      |    4 +-
>  tools/perf/builtin-record.c                   |    2 +-
>  tools/perf/builtin-report.c                   |  209 +--
>  tools/perf/builtin-script.c                   |    8 +-
>  tools/perf/builtin-top.c                      |    4 +-
>  tools/perf/builtin-trace.c                    |   43 +-
>  tools/perf/tests/code-reading.c               |    8 +-
>  tools/perf/tests/dso-data.c                   |   67 +-
>  tools/perf/tests/hists_common.c               |    6 +-
>  tools/perf/tests/hists_cumulate.c             |    4 +-
>  tools/perf/tests/hists_output.c               |    2 +-
>  tools/perf/tests/maps.c                       |    7 +-
>  tools/perf/tests/symbols.c                    |    2 +-
>  tools/perf/tests/thread-maps-share.c          |    8 +-
>  tools/perf/tests/vmlinux-kallsyms.c           |   16 +-
>  tools/perf/ui/browsers/annotate.c             |    6 +-
>  tools/perf/ui/browsers/hists.c                |    8 +-
>  tools/perf/ui/browsers/map.c                  |    4 +-
>  tools/perf/util/Build                         |    1 +
>  tools/perf/util/annotate-data.c               |    6 +-
>  tools/perf/util/annotate.c                    |   45 +-
>  tools/perf/util/auxtrace.c                    |    2 +-
>  tools/perf/util/block-info.c                  |    2 +-
>  tools/perf/util/bpf-event.c                   |    9 +-
>  tools/perf/util/bpf_lock_contention.c         |    8 +-
>  tools/perf/util/build-id.c                    |  136 +-
>  tools/perf/util/build-id.h                    |    2 -
>  tools/perf/util/callchain.c                   |    4 +-
>  tools/perf/util/data-convert-json.c           |    2 +-
>  tools/perf/util/db-export.c                   |    6 +-
>  tools/perf/util/dlfilter.c                    |   12 +-
>  tools/perf/util/dso.c                         |  469 +++---
>  tools/perf/util/dso.h                         |  549 +++++--
>  tools/perf/util/dsos.c                        |  529 ++++---
>  tools/perf/util/dsos.h                        |   40 +-
>  tools/perf/util/event.c                       |   12 +-
>  tools/perf/util/header.c                      |    8 +-
>  tools/perf/util/hist.c                        |    4 +-
>  tools/perf/util/intel-pt.c                    |   22 +-
>  tools/perf/util/machine.c                     |  570 +++-----
>  tools/perf/util/machine.h                     |   32 +-
>  tools/perf/util/map.c                         |   73 +-
>  tools/perf/util/maps.c                        | 1280 +++++++++++------
>  tools/perf/util/maps.h                        |   65 +-
>  tools/perf/util/probe-event.c                 |   26 +-
>  tools/perf/util/rb_resort.h                   |    5 -
>  .../util/scripting-engines/trace-event-perl.c |    6 +-
>  .../scripting-engines/trace-event-python.c    |   21 +-
>  tools/perf/util/session.c                     |   21 +
>  tools/perf/util/session.h                     |    2 +
>  tools/perf/util/sort.c                        |   19 +-
>  tools/perf/util/srcline.c                     |   65 +-
>  tools/perf/util/symbol-elf.c                  |  132 +-
>  tools/perf/util/symbol.c                      |  217 +--
>  tools/perf/util/symbol_fprintf.c              |    4 +-
>  tools/perf/util/synthetic-events.c            |   24 +-
>  tools/perf/util/thread.c                      |    8 +-
>  tools/perf/util/thread.h                      |    6 -
>  tools/perf/util/threads.c                     |  186 +++
>  tools/perf/util/threads.h                     |   35 +
>  tools/perf/util/unwind-libunwind-local.c      |   20 +-
>  tools/perf/util/unwind-libunwind.c            |    9 +-
>  tools/perf/util/vdso.c                        |   56 +-
>  69 files changed, 3127 insertions(+), 2158 deletions(-)
>  create mode 100644 tools/perf/util/threads.c
>  create mode 100644 tools/perf/util/threads.h
>
> --
> 2.43.0.472.g3155946c3a-goog
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses
  2024-01-03  5:06 ` [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses Ian Rogers
@ 2024-02-02  2:48   ` Namhyung Kim
  2024-02-02  4:20     ` Ian Rogers
  0 siblings, 1 reply; 31+ messages in thread
From: Namhyung Kim @ 2024-02-02  2:48 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Nick Terrell, Kan Liang, Andi Kleen, Kajol Jain, Athira Rajeev,
	Huacai Chen, Masami Hiramatsu, Vincent Whitchurch,
	Steinar H. Gunderson, Liam Howlett, Miguel Ojeda, Colin Ian King,
	Dmitrii Dolgov, Yang Jihong, Ming Wang, James Clark,
	K Prateek Nayak, Sean Christopherson, Leo Yan, Ravi Bangoria,
	German Gomez, Changbin Du, Paolo Bonzini, Li Dong, Sandipan Das,
	liuwenyu, linux-kernel, linux-perf-users, Guilherme Amadio

Hi Ian,

On Tue, Jan 2, 2024 at 9:07 PM Ian Rogers <irogers@google.com> wrote:
>
> Maps is a collection of maps primarily sorted by the starting address
> of the map. Prior to this change the maps were held in an rbtree
> requiring 4 pointers per node. Prior to reference count checking, the
> rbnode was embedded in the map so 3 pointers per node were
> necessary. This change switches the rbtree to an array lazily sorted
> by address, much as the array sorting nodes by name. 1 pointer is
> needed per node, but to avoid excessive resizing the backing array may
> be twice the number of used elements. Meaning the memory overhead is
> roughly half that of the rbtree. For a perf record with
> "--no-bpf-event -g -a" of true, the memory overhead of perf inject is
> reduce fom 3.3MB to 3MB, so 10% or 300KB is saved.
>
> Map inserts always happen at the end of the array. The code tracks
> whether the insertion violates the sorting property. O(log n) rb-tree
> complexity is switched to O(1).
>
> Remove slides the array, so O(log n) rb-tree complexity is degraded to
> O(n).
>
> A find may need to sort the array using qsort which is O(n*log n), but
> in general the maps should be sorted and so average performance should
> be O(log n) as with the rbtree.
>
> An rbtree node consumes a cache line, but with the array 4 nodes fit
> on a cache line. Iteration is simplified to scanning an array rather
> than pointer chasing.
>
> Overall it is expected the performance after the change should be
> comparable to before, but with half of the memory consumed.

I don't know how much performance impact it would have but I guess
search/iteration would be the most frequent operation.  So I like the
memory saving it can bring.

>
> To avoid a list and repeated logic around splitting maps,
> maps__merge_in is rewritten in terms of
> maps__fixup_overlap_and_insert. maps_merge_in splits the given mapping
> inserting remaining gaps. maps__fixup_overlap_and_insert splits the
> existing mappings, then adds the incoming mapping. By adding the new
> mapping first, then re-inserting the existing mappings the splitting
> behavior matches.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/tests/maps.c |    3 +
>  tools/perf/util/map.c   |    1 +
>  tools/perf/util/maps.c  | 1183 +++++++++++++++++++++++----------------
>  tools/perf/util/maps.h  |   54 +-
>  4 files changed, 757 insertions(+), 484 deletions(-)
>
> diff --git a/tools/perf/tests/maps.c b/tools/perf/tests/maps.c
> index bb3fbfe5a73e..b15417a0d617 100644
> --- a/tools/perf/tests/maps.c
> +++ b/tools/perf/tests/maps.c
> @@ -156,6 +156,9 @@ static int test__maps__merge_in(struct test_suite *t __maybe_unused, int subtest
>         TEST_ASSERT_VAL("merge check failed", !ret);
>
>         maps__zput(maps);
> +       map__zput(map_kcore1);
> +       map__zput(map_kcore2);
> +       map__zput(map_kcore3);
>         return TEST_OK;
>  }
>
> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> index 54c67cb7ecef..cf5a15db3a1f 100644
> --- a/tools/perf/util/map.c
> +++ b/tools/perf/util/map.c
> @@ -168,6 +168,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>                 if (dso == NULL)
>                         goto out_delete;
>
> +               assert(!dso->kernel);
>                 map__init(result, start, start + len, pgoff, dso);
>
>                 if (anon || no_dso) {
> diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
> index 0334fc18d9c6..6ee81160cdab 100644
> --- a/tools/perf/util/maps.c
> +++ b/tools/perf/util/maps.c
> @@ -10,286 +10,477 @@
>  #include "ui/ui.h"
>  #include "unwind.h"
>
> -struct map_rb_node {
> -       struct rb_node rb_node;
> -       struct map *map;
> -};
> -
> -#define maps__for_each_entry(maps, map) \
> -       for (map = maps__first(maps); map; map = map_rb_node__next(map))
> +static void check_invariants(const struct maps *maps __maybe_unused)
> +{
> +#ifndef NDEBUG
> +       assert(RC_CHK_ACCESS(maps)->nr_maps <= RC_CHK_ACCESS(maps)->nr_maps_allocated);
> +       for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
> +               struct map *map = RC_CHK_ACCESS(maps)->maps_by_address[i];
> +
> +               /* Check map is well-formed. */
> +               assert(map__end(map) == 0 || map__start(map) <= map__end(map));
> +               /* Expect at least 1 reference count. */
> +               assert(refcount_read(map__refcnt(map)) > 0);
> +
> +               if (map__dso(map) && map__dso(map)->kernel)
> +                       assert(RC_CHK_EQUAL(map__kmap(map)->kmaps, maps));
> +
> +               if (i > 0) {
> +                       struct map *prev = RC_CHK_ACCESS(maps)->maps_by_address[i - 1];
> +
> +                       /* If addresses are sorted... */
> +                       if (RC_CHK_ACCESS(maps)->maps_by_address_sorted) {
> +                               /* Maps should be in start address order. */
> +                               assert(map__start(prev) <= map__start(map));
> +                               /*
> +                                * If the ends of maps aren't broken (during
> +                                * construction) then they should be ordered
> +                                * too.
> +                                */
> +                               if (!RC_CHK_ACCESS(maps)->ends_broken) {
> +                                       assert(map__end(prev) <= map__end(map));
> +                                       assert(map__end(prev) <= map__start(map) ||
> +                                              map__start(prev) == map__start(map));
> +                               }
> +                       }
> +               }
> +       }
> +       if (RC_CHK_ACCESS(maps)->maps_by_name) {
> +               for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
> +                       struct map *map = RC_CHK_ACCESS(maps)->maps_by_name[i];
>
> -#define maps__for_each_entry_safe(maps, map, next) \
> -       for (map = maps__first(maps), next = map_rb_node__next(map); map; \
> -            map = next, next = map_rb_node__next(map))
> +                       /*
> +                        * Maps by name maps should be in maps_by_address, so
> +                        * the reference count should be higher.
> +                        */
> +                       assert(refcount_read(map__refcnt(map)) > 1);
> +               }
> +       }
> +#endif
> +}
>
> -static struct rb_root *maps__entries(struct maps *maps)
> +static struct map **maps__maps_by_address(const struct maps *maps)
>  {
> -       return &RC_CHK_ACCESS(maps)->entries;
> +       return RC_CHK_ACCESS(maps)->maps_by_address;
>  }
>
> -static struct rw_semaphore *maps__lock(struct maps *maps)
> +static void maps__set_maps_by_address(struct maps *maps, struct map **new)
>  {
> -       return &RC_CHK_ACCESS(maps)->lock;
> +       RC_CHK_ACCESS(maps)->maps_by_address = new;
> +
>  }
>
> -static struct map **maps__maps_by_name(struct maps *maps)
> +/* Not in the header, to aid reference counting. */
> +static struct map **maps__maps_by_name(const struct maps *maps)
>  {
>         return RC_CHK_ACCESS(maps)->maps_by_name;
> +
>  }
>
> -static struct map_rb_node *maps__first(struct maps *maps)
> +static void maps__set_maps_by_name(struct maps *maps, struct map **new)
>  {
> -       struct rb_node *first = rb_first(maps__entries(maps));
> +       RC_CHK_ACCESS(maps)->maps_by_name = new;
>
> -       if (first)
> -               return rb_entry(first, struct map_rb_node, rb_node);
> -       return NULL;
>  }
>
> -static struct map_rb_node *map_rb_node__next(struct map_rb_node *node)
> +static bool maps__maps_by_address_sorted(const struct maps *maps)
>  {
> -       struct rb_node *next;
> -
> -       if (!node)
> -               return NULL;
> -
> -       next = rb_next(&node->rb_node);
> +       return RC_CHK_ACCESS(maps)->maps_by_address_sorted;
> +}
>
> -       if (!next)
> -               return NULL;
> +static void maps__set_maps_by_address_sorted(struct maps *maps, bool value)
> +{
> +       RC_CHK_ACCESS(maps)->maps_by_address_sorted = value;
> +}
>
> -       return rb_entry(next, struct map_rb_node, rb_node);
> +static bool maps__maps_by_name_sorted(const struct maps *maps)
> +{
> +       return RC_CHK_ACCESS(maps)->maps_by_name_sorted;
>  }
>
> -static struct map_rb_node *maps__find_node(struct maps *maps, struct map *map)
> +static void maps__set_maps_by_name_sorted(struct maps *maps, bool value)
>  {
> -       struct map_rb_node *rb_node;
> +       RC_CHK_ACCESS(maps)->maps_by_name_sorted = value;
> +}
>
> -       maps__for_each_entry(maps, rb_node) {
> -               if (rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map))
> -                       return rb_node;
> -       }
> -       return NULL;
> +static struct rw_semaphore *maps__lock(struct maps *maps)
> +{
> +       /*
> +        * When the lock is acquired or released the maps invariants should
> +        * hold.
> +        */
> +       check_invariants(maps);
> +       return &RC_CHK_ACCESS(maps)->lock;
>  }
>
>  static void maps__init(struct maps *maps, struct machine *machine)
>  {
> -       refcount_set(maps__refcnt(maps), 1);
>         init_rwsem(maps__lock(maps));
> -       RC_CHK_ACCESS(maps)->entries = RB_ROOT;
> +       RC_CHK_ACCESS(maps)->maps_by_address = NULL;
> +       RC_CHK_ACCESS(maps)->maps_by_name = NULL;
>         RC_CHK_ACCESS(maps)->machine = machine;
> -       RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
> +#ifdef HAVE_LIBUNWIND_SUPPORT
> +       RC_CHK_ACCESS(maps)->addr_space = NULL;
> +       RC_CHK_ACCESS(maps)->unwind_libunwind_ops = NULL;
> +#endif
> +       refcount_set(maps__refcnt(maps), 1);
>         RC_CHK_ACCESS(maps)->nr_maps = 0;
> -       RC_CHK_ACCESS(maps)->maps_by_name = NULL;
> +       RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
> +       RC_CHK_ACCESS(maps)->last_search_by_name_idx = 0;
> +       RC_CHK_ACCESS(maps)->maps_by_address_sorted = true;
> +       RC_CHK_ACCESS(maps)->maps_by_name_sorted = false;
>  }
>
> -static void __maps__free_maps_by_name(struct maps *maps)
> +static void maps__exit(struct maps *maps)
>  {
> -       /*
> -        * Free everything to try to do it from the rbtree in the next search
> -        */
> -       for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
> -               map__put(maps__maps_by_name(maps)[i]);
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +       struct map **maps_by_name = maps__maps_by_name(maps);
>
> -       zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
> -       RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
> +       for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> +               map__zput(maps_by_address[i]);
> +               if (maps_by_name)
> +                       map__zput(maps_by_name[i]);
> +       }
> +       zfree(&maps_by_address);
> +       zfree(&maps_by_name);
> +       unwind__finish_access(maps);
>  }
>
> -static int __maps__insert(struct maps *maps, struct map *map)
> +struct maps *maps__new(struct machine *machine)
>  {
> -       struct rb_node **p = &maps__entries(maps)->rb_node;
> -       struct rb_node *parent = NULL;
> -       const u64 ip = map__start(map);
> -       struct map_rb_node *m, *new_rb_node;
> -
> -       new_rb_node = malloc(sizeof(*new_rb_node));
> -       if (!new_rb_node)
> -               return -ENOMEM;
> -
> -       RB_CLEAR_NODE(&new_rb_node->rb_node);
> -       new_rb_node->map = map__get(map);
> +       struct maps *result;
> +       RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
>
> -       while (*p != NULL) {
> -               parent = *p;
> -               m = rb_entry(parent, struct map_rb_node, rb_node);
> -               if (ip < map__start(m->map))
> -                       p = &(*p)->rb_left;
> -               else
> -                       p = &(*p)->rb_right;
> -       }
> +       if (ADD_RC_CHK(result, maps))
> +               maps__init(result, machine);
>
> -       rb_link_node(&new_rb_node->rb_node, parent, p);
> -       rb_insert_color(&new_rb_node->rb_node, maps__entries(maps));
> -       return 0;
> +       return result;
>  }
>
> -int maps__insert(struct maps *maps, struct map *map)
> +static void maps__delete(struct maps *maps)
>  {
> -       int err;
> -       const struct dso *dso = map__dso(map);
> -
> -       down_write(maps__lock(maps));
> -       err = __maps__insert(maps, map);
> -       if (err)
> -               goto out;
> +       maps__exit(maps);
> +       RC_CHK_FREE(maps);
> +}
>
> -       ++RC_CHK_ACCESS(maps)->nr_maps;
> +struct maps *maps__get(struct maps *maps)
> +{
> +       struct maps *result;
>
> -       if (dso && dso->kernel) {
> -               struct kmap *kmap = map__kmap(map);
> +       if (RC_CHK_GET(result, maps))
> +               refcount_inc(maps__refcnt(maps));
>
> -               if (kmap)
> -                       kmap->kmaps = maps;
> -               else
> -                       pr_err("Internal error: kernel dso with non kernel map\n");
> -       }
> +       return result;
> +}
>
> +void maps__put(struct maps *maps)
> +{
> +       if (maps && refcount_dec_and_test(maps__refcnt(maps)))
> +               maps__delete(maps);
> +       else
> +               RC_CHK_PUT(maps);
> +}
>
> +static void __maps__free_maps_by_name(struct maps *maps)
> +{
>         /*
> -        * If we already performed some search by name, then we need to add the just
> -        * inserted map and resort.
> +        * Free everything to try to do it from the rbtree in the next search
>          */
> -       if (maps__maps_by_name(maps)) {
> -               if (maps__nr_maps(maps) > RC_CHK_ACCESS(maps)->nr_maps_allocated) {
> -                       int nr_allocate = maps__nr_maps(maps) * 2;
> -                       struct map **maps_by_name = realloc(maps__maps_by_name(maps),
> -                                                           nr_allocate * sizeof(map));
> +       for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
> +               map__put(maps__maps_by_name(maps)[i]);
>
> -                       if (maps_by_name == NULL) {
> -                               __maps__free_maps_by_name(maps);
> -                               err = -ENOMEM;
> -                               goto out;
> -                       }
> +       zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
> +}
>
> -                       RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
> -                       RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
> +static int map__start_cmp(const void *a, const void *b)
> +{
> +       const struct map *map_a = *(const struct map * const *)a;
> +       const struct map *map_b = *(const struct map * const *)b;
> +       u64 map_a_start = map__start(map_a);
> +       u64 map_b_start = map__start(map_b);
> +
> +       if (map_a_start == map_b_start) {
> +               u64 map_a_end = map__end(map_a);
> +               u64 map_b_end = map__end(map_b);
> +
> +               if  (map_a_end == map_b_end) {
> +                       /* Ensure maps with the same addresses have a fixed order. */
> +                       if (RC_CHK_ACCESS(map_a) == RC_CHK_ACCESS(map_b))
> +                               return 0;
> +                       return (intptr_t)RC_CHK_ACCESS(map_a) > (intptr_t)RC_CHK_ACCESS(map_b)
> +                               ? 1 : -1;
>                 }
> -               maps__maps_by_name(maps)[maps__nr_maps(maps) - 1] = map__get(map);
> -               __maps__sort_by_name(maps);
> +               return map_a_end > map_b_end ? 1 : -1;
>         }
> - out:
> -       up_write(maps__lock(maps));
> -       return err;
> +       return map_a_start > map_b_start ? 1 : -1;
>  }
>
> -static void __maps__remove(struct maps *maps, struct map_rb_node *rb_node)
> +static void __maps__sort_by_address(struct maps *maps)
>  {
> -       rb_erase_init(&rb_node->rb_node, maps__entries(maps));
> -       map__put(rb_node->map);
> -       free(rb_node);
> +       if (maps__maps_by_address_sorted(maps))
> +               return;
> +
> +       qsort(maps__maps_by_address(maps),
> +               maps__nr_maps(maps),
> +               sizeof(struct map *),
> +               map__start_cmp);
> +       maps__set_maps_by_address_sorted(maps, true);
>  }
>
> -void maps__remove(struct maps *maps, struct map *map)
> +static void maps__sort_by_address(struct maps *maps)
>  {
> -       struct map_rb_node *rb_node;
> -
>         down_write(maps__lock(maps));
> -       if (RC_CHK_ACCESS(maps)->last_search_by_name == map)
> -               RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
> -
> -       rb_node = maps__find_node(maps, map);
> -       assert(rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map));
> -       __maps__remove(maps, rb_node);
> -       if (maps__maps_by_name(maps))
> -               __maps__free_maps_by_name(maps);
> -       --RC_CHK_ACCESS(maps)->nr_maps;
> +       __maps__sort_by_address(maps);
>         up_write(maps__lock(maps));
>  }
>
> -static void __maps__purge(struct maps *maps)
> +static int map__strcmp(const void *a, const void *b)
>  {
> -       struct map_rb_node *pos, *next;
> -
> -       if (maps__maps_by_name(maps))
> -               __maps__free_maps_by_name(maps);
> +       const struct map *map_a = *(const struct map * const *)a;
> +       const struct map *map_b = *(const struct map * const *)b;
> +       const struct dso *dso_a = map__dso(map_a);
> +       const struct dso *dso_b = map__dso(map_b);
> +       int ret = strcmp(dso_a->short_name, dso_b->short_name);
>
> -       maps__for_each_entry_safe(maps, pos, next) {
> -               rb_erase_init(&pos->rb_node,  maps__entries(maps));
> -               map__put(pos->map);
> -               free(pos);
> +       if (ret == 0 && RC_CHK_ACCESS(map_a) != RC_CHK_ACCESS(map_b)) {
> +               /* Ensure distinct but name equal maps have an order. */
> +               return map__start_cmp(a, b);
>         }
> +       return ret;
>  }
>
> -static void maps__exit(struct maps *maps)
> +static int maps__sort_by_name(struct maps *maps)
>  {
> +       int err = 0;
>         down_write(maps__lock(maps));
> -       __maps__purge(maps);
> +       if (!maps__maps_by_name_sorted(maps)) {
> +               struct map **maps_by_name = maps__maps_by_name(maps);
> +
> +               if (!maps_by_name) {
> +                       maps_by_name = malloc(RC_CHK_ACCESS(maps)->nr_maps_allocated *
> +                                       sizeof(*maps_by_name));
> +                       if (!maps_by_name)
> +                               err = -ENOMEM;
> +                       else {
> +                               struct map **maps_by_address = maps__maps_by_address(maps);
> +                               unsigned int n = maps__nr_maps(maps);
> +
> +                               maps__set_maps_by_name(maps, maps_by_name);
> +                               for (unsigned int i = 0; i < n; i++)
> +                                       maps_by_name[i] = map__get(maps_by_address[i]);
> +                       }
> +               }
> +               if (!err) {
> +                       qsort(maps_by_name,
> +                               maps__nr_maps(maps),
> +                               sizeof(struct map *),
> +                               map__strcmp);
> +                       maps__set_maps_by_name_sorted(maps, true);
> +               }
> +       }
>         up_write(maps__lock(maps));
> +       return err;
>  }
>
> -bool maps__empty(struct maps *maps)
> +static unsigned int maps__by_address_index(const struct maps *maps, const struct map *map)
>  {
> -       return !maps__first(maps);
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +
> +       if (maps__maps_by_address_sorted(maps)) {
> +               struct map **mapp =
> +                       bsearch(&map, maps__maps_by_address(maps), maps__nr_maps(maps),
> +                               sizeof(*mapp), map__start_cmp);
> +
> +               if (mapp)
> +                       return mapp - maps_by_address;
> +       } else {
> +               for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> +                       if (RC_CHK_ACCESS(maps_by_address[i]) == RC_CHK_ACCESS(map))
> +                               return i;
> +               }
> +       }
> +       pr_err("Map missing from maps");
> +       return -1;
>  }
>
> -struct maps *maps__new(struct machine *machine)
> +static unsigned int maps__by_name_index(const struct maps *maps, const struct map *map)
>  {
> -       struct maps *result;
> -       RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
> +       struct map **maps_by_name = maps__maps_by_name(maps);
> +
> +       if (maps__maps_by_name_sorted(maps)) {
> +               struct map **mapp =
> +                       bsearch(&map, maps_by_name, maps__nr_maps(maps),
> +                               sizeof(*mapp), map__strcmp);
> +
> +               if (mapp)
> +                       return mapp - maps_by_name;
> +       } else {
> +               for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> +                       if (RC_CHK_ACCESS(maps_by_name[i]) == RC_CHK_ACCESS(map))
> +                               return i;
> +               }
> +       }
> +       pr_err("Map missing from maps");
> +       return -1;
> +}
>
> -       if (ADD_RC_CHK(result, maps))
> -               maps__init(result, machine);
> +static int __maps__insert(struct maps *maps, struct map *new)
> +{
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +       struct map **maps_by_name = maps__maps_by_name(maps);
> +       const struct dso *dso = map__dso(new);
> +       unsigned int nr_maps = maps__nr_maps(maps);
> +       unsigned int nr_allocate = RC_CHK_ACCESS(maps)->nr_maps_allocated;
> +
> +       if (nr_maps + 1 > nr_allocate) {
> +               nr_allocate = !nr_allocate ? 32 : nr_allocate * 2;
> +
> +               maps_by_address = realloc(maps_by_address, nr_allocate * sizeof(new));
> +               if (!maps_by_address)
> +                       return -ENOMEM;
> +
> +               maps__set_maps_by_address(maps, maps_by_address);
> +               if (maps_by_name) {
> +                       maps_by_name = realloc(maps_by_name, nr_allocate * sizeof(new));
> +                       if (!maps_by_name) {
> +                               /*
> +                                * If by name fails, just disable by name and it will
> +                                * recompute next time it is required.
> +                                */
> +                               __maps__free_maps_by_name(maps);
> +                       }
> +                       maps__set_maps_by_name(maps, maps_by_name);
> +               }
> +               RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
> +       }
> +       /* Insert the value at the end. */
> +       maps_by_address[nr_maps] = map__get(new);
> +       if (maps_by_name)
> +               maps_by_name[nr_maps] = map__get(new);
>
> -       return result;
> +       nr_maps++;
> +       RC_CHK_ACCESS(maps)->nr_maps = nr_maps;
> +
> +       /*
> +        * Recompute if things are sorted. If things are inserted in a sorted
> +        * manner, for example by processing /proc/pid/maps, then no
> +        * sorting/resorting will be necessary.
> +        */
> +       if (nr_maps == 1) {
> +               /* If there's just 1 entry then maps are sorted. */
> +               maps__set_maps_by_address_sorted(maps, true);
> +               maps__set_maps_by_name_sorted(maps, maps_by_name != NULL);
> +       } else {
> +               /* Sorted if maps were already sorted and this map starts after the last one. */
> +               maps__set_maps_by_address_sorted(maps,
> +                       maps__maps_by_address_sorted(maps) &&
> +                       map__end(maps_by_address[nr_maps - 2]) <= map__start(new));
> +               maps__set_maps_by_name_sorted(maps, false);
> +       }
> +       if (map__end(new) < map__start(new))
> +               RC_CHK_ACCESS(maps)->ends_broken = true;
> +       if (dso && dso->kernel) {
> +               struct kmap *kmap = map__kmap(new);
> +
> +               if (kmap)
> +                       kmap->kmaps = maps;
> +               else
> +                       pr_err("Internal error: kernel dso with non kernel map\n");
> +       }
> +       check_invariants(maps);

Probably not needed as it's checked when you get the lock below.


> +       return 0;
>  }
>
> -static void maps__delete(struct maps *maps)
> +int maps__insert(struct maps *maps, struct map *map)
>  {
> -       maps__exit(maps);
> -       unwind__finish_access(maps);
> -       RC_CHK_FREE(maps);
> +       int ret;
> +
> +       down_write(maps__lock(maps));
> +       ret = __maps__insert(maps, map);
> +       up_write(maps__lock(maps));
> +       return ret;
>  }
>
> -struct maps *maps__get(struct maps *maps)
> +static void __maps__remove(struct maps *maps, struct map *map)
>  {
> -       struct maps *result;
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +       struct map **maps_by_name = maps__maps_by_name(maps);
> +       unsigned int nr_maps = maps__nr_maps(maps);
> +       unsigned int address_idx;
> +
> +       /* Slide later mappings over the one to remove */
> +       address_idx = maps__by_address_index(maps, map);
> +       map__put(maps_by_address[address_idx]);
> +       memmove(&maps_by_address[address_idx],
> +               &maps_by_address[address_idx + 1],
> +               (nr_maps - address_idx - 1) * sizeof(*maps_by_address));
> +
> +       if (maps_by_name) {
> +               unsigned int name_idx = maps__by_name_index(maps, map);
> +
> +               map__put(maps_by_name[name_idx]);
> +               memmove(&maps_by_name[name_idx],
> +                       &maps_by_name[name_idx + 1],
> +                       (nr_maps - name_idx - 1) *  sizeof(*maps_by_name));
> +       }
>
> -       if (RC_CHK_GET(result, maps))
> -               refcount_inc(maps__refcnt(maps));
> +       --RC_CHK_ACCESS(maps)->nr_maps;
> +       check_invariants(maps);

Ditto.

> +}
>
> -       return result;
> +void maps__remove(struct maps *maps, struct map *map)
> +{
> +       down_write(maps__lock(maps));
> +       __maps__remove(maps, map);
> +       up_write(maps__lock(maps));
>  }
>
> -void maps__put(struct maps *maps)
> +bool maps__empty(struct maps *maps)
>  {
> -       if (maps && refcount_dec_and_test(maps__refcnt(maps)))
> -               maps__delete(maps);
> -       else
> -               RC_CHK_PUT(maps);
> +       return maps__nr_maps(maps) == 0;
>  }
>
>  int maps__for_each_map(struct maps *maps, int (*cb)(struct map *map, void *data), void *data)
>  {
> -       struct map_rb_node *pos;
> +       bool done = false;
>         int ret = 0;
>
> -       down_read(maps__lock(maps));
> -       maps__for_each_entry(maps, pos) {
> -               ret = cb(pos->map, data);
> -               if (ret)
> -                       break;
> +       /* See locking/sorting note. */
> +       while (!done) {
> +               down_read(maps__lock(maps));
> +               if (maps__maps_by_address_sorted(maps)) {
> +                       struct map **maps_by_address = maps__maps_by_address(maps);
> +                       unsigned int n = maps__nr_maps(maps);
> +
> +                       for (unsigned int i = 0; i < n; i++) {
> +                               struct map *map = maps_by_address[i];
> +
> +                               ret = cb(map, data);
> +                               if (ret)
> +                                       break;
> +                       }
> +                       done = true;
> +               }
> +               up_read(maps__lock(maps));
> +               if (!done)
> +                       maps__sort_by_address(maps);
>         }
> -       up_read(maps__lock(maps));
>         return ret;
>  }
>
>  void maps__remove_maps(struct maps *maps, bool (*cb)(struct map *map, void *data), void *data)
>  {
> -       struct map_rb_node *pos, *next;
> -       unsigned int start_nr_maps;
> +       struct map **maps_by_address;
>
>         down_write(maps__lock(maps));
>
> -       start_nr_maps = maps__nr_maps(maps);
> -       maps__for_each_entry_safe(maps, pos, next)      {
> -               if (cb(pos->map, data)) {
> -                       __maps__remove(maps, pos);
> -                       --RC_CHK_ACCESS(maps)->nr_maps;
> -               }
> +       maps_by_address = maps__maps_by_address(maps);
> +       for (unsigned int i = 0; i < maps__nr_maps(maps);) {
> +               if (cb(maps_by_address[i], data))
> +                       __maps__remove(maps, maps_by_address[i]);
> +               else
> +                       i++;
>         }
> -       if (maps__maps_by_name(maps) && start_nr_maps != maps__nr_maps(maps))
> -               __maps__free_maps_by_name(maps);
> -
>         up_write(maps__lock(maps));
>  }
>
> @@ -300,7 +491,7 @@ struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp)
>         /* Ensure map is loaded before using map->map_ip */
>         if (map != NULL && map__load(map) >= 0) {
>                 if (mapp != NULL)
> -                       *mapp = map;
> +                       *mapp = map; // TODO: map_put on else path when find returns a get.
>                 return map__find_symbol(map, map__map_ip(map, addr));
>         }
>
> @@ -348,7 +539,7 @@ int maps__find_ams(struct maps *maps, struct addr_map_symbol *ams)
>         if (ams->addr < map__start(ams->ms.map) || ams->addr >= map__end(ams->ms.map)) {
>                 if (maps == NULL)
>                         return -1;
> -               ams->ms.map = maps__find(maps, ams->addr);
> +               ams->ms.map = maps__find(maps, ams->addr);  // TODO: map_get
>                 if (ams->ms.map == NULL)
>                         return -1;
>         }
> @@ -393,24 +584,28 @@ size_t maps__fprintf(struct maps *maps, FILE *fp)
>   * Find first map where end > map->start.
>   * Same as find_vma() in kernel.
>   */
> -static struct rb_node *first_ending_after(struct maps *maps, const struct map *map)
> +static unsigned int first_ending_after(struct maps *maps, const struct map *map)
>  {
> -       struct rb_root *root;
> -       struct rb_node *next, *first;
> +       struct map **maps_by_address = maps__maps_by_address(maps);
> +       int low = 0, high = (int)maps__nr_maps(maps) - 1, first = high + 1;
> +
> +       assert(maps__maps_by_address_sorted(maps));
> +       if (low <= high && map__end(maps_by_address[0]) > map__start(map))
> +               return 0;
>
> -       root = maps__entries(maps);
> -       next = root->rb_node;
> -       first = NULL;
> -       while (next) {
> -               struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
> +       while (low <= high) {
> +               int mid = (low + high) / 2;
> +               struct map *pos = maps_by_address[mid];
>
> -               if (map__end(pos->map) > map__start(map)) {
> -                       first = next;
> -                       if (map__start(pos->map) <= map__start(map))
> +               if (map__end(pos) > map__start(map)) {
> +                       first = mid;
> +                       if (map__start(pos) <= map__start(map)) {
> +                               /* Entry overlaps map. */
>                                 break;
> -                       next = next->rb_left;
> +                       }
> +                       high = mid - 1;
>                 } else
> -                       next = next->rb_right;
> +                       low = mid + 1;
>         }
>         return first;
>  }
> @@ -419,171 +614,249 @@ static struct rb_node *first_ending_after(struct maps *maps, const struct map *m
>   * Adds new to maps, if new overlaps existing entries then the existing maps are
>   * adjusted or removed so that new fits without overlapping any entries.
>   */
> -int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
> +static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
>  {
> -
> -       struct rb_node *next;
> +       struct map **maps_by_address;
>         int err = 0;
>         FILE *fp = debug_file();
>
> -       down_write(maps__lock(maps));
> +sort_again:
> +       if (!maps__maps_by_address_sorted(maps))
> +               __maps__sort_by_address(maps);
>
> -       next = first_ending_after(maps, new);
> -       while (next && !err) {
> -               struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
> -               next = rb_next(&pos->rb_node);
> +       maps_by_address = maps__maps_by_address(maps);
> +       /*
> +        * Iterate through entries where the end of the existing entry is
> +        * greater-than the new map's start.
> +        */
> +       for (unsigned int i = first_ending_after(maps, new); i < maps__nr_maps(maps); ) {
> +               struct map *pos = maps_by_address[i];
> +               struct map *before = NULL, *after = NULL;
>
>                 /*
>                  * Stop if current map starts after map->end.
>                  * Maps are ordered by start: next will not overlap for sure.
>                  */
> -               if (map__start(pos->map) >= map__end(new))
> +               if (map__start(pos) >= map__end(new))
>                         break;
>
> -               if (verbose >= 2) {
> -
> -                       if (use_browser) {
> -                               pr_debug("overlapping maps in %s (disable tui for more info)\n",
> -                                        map__dso(new)->name);
> -                       } else {
> -                               pr_debug("overlapping maps:\n");
> -                               map__fprintf(new, fp);
> -                               map__fprintf(pos->map, fp);
> -                       }
> +               if (use_browser) {
> +                       pr_debug("overlapping maps in %s (disable tui for more info)\n",
> +                               map__dso(new)->name);
> +               } else if (verbose >= 2) {
> +                       pr_debug("overlapping maps:\n");
> +                       map__fprintf(new, fp);
> +                       map__fprintf(pos, fp);
>                 }
>
> -               rb_erase_init(&pos->rb_node, maps__entries(maps));
>                 /*
>                  * Now check if we need to create new maps for areas not
>                  * overlapped by the new map:
>                  */
> -               if (map__start(new) > map__start(pos->map)) {
> -                       struct map *before = map__clone(pos->map);
> +               if (map__start(new) > map__start(pos)) {
> +                       /* Map starts within existing map. Need to shorten the existing map. */
> +                       before = map__clone(pos);
>
>                         if (before == NULL) {
>                                 err = -ENOMEM;
> -                               goto put_map;
> +                               goto out_err;
>                         }
> -
>                         map__set_end(before, map__start(new));
> -                       err = __maps__insert(maps, before);
> -                       if (err) {
> -                               map__put(before);
> -                               goto put_map;
> -                       }
>
>                         if (verbose >= 2 && !use_browser)
>                                 map__fprintf(before, fp);
> -                       map__put(before);
>                 }
> -
> -               if (map__end(new) < map__end(pos->map)) {
> -                       struct map *after = map__clone(pos->map);
> +               if (map__end(new) < map__end(pos)) {
> +                       /* The new map isn't as long as the existing map. */
> +                       after = map__clone(pos);
>
>                         if (after == NULL) {
> +                               map__zput(before);
>                                 err = -ENOMEM;
> -                               goto put_map;
> +                               goto out_err;
>                         }
>
>                         map__set_start(after, map__end(new));
> -                       map__add_pgoff(after, map__end(new) - map__start(pos->map));
> -                       assert(map__map_ip(pos->map, map__end(new)) ==
> -                               map__map_ip(after, map__end(new)));
> -                       err = __maps__insert(maps, after);
> -                       if (err) {
> -                               map__put(after);
> -                               goto put_map;
> -                       }
> +                       map__add_pgoff(after, map__end(new) - map__start(pos));
> +                       assert(map__map_ip(pos, map__end(new)) ==
> +                              map__map_ip(after, map__end(new)));
> +
>                         if (verbose >= 2 && !use_browser)
>                                 map__fprintf(after, fp);
> -                       map__put(after);
>                 }
> -put_map:
> -               map__put(pos->map);
> -               free(pos);
> +               /*
> +                * If adding one entry, for `before` or `after`, we can replace
> +                * the existing entry. If both `before` and `after` are
> +                * necessary than an insert is needed. If the existing entry
> +                * entirely overlaps the existing entry it can just be removed.
> +                */
> +               if (before) {
> +                       map__put(maps_by_address[i]);
> +                       maps_by_address[i] = before;
> +                       /* Maps are still ordered, go to next one. */
> +                       i++;
> +                       if (after) {
> +                               __maps__insert(maps, after);
> +                               map__put(after);
> +                               if (!maps__maps_by_address_sorted(maps)) {
> +                                       /*
> +                                        * Sorting broken so invariants don't
> +                                        * hold, sort and go again.
> +                                        */
> +                                       goto sort_again;
> +                               }
> +                               /*
> +                                * Maps are still ordered, skip after and go to
> +                                * next one (terminate loop).
> +                                */
> +                               i++;
> +                       }
> +               } else if (after) {
> +                       map__put(maps_by_address[i]);
> +                       maps_by_address[i] = after;
> +                       /* Maps are ordered, go to next one. */
> +                       i++;
> +               } else {
> +                       __maps__remove(maps, pos);
> +                       /*
> +                        * Maps are ordered but no need to increase `i` as the
> +                        * later maps were moved down.
> +                        */
> +               }
> +               check_invariants(maps);
>         }
>         /* Add the map. */
> -       err = __maps__insert(maps, new);
> -       up_write(maps__lock(maps));
> +       __maps__insert(maps, new);
> +out_err:
>         return err;
>  }
>
> -int maps__copy_from(struct maps *maps, struct maps *parent)
> +int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
>  {
>         int err;
> -       struct map_rb_node *rb_node;
>
> +       down_write(maps__lock(maps));
> +       err =  __maps__fixup_overlap_and_insert(maps, new);
> +       up_write(maps__lock(maps));
> +       return err;
> +}
> +
> +int maps__copy_from(struct maps *dest, struct maps *parent)
> +{
> +       /* Note, if struct map were immutable then cloning could use ref counts. */
> +       struct map **parent_maps_by_address;
> +       int err = 0;
> +       unsigned int n;
> +
> +       down_write(maps__lock(dest));
>         down_read(maps__lock(parent));
>
> -       maps__for_each_entry(parent, rb_node) {
> -               struct map *new = map__clone(rb_node->map);
> +       parent_maps_by_address = maps__maps_by_address(parent);
> +       n = maps__nr_maps(parent);
> +       if (maps__empty(dest)) {
> +               /* No existing mappings so just copy from parent to avoid reallocs in insert. */
> +               unsigned int nr_maps_allocated = RC_CHK_ACCESS(parent)->nr_maps_allocated;
> +               struct map **dest_maps_by_address =
> +                       malloc(nr_maps_allocated * sizeof(struct map *));
> +               struct map **dest_maps_by_name = NULL;
>
> -               if (new == NULL) {
> +               if (!dest_maps_by_address)
>                         err = -ENOMEM;
> -                       goto out_unlock;
> +               else {
> +                       if (maps__maps_by_name(parent)) {
> +                               dest_maps_by_name =
> +                                       malloc(nr_maps_allocated * sizeof(struct map *));
> +                       }
> +
> +                       RC_CHK_ACCESS(dest)->maps_by_address = dest_maps_by_address;
> +                       RC_CHK_ACCESS(dest)->maps_by_name = dest_maps_by_name;
> +                       RC_CHK_ACCESS(dest)->nr_maps_allocated = nr_maps_allocated;
>                 }
>
> -               err = unwind__prepare_access(maps, new, NULL);
> -               if (err)
> -                       goto out_unlock;
> +               for (unsigned int i = 0; !err && i < n; i++) {
> +                       struct map *pos = parent_maps_by_address[i];
> +                       struct map *new = map__clone(pos);
>
> -               err = maps__insert(maps, new);
> -               if (err)
> -                       goto out_unlock;
> +                       if (!new)
> +                               err = -ENOMEM;
> +                       else {
> +                               err = unwind__prepare_access(dest, new, NULL);
> +                               if (!err) {
> +                                       dest_maps_by_address[i] = new;
> +                                       if (dest_maps_by_name)
> +                                               dest_maps_by_name[i] = map__get(new);
> +                                       RC_CHK_ACCESS(dest)->nr_maps = i + 1;
> +                               }
> +                       }
> +                       if (err)
> +                               map__put(new);
> +               }
> +               maps__set_maps_by_address_sorted(dest, maps__maps_by_address_sorted(parent));
> +               if (!err) {
> +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx =
> +                               RC_CHK_ACCESS(parent)->last_search_by_name_idx;
> +                       maps__set_maps_by_name_sorted(dest,
> +                                               dest_maps_by_name &&
> +                                               maps__maps_by_name_sorted(parent));
> +               } else {
> +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx = 0;
> +                       maps__set_maps_by_name_sorted(dest, false);
> +               }
> +       } else {
> +               /* Unexpected copying to a maps containing entries. */
> +               for (unsigned int i = 0; !err && i < n; i++) {
> +                       struct map *pos = parent_maps_by_address[i];
> +                       struct map *new = map__clone(pos);
>
> -               map__put(new);
> +                       if (!new)
> +                               err = -ENOMEM;
> +                       else {
> +                               err = unwind__prepare_access(dest, new, NULL);
> +                               if (!err)
> +                                       err = maps__insert(dest, new);

Shouldn't it be __maps__insert()?


> +                       }
> +                       map__put(new);
> +               }
>         }
> -
> -       err = 0;
> -out_unlock:
>         up_read(maps__lock(parent));
> +       up_write(maps__lock(dest));
>         return err;
>  }
>
> -struct map *maps__find(struct maps *maps, u64 ip)
> +static int map__addr_cmp(const void *key, const void *entry)
>  {
> -       struct rb_node *p;
> -       struct map_rb_node *m;
> -
> +       const u64 ip = *(const u64 *)key;
> +       const struct map *map = *(const struct map * const *)entry;
>
> -       down_read(maps__lock(maps));
> -
> -       p = maps__entries(maps)->rb_node;
> -       while (p != NULL) {
> -               m = rb_entry(p, struct map_rb_node, rb_node);
> -               if (ip < map__start(m->map))
> -                       p = p->rb_left;
> -               else if (ip >= map__end(m->map))
> -                       p = p->rb_right;
> -               else
> -                       goto out;
> -       }
> -
> -       m = NULL;
> -out:
> -       up_read(maps__lock(maps));
> -       return m ? m->map : NULL;
> +       if (ip < map__start(map))
> +               return -1;
> +       if (ip >= map__end(map))
> +               return 1;
> +       return 0;
>  }
>
> -static int map__strcmp(const void *a, const void *b)
> +struct map *maps__find(struct maps *maps, u64 ip)
>  {
> -       const struct map *map_a = *(const struct map **)a;
> -       const struct map *map_b = *(const struct map **)b;
> -       const struct dso *dso_a = map__dso(map_a);
> -       const struct dso *dso_b = map__dso(map_b);
> -       int ret = strcmp(dso_a->short_name, dso_b->short_name);
> -
> -       if (ret == 0 && map_a != map_b) {
> -               /*
> -                * Ensure distinct but name equal maps have an order in part to
> -                * aid reference counting.
> -                */
> -               ret = (int)map__start(map_a) - (int)map__start(map_b);
> -               if (ret == 0)
> -                       ret = (int)((intptr_t)map_a - (intptr_t)map_b);
> +       struct map *result = NULL;
> +       bool done = false;
> +
> +       /* See locking/sorting note. */
> +       while (!done) {
> +               down_read(maps__lock(maps));
> +               if (maps__maps_by_address_sorted(maps)) {
> +                       struct map **mapp =
> +                               bsearch(&ip, maps__maps_by_address(maps), maps__nr_maps(maps),
> +                                       sizeof(*mapp), map__addr_cmp);
> +
> +                       if (mapp)
> +                               result = *mapp; // map__get(*mapp);
> +                       done = true;
> +               }
> +               up_read(maps__lock(maps));
> +               if (!done)
> +                       maps__sort_by_address(maps);
>         }
> -
> -       return ret;
> +       return result;
>  }
>
>  static int map__strcmp_name(const void *name, const void *b)
> @@ -593,126 +866,113 @@ static int map__strcmp_name(const void *name, const void *b)
>         return strcmp(name, dso->short_name);
>  }
>
> -void __maps__sort_by_name(struct maps *maps)
> -{
> -       qsort(maps__maps_by_name(maps), maps__nr_maps(maps), sizeof(struct map *), map__strcmp);
> -}
> -
> -static int map__groups__sort_by_name_from_rbtree(struct maps *maps)
> -{
> -       struct map_rb_node *rb_node;
> -       struct map **maps_by_name = realloc(maps__maps_by_name(maps),
> -                                           maps__nr_maps(maps) * sizeof(struct map *));
> -       int i = 0;
> -
> -       if (maps_by_name == NULL)
> -               return -1;
> -
> -       up_read(maps__lock(maps));
> -       down_write(maps__lock(maps));
> -
> -       RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
> -       RC_CHK_ACCESS(maps)->nr_maps_allocated = maps__nr_maps(maps);
> -
> -       maps__for_each_entry(maps, rb_node)
> -               maps_by_name[i++] = map__get(rb_node->map);
> -
> -       __maps__sort_by_name(maps);
> -
> -       up_write(maps__lock(maps));
> -       down_read(maps__lock(maps));
> -
> -       return 0;
> -}
> -
> -static struct map *__maps__find_by_name(struct maps *maps, const char *name)
> +struct map *maps__find_by_name(struct maps *maps, const char *name)
>  {
> -       struct map **mapp;
> +       struct map *result = NULL;
> +       bool done = false;
>
> -       if (maps__maps_by_name(maps) == NULL &&
> -           map__groups__sort_by_name_from_rbtree(maps))
> -               return NULL;
> +       /* See locking/sorting note. */
> +       while (!done) {
> +               unsigned int i;
>
> -       mapp = bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
> -                      sizeof(*mapp), map__strcmp_name);
> -       if (mapp)
> -               return *mapp;
> -       return NULL;
> -}
> +               down_read(maps__lock(maps));
>
> -struct map *maps__find_by_name(struct maps *maps, const char *name)
> -{
> -       struct map_rb_node *rb_node;
> -       struct map *map;
> -
> -       down_read(maps__lock(maps));
> +               /* First check last found entry. */
> +               i = RC_CHK_ACCESS(maps)->last_search_by_name_idx;
> +               if (i < maps__nr_maps(maps) && maps__maps_by_name(maps)) {
> +                       struct dso *dso = map__dso(maps__maps_by_name(maps)[i]);
>
> +                       if (dso && strcmp(dso->short_name, name) == 0) {
> +                               result = maps__maps_by_name(maps)[i]; // TODO: map__get
> +                               done = true;
> +                       }
> +               }
>
> -       if (RC_CHK_ACCESS(maps)->last_search_by_name) {
> -               const struct dso *dso = map__dso(RC_CHK_ACCESS(maps)->last_search_by_name);
> +               /* Second search sorted array. */
> +               if (!done && maps__maps_by_name_sorted(maps)) {
> +                       struct map **mapp =
> +                               bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
> +                                       sizeof(*mapp), map__strcmp_name);
>
> -               if (strcmp(dso->short_name, name) == 0) {
> -                       map = RC_CHK_ACCESS(maps)->last_search_by_name;
> -                       goto out_unlock;
> +                       if (mapp) {
> +                               result = *mapp; // TODO: map__get
> +                               i = mapp - maps__maps_by_name(maps);
> +                               RC_CHK_ACCESS(maps)->last_search_by_name_idx = i;
> +                       }
> +                       done = true;
>                 }
> -       }
> -       /*
> -        * If we have maps->maps_by_name, then the name isn't in the rbtree,
> -        * as maps->maps_by_name mirrors the rbtree when lookups by name are
> -        * made.
> -        */
> -       map = __maps__find_by_name(maps, name);
> -       if (map || maps__maps_by_name(maps) != NULL)
> -               goto out_unlock;
> -
> -       /* Fallback to traversing the rbtree... */
> -       maps__for_each_entry(maps, rb_node) {
> -               struct dso *dso;
> -
> -               map = rb_node->map;
> -               dso = map__dso(map);
> -               if (strcmp(dso->short_name, name) == 0) {
> -                       RC_CHK_ACCESS(maps)->last_search_by_name = map;
> -                       goto out_unlock;
> +               up_read(maps__lock(maps));
> +               if (!done) {
> +                       /* Sort and retry binary search. */
> +                       if (maps__sort_by_name(maps)) {
> +                               /*
> +                                * Memory allocation failed do linear search
> +                                * through address sorted maps.
> +                                */
> +                               struct map **maps_by_address;
> +                               unsigned int n;
> +
> +                               down_read(maps__lock(maps));
> +                               maps_by_address =  maps__maps_by_address(maps);
> +                               n = maps__nr_maps(maps);
> +                               for (i = 0; i < n; i++) {
> +                                       struct map *pos = maps_by_address[i];
> +                                       struct dso *dso = map__dso(pos);
> +
> +                                       if (dso && strcmp(dso->short_name, name) == 0) {
> +                                               result = pos; // TODO: map__get
> +                                               break;
> +                                       }
> +                               }
> +                               up_read(maps__lock(maps));
> +                               done = true;
> +                       }
>                 }
>         }
> -       map = NULL;
> -
> -out_unlock:
> -       up_read(maps__lock(maps));
> -       return map;
> +       return result;
>  }
>
>  struct map *maps__find_next_entry(struct maps *maps, struct map *map)
>  {
> -       struct map_rb_node *rb_node = maps__find_node(maps, map);
> -       struct map_rb_node *next = map_rb_node__next(rb_node);
> +       unsigned int i;
> +       struct map *result = NULL;
>
> -       if (next)
> -               return next->map;
> +       down_read(maps__lock(maps));
> +       i = maps__by_address_index(maps, map);
> +       if (i < maps__nr_maps(maps))
> +               result = maps__maps_by_address(maps)[i]; // TODO: map__get
>
> -       return NULL;
> +       up_read(maps__lock(maps));
> +       return result;
>  }
>
>  void maps__fixup_end(struct maps *maps)
>  {
> -       struct map_rb_node *prev = NULL, *curr;
> +       struct map **maps_by_address;
> +       unsigned int n;
>
>         down_write(maps__lock(maps));
> +       if (!maps__maps_by_address_sorted(maps))
> +               __maps__sort_by_address(maps);
>
> -       maps__for_each_entry(maps, curr) {
> -               if (prev && (!map__end(prev->map) || map__end(prev->map) > map__start(curr->map)))
> -                       map__set_end(prev->map, map__start(curr->map));
> +       maps_by_address = maps__maps_by_address(maps);
> +       n = maps__nr_maps(maps);
> +       for (unsigned int i = 1; i < n; i++) {
> +               struct map *prev = maps_by_address[i - 1];
> +               struct map *curr = maps_by_address[i];
>
> -               prev = curr;
> +               if (!map__end(prev) || map__end(prev) > map__start(curr))
> +                       map__set_end(prev, map__start(curr));
>         }
>
>         /*
>          * We still haven't the actual symbols, so guess the
>          * last map final address.
>          */
> -       if (curr && !map__end(curr->map))
> -               map__set_end(curr->map, ~0ULL);
> +       if (n > 0 && !map__end(maps_by_address[n - 1]))
> +               map__set_end(maps_by_address[n - 1], ~0ULL);
> +
> +       RC_CHK_ACCESS(maps)->ends_broken = false;
>
>         up_write(maps__lock(maps));
>  }
> @@ -723,117 +983,92 @@ void maps__fixup_end(struct maps *maps)
>   */
>  int maps__merge_in(struct maps *kmaps, struct map *new_map)
>  {
> -       struct map_rb_node *rb_node;
> -       struct rb_node *first;
> -       bool overlaps;
> -       LIST_HEAD(merged);
> -       int err = 0;
> +       unsigned int first_after_, kmaps__nr_maps;
> +       struct map **kmaps_maps_by_address;
> +       struct map **merged_maps_by_address;
> +       unsigned int merged_nr_maps_allocated;
> +
> +       /* First try under a read lock. */
> +       while (true) {
> +               down_read(maps__lock(kmaps));
> +               if (maps__maps_by_address_sorted(kmaps))
> +                       break;
>
> -       down_read(maps__lock(kmaps));
> -       first = first_ending_after(kmaps, new_map);
> -       rb_node = first ? rb_entry(first, struct map_rb_node, rb_node) : NULL;
> -       overlaps = rb_node && map__start(rb_node->map) < map__end(new_map);
> -       up_read(maps__lock(kmaps));
> +               up_read(maps__lock(kmaps));
> +
> +               /* First after binary search requires sorted maps. Sort and try again. */
> +               maps__sort_by_address(kmaps);
> +       }
> +       first_after_ = first_ending_after(kmaps, new_map);
> +       kmaps_maps_by_address = maps__maps_by_address(kmaps);
>
> -       if (!overlaps)
> +       if (first_after_ >= maps__nr_maps(kmaps) ||
> +           map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
> +               /* No overlap so regular insert suffices. */
> +               up_read(maps__lock(kmaps));
>                 return maps__insert(kmaps, new_map);
> +       }
> +       up_read(maps__lock(kmaps));
>
> -       maps__for_each_entry(kmaps, rb_node) {
> -               struct map *old_map = rb_node->map;
> +       /* Plain insert with a read-lock failed, try again now with the write lock. */
> +       down_write(maps__lock(kmaps));
> +       if (!maps__maps_by_address_sorted(kmaps))
> +               __maps__sort_by_address(kmaps);
>
> -               /* no overload with this one */
> -               if (map__end(new_map) < map__start(old_map) ||
> -                   map__start(new_map) >= map__end(old_map))
> -                       continue;
> +       first_after_ = first_ending_after(kmaps, new_map);
> +       kmaps_maps_by_address = maps__maps_by_address(kmaps);
> +       kmaps__nr_maps = maps__nr_maps(kmaps);
>
> -               if (map__start(new_map) < map__start(old_map)) {
> -                       /*
> -                        * |new......
> -                        *       |old....
> -                        */
> -                       if (map__end(new_map) < map__end(old_map)) {
> -                               /*
> -                                * |new......|     -> |new..|
> -                                *       |old....| ->       |old....|
> -                                */
> -                               map__set_end(new_map, map__start(old_map));
> -                       } else {
> -                               /*
> -                                * |new.............| -> |new..|       |new..|
> -                                *       |old....|    ->       |old....|
> -                                */
> -                               struct map_list_node *m = map_list_node__new();
> +       if (first_after_ >= kmaps__nr_maps ||
> +           map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
> +               /* No overlap so regular insert suffices. */
> +               up_write(maps__lock(kmaps));
> +               return maps__insert(kmaps, new_map);

I think it could be:

        ret = __maps__insert(kmaps, new_map);
        up_write(maps__lock(kmaps));
        return ret;


> +       }
> +       /* Array to merge into, possibly 1 more for the sake of new_map. */
> +       merged_nr_maps_allocated = RC_CHK_ACCESS(kmaps)->nr_maps_allocated;
> +       if (kmaps__nr_maps + 1 == merged_nr_maps_allocated)
> +               merged_nr_maps_allocated++;
> +
> +       merged_maps_by_address = malloc(merged_nr_maps_allocated * sizeof(*merged_maps_by_address));
> +       if (!merged_maps_by_address) {
> +               up_write(maps__lock(kmaps));
> +               return -ENOMEM;
> +       }
> +       RC_CHK_ACCESS(kmaps)->maps_by_address = merged_maps_by_address;
> +       RC_CHK_ACCESS(kmaps)->maps_by_address_sorted = true;
> +       zfree(&RC_CHK_ACCESS(kmaps)->maps_by_name);
> +       RC_CHK_ACCESS(kmaps)->maps_by_name_sorted = false;
> +       RC_CHK_ACCESS(kmaps)->nr_maps_allocated = merged_nr_maps_allocated;

Why not use the accessor functions?

Thanks,
Namhyung

>
> -                               if (!m) {
> -                                       err = -ENOMEM;
> -                                       goto out;
> -                               }
> +       /* Copy entries before the new_map that can't overlap. */
> +       for (unsigned int i = 0; i < first_after_; i++)
> +               merged_maps_by_address[i] = map__get(kmaps_maps_by_address[i]);
>
> -                               m->map = map__clone(new_map);
> -                               if (!m->map) {
> -                                       free(m);
> -                                       err = -ENOMEM;
> -                                       goto out;
> -                               }
> +       RC_CHK_ACCESS(kmaps)->nr_maps = first_after_;
>
> -                               map__set_end(m->map, map__start(old_map));
> -                               list_add_tail(&m->node, &merged);
> -                               map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
> -                               map__set_start(new_map, map__end(old_map));
> -                       }
> -               } else {
> -                       /*
> -                        *      |new......
> -                        * |old....
> -                        */
> -                       if (map__end(new_map) < map__end(old_map)) {
> -                               /*
> -                                *      |new..|   -> x
> -                                * |old.........| -> |old.........|
> -                                */
> -                               map__put(new_map);
> -                               new_map = NULL;
> -                               break;
> -                       } else {
> -                               /*
> -                                *      |new......| ->         |new...|
> -                                * |old....|        -> |old....|
> -                                */
> -                               map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
> -                               map__set_start(new_map, map__end(old_map));
> -                       }
> -               }
> -       }
> +       /* Add the new map, it will be split when the later overlapping mappings are added. */
> +       __maps__insert(kmaps, new_map);
>
> -out:
> -       while (!list_empty(&merged)) {
> -               struct map_list_node *old_node;
> +       /* Insert mappings after new_map, splitting new_map in the process. */
> +       for (unsigned int i = first_after_; i < kmaps__nr_maps; i++)
> +               __maps__fixup_overlap_and_insert(kmaps, kmaps_maps_by_address[i]);
>
> -               old_node = list_entry(merged.next, struct map_list_node, node);
> -               list_del_init(&old_node->node);
> -               if (!err)
> -                       err = maps__insert(kmaps, old_node->map);
> -               map__put(old_node->map);
> -               free(old_node);
> -       }
> +       /* Copy the maps from merged into kmaps. */
> +       for (unsigned int i = 0; i < kmaps__nr_maps; i++)
> +               map__zput(kmaps_maps_by_address[i]);
>
> -       if (new_map) {
> -               if (!err)
> -                       err = maps__insert(kmaps, new_map);
> -               map__put(new_map);
> -       }
> -       return err;
> +       free(kmaps_maps_by_address);
> +       up_write(maps__lock(kmaps));
> +       return 0;
>  }
>
>  void maps__load_first(struct maps *maps)
>  {
> -       struct map_rb_node *first;
> -
>         down_read(maps__lock(maps));
>
> -       first = maps__first(maps);
> -       if (first)
> -               map__load(first->map);
> +       if (maps__nr_maps(maps) > 0)
> +               map__load(maps__maps_by_address(maps)[0]);
>
>         up_read(maps__lock(maps));
>  }
> diff --git a/tools/perf/util/maps.h b/tools/perf/util/maps.h
> index d836d04c9402..df9dd5a0e3c0 100644
> --- a/tools/perf/util/maps.h
> +++ b/tools/perf/util/maps.h
> @@ -25,21 +25,56 @@ static inline struct map_list_node *map_list_node__new(void)
>         return malloc(sizeof(struct map_list_node));
>  }
>
> -struct map *maps__find(struct maps *maps, u64 addr);
> +/*
> + * Locking/sorting note:
> + *
> + * Sorting is done with the write lock, iteration and binary searching happens
> + * under the read lock requiring being sorted. There is a race between sorting
> + * releasing the write lock and acquiring the read lock for iteration/searching
> + * where another thread could insert and break the sorting of the maps. In
> + * practice inserting maps should be rare meaning that the race shouldn't lead
> + * to live lock. Removal of maps doesn't break being sorted.
> + */
>
>  DECLARE_RC_STRUCT(maps) {
> -       struct rb_root      entries;
>         struct rw_semaphore lock;
> -       struct machine   *machine;
> -       struct map       *last_search_by_name;
> +       /**
> +        * @maps_by_address: array of maps sorted by their starting address if
> +        * maps_by_address_sorted is true.
> +        */
> +       struct map       **maps_by_address;
> +       /**
> +        * @maps_by_name: optional array of maps sorted by their dso name if
> +        * maps_by_name_sorted is true.
> +        */
>         struct map       **maps_by_name;
> -       refcount_t       refcnt;
> -       unsigned int     nr_maps;
> -       unsigned int     nr_maps_allocated;
> +       struct machine   *machine;
>  #ifdef HAVE_LIBUNWIND_SUPPORT
> -       void                            *addr_space;
> +       void            *addr_space;
>         const struct unwind_libunwind_ops *unwind_libunwind_ops;
>  #endif
> +       refcount_t       refcnt;
> +       /**
> +        * @nr_maps: number of maps_by_address, and possibly maps_by_name,
> +        * entries that contain maps.
> +        */
> +       unsigned int     nr_maps;
> +       /**
> +        * @nr_maps_allocated: number of entries in maps_by_address and possibly
> +        * maps_by_name.
> +        */
> +       unsigned int     nr_maps_allocated;
> +       /**
> +        * @last_search_by_name_idx: cache of last found by name entry's index
> +        * as frequent searches for the same dso name are common.
> +        */
> +       unsigned int     last_search_by_name_idx;
> +       /** @maps_by_address_sorted: is maps_by_address sorted. */
> +       bool             maps_by_address_sorted;
> +       /** @maps_by_name_sorted: is maps_by_name sorted. */
> +       bool             maps_by_name_sorted;
> +       /** @ends_broken: does the map contain a map where end values are unset/unsorted? */
> +       bool             ends_broken;
>  };
>
>  #define KMAP_NAME_LEN 256
> @@ -102,6 +137,7 @@ size_t maps__fprintf(struct maps *maps, FILE *fp);
>  int maps__insert(struct maps *maps, struct map *map);
>  void maps__remove(struct maps *maps, struct map *map);
>
> +struct map *maps__find(struct maps *maps, u64 addr);
>  struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp);
>  struct symbol *maps__find_symbol_by_name(struct maps *maps, const char *name, struct map **mapp);
>
> @@ -117,8 +153,6 @@ struct map *maps__find_next_entry(struct maps *maps, struct map *map);
>
>  int maps__merge_in(struct maps *kmaps, struct map *new_map);
>
> -void __maps__sort_by_name(struct maps *maps);
> -
>  void maps__fixup_end(struct maps *maps);
>
>  void maps__load_first(struct maps *maps);
> --
> 2.43.0.472.g3155946c3a-goog
>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses
  2024-02-02  2:48   ` Namhyung Kim
@ 2024-02-02  4:20     ` Ian Rogers
  2024-02-02  4:21       ` Ian Rogers
  2024-02-06  0:37       ` Namhyung Kim
  0 siblings, 2 replies; 31+ messages in thread
From: Ian Rogers @ 2024-02-02  4:20 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Nick Terrell, Kan Liang, Andi Kleen, Kajol Jain, Athira Rajeev,
	Huacai Chen, Masami Hiramatsu, Vincent Whitchurch,
	Steinar H. Gunderson, Liam Howlett, Miguel Ojeda, Colin Ian King,
	Dmitrii Dolgov, Yang Jihong, Ming Wang, James Clark,
	K Prateek Nayak, Sean Christopherson, Leo Yan, Ravi Bangoria,
	German Gomez, Changbin Du, Paolo Bonzini, Li Dong, Sandipan Das,
	liuwenyu, linux-kernel, linux-perf-users, Guilherme Amadio

On Thu, Feb 1, 2024 at 6:48 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Tue, Jan 2, 2024 at 9:07 PM Ian Rogers <irogers@google.com> wrote:
> >
> > Maps is a collection of maps primarily sorted by the starting address
> > of the map. Prior to this change the maps were held in an rbtree
> > requiring 4 pointers per node. Prior to reference count checking, the
> > rbnode was embedded in the map so 3 pointers per node were
> > necessary. This change switches the rbtree to an array lazily sorted
> > by address, much as the array sorting nodes by name. 1 pointer is
> > needed per node, but to avoid excessive resizing the backing array may
> > be twice the number of used elements. Meaning the memory overhead is
> > roughly half that of the rbtree. For a perf record with
> > "--no-bpf-event -g -a" of true, the memory overhead of perf inject is
> > reduce fom 3.3MB to 3MB, so 10% or 300KB is saved.
> >
> > Map inserts always happen at the end of the array. The code tracks
> > whether the insertion violates the sorting property. O(log n) rb-tree
> > complexity is switched to O(1).
> >
> > Remove slides the array, so O(log n) rb-tree complexity is degraded to
> > O(n).
> >
> > A find may need to sort the array using qsort which is O(n*log n), but
> > in general the maps should be sorted and so average performance should
> > be O(log n) as with the rbtree.
> >
> > An rbtree node consumes a cache line, but with the array 4 nodes fit
> > on a cache line. Iteration is simplified to scanning an array rather
> > than pointer chasing.
> >
> > Overall it is expected the performance after the change should be
> > comparable to before, but with half of the memory consumed.
>
> I don't know how much performance impact it would have but I guess
> search/iteration would be the most frequent operation.  So I like the
> memory saving it can bring.
>
> >
> > To avoid a list and repeated logic around splitting maps,
> > maps__merge_in is rewritten in terms of
> > maps__fixup_overlap_and_insert. maps_merge_in splits the given mapping
> > inserting remaining gaps. maps__fixup_overlap_and_insert splits the
> > existing mappings, then adds the incoming mapping. By adding the new
> > mapping first, then re-inserting the existing mappings the splitting
> > behavior matches.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/tests/maps.c |    3 +
> >  tools/perf/util/map.c   |    1 +
> >  tools/perf/util/maps.c  | 1183 +++++++++++++++++++++++----------------
> >  tools/perf/util/maps.h  |   54 +-
> >  4 files changed, 757 insertions(+), 484 deletions(-)
> >
> > diff --git a/tools/perf/tests/maps.c b/tools/perf/tests/maps.c
> > index bb3fbfe5a73e..b15417a0d617 100644
> > --- a/tools/perf/tests/maps.c
> > +++ b/tools/perf/tests/maps.c
> > @@ -156,6 +156,9 @@ static int test__maps__merge_in(struct test_suite *t __maybe_unused, int subtest
> >         TEST_ASSERT_VAL("merge check failed", !ret);
> >
> >         maps__zput(maps);
> > +       map__zput(map_kcore1);
> > +       map__zput(map_kcore2);
> > +       map__zput(map_kcore3);
> >         return TEST_OK;
> >  }
> >
> > diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> > index 54c67cb7ecef..cf5a15db3a1f 100644
> > --- a/tools/perf/util/map.c
> > +++ b/tools/perf/util/map.c
> > @@ -168,6 +168,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
> >                 if (dso == NULL)
> >                         goto out_delete;
> >
> > +               assert(!dso->kernel);
> >                 map__init(result, start, start + len, pgoff, dso);
> >
> >                 if (anon || no_dso) {
> > diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
> > index 0334fc18d9c6..6ee81160cdab 100644
> > --- a/tools/perf/util/maps.c
> > +++ b/tools/perf/util/maps.c
> > @@ -10,286 +10,477 @@
> >  #include "ui/ui.h"
> >  #include "unwind.h"
> >
> > -struct map_rb_node {
> > -       struct rb_node rb_node;
> > -       struct map *map;
> > -};
> > -
> > -#define maps__for_each_entry(maps, map) \
> > -       for (map = maps__first(maps); map; map = map_rb_node__next(map))
> > +static void check_invariants(const struct maps *maps __maybe_unused)
> > +{
> > +#ifndef NDEBUG
> > +       assert(RC_CHK_ACCESS(maps)->nr_maps <= RC_CHK_ACCESS(maps)->nr_maps_allocated);
> > +       for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
> > +               struct map *map = RC_CHK_ACCESS(maps)->maps_by_address[i];
> > +
> > +               /* Check map is well-formed. */
> > +               assert(map__end(map) == 0 || map__start(map) <= map__end(map));
> > +               /* Expect at least 1 reference count. */
> > +               assert(refcount_read(map__refcnt(map)) > 0);
> > +
> > +               if (map__dso(map) && map__dso(map)->kernel)
> > +                       assert(RC_CHK_EQUAL(map__kmap(map)->kmaps, maps));
> > +
> > +               if (i > 0) {
> > +                       struct map *prev = RC_CHK_ACCESS(maps)->maps_by_address[i - 1];
> > +
> > +                       /* If addresses are sorted... */
> > +                       if (RC_CHK_ACCESS(maps)->maps_by_address_sorted) {
> > +                               /* Maps should be in start address order. */
> > +                               assert(map__start(prev) <= map__start(map));
> > +                               /*
> > +                                * If the ends of maps aren't broken (during
> > +                                * construction) then they should be ordered
> > +                                * too.
> > +                                */
> > +                               if (!RC_CHK_ACCESS(maps)->ends_broken) {
> > +                                       assert(map__end(prev) <= map__end(map));
> > +                                       assert(map__end(prev) <= map__start(map) ||
> > +                                              map__start(prev) == map__start(map));
> > +                               }
> > +                       }
> > +               }
> > +       }
> > +       if (RC_CHK_ACCESS(maps)->maps_by_name) {
> > +               for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
> > +                       struct map *map = RC_CHK_ACCESS(maps)->maps_by_name[i];
> >
> > -#define maps__for_each_entry_safe(maps, map, next) \
> > -       for (map = maps__first(maps), next = map_rb_node__next(map); map; \
> > -            map = next, next = map_rb_node__next(map))
> > +                       /*
> > +                        * Maps by name maps should be in maps_by_address, so
> > +                        * the reference count should be higher.
> > +                        */
> > +                       assert(refcount_read(map__refcnt(map)) > 1);
> > +               }
> > +       }
> > +#endif
> > +}
> >
> > -static struct rb_root *maps__entries(struct maps *maps)
> > +static struct map **maps__maps_by_address(const struct maps *maps)
> >  {
> > -       return &RC_CHK_ACCESS(maps)->entries;
> > +       return RC_CHK_ACCESS(maps)->maps_by_address;
> >  }
> >
> > -static struct rw_semaphore *maps__lock(struct maps *maps)
> > +static void maps__set_maps_by_address(struct maps *maps, struct map **new)
> >  {
> > -       return &RC_CHK_ACCESS(maps)->lock;
> > +       RC_CHK_ACCESS(maps)->maps_by_address = new;
> > +
> >  }
> >
> > -static struct map **maps__maps_by_name(struct maps *maps)
> > +/* Not in the header, to aid reference counting. */
> > +static struct map **maps__maps_by_name(const struct maps *maps)
> >  {
> >         return RC_CHK_ACCESS(maps)->maps_by_name;
> > +
> >  }
> >
> > -static struct map_rb_node *maps__first(struct maps *maps)
> > +static void maps__set_maps_by_name(struct maps *maps, struct map **new)
> >  {
> > -       struct rb_node *first = rb_first(maps__entries(maps));
> > +       RC_CHK_ACCESS(maps)->maps_by_name = new;
> >
> > -       if (first)
> > -               return rb_entry(first, struct map_rb_node, rb_node);
> > -       return NULL;
> >  }
> >
> > -static struct map_rb_node *map_rb_node__next(struct map_rb_node *node)
> > +static bool maps__maps_by_address_sorted(const struct maps *maps)
> >  {
> > -       struct rb_node *next;
> > -
> > -       if (!node)
> > -               return NULL;
> > -
> > -       next = rb_next(&node->rb_node);
> > +       return RC_CHK_ACCESS(maps)->maps_by_address_sorted;
> > +}
> >
> > -       if (!next)
> > -               return NULL;
> > +static void maps__set_maps_by_address_sorted(struct maps *maps, bool value)
> > +{
> > +       RC_CHK_ACCESS(maps)->maps_by_address_sorted = value;
> > +}
> >
> > -       return rb_entry(next, struct map_rb_node, rb_node);
> > +static bool maps__maps_by_name_sorted(const struct maps *maps)
> > +{
> > +       return RC_CHK_ACCESS(maps)->maps_by_name_sorted;
> >  }
> >
> > -static struct map_rb_node *maps__find_node(struct maps *maps, struct map *map)
> > +static void maps__set_maps_by_name_sorted(struct maps *maps, bool value)
> >  {
> > -       struct map_rb_node *rb_node;
> > +       RC_CHK_ACCESS(maps)->maps_by_name_sorted = value;
> > +}
> >
> > -       maps__for_each_entry(maps, rb_node) {
> > -               if (rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map))
> > -                       return rb_node;
> > -       }
> > -       return NULL;
> > +static struct rw_semaphore *maps__lock(struct maps *maps)
> > +{
> > +       /*
> > +        * When the lock is acquired or released the maps invariants should
> > +        * hold.
> > +        */
> > +       check_invariants(maps);
> > +       return &RC_CHK_ACCESS(maps)->lock;
> >  }
> >
> >  static void maps__init(struct maps *maps, struct machine *machine)
> >  {
> > -       refcount_set(maps__refcnt(maps), 1);
> >         init_rwsem(maps__lock(maps));
> > -       RC_CHK_ACCESS(maps)->entries = RB_ROOT;
> > +       RC_CHK_ACCESS(maps)->maps_by_address = NULL;
> > +       RC_CHK_ACCESS(maps)->maps_by_name = NULL;
> >         RC_CHK_ACCESS(maps)->machine = machine;
> > -       RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
> > +#ifdef HAVE_LIBUNWIND_SUPPORT
> > +       RC_CHK_ACCESS(maps)->addr_space = NULL;
> > +       RC_CHK_ACCESS(maps)->unwind_libunwind_ops = NULL;
> > +#endif
> > +       refcount_set(maps__refcnt(maps), 1);
> >         RC_CHK_ACCESS(maps)->nr_maps = 0;
> > -       RC_CHK_ACCESS(maps)->maps_by_name = NULL;
> > +       RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
> > +       RC_CHK_ACCESS(maps)->last_search_by_name_idx = 0;
> > +       RC_CHK_ACCESS(maps)->maps_by_address_sorted = true;
> > +       RC_CHK_ACCESS(maps)->maps_by_name_sorted = false;
> >  }
> >
> > -static void __maps__free_maps_by_name(struct maps *maps)
> > +static void maps__exit(struct maps *maps)
> >  {
> > -       /*
> > -        * Free everything to try to do it from the rbtree in the next search
> > -        */
> > -       for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
> > -               map__put(maps__maps_by_name(maps)[i]);
> > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > +       struct map **maps_by_name = maps__maps_by_name(maps);
> >
> > -       zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
> > -       RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
> > +       for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> > +               map__zput(maps_by_address[i]);
> > +               if (maps_by_name)
> > +                       map__zput(maps_by_name[i]);
> > +       }
> > +       zfree(&maps_by_address);
> > +       zfree(&maps_by_name);
> > +       unwind__finish_access(maps);
> >  }
> >
> > -static int __maps__insert(struct maps *maps, struct map *map)
> > +struct maps *maps__new(struct machine *machine)
> >  {
> > -       struct rb_node **p = &maps__entries(maps)->rb_node;
> > -       struct rb_node *parent = NULL;
> > -       const u64 ip = map__start(map);
> > -       struct map_rb_node *m, *new_rb_node;
> > -
> > -       new_rb_node = malloc(sizeof(*new_rb_node));
> > -       if (!new_rb_node)
> > -               return -ENOMEM;
> > -
> > -       RB_CLEAR_NODE(&new_rb_node->rb_node);
> > -       new_rb_node->map = map__get(map);
> > +       struct maps *result;
> > +       RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
> >
> > -       while (*p != NULL) {
> > -               parent = *p;
> > -               m = rb_entry(parent, struct map_rb_node, rb_node);
> > -               if (ip < map__start(m->map))
> > -                       p = &(*p)->rb_left;
> > -               else
> > -                       p = &(*p)->rb_right;
> > -       }
> > +       if (ADD_RC_CHK(result, maps))
> > +               maps__init(result, machine);
> >
> > -       rb_link_node(&new_rb_node->rb_node, parent, p);
> > -       rb_insert_color(&new_rb_node->rb_node, maps__entries(maps));
> > -       return 0;
> > +       return result;
> >  }
> >
> > -int maps__insert(struct maps *maps, struct map *map)
> > +static void maps__delete(struct maps *maps)
> >  {
> > -       int err;
> > -       const struct dso *dso = map__dso(map);
> > -
> > -       down_write(maps__lock(maps));
> > -       err = __maps__insert(maps, map);
> > -       if (err)
> > -               goto out;
> > +       maps__exit(maps);
> > +       RC_CHK_FREE(maps);
> > +}
> >
> > -       ++RC_CHK_ACCESS(maps)->nr_maps;
> > +struct maps *maps__get(struct maps *maps)
> > +{
> > +       struct maps *result;
> >
> > -       if (dso && dso->kernel) {
> > -               struct kmap *kmap = map__kmap(map);
> > +       if (RC_CHK_GET(result, maps))
> > +               refcount_inc(maps__refcnt(maps));
> >
> > -               if (kmap)
> > -                       kmap->kmaps = maps;
> > -               else
> > -                       pr_err("Internal error: kernel dso with non kernel map\n");
> > -       }
> > +       return result;
> > +}
> >
> > +void maps__put(struct maps *maps)
> > +{
> > +       if (maps && refcount_dec_and_test(maps__refcnt(maps)))
> > +               maps__delete(maps);
> > +       else
> > +               RC_CHK_PUT(maps);
> > +}
> >
> > +static void __maps__free_maps_by_name(struct maps *maps)
> > +{
> >         /*
> > -        * If we already performed some search by name, then we need to add the just
> > -        * inserted map and resort.
> > +        * Free everything to try to do it from the rbtree in the next search
> >          */
> > -       if (maps__maps_by_name(maps)) {
> > -               if (maps__nr_maps(maps) > RC_CHK_ACCESS(maps)->nr_maps_allocated) {
> > -                       int nr_allocate = maps__nr_maps(maps) * 2;
> > -                       struct map **maps_by_name = realloc(maps__maps_by_name(maps),
> > -                                                           nr_allocate * sizeof(map));
> > +       for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
> > +               map__put(maps__maps_by_name(maps)[i]);
> >
> > -                       if (maps_by_name == NULL) {
> > -                               __maps__free_maps_by_name(maps);
> > -                               err = -ENOMEM;
> > -                               goto out;
> > -                       }
> > +       zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
> > +}
> >
> > -                       RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
> > -                       RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
> > +static int map__start_cmp(const void *a, const void *b)
> > +{
> > +       const struct map *map_a = *(const struct map * const *)a;
> > +       const struct map *map_b = *(const struct map * const *)b;
> > +       u64 map_a_start = map__start(map_a);
> > +       u64 map_b_start = map__start(map_b);
> > +
> > +       if (map_a_start == map_b_start) {
> > +               u64 map_a_end = map__end(map_a);
> > +               u64 map_b_end = map__end(map_b);
> > +
> > +               if  (map_a_end == map_b_end) {
> > +                       /* Ensure maps with the same addresses have a fixed order. */
> > +                       if (RC_CHK_ACCESS(map_a) == RC_CHK_ACCESS(map_b))
> > +                               return 0;
> > +                       return (intptr_t)RC_CHK_ACCESS(map_a) > (intptr_t)RC_CHK_ACCESS(map_b)
> > +                               ? 1 : -1;
> >                 }
> > -               maps__maps_by_name(maps)[maps__nr_maps(maps) - 1] = map__get(map);
> > -               __maps__sort_by_name(maps);
> > +               return map_a_end > map_b_end ? 1 : -1;
> >         }
> > - out:
> > -       up_write(maps__lock(maps));
> > -       return err;
> > +       return map_a_start > map_b_start ? 1 : -1;
> >  }
> >
> > -static void __maps__remove(struct maps *maps, struct map_rb_node *rb_node)
> > +static void __maps__sort_by_address(struct maps *maps)
> >  {
> > -       rb_erase_init(&rb_node->rb_node, maps__entries(maps));
> > -       map__put(rb_node->map);
> > -       free(rb_node);
> > +       if (maps__maps_by_address_sorted(maps))
> > +               return;
> > +
> > +       qsort(maps__maps_by_address(maps),
> > +               maps__nr_maps(maps),
> > +               sizeof(struct map *),
> > +               map__start_cmp);
> > +       maps__set_maps_by_address_sorted(maps, true);
> >  }
> >
> > -void maps__remove(struct maps *maps, struct map *map)
> > +static void maps__sort_by_address(struct maps *maps)
> >  {
> > -       struct map_rb_node *rb_node;
> > -
> >         down_write(maps__lock(maps));
> > -       if (RC_CHK_ACCESS(maps)->last_search_by_name == map)
> > -               RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
> > -
> > -       rb_node = maps__find_node(maps, map);
> > -       assert(rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map));
> > -       __maps__remove(maps, rb_node);
> > -       if (maps__maps_by_name(maps))
> > -               __maps__free_maps_by_name(maps);
> > -       --RC_CHK_ACCESS(maps)->nr_maps;
> > +       __maps__sort_by_address(maps);
> >         up_write(maps__lock(maps));
> >  }
> >
> > -static void __maps__purge(struct maps *maps)
> > +static int map__strcmp(const void *a, const void *b)
> >  {
> > -       struct map_rb_node *pos, *next;
> > -
> > -       if (maps__maps_by_name(maps))
> > -               __maps__free_maps_by_name(maps);
> > +       const struct map *map_a = *(const struct map * const *)a;
> > +       const struct map *map_b = *(const struct map * const *)b;
> > +       const struct dso *dso_a = map__dso(map_a);
> > +       const struct dso *dso_b = map__dso(map_b);
> > +       int ret = strcmp(dso_a->short_name, dso_b->short_name);
> >
> > -       maps__for_each_entry_safe(maps, pos, next) {
> > -               rb_erase_init(&pos->rb_node,  maps__entries(maps));
> > -               map__put(pos->map);
> > -               free(pos);
> > +       if (ret == 0 && RC_CHK_ACCESS(map_a) != RC_CHK_ACCESS(map_b)) {
> > +               /* Ensure distinct but name equal maps have an order. */
> > +               return map__start_cmp(a, b);
> >         }
> > +       return ret;
> >  }
> >
> > -static void maps__exit(struct maps *maps)
> > +static int maps__sort_by_name(struct maps *maps)
> >  {
> > +       int err = 0;
> >         down_write(maps__lock(maps));
> > -       __maps__purge(maps);
> > +       if (!maps__maps_by_name_sorted(maps)) {
> > +               struct map **maps_by_name = maps__maps_by_name(maps);
> > +
> > +               if (!maps_by_name) {
> > +                       maps_by_name = malloc(RC_CHK_ACCESS(maps)->nr_maps_allocated *
> > +                                       sizeof(*maps_by_name));
> > +                       if (!maps_by_name)
> > +                               err = -ENOMEM;
> > +                       else {
> > +                               struct map **maps_by_address = maps__maps_by_address(maps);
> > +                               unsigned int n = maps__nr_maps(maps);
> > +
> > +                               maps__set_maps_by_name(maps, maps_by_name);
> > +                               for (unsigned int i = 0; i < n; i++)
> > +                                       maps_by_name[i] = map__get(maps_by_address[i]);
> > +                       }
> > +               }
> > +               if (!err) {
> > +                       qsort(maps_by_name,
> > +                               maps__nr_maps(maps),
> > +                               sizeof(struct map *),
> > +                               map__strcmp);
> > +                       maps__set_maps_by_name_sorted(maps, true);
> > +               }
> > +       }
> >         up_write(maps__lock(maps));
> > +       return err;
> >  }
> >
> > -bool maps__empty(struct maps *maps)
> > +static unsigned int maps__by_address_index(const struct maps *maps, const struct map *map)
> >  {
> > -       return !maps__first(maps);
> > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > +
> > +       if (maps__maps_by_address_sorted(maps)) {
> > +               struct map **mapp =
> > +                       bsearch(&map, maps__maps_by_address(maps), maps__nr_maps(maps),
> > +                               sizeof(*mapp), map__start_cmp);
> > +
> > +               if (mapp)
> > +                       return mapp - maps_by_address;
> > +       } else {
> > +               for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> > +                       if (RC_CHK_ACCESS(maps_by_address[i]) == RC_CHK_ACCESS(map))
> > +                               return i;
> > +               }
> > +       }
> > +       pr_err("Map missing from maps");
> > +       return -1;
> >  }
> >
> > -struct maps *maps__new(struct machine *machine)
> > +static unsigned int maps__by_name_index(const struct maps *maps, const struct map *map)
> >  {
> > -       struct maps *result;
> > -       RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
> > +       struct map **maps_by_name = maps__maps_by_name(maps);
> > +
> > +       if (maps__maps_by_name_sorted(maps)) {
> > +               struct map **mapp =
> > +                       bsearch(&map, maps_by_name, maps__nr_maps(maps),
> > +                               sizeof(*mapp), map__strcmp);
> > +
> > +               if (mapp)
> > +                       return mapp - maps_by_name;
> > +       } else {
> > +               for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> > +                       if (RC_CHK_ACCESS(maps_by_name[i]) == RC_CHK_ACCESS(map))
> > +                               return i;
> > +               }
> > +       }
> > +       pr_err("Map missing from maps");
> > +       return -1;
> > +}
> >
> > -       if (ADD_RC_CHK(result, maps))
> > -               maps__init(result, machine);
> > +static int __maps__insert(struct maps *maps, struct map *new)
> > +{
> > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > +       struct map **maps_by_name = maps__maps_by_name(maps);
> > +       const struct dso *dso = map__dso(new);
> > +       unsigned int nr_maps = maps__nr_maps(maps);
> > +       unsigned int nr_allocate = RC_CHK_ACCESS(maps)->nr_maps_allocated;
> > +
> > +       if (nr_maps + 1 > nr_allocate) {
> > +               nr_allocate = !nr_allocate ? 32 : nr_allocate * 2;
> > +
> > +               maps_by_address = realloc(maps_by_address, nr_allocate * sizeof(new));
> > +               if (!maps_by_address)
> > +                       return -ENOMEM;
> > +
> > +               maps__set_maps_by_address(maps, maps_by_address);
> > +               if (maps_by_name) {
> > +                       maps_by_name = realloc(maps_by_name, nr_allocate * sizeof(new));
> > +                       if (!maps_by_name) {
> > +                               /*
> > +                                * If by name fails, just disable by name and it will
> > +                                * recompute next time it is required.
> > +                                */
> > +                               __maps__free_maps_by_name(maps);
> > +                       }
> > +                       maps__set_maps_by_name(maps, maps_by_name);
> > +               }
> > +               RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
> > +       }
> > +       /* Insert the value at the end. */
> > +       maps_by_address[nr_maps] = map__get(new);
> > +       if (maps_by_name)
> > +               maps_by_name[nr_maps] = map__get(new);
> >
> > -       return result;
> > +       nr_maps++;
> > +       RC_CHK_ACCESS(maps)->nr_maps = nr_maps;
> > +
> > +       /*
> > +        * Recompute if things are sorted. If things are inserted in a sorted
> > +        * manner, for example by processing /proc/pid/maps, then no
> > +        * sorting/resorting will be necessary.
> > +        */
> > +       if (nr_maps == 1) {
> > +               /* If there's just 1 entry then maps are sorted. */
> > +               maps__set_maps_by_address_sorted(maps, true);
> > +               maps__set_maps_by_name_sorted(maps, maps_by_name != NULL);
> > +       } else {
> > +               /* Sorted if maps were already sorted and this map starts after the last one. */
> > +               maps__set_maps_by_address_sorted(maps,
> > +                       maps__maps_by_address_sorted(maps) &&
> > +                       map__end(maps_by_address[nr_maps - 2]) <= map__start(new));
> > +               maps__set_maps_by_name_sorted(maps, false);
> > +       }
> > +       if (map__end(new) < map__start(new))
> > +               RC_CHK_ACCESS(maps)->ends_broken = true;
> > +       if (dso && dso->kernel) {
> > +               struct kmap *kmap = map__kmap(new);
> > +
> > +               if (kmap)
> > +                       kmap->kmaps = maps;
> > +               else
> > +                       pr_err("Internal error: kernel dso with non kernel map\n");
> > +       }
> > +       check_invariants(maps);
>
> Probably not needed as it's checked when you get the lock below.

Ack. Will remove in v3.

>
> > +       return 0;
> >  }
> >
> > -static void maps__delete(struct maps *maps)
> > +int maps__insert(struct maps *maps, struct map *map)
> >  {
> > -       maps__exit(maps);
> > -       unwind__finish_access(maps);
> > -       RC_CHK_FREE(maps);
> > +       int ret;
> > +
> > +       down_write(maps__lock(maps));
> > +       ret = __maps__insert(maps, map);
> > +       up_write(maps__lock(maps));
> > +       return ret;
> >  }
> >
> > -struct maps *maps__get(struct maps *maps)
> > +static void __maps__remove(struct maps *maps, struct map *map)
> >  {
> > -       struct maps *result;
> > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > +       struct map **maps_by_name = maps__maps_by_name(maps);
> > +       unsigned int nr_maps = maps__nr_maps(maps);
> > +       unsigned int address_idx;
> > +
> > +       /* Slide later mappings over the one to remove */
> > +       address_idx = maps__by_address_index(maps, map);
> > +       map__put(maps_by_address[address_idx]);
> > +       memmove(&maps_by_address[address_idx],
> > +               &maps_by_address[address_idx + 1],
> > +               (nr_maps - address_idx - 1) * sizeof(*maps_by_address));
> > +
> > +       if (maps_by_name) {
> > +               unsigned int name_idx = maps__by_name_index(maps, map);
> > +
> > +               map__put(maps_by_name[name_idx]);
> > +               memmove(&maps_by_name[name_idx],
> > +                       &maps_by_name[name_idx + 1],
> > +                       (nr_maps - name_idx - 1) *  sizeof(*maps_by_name));
> > +       }
> >
> > -       if (RC_CHK_GET(result, maps))
> > -               refcount_inc(maps__refcnt(maps));
> > +       --RC_CHK_ACCESS(maps)->nr_maps;
> > +       check_invariants(maps);
>
> Ditto.

Ack.

> > +}
> >
> > -       return result;
> > +void maps__remove(struct maps *maps, struct map *map)
> > +{
> > +       down_write(maps__lock(maps));
> > +       __maps__remove(maps, map);
> > +       up_write(maps__lock(maps));
> >  }
> >
> > -void maps__put(struct maps *maps)
> > +bool maps__empty(struct maps *maps)
> >  {
> > -       if (maps && refcount_dec_and_test(maps__refcnt(maps)))
> > -               maps__delete(maps);
> > -       else
> > -               RC_CHK_PUT(maps);
> > +       return maps__nr_maps(maps) == 0;
> >  }
> >
> >  int maps__for_each_map(struct maps *maps, int (*cb)(struct map *map, void *data), void *data)
> >  {
> > -       struct map_rb_node *pos;
> > +       bool done = false;
> >         int ret = 0;
> >
> > -       down_read(maps__lock(maps));
> > -       maps__for_each_entry(maps, pos) {
> > -               ret = cb(pos->map, data);
> > -               if (ret)
> > -                       break;
> > +       /* See locking/sorting note. */
> > +       while (!done) {
> > +               down_read(maps__lock(maps));
> > +               if (maps__maps_by_address_sorted(maps)) {
> > +                       struct map **maps_by_address = maps__maps_by_address(maps);
> > +                       unsigned int n = maps__nr_maps(maps);
> > +
> > +                       for (unsigned int i = 0; i < n; i++) {
> > +                               struct map *map = maps_by_address[i];
> > +
> > +                               ret = cb(map, data);
> > +                               if (ret)
> > +                                       break;
> > +                       }
> > +                       done = true;
> > +               }
> > +               up_read(maps__lock(maps));
> > +               if (!done)
> > +                       maps__sort_by_address(maps);
> >         }
> > -       up_read(maps__lock(maps));
> >         return ret;
> >  }
> >
> >  void maps__remove_maps(struct maps *maps, bool (*cb)(struct map *map, void *data), void *data)
> >  {
> > -       struct map_rb_node *pos, *next;
> > -       unsigned int start_nr_maps;
> > +       struct map **maps_by_address;
> >
> >         down_write(maps__lock(maps));
> >
> > -       start_nr_maps = maps__nr_maps(maps);
> > -       maps__for_each_entry_safe(maps, pos, next)      {
> > -               if (cb(pos->map, data)) {
> > -                       __maps__remove(maps, pos);
> > -                       --RC_CHK_ACCESS(maps)->nr_maps;
> > -               }
> > +       maps_by_address = maps__maps_by_address(maps);
> > +       for (unsigned int i = 0; i < maps__nr_maps(maps);) {
> > +               if (cb(maps_by_address[i], data))
> > +                       __maps__remove(maps, maps_by_address[i]);
> > +               else
> > +                       i++;
> >         }
> > -       if (maps__maps_by_name(maps) && start_nr_maps != maps__nr_maps(maps))
> > -               __maps__free_maps_by_name(maps);
> > -
> >         up_write(maps__lock(maps));
> >  }
> >
> > @@ -300,7 +491,7 @@ struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp)
> >         /* Ensure map is loaded before using map->map_ip */
> >         if (map != NULL && map__load(map) >= 0) {
> >                 if (mapp != NULL)
> > -                       *mapp = map;
> > +                       *mapp = map; // TODO: map_put on else path when find returns a get.
> >                 return map__find_symbol(map, map__map_ip(map, addr));
> >         }
> >
> > @@ -348,7 +539,7 @@ int maps__find_ams(struct maps *maps, struct addr_map_symbol *ams)
> >         if (ams->addr < map__start(ams->ms.map) || ams->addr >= map__end(ams->ms.map)) {
> >                 if (maps == NULL)
> >                         return -1;
> > -               ams->ms.map = maps__find(maps, ams->addr);
> > +               ams->ms.map = maps__find(maps, ams->addr);  // TODO: map_get
> >                 if (ams->ms.map == NULL)
> >                         return -1;
> >         }
> > @@ -393,24 +584,28 @@ size_t maps__fprintf(struct maps *maps, FILE *fp)
> >   * Find first map where end > map->start.
> >   * Same as find_vma() in kernel.
> >   */
> > -static struct rb_node *first_ending_after(struct maps *maps, const struct map *map)
> > +static unsigned int first_ending_after(struct maps *maps, const struct map *map)
> >  {
> > -       struct rb_root *root;
> > -       struct rb_node *next, *first;
> > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > +       int low = 0, high = (int)maps__nr_maps(maps) - 1, first = high + 1;
> > +
> > +       assert(maps__maps_by_address_sorted(maps));
> > +       if (low <= high && map__end(maps_by_address[0]) > map__start(map))
> > +               return 0;
> >
> > -       root = maps__entries(maps);
> > -       next = root->rb_node;
> > -       first = NULL;
> > -       while (next) {
> > -               struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
> > +       while (low <= high) {
> > +               int mid = (low + high) / 2;
> > +               struct map *pos = maps_by_address[mid];
> >
> > -               if (map__end(pos->map) > map__start(map)) {
> > -                       first = next;
> > -                       if (map__start(pos->map) <= map__start(map))
> > +               if (map__end(pos) > map__start(map)) {
> > +                       first = mid;
> > +                       if (map__start(pos) <= map__start(map)) {
> > +                               /* Entry overlaps map. */
> >                                 break;
> > -                       next = next->rb_left;
> > +                       }
> > +                       high = mid - 1;
> >                 } else
> > -                       next = next->rb_right;
> > +                       low = mid + 1;
> >         }
> >         return first;
> >  }
> > @@ -419,171 +614,249 @@ static struct rb_node *first_ending_after(struct maps *maps, const struct map *m
> >   * Adds new to maps, if new overlaps existing entries then the existing maps are
> >   * adjusted or removed so that new fits without overlapping any entries.
> >   */
> > -int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
> > +static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
> >  {
> > -
> > -       struct rb_node *next;
> > +       struct map **maps_by_address;
> >         int err = 0;
> >         FILE *fp = debug_file();
> >
> > -       down_write(maps__lock(maps));
> > +sort_again:
> > +       if (!maps__maps_by_address_sorted(maps))
> > +               __maps__sort_by_address(maps);
> >
> > -       next = first_ending_after(maps, new);
> > -       while (next && !err) {
> > -               struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
> > -               next = rb_next(&pos->rb_node);
> > +       maps_by_address = maps__maps_by_address(maps);
> > +       /*
> > +        * Iterate through entries where the end of the existing entry is
> > +        * greater-than the new map's start.
> > +        */
> > +       for (unsigned int i = first_ending_after(maps, new); i < maps__nr_maps(maps); ) {
> > +               struct map *pos = maps_by_address[i];
> > +               struct map *before = NULL, *after = NULL;
> >
> >                 /*
> >                  * Stop if current map starts after map->end.
> >                  * Maps are ordered by start: next will not overlap for sure.
> >                  */
> > -               if (map__start(pos->map) >= map__end(new))
> > +               if (map__start(pos) >= map__end(new))
> >                         break;
> >
> > -               if (verbose >= 2) {
> > -
> > -                       if (use_browser) {
> > -                               pr_debug("overlapping maps in %s (disable tui for more info)\n",
> > -                                        map__dso(new)->name);
> > -                       } else {
> > -                               pr_debug("overlapping maps:\n");
> > -                               map__fprintf(new, fp);
> > -                               map__fprintf(pos->map, fp);
> > -                       }
> > +               if (use_browser) {
> > +                       pr_debug("overlapping maps in %s (disable tui for more info)\n",
> > +                               map__dso(new)->name);
> > +               } else if (verbose >= 2) {
> > +                       pr_debug("overlapping maps:\n");
> > +                       map__fprintf(new, fp);
> > +                       map__fprintf(pos, fp);
> >                 }
> >
> > -               rb_erase_init(&pos->rb_node, maps__entries(maps));
> >                 /*
> >                  * Now check if we need to create new maps for areas not
> >                  * overlapped by the new map:
> >                  */
> > -               if (map__start(new) > map__start(pos->map)) {
> > -                       struct map *before = map__clone(pos->map);
> > +               if (map__start(new) > map__start(pos)) {
> > +                       /* Map starts within existing map. Need to shorten the existing map. */
> > +                       before = map__clone(pos);
> >
> >                         if (before == NULL) {
> >                                 err = -ENOMEM;
> > -                               goto put_map;
> > +                               goto out_err;
> >                         }
> > -
> >                         map__set_end(before, map__start(new));
> > -                       err = __maps__insert(maps, before);
> > -                       if (err) {
> > -                               map__put(before);
> > -                               goto put_map;
> > -                       }
> >
> >                         if (verbose >= 2 && !use_browser)
> >                                 map__fprintf(before, fp);
> > -                       map__put(before);
> >                 }
> > -
> > -               if (map__end(new) < map__end(pos->map)) {
> > -                       struct map *after = map__clone(pos->map);
> > +               if (map__end(new) < map__end(pos)) {
> > +                       /* The new map isn't as long as the existing map. */
> > +                       after = map__clone(pos);
> >
> >                         if (after == NULL) {
> > +                               map__zput(before);
> >                                 err = -ENOMEM;
> > -                               goto put_map;
> > +                               goto out_err;
> >                         }
> >
> >                         map__set_start(after, map__end(new));
> > -                       map__add_pgoff(after, map__end(new) - map__start(pos->map));
> > -                       assert(map__map_ip(pos->map, map__end(new)) ==
> > -                               map__map_ip(after, map__end(new)));
> > -                       err = __maps__insert(maps, after);
> > -                       if (err) {
> > -                               map__put(after);
> > -                               goto put_map;
> > -                       }
> > +                       map__add_pgoff(after, map__end(new) - map__start(pos));
> > +                       assert(map__map_ip(pos, map__end(new)) ==
> > +                              map__map_ip(after, map__end(new)));
> > +
> >                         if (verbose >= 2 && !use_browser)
> >                                 map__fprintf(after, fp);
> > -                       map__put(after);
> >                 }
> > -put_map:
> > -               map__put(pos->map);
> > -               free(pos);
> > +               /*
> > +                * If adding one entry, for `before` or `after`, we can replace
> > +                * the existing entry. If both `before` and `after` are
> > +                * necessary than an insert is needed. If the existing entry
> > +                * entirely overlaps the existing entry it can just be removed.
> > +                */
> > +               if (before) {
> > +                       map__put(maps_by_address[i]);
> > +                       maps_by_address[i] = before;
> > +                       /* Maps are still ordered, go to next one. */
> > +                       i++;
> > +                       if (after) {
> > +                               __maps__insert(maps, after);
> > +                               map__put(after);
> > +                               if (!maps__maps_by_address_sorted(maps)) {
> > +                                       /*
> > +                                        * Sorting broken so invariants don't
> > +                                        * hold, sort and go again.
> > +                                        */
> > +                                       goto sort_again;
> > +                               }
> > +                               /*
> > +                                * Maps are still ordered, skip after and go to
> > +                                * next one (terminate loop).
> > +                                */
> > +                               i++;
> > +                       }
> > +               } else if (after) {
> > +                       map__put(maps_by_address[i]);
> > +                       maps_by_address[i] = after;
> > +                       /* Maps are ordered, go to next one. */
> > +                       i++;
> > +               } else {
> > +                       __maps__remove(maps, pos);
> > +                       /*
> > +                        * Maps are ordered but no need to increase `i` as the
> > +                        * later maps were moved down.
> > +                        */
> > +               }
> > +               check_invariants(maps);
> >         }
> >         /* Add the map. */
> > -       err = __maps__insert(maps, new);
> > -       up_write(maps__lock(maps));
> > +       __maps__insert(maps, new);
> > +out_err:
> >         return err;
> >  }
> >
> > -int maps__copy_from(struct maps *maps, struct maps *parent)
> > +int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
> >  {
> >         int err;
> > -       struct map_rb_node *rb_node;
> >
> > +       down_write(maps__lock(maps));
> > +       err =  __maps__fixup_overlap_and_insert(maps, new);
> > +       up_write(maps__lock(maps));
> > +       return err;
> > +}
> > +
> > +int maps__copy_from(struct maps *dest, struct maps *parent)
> > +{
> > +       /* Note, if struct map were immutable then cloning could use ref counts. */
> > +       struct map **parent_maps_by_address;
> > +       int err = 0;
> > +       unsigned int n;
> > +
> > +       down_write(maps__lock(dest));
> >         down_read(maps__lock(parent));
> >
> > -       maps__for_each_entry(parent, rb_node) {
> > -               struct map *new = map__clone(rb_node->map);
> > +       parent_maps_by_address = maps__maps_by_address(parent);
> > +       n = maps__nr_maps(parent);
> > +       if (maps__empty(dest)) {
> > +               /* No existing mappings so just copy from parent to avoid reallocs in insert. */
> > +               unsigned int nr_maps_allocated = RC_CHK_ACCESS(parent)->nr_maps_allocated;
> > +               struct map **dest_maps_by_address =
> > +                       malloc(nr_maps_allocated * sizeof(struct map *));
> > +               struct map **dest_maps_by_name = NULL;
> >
> > -               if (new == NULL) {
> > +               if (!dest_maps_by_address)
> >                         err = -ENOMEM;
> > -                       goto out_unlock;
> > +               else {
> > +                       if (maps__maps_by_name(parent)) {
> > +                               dest_maps_by_name =
> > +                                       malloc(nr_maps_allocated * sizeof(struct map *));
> > +                       }
> > +
> > +                       RC_CHK_ACCESS(dest)->maps_by_address = dest_maps_by_address;
> > +                       RC_CHK_ACCESS(dest)->maps_by_name = dest_maps_by_name;
> > +                       RC_CHK_ACCESS(dest)->nr_maps_allocated = nr_maps_allocated;
> >                 }
> >
> > -               err = unwind__prepare_access(maps, new, NULL);
> > -               if (err)
> > -                       goto out_unlock;
> > +               for (unsigned int i = 0; !err && i < n; i++) {
> > +                       struct map *pos = parent_maps_by_address[i];
> > +                       struct map *new = map__clone(pos);
> >
> > -               err = maps__insert(maps, new);
> > -               if (err)
> > -                       goto out_unlock;
> > +                       if (!new)
> > +                               err = -ENOMEM;
> > +                       else {
> > +                               err = unwind__prepare_access(dest, new, NULL);
> > +                               if (!err) {
> > +                                       dest_maps_by_address[i] = new;
> > +                                       if (dest_maps_by_name)
> > +                                               dest_maps_by_name[i] = map__get(new);
> > +                                       RC_CHK_ACCESS(dest)->nr_maps = i + 1;
> > +                               }
> > +                       }
> > +                       if (err)
> > +                               map__put(new);
> > +               }
> > +               maps__set_maps_by_address_sorted(dest, maps__maps_by_address_sorted(parent));
> > +               if (!err) {
> > +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx =
> > +                               RC_CHK_ACCESS(parent)->last_search_by_name_idx;
> > +                       maps__set_maps_by_name_sorted(dest,
> > +                                               dest_maps_by_name &&
> > +                                               maps__maps_by_name_sorted(parent));
> > +               } else {
> > +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx = 0;
> > +                       maps__set_maps_by_name_sorted(dest, false);
> > +               }
> > +       } else {
> > +               /* Unexpected copying to a maps containing entries. */
> > +               for (unsigned int i = 0; !err && i < n; i++) {
> > +                       struct map *pos = parent_maps_by_address[i];
> > +                       struct map *new = map__clone(pos);
> >
> > -               map__put(new);
> > +                       if (!new)
> > +                               err = -ENOMEM;
> > +                       else {
> > +                               err = unwind__prepare_access(dest, new, NULL);
> > +                               if (!err)
> > +                                       err = maps__insert(dest, new);
>
> Shouldn't it be __maps__insert()?

On entry the read lock is taken on parent but no lock is taken on dest
so the locked version is used.

> > +                       }
> > +                       map__put(new);
> > +               }
> >         }
> > -
> > -       err = 0;
> > -out_unlock:
> >         up_read(maps__lock(parent));
> > +       up_write(maps__lock(dest));
> >         return err;
> >  }
> >
> > -struct map *maps__find(struct maps *maps, u64 ip)
> > +static int map__addr_cmp(const void *key, const void *entry)
> >  {
> > -       struct rb_node *p;
> > -       struct map_rb_node *m;
> > -
> > +       const u64 ip = *(const u64 *)key;
> > +       const struct map *map = *(const struct map * const *)entry;
> >
> > -       down_read(maps__lock(maps));
> > -
> > -       p = maps__entries(maps)->rb_node;
> > -       while (p != NULL) {
> > -               m = rb_entry(p, struct map_rb_node, rb_node);
> > -               if (ip < map__start(m->map))
> > -                       p = p->rb_left;
> > -               else if (ip >= map__end(m->map))
> > -                       p = p->rb_right;
> > -               else
> > -                       goto out;
> > -       }
> > -
> > -       m = NULL;
> > -out:
> > -       up_read(maps__lock(maps));
> > -       return m ? m->map : NULL;
> > +       if (ip < map__start(map))
> > +               return -1;
> > +       if (ip >= map__end(map))
> > +               return 1;
> > +       return 0;
> >  }
> >
> > -static int map__strcmp(const void *a, const void *b)
> > +struct map *maps__find(struct maps *maps, u64 ip)
> >  {
> > -       const struct map *map_a = *(const struct map **)a;
> > -       const struct map *map_b = *(const struct map **)b;
> > -       const struct dso *dso_a = map__dso(map_a);
> > -       const struct dso *dso_b = map__dso(map_b);
> > -       int ret = strcmp(dso_a->short_name, dso_b->short_name);
> > -
> > -       if (ret == 0 && map_a != map_b) {
> > -               /*
> > -                * Ensure distinct but name equal maps have an order in part to
> > -                * aid reference counting.
> > -                */
> > -               ret = (int)map__start(map_a) - (int)map__start(map_b);
> > -               if (ret == 0)
> > -                       ret = (int)((intptr_t)map_a - (intptr_t)map_b);
> > +       struct map *result = NULL;
> > +       bool done = false;
> > +
> > +       /* See locking/sorting note. */
> > +       while (!done) {
> > +               down_read(maps__lock(maps));
> > +               if (maps__maps_by_address_sorted(maps)) {
> > +                       struct map **mapp =
> > +                               bsearch(&ip, maps__maps_by_address(maps), maps__nr_maps(maps),
> > +                                       sizeof(*mapp), map__addr_cmp);
> > +
> > +                       if (mapp)
> > +                               result = *mapp; // map__get(*mapp);
> > +                       done = true;
> > +               }
> > +               up_read(maps__lock(maps));
> > +               if (!done)
> > +                       maps__sort_by_address(maps);
> >         }
> > -
> > -       return ret;
> > +       return result;
> >  }
> >
> >  static int map__strcmp_name(const void *name, const void *b)
> > @@ -593,126 +866,113 @@ static int map__strcmp_name(const void *name, const void *b)
> >         return strcmp(name, dso->short_name);
> >  }
> >
> > -void __maps__sort_by_name(struct maps *maps)
> > -{
> > -       qsort(maps__maps_by_name(maps), maps__nr_maps(maps), sizeof(struct map *), map__strcmp);
> > -}
> > -
> > -static int map__groups__sort_by_name_from_rbtree(struct maps *maps)
> > -{
> > -       struct map_rb_node *rb_node;
> > -       struct map **maps_by_name = realloc(maps__maps_by_name(maps),
> > -                                           maps__nr_maps(maps) * sizeof(struct map *));
> > -       int i = 0;
> > -
> > -       if (maps_by_name == NULL)
> > -               return -1;
> > -
> > -       up_read(maps__lock(maps));
> > -       down_write(maps__lock(maps));
> > -
> > -       RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
> > -       RC_CHK_ACCESS(maps)->nr_maps_allocated = maps__nr_maps(maps);
> > -
> > -       maps__for_each_entry(maps, rb_node)
> > -               maps_by_name[i++] = map__get(rb_node->map);
> > -
> > -       __maps__sort_by_name(maps);
> > -
> > -       up_write(maps__lock(maps));
> > -       down_read(maps__lock(maps));
> > -
> > -       return 0;
> > -}
> > -
> > -static struct map *__maps__find_by_name(struct maps *maps, const char *name)
> > +struct map *maps__find_by_name(struct maps *maps, const char *name)
> >  {
> > -       struct map **mapp;
> > +       struct map *result = NULL;
> > +       bool done = false;
> >
> > -       if (maps__maps_by_name(maps) == NULL &&
> > -           map__groups__sort_by_name_from_rbtree(maps))
> > -               return NULL;
> > +       /* See locking/sorting note. */
> > +       while (!done) {
> > +               unsigned int i;
> >
> > -       mapp = bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
> > -                      sizeof(*mapp), map__strcmp_name);
> > -       if (mapp)
> > -               return *mapp;
> > -       return NULL;
> > -}
> > +               down_read(maps__lock(maps));
> >
> > -struct map *maps__find_by_name(struct maps *maps, const char *name)
> > -{
> > -       struct map_rb_node *rb_node;
> > -       struct map *map;
> > -
> > -       down_read(maps__lock(maps));
> > +               /* First check last found entry. */
> > +               i = RC_CHK_ACCESS(maps)->last_search_by_name_idx;
> > +               if (i < maps__nr_maps(maps) && maps__maps_by_name(maps)) {
> > +                       struct dso *dso = map__dso(maps__maps_by_name(maps)[i]);
> >
> > +                       if (dso && strcmp(dso->short_name, name) == 0) {
> > +                               result = maps__maps_by_name(maps)[i]; // TODO: map__get
> > +                               done = true;
> > +                       }
> > +               }
> >
> > -       if (RC_CHK_ACCESS(maps)->last_search_by_name) {
> > -               const struct dso *dso = map__dso(RC_CHK_ACCESS(maps)->last_search_by_name);
> > +               /* Second search sorted array. */
> > +               if (!done && maps__maps_by_name_sorted(maps)) {
> > +                       struct map **mapp =
> > +                               bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
> > +                                       sizeof(*mapp), map__strcmp_name);
> >
> > -               if (strcmp(dso->short_name, name) == 0) {
> > -                       map = RC_CHK_ACCESS(maps)->last_search_by_name;
> > -                       goto out_unlock;
> > +                       if (mapp) {
> > +                               result = *mapp; // TODO: map__get
> > +                               i = mapp - maps__maps_by_name(maps);
> > +                               RC_CHK_ACCESS(maps)->last_search_by_name_idx = i;
> > +                       }
> > +                       done = true;
> >                 }
> > -       }
> > -       /*
> > -        * If we have maps->maps_by_name, then the name isn't in the rbtree,
> > -        * as maps->maps_by_name mirrors the rbtree when lookups by name are
> > -        * made.
> > -        */
> > -       map = __maps__find_by_name(maps, name);
> > -       if (map || maps__maps_by_name(maps) != NULL)
> > -               goto out_unlock;
> > -
> > -       /* Fallback to traversing the rbtree... */
> > -       maps__for_each_entry(maps, rb_node) {
> > -               struct dso *dso;
> > -
> > -               map = rb_node->map;
> > -               dso = map__dso(map);
> > -               if (strcmp(dso->short_name, name) == 0) {
> > -                       RC_CHK_ACCESS(maps)->last_search_by_name = map;
> > -                       goto out_unlock;
> > +               up_read(maps__lock(maps));
> > +               if (!done) {
> > +                       /* Sort and retry binary search. */
> > +                       if (maps__sort_by_name(maps)) {
> > +                               /*
> > +                                * Memory allocation failed do linear search
> > +                                * through address sorted maps.
> > +                                */
> > +                               struct map **maps_by_address;
> > +                               unsigned int n;
> > +
> > +                               down_read(maps__lock(maps));
> > +                               maps_by_address =  maps__maps_by_address(maps);
> > +                               n = maps__nr_maps(maps);
> > +                               for (i = 0; i < n; i++) {
> > +                                       struct map *pos = maps_by_address[i];
> > +                                       struct dso *dso = map__dso(pos);
> > +
> > +                                       if (dso && strcmp(dso->short_name, name) == 0) {
> > +                                               result = pos; // TODO: map__get
> > +                                               break;
> > +                                       }
> > +                               }
> > +                               up_read(maps__lock(maps));
> > +                               done = true;
> > +                       }
> >                 }
> >         }
> > -       map = NULL;
> > -
> > -out_unlock:
> > -       up_read(maps__lock(maps));
> > -       return map;
> > +       return result;
> >  }
> >
> >  struct map *maps__find_next_entry(struct maps *maps, struct map *map)
> >  {
> > -       struct map_rb_node *rb_node = maps__find_node(maps, map);
> > -       struct map_rb_node *next = map_rb_node__next(rb_node);
> > +       unsigned int i;
> > +       struct map *result = NULL;
> >
> > -       if (next)
> > -               return next->map;
> > +       down_read(maps__lock(maps));
> > +       i = maps__by_address_index(maps, map);
> > +       if (i < maps__nr_maps(maps))
> > +               result = maps__maps_by_address(maps)[i]; // TODO: map__get
> >
> > -       return NULL;
> > +       up_read(maps__lock(maps));
> > +       return result;
> >  }
> >
> >  void maps__fixup_end(struct maps *maps)
> >  {
> > -       struct map_rb_node *prev = NULL, *curr;
> > +       struct map **maps_by_address;
> > +       unsigned int n;
> >
> >         down_write(maps__lock(maps));
> > +       if (!maps__maps_by_address_sorted(maps))
> > +               __maps__sort_by_address(maps);
> >
> > -       maps__for_each_entry(maps, curr) {
> > -               if (prev && (!map__end(prev->map) || map__end(prev->map) > map__start(curr->map)))
> > -                       map__set_end(prev->map, map__start(curr->map));
> > +       maps_by_address = maps__maps_by_address(maps);
> > +       n = maps__nr_maps(maps);
> > +       for (unsigned int i = 1; i < n; i++) {
> > +               struct map *prev = maps_by_address[i - 1];
> > +               struct map *curr = maps_by_address[i];
> >
> > -               prev = curr;
> > +               if (!map__end(prev) || map__end(prev) > map__start(curr))
> > +                       map__set_end(prev, map__start(curr));
> >         }
> >
> >         /*
> >          * We still haven't the actual symbols, so guess the
> >          * last map final address.
> >          */
> > -       if (curr && !map__end(curr->map))
> > -               map__set_end(curr->map, ~0ULL);
> > +       if (n > 0 && !map__end(maps_by_address[n - 1]))
> > +               map__set_end(maps_by_address[n - 1], ~0ULL);
> > +
> > +       RC_CHK_ACCESS(maps)->ends_broken = false;
> >
> >         up_write(maps__lock(maps));
> >  }
> > @@ -723,117 +983,92 @@ void maps__fixup_end(struct maps *maps)
> >   */
> >  int maps__merge_in(struct maps *kmaps, struct map *new_map)
> >  {
> > -       struct map_rb_node *rb_node;
> > -       struct rb_node *first;
> > -       bool overlaps;
> > -       LIST_HEAD(merged);
> > -       int err = 0;
> > +       unsigned int first_after_, kmaps__nr_maps;
> > +       struct map **kmaps_maps_by_address;
> > +       struct map **merged_maps_by_address;
> > +       unsigned int merged_nr_maps_allocated;
> > +
> > +       /* First try under a read lock. */
> > +       while (true) {
> > +               down_read(maps__lock(kmaps));
> > +               if (maps__maps_by_address_sorted(kmaps))
> > +                       break;
> >
> > -       down_read(maps__lock(kmaps));
> > -       first = first_ending_after(kmaps, new_map);
> > -       rb_node = first ? rb_entry(first, struct map_rb_node, rb_node) : NULL;
> > -       overlaps = rb_node && map__start(rb_node->map) < map__end(new_map);
> > -       up_read(maps__lock(kmaps));
> > +               up_read(maps__lock(kmaps));
> > +
> > +               /* First after binary search requires sorted maps. Sort and try again. */
> > +               maps__sort_by_address(kmaps);
> > +       }
> > +       first_after_ = first_ending_after(kmaps, new_map);
> > +       kmaps_maps_by_address = maps__maps_by_address(kmaps);
> >
> > -       if (!overlaps)
> > +       if (first_after_ >= maps__nr_maps(kmaps) ||
> > +           map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
> > +               /* No overlap so regular insert suffices. */
> > +               up_read(maps__lock(kmaps));
> >                 return maps__insert(kmaps, new_map);
> > +       }
> > +       up_read(maps__lock(kmaps));
> >
> > -       maps__for_each_entry(kmaps, rb_node) {
> > -               struct map *old_map = rb_node->map;
> > +       /* Plain insert with a read-lock failed, try again now with the write lock. */
> > +       down_write(maps__lock(kmaps));
> > +       if (!maps__maps_by_address_sorted(kmaps))
> > +               __maps__sort_by_address(kmaps);
> >
> > -               /* no overload with this one */
> > -               if (map__end(new_map) < map__start(old_map) ||
> > -                   map__start(new_map) >= map__end(old_map))
> > -                       continue;
> > +       first_after_ = first_ending_after(kmaps, new_map);
> > +       kmaps_maps_by_address = maps__maps_by_address(kmaps);
> > +       kmaps__nr_maps = maps__nr_maps(kmaps);
> >
> > -               if (map__start(new_map) < map__start(old_map)) {
> > -                       /*
> > -                        * |new......
> > -                        *       |old....
> > -                        */
> > -                       if (map__end(new_map) < map__end(old_map)) {
> > -                               /*
> > -                                * |new......|     -> |new..|
> > -                                *       |old....| ->       |old....|
> > -                                */
> > -                               map__set_end(new_map, map__start(old_map));
> > -                       } else {
> > -                               /*
> > -                                * |new.............| -> |new..|       |new..|
> > -                                *       |old....|    ->       |old....|
> > -                                */
> > -                               struct map_list_node *m = map_list_node__new();
> > +       if (first_after_ >= kmaps__nr_maps ||
> > +           map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
> > +               /* No overlap so regular insert suffices. */
> > +               up_write(maps__lock(kmaps));
> > +               return maps__insert(kmaps, new_map);
>
> I think it could be:
>
>         ret = __maps__insert(kmaps, new_map);
>         up_write(maps__lock(kmaps));
>         return ret;

Ack. Will change in v3.

> > +       }
> > +       /* Array to merge into, possibly 1 more for the sake of new_map. */
> > +       merged_nr_maps_allocated = RC_CHK_ACCESS(kmaps)->nr_maps_allocated;
> > +       if (kmaps__nr_maps + 1 == merged_nr_maps_allocated)
> > +               merged_nr_maps_allocated++;
> > +
> > +       merged_maps_by_address = malloc(merged_nr_maps_allocated * sizeof(*merged_maps_by_address));
> > +       if (!merged_maps_by_address) {
> > +               up_write(maps__lock(kmaps));
> > +               return -ENOMEM;
> > +       }
> > +       RC_CHK_ACCESS(kmaps)->maps_by_address = merged_maps_by_address;
> > +       RC_CHK_ACCESS(kmaps)->maps_by_address_sorted = true;
> > +       zfree(&RC_CHK_ACCESS(kmaps)->maps_by_name);
> > +       RC_CHK_ACCESS(kmaps)->maps_by_name_sorted = false;
> > +       RC_CHK_ACCESS(kmaps)->nr_maps_allocated = merged_nr_maps_allocated;
>
> Why not use the accessor functions?

Ack. I've been holding back on accessors that are used once, but I
will add them here.

Thanks,
Ian

> Thanks,
> Namhyung
>
> >
> > -                               if (!m) {
> > -                                       err = -ENOMEM;
> > -                                       goto out;
> > -                               }
> > +       /* Copy entries before the new_map that can't overlap. */
> > +       for (unsigned int i = 0; i < first_after_; i++)
> > +               merged_maps_by_address[i] = map__get(kmaps_maps_by_address[i]);
> >
> > -                               m->map = map__clone(new_map);
> > -                               if (!m->map) {
> > -                                       free(m);
> > -                                       err = -ENOMEM;
> > -                                       goto out;
> > -                               }
> > +       RC_CHK_ACCESS(kmaps)->nr_maps = first_after_;
> >
> > -                               map__set_end(m->map, map__start(old_map));
> > -                               list_add_tail(&m->node, &merged);
> > -                               map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
> > -                               map__set_start(new_map, map__end(old_map));
> > -                       }
> > -               } else {
> > -                       /*
> > -                        *      |new......
> > -                        * |old....
> > -                        */
> > -                       if (map__end(new_map) < map__end(old_map)) {
> > -                               /*
> > -                                *      |new..|   -> x
> > -                                * |old.........| -> |old.........|
> > -                                */
> > -                               map__put(new_map);
> > -                               new_map = NULL;
> > -                               break;
> > -                       } else {
> > -                               /*
> > -                                *      |new......| ->         |new...|
> > -                                * |old....|        -> |old....|
> > -                                */
> > -                               map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
> > -                               map__set_start(new_map, map__end(old_map));
> > -                       }
> > -               }
> > -       }
> > +       /* Add the new map, it will be split when the later overlapping mappings are added. */
> > +       __maps__insert(kmaps, new_map);
> >
> > -out:
> > -       while (!list_empty(&merged)) {
> > -               struct map_list_node *old_node;
> > +       /* Insert mappings after new_map, splitting new_map in the process. */
> > +       for (unsigned int i = first_after_; i < kmaps__nr_maps; i++)
> > +               __maps__fixup_overlap_and_insert(kmaps, kmaps_maps_by_address[i]);
> >
> > -               old_node = list_entry(merged.next, struct map_list_node, node);
> > -               list_del_init(&old_node->node);
> > -               if (!err)
> > -                       err = maps__insert(kmaps, old_node->map);
> > -               map__put(old_node->map);
> > -               free(old_node);
> > -       }
> > +       /* Copy the maps from merged into kmaps. */
> > +       for (unsigned int i = 0; i < kmaps__nr_maps; i++)
> > +               map__zput(kmaps_maps_by_address[i]);
> >
> > -       if (new_map) {
> > -               if (!err)
> > -                       err = maps__insert(kmaps, new_map);
> > -               map__put(new_map);
> > -       }
> > -       return err;
> > +       free(kmaps_maps_by_address);
> > +       up_write(maps__lock(kmaps));
> > +       return 0;
> >  }
> >
> >  void maps__load_first(struct maps *maps)
> >  {
> > -       struct map_rb_node *first;
> > -
> >         down_read(maps__lock(maps));
> >
> > -       first = maps__first(maps);
> > -       if (first)
> > -               map__load(first->map);
> > +       if (maps__nr_maps(maps) > 0)
> > +               map__load(maps__maps_by_address(maps)[0]);
> >
> >         up_read(maps__lock(maps));
> >  }
> > diff --git a/tools/perf/util/maps.h b/tools/perf/util/maps.h
> > index d836d04c9402..df9dd5a0e3c0 100644
> > --- a/tools/perf/util/maps.h
> > +++ b/tools/perf/util/maps.h
> > @@ -25,21 +25,56 @@ static inline struct map_list_node *map_list_node__new(void)
> >         return malloc(sizeof(struct map_list_node));
> >  }
> >
> > -struct map *maps__find(struct maps *maps, u64 addr);
> > +/*
> > + * Locking/sorting note:
> > + *
> > + * Sorting is done with the write lock, iteration and binary searching happens
> > + * under the read lock requiring being sorted. There is a race between sorting
> > + * releasing the write lock and acquiring the read lock for iteration/searching
> > + * where another thread could insert and break the sorting of the maps. In
> > + * practice inserting maps should be rare meaning that the race shouldn't lead
> > + * to live lock. Removal of maps doesn't break being sorted.
> > + */
> >
> >  DECLARE_RC_STRUCT(maps) {
> > -       struct rb_root      entries;
> >         struct rw_semaphore lock;
> > -       struct machine   *machine;
> > -       struct map       *last_search_by_name;
> > +       /**
> > +        * @maps_by_address: array of maps sorted by their starting address if
> > +        * maps_by_address_sorted is true.
> > +        */
> > +       struct map       **maps_by_address;
> > +       /**
> > +        * @maps_by_name: optional array of maps sorted by their dso name if
> > +        * maps_by_name_sorted is true.
> > +        */
> >         struct map       **maps_by_name;
> > -       refcount_t       refcnt;
> > -       unsigned int     nr_maps;
> > -       unsigned int     nr_maps_allocated;
> > +       struct machine   *machine;
> >  #ifdef HAVE_LIBUNWIND_SUPPORT
> > -       void                            *addr_space;
> > +       void            *addr_space;
> >         const struct unwind_libunwind_ops *unwind_libunwind_ops;
> >  #endif
> > +       refcount_t       refcnt;
> > +       /**
> > +        * @nr_maps: number of maps_by_address, and possibly maps_by_name,
> > +        * entries that contain maps.
> > +        */
> > +       unsigned int     nr_maps;
> > +       /**
> > +        * @nr_maps_allocated: number of entries in maps_by_address and possibly
> > +        * maps_by_name.
> > +        */
> > +       unsigned int     nr_maps_allocated;
> > +       /**
> > +        * @last_search_by_name_idx: cache of last found by name entry's index
> > +        * as frequent searches for the same dso name are common.
> > +        */
> > +       unsigned int     last_search_by_name_idx;
> > +       /** @maps_by_address_sorted: is maps_by_address sorted. */
> > +       bool             maps_by_address_sorted;
> > +       /** @maps_by_name_sorted: is maps_by_name sorted. */
> > +       bool             maps_by_name_sorted;
> > +       /** @ends_broken: does the map contain a map where end values are unset/unsorted? */
> > +       bool             ends_broken;
> >  };
> >
> >  #define KMAP_NAME_LEN 256
> > @@ -102,6 +137,7 @@ size_t maps__fprintf(struct maps *maps, FILE *fp);
> >  int maps__insert(struct maps *maps, struct map *map);
> >  void maps__remove(struct maps *maps, struct map *map);
> >
> > +struct map *maps__find(struct maps *maps, u64 addr);
> >  struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp);
> >  struct symbol *maps__find_symbol_by_name(struct maps *maps, const char *name, struct map **mapp);
> >
> > @@ -117,8 +153,6 @@ struct map *maps__find_next_entry(struct maps *maps, struct map *map);
> >
> >  int maps__merge_in(struct maps *kmaps, struct map *new_map);
> >
> > -void __maps__sort_by_name(struct maps *maps);
> > -
> >  void maps__fixup_end(struct maps *maps);
> >
> >  void maps__load_first(struct maps *maps);
> > --
> > 2.43.0.472.g3155946c3a-goog
> >
> >

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses
  2024-02-02  4:20     ` Ian Rogers
@ 2024-02-02  4:21       ` Ian Rogers
  2024-02-06  0:37       ` Namhyung Kim
  1 sibling, 0 replies; 31+ messages in thread
From: Ian Rogers @ 2024-02-02  4:21 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Nick Terrell, Kan Liang, Andi Kleen, Kajol Jain, Athira Rajeev,
	Huacai Chen, Masami Hiramatsu, Vincent Whitchurch,
	Steinar H. Gunderson, Liam Howlett, Miguel Ojeda, Colin Ian King,
	Dmitrii Dolgov, Yang Jihong, Ming Wang, James Clark,
	K Prateek Nayak, Sean Christopherson, Leo Yan, Ravi Bangoria,
	German Gomez, Changbin Du, Paolo Bonzini, Li Dong, Sandipan Das,
	liuwenyu, linux-kernel, linux-perf-users, Guilherme Amadio

On Thu, Feb 1, 2024 at 8:20 PM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Feb 1, 2024 at 6:48 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Tue, Jan 2, 2024 at 9:07 PM Ian Rogers <irogers@google.com> wrote:
> > >
> > > Maps is a collection of maps primarily sorted by the starting address
> > > of the map. Prior to this change the maps were held in an rbtree
> > > requiring 4 pointers per node. Prior to reference count checking, the
> > > rbnode was embedded in the map so 3 pointers per node were
> > > necessary. This change switches the rbtree to an array lazily sorted
> > > by address, much as the array sorting nodes by name. 1 pointer is
> > > needed per node, but to avoid excessive resizing the backing array may
> > > be twice the number of used elements. Meaning the memory overhead is
> > > roughly half that of the rbtree. For a perf record with
> > > "--no-bpf-event -g -a" of true, the memory overhead of perf inject is
> > > reduce fom 3.3MB to 3MB, so 10% or 300KB is saved.
> > >
> > > Map inserts always happen at the end of the array. The code tracks
> > > whether the insertion violates the sorting property. O(log n) rb-tree
> > > complexity is switched to O(1).
> > >
> > > Remove slides the array, so O(log n) rb-tree complexity is degraded to
> > > O(n).
> > >
> > > A find may need to sort the array using qsort which is O(n*log n), but
> > > in general the maps should be sorted and so average performance should
> > > be O(log n) as with the rbtree.
> > >
> > > An rbtree node consumes a cache line, but with the array 4 nodes fit
> > > on a cache line. Iteration is simplified to scanning an array rather
> > > than pointer chasing.
> > >
> > > Overall it is expected the performance after the change should be
> > > comparable to before, but with half of the memory consumed.
> >
> > I don't know how much performance impact it would have but I guess
> > search/iteration would be the most frequent operation.  So I like the
> > memory saving it can bring.
> >
> > >
> > > To avoid a list and repeated logic around splitting maps,
> > > maps__merge_in is rewritten in terms of
> > > maps__fixup_overlap_and_insert. maps_merge_in splits the given mapping
> > > inserting remaining gaps. maps__fixup_overlap_and_insert splits the
> > > existing mappings, then adds the incoming mapping. By adding the new
> > > mapping first, then re-inserting the existing mappings the splitting
> > > behavior matches.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > >  tools/perf/tests/maps.c |    3 +
> > >  tools/perf/util/map.c   |    1 +
> > >  tools/perf/util/maps.c  | 1183 +++++++++++++++++++++++----------------
> > >  tools/perf/util/maps.h  |   54 +-
> > >  4 files changed, 757 insertions(+), 484 deletions(-)
> > >
> > > diff --git a/tools/perf/tests/maps.c b/tools/perf/tests/maps.c
> > > index bb3fbfe5a73e..b15417a0d617 100644
> > > --- a/tools/perf/tests/maps.c
> > > +++ b/tools/perf/tests/maps.c
> > > @@ -156,6 +156,9 @@ static int test__maps__merge_in(struct test_suite *t __maybe_unused, int subtest
> > >         TEST_ASSERT_VAL("merge check failed", !ret);
> > >
> > >         maps__zput(maps);
> > > +       map__zput(map_kcore1);
> > > +       map__zput(map_kcore2);
> > > +       map__zput(map_kcore3);
> > >         return TEST_OK;
> > >  }
> > >
> > > diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> > > index 54c67cb7ecef..cf5a15db3a1f 100644
> > > --- a/tools/perf/util/map.c
> > > +++ b/tools/perf/util/map.c
> > > @@ -168,6 +168,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
> > >                 if (dso == NULL)
> > >                         goto out_delete;
> > >
> > > +               assert(!dso->kernel);
> > >                 map__init(result, start, start + len, pgoff, dso);
> > >
> > >                 if (anon || no_dso) {
> > > diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
> > > index 0334fc18d9c6..6ee81160cdab 100644
> > > --- a/tools/perf/util/maps.c
> > > +++ b/tools/perf/util/maps.c
> > > @@ -10,286 +10,477 @@
> > >  #include "ui/ui.h"
> > >  #include "unwind.h"
> > >
> > > -struct map_rb_node {
> > > -       struct rb_node rb_node;
> > > -       struct map *map;
> > > -};
> > > -
> > > -#define maps__for_each_entry(maps, map) \
> > > -       for (map = maps__first(maps); map; map = map_rb_node__next(map))
> > > +static void check_invariants(const struct maps *maps __maybe_unused)
> > > +{
> > > +#ifndef NDEBUG
> > > +       assert(RC_CHK_ACCESS(maps)->nr_maps <= RC_CHK_ACCESS(maps)->nr_maps_allocated);
> > > +       for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
> > > +               struct map *map = RC_CHK_ACCESS(maps)->maps_by_address[i];
> > > +
> > > +               /* Check map is well-formed. */
> > > +               assert(map__end(map) == 0 || map__start(map) <= map__end(map));
> > > +               /* Expect at least 1 reference count. */
> > > +               assert(refcount_read(map__refcnt(map)) > 0);
> > > +
> > > +               if (map__dso(map) && map__dso(map)->kernel)
> > > +                       assert(RC_CHK_EQUAL(map__kmap(map)->kmaps, maps));
> > > +
> > > +               if (i > 0) {
> > > +                       struct map *prev = RC_CHK_ACCESS(maps)->maps_by_address[i - 1];
> > > +
> > > +                       /* If addresses are sorted... */
> > > +                       if (RC_CHK_ACCESS(maps)->maps_by_address_sorted) {
> > > +                               /* Maps should be in start address order. */
> > > +                               assert(map__start(prev) <= map__start(map));
> > > +                               /*
> > > +                                * If the ends of maps aren't broken (during
> > > +                                * construction) then they should be ordered
> > > +                                * too.
> > > +                                */
> > > +                               if (!RC_CHK_ACCESS(maps)->ends_broken) {
> > > +                                       assert(map__end(prev) <= map__end(map));
> > > +                                       assert(map__end(prev) <= map__start(map) ||
> > > +                                              map__start(prev) == map__start(map));
> > > +                               }
> > > +                       }
> > > +               }
> > > +       }
> > > +       if (RC_CHK_ACCESS(maps)->maps_by_name) {
> > > +               for (unsigned int i = 0; i < RC_CHK_ACCESS(maps)->nr_maps; i++) {
> > > +                       struct map *map = RC_CHK_ACCESS(maps)->maps_by_name[i];
> > >
> > > -#define maps__for_each_entry_safe(maps, map, next) \
> > > -       for (map = maps__first(maps), next = map_rb_node__next(map); map; \
> > > -            map = next, next = map_rb_node__next(map))
> > > +                       /*
> > > +                        * Maps by name maps should be in maps_by_address, so
> > > +                        * the reference count should be higher.
> > > +                        */
> > > +                       assert(refcount_read(map__refcnt(map)) > 1);
> > > +               }
> > > +       }
> > > +#endif
> > > +}
> > >
> > > -static struct rb_root *maps__entries(struct maps *maps)
> > > +static struct map **maps__maps_by_address(const struct maps *maps)
> > >  {
> > > -       return &RC_CHK_ACCESS(maps)->entries;
> > > +       return RC_CHK_ACCESS(maps)->maps_by_address;
> > >  }
> > >
> > > -static struct rw_semaphore *maps__lock(struct maps *maps)
> > > +static void maps__set_maps_by_address(struct maps *maps, struct map **new)
> > >  {
> > > -       return &RC_CHK_ACCESS(maps)->lock;
> > > +       RC_CHK_ACCESS(maps)->maps_by_address = new;
> > > +
> > >  }
> > >
> > > -static struct map **maps__maps_by_name(struct maps *maps)
> > > +/* Not in the header, to aid reference counting. */
> > > +static struct map **maps__maps_by_name(const struct maps *maps)
> > >  {
> > >         return RC_CHK_ACCESS(maps)->maps_by_name;
> > > +
> > >  }
> > >
> > > -static struct map_rb_node *maps__first(struct maps *maps)
> > > +static void maps__set_maps_by_name(struct maps *maps, struct map **new)
> > >  {
> > > -       struct rb_node *first = rb_first(maps__entries(maps));
> > > +       RC_CHK_ACCESS(maps)->maps_by_name = new;
> > >
> > > -       if (first)
> > > -               return rb_entry(first, struct map_rb_node, rb_node);
> > > -       return NULL;
> > >  }
> > >
> > > -static struct map_rb_node *map_rb_node__next(struct map_rb_node *node)
> > > +static bool maps__maps_by_address_sorted(const struct maps *maps)
> > >  {
> > > -       struct rb_node *next;
> > > -
> > > -       if (!node)
> > > -               return NULL;
> > > -
> > > -       next = rb_next(&node->rb_node);
> > > +       return RC_CHK_ACCESS(maps)->maps_by_address_sorted;
> > > +}
> > >
> > > -       if (!next)
> > > -               return NULL;
> > > +static void maps__set_maps_by_address_sorted(struct maps *maps, bool value)
> > > +{
> > > +       RC_CHK_ACCESS(maps)->maps_by_address_sorted = value;
> > > +}
> > >
> > > -       return rb_entry(next, struct map_rb_node, rb_node);
> > > +static bool maps__maps_by_name_sorted(const struct maps *maps)
> > > +{
> > > +       return RC_CHK_ACCESS(maps)->maps_by_name_sorted;
> > >  }
> > >
> > > -static struct map_rb_node *maps__find_node(struct maps *maps, struct map *map)
> > > +static void maps__set_maps_by_name_sorted(struct maps *maps, bool value)
> > >  {
> > > -       struct map_rb_node *rb_node;
> > > +       RC_CHK_ACCESS(maps)->maps_by_name_sorted = value;
> > > +}
> > >
> > > -       maps__for_each_entry(maps, rb_node) {
> > > -               if (rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map))
> > > -                       return rb_node;
> > > -       }
> > > -       return NULL;
> > > +static struct rw_semaphore *maps__lock(struct maps *maps)
> > > +{
> > > +       /*
> > > +        * When the lock is acquired or released the maps invariants should
> > > +        * hold.
> > > +        */
> > > +       check_invariants(maps);
> > > +       return &RC_CHK_ACCESS(maps)->lock;
> > >  }
> > >
> > >  static void maps__init(struct maps *maps, struct machine *machine)
> > >  {
> > > -       refcount_set(maps__refcnt(maps), 1);
> > >         init_rwsem(maps__lock(maps));
> > > -       RC_CHK_ACCESS(maps)->entries = RB_ROOT;
> > > +       RC_CHK_ACCESS(maps)->maps_by_address = NULL;
> > > +       RC_CHK_ACCESS(maps)->maps_by_name = NULL;
> > >         RC_CHK_ACCESS(maps)->machine = machine;
> > > -       RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
> > > +#ifdef HAVE_LIBUNWIND_SUPPORT
> > > +       RC_CHK_ACCESS(maps)->addr_space = NULL;
> > > +       RC_CHK_ACCESS(maps)->unwind_libunwind_ops = NULL;
> > > +#endif
> > > +       refcount_set(maps__refcnt(maps), 1);
> > >         RC_CHK_ACCESS(maps)->nr_maps = 0;
> > > -       RC_CHK_ACCESS(maps)->maps_by_name = NULL;
> > > +       RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
> > > +       RC_CHK_ACCESS(maps)->last_search_by_name_idx = 0;
> > > +       RC_CHK_ACCESS(maps)->maps_by_address_sorted = true;
> > > +       RC_CHK_ACCESS(maps)->maps_by_name_sorted = false;
> > >  }
> > >
> > > -static void __maps__free_maps_by_name(struct maps *maps)
> > > +static void maps__exit(struct maps *maps)
> > >  {
> > > -       /*
> > > -        * Free everything to try to do it from the rbtree in the next search
> > > -        */
> > > -       for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
> > > -               map__put(maps__maps_by_name(maps)[i]);
> > > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > > +       struct map **maps_by_name = maps__maps_by_name(maps);
> > >
> > > -       zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
> > > -       RC_CHK_ACCESS(maps)->nr_maps_allocated = 0;
> > > +       for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> > > +               map__zput(maps_by_address[i]);
> > > +               if (maps_by_name)
> > > +                       map__zput(maps_by_name[i]);
> > > +       }
> > > +       zfree(&maps_by_address);
> > > +       zfree(&maps_by_name);
> > > +       unwind__finish_access(maps);
> > >  }
> > >
> > > -static int __maps__insert(struct maps *maps, struct map *map)
> > > +struct maps *maps__new(struct machine *machine)
> > >  {
> > > -       struct rb_node **p = &maps__entries(maps)->rb_node;
> > > -       struct rb_node *parent = NULL;
> > > -       const u64 ip = map__start(map);
> > > -       struct map_rb_node *m, *new_rb_node;
> > > -
> > > -       new_rb_node = malloc(sizeof(*new_rb_node));
> > > -       if (!new_rb_node)
> > > -               return -ENOMEM;
> > > -
> > > -       RB_CLEAR_NODE(&new_rb_node->rb_node);
> > > -       new_rb_node->map = map__get(map);
> > > +       struct maps *result;
> > > +       RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
> > >
> > > -       while (*p != NULL) {
> > > -               parent = *p;
> > > -               m = rb_entry(parent, struct map_rb_node, rb_node);
> > > -               if (ip < map__start(m->map))
> > > -                       p = &(*p)->rb_left;
> > > -               else
> > > -                       p = &(*p)->rb_right;
> > > -       }
> > > +       if (ADD_RC_CHK(result, maps))
> > > +               maps__init(result, machine);
> > >
> > > -       rb_link_node(&new_rb_node->rb_node, parent, p);
> > > -       rb_insert_color(&new_rb_node->rb_node, maps__entries(maps));
> > > -       return 0;
> > > +       return result;
> > >  }
> > >
> > > -int maps__insert(struct maps *maps, struct map *map)
> > > +static void maps__delete(struct maps *maps)
> > >  {
> > > -       int err;
> > > -       const struct dso *dso = map__dso(map);
> > > -
> > > -       down_write(maps__lock(maps));
> > > -       err = __maps__insert(maps, map);
> > > -       if (err)
> > > -               goto out;
> > > +       maps__exit(maps);
> > > +       RC_CHK_FREE(maps);
> > > +}
> > >
> > > -       ++RC_CHK_ACCESS(maps)->nr_maps;
> > > +struct maps *maps__get(struct maps *maps)
> > > +{
> > > +       struct maps *result;
> > >
> > > -       if (dso && dso->kernel) {
> > > -               struct kmap *kmap = map__kmap(map);
> > > +       if (RC_CHK_GET(result, maps))
> > > +               refcount_inc(maps__refcnt(maps));
> > >
> > > -               if (kmap)
> > > -                       kmap->kmaps = maps;
> > > -               else
> > > -                       pr_err("Internal error: kernel dso with non kernel map\n");
> > > -       }
> > > +       return result;
> > > +}
> > >
> > > +void maps__put(struct maps *maps)
> > > +{
> > > +       if (maps && refcount_dec_and_test(maps__refcnt(maps)))
> > > +               maps__delete(maps);
> > > +       else
> > > +               RC_CHK_PUT(maps);
> > > +}
> > >
> > > +static void __maps__free_maps_by_name(struct maps *maps)
> > > +{
> > >         /*
> > > -        * If we already performed some search by name, then we need to add the just
> > > -        * inserted map and resort.
> > > +        * Free everything to try to do it from the rbtree in the next search
> > >          */
> > > -       if (maps__maps_by_name(maps)) {
> > > -               if (maps__nr_maps(maps) > RC_CHK_ACCESS(maps)->nr_maps_allocated) {
> > > -                       int nr_allocate = maps__nr_maps(maps) * 2;
> > > -                       struct map **maps_by_name = realloc(maps__maps_by_name(maps),
> > > -                                                           nr_allocate * sizeof(map));
> > > +       for (unsigned int i = 0; i < maps__nr_maps(maps); i++)
> > > +               map__put(maps__maps_by_name(maps)[i]);
> > >
> > > -                       if (maps_by_name == NULL) {
> > > -                               __maps__free_maps_by_name(maps);
> > > -                               err = -ENOMEM;
> > > -                               goto out;
> > > -                       }
> > > +       zfree(&RC_CHK_ACCESS(maps)->maps_by_name);
> > > +}
> > >
> > > -                       RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
> > > -                       RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
> > > +static int map__start_cmp(const void *a, const void *b)
> > > +{
> > > +       const struct map *map_a = *(const struct map * const *)a;
> > > +       const struct map *map_b = *(const struct map * const *)b;
> > > +       u64 map_a_start = map__start(map_a);
> > > +       u64 map_b_start = map__start(map_b);
> > > +
> > > +       if (map_a_start == map_b_start) {
> > > +               u64 map_a_end = map__end(map_a);
> > > +               u64 map_b_end = map__end(map_b);
> > > +
> > > +               if  (map_a_end == map_b_end) {
> > > +                       /* Ensure maps with the same addresses have a fixed order. */
> > > +                       if (RC_CHK_ACCESS(map_a) == RC_CHK_ACCESS(map_b))
> > > +                               return 0;
> > > +                       return (intptr_t)RC_CHK_ACCESS(map_a) > (intptr_t)RC_CHK_ACCESS(map_b)
> > > +                               ? 1 : -1;
> > >                 }
> > > -               maps__maps_by_name(maps)[maps__nr_maps(maps) - 1] = map__get(map);
> > > -               __maps__sort_by_name(maps);
> > > +               return map_a_end > map_b_end ? 1 : -1;
> > >         }
> > > - out:
> > > -       up_write(maps__lock(maps));
> > > -       return err;
> > > +       return map_a_start > map_b_start ? 1 : -1;
> > >  }
> > >
> > > -static void __maps__remove(struct maps *maps, struct map_rb_node *rb_node)
> > > +static void __maps__sort_by_address(struct maps *maps)
> > >  {
> > > -       rb_erase_init(&rb_node->rb_node, maps__entries(maps));
> > > -       map__put(rb_node->map);
> > > -       free(rb_node);
> > > +       if (maps__maps_by_address_sorted(maps))
> > > +               return;
> > > +
> > > +       qsort(maps__maps_by_address(maps),
> > > +               maps__nr_maps(maps),
> > > +               sizeof(struct map *),
> > > +               map__start_cmp);
> > > +       maps__set_maps_by_address_sorted(maps, true);
> > >  }
> > >
> > > -void maps__remove(struct maps *maps, struct map *map)
> > > +static void maps__sort_by_address(struct maps *maps)
> > >  {
> > > -       struct map_rb_node *rb_node;
> > > -
> > >         down_write(maps__lock(maps));
> > > -       if (RC_CHK_ACCESS(maps)->last_search_by_name == map)
> > > -               RC_CHK_ACCESS(maps)->last_search_by_name = NULL;
> > > -
> > > -       rb_node = maps__find_node(maps, map);
> > > -       assert(rb_node->RC_CHK_ACCESS(map) == RC_CHK_ACCESS(map));
> > > -       __maps__remove(maps, rb_node);
> > > -       if (maps__maps_by_name(maps))
> > > -               __maps__free_maps_by_name(maps);
> > > -       --RC_CHK_ACCESS(maps)->nr_maps;
> > > +       __maps__sort_by_address(maps);
> > >         up_write(maps__lock(maps));
> > >  }
> > >
> > > -static void __maps__purge(struct maps *maps)
> > > +static int map__strcmp(const void *a, const void *b)
> > >  {
> > > -       struct map_rb_node *pos, *next;
> > > -
> > > -       if (maps__maps_by_name(maps))
> > > -               __maps__free_maps_by_name(maps);
> > > +       const struct map *map_a = *(const struct map * const *)a;
> > > +       const struct map *map_b = *(const struct map * const *)b;
> > > +       const struct dso *dso_a = map__dso(map_a);
> > > +       const struct dso *dso_b = map__dso(map_b);
> > > +       int ret = strcmp(dso_a->short_name, dso_b->short_name);
> > >
> > > -       maps__for_each_entry_safe(maps, pos, next) {
> > > -               rb_erase_init(&pos->rb_node,  maps__entries(maps));
> > > -               map__put(pos->map);
> > > -               free(pos);
> > > +       if (ret == 0 && RC_CHK_ACCESS(map_a) != RC_CHK_ACCESS(map_b)) {
> > > +               /* Ensure distinct but name equal maps have an order. */
> > > +               return map__start_cmp(a, b);
> > >         }
> > > +       return ret;
> > >  }
> > >
> > > -static void maps__exit(struct maps *maps)
> > > +static int maps__sort_by_name(struct maps *maps)
> > >  {
> > > +       int err = 0;
> > >         down_write(maps__lock(maps));
> > > -       __maps__purge(maps);
> > > +       if (!maps__maps_by_name_sorted(maps)) {
> > > +               struct map **maps_by_name = maps__maps_by_name(maps);
> > > +
> > > +               if (!maps_by_name) {
> > > +                       maps_by_name = malloc(RC_CHK_ACCESS(maps)->nr_maps_allocated *
> > > +                                       sizeof(*maps_by_name));
> > > +                       if (!maps_by_name)
> > > +                               err = -ENOMEM;
> > > +                       else {
> > > +                               struct map **maps_by_address = maps__maps_by_address(maps);
> > > +                               unsigned int n = maps__nr_maps(maps);
> > > +
> > > +                               maps__set_maps_by_name(maps, maps_by_name);
> > > +                               for (unsigned int i = 0; i < n; i++)
> > > +                                       maps_by_name[i] = map__get(maps_by_address[i]);
> > > +                       }
> > > +               }
> > > +               if (!err) {
> > > +                       qsort(maps_by_name,
> > > +                               maps__nr_maps(maps),
> > > +                               sizeof(struct map *),
> > > +                               map__strcmp);
> > > +                       maps__set_maps_by_name_sorted(maps, true);
> > > +               }
> > > +       }
> > >         up_write(maps__lock(maps));
> > > +       return err;
> > >  }
> > >
> > > -bool maps__empty(struct maps *maps)
> > > +static unsigned int maps__by_address_index(const struct maps *maps, const struct map *map)
> > >  {
> > > -       return !maps__first(maps);
> > > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > > +
> > > +       if (maps__maps_by_address_sorted(maps)) {
> > > +               struct map **mapp =
> > > +                       bsearch(&map, maps__maps_by_address(maps), maps__nr_maps(maps),
> > > +                               sizeof(*mapp), map__start_cmp);
> > > +
> > > +               if (mapp)
> > > +                       return mapp - maps_by_address;
> > > +       } else {
> > > +               for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> > > +                       if (RC_CHK_ACCESS(maps_by_address[i]) == RC_CHK_ACCESS(map))
> > > +                               return i;
> > > +               }
> > > +       }
> > > +       pr_err("Map missing from maps");
> > > +       return -1;
> > >  }
> > >
> > > -struct maps *maps__new(struct machine *machine)
> > > +static unsigned int maps__by_name_index(const struct maps *maps, const struct map *map)
> > >  {
> > > -       struct maps *result;
> > > -       RC_STRUCT(maps) *maps = zalloc(sizeof(*maps));
> > > +       struct map **maps_by_name = maps__maps_by_name(maps);
> > > +
> > > +       if (maps__maps_by_name_sorted(maps)) {
> > > +               struct map **mapp =
> > > +                       bsearch(&map, maps_by_name, maps__nr_maps(maps),
> > > +                               sizeof(*mapp), map__strcmp);
> > > +
> > > +               if (mapp)
> > > +                       return mapp - maps_by_name;
> > > +       } else {
> > > +               for (unsigned int i = 0; i < maps__nr_maps(maps); i++) {
> > > +                       if (RC_CHK_ACCESS(maps_by_name[i]) == RC_CHK_ACCESS(map))
> > > +                               return i;
> > > +               }
> > > +       }
> > > +       pr_err("Map missing from maps");
> > > +       return -1;
> > > +}
> > >
> > > -       if (ADD_RC_CHK(result, maps))
> > > -               maps__init(result, machine);
> > > +static int __maps__insert(struct maps *maps, struct map *new)
> > > +{
> > > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > > +       struct map **maps_by_name = maps__maps_by_name(maps);
> > > +       const struct dso *dso = map__dso(new);
> > > +       unsigned int nr_maps = maps__nr_maps(maps);
> > > +       unsigned int nr_allocate = RC_CHK_ACCESS(maps)->nr_maps_allocated;
> > > +
> > > +       if (nr_maps + 1 > nr_allocate) {
> > > +               nr_allocate = !nr_allocate ? 32 : nr_allocate * 2;
> > > +
> > > +               maps_by_address = realloc(maps_by_address, nr_allocate * sizeof(new));
> > > +               if (!maps_by_address)
> > > +                       return -ENOMEM;
> > > +
> > > +               maps__set_maps_by_address(maps, maps_by_address);
> > > +               if (maps_by_name) {
> > > +                       maps_by_name = realloc(maps_by_name, nr_allocate * sizeof(new));
> > > +                       if (!maps_by_name) {
> > > +                               /*
> > > +                                * If by name fails, just disable by name and it will
> > > +                                * recompute next time it is required.
> > > +                                */
> > > +                               __maps__free_maps_by_name(maps);
> > > +                       }
> > > +                       maps__set_maps_by_name(maps, maps_by_name);
> > > +               }
> > > +               RC_CHK_ACCESS(maps)->nr_maps_allocated = nr_allocate;
> > > +       }
> > > +       /* Insert the value at the end. */
> > > +       maps_by_address[nr_maps] = map__get(new);
> > > +       if (maps_by_name)
> > > +               maps_by_name[nr_maps] = map__get(new);
> > >
> > > -       return result;
> > > +       nr_maps++;
> > > +       RC_CHK_ACCESS(maps)->nr_maps = nr_maps;
> > > +
> > > +       /*
> > > +        * Recompute if things are sorted. If things are inserted in a sorted
> > > +        * manner, for example by processing /proc/pid/maps, then no
> > > +        * sorting/resorting will be necessary.
> > > +        */
> > > +       if (nr_maps == 1) {
> > > +               /* If there's just 1 entry then maps are sorted. */
> > > +               maps__set_maps_by_address_sorted(maps, true);
> > > +               maps__set_maps_by_name_sorted(maps, maps_by_name != NULL);
> > > +       } else {
> > > +               /* Sorted if maps were already sorted and this map starts after the last one. */
> > > +               maps__set_maps_by_address_sorted(maps,
> > > +                       maps__maps_by_address_sorted(maps) &&
> > > +                       map__end(maps_by_address[nr_maps - 2]) <= map__start(new));
> > > +               maps__set_maps_by_name_sorted(maps, false);
> > > +       }
> > > +       if (map__end(new) < map__start(new))
> > > +               RC_CHK_ACCESS(maps)->ends_broken = true;
> > > +       if (dso && dso->kernel) {
> > > +               struct kmap *kmap = map__kmap(new);
> > > +
> > > +               if (kmap)
> > > +                       kmap->kmaps = maps;
> > > +               else
> > > +                       pr_err("Internal error: kernel dso with non kernel map\n");
> > > +       }
> > > +       check_invariants(maps);
> >
> > Probably not needed as it's checked when you get the lock below.
>
> Ack. Will remove in v3.

s/v3/v8/g :-)

Ian

> >
> > > +       return 0;
> > >  }
> > >
> > > -static void maps__delete(struct maps *maps)
> > > +int maps__insert(struct maps *maps, struct map *map)
> > >  {
> > > -       maps__exit(maps);
> > > -       unwind__finish_access(maps);
> > > -       RC_CHK_FREE(maps);
> > > +       int ret;
> > > +
> > > +       down_write(maps__lock(maps));
> > > +       ret = __maps__insert(maps, map);
> > > +       up_write(maps__lock(maps));
> > > +       return ret;
> > >  }
> > >
> > > -struct maps *maps__get(struct maps *maps)
> > > +static void __maps__remove(struct maps *maps, struct map *map)
> > >  {
> > > -       struct maps *result;
> > > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > > +       struct map **maps_by_name = maps__maps_by_name(maps);
> > > +       unsigned int nr_maps = maps__nr_maps(maps);
> > > +       unsigned int address_idx;
> > > +
> > > +       /* Slide later mappings over the one to remove */
> > > +       address_idx = maps__by_address_index(maps, map);
> > > +       map__put(maps_by_address[address_idx]);
> > > +       memmove(&maps_by_address[address_idx],
> > > +               &maps_by_address[address_idx + 1],
> > > +               (nr_maps - address_idx - 1) * sizeof(*maps_by_address));
> > > +
> > > +       if (maps_by_name) {
> > > +               unsigned int name_idx = maps__by_name_index(maps, map);
> > > +
> > > +               map__put(maps_by_name[name_idx]);
> > > +               memmove(&maps_by_name[name_idx],
> > > +                       &maps_by_name[name_idx + 1],
> > > +                       (nr_maps - name_idx - 1) *  sizeof(*maps_by_name));
> > > +       }
> > >
> > > -       if (RC_CHK_GET(result, maps))
> > > -               refcount_inc(maps__refcnt(maps));
> > > +       --RC_CHK_ACCESS(maps)->nr_maps;
> > > +       check_invariants(maps);
> >
> > Ditto.
>
> Ack.
>
> > > +}
> > >
> > > -       return result;
> > > +void maps__remove(struct maps *maps, struct map *map)
> > > +{
> > > +       down_write(maps__lock(maps));
> > > +       __maps__remove(maps, map);
> > > +       up_write(maps__lock(maps));
> > >  }
> > >
> > > -void maps__put(struct maps *maps)
> > > +bool maps__empty(struct maps *maps)
> > >  {
> > > -       if (maps && refcount_dec_and_test(maps__refcnt(maps)))
> > > -               maps__delete(maps);
> > > -       else
> > > -               RC_CHK_PUT(maps);
> > > +       return maps__nr_maps(maps) == 0;
> > >  }
> > >
> > >  int maps__for_each_map(struct maps *maps, int (*cb)(struct map *map, void *data), void *data)
> > >  {
> > > -       struct map_rb_node *pos;
> > > +       bool done = false;
> > >         int ret = 0;
> > >
> > > -       down_read(maps__lock(maps));
> > > -       maps__for_each_entry(maps, pos) {
> > > -               ret = cb(pos->map, data);
> > > -               if (ret)
> > > -                       break;
> > > +       /* See locking/sorting note. */
> > > +       while (!done) {
> > > +               down_read(maps__lock(maps));
> > > +               if (maps__maps_by_address_sorted(maps)) {
> > > +                       struct map **maps_by_address = maps__maps_by_address(maps);
> > > +                       unsigned int n = maps__nr_maps(maps);
> > > +
> > > +                       for (unsigned int i = 0; i < n; i++) {
> > > +                               struct map *map = maps_by_address[i];
> > > +
> > > +                               ret = cb(map, data);
> > > +                               if (ret)
> > > +                                       break;
> > > +                       }
> > > +                       done = true;
> > > +               }
> > > +               up_read(maps__lock(maps));
> > > +               if (!done)
> > > +                       maps__sort_by_address(maps);
> > >         }
> > > -       up_read(maps__lock(maps));
> > >         return ret;
> > >  }
> > >
> > >  void maps__remove_maps(struct maps *maps, bool (*cb)(struct map *map, void *data), void *data)
> > >  {
> > > -       struct map_rb_node *pos, *next;
> > > -       unsigned int start_nr_maps;
> > > +       struct map **maps_by_address;
> > >
> > >         down_write(maps__lock(maps));
> > >
> > > -       start_nr_maps = maps__nr_maps(maps);
> > > -       maps__for_each_entry_safe(maps, pos, next)      {
> > > -               if (cb(pos->map, data)) {
> > > -                       __maps__remove(maps, pos);
> > > -                       --RC_CHK_ACCESS(maps)->nr_maps;
> > > -               }
> > > +       maps_by_address = maps__maps_by_address(maps);
> > > +       for (unsigned int i = 0; i < maps__nr_maps(maps);) {
> > > +               if (cb(maps_by_address[i], data))
> > > +                       __maps__remove(maps, maps_by_address[i]);
> > > +               else
> > > +                       i++;
> > >         }
> > > -       if (maps__maps_by_name(maps) && start_nr_maps != maps__nr_maps(maps))
> > > -               __maps__free_maps_by_name(maps);
> > > -
> > >         up_write(maps__lock(maps));
> > >  }
> > >
> > > @@ -300,7 +491,7 @@ struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp)
> > >         /* Ensure map is loaded before using map->map_ip */
> > >         if (map != NULL && map__load(map) >= 0) {
> > >                 if (mapp != NULL)
> > > -                       *mapp = map;
> > > +                       *mapp = map; // TODO: map_put on else path when find returns a get.
> > >                 return map__find_symbol(map, map__map_ip(map, addr));
> > >         }
> > >
> > > @@ -348,7 +539,7 @@ int maps__find_ams(struct maps *maps, struct addr_map_symbol *ams)
> > >         if (ams->addr < map__start(ams->ms.map) || ams->addr >= map__end(ams->ms.map)) {
> > >                 if (maps == NULL)
> > >                         return -1;
> > > -               ams->ms.map = maps__find(maps, ams->addr);
> > > +               ams->ms.map = maps__find(maps, ams->addr);  // TODO: map_get
> > >                 if (ams->ms.map == NULL)
> > >                         return -1;
> > >         }
> > > @@ -393,24 +584,28 @@ size_t maps__fprintf(struct maps *maps, FILE *fp)
> > >   * Find first map where end > map->start.
> > >   * Same as find_vma() in kernel.
> > >   */
> > > -static struct rb_node *first_ending_after(struct maps *maps, const struct map *map)
> > > +static unsigned int first_ending_after(struct maps *maps, const struct map *map)
> > >  {
> > > -       struct rb_root *root;
> > > -       struct rb_node *next, *first;
> > > +       struct map **maps_by_address = maps__maps_by_address(maps);
> > > +       int low = 0, high = (int)maps__nr_maps(maps) - 1, first = high + 1;
> > > +
> > > +       assert(maps__maps_by_address_sorted(maps));
> > > +       if (low <= high && map__end(maps_by_address[0]) > map__start(map))
> > > +               return 0;
> > >
> > > -       root = maps__entries(maps);
> > > -       next = root->rb_node;
> > > -       first = NULL;
> > > -       while (next) {
> > > -               struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
> > > +       while (low <= high) {
> > > +               int mid = (low + high) / 2;
> > > +               struct map *pos = maps_by_address[mid];
> > >
> > > -               if (map__end(pos->map) > map__start(map)) {
> > > -                       first = next;
> > > -                       if (map__start(pos->map) <= map__start(map))
> > > +               if (map__end(pos) > map__start(map)) {
> > > +                       first = mid;
> > > +                       if (map__start(pos) <= map__start(map)) {
> > > +                               /* Entry overlaps map. */
> > >                                 break;
> > > -                       next = next->rb_left;
> > > +                       }
> > > +                       high = mid - 1;
> > >                 } else
> > > -                       next = next->rb_right;
> > > +                       low = mid + 1;
> > >         }
> > >         return first;
> > >  }
> > > @@ -419,171 +614,249 @@ static struct rb_node *first_ending_after(struct maps *maps, const struct map *m
> > >   * Adds new to maps, if new overlaps existing entries then the existing maps are
> > >   * adjusted or removed so that new fits without overlapping any entries.
> > >   */
> > > -int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
> > > +static int __maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
> > >  {
> > > -
> > > -       struct rb_node *next;
> > > +       struct map **maps_by_address;
> > >         int err = 0;
> > >         FILE *fp = debug_file();
> > >
> > > -       down_write(maps__lock(maps));
> > > +sort_again:
> > > +       if (!maps__maps_by_address_sorted(maps))
> > > +               __maps__sort_by_address(maps);
> > >
> > > -       next = first_ending_after(maps, new);
> > > -       while (next && !err) {
> > > -               struct map_rb_node *pos = rb_entry(next, struct map_rb_node, rb_node);
> > > -               next = rb_next(&pos->rb_node);
> > > +       maps_by_address = maps__maps_by_address(maps);
> > > +       /*
> > > +        * Iterate through entries where the end of the existing entry is
> > > +        * greater-than the new map's start.
> > > +        */
> > > +       for (unsigned int i = first_ending_after(maps, new); i < maps__nr_maps(maps); ) {
> > > +               struct map *pos = maps_by_address[i];
> > > +               struct map *before = NULL, *after = NULL;
> > >
> > >                 /*
> > >                  * Stop if current map starts after map->end.
> > >                  * Maps are ordered by start: next will not overlap for sure.
> > >                  */
> > > -               if (map__start(pos->map) >= map__end(new))
> > > +               if (map__start(pos) >= map__end(new))
> > >                         break;
> > >
> > > -               if (verbose >= 2) {
> > > -
> > > -                       if (use_browser) {
> > > -                               pr_debug("overlapping maps in %s (disable tui for more info)\n",
> > > -                                        map__dso(new)->name);
> > > -                       } else {
> > > -                               pr_debug("overlapping maps:\n");
> > > -                               map__fprintf(new, fp);
> > > -                               map__fprintf(pos->map, fp);
> > > -                       }
> > > +               if (use_browser) {
> > > +                       pr_debug("overlapping maps in %s (disable tui for more info)\n",
> > > +                               map__dso(new)->name);
> > > +               } else if (verbose >= 2) {
> > > +                       pr_debug("overlapping maps:\n");
> > > +                       map__fprintf(new, fp);
> > > +                       map__fprintf(pos, fp);
> > >                 }
> > >
> > > -               rb_erase_init(&pos->rb_node, maps__entries(maps));
> > >                 /*
> > >                  * Now check if we need to create new maps for areas not
> > >                  * overlapped by the new map:
> > >                  */
> > > -               if (map__start(new) > map__start(pos->map)) {
> > > -                       struct map *before = map__clone(pos->map);
> > > +               if (map__start(new) > map__start(pos)) {
> > > +                       /* Map starts within existing map. Need to shorten the existing map. */
> > > +                       before = map__clone(pos);
> > >
> > >                         if (before == NULL) {
> > >                                 err = -ENOMEM;
> > > -                               goto put_map;
> > > +                               goto out_err;
> > >                         }
> > > -
> > >                         map__set_end(before, map__start(new));
> > > -                       err = __maps__insert(maps, before);
> > > -                       if (err) {
> > > -                               map__put(before);
> > > -                               goto put_map;
> > > -                       }
> > >
> > >                         if (verbose >= 2 && !use_browser)
> > >                                 map__fprintf(before, fp);
> > > -                       map__put(before);
> > >                 }
> > > -
> > > -               if (map__end(new) < map__end(pos->map)) {
> > > -                       struct map *after = map__clone(pos->map);
> > > +               if (map__end(new) < map__end(pos)) {
> > > +                       /* The new map isn't as long as the existing map. */
> > > +                       after = map__clone(pos);
> > >
> > >                         if (after == NULL) {
> > > +                               map__zput(before);
> > >                                 err = -ENOMEM;
> > > -                               goto put_map;
> > > +                               goto out_err;
> > >                         }
> > >
> > >                         map__set_start(after, map__end(new));
> > > -                       map__add_pgoff(after, map__end(new) - map__start(pos->map));
> > > -                       assert(map__map_ip(pos->map, map__end(new)) ==
> > > -                               map__map_ip(after, map__end(new)));
> > > -                       err = __maps__insert(maps, after);
> > > -                       if (err) {
> > > -                               map__put(after);
> > > -                               goto put_map;
> > > -                       }
> > > +                       map__add_pgoff(after, map__end(new) - map__start(pos));
> > > +                       assert(map__map_ip(pos, map__end(new)) ==
> > > +                              map__map_ip(after, map__end(new)));
> > > +
> > >                         if (verbose >= 2 && !use_browser)
> > >                                 map__fprintf(after, fp);
> > > -                       map__put(after);
> > >                 }
> > > -put_map:
> > > -               map__put(pos->map);
> > > -               free(pos);
> > > +               /*
> > > +                * If adding one entry, for `before` or `after`, we can replace
> > > +                * the existing entry. If both `before` and `after` are
> > > +                * necessary than an insert is needed. If the existing entry
> > > +                * entirely overlaps the existing entry it can just be removed.
> > > +                */
> > > +               if (before) {
> > > +                       map__put(maps_by_address[i]);
> > > +                       maps_by_address[i] = before;
> > > +                       /* Maps are still ordered, go to next one. */
> > > +                       i++;
> > > +                       if (after) {
> > > +                               __maps__insert(maps, after);
> > > +                               map__put(after);
> > > +                               if (!maps__maps_by_address_sorted(maps)) {
> > > +                                       /*
> > > +                                        * Sorting broken so invariants don't
> > > +                                        * hold, sort and go again.
> > > +                                        */
> > > +                                       goto sort_again;
> > > +                               }
> > > +                               /*
> > > +                                * Maps are still ordered, skip after and go to
> > > +                                * next one (terminate loop).
> > > +                                */
> > > +                               i++;
> > > +                       }
> > > +               } else if (after) {
> > > +                       map__put(maps_by_address[i]);
> > > +                       maps_by_address[i] = after;
> > > +                       /* Maps are ordered, go to next one. */
> > > +                       i++;
> > > +               } else {
> > > +                       __maps__remove(maps, pos);
> > > +                       /*
> > > +                        * Maps are ordered but no need to increase `i` as the
> > > +                        * later maps were moved down.
> > > +                        */
> > > +               }
> > > +               check_invariants(maps);
> > >         }
> > >         /* Add the map. */
> > > -       err = __maps__insert(maps, new);
> > > -       up_write(maps__lock(maps));
> > > +       __maps__insert(maps, new);
> > > +out_err:
> > >         return err;
> > >  }
> > >
> > > -int maps__copy_from(struct maps *maps, struct maps *parent)
> > > +int maps__fixup_overlap_and_insert(struct maps *maps, struct map *new)
> > >  {
> > >         int err;
> > > -       struct map_rb_node *rb_node;
> > >
> > > +       down_write(maps__lock(maps));
> > > +       err =  __maps__fixup_overlap_and_insert(maps, new);
> > > +       up_write(maps__lock(maps));
> > > +       return err;
> > > +}
> > > +
> > > +int maps__copy_from(struct maps *dest, struct maps *parent)
> > > +{
> > > +       /* Note, if struct map were immutable then cloning could use ref counts. */
> > > +       struct map **parent_maps_by_address;
> > > +       int err = 0;
> > > +       unsigned int n;
> > > +
> > > +       down_write(maps__lock(dest));
> > >         down_read(maps__lock(parent));
> > >
> > > -       maps__for_each_entry(parent, rb_node) {
> > > -               struct map *new = map__clone(rb_node->map);
> > > +       parent_maps_by_address = maps__maps_by_address(parent);
> > > +       n = maps__nr_maps(parent);
> > > +       if (maps__empty(dest)) {
> > > +               /* No existing mappings so just copy from parent to avoid reallocs in insert. */
> > > +               unsigned int nr_maps_allocated = RC_CHK_ACCESS(parent)->nr_maps_allocated;
> > > +               struct map **dest_maps_by_address =
> > > +                       malloc(nr_maps_allocated * sizeof(struct map *));
> > > +               struct map **dest_maps_by_name = NULL;
> > >
> > > -               if (new == NULL) {
> > > +               if (!dest_maps_by_address)
> > >                         err = -ENOMEM;
> > > -                       goto out_unlock;
> > > +               else {
> > > +                       if (maps__maps_by_name(parent)) {
> > > +                               dest_maps_by_name =
> > > +                                       malloc(nr_maps_allocated * sizeof(struct map *));
> > > +                       }
> > > +
> > > +                       RC_CHK_ACCESS(dest)->maps_by_address = dest_maps_by_address;
> > > +                       RC_CHK_ACCESS(dest)->maps_by_name = dest_maps_by_name;
> > > +                       RC_CHK_ACCESS(dest)->nr_maps_allocated = nr_maps_allocated;
> > >                 }
> > >
> > > -               err = unwind__prepare_access(maps, new, NULL);
> > > -               if (err)
> > > -                       goto out_unlock;
> > > +               for (unsigned int i = 0; !err && i < n; i++) {
> > > +                       struct map *pos = parent_maps_by_address[i];
> > > +                       struct map *new = map__clone(pos);
> > >
> > > -               err = maps__insert(maps, new);
> > > -               if (err)
> > > -                       goto out_unlock;
> > > +                       if (!new)
> > > +                               err = -ENOMEM;
> > > +                       else {
> > > +                               err = unwind__prepare_access(dest, new, NULL);
> > > +                               if (!err) {
> > > +                                       dest_maps_by_address[i] = new;
> > > +                                       if (dest_maps_by_name)
> > > +                                               dest_maps_by_name[i] = map__get(new);
> > > +                                       RC_CHK_ACCESS(dest)->nr_maps = i + 1;
> > > +                               }
> > > +                       }
> > > +                       if (err)
> > > +                               map__put(new);
> > > +               }
> > > +               maps__set_maps_by_address_sorted(dest, maps__maps_by_address_sorted(parent));
> > > +               if (!err) {
> > > +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx =
> > > +                               RC_CHK_ACCESS(parent)->last_search_by_name_idx;
> > > +                       maps__set_maps_by_name_sorted(dest,
> > > +                                               dest_maps_by_name &&
> > > +                                               maps__maps_by_name_sorted(parent));
> > > +               } else {
> > > +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx = 0;
> > > +                       maps__set_maps_by_name_sorted(dest, false);
> > > +               }
> > > +       } else {
> > > +               /* Unexpected copying to a maps containing entries. */
> > > +               for (unsigned int i = 0; !err && i < n; i++) {
> > > +                       struct map *pos = parent_maps_by_address[i];
> > > +                       struct map *new = map__clone(pos);
> > >
> > > -               map__put(new);
> > > +                       if (!new)
> > > +                               err = -ENOMEM;
> > > +                       else {
> > > +                               err = unwind__prepare_access(dest, new, NULL);
> > > +                               if (!err)
> > > +                                       err = maps__insert(dest, new);
> >
> > Shouldn't it be __maps__insert()?
>
> On entry the read lock is taken on parent but no lock is taken on dest
> so the locked version is used.
>
> > > +                       }
> > > +                       map__put(new);
> > > +               }
> > >         }
> > > -
> > > -       err = 0;
> > > -out_unlock:
> > >         up_read(maps__lock(parent));
> > > +       up_write(maps__lock(dest));
> > >         return err;
> > >  }
> > >
> > > -struct map *maps__find(struct maps *maps, u64 ip)
> > > +static int map__addr_cmp(const void *key, const void *entry)
> > >  {
> > > -       struct rb_node *p;
> > > -       struct map_rb_node *m;
> > > -
> > > +       const u64 ip = *(const u64 *)key;
> > > +       const struct map *map = *(const struct map * const *)entry;
> > >
> > > -       down_read(maps__lock(maps));
> > > -
> > > -       p = maps__entries(maps)->rb_node;
> > > -       while (p != NULL) {
> > > -               m = rb_entry(p, struct map_rb_node, rb_node);
> > > -               if (ip < map__start(m->map))
> > > -                       p = p->rb_left;
> > > -               else if (ip >= map__end(m->map))
> > > -                       p = p->rb_right;
> > > -               else
> > > -                       goto out;
> > > -       }
> > > -
> > > -       m = NULL;
> > > -out:
> > > -       up_read(maps__lock(maps));
> > > -       return m ? m->map : NULL;
> > > +       if (ip < map__start(map))
> > > +               return -1;
> > > +       if (ip >= map__end(map))
> > > +               return 1;
> > > +       return 0;
> > >  }
> > >
> > > -static int map__strcmp(const void *a, const void *b)
> > > +struct map *maps__find(struct maps *maps, u64 ip)
> > >  {
> > > -       const struct map *map_a = *(const struct map **)a;
> > > -       const struct map *map_b = *(const struct map **)b;
> > > -       const struct dso *dso_a = map__dso(map_a);
> > > -       const struct dso *dso_b = map__dso(map_b);
> > > -       int ret = strcmp(dso_a->short_name, dso_b->short_name);
> > > -
> > > -       if (ret == 0 && map_a != map_b) {
> > > -               /*
> > > -                * Ensure distinct but name equal maps have an order in part to
> > > -                * aid reference counting.
> > > -                */
> > > -               ret = (int)map__start(map_a) - (int)map__start(map_b);
> > > -               if (ret == 0)
> > > -                       ret = (int)((intptr_t)map_a - (intptr_t)map_b);
> > > +       struct map *result = NULL;
> > > +       bool done = false;
> > > +
> > > +       /* See locking/sorting note. */
> > > +       while (!done) {
> > > +               down_read(maps__lock(maps));
> > > +               if (maps__maps_by_address_sorted(maps)) {
> > > +                       struct map **mapp =
> > > +                               bsearch(&ip, maps__maps_by_address(maps), maps__nr_maps(maps),
> > > +                                       sizeof(*mapp), map__addr_cmp);
> > > +
> > > +                       if (mapp)
> > > +                               result = *mapp; // map__get(*mapp);
> > > +                       done = true;
> > > +               }
> > > +               up_read(maps__lock(maps));
> > > +               if (!done)
> > > +                       maps__sort_by_address(maps);
> > >         }
> > > -
> > > -       return ret;
> > > +       return result;
> > >  }
> > >
> > >  static int map__strcmp_name(const void *name, const void *b)
> > > @@ -593,126 +866,113 @@ static int map__strcmp_name(const void *name, const void *b)
> > >         return strcmp(name, dso->short_name);
> > >  }
> > >
> > > -void __maps__sort_by_name(struct maps *maps)
> > > -{
> > > -       qsort(maps__maps_by_name(maps), maps__nr_maps(maps), sizeof(struct map *), map__strcmp);
> > > -}
> > > -
> > > -static int map__groups__sort_by_name_from_rbtree(struct maps *maps)
> > > -{
> > > -       struct map_rb_node *rb_node;
> > > -       struct map **maps_by_name = realloc(maps__maps_by_name(maps),
> > > -                                           maps__nr_maps(maps) * sizeof(struct map *));
> > > -       int i = 0;
> > > -
> > > -       if (maps_by_name == NULL)
> > > -               return -1;
> > > -
> > > -       up_read(maps__lock(maps));
> > > -       down_write(maps__lock(maps));
> > > -
> > > -       RC_CHK_ACCESS(maps)->maps_by_name = maps_by_name;
> > > -       RC_CHK_ACCESS(maps)->nr_maps_allocated = maps__nr_maps(maps);
> > > -
> > > -       maps__for_each_entry(maps, rb_node)
> > > -               maps_by_name[i++] = map__get(rb_node->map);
> > > -
> > > -       __maps__sort_by_name(maps);
> > > -
> > > -       up_write(maps__lock(maps));
> > > -       down_read(maps__lock(maps));
> > > -
> > > -       return 0;
> > > -}
> > > -
> > > -static struct map *__maps__find_by_name(struct maps *maps, const char *name)
> > > +struct map *maps__find_by_name(struct maps *maps, const char *name)
> > >  {
> > > -       struct map **mapp;
> > > +       struct map *result = NULL;
> > > +       bool done = false;
> > >
> > > -       if (maps__maps_by_name(maps) == NULL &&
> > > -           map__groups__sort_by_name_from_rbtree(maps))
> > > -               return NULL;
> > > +       /* See locking/sorting note. */
> > > +       while (!done) {
> > > +               unsigned int i;
> > >
> > > -       mapp = bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
> > > -                      sizeof(*mapp), map__strcmp_name);
> > > -       if (mapp)
> > > -               return *mapp;
> > > -       return NULL;
> > > -}
> > > +               down_read(maps__lock(maps));
> > >
> > > -struct map *maps__find_by_name(struct maps *maps, const char *name)
> > > -{
> > > -       struct map_rb_node *rb_node;
> > > -       struct map *map;
> > > -
> > > -       down_read(maps__lock(maps));
> > > +               /* First check last found entry. */
> > > +               i = RC_CHK_ACCESS(maps)->last_search_by_name_idx;
> > > +               if (i < maps__nr_maps(maps) && maps__maps_by_name(maps)) {
> > > +                       struct dso *dso = map__dso(maps__maps_by_name(maps)[i]);
> > >
> > > +                       if (dso && strcmp(dso->short_name, name) == 0) {
> > > +                               result = maps__maps_by_name(maps)[i]; // TODO: map__get
> > > +                               done = true;
> > > +                       }
> > > +               }
> > >
> > > -       if (RC_CHK_ACCESS(maps)->last_search_by_name) {
> > > -               const struct dso *dso = map__dso(RC_CHK_ACCESS(maps)->last_search_by_name);
> > > +               /* Second search sorted array. */
> > > +               if (!done && maps__maps_by_name_sorted(maps)) {
> > > +                       struct map **mapp =
> > > +                               bsearch(name, maps__maps_by_name(maps), maps__nr_maps(maps),
> > > +                                       sizeof(*mapp), map__strcmp_name);
> > >
> > > -               if (strcmp(dso->short_name, name) == 0) {
> > > -                       map = RC_CHK_ACCESS(maps)->last_search_by_name;
> > > -                       goto out_unlock;
> > > +                       if (mapp) {
> > > +                               result = *mapp; // TODO: map__get
> > > +                               i = mapp - maps__maps_by_name(maps);
> > > +                               RC_CHK_ACCESS(maps)->last_search_by_name_idx = i;
> > > +                       }
> > > +                       done = true;
> > >                 }
> > > -       }
> > > -       /*
> > > -        * If we have maps->maps_by_name, then the name isn't in the rbtree,
> > > -        * as maps->maps_by_name mirrors the rbtree when lookups by name are
> > > -        * made.
> > > -        */
> > > -       map = __maps__find_by_name(maps, name);
> > > -       if (map || maps__maps_by_name(maps) != NULL)
> > > -               goto out_unlock;
> > > -
> > > -       /* Fallback to traversing the rbtree... */
> > > -       maps__for_each_entry(maps, rb_node) {
> > > -               struct dso *dso;
> > > -
> > > -               map = rb_node->map;
> > > -               dso = map__dso(map);
> > > -               if (strcmp(dso->short_name, name) == 0) {
> > > -                       RC_CHK_ACCESS(maps)->last_search_by_name = map;
> > > -                       goto out_unlock;
> > > +               up_read(maps__lock(maps));
> > > +               if (!done) {
> > > +                       /* Sort and retry binary search. */
> > > +                       if (maps__sort_by_name(maps)) {
> > > +                               /*
> > > +                                * Memory allocation failed do linear search
> > > +                                * through address sorted maps.
> > > +                                */
> > > +                               struct map **maps_by_address;
> > > +                               unsigned int n;
> > > +
> > > +                               down_read(maps__lock(maps));
> > > +                               maps_by_address =  maps__maps_by_address(maps);
> > > +                               n = maps__nr_maps(maps);
> > > +                               for (i = 0; i < n; i++) {
> > > +                                       struct map *pos = maps_by_address[i];
> > > +                                       struct dso *dso = map__dso(pos);
> > > +
> > > +                                       if (dso && strcmp(dso->short_name, name) == 0) {
> > > +                                               result = pos; // TODO: map__get
> > > +                                               break;
> > > +                                       }
> > > +                               }
> > > +                               up_read(maps__lock(maps));
> > > +                               done = true;
> > > +                       }
> > >                 }
> > >         }
> > > -       map = NULL;
> > > -
> > > -out_unlock:
> > > -       up_read(maps__lock(maps));
> > > -       return map;
> > > +       return result;
> > >  }
> > >
> > >  struct map *maps__find_next_entry(struct maps *maps, struct map *map)
> > >  {
> > > -       struct map_rb_node *rb_node = maps__find_node(maps, map);
> > > -       struct map_rb_node *next = map_rb_node__next(rb_node);
> > > +       unsigned int i;
> > > +       struct map *result = NULL;
> > >
> > > -       if (next)
> > > -               return next->map;
> > > +       down_read(maps__lock(maps));
> > > +       i = maps__by_address_index(maps, map);
> > > +       if (i < maps__nr_maps(maps))
> > > +               result = maps__maps_by_address(maps)[i]; // TODO: map__get
> > >
> > > -       return NULL;
> > > +       up_read(maps__lock(maps));
> > > +       return result;
> > >  }
> > >
> > >  void maps__fixup_end(struct maps *maps)
> > >  {
> > > -       struct map_rb_node *prev = NULL, *curr;
> > > +       struct map **maps_by_address;
> > > +       unsigned int n;
> > >
> > >         down_write(maps__lock(maps));
> > > +       if (!maps__maps_by_address_sorted(maps))
> > > +               __maps__sort_by_address(maps);
> > >
> > > -       maps__for_each_entry(maps, curr) {
> > > -               if (prev && (!map__end(prev->map) || map__end(prev->map) > map__start(curr->map)))
> > > -                       map__set_end(prev->map, map__start(curr->map));
> > > +       maps_by_address = maps__maps_by_address(maps);
> > > +       n = maps__nr_maps(maps);
> > > +       for (unsigned int i = 1; i < n; i++) {
> > > +               struct map *prev = maps_by_address[i - 1];
> > > +               struct map *curr = maps_by_address[i];
> > >
> > > -               prev = curr;
> > > +               if (!map__end(prev) || map__end(prev) > map__start(curr))
> > > +                       map__set_end(prev, map__start(curr));
> > >         }
> > >
> > >         /*
> > >          * We still haven't the actual symbols, so guess the
> > >          * last map final address.
> > >          */
> > > -       if (curr && !map__end(curr->map))
> > > -               map__set_end(curr->map, ~0ULL);
> > > +       if (n > 0 && !map__end(maps_by_address[n - 1]))
> > > +               map__set_end(maps_by_address[n - 1], ~0ULL);
> > > +
> > > +       RC_CHK_ACCESS(maps)->ends_broken = false;
> > >
> > >         up_write(maps__lock(maps));
> > >  }
> > > @@ -723,117 +983,92 @@ void maps__fixup_end(struct maps *maps)
> > >   */
> > >  int maps__merge_in(struct maps *kmaps, struct map *new_map)
> > >  {
> > > -       struct map_rb_node *rb_node;
> > > -       struct rb_node *first;
> > > -       bool overlaps;
> > > -       LIST_HEAD(merged);
> > > -       int err = 0;
> > > +       unsigned int first_after_, kmaps__nr_maps;
> > > +       struct map **kmaps_maps_by_address;
> > > +       struct map **merged_maps_by_address;
> > > +       unsigned int merged_nr_maps_allocated;
> > > +
> > > +       /* First try under a read lock. */
> > > +       while (true) {
> > > +               down_read(maps__lock(kmaps));
> > > +               if (maps__maps_by_address_sorted(kmaps))
> > > +                       break;
> > >
> > > -       down_read(maps__lock(kmaps));
> > > -       first = first_ending_after(kmaps, new_map);
> > > -       rb_node = first ? rb_entry(first, struct map_rb_node, rb_node) : NULL;
> > > -       overlaps = rb_node && map__start(rb_node->map) < map__end(new_map);
> > > -       up_read(maps__lock(kmaps));
> > > +               up_read(maps__lock(kmaps));
> > > +
> > > +               /* First after binary search requires sorted maps. Sort and try again. */
> > > +               maps__sort_by_address(kmaps);
> > > +       }
> > > +       first_after_ = first_ending_after(kmaps, new_map);
> > > +       kmaps_maps_by_address = maps__maps_by_address(kmaps);
> > >
> > > -       if (!overlaps)
> > > +       if (first_after_ >= maps__nr_maps(kmaps) ||
> > > +           map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
> > > +               /* No overlap so regular insert suffices. */
> > > +               up_read(maps__lock(kmaps));
> > >                 return maps__insert(kmaps, new_map);
> > > +       }
> > > +       up_read(maps__lock(kmaps));
> > >
> > > -       maps__for_each_entry(kmaps, rb_node) {
> > > -               struct map *old_map = rb_node->map;
> > > +       /* Plain insert with a read-lock failed, try again now with the write lock. */
> > > +       down_write(maps__lock(kmaps));
> > > +       if (!maps__maps_by_address_sorted(kmaps))
> > > +               __maps__sort_by_address(kmaps);
> > >
> > > -               /* no overload with this one */
> > > -               if (map__end(new_map) < map__start(old_map) ||
> > > -                   map__start(new_map) >= map__end(old_map))
> > > -                       continue;
> > > +       first_after_ = first_ending_after(kmaps, new_map);
> > > +       kmaps_maps_by_address = maps__maps_by_address(kmaps);
> > > +       kmaps__nr_maps = maps__nr_maps(kmaps);
> > >
> > > -               if (map__start(new_map) < map__start(old_map)) {
> > > -                       /*
> > > -                        * |new......
> > > -                        *       |old....
> > > -                        */
> > > -                       if (map__end(new_map) < map__end(old_map)) {
> > > -                               /*
> > > -                                * |new......|     -> |new..|
> > > -                                *       |old....| ->       |old....|
> > > -                                */
> > > -                               map__set_end(new_map, map__start(old_map));
> > > -                       } else {
> > > -                               /*
> > > -                                * |new.............| -> |new..|       |new..|
> > > -                                *       |old....|    ->       |old....|
> > > -                                */
> > > -                               struct map_list_node *m = map_list_node__new();
> > > +       if (first_after_ >= kmaps__nr_maps ||
> > > +           map__start(kmaps_maps_by_address[first_after_]) >= map__end(new_map)) {
> > > +               /* No overlap so regular insert suffices. */
> > > +               up_write(maps__lock(kmaps));
> > > +               return maps__insert(kmaps, new_map);
> >
> > I think it could be:
> >
> >         ret = __maps__insert(kmaps, new_map);
> >         up_write(maps__lock(kmaps));
> >         return ret;
>
> Ack. Will change in v3.
>
> > > +       }
> > > +       /* Array to merge into, possibly 1 more for the sake of new_map. */
> > > +       merged_nr_maps_allocated = RC_CHK_ACCESS(kmaps)->nr_maps_allocated;
> > > +       if (kmaps__nr_maps + 1 == merged_nr_maps_allocated)
> > > +               merged_nr_maps_allocated++;
> > > +
> > > +       merged_maps_by_address = malloc(merged_nr_maps_allocated * sizeof(*merged_maps_by_address));
> > > +       if (!merged_maps_by_address) {
> > > +               up_write(maps__lock(kmaps));
> > > +               return -ENOMEM;
> > > +       }
> > > +       RC_CHK_ACCESS(kmaps)->maps_by_address = merged_maps_by_address;
> > > +       RC_CHK_ACCESS(kmaps)->maps_by_address_sorted = true;
> > > +       zfree(&RC_CHK_ACCESS(kmaps)->maps_by_name);
> > > +       RC_CHK_ACCESS(kmaps)->maps_by_name_sorted = false;
> > > +       RC_CHK_ACCESS(kmaps)->nr_maps_allocated = merged_nr_maps_allocated;
> >
> > Why not use the accessor functions?
>
> Ack. I've been holding back on accessors that are used once, but I
> will add them here.
>
> Thanks,
> Ian
>
> > Thanks,
> > Namhyung
> >
> > >
> > > -                               if (!m) {
> > > -                                       err = -ENOMEM;
> > > -                                       goto out;
> > > -                               }
> > > +       /* Copy entries before the new_map that can't overlap. */
> > > +       for (unsigned int i = 0; i < first_after_; i++)
> > > +               merged_maps_by_address[i] = map__get(kmaps_maps_by_address[i]);
> > >
> > > -                               m->map = map__clone(new_map);
> > > -                               if (!m->map) {
> > > -                                       free(m);
> > > -                                       err = -ENOMEM;
> > > -                                       goto out;
> > > -                               }
> > > +       RC_CHK_ACCESS(kmaps)->nr_maps = first_after_;
> > >
> > > -                               map__set_end(m->map, map__start(old_map));
> > > -                               list_add_tail(&m->node, &merged);
> > > -                               map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
> > > -                               map__set_start(new_map, map__end(old_map));
> > > -                       }
> > > -               } else {
> > > -                       /*
> > > -                        *      |new......
> > > -                        * |old....
> > > -                        */
> > > -                       if (map__end(new_map) < map__end(old_map)) {
> > > -                               /*
> > > -                                *      |new..|   -> x
> > > -                                * |old.........| -> |old.........|
> > > -                                */
> > > -                               map__put(new_map);
> > > -                               new_map = NULL;
> > > -                               break;
> > > -                       } else {
> > > -                               /*
> > > -                                *      |new......| ->         |new...|
> > > -                                * |old....|        -> |old....|
> > > -                                */
> > > -                               map__add_pgoff(new_map, map__end(old_map) - map__start(new_map));
> > > -                               map__set_start(new_map, map__end(old_map));
> > > -                       }
> > > -               }
> > > -       }
> > > +       /* Add the new map, it will be split when the later overlapping mappings are added. */
> > > +       __maps__insert(kmaps, new_map);
> > >
> > > -out:
> > > -       while (!list_empty(&merged)) {
> > > -               struct map_list_node *old_node;
> > > +       /* Insert mappings after new_map, splitting new_map in the process. */
> > > +       for (unsigned int i = first_after_; i < kmaps__nr_maps; i++)
> > > +               __maps__fixup_overlap_and_insert(kmaps, kmaps_maps_by_address[i]);
> > >
> > > -               old_node = list_entry(merged.next, struct map_list_node, node);
> > > -               list_del_init(&old_node->node);
> > > -               if (!err)
> > > -                       err = maps__insert(kmaps, old_node->map);
> > > -               map__put(old_node->map);
> > > -               free(old_node);
> > > -       }
> > > +       /* Copy the maps from merged into kmaps. */
> > > +       for (unsigned int i = 0; i < kmaps__nr_maps; i++)
> > > +               map__zput(kmaps_maps_by_address[i]);
> > >
> > > -       if (new_map) {
> > > -               if (!err)
> > > -                       err = maps__insert(kmaps, new_map);
> > > -               map__put(new_map);
> > > -       }
> > > -       return err;
> > > +       free(kmaps_maps_by_address);
> > > +       up_write(maps__lock(kmaps));
> > > +       return 0;
> > >  }
> > >
> > >  void maps__load_first(struct maps *maps)
> > >  {
> > > -       struct map_rb_node *first;
> > > -
> > >         down_read(maps__lock(maps));
> > >
> > > -       first = maps__first(maps);
> > > -       if (first)
> > > -               map__load(first->map);
> > > +       if (maps__nr_maps(maps) > 0)
> > > +               map__load(maps__maps_by_address(maps)[0]);
> > >
> > >         up_read(maps__lock(maps));
> > >  }
> > > diff --git a/tools/perf/util/maps.h b/tools/perf/util/maps.h
> > > index d836d04c9402..df9dd5a0e3c0 100644
> > > --- a/tools/perf/util/maps.h
> > > +++ b/tools/perf/util/maps.h
> > > @@ -25,21 +25,56 @@ static inline struct map_list_node *map_list_node__new(void)
> > >         return malloc(sizeof(struct map_list_node));
> > >  }
> > >
> > > -struct map *maps__find(struct maps *maps, u64 addr);
> > > +/*
> > > + * Locking/sorting note:
> > > + *
> > > + * Sorting is done with the write lock, iteration and binary searching happens
> > > + * under the read lock requiring being sorted. There is a race between sorting
> > > + * releasing the write lock and acquiring the read lock for iteration/searching
> > > + * where another thread could insert and break the sorting of the maps. In
> > > + * practice inserting maps should be rare meaning that the race shouldn't lead
> > > + * to live lock. Removal of maps doesn't break being sorted.
> > > + */
> > >
> > >  DECLARE_RC_STRUCT(maps) {
> > > -       struct rb_root      entries;
> > >         struct rw_semaphore lock;
> > > -       struct machine   *machine;
> > > -       struct map       *last_search_by_name;
> > > +       /**
> > > +        * @maps_by_address: array of maps sorted by their starting address if
> > > +        * maps_by_address_sorted is true.
> > > +        */
> > > +       struct map       **maps_by_address;
> > > +       /**
> > > +        * @maps_by_name: optional array of maps sorted by their dso name if
> > > +        * maps_by_name_sorted is true.
> > > +        */
> > >         struct map       **maps_by_name;
> > > -       refcount_t       refcnt;
> > > -       unsigned int     nr_maps;
> > > -       unsigned int     nr_maps_allocated;
> > > +       struct machine   *machine;
> > >  #ifdef HAVE_LIBUNWIND_SUPPORT
> > > -       void                            *addr_space;
> > > +       void            *addr_space;
> > >         const struct unwind_libunwind_ops *unwind_libunwind_ops;
> > >  #endif
> > > +       refcount_t       refcnt;
> > > +       /**
> > > +        * @nr_maps: number of maps_by_address, and possibly maps_by_name,
> > > +        * entries that contain maps.
> > > +        */
> > > +       unsigned int     nr_maps;
> > > +       /**
> > > +        * @nr_maps_allocated: number of entries in maps_by_address and possibly
> > > +        * maps_by_name.
> > > +        */
> > > +       unsigned int     nr_maps_allocated;
> > > +       /**
> > > +        * @last_search_by_name_idx: cache of last found by name entry's index
> > > +        * as frequent searches for the same dso name are common.
> > > +        */
> > > +       unsigned int     last_search_by_name_idx;
> > > +       /** @maps_by_address_sorted: is maps_by_address sorted. */
> > > +       bool             maps_by_address_sorted;
> > > +       /** @maps_by_name_sorted: is maps_by_name sorted. */
> > > +       bool             maps_by_name_sorted;
> > > +       /** @ends_broken: does the map contain a map where end values are unset/unsorted? */
> > > +       bool             ends_broken;
> > >  };
> > >
> > >  #define KMAP_NAME_LEN 256
> > > @@ -102,6 +137,7 @@ size_t maps__fprintf(struct maps *maps, FILE *fp);
> > >  int maps__insert(struct maps *maps, struct map *map);
> > >  void maps__remove(struct maps *maps, struct map *map);
> > >
> > > +struct map *maps__find(struct maps *maps, u64 addr);
> > >  struct symbol *maps__find_symbol(struct maps *maps, u64 addr, struct map **mapp);
> > >  struct symbol *maps__find_symbol_by_name(struct maps *maps, const char *name, struct map **mapp);
> > >
> > > @@ -117,8 +153,6 @@ struct map *maps__find_next_entry(struct maps *maps, struct map *map);
> > >
> > >  int maps__merge_in(struct maps *kmaps, struct map *new_map);
> > >
> > > -void __maps__sort_by_name(struct maps *maps);
> > > -
> > >  void maps__fixup_end(struct maps *maps);
> > >
> > >  void maps__load_first(struct maps *maps);
> > > --
> > > 2.43.0.472.g3155946c3a-goog
> > >
> > >

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses
  2024-02-02  4:20     ` Ian Rogers
  2024-02-02  4:21       ` Ian Rogers
@ 2024-02-06  0:37       ` Namhyung Kim
  1 sibling, 0 replies; 31+ messages in thread
From: Namhyung Kim @ 2024-02-06  0:37 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Nick Terrell, Kan Liang, Andi Kleen, Kajol Jain, Athira Rajeev,
	Huacai Chen, Masami Hiramatsu, Vincent Whitchurch,
	Steinar H. Gunderson, Liam Howlett, Miguel Ojeda, Colin Ian King,
	Dmitrii Dolgov, Yang Jihong, Ming Wang, James Clark,
	K Prateek Nayak, Sean Christopherson, Leo Yan, Ravi Bangoria,
	German Gomez, Changbin Du, Paolo Bonzini, Li Dong, Sandipan Das,
	liuwenyu, linux-kernel, linux-perf-users, Guilherme Amadio

Hi Ian,

Sorry for the late reply.

On Thu, Feb 1, 2024 at 8:21 PM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Feb 1, 2024 at 6:48 PM Namhyung Kim <namhyung@kernel.org> wrote:
[SNIP]
> > > +int maps__copy_from(struct maps *dest, struct maps *parent)
> > > +{
> > > +       /* Note, if struct map were immutable then cloning could use ref counts. */
> > > +       struct map **parent_maps_by_address;
> > > +       int err = 0;
> > > +       unsigned int n;
> > > +
> > > +       down_write(maps__lock(dest));
> > >         down_read(maps__lock(parent));
> > >
> > > -       maps__for_each_entry(parent, rb_node) {
> > > -               struct map *new = map__clone(rb_node->map);
> > > +       parent_maps_by_address = maps__maps_by_address(parent);
> > > +       n = maps__nr_maps(parent);
> > > +       if (maps__empty(dest)) {
> > > +               /* No existing mappings so just copy from parent to avoid reallocs in insert. */
> > > +               unsigned int nr_maps_allocated = RC_CHK_ACCESS(parent)->nr_maps_allocated;
> > > +               struct map **dest_maps_by_address =
> > > +                       malloc(nr_maps_allocated * sizeof(struct map *));
> > > +               struct map **dest_maps_by_name = NULL;
> > >
> > > -               if (new == NULL) {
> > > +               if (!dest_maps_by_address)
> > >                         err = -ENOMEM;
> > > -                       goto out_unlock;
> > > +               else {
> > > +                       if (maps__maps_by_name(parent)) {
> > > +                               dest_maps_by_name =
> > > +                                       malloc(nr_maps_allocated * sizeof(struct map *));
> > > +                       }
> > > +
> > > +                       RC_CHK_ACCESS(dest)->maps_by_address = dest_maps_by_address;
> > > +                       RC_CHK_ACCESS(dest)->maps_by_name = dest_maps_by_name;
> > > +                       RC_CHK_ACCESS(dest)->nr_maps_allocated = nr_maps_allocated;
> > >                 }
> > >
> > > -               err = unwind__prepare_access(maps, new, NULL);
> > > -               if (err)
> > > -                       goto out_unlock;
> > > +               for (unsigned int i = 0; !err && i < n; i++) {
> > > +                       struct map *pos = parent_maps_by_address[i];
> > > +                       struct map *new = map__clone(pos);
> > >
> > > -               err = maps__insert(maps, new);
> > > -               if (err)
> > > -                       goto out_unlock;
> > > +                       if (!new)
> > > +                               err = -ENOMEM;
> > > +                       else {
> > > +                               err = unwind__prepare_access(dest, new, NULL);
> > > +                               if (!err) {
> > > +                                       dest_maps_by_address[i] = new;
> > > +                                       if (dest_maps_by_name)
> > > +                                               dest_maps_by_name[i] = map__get(new);
> > > +                                       RC_CHK_ACCESS(dest)->nr_maps = i + 1;
> > > +                               }
> > > +                       }
> > > +                       if (err)
> > > +                               map__put(new);
> > > +               }
> > > +               maps__set_maps_by_address_sorted(dest, maps__maps_by_address_sorted(parent));
> > > +               if (!err) {
> > > +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx =
> > > +                               RC_CHK_ACCESS(parent)->last_search_by_name_idx;
> > > +                       maps__set_maps_by_name_sorted(dest,
> > > +                                               dest_maps_by_name &&
> > > +                                               maps__maps_by_name_sorted(parent));
> > > +               } else {
> > > +                       RC_CHK_ACCESS(dest)->last_search_by_name_idx = 0;
> > > +                       maps__set_maps_by_name_sorted(dest, false);
> > > +               }
> > > +       } else {
> > > +               /* Unexpected copying to a maps containing entries. */
> > > +               for (unsigned int i = 0; !err && i < n; i++) {
> > > +                       struct map *pos = parent_maps_by_address[i];
> > > +                       struct map *new = map__clone(pos);
> > >
> > > -               map__put(new);
> > > +                       if (!new)
> > > +                               err = -ENOMEM;
> > > +                       else {
> > > +                               err = unwind__prepare_access(dest, new, NULL);
> > > +                               if (!err)
> > > +                                       err = maps__insert(dest, new);
> >
> > Shouldn't it be __maps__insert()?
>
> On entry the read lock is taken on parent but no lock is taken on dest
> so the locked version is used.

I think you added the writer lock on dest.

Thanks,
Namhyung

>
> > > +                       }
> > > +                       map__put(new);
> > > +               }
> > >         }
> > > -
> > > -       err = 0;
> > > -out_unlock:
> > >         up_read(maps__lock(parent));
> > > +       up_write(maps__lock(dest));
> > >         return err;
> > >  }

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-02-06  0:37 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-03  5:06 [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers
2024-01-03  5:06 ` [PATCH v7 01/25] perf maps: Switch from rbtree to lazily sorted array for addresses Ian Rogers
2024-02-02  2:48   ` Namhyung Kim
2024-02-02  4:20     ` Ian Rogers
2024-02-02  4:21       ` Ian Rogers
2024-02-06  0:37       ` Namhyung Kim
2024-01-03  5:06 ` [PATCH v7 02/25] perf maps: Get map before returning in maps__find Ian Rogers
2024-01-03  5:06 ` [PATCH v7 03/25] perf maps: Get map before returning in maps__find_by_name Ian Rogers
2024-01-03  5:06 ` [PATCH v7 04/25] perf maps: Get map before returning in maps__find_next_entry Ian Rogers
2024-01-03  5:06 ` [PATCH v7 05/25] perf maps: Hide maps internals Ian Rogers
2024-01-03  5:06 ` [PATCH v7 06/25] perf maps: Locking tidy up of nr_maps Ian Rogers
2024-01-03  5:06 ` [PATCH v7 07/25] perf dso: Reorder variables to save space in struct dso Ian Rogers
2024-01-03  5:06 ` [PATCH v7 08/25] perf report: Sort child tasks by tid Ian Rogers
2024-01-03  5:06 ` [PATCH v7 09/25] perf trace: Ignore thread hashing in summary Ian Rogers
2024-01-03  5:06 ` [PATCH v7 10/25] perf machine: Move fprintf to for_each loop and a callback Ian Rogers
2024-01-03  5:06 ` [PATCH v7 11/25] perf threads: Move threads to its own files Ian Rogers
2024-01-03  5:06 ` [PATCH v7 12/25] perf threads: Switch from rbtree to hashmap Ian Rogers
2024-01-03  5:06 ` [PATCH v7 13/25] perf threads: Reduce table size from 256 to 8 Ian Rogers
2024-01-03  5:06 ` [PATCH v7 14/25] perf dsos: Attempt to better abstract dsos internals Ian Rogers
2024-01-03  5:06 ` [PATCH v7 15/25] perf dsos: Tidy reference counting and locking Ian Rogers
2024-01-03  5:06 ` [PATCH v7 16/25] perf dsos: Add dsos__for_each_dso Ian Rogers
2024-01-03  5:06 ` [PATCH v7 17/25] perf dso: Move dso functions out of dsos Ian Rogers
2024-01-03  5:06 ` [PATCH v7 18/25] perf dsos: Switch more loops to dsos__for_each_dso Ian Rogers
2024-01-03  5:06 ` [PATCH v7 19/25] perf dsos: Switch backing storage to array from rbtree/list Ian Rogers
2024-01-03  5:06 ` [PATCH v7 20/25] perf dsos: Remove __dsos__addnew Ian Rogers
2024-01-03  5:06 ` [PATCH v7 21/25] perf dsos: Remove __dsos__findnew_link_by_longname_id Ian Rogers
2024-01-03  5:06 ` [PATCH v7 22/25] perf dsos: Switch hand code to bsearch Ian Rogers
2024-01-03  5:06 ` [PATCH v7 23/25] perf dso: Add reference count checking and accessor functions Ian Rogers
2024-01-03  5:06 ` [PATCH v7 24/25] perf dso: Reference counting related fixes Ian Rogers
2024-01-03  5:06 ` [PATCH v7 25/25] perf dso: Use container_of to avoid a pointer in dso_data Ian Rogers
2024-01-31 22:22 ` [PATCH v7 00/25] maps/threads/dsos memory improvements and fixes Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).