linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Reference count checker and related fixes
@ 2022-01-25 20:45 Ian Rogers
  2022-01-25 20:45 ` [PATCH v2 1/4] perf cpumap: Add reference count checking Ian Rogers
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Ian Rogers @ 2022-01-25 20:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov
  Cc: eranian, Ian Rogers

This v2 patch set has the main reference count patch for cpu map from
the first set and then adds reference count checking to nsinfo. The
reference count checking on nsinfo helped diagnose a data race bug
which is fixed in the independent patches 2 and 3.

The perf tool has a class of memory problems where reference counts
are used incorrectly. Memory/address sanitizers and valgrind don't
provide useful ways to debug these problems, you see a memory leak
where the only pertinent information is the original allocation
site. What would be more useful is knowing where a get fails to have a
corresponding put, where there are double puts, etc.

This work was motivated by the roll-back of:
https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
where fixing a missed put resulted in a use-after-free in a different
context. There was a sense in fixing the issue that a game of
wac-a-mole had been embarked upon in adding missed gets and puts.

The basic approach of the change is to add a level of indirection at
the get and put calls. Get allocates a level of indirection that, if
no corresponding put is called, becomes a memory leak (and associated
stack trace) that leak sanitizer can report. Similarly if two puts are
called for the same get, then a double free can be detected by address
sanitizer. This can also detect the use after put, which should also
yield a segv without a sanitizer.

Adding reference count checking to cpu map was done as a proof of
concept, it yielded little other than a location where the use of get
could be cleaner by using its result. Reference count checking on
nsinfo identified a double free of the indirection layer and the
related threads, thereby identifying a data race as discussed here:
https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
Accordingly the dso->lock was extended and use to cover the race.

An alternative that was considered was ref_tracker:
 https://lwn.net/Articles/877603/
ref_tracker requires use of a reference counted struct to also use a
cookie/tracker. The cookie is combined with data in a ref_tracker_dir
to spot double puts. When an object is finished with leaks can be
detected, as with this approach when leak analysis happens. This
approach was preferred as it doesn't introduce cookies, spots use
after put and appears moderately more neutral to the API. Weaknesses
of the implemented approcah are not being able to do adhoc leak
detection and a preference for adding an accessor API to structs. I
believe there are other issues and welcome suggestions.

Ian Rogers (4):
  perf cpumap: Add reference count checking
  perf dso: Make lock error check and add BUG_ONs
  perf dso: Hold lock when accessing nsinfo
  perf namespaces: Add reference count checking

 tools/lib/perf/cpumap.c                  | 120 ++++++++++-----
 tools/lib/perf/include/internal/cpumap.h |  14 +-
 tools/perf/builtin-inject.c              |   6 +-
 tools/perf/builtin-probe.c               |   2 +-
 tools/perf/tests/cpumap.c                |  20 ++-
 tools/perf/util/build-id.c               |   4 +-
 tools/perf/util/cpumap.c                 |  42 ++---
 tools/perf/util/dso.c                    |  17 ++-
 tools/perf/util/jitdump.c                |  10 +-
 tools/perf/util/map.c                    |   7 +-
 tools/perf/util/namespaces.c             | 187 ++++++++++++++++-------
 tools/perf/util/namespaces.h             |  23 ++-
 tools/perf/util/pmu.c                    |  24 +--
 tools/perf/util/probe-event.c            |   2 +
 tools/perf/util/symbol.c                 |  10 +-
 15 files changed, 337 insertions(+), 151 deletions(-)

-- 
2.35.0.rc0.227.g00780c9af4-goog


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/4] perf cpumap: Add reference count checking
  2022-01-25 20:45 [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
@ 2022-01-25 20:45 ` Ian Rogers
  2022-01-31 14:44   ` Arnaldo Carvalho de Melo
  2022-01-25 20:46 ` [PATCH v2 2/4] perf dso: Make lock error check and add BUG_ONs Ian Rogers
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Ian Rogers @ 2022-01-25 20:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov
  Cc: eranian, Ian Rogers

Enabled when REFCNT_CHECKING is defined. The change adds a memory
allocated pointer that is interposed between the reference counted
cpu map at a get and freed by a put. The pointer replaces the original
perf_cpu_map struct, so use of the perf_cpu_map via APIs remains
unchanged. Any use of the cpu map without the API requires two versions,
typically handled via an UNWRAP macro.

This change is intended to catch:
 - use after put: using a cpumap after you have put it will cause a
   segv.
 - unbalanced puts: two puts for a get will result in a double free
   that can be captured and reported by tools like address sanitizer,
   including with the associated stack traces of allocation and frees.
 - missing puts: if a put is missing then the get turns into a memory
   leak that can be reported by leak sanitizer, including the stack
   trace at the point the get occurs.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/perf/cpumap.c                  | 120 ++++++++++++++++-------
 tools/lib/perf/include/internal/cpumap.h |  14 ++-
 tools/perf/tests/cpumap.c                |  20 ++--
 tools/perf/util/cpumap.c                 |  42 ++++----
 tools/perf/util/pmu.c                    |  24 +++--
 5 files changed, 146 insertions(+), 74 deletions(-)

diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index ee66760f1e63..d401d133f84b 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -10,10 +10,16 @@
 #include <ctype.h>
 #include <limits.h>
 
-static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
+#ifndef REFCNT_CHECKING
+#define UNWRAP_MAP(x) x
+#else
+#define UNWRAP_MAP(x) x->orig
+#endif
+
+#ifndef REFCNT_CHECKING
+struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
 {
 	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
-
 	if (cpus != NULL) {
 		cpus->nr = nr_cpus;
 		refcount_set(&cpus->refcnt, 1);
@@ -21,13 +27,31 @@ static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
 	}
 	return cpus;
 }
+#else
+struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
+{
+	struct perf_cpu_map *wrapped_cpus = NULL;
+	struct original_perf_cpu_map *cpus =
+		malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
+	if (cpus != NULL) {
+		cpus->nr = nr_cpus;
+		refcount_set(&cpus->refcnt, 1);
+		wrapped_cpus = malloc(sizeof(*wrapped_cpus));
+		if (wrapped_cpus != NULL)
+			wrapped_cpus->orig = cpus;
+		else
+			free(cpus);
+	}
+	return wrapped_cpus;
+}
+#endif
 
 struct perf_cpu_map *perf_cpu_map__dummy_new(void)
 {
 	struct perf_cpu_map *cpus = perf_cpu_map__alloc(1);
 
 	if (cpus)
-		cpus->map[0].cpu = -1;
+		UNWRAP_MAP(cpus)->map[0].cpu = -1;
 
 	return cpus;
 }
@@ -35,23 +59,45 @@ struct perf_cpu_map *perf_cpu_map__dummy_new(void)
 static void cpu_map__delete(struct perf_cpu_map *map)
 {
 	if (map) {
-		WARN_ONCE(refcount_read(&map->refcnt) != 0,
+		WARN_ONCE(refcount_read(&UNWRAP_MAP(map)->refcnt) != 0,
 			  "cpu_map refcnt unbalanced\n");
+#ifdef REFCNT_CHECKING
+		free(map->orig);
+		map->orig = NULL;
+#endif
 		free(map);
 	}
 }
 
 struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
 {
-	if (map)
-		refcount_inc(&map->refcnt);
+	if (map) {
+#ifdef REFCNT_CHECKING
+		struct perf_cpu_map *new_wrapper;
+#endif
+		refcount_inc(&UNWRAP_MAP(map)->refcnt);
+#ifdef REFCNT_CHECKING
+		new_wrapper = malloc(sizeof(*new_wrapper));
+		new_wrapper->orig = map->orig;
+		map = new_wrapper;
+#endif
+	}
 	return map;
 }
 
 void perf_cpu_map__put(struct perf_cpu_map *map)
 {
-	if (map && refcount_dec_and_test(&map->refcnt))
-		cpu_map__delete(map);
+	if (map) {
+		if (refcount_dec_and_test(&UNWRAP_MAP(map)->refcnt))
+			cpu_map__delete(map);
+		else {
+#ifdef REFCNT_CHECKING
+			/* Free the wrapper object but the reference counted object remains. */
+			map->orig = NULL;
+			free(map);
+#endif
+		}
+	}
 }
 
 static struct perf_cpu_map *cpu_map__default_new(void)
@@ -68,7 +114,7 @@ static struct perf_cpu_map *cpu_map__default_new(void)
 		int i;
 
 		for (i = 0; i < nr_cpus; ++i)
-			cpus->map[i].cpu = i;
+			UNWRAP_MAP(cpus)->map[i].cpu = i;
 	}
 
 	return cpus;
@@ -94,15 +140,16 @@ static struct perf_cpu_map *cpu_map__trim_new(int nr_cpus, const struct perf_cpu
 	int i, j;
 
 	if (cpus != NULL) {
-		memcpy(cpus->map, tmp_cpus, payload_size);
-		qsort(cpus->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
+		memcpy(UNWRAP_MAP(cpus)->map, tmp_cpus, payload_size);
+		qsort(UNWRAP_MAP(cpus)->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
 		/* Remove dups */
 		j = 0;
 		for (i = 0; i < nr_cpus; i++) {
-			if (i == 0 || cpus->map[i].cpu != cpus->map[i - 1].cpu)
-				cpus->map[j++].cpu = cpus->map[i].cpu;
+			if (i == 0 ||
+			    UNWRAP_MAP(cpus)->map[i].cpu != UNWRAP_MAP(cpus)->map[i - 1].cpu)
+				UNWRAP_MAP(cpus)->map[j++].cpu = UNWRAP_MAP(cpus)->map[i].cpu;
 		}
-		cpus->nr = j;
+		UNWRAP_MAP(cpus)->nr = j;
 		assert(j <= nr_cpus);
 	}
 	return cpus;
@@ -263,20 +310,20 @@ struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx)
 		.cpu = -1
 	};
 
-	if (cpus && idx < cpus->nr)
-		return cpus->map[idx];
+	if (cpus && idx < UNWRAP_MAP(cpus)->nr)
+		return UNWRAP_MAP(cpus)->map[idx];
 
 	return result;
 }
 
 int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
 {
-	return cpus ? cpus->nr : 1;
+	return cpus ? UNWRAP_MAP(cpus)->nr : 1;
 }
 
 bool perf_cpu_map__empty(const struct perf_cpu_map *map)
 {
-	return map ? map->map[0].cpu == -1 : true;
+	return map ? UNWRAP_MAP(map)->map[0].cpu == -1 : true;
 }
 
 int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
@@ -287,10 +334,10 @@ int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
 		return -1;
 
 	low = 0;
-	high = cpus->nr;
+	high = UNWRAP_MAP(cpus)->nr;
 	while (low < high) {
 		int idx = (low + high) / 2;
-		struct perf_cpu cpu_at_idx = cpus->map[idx];
+		struct perf_cpu cpu_at_idx = UNWRAP_MAP(cpus)->map[idx];
 
 		if (cpu_at_idx.cpu == cpu.cpu)
 			return idx;
@@ -316,7 +363,7 @@ struct perf_cpu perf_cpu_map__max(struct perf_cpu_map *map)
 	};
 
 	// cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
-	return map->nr > 0 ? map->map[map->nr - 1] : result;
+	return UNWRAP_MAP(map)->nr > 0 ? UNWRAP_MAP(map)->map[UNWRAP_MAP(map)->nr - 1] : result;
 }
 
 /*
@@ -337,37 +384,36 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
 
 	if (!orig && !other)
 		return NULL;
-	if (!orig) {
-		perf_cpu_map__get(other);
-		return other;
-	}
+	if (!orig)
+		return perf_cpu_map__get(other);
 	if (!other)
 		return orig;
-	if (orig->nr == other->nr &&
-	    !memcmp(orig->map, other->map, orig->nr * sizeof(struct perf_cpu)))
+	if (UNWRAP_MAP(orig)->nr == UNWRAP_MAP(other)->nr &&
+	    !memcmp(UNWRAP_MAP(orig)->map, UNWRAP_MAP(other)->map,
+		    UNWRAP_MAP(orig)->nr * sizeof(struct perf_cpu)))
 		return orig;
 
-	tmp_len = orig->nr + other->nr;
+	tmp_len = UNWRAP_MAP(orig)->nr + UNWRAP_MAP(other)->nr;
 	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
 	if (!tmp_cpus)
 		return NULL;
 
 	/* Standard merge algorithm from wikipedia */
 	i = j = k = 0;
-	while (i < orig->nr && j < other->nr) {
-		if (orig->map[i].cpu <= other->map[j].cpu) {
-			if (orig->map[i].cpu == other->map[j].cpu)
+	while (i < UNWRAP_MAP(orig)->nr && j < UNWRAP_MAP(other)->nr) {
+		if (UNWRAP_MAP(orig)->map[i].cpu <= UNWRAP_MAP(other)->map[j].cpu) {
+			if (UNWRAP_MAP(orig)->map[i].cpu == UNWRAP_MAP(other)->map[j].cpu)
 				j++;
-			tmp_cpus[k++] = orig->map[i++];
+			tmp_cpus[k++] = UNWRAP_MAP(orig)->map[i++];
 		} else
-			tmp_cpus[k++] = other->map[j++];
+			tmp_cpus[k++] = UNWRAP_MAP(other)->map[j++];
 	}
 
-	while (i < orig->nr)
-		tmp_cpus[k++] = orig->map[i++];
+	while (i < UNWRAP_MAP(orig)->nr)
+		tmp_cpus[k++] = UNWRAP_MAP(orig)->map[i++];
 
-	while (j < other->nr)
-		tmp_cpus[k++] = other->map[j++];
+	while (j < UNWRAP_MAP(other)->nr)
+		tmp_cpus[k++] = UNWRAP_MAP(other)->map[j++];
 	assert(k <= tmp_len);
 
 	merged = cpu_map__trim_new(k, tmp_cpus);
diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
index 581f9ffb4237..64ad56d167a0 100644
--- a/tools/lib/perf/include/internal/cpumap.h
+++ b/tools/lib/perf/include/internal/cpumap.h
@@ -16,7 +16,12 @@ struct perf_cpu {
  * gaps if CPU numbers were used. For events associated with a pid, rather than
  * a CPU, a single dummy map with an entry of -1 is used.
  */
-struct perf_cpu_map {
+#ifndef REFCNT_CHECKING
+struct perf_cpu_map
+#else
+struct original_perf_cpu_map
+#endif
+{
 	refcount_t	refcnt;
 	/** Length of the map array. */
 	int		nr;
@@ -24,10 +29,17 @@ struct perf_cpu_map {
 	struct perf_cpu	map[];
 };
 
+#ifdef REFCNT_CHECKING
+struct perf_cpu_map {
+	struct original_perf_cpu_map *orig;
+};
+#endif
+
 #ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS	2048
 #endif
 
+struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus);
 int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu);
 
 #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
index 84e87e31f119..65d9546aec6e 100644
--- a/tools/perf/tests/cpumap.c
+++ b/tools/perf/tests/cpumap.c
@@ -11,6 +11,12 @@
 
 struct machine;
 
+#ifndef REFCNT_CHECKING
+#define UNWRAP_CPUMAP(x) x
+#else
+#define UNWRAP_CPUMAP(x) x->orig
+#endif
+
 static int process_event_mask(struct perf_tool *tool __maybe_unused,
 			 union perf_event *event,
 			 struct perf_sample *sample __maybe_unused,
@@ -35,10 +41,10 @@ static int process_event_mask(struct perf_tool *tool __maybe_unused,
 	}
 
 	map = cpu_map__new_data(data);
-	TEST_ASSERT_VAL("wrong nr",  map->nr == 20);
+	TEST_ASSERT_VAL("wrong nr",  perf_cpu_map__nr(map) == 20);
 
 	for (i = 0; i < 20; i++) {
-		TEST_ASSERT_VAL("wrong cpu", map->map[i].cpu == i);
+		TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, i).cpu == i);
 	}
 
 	perf_cpu_map__put(map);
@@ -66,10 +72,10 @@ static int process_event_cpus(struct perf_tool *tool __maybe_unused,
 	TEST_ASSERT_VAL("wrong cpu",  cpus->cpu[1] == 256);
 
 	map = cpu_map__new_data(data);
-	TEST_ASSERT_VAL("wrong nr",  map->nr == 2);
-	TEST_ASSERT_VAL("wrong cpu", map->map[0].cpu == 1);
-	TEST_ASSERT_VAL("wrong cpu", map->map[1].cpu == 256);
-	TEST_ASSERT_VAL("wrong refcnt", refcount_read(&map->refcnt) == 1);
+	TEST_ASSERT_VAL("wrong nr",  perf_cpu_map__nr(map) == 2);
+	TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, 0).cpu == 1);
+	TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, 1).cpu == 256);
+	TEST_ASSERT_VAL("wrong refcnt", refcount_read(&UNWRAP_CPUMAP(map)->refcnt) == 1);
 	perf_cpu_map__put(map);
 	return 0;
 }
@@ -130,7 +136,7 @@ static int test__cpu_map_merge(struct test_suite *test __maybe_unused, int subte
 	struct perf_cpu_map *c = perf_cpu_map__merge(a, b);
 	char buf[100];
 
-	TEST_ASSERT_VAL("failed to merge map: bad nr", c->nr == 5);
+	TEST_ASSERT_VAL("failed to merge map: bad nr", perf_cpu_map__nr(c) == 5);
 	cpu_map__snprint(c, buf, sizeof(buf));
 	TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, "1-2,4-5,7"));
 	perf_cpu_map__put(b);
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 12b2243222b0..e9976ca238fc 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -22,6 +22,12 @@ static int max_node_num;
  */
 static int *cpunode_map;
 
+#ifndef REFCNT_CHECKING
+#define UNWRAP_MAP(x) x
+#else
+#define UNWRAP_MAP(x) x->orig
+#endif
+
 static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
 {
 	struct perf_cpu_map *map;
@@ -37,9 +43,9 @@ static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
 			 * otherwise it would become 65535.
 			 */
 			if (cpus->cpu[i] == (u16) -1)
-				map->map[i].cpu = -1;
+				UNWRAP_MAP(map)->map[i].cpu = -1;
 			else
-				map->map[i].cpu = (int) cpus->cpu[i];
+				UNWRAP_MAP(map)->map[i].cpu = (int) cpus->cpu[i];
 		}
 	}
 
@@ -58,7 +64,7 @@ static struct perf_cpu_map *cpu_map__from_mask(struct perf_record_record_cpu_map
 		int cpu, i = 0;
 
 		for_each_set_bit(cpu, mask->mask, nbits)
-			map->map[i++].cpu = cpu;
+			UNWRAP_MAP(map)->map[i++].cpu = cpu;
 	}
 	return map;
 
@@ -84,16 +90,13 @@ size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp)
 
 struct perf_cpu_map *perf_cpu_map__empty_new(int nr)
 {
-	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(int) * nr);
+	struct perf_cpu_map *cpus = perf_cpu_map__alloc(nr);
 
 	if (cpus != NULL) {
 		int i;
 
-		cpus->nr = nr;
 		for (i = 0; i < nr; i++)
-			cpus->map[i].cpu = -1;
-
-		refcount_set(&cpus->refcnt, 1);
+			UNWRAP_MAP(cpus)->map[i].cpu = -1;
 	}
 
 	return cpus;
@@ -163,7 +166,7 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
 {
 	int idx;
 	struct perf_cpu cpu;
-	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(cpus->nr);
+	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(perf_cpu_map__nr(cpus));
 
 	if (!c)
 		return NULL;
@@ -187,7 +190,7 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
 		}
 	}
 	/* Trim. */
-	if (c->nr != cpus->nr) {
+	if (c->nr != perf_cpu_map__nr(cpus)) {
 		struct cpu_aggr_map *trimmed_c =
 			realloc(c,
 				sizeof(struct cpu_aggr_map) + sizeof(struct aggr_cpu_id) * c->nr);
@@ -494,31 +497,32 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
 
 #define COMMA first ? "" : ","
 
-	for (i = 0; i < map->nr + 1; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map) + 1; i++) {
 		struct perf_cpu cpu = { .cpu = INT_MAX };
-		bool last = i == map->nr;
+		bool last = i == perf_cpu_map__nr(map);
 
 		if (!last)
-			cpu = map->map[i];
+			cpu = perf_cpu_map__cpu(map, i);
 
 		if (start == -1) {
 			start = i;
 			if (last) {
 				ret += snprintf(buf + ret, size - ret,
 						"%s%d", COMMA,
-						map->map[i].cpu);
+						perf_cpu_map__cpu(map, i).cpu);
 			}
-		} else if (((i - start) != (cpu.cpu - map->map[start].cpu)) || last) {
+		} else if (((i - start) != (cpu.cpu - perf_cpu_map__cpu(map, start).cpu)) || last) {
 			int end = i - 1;
 
 			if (start == end) {
 				ret += snprintf(buf + ret, size - ret,
 						"%s%d", COMMA,
-						map->map[start].cpu);
+						perf_cpu_map__cpu(map, start).cpu);
 			} else {
 				ret += snprintf(buf + ret, size - ret,
 						"%s%d-%d", COMMA,
-						map->map[start].cpu, map->map[end].cpu);
+						perf_cpu_map__cpu(map, start).cpu,
+						perf_cpu_map__cpu(map, end).cpu);
 			}
 			first = false;
 			start = i;
@@ -545,7 +549,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
 	int i, cpu;
 	char *ptr = buf;
 	unsigned char *bitmap;
-	struct perf_cpu last_cpu = perf_cpu_map__cpu(map, map->nr - 1);
+	struct perf_cpu last_cpu = perf_cpu_map__cpu(map, perf_cpu_map__nr(map) - 1);
 
 	if (buf == NULL)
 		return 0;
@@ -556,7 +560,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
 		return 0;
 	}
 
-	for (i = 0; i < map->nr; i++) {
+	for (i = 0; i < perf_cpu_map__nr(map); i++) {
 		cpu = perf_cpu_map__cpu(map, i).cpu;
 		bitmap[cpu / 8] |= 1 << (cpu % 8);
 	}
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 8dfbba15aeb8..1649321fe86f 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1992,13 +1992,20 @@ int perf_pmu__match(char *pattern, char *name, char *tok)
 	return 0;
 }
 
+#ifndef REFCNT_CHECKING
+#define UNWRAP_CPUMAP(x) x
+#else
+#define UNWRAP_CPUMAP(x) x->orig
+#endif
+
 int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
 			 struct perf_cpu_map **mcpus_ptr,
 			 struct perf_cpu_map **ucpus_ptr)
 {
 	struct perf_cpu_map *pmu_cpus = pmu->cpus;
 	struct perf_cpu_map *matched_cpus, *unmatched_cpus;
-	int matched_nr = 0, unmatched_nr = 0;
+	struct perf_cpu cpu;
+	int i, matched_nr = 0, unmatched_nr = 0;
 
 	matched_cpus = perf_cpu_map__default_new();
 	if (!matched_cpus)
@@ -2010,18 +2017,15 @@ int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
 		return -1;
 	}
 
-	for (int i = 0; i < cpus->nr; i++) {
-		int cpu;
-
-		cpu = perf_cpu_map__idx(pmu_cpus, cpus->map[i]);
-		if (cpu == -1)
-			unmatched_cpus->map[unmatched_nr++] = cpus->map[i];
+	perf_cpu_map__for_each_cpu(cpu, i, cpus) {
+		if (!perf_cpu_map__has(pmu_cpus, cpu))
+			UNWRAP_CPUMAP(unmatched_cpus)->map[unmatched_nr++] = cpu;
 		else
-			matched_cpus->map[matched_nr++] = cpus->map[i];
+			UNWRAP_CPUMAP(matched_cpus)->map[matched_nr++] = cpu;
 	}
 
-	unmatched_cpus->nr = unmatched_nr;
-	matched_cpus->nr = matched_nr;
+	UNWRAP_CPUMAP(unmatched_cpus)->nr = unmatched_nr;
+	UNWRAP_CPUMAP(matched_cpus)->nr = matched_nr;
 	*mcpus_ptr = matched_cpus;
 	*ucpus_ptr = unmatched_cpus;
 	return 0;
-- 
2.35.0.rc0.227.g00780c9af4-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/4] perf dso: Make lock error check and add BUG_ONs
  2022-01-25 20:45 [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
  2022-01-25 20:45 ` [PATCH v2 1/4] perf cpumap: Add reference count checking Ian Rogers
@ 2022-01-25 20:46 ` Ian Rogers
  2022-01-25 20:46 ` [PATCH v2 3/4] perf dso: Hold lock when accessing nsinfo Ian Rogers
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2022-01-25 20:46 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov
  Cc: eranian, Ian Rogers

Make the pthread mutex on dso use the error check type. This allows
deadlock checking via the return type. Assert the returned value from
mutex lock is always 0.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/dso.c    | 12 +++++++++---
 tools/perf/util/symbol.c |  2 +-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 9cc8a1772b4b..6beccffeef7b 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -784,7 +784,7 @@ dso_cache__free(struct dso *dso)
 	struct rb_root *root = &dso->data.cache;
 	struct rb_node *next = rb_first(root);
 
-	pthread_mutex_lock(&dso->lock);
+	BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
 	while (next) {
 		struct dso_cache *cache;
 
@@ -830,7 +830,7 @@ dso_cache__insert(struct dso *dso, struct dso_cache *new)
 	struct dso_cache *cache;
 	u64 offset = new->offset;
 
-	pthread_mutex_lock(&dso->lock);
+	BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
 	while (*p != NULL) {
 		u64 end;
 
@@ -1259,6 +1259,8 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
 	struct dso *dso = calloc(1, sizeof(*dso) + strlen(name) + 1);
 
 	if (dso != NULL) {
+		pthread_mutexattr_t lock_attr;
+
 		strcpy(dso->name, name);
 		if (id)
 			dso->id = *id;
@@ -1286,8 +1288,12 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
 		dso->root = NULL;
 		INIT_LIST_HEAD(&dso->node);
 		INIT_LIST_HEAD(&dso->data.open_entry);
-		pthread_mutex_init(&dso->lock, NULL);
+		pthread_mutexattr_init(&lock_attr);
+		pthread_mutexattr_settype(&lock_attr, PTHREAD_MUTEX_ERRORCHECK);
+		pthread_mutex_init(&dso->lock, &lock_attr);
+		pthread_mutexattr_destroy(&lock_attr);
 		refcount_set(&dso->refcnt, 1);
+
 	}
 
 	return dso;
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b2ed3140a1fa..43f47532696f 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1783,7 +1783,7 @@ int dso__load(struct dso *dso, struct map *map)
 	}
 
 	nsinfo__mountns_enter(dso->nsinfo, &nsc);
-	pthread_mutex_lock(&dso->lock);
+	BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
 
 	/* check again under the dso->lock */
 	if (dso__loaded(dso)) {
-- 
2.35.0.rc0.227.g00780c9af4-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 3/4] perf dso: Hold lock when accessing nsinfo
  2022-01-25 20:45 [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
  2022-01-25 20:45 ` [PATCH v2 1/4] perf cpumap: Add reference count checking Ian Rogers
  2022-01-25 20:46 ` [PATCH v2 2/4] perf dso: Make lock error check and add BUG_ONs Ian Rogers
@ 2022-01-25 20:46 ` Ian Rogers
  2022-01-25 20:46 ` [PATCH v2 4/4] perf namespaces: Add reference count checking Ian Rogers
  2022-01-27 21:33 ` [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
  4 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2022-01-25 20:46 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov
  Cc: eranian, Ian Rogers

There may be threads racing to update dso->nsinfo:
https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
Holding the dso->lock avoids use-after-free, memory leaks and other
such bugs. Apply the fix in:
https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
of there being a missing nsinfo__put now that the accesses are data race
free.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c   | 4 ++++
 tools/perf/util/dso.c         | 5 ++++-
 tools/perf/util/map.c         | 3 +++
 tools/perf/util/probe-event.c | 2 ++
 tools/perf/util/symbol.c      | 2 +-
 5 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index fbf43a454cba..bede332bf0e2 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -363,8 +363,10 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
 	}
 
 	if (dso) {
+		BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
 		nsinfo__put(dso->nsinfo);
 		dso->nsinfo = nsi;
+		pthread_mutex_unlock(&dso->lock);
 	} else
 		nsinfo__put(nsi);
 
@@ -547,7 +549,9 @@ static int dso__read_build_id(struct dso *dso)
 	if (dso->has_build_id)
 		return 0;
 
+	BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
 	nsinfo__mountns_enter(dso->nsinfo, &nsc);
+	pthread_mutex_unlock(&dso->lock);
 	if (filename__read_build_id(dso->long_name, &dso->bid) > 0)
 		dso->has_build_id = true;
 	nsinfo__mountns_exit(&nsc);
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 6beccffeef7b..b2f570adba35 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -548,8 +548,11 @@ static int open_dso(struct dso *dso, struct machine *machine)
 	int fd;
 	struct nscookie nsc;
 
-	if (dso->binary_type != DSO_BINARY_TYPE__BUILD_ID_CACHE)
+	if (dso->binary_type != DSO_BINARY_TYPE__BUILD_ID_CACHE) {
+		BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
 		nsinfo__mountns_enter(dso->nsinfo, &nsc);
+		pthread_mutex_unlock(&dso->lock);
+	}
 	fd = __open_dso(dso, machine);
 	if (dso->binary_type != DSO_BINARY_TYPE__BUILD_ID_CACHE)
 		nsinfo__mountns_exit(&nsc);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 8af693d9678c..ae99b52502d5 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -192,7 +192,10 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 			if (!(prot & PROT_EXEC))
 				dso__set_loaded(dso);
 		}
+		BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
+		nsinfo__put(dso->nsinfo);
 		dso->nsinfo = nsi;
+		pthread_mutex_unlock(&dso->lock);
 
 		if (build_id__is_defined(bid))
 			dso__set_build_id(dso, bid);
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index a834918a0a0d..7444e689ece7 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -180,8 +180,10 @@ struct map *get_target_map(const char *target, struct nsinfo *nsi, bool user)
 
 		map = dso__new_map(target);
 		if (map && map->dso) {
+			BUG_ON(pthread_mutex_lock(&map->dso->lock) != 0);
 			nsinfo__put(map->dso->nsinfo);
 			map->dso->nsinfo = nsinfo__get(nsi);
+			pthread_mutex_unlock(&map->dso->lock);
 		}
 		return map;
 	} else {
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 43f47532696f..a504346feb05 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1774,6 +1774,7 @@ int dso__load(struct dso *dso, struct map *map)
 	char newmapname[PATH_MAX];
 	const char *map_path = dso->long_name;
 
+	BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
 	perfmap = strncmp(dso->name, "/tmp/perf-", 10) == 0;
 	if (perfmap) {
 		if (dso->nsinfo && (dso__find_perf_map(newmapname,
@@ -1783,7 +1784,6 @@ int dso__load(struct dso *dso, struct map *map)
 	}
 
 	nsinfo__mountns_enter(dso->nsinfo, &nsc);
-	BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
 
 	/* check again under the dso->lock */
 	if (dso__loaded(dso)) {
-- 
2.35.0.rc0.227.g00780c9af4-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 4/4] perf namespaces: Add reference count checking
  2022-01-25 20:45 [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
                   ` (2 preceding siblings ...)
  2022-01-25 20:46 ` [PATCH v2 3/4] perf dso: Hold lock when accessing nsinfo Ian Rogers
@ 2022-01-25 20:46 ` Ian Rogers
  2022-01-27 21:33 ` [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
  4 siblings, 0 replies; 20+ messages in thread
From: Ian Rogers @ 2022-01-25 20:46 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov
  Cc: eranian, Ian Rogers

Add reference count checking controlled by REFCNT_CHECKING ifdef. The
reference count checking interposes an allocated pointer between the
reference counted struct on a get and frees the pointer on a put.
Accesses after a put cause faults and use after free, missed puts are
caughts as leaks and double puts are double frees.

This checking helped resolve a memory leak and use after free:
https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-inject.c  |   2 +-
 tools/perf/builtin-probe.c   |   2 +-
 tools/perf/util/build-id.c   |   4 +-
 tools/perf/util/jitdump.c    |  10 +-
 tools/perf/util/map.c        |   4 +-
 tools/perf/util/namespaces.c | 187 +++++++++++++++++++++++++----------
 tools/perf/util/namespaces.h |  23 ++++-
 tools/perf/util/symbol.c     |   8 +-
 8 files changed, 168 insertions(+), 72 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index bede332bf0e2..f7917c390e96 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -354,7 +354,7 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename,
 		nnsi = nsinfo__copy(nsi);
 		if (nnsi) {
 			nsinfo__put(nsi);
-			nnsi->need_setns = false;
+			nsinfo__clear_need_setns(nnsi);
 			nsi = nnsi;
 		}
 		dso = machine__findnew_vdso(machine, thread);
diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index c31627af75d4..f62298f5db3b 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -217,7 +217,7 @@ static int opt_set_target_ns(const struct option *opt __maybe_unused,
 			return ret;
 		}
 		nsip = nsinfo__new(ns_pid);
-		if (nsip && nsip->need_setns)
+		if (nsip && nsinfo__need_setns(nsip))
 			params.nsi = nsinfo__get(nsip);
 		nsinfo__put(nsip);
 
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index e32e8f2ff3bd..7a5821c87f94 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -706,7 +706,7 @@ build_id_cache__add(const char *sbuild_id, const char *name, const char *realnam
 		if (is_kallsyms) {
 			if (copyfile("/proc/kallsyms", filename))
 				goto out_free;
-		} else if (nsi && nsi->need_setns) {
+		} else if (nsi && nsinfo__need_setns(nsi)) {
 			if (copyfile_ns(name, filename, nsi))
 				goto out_free;
 		} else if (link(realname, filename) && errno != EEXIST &&
@@ -730,7 +730,7 @@ build_id_cache__add(const char *sbuild_id, const char *name, const char *realnam
 				goto out_free;
 			}
 			if (access(filename, F_OK)) {
-				if (nsi && nsi->need_setns) {
+				if (nsi && nsinfo__need_setns(nsi)) {
 					if (copyfile_ns(debugfile, filename,
 							nsi))
 						goto out_free;
diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index 917a9c707371..a23255773c60 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -382,15 +382,15 @@ jit_inject_event(struct jit_buf_desc *jd, union perf_event *event)
 
 static pid_t jr_entry_pid(struct jit_buf_desc *jd, union jr_entry *jr)
 {
-	if (jd->nsi && jd->nsi->in_pidns)
-		return jd->nsi->tgid;
+	if (jd->nsi && nsinfo__in_pidns(jd->nsi))
+		return nsinfo__tgid(jd->nsi);
 	return jr->load.pid;
 }
 
 static pid_t jr_entry_tid(struct jit_buf_desc *jd, union jr_entry *jr)
 {
-	if (jd->nsi && jd->nsi->in_pidns)
-		return jd->nsi->pid;
+	if (jd->nsi && nsinfo__in_pidns(jd->nsi))
+		return nsinfo__pid(jd->nsi);
 	return jr->load.tid;
 }
 
@@ -779,7 +779,7 @@ jit_detect(char *mmap_name, pid_t pid, struct nsinfo *nsi)
 	 * pid does not match mmap pid
 	 * pid==0 in system-wide mode (synthesized)
 	 */
-	if (pid && pid2 != nsi->nstgid)
+	if (pid && pid2 != nsinfo__nstgid(nsi))
 		return -1;
 	/*
 	 * validate suffix
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index ae99b52502d5..5b93a91bb0f9 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -151,7 +151,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 
 		if ((anon || no_dso) && nsi && (prot & PROT_EXEC)) {
 			snprintf(newfilename, sizeof(newfilename),
-				 "/tmp/perf-%d.map", nsi->pid);
+				 "/tmp/perf-%d.map", nsinfo__pid(nsi));
 			filename = newfilename;
 		}
 
@@ -168,7 +168,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 			nnsi = nsinfo__copy(nsi);
 			if (nnsi) {
 				nsinfo__put(nsi);
-				nnsi->need_setns = false;
+				nsinfo__clear_need_setns(nnsi);
 				nsi = nnsi;
 			}
 			pgoff = 0;
diff --git a/tools/perf/util/namespaces.c b/tools/perf/util/namespaces.c
index 48aa3217300b..4187179107ec 100644
--- a/tools/perf/util/namespaces.c
+++ b/tools/perf/util/namespaces.c
@@ -30,6 +30,12 @@ static const char *perf_ns__names[] = {
 	[CGROUP_NS_INDEX]	= "cgroup",
 };
 
+#ifndef REFCNT_CHECKING
+#define UNWRAP_NSI(x) x
+#else
+#define UNWRAP_NSI(x) x->orig
+#endif
+
 const char *perf_ns__name(unsigned int id)
 {
 	if (id >= ARRAY_SIZE(perf_ns__names))
@@ -60,7 +66,7 @@ void namespaces__free(struct namespaces *namespaces)
 	free(namespaces);
 }
 
-static int nsinfo__get_nspid(struct nsinfo *nsi, const char *path)
+static int nsinfo__get_nspid(pid_t *tgid, pid_t *nstgid, bool *in_pidns, const char *path)
 {
 	FILE *f = NULL;
 	char *statln = NULL;
@@ -74,19 +80,18 @@ static int nsinfo__get_nspid(struct nsinfo *nsi, const char *path)
 	while (getline(&statln, &linesz, f) != -1) {
 		/* Use tgid if CONFIG_PID_NS is not defined. */
 		if (strstr(statln, "Tgid:") != NULL) {
-			nsi->tgid = (pid_t)strtol(strrchr(statln, '\t'),
-						     NULL, 10);
-			nsi->nstgid = nsi->tgid;
+			*tgid = (pid_t)strtol(strrchr(statln, '\t'), NULL, 10);
+			*nstgid = *tgid;
 		}
 
 		if (strstr(statln, "NStgid:") != NULL) {
 			nspid = strrchr(statln, '\t');
-			nsi->nstgid = (pid_t)strtol(nspid, NULL, 10);
+			*nstgid = (pid_t)strtol(nspid, NULL, 10);
 			/*
 			 * If innermost tgid is not the first, process is in a different
 			 * PID namespace.
 			 */
-			nsi->in_pidns = (statln + sizeof("NStgid:") - 1) != nspid;
+			*in_pidns = (statln + sizeof("NStgid:") - 1) != nspid;
 			break;
 		}
 	}
@@ -108,7 +113,7 @@ int nsinfo__init(struct nsinfo *nsi)
 	if (snprintf(oldns, PATH_MAX, "/proc/self/ns/mnt") >= PATH_MAX)
 		return rv;
 
-	if (asprintf(&newns, "/proc/%d/ns/mnt", nsi->pid) == -1)
+	if (asprintf(&newns, "/proc/%d/ns/mnt", UNWRAP_NSI(nsi)->pid) == -1)
 		return rv;
 
 	if (stat(oldns, &old_stat) < 0)
@@ -121,24 +126,47 @@ int nsinfo__init(struct nsinfo *nsi)
 	 * want to switch as part of looking up dso/map data.
 	 */
 	if (old_stat.st_ino != new_stat.st_ino) {
-		nsi->need_setns = true;
-		nsi->mntns_path = newns;
+		UNWRAP_NSI(nsi)->need_setns = true;
+		UNWRAP_NSI(nsi)->mntns_path = newns;
 		newns = NULL;
 	}
 
 	/* If we're dealing with a process that is in a different PID namespace,
 	 * attempt to work out the innermost tgid for the process.
 	 */
-	if (snprintf(spath, PATH_MAX, "/proc/%d/status", nsi->pid) >= PATH_MAX)
+	if (snprintf(spath, PATH_MAX, "/proc/%d/status", UNWRAP_NSI(nsi)->pid) >= PATH_MAX)
 		goto out;
 
-	rv = nsinfo__get_nspid(nsi, spath);
+	rv = nsinfo__get_nspid(&UNWRAP_NSI(nsi)->tgid, &UNWRAP_NSI(nsi)->nstgid,
+			       &UNWRAP_NSI(nsi)->in_pidns, spath);
 
 out:
 	free(newns);
 	return rv;
 }
 
+static struct nsinfo *nsinfo__alloc(void)
+{
+	struct nsinfo *res;
+#ifdef REFCNT_CHECKING
+	res = malloc(sizeof(*res));
+	if (!res)
+		return NULL;
+
+	res->orig = calloc(1, sizeof(struct original_nsinfo));
+	if (!res->orig) {
+		free(res);
+		res = NULL;
+	} else
+		refcount_set(&res->orig->refcnt, 1);
+#else
+	res = calloc(1, sizeof(*res));
+	if (res)
+		refcount_set(&res->refcnt, 1);
+#endif
+	return res;
+}
+
 struct nsinfo *nsinfo__new(pid_t pid)
 {
 	struct nsinfo *nsi;
@@ -146,70 +174,121 @@ struct nsinfo *nsinfo__new(pid_t pid)
 	if (pid == 0)
 		return NULL;
 
-	nsi = calloc(1, sizeof(*nsi));
-	if (nsi != NULL) {
-		nsi->pid = pid;
-		nsi->tgid = pid;
-		nsi->nstgid = pid;
-		nsi->need_setns = false;
-		nsi->in_pidns = false;
-		/* Init may fail if the process exits while we're trying to look
-		 * at its proc information.  In that case, save the pid but
-		 * don't try to enter the namespace.
-		 */
-		if (nsinfo__init(nsi) == -1)
-			nsi->need_setns = false;
-
-		refcount_set(&nsi->refcnt, 1);
-	}
+	nsi = nsinfo__alloc();
+	if (!nsi)
+		return NULL;
+
+	UNWRAP_NSI(nsi)->pid = pid;
+	UNWRAP_NSI(nsi)->tgid = pid;
+	UNWRAP_NSI(nsi)->nstgid = pid;
+	UNWRAP_NSI(nsi)->need_setns = false;
+	UNWRAP_NSI(nsi)->in_pidns = false;
+	/* Init may fail if the process exits while we're trying to look at its
+	 * proc information. In that case, save the pid but don't try to enter
+	 * the namespace.
+	 */
+	if (nsinfo__init(nsi) == -1)
+		UNWRAP_NSI(nsi)->need_setns = false;
 
 	return nsi;
 }
 
-struct nsinfo *nsinfo__copy(struct nsinfo *nsi)
+struct nsinfo *nsinfo__copy(const struct nsinfo *nsi)
 {
 	struct nsinfo *nnsi;
 
 	if (nsi == NULL)
 		return NULL;
 
-	nnsi = calloc(1, sizeof(*nnsi));
-	if (nnsi != NULL) {
-		nnsi->pid = nsi->pid;
-		nnsi->tgid = nsi->tgid;
-		nnsi->nstgid = nsi->nstgid;
-		nnsi->need_setns = nsi->need_setns;
-		nnsi->in_pidns = nsi->in_pidns;
-		if (nsi->mntns_path) {
-			nnsi->mntns_path = strdup(nsi->mntns_path);
-			if (!nnsi->mntns_path) {
-				free(nnsi);
-				return NULL;
-			}
+	nnsi = nsinfo__alloc();
+	if (!nnsi)
+		return NULL;
+
+	UNWRAP_NSI(nnsi)->pid = UNWRAP_NSI(nsi)->pid;
+	UNWRAP_NSI(nnsi)->tgid = UNWRAP_NSI(nsi)->tgid;
+	UNWRAP_NSI(nnsi)->nstgid = UNWRAP_NSI(nsi)->nstgid;
+	UNWRAP_NSI(nnsi)->need_setns = UNWRAP_NSI(nsi)->need_setns;
+	UNWRAP_NSI(nnsi)->in_pidns = UNWRAP_NSI(nsi)->in_pidns;
+	if (UNWRAP_NSI(nsi)->mntns_path) {
+		UNWRAP_NSI(nnsi)->mntns_path = strdup(UNWRAP_NSI(nsi)->mntns_path);
+		if (!UNWRAP_NSI(nnsi)->mntns_path) {
+			nsinfo__put(nnsi);
+			return NULL;
 		}
-		refcount_set(&nnsi->refcnt, 1);
 	}
 
 	return nnsi;
 }
 
-void nsinfo__delete(struct nsinfo *nsi)
+static void nsinfo__delete(struct nsinfo *nsi)
 {
-	zfree(&nsi->mntns_path);
-	free(nsi);
+	if (nsi) {
+		WARN_ONCE(refcount_read(&UNWRAP_NSI(nsi)->refcnt) != 0,
+			"nsinfo refcnt unbalanced\n");
+		zfree(&UNWRAP_NSI(nsi)->mntns_path);
+#ifdef REFCNT_CHECKING
+		zfree(&nsi->orig);
+#endif
+		free(nsi);
+	}
 }
 
 struct nsinfo *nsinfo__get(struct nsinfo *nsi)
 {
-	if (nsi)
-		refcount_inc(&nsi->refcnt);
+	if (nsi) {
+#ifdef REFCNT_CHECKING
+		struct nsinfo *new_wrapper;
+#endif
+		refcount_inc(&UNWRAP_NSI(nsi)->refcnt);
+#ifdef REFCNT_CHECKING
+		new_wrapper = malloc(sizeof(*new_wrapper));
+		new_wrapper->orig = nsi->orig;
+		nsi = new_wrapper;
+#endif
+	}
 	return nsi;
 }
 
 void nsinfo__put(struct nsinfo *nsi)
 {
-	if (nsi && refcount_dec_and_test(&nsi->refcnt))
+	if (nsi && refcount_dec_and_test(&UNWRAP_NSI(nsi)->refcnt))
 		nsinfo__delete(nsi);
+#ifdef REFCNT_CHECKING
+	else if (nsi) {
+		nsi->orig = NULL;
+		free(nsi);
+	}
+#endif
+}
+
+bool nsinfo__need_setns(const struct nsinfo *nsi)
+{
+	return UNWRAP_NSI(nsi)->need_setns;
+}
+
+void nsinfo__clear_need_setns(struct nsinfo *nsi)
+{
+	UNWRAP_NSI(nsi)->need_setns = false;
+}
+
+pid_t nsinfo__tgid(const struct nsinfo  *nsi)
+{
+	return UNWRAP_NSI(nsi)->tgid;
+}
+
+pid_t nsinfo__nstgid(const struct nsinfo  *nsi)
+{
+	return UNWRAP_NSI(nsi)->nstgid;
+}
+
+pid_t nsinfo__pid(const struct nsinfo  *nsi)
+{
+	return UNWRAP_NSI(nsi)->pid;
+}
+
+pid_t nsinfo__in_pidns(const struct nsinfo  *nsi)
+{
+	return UNWRAP_NSI(nsi)->in_pidns;
 }
 
 void nsinfo__mountns_enter(struct nsinfo *nsi,
@@ -226,7 +305,7 @@ void nsinfo__mountns_enter(struct nsinfo *nsi,
 	nc->oldns = -1;
 	nc->newns = -1;
 
-	if (!nsi || !nsi->need_setns)
+	if (!nsi || !UNWRAP_NSI(nsi)->need_setns)
 		return;
 
 	if (snprintf(curpath, PATH_MAX, "/proc/self/ns/mnt") >= PATH_MAX)
@@ -240,7 +319,7 @@ void nsinfo__mountns_enter(struct nsinfo *nsi,
 	if (oldns < 0)
 		goto errout;
 
-	newns = open(nsi->mntns_path, O_RDONLY);
+	newns = open(UNWRAP_NSI(nsi)->mntns_path, O_RDONLY);
 	if (newns < 0)
 		goto errout;
 
@@ -309,9 +388,9 @@ int nsinfo__stat(const char *filename, struct stat *st, struct nsinfo *nsi)
 
 bool nsinfo__is_in_root_namespace(void)
 {
-	struct nsinfo nsi;
+	pid_t tgid = 0, nstgid = 0;
+	bool in_pidns = false;
 
-	memset(&nsi, 0x0, sizeof(nsi));
-	nsinfo__get_nspid(&nsi, "/proc/self/status");
-	return !nsi.in_pidns;
+	nsinfo__get_nspid(&tgid, &nstgid, &in_pidns, "/proc/self/status");
+	return !in_pidns;
 }
diff --git a/tools/perf/util/namespaces.h b/tools/perf/util/namespaces.h
index 9ceea9643507..673c7bcdb9b0 100644
--- a/tools/perf/util/namespaces.h
+++ b/tools/perf/util/namespaces.h
@@ -29,7 +29,12 @@ struct namespaces {
 struct namespaces *namespaces__new(struct perf_record_namespaces *event);
 void namespaces__free(struct namespaces *namespaces);
 
-struct nsinfo {
+#ifndef REFCNT_CHECKING
+struct nsinfo
+#else
+struct original_nsinfo
+#endif
+{
 	pid_t			pid;
 	pid_t			tgid;
 	pid_t			nstgid;
@@ -39,6 +44,12 @@ struct nsinfo {
 	refcount_t		refcnt;
 };
 
+#ifdef REFCNT_CHECKING
+struct nsinfo {
+	struct original_nsinfo *orig;
+};
+#endif
+
 struct nscookie {
 	int			oldns;
 	int			newns;
@@ -47,12 +58,18 @@ struct nscookie {
 
 int nsinfo__init(struct nsinfo *nsi);
 struct nsinfo *nsinfo__new(pid_t pid);
-struct nsinfo *nsinfo__copy(struct nsinfo *nsi);
-void nsinfo__delete(struct nsinfo *nsi);
+struct nsinfo *nsinfo__copy(const struct nsinfo *nsi);
 
 struct nsinfo *nsinfo__get(struct nsinfo *nsi);
 void nsinfo__put(struct nsinfo *nsi);
 
+bool nsinfo__need_setns(const struct nsinfo *nsi);
+void nsinfo__clear_need_setns(struct nsinfo *nsi);
+pid_t nsinfo__tgid(const struct nsinfo  *nsi);
+pid_t nsinfo__nstgid(const struct nsinfo  *nsi);
+pid_t nsinfo__pid(const struct nsinfo  *nsi);
+pid_t nsinfo__in_pidns(const struct nsinfo  *nsi);
+
 void nsinfo__mountns_enter(struct nsinfo *nsi, struct nscookie *nc);
 void nsinfo__mountns_exit(struct nscookie *nc);
 
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a504346feb05..40477f4d16fa 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1735,8 +1735,8 @@ static int dso__find_perf_map(char *filebuf, size_t bufsz,
 
 	nsi = *nsip;
 
-	if (nsi->need_setns) {
-		snprintf(filebuf, bufsz, "/tmp/perf-%d.map", nsi->nstgid);
+	if (nsinfo__need_setns(nsi)) {
+		snprintf(filebuf, bufsz, "/tmp/perf-%d.map", nsinfo__nstgid(nsi));
 		nsinfo__mountns_enter(nsi, &nsc);
 		rc = access(filebuf, R_OK);
 		nsinfo__mountns_exit(&nsc);
@@ -1748,8 +1748,8 @@ static int dso__find_perf_map(char *filebuf, size_t bufsz,
 	if (nnsi) {
 		nsinfo__put(nsi);
 
-		nnsi->need_setns = false;
-		snprintf(filebuf, bufsz, "/tmp/perf-%d.map", nnsi->tgid);
+		nsinfo__clear_need_setns(nnsi);
+		snprintf(filebuf, bufsz, "/tmp/perf-%d.map", nsinfo__tgid(nnsi));
 		*nsip = nnsi;
 		rc = 0;
 	}
-- 
2.35.0.rc0.227.g00780c9af4-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-25 20:45 [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
                   ` (3 preceding siblings ...)
  2022-01-25 20:46 ` [PATCH v2 4/4] perf namespaces: Add reference count checking Ian Rogers
@ 2022-01-27 21:33 ` Ian Rogers
  2022-01-28  5:23   ` Masami Hiramatsu
  4 siblings, 1 reply; 20+ messages in thread
From: Ian Rogers @ 2022-01-27 21:33 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov, masami.hiramatsu.pt
  Cc: eranian

On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
>
> This v2 patch set has the main reference count patch for cpu map from
> the first set and then adds reference count checking to nsinfo. The
> reference count checking on nsinfo helped diagnose a data race bug
> which is fixed in the independent patches 2 and 3.
>
> The perf tool has a class of memory problems where reference counts
> are used incorrectly. Memory/address sanitizers and valgrind don't
> provide useful ways to debug these problems, you see a memory leak
> where the only pertinent information is the original allocation
> site. What would be more useful is knowing where a get fails to have a
> corresponding put, where there are double puts, etc.
>
> This work was motivated by the roll-back of:
> https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> where fixing a missed put resulted in a use-after-free in a different
> context. There was a sense in fixing the issue that a game of
> wac-a-mole had been embarked upon in adding missed gets and puts.
>
> The basic approach of the change is to add a level of indirection at
> the get and put calls. Get allocates a level of indirection that, if
> no corresponding put is called, becomes a memory leak (and associated
> stack trace) that leak sanitizer can report. Similarly if two puts are
> called for the same get, then a double free can be detected by address
> sanitizer. This can also detect the use after put, which should also
> yield a segv without a sanitizer.
>
> Adding reference count checking to cpu map was done as a proof of
> concept, it yielded little other than a location where the use of get
> could be cleaner by using its result. Reference count checking on
> nsinfo identified a double free of the indirection layer and the
> related threads, thereby identifying a data race as discussed here:
> https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> Accordingly the dso->lock was extended and use to cover the race.
>
> An alternative that was considered was ref_tracker:
>  https://lwn.net/Articles/877603/
> ref_tracker requires use of a reference counted struct to also use a
> cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> to spot double puts. When an object is finished with leaks can be
> detected, as with this approach when leak analysis happens. This
> approach was preferred as it doesn't introduce cookies, spots use
> after put and appears moderately more neutral to the API. Weaknesses
> of the implemented approcah are not being able to do adhoc leak
> detection and a preference for adding an accessor API to structs. I
> believe there are other issues and welcome suggestions.

And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
by Masami Hiramatsu. In this work he adds a leak sanitizer style
reference count checker that will describe locations of puts and gets
for diagnosis. Firstly that's an awesome achievement! This work is
different in that it isn't trying to invent a leak sanitizer, it is
just using the existing one. By adding a level of indirection this
work can catch use after put and pairs gets with puts to make lifetime
analysis more automatic. An advantage of Masami's work is that it
doesn't change data-structures and after the initial patch-set is
somewhat transparent. Overall I prefer the approach in these patches,
future patches can look to clean up the API as Masami has.

Thanks,
Ian

> Ian Rogers (4):
>   perf cpumap: Add reference count checking
>   perf dso: Make lock error check and add BUG_ONs
>   perf dso: Hold lock when accessing nsinfo
>   perf namespaces: Add reference count checking
>
>  tools/lib/perf/cpumap.c                  | 120 ++++++++++-----
>  tools/lib/perf/include/internal/cpumap.h |  14 +-
>  tools/perf/builtin-inject.c              |   6 +-
>  tools/perf/builtin-probe.c               |   2 +-
>  tools/perf/tests/cpumap.c                |  20 ++-
>  tools/perf/util/build-id.c               |   4 +-
>  tools/perf/util/cpumap.c                 |  42 ++---
>  tools/perf/util/dso.c                    |  17 ++-
>  tools/perf/util/jitdump.c                |  10 +-
>  tools/perf/util/map.c                    |   7 +-
>  tools/perf/util/namespaces.c             | 187 ++++++++++++++++-------
>  tools/perf/util/namespaces.h             |  23 ++-
>  tools/perf/util/pmu.c                    |  24 +--
>  tools/perf/util/probe-event.c            |   2 +
>  tools/perf/util/symbol.c                 |  10 +-
>  15 files changed, 337 insertions(+), 151 deletions(-)
>
> --
> 2.35.0.rc0.227.g00780c9af4-goog
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-27 21:33 ` [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
@ 2022-01-28  5:23   ` Masami Hiramatsu
  2022-01-28  6:24     ` Ian Rogers
  0 siblings, 1 reply; 20+ messages in thread
From: Masami Hiramatsu @ 2022-01-28  5:23 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Thu, 27 Jan 2022 13:33:23 -0800
Ian Rogers <irogers@google.com> wrote:

> On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
> >
> > This v2 patch set has the main reference count patch for cpu map from
> > the first set and then adds reference count checking to nsinfo. The
> > reference count checking on nsinfo helped diagnose a data race bug
> > which is fixed in the independent patches 2 and 3.
> >
> > The perf tool has a class of memory problems where reference counts
> > are used incorrectly. Memory/address sanitizers and valgrind don't
> > provide useful ways to debug these problems, you see a memory leak
> > where the only pertinent information is the original allocation
> > site. What would be more useful is knowing where a get fails to have a
> > corresponding put, where there are double puts, etc.
> >
> > This work was motivated by the roll-back of:
> > https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> > where fixing a missed put resulted in a use-after-free in a different
> > context. There was a sense in fixing the issue that a game of
> > wac-a-mole had been embarked upon in adding missed gets and puts.
> >
> > The basic approach of the change is to add a level of indirection at
> > the get and put calls. Get allocates a level of indirection that, if
> > no corresponding put is called, becomes a memory leak (and associated
> > stack trace) that leak sanitizer can report. Similarly if two puts are
> > called for the same get, then a double free can be detected by address
> > sanitizer. This can also detect the use after put, which should also
> > yield a segv without a sanitizer.
> >
> > Adding reference count checking to cpu map was done as a proof of
> > concept, it yielded little other than a location where the use of get
> > could be cleaner by using its result. Reference count checking on
> > nsinfo identified a double free of the indirection layer and the
> > related threads, thereby identifying a data race as discussed here:
> > https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> > Accordingly the dso->lock was extended and use to cover the race.
> >
> > An alternative that was considered was ref_tracker:
> >  https://lwn.net/Articles/877603/
> > ref_tracker requires use of a reference counted struct to also use a
> > cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> > to spot double puts. When an object is finished with leaks can be
> > detected, as with this approach when leak analysis happens. This
> > approach was preferred as it doesn't introduce cookies, spots use
> > after put and appears moderately more neutral to the API. Weaknesses
> > of the implemented approcah are not being able to do adhoc leak
> > detection and a preference for adding an accessor API to structs. I
> > believe there are other issues and welcome suggestions.
> 
> And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
> https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
> by Masami Hiramatsu. In this work he adds a leak sanitizer style
> reference count checker that will describe locations of puts and gets
> for diagnosis. Firstly that's an awesome achievement! This work is
> different in that it isn't trying to invent a leak sanitizer, it is
> just using the existing one. By adding a level of indirection this
> work can catch use after put and pairs gets with puts to make lifetime
> analysis more automatic. An advantage of Masami's work is that it
> doesn't change data-structures and after the initial patch-set is
> somewhat transparent. Overall I prefer the approach in these patches,
> future patches can look to clean up the API as Masami has.

Thanks for referring my series :-D The series aimed to solve the refcount
usage issue in the perf which lead the object leaks. At that moment,
I found that there were 2 patterns, refcount start from 0 and start from 1.
That made me confused what I should do for using a object.
But the perf uses linux/refcount.h now, I hope such issue has already gone.
(but the object leakage seems not fixed fully yet, as you found.)

BTW, I think the introducing UNWRAP_* macro may put a burden on future
development. If it is inevitable, we should consider it as carefully as
possible. Or, it may cause another issue (it is easily missed that the new
patch does not use UNWRAP_* for object reference, because it is natual.)

So I agree with you that you to clean up the API. :-)
I think we can make yet another refcount.h for user space debugging and
replace it with the linux/refcount.h.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-28  5:23   ` Masami Hiramatsu
@ 2022-01-28  6:24     ` Ian Rogers
  2022-01-28 15:34       ` Masami Hiramatsu
  0 siblings, 1 reply; 20+ messages in thread
From: Ian Rogers @ 2022-01-28  6:24 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Thu, Jan 27, 2022 at 9:24 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 27 Jan 2022 13:33:23 -0800
> Ian Rogers <irogers@google.com> wrote:
>
> > On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
> > >
> > > This v2 patch set has the main reference count patch for cpu map from
> > > the first set and then adds reference count checking to nsinfo. The
> > > reference count checking on nsinfo helped diagnose a data race bug
> > > which is fixed in the independent patches 2 and 3.
> > >
> > > The perf tool has a class of memory problems where reference counts
> > > are used incorrectly. Memory/address sanitizers and valgrind don't
> > > provide useful ways to debug these problems, you see a memory leak
> > > where the only pertinent information is the original allocation
> > > site. What would be more useful is knowing where a get fails to have a
> > > corresponding put, where there are double puts, etc.
> > >
> > > This work was motivated by the roll-back of:
> > > https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> > > where fixing a missed put resulted in a use-after-free in a different
> > > context. There was a sense in fixing the issue that a game of
> > > wac-a-mole had been embarked upon in adding missed gets and puts.
> > >
> > > The basic approach of the change is to add a level of indirection at
> > > the get and put calls. Get allocates a level of indirection that, if
> > > no corresponding put is called, becomes a memory leak (and associated
> > > stack trace) that leak sanitizer can report. Similarly if two puts are
> > > called for the same get, then a double free can be detected by address
> > > sanitizer. This can also detect the use after put, which should also
> > > yield a segv without a sanitizer.
> > >
> > > Adding reference count checking to cpu map was done as a proof of
> > > concept, it yielded little other than a location where the use of get
> > > could be cleaner by using its result. Reference count checking on
> > > nsinfo identified a double free of the indirection layer and the
> > > related threads, thereby identifying a data race as discussed here:
> > > https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> > > Accordingly the dso->lock was extended and use to cover the race.
> > >
> > > An alternative that was considered was ref_tracker:
> > >  https://lwn.net/Articles/877603/
> > > ref_tracker requires use of a reference counted struct to also use a
> > > cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> > > to spot double puts. When an object is finished with leaks can be
> > > detected, as with this approach when leak analysis happens. This
> > > approach was preferred as it doesn't introduce cookies, spots use
> > > after put and appears moderately more neutral to the API. Weaknesses
> > > of the implemented approcah are not being able to do adhoc leak
> > > detection and a preference for adding an accessor API to structs. I
> > > believe there are other issues and welcome suggestions.
> >
> > And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
> > https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
> > by Masami Hiramatsu. In this work he adds a leak sanitizer style
> > reference count checker that will describe locations of puts and gets
> > for diagnosis. Firstly that's an awesome achievement! This work is
> > different in that it isn't trying to invent a leak sanitizer, it is
> > just using the existing one. By adding a level of indirection this
> > work can catch use after put and pairs gets with puts to make lifetime
> > analysis more automatic. An advantage of Masami's work is that it
> > doesn't change data-structures and after the initial patch-set is
> > somewhat transparent. Overall I prefer the approach in these patches,
> > future patches can look to clean up the API as Masami has.
>
> Thanks for referring my series :-D The series aimed to solve the refcount
> usage issue in the perf which lead the object leaks. At that moment,
> I found that there were 2 patterns, refcount start from 0 and start from 1.
> That made me confused what I should do for using a object.
> But the perf uses linux/refcount.h now, I hope such issue has already gone.
> (but the object leakage seems not fixed fully yet, as you found.)
>
> BTW, I think the introducing UNWRAP_* macro may put a burden on future
> development. If it is inevitable, we should consider it as carefully as
> possible. Or, it may cause another issue (it is easily missed that the new
> patch does not use UNWRAP_* for object reference, because it is natual.)
>
> So I agree with you that you to clean up the API. :-)
> I think we can make yet another refcount.h for user space debugging and
> replace it with the linux/refcount.h.

Thanks Masami,

Agreed on the UNWRAP_ macros, hence wanting to hide them behind
accessors. Making accessors could be automated with macros, for
example, have a list of variables, have a macro declare the struct
using the list, another macro can use the list to declare accessors. I
didn't find adding the UNWRAP_ macros in this change particularly
burdensome as any use of the wrapping pointer as the original type
caused a compile time error telling you what and where to fix. The
macro is extra stuff in the way of using just the raw object, but
that's fairly typical in C++ with shared_ptr, scoped_lock, etc. The
question is, is it worth it to make sure use of the reference counted
object is correct and misuse is easier to diagnose? I think it is near
as least offensive as possible while providing the best information -
hence being able to solve the dso->nsinfo put data race, that has been
a problem to solve up to this point. I'm open to better suggestions
though :-)

Thanks again,
Ian

> Thank you,
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-28  6:24     ` Ian Rogers
@ 2022-01-28 15:34       ` Masami Hiramatsu
  2022-01-28 18:26         ` Ian Rogers
  0 siblings, 1 reply; 20+ messages in thread
From: Masami Hiramatsu @ 2022-01-28 15:34 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Thu, 27 Jan 2022 22:24:59 -0800
Ian Rogers <irogers@google.com> wrote:

> On Thu, Jan 27, 2022 at 9:24 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 27 Jan 2022 13:33:23 -0800
> > Ian Rogers <irogers@google.com> wrote:
> >
> > > On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > This v2 patch set has the main reference count patch for cpu map from
> > > > the first set and then adds reference count checking to nsinfo. The
> > > > reference count checking on nsinfo helped diagnose a data race bug
> > > > which is fixed in the independent patches 2 and 3.
> > > >
> > > > The perf tool has a class of memory problems where reference counts
> > > > are used incorrectly. Memory/address sanitizers and valgrind don't
> > > > provide useful ways to debug these problems, you see a memory leak
> > > > where the only pertinent information is the original allocation
> > > > site. What would be more useful is knowing where a get fails to have a
> > > > corresponding put, where there are double puts, etc.
> > > >
> > > > This work was motivated by the roll-back of:
> > > > https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> > > > where fixing a missed put resulted in a use-after-free in a different
> > > > context. There was a sense in fixing the issue that a game of
> > > > wac-a-mole had been embarked upon in adding missed gets and puts.
> > > >
> > > > The basic approach of the change is to add a level of indirection at
> > > > the get and put calls. Get allocates a level of indirection that, if
> > > > no corresponding put is called, becomes a memory leak (and associated
> > > > stack trace) that leak sanitizer can report. Similarly if two puts are
> > > > called for the same get, then a double free can be detected by address
> > > > sanitizer. This can also detect the use after put, which should also
> > > > yield a segv without a sanitizer.
> > > >
> > > > Adding reference count checking to cpu map was done as a proof of
> > > > concept, it yielded little other than a location where the use of get
> > > > could be cleaner by using its result. Reference count checking on
> > > > nsinfo identified a double free of the indirection layer and the
> > > > related threads, thereby identifying a data race as discussed here:
> > > > https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> > > > Accordingly the dso->lock was extended and use to cover the race.
> > > >
> > > > An alternative that was considered was ref_tracker:
> > > >  https://lwn.net/Articles/877603/
> > > > ref_tracker requires use of a reference counted struct to also use a
> > > > cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> > > > to spot double puts. When an object is finished with leaks can be
> > > > detected, as with this approach when leak analysis happens. This
> > > > approach was preferred as it doesn't introduce cookies, spots use
> > > > after put and appears moderately more neutral to the API. Weaknesses
> > > > of the implemented approcah are not being able to do adhoc leak
> > > > detection and a preference for adding an accessor API to structs. I
> > > > believe there are other issues and welcome suggestions.
> > >
> > > And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
> > > https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
> > > by Masami Hiramatsu. In this work he adds a leak sanitizer style
> > > reference count checker that will describe locations of puts and gets
> > > for diagnosis. Firstly that's an awesome achievement! This work is
> > > different in that it isn't trying to invent a leak sanitizer, it is
> > > just using the existing one. By adding a level of indirection this
> > > work can catch use after put and pairs gets with puts to make lifetime
> > > analysis more automatic. An advantage of Masami's work is that it
> > > doesn't change data-structures and after the initial patch-set is
> > > somewhat transparent. Overall I prefer the approach in these patches,
> > > future patches can look to clean up the API as Masami has.
> >
> > Thanks for referring my series :-D The series aimed to solve the refcount
> > usage issue in the perf which lead the object leaks. At that moment,
> > I found that there were 2 patterns, refcount start from 0 and start from 1.
> > That made me confused what I should do for using a object.
> > But the perf uses linux/refcount.h now, I hope such issue has already gone.
> > (but the object leakage seems not fixed fully yet, as you found.)
> >
> > BTW, I think the introducing UNWRAP_* macro may put a burden on future
> > development. If it is inevitable, we should consider it as carefully as
> > possible. Or, it may cause another issue (it is easily missed that the new
> > patch does not use UNWRAP_* for object reference, because it is natual.)
> >
> > So I agree with you that you to clean up the API. :-)
> > I think we can make yet another refcount.h for user space debugging and
> > replace it with the linux/refcount.h.
> 
> Thanks Masami,
> 
> Agreed on the UNWRAP_ macros, hence wanting to hide them behind
> accessors. Making accessors could be automated with macros, for
> example, have a list of variables, have a macro declare the struct
> using the list, another macro can use the list to declare accessors. I
> didn't find adding the UNWRAP_ macros in this change particularly
> burdensome as any use of the wrapping pointer as the original type
> caused a compile time error telling you what and where to fix. The
> macro is extra stuff in the way of using just the raw object, but
> that's fairly typical in C++ with shared_ptr, scoped_lock, etc.

Hi Ian,

Hmm, but such a macro is not usual for C which perf is written in.
If I understand correctly, you might want to use memory leak
analyzer to detect refcount leak, and that analyzer will show
what data structure is leaked.

If so, maybe you can do the same thing by introducing a dummy
list node for each data structure which you want to debug.

struct perf_cpu_map__refdebug {
	struct perf_cpu_map__refdebug *orig;
};

And expand refcount_t as.

typedef struct refcount_struct {
	atomic_t refs;
#ifdef REFCNT_CHECKING
	void *orig;
#endif
} refcount_t;

And change the get/put as below

struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
{
	if (map) {
#ifdef REFCNT_CHECKING
		struct perf_cpu_map__refdebug *new_node;
#endif
		refcount_inc(&map->refcnt);
#ifdef REFCNT_CHECKING
		new_node = malloc(sizeof(*new_node));
		new_node->orig = map->refcnt->orig;
		map->refcnt->orig = new_node;
#endif
	}
	return map;
}

void perf_cpu_map__put(struct perf_cpu_map *map)
{
	if (map) {
		if (refcount_dec_and_test(&map->refcnt))
			cpu_map__delete(map);
		else {
#ifdef REFCNT_CHECKING
			struct perf_cpu_map__refdebug *node = map->refcnt->orig;

			map->refcnt->orig = node->orig;
			free(node);
#endif
		}
	}
}

This need a bit complex get/put, but no need to change other parts.

> The
> question is, is it worth it to make sure use of the reference counted
> object is correct and misuse is easier to diagnose?

You mean the stackdump for every get/put as I did? That's a good
question. Let's think what may happen.

For example, if funcA() expects its caller funcB() will put the object
but actually funcB() doesn't, or the funcC() which is the another
caller of funcA()) doesn't expect the funcA() gets the object.

funcA() {
	get(obj);
	return obj;
}

funcB() {
	obj = funcA();
	...
	// wrong! it should do put(obj);
}

funcC() {
	obj = funcA();
	get(obj);		// this is wrong get().
	...
	put(obj);
}

If we just list the non-released object, both logs seems same because
funcB()'s get/put pair will be skipped. If the analyzer shows the
stacktrace when the object was got, maybe we can notice the difference
of funcB() and funcC() path, but this is the simplest case. funcA()
can be called from funcB/C via several different functions.
But perhaps I'm too worried.

Thank you,

> I think it is near
> as least offensive as possible while providing the best information -
> hence being able to solve the dso->nsinfo put data race, that has been
> a problem to solve up to this point. I'm open to better suggestions
> though :-)
> 
> Thanks again,
> Ian
> 
> > Thank you,
> >
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-28 15:34       ` Masami Hiramatsu
@ 2022-01-28 18:26         ` Ian Rogers
  2022-01-28 19:59           ` Arnaldo Carvalho de Melo
                             ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Ian Rogers @ 2022-01-28 18:26 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Fri, Jan 28, 2022 at 7:35 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 27 Jan 2022 22:24:59 -0800
> Ian Rogers <irogers@google.com> wrote:
>
> > On Thu, Jan 27, 2022 at 9:24 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > On Thu, 27 Jan 2022 13:33:23 -0800
> > > Ian Rogers <irogers@google.com> wrote:
> > >
> > > > On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
> > > > >
> > > > > This v2 patch set has the main reference count patch for cpu map from
> > > > > the first set and then adds reference count checking to nsinfo. The
> > > > > reference count checking on nsinfo helped diagnose a data race bug
> > > > > which is fixed in the independent patches 2 and 3.
> > > > >
> > > > > The perf tool has a class of memory problems where reference counts
> > > > > are used incorrectly. Memory/address sanitizers and valgrind don't
> > > > > provide useful ways to debug these problems, you see a memory leak
> > > > > where the only pertinent information is the original allocation
> > > > > site. What would be more useful is knowing where a get fails to have a
> > > > > corresponding put, where there are double puts, etc.
> > > > >
> > > > > This work was motivated by the roll-back of:
> > > > > https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> > > > > where fixing a missed put resulted in a use-after-free in a different
> > > > > context. There was a sense in fixing the issue that a game of
> > > > > wac-a-mole had been embarked upon in adding missed gets and puts.
> > > > >
> > > > > The basic approach of the change is to add a level of indirection at
> > > > > the get and put calls. Get allocates a level of indirection that, if
> > > > > no corresponding put is called, becomes a memory leak (and associated
> > > > > stack trace) that leak sanitizer can report. Similarly if two puts are
> > > > > called for the same get, then a double free can be detected by address
> > > > > sanitizer. This can also detect the use after put, which should also
> > > > > yield a segv without a sanitizer.
> > > > >
> > > > > Adding reference count checking to cpu map was done as a proof of
> > > > > concept, it yielded little other than a location where the use of get
> > > > > could be cleaner by using its result. Reference count checking on
> > > > > nsinfo identified a double free of the indirection layer and the
> > > > > related threads, thereby identifying a data race as discussed here:
> > > > > https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> > > > > Accordingly the dso->lock was extended and use to cover the race.
> > > > >
> > > > > An alternative that was considered was ref_tracker:
> > > > >  https://lwn.net/Articles/877603/
> > > > > ref_tracker requires use of a reference counted struct to also use a
> > > > > cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> > > > > to spot double puts. When an object is finished with leaks can be
> > > > > detected, as with this approach when leak analysis happens. This
> > > > > approach was preferred as it doesn't introduce cookies, spots use
> > > > > after put and appears moderately more neutral to the API. Weaknesses
> > > > > of the implemented approcah are not being able to do adhoc leak
> > > > > detection and a preference for adding an accessor API to structs. I
> > > > > believe there are other issues and welcome suggestions.
> > > >
> > > > And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
> > > > https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
> > > > by Masami Hiramatsu. In this work he adds a leak sanitizer style
> > > > reference count checker that will describe locations of puts and gets
> > > > for diagnosis. Firstly that's an awesome achievement! This work is
> > > > different in that it isn't trying to invent a leak sanitizer, it is
> > > > just using the existing one. By adding a level of indirection this
> > > > work can catch use after put and pairs gets with puts to make lifetime
> > > > analysis more automatic. An advantage of Masami's work is that it
> > > > doesn't change data-structures and after the initial patch-set is
> > > > somewhat transparent. Overall I prefer the approach in these patches,
> > > > future patches can look to clean up the API as Masami has.
> > >
> > > Thanks for referring my series :-D The series aimed to solve the refcount
> > > usage issue in the perf which lead the object leaks. At that moment,
> > > I found that there were 2 patterns, refcount start from 0 and start from 1.
> > > That made me confused what I should do for using a object.
> > > But the perf uses linux/refcount.h now, I hope such issue has already gone.
> > > (but the object leakage seems not fixed fully yet, as you found.)
> > >
> > > BTW, I think the introducing UNWRAP_* macro may put a burden on future
> > > development. If it is inevitable, we should consider it as carefully as
> > > possible. Or, it may cause another issue (it is easily missed that the new
> > > patch does not use UNWRAP_* for object reference, because it is natual.)
> > >
> > > So I agree with you that you to clean up the API. :-)
> > > I think we can make yet another refcount.h for user space debugging and
> > > replace it with the linux/refcount.h.
> >
> > Thanks Masami,
> >
> > Agreed on the UNWRAP_ macros, hence wanting to hide them behind
> > accessors. Making accessors could be automated with macros, for
> > example, have a list of variables, have a macro declare the struct
> > using the list, another macro can use the list to declare accessors. I
> > didn't find adding the UNWRAP_ macros in this change particularly
> > burdensome as any use of the wrapping pointer as the original type
> > caused a compile time error telling you what and where to fix. The
> > macro is extra stuff in the way of using just the raw object, but
> > that's fairly typical in C++ with shared_ptr, scoped_lock, etc.
>
> Hi Ian,
>
> Hmm, but such a macro is not usual for C which perf is written in.
> If I understand correctly, you might want to use memory leak
> analyzer to detect refcount leak, and that analyzer will show
> what data structure is leaked.

Firstly, thanks for the conversation - this is really useful to
improve the code!

I think in an ideal world we'd somehow educate things like address
sanitizer of reference counted data structures and they would do a
better job of tracking gets and puts. The problem is pairing gets and
puts. In C++ you use RAII types so that the destructor ensures a put -
this can be complex when using data types like lists where you want to
move or swap things onto the list, to keep the single pointer
property. In the C code in Linux we use gotos, similarly to how defer
is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
the get/put pairing problem by adding a cookie that is passed around.
The problem with that is that then the cookie becomes part of the API.
To avoid that the approach here is just to change the original data
type and add in a layer of indirection, that layer has become the
cookie. A benefit of this approach is that once the cookie/indirection
is freed, use of it becomes an obvious failure - we leverage address
sanitizer for use after free.

> If so, maybe you can do the same thing by introducing a dummy
> list node for each data structure which you want to debug.
>
> struct perf_cpu_map__refdebug {
>         struct perf_cpu_map__refdebug *orig;
> };
>
> And expand refcount_t as.
>
> typedef struct refcount_struct {
>         atomic_t refs;
> #ifdef REFCNT_CHECKING
>         void *orig;
> #endif
> } refcount_t;
>
> And change the get/put as below
>
> struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
> {
>         if (map) {
> #ifdef REFCNT_CHECKING
>                 struct perf_cpu_map__refdebug *new_node;
> #endif
>                 refcount_inc(&map->refcnt);
> #ifdef REFCNT_CHECKING
>                 new_node = malloc(sizeof(*new_node));
>                 new_node->orig = map->refcnt->orig;
>                 map->refcnt->orig = new_node;
> #endif
>         }
>         return map;
> }
>
> void perf_cpu_map__put(struct perf_cpu_map *map)
> {
>         if (map) {
>                 if (refcount_dec_and_test(&map->refcnt))
>                         cpu_map__delete(map);
>                 else {
> #ifdef REFCNT_CHECKING
>                         struct perf_cpu_map__refdebug *node = map->refcnt->orig;
>
>                         map->refcnt->orig = node->orig;
>                         free(node);
> #endif
>                 }
>         }
> }
>
> This need a bit complex get/put, but no need to change other parts.

Adding a list like this gives an ability to say something like of the
current reference count of 3 what indirection objects exist. This
could be useful for diagnosis but you probably want to pair it with a
stack trace, and the approach here is punting that problem to the
address/leak sanitizer. I'm also concerned that there should be a lock
around the list. I think pursuing this ends up with something like
ref_tracker.

If we're using indirection, as in my proposal, then adding a common
indirection struct is problematic as anything declared to be a "struct
cpumap" now needs to be either the indirection or the original type -
hence using macros to hide that in the code. If we embed the
information into the refcount_t then we end up with something like
ref_tracker, API problems and losing use-after-put checking. Outside
of the macros, I think there is a simplicity to the approach I've put
forward.

> > The
> > question is, is it worth it to make sure use of the reference counted
> > object is correct and misuse is easier to diagnose?
>
> You mean the stackdump for every get/put as I did? That's a good
> question. Let's think what may happen.
>
> For example, if funcA() expects its caller funcB() will put the object
> but actually funcB() doesn't, or the funcC() which is the another
> caller of funcA()) doesn't expect the funcA() gets the object.
>
> funcA() {
>         get(obj);
>         return obj;
> }
>
> funcB() {
>         obj = funcA();
>         ...
>         // wrong! it should do put(obj);
> }
>
> funcC() {
>         obj = funcA();
>         get(obj);               // this is wrong get().
>         ...
>         put(obj);
> }
>
> If we just list the non-released object, both logs seems same because
> funcB()'s get/put pair will be skipped. If the analyzer shows the
> stacktrace when the object was got, maybe we can notice the difference
> of funcB() and funcC() path, but this is the simplest case. funcA()
> can be called from funcB/C via several different functions.
> But perhaps I'm too worried.

So in the logs we should see for funcB:

Memory leak of ... at:
malloc...
get...
funcA
funcB
...

as the put on the indirection object was missed and this is now a leak
of the indirection object. For funcC we should see:

Memory leak of ... at:
malloc..
get..
funcA
funcC

So from the stack traces we can see that there is an unpaired get
happening in funcA called from either funcB and funcC, which means we
need to a put there. In the funcC case we can see the put was missed
from a call to funcA, rather than a get it made.

As the code in perf is complex, multi-threaded and sometimes
unintentionally racy a get may happen on 1 thread, the object is
placed in a global, the object is put by another thread and also
accessed by a 3rd thread. This is what was happening in the
dso->nsinfo case. The bug there is that there was a double put
happening by the third thread because of a race. Leak sanitizer treats
memory visible from a global as not a leak, this can mean to get the
most information on leaks in perf we need to aggressively
free/delete/deconstruct when terminating so that leaks become visible.
This feels to me like good hygiene, but it could also be argued to be
a tax.

Anyway, I think I'm still at the same point I was when I posted these
changes. That an indirection object is the simplest, smallest,
cleanest way to get the most information. I think making the rest of
the reference counted data structures have this feature would be
great, so I'd like to merge the 4 patches here and work to add more. I
think we can also build on that foundation for extra debug
information.

Thanks,
Ian

> Thank you,
>
> > I think it is near
> > as least offensive as possible while providing the best information -
> > hence being able to solve the dso->nsinfo put data race, that has been
> > a problem to solve up to this point. I'm open to better suggestions
> > though :-)
> >
> > Thanks again,
> > Ian
> >
> > > Thank you,
> > >
> > > --
> > > Masami Hiramatsu <mhiramat@kernel.org>
>
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-28 18:26         ` Ian Rogers
@ 2022-01-28 19:59           ` Arnaldo Carvalho de Melo
  2022-01-30  8:04             ` Masami Hiramatsu
  2022-01-30  7:54           ` Masami Hiramatsu
  2022-01-31 13:56           ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-28 19:59 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Masami Hiramatsu, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Darren Hart, Davidlohr Bueso, André Almeida, James Clark,
	John Garry, Riccardo Mancini, Yury Norov, Andy Shevchenko,
	Andrew Morton, Jin Yao, Adrian Hunter, Leo Yan, Andi Kleen,
	Thomas Richter, Kan Liang, Madhavan Srinivasan,
	Shunsuke Nakamura, Song Liu, Steven Rostedt, Miaoqian Lin,
	Stephen Brennan, Kajol Jain, Alexey Bayduraev, German Gomez,
	linux-perf-users, linux-kernel, Eric Dumazet, Dmitry Vyukov,
	masami.hiramatsu.pt, eranian

Em Fri, Jan 28, 2022 at 10:26:20AM -0800, Ian Rogers escreveu:
> On Fri, Jan 28, 2022 at 7:35 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 27 Jan 2022 22:24:59 -0800
> > Ian Rogers <irogers@google.com> wrote:
> >
> > > On Thu, Jan 27, 2022 at 9:24 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >
> > > > On Thu, 27 Jan 2022 13:33:23 -0800
> > > > Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > > On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
> > > > > >
> > > > > > This v2 patch set has the main reference count patch for cpu map from
> > > > > > the first set and then adds reference count checking to nsinfo. The
> > > > > > reference count checking on nsinfo helped diagnose a data race bug
> > > > > > which is fixed in the independent patches 2 and 3.
> > > > > >
> > > > > > The perf tool has a class of memory problems where reference counts
> > > > > > are used incorrectly. Memory/address sanitizers and valgrind don't
> > > > > > provide useful ways to debug these problems, you see a memory leak
> > > > > > where the only pertinent information is the original allocation
> > > > > > site. What would be more useful is knowing where a get fails to have a
> > > > > > corresponding put, where there are double puts, etc.
> > > > > >
> > > > > > This work was motivated by the roll-back of:
> > > > > > https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> > > > > > where fixing a missed put resulted in a use-after-free in a different
> > > > > > context. There was a sense in fixing the issue that a game of
> > > > > > wac-a-mole had been embarked upon in adding missed gets and puts.
> > > > > >
> > > > > > The basic approach of the change is to add a level of indirection at
> > > > > > the get and put calls. Get allocates a level of indirection that, if
> > > > > > no corresponding put is called, becomes a memory leak (and associated
> > > > > > stack trace) that leak sanitizer can report. Similarly if two puts are
> > > > > > called for the same get, then a double free can be detected by address
> > > > > > sanitizer. This can also detect the use after put, which should also
> > > > > > yield a segv without a sanitizer.
> > > > > >
> > > > > > Adding reference count checking to cpu map was done as a proof of
> > > > > > concept, it yielded little other than a location where the use of get
> > > > > > could be cleaner by using its result. Reference count checking on
> > > > > > nsinfo identified a double free of the indirection layer and the
> > > > > > related threads, thereby identifying a data race as discussed here:
> > > > > > https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> > > > > > Accordingly the dso->lock was extended and use to cover the race.
> > > > > >
> > > > > > An alternative that was considered was ref_tracker:
> > > > > >  https://lwn.net/Articles/877603/
> > > > > > ref_tracker requires use of a reference counted struct to also use a
> > > > > > cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> > > > > > to spot double puts. When an object is finished with leaks can be
> > > > > > detected, as with this approach when leak analysis happens. This
> > > > > > approach was preferred as it doesn't introduce cookies, spots use
> > > > > > after put and appears moderately more neutral to the API. Weaknesses
> > > > > > of the implemented approcah are not being able to do adhoc leak
> > > > > > detection and a preference for adding an accessor API to structs. I
> > > > > > believe there are other issues and welcome suggestions.
> > > > >
> > > > > And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
> > > > > https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
> > > > > by Masami Hiramatsu. In this work he adds a leak sanitizer style
> > > > > reference count checker that will describe locations of puts and gets
> > > > > for diagnosis. Firstly that's an awesome achievement! This work is
> > > > > different in that it isn't trying to invent a leak sanitizer, it is
> > > > > just using the existing one. By adding a level of indirection this
> > > > > work can catch use after put and pairs gets with puts to make lifetime
> > > > > analysis more automatic. An advantage of Masami's work is that it
> > > > > doesn't change data-structures and after the initial patch-set is
> > > > > somewhat transparent. Overall I prefer the approach in these patches,
> > > > > future patches can look to clean up the API as Masami has.
> > > >
> > > > Thanks for referring my series :-D The series aimed to solve the refcount
> > > > usage issue in the perf which lead the object leaks. At that moment,
> > > > I found that there were 2 patterns, refcount start from 0 and start from 1.
> > > > That made me confused what I should do for using a object.
> > > > But the perf uses linux/refcount.h now, I hope such issue has already gone.
> > > > (but the object leakage seems not fixed fully yet, as you found.)
> > > >
> > > > BTW, I think the introducing UNWRAP_* macro may put a burden on future
> > > > development. If it is inevitable, we should consider it as carefully as
> > > > possible. Or, it may cause another issue (it is easily missed that the new
> > > > patch does not use UNWRAP_* for object reference, because it is natual.)
> > > >
> > > > So I agree with you that you to clean up the API. :-)
> > > > I think we can make yet another refcount.h for user space debugging and
> > > > replace it with the linux/refcount.h.
> > >
> > > Thanks Masami,
> > >
> > > Agreed on the UNWRAP_ macros, hence wanting to hide them behind
> > > accessors. Making accessors could be automated with macros, for
> > > example, have a list of variables, have a macro declare the struct
> > > using the list, another macro can use the list to declare accessors. I
> > > didn't find adding the UNWRAP_ macros in this change particularly
> > > burdensome as any use of the wrapping pointer as the original type
> > > caused a compile time error telling you what and where to fix. The
> > > macro is extra stuff in the way of using just the raw object, but
> > > that's fairly typical in C++ with shared_ptr, scoped_lock, etc.
> >
> > Hi Ian,
> >
> > Hmm, but such a macro is not usual for C which perf is written in.
> > If I understand correctly, you might want to use memory leak
> > analyzer to detect refcount leak, and that analyzer will show
> > what data structure is leaked.
> 
> Firstly, thanks for the conversation - this is really useful to
> improve the code!
> 
> I think in an ideal world we'd somehow educate things like address
> sanitizer of reference counted data structures and they would do a
> better job of tracking gets and puts. The problem is pairing gets and
> puts. In C++ you use RAII types so that the destructor ensures a put -
> this can be complex when using data types like lists where you want to
> move or swap things onto the list, to keep the single pointer
> property. In the C code in Linux we use gotos, similarly to how defer
> is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
> the get/put pairing problem by adding a cookie that is passed around.
> The problem with that is that then the cookie becomes part of the API.
> To avoid that the approach here is just to change the original data
> type and add in a layer of indirection, that layer has become the
> cookie. A benefit of this approach is that once the cookie/indirection
> is freed, use of it becomes an obvious failure - we leverage address
> sanitizer for use after free.

I went back to that discussion and saw this part where I brainstormed
about doing all this in unmodified binaries:

https://lore.kernel.org/all/20151209134138.GB15864@kernel.org/

Even Alexei chimed in and replied to that thinking it was doable:

https://lore.kernel.org/all/20151210033139.GA10056@ast-mbp.thefacebook.com/#t

And nowadays we have much better BPF infrastructure, much faster probes,
etc.

But anyway, like at that opportunity, I thank you guys for working on
such infrastructure, in 2015 several bugs were found and fixed with that
refcount debbuger, as is now the case with Ian's attempt.

Thanks!

- Arnaldo
 
> > If so, maybe you can do the same thing by introducing a dummy
> > list node for each data structure which you want to debug.
> >
> > struct perf_cpu_map__refdebug {
> >         struct perf_cpu_map__refdebug *orig;
> > };
> >
> > And expand refcount_t as.
> >
> > typedef struct refcount_struct {
> >         atomic_t refs;
> > #ifdef REFCNT_CHECKING
> >         void *orig;
> > #endif
> > } refcount_t;
> >
> > And change the get/put as below
> >
> > struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
> > {
> >         if (map) {
> > #ifdef REFCNT_CHECKING
> >                 struct perf_cpu_map__refdebug *new_node;
> > #endif
> >                 refcount_inc(&map->refcnt);
> > #ifdef REFCNT_CHECKING
> >                 new_node = malloc(sizeof(*new_node));
> >                 new_node->orig = map->refcnt->orig;
> >                 map->refcnt->orig = new_node;
> > #endif
> >         }
> >         return map;
> > }
> >
> > void perf_cpu_map__put(struct perf_cpu_map *map)
> > {
> >         if (map) {
> >                 if (refcount_dec_and_test(&map->refcnt))
> >                         cpu_map__delete(map);
> >                 else {
> > #ifdef REFCNT_CHECKING
> >                         struct perf_cpu_map__refdebug *node = map->refcnt->orig;
> >
> >                         map->refcnt->orig = node->orig;
> >                         free(node);
> > #endif
> >                 }
> >         }
> > }
> >
> > This need a bit complex get/put, but no need to change other parts.
> 
> Adding a list like this gives an ability to say something like of the
> current reference count of 3 what indirection objects exist. This
> could be useful for diagnosis but you probably want to pair it with a
> stack trace, and the approach here is punting that problem to the
> address/leak sanitizer. I'm also concerned that there should be a lock
> around the list. I think pursuing this ends up with something like
> ref_tracker.
> 
> If we're using indirection, as in my proposal, then adding a common
> indirection struct is problematic as anything declared to be a "struct
> cpumap" now needs to be either the indirection or the original type -
> hence using macros to hide that in the code. If we embed the
> information into the refcount_t then we end up with something like
> ref_tracker, API problems and losing use-after-put checking. Outside
> of the macros, I think there is a simplicity to the approach I've put
> forward.
> 
> > > The
> > > question is, is it worth it to make sure use of the reference counted
> > > object is correct and misuse is easier to diagnose?
> >
> > You mean the stackdump for every get/put as I did? That's a good
> > question. Let's think what may happen.
> >
> > For example, if funcA() expects its caller funcB() will put the object
> > but actually funcB() doesn't, or the funcC() which is the another
> > caller of funcA()) doesn't expect the funcA() gets the object.
> >
> > funcA() {
> >         get(obj);
> >         return obj;
> > }
> >
> > funcB() {
> >         obj = funcA();
> >         ...
> >         // wrong! it should do put(obj);
> > }
> >
> > funcC() {
> >         obj = funcA();
> >         get(obj);               // this is wrong get().
> >         ...
> >         put(obj);
> > }
> >
> > If we just list the non-released object, both logs seems same because
> > funcB()'s get/put pair will be skipped. If the analyzer shows the
> > stacktrace when the object was got, maybe we can notice the difference
> > of funcB() and funcC() path, but this is the simplest case. funcA()
> > can be called from funcB/C via several different functions.
> > But perhaps I'm too worried.
> 
> So in the logs we should see for funcB:
> 
> Memory leak of ... at:
> malloc...
> get...
> funcA
> funcB
> ...
> 
> as the put on the indirection object was missed and this is now a leak
> of the indirection object. For funcC we should see:
> 
> Memory leak of ... at:
> malloc..
> get..
> funcA
> funcC
> 
> So from the stack traces we can see that there is an unpaired get
> happening in funcA called from either funcB and funcC, which means we
> need to a put there. In the funcC case we can see the put was missed
> from a call to funcA, rather than a get it made.
> 
> As the code in perf is complex, multi-threaded and sometimes
> unintentionally racy a get may happen on 1 thread, the object is
> placed in a global, the object is put by another thread and also
> accessed by a 3rd thread. This is what was happening in the
> dso->nsinfo case. The bug there is that there was a double put
> happening by the third thread because of a race. Leak sanitizer treats
> memory visible from a global as not a leak, this can mean to get the
> most information on leaks in perf we need to aggressively
> free/delete/deconstruct when terminating so that leaks become visible.
> This feels to me like good hygiene, but it could also be argued to be
> a tax.
> 
> Anyway, I think I'm still at the same point I was when I posted these
> changes. That an indirection object is the simplest, smallest,
> cleanest way to get the most information. I think making the rest of
> the reference counted data structures have this feature would be
> great, so I'd like to merge the 4 patches here and work to add more. I
> think we can also build on that foundation for extra debug
> information.
> 
> Thanks,
> Ian
> 
> > Thank you,
> >
> > > I think it is near
> > > as least offensive as possible while providing the best information -
> > > hence being able to solve the dso->nsinfo put data race, that has been
> > > a problem to solve up to this point. I'm open to better suggestions
> > > though :-)
> > >
> > > Thanks again,
> > > Ian
> > >
> > > > Thank you,
> > > >
> > > > --
> > > > Masami Hiramatsu <mhiramat@kernel.org>
> >
> >
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-28 18:26         ` Ian Rogers
  2022-01-28 19:59           ` Arnaldo Carvalho de Melo
@ 2022-01-30  7:54           ` Masami Hiramatsu
  2022-01-30 17:40             ` Ian Rogers
  2022-01-31 13:56           ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 20+ messages in thread
From: Masami Hiramatsu @ 2022-01-30  7:54 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Fri, 28 Jan 2022 10:26:20 -0800
Ian Rogers <irogers@google.com> wrote:

> On Fri, Jan 28, 2022 at 7:35 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 27 Jan 2022 22:24:59 -0800
> > Ian Rogers <irogers@google.com> wrote:
> >
> > > On Thu, Jan 27, 2022 at 9:24 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >
> > > > On Thu, 27 Jan 2022 13:33:23 -0800
> > > > Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > > On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
> > > > > >
> > > > > > This v2 patch set has the main reference count patch for cpu map from
> > > > > > the first set and then adds reference count checking to nsinfo. The
> > > > > > reference count checking on nsinfo helped diagnose a data race bug
> > > > > > which is fixed in the independent patches 2 and 3.
> > > > > >
> > > > > > The perf tool has a class of memory problems where reference counts
> > > > > > are used incorrectly. Memory/address sanitizers and valgrind don't
> > > > > > provide useful ways to debug these problems, you see a memory leak
> > > > > > where the only pertinent information is the original allocation
> > > > > > site. What would be more useful is knowing where a get fails to have a
> > > > > > corresponding put, where there are double puts, etc.
> > > > > >
> > > > > > This work was motivated by the roll-back of:
> > > > > > https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> > > > > > where fixing a missed put resulted in a use-after-free in a different
> > > > > > context. There was a sense in fixing the issue that a game of
> > > > > > wac-a-mole had been embarked upon in adding missed gets and puts.
> > > > > >
> > > > > > The basic approach of the change is to add a level of indirection at
> > > > > > the get and put calls. Get allocates a level of indirection that, if
> > > > > > no corresponding put is called, becomes a memory leak (and associated
> > > > > > stack trace) that leak sanitizer can report. Similarly if two puts are
> > > > > > called for the same get, then a double free can be detected by address
> > > > > > sanitizer. This can also detect the use after put, which should also
> > > > > > yield a segv without a sanitizer.
> > > > > >
> > > > > > Adding reference count checking to cpu map was done as a proof of
> > > > > > concept, it yielded little other than a location where the use of get
> > > > > > could be cleaner by using its result. Reference count checking on
> > > > > > nsinfo identified a double free of the indirection layer and the
> > > > > > related threads, thereby identifying a data race as discussed here:
> > > > > > https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> > > > > > Accordingly the dso->lock was extended and use to cover the race.
> > > > > >
> > > > > > An alternative that was considered was ref_tracker:
> > > > > >  https://lwn.net/Articles/877603/
> > > > > > ref_tracker requires use of a reference counted struct to also use a
> > > > > > cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> > > > > > to spot double puts. When an object is finished with leaks can be
> > > > > > detected, as with this approach when leak analysis happens. This
> > > > > > approach was preferred as it doesn't introduce cookies, spots use
> > > > > > after put and appears moderately more neutral to the API. Weaknesses
> > > > > > of the implemented approcah are not being able to do adhoc leak
> > > > > > detection and a preference for adding an accessor API to structs. I
> > > > > > believe there are other issues and welcome suggestions.
> > > > >
> > > > > And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
> > > > > https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
> > > > > by Masami Hiramatsu. In this work he adds a leak sanitizer style
> > > > > reference count checker that will describe locations of puts and gets
> > > > > for diagnosis. Firstly that's an awesome achievement! This work is
> > > > > different in that it isn't trying to invent a leak sanitizer, it is
> > > > > just using the existing one. By adding a level of indirection this
> > > > > work can catch use after put and pairs gets with puts to make lifetime
> > > > > analysis more automatic. An advantage of Masami's work is that it
> > > > > doesn't change data-structures and after the initial patch-set is
> > > > > somewhat transparent. Overall I prefer the approach in these patches,
> > > > > future patches can look to clean up the API as Masami has.
> > > >
> > > > Thanks for referring my series :-D The series aimed to solve the refcount
> > > > usage issue in the perf which lead the object leaks. At that moment,
> > > > I found that there were 2 patterns, refcount start from 0 and start from 1.
> > > > That made me confused what I should do for using a object.
> > > > But the perf uses linux/refcount.h now, I hope such issue has already gone.
> > > > (but the object leakage seems not fixed fully yet, as you found.)
> > > >
> > > > BTW, I think the introducing UNWRAP_* macro may put a burden on future
> > > > development. If it is inevitable, we should consider it as carefully as
> > > > possible. Or, it may cause another issue (it is easily missed that the new
> > > > patch does not use UNWRAP_* for object reference, because it is natual.)
> > > >
> > > > So I agree with you that you to clean up the API. :-)
> > > > I think we can make yet another refcount.h for user space debugging and
> > > > replace it with the linux/refcount.h.
> > >
> > > Thanks Masami,
> > >
> > > Agreed on the UNWRAP_ macros, hence wanting to hide them behind
> > > accessors. Making accessors could be automated with macros, for
> > > example, have a list of variables, have a macro declare the struct
> > > using the list, another macro can use the list to declare accessors. I
> > > didn't find adding the UNWRAP_ macros in this change particularly
> > > burdensome as any use of the wrapping pointer as the original type
> > > caused a compile time error telling you what and where to fix. The
> > > macro is extra stuff in the way of using just the raw object, but
> > > that's fairly typical in C++ with shared_ptr, scoped_lock, etc.
> >
> > Hi Ian,
> >
> > Hmm, but such a macro is not usual for C which perf is written in.
> > If I understand correctly, you might want to use memory leak
> > analyzer to detect refcount leak, and that analyzer will show
> > what data structure is leaked.
> 
> Firstly, thanks for the conversation - this is really useful to
> improve the code!

Hi Ian,

You're welcome! This conversation also useful to me to understand
the issue deeper :-)

> I think in an ideal world we'd somehow educate things like address
> sanitizer of reference counted data structures and they would do a
> better job of tracking gets and puts. The problem is pairing gets and
> puts.

Agreed. From my experience, pairing gets and puts are hard without
reviewing the source code, since the refcounter is not only used in a
single function, but it is for keeping the object not released until
some long process is finished.

For example, if the correct pair is like below, funcA-funcD, funcB-funcC,
funcA (get)
funcB (get)
funcC (put)
funcD (put)

But depending on the execution timing, funcC/funcD can be swapped.
funcA (get)
funcB (get)
funcD (put)
funcC (put)

And if there is a bug, funcX may get the object by mistake.
funcA (get)
funcX (get)
funcB (get)
funcD (put)
funcC (put)

Or, funcC() might miss to put.
funcA (get)
funcB (get)
funcD (put)

In any case, just tracking the get/put, it is hard to know which pair
is not right. I saw these patterns when I debugged it. :(

> In C++ you use RAII types so that the destructor ensures a put -
> this can be complex when using data types like lists where you want to
> move or swap things onto the list, to keep the single pointer
> property. In the C code in Linux we use gotos, similarly to how defer
> is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
> the get/put pairing problem by adding a cookie that is passed around.

Cool! I will check the ref_tracker :)

> The problem with that is that then the cookie becomes part of the API.

What the cookie is? some pairing function address??

> To avoid that the approach here is just to change the original data
> type and add in a layer of indirection, that layer has become the
> cookie. A benefit of this approach is that once the cookie/indirection
> is freed, use of it becomes an obvious failure - we leverage address
> sanitizer for use after free.

Ah, got it. I thought that was anyway caught by address sanitizer by default, wasn't?

> 
> > If so, maybe you can do the same thing by introducing a dummy
> > list node for each data structure which you want to debug.
> >
> > struct perf_cpu_map__refdebug {
> >         struct perf_cpu_map__refdebug *orig;
> > };
> >
> > And expand refcount_t as.
> >
> > typedef struct refcount_struct {
> >         atomic_t refs;
> > #ifdef REFCNT_CHECKING
> >         void *orig;
> > #endif
> > } refcount_t;
> >
> > And change the get/put as below
> >
> > struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
> > {
> >         if (map) {
> > #ifdef REFCNT_CHECKING
> >                 struct perf_cpu_map__refdebug *new_node;
> > #endif
> >                 refcount_inc(&map->refcnt);
> > #ifdef REFCNT_CHECKING
> >                 new_node = malloc(sizeof(*new_node));
> >                 new_node->orig = map->refcnt->orig;
> >                 map->refcnt->orig = new_node;
> > #endif
> >         }
> >         return map;
> > }
> >
> > void perf_cpu_map__put(struct perf_cpu_map *map)
> > {
> >         if (map) {
> >                 if (refcount_dec_and_test(&map->refcnt))
> >                         cpu_map__delete(map);
> >                 else {
> > #ifdef REFCNT_CHECKING
> >                         struct perf_cpu_map__refdebug *node = map->refcnt->orig;
> >
> >                         map->refcnt->orig = node->orig;
> >                         free(node);
> > #endif
> >                 }
> >         }
> > }
> >
> > This need a bit complex get/put, but no need to change other parts.
> 
> Adding a list like this gives an ability to say something like of the
> current reference count of 3 what indirection objects exist. This
> could be useful for diagnosis but you probably want to pair it with a
> stack trace, and the approach here is punting that problem to the
> address/leak sanitizer.

I think the pairing get/put should be done by manual. Without
reviewing the code, it is hard to know what was expected by the
original developer. :(

> I'm also concerned that there should be a lock
> around the list. I think pursuing this ends up with something like
> ref_tracker.

Good catch! Indeed, a lock is needed.

> If we're using indirection, as in my proposal, then adding a common
> indirection struct is problematic as anything declared to be a "struct
> cpumap" now needs to be either the indirection or the original type -
> hence using macros to hide that in the code. If we embed the
> information into the refcount_t then we end up with something like
> ref_tracker, API problems and losing use-after-put checking. Outside
> of the macros, I think there is a simplicity to the approach I've put
> forward.

OK, so the "use-after-put" is what you solve. That is indeed hard to
check without introducing indirect pointer, because "use-after-put"
is different from "use-after-free", it will use the same "object"
but that has just a decremented refcounter.

funcA(obj) {
  get(obj);
  ...
  put(obj);
}

funcA2(obj) {
  ...
  obj_access(obj); <- //use-after-put? (or use-no-get?)
}

funcB() {
  obj = new_obj()
  get(obj);
  funcA(obj);
  funcA2(obj);
  put(obj);
}

So my question is that we need to pay the cost to use UNWRAP_ macro
on all those object just for finding the "use-after-put" case.
Usually the case that "use-after-put" causes actual problem is very
limited, or it can be "use-after-free".

funcA(obj) {
  get(obj);
  ...
  put(obj);	// without funcB()'s get, obj will be freed here.
}

funcB() {
  obj = new_obj()
  // get(obj) <-- forgot to get the refcounter
  funcA(obj)
  obj_access(obj) <- //use after put!
}

In above case, use-after-put will be use-after-free, because obj is
freed.

If we can re-define the '->' operator, this is completely hidden
from programmers, but C has no such feature. Maybe we can use a
hidden data structure and accessors to prohibit user using '->'
on such object (e.g. elfutils) but it may increase runtime
overhead (LTO might be a remedy). And anyway inside the object
definition file, it need to use UNWRAP_ macro (also, it must use
object pointer returned from get() method.)

> 
> > > The
> > > question is, is it worth it to make sure use of the reference counted
> > > object is correct and misuse is easier to diagnose?
> >
> > You mean the stackdump for every get/put as I did? That's a good
> > question. Let's think what may happen.
> >
> > For example, if funcA() expects its caller funcB() will put the object
> > but actually funcB() doesn't, or the funcC() which is the another
> > caller of funcA()) doesn't expect the funcA() gets the object.
> >
> > funcA() {
> >         get(obj);
> >         return obj;
> > }
> >
> > funcB() {
> >         obj = funcA();
> >         ...
> >         // wrong! it should do put(obj);
> > }
> >
> > funcC() {
> >         obj = funcA();
> >         get(obj);               // this is wrong get().
> >         ...
> >         put(obj);
> > }
> >
> > If we just list the non-released object, both logs seems same because
> > funcB()'s get/put pair will be skipped. If the analyzer shows the
> > stacktrace when the object was got, maybe we can notice the difference
> > of funcB() and funcC() path, but this is the simplest case. funcA()
> > can be called from funcB/C via several different functions.
> > But perhaps I'm too worried.
> 
> So in the logs we should see for funcB:
> 
> Memory leak of ... at:
> malloc...
> get...
> funcA
> funcB
> ...

Yeah, OK. and I think this can be done without anything because "obj"
is not released anyway. Am I correct?

> 
> as the put on the indirection object was missed and this is now a leak
> of the indirection object. For funcC we should see:
> 
> Memory leak of ... at:
> malloc..
> get..
> funcA
> funcC

And this is also be done with my proposal too. (of course the leaked
object type name will be perf_cpu_map__refdebug)

> 
> So from the stack traces we can see that there is an unpaired get
> happening in funcA called from either funcB and funcC, which means we
> need to a put there. In the funcC case we can see the put was missed
> from a call to funcA, rather than a get it made.

So you meant we only need a log of the malloced (== get) place with
backtrace, right? Yeah, I think that may be enoguh. When I wrote my
previous work (2015), I tried to expose information as much as possible. 

> 
> As the code in perf is complex, multi-threaded and sometimes
> unintentionally racy a get may happen on 1 thread, the object is
> placed in a global, the object is put by another thread and also
> accessed by a 3rd thread. This is what was happening in the
> dso->nsinfo case. The bug there is that there was a double put
> happening by the third thread because of a race. Leak sanitizer treats
> memory visible from a global as not a leak, this can mean to get the
> most information on leaks in perf we need to aggressively
> free/delete/deconstruct when terminating so that leaks become visible.
> This feels to me like good hygiene, but it could also be argued to be
> a tax.

Got it. So we need to make the global object visible to leak sanitizer.
For this reason you made get() to malloc() (get will allocate a
local object.)

> 
> Anyway, I think I'm still at the same point I was when I posted these
> changes. That an indirection object is the simplest, smallest,
> cleanest way to get the most information. I think making the rest of
> the reference counted data structures have this feature would be
> great, so I'd like to merge the 4 patches here and work to add more. I
> think we can also build on that foundation for extra debug
> information.

For me, your change aims to 2 goals, (1) catch the global object leak,
(2) catch the "use-after-put". And I think (1) is possible to be solved
without the indirection object because the required technic is changing
get() into malloc() some data. On the other hand, the (2) is solved only
by the indirection object. And I think the case that (2) becomes real
problem is when it turns into "use-after-free". So I think it may be
possible to be caught by the address sanitizer.


Thank you,

> 
> Thanks,
> Ian
> 
> > Thank you,
> >
> > > I think it is near
> > > as least offensive as possible while providing the best information -
> > > hence being able to solve the dso->nsinfo put data race, that has been
> > > a problem to solve up to this point. I'm open to better suggestions
> > > though :-)
> > >
> > > Thanks again,
> > > Ian
> > >
> > > > Thank you,
> > > >
> > > > --
> > > > Masami Hiramatsu <mhiramat@kernel.org>
> >
> >
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-28 19:59           ` Arnaldo Carvalho de Melo
@ 2022-01-30  8:04             ` Masami Hiramatsu
  2022-01-31 14:28               ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Masami Hiramatsu @ 2022-01-30  8:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ian Rogers, Masami Hiramatsu, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Fri, 28 Jan 2022 16:59:13 -0300
Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Em Fri, Jan 28, 2022 at 10:26:20AM -0800, Ian Rogers escreveu:
> > On Fri, Jan 28, 2022 at 7:35 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > On Thu, 27 Jan 2022 22:24:59 -0800
> > > Ian Rogers <irogers@google.com> wrote:
> > >
> > > > On Thu, Jan 27, 2022 at 9:24 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > >
> > > > > On Thu, 27 Jan 2022 13:33:23 -0800
> > > > > Ian Rogers <irogers@google.com> wrote:
> > > > >
> > > > > > On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
> > > > > > >
> > > > > > > This v2 patch set has the main reference count patch for cpu map from
> > > > > > > the first set and then adds reference count checking to nsinfo. The
> > > > > > > reference count checking on nsinfo helped diagnose a data race bug
> > > > > > > which is fixed in the independent patches 2 and 3.
> > > > > > >
> > > > > > > The perf tool has a class of memory problems where reference counts
> > > > > > > are used incorrectly. Memory/address sanitizers and valgrind don't
> > > > > > > provide useful ways to debug these problems, you see a memory leak
> > > > > > > where the only pertinent information is the original allocation
> > > > > > > site. What would be more useful is knowing where a get fails to have a
> > > > > > > corresponding put, where there are double puts, etc.
> > > > > > >
> > > > > > > This work was motivated by the roll-back of:
> > > > > > > https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> > > > > > > where fixing a missed put resulted in a use-after-free in a different
> > > > > > > context. There was a sense in fixing the issue that a game of
> > > > > > > wac-a-mole had been embarked upon in adding missed gets and puts.
> > > > > > >
> > > > > > > The basic approach of the change is to add a level of indirection at
> > > > > > > the get and put calls. Get allocates a level of indirection that, if
> > > > > > > no corresponding put is called, becomes a memory leak (and associated
> > > > > > > stack trace) that leak sanitizer can report. Similarly if two puts are
> > > > > > > called for the same get, then a double free can be detected by address
> > > > > > > sanitizer. This can also detect the use after put, which should also
> > > > > > > yield a segv without a sanitizer.
> > > > > > >
> > > > > > > Adding reference count checking to cpu map was done as a proof of
> > > > > > > concept, it yielded little other than a location where the use of get
> > > > > > > could be cleaner by using its result. Reference count checking on
> > > > > > > nsinfo identified a double free of the indirection layer and the
> > > > > > > related threads, thereby identifying a data race as discussed here:
> > > > > > > https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> > > > > > > Accordingly the dso->lock was extended and use to cover the race.
> > > > > > >
> > > > > > > An alternative that was considered was ref_tracker:
> > > > > > >  https://lwn.net/Articles/877603/
> > > > > > > ref_tracker requires use of a reference counted struct to also use a
> > > > > > > cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> > > > > > > to spot double puts. When an object is finished with leaks can be
> > > > > > > detected, as with this approach when leak analysis happens. This
> > > > > > > approach was preferred as it doesn't introduce cookies, spots use
> > > > > > > after put and appears moderately more neutral to the API. Weaknesses
> > > > > > > of the implemented approcah are not being able to do adhoc leak
> > > > > > > detection and a preference for adding an accessor API to structs. I
> > > > > > > believe there are other issues and welcome suggestions.
> > > > > >
> > > > > > And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
> > > > > > https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
> > > > > > by Masami Hiramatsu. In this work he adds a leak sanitizer style
> > > > > > reference count checker that will describe locations of puts and gets
> > > > > > for diagnosis. Firstly that's an awesome achievement! This work is
> > > > > > different in that it isn't trying to invent a leak sanitizer, it is
> > > > > > just using the existing one. By adding a level of indirection this
> > > > > > work can catch use after put and pairs gets with puts to make lifetime
> > > > > > analysis more automatic. An advantage of Masami's work is that it
> > > > > > doesn't change data-structures and after the initial patch-set is
> > > > > > somewhat transparent. Overall I prefer the approach in these patches,
> > > > > > future patches can look to clean up the API as Masami has.
> > > > >
> > > > > Thanks for referring my series :-D The series aimed to solve the refcount
> > > > > usage issue in the perf which lead the object leaks. At that moment,
> > > > > I found that there were 2 patterns, refcount start from 0 and start from 1.
> > > > > That made me confused what I should do for using a object.
> > > > > But the perf uses linux/refcount.h now, I hope such issue has already gone.
> > > > > (but the object leakage seems not fixed fully yet, as you found.)
> > > > >
> > > > > BTW, I think the introducing UNWRAP_* macro may put a burden on future
> > > > > development. If it is inevitable, we should consider it as carefully as
> > > > > possible. Or, it may cause another issue (it is easily missed that the new
> > > > > patch does not use UNWRAP_* for object reference, because it is natual.)
> > > > >
> > > > > So I agree with you that you to clean up the API. :-)
> > > > > I think we can make yet another refcount.h for user space debugging and
> > > > > replace it with the linux/refcount.h.
> > > >
> > > > Thanks Masami,
> > > >
> > > > Agreed on the UNWRAP_ macros, hence wanting to hide them behind
> > > > accessors. Making accessors could be automated with macros, for
> > > > example, have a list of variables, have a macro declare the struct
> > > > using the list, another macro can use the list to declare accessors. I
> > > > didn't find adding the UNWRAP_ macros in this change particularly
> > > > burdensome as any use of the wrapping pointer as the original type
> > > > caused a compile time error telling you what and where to fix. The
> > > > macro is extra stuff in the way of using just the raw object, but
> > > > that's fairly typical in C++ with shared_ptr, scoped_lock, etc.
> > >
> > > Hi Ian,
> > >
> > > Hmm, but such a macro is not usual for C which perf is written in.
> > > If I understand correctly, you might want to use memory leak
> > > analyzer to detect refcount leak, and that analyzer will show
> > > what data structure is leaked.
> > 
> > Firstly, thanks for the conversation - this is really useful to
> > improve the code!
> > 
> > I think in an ideal world we'd somehow educate things like address
> > sanitizer of reference counted data structures and they would do a
> > better job of tracking gets and puts. The problem is pairing gets and
> > puts. In C++ you use RAII types so that the destructor ensures a put -
> > this can be complex when using data types like lists where you want to
> > move or swap things onto the list, to keep the single pointer
> > property. In the C code in Linux we use gotos, similarly to how defer
> > is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
> > the get/put pairing problem by adding a cookie that is passed around.
> > The problem with that is that then the cookie becomes part of the API.
> > To avoid that the approach here is just to change the original data
> > type and add in a layer of indirection, that layer has become the
> > cookie. A benefit of this approach is that once the cookie/indirection
> > is freed, use of it becomes an obvious failure - we leverage address
> > sanitizer for use after free.
> 
> I went back to that discussion and saw this part where I brainstormed
> about doing all this in unmodified binaries:
> 
> https://lore.kernel.org/all/20151209134138.GB15864@kernel.org/
> 
> Even Alexei chimed in and replied to that thinking it was doable:
> 
> https://lore.kernel.org/all/20151210033139.GA10056@ast-mbp.thefacebook.com/#t
> 
> And nowadays we have much better BPF infrastructure, much faster probes,
> etc.

Yeah I think now we (will) have faster user-event[1] too. :)

[1] https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/T/#u

So instead of allocating an indirect object on get(), we also can define
an event and send it to the kernel, and run a BPF to analyze it.
Note that this will *NOT* be able to detect the "use-after-put" unless
we automatically trace the all object field access ;-)

Hm, apart from this topic, isn't it good to introduce user-space trace
event( macro)s in perf tools? :-)

Thank you,

> 
> But anyway, like at that opportunity, I thank you guys for working on
> such infrastructure, in 2015 several bugs were found and fixed with that
> refcount debbuger, as is now the case with Ian's attempt.
> 
> Thanks!
> 
> - Arnaldo
>  
> > > If so, maybe you can do the same thing by introducing a dummy
> > > list node for each data structure which you want to debug.
> > >
> > > struct perf_cpu_map__refdebug {
> > >         struct perf_cpu_map__refdebug *orig;
> > > };
> > >
> > > And expand refcount_t as.
> > >
> > > typedef struct refcount_struct {
> > >         atomic_t refs;
> > > #ifdef REFCNT_CHECKING
> > >         void *orig;
> > > #endif
> > > } refcount_t;
> > >
> > > And change the get/put as below
> > >
> > > struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
> > > {
> > >         if (map) {
> > > #ifdef REFCNT_CHECKING
> > >                 struct perf_cpu_map__refdebug *new_node;
> > > #endif
> > >                 refcount_inc(&map->refcnt);
> > > #ifdef REFCNT_CHECKING
> > >                 new_node = malloc(sizeof(*new_node));
> > >                 new_node->orig = map->refcnt->orig;
> > >                 map->refcnt->orig = new_node;
> > > #endif
> > >         }
> > >         return map;
> > > }
> > >
> > > void perf_cpu_map__put(struct perf_cpu_map *map)
> > > {
> > >         if (map) {
> > >                 if (refcount_dec_and_test(&map->refcnt))
> > >                         cpu_map__delete(map);
> > >                 else {
> > > #ifdef REFCNT_CHECKING
> > >                         struct perf_cpu_map__refdebug *node = map->refcnt->orig;
> > >
> > >                         map->refcnt->orig = node->orig;
> > >                         free(node);
> > > #endif
> > >                 }
> > >         }
> > > }
> > >
> > > This need a bit complex get/put, but no need to change other parts.
> > 
> > Adding a list like this gives an ability to say something like of the
> > current reference count of 3 what indirection objects exist. This
> > could be useful for diagnosis but you probably want to pair it with a
> > stack trace, and the approach here is punting that problem to the
> > address/leak sanitizer. I'm also concerned that there should be a lock
> > around the list. I think pursuing this ends up with something like
> > ref_tracker.
> > 
> > If we're using indirection, as in my proposal, then adding a common
> > indirection struct is problematic as anything declared to be a "struct
> > cpumap" now needs to be either the indirection or the original type -
> > hence using macros to hide that in the code. If we embed the
> > information into the refcount_t then we end up with something like
> > ref_tracker, API problems and losing use-after-put checking. Outside
> > of the macros, I think there is a simplicity to the approach I've put
> > forward.
> > 
> > > > The
> > > > question is, is it worth it to make sure use of the reference counted
> > > > object is correct and misuse is easier to diagnose?
> > >
> > > You mean the stackdump for every get/put as I did? That's a good
> > > question. Let's think what may happen.
> > >
> > > For example, if funcA() expects its caller funcB() will put the object
> > > but actually funcB() doesn't, or the funcC() which is the another
> > > caller of funcA()) doesn't expect the funcA() gets the object.
> > >
> > > funcA() {
> > >         get(obj);
> > >         return obj;
> > > }
> > >
> > > funcB() {
> > >         obj = funcA();
> > >         ...
> > >         // wrong! it should do put(obj);
> > > }
> > >
> > > funcC() {
> > >         obj = funcA();
> > >         get(obj);               // this is wrong get().
> > >         ...
> > >         put(obj);
> > > }
> > >
> > > If we just list the non-released object, both logs seems same because
> > > funcB()'s get/put pair will be skipped. If the analyzer shows the
> > > stacktrace when the object was got, maybe we can notice the difference
> > > of funcB() and funcC() path, but this is the simplest case. funcA()
> > > can be called from funcB/C via several different functions.
> > > But perhaps I'm too worried.
> > 
> > So in the logs we should see for funcB:
> > 
> > Memory leak of ... at:
> > malloc...
> > get...
> > funcA
> > funcB
> > ...
> > 
> > as the put on the indirection object was missed and this is now a leak
> > of the indirection object. For funcC we should see:
> > 
> > Memory leak of ... at:
> > malloc..
> > get..
> > funcA
> > funcC
> > 
> > So from the stack traces we can see that there is an unpaired get
> > happening in funcA called from either funcB and funcC, which means we
> > need to a put there. In the funcC case we can see the put was missed
> > from a call to funcA, rather than a get it made.
> > 
> > As the code in perf is complex, multi-threaded and sometimes
> > unintentionally racy a get may happen on 1 thread, the object is
> > placed in a global, the object is put by another thread and also
> > accessed by a 3rd thread. This is what was happening in the
> > dso->nsinfo case. The bug there is that there was a double put
> > happening by the third thread because of a race. Leak sanitizer treats
> > memory visible from a global as not a leak, this can mean to get the
> > most information on leaks in perf we need to aggressively
> > free/delete/deconstruct when terminating so that leaks become visible.
> > This feels to me like good hygiene, but it could also be argued to be
> > a tax.
> > 
> > Anyway, I think I'm still at the same point I was when I posted these
> > changes. That an indirection object is the simplest, smallest,
> > cleanest way to get the most information. I think making the rest of
> > the reference counted data structures have this feature would be
> > great, so I'd like to merge the 4 patches here and work to add more. I
> > think we can also build on that foundation for extra debug
> > information.
> > 
> > Thanks,
> > Ian
> > 
> > > Thank you,
> > >
> > > > I think it is near
> > > > as least offensive as possible while providing the best information -
> > > > hence being able to solve the dso->nsinfo put data race, that has been
> > > > a problem to solve up to this point. I'm open to better suggestions
> > > > though :-)
> > > >
> > > > Thanks again,
> > > > Ian
> > > >
> > > > > Thank you,
> > > > >
> > > > > --
> > > > > Masami Hiramatsu <mhiramat@kernel.org>
> > >
> > >
> > > --
> > > Masami Hiramatsu <mhiramat@kernel.org>
> 
> -- 
> 
> - Arnaldo


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-30  7:54           ` Masami Hiramatsu
@ 2022-01-30 17:40             ` Ian Rogers
  2022-02-04 14:57               ` Masami Hiramatsu
  0 siblings, 1 reply; 20+ messages in thread
From: Ian Rogers @ 2022-01-30 17:40 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Sat, Jan 29, 2022 at 11:55 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Fri, 28 Jan 2022 10:26:20 -0800
> Ian Rogers <irogers@google.com> wrote:
>
> > On Fri, Jan 28, 2022 at 7:35 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > On Thu, 27 Jan 2022 22:24:59 -0800
> > > Ian Rogers <irogers@google.com> wrote:
> > >
> > > > On Thu, Jan 27, 2022 at 9:24 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > >
> > > > > On Thu, 27 Jan 2022 13:33:23 -0800
> > > > > Ian Rogers <irogers@google.com> wrote:
> > > > >
> > > > > > On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
> > > > > > >
> > > > > > > This v2 patch set has the main reference count patch for cpu map from
> > > > > > > the first set and then adds reference count checking to nsinfo. The
> > > > > > > reference count checking on nsinfo helped diagnose a data race bug
> > > > > > > which is fixed in the independent patches 2 and 3.
> > > > > > >
> > > > > > > The perf tool has a class of memory problems where reference counts
> > > > > > > are used incorrectly. Memory/address sanitizers and valgrind don't
> > > > > > > provide useful ways to debug these problems, you see a memory leak
> > > > > > > where the only pertinent information is the original allocation
> > > > > > > site. What would be more useful is knowing where a get fails to have a
> > > > > > > corresponding put, where there are double puts, etc.
> > > > > > >
> > > > > > > This work was motivated by the roll-back of:
> > > > > > > https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> > > > > > > where fixing a missed put resulted in a use-after-free in a different
> > > > > > > context. There was a sense in fixing the issue that a game of
> > > > > > > wac-a-mole had been embarked upon in adding missed gets and puts.
> > > > > > >
> > > > > > > The basic approach of the change is to add a level of indirection at
> > > > > > > the get and put calls. Get allocates a level of indirection that, if
> > > > > > > no corresponding put is called, becomes a memory leak (and associated
> > > > > > > stack trace) that leak sanitizer can report. Similarly if two puts are
> > > > > > > called for the same get, then a double free can be detected by address
> > > > > > > sanitizer. This can also detect the use after put, which should also
> > > > > > > yield a segv without a sanitizer.
> > > > > > >
> > > > > > > Adding reference count checking to cpu map was done as a proof of
> > > > > > > concept, it yielded little other than a location where the use of get
> > > > > > > could be cleaner by using its result. Reference count checking on
> > > > > > > nsinfo identified a double free of the indirection layer and the
> > > > > > > related threads, thereby identifying a data race as discussed here:
> > > > > > > https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> > > > > > > Accordingly the dso->lock was extended and use to cover the race.
> > > > > > >
> > > > > > > An alternative that was considered was ref_tracker:
> > > > > > >  https://lwn.net/Articles/877603/
> > > > > > > ref_tracker requires use of a reference counted struct to also use a
> > > > > > > cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> > > > > > > to spot double puts. When an object is finished with leaks can be
> > > > > > > detected, as with this approach when leak analysis happens. This
> > > > > > > approach was preferred as it doesn't introduce cookies, spots use
> > > > > > > after put and appears moderately more neutral to the API. Weaknesses
> > > > > > > of the implemented approcah are not being able to do adhoc leak
> > > > > > > detection and a preference for adding an accessor API to structs. I
> > > > > > > believe there are other issues and welcome suggestions.
> > > > > >
> > > > > > And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
> > > > > > https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
> > > > > > by Masami Hiramatsu. In this work he adds a leak sanitizer style
> > > > > > reference count checker that will describe locations of puts and gets
> > > > > > for diagnosis. Firstly that's an awesome achievement! This work is
> > > > > > different in that it isn't trying to invent a leak sanitizer, it is
> > > > > > just using the existing one. By adding a level of indirection this
> > > > > > work can catch use after put and pairs gets with puts to make lifetime
> > > > > > analysis more automatic. An advantage of Masami's work is that it
> > > > > > doesn't change data-structures and after the initial patch-set is
> > > > > > somewhat transparent. Overall I prefer the approach in these patches,
> > > > > > future patches can look to clean up the API as Masami has.
> > > > >
> > > > > Thanks for referring my series :-D The series aimed to solve the refcount
> > > > > usage issue in the perf which lead the object leaks. At that moment,
> > > > > I found that there were 2 patterns, refcount start from 0 and start from 1.
> > > > > That made me confused what I should do for using a object.
> > > > > But the perf uses linux/refcount.h now, I hope such issue has already gone.
> > > > > (but the object leakage seems not fixed fully yet, as you found.)
> > > > >
> > > > > BTW, I think the introducing UNWRAP_* macro may put a burden on future
> > > > > development. If it is inevitable, we should consider it as carefully as
> > > > > possible. Or, it may cause another issue (it is easily missed that the new
> > > > > patch does not use UNWRAP_* for object reference, because it is natual.)
> > > > >
> > > > > So I agree with you that you to clean up the API. :-)
> > > > > I think we can make yet another refcount.h for user space debugging and
> > > > > replace it with the linux/refcount.h.
> > > >
> > > > Thanks Masami,
> > > >
> > > > Agreed on the UNWRAP_ macros, hence wanting to hide them behind
> > > > accessors. Making accessors could be automated with macros, for
> > > > example, have a list of variables, have a macro declare the struct
> > > > using the list, another macro can use the list to declare accessors. I
> > > > didn't find adding the UNWRAP_ macros in this change particularly
> > > > burdensome as any use of the wrapping pointer as the original type
> > > > caused a compile time error telling you what and where to fix. The
> > > > macro is extra stuff in the way of using just the raw object, but
> > > > that's fairly typical in C++ with shared_ptr, scoped_lock, etc.
> > >
> > > Hi Ian,
> > >
> > > Hmm, but such a macro is not usual for C which perf is written in.
> > > If I understand correctly, you might want to use memory leak
> > > analyzer to detect refcount leak, and that analyzer will show
> > > what data structure is leaked.
> >
> > Firstly, thanks for the conversation - this is really useful to
> > improve the code!
>
> Hi Ian,
>
> You're welcome! This conversation also useful to me to understand
> the issue deeper :-)
>
> > I think in an ideal world we'd somehow educate things like address
> > sanitizer of reference counted data structures and they would do a
> > better job of tracking gets and puts. The problem is pairing gets and
> > puts.
>
> Agreed. From my experience, pairing gets and puts are hard without
> reviewing the source code, since the refcounter is not only used in a
> single function, but it is for keeping the object not released until
> some long process is finished.
>
> For example, if the correct pair is like below, funcA-funcD, funcB-funcC,
> funcA (get)
> funcB (get)
> funcC (put)
> funcD (put)
>
> But depending on the execution timing, funcC/funcD can be swapped.
> funcA (get)
> funcB (get)
> funcD (put)
> funcC (put)
>
> And if there is a bug, funcX may get the object by mistake.
> funcA (get)
> funcX (get)
> funcB (get)
> funcD (put)
> funcC (put)
>
> Or, funcC() might miss to put.
> funcA (get)
> funcB (get)
> funcD (put)
>
> In any case, just tracking the get/put, it is hard to know which pair
> is not right. I saw these patterns when I debugged it. :(

Yep, I've found this issue too :-) The get is being used for the
side-effect of incrementing a reference count rather than for
returning the value. This happened in cpumap merge and was easy to fix
there.

This problem is possible in general, but I think if it were common we
are doomed. I don't think this pattern is common though. In general a
reference count is owned by something, the scope of a function or the
lifetime of a list. If puts were adhoc then it would mean that one
occurring in a function could be doing it for the side effect of
freeing on a list. I don't think the code aims to do that. Making the
code clean with pairing gets and puts is an issue, like with the
cpumap merge change.

> > In C++ you use RAII types so that the destructor ensures a put -
> > this can be complex when using data types like lists where you want to
> > move or swap things onto the list, to keep the single pointer
> > property. In the C code in Linux we use gotos, similarly to how defer
> > is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
> > the get/put pairing problem by adding a cookie that is passed around.
>
> Cool! I will check the ref_tracker :)
>
> > The problem with that is that then the cookie becomes part of the API.
>
> What the cookie is? some pairing function address??

As I understand it, a token to identify a get.

> > To avoid that the approach here is just to change the original data
> > type and add in a layer of indirection, that layer has become the
> > cookie. A benefit of this approach is that once the cookie/indirection
> > is freed, use of it becomes an obvious failure - we leverage address
> > sanitizer for use after free.
>
> Ah, got it. I thought that was anyway caught by address sanitizer by default, wasn't?

Out of the box, address sanitizer will tell you where the reference
counted thing was created and about double frees. Leak sanitizer, that
runs with address sanitizer, will tell about leaks.  Our problem is
knowing we have leaks but not having enough information to debug the
unpaired gets and puts.

> > > If so, maybe you can do the same thing by introducing a dummy
> > > list node for each data structure which you want to debug.
> > >
> > > struct perf_cpu_map__refdebug {
> > >         struct perf_cpu_map__refdebug *orig;
> > > };
> > >
> > > And expand refcount_t as.
> > >
> > > typedef struct refcount_struct {
> > >         atomic_t refs;
> > > #ifdef REFCNT_CHECKING
> > >         void *orig;
> > > #endif
> > > } refcount_t;
> > >
> > > And change the get/put as below
> > >
> > > struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
> > > {
> > >         if (map) {
> > > #ifdef REFCNT_CHECKING
> > >                 struct perf_cpu_map__refdebug *new_node;
> > > #endif
> > >                 refcount_inc(&map->refcnt);
> > > #ifdef REFCNT_CHECKING
> > >                 new_node = malloc(sizeof(*new_node));
> > >                 new_node->orig = map->refcnt->orig;
> > >                 map->refcnt->orig = new_node;
> > > #endif
> > >         }
> > >         return map;
> > > }
> > >
> > > void perf_cpu_map__put(struct perf_cpu_map *map)
> > > {
> > >         if (map) {
> > >                 if (refcount_dec_and_test(&map->refcnt))
> > >                         cpu_map__delete(map);
> > >                 else {
> > > #ifdef REFCNT_CHECKING
> > >                         struct perf_cpu_map__refdebug *node = map->refcnt->orig;
> > >
> > >                         map->refcnt->orig = node->orig;
> > >                         free(node);
> > > #endif
> > >                 }
> > >         }
> > > }
> > >
> > > This need a bit complex get/put, but no need to change other parts.
> >
> > Adding a list like this gives an ability to say something like of the
> > current reference count of 3 what indirection objects exist. This
> > could be useful for diagnosis but you probably want to pair it with a
> > stack trace, and the approach here is punting that problem to the
> > address/leak sanitizer.
>
> I think the pairing get/put should be done by manual. Without
> reviewing the code, it is hard to know what was expected by the
> original developer. :(
>
> > I'm also concerned that there should be a lock
> > around the list. I think pursuing this ends up with something like
> > ref_tracker.
>
> Good catch! Indeed, a lock is needed.
>
> > If we're using indirection, as in my proposal, then adding a common
> > indirection struct is problematic as anything declared to be a "struct
> > cpumap" now needs to be either the indirection or the original type -
> > hence using macros to hide that in the code. If we embed the
> > information into the refcount_t then we end up with something like
> > ref_tracker, API problems and losing use-after-put checking. Outside
> > of the macros, I think there is a simplicity to the approach I've put
> > forward.
>
> OK, so the "use-after-put" is what you solve. That is indeed hard to
> check without introducing indirect pointer, because "use-after-put"
> is different from "use-after-free", it will use the same "object"
> but that has just a decremented refcounter.
>
> funcA(obj) {
>   get(obj);
>   ...
>   put(obj);
> }
>
> funcA2(obj) {
>   ...
>   obj_access(obj); <- //use-after-put? (or use-no-get?)
> }
>
> funcB() {
>   obj = new_obj()
>   get(obj);
>   funcA(obj);
>   funcA2(obj);
>   put(obj);
> }
>
> So my question is that we need to pay the cost to use UNWRAP_ macro
> on all those object just for finding the "use-after-put" case.
> Usually the case that "use-after-put" causes actual problem is very
> limited, or it can be "use-after-free".

So the dso->nsinfo case was a use after put once we added in the
missing put - it could also be thought of as a double put/free. In
general use-after-put is going to show where a strict get-then-put
isn't being followed, if we make sure of that property then the
reference counting will be accurate.

A case that came up previously was maps__find:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.c#n974
this code retuns a map but without doing a get on it, even though a
map is reference counted:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.h#n46
If we change that code to return a get of the map then we add overhead
for simple cases of checking a map is present - you can infer you have
a reference count on the map if you have it on the maps. The
indirection allows the code to remain as-is, while being able to catch
misuse.

> funcA(obj) {
>   get(obj);
>   ...
>   put(obj);     // without funcB()'s get, obj will be freed here.
> }
>
> funcB() {
>   obj = new_obj()
>   // get(obj) <-- forgot to get the refcounter
>   funcA(obj)
>   obj_access(obj) <- //use after put!
> }
>
> In above case, use-after-put will be use-after-free, because obj is
> freed.
>
> If we can re-define the '->' operator, this is completely hidden
> from programmers, but C has no such feature. Maybe we can use a
> hidden data structure and accessors to prohibit user using '->'
> on such object (e.g. elfutils) but it may increase runtime
> overhead (LTO might be a remedy). And anyway inside the object
> definition file, it need to use UNWRAP_ macro (also, it must use
> object pointer returned from get() method.)

Yeah, I could imagine adding attributes to the struct that address
sanitizer in the compiler could use to inject things.

> >
> > > > The
> > > > question is, is it worth it to make sure use of the reference counted
> > > > object is correct and misuse is easier to diagnose?
> > >
> > > You mean the stackdump for every get/put as I did? That's a good
> > > question. Let's think what may happen.
> > >
> > > For example, if funcA() expects its caller funcB() will put the object
> > > but actually funcB() doesn't, or the funcC() which is the another
> > > caller of funcA()) doesn't expect the funcA() gets the object.
> > >
> > > funcA() {
> > >         get(obj);
> > >         return obj;
> > > }
> > >
> > > funcB() {
> > >         obj = funcA();
> > >         ...
> > >         // wrong! it should do put(obj);
> > > }
> > >
> > > funcC() {
> > >         obj = funcA();
> > >         get(obj);               // this is wrong get().
> > >         ...
> > >         put(obj);
> > > }
> > >
> > > If we just list the non-released object, both logs seems same because
> > > funcB()'s get/put pair will be skipped. If the analyzer shows the
> > > stacktrace when the object was got, maybe we can notice the difference
> > > of funcB() and funcC() path, but this is the simplest case. funcA()
> > > can be called from funcB/C via several different functions.
> > > But perhaps I'm too worried.
> >
> > So in the logs we should see for funcB:
> >
> > Memory leak of ... at:
> > malloc...
> > get...
> > funcA
> > funcB
> > ...
>
> Yeah, OK. and I think this can be done without anything because "obj"
> is not released anyway. Am I correct?

It needs the indirection otherwise you don't know which get is the leak.

> >
> > as the put on the indirection object was missed and this is now a leak
> > of the indirection object. For funcC we should see:
> >
> > Memory leak of ... at:
> > malloc..
> > get..
> > funcA
> > funcC
>
> And this is also be done with my proposal too. (of course the leaked
> object type name will be perf_cpu_map__refdebug)
>
> >
> > So from the stack traces we can see that there is an unpaired get
> > happening in funcA called from either funcB and funcC, which means we
> > need to a put there. In the funcC case we can see the put was missed
> > from a call to funcA, rather than a get it made.
>
> So you meant we only need a log of the malloced (== get) place with
> backtrace, right? Yeah, I think that may be enoguh. When I wrote my
> previous work (2015), I tried to expose information as much as possible.
>
> >
> > As the code in perf is complex, multi-threaded and sometimes
> > unintentionally racy a get may happen on 1 thread, the object is
> > placed in a global, the object is put by another thread and also
> > accessed by a 3rd thread. This is what was happening in the
> > dso->nsinfo case. The bug there is that there was a double put
> > happening by the third thread because of a race. Leak sanitizer treats
> > memory visible from a global as not a leak, this can mean to get the
> > most information on leaks in perf we need to aggressively
> > free/delete/deconstruct when terminating so that leaks become visible.
> > This feels to me like good hygiene, but it could also be argued to be
> > a tax.
>
> Got it. So we need to make the global object visible to leak sanitizer.
> For this reason you made get() to malloc() (get will allocate a
> local object.)
>
> >
> > Anyway, I think I'm still at the same point I was when I posted these
> > changes. That an indirection object is the simplest, smallest,
> > cleanest way to get the most information. I think making the rest of
> > the reference counted data structures have this feature would be
> > great, so I'd like to merge the 4 patches here and work to add more. I
> > think we can also build on that foundation for extra debug
> > information.
>
> For me, your change aims to 2 goals, (1) catch the global object leak,
> (2) catch the "use-after-put". And I think (1) is possible to be solved
> without the indirection object because the required technic is changing
> get() into malloc() some data. On the other hand, the (2) is solved only
> by the indirection object. And I think the case that (2) becomes real
> problem is when it turns into "use-after-free". So I think it may be
> possible to be caught by the address sanitizer.

I think use-after-free can be a good signal, but for trying to get the
hygiene right use-after-put is what we need. I'm basically using the
leak sanitizer and the indirection as a scoped thing, using a scoped
thing outside of its scope is only caught by use-after-put.

Thanks,
Ian

> Thank you,
>
> >
> > Thanks,
> > Ian
> >
> > > Thank you,
> > >
> > > > I think it is near
> > > > as least offensive as possible while providing the best information -
> > > > hence being able to solve the dso->nsinfo put data race, that has been
> > > > a problem to solve up to this point. I'm open to better suggestions
> > > > though :-)
> > > >
> > > > Thanks again,
> > > > Ian
> > > >
> > > > > Thank you,
> > > > >
> > > > > --
> > > > > Masami Hiramatsu <mhiramat@kernel.org>
> > >
> > >
> > > --
> > > Masami Hiramatsu <mhiramat@kernel.org>
>
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-28 18:26         ` Ian Rogers
  2022-01-28 19:59           ` Arnaldo Carvalho de Melo
  2022-01-30  7:54           ` Masami Hiramatsu
@ 2022-01-31 13:56           ` Arnaldo Carvalho de Melo
  2 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-31 13:56 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Masami Hiramatsu, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Darren Hart, Davidlohr Bueso, André Almeida, James Clark,
	John Garry, Riccardo Mancini, Yury Norov, Andy Shevchenko,
	Andrew Morton, Jin Yao, Adrian Hunter, Leo Yan, Andi Kleen,
	Thomas Richter, Kan Liang, Madhavan Srinivasan,
	Shunsuke Nakamura, Song Liu, Steven Rostedt, Miaoqian Lin,
	Stephen Brennan, Kajol Jain, Alexey Bayduraev, German Gomez,
	linux-perf-users, linux-kernel, Eric Dumazet, Dmitry Vyukov,
	masami.hiramatsu.pt, eranian

Em Fri, Jan 28, 2022 at 10:26:20AM -0800, Ian Rogers escreveu:
> On Fri, Jan 28, 2022 at 7:35 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 27 Jan 2022 22:24:59 -0800
> > Ian Rogers <irogers@google.com> wrote:
> >
> > > On Thu, Jan 27, 2022 at 9:24 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >
> > > > On Thu, 27 Jan 2022 13:33:23 -0800
> > > > Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > > On Tue, Jan 25, 2022 at 12:46 PM Ian Rogers <irogers@google.com> wrote:
> > > > > >
> > > > > > This v2 patch set has the main reference count patch for cpu map from
> > > > > > the first set and then adds reference count checking to nsinfo. The
> > > > > > reference count checking on nsinfo helped diagnose a data race bug
> > > > > > which is fixed in the independent patches 2 and 3.
> > > > > >
> > > > > > The perf tool has a class of memory problems where reference counts
> > > > > > are used incorrectly. Memory/address sanitizers and valgrind don't
> > > > > > provide useful ways to debug these problems, you see a memory leak
> > > > > > where the only pertinent information is the original allocation
> > > > > > site. What would be more useful is knowing where a get fails to have a
> > > > > > corresponding put, where there are double puts, etc.
> > > > > >
> > > > > > This work was motivated by the roll-back of:
> > > > > > https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@google.com/
> > > > > > where fixing a missed put resulted in a use-after-free in a different
> > > > > > context. There was a sense in fixing the issue that a game of
> > > > > > wac-a-mole had been embarked upon in adding missed gets and puts.
> > > > > >
> > > > > > The basic approach of the change is to add a level of indirection at
> > > > > > the get and put calls. Get allocates a level of indirection that, if
> > > > > > no corresponding put is called, becomes a memory leak (and associated
> > > > > > stack trace) that leak sanitizer can report. Similarly if two puts are
> > > > > > called for the same get, then a double free can be detected by address
> > > > > > sanitizer. This can also detect the use after put, which should also
> > > > > > yield a segv without a sanitizer.
> > > > > >
> > > > > > Adding reference count checking to cpu map was done as a proof of
> > > > > > concept, it yielded little other than a location where the use of get
> > > > > > could be cleaner by using its result. Reference count checking on
> > > > > > nsinfo identified a double free of the indirection layer and the
> > > > > > related threads, thereby identifying a data race as discussed here:
> > > > > > https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/
> > > > > > Accordingly the dso->lock was extended and use to cover the race.
> > > > > >
> > > > > > An alternative that was considered was ref_tracker:
> > > > > >  https://lwn.net/Articles/877603/
> > > > > > ref_tracker requires use of a reference counted struct to also use a
> > > > > > cookie/tracker. The cookie is combined with data in a ref_tracker_dir
> > > > > > to spot double puts. When an object is finished with leaks can be
> > > > > > detected, as with this approach when leak analysis happens. This
> > > > > > approach was preferred as it doesn't introduce cookies, spots use
> > > > > > after put and appears moderately more neutral to the API. Weaknesses
> > > > > > of the implemented approcah are not being able to do adhoc leak
> > > > > > detection and a preference for adding an accessor API to structs. I
> > > > > > believe there are other issues and welcome suggestions.
> > > > >
> > > > > And so we've been here before (Dec 2015 to be exact). Namhyung pointed me to:
> > > > > https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
> > > > > by Masami Hiramatsu. In this work he adds a leak sanitizer style
> > > > > reference count checker that will describe locations of puts and gets
> > > > > for diagnosis. Firstly that's an awesome achievement! This work is
> > > > > different in that it isn't trying to invent a leak sanitizer, it is
> > > > > just using the existing one. By adding a level of indirection this
> > > > > work can catch use after put and pairs gets with puts to make lifetime
> > > > > analysis more automatic. An advantage of Masami's work is that it
> > > > > doesn't change data-structures and after the initial patch-set is
> > > > > somewhat transparent. Overall I prefer the approach in these patches,
> > > > > future patches can look to clean up the API as Masami has.
> > > >
> > > > Thanks for referring my series :-D The series aimed to solve the refcount
> > > > usage issue in the perf which lead the object leaks. At that moment,
> > > > I found that there were 2 patterns, refcount start from 0 and start from 1.
> > > > That made me confused what I should do for using a object.
> > > > But the perf uses linux/refcount.h now, I hope such issue has already gone.
> > > > (but the object leakage seems not fixed fully yet, as you found.)
> > > >
> > > > BTW, I think the introducing UNWRAP_* macro may put a burden on future
> > > > development. If it is inevitable, we should consider it as carefully as
> > > > possible. Or, it may cause another issue (it is easily missed that the new
> > > > patch does not use UNWRAP_* for object reference, because it is natual.)
> > > >
> > > > So I agree with you that you to clean up the API. :-)
> > > > I think we can make yet another refcount.h for user space debugging and
> > > > replace it with the linux/refcount.h.
> > >
> > > Thanks Masami,
> > >
> > > Agreed on the UNWRAP_ macros, hence wanting to hide them behind
> > > accessors. Making accessors could be automated with macros, for
> > > example, have a list of variables, have a macro declare the struct
> > > using the list, another macro can use the list to declare accessors. I
> > > didn't find adding the UNWRAP_ macros in this change particularly
> > > burdensome as any use of the wrapping pointer as the original type
> > > caused a compile time error telling you what and where to fix. The
> > > macro is extra stuff in the way of using just the raw object, but
> > > that's fairly typical in C++ with shared_ptr, scoped_lock, etc.
> >
> > Hi Ian,
> >
> > Hmm, but such a macro is not usual for C which perf is written in.
> > If I understand correctly, you might want to use memory leak
> > analyzer to detect refcount leak, and that analyzer will show
> > what data structure is leaked.
> 
> Firstly, thanks for the conversation - this is really useful to
> improve the code!
> 
> I think in an ideal world we'd somehow educate things like address
> sanitizer of reference counted data structures and they would do a
> better job of tracking gets and puts. The problem is pairing gets and
> puts. In C++ you use RAII types so that the destructor ensures a put -
> this can be complex when using data types like lists where you want to
> move or swap things onto the list, to keep the single pointer
> property. In the C code in Linux we use gotos, similarly to how defer
> is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
> the get/put pairing problem by adding a cookie that is passed around.
> The problem with that is that then the cookie becomes part of the API.
> To avoid that the approach here is just to change the original data
> type and add in a layer of indirection, that layer has become the
> cookie. A benefit of this approach is that once the cookie/indirection
> is freed, use of it becomes an obvious failure - we leverage address
> sanitizer for use after free.
> 
> > If so, maybe you can do the same thing by introducing a dummy
> > list node for each data structure which you want to debug.
> >
> > struct perf_cpu_map__refdebug {
> >         struct perf_cpu_map__refdebug *orig;
> > };
> >
> > And expand refcount_t as.
> >
> > typedef struct refcount_struct {
> >         atomic_t refs;
> > #ifdef REFCNT_CHECKING
> >         void *orig;
> > #endif
> > } refcount_t;
> >
> > And change the get/put as below
> >
> > struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
> > {
> >         if (map) {
> > #ifdef REFCNT_CHECKING
> >                 struct perf_cpu_map__refdebug *new_node;
> > #endif
> >                 refcount_inc(&map->refcnt);
> > #ifdef REFCNT_CHECKING
> >                 new_node = malloc(sizeof(*new_node));
> >                 new_node->orig = map->refcnt->orig;
> >                 map->refcnt->orig = new_node;
> > #endif
> >         }
> >         return map;
> > }
> >
> > void perf_cpu_map__put(struct perf_cpu_map *map)
> > {
> >         if (map) {
> >                 if (refcount_dec_and_test(&map->refcnt))
> >                         cpu_map__delete(map);
> >                 else {
> > #ifdef REFCNT_CHECKING
> >                         struct perf_cpu_map__refdebug *node = map->refcnt->orig;
> >
> >                         map->refcnt->orig = node->orig;
> >                         free(node);
> > #endif
> >                 }
> >         }
> > }
> >
> > This need a bit complex get/put, but no need to change other parts.
> 
> Adding a list like this gives an ability to say something like of the
> current reference count of 3 what indirection objects exist. This
> could be useful for diagnosis but you probably want to pair it with a
> stack trace, and the approach here is punting that problem to the
> address/leak sanitizer. I'm also concerned that there should be a lock
> around the list. I think pursuing this ends up with something like
> ref_tracker.
> 
> If we're using indirection, as in my proposal, then adding a common
> indirection struct is problematic as anything declared to be a "struct
> cpumap" now needs to be either the indirection or the original type -
> hence using macros to hide that in the code. If we embed the
> information into the refcount_t then we end up with something like
> ref_tracker, API problems and losing use-after-put checking. Outside
> of the macros, I think there is a simplicity to the approach I've put
> forward.
> 
> > > The
> > > question is, is it worth it to make sure use of the reference counted
> > > object is correct and misuse is easier to diagnose?
> >
> > You mean the stackdump for every get/put as I did? That's a good
> > question. Let's think what may happen.
> >
> > For example, if funcA() expects its caller funcB() will put the object
> > but actually funcB() doesn't, or the funcC() which is the another
> > caller of funcA()) doesn't expect the funcA() gets the object.
> >
> > funcA() {
> >         get(obj);
> >         return obj;
> > }
> >
> > funcB() {
> >         obj = funcA();
> >         ...
> >         // wrong! it should do put(obj);
> > }
> >
> > funcC() {
> >         obj = funcA();
> >         get(obj);               // this is wrong get().
> >         ...
> >         put(obj);
> > }
> >
> > If we just list the non-released object, both logs seems same because
> > funcB()'s get/put pair will be skipped. If the analyzer shows the
> > stacktrace when the object was got, maybe we can notice the difference
> > of funcB() and funcC() path, but this is the simplest case. funcA()
> > can be called from funcB/C via several different functions.
> > But perhaps I'm too worried.
> 
> So in the logs we should see for funcB:
> 
> Memory leak of ... at:
> malloc...
> get...
> funcA
> funcB
> ...
> 
> as the put on the indirection object was missed and this is now a leak
> of the indirection object. For funcC we should see:
> 
> Memory leak of ... at:
> malloc..
> get..
> funcA
> funcC
> 
> So from the stack traces we can see that there is an unpaired get
> happening in funcA called from either funcB and funcC, which means we
> need to a put there. In the funcC case we can see the put was missed
> from a call to funcA, rather than a get it made.
> 
> As the code in perf is complex, multi-threaded and sometimes
> unintentionally racy a get may happen on 1 thread, the object is
> placed in a global, the object is put by another thread and also
> accessed by a 3rd thread. This is what was happening in the
> dso->nsinfo case. The bug there is that there was a double put
> happening by the third thread because of a race. Leak sanitizer treats
> memory visible from a global as not a leak, this can mean to get the
> most information on leaks in perf we need to aggressively
> free/delete/deconstruct when terminating so that leaks become visible.
> This feels to me like good hygiene, but it could also be argued to be
> a tax.

We can have it perfect, i.e. freeing everything but then leaving that
disabled in !DEBUG builds so that the exit time is kept fast.
 
> Anyway, I think I'm still at the same point I was when I posted these
> changes. That an indirection object is the simplest, smallest,
> cleanest way to get the most information. I think making the rest of
> the reference counted data structures have this feature would be
> great, so I'd like to merge the 4 patches here and work to add more. I
> think we can also build on that foundation for extra debug
> information.
> 
> Thanks,
> Ian
> 
> > Thank you,
> >
> > > I think it is near
> > > as least offensive as possible while providing the best information -
> > > hence being able to solve the dso->nsinfo put data race, that has been
> > > a problem to solve up to this point. I'm open to better suggestions
> > > though :-)
> > >
> > > Thanks again,
> > > Ian
> > >
> > > > Thank you,
> > > >
> > > > --
> > > > Masami Hiramatsu <mhiramat@kernel.org>
> >
> >
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-30  8:04             ` Masami Hiramatsu
@ 2022-01-31 14:28               ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-31 14:28 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ian Rogers, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Thomas Gleixner,
	Darren Hart, Davidlohr Bueso, André Almeida, James Clark,
	John Garry, Riccardo Mancini, Yury Norov, Andy Shevchenko,
	Andrew Morton, Jin Yao, Adrian Hunter, Leo Yan, Andi Kleen,
	Thomas Richter, Kan Liang, Madhavan Srinivasan,
	Shunsuke Nakamura, Song Liu, Steven Rostedt, Miaoqian Lin,
	Stephen Brennan, Kajol Jain, Alexey Bayduraev, German Gomez,
	linux-perf-users, linux-kernel, Eric Dumazet, Dmitry Vyukov,
	masami.hiramatsu.pt, eranian

Em Sun, Jan 30, 2022 at 05:04:18PM +0900, Masami Hiramatsu escreveu:
> On Fri, 28 Jan 2022 16:59:13 -0300 Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > I went back to that discussion and saw this part where I brainstormed
> > about doing all this in unmodified binaries:

> > https://lore.kernel.org/all/20151209134138.GB15864@kernel.org/

> > Even Alexei chimed in and replied to that thinking it was doable:

> > https://lore.kernel.org/all/20151210033139.GA10056@ast-mbp.thefacebook.com/#t

> > And nowadays we have much better BPF infrastructure, much faster probes,
> > etc.

> Yeah I think now we (will) have faster user-event[1] too. :)

> [1] https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/T/#u

> So instead of allocating an indirect object on get(), we also can define
> an event and send it to the kernel, and run a BPF to analyze it.
> Note that this will *NOT* be able to detect the "use-after-put" unless
> we automatically trace the all object field access ;-)

Humm, reading https://blog.janestreet.com/magic-trace/:

"I spent a bunch of time looking for a better solution and eventually I
found a really satisfying one in the perf_event_open docs. It turns out
that perf_event_open can use hardware breakpoints and notify you when a
memory address is executed or accessed."

I.e. after the last put we could automagically add a:

  mem:<addr>[/len][:access]                          [Hardware breakpoint]

But there are only HBP_NUM hardware breakpoints (4 on x86)... So some
sort of scheduling would be needed, or after last put add it then, leave
it there for some time, then stop tracking it, reusing it for some other
object, etc. We would be able to catch some of the problems sometimes.

For things that do use-after-free straight away we would get some of
these and fix it, making this tunable (the time to track a object after
it is last put) should be possible.

> Hm, apart from this topic, isn't it good to introduce user-space trace
> event( macro)s in perf tools? :-)

Yeah, this seems to be an interesting case for that.

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/4] perf cpumap: Add reference count checking
  2022-01-25 20:45 ` [PATCH v2 1/4] perf cpumap: Add reference count checking Ian Rogers
@ 2022-01-31 14:44   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-01-31 14:44 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Darren Hart,
	Davidlohr Bueso, André Almeida, James Clark, John Garry,
	Riccardo Mancini, Yury Norov, Andy Shevchenko, Andrew Morton,
	Jin Yao, Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter,
	Kan Liang, Madhavan Srinivasan, Shunsuke Nakamura, Song Liu,
	Masami Hiramatsu, Steven Rostedt, Miaoqian Lin, Stephen Brennan,
	Kajol Jain, Alexey Bayduraev, German Gomez, linux-perf-users,
	linux-kernel, Eric Dumazet, Dmitry Vyukov, eranian

Em Tue, Jan 25, 2022 at 12:45:59PM -0800, Ian Rogers escreveu:
> Enabled when REFCNT_CHECKING is defined. The change adds a memory
> allocated pointer that is interposed between the reference counted
> cpu map at a get and freed by a put. The pointer replaces the original
> perf_cpu_map struct, so use of the perf_cpu_map via APIs remains
> unchanged. Any use of the cpu map without the API requires two versions,
> typically handled via an UNWRAP macro.
> 
> This change is intended to catch:
>  - use after put: using a cpumap after you have put it will cause a
>    segv.
>  - unbalanced puts: two puts for a get will result in a double free
>    that can be captured and reported by tools like address sanitizer,
>    including with the associated stack traces of allocation and frees.
>  - missing puts: if a put is missing then the get turns into a memory
>    leak that can be reported by leak sanitizer, including the stack
>    trace at the point the get occurs.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/lib/perf/cpumap.c                  | 120 ++++++++++++++++-------
>  tools/lib/perf/include/internal/cpumap.h |  14 ++-
>  tools/perf/tests/cpumap.c                |  20 ++--
>  tools/perf/util/cpumap.c                 |  42 ++++----
>  tools/perf/util/pmu.c                    |  24 +++--
>  5 files changed, 146 insertions(+), 74 deletions(-)
> 
> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> index ee66760f1e63..d401d133f84b 100644
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -10,10 +10,16 @@
>  #include <ctype.h>
>  #include <limits.h>
>  
> -static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
> +#ifndef REFCNT_CHECKING
> +#define UNWRAP_MAP(x) x
> +#else
> +#define UNWRAP_MAP(x) x->orig
> +#endif
> +
> +#ifndef REFCNT_CHECKING
> +struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
>  {
>  	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
> -
>  	if (cpus != NULL) {
>  		cpus->nr = nr_cpus;
>  		refcount_set(&cpus->refcnt, 1);
> @@ -21,13 +27,31 @@ static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
>  	}
>  	return cpus;
>  }
> +#else
> +struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
> +{
> +	struct perf_cpu_map *wrapped_cpus = NULL;
> +	struct original_perf_cpu_map *cpus =
> +		malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
> +	if (cpus != NULL) {
> +		cpus->nr = nr_cpus;
> +		refcount_set(&cpus->refcnt, 1);
> +		wrapped_cpus = malloc(sizeof(*wrapped_cpus));
> +		if (wrapped_cpus != NULL)
> +			wrapped_cpus->orig = cpus;
> +		else
> +			free(cpus);
> +	}
> +	return wrapped_cpus;
> +}
> +#endif
>  
>  struct perf_cpu_map *perf_cpu_map__dummy_new(void)
>  {
>  	struct perf_cpu_map *cpus = perf_cpu_map__alloc(1);
>  
>  	if (cpus)
> -		cpus->map[0].cpu = -1;
> +		UNWRAP_MAP(cpus)->map[0].cpu = -1;
>  
>  	return cpus;
>  }
> @@ -35,23 +59,45 @@ struct perf_cpu_map *perf_cpu_map__dummy_new(void)
>  static void cpu_map__delete(struct perf_cpu_map *map)
>  {
>  	if (map) {
> -		WARN_ONCE(refcount_read(&map->refcnt) != 0,
> +		WARN_ONCE(refcount_read(&UNWRAP_MAP(map)->refcnt) != 0,
>  			  "cpu_map refcnt unbalanced\n");
> +#ifdef REFCNT_CHECKINGyy
> +		free(map->orig);
> +		map->orig = NULL;
> +#endif

Could this be:

		refcntchk__delete(map);

And then:

#ifdef REFCNT_CHECKING
#define refcntchk__delete(object) zfree(&object->orig)
#else
#define refcntchk__delete(object)
#endif

So that we introduce less clutter?

Ditto for other places, where we would just call things like:

#ifdef REFCNT_CHECKING
	refcntchk__new(object) (object->orig = malloc(sizeof(*object))) == NULL)
#else
#define refcntchk__new(object)
#endif

	and then use:

		if (refcntchk__new(cpu_map))
			bail out


With this we would also be marking the places where the refcntchk needs
per-object space/do checks, etc that we would then turn into SDT events
to have all the checks made by some general purpose tool, eBPF based so
that we don't keep tons of info for post processing?

>  		free(map);
>  	}
>  }
>  
>  struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
>  {
> -	if (map)
> -		refcount_inc(&map->refcnt);
> +	if (map) {
> +#ifdef REFCNT_CHECKING
> +		struct perf_cpu_map *new_wrapper;
> +#endif
> +		refcount_inc(&UNWRAP_MAP(map)->refcnt);
> +#ifdef REFCNT_CHECKING
> +		new_wrapper = malloc(sizeof(*new_wrapper));
> +		new_wrapper->orig = map->orig;
> +		map = new_wrapper;
> +#endif
> +	}
>  	return map;
>  }
>  
>  void perf_cpu_map__put(struct perf_cpu_map *map)
>  {
> -	if (map && refcount_dec_and_test(&map->refcnt))
> -		cpu_map__delete(map);
> +	if (map) {
> +		if (refcount_dec_and_test(&UNWRAP_MAP(map)->refcnt))
> +			cpu_map__delete(map);
> +		else {
> +#ifdef REFCNT_CHECKING
> +			/* Free the wrapper object but the reference counted object remains. */
> +			map->orig = NULL;
> +			free(map);
> +#endif
> +		}
> +	}
>  }
>  
>  static struct perf_cpu_map *cpu_map__default_new(void)
> @@ -68,7 +114,7 @@ static struct perf_cpu_map *cpu_map__default_new(void)
>  		int i;
>  
>  		for (i = 0; i < nr_cpus; ++i)
> -			cpus->map[i].cpu = i;
> +			UNWRAP_MAP(cpus)->map[i].cpu = i;


Humm, perhaps shorter, something like _(cpus)->map[i].cpu = i; ?

>  	}
>  
>  	return cpus;
> @@ -94,15 +140,16 @@ static struct perf_cpu_map *cpu_map__trim_new(int nr_cpus, const struct perf_cpu
>  	int i, j;
>  
>  	if (cpus != NULL) {
> -		memcpy(cpus->map, tmp_cpus, payload_size);
> -		qsort(cpus->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
> +		memcpy(UNWRAP_MAP(cpus)->map, tmp_cpus, payload_size);
> +		qsort(UNWRAP_MAP(cpus)->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
>  		/* Remove dups */
>  		j = 0;
>  		for (i = 0; i < nr_cpus; i++) {
> -			if (i == 0 || cpus->map[i].cpu != cpus->map[i - 1].cpu)
> -				cpus->map[j++].cpu = cpus->map[i].cpu;
> +			if (i == 0 ||
> +			    UNWRAP_MAP(cpus)->map[i].cpu != UNWRAP_MAP(cpus)->map[i - 1].cpu)
> +				UNWRAP_MAP(cpus)->map[j++].cpu = UNWRAP_MAP(cpus)->map[i].cpu;
>  		}
> -		cpus->nr = j;
> +		UNWRAP_MAP(cpus)->nr = j;
>  		assert(j <= nr_cpus);
>  	}
>  	return cpus;
> @@ -263,20 +310,20 @@ struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx)
>  		.cpu = -1
>  	};
>  
> -	if (cpus && idx < cpus->nr)
> -		return cpus->map[idx];
> +	if (cpus && idx < UNWRAP_MAP(cpus)->nr)
> +		return UNWRAP_MAP(cpus)->map[idx];
>  
>  	return result;
>  }
>  
>  int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
>  {
> -	return cpus ? cpus->nr : 1;
> +	return cpus ? UNWRAP_MAP(cpus)->nr : 1;
>  }
>  
>  bool perf_cpu_map__empty(const struct perf_cpu_map *map)
>  {
> -	return map ? map->map[0].cpu == -1 : true;
> +	return map ? UNWRAP_MAP(map)->map[0].cpu == -1 : true;
>  }
>  
>  int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
> @@ -287,10 +334,10 @@ int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
>  		return -1;
>  
>  	low = 0;
> -	high = cpus->nr;
> +	high = UNWRAP_MAP(cpus)->nr;
>  	while (low < high) {
>  		int idx = (low + high) / 2;
> -		struct perf_cpu cpu_at_idx = cpus->map[idx];
> +		struct perf_cpu cpu_at_idx = UNWRAP_MAP(cpus)->map[idx];
>  
>  		if (cpu_at_idx.cpu == cpu.cpu)
>  			return idx;
> @@ -316,7 +363,7 @@ struct perf_cpu perf_cpu_map__max(struct perf_cpu_map *map)
>  	};
>  
>  	// cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
> -	return map->nr > 0 ? map->map[map->nr - 1] : result;
> +	return UNWRAP_MAP(map)->nr > 0 ? UNWRAP_MAP(map)->map[UNWRAP_MAP(map)->nr - 1] : result;
>  }
>  
>  /*
> @@ -337,37 +384,36 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
>  
>  	if (!orig && !other)
>  		return NULL;
> -	if (!orig) {
> -		perf_cpu_map__get(other);
> -		return other;
> -	}
> +	if (!orig)
> +		return perf_cpu_map__get(other);
>  	if (!other)
>  		return orig;
> -	if (orig->nr == other->nr &&
> -	    !memcmp(orig->map, other->map, orig->nr * sizeof(struct perf_cpu)))
> +	if (UNWRAP_MAP(orig)->nr == UNWRAP_MAP(other)->nr &&
> +	    !memcmp(UNWRAP_MAP(orig)->map, UNWRAP_MAP(other)->map,
> +		    UNWRAP_MAP(orig)->nr * sizeof(struct perf_cpu)))
>  		return orig;
>  
> -	tmp_len = orig->nr + other->nr;
> +	tmp_len = UNWRAP_MAP(orig)->nr + UNWRAP_MAP(other)->nr;
>  	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
>  	if (!tmp_cpus)
>  		return NULL;
>  
>  	/* Standard merge algorithm from wikipedia */
>  	i = j = k = 0;
> -	while (i < orig->nr && j < other->nr) {
> -		if (orig->map[i].cpu <= other->map[j].cpu) {
> -			if (orig->map[i].cpu == other->map[j].cpu)
> +	while (i < UNWRAP_MAP(orig)->nr && j < UNWRAP_MAP(other)->nr) {
> +		if (UNWRAP_MAP(orig)->map[i].cpu <= UNWRAP_MAP(other)->map[j].cpu) {
> +			if (UNWRAP_MAP(orig)->map[i].cpu == UNWRAP_MAP(other)->map[j].cpu)
>  				j++;
> -			tmp_cpus[k++] = orig->map[i++];
> +			tmp_cpus[k++] = UNWRAP_MAP(orig)->map[i++];
>  		} else
> -			tmp_cpus[k++] = other->map[j++];
> +			tmp_cpus[k++] = UNWRAP_MAP(other)->map[j++];
>  	}
>  
> -	while (i < orig->nr)
> -		tmp_cpus[k++] = orig->map[i++];
> +	while (i < UNWRAP_MAP(orig)->nr)
> +		tmp_cpus[k++] = UNWRAP_MAP(orig)->map[i++];
>  
> -	while (j < other->nr)
> -		tmp_cpus[k++] = other->map[j++];
> +	while (j < UNWRAP_MAP(other)->nr)
> +		tmp_cpus[k++] = UNWRAP_MAP(other)->map[j++];
>  	assert(k <= tmp_len);
>  
>  	merged = cpu_map__trim_new(k, tmp_cpus);
> diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h
> index 581f9ffb4237..64ad56d167a0 100644
> --- a/tools/lib/perf/include/internal/cpumap.h
> +++ b/tools/lib/perf/include/internal/cpumap.h
> @@ -16,7 +16,12 @@ struct perf_cpu {
>   * gaps if CPU numbers were used. For events associated with a pid, rather than
>   * a CPU, a single dummy map with an entry of -1 is used.
>   */
> -struct perf_cpu_map {
> +#ifndef REFCNT_CHECKING
> +struct perf_cpu_map
> +#else
> +struct original_perf_cpu_map
> +#endif
> +{
>  	refcount_t	refcnt;
>  	/** Length of the map array. */
>  	int		nr;
> @@ -24,10 +29,17 @@ struct perf_cpu_map {
>  	struct perf_cpu	map[];
>  };
>  
> +#ifdef REFCNT_CHECKING
> +struct perf_cpu_map {
> +	struct original_perf_cpu_map *orig;
> +};
> +#endif
> +
>  #ifndef MAX_NR_CPUS
>  #define MAX_NR_CPUS	2048
>  #endif
>  
> +struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus);
>  int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu);
>  
>  #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
> diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
> index 84e87e31f119..65d9546aec6e 100644
> --- a/tools/perf/tests/cpumap.c
> +++ b/tools/perf/tests/cpumap.c
> @@ -11,6 +11,12 @@
>  
>  struct machine;
>  
> +#ifndef REFCNT_CHECKING
> +#define UNWRAP_CPUMAP(x) x
> +#else
> +#define UNWRAP_CPUMAP(x) x->orig
> +#endif
> +
>  static int process_event_mask(struct perf_tool *tool __maybe_unused,
>  			 union perf_event *event,
>  			 struct perf_sample *sample __maybe_unused,
> @@ -35,10 +41,10 @@ static int process_event_mask(struct perf_tool *tool __maybe_unused,
>  	}
>  
>  	map = cpu_map__new_data(data);
> -	TEST_ASSERT_VAL("wrong nr",  map->nr == 20);
> +	TEST_ASSERT_VAL("wrong nr",  perf_cpu_map__nr(map) == 20);
>  
>  	for (i = 0; i < 20; i++) {
> -		TEST_ASSERT_VAL("wrong cpu", map->map[i].cpu == i);
> +		TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, i).cpu == i);
>  	}
>  
>  	perf_cpu_map__put(map);
> @@ -66,10 +72,10 @@ static int process_event_cpus(struct perf_tool *tool __maybe_unused,
>  	TEST_ASSERT_VAL("wrong cpu",  cpus->cpu[1] == 256);
>  
>  	map = cpu_map__new_data(data);
> -	TEST_ASSERT_VAL("wrong nr",  map->nr == 2);
> -	TEST_ASSERT_VAL("wrong cpu", map->map[0].cpu == 1);
> -	TEST_ASSERT_VAL("wrong cpu", map->map[1].cpu == 256);
> -	TEST_ASSERT_VAL("wrong refcnt", refcount_read(&map->refcnt) == 1);
> +	TEST_ASSERT_VAL("wrong nr",  perf_cpu_map__nr(map) == 2);
> +	TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, 0).cpu == 1);
> +	TEST_ASSERT_VAL("wrong cpu", perf_cpu_map__cpu(map, 1).cpu == 256);

The above should be in as separate patch making it use the accessors? I
think I merged patches like that already for some other cases?

> +	TEST_ASSERT_VAL("wrong refcnt", refcount_read(&UNWRAP_CPUMAP(map)->refcnt) == 1);
>  	perf_cpu_map__put(map);
>  	return 0;
>  }
> @@ -130,7 +136,7 @@ static int test__cpu_map_merge(struct test_suite *test __maybe_unused, int subte
>  	struct perf_cpu_map *c = perf_cpu_map__merge(a, b);
>  	char buf[100];
>  
> -	TEST_ASSERT_VAL("failed to merge map: bad nr", c->nr == 5);
> +	TEST_ASSERT_VAL("failed to merge map: bad nr", perf_cpu_map__nr(c) == 5);
>  	cpu_map__snprint(c, buf, sizeof(buf));
>  	TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, "1-2,4-5,7"));
>  	perf_cpu_map__put(b);
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 12b2243222b0..e9976ca238fc 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -22,6 +22,12 @@ static int max_node_num;
>   */
>  static int *cpunode_map;
>  
> +#ifndef REFCNT_CHECKING
> +#define UNWRAP_MAP(x) x
> +#else
> +#define UNWRAP_MAP(x) x->orig
> +#endif
> +
>  static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
>  {
>  	struct perf_cpu_map *map;
> @@ -37,9 +43,9 @@ static struct perf_cpu_map *cpu_map__from_entries(struct cpu_map_entries *cpus)
>  			 * otherwise it would become 65535.
>  			 */
>  			if (cpus->cpu[i] == (u16) -1)
> -				map->map[i].cpu = -1;
> +				UNWRAP_MAP(map)->map[i].cpu = -1;
>  			else
> -				map->map[i].cpu = (int) cpus->cpu[i];
> +				UNWRAP_MAP(map)->map[i].cpu = (int) cpus->cpu[i];
>  		}
>  	}
>  
> @@ -58,7 +64,7 @@ static struct perf_cpu_map *cpu_map__from_mask(struct perf_record_record_cpu_map
>  		int cpu, i = 0;
>  
>  		for_each_set_bit(cpu, mask->mask, nbits)
> -			map->map[i++].cpu = cpu;
> +			UNWRAP_MAP(map)->map[i++].cpu = cpu;
>  	}
>  	return map;
>  
> @@ -84,16 +90,13 @@ size_t cpu_map__fprintf(struct perf_cpu_map *map, FILE *fp)
>  
>  struct perf_cpu_map *perf_cpu_map__empty_new(int nr)
>  {
> -	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(int) * nr);
> +	struct perf_cpu_map *cpus = perf_cpu_map__alloc(nr);
>  
>  	if (cpus != NULL) {
>  		int i;
>  
> -		cpus->nr = nr;
>  		for (i = 0; i < nr; i++)
> -			cpus->map[i].cpu = -1;
> -
> -		refcount_set(&cpus->refcnt, 1);
> +			UNWRAP_MAP(cpus)->map[i].cpu = -1;
>  	}
>  
>  	return cpus;
> @@ -163,7 +166,7 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
>  {
>  	int idx;
>  	struct perf_cpu cpu;
> -	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(cpus->nr);
> +	struct cpu_aggr_map *c = cpu_aggr_map__empty_new(perf_cpu_map__nr(cpus));
>  
>  	if (!c)
>  		return NULL;
> @@ -187,7 +190,7 @@ struct cpu_aggr_map *cpu_aggr_map__new(const struct perf_cpu_map *cpus,
>  		}
>  	}
>  	/* Trim. */
> -	if (c->nr != cpus->nr) {
> +	if (c->nr != perf_cpu_map__nr(cpus)) {
>  		struct cpu_aggr_map *trimmed_c =
>  			realloc(c,
>  				sizeof(struct cpu_aggr_map) + sizeof(struct aggr_cpu_id) * c->nr);
> @@ -494,31 +497,32 @@ size_t cpu_map__snprint(struct perf_cpu_map *map, char *buf, size_t size)
>  
>  #define COMMA first ? "" : ","
>  
> -	for (i = 0; i < map->nr + 1; i++) {
> +	for (i = 0; i < perf_cpu_map__nr(map) + 1; i++) {
>  		struct perf_cpu cpu = { .cpu = INT_MAX };
> -		bool last = i == map->nr;
> +		bool last = i == perf_cpu_map__nr(map);
>  
>  		if (!last)
> -			cpu = map->map[i];
> +			cpu = perf_cpu_map__cpu(map, i);
>  
>  		if (start == -1) {
>  			start = i;
>  			if (last) {
>  				ret += snprintf(buf + ret, size - ret,
>  						"%s%d", COMMA,
> -						map->map[i].cpu);
> +						perf_cpu_map__cpu(map, i).cpu);
>  			}
> -		} else if (((i - start) != (cpu.cpu - map->map[start].cpu)) || last) {
> +		} else if (((i - start) != (cpu.cpu - perf_cpu_map__cpu(map, start).cpu)) || last) {
>  			int end = i - 1;
>  
>  			if (start == end) {
>  				ret += snprintf(buf + ret, size - ret,
>  						"%s%d", COMMA,
> -						map->map[start].cpu);
> +						perf_cpu_map__cpu(map, start).cpu);
>  			} else {
>  				ret += snprintf(buf + ret, size - ret,
>  						"%s%d-%d", COMMA,
> -						map->map[start].cpu, map->map[end].cpu);
> +						perf_cpu_map__cpu(map, start).cpu,
> +						perf_cpu_map__cpu(map, end).cpu);
>  			}
>  			first = false;
>  			start = i;
> @@ -545,7 +549,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
>  	int i, cpu;
>  	char *ptr = buf;
>  	unsigned char *bitmap;
> -	struct perf_cpu last_cpu = perf_cpu_map__cpu(map, map->nr - 1);
> +	struct perf_cpu last_cpu = perf_cpu_map__cpu(map, perf_cpu_map__nr(map) - 1);
>  
>  	if (buf == NULL)
>  		return 0;
> @@ -556,7 +560,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
>  		return 0;
>  	}
>  
> -	for (i = 0; i < map->nr; i++) {
> +	for (i = 0; i < perf_cpu_map__nr(map); i++) {
>  		cpu = perf_cpu_map__cpu(map, i).cpu;
>  		bitmap[cpu / 8] |= 1 << (cpu % 8);
>  	}
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 8dfbba15aeb8..1649321fe86f 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -1992,13 +1992,20 @@ int perf_pmu__match(char *pattern, char *name, char *tok)
>  	return 0;
>  }
>  
> +#ifndef REFCNT_CHECKING
> +#define UNWRAP_CPUMAP(x) x
> +#else
> +#define UNWRAP_CPUMAP(x) x->orig
> +#endif
> +
>  int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
>  			 struct perf_cpu_map **mcpus_ptr,
>  			 struct perf_cpu_map **ucpus_ptr)
>  {
>  	struct perf_cpu_map *pmu_cpus = pmu->cpus;
>  	struct perf_cpu_map *matched_cpus, *unmatched_cpus;
> -	int matched_nr = 0, unmatched_nr = 0;
> +	struct perf_cpu cpu;
> +	int i, matched_nr = 0, unmatched_nr = 0;
>  
>  	matched_cpus = perf_cpu_map__default_new();
>  	if (!matched_cpus)
> @@ -2010,18 +2017,15 @@ int perf_pmu__cpus_match(struct perf_pmu *pmu, struct perf_cpu_map *cpus,
>  		return -1;
>  	}
>  
> -	for (int i = 0; i < cpus->nr; i++) {
> -		int cpu;
> -
> -		cpu = perf_cpu_map__idx(pmu_cpus, cpus->map[i]);
> -		if (cpu == -1)
> -			unmatched_cpus->map[unmatched_nr++] = cpus->map[i];
> +	perf_cpu_map__for_each_cpu(cpu, i, cpus) {
> +		if (!perf_cpu_map__has(pmu_cpus, cpu))
> +			UNWRAP_CPUMAP(unmatched_cpus)->map[unmatched_nr++] = cpu;
>  		else
> -			matched_cpus->map[matched_nr++] = cpus->map[i];
> +			UNWRAP_CPUMAP(matched_cpus)->map[matched_nr++] = cpu;
>  	}
>  
> -	unmatched_cpus->nr = unmatched_nr;
> -	matched_cpus->nr = matched_nr;
> +	UNWRAP_CPUMAP(unmatched_cpus)->nr = unmatched_nr;
> +	UNWRAP_CPUMAP(matched_cpus)->nr = matched_nr;
>  	*mcpus_ptr = matched_cpus;
>  	*ucpus_ptr = unmatched_cpus;
>  	return 0;
> -- 
> 2.35.0.rc0.227.g00780c9af4-goog

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-01-30 17:40             ` Ian Rogers
@ 2022-02-04 14:57               ` Masami Hiramatsu
  2022-02-04 19:11                 ` Ian Rogers
  0 siblings, 1 reply; 20+ messages in thread
From: Masami Hiramatsu @ 2022-02-04 14:57 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Sun, 30 Jan 2022 09:40:21 -0800
Ian Rogers <irogers@google.com> wrote:

> > > > Hi Ian,
> > > >
> > > > Hmm, but such a macro is not usual for C which perf is written in.
> > > > If I understand correctly, you might want to use memory leak
> > > > analyzer to detect refcount leak, and that analyzer will show
> > > > what data structure is leaked.
> > >
> > > Firstly, thanks for the conversation - this is really useful to
> > > improve the code!
> >
> > Hi Ian,
> >
> > You're welcome! This conversation also useful to me to understand
> > the issue deeper :-)
> >
> > > I think in an ideal world we'd somehow educate things like address
> > > sanitizer of reference counted data structures and they would do a
> > > better job of tracking gets and puts. The problem is pairing gets and
> > > puts.
> >
> > Agreed. From my experience, pairing gets and puts are hard without
> > reviewing the source code, since the refcounter is not only used in a
> > single function, but it is for keeping the object not released until
> > some long process is finished.
> >
> > For example, if the correct pair is like below, funcA-funcD, funcB-funcC,
> > funcA (get)
> > funcB (get)
> > funcC (put)
> > funcD (put)
> >
> > But depending on the execution timing, funcC/funcD can be swapped.
> > funcA (get)
> > funcB (get)
> > funcD (put)
> > funcC (put)
> >
> > And if there is a bug, funcX may get the object by mistake.
> > funcA (get)
> > funcX (get)
> > funcB (get)
> > funcD (put)
> > funcC (put)
> >
> > Or, funcC() might miss to put.
> > funcA (get)
> > funcB (get)
> > funcD (put)
> >
> > In any case, just tracking the get/put, it is hard to know which pair
> > is not right. I saw these patterns when I debugged it. :(
> 
> Yep, I've found this issue too :-) The get is being used for the
> side-effect of incrementing a reference count rather than for
> returning the value. This happened in cpumap merge and was easy to fix
> there.
> 
> This problem is possible in general, but I think if it were common we
> are doomed. I don't think this pattern is common though. In general a
> reference count is owned by something, the scope of a function or the
> lifetime of a list. If puts were adhoc then it would mean that one
> occurring in a function could be doing it for the side effect of
> freeing on a list. I don't think the code aims to do that. Making the
> code clean with pairing gets and puts is an issue, like with the
> cpumap merge change.

Hi Ian,
Sorry for waiting.

I got the pairing of get/put is not so hard if we use your
programing pattern. The problem was the posession of the object.
As you proposed, if we force users to use the returning "token"
instead of object pointer itself as below;

funcA(obj) {
  token = get(obj);
  get_obj(token)->...
  put(token);
}

Then it is clear who leaks the token.

 funcA (get-> token1)
 funcX (get-> token3)
 funcB (get-> token2)
 funcD (put-> token2)
 funcC (put-> token1)

In this case token3 is not freed, thus the funcX's pair is lost.
Or,

 funcA (get-> token1)
 funcB (get-> token2)
 funcD (put-> token2)

In this case funcA's pair is lost.

And if the user access object with the token which already put,
it can be detected.


> 
> > > In C++ you use RAII types so that the destructor ensures a put -
> > > this can be complex when using data types like lists where you want to
> > > move or swap things onto the list, to keep the single pointer
> > > property. In the C code in Linux we use gotos, similarly to how defer
> > > is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
> > > the get/put pairing problem by adding a cookie that is passed around.
> >
> > Cool! I will check the ref_tracker :)
> >
> > > The problem with that is that then the cookie becomes part of the API.
> >
> > What the cookie is? some pairing function address??
> 
> As I understand it, a token to identify a get.

Yeah, I slightly missed that your API will force caller to use the
returning token instead of object.
So, what about using token always, instead of wrapping the object
pointer only when debugging?

I felt uncomfortable about changing the data structure name according
to the debug macro. Instead, it is better way for me if get() always
returns a token of the object and users need to convert the
token to the object. For example;

struct perf_cpu_map {
...
};

#ifdef REFCNT_CHECKING
typedef struct {struct perf_cpu_map *orig;} perf_cpu_map_token_t;
#else
typedef unsigned long perf_cpu_map_token_t;	/* actually this is "struct perf_cpu_map *" */
#endif

perf_cpu_map_token_t perf_cpu_map__get(struct perf_cpu_map *map);
void perf_cpu_map__put(struct perf_cpu_map_token_t tok);

This explicitly forces users to convert the token to the object
when using it. Of course if a user uses the object pointer ("map" here)
directly, the token is not used. But we can check such wrong usage by
compilation.

[...]
> > So my question is that we need to pay the cost to use UNWRAP_ macro
> > on all those object just for finding the "use-after-put" case.
> > Usually the case that "use-after-put" causes actual problem is very
> > limited, or it can be "use-after-free".
> 
> So the dso->nsinfo case was a use after put once we added in the
> missing put - it could also be thought of as a double put/free. In
> general use-after-put is going to show where a strict get-then-put
> isn't being followed, if we make sure of that property then the
> reference counting will be accurate.

The double free/put will be detected different way. But indeed the
use-after-put can be overlooked (I think there actually be the
case, it should be "use-after-free" but it depends on the timing.)

> 
> A case that came up previously was maps__find:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.c#n974
> this code retuns a map but without doing a get on it, even though a
> map is reference counted:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.h#n46
> If we change that code to return a get of the map then we add overhead
> for simple cases of checking a map is present - you can infer you have
> a reference count on the map if you have it on the maps. The
> indirection allows the code to remain as-is, while being able to catch
> misuse.

I don't think using the UNWRAP_* macro is "remain as-is" ;) but I agree
with you that can catch the misuse.

IMHO, I rather like using the explicit token. I don't like to see
"UNWRAP_map(map)->field", but "GET_map(token)->field" is good for me.
This is because the "map" should be a pointer of data structure (so
its field can be accessed without any wrapper), but token is just a
value (so this implies that it must be converted always). 
In other words, "map->field" looks natural for reviewing, but
"token->field" looks obviously strange.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-02-04 14:57               ` Masami Hiramatsu
@ 2022-02-04 19:11                 ` Ian Rogers
  2022-02-05  4:41                   ` Masami Hiramatsu
  0 siblings, 1 reply; 20+ messages in thread
From: Ian Rogers @ 2022-02-04 19:11 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Fri, Feb 4, 2022 at 6:57 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Sun, 30 Jan 2022 09:40:21 -0800
> Ian Rogers <irogers@google.com> wrote:
>
> > > > > Hi Ian,
> > > > >
> > > > > Hmm, but such a macro is not usual for C which perf is written in.
> > > > > If I understand correctly, you might want to use memory leak
> > > > > analyzer to detect refcount leak, and that analyzer will show
> > > > > what data structure is leaked.
> > > >
> > > > Firstly, thanks for the conversation - this is really useful to
> > > > improve the code!
> > >
> > > Hi Ian,
> > >
> > > You're welcome! This conversation also useful to me to understand
> > > the issue deeper :-)
> > >
> > > > I think in an ideal world we'd somehow educate things like address
> > > > sanitizer of reference counted data structures and they would do a
> > > > better job of tracking gets and puts. The problem is pairing gets and
> > > > puts.
> > >
> > > Agreed. From my experience, pairing gets and puts are hard without
> > > reviewing the source code, since the refcounter is not only used in a
> > > single function, but it is for keeping the object not released until
> > > some long process is finished.
> > >
> > > For example, if the correct pair is like below, funcA-funcD, funcB-funcC,
> > > funcA (get)
> > > funcB (get)
> > > funcC (put)
> > > funcD (put)
> > >
> > > But depending on the execution timing, funcC/funcD can be swapped.
> > > funcA (get)
> > > funcB (get)
> > > funcD (put)
> > > funcC (put)
> > >
> > > And if there is a bug, funcX may get the object by mistake.
> > > funcA (get)
> > > funcX (get)
> > > funcB (get)
> > > funcD (put)
> > > funcC (put)
> > >
> > > Or, funcC() might miss to put.
> > > funcA (get)
> > > funcB (get)
> > > funcD (put)
> > >
> > > In any case, just tracking the get/put, it is hard to know which pair
> > > is not right. I saw these patterns when I debugged it. :(
> >
> > Yep, I've found this issue too :-) The get is being used for the
> > side-effect of incrementing a reference count rather than for
> > returning the value. This happened in cpumap merge and was easy to fix
> > there.
> >
> > This problem is possible in general, but I think if it were common we
> > are doomed. I don't think this pattern is common though. In general a
> > reference count is owned by something, the scope of a function or the
> > lifetime of a list. If puts were adhoc then it would mean that one
> > occurring in a function could be doing it for the side effect of
> > freeing on a list. I don't think the code aims to do that. Making the
> > code clean with pairing gets and puts is an issue, like with the
> > cpumap merge change.
>
> Hi Ian,
> Sorry for waiting.
>
> I got the pairing of get/put is not so hard if we use your
> programing pattern. The problem was the posession of the object.
> As you proposed, if we force users to use the returning "token"
> instead of object pointer itself as below;
>
> funcA(obj) {
>   token = get(obj);
>   get_obj(token)->...
>   put(token);
> }
>
> Then it is clear who leaks the token.
>
>  funcA (get-> token1)
>  funcX (get-> token3)
>  funcB (get-> token2)
>  funcD (put-> token2)
>  funcC (put-> token1)
>
> In this case token3 is not freed, thus the funcX's pair is lost.
> Or,
>
>  funcA (get-> token1)
>  funcB (get-> token2)
>  funcD (put-> token2)
>
> In this case funcA's pair is lost.
>
> And if the user access object with the token which already put,
> it can be detected.
>
>
> >
> > > > In C++ you use RAII types so that the destructor ensures a put -
> > > > this can be complex when using data types like lists where you want to
> > > > move or swap things onto the list, to keep the single pointer
> > > > property. In the C code in Linux we use gotos, similarly to how defer
> > > > is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
> > > > the get/put pairing problem by adding a cookie that is passed around.
> > >
> > > Cool! I will check the ref_tracker :)
> > >
> > > > The problem with that is that then the cookie becomes part of the API.
> > >
> > > What the cookie is? some pairing function address??
> >
> > As I understand it, a token to identify a get.
>
> Yeah, I slightly missed that your API will force caller to use the
> returning token instead of object.
> So, what about using token always, instead of wrapping the object
> pointer only when debugging?
>
> I felt uncomfortable about changing the data structure name according
> to the debug macro. Instead, it is better way for me if get() always
> returns a token of the object and users need to convert the
> token to the object. For example;
>
> struct perf_cpu_map {
> ...
> };
>
> #ifdef REFCNT_CHECKING
> typedef struct {struct perf_cpu_map *orig;} perf_cpu_map_token_t;
> #else
> typedef unsigned long perf_cpu_map_token_t;     /* actually this is "struct perf_cpu_map *" */
> #endif
>
> perf_cpu_map_token_t perf_cpu_map__get(struct perf_cpu_map *map);
> void perf_cpu_map__put(struct perf_cpu_map_token_t tok);
>
> This explicitly forces users to convert the token to the object
> when using it. Of course if a user uses the object pointer ("map" here)
> directly, the token is not used. But we can check such wrong usage by
> compilation.
>
> [...]
> > > So my question is that we need to pay the cost to use UNWRAP_ macro
> > > on all those object just for finding the "use-after-put" case.
> > > Usually the case that "use-after-put" causes actual problem is very
> > > limited, or it can be "use-after-free".
> >
> > So the dso->nsinfo case was a use after put once we added in the
> > missing put - it could also be thought of as a double put/free. In
> > general use-after-put is going to show where a strict get-then-put
> > isn't being followed, if we make sure of that property then the
> > reference counting will be accurate.
>
> The double free/put will be detected different way. But indeed the
> use-after-put can be overlooked (I think there actually be the
> case, it should be "use-after-free" but it depends on the timing.)
>
> >
> > A case that came up previously was maps__find:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.c#n974
> > this code retuns a map but without doing a get on it, even though a
> > map is reference counted:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.h#n46
> > If we change that code to return a get of the map then we add overhead
> > for simple cases of checking a map is present - you can infer you have
> > a reference count on the map if you have it on the maps. The

> > indirection allows the code to remain as-is, while being able to catch
> > misuse.
>
> I don't think using the UNWRAP_* macro is "remain as-is" ;) but I agree
> with you that can catch the misuse.
>
> IMHO, I rather like using the explicit token. I don't like to see
> "UNWRAP_map(map)->field", but "GET_map(token)->field" is good for me.
> This is because the "map" should be a pointer of data structure (so
> its field can be accessed without any wrapper), but token is just a
> value (so this implies that it must be converted always).
> In other words, "map->field" looks natural for reviewing, but
> "token->field" looks obviously strange.

Thanks Masami, this is all very constructive. I think with the naming
we can get there. I'm a little uncomfortable paying a cost for the
token when not debugging, but that is more of a nit as I doubt the
performance overhead really matters. I am planning a v3 patch set to
use macros, break out more stuff that can be merged regardless, etc.

I've also started to take a look at the maps and map structs. maps is
an easy struct to add the check to, no more difficult than cpumap and
nsinfo. So that's pretty good, we've been able to have good leak
detection on 3 structs catching missing puts, uses after puts, etc.
The problem I'm having is with struct map as not only is it reference
counted, it is a node in a red-black tree or a list:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.h?h=perf/core#n20

For nsinfo we add a layer of indirection between the dso and the
nsinfo objects. There is a single pointer, the indirection lives
there, life is easy. For a red-black tree node there are references
from the children and the parent, none of which in the current code
impact the refcnt field. I can in the 'find' code add a layer of
wrapping, but this is a layer of wrapping that doesn't correspond to
an actual get and means things aren't working as intended. My feeling
here is that the 'struct map' needs a refactor. The redblack tree and
list code should be their own structs - struct map_rbnode and
map_listnode or some such. When those structs point to the reference
counted data then they add a reference count of 1 and have the
indirection layer associated with that reference count. The separation
is easier for humans and machines to understand. Where this fails is
if the current code wants to look from the map to a rbtree neighbor or
list node neighbor, which may not be a problem.

Overall it could be that the 'struct map' code was just in need of a
refactor or it could point to this approach adding yet more overhead.
I'm going to push on it as even if this just turns into a fork, I can
catch bugs that are real.

Thanks,
Ian

> Thank you,
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/4] Reference count checker and related fixes
  2022-02-04 19:11                 ` Ian Rogers
@ 2022-02-05  4:41                   ` Masami Hiramatsu
  0 siblings, 0 replies; 20+ messages in thread
From: Masami Hiramatsu @ 2022-02-05  4:41 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Darren Hart, Davidlohr Bueso,
	André Almeida, James Clark, John Garry, Riccardo Mancini,
	Yury Norov, Andy Shevchenko, Andrew Morton, Jin Yao,
	Adrian Hunter, Leo Yan, Andi Kleen, Thomas Richter, Kan Liang,
	Madhavan Srinivasan, Shunsuke Nakamura, Song Liu, Steven Rostedt,
	Miaoqian Lin, Stephen Brennan, Kajol Jain, Alexey Bayduraev,
	German Gomez, linux-perf-users, linux-kernel, Eric Dumazet,
	Dmitry Vyukov, masami.hiramatsu.pt, eranian

On Fri, 4 Feb 2022 11:11:48 -0800
Ian Rogers <irogers@google.com> wrote:

> On Fri, Feb 4, 2022 at 6:57 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Sun, 30 Jan 2022 09:40:21 -0800
> > Ian Rogers <irogers@google.com> wrote:
> >
> > > > > > Hi Ian,
> > > > > >
> > > > > > Hmm, but such a macro is not usual for C which perf is written in.
> > > > > > If I understand correctly, you might want to use memory leak
> > > > > > analyzer to detect refcount leak, and that analyzer will show
> > > > > > what data structure is leaked.
> > > > >
> > > > > Firstly, thanks for the conversation - this is really useful to
> > > > > improve the code!
> > > >
> > > > Hi Ian,
> > > >
> > > > You're welcome! This conversation also useful to me to understand
> > > > the issue deeper :-)
> > > >
> > > > > I think in an ideal world we'd somehow educate things like address
> > > > > sanitizer of reference counted data structures and they would do a
> > > > > better job of tracking gets and puts. The problem is pairing gets and
> > > > > puts.
> > > >
> > > > Agreed. From my experience, pairing gets and puts are hard without
> > > > reviewing the source code, since the refcounter is not only used in a
> > > > single function, but it is for keeping the object not released until
> > > > some long process is finished.
> > > >
> > > > For example, if the correct pair is like below, funcA-funcD, funcB-funcC,
> > > > funcA (get)
> > > > funcB (get)
> > > > funcC (put)
> > > > funcD (put)
> > > >
> > > > But depending on the execution timing, funcC/funcD can be swapped.
> > > > funcA (get)
> > > > funcB (get)
> > > > funcD (put)
> > > > funcC (put)
> > > >
> > > > And if there is a bug, funcX may get the object by mistake.
> > > > funcA (get)
> > > > funcX (get)
> > > > funcB (get)
> > > > funcD (put)
> > > > funcC (put)
> > > >
> > > > Or, funcC() might miss to put.
> > > > funcA (get)
> > > > funcB (get)
> > > > funcD (put)
> > > >
> > > > In any case, just tracking the get/put, it is hard to know which pair
> > > > is not right. I saw these patterns when I debugged it. :(
> > >
> > > Yep, I've found this issue too :-) The get is being used for the
> > > side-effect of incrementing a reference count rather than for
> > > returning the value. This happened in cpumap merge and was easy to fix
> > > there.
> > >
> > > This problem is possible in general, but I think if it were common we
> > > are doomed. I don't think this pattern is common though. In general a
> > > reference count is owned by something, the scope of a function or the
> > > lifetime of a list. If puts were adhoc then it would mean that one
> > > occurring in a function could be doing it for the side effect of
> > > freeing on a list. I don't think the code aims to do that. Making the
> > > code clean with pairing gets and puts is an issue, like with the
> > > cpumap merge change.
> >
> > Hi Ian,
> > Sorry for waiting.
> >
> > I got the pairing of get/put is not so hard if we use your
> > programing pattern. The problem was the posession of the object.
> > As you proposed, if we force users to use the returning "token"
> > instead of object pointer itself as below;
> >
> > funcA(obj) {
> >   token = get(obj);
> >   get_obj(token)->...
> >   put(token);
> > }
> >
> > Then it is clear who leaks the token.
> >
> >  funcA (get-> token1)
> >  funcX (get-> token3)
> >  funcB (get-> token2)
> >  funcD (put-> token2)
> >  funcC (put-> token1)
> >
> > In this case token3 is not freed, thus the funcX's pair is lost.
> > Or,
> >
> >  funcA (get-> token1)
> >  funcB (get-> token2)
> >  funcD (put-> token2)
> >
> > In this case funcA's pair is lost.
> >
> > And if the user access object with the token which already put,
> > it can be detected.
> >
> >
> > >
> > > > > In C++ you use RAII types so that the destructor ensures a put -
> > > > > this can be complex when using data types like lists where you want to
> > > > > move or swap things onto the list, to keep the single pointer
> > > > > property. In the C code in Linux we use gotos, similarly to how defer
> > > > > is used in Go. Anyway, the ref_tracker that Eric Dumazet added solved
> > > > > the get/put pairing problem by adding a cookie that is passed around.
> > > >
> > > > Cool! I will check the ref_tracker :)
> > > >
> > > > > The problem with that is that then the cookie becomes part of the API.
> > > >
> > > > What the cookie is? some pairing function address??
> > >
> > > As I understand it, a token to identify a get.
> >
> > Yeah, I slightly missed that your API will force caller to use the
> > returning token instead of object.
> > So, what about using token always, instead of wrapping the object
> > pointer only when debugging?
> >
> > I felt uncomfortable about changing the data structure name according
> > to the debug macro. Instead, it is better way for me if get() always
> > returns a token of the object and users need to convert the
> > token to the object. For example;
> >
> > struct perf_cpu_map {
> > ...
> > };
> >
> > #ifdef REFCNT_CHECKING
> > typedef struct {struct perf_cpu_map *orig;} perf_cpu_map_token_t;
> > #else
> > typedef unsigned long perf_cpu_map_token_t;     /* actually this is "struct perf_cpu_map *" */
> > #endif
> >
> > perf_cpu_map_token_t perf_cpu_map__get(struct perf_cpu_map *map);
> > void perf_cpu_map__put(struct perf_cpu_map_token_t tok);
> >
> > This explicitly forces users to convert the token to the object
> > when using it. Of course if a user uses the object pointer ("map" here)
> > directly, the token is not used. But we can check such wrong usage by
> > compilation.
> >
> > [...]
> > > > So my question is that we need to pay the cost to use UNWRAP_ macro
> > > > on all those object just for finding the "use-after-put" case.
> > > > Usually the case that "use-after-put" causes actual problem is very
> > > > limited, or it can be "use-after-free".
> > >
> > > So the dso->nsinfo case was a use after put once we added in the
> > > missing put - it could also be thought of as a double put/free. In
> > > general use-after-put is going to show where a strict get-then-put
> > > isn't being followed, if we make sure of that property then the
> > > reference counting will be accurate.
> >
> > The double free/put will be detected different way. But indeed the
> > use-after-put can be overlooked (I think there actually be the
> > case, it should be "use-after-free" but it depends on the timing.)
> >
> > >
> > > A case that came up previously was maps__find:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.c#n974
> > > this code retuns a map but without doing a get on it, even though a
> > > map is reference counted:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.h#n46
> > > If we change that code to return a get of the map then we add overhead
> > > for simple cases of checking a map is present - you can infer you have
> > > a reference count on the map if you have it on the maps. The
> 
> > > indirection allows the code to remain as-is, while being able to catch
> > > misuse.
> >
> > I don't think using the UNWRAP_* macro is "remain as-is" ;) but I agree
> > with you that can catch the misuse.
> >
> > IMHO, I rather like using the explicit token. I don't like to see
> > "UNWRAP_map(map)->field", but "GET_map(token)->field" is good for me.
> > This is because the "map" should be a pointer of data structure (so
> > its field can be accessed without any wrapper), but token is just a
> > value (so this implies that it must be converted always).
> > In other words, "map->field" looks natural for reviewing, but
> > "token->field" looks obviously strange.
> 
> Thanks Masami, this is all very constructive. I think with the naming
> we can get there. I'm a little uncomfortable paying a cost for the
> token when not debugging, but that is more of a nit as I doubt the
> performance overhead really matters. I am planning a v3 patch set to
> use macros, break out more stuff that can be merged regardless, etc.

Thanks for preparing the next version :-) I think even with my method
there is no overhead when not debugging, because it is just a unsigned-long
casted pointer (so internally it is treated as a pointer to map.)
Let me explain what the comment 'actually this is "struct perf_cpu_map *"'
meant.

#ifdef REFCNT_CHECKING
typedef struct {struct perf_cpu_map *orig;} perf_cpu_map_token_t;
#define REF_CPUMAP(tok) (tok->orig)
#else
typedef unsigned long perf_cpu_map_token_t;     /* actually this is "struct perf_cpu_map *" */
#define REF_CPUMAP(tok) ((struct perf_cpu_map *)tok)
#endif

This motivates (and forces) developers to use the REF_* macro
for the token. In your method, "obj->field" will cause an error only
when RECNT_CHECKING is defined. So when not debugging, the code must
be passed.
With the above token, "tok->field" must cause an error always. :-)

> 
> I've also started to take a look at the maps and map structs. maps is
> an easy struct to add the check to, no more difficult than cpumap and
> nsinfo. So that's pretty good, we've been able to have good leak
> detection on 3 structs catching missing puts, uses after puts, etc.

OK.

> The problem I'm having is with struct map as not only is it reference
> counted, it is a node in a red-black tree or a list:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/map.h?h=perf/core#n20
> 
> For nsinfo we add a layer of indirection between the dso and the
> nsinfo objects. There is a single pointer, the indirection lives
> there, life is easy. For a red-black tree node there are references
> from the children and the parent, none of which in the current code
> impact the refcnt field. I can in the 'find' code add a layer of
> wrapping, but this is a layer of wrapping that doesn't correspond to
> an actual get and means things aren't working as intended. My feeling
> here is that the 'struct map' needs a refactor. The redblack tree and
> list code should be their own structs - struct map_rbnode and
> map_listnode or some such. When those structs point to the reference
> counted data then they add a reference count of 1 and have the
> indirection layer associated with that reference count. The separation
> is easier for humans and machines to understand. Where this fails is
> if the current code wants to look from the map to a rbtree neighbor or
> list node neighbor, which may not be a problem.

That sounds good idea. I think there is another choice if map is always
in the rbtree. In this case, when we allocates the new map, it always
is placed in the rbtree (new() function always returns the object in
the tree) and when the refcount is 0, the map is removed from the tree
in the put() function. But this depends on how the map is used. If it
is not on the tree until it is initialized, or can be removed before
releasing object, your pattern (split list node and the object) is better.

> 
> Overall it could be that the 'struct map' code was just in need of a
> refactor or it could point to this approach adding yet more overhead.
> I'm going to push on it as even if this just turns into a fork, I can
> catch bugs that are real.


Thank you,


> 
> Thanks,
> Ian
> 
> > Thank you,
> >
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-02-05  4:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-25 20:45 [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
2022-01-25 20:45 ` [PATCH v2 1/4] perf cpumap: Add reference count checking Ian Rogers
2022-01-31 14:44   ` Arnaldo Carvalho de Melo
2022-01-25 20:46 ` [PATCH v2 2/4] perf dso: Make lock error check and add BUG_ONs Ian Rogers
2022-01-25 20:46 ` [PATCH v2 3/4] perf dso: Hold lock when accessing nsinfo Ian Rogers
2022-01-25 20:46 ` [PATCH v2 4/4] perf namespaces: Add reference count checking Ian Rogers
2022-01-27 21:33 ` [PATCH v2 0/4] Reference count checker and related fixes Ian Rogers
2022-01-28  5:23   ` Masami Hiramatsu
2022-01-28  6:24     ` Ian Rogers
2022-01-28 15:34       ` Masami Hiramatsu
2022-01-28 18:26         ` Ian Rogers
2022-01-28 19:59           ` Arnaldo Carvalho de Melo
2022-01-30  8:04             ` Masami Hiramatsu
2022-01-31 14:28               ` Arnaldo Carvalho de Melo
2022-01-30  7:54           ` Masami Hiramatsu
2022-01-30 17:40             ` Ian Rogers
2022-02-04 14:57               ` Masami Hiramatsu
2022-02-04 19:11                 ` Ian Rogers
2022-02-05  4:41                   ` Masami Hiramatsu
2022-01-31 13:56           ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).