All of lore.kernel.org
 help / color / mirror / Atom feed
* "perf top -g" leaking ~300MB per second.
@ 2014-12-13  8:48 Markus Trippelsdorf
  2014-12-13  9:03 ` Markus Trippelsdorf
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Trippelsdorf @ 2014-12-13  8:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

Running "perf top -g" built from current Linus tree apparently leaks
~300MB of memory every second an my machine.

-- 
Markus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-13  8:48 "perf top -g" leaking ~300MB per second Markus Trippelsdorf
@ 2014-12-13  9:03 ` Markus Trippelsdorf
  2014-12-13 15:26   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Trippelsdorf @ 2014-12-13  9:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

On 2014.12.13 at 09:48 +0100, Markus Trippelsdorf wrote:
> Running "perf top -g" built from current Linus tree apparently leaks
> ~300MB of memory every second an my machine.

Hmm, this is a much older problem. I just noticed this the first time
today. 
To reproduce: Compile some application in the background (make -j4 in my
case) and run "perf top -g". Perf will continue to accumulate memory
until the system starts to swap and the OOM killer eventually kicks in.

-- 
Markus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-13  9:03 ` Markus Trippelsdorf
@ 2014-12-13 15:26   ` Arnaldo Carvalho de Melo
  2014-12-13 18:16     ` David Ahern
  0 siblings, 1 reply; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-13 15:26 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar

Em Sat, Dec 13, 2014 at 10:03:31AM +0100, Markus Trippelsdorf escreveu:
> On 2014.12.13 at 09:48 +0100, Markus Trippelsdorf wrote:
> > Running "perf top -g" built from current Linus tree apparently leaks
> > ~300MB of memory every second an my machine.
> 
> Hmm, this is a much older problem. I just noticed this the first time
> today. 
> To reproduce: Compile some application in the background (make -j4 in my
> case) and run "perf top -g". Perf will continue to accumulate memory
> until the system starts to swap and the OOM killer eventually kicks in.

Yeap, longstanding problem, try minimizing the problem using a lower
frequency.

The callchain code was done initially for 'report' and when I made 'top'
reuse the hist_entry code allowing 'top' to collect callchains was too
easy, but then we need to go thru the callchain/hists/hist_entry code to
make sure that they don't leak, will try to do it...

- Arnaldo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-13 15:26   ` Arnaldo Carvalho de Melo
@ 2014-12-13 18:16     ` David Ahern
  2014-12-30  5:38       ` Namhyung Kim
  0 siblings, 1 reply; 9+ messages in thread
From: David Ahern @ 2014-12-13 18:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Markus Trippelsdorf
  Cc: linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar

On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> The callchain code was done initially for 'report' and when I made 'top'
> reuse the hist_entry code allowing 'top' to collect callchains was too
> easy, but then we need to go thru the callchain/hists/hist_entry code to
> make sure that they don't leak, will try to do it...
>

As I recall it is build up of the dead_threads list.

David


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-13 18:16     ` David Ahern
@ 2014-12-30  5:38       ` Namhyung Kim
  2014-12-30  8:35         ` Markus Trippelsdorf
                           ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Namhyung Kim @ 2014-12-30  5:38 UTC (permalink / raw)
  To: David Ahern
  Cc: Arnaldo Carvalho de Melo, Markus Trippelsdorf, linux-kernel,
	Peter Zijlstra, Paul Mackerras, Ingo Molnar

Hi David and Markus,

On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote:
> On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> >The callchain code was done initially for 'report' and when I made 'top'
> >reuse the hist_entry code allowing 'top' to collect callchains was too
> >easy, but then we need to go thru the callchain/hists/hist_entry code to
> >make sure that they don't leak, will try to do it...
> >
> 
> As I recall it is build up of the dead_threads list.

Maybe.  But I guess it's because of leak of callchains..

Markus, could you please test below patch how much it affects?


>From b29ccd79727654653986ab1170e0b1f5d6518035 Mon Sep 17 00:00:00 2001
From: Namhyung Kim <namhyung@kernel.org>
Date: Tue, 30 Dec 2014 14:28:45 +0900
Subject: [PATCH] perf callchain: Free callchains when hist entries are deleted

Markus reported that "perf top -g" can leak ~300MB per second on his
machine.  This is partly because it missed to free callchains when
hist entries are deleted.  Fix it.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/callchain.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |  2 ++
 tools/perf/util/hist.c      |  1 +
 3 files changed, 33 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 64b377e591e4..14e7a123d43b 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -841,3 +841,33 @@ char *callchain_list__sym_name(struct callchain_list *cl,
 
 	return bf;
 }
+
+static void free_callchain_node(struct callchain_node *node)
+{
+	struct callchain_list *list, *tmp;
+	struct callchain_node *child;
+	struct rb_node *n;
+
+	list_for_each_entry_safe(list, tmp, &node->val, list) {
+		list_del(&list->list);
+		free(list);
+	}
+
+	n = rb_first(&node->rb_root_in);
+	while (n) {
+		child = container_of(n, struct callchain_node, rb_node_in);
+		n = rb_next(n);
+		rb_erase(&child->rb_node_in, &node->rb_root_in);
+
+		free_callchain_node(child);
+		free(child);
+	}
+}
+
+void free_callchain(struct callchain_root *root)
+{
+	if (!symbol_conf.use_callchain)
+		return;
+
+	free_callchain_node(&root->node);
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index dbc08cf5f970..c0ec1acc38e4 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -198,4 +198,6 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,
 char *callchain_list__sym_name(struct callchain_list *cl,
 			       char *bf, size_t bfsize, bool show_dso);
 
+void free_callchain(struct callchain_root *root);
+
 #endif	/* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 30ff2cb92884..e17163fcb702 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -945,6 +945,7 @@ void hist_entry__delete(struct hist_entry *he)
 	zfree(&he->mem_info);
 	zfree(&he->stat_acc);
 	free_srcline(he->srcline);
+	free_callchain(he->callchain);
 	free(he);
 }
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-30  5:38       ` Namhyung Kim
@ 2014-12-30  8:35         ` Markus Trippelsdorf
  2014-12-30 23:04           ` Arnaldo Carvalho de Melo
  2015-01-08 15:09         ` Arnaldo Carvalho de Melo
  2015-01-09 12:31         ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim
  2 siblings, 1 reply; 9+ messages in thread
From: Markus Trippelsdorf @ 2014-12-30  8:35 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: David Ahern, Arnaldo Carvalho de Melo, linux-kernel,
	Peter Zijlstra, Paul Mackerras, Ingo Molnar

On 2014.12.30 at 14:38 +0900, Namhyung Kim wrote:
> Hi David and Markus,
> 
> On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote:
> > On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> > >The callchain code was done initially for 'report' and when I made 'top'
> > >reuse the hist_entry code allowing 'top' to collect callchains was too
> > >easy, but then we need to go thru the callchain/hists/hist_entry code to
> > >make sure that they don't leak, will try to do it...
> > >
> > 
> > As I recall it is build up of the dead_threads list.
> 
> Maybe.  But I guess it's because of leak of callchains..
> 
> Markus, could you please test below patch how much it affects?

Thanks Namhyung. It leaks an order of magnitude less memory now:
~30MB/sec on my machine.

Valgrind shows (last entries of the list):
...
==20512== 7,225,920 bytes in 17,370 blocks are possibly lost in loss record 295 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4A6996: zalloc (util.h:189)
==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
==20512==    by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x4395EE: __cmd_top (builtin-top.c:974)
==20512==    by 0x4395EE: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 8,922,480 bytes in 159,330 blocks are possibly lost in loss record 296 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4821FE: zalloc (util.h:189)
==20512==    by 0x4821FE: fill_node (callchain.c:450)
==20512==    by 0x4821FE: add_child (callchain.c:473)
==20512==    by 0x4821FE: append_chain_children (callchain.c:596)
==20512==    by 0x48514E: callchain_append (callchain.c:672)
==20512==    by 0x4A8947: iter_add_single_cumulative_entry (hist.c:739)
==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 11,050,136 bytes in 1,663 blocks are definitely lost in loss record 297 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x44F7BF: zalloc (util.h:189)
==20512==    by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
==20512==    by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
==20512==    by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
==20512==    by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
==20512==    by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
==20512==    by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 24,920,064 bytes in 59,904 blocks are possibly lost in loss record 298 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4A6996: zalloc (util.h:189)
==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
==20512==    by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 26,147,680 bytes in 62,855 blocks are possibly lost in loss record 299 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4A6996: zalloc (util.h:189)
==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
==20512==    by 0x4A8728: iter_add_next_cumulative_entry (hist.c:803)
==20512==    by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 44,939,792 bytes in 3,788 blocks are definitely lost in loss record 300 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x44F7BF: zalloc (util.h:189)
==20512==    by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
==20512==    by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
==20512==    by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
==20512==    by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
==20512==    by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
==20512==    by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x4395EE: __cmd_top (builtin-top.c:974)
==20512==    by 0x4395EE: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 81,248,440 bytes in 1,450,865 blocks are possibly lost in loss record 301 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4821FE: zalloc (util.h:189)
==20512==    by 0x4821FE: fill_node (callchain.c:450)
==20512==    by 0x4821FE: add_child (callchain.c:473)
==20512==    by 0x4821FE: append_chain_children (callchain.c:596)
==20512==    by 0x48514E: callchain_append (callchain.c:672)
==20512==    by 0x4A8768: iter_add_next_cumulative_entry (hist.c:812)
==20512==    by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== LEAK SUMMARY:
==20512==    definitely lost: 56,810,427 bytes in 5,479 blocks
==20512==    indirectly lost: 4,000 bytes in 125 blocks
==20512==      possibly lost: 210,493,048 bytes in 2,552,578 blocks
==20512==    still reachable: 18,258 bytes in 242 blocks
==20512==         suppressed: 0 bytes in 0 blocks

-- 
Markus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-30  8:35         ` Markus Trippelsdorf
@ 2014-12-30 23:04           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-30 23:04 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Namhyung Kim, David Ahern, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar

Em Tue, Dec 30, 2014 at 09:35:24AM +0100, Markus Trippelsdorf escreveu:
> On 2014.12.30 at 14:38 +0900, Namhyung Kim wrote:
> > Hi David and Markus,
> > 
> > On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote:
> > > On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> > > >The callchain code was done initially for 'report' and when I made 'top'
> > > >reuse the hist_entry code allowing 'top' to collect callchains was too
> > > >easy, but then we need to go thru the callchain/hists/hist_entry code to
> > > >make sure that they don't leak, will try to do it...
> > > >
> > > 
> > > As I recall it is build up of the dead_threads list.
> > 
> > Maybe.  But I guess it's because of leak of callchains..
> > 
> > Markus, could you please test below patch how much it affects?
> 
> Thanks Namhyung. It leaks an order of magnitude less memory now:
> ~30MB/sec on my machine.
> 
> Valgrind shows (last entries of the list):
> ...
> ==20512== 7,225,920 bytes in 17,370 blocks are possibly lost in loss record 295 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4A6996: zalloc (util.h:189)
> ==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
> ==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
> ==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
> ==20512==    by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
> ==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x4395EE: __cmd_top (builtin-top.c:974)
> ==20512==    by 0x4395EE: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)

#1) Ok, those are the hist_entries that were not decayed, if top continued
they would eventually be decayed, freed, etc, i.e. the exit of top is
equivalent to the last decay.

> ==20512== 
> ==20512== 8,922,480 bytes in 159,330 blocks are possibly lost in loss record 296 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4821FE: zalloc (util.h:189)
> ==20512==    by 0x4821FE: fill_node (callchain.c:450)
> ==20512==    by 0x4821FE: add_child (callchain.c:473)
> ==20512==    by 0x4821FE: append_chain_children (callchain.c:596)
> ==20512==    by 0x48514E: callchain_append (callchain.c:672)
> ==20512==    by 0x4A8947: iter_add_single_cumulative_entry (hist.c:739)
> ==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)
> ==20512== 

> ==20512== 11,050,136 bytes in 1,663 blocks are definitely lost in loss record 297 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x44F7BF: zalloc (util.h:189)
> ==20512==    by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
> ==20512==    by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
> ==20512==    by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
> ==20512==    by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
> ==20512==    by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
> ==20512==    by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)

These (symbol__alloc_hist) we need to free up the decaying of per RIP
annotation gets to zero... Will check where to do this...

> ==20512== 
> ==20512== 24,920,064 bytes in 59,904 blocks are possibly lost in loss record 298 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4A6996: zalloc (util.h:189)
> ==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
> ==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
> ==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
> ==20512==    by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
> ==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)

Same thing as #1.

> ==20512== 
> ==20512== 26,147,680 bytes in 62,855 blocks are possibly lost in loss record 299 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4A6996: zalloc (util.h:189)
> ==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
> ==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
> ==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
> ==20512==    by 0x4A8728: iter_add_next_cumulative_entry (hist.c:803)
> ==20512==    by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)
> ==20512== 
> ==20512== 44,939,792 bytes in 3,788 blocks are definitely lost in loss record 300 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x44F7BF: zalloc (util.h:189)
> ==20512==    by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
> ==20512==    by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
> ==20512==    by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
> ==20512==    by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
> ==20512==    by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
> ==20512==    by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x4395EE: __cmd_top (builtin-top.c:974)
> ==20512==    by 0x4395EE: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)
> ==20512== 
> ==20512== 81,248,440 bytes in 1,450,865 blocks are possibly lost in loss record 301 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4821FE: zalloc (util.h:189)
> ==20512==    by 0x4821FE: fill_node (callchain.c:450)
> ==20512==    by 0x4821FE: add_child (callchain.c:473)
> ==20512==    by 0x4821FE: append_chain_children (callchain.c:596)
> ==20512==    by 0x48514E: callchain_append (callchain.c:672)
> ==20512==    by 0x4A8768: iter_add_next_cumulative_entry (hist.c:812)
> ==20512==    by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)
> ==20512== 
> ==20512== LEAK SUMMARY:
> ==20512==    definitely lost: 56,810,427 bytes in 5,479 blocks
> ==20512==    indirectly lost: 4,000 bytes in 125 blocks
> ==20512==      possibly lost: 210,493,048 bytes in 2,552,578 blocks
> ==20512==    still reachable: 18,258 bytes in 242 blocks
> ==20512==         suppressed: 0 bytes in 0 blocks
> 
> -- 
> Markus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-30  5:38       ` Namhyung Kim
  2014-12-30  8:35         ` Markus Trippelsdorf
@ 2015-01-08 15:09         ` Arnaldo Carvalho de Melo
  2015-01-09 12:31         ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim
  2 siblings, 0 replies; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-08 15:09 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: David Ahern, Markus Trippelsdorf, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar

Em Tue, Dec 30, 2014 at 02:38:13PM +0900, Namhyung Kim escreveu:
> Subject: [PATCH] perf callchain: Free callchains when hist entries are deleted
 
> Markus reported that "perf top -g" can leak ~300MB per second on his
> machine.  This is partly because it missed to free callchains when
> hist entries are deleted.  Fix it.

Thanks, applied.

- Arnaldo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted
  2014-12-30  5:38       ` Namhyung Kim
  2014-12-30  8:35         ` Markus Trippelsdorf
  2015-01-08 15:09         ` Arnaldo Carvalho de Melo
@ 2015-01-09 12:31         ` tip-bot for Namhyung Kim
  2 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-09 12:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, mingo, linux-kernel, namhyung, acme, fweisbec, dsahern,
	a.p.zijlstra, markus, tglx, paulus, mingo

Commit-ID:  d114960c488b5a95705a04bba305f931cef0efd6
Gitweb:     http://git.kernel.org/tip/d114960c488b5a95705a04bba305f931cef0efd6
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Tue, 30 Dec 2014 14:38:13 +0900
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 8 Jan 2015 11:56:35 -0300

perf callchain: Free callchains when hist entries are deleted

Markus reported that "perf top -g" can leak ~300MB per second on his
machine.  This is partly because it missed to free callchains when hist
entries are deleted.  Fix it.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20141230053813.GD6081@sejong
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/callchain.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |  2 ++
 tools/perf/util/hist.c      |  1 +
 3 files changed, 33 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 64b377e..14e7a12 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -841,3 +841,33 @@ char *callchain_list__sym_name(struct callchain_list *cl,
 
 	return bf;
 }
+
+static void free_callchain_node(struct callchain_node *node)
+{
+	struct callchain_list *list, *tmp;
+	struct callchain_node *child;
+	struct rb_node *n;
+
+	list_for_each_entry_safe(list, tmp, &node->val, list) {
+		list_del(&list->list);
+		free(list);
+	}
+
+	n = rb_first(&node->rb_root_in);
+	while (n) {
+		child = container_of(n, struct callchain_node, rb_node_in);
+		n = rb_next(n);
+		rb_erase(&child->rb_node_in, &node->rb_root_in);
+
+		free_callchain_node(child);
+		free(child);
+	}
+}
+
+void free_callchain(struct callchain_root *root)
+{
+	if (!symbol_conf.use_callchain)
+		return;
+
+	free_callchain_node(&root->node);
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index dbc08cf..c0ec1ac 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -198,4 +198,6 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,
 char *callchain_list__sym_name(struct callchain_list *cl,
 			       char *bf, size_t bfsize, bool show_dso);
 
+void free_callchain(struct callchain_root *root);
+
 #endif	/* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0ced178..1823955 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -947,6 +947,7 @@ void hist_entry__free(struct hist_entry *he)
 	zfree(&he->mem_info);
 	zfree(&he->stat_acc);
 	free_srcline(he->srcline);
+	free_callchain(he->callchain);
 	free(he);
 }
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-01-09 12:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-13  8:48 "perf top -g" leaking ~300MB per second Markus Trippelsdorf
2014-12-13  9:03 ` Markus Trippelsdorf
2014-12-13 15:26   ` Arnaldo Carvalho de Melo
2014-12-13 18:16     ` David Ahern
2014-12-30  5:38       ` Namhyung Kim
2014-12-30  8:35         ` Markus Trippelsdorf
2014-12-30 23:04           ` Arnaldo Carvalho de Melo
2015-01-08 15:09         ` Arnaldo Carvalho de Melo
2015-01-09 12:31         ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.