linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* "perf top -g" leaking ~300MB per second.
@ 2014-12-13  8:48 Markus Trippelsdorf
  2014-12-13  9:03 ` Markus Trippelsdorf
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Trippelsdorf @ 2014-12-13  8:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

Running "perf top -g" built from current Linus tree apparently leaks
~300MB of memory every second an my machine.

-- 
Markus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-13  8:48 "perf top -g" leaking ~300MB per second Markus Trippelsdorf
@ 2014-12-13  9:03 ` Markus Trippelsdorf
  2014-12-13 15:26   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Trippelsdorf @ 2014-12-13  9:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

On 2014.12.13 at 09:48 +0100, Markus Trippelsdorf wrote:
> Running "perf top -g" built from current Linus tree apparently leaks
> ~300MB of memory every second an my machine.

Hmm, this is a much older problem. I just noticed this the first time
today. 
To reproduce: Compile some application in the background (make -j4 in my
case) and run "perf top -g". Perf will continue to accumulate memory
until the system starts to swap and the OOM killer eventually kicks in.

-- 
Markus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-13  9:03 ` Markus Trippelsdorf
@ 2014-12-13 15:26   ` Arnaldo Carvalho de Melo
  2014-12-13 18:16     ` David Ahern
  0 siblings, 1 reply; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-13 15:26 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar

Em Sat, Dec 13, 2014 at 10:03:31AM +0100, Markus Trippelsdorf escreveu:
> On 2014.12.13 at 09:48 +0100, Markus Trippelsdorf wrote:
> > Running "perf top -g" built from current Linus tree apparently leaks
> > ~300MB of memory every second an my machine.
> 
> Hmm, this is a much older problem. I just noticed this the first time
> today. 
> To reproduce: Compile some application in the background (make -j4 in my
> case) and run "perf top -g". Perf will continue to accumulate memory
> until the system starts to swap and the OOM killer eventually kicks in.

Yeap, longstanding problem, try minimizing the problem using a lower
frequency.

The callchain code was done initially for 'report' and when I made 'top'
reuse the hist_entry code allowing 'top' to collect callchains was too
easy, but then we need to go thru the callchain/hists/hist_entry code to
make sure that they don't leak, will try to do it...

- Arnaldo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-13 15:26   ` Arnaldo Carvalho de Melo
@ 2014-12-13 18:16     ` David Ahern
  2014-12-30  5:38       ` Namhyung Kim
  0 siblings, 1 reply; 9+ messages in thread
From: David Ahern @ 2014-12-13 18:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Markus Trippelsdorf
  Cc: linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar

On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> The callchain code was done initially for 'report' and when I made 'top'
> reuse the hist_entry code allowing 'top' to collect callchains was too
> easy, but then we need to go thru the callchain/hists/hist_entry code to
> make sure that they don't leak, will try to do it...
>

As I recall it is build up of the dead_threads list.

David


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-13 18:16     ` David Ahern
@ 2014-12-30  5:38       ` Namhyung Kim
  2014-12-30  8:35         ` Markus Trippelsdorf
                           ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Namhyung Kim @ 2014-12-30  5:38 UTC (permalink / raw)
  To: David Ahern
  Cc: Arnaldo Carvalho de Melo, Markus Trippelsdorf, linux-kernel,
	Peter Zijlstra, Paul Mackerras, Ingo Molnar

Hi David and Markus,

On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote:
> On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> >The callchain code was done initially for 'report' and when I made 'top'
> >reuse the hist_entry code allowing 'top' to collect callchains was too
> >easy, but then we need to go thru the callchain/hists/hist_entry code to
> >make sure that they don't leak, will try to do it...
> >
> 
> As I recall it is build up of the dead_threads list.

Maybe.  But I guess it's because of leak of callchains..

Markus, could you please test below patch how much it affects?


>From b29ccd79727654653986ab1170e0b1f5d6518035 Mon Sep 17 00:00:00 2001
From: Namhyung Kim <namhyung@kernel.org>
Date: Tue, 30 Dec 2014 14:28:45 +0900
Subject: [PATCH] perf callchain: Free callchains when hist entries are deleted

Markus reported that "perf top -g" can leak ~300MB per second on his
machine.  This is partly because it missed to free callchains when
hist entries are deleted.  Fix it.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/callchain.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |  2 ++
 tools/perf/util/hist.c      |  1 +
 3 files changed, 33 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 64b377e591e4..14e7a123d43b 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -841,3 +841,33 @@ char *callchain_list__sym_name(struct callchain_list *cl,
 
 	return bf;
 }
+
+static void free_callchain_node(struct callchain_node *node)
+{
+	struct callchain_list *list, *tmp;
+	struct callchain_node *child;
+	struct rb_node *n;
+
+	list_for_each_entry_safe(list, tmp, &node->val, list) {
+		list_del(&list->list);
+		free(list);
+	}
+
+	n = rb_first(&node->rb_root_in);
+	while (n) {
+		child = container_of(n, struct callchain_node, rb_node_in);
+		n = rb_next(n);
+		rb_erase(&child->rb_node_in, &node->rb_root_in);
+
+		free_callchain_node(child);
+		free(child);
+	}
+}
+
+void free_callchain(struct callchain_root *root)
+{
+	if (!symbol_conf.use_callchain)
+		return;
+
+	free_callchain_node(&root->node);
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index dbc08cf5f970..c0ec1acc38e4 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -198,4 +198,6 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,
 char *callchain_list__sym_name(struct callchain_list *cl,
 			       char *bf, size_t bfsize, bool show_dso);
 
+void free_callchain(struct callchain_root *root);
+
 #endif	/* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 30ff2cb92884..e17163fcb702 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -945,6 +945,7 @@ void hist_entry__delete(struct hist_entry *he)
 	zfree(&he->mem_info);
 	zfree(&he->stat_acc);
 	free_srcline(he->srcline);
+	free_callchain(he->callchain);
 	free(he);
 }
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-30  5:38       ` Namhyung Kim
@ 2014-12-30  8:35         ` Markus Trippelsdorf
  2014-12-30 23:04           ` Arnaldo Carvalho de Melo
  2015-01-08 15:09         ` Arnaldo Carvalho de Melo
  2015-01-09 12:31         ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim
  2 siblings, 1 reply; 9+ messages in thread
From: Markus Trippelsdorf @ 2014-12-30  8:35 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: David Ahern, Arnaldo Carvalho de Melo, linux-kernel,
	Peter Zijlstra, Paul Mackerras, Ingo Molnar

On 2014.12.30 at 14:38 +0900, Namhyung Kim wrote:
> Hi David and Markus,
> 
> On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote:
> > On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> > >The callchain code was done initially for 'report' and when I made 'top'
> > >reuse the hist_entry code allowing 'top' to collect callchains was too
> > >easy, but then we need to go thru the callchain/hists/hist_entry code to
> > >make sure that they don't leak, will try to do it...
> > >
> > 
> > As I recall it is build up of the dead_threads list.
> 
> Maybe.  But I guess it's because of leak of callchains..
> 
> Markus, could you please test below patch how much it affects?

Thanks Namhyung. It leaks an order of magnitude less memory now:
~30MB/sec on my machine.

Valgrind shows (last entries of the list):
...
==20512== 7,225,920 bytes in 17,370 blocks are possibly lost in loss record 295 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4A6996: zalloc (util.h:189)
==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
==20512==    by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x4395EE: __cmd_top (builtin-top.c:974)
==20512==    by 0x4395EE: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 8,922,480 bytes in 159,330 blocks are possibly lost in loss record 296 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4821FE: zalloc (util.h:189)
==20512==    by 0x4821FE: fill_node (callchain.c:450)
==20512==    by 0x4821FE: add_child (callchain.c:473)
==20512==    by 0x4821FE: append_chain_children (callchain.c:596)
==20512==    by 0x48514E: callchain_append (callchain.c:672)
==20512==    by 0x4A8947: iter_add_single_cumulative_entry (hist.c:739)
==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 11,050,136 bytes in 1,663 blocks are definitely lost in loss record 297 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x44F7BF: zalloc (util.h:189)
==20512==    by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
==20512==    by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
==20512==    by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
==20512==    by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
==20512==    by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
==20512==    by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 24,920,064 bytes in 59,904 blocks are possibly lost in loss record 298 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4A6996: zalloc (util.h:189)
==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
==20512==    by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 26,147,680 bytes in 62,855 blocks are possibly lost in loss record 299 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4A6996: zalloc (util.h:189)
==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
==20512==    by 0x4A8728: iter_add_next_cumulative_entry (hist.c:803)
==20512==    by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 44,939,792 bytes in 3,788 blocks are definitely lost in loss record 300 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x44F7BF: zalloc (util.h:189)
==20512==    by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
==20512==    by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
==20512==    by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
==20512==    by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
==20512==    by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
==20512==    by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x4395EE: __cmd_top (builtin-top.c:974)
==20512==    by 0x4395EE: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== 81,248,440 bytes in 1,450,865 blocks are possibly lost in loss record 301 of 301
==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512==    by 0x4821FE: zalloc (util.h:189)
==20512==    by 0x4821FE: fill_node (callchain.c:450)
==20512==    by 0x4821FE: add_child (callchain.c:473)
==20512==    by 0x4821FE: append_chain_children (callchain.c:596)
==20512==    by 0x48514E: callchain_append (callchain.c:672)
==20512==    by 0x4A8768: iter_add_next_cumulative_entry (hist.c:812)
==20512==    by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
==20512==    by 0x41B702: run_builtin (perf.c:341)
==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
==20512==    by 0x41AE51: run_argv (perf.c:444)
==20512==    by 0x41AE51: main (perf.c:559)
==20512== 
==20512== LEAK SUMMARY:
==20512==    definitely lost: 56,810,427 bytes in 5,479 blocks
==20512==    indirectly lost: 4,000 bytes in 125 blocks
==20512==      possibly lost: 210,493,048 bytes in 2,552,578 blocks
==20512==    still reachable: 18,258 bytes in 242 blocks
==20512==         suppressed: 0 bytes in 0 blocks

-- 
Markus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-30  8:35         ` Markus Trippelsdorf
@ 2014-12-30 23:04           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-30 23:04 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Namhyung Kim, David Ahern, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar

Em Tue, Dec 30, 2014 at 09:35:24AM +0100, Markus Trippelsdorf escreveu:
> On 2014.12.30 at 14:38 +0900, Namhyung Kim wrote:
> > Hi David and Markus,
> > 
> > On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote:
> > > On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> > > >The callchain code was done initially for 'report' and when I made 'top'
> > > >reuse the hist_entry code allowing 'top' to collect callchains was too
> > > >easy, but then we need to go thru the callchain/hists/hist_entry code to
> > > >make sure that they don't leak, will try to do it...
> > > >
> > > 
> > > As I recall it is build up of the dead_threads list.
> > 
> > Maybe.  But I guess it's because of leak of callchains..
> > 
> > Markus, could you please test below patch how much it affects?
> 
> Thanks Namhyung. It leaks an order of magnitude less memory now:
> ~30MB/sec on my machine.
> 
> Valgrind shows (last entries of the list):
> ...
> ==20512== 7,225,920 bytes in 17,370 blocks are possibly lost in loss record 295 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4A6996: zalloc (util.h:189)
> ==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
> ==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
> ==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
> ==20512==    by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
> ==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x4395EE: __cmd_top (builtin-top.c:974)
> ==20512==    by 0x4395EE: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)

#1) Ok, those are the hist_entries that were not decayed, if top continued
they would eventually be decayed, freed, etc, i.e. the exit of top is
equivalent to the last decay.

> ==20512== 
> ==20512== 8,922,480 bytes in 159,330 blocks are possibly lost in loss record 296 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4821FE: zalloc (util.h:189)
> ==20512==    by 0x4821FE: fill_node (callchain.c:450)
> ==20512==    by 0x4821FE: add_child (callchain.c:473)
> ==20512==    by 0x4821FE: append_chain_children (callchain.c:596)
> ==20512==    by 0x48514E: callchain_append (callchain.c:672)
> ==20512==    by 0x4A8947: iter_add_single_cumulative_entry (hist.c:739)
> ==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)
> ==20512== 

> ==20512== 11,050,136 bytes in 1,663 blocks are definitely lost in loss record 297 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x44F7BF: zalloc (util.h:189)
> ==20512==    by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
> ==20512==    by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
> ==20512==    by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
> ==20512==    by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
> ==20512==    by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
> ==20512==    by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)

These (symbol__alloc_hist) we need to free up the decaying of per RIP
annotation gets to zero... Will check where to do this...

> ==20512== 
> ==20512== 24,920,064 bytes in 59,904 blocks are possibly lost in loss record 298 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4A6996: zalloc (util.h:189)
> ==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
> ==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
> ==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
> ==20512==    by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
> ==20512==    by 0x4A8A64: hist_entry_iter__add (hist.c:876)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)

Same thing as #1.

> ==20512== 
> ==20512== 26,147,680 bytes in 62,855 blocks are possibly lost in loss record 299 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4A6996: zalloc (util.h:189)
> ==20512==    by 0x4A6996: hist_entry__new (hist.c:309)
> ==20512==    by 0x4A8249: add_hist_entry (hist.c:431)
> ==20512==    by 0x4A8249: __hists__add_entry (hist.c:477)
> ==20512==    by 0x4A8728: iter_add_next_cumulative_entry (hist.c:803)
> ==20512==    by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)
> ==20512== 
> ==20512== 44,939,792 bytes in 3,788 blocks are definitely lost in loss record 300 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x44F7BF: zalloc (util.h:189)
> ==20512==    by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
> ==20512==    by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
> ==20512==    by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
> ==20512==    by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
> ==20512==    by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
> ==20512==    by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x4395EE: __cmd_top (builtin-top.c:974)
> ==20512==    by 0x4395EE: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)
> ==20512== 
> ==20512== 81,248,440 bytes in 1,450,865 blocks are possibly lost in loss record 301 of 301
> ==20512==    at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512==    by 0x4821FE: zalloc (util.h:189)
> ==20512==    by 0x4821FE: fill_node (callchain.c:450)
> ==20512==    by 0x4821FE: add_child (callchain.c:473)
> ==20512==    by 0x4821FE: append_chain_children (callchain.c:596)
> ==20512==    by 0x48514E: callchain_append (callchain.c:672)
> ==20512==    by 0x4A8768: iter_add_next_cumulative_entry (hist.c:812)
> ==20512==    by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
> ==20512==    by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512==    by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512==    by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512==    by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512==    by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512==    by 0x41B702: run_builtin (perf.c:341)
> ==20512==    by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512==    by 0x41AE51: run_argv (perf.c:444)
> ==20512==    by 0x41AE51: main (perf.c:559)
> ==20512== 
> ==20512== LEAK SUMMARY:
> ==20512==    definitely lost: 56,810,427 bytes in 5,479 blocks
> ==20512==    indirectly lost: 4,000 bytes in 125 blocks
> ==20512==      possibly lost: 210,493,048 bytes in 2,552,578 blocks
> ==20512==    still reachable: 18,258 bytes in 242 blocks
> ==20512==         suppressed: 0 bytes in 0 blocks
> 
> -- 
> Markus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: "perf top -g" leaking ~300MB per second.
  2014-12-30  5:38       ` Namhyung Kim
  2014-12-30  8:35         ` Markus Trippelsdorf
@ 2015-01-08 15:09         ` Arnaldo Carvalho de Melo
  2015-01-09 12:31         ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim
  2 siblings, 0 replies; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-08 15:09 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: David Ahern, Markus Trippelsdorf, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar

Em Tue, Dec 30, 2014 at 02:38:13PM +0900, Namhyung Kim escreveu:
> Subject: [PATCH] perf callchain: Free callchains when hist entries are deleted
 
> Markus reported that "perf top -g" can leak ~300MB per second on his
> machine.  This is partly because it missed to free callchains when
> hist entries are deleted.  Fix it.

Thanks, applied.

- Arnaldo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted
  2014-12-30  5:38       ` Namhyung Kim
  2014-12-30  8:35         ` Markus Trippelsdorf
  2015-01-08 15:09         ` Arnaldo Carvalho de Melo
@ 2015-01-09 12:31         ` tip-bot for Namhyung Kim
  2 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-09 12:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, mingo, linux-kernel, namhyung, acme, fweisbec, dsahern,
	a.p.zijlstra, markus, tglx, paulus, mingo

Commit-ID:  d114960c488b5a95705a04bba305f931cef0efd6
Gitweb:     http://git.kernel.org/tip/d114960c488b5a95705a04bba305f931cef0efd6
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Tue, 30 Dec 2014 14:38:13 +0900
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 8 Jan 2015 11:56:35 -0300

perf callchain: Free callchains when hist entries are deleted

Markus reported that "perf top -g" can leak ~300MB per second on his
machine.  This is partly because it missed to free callchains when hist
entries are deleted.  Fix it.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20141230053813.GD6081@sejong
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/callchain.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |  2 ++
 tools/perf/util/hist.c      |  1 +
 3 files changed, 33 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 64b377e..14e7a12 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -841,3 +841,33 @@ char *callchain_list__sym_name(struct callchain_list *cl,
 
 	return bf;
 }
+
+static void free_callchain_node(struct callchain_node *node)
+{
+	struct callchain_list *list, *tmp;
+	struct callchain_node *child;
+	struct rb_node *n;
+
+	list_for_each_entry_safe(list, tmp, &node->val, list) {
+		list_del(&list->list);
+		free(list);
+	}
+
+	n = rb_first(&node->rb_root_in);
+	while (n) {
+		child = container_of(n, struct callchain_node, rb_node_in);
+		n = rb_next(n);
+		rb_erase(&child->rb_node_in, &node->rb_root_in);
+
+		free_callchain_node(child);
+		free(child);
+	}
+}
+
+void free_callchain(struct callchain_root *root)
+{
+	if (!symbol_conf.use_callchain)
+		return;
+
+	free_callchain_node(&root->node);
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index dbc08cf..c0ec1ac 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -198,4 +198,6 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,
 char *callchain_list__sym_name(struct callchain_list *cl,
 			       char *bf, size_t bfsize, bool show_dso);
 
+void free_callchain(struct callchain_root *root);
+
 #endif	/* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0ced178..1823955 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -947,6 +947,7 @@ void hist_entry__free(struct hist_entry *he)
 	zfree(&he->mem_info);
 	zfree(&he->stat_acc);
 	free_srcline(he->srcline);
+	free_callchain(he->callchain);
 	free(he);
 }
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-01-09 12:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-13  8:48 "perf top -g" leaking ~300MB per second Markus Trippelsdorf
2014-12-13  9:03 ` Markus Trippelsdorf
2014-12-13 15:26   ` Arnaldo Carvalho de Melo
2014-12-13 18:16     ` David Ahern
2014-12-30  5:38       ` Namhyung Kim
2014-12-30  8:35         ` Markus Trippelsdorf
2014-12-30 23:04           ` Arnaldo Carvalho de Melo
2015-01-08 15:09         ` Arnaldo Carvalho de Melo
2015-01-09 12:31         ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).