* "perf top -g" leaking ~300MB per second.
@ 2014-12-13 8:48 Markus Trippelsdorf
2014-12-13 9:03 ` Markus Trippelsdorf
0 siblings, 1 reply; 9+ messages in thread
From: Markus Trippelsdorf @ 2014-12-13 8:48 UTC (permalink / raw)
To: linux-kernel
Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo
Running "perf top -g" built from current Linus tree apparently leaks
~300MB of memory every second an my machine.
--
Markus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second.
2014-12-13 8:48 "perf top -g" leaking ~300MB per second Markus Trippelsdorf
@ 2014-12-13 9:03 ` Markus Trippelsdorf
2014-12-13 15:26 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 9+ messages in thread
From: Markus Trippelsdorf @ 2014-12-13 9:03 UTC (permalink / raw)
To: linux-kernel
Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo
On 2014.12.13 at 09:48 +0100, Markus Trippelsdorf wrote:
> Running "perf top -g" built from current Linus tree apparently leaks
> ~300MB of memory every second an my machine.
Hmm, this is a much older problem. I just noticed this the first time
today.
To reproduce: Compile some application in the background (make -j4 in my
case) and run "perf top -g". Perf will continue to accumulate memory
until the system starts to swap and the OOM killer eventually kicks in.
--
Markus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second.
2014-12-13 9:03 ` Markus Trippelsdorf
@ 2014-12-13 15:26 ` Arnaldo Carvalho de Melo
2014-12-13 18:16 ` David Ahern
0 siblings, 1 reply; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-13 15:26 UTC (permalink / raw)
To: Markus Trippelsdorf
Cc: linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar
Em Sat, Dec 13, 2014 at 10:03:31AM +0100, Markus Trippelsdorf escreveu:
> On 2014.12.13 at 09:48 +0100, Markus Trippelsdorf wrote:
> > Running "perf top -g" built from current Linus tree apparently leaks
> > ~300MB of memory every second an my machine.
>
> Hmm, this is a much older problem. I just noticed this the first time
> today.
> To reproduce: Compile some application in the background (make -j4 in my
> case) and run "perf top -g". Perf will continue to accumulate memory
> until the system starts to swap and the OOM killer eventually kicks in.
Yeap, longstanding problem, try minimizing the problem using a lower
frequency.
The callchain code was done initially for 'report' and when I made 'top'
reuse the hist_entry code allowing 'top' to collect callchains was too
easy, but then we need to go thru the callchain/hists/hist_entry code to
make sure that they don't leak, will try to do it...
- Arnaldo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second.
2014-12-13 15:26 ` Arnaldo Carvalho de Melo
@ 2014-12-13 18:16 ` David Ahern
2014-12-30 5:38 ` Namhyung Kim
0 siblings, 1 reply; 9+ messages in thread
From: David Ahern @ 2014-12-13 18:16 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Markus Trippelsdorf
Cc: linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar
On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> The callchain code was done initially for 'report' and when I made 'top'
> reuse the hist_entry code allowing 'top' to collect callchains was too
> easy, but then we need to go thru the callchain/hists/hist_entry code to
> make sure that they don't leak, will try to do it...
>
As I recall it is build up of the dead_threads list.
David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second.
2014-12-13 18:16 ` David Ahern
@ 2014-12-30 5:38 ` Namhyung Kim
2014-12-30 8:35 ` Markus Trippelsdorf
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Namhyung Kim @ 2014-12-30 5:38 UTC (permalink / raw)
To: David Ahern
Cc: Arnaldo Carvalho de Melo, Markus Trippelsdorf, linux-kernel,
Peter Zijlstra, Paul Mackerras, Ingo Molnar
Hi David and Markus,
On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote:
> On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> >The callchain code was done initially for 'report' and when I made 'top'
> >reuse the hist_entry code allowing 'top' to collect callchains was too
> >easy, but then we need to go thru the callchain/hists/hist_entry code to
> >make sure that they don't leak, will try to do it...
> >
>
> As I recall it is build up of the dead_threads list.
Maybe. But I guess it's because of leak of callchains..
Markus, could you please test below patch how much it affects?
>From b29ccd79727654653986ab1170e0b1f5d6518035 Mon Sep 17 00:00:00 2001
From: Namhyung Kim <namhyung@kernel.org>
Date: Tue, 30 Dec 2014 14:28:45 +0900
Subject: [PATCH] perf callchain: Free callchains when hist entries are deleted
Markus reported that "perf top -g" can leak ~300MB per second on his
machine. This is partly because it missed to free callchains when
hist entries are deleted. Fix it.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/callchain.c | 30 ++++++++++++++++++++++++++++++
tools/perf/util/callchain.h | 2 ++
tools/perf/util/hist.c | 1 +
3 files changed, 33 insertions(+)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 64b377e591e4..14e7a123d43b 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -841,3 +841,33 @@ char *callchain_list__sym_name(struct callchain_list *cl,
return bf;
}
+
+static void free_callchain_node(struct callchain_node *node)
+{
+ struct callchain_list *list, *tmp;
+ struct callchain_node *child;
+ struct rb_node *n;
+
+ list_for_each_entry_safe(list, tmp, &node->val, list) {
+ list_del(&list->list);
+ free(list);
+ }
+
+ n = rb_first(&node->rb_root_in);
+ while (n) {
+ child = container_of(n, struct callchain_node, rb_node_in);
+ n = rb_next(n);
+ rb_erase(&child->rb_node_in, &node->rb_root_in);
+
+ free_callchain_node(child);
+ free(child);
+ }
+}
+
+void free_callchain(struct callchain_root *root)
+{
+ if (!symbol_conf.use_callchain)
+ return;
+
+ free_callchain_node(&root->node);
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index dbc08cf5f970..c0ec1acc38e4 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -198,4 +198,6 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,
char *callchain_list__sym_name(struct callchain_list *cl,
char *bf, size_t bfsize, bool show_dso);
+void free_callchain(struct callchain_root *root);
+
#endif /* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 30ff2cb92884..e17163fcb702 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -945,6 +945,7 @@ void hist_entry__delete(struct hist_entry *he)
zfree(&he->mem_info);
zfree(&he->stat_acc);
free_srcline(he->srcline);
+ free_callchain(he->callchain);
free(he);
}
--
2.1.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second.
2014-12-30 5:38 ` Namhyung Kim
@ 2014-12-30 8:35 ` Markus Trippelsdorf
2014-12-30 23:04 ` Arnaldo Carvalho de Melo
2015-01-08 15:09 ` Arnaldo Carvalho de Melo
2015-01-09 12:31 ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim
2 siblings, 1 reply; 9+ messages in thread
From: Markus Trippelsdorf @ 2014-12-30 8:35 UTC (permalink / raw)
To: Namhyung Kim
Cc: David Ahern, Arnaldo Carvalho de Melo, linux-kernel,
Peter Zijlstra, Paul Mackerras, Ingo Molnar
On 2014.12.30 at 14:38 +0900, Namhyung Kim wrote:
> Hi David and Markus,
>
> On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote:
> > On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> > >The callchain code was done initially for 'report' and when I made 'top'
> > >reuse the hist_entry code allowing 'top' to collect callchains was too
> > >easy, but then we need to go thru the callchain/hists/hist_entry code to
> > >make sure that they don't leak, will try to do it...
> > >
> >
> > As I recall it is build up of the dead_threads list.
>
> Maybe. But I guess it's because of leak of callchains..
>
> Markus, could you please test below patch how much it affects?
Thanks Namhyung. It leaks an order of magnitude less memory now:
~30MB/sec on my machine.
Valgrind shows (last entries of the list):
...
==20512== 7,225,920 bytes in 17,370 blocks are possibly lost in loss record 295 of 301
==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512== by 0x4A6996: zalloc (util.h:189)
==20512== by 0x4A6996: hist_entry__new (hist.c:309)
==20512== by 0x4A8249: add_hist_entry (hist.c:431)
==20512== by 0x4A8249: __hists__add_entry (hist.c:477)
==20512== by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876)
==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512== by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
==20512== by 0x4395EE: __cmd_top (builtin-top.c:974)
==20512== by 0x4395EE: cmd_top (builtin-top.c:1266)
==20512== by 0x41B702: run_builtin (perf.c:341)
==20512== by 0x41AE51: handle_internal_command (perf.c:400)
==20512== by 0x41AE51: run_argv (perf.c:444)
==20512== by 0x41AE51: main (perf.c:559)
==20512==
==20512== 8,922,480 bytes in 159,330 blocks are possibly lost in loss record 296 of 301
==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512== by 0x4821FE: zalloc (util.h:189)
==20512== by 0x4821FE: fill_node (callchain.c:450)
==20512== by 0x4821FE: add_child (callchain.c:473)
==20512== by 0x4821FE: append_chain_children (callchain.c:596)
==20512== by 0x48514E: callchain_append (callchain.c:672)
==20512== by 0x4A8947: iter_add_single_cumulative_entry (hist.c:739)
==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876)
==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
==20512== by 0x41B702: run_builtin (perf.c:341)
==20512== by 0x41AE51: handle_internal_command (perf.c:400)
==20512== by 0x41AE51: run_argv (perf.c:444)
==20512== by 0x41AE51: main (perf.c:559)
==20512==
==20512== 11,050,136 bytes in 1,663 blocks are definitely lost in loss record 297 of 301
==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512== by 0x44F7BF: zalloc (util.h:189)
==20512== by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
==20512== by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
==20512== by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
==20512== by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
==20512== by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
==20512== by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
==20512== by 0x41B702: run_builtin (perf.c:341)
==20512== by 0x41AE51: handle_internal_command (perf.c:400)
==20512== by 0x41AE51: run_argv (perf.c:444)
==20512== by 0x41AE51: main (perf.c:559)
==20512==
==20512== 24,920,064 bytes in 59,904 blocks are possibly lost in loss record 298 of 301
==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512== by 0x4A6996: zalloc (util.h:189)
==20512== by 0x4A6996: hist_entry__new (hist.c:309)
==20512== by 0x4A8249: add_hist_entry (hist.c:431)
==20512== by 0x4A8249: __hists__add_entry (hist.c:477)
==20512== by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876)
==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
==20512== by 0x41B702: run_builtin (perf.c:341)
==20512== by 0x41AE51: handle_internal_command (perf.c:400)
==20512== by 0x41AE51: run_argv (perf.c:444)
==20512== by 0x41AE51: main (perf.c:559)
==20512==
==20512== 26,147,680 bytes in 62,855 blocks are possibly lost in loss record 299 of 301
==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512== by 0x4A6996: zalloc (util.h:189)
==20512== by 0x4A6996: hist_entry__new (hist.c:309)
==20512== by 0x4A8249: add_hist_entry (hist.c:431)
==20512== by 0x4A8249: __hists__add_entry (hist.c:477)
==20512== by 0x4A8728: iter_add_next_cumulative_entry (hist.c:803)
==20512== by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
==20512== by 0x41B702: run_builtin (perf.c:341)
==20512== by 0x41AE51: handle_internal_command (perf.c:400)
==20512== by 0x41AE51: run_argv (perf.c:444)
==20512== by 0x41AE51: main (perf.c:559)
==20512==
==20512== 44,939,792 bytes in 3,788 blocks are definitely lost in loss record 300 of 301
==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512== by 0x44F7BF: zalloc (util.h:189)
==20512== by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
==20512== by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
==20512== by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
==20512== by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
==20512== by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
==20512== by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512== by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
==20512== by 0x4395EE: __cmd_top (builtin-top.c:974)
==20512== by 0x4395EE: cmd_top (builtin-top.c:1266)
==20512== by 0x41B702: run_builtin (perf.c:341)
==20512== by 0x41AE51: handle_internal_command (perf.c:400)
==20512== by 0x41AE51: run_argv (perf.c:444)
==20512== by 0x41AE51: main (perf.c:559)
==20512==
==20512== 81,248,440 bytes in 1,450,865 blocks are possibly lost in loss record 301 of 301
==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
==20512== by 0x4821FE: zalloc (util.h:189)
==20512== by 0x4821FE: fill_node (callchain.c:450)
==20512== by 0x4821FE: add_child (callchain.c:473)
==20512== by 0x4821FE: append_chain_children (callchain.c:596)
==20512== by 0x48514E: callchain_append (callchain.c:672)
==20512== by 0x4A8768: iter_add_next_cumulative_entry (hist.c:812)
==20512== by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
==20512== by 0x41B702: run_builtin (perf.c:341)
==20512== by 0x41AE51: handle_internal_command (perf.c:400)
==20512== by 0x41AE51: run_argv (perf.c:444)
==20512== by 0x41AE51: main (perf.c:559)
==20512==
==20512== LEAK SUMMARY:
==20512== definitely lost: 56,810,427 bytes in 5,479 blocks
==20512== indirectly lost: 4,000 bytes in 125 blocks
==20512== possibly lost: 210,493,048 bytes in 2,552,578 blocks
==20512== still reachable: 18,258 bytes in 242 blocks
==20512== suppressed: 0 bytes in 0 blocks
--
Markus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second.
2014-12-30 8:35 ` Markus Trippelsdorf
@ 2014-12-30 23:04 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-30 23:04 UTC (permalink / raw)
To: Markus Trippelsdorf
Cc: Namhyung Kim, David Ahern, linux-kernel, Peter Zijlstra,
Paul Mackerras, Ingo Molnar
Em Tue, Dec 30, 2014 at 09:35:24AM +0100, Markus Trippelsdorf escreveu:
> On 2014.12.30 at 14:38 +0900, Namhyung Kim wrote:
> > Hi David and Markus,
> >
> > On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote:
> > > On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote:
> > > >The callchain code was done initially for 'report' and when I made 'top'
> > > >reuse the hist_entry code allowing 'top' to collect callchains was too
> > > >easy, but then we need to go thru the callchain/hists/hist_entry code to
> > > >make sure that they don't leak, will try to do it...
> > > >
> > >
> > > As I recall it is build up of the dead_threads list.
> >
> > Maybe. But I guess it's because of leak of callchains..
> >
> > Markus, could you please test below patch how much it affects?
>
> Thanks Namhyung. It leaks an order of magnitude less memory now:
> ~30MB/sec on my machine.
>
> Valgrind shows (last entries of the list):
> ...
> ==20512== 7,225,920 bytes in 17,370 blocks are possibly lost in loss record 295 of 301
> ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512== by 0x4A6996: zalloc (util.h:189)
> ==20512== by 0x4A6996: hist_entry__new (hist.c:309)
> ==20512== by 0x4A8249: add_hist_entry (hist.c:431)
> ==20512== by 0x4A8249: __hists__add_entry (hist.c:477)
> ==20512== by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
> ==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876)
> ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512== by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
> ==20512== by 0x4395EE: __cmd_top (builtin-top.c:974)
> ==20512== by 0x4395EE: cmd_top (builtin-top.c:1266)
> ==20512== by 0x41B702: run_builtin (perf.c:341)
> ==20512== by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512== by 0x41AE51: run_argv (perf.c:444)
> ==20512== by 0x41AE51: main (perf.c:559)
#1) Ok, those are the hist_entries that were not decayed, if top continued
they would eventually be decayed, freed, etc, i.e. the exit of top is
equivalent to the last decay.
> ==20512==
> ==20512== 8,922,480 bytes in 159,330 blocks are possibly lost in loss record 296 of 301
> ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512== by 0x4821FE: zalloc (util.h:189)
> ==20512== by 0x4821FE: fill_node (callchain.c:450)
> ==20512== by 0x4821FE: add_child (callchain.c:473)
> ==20512== by 0x4821FE: append_chain_children (callchain.c:596)
> ==20512== by 0x48514E: callchain_append (callchain.c:672)
> ==20512== by 0x4A8947: iter_add_single_cumulative_entry (hist.c:739)
> ==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876)
> ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512== by 0x41B702: run_builtin (perf.c:341)
> ==20512== by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512== by 0x41AE51: run_argv (perf.c:444)
> ==20512== by 0x41AE51: main (perf.c:559)
> ==20512==
> ==20512== 11,050,136 bytes in 1,663 blocks are definitely lost in loss record 297 of 301
> ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512== by 0x44F7BF: zalloc (util.h:189)
> ==20512== by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
> ==20512== by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
> ==20512== by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
> ==20512== by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
> ==20512== by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
> ==20512== by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
> ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512== by 0x41B702: run_builtin (perf.c:341)
> ==20512== by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512== by 0x41AE51: run_argv (perf.c:444)
> ==20512== by 0x41AE51: main (perf.c:559)
These (symbol__alloc_hist) we need to free up the decaying of per RIP
annotation gets to zero... Will check where to do this...
> ==20512==
> ==20512== 24,920,064 bytes in 59,904 blocks are possibly lost in loss record 298 of 301
> ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512== by 0x4A6996: zalloc (util.h:189)
> ==20512== by 0x4A6996: hist_entry__new (hist.c:309)
> ==20512== by 0x4A8249: add_hist_entry (hist.c:431)
> ==20512== by 0x4A8249: __hists__add_entry (hist.c:477)
> ==20512== by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730)
> ==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876)
> ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512== by 0x41B702: run_builtin (perf.c:341)
> ==20512== by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512== by 0x41AE51: run_argv (perf.c:444)
> ==20512== by 0x41AE51: main (perf.c:559)
Same thing as #1.
> ==20512==
> ==20512== 26,147,680 bytes in 62,855 blocks are possibly lost in loss record 299 of 301
> ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512== by 0x4A6996: zalloc (util.h:189)
> ==20512== by 0x4A6996: hist_entry__new (hist.c:309)
> ==20512== by 0x4A8249: add_hist_entry (hist.c:431)
> ==20512== by 0x4A8249: __hists__add_entry (hist.c:477)
> ==20512== by 0x4A8728: iter_add_next_cumulative_entry (hist.c:803)
> ==20512== by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
> ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512== by 0x41B702: run_builtin (perf.c:341)
> ==20512== by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512== by 0x41AE51: run_argv (perf.c:444)
> ==20512== by 0x41AE51: main (perf.c:559)
> ==20512==
> ==20512== 44,939,792 bytes in 3,788 blocks are definitely lost in loss record 300 of 301
> ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512== by 0x44F7BF: zalloc (util.h:189)
> ==20512== by 0x44F7BF: symbol__alloc_hist (annotate.c:455)
> ==20512== by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507)
> ==20512== by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521)
> ==20512== by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195)
> ==20512== by 0x437A65: hist_iter__top_callback (builtin-top.c:688)
> ==20512== by 0x4A8AC8: hist_entry_iter__add (hist.c:892)
> ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512== by 0x4395EE: perf_top__mmap_read (builtin-top.c:871)
> ==20512== by 0x4395EE: __cmd_top (builtin-top.c:974)
> ==20512== by 0x4395EE: cmd_top (builtin-top.c:1266)
> ==20512== by 0x41B702: run_builtin (perf.c:341)
> ==20512== by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512== by 0x41AE51: run_argv (perf.c:444)
> ==20512== by 0x41AE51: main (perf.c:559)
> ==20512==
> ==20512== 81,248,440 bytes in 1,450,865 blocks are possibly lost in loss record 301 of 301
> ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623)
> ==20512== by 0x4821FE: zalloc (util.h:189)
> ==20512== by 0x4821FE: fill_node (callchain.c:450)
> ==20512== by 0x4821FE: add_child (callchain.c:473)
> ==20512== by 0x4821FE: append_chain_children (callchain.c:596)
> ==20512== by 0x48514E: callchain_append (callchain.c:672)
> ==20512== by 0x4A8768: iter_add_next_cumulative_entry (hist.c:812)
> ==20512== by 0x4A8AA5: hist_entry_iter__add (hist.c:887)
> ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787)
> ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854)
> ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871)
> ==20512== by 0x43967E: __cmd_top (builtin-top.c:996)
> ==20512== by 0x43967E: cmd_top (builtin-top.c:1266)
> ==20512== by 0x41B702: run_builtin (perf.c:341)
> ==20512== by 0x41AE51: handle_internal_command (perf.c:400)
> ==20512== by 0x41AE51: run_argv (perf.c:444)
> ==20512== by 0x41AE51: main (perf.c:559)
> ==20512==
> ==20512== LEAK SUMMARY:
> ==20512== definitely lost: 56,810,427 bytes in 5,479 blocks
> ==20512== indirectly lost: 4,000 bytes in 125 blocks
> ==20512== possibly lost: 210,493,048 bytes in 2,552,578 blocks
> ==20512== still reachable: 18,258 bytes in 242 blocks
> ==20512== suppressed: 0 bytes in 0 blocks
>
> --
> Markus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second.
2014-12-30 5:38 ` Namhyung Kim
2014-12-30 8:35 ` Markus Trippelsdorf
@ 2015-01-08 15:09 ` Arnaldo Carvalho de Melo
2015-01-09 12:31 ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim
2 siblings, 0 replies; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-08 15:09 UTC (permalink / raw)
To: Namhyung Kim
Cc: David Ahern, Markus Trippelsdorf, linux-kernel, Peter Zijlstra,
Paul Mackerras, Ingo Molnar
Em Tue, Dec 30, 2014 at 02:38:13PM +0900, Namhyung Kim escreveu:
> Subject: [PATCH] perf callchain: Free callchains when hist entries are deleted
> Markus reported that "perf top -g" can leak ~300MB per second on his
> machine. This is partly because it missed to free callchains when
> hist entries are deleted. Fix it.
Thanks, applied.
- Arnaldo
^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted
2014-12-30 5:38 ` Namhyung Kim
2014-12-30 8:35 ` Markus Trippelsdorf
2015-01-08 15:09 ` Arnaldo Carvalho de Melo
@ 2015-01-09 12:31 ` tip-bot for Namhyung Kim
2 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-09 12:31 UTC (permalink / raw)
To: linux-tip-commits
Cc: hpa, mingo, linux-kernel, namhyung, acme, fweisbec, dsahern,
a.p.zijlstra, markus, tglx, paulus, mingo
Commit-ID: d114960c488b5a95705a04bba305f931cef0efd6
Gitweb: http://git.kernel.org/tip/d114960c488b5a95705a04bba305f931cef0efd6
Author: Namhyung Kim <namhyung@kernel.org>
AuthorDate: Tue, 30 Dec 2014 14:38:13 +0900
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 8 Jan 2015 11:56:35 -0300
perf callchain: Free callchains when hist entries are deleted
Markus reported that "perf top -g" can leak ~300MB per second on his
machine. This is partly because it missed to free callchains when hist
entries are deleted. Fix it.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20141230053813.GD6081@sejong
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/callchain.c | 30 ++++++++++++++++++++++++++++++
tools/perf/util/callchain.h | 2 ++
tools/perf/util/hist.c | 1 +
3 files changed, 33 insertions(+)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 64b377e..14e7a12 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -841,3 +841,33 @@ char *callchain_list__sym_name(struct callchain_list *cl,
return bf;
}
+
+static void free_callchain_node(struct callchain_node *node)
+{
+ struct callchain_list *list, *tmp;
+ struct callchain_node *child;
+ struct rb_node *n;
+
+ list_for_each_entry_safe(list, tmp, &node->val, list) {
+ list_del(&list->list);
+ free(list);
+ }
+
+ n = rb_first(&node->rb_root_in);
+ while (n) {
+ child = container_of(n, struct callchain_node, rb_node_in);
+ n = rb_next(n);
+ rb_erase(&child->rb_node_in, &node->rb_root_in);
+
+ free_callchain_node(child);
+ free(child);
+ }
+}
+
+void free_callchain(struct callchain_root *root)
+{
+ if (!symbol_conf.use_callchain)
+ return;
+
+ free_callchain_node(&root->node);
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index dbc08cf..c0ec1ac 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -198,4 +198,6 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,
char *callchain_list__sym_name(struct callchain_list *cl,
char *bf, size_t bfsize, bool show_dso);
+void free_callchain(struct callchain_root *root);
+
#endif /* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0ced178..1823955 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -947,6 +947,7 @@ void hist_entry__free(struct hist_entry *he)
zfree(&he->mem_info);
zfree(&he->stat_acc);
free_srcline(he->srcline);
+ free_callchain(he->callchain);
free(he);
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-01-09 12:31 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-13 8:48 "perf top -g" leaking ~300MB per second Markus Trippelsdorf
2014-12-13 9:03 ` Markus Trippelsdorf
2014-12-13 15:26 ` Arnaldo Carvalho de Melo
2014-12-13 18:16 ` David Ahern
2014-12-30 5:38 ` Namhyung Kim
2014-12-30 8:35 ` Markus Trippelsdorf
2014-12-30 23:04 ` Arnaldo Carvalho de Melo
2015-01-08 15:09 ` Arnaldo Carvalho de Melo
2015-01-09 12:31 ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).