* [BUG] segfault in perf-top -- thread refcnt @ 2015-03-27 17:31 David Ahern 2015-03-27 19:51 ` Arnaldo Carvalho de Melo 2015-03-27 20:11 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 24+ messages in thread From: David Ahern @ 2015-03-27 17:31 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Stephane Eranian; +Cc: LKML Hi Arnaldo: If I leave 'perf top -z' running it eventually crashes with the backtrace in the new thread refcnt code: (gdb) bt #0 __list_del_entry (thread=0x670bd30) at /home/dahern/kernels/linux.git/tools/perf/util/include/../../../../include/linux/list.h:102 #1 list_del_init (thread=0x670bd30) at /home/dahern/kernels/linux.git/tools/perf/util/include/../../../../include/linux/list.h:145 #2 thread__put (thread=0x670bd30) at util/thread.c:94 #3 0x00000000001bfa3c in __thread__zput (he=0x820adc0) at util/thread.h:48 #4 hist_entry__delete (he=0x820adc0) at util/hist.c:939 #5 0x00000000001c18a4 in hists__delete_entry (hists=0x4fe0f8) at util/hist.c:255 #6 hists__delete_entries (hists=0x4fe0f8) at util/hist.c:283 #7 0x00000000001250e8 in perf_top__sort_new_samples (arg=0x7feffffbca8) at builtin-top.c:549 #8 0x00000000001e9108 in hist_browser__run (evsel=0x4fdfc0, nr_events=<value optimized out>, helpline=<value optimized out>, left_exits=<value optimized out>, hbt=0xfff800012cd7cce0, min_pcnt=0, env=0x4fe4c0) at ui/browsers/hists.c:431 #9 perf_evsel__hists_browse (evsel=0x4fdfc0, nr_events=<value optimized out>, helpline=<value optimized out>, left_exits=<value optimized out>, hbt=0xfff800012cd7cce0, min_pcnt=0, env=0x4fe4c0) at ui/browsers/hists.c:1501 #10 0x00000000001eb390 in perf_evlist__tui_browse_hists (evlist=0x4c9ae0, help=0x28dc00 "For a higher level overview, try: perf top --sort comm,dso", hbt=0xfff800012cd7cce0, min_pcnt=<value optimized out>, env=0x4fe4c0) at ui/browsers/hists.c:2022 #11 0x0000000000126188 in display_thread_tui (arg=<value optimized out>) at builtin-top.c:582 #12 0xfff8000100130a68 in start_thread () from /lib64/libpthread.so.0 #13 0xfff800010139c598 in __thread_start () from /lib64/libc.so.6 #14 0xfff800010139c598 in __thread_start () from /lib64/libc.so.6 (gdb) fra 2 #2 thread__put (thread=0x670bd30) at util/thread.c:94 94 util/thread.c: No such file or directory. in util/thread.c (gdb) p *thread $2 = {{rb_node = {__rb_parent_color = 106933920, rb_right = 0x0, rb_left = 0x0}, node = {next = 0x65faea0, prev = 0x0}}, mg = 0x670be20, pid_ = 153034, tid = 153034, ppid = -1, cpu = -1, refcnt = 0, shortname = "\000\000", comm_set = false, dead = false, comm_list = {next = 0x670bdc0, prev = 0x670bdc0}, comm_len = 7, db_id = 0, priv = 0x0, ts = 0x0} In node element, prev is NULL ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-27 17:31 [BUG] segfault in perf-top -- thread refcnt David Ahern @ 2015-03-27 19:51 ` Arnaldo Carvalho de Melo 2015-03-27 20:11 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-27 19:51 UTC (permalink / raw) To: David Ahern; +Cc: Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML Em Fri, Mar 27, 2015 at 11:31:22AM -0600, David Ahern escreveu: > Hi Arnaldo: > > If I leave 'perf top -z' running it eventually crashes with the backtrace in > the new thread refcnt code: Debugging, thanks for the report! - Arnaldo > (gdb) bt > #0 __list_del_entry (thread=0x670bd30) > at /home/dahern/kernels/linux.git/tools/perf/util/include/../../../../include/linux/list.h:102 > #1 list_del_init (thread=0x670bd30) > at /home/dahern/kernels/linux.git/tools/perf/util/include/../../../../include/linux/list.h:145 > #2 thread__put (thread=0x670bd30) at util/thread.c:94 > #3 0x00000000001bfa3c in __thread__zput (he=0x820adc0) at util/thread.h:48 > #4 hist_entry__delete (he=0x820adc0) at util/hist.c:939 > #5 0x00000000001c18a4 in hists__delete_entry (hists=0x4fe0f8) at > util/hist.c:255 > #6 hists__delete_entries (hists=0x4fe0f8) at util/hist.c:283 > #7 0x00000000001250e8 in perf_top__sort_new_samples (arg=0x7feffffbca8) at > builtin-top.c:549 > #8 0x00000000001e9108 in hist_browser__run (evsel=0x4fdfc0, > nr_events=<value optimized out>, > helpline=<value optimized out>, left_exits=<value optimized out>, > hbt=0xfff800012cd7cce0, min_pcnt=0, env=0x4fe4c0) > at ui/browsers/hists.c:431 > #9 perf_evsel__hists_browse (evsel=0x4fdfc0, nr_events=<value optimized > out>, helpline=<value optimized out>, > left_exits=<value optimized out>, hbt=0xfff800012cd7cce0, min_pcnt=0, > env=0x4fe4c0) at ui/browsers/hists.c:1501 > #10 0x00000000001eb390 in perf_evlist__tui_browse_hists (evlist=0x4c9ae0, > help=0x28dc00 "For a higher level overview, try: perf top --sort > comm,dso", hbt=0xfff800012cd7cce0, > min_pcnt=<value optimized out>, env=0x4fe4c0) at > ui/browsers/hists.c:2022 > #11 0x0000000000126188 in display_thread_tui (arg=<value optimized out>) at > builtin-top.c:582 > #12 0xfff8000100130a68 in start_thread () from /lib64/libpthread.so.0 > #13 0xfff800010139c598 in __thread_start () from /lib64/libc.so.6 > #14 0xfff800010139c598 in __thread_start () from /lib64/libc.so.6 > > > (gdb) fra 2 > #2 thread__put (thread=0x670bd30) at util/thread.c:94 > 94 util/thread.c: No such file or directory. > in util/thread.c > (gdb) p *thread > $2 = {{rb_node = {__rb_parent_color = 106933920, rb_right = 0x0, rb_left = > 0x0}, node = {next = 0x65faea0, prev = 0x0}}, > mg = 0x670be20, pid_ = 153034, tid = 153034, ppid = -1, cpu = -1, refcnt = > 0, shortname = "\000\000", comm_set = false, > dead = false, comm_list = {next = 0x670bdc0, prev = 0x670bdc0}, comm_len = > 7, db_id = 0, priv = 0x0, ts = 0x0} > > > In node element, prev is NULL ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-27 17:31 [BUG] segfault in perf-top -- thread refcnt David Ahern 2015-03-27 19:51 ` Arnaldo Carvalho de Melo @ 2015-03-27 20:11 ` Arnaldo Carvalho de Melo 2015-03-27 20:13 ` David Ahern 1 sibling, 1 reply; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-27 20:11 UTC (permalink / raw) To: David Ahern; +Cc: Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML Em Fri, Mar 27, 2015 at 11:31:22AM -0600, David Ahern escreveu: > Hi Arnaldo: > > If I leave 'perf top -z' running it eventually crashes with the backtrace in > the new thread refcnt code: Humm, tried here with: perf top -d 1 -z I.e. as you, with the --tui, and also with --stdio, so far couldn't reproduce. > (gdb) bt > #0 __list_del_entry (thread=0x670bd30) > at /home/dahern/kernels/linux.git/tools/perf/util/include/../../../../include/linux/list.h:102 > #1 list_del_init (thread=0x670bd30) > at /home/dahern/kernels/linux.git/tools/perf/util/include/../../../../include/linux/list.h:145 > #2 thread__put (thread=0x670bd30) at util/thread.c:94 > #3 0x00000000001bfa3c in __thread__zput (he=0x820adc0) at util/thread.h:48 > #4 hist_entry__delete (he=0x820adc0) at util/hist.c:939 > #5 0x00000000001c18a4 in hists__delete_entry (hists=0x4fe0f8) at > util/hist.c:255 > #6 hists__delete_entries (hists=0x4fe0f8) at util/hist.c:283 > #7 0x00000000001250e8 in perf_top__sort_new_samples (arg=0x7feffffbca8) at > builtin-top.c:549 > #8 0x00000000001e9108 in hist_browser__run (evsel=0x4fdfc0, > nr_events=<value optimized out>, > helpline=<value optimized out>, left_exits=<value optimized out>, > hbt=0xfff800012cd7cce0, min_pcnt=0, env=0x4fe4c0) > at ui/browsers/hists.c:431 > #9 perf_evsel__hists_browse (evsel=0x4fdfc0, nr_events=<value optimized > out>, helpline=<value optimized out>, > left_exits=<value optimized out>, hbt=0xfff800012cd7cce0, min_pcnt=0, > env=0x4fe4c0) at ui/browsers/hists.c:1501 > #10 0x00000000001eb390 in perf_evlist__tui_browse_hists (evlist=0x4c9ae0, > help=0x28dc00 "For a higher level overview, try: perf top --sort > comm,dso", hbt=0xfff800012cd7cce0, > min_pcnt=<value optimized out>, env=0x4fe4c0) at > ui/browsers/hists.c:2022 > #11 0x0000000000126188 in display_thread_tui (arg=<value optimized out>) at > builtin-top.c:582 > #12 0xfff8000100130a68 in start_thread () from /lib64/libpthread.so.0 > #13 0xfff800010139c598 in __thread_start () from /lib64/libc.so.6 > #14 0xfff800010139c598 in __thread_start () from /lib64/libc.so.6 > > > (gdb) fra 2 > #2 thread__put (thread=0x670bd30) at util/thread.c:94 > 94 util/thread.c: No such file or directory. > in util/thread.c > (gdb) p *thread > $2 = {{rb_node = {__rb_parent_color = 106933920, rb_right = 0x0, rb_left = > 0x0}, node = {next = 0x65faea0, prev = 0x0}}, > mg = 0x670be20, pid_ = 153034, tid = 153034, ppid = -1, cpu = -1, refcnt = > 0, shortname = "\000\000", comm_set = false, > dead = false, comm_list = {next = 0x670bdc0, prev = 0x670bdc0}, comm_len = > 7, db_id = 0, priv = 0x0, ts = 0x0} > > > In node element, prev is NULL ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-27 20:11 ` Arnaldo Carvalho de Melo @ 2015-03-27 20:13 ` David Ahern 2015-03-30 8:07 ` Jiri Olsa 0 siblings, 1 reply; 24+ messages in thread From: David Ahern @ 2015-03-27 20:13 UTC (permalink / raw) To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML On 3/27/15 2:11 PM, Arnaldo Carvalho de Melo wrote: > Em Fri, Mar 27, 2015 at 11:31:22AM -0600, David Ahern escreveu: >> >Hi Arnaldo: >> > >> >If I leave 'perf top -z' running it eventually crashes with the backtrace in >> >the new thread refcnt code: > Humm, tried here with: > > perf top -d 1 -z > > I.e. as you, with the --tui, and also with --stdio, so far couldn't > reproduce. > 2 things: 1. let run for a long time. go about using the server. do lots of builds, etc. it takes time 2. use a box with a LOT of cpus (1024 in my case) Make sure ulimit is set to get the core. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-27 20:13 ` David Ahern @ 2015-03-30 8:07 ` Jiri Olsa 2015-03-30 10:22 ` Jiri Olsa 0 siblings, 1 reply; 24+ messages in thread From: Jiri Olsa @ 2015-03-30 8:07 UTC (permalink / raw) To: David Ahern Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML On Fri, Mar 27, 2015 at 02:13:26PM -0600, David Ahern wrote: > On 3/27/15 2:11 PM, Arnaldo Carvalho de Melo wrote: > >Em Fri, Mar 27, 2015 at 11:31:22AM -0600, David Ahern escreveu: > >>>Hi Arnaldo: > >>> > >>>If I leave 'perf top -z' running it eventually crashes with the backtrace in > >>>the new thread refcnt code: > >Humm, tried here with: > > > >perf top -d 1 -z > > > >I.e. as you, with the --tui, and also with --stdio, so far couldn't > >reproduce. > > > > 2 things: > 1. let run for a long time. go about using the server. do lots of builds, > etc. it takes time > > 2. use a box with a LOT of cpus (1024 in my case) > > Make sure ulimit is set to get the core. reproduced under 24 cpu box with kernel build (make -j25) running on background.. will try to look closer perf: Segmentation fault -------- backtrace -------- ./perf[0x4fd79b] /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0] ./perf(thread__put+0x5b)[0x4b1a7b] ./perf(hists__delete_entries+0x70)[0x4c8670] ./perf[0x436a88] ./perf[0x4fa73d] ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437] ./perf[0x4381d0] /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5] /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d] [0x0] jirka ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 8:07 ` Jiri Olsa @ 2015-03-30 10:22 ` Jiri Olsa 2015-03-30 11:21 ` Jiri Olsa ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Jiri Olsa @ 2015-03-30 10:22 UTC (permalink / raw) To: David Ahern Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: SNIP > > > > 2 things: > > 1. let run for a long time. go about using the server. do lots of builds, > > etc. it takes time > > > > 2. use a box with a LOT of cpus (1024 in my case) > > > > Make sure ulimit is set to get the core. > > reproduced under 24 cpu box with kernel build (make -j25) > running on background.. will try to look closer > > perf: Segmentation fault > -------- backtrace -------- > ./perf[0x4fd79b] > /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0] > ./perf(thread__put+0x5b)[0x4b1a7b] > ./perf(hists__delete_entries+0x70)[0x4c8670] > ./perf[0x436a88] > ./perf[0x4fa73d] > ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437] > ./perf[0x4381d0] > /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5] > /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d] > [0x0] looks like race among __machine__findnew_thread and thread__put over the machine->threads rb_tree insert/removal is there a reason why thread__put does not erase itself from machine->threads? I'm trying attached patch.. so far so gut ;-) jirka --- diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index e335330..7e6abc7 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) dsos__init(&machine->kernel_dsos); machine->threads = RB_ROOT; + pthread_mutex_init(&machine->threads_lock, NULL); INIT_LIST_HEAD(&machine->dead_threads); machine->last_match = NULL; @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine, if (!create) return NULL; - th = thread__new(pid, tid); + th = thread__new(machine, pid, tid); if (th != NULL) { + + pthread_mutex_lock(&machine->threads_lock); rb_link_node(&th->rb_node, parent, p); rb_insert_color(&th->rb_node, &machine->threads); + pthread_mutex_unlock(&machine->threads_lock); /* * We have to initialize map_groups separately @@ -394,8 +398,10 @@ static struct thread *__machine__findnew_thread(struct machine *machine, * leader and that would screwed the rb tree. */ if (thread__init_map_groups(th, machine)) { + pthread_mutex_lock(&machine->threads_lock); rb_erase(&th->rb_node, &machine->threads); thread__delete(th); + pthread_mutex_unlock(&machine->threads_lock); return NULL; } /* diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index e2faf3b..e3468d6 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -30,6 +30,7 @@ struct machine { bool comm_exec; char *root_dir; struct rb_root threads; + pthread_mutex_t threads_lock; struct list_head dead_threads; struct thread *last_match; struct vdso_info *vdso_info; diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c index 1c8fbc9..4592fc4 100644 --- a/tools/perf/util/thread.c +++ b/tools/perf/util/thread.c @@ -26,7 +26,7 @@ int thread__init_map_groups(struct thread *thread, struct machine *machine) return thread->mg ? 0 : -1; } -struct thread *thread__new(pid_t pid, pid_t tid) +struct thread *thread__new(struct machine *machine, pid_t pid, pid_t tid) { char *comm_str; struct comm *comm; @@ -38,6 +38,7 @@ struct thread *thread__new(pid_t pid, pid_t tid) thread->ppid = -1; thread->cpu = -1; INIT_LIST_HEAD(&thread->comm_list); + thread->machine = machine; if (unwind__prepare_access(thread) < 0) goto err_thread; @@ -91,7 +92,14 @@ struct thread *thread__get(struct thread *thread) void thread__put(struct thread *thread) { if (thread && --thread->refcnt == 0) { + struct machine *machine = thread->machine; + list_del_init(&thread->node); + + pthread_mutex_lock(&machine->threads_lock); + rb_erase(&thread->rb_node, &machine->threads); + pthread_mutex_unlock(&machine->threads_lock); + thread__delete(thread); } } diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h index 9b8a54d..df6fb69 100644 --- a/tools/perf/util/thread.h +++ b/tools/perf/util/thread.h @@ -31,12 +31,13 @@ struct thread { void *priv; struct thread_stack *ts; + struct machine *machine; }; struct machine; struct comm; -struct thread *thread__new(pid_t pid, pid_t tid); +struct thread *thread__new(struct machine *machine, pid_t pid, pid_t tid); int thread__init_map_groups(struct thread *thread, struct machine *machine); void thread__delete(struct thread *thread); ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 10:22 ` Jiri Olsa @ 2015-03-30 11:21 ` Jiri Olsa 2015-03-30 11:49 ` Jiri Olsa 2015-03-30 13:09 ` Arnaldo Carvalho de Melo 2015-03-30 13:17 ` Arnaldo Carvalho de Melo 2 siblings, 1 reply; 24+ messages in thread From: Jiri Olsa @ 2015-03-30 11:21 UTC (permalink / raw) To: David Ahern Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote: > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > > SNIP > > > > > > > 2 things: > > > 1. let run for a long time. go about using the server. do lots of builds, > > > etc. it takes time > > > > > > 2. use a box with a LOT of cpus (1024 in my case) > > > > > > Make sure ulimit is set to get the core. > > > > reproduced under 24 cpu box with kernel build (make -j25) > > running on background.. will try to look closer > > > > perf: Segmentation fault > > -------- backtrace -------- > > ./perf[0x4fd79b] > > /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0] > > ./perf(thread__put+0x5b)[0x4b1a7b] > > ./perf(hists__delete_entries+0x70)[0x4c8670] > > ./perf[0x436a88] > > ./perf[0x4fa73d] > > ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437] > > ./perf[0x4381d0] > > /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5] > > /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d] > > [0x0] > > looks like race among __machine__findnew_thread and thread__put > over the machine->threads rb_tree insert/removal > > is there a reason why thread__put does not erase itself from machine->threads? > > I'm trying attached patch.. so far so gut ;-) arghh.. it blowed up during the lunch :-\ jirka ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 11:21 ` Jiri Olsa @ 2015-03-30 11:49 ` Jiri Olsa 2015-03-30 12:48 ` Namhyung Kim 2015-03-30 13:22 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 24+ messages in thread From: Jiri Olsa @ 2015-03-30 11:49 UTC (permalink / raw) To: David Ahern Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote: > On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote: > > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > > > > SNIP > > > > > > > > > > 2 things: > > > > 1. let run for a long time. go about using the server. do lots of builds, > > > > etc. it takes time > > > > > > > > 2. use a box with a LOT of cpus (1024 in my case) > > > > > > > > Make sure ulimit is set to get the core. > > > > > > reproduced under 24 cpu box with kernel build (make -j25) > > > running on background.. will try to look closer > > > > > > perf: Segmentation fault > > > -------- backtrace -------- > > > ./perf[0x4fd79b] > > > /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0] > > > ./perf(thread__put+0x5b)[0x4b1a7b] > > > ./perf(hists__delete_entries+0x70)[0x4c8670] > > > ./perf[0x436a88] > > > ./perf[0x4fa73d] > > > ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437] > > > ./perf[0x4381d0] > > > /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5] > > > /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d] > > > [0x0] > > > > looks like race among __machine__findnew_thread and thread__put > > over the machine->threads rb_tree insert/removal > > > > is there a reason why thread__put does not erase itself from machine->threads? that was the reason.. we do this separately.. not in thread__put.. is there a reason for this? ;-) testing attached patch.. jirka --- diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c index f7fb258..966564a 100644 --- a/tools/perf/util/build-id.c +++ b/tools/perf/util/build-id.c @@ -60,7 +60,6 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused, event->fork.ppid, event->fork.ptid); if (thread) { - rb_erase(&thread->rb_node, &machine->threads); if (machine->last_match == thread) thread__zput(machine->last_match); thread__put(thread); diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index e335330..a8443ef 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) dsos__init(&machine->kernel_dsos); machine->threads = RB_ROOT; + pthread_mutex_init(&machine->threads_lock, NULL); INIT_LIST_HEAD(&machine->dead_threads); machine->last_match = NULL; @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine, if (!create) return NULL; - th = thread__new(pid, tid); + th = thread__new(machine, pid, tid); if (th != NULL) { + + pthread_mutex_lock(&machine->threads_lock); rb_link_node(&th->rb_node, parent, p); rb_insert_color(&th->rb_node, &machine->threads); + pthread_mutex_unlock(&machine->threads_lock); /* * We have to initialize map_groups separately @@ -394,7 +398,9 @@ static struct thread *__machine__findnew_thread(struct machine *machine, * leader and that would screwed the rb tree. */ if (thread__init_map_groups(th, machine)) { + pthread_mutex_lock(&machine->threads_lock); rb_erase(&th->rb_node, &machine->threads); + pthread_mutex_unlock(&machine->threads_lock); thread__delete(th); return NULL; } @@ -1258,7 +1264,6 @@ static void machine__remove_thread(struct machine *machine, struct thread *th) if (machine->last_match == th) thread__zput(machine->last_match); - rb_erase(&th->rb_node, &machine->threads); /* * Move it first to the dead_threads list, then drop the reference, * if this is the last reference, then the thread__delete destructor diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index e2faf3b..e3468d6 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -30,6 +30,7 @@ struct machine { bool comm_exec; char *root_dir; struct rb_root threads; + pthread_mutex_t threads_lock; struct list_head dead_threads; struct thread *last_match; struct vdso_info *vdso_info; diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c index 1c8fbc9..4592fc4 100644 --- a/tools/perf/util/thread.c +++ b/tools/perf/util/thread.c @@ -26,7 +26,7 @@ int thread__init_map_groups(struct thread *thread, struct machine *machine) return thread->mg ? 0 : -1; } -struct thread *thread__new(pid_t pid, pid_t tid) +struct thread *thread__new(struct machine *machine, pid_t pid, pid_t tid) { char *comm_str; struct comm *comm; @@ -38,6 +38,7 @@ struct thread *thread__new(pid_t pid, pid_t tid) thread->ppid = -1; thread->cpu = -1; INIT_LIST_HEAD(&thread->comm_list); + thread->machine = machine; if (unwind__prepare_access(thread) < 0) goto err_thread; @@ -91,7 +92,14 @@ struct thread *thread__get(struct thread *thread) void thread__put(struct thread *thread) { if (thread && --thread->refcnt == 0) { + struct machine *machine = thread->machine; + list_del_init(&thread->node); + + pthread_mutex_lock(&machine->threads_lock); + rb_erase(&thread->rb_node, &machine->threads); + pthread_mutex_unlock(&machine->threads_lock); + thread__delete(thread); } } diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h index 9b8a54d..df6fb69 100644 --- a/tools/perf/util/thread.h +++ b/tools/perf/util/thread.h @@ -31,12 +31,13 @@ struct thread { void *priv; struct thread_stack *ts; + struct machine *machine; }; struct machine; struct comm; -struct thread *thread__new(pid_t pid, pid_t tid); +struct thread *thread__new(struct machine *machine, pid_t pid, pid_t tid); int thread__init_map_groups(struct thread *thread, struct machine *machine); void thread__delete(struct thread *thread); ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 11:49 ` Jiri Olsa @ 2015-03-30 12:48 ` Namhyung Kim 2015-03-30 12:56 ` Jiri Olsa 2015-03-30 14:58 ` Arnaldo Carvalho de Melo 2015-03-30 13:22 ` Arnaldo Carvalho de Melo 1 sibling, 2 replies; 24+ messages in thread From: Namhyung Kim @ 2015-03-30 12:48 UTC (permalink / raw) To: Jiri Olsa Cc: David Ahern, Arnaldo Carvalho de Melo, Jiri Olsa, Stephane Eranian, LKML Hi Jiri, On Mon, Mar 30, 2015 at 01:49:07PM +0200, Jiri Olsa wrote: > On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote: > > On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote: > > > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > > > > > > SNIP > > > > > > > > > > > > > 2 things: > > > > > 1. let run for a long time. go about using the server. do lots of builds, > > > > > etc. it takes time > > > > > > > > > > 2. use a box with a LOT of cpus (1024 in my case) > > > > > > > > > > Make sure ulimit is set to get the core. > > > > > > > > reproduced under 24 cpu box with kernel build (make -j25) > > > > running on background.. will try to look closer > > > > > > > > perf: Segmentation fault > > > > -------- backtrace -------- > > > > ./perf[0x4fd79b] > > > > /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0] > > > > ./perf(thread__put+0x5b)[0x4b1a7b] > > > > ./perf(hists__delete_entries+0x70)[0x4c8670] > > > > ./perf[0x436a88] > > > > ./perf[0x4fa73d] > > > > ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437] > > > > ./perf[0x4381d0] > > > > /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5] > > > > /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d] > > > > [0x0] > > > > > > looks like race among __machine__findnew_thread and thread__put > > > over the machine->threads rb_tree insert/removal > > > > > > is there a reason why thread__put does not erase itself from machine->threads? > > that was the reason.. we do this separately.. not in thread__put.. > is there a reason for this? ;-) > > testing attached patch.. > > jirka > > > --- > diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c > index f7fb258..966564a 100644 > --- a/tools/perf/util/build-id.c > +++ b/tools/perf/util/build-id.c > @@ -60,7 +60,6 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused, > event->fork.ppid, event->fork.ptid); > > if (thread) { > - rb_erase(&thread->rb_node, &machine->threads); > if (machine->last_match == thread) > thread__zput(machine->last_match); > thread__put(thread); > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > index e335330..a8443ef 100644 > --- a/tools/perf/util/machine.c > +++ b/tools/perf/util/machine.c > @@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) > dsos__init(&machine->kernel_dsos); > > machine->threads = RB_ROOT; > + pthread_mutex_init(&machine->threads_lock, NULL); > INIT_LIST_HEAD(&machine->dead_threads); > machine->last_match = NULL; > > @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine, > if (!create) > return NULL; > > - th = thread__new(pid, tid); > + th = thread__new(machine, pid, tid); > if (th != NULL) { > + > + pthread_mutex_lock(&machine->threads_lock); > rb_link_node(&th->rb_node, parent, p); > rb_insert_color(&th->rb_node, &machine->threads); > + pthread_mutex_unlock(&machine->threads_lock); I think you also need to protect the rb tree traversal above. But this makes every sample processing grabs and releases the lock so might cause high overhead. It can be a problem if such processing is done parallelly like my multi-thread work. :-/ Thanks, Namhyung > > /* > * We have to initialize map_groups separately > @@ -394,7 +398,9 @@ static struct thread *__machine__findnew_thread(struct machine *machine, > * leader and that would screwed the rb tree. > */ > if (thread__init_map_groups(th, machine)) { > + pthread_mutex_lock(&machine->threads_lock); > rb_erase(&th->rb_node, &machine->threads); > + pthread_mutex_unlock(&machine->threads_lock); > thread__delete(th); > return NULL; > } > @@ -1258,7 +1264,6 @@ static void machine__remove_thread(struct machine *machine, struct thread *th) > if (machine->last_match == th) > thread__zput(machine->last_match); > > - rb_erase(&th->rb_node, &machine->threads); > /* > * Move it first to the dead_threads list, then drop the reference, > * if this is the last reference, then the thread__delete destructor > diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h > index e2faf3b..e3468d6 100644 > --- a/tools/perf/util/machine.h > +++ b/tools/perf/util/machine.h > @@ -30,6 +30,7 @@ struct machine { > bool comm_exec; > char *root_dir; > struct rb_root threads; > + pthread_mutex_t threads_lock; > struct list_head dead_threads; > struct thread *last_match; > struct vdso_info *vdso_info; > diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c > index 1c8fbc9..4592fc4 100644 > --- a/tools/perf/util/thread.c > +++ b/tools/perf/util/thread.c > @@ -26,7 +26,7 @@ int thread__init_map_groups(struct thread *thread, struct machine *machine) > return thread->mg ? 0 : -1; > } > > -struct thread *thread__new(pid_t pid, pid_t tid) > +struct thread *thread__new(struct machine *machine, pid_t pid, pid_t tid) > { > char *comm_str; > struct comm *comm; > @@ -38,6 +38,7 @@ struct thread *thread__new(pid_t pid, pid_t tid) > thread->ppid = -1; > thread->cpu = -1; > INIT_LIST_HEAD(&thread->comm_list); > + thread->machine = machine; > > if (unwind__prepare_access(thread) < 0) > goto err_thread; > @@ -91,7 +92,14 @@ struct thread *thread__get(struct thread *thread) > void thread__put(struct thread *thread) > { > if (thread && --thread->refcnt == 0) { > + struct machine *machine = thread->machine; > + > list_del_init(&thread->node); > + > + pthread_mutex_lock(&machine->threads_lock); > + rb_erase(&thread->rb_node, &machine->threads); > + pthread_mutex_unlock(&machine->threads_lock); > + > thread__delete(thread); > } > } > diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h > index 9b8a54d..df6fb69 100644 > --- a/tools/perf/util/thread.h > +++ b/tools/perf/util/thread.h > @@ -31,12 +31,13 @@ struct thread { > > void *priv; > struct thread_stack *ts; > + struct machine *machine; > }; > > struct machine; > struct comm; > > -struct thread *thread__new(pid_t pid, pid_t tid); > +struct thread *thread__new(struct machine *machine, pid_t pid, pid_t tid); > int thread__init_map_groups(struct thread *thread, struct machine *machine); > void thread__delete(struct thread *thread); > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 12:48 ` Namhyung Kim @ 2015-03-30 12:56 ` Jiri Olsa 2015-03-30 13:06 ` Namhyung Kim 2015-03-30 13:07 ` Arnaldo Carvalho de Melo 2015-03-30 14:58 ` Arnaldo Carvalho de Melo 1 sibling, 2 replies; 24+ messages in thread From: Jiri Olsa @ 2015-03-30 12:56 UTC (permalink / raw) To: Namhyung Kim Cc: David Ahern, Arnaldo Carvalho de Melo, Jiri Olsa, Stephane Eranian, LKML On Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim wrote: > Hi Jiri, > > On Mon, Mar 30, 2015 at 01:49:07PM +0200, Jiri Olsa wrote: > > On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote: > > > On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote: > > > > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > > > > > > > > SNIP > > > > > > > > > > > > > > > > 2 things: > > > > > > 1. let run for a long time. go about using the server. do lots of builds, > > > > > > etc. it takes time > > > > > > > > > > > > 2. use a box with a LOT of cpus (1024 in my case) > > > > > > > > > > > > Make sure ulimit is set to get the core. > > > > > > > > > > reproduced under 24 cpu box with kernel build (make -j25) > > > > > running on background.. will try to look closer > > > > > > > > > > perf: Segmentation fault > > > > > -------- backtrace -------- > > > > > ./perf[0x4fd79b] > > > > > /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0] > > > > > ./perf(thread__put+0x5b)[0x4b1a7b] > > > > > ./perf(hists__delete_entries+0x70)[0x4c8670] > > > > > ./perf[0x436a88] > > > > > ./perf[0x4fa73d] > > > > > ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437] > > > > > ./perf[0x4381d0] > > > > > /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5] > > > > > /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d] > > > > > [0x0] > > > > > > > > looks like race among __machine__findnew_thread and thread__put > > > > over the machine->threads rb_tree insert/removal > > > > > > > > is there a reason why thread__put does not erase itself from machine->threads? > > > > that was the reason.. we do this separately.. not in thread__put.. > > is there a reason for this? ;-) > > > > testing attached patch.. > > > > jirka > > > > > > --- > > diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c > > index f7fb258..966564a 100644 > > --- a/tools/perf/util/build-id.c > > +++ b/tools/perf/util/build-id.c > > @@ -60,7 +60,6 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused, > > event->fork.ppid, event->fork.ptid); > > > > if (thread) { > > - rb_erase(&thread->rb_node, &machine->threads); > > if (machine->last_match == thread) > > thread__zput(machine->last_match); > > thread__put(thread); > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > index e335330..a8443ef 100644 > > --- a/tools/perf/util/machine.c > > +++ b/tools/perf/util/machine.c > > @@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) > > dsos__init(&machine->kernel_dsos); > > > > machine->threads = RB_ROOT; > > + pthread_mutex_init(&machine->threads_lock, NULL); > > INIT_LIST_HEAD(&machine->dead_threads); > > machine->last_match = NULL; > > > > @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine, > > if (!create) > > return NULL; > > > > - th = thread__new(pid, tid); > > + th = thread__new(machine, pid, tid); > > if (th != NULL) { > > + > > + pthread_mutex_lock(&machine->threads_lock); > > rb_link_node(&th->rb_node, parent, p); > > rb_insert_color(&th->rb_node, &machine->threads); > > + pthread_mutex_unlock(&machine->threads_lock); > > I think you also need to protect the rb tree traversal above. yep, I already have another version.. but it blows on another place ;-) > > But this makes every sample processing grabs and releases the lock so > might cause high overhead. It can be a problem if such processing is > done parallelly like my multi-thread work. :-/ yep.. perhaps instead of more locking we need to find a way where only single thread do the update on hists/threads jirka ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 12:56 ` Jiri Olsa @ 2015-03-30 13:06 ` Namhyung Kim 2015-03-30 14:02 ` Arnaldo Carvalho de Melo 2015-03-30 13:07 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 24+ messages in thread From: Namhyung Kim @ 2015-03-30 13:06 UTC (permalink / raw) To: Jiri Olsa Cc: David Ahern, Arnaldo Carvalho de Melo, Jiri Olsa, Stephane Eranian, LKML On Mon, Mar 30, 2015 at 9:56 PM, Jiri Olsa <jolsa@redhat.com> wrote: > On Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim wrote: >> Hi Jiri, >> >> On Mon, Mar 30, 2015 at 01:49:07PM +0200, Jiri Olsa wrote: >> > On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote: >> > > On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote: >> > > > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: >> > > > >> > > > SNIP >> > > > >> > > > > > >> > > > > > 2 things: >> > > > > > 1. let run for a long time. go about using the server. do lots of builds, >> > > > > > etc. it takes time >> > > > > > >> > > > > > 2. use a box with a LOT of cpus (1024 in my case) >> > > > > > >> > > > > > Make sure ulimit is set to get the core. >> > > > > >> > > > > reproduced under 24 cpu box with kernel build (make -j25) >> > > > > running on background.. will try to look closer >> > > > > >> > > > > perf: Segmentation fault >> > > > > -------- backtrace -------- >> > > > > ./perf[0x4fd79b] >> > > > > /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0] >> > > > > ./perf(thread__put+0x5b)[0x4b1a7b] >> > > > > ./perf(hists__delete_entries+0x70)[0x4c8670] >> > > > > ./perf[0x436a88] >> > > > > ./perf[0x4fa73d] >> > > > > ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437] >> > > > > ./perf[0x4381d0] >> > > > > /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5] >> > > > > /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d] >> > > > > [0x0] >> > > > >> > > > looks like race among __machine__findnew_thread and thread__put >> > > > over the machine->threads rb_tree insert/removal >> > > > >> > > > is there a reason why thread__put does not erase itself from machine->threads? >> > >> > that was the reason.. we do this separately.. not in thread__put.. >> > is there a reason for this? ;-) >> > >> > testing attached patch.. >> > >> > jirka >> > >> > >> > --- >> > diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c >> > index f7fb258..966564a 100644 >> > --- a/tools/perf/util/build-id.c >> > +++ b/tools/perf/util/build-id.c >> > @@ -60,7 +60,6 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused, >> > event->fork.ppid, event->fork.ptid); >> > >> > if (thread) { >> > - rb_erase(&thread->rb_node, &machine->threads); >> > if (machine->last_match == thread) >> > thread__zput(machine->last_match); >> > thread__put(thread); >> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c >> > index e335330..a8443ef 100644 >> > --- a/tools/perf/util/machine.c >> > +++ b/tools/perf/util/machine.c >> > @@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) >> > dsos__init(&machine->kernel_dsos); >> > >> > machine->threads = RB_ROOT; >> > + pthread_mutex_init(&machine->threads_lock, NULL); >> > INIT_LIST_HEAD(&machine->dead_threads); >> > machine->last_match = NULL; >> > >> > @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine, >> > if (!create) >> > return NULL; >> > >> > - th = thread__new(pid, tid); >> > + th = thread__new(machine, pid, tid); >> > if (th != NULL) { >> > + >> > + pthread_mutex_lock(&machine->threads_lock); >> > rb_link_node(&th->rb_node, parent, p); >> > rb_insert_color(&th->rb_node, &machine->threads); >> > + pthread_mutex_unlock(&machine->threads_lock); >> >> I think you also need to protect the rb tree traversal above. > > yep, I already have another version.. but it blows on another place ;-) > >> >> But this makes every sample processing grabs and releases the lock so >> might cause high overhead. It can be a problem if such processing is >> done parallelly like my multi-thread work. :-/ > > yep.. perhaps instead of more locking we need to find a way where > only single thread do the update on hists/threads Agreed. AFAIK the reason we do ref-counting is to cleanup dead/exited thread for live session like perf top. In that case we can somehow mark to-be-deleted thread and kill it in a safe time/place.. Thanks, Namhyung ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 13:06 ` Namhyung Kim @ 2015-03-30 14:02 ` Arnaldo Carvalho de Melo 2015-03-31 0:15 ` Namhyung Kim 0 siblings, 1 reply; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-30 14:02 UTC (permalink / raw) To: Namhyung Kim; +Cc: Jiri Olsa, David Ahern, Jiri Olsa, Stephane Eranian, LKML Em Mon, Mar 30, 2015 at 10:06:35PM +0900, Namhyung Kim escreveu: > On Mon, Mar 30, 2015 at 9:56 PM, Jiri Olsa <jolsa@redhat.com> wrote: > > On Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim wrote: > >> > @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine, > >> > if (!create) > >> > return NULL; > >> > > >> > - th = thread__new(pid, tid); > >> > + th = thread__new(machine, pid, tid); > >> > if (th != NULL) { > >> > + > >> > + pthread_mutex_lock(&machine->threads_lock); > >> > rb_link_node(&th->rb_node, parent, p); > >> > rb_insert_color(&th->rb_node, &machine->threads); > >> > + pthread_mutex_unlock(&machine->threads_lock); > >> > >> I think you also need to protect the rb tree traversal above. > > > > yep, I already have another version.. but it blows on another place ;-) > > > >> > >> But this makes every sample processing grabs and releases the lock so > >> might cause high overhead. It can be a problem if such processing is > >> done parallelly like my multi-thread work. :-/ > > > > yep.. perhaps instead of more locking we need to find a way where > > only single thread do the update on hists/threads > > Agreed. > > AFAIK the reason we do ref-counting is to cleanup dead/exited thread > for live session like perf top. In that case we can somehow mark > to-be-deleted thread and kill it in a safe time/place.. Humm, you mean have another list node in struct threads and add threads to another dead_threads like list, i.e. one that is _really_ dead as no more refcounts point to it, and then amortize the costs of removing it from the rb_tree by removing multiple threads instead of just one? - Arnaldo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 14:02 ` Arnaldo Carvalho de Melo @ 2015-03-31 0:15 ` Namhyung Kim 0 siblings, 0 replies; 24+ messages in thread From: Namhyung Kim @ 2015-03-31 0:15 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Jiri Olsa, David Ahern, Jiri Olsa, Stephane Eranian, LKML Hi Arnaldo, On Mon, Mar 30, 2015 at 11:02:39AM -0300, Arnaldo Carvalho de Melo wrote: > Em Mon, Mar 30, 2015 at 10:06:35PM +0900, Namhyung Kim escreveu: > > On Mon, Mar 30, 2015 at 9:56 PM, Jiri Olsa <jolsa@redhat.com> wrote: > > > On Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim wrote: > > >> > @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine, > > >> > if (!create) > > >> > return NULL; > > >> > > > >> > - th = thread__new(pid, tid); > > >> > + th = thread__new(machine, pid, tid); > > >> > if (th != NULL) { > > >> > + > > >> > + pthread_mutex_lock(&machine->threads_lock); > > >> > rb_link_node(&th->rb_node, parent, p); > > >> > rb_insert_color(&th->rb_node, &machine->threads); > > >> > + pthread_mutex_unlock(&machine->threads_lock); > > >> > > >> I think you also need to protect the rb tree traversal above. > > > > > > yep, I already have another version.. but it blows on another place ;-) > > > > > >> > > >> But this makes every sample processing grabs and releases the lock so > > >> might cause high overhead. It can be a problem if such processing is > > >> done parallelly like my multi-thread work. :-/ > > > > > > yep.. perhaps instead of more locking we need to find a way where > > > only single thread do the update on hists/threads > > > > Agreed. > > > > AFAIK the reason we do ref-counting is to cleanup dead/exited thread > > for live session like perf top. In that case we can somehow mark > > to-be-deleted thread and kill it in a safe time/place.. > > Humm, you mean have another list node in struct threads and add threads > to another dead_threads like list, i.e. one that is _really_ dead as no > more refcounts point to it, and then amortize the costs of removing it > from the rb_tree by removing multiple threads instead of just one? Yes, I really want to avoid any overhead on the fastpath. Instead of refcnt, how about marking and deleting dead threads by the perf top's display thread unless it's not a selected one? Thanks, Namhyung ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 12:56 ` Jiri Olsa 2015-03-30 13:06 ` Namhyung Kim @ 2015-03-30 13:07 ` Arnaldo Carvalho de Melo 2015-03-30 13:20 ` Jiri Olsa 1 sibling, 1 reply; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-30 13:07 UTC (permalink / raw) To: Jiri Olsa; +Cc: Namhyung Kim, David Ahern, Jiri Olsa, Stephane Eranian, LKML Em Mon, Mar 30, 2015 at 02:56:31PM +0200, Jiri Olsa escreveu: > On Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim wrote: > > Hi Jiri, > > > > On Mon, Mar 30, 2015 at 01:49:07PM +0200, Jiri Olsa wrote: > > > On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote: > > > > On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote: > > > > > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > > > - th = thread__new(pid, tid); > > > + th = thread__new(machine, pid, tid); > > > if (th != NULL) { > > > + > > > + pthread_mutex_lock(&machine->threads_lock); > > > rb_link_node(&th->rb_node, parent, p); > > > rb_insert_color(&th->rb_node, &machine->threads); > > > + pthread_mutex_unlock(&machine->threads_lock); > > > > I think you also need to protect the rb tree traversal above. > > yep, I already have another version.. but it blows on another place ;-) Well, why? The point of refcounting is that the structure will not go away while we have it in the rbtree. Or are you talking about two threads trying to insert entries in the rbtree? Can you point where this can happen? > > But this makes every sample processing grabs and releases the lock so > > might cause high overhead. It can be a problem if such processing is > > done parallelly like my multi-thread work. :-/ > yep.. perhaps instead of more locking we need to find a way where > only single thread do the update on hists/threads But that should be really rare, no? The problem is to search in one thread (the fast path) and inserting entries (slow path), no? - Arnaldo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 13:07 ` Arnaldo Carvalho de Melo @ 2015-03-30 13:20 ` Jiri Olsa 2015-03-30 13:59 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 24+ messages in thread From: Jiri Olsa @ 2015-03-30 13:20 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Namhyung Kim, David Ahern, Jiri Olsa, Stephane Eranian, LKML On Mon, Mar 30, 2015 at 10:07:08AM -0300, Arnaldo Carvalho de Melo wrote: > Em Mon, Mar 30, 2015 at 02:56:31PM +0200, Jiri Olsa escreveu: > > On Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim wrote: > > > Hi Jiri, > > > > > > On Mon, Mar 30, 2015 at 01:49:07PM +0200, Jiri Olsa wrote: > > > > On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote: > > > > > On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote: > > > > > > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > > > > - th = thread__new(pid, tid); > > > > + th = thread__new(machine, pid, tid); > > > > if (th != NULL) { > > > > + > > > > + pthread_mutex_lock(&machine->threads_lock); > > > > rb_link_node(&th->rb_node, parent, p); > > > > rb_insert_color(&th->rb_node, &machine->threads); > > > > + pthread_mutex_unlock(&machine->threads_lock); > > > > > > I think you also need to protect the rb tree traversal above. > > > > yep, I already have another version.. but it blows on another place ;-) > > Well, why? The point of refcounting is that the structure will not go > away while we have it in the rbtree. > > Or are you talking about two threads trying to insert entries in the > rbtree? Can you point where this can happen? yep, as I wrote in previous email: > looks like race among __machine__findnew_thread and thread__put > over the machine->threads rb_tree insert/removal update thread: perf_event__process_sample perf_event__preprocess_sample machine__findnew_thread traverse machine->threads display thread: perf_top__sort_new_samples hists__delete_entries ... hist_entry__delete(struct hist_entry *he) thread__zput(he->thread); > > > > But this makes every sample processing grabs and releases the lock so > > > might cause high overhead. It can be a problem if such processing is > > > done parallelly like my multi-thread work. :-/ > > > yep.. perhaps instead of more locking we need to find a way where > > only single thread do the update on hists/threads > > But that should be really rare, no? The problem is to search in one > thread (the fast path) and inserting entries (slow path), no? there're many new threads on kernel make -j25 workload ;-) jirka ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 13:20 ` Jiri Olsa @ 2015-03-30 13:59 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-30 13:59 UTC (permalink / raw) To: Jiri Olsa; +Cc: Namhyung Kim, David Ahern, Jiri Olsa, Stephane Eranian, LKML Em Mon, Mar 30, 2015 at 03:20:59PM +0200, Jiri Olsa escreveu: > On Mon, Mar 30, 2015 at 10:07:08AM -0300, Arnaldo Carvalho de Melo wrote: > > Em Mon, Mar 30, 2015 at 02:56:31PM +0200, Jiri Olsa escreveu: > > > On Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim wrote: > > > > Hi Jiri, > > > > > > > > On Mon, Mar 30, 2015 at 01:49:07PM +0200, Jiri Olsa wrote: > > > > > On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote: > > > > > > On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote: > > > > > > > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > > > > > - th = thread__new(pid, tid); > > > > > + th = thread__new(machine, pid, tid); > > > > > if (th != NULL) { > > > > > + > > > > > + pthread_mutex_lock(&machine->threads_lock); > > > > > rb_link_node(&th->rb_node, parent, p); > > > > > rb_insert_color(&th->rb_node, &machine->threads); > > > > > + pthread_mutex_unlock(&machine->threads_lock); > > > > > > > > I think you also need to protect the rb tree traversal above. > > > > > > yep, I already have another version.. but it blows on another place ;-) > > > > Well, why? The point of refcounting is that the structure will not go > > away while we have it in the rbtree. > > > > Or are you talking about two threads trying to insert entries in the > > rbtree? Can you point where this can happen? > > yep, as I wrote in previous email: > > > looks like race among __machine__findnew_thread and thread__put > > over the machine->threads rb_tree insert/removal > > update thread: > perf_event__process_sample > perf_event__preprocess_sample > machine__findnew_thread > traverse machine->threads > > display thread: > perf_top__sort_new_samples > hists__delete_entries > ... > hist_entry__delete(struct hist_entry *he) > thread__zput(he->thread); > > > > > > > But this makes every sample processing grabs and releases the lock so > > > > might cause high overhead. It can be a problem if such processing is > > > > done parallelly like my multi-thread work. :-/ > > > > > yep.. perhaps instead of more locking we need to find a way where > > > only single thread do the update on hists/threads > > > > But that should be really rare, no? The problem is to search in one > > thread (the fast path) and inserting entries (slow path), no? > > there're many new threads on kernel make -j25 workload ;-) yeah, dumb me, hey, its still early here in .br ;-) - Arnaldo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 12:48 ` Namhyung Kim 2015-03-30 12:56 ` Jiri Olsa @ 2015-03-30 14:58 ` Arnaldo Carvalho de Melo 2015-03-30 15:13 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-30 14:58 UTC (permalink / raw) To: Namhyung Kim; +Cc: Jiri Olsa, David Ahern, Jiri Olsa, Stephane Eranian, LKML Em Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim escreveu: > Hi Jiri, > > On Mon, Mar 30, 2015 at 01:49:07PM +0200, Jiri Olsa wrote: > > On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote: > > > On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote: > > > > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > > > > > > > > SNIP > > > > > > > > > > > > > > > > 2 things: > > > > > > 1. let run for a long time. go about using the server. do lots of builds, > > > > > > etc. it takes time > > > > > > > > > > > > 2. use a box with a LOT of cpus (1024 in my case) > > > > > > > > > > > > Make sure ulimit is set to get the core. > > > > > > > > > > reproduced under 24 cpu box with kernel build (make -j25) > > > > > running on background.. will try to look closer > > > > > > > > > > perf: Segmentation fault > > > > > -------- backtrace -------- > > > > > ./perf[0x4fd79b] > > > > > /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0] > > > > > ./perf(thread__put+0x5b)[0x4b1a7b] > > > > > ./perf(hists__delete_entries+0x70)[0x4c8670] > > > > > ./perf[0x436a88] > > > > > ./perf[0x4fa73d] > > > > > ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437] > > > > > ./perf[0x4381d0] > > > > > /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5] > > > > > /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d] > > > > > [0x0] > > > > > > > > looks like race among __machine__findnew_thread and thread__put > > > > over the machine->threads rb_tree insert/removal > > > > > > > > is there a reason why thread__put does not erase itself from machine->threads? > > > > that was the reason.. we do this separately.. not in thread__put.. > > is there a reason for this? ;-) > > > > testing attached patch.. > > > > jirka > > > > > > --- > > diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c > > index f7fb258..966564a 100644 > > --- a/tools/perf/util/build-id.c > > +++ b/tools/perf/util/build-id.c > > @@ -60,7 +60,6 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused, > > event->fork.ppid, event->fork.ptid); > > > > if (thread) { > > - rb_erase(&thread->rb_node, &machine->threads); > > if (machine->last_match == thread) > > thread__zput(machine->last_match); > > thread__put(thread); > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > index e335330..a8443ef 100644 > > --- a/tools/perf/util/machine.c > > +++ b/tools/perf/util/machine.c > > @@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) > > dsos__init(&machine->kernel_dsos); > > > > machine->threads = RB_ROOT; > > + pthread_mutex_init(&machine->threads_lock, NULL); > > INIT_LIST_HEAD(&machine->dead_threads); > > machine->last_match = NULL; > > > > @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine, > > if (!create) > > return NULL; > > > > - th = thread__new(pid, tid); > > + th = thread__new(machine, pid, tid); > > if (th != NULL) { > > + > > + pthread_mutex_lock(&machine->threads_lock); > > rb_link_node(&th->rb_node, parent, p); > > rb_insert_color(&th->rb_node, &machine->threads); > > + pthread_mutex_unlock(&machine->threads_lock); > > I think you also need to protect the rb tree traversal above. > > But this makes every sample processing grabs and releases the lock so > might cause high overhead. It can be a problem if such processing is > done parallelly like my multi-thread work. :-/ Still untested, using rw lock, next step is auditing the machine__findnew_thread users that really should be using machine__find_thread, i.e. grabbing just the reader lock, and measuring the overhead of using a pthread rw lock instead of pthread_mutex_t as Jiri is doing. - Arnaldo diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c index f7fb2587df69..e3c80bab47a3 100644 --- a/tools/perf/util/build-id.c +++ b/tools/perf/util/build-id.c @@ -60,7 +60,9 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused, event->fork.ppid, event->fork.ptid); if (thread) { + pthread_rwlock_wrlock(&machine->threads_lock); rb_erase(&thread->rb_node, &machine->threads); + pthread_rwlock_unlock(&machine->threads_lock); if (machine->last_match == thread) thread__zput(machine->last_match); thread__put(thread); diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index e45c8f33a8fd..b901ed27a793 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) dsos__init(&machine->kernel_dsos); machine->threads = RB_ROOT; + pthread_rwlock_init(&machine->threads_lock, NULL); INIT_LIST_HEAD(&machine->dead_threads); machine->last_match = NULL; @@ -111,6 +112,7 @@ void machine__exit(struct machine *machine) vdso__exit(machine); zfree(&machine->root_dir); zfree(&machine->current_tid); + pthread_rwlock_destroy(&machines->threads_lock); } void machine__delete(struct machine *machine) @@ -411,13 +413,22 @@ static struct thread *__machine__findnew_thread(struct machine *machine, struct thread *machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid) { - return __machine__findnew_thread(machine, pid, tid, true); + struct thread *th; + + pthread_rwlock_wrlock(&machine->threads_lock); + th = __machine__findnew_thread(machine, pid, tid, true); + pthread_rwlock_unlock(&machine->threads_lock); + return th; } struct thread *machine__find_thread(struct machine *machine, pid_t pid, pid_t tid) { - return __machine__findnew_thread(machine, pid, tid, false); + struct thread *th; + pthread_rwlock_rdlock(&machine->threads_lock); + th = __machine__findnew_thread(machine, pid, tid, false); + pthread_rwlock_unlock(&machine->threads_lock); + return th; } struct comm *machine__thread_exec_comm(struct machine *machine, @@ -1258,7 +1269,9 @@ static void machine__remove_thread(struct machine *machine, struct thread *th) if (machine->last_match == th) thread__zput(machine->last_match); + pthread_rwlock_wrlock(&machine->threads_lock); rb_erase(&th->rb_node, &machine->threads); + pthread_rwlock_unlock(&machine->threads_lock); /* * Move it first to the dead_threads list, then drop the reference, * if this is the last reference, then the thread__delete destructor diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index e2faf3b47e7b..c2b9402921fc 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -30,6 +30,7 @@ struct machine { bool comm_exec; char *root_dir; struct rb_root threads; + pthread_rwlock_t threads_lock; struct list_head dead_threads; struct thread *last_match; struct vdso_info *vdso_info; ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 14:58 ` Arnaldo Carvalho de Melo @ 2015-03-30 15:13 ` Arnaldo Carvalho de Melo 2015-03-31 0:27 ` Namhyung Kim 0 siblings, 1 reply; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-30 15:13 UTC (permalink / raw) To: Namhyung Kim; +Cc: Jiri Olsa, David Ahern, Jiri Olsa, Stephane Eranian, LKML Em Mon, Mar 30, 2015 at 11:58:05AM -0300, Arnaldo Carvalho de Melo escreveu: > Em Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim escreveu: > > But this makes every sample processing grabs and releases the lock so > > might cause high overhead. It can be a problem if such processing is > > done parallelly like my multi-thread work. :-/ > > Still untested, using rw lock, next step is auditing the > machine__findnew_thread users that really should be using > machine__find_thread, i.e. grabbing just the reader lock, and measuring > the overhead of using a pthread rw lock instead of pthread_mutex_t as > Jiri is doing. Don't bother trying it, doesn't even compile ;-\ As I said, untested, more as a heads up, will fix it up properly and test after lunch. - Arnaldo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 15:13 ` Arnaldo Carvalho de Melo @ 2015-03-31 0:27 ` Namhyung Kim 2015-03-31 0:46 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 24+ messages in thread From: Namhyung Kim @ 2015-03-31 0:27 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Jiri Olsa, David Ahern, Jiri Olsa, Stephane Eranian, LKML On Mon, Mar 30, 2015 at 12:13:03PM -0300, Arnaldo Carvalho de Melo wrote: > Em Mon, Mar 30, 2015 at 11:58:05AM -0300, Arnaldo Carvalho de Melo escreveu: > > Em Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim escreveu: > > > But this makes every sample processing grabs and releases the lock so > > > might cause high overhead. It can be a problem if such processing is > > > done parallelly like my multi-thread work. :-/ > > > > Still untested, using rw lock, next step is auditing the > > machine__findnew_thread users that really should be using > > machine__find_thread, i.e. grabbing just the reader lock, and measuring > > the overhead of using a pthread rw lock instead of pthread_mutex_t as > > Jiri is doing. > > Don't bother trying it, doesn't even compile ;-\ OK. :) But I think rw lock still has not-so-low overhead as it involves atomic operations and cache misses. Thanks, Namhyung ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-31 0:27 ` Namhyung Kim @ 2015-03-31 0:46 ` Arnaldo Carvalho de Melo 2015-03-31 7:21 ` Namhyung Kim 0 siblings, 1 reply; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-31 0:46 UTC (permalink / raw) To: Namhyung Kim; +Cc: Jiri Olsa, David Ahern, Jiri Olsa, Stephane Eranian, LKML Em Tue, Mar 31, 2015 at 09:27:30AM +0900, Namhyung Kim escreveu: > On Mon, Mar 30, 2015 at 12:13:03PM -0300, Arnaldo Carvalho de Melo wrote: > > Em Mon, Mar 30, 2015 at 11:58:05AM -0300, Arnaldo Carvalho de Melo escreveu: > > > Em Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim escreveu: > > > > But this makes every sample processing grabs and releases the lock so > > > > might cause high overhead. It can be a problem if such processing is > > > > done parallelly like my multi-thread work. :-/ > > > > > > Still untested, using rw lock, next step is auditing the > > > machine__findnew_thread users that really should be using > > > machine__find_thread, i.e. grabbing just the reader lock, and measuring > > > the overhead of using a pthread rw lock instead of pthread_mutex_t as > > > Jiri is doing. > > > > Don't bother trying it, doesn't even compile ;-\ > > OK. :) > > But I think rw lock still has not-so-low overhead as it involves > atomic operations and cache misses. But we will have to serialize access to the data structure at some point... - Arnaldo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-31 0:46 ` Arnaldo Carvalho de Melo @ 2015-03-31 7:21 ` Namhyung Kim 0 siblings, 0 replies; 24+ messages in thread From: Namhyung Kim @ 2015-03-31 7:21 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Jiri Olsa, David Ahern, Jiri Olsa, Stephane Eranian, LKML On Mon, Mar 30, 2015 at 09:46:35PM -0300, Arnaldo Carvalho de Melo wrote: > Em Tue, Mar 31, 2015 at 09:27:30AM +0900, Namhyung Kim escreveu: > > On Mon, Mar 30, 2015 at 12:13:03PM -0300, Arnaldo Carvalho de Melo wrote: > > > Em Mon, Mar 30, 2015 at 11:58:05AM -0300, Arnaldo Carvalho de Melo escreveu: > > > > Em Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim escreveu: > > > > > But this makes every sample processing grabs and releases the lock so > > > > > might cause high overhead. It can be a problem if such processing is > > > > > done parallelly like my multi-thread work. :-/ > > > > > > > > Still untested, using rw lock, next step is auditing the > > > > machine__findnew_thread users that really should be using > > > > machine__find_thread, i.e. grabbing just the reader lock, and measuring > > > > the overhead of using a pthread rw lock instead of pthread_mutex_t as > > > > Jiri is doing. > > > > > > Don't bother trying it, doesn't even compile ;-\ > > > > OK. :) > > > > But I think rw lock still has not-so-low overhead as it involves > > atomic operations and cache misses. > > But we will have to serialize access to the data structure at some > point... Yes, as long as we keep ref-counting. I'm guessing if we only focus on the perf top case, there might be a way to cleanup dead threads without ref-counting (i.e. w/o affecting fastpath on the perf report). Thanks, Namhyung ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 11:49 ` Jiri Olsa 2015-03-30 12:48 ` Namhyung Kim @ 2015-03-30 13:22 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-30 13:22 UTC (permalink / raw) To: Jiri Olsa; +Cc: David Ahern, Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML Em Mon, Mar 30, 2015 at 01:49:07PM +0200, Jiri Olsa escreveu: > On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote: > > > looks like race among __machine__findnew_thread and thread__put > > > over the machine->threads rb_tree insert/removal > > > is there a reason why thread__put does not erase itself from machine->threads? > that was the reason.. we do this separately.. not in thread__put.. > is there a reason for this? ;-) > testing attached patch.. > void thread__put(struct thread *thread) > { > if (thread && --thread->refcnt == 0) { > + struct machine *machine = thread->machine; > + > list_del_init(&thread->node); > + > + pthread_mutex_lock(&machine->threads_lock); > + rb_erase(&thread->rb_node, &machine->threads); > + pthread_mutex_unlock(&machine->threads_lock); > + > thread__delete(thread); You can't do that, look at the definition of struct thread: struct thread { union { struct rb_node rb_node; struct list_head node; }; I.e. a thread can't be at the same time in the rbtree (live threads) and on the list (dead threads). When the thread dies (PERF_RECORD_EXIT) we transition it from the live threads rb_tree to the dead_threads list, where it will wait till the last hist_entry holding a reference to it to call thread__put(), when we will want to remove it _only_ from the dead_threads _list_. - Arnaldo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 10:22 ` Jiri Olsa 2015-03-30 11:21 ` Jiri Olsa @ 2015-03-30 13:09 ` Arnaldo Carvalho de Melo 2015-03-30 13:17 ` Arnaldo Carvalho de Melo 2 siblings, 0 replies; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-30 13:09 UTC (permalink / raw) To: Jiri Olsa; +Cc: David Ahern, Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML Em Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa escreveu: > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > > SNIP > > > > > > > 2 things: > > > 1. let run for a long time. go about using the server. do lots of builds, > > > etc. it takes time > > > > > > 2. use a box with a LOT of cpus (1024 in my case) > > > > > > Make sure ulimit is set to get the core. > > > > reproduced under 24 cpu box with kernel build (make -j25) > > running on background.. will try to look closer > > > > perf: Segmentation fault > > -------- backtrace -------- > > ./perf[0x4fd79b] > > /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0] > > ./perf(thread__put+0x5b)[0x4b1a7b] > > ./perf(hists__delete_entries+0x70)[0x4c8670] > > ./perf[0x436a88] > > ./perf[0x4fa73d] > > ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437] > > ./perf[0x4381d0] > > /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5] > > /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d] > > [0x0] > > looks like race among __machine__findnew_thread and thread__put > over the machine->threads rb_tree insert/removal > > is there a reason why thread__put does not erase itself from machine->threads? Thread put should only remove stuff from the rbtree when the refcount hits zero, and then, it should grab a lock, that is the buggy part, as this lock isn't there, and remove the entry from the rbtree. Now I see what Namhyung was talking about... - Arnaldo > I'm trying attached patch.. so far so gut ;-) > > jirka > > > --- > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > index e335330..7e6abc7 100644 > --- a/tools/perf/util/machine.c > +++ b/tools/perf/util/machine.c > @@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid) > dsos__init(&machine->kernel_dsos); > > machine->threads = RB_ROOT; > + pthread_mutex_init(&machine->threads_lock, NULL); > INIT_LIST_HEAD(&machine->dead_threads); > machine->last_match = NULL; > > @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine, > if (!create) > return NULL; > > - th = thread__new(pid, tid); > + th = thread__new(machine, pid, tid); > if (th != NULL) { > + > + pthread_mutex_lock(&machine->threads_lock); > rb_link_node(&th->rb_node, parent, p); > rb_insert_color(&th->rb_node, &machine->threads); > + pthread_mutex_unlock(&machine->threads_lock); > > /* > * We have to initialize map_groups separately > @@ -394,8 +398,10 @@ static struct thread *__machine__findnew_thread(struct machine *machine, > * leader and that would screwed the rb tree. > */ > if (thread__init_map_groups(th, machine)) { > + pthread_mutex_lock(&machine->threads_lock); > rb_erase(&th->rb_node, &machine->threads); > thread__delete(th); > + pthread_mutex_unlock(&machine->threads_lock); > return NULL; > } > /* > diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h > index e2faf3b..e3468d6 100644 > --- a/tools/perf/util/machine.h > +++ b/tools/perf/util/machine.h > @@ -30,6 +30,7 @@ struct machine { > bool comm_exec; > char *root_dir; > struct rb_root threads; > + pthread_mutex_t threads_lock; > struct list_head dead_threads; > struct thread *last_match; > struct vdso_info *vdso_info; > diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c > index 1c8fbc9..4592fc4 100644 > --- a/tools/perf/util/thread.c > +++ b/tools/perf/util/thread.c > @@ -26,7 +26,7 @@ int thread__init_map_groups(struct thread *thread, struct machine *machine) > return thread->mg ? 0 : -1; > } > > -struct thread *thread__new(pid_t pid, pid_t tid) > +struct thread *thread__new(struct machine *machine, pid_t pid, pid_t tid) > { > char *comm_str; > struct comm *comm; > @@ -38,6 +38,7 @@ struct thread *thread__new(pid_t pid, pid_t tid) > thread->ppid = -1; > thread->cpu = -1; > INIT_LIST_HEAD(&thread->comm_list); > + thread->machine = machine; > > if (unwind__prepare_access(thread) < 0) > goto err_thread; > @@ -91,7 +92,14 @@ struct thread *thread__get(struct thread *thread) > void thread__put(struct thread *thread) > { > if (thread && --thread->refcnt == 0) { > + struct machine *machine = thread->machine; > + > list_del_init(&thread->node); > + > + pthread_mutex_lock(&machine->threads_lock); > + rb_erase(&thread->rb_node, &machine->threads); > + pthread_mutex_unlock(&machine->threads_lock); > + > thread__delete(thread); > } > } > diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h > index 9b8a54d..df6fb69 100644 > --- a/tools/perf/util/thread.h > +++ b/tools/perf/util/thread.h > @@ -31,12 +31,13 @@ struct thread { > > void *priv; > struct thread_stack *ts; > + struct machine *machine; > }; > > struct machine; > struct comm; > > -struct thread *thread__new(pid_t pid, pid_t tid); > +struct thread *thread__new(struct machine *machine, pid_t pid, pid_t tid); > int thread__init_map_groups(struct thread *thread, struct machine *machine); > void thread__delete(struct thread *thread); > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG] segfault in perf-top -- thread refcnt 2015-03-30 10:22 ` Jiri Olsa 2015-03-30 11:21 ` Jiri Olsa 2015-03-30 13:09 ` Arnaldo Carvalho de Melo @ 2015-03-30 13:17 ` Arnaldo Carvalho de Melo 2 siblings, 0 replies; 24+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-03-30 13:17 UTC (permalink / raw) To: Jiri Olsa; +Cc: David Ahern, Jiri Olsa, Namhyung Kim, Stephane Eranian, LKML Em Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa escreveu: > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote: > looks like race among __machine__findnew_thread and thread__put > over the machine->threads rb_tree insert/removal > > is there a reason why thread__put does not erase itself from machine->threads? IIRC it has to first pass thru: machine__remove_thread() that will remove the reference count obtained when inserting the thread on the rbtree and move it to the dead_threads list, where it will stay until the last reference count obtained when adding the thread to some hist_entry is finally dropped via thread__put(). - Arnaldo ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2015-03-31 7:27 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-03-27 17:31 [BUG] segfault in perf-top -- thread refcnt David Ahern 2015-03-27 19:51 ` Arnaldo Carvalho de Melo 2015-03-27 20:11 ` Arnaldo Carvalho de Melo 2015-03-27 20:13 ` David Ahern 2015-03-30 8:07 ` Jiri Olsa 2015-03-30 10:22 ` Jiri Olsa 2015-03-30 11:21 ` Jiri Olsa 2015-03-30 11:49 ` Jiri Olsa 2015-03-30 12:48 ` Namhyung Kim 2015-03-30 12:56 ` Jiri Olsa 2015-03-30 13:06 ` Namhyung Kim 2015-03-30 14:02 ` Arnaldo Carvalho de Melo 2015-03-31 0:15 ` Namhyung Kim 2015-03-30 13:07 ` Arnaldo Carvalho de Melo 2015-03-30 13:20 ` Jiri Olsa 2015-03-30 13:59 ` Arnaldo Carvalho de Melo 2015-03-30 14:58 ` Arnaldo Carvalho de Melo 2015-03-30 15:13 ` Arnaldo Carvalho de Melo 2015-03-31 0:27 ` Namhyung Kim 2015-03-31 0:46 ` Arnaldo Carvalho de Melo 2015-03-31 7:21 ` Namhyung Kim 2015-03-30 13:22 ` Arnaldo Carvalho de Melo 2015-03-30 13:09 ` Arnaldo Carvalho de Melo 2015-03-30 13:17 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).