All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>,
	Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Stephane Eranian <eranian@google.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] segfault in perf-top -- thread refcnt
Date: Mon, 30 Mar 2015 22:06:35 +0900	[thread overview]
Message-ID: <CAM9d7chr8SuH+6Qoa2mmY2OdemD1KMmbUv_ODS7HLiiZ2xGYGg@mail.gmail.com> (raw)
In-Reply-To: <20150330125631.GI1413@krava>

On Mon, Mar 30, 2015 at 9:56 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> On Mon, Mar 30, 2015 at 09:48:52PM +0900, Namhyung Kim wrote:
>> Hi Jiri,
>>
>> On Mon, Mar 30, 2015 at 01:49:07PM +0200, Jiri Olsa wrote:
>> > On Mon, Mar 30, 2015 at 01:21:08PM +0200, Jiri Olsa wrote:
>> > > On Mon, Mar 30, 2015 at 12:22:20PM +0200, Jiri Olsa wrote:
>> > > > On Mon, Mar 30, 2015 at 10:07:37AM +0200, Jiri Olsa wrote:
>> > > >
>> > > > SNIP
>> > > >
>> > > > > >
>> > > > > > 2 things:
>> > > > > > 1. let run for a long time. go about using the server. do lots of builds,
>> > > > > > etc. it takes time
>> > > > > >
>> > > > > > 2. use a box with a LOT of cpus (1024 in my case)
>> > > > > >
>> > > > > > Make sure ulimit is set to get the core.
>> > > > >
>> > > > > reproduced under 24 cpu box with kernel build (make -j25)
>> > > > > running on background.. will try to look closer
>> > > > >
>> > > > > perf: Segmentation fault
>> > > > > -------- backtrace --------
>> > > > > ./perf[0x4fd79b]
>> > > > > /lib64/libc.so.6(+0x358f0)[0x7f9cbff528f0]
>> > > > > ./perf(thread__put+0x5b)[0x4b1a7b]
>> > > > > ./perf(hists__delete_entries+0x70)[0x4c8670]
>> > > > > ./perf[0x436a88]
>> > > > > ./perf[0x4fa73d]
>> > > > > ./perf(perf_evlist__tui_browse_hists+0x97)[0x4fc437]
>> > > > > ./perf[0x4381d0]
>> > > > > /lib64/libpthread.so.0(+0x7ee5)[0x7f9cc1ff2ee5]
>> > > > > /lib64/libc.so.6(clone+0x6d)[0x7f9cc0011b8d]
>> > > > > [0x0]
>> > > >
>> > > > looks like race among __machine__findnew_thread and thread__put
>> > > > over the machine->threads rb_tree insert/removal
>> > > >
>> > > > is there a reason why thread__put does not erase itself from machine->threads?
>> >
>> > that was the reason.. we do this separately.. not in thread__put..
>> > is there a reason for this? ;-)
>> >
>> > testing attached patch..
>> >
>> > jirka
>> >
>> >
>> > ---
>> > diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
>> > index f7fb258..966564a 100644
>> > --- a/tools/perf/util/build-id.c
>> > +++ b/tools/perf/util/build-id.c
>> > @@ -60,7 +60,6 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused,
>> >                 event->fork.ppid, event->fork.ptid);
>> >
>> >     if (thread) {
>> > -           rb_erase(&thread->rb_node, &machine->threads);
>> >             if (machine->last_match == thread)
>> >                     thread__zput(machine->last_match);
>> >             thread__put(thread);
>> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
>> > index e335330..a8443ef 100644
>> > --- a/tools/perf/util/machine.c
>> > +++ b/tools/perf/util/machine.c
>> > @@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
>> >     dsos__init(&machine->kernel_dsos);
>> >
>> >     machine->threads = RB_ROOT;
>> > +   pthread_mutex_init(&machine->threads_lock, NULL);
>> >     INIT_LIST_HEAD(&machine->dead_threads);
>> >     machine->last_match = NULL;
>> >
>> > @@ -380,10 +381,13 @@ static struct thread *__machine__findnew_thread(struct machine *machine,
>> >     if (!create)
>> >             return NULL;
>> >
>> > -   th = thread__new(pid, tid);
>> > +   th = thread__new(machine, pid, tid);
>> >     if (th != NULL) {
>> > +
>> > +           pthread_mutex_lock(&machine->threads_lock);
>> >             rb_link_node(&th->rb_node, parent, p);
>> >             rb_insert_color(&th->rb_node, &machine->threads);
>> > +           pthread_mutex_unlock(&machine->threads_lock);
>>
>> I think you also need to protect the rb tree traversal above.
>
> yep, I already have another version.. but it blows on another place ;-)
>
>>
>> But this makes every sample processing grabs and releases the lock so
>> might cause high overhead.  It can be a problem if such processing is
>> done parallelly like my multi-thread work. :-/
>
> yep.. perhaps instead of more locking we need to find a way where
> only single thread do the update on hists/threads

Agreed.

AFAIK the reason we do ref-counting is to cleanup dead/exited thread
for live session like perf top.  In that case we can somehow mark
to-be-deleted thread and kill it in a safe time/place..

Thanks,
Namhyung

  reply	other threads:[~2015-03-30 13:06 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-27 17:31 [BUG] segfault in perf-top -- thread refcnt David Ahern
2015-03-27 19:51 ` Arnaldo Carvalho de Melo
2015-03-27 20:11 ` Arnaldo Carvalho de Melo
2015-03-27 20:13   ` David Ahern
2015-03-30  8:07     ` Jiri Olsa
2015-03-30 10:22       ` Jiri Olsa
2015-03-30 11:21         ` Jiri Olsa
2015-03-30 11:49           ` Jiri Olsa
2015-03-30 12:48             ` Namhyung Kim
2015-03-30 12:56               ` Jiri Olsa
2015-03-30 13:06                 ` Namhyung Kim [this message]
2015-03-30 14:02                   ` Arnaldo Carvalho de Melo
2015-03-31  0:15                     ` Namhyung Kim
2015-03-30 13:07                 ` Arnaldo Carvalho de Melo
2015-03-30 13:20                   ` Jiri Olsa
2015-03-30 13:59                     ` Arnaldo Carvalho de Melo
2015-03-30 14:58               ` Arnaldo Carvalho de Melo
2015-03-30 15:13                 ` Arnaldo Carvalho de Melo
2015-03-31  0:27                   ` Namhyung Kim
2015-03-31  0:46                     ` Arnaldo Carvalho de Melo
2015-03-31  7:21                       ` Namhyung Kim
2015-03-30 13:22             ` Arnaldo Carvalho de Melo
2015-03-30 13:09         ` Arnaldo Carvalho de Melo
2015-03-30 13:17         ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM9d7chr8SuH+6Qoa2mmY2OdemD1KMmbUv_ODS7HLiiZ2xGYGg@mail.gmail.com \
    --to=namhyung@kernel.org \
    --cc=arnaldo.melo@gmail.com \
    --cc=dsahern@gmail.com \
    --cc=eranian@google.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.