All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf sched latency: Fix removed thread issue
@ 2015-11-02 11:10 Jiri Olsa
  2015-11-02 22:27 ` Namhyung Kim
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jiri Olsa @ 2015-11-02 11:10 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, David Ahern, Ingo Molnar, Namhyung Kim, Peter Zijlstra,
	Mohit Agrawal

If machine's thread gets excited (EXIT event is received),
we set thread->dead = true and it is later on removed from
machine's tree if the pid is reused on new thread.

The latency subcommand holds tree of working atoms sorted
by thread's pid/tid. If there's new thread with same pid
and tid, the old working atom is found and assert bug
condition is hit in search function:

  thread_atoms_search: Assertion `!(thread != atoms->thread)' failed

Changing the sort function to use thread object pointers
together with pid and tid check. This way new thread will
never find old one with same pid/tid.

I think we could change this to the sort based on timestamp
of thread creation, once it's added within Namhyung's thread
patchset.

Reported-by: Mohit Agrawal <moagrawa@redhat.com>
Link: http://lkml.kernel.org/n/tip-o4doazhhv0zax5zshkg8hnys@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-sched.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 0ee6d900e100..e3d3e32c0a93 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1203,12 +1203,13 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
 
 static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
 {
+	if (l->thread == r->thread)
+		return 0;
 	if (l->thread->tid < r->thread->tid)
 		return -1;
 	if (l->thread->tid > r->thread->tid)
 		return 1;
-
-	return 0;
+	return (int)(l->thread - r->thread);
 }
 
 static int avg_cmp(struct work_atoms *l, struct work_atoms *r)
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf sched latency: Fix removed thread issue
  2015-11-02 11:10 [PATCH] perf sched latency: Fix removed thread issue Jiri Olsa
@ 2015-11-02 22:27 ` Namhyung Kim
  2015-11-03  8:04   ` Jiri Olsa
  2015-11-02 22:53 ` Arnaldo Carvalho de Melo
  2015-11-08  7:31 ` [tip:perf/urgent] perf sched latency: Fix thread pid reuse issue tip-bot for Jiri Olsa
  2 siblings, 1 reply; 8+ messages in thread
From: Namhyung Kim @ 2015-11-02 22:27 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, lkml, David Ahern, Ingo Molnar,
	Peter Zijlstra, Mohit Agrawal

On Mon, Nov 02, 2015 at 12:10:25PM +0100, Jiri Olsa wrote:
> If machine's thread gets excited (EXIT event is received),

Why a thread get *excited* when it received EXIT event? :)


> we set thread->dead = true and it is later on removed from
> machine's tree if the pid is reused on new thread.
> 
> The latency subcommand holds tree of working atoms sorted
> by thread's pid/tid. If there's new thread with same pid
> and tid, the old working atom is found and assert bug
> condition is hit in search function:
> 
>   thread_atoms_search: Assertion `!(thread != atoms->thread)' failed
> 
> Changing the sort function to use thread object pointers
> together with pid and tid check. This way new thread will
> never find old one with same pid/tid.
> 
> I think we could change this to the sort based on timestamp
> of thread creation, once it's added within Namhyung's thread
> patchset.

Right.  I'll work on the v6 soon.

As a work around:

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung


> 
> Reported-by: Mohit Agrawal <moagrawa@redhat.com>
> Link: http://lkml.kernel.org/n/tip-o4doazhhv0zax5zshkg8hnys@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/perf/builtin-sched.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 0ee6d900e100..e3d3e32c0a93 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -1203,12 +1203,13 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>  
>  static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
>  {
> +	if (l->thread == r->thread)
> +		return 0;
>  	if (l->thread->tid < r->thread->tid)
>  		return -1;
>  	if (l->thread->tid > r->thread->tid)
>  		return 1;
> -
> -	return 0;
> +	return (int)(l->thread - r->thread);
>  }
>  
>  static int avg_cmp(struct work_atoms *l, struct work_atoms *r)
> -- 
> 2.4.3
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf sched latency: Fix removed thread issue
  2015-11-02 11:10 [PATCH] perf sched latency: Fix removed thread issue Jiri Olsa
  2015-11-02 22:27 ` Namhyung Kim
@ 2015-11-02 22:53 ` Arnaldo Carvalho de Melo
  2015-11-03  7:41   ` Jiri Olsa
  2015-11-08  7:31 ` [tip:perf/urgent] perf sched latency: Fix thread pid reuse issue tip-bot for Jiri Olsa
  2 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-11-02 22:53 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: lkml, David Ahern, Ingo Molnar, Namhyung Kim, Peter Zijlstra,
	Mohit Agrawal

Em Mon, Nov 02, 2015 at 12:10:25PM +0100, Jiri Olsa escreveu:
> If machine's thread gets excited (EXIT event is received),
> we set thread->dead = true and it is later on removed from
> machine's tree if the pid is reused on new thread.

> The latency subcommand holds tree of working atoms sorted
> by thread's pid/tid. If there's new thread with same pid

Humm, wher is the latency subcommand handling the EXIT event?

I see:

   perf_sched__lat
     perf_sched__read_events
       session = perf_session__new(&file, false, &sched->tool);
       perf_session__process_events(session)

And sched->tool->exit() is not set, which will make
perf_session__process_events(), when calling perf_tool__fill_defaults()
set it to process_event_stub() which will do nothing for
PERF_RECORD_EXIT events, no?

Is there some perf.data file somewhere to reproduce this problem?

- Arnaldo

> and tid, the old working atom is found and assert bug
> condition is hit in search function:
 
>   thread_atoms_search: Assertion `!(thread != atoms->thread)' failed
> 
> Changing the sort function to use thread object pointers
> together with pid and tid check. This way new thread will
> never find old one with same pid/tid.
> 
> I think we could change this to the sort based on timestamp
> of thread creation, once it's added within Namhyung's thread
> patchset.
> 
> Reported-by: Mohit Agrawal <moagrawa@redhat.com>
> Link: http://lkml.kernel.org/n/tip-o4doazhhv0zax5zshkg8hnys@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/perf/builtin-sched.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 0ee6d900e100..e3d3e32c0a93 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -1203,12 +1203,13 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>  
>  static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
>  {
> +	if (l->thread == r->thread)
> +		return 0;
>  	if (l->thread->tid < r->thread->tid)
>  		return -1;
>  	if (l->thread->tid > r->thread->tid)
>  		return 1;
> -
> -	return 0;
> +	return (int)(l->thread - r->thread);
>  }
>  
>  static int avg_cmp(struct work_atoms *l, struct work_atoms *r)
> -- 
> 2.4.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf sched latency: Fix removed thread issue
  2015-11-02 22:53 ` Arnaldo Carvalho de Melo
@ 2015-11-03  7:41   ` Jiri Olsa
  2015-11-03 18:33     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: Jiri Olsa @ 2015-11-03  7:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, lkml, David Ahern, Ingo Molnar, Namhyung Kim,
	Peter Zijlstra, Mohit Agrawal

On Mon, Nov 02, 2015 at 07:53:53PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 02, 2015 at 12:10:25PM +0100, Jiri Olsa escreveu:
> > If machine's thread gets excited (EXIT event is received),
> > we set thread->dead = true and it is later on removed from
> > machine's tree if the pid is reused on new thread.
> 
> > The latency subcommand holds tree of working atoms sorted
> > by thread's pid/tid. If there's new thread with same pid
> 
> Humm, wher is the latency subcommand handling the EXIT event?
> 
> I see:
> 
>    perf_sched__lat
>      perf_sched__read_events
>        session = perf_session__new(&file, false, &sched->tool);
>        perf_session__process_events(session)
> 
> And sched->tool->exit() is not set, which will make
> perf_session__process_events(), when calling perf_tool__fill_defaults()
> set it to process_event_stub() which will do nothing for
> PERF_RECORD_EXIT events, no?

yep, latency command does not handle EXIT event, but the
thread is removed via FORK event.. the first changelog
paragraph might be a little misleading sorry ;-)

could you please change it to:

---
If machine's thread gets excited (EXIT event is received),
we set thread->dead = true and it is later on removed from
machine's tree if the pid is reused on new thread.

We dont handle EXIT command in 'perf sched latency',
however the old thread is removed anyway when FORK
event is received for new thrad with same pid/tid.
---

thanks,
jirka

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf sched latency: Fix removed thread issue
  2015-11-02 22:27 ` Namhyung Kim
@ 2015-11-03  8:04   ` Jiri Olsa
  0 siblings, 0 replies; 8+ messages in thread
From: Jiri Olsa @ 2015-11-03  8:04 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, David Ahern,
	Ingo Molnar, Peter Zijlstra, Mohit Agrawal

On Tue, Nov 03, 2015 at 07:27:21AM +0900, Namhyung Kim wrote:
> On Mon, Nov 02, 2015 at 12:10:25PM +0100, Jiri Olsa wrote:
> > If machine's thread gets excited (EXIT event is received),
> 
> Why a thread get *excited* when it received EXIT event? :)

wouldn't you? I would.. ;-)

jirka

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf sched latency: Fix removed thread issue
  2015-11-03  7:41   ` Jiri Olsa
@ 2015-11-03 18:33     ` Arnaldo Carvalho de Melo
  2015-11-04  7:34       ` Jiri Olsa
  0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-11-03 18:33 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, lkml, David Ahern, Ingo Molnar, Namhyung Kim,
	Peter Zijlstra, Mohit Agrawal

Em Tue, Nov 03, 2015 at 08:41:48AM +0100, Jiri Olsa escreveu:
> On Mon, Nov 02, 2015 at 07:53:53PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Nov 02, 2015 at 12:10:25PM +0100, Jiri Olsa escreveu:
> > > If machine's thread gets excited (EXIT event is received),
> > > we set thread->dead = true and it is later on removed from
> > > machine's tree if the pid is reused on new thread.
> > 
> > > The latency subcommand holds tree of working atoms sorted
> > > by thread's pid/tid. If there's new thread with same pid
> > 
> > Humm, wher is the latency subcommand handling the EXIT event?
> > 
> > I see:
> > 
> >    perf_sched__lat
> >      perf_sched__read_events
> >        session = perf_session__new(&file, false, &sched->tool);
> >        perf_session__process_events(session)
> > 
> > And sched->tool->exit() is not set, which will make
> > perf_session__process_events(), when calling perf_tool__fill_defaults()
> > set it to process_event_stub() which will do nothing for
> > PERF_RECORD_EXIT events, no?
> 
> yep, latency command does not handle EXIT event, but the
> thread is removed via FORK event.. the first changelog

So its not related to processing an EXIT event as described in the
changelog, ok. And I don't see where is that thread->dead is set, i.e.
the sequence is:

  machine__process_fork_event()
    machine__remove_thread()

The only place where thread->dead is set to true is in thread__exited()
and that is only called by machine__process_exit_event(), which is never
called by 'perf sched'.

It is not "later removed from machine's tree", it is removed straight
away, in __machine__remove_thread().

Anyway, I'm downloading that perf.data file to try to debug this here.

- Arnaldo

> paragraph might be a little misleading sorry ;-)
> 
> could you please change it to:
> 
> ---
> If machine's thread gets excited (EXIT event is received),
> we set thread->dead = true and it is later on removed from
> machine's tree if the pid is reused on new thread.
> 
> We dont handle EXIT command in 'perf sched latency',
> however the old thread is removed anyway when FORK
> event is received for new thrad with same pid/tid.
> ---
> 
> thanks,
> jirka

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf sched latency: Fix removed thread issue
  2015-11-03 18:33     ` Arnaldo Carvalho de Melo
@ 2015-11-04  7:34       ` Jiri Olsa
  0 siblings, 0 replies; 8+ messages in thread
From: Jiri Olsa @ 2015-11-04  7:34 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, lkml, David Ahern, Ingo Molnar, Namhyung Kim,
	Peter Zijlstra, Mohit Agrawal

On Tue, Nov 03, 2015 at 03:33:10PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 03, 2015 at 08:41:48AM +0100, Jiri Olsa escreveu:
> > On Mon, Nov 02, 2015 at 07:53:53PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Nov 02, 2015 at 12:10:25PM +0100, Jiri Olsa escreveu:
> > > > If machine's thread gets excited (EXIT event is received),
> > > > we set thread->dead = true and it is later on removed from
> > > > machine's tree if the pid is reused on new thread.
> > > 
> > > > The latency subcommand holds tree of working atoms sorted
> > > > by thread's pid/tid. If there's new thread with same pid
> > > 
> > > Humm, wher is the latency subcommand handling the EXIT event?
> > > 
> > > I see:
> > > 
> > >    perf_sched__lat
> > >      perf_sched__read_events
> > >        session = perf_session__new(&file, false, &sched->tool);
> > >        perf_session__process_events(session)
> > > 
> > > And sched->tool->exit() is not set, which will make
> > > perf_session__process_events(), when calling perf_tool__fill_defaults()
> > > set it to process_event_stub() which will do nothing for
> > > PERF_RECORD_EXIT events, no?
> > 
> > yep, latency command does not handle EXIT event, but the
> > thread is removed via FORK event.. the first changelog
> 
> So its not related to processing an EXIT event as described in the
> changelog, ok. And I don't see where is that thread->dead is set, i.e.
> the sequence is:
> 
>   machine__process_fork_event()
>     machine__remove_thread()
> 
> The only place where thread->dead is set to true is in thread__exited()
> and that is only called by machine__process_exit_event(), which is never
> called by 'perf sched'.
> 
> It is not "later removed from machine's tree", it is removed straight
> away, in __machine__remove_thread().

it is removed later via FORK event, as stated in changelog
and in updated changelog I sent in previous email:

SNIP

> > could you please change it to:
> > 
> > ---
> > If machine's thread gets excited (EXIT event is received),
> > we set thread->dead = true and it is later on removed from
> > machine's tree if the pid is reused on new thread.
> > 
> > We dont handle EXIT command in 'perf sched latency',
> > however the old thread is removed anyway when FORK
> > event is received for new thrad with same pid/tid.
> > ---

thanks,
jirka

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip:perf/urgent] perf sched latency: Fix thread pid reuse issue
  2015-11-02 11:10 [PATCH] perf sched latency: Fix removed thread issue Jiri Olsa
  2015-11-02 22:27 ` Namhyung Kim
  2015-11-02 22:53 ` Arnaldo Carvalho de Melo
@ 2015-11-08  7:31 ` tip-bot for Jiri Olsa
  2 siblings, 0 replies; 8+ messages in thread
From: tip-bot for Jiri Olsa @ 2015-11-08  7:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, a.p.zijlstra, acme, moagrawa, namhyung, mingo,
	jolsa, dsahern, tglx

Commit-ID:  0014de172d228e450377d1fd079d94e67128d27f
Gitweb:     http://git.kernel.org/tip/0014de172d228e450377d1fd079d94e67128d27f
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Mon, 2 Nov 2015 12:10:25 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 5 Nov 2015 12:51:00 -0300

perf sched latency: Fix thread pid reuse issue

The latency subcommand holds a tree of working atoms sorted by thread's
pid/tid. If there's new thread with same pid and tid, the old working atom is
found and assert bug condition is hit in search function:

  thread_atoms_search: Assertion `!(thread != atoms->thread)' failed

Changing the sort function to use thread object pointers together with pid and
tid check. This way new thread will never find old one with same pid/tid.

Link: http://lkml.kernel.org/n/tip-o4doazhhv0zax5zshkg8hnys@git.kernel.org
Reported-by: Mohit Agrawal <moagrawa@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446462625-15807-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 0ee6d90..e3d3e32 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1203,12 +1203,13 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
 
 static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
 {
+	if (l->thread == r->thread)
+		return 0;
 	if (l->thread->tid < r->thread->tid)
 		return -1;
 	if (l->thread->tid > r->thread->tid)
 		return 1;
-
-	return 0;
+	return (int)(l->thread - r->thread);
 }
 
 static int avg_cmp(struct work_atoms *l, struct work_atoms *r)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-11-08  7:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-02 11:10 [PATCH] perf sched latency: Fix removed thread issue Jiri Olsa
2015-11-02 22:27 ` Namhyung Kim
2015-11-03  8:04   ` Jiri Olsa
2015-11-02 22:53 ` Arnaldo Carvalho de Melo
2015-11-03  7:41   ` Jiri Olsa
2015-11-03 18:33     ` Arnaldo Carvalho de Melo
2015-11-04  7:34       ` Jiri Olsa
2015-11-08  7:31 ` [tip:perf/urgent] perf sched latency: Fix thread pid reuse issue tip-bot for Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.