All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf tools: Do not fail on processing out of order event
@ 2014-11-26 15:39 Jiri Olsa
  2014-11-26 15:44 ` Arnaldo Carvalho de Melo
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Jiri Olsa @ 2014-11-26 15:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Linus Torvalds, Andi Kleen, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Matt Fleming,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo

Linus reported perf report command being interrupted due to
processing of 'out of order' event, with following error:

  Timestamp below last timeslice flush
  0x5733a8 [0x28]: failed to process type: 3

I could reproduce the issue and in my case it was caused by one
CPU (mmap) being behind during record and userspace mmap reader
seeing the data after other CPUs data were already stored.

It'd be nice to find the source of this hiccup, meanwhile I'm
submitting change that does not kill the processing, but only
tracks the number of out of order events and warn user.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/event.h   |  1 +
 tools/perf/util/session.c | 13 +++++++------
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 09b9e8d3fcf7..c4ffe2bd0738 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -242,6 +242,7 @@ struct events_stats {
 	u32 nr_invalid_chains;
 	u32 nr_unknown_id;
 	u32 nr_unprocessable_samples;
+	u32 nr_unordered_events;
 };
 
 struct attr_event {
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 6ac62ae6b8fa..17a85f527bfc 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -533,15 +533,11 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
 		return -ETIME;
 
 	if (timestamp < oe->last_flush) {
-		WARN_ONCE(1, "Timestamp below last timeslice flush\n");
-
-		pr_oe_time(timestamp,      "out of order event");
+		pr_oe_time(timestamp,      "out of order event\n");
 		pr_oe_time(oe->last_flush, "last flush, last_flush_type %d\n",
 			   oe->last_flush_type);
 
-		/* We could get out of order messages after forced flush. */
-		if (oe->last_flush_type != OE_FLUSH__HALF)
-			return -EINVAL;
+		s->stats.nr_unordered_events++;
 	}
 
 	new = ordered_events__new(oe, timestamp, event);
@@ -1118,6 +1114,11 @@ static void perf_session__warn_about_errors(const struct perf_session *session,
 			    "Do you have a KVM guest running and not using 'perf kvm'?\n",
 			    session->stats.nr_unprocessable_samples);
 	}
+
+	if (session->stats.nr_unordered_events != 0) {
+		ui__warning("%u out of order events recorded.\n",
+			    session->stats.nr_unordered_events);
+	}
 }
 
 volatile int session_done;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-26 15:39 [PATCH] perf tools: Do not fail on processing out of order event Jiri Olsa
@ 2014-11-26 15:44 ` Arnaldo Carvalho de Melo
  2014-11-27 10:56   ` Ingo Molnar
  2014-11-27 19:44 ` Andi Kleen
  2014-12-09 10:16 ` [tip:perf/urgent] perf session: " tip-bot for Jiri Olsa
  2 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-11-26 15:44 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Linus Torvalds, Andi Kleen, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Matt Fleming,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

Em Wed, Nov 26, 2014 at 04:39:31PM +0100, Jiri Olsa escreveu:
> Linus reported perf report command being interrupted due to
> processing of 'out of order' event, with following error:
> 
>   Timestamp below last timeslice flush
>   0x5733a8 [0x28]: failed to process type: 3
> 
> I could reproduce the issue and in my case it was caused by one
> CPU (mmap) being behind during record and userspace mmap reader
> seeing the data after other CPUs data were already stored.
> 
> It'd be nice to find the source of this hiccup, meanwhile I'm
> submitting change that does not kill the processing, but only
> tracks the number of out of order events and warn user.

Works, for me, thanks for the fast response!

- Arnaldo
 
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Matt Fleming <matt.fleming@intel.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Stephane Eranian <eranian@google.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/perf/util/event.h   |  1 +
>  tools/perf/util/session.c | 13 +++++++------
>  2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index 09b9e8d3fcf7..c4ffe2bd0738 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -242,6 +242,7 @@ struct events_stats {
>  	u32 nr_invalid_chains;
>  	u32 nr_unknown_id;
>  	u32 nr_unprocessable_samples;
> +	u32 nr_unordered_events;
>  };
>  
>  struct attr_event {
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 6ac62ae6b8fa..17a85f527bfc 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -533,15 +533,11 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
>  		return -ETIME;
>  
>  	if (timestamp < oe->last_flush) {
> -		WARN_ONCE(1, "Timestamp below last timeslice flush\n");
> -
> -		pr_oe_time(timestamp,      "out of order event");
> +		pr_oe_time(timestamp,      "out of order event\n");
>  		pr_oe_time(oe->last_flush, "last flush, last_flush_type %d\n",
>  			   oe->last_flush_type);
>  
> -		/* We could get out of order messages after forced flush. */
> -		if (oe->last_flush_type != OE_FLUSH__HALF)
> -			return -EINVAL;
> +		s->stats.nr_unordered_events++;
>  	}
>  
>  	new = ordered_events__new(oe, timestamp, event);
> @@ -1118,6 +1114,11 @@ static void perf_session__warn_about_errors(const struct perf_session *session,
>  			    "Do you have a KVM guest running and not using 'perf kvm'?\n",
>  			    session->stats.nr_unprocessable_samples);
>  	}
> +
> +	if (session->stats.nr_unordered_events != 0) {
> +		ui__warning("%u out of order events recorded.\n",
> +			    session->stats.nr_unordered_events);
> +	}
>  }
>  
>  volatile int session_done;
> -- 
> 1.9.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-26 15:44 ` Arnaldo Carvalho de Melo
@ 2014-11-27 10:56   ` Ingo Molnar
  2014-11-27 12:54     ` Jiri Olsa
  2014-11-27 13:53     ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 12+ messages in thread
From: Ingo Molnar @ 2014-11-27 10:56 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, linux-kernel, Linus Torvalds, Andi Kleen,
	Corey Ashford, David Ahern, Frederic Weisbecker, Matt Fleming,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian


* Arnaldo Carvalho de Melo <acme@redhat.com> wrote:

> Em Wed, Nov 26, 2014 at 04:39:31PM +0100, Jiri Olsa escreveu:
> > Linus reported perf report command being interrupted due to
> > processing of 'out of order' event, with following error:
> > 
> >   Timestamp below last timeslice flush
> >   0x5733a8 [0x28]: failed to process type: 3
> > 
> > I could reproduce the issue and in my case it was caused by one
> > CPU (mmap) being behind during record and userspace mmap reader
> > seeing the data after other CPUs data were already stored.
> > 
> > It'd be nice to find the source of this hiccup, meanwhile I'm
> > submitting change that does not kill the processing, but only
> > tracks the number of out of order events and warn user.
> 
> Works, for me, thanks for the fast response!

Thanks Jiri!

> > --- a/tools/perf/util/session.c
> > +++ b/tools/perf/util/session.c
> > @@ -533,15 +533,11 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
> >  		return -ETIME;
> >  
> >  	if (timestamp < oe->last_flush) {
> > -		WARN_ONCE(1, "Timestamp below last timeslice flush\n");
> > -
> > -		pr_oe_time(timestamp,      "out of order event");
> > +		pr_oe_time(timestamp,      "out of order event\n");
> >  		pr_oe_time(oe->last_flush, "last flush, last_flush_type %d\n",
> >  			   oe->last_flush_type);
> >  
> > -		/* We could get out of order messages after forced flush. */
> > -		if (oe->last_flush_type != OE_FLUSH__HALF)
> > -			return -EINVAL;
> > +		s->stats.nr_unordered_events++;

Btw., in the forced flush case we'll get out of order events that 
are 'expected'. Shouldn't we count them separately and not warn 
about them, or so?

> > +	if (session->stats.nr_unordered_events != 0) {
> > +		ui__warning("%u out of order events recorded.\n",
> > +			    session->stats.nr_unordered_events);
> > +	}

Nit: I'd suggest keeping the message printout on a single line:
 
	if (session->stats.nr_unordered_events != 0) {
		ui__warning("%u out of order events recorded.\n", session->stats.nr_unordered_events);

as IMHO the cure for this col80 linebreak checkpatch warning is 
worse than the disease! :-)

Barring those details:

  Acked-by: Ingo Molnar <mingo@kernel.org>

Arnaldo, I suppose the fix will go upstream via your tree, as a 
pull request for v3.18 fixes?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-27 10:56   ` Ingo Molnar
@ 2014-11-27 12:54     ` Jiri Olsa
  2014-11-27 13:54       ` Arnaldo Carvalho de Melo
  2014-11-27 13:53     ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2014-11-27 12:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel,
	Linus Torvalds, Andi Kleen, Corey Ashford, David Ahern,
	Frederic Weisbecker, Matt Fleming, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Thu, Nov 27, 2014 at 11:56:03AM +0100, Ingo Molnar wrote:

SNIP

> > > -		pr_oe_time(timestamp,      "out of order event");
> > > +		pr_oe_time(timestamp,      "out of order event\n");
> > >  		pr_oe_time(oe->last_flush, "last flush, last_flush_type %d\n",
> > >  			   oe->last_flush_type);
> > >  
> > > -		/* We could get out of order messages after forced flush. */
> > > -		if (oe->last_flush_type != OE_FLUSH__HALF)
> > > -			return -EINVAL;
> > > +		s->stats.nr_unordered_events++;
> 
> Btw., in the forced flush case we'll get out of order events that 
> are 'expected'. Shouldn't we count them separately and not warn 
> about them, or so?

hum, we warned about them anyway, we just did not fail processing..
and the impact of both cases should be the same.. it's just at the
forced flush we expected/allowed out of order events

so I think it's ok to share the same counter and warn about
them the same way

> 
> > > +	if (session->stats.nr_unordered_events != 0) {
> > > +		ui__warning("%u out of order events recorded.\n",
> > > +			    session->stats.nr_unordered_events);
> > > +	}
> 
> Nit: I'd suggest keeping the message printout on a single line:
>  
> 	if (session->stats.nr_unordered_events != 0) {
> 		ui__warning("%u out of order events recorded.\n", session->stats.nr_unordered_events);
> 
> as IMHO the cure for this col80 linebreak checkpatch warning is 
> worse than the disease! :-)

ok ;-)

> 
> Barring those details:
> 
>   Acked-by: Ingo Molnar <mingo@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-27 10:56   ` Ingo Molnar
  2014-11-27 12:54     ` Jiri Olsa
@ 2014-11-27 13:53     ` Arnaldo Carvalho de Melo
  2014-11-27 14:07       ` Jiri Olsa
  1 sibling, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-11-27 13:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jiri Olsa, linux-kernel, Linus Torvalds, Andi Kleen,
	Corey Ashford, David Ahern, Frederic Weisbecker, Matt Fleming,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

Em Thu, Nov 27, 2014 at 11:56:03AM +0100, Ingo Molnar escreveu:
> 
> * Arnaldo Carvalho de Melo <acme@redhat.com> wrote:
> 
> > Em Wed, Nov 26, 2014 at 04:39:31PM +0100, Jiri Olsa escreveu:
> > > Linus reported perf report command being interrupted due to
> > > processing of 'out of order' event, with following error:
> > > 
> > >   Timestamp below last timeslice flush
> > >   0x5733a8 [0x28]: failed to process type: 3
> > > 
> > > I could reproduce the issue and in my case it was caused by one
> > > CPU (mmap) being behind during record and userspace mmap reader
> > > seeing the data after other CPUs data were already stored.
> > > 
> > > It'd be nice to find the source of this hiccup, meanwhile I'm
> > > submitting change that does not kill the processing, but only
> > > tracks the number of out of order events and warn user.
> > 
> > Works, for me, thanks for the fast response!
> 
> Thanks Jiri!
> 
> > > --- a/tools/perf/util/session.c
> > > +++ b/tools/perf/util/session.c
> > > @@ -533,15 +533,11 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
> > >  		return -ETIME;
> > >  
> > >  	if (timestamp < oe->last_flush) {
> > > -		WARN_ONCE(1, "Timestamp below last timeslice flush\n");
> > > -
> > > -		pr_oe_time(timestamp,      "out of order event");
> > > +		pr_oe_time(timestamp,      "out of order event\n");
> > >  		pr_oe_time(oe->last_flush, "last flush, last_flush_type %d\n",
> > >  			   oe->last_flush_type);
> > >  
> > > -		/* We could get out of order messages after forced flush. */
> > > -		if (oe->last_flush_type != OE_FLUSH__HALF)
> > > -			return -EINVAL;
> > > +		s->stats.nr_unordered_events++;
> 
> Btw., in the forced flush case we'll get out of order events that 
> are 'expected'. Shouldn't we count them separately and not warn 
> about them, or so?
> 
> > > +	if (session->stats.nr_unordered_events != 0) {
> > > +		ui__warning("%u out of order events recorded.\n",
> > > +			    session->stats.nr_unordered_events);
> > > +	}
> 
> Nit: I'd suggest keeping the message printout on a single line:
>  
> 	if (session->stats.nr_unordered_events != 0) {
> 		ui__warning("%u out of order events recorded.\n", session->stats.nr_unordered_events);
> 
> as IMHO the cure for this col80 linebreak checkpatch warning is 
> worse than the disease! :-)
> 
> Barring those details:
> 
>   Acked-by: Ingo Molnar <mingo@kernel.org>
> 
> Arnaldo, I suppose the fix will go upstream via your tree, as a 
> pull request for v3.18 fixes?

Right, I can do that.

And I agree with the "expected" for events that are out of the current
reordering window, i.e. we can't insert something into previous windows,
so those are, humm, what would be a good name:
stats->out_of_reordering_window, while stats->reordered, would be for
events that were found out of order, but were successfully sorted as
part of a flush operation, right?

Also I'd drop that nr_ prefix, its a 'stat', so it probably is a "number",
right? :-)
 
- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-27 12:54     ` Jiri Olsa
@ 2014-11-27 13:54       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-11-27 13:54 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ingo Molnar, Jiri Olsa, linux-kernel, Linus Torvalds, Andi Kleen,
	Corey Ashford, David Ahern, Frederic Weisbecker, Matt Fleming,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

Em Thu, Nov 27, 2014 at 01:54:16PM +0100, Jiri Olsa escreveu:
> On Thu, Nov 27, 2014 at 11:56:03AM +0100, Ingo Molnar wrote:
> 
> SNIP
> 
> > > > -		pr_oe_time(timestamp,      "out of order event");
> > > > +		pr_oe_time(timestamp,      "out of order event\n");
> > > >  		pr_oe_time(oe->last_flush, "last flush, last_flush_type %d\n",
> > > >  			   oe->last_flush_type);
> > > >  
> > > > -		/* We could get out of order messages after forced flush. */
> > > > -		if (oe->last_flush_type != OE_FLUSH__HALF)
> > > > -			return -EINVAL;
> > > > +		s->stats.nr_unordered_events++;
> > 
> > Btw., in the forced flush case we'll get out of order events that 
> > are 'expected'. Shouldn't we count them separately and not warn 
> > about them, or so?
> 
> hum, we warned about them anyway, we just did not fail processing..
> and the impact of both cases should be the same.. it's just at the
> forced flush we expected/allowed out of order events
> 
> so I think it's ok to share the same counter and warn about
> them the same way

Sure? See my other message, aren't those two kinds of reordering? I.e.
one that we can "fix" (aka reorder as part of a flush), one that we
can't?
 
> > 
> > > > +	if (session->stats.nr_unordered_events != 0) {
> > > > +		ui__warning("%u out of order events recorded.\n",
> > > > +			    session->stats.nr_unordered_events);
> > > > +	}
> > 
> > Nit: I'd suggest keeping the message printout on a single line:
> >  
> > 	if (session->stats.nr_unordered_events != 0) {
> > 		ui__warning("%u out of order events recorded.\n", session->stats.nr_unordered_events);
> > 
> > as IMHO the cure for this col80 linebreak checkpatch warning is 
> > worse than the disease! :-)
> 
> ok ;-)
> 
> > 
> > Barring those details:
> > 
> >   Acked-by: Ingo Molnar <mingo@kernel.org>
> 
> thanks,
> jirka

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-27 13:53     ` Arnaldo Carvalho de Melo
@ 2014-11-27 14:07       ` Jiri Olsa
  2014-11-27 14:16         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2014-11-27 14:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Jiri Olsa, linux-kernel, Linus Torvalds, Andi Kleen,
	Corey Ashford, David Ahern, Frederic Weisbecker, Matt Fleming,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

On Thu, Nov 27, 2014 at 11:53:00AM -0200, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 27, 2014 at 11:56:03AM +0100, Ingo Molnar escreveu:
> > 
> > * Arnaldo Carvalho de Melo <acme@redhat.com> wrote:
> > 
> > > Em Wed, Nov 26, 2014 at 04:39:31PM +0100, Jiri Olsa escreveu:
> > > > Linus reported perf report command being interrupted due to
> > > > processing of 'out of order' event, with following error:
> > > > 
> > > >   Timestamp below last timeslice flush
> > > >   0x5733a8 [0x28]: failed to process type: 3
> > > > 
> > > > I could reproduce the issue and in my case it was caused by one
> > > > CPU (mmap) being behind during record and userspace mmap reader
> > > > seeing the data after other CPUs data were already stored.
> > > > 
> > > > It'd be nice to find the source of this hiccup, meanwhile I'm
> > > > submitting change that does not kill the processing, but only
> > > > tracks the number of out of order events and warn user.
> > > 
> > > Works, for me, thanks for the fast response!
> > 
> > Thanks Jiri!
> > 
> > > > --- a/tools/perf/util/session.c
> > > > +++ b/tools/perf/util/session.c
> > > > @@ -533,15 +533,11 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
> > > >  		return -ETIME;
> > > >  
> > > >  	if (timestamp < oe->last_flush) {
> > > > -		WARN_ONCE(1, "Timestamp below last timeslice flush\n");
> > > > -
> > > > -		pr_oe_time(timestamp,      "out of order event");
> > > > +		pr_oe_time(timestamp,      "out of order event\n");
> > > >  		pr_oe_time(oe->last_flush, "last flush, last_flush_type %d\n",
> > > >  			   oe->last_flush_type);
> > > >  
> > > > -		/* We could get out of order messages after forced flush. */
> > > > -		if (oe->last_flush_type != OE_FLUSH__HALF)
> > > > -			return -EINVAL;
> > > > +		s->stats.nr_unordered_events++;
> > 
> > Btw., in the forced flush case we'll get out of order events that 
> > are 'expected'. Shouldn't we count them separately and not warn 
> > about them, or so?
> > 
> > > > +	if (session->stats.nr_unordered_events != 0) {
> > > > +		ui__warning("%u out of order events recorded.\n",
> > > > +			    session->stats.nr_unordered_events);
> > > > +	}
> > 
> > Nit: I'd suggest keeping the message printout on a single line:
> >  
> > 	if (session->stats.nr_unordered_events != 0) {
> > 		ui__warning("%u out of order events recorded.\n", session->stats.nr_unordered_events);
> > 
> > as IMHO the cure for this col80 linebreak checkpatch warning is 
> > worse than the disease! :-)
> > 
> > Barring those details:
> > 
> >   Acked-by: Ingo Molnar <mingo@kernel.org>
> > 
> > Arnaldo, I suppose the fix will go upstream via your tree, as a 
> > pull request for v3.18 fixes?
> 
> Right, I can do that.
> 
> And I agree with the "expected" for events that are out of the current
> reordering window, i.e. we can't insert something into previous windows,
> so those are, humm, what would be a good name:
> stats->out_of_reordering_window, while stats->reordered, would be for
> events that were found out of order, but were successfully sorted as
> part of a flush operation, right?

the forced flush is when we find out we crossed the allowed allocation
space for the samples queue.. so we take the half of the sorted queue
and flush it.. for this case we break the flushing logic and we might
(probably just in theory) get out of order events

but IMO both cases of out of order event are equal.. we dont do
anything special for forced flushed AFAIK

jirka

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-27 14:07       ` Jiri Olsa
@ 2014-11-27 14:16         ` Arnaldo Carvalho de Melo
  2014-11-27 14:23           ` Jiri Olsa
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-11-27 14:16 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ingo Molnar, Jiri Olsa, linux-kernel, Linus Torvalds, Andi Kleen,
	Corey Ashford, David Ahern, Frederic Weisbecker, Matt Fleming,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

Em Thu, Nov 27, 2014 at 03:07:07PM +0100, Jiri Olsa escreveu:
> On Thu, Nov 27, 2014 at 11:53:00AM -0200, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Nov 27, 2014 at 11:56:03AM +0100, Ingo Molnar escreveu:
> > > Arnaldo, I suppose the fix will go upstream via your tree, as a 
> > > pull request for v3.18 fixes?

> > Right, I can do that.

> > And I agree with the "expected" for events that are out of the current
> > reordering window, i.e. we can't insert something into previous windows,
> > so those are, humm, what would be a good name:
> > stats->out_of_reordering_window, while stats->reordered, would be for
> > events that were found out of order, but were successfully sorted as
> > part of a flush operation, right?
 
> the forced flush is when we find out we crossed the allowed allocation
> space for the samples queue.. so we take the half of the sorted queue
> and flush it.. for this case we break the flushing logic and we might

Well, we make it more likely than without a forced flush to find out of
order events, because the window suddenly became smaller, its like we
found a FLUSH event right there, no?

> (probably just in theory) get out of order events
 
> but IMO both cases of out of order event are equal.. we dont do
> anything special for forced flushed AFAIK

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-27 14:16         ` Arnaldo Carvalho de Melo
@ 2014-11-27 14:23           ` Jiri Olsa
  2014-11-27 14:45             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2014-11-27 14:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Jiri Olsa, linux-kernel, Linus Torvalds, Andi Kleen,
	Corey Ashford, David Ahern, Frederic Weisbecker, Matt Fleming,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

On Thu, Nov 27, 2014 at 12:16:26PM -0200, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 27, 2014 at 03:07:07PM +0100, Jiri Olsa escreveu:
> > On Thu, Nov 27, 2014 at 11:53:00AM -0200, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Nov 27, 2014 at 11:56:03AM +0100, Ingo Molnar escreveu:
> > > > Arnaldo, I suppose the fix will go upstream via your tree, as a 
> > > > pull request for v3.18 fixes?
> 
> > > Right, I can do that.
> 
> > > And I agree with the "expected" for events that are out of the current
> > > reordering window, i.e. we can't insert something into previous windows,
> > > so those are, humm, what would be a good name:
> > > stats->out_of_reordering_window, while stats->reordered, would be for
> > > events that were found out of order, but were successfully sorted as
> > > part of a flush operation, right?
>  
> > the forced flush is when we find out we crossed the allowed allocation
> > space for the samples queue.. so we take the half of the sorted queue
> > and flush it.. for this case we break the flushing logic and we might
> 
> Well, we make it more likely than without a forced flush to find out of
> order events, because the window suddenly became smaller, its like we
> found a FLUSH event right there, no?

yep, still I dont see the need to count those 2 cases separately,
both these types (of out of order event) have same implications
for the report

jirka

> 
> > (probably just in theory) get out of order events
>  
> > but IMO both cases of out of order event are equal.. we dont do
> > anything special for forced flushed AFAIK
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-27 14:23           ` Jiri Olsa
@ 2014-11-27 14:45             ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-11-27 14:45 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ingo Molnar, Jiri Olsa, linux-kernel, Linus Torvalds, Andi Kleen,
	Corey Ashford, David Ahern, Frederic Weisbecker, Matt Fleming,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

Em Thu, Nov 27, 2014 at 03:23:21PM +0100, Jiri Olsa escreveu:
> On Thu, Nov 27, 2014 at 12:16:26PM -0200, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Nov 27, 2014 at 03:07:07PM +0100, Jiri Olsa escreveu:
> > > On Thu, Nov 27, 2014 at 11:53:00AM -0200, Arnaldo Carvalho de Melo wrote:
> > > > Em Thu, Nov 27, 2014 at 11:56:03AM +0100, Ingo Molnar escreveu:
> > > > > Arnaldo, I suppose the fix will go upstream via your tree, as a 
> > > > > pull request for v3.18 fixes?
> > 
> > > > Right, I can do that.
> > 
> > > > And I agree with the "expected" for events that are out of the current
> > > > reordering window, i.e. we can't insert something into previous windows,
> > > > so those are, humm, what would be a good name:
> > > > stats->out_of_reordering_window, while stats->reordered, would be for
> > > > events that were found out of order, but were successfully sorted as
> > > > part of a flush operation, right?
> >  
> > > the forced flush is when we find out we crossed the allowed allocation
> > > space for the samples queue.. so we take the half of the sorted queue
> > > and flush it.. for this case we break the flushing logic and we might
> > 
> > Well, we make it more likely than without a forced flush to find out of
> > order events, because the window suddenly became smaller, its like we
> > found a FLUSH event right there, no?
 
> yep, still I dont see the need to count those 2 cases separately,
> both these types (of out of order event) have same implications
> for the report

Ok, I misundertood it when thinking it was about reordered events inside
a flush window versus events for a previous, closed flush window.

Yeah, if we detect events for a previous flush window in a forced flush
or in a normal flush, its all the same: events we can't consider anymore
because they are for a closed (old) flush window.

- Arnaldo
 
> jirka
> 
> > 
> > > (probably just in theory) get out of order events
> >  
> > > but IMO both cases of out of order event are equal.. we dont do
> > > anything special for forced flushed AFAIK
> > 
> > - Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] perf tools: Do not fail on processing out of order event
  2014-11-26 15:39 [PATCH] perf tools: Do not fail on processing out of order event Jiri Olsa
  2014-11-26 15:44 ` Arnaldo Carvalho de Melo
@ 2014-11-27 19:44 ` Andi Kleen
  2014-12-09 10:16 ` [tip:perf/urgent] perf session: " tip-bot for Jiri Olsa
  2 siblings, 0 replies; 12+ messages in thread
From: Andi Kleen @ 2014-11-27 19:44 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Linus Torvalds, Corey Ashford, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Matt Fleming, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo

On Wed, Nov 26, 2014 at 04:39:31PM +0100, Jiri Olsa wrote:
> Linus reported perf report command being interrupted due to
> processing of 'out of order' event, with following error:
> 
>   Timestamp below last timeslice flush
>   0x5733a8 [0x28]: failed to process type: 3
> 
> I could reproduce the issue and in my case it was caused by one
> CPU (mmap) being behind during record and userspace mmap reader
> seeing the data after other CPUs data were already stored.

I like the change. This problem hit me a couple of times too.

-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:perf/urgent] perf session: Do not fail on processing out of order event
  2014-11-26 15:39 [PATCH] perf tools: Do not fail on processing out of order event Jiri Olsa
  2014-11-26 15:44 ` Arnaldo Carvalho de Melo
  2014-11-27 19:44 ` Andi Kleen
@ 2014-12-09 10:16 ` tip-bot for Jiri Olsa
  2 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Jiri Olsa @ 2014-12-09 10:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, jolsa, linux-kernel, dsahern, eranian, tglx, mingo,
	matt.fleming, paulus, torvalds, a.p.zijlstra, hpa, ak, cjashfor,
	namhyung, fweisbec

Commit-ID:  f61ff6c06dc8f32c7036013ad802c899ec590607
Gitweb:     http://git.kernel.org/tip/f61ff6c06dc8f32c7036013ad802c899ec590607
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Wed, 26 Nov 2014 16:39:31 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 28 Nov 2014 18:19:37 -0300

perf session: Do not fail on processing out of order event

Linus reported perf report command being interrupted due to processing
of 'out of order' event, with following error:

  Timestamp below last timeslice flush
  0x5733a8 [0x28]: failed to process type: 3

I could reproduce the issue and in my case it was caused by one CPU
(mmap) being behind during record and userspace mmap reader seeing the
data after other CPUs data were already stored.

This is expected under some circumstances because we need to limit the
number of events that we queue for reordering when we receive a
PERF_RECORD_FINISHED_ROUND or when we force flush due to memory
pressure.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1417016371-30249-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/event.h   |  1 +
 tools/perf/util/session.c | 11 +++++------
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 5699e7e..50a7b11 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -214,6 +214,7 @@ struct events_stats {
 	u32 nr_invalid_chains;
 	u32 nr_unknown_id;
 	u32 nr_unprocessable_samples;
+	u32 nr_unordered_events;
 };
 
 struct attr_event {
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 6702ac2..80dbba0 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -521,15 +521,11 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
 		return -ETIME;
 
 	if (timestamp < oe->last_flush) {
-		WARN_ONCE(1, "Timestamp below last timeslice flush\n");
-
-		pr_oe_time(timestamp,      "out of order event");
+		pr_oe_time(timestamp,      "out of order event\n");
 		pr_oe_time(oe->last_flush, "last flush, last_flush_type %d\n",
 			   oe->last_flush_type);
 
-		/* We could get out of order messages after forced flush. */
-		if (oe->last_flush_type != OE_FLUSH__HALF)
-			return -EINVAL;
+		s->stats.nr_unordered_events++;
 	}
 
 	new = ordered_events__new(oe, timestamp, event);
@@ -1057,6 +1053,9 @@ static void perf_session__warn_about_errors(const struct perf_session *session,
 			    "Do you have a KVM guest running and not using 'perf kvm'?\n",
 			    session->stats.nr_unprocessable_samples);
 	}
+
+	if (session->stats.nr_unordered_events != 0)
+		ui__warning("%u out of order events recorded.\n", session->stats.nr_unordered_events);
 }
 
 volatile int session_done;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-12-09 10:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-26 15:39 [PATCH] perf tools: Do not fail on processing out of order event Jiri Olsa
2014-11-26 15:44 ` Arnaldo Carvalho de Melo
2014-11-27 10:56   ` Ingo Molnar
2014-11-27 12:54     ` Jiri Olsa
2014-11-27 13:54       ` Arnaldo Carvalho de Melo
2014-11-27 13:53     ` Arnaldo Carvalho de Melo
2014-11-27 14:07       ` Jiri Olsa
2014-11-27 14:16         ` Arnaldo Carvalho de Melo
2014-11-27 14:23           ` Jiri Olsa
2014-11-27 14:45             ` Arnaldo Carvalho de Melo
2014-11-27 19:44 ` Andi Kleen
2014-12-09 10:16 ` [tip:perf/urgent] perf session: " tip-bot for Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.