linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tracing: Fix the outmost stupidity of tracing_off()
@ 2012-06-14 22:47 Thomas Gleixner
  2012-06-14 22:50 ` Thomas Gleixner
  2012-06-14 22:58 ` Steven Rostedt
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Gleixner @ 2012-06-14 22:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

I lost another day of waiting for a complex bug to trigger and turn
off tracing, which did not happen due to:

 commit 499e54705(tracing/ring-buffer: Only have tracing_on disable
 tracing buffers)

The subject line itself is supsicious itself:

  Only have tracing_on disable tracing buffers

tracing_on as far as I'm concerned is supposed to start tracing. So
disabling trace buffers on tracing_on is counterproductive.

The patch description is not really a better source of enlightment:

  As the ring-buffer code is being used by other facilities in the
  kernel, having tracing_on file disable *all* buffers is not a desired
  affect. It should only disable the ftrace buffers that are being used.
    
  Move the code into the trace.c file and use the buffer disabling
  for tracing_on() and tracing_off(). This way only the ftrace buffers
  will be affected by them and other kernel utilities will not be
  confused to why their output suddenly stopped.

How should the casual reader of this commit message figure out that
tracing_on is a debugfs file which accepts 0 resp. 1 to it, where 0
causes the buffers to be disabled?

Yes, it's obvious to the patch author, but ....

While I account that to the usual inability to write proper
changelogs, I'm amazed about the following parts of the patch

-void tracing_off(void)
-{
-       clear_bit(RB_BUFFERS_ON_BIT, &ring_buffer_flags);
-}

...

+void tracing_off(void)
+{
+       if (global_trace.buffer)
+               ring_buffer_record_on(global_trace.buffer);

This is so amazingly stupid, that it's not even funny anymore.

Is it really too much to expect that a patch of the size

 4 files changed, 171 insertions(+), 97 deletions(-)

which touches fundamental functionality of a subsystem is at least
tested to maintain the previous functionality?

Signed-off-by: Thomas [royally annoyed] Gleixner <tglx@linutronix.de>

---
 kernel/trace/trace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/kernel/trace/trace.c
===================================================================
--- linux-2.6.orig/kernel/trace/trace.c
+++ linux-2.6/kernel/trace/trace.c
@@ -371,7 +371,7 @@ EXPORT_SYMBOL_GPL(tracing_on);
 void tracing_off(void)
 {
 	if (global_trace.buffer)
-		ring_buffer_record_on(global_trace.buffer);
+		ring_buffer_record_off(global_trace.buffer);
 	/*
 	 * This flag is only looked at when buffers haven't been
 	 * allocated yet. We don't really care about the race

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off()
  2012-06-14 22:47 [PATCH] tracing: Fix the outmost stupidity of tracing_off() Thomas Gleixner
@ 2012-06-14 22:50 ` Thomas Gleixner
  2012-06-14 23:38   ` Steven Rostedt
  2012-06-14 22:58 ` Steven Rostedt
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2012-06-14 22:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

Steven,

On Fri, 15 Jun 2012, Thomas Gleixner wrote:
>  void tracing_off(void)
>  {
>  	if (global_trace.buffer)
> -		ring_buffer_record_on(global_trace.buffer);
> +		ring_buffer_record_off(global_trace.buffer);

this is the gazillionst time that I wasted valuable time relying on
the not so extraordinary expectation that the kernel tracer stays
functional at least in the most basic ways.

I'm really tired of wasting hours and days just because you are unable
to keep your stuff straight and functional.

I avoided to make public fuzz about that so far, but my patience is
exhausted now.

You know yourself how many times I ranted to you on IRC or in private
conversations, but you just keep on to bask in your egocentric
mastering the uber complexity treat.

Your self description (https://lwn.net/Articles/500388/):

 Complexity is my Sun, and I am the planet that orbits around it.

is pretty accurate.

You can orbit around whatever you want and as long as you want, just
please do that on your own playground and not on the expense of
others.

I always held up to ftrace, but the repeating experience of
disfunctionality caused by your refusal to put your priorites straight
is just too annoying.

It's not the first time, that tracing_off() or other stuff I'm
depending on has been broken and caused me exhaustive loss of time.

I know that I'm not the average user of tracing, but that's not an
excuse for pushing out crap over and over again.

I'm tired of it and to be honest, I'm going to spend some quality time
on evaluating LLTNG. If it can do what I need then I switch over and
see whether it can hold up to the basic expectations of a primary
user. AFAICT from previous experience it can do that, and if Matthieu
is willing and able to cooperate on integrating with perf, then I
happily kick you off into your self chosen complexity orbit forever.

Sadly,

	Thomas


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off()
  2012-06-14 22:47 [PATCH] tracing: Fix the outmost stupidity of tracing_off() Thomas Gleixner
  2012-06-14 22:50 ` Thomas Gleixner
@ 2012-06-14 22:58 ` Steven Rostedt
  2012-06-14 23:12   ` Thomas Gleixner
  1 sibling, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2012-06-14 22:58 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

On Fri, 2012-06-15 at 00:47 +0200, Thomas Gleixner wrote:
> I lost another day of waiting for a complex bug to trigger and turn
> off tracing, which did not happen due to:
> 
>  commit 499e54705(tracing/ring-buffer: Only have tracing_on disable
>  tracing buffers)

I submitted this last week: https://lkml.org/lkml/2012/6/6/701

I forgot to Cc you.

Sorry,

-- Steve




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off()
  2012-06-14 22:58 ` Steven Rostedt
@ 2012-06-14 23:12   ` Thomas Gleixner
  2012-06-14 23:48     ` Steven Rostedt
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2012-06-14 23:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

On Thu, 14 Jun 2012, Steven Rostedt wrote:

> On Fri, 2012-06-15 at 00:47 +0200, Thomas Gleixner wrote:
> > I lost another day of waiting for a complex bug to trigger and turn
> > off tracing, which did not happen due to:
> > 
> >  commit 499e54705(tracing/ring-buffer: Only have tracing_on disable
> >  tracing buffers)
> 
> I submitted this last week: https://lkml.org/lkml/2012/6/6/701

And how does that make up for a lost day?
 
> I forgot to Cc you.

That's still not an excuse for utter stupidity, really. Just check the
commit date vs. your discovery time.

I told you often enough to be more careful, but that one is really
beyond comprehension.

       Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off()
  2012-06-14 22:50 ` Thomas Gleixner
@ 2012-06-14 23:38   ` Steven Rostedt
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2012-06-14 23:38 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

On Fri, 2012-06-15 at 00:50 +0200, Thomas Gleixner wrote:
> Steven,
> 
> On Fri, 15 Jun 2012, Thomas Gleixner wrote:
> >  void tracing_off(void)
> >  {
> >  	if (global_trace.buffer)
> > -		ring_buffer_record_on(global_trace.buffer);
> > +		ring_buffer_record_off(global_trace.buffer);
> 
> this is the gazillionst time that I wasted valuable time relying on
> the not so extraordinary expectation that the kernel tracer stays
> functional at least in the most basic ways.

I admit this bug should have never made it in. I'm not sure how it did
because I know I tested the tracing_off() code before submitting it. But
it could have been something stupid as making the fix but never pushing
it. There were some conflicts I had with updating to the latest tip and
I may not have checked that what made it into the tip repo was what was
in my latest.

That's not my normal workflow. I usually use git to make sure I can fast
forward all my changes to tip/perf/core, but it may have missed this
case.

> 
> I'm really tired of wasting hours and days just because you are unable
> to keep your stuff straight and functional.
> 
> I avoided to make public fuzz about that so far, but my patience is
> exhausted now.

I'm fine with you making your complaints public. I prefer it that way.
To get things out in the open, whether good or bad.

> 
> You know yourself how many times I ranted to you on IRC or in private
> conversations, but you just keep on to bask in your egocentric
> mastering the uber complexity treat.
> 
> Your self description (https://lwn.net/Articles/500388/):
> 
>  Complexity is my Sun, and I am the planet that orbits around it.

Actually, that saying was inspired by your previous comments to me.
Although, I didn't mean that to be that I like to do things in complex
ways, but that complex tasks seem to draw me in. There is a difference.



> 
> is pretty accurate.
> 
> You can orbit around whatever you want and as long as you want, just
> please do that on your own playground and not on the expense of
> others.

This isn't my own playground. That change happened to come about when
someone else started implementing using the ring buffer for something
completely different than tracing. I found that the tracing 'tracing_on'
file was interfering with that code. I had to make it so that the file
would only disable tracing, not all users of the ring buffer facility.

That's actually something I thought you would respect. As why would
echoing 0 into tracing_on for ftrace disable recording of data that has
nothing to do with tracing.

That file should have never been part of the ringbuffer, it should have
been contained by ftrace from day one. Yes, that was my own fault that I
did not do that.


> 
> I always held up to ftrace, but the repeating experience of
> disfunctionality caused by your refusal to put your priorites straight
> is just too annoying.
> 
> It's not the first time, that tracing_off() or other stuff I'm
> depending on has been broken and caused me exhaustive loss of time.

I remember the beginning with 'tracing_enabled' which was heavy weight
and required knowledge of the tracers. You hated it because this complex
tie in to the tracers made it difficult to stop and start tracing
between kernel and userspace. I agreed, and added the quick approach to
the ring buffer. But the ring buffer is generic, and that file should
have never been a big hammer to disable all ring buffer users. Just
ftrace's buffer.

> 
> I know that I'm not the average user of tracing, but that's not an
> excuse for pushing out crap over and over again.

I haven't used that excuse in a long time. Just when new code went in
and you were able to find all the corner cases. This bug was not a
corner case and I admit, this bug caused me a red face. When I saw it, I
really did crap. I said to myself, "I know I tested this, how the hell
did it get in like this??". I really think I pushed the wrong branch. As
I have used tracing_off() recently and it did work. But that was with my
own stale code. This failure was a hiccup in my work flow, and I have no
excuse for it.

> 
> I'm tired of it and to be honest, I'm going to spend some quality time
> on evaluating LLTNG. If it can do what I need then I switch over and
> see whether it can hold up to the basic expectations of a primary
> user. AFAICT from previous experience it can do that, and if Matthieu
> is willing and able to cooperate on integrating with perf, then I
> happily kick you off into your self chosen complexity orbit forever.

I'm all for competition. That's what open source is all about. If it
does come in and satisfies everyone's needs, I'll back off, and maybe
even contribute. In the past, I have tried to make contributions to
integrate perf and ftrace, but things did not go so well.

But a lot of my "complex" code gave perf the ability to do function
tracing. That was a lot of work and its main purpose was to give perf
that ability. With the integration of the libtraceevents library, things
are about to move faster as well.

I wanted to let perf have access to the ftrace ring buffering too, such
that we can get perf and ftrace events integrated. That was one of my
works in progress. We'll see how that goes.

-- Steve


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off()
  2012-06-14 23:12   ` Thomas Gleixner
@ 2012-06-14 23:48     ` Steven Rostedt
  2012-06-15  0:00       ` Thomas Gleixner
  2012-06-15  4:18       ` Steven Rostedt
  0 siblings, 2 replies; 10+ messages in thread
From: Steven Rostedt @ 2012-06-14 23:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

On Fri, 2012-06-15 at 01:12 +0200, Thomas Gleixner wrote:

> That's still not an excuse for utter stupidity, really. Just check the
> commit date vs. your discovery time.
> 
> I told you often enough to be more careful, but that one is really
> beyond comprehension.

I'm really thinking that this was due to a bad push. As I believe I had
stale code (made this fix locally, but pushed the broken branch) and all
my changes were on top of it. When updating to the latest tip, I had
code I didn't want to merge, or conflicts happened, that I ended up
cherry-picking changes. Which breaks from my normal workflow, which is
to make sure my old changes fast forward to tip before doing a rebase of
my new code.

I have several branches that I work on and when they are ready, I rebase
them on top of tip, run a bunch of tests, and then push them out.

I'm a heavy user of tracing_off() too, and since the development of my
work is usually on older branches, I would not have noticed the bug. As
after I get things working I usually rebase on top of tip before doing
my final tests and asking for the pull request.

I'm usually very careful about checking that things fast forward and
such, and that rebases don't have strange conflicts. This one slipped
by, and yes, I'm extremely ashamed and embarrassed by it. That's why I
said in my pull request 'This was one of those "oh crap" moments'.

-- Steve



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off()
  2012-06-14 23:48     ` Steven Rostedt
@ 2012-06-15  0:00       ` Thomas Gleixner
  2012-06-15  0:32         ` Steven Rostedt
  2012-06-15  4:18       ` Steven Rostedt
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2012-06-15  0:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

On Thu, 14 Jun 2012, Steven Rostedt wrote:
> On Fri, 2012-06-15 at 01:12 +0200, Thomas Gleixner wrote:
> 
> > That's still not an excuse for utter stupidity, really. Just check the
> > commit date vs. your discovery time.
> > 
> > I told you often enough to be more careful, but that one is really
> > beyond comprehension.
> 
> I'm really thinking that this was due to a bad push. As I believe I had
> stale code (made this fix locally, but pushed the broken branch) and all
> my changes were on top of it. When updating to the latest tip, I had
> code I didn't want to merge, or conflicts happened, that I ended up
> cherry-picking changes. Which breaks from my normal workflow, which is
> to make sure my old changes fast forward to tip before doing a rebase of
> my new code.
> 
> I have several branches that I work on and when they are ready, I rebase
> them on top of tip, run a bunch of tests, and then push them out.
> 
> I'm a heavy user of tracing_off() too, and since the development of my
> work is usually on older branches, I would not have noticed the bug. As
> after I get things working I usually rebase on top of tip before doing
> my final tests and asking for the pull request.

Bla, bla, bla.....

 commit 499e547057f5bba5cd6f87ebe59b05d0c59da905
 Author: Steven Rostedt <srostedt@redhat.com>
 Date:   Wed Feb 22 15:50:28 2012 -0500

Did you actually read what I wrote above?

> > That's still not an excuse for utter stupidity, really. Just check the
> > commit date vs. your discovery time.

Emphasis on discovery time.

Si tacuisses, philosophus mansisses


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off()
  2012-06-15  0:00       ` Thomas Gleixner
@ 2012-06-15  0:32         ` Steven Rostedt
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2012-06-15  0:32 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

On Fri, 2012-06-15 at 02:00 +0200, Thomas Gleixner wrote:

>  commit 499e547057f5bba5cd6f87ebe59b05d0c59da905
>  Author: Steven Rostedt <srostedt@redhat.com>
>  Date:   Wed Feb 22 15:50:28 2012 -0500
> 
> Did you actually read what I wrote above?
> 
> > > That's still not an excuse for utter stupidity, really. Just check the
> > > commit date vs. your discovery time.
> 
> Emphasis on discovery time.
> 

Commit date was the local version. The pull request was March 14. It was
a hodge podge of updates that I pulled together. This could have been
where I pulled a stale branch. In fact, the fact that commit date was
Feb, and my push was almost a month later, could be exactly because I
grabbed the wrong branch :-/

Then most of my development goes on separate git branches. Sometimes I
rebase on top of tip and continue development, and sometimes I just
continue development on the branch I'm at and rebase when its done. It
may be months before I'm reusing code that I pushed out if the code I'm
working with is unrelated.

If this is an issue, I can change my work flow to always update to the
latest. Lately, I've actually been doing that. But a lot of time I just
work on my separate pieces, as they don't always have to do with each
other.

Since that change was submitted, I've pulled in and tested other
developers code, and did the 'remove stop machine with breakpoints'.
That code has been under development for about a year. And when it was
finally done, I even waited a few releases before pushing, and then only
pushing bits and pieces at a time.

Basically, yeah, the change went in in March, but I admit, I haven't
been developing with the latest code. A lot of the code I've been
working on is a year old, and the branches have not been updated to the
latest. June 6th came and Arnaldo asked for a feature in ftrace, and I
checked out a new branch based on latest tip. This was the first time I
encountered the bug.

I also use ftrace and tracing_off() extensively for work. But as we are
still on the 3.2-rt kernel, that would also keep me from seeing this as
well.

-- Steve



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off()
  2012-06-14 23:48     ` Steven Rostedt
  2012-06-15  0:00       ` Thomas Gleixner
@ 2012-06-15  4:18       ` Steven Rostedt
  2012-06-15 11:54         ` Steven Rostedt
  1 sibling, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2012-06-15  4:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

On Thu, 2012-06-14 at 19:48 -0400, Steven Rostedt wrote:

> I have several branches that I work on and when they are ready, I rebase
> them on top of tip, run a bunch of tests, and then push them out.

One more thing. I do run specific tests during development aimed at the
changes I make. I may rebase on top of tip afterward, run generic tests,
and then push out. These activities are not always done right after each
other. I probably screwed up by either cherry-picking or rebasing a
separate branch, noticing the bug (I remember fixing it before) but then
get side-tracked, and use the original branch that didn't have the fix.
Or it could have simply gotten reverted with a git reset --hard, where I
forgot to do the --amend. For testing something like this, I would have
added tracing_off() in the code, and I do reset --hard to remove my
testing code. If I made the fix and forgot to --amend it, the fix was
lost.

Anyway, I added to my generic tests: testing
 'echo schedule:traceon > set_ftrace_filter' as well as
 'echo schedule:traceoff > set_ftrace_filter' and make sure that they do
enable and disable tracing.

The traceon and traceoff triggers use tracing_on() and tracing_off()
underneath the covers. If they break, the test will fail (I tested the
broken patch and it does fail). This regression wont appear again.

Since the last time you've yelled at me, I've added a lot of stress
tests to ftrace. I test the tracing_on file to make sure it enables and
disables. But my generic tests did not include the internal
tracing_off() testing. My mistake, and yes I screwed up.

-- Steve



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off()
  2012-06-15  4:18       ` Steven Rostedt
@ 2012-06-15 11:54         ` Steven Rostedt
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2012-06-15 11:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	H. Peter Anvin, LKML, Mathieu Desnoyers

On Fri, 2012-06-15 at 00:18 -0400, Steven Rostedt wrote:
> For testing something like this, I would have
> added tracing_off() in the code, and I do reset --hard to remove my
> testing code. If I made the fix and forgot to --amend it, the fix was
> lost.

I'm almost positive this is what happened.

tracing_off() is similar to trace_printk(), it's something that is
usually added during development and not something that is normally
executed in production. With an unmodified kernel, to test it, you need
to start tracing and trigger a crash and have ftrace_dump_on_opps set,
and have something to read it so that you know the tracing stopped. Not
very reliable, and not something to put into generic testing.

I forgot that the function tracer 'traceoff' trigger uses it, otherwise
I would have added that new test a while ago.

For my testing, I put in tracing_off() at various places in the kernel,
and made sure that it worked. I did see that it was broken before and
fixed it. But for some reason, maybe I was distracted by other work, I
never did the git commit --amend (similar to forgetting to do a quilt
refresh). All my tests worked, and I reverted the test code I added with
either a 'git reset --hard', or a 'git diff | patch -p1 -R', both of
which would have removed my fix.

I usually scan the 'git diff' before doing the revert, but as this
change was a one liner, it could easily have been missed.

The code I've been working on for the last few months was debugging the
breakpoint code, which, when it failed, triple faulted the box. Tracing
was never an option. Which means that I would not have played with
tracing_off() then. And the older code that I've been working with
(3.2-rt), did not have this bug.

Again, I know you are one of the biggest users of ftrace, and I'm very
sorry (and ashamed) that his bug got in and caused you a days worth of
frustration.

-- Steve



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-06-15 11:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-14 22:47 [PATCH] tracing: Fix the outmost stupidity of tracing_off() Thomas Gleixner
2012-06-14 22:50 ` Thomas Gleixner
2012-06-14 23:38   ` Steven Rostedt
2012-06-14 22:58 ` Steven Rostedt
2012-06-14 23:12   ` Thomas Gleixner
2012-06-14 23:48     ` Steven Rostedt
2012-06-15  0:00       ` Thomas Gleixner
2012-06-15  0:32         ` Steven Rostedt
2012-06-15  4:18       ` Steven Rostedt
2012-06-15 11:54         ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).