* [PATCH] tracing: Fix the outmost stupidity of tracing_off() @ 2012-06-14 22:47 Thomas Gleixner 2012-06-14 22:50 ` Thomas Gleixner 2012-06-14 22:58 ` Steven Rostedt 0 siblings, 2 replies; 10+ messages in thread From: Thomas Gleixner @ 2012-06-14 22:47 UTC (permalink / raw) To: Linus Torvalds Cc: Steven Rostedt, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers I lost another day of waiting for a complex bug to trigger and turn off tracing, which did not happen due to: commit 499e54705(tracing/ring-buffer: Only have tracing_on disable tracing buffers) The subject line itself is supsicious itself: Only have tracing_on disable tracing buffers tracing_on as far as I'm concerned is supposed to start tracing. So disabling trace buffers on tracing_on is counterproductive. The patch description is not really a better source of enlightment: As the ring-buffer code is being used by other facilities in the kernel, having tracing_on file disable *all* buffers is not a desired affect. It should only disable the ftrace buffers that are being used. Move the code into the trace.c file and use the buffer disabling for tracing_on() and tracing_off(). This way only the ftrace buffers will be affected by them and other kernel utilities will not be confused to why their output suddenly stopped. How should the casual reader of this commit message figure out that tracing_on is a debugfs file which accepts 0 resp. 1 to it, where 0 causes the buffers to be disabled? Yes, it's obvious to the patch author, but .... While I account that to the usual inability to write proper changelogs, I'm amazed about the following parts of the patch -void tracing_off(void) -{ - clear_bit(RB_BUFFERS_ON_BIT, &ring_buffer_flags); -} ... +void tracing_off(void) +{ + if (global_trace.buffer) + ring_buffer_record_on(global_trace.buffer); This is so amazingly stupid, that it's not even funny anymore. Is it really too much to expect that a patch of the size 4 files changed, 171 insertions(+), 97 deletions(-) which touches fundamental functionality of a subsystem is at least tested to maintain the previous functionality? Signed-off-by: Thomas [royally annoyed] Gleixner <tglx@linutronix.de> --- kernel/trace/trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/kernel/trace/trace.c =================================================================== --- linux-2.6.orig/kernel/trace/trace.c +++ linux-2.6/kernel/trace/trace.c @@ -371,7 +371,7 @@ EXPORT_SYMBOL_GPL(tracing_on); void tracing_off(void) { if (global_trace.buffer) - ring_buffer_record_on(global_trace.buffer); + ring_buffer_record_off(global_trace.buffer); /* * This flag is only looked at when buffers haven't been * allocated yet. We don't really care about the race ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off() 2012-06-14 22:47 [PATCH] tracing: Fix the outmost stupidity of tracing_off() Thomas Gleixner @ 2012-06-14 22:50 ` Thomas Gleixner 2012-06-14 23:38 ` Steven Rostedt 2012-06-14 22:58 ` Steven Rostedt 1 sibling, 1 reply; 10+ messages in thread From: Thomas Gleixner @ 2012-06-14 22:50 UTC (permalink / raw) To: Linus Torvalds Cc: Steven Rostedt, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers Steven, On Fri, 15 Jun 2012, Thomas Gleixner wrote: > void tracing_off(void) > { > if (global_trace.buffer) > - ring_buffer_record_on(global_trace.buffer); > + ring_buffer_record_off(global_trace.buffer); this is the gazillionst time that I wasted valuable time relying on the not so extraordinary expectation that the kernel tracer stays functional at least in the most basic ways. I'm really tired of wasting hours and days just because you are unable to keep your stuff straight and functional. I avoided to make public fuzz about that so far, but my patience is exhausted now. You know yourself how many times I ranted to you on IRC or in private conversations, but you just keep on to bask in your egocentric mastering the uber complexity treat. Your self description (https://lwn.net/Articles/500388/): Complexity is my Sun, and I am the planet that orbits around it. is pretty accurate. You can orbit around whatever you want and as long as you want, just please do that on your own playground and not on the expense of others. I always held up to ftrace, but the repeating experience of disfunctionality caused by your refusal to put your priorites straight is just too annoying. It's not the first time, that tracing_off() or other stuff I'm depending on has been broken and caused me exhaustive loss of time. I know that I'm not the average user of tracing, but that's not an excuse for pushing out crap over and over again. I'm tired of it and to be honest, I'm going to spend some quality time on evaluating LLTNG. If it can do what I need then I switch over and see whether it can hold up to the basic expectations of a primary user. AFAICT from previous experience it can do that, and if Matthieu is willing and able to cooperate on integrating with perf, then I happily kick you off into your self chosen complexity orbit forever. Sadly, Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off() 2012-06-14 22:50 ` Thomas Gleixner @ 2012-06-14 23:38 ` Steven Rostedt 0 siblings, 0 replies; 10+ messages in thread From: Steven Rostedt @ 2012-06-14 23:38 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers On Fri, 2012-06-15 at 00:50 +0200, Thomas Gleixner wrote: > Steven, > > On Fri, 15 Jun 2012, Thomas Gleixner wrote: > > void tracing_off(void) > > { > > if (global_trace.buffer) > > - ring_buffer_record_on(global_trace.buffer); > > + ring_buffer_record_off(global_trace.buffer); > > this is the gazillionst time that I wasted valuable time relying on > the not so extraordinary expectation that the kernel tracer stays > functional at least in the most basic ways. I admit this bug should have never made it in. I'm not sure how it did because I know I tested the tracing_off() code before submitting it. But it could have been something stupid as making the fix but never pushing it. There were some conflicts I had with updating to the latest tip and I may not have checked that what made it into the tip repo was what was in my latest. That's not my normal workflow. I usually use git to make sure I can fast forward all my changes to tip/perf/core, but it may have missed this case. > > I'm really tired of wasting hours and days just because you are unable > to keep your stuff straight and functional. > > I avoided to make public fuzz about that so far, but my patience is > exhausted now. I'm fine with you making your complaints public. I prefer it that way. To get things out in the open, whether good or bad. > > You know yourself how many times I ranted to you on IRC or in private > conversations, but you just keep on to bask in your egocentric > mastering the uber complexity treat. > > Your self description (https://lwn.net/Articles/500388/): > > Complexity is my Sun, and I am the planet that orbits around it. Actually, that saying was inspired by your previous comments to me. Although, I didn't mean that to be that I like to do things in complex ways, but that complex tasks seem to draw me in. There is a difference. > > is pretty accurate. > > You can orbit around whatever you want and as long as you want, just > please do that on your own playground and not on the expense of > others. This isn't my own playground. That change happened to come about when someone else started implementing using the ring buffer for something completely different than tracing. I found that the tracing 'tracing_on' file was interfering with that code. I had to make it so that the file would only disable tracing, not all users of the ring buffer facility. That's actually something I thought you would respect. As why would echoing 0 into tracing_on for ftrace disable recording of data that has nothing to do with tracing. That file should have never been part of the ringbuffer, it should have been contained by ftrace from day one. Yes, that was my own fault that I did not do that. > > I always held up to ftrace, but the repeating experience of > disfunctionality caused by your refusal to put your priorites straight > is just too annoying. > > It's not the first time, that tracing_off() or other stuff I'm > depending on has been broken and caused me exhaustive loss of time. I remember the beginning with 'tracing_enabled' which was heavy weight and required knowledge of the tracers. You hated it because this complex tie in to the tracers made it difficult to stop and start tracing between kernel and userspace. I agreed, and added the quick approach to the ring buffer. But the ring buffer is generic, and that file should have never been a big hammer to disable all ring buffer users. Just ftrace's buffer. > > I know that I'm not the average user of tracing, but that's not an > excuse for pushing out crap over and over again. I haven't used that excuse in a long time. Just when new code went in and you were able to find all the corner cases. This bug was not a corner case and I admit, this bug caused me a red face. When I saw it, I really did crap. I said to myself, "I know I tested this, how the hell did it get in like this??". I really think I pushed the wrong branch. As I have used tracing_off() recently and it did work. But that was with my own stale code. This failure was a hiccup in my work flow, and I have no excuse for it. > > I'm tired of it and to be honest, I'm going to spend some quality time > on evaluating LLTNG. If it can do what I need then I switch over and > see whether it can hold up to the basic expectations of a primary > user. AFAICT from previous experience it can do that, and if Matthieu > is willing and able to cooperate on integrating with perf, then I > happily kick you off into your self chosen complexity orbit forever. I'm all for competition. That's what open source is all about. If it does come in and satisfies everyone's needs, I'll back off, and maybe even contribute. In the past, I have tried to make contributions to integrate perf and ftrace, but things did not go so well. But a lot of my "complex" code gave perf the ability to do function tracing. That was a lot of work and its main purpose was to give perf that ability. With the integration of the libtraceevents library, things are about to move faster as well. I wanted to let perf have access to the ftrace ring buffering too, such that we can get perf and ftrace events integrated. That was one of my works in progress. We'll see how that goes. -- Steve ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off() 2012-06-14 22:47 [PATCH] tracing: Fix the outmost stupidity of tracing_off() Thomas Gleixner 2012-06-14 22:50 ` Thomas Gleixner @ 2012-06-14 22:58 ` Steven Rostedt 2012-06-14 23:12 ` Thomas Gleixner 1 sibling, 1 reply; 10+ messages in thread From: Steven Rostedt @ 2012-06-14 22:58 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers On Fri, 2012-06-15 at 00:47 +0200, Thomas Gleixner wrote: > I lost another day of waiting for a complex bug to trigger and turn > off tracing, which did not happen due to: > > commit 499e54705(tracing/ring-buffer: Only have tracing_on disable > tracing buffers) I submitted this last week: https://lkml.org/lkml/2012/6/6/701 I forgot to Cc you. Sorry, -- Steve ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off() 2012-06-14 22:58 ` Steven Rostedt @ 2012-06-14 23:12 ` Thomas Gleixner 2012-06-14 23:48 ` Steven Rostedt 0 siblings, 1 reply; 10+ messages in thread From: Thomas Gleixner @ 2012-06-14 23:12 UTC (permalink / raw) To: Steven Rostedt Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers On Thu, 14 Jun 2012, Steven Rostedt wrote: > On Fri, 2012-06-15 at 00:47 +0200, Thomas Gleixner wrote: > > I lost another day of waiting for a complex bug to trigger and turn > > off tracing, which did not happen due to: > > > > commit 499e54705(tracing/ring-buffer: Only have tracing_on disable > > tracing buffers) > > I submitted this last week: https://lkml.org/lkml/2012/6/6/701 And how does that make up for a lost day? > I forgot to Cc you. That's still not an excuse for utter stupidity, really. Just check the commit date vs. your discovery time. I told you often enough to be more careful, but that one is really beyond comprehension. Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off() 2012-06-14 23:12 ` Thomas Gleixner @ 2012-06-14 23:48 ` Steven Rostedt 2012-06-15 0:00 ` Thomas Gleixner 2012-06-15 4:18 ` Steven Rostedt 0 siblings, 2 replies; 10+ messages in thread From: Steven Rostedt @ 2012-06-14 23:48 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers On Fri, 2012-06-15 at 01:12 +0200, Thomas Gleixner wrote: > That's still not an excuse for utter stupidity, really. Just check the > commit date vs. your discovery time. > > I told you often enough to be more careful, but that one is really > beyond comprehension. I'm really thinking that this was due to a bad push. As I believe I had stale code (made this fix locally, but pushed the broken branch) and all my changes were on top of it. When updating to the latest tip, I had code I didn't want to merge, or conflicts happened, that I ended up cherry-picking changes. Which breaks from my normal workflow, which is to make sure my old changes fast forward to tip before doing a rebase of my new code. I have several branches that I work on and when they are ready, I rebase them on top of tip, run a bunch of tests, and then push them out. I'm a heavy user of tracing_off() too, and since the development of my work is usually on older branches, I would not have noticed the bug. As after I get things working I usually rebase on top of tip before doing my final tests and asking for the pull request. I'm usually very careful about checking that things fast forward and such, and that rebases don't have strange conflicts. This one slipped by, and yes, I'm extremely ashamed and embarrassed by it. That's why I said in my pull request 'This was one of those "oh crap" moments'. -- Steve ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off() 2012-06-14 23:48 ` Steven Rostedt @ 2012-06-15 0:00 ` Thomas Gleixner 2012-06-15 0:32 ` Steven Rostedt 2012-06-15 4:18 ` Steven Rostedt 1 sibling, 1 reply; 10+ messages in thread From: Thomas Gleixner @ 2012-06-15 0:00 UTC (permalink / raw) To: Steven Rostedt Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers On Thu, 14 Jun 2012, Steven Rostedt wrote: > On Fri, 2012-06-15 at 01:12 +0200, Thomas Gleixner wrote: > > > That's still not an excuse for utter stupidity, really. Just check the > > commit date vs. your discovery time. > > > > I told you often enough to be more careful, but that one is really > > beyond comprehension. > > I'm really thinking that this was due to a bad push. As I believe I had > stale code (made this fix locally, but pushed the broken branch) and all > my changes were on top of it. When updating to the latest tip, I had > code I didn't want to merge, or conflicts happened, that I ended up > cherry-picking changes. Which breaks from my normal workflow, which is > to make sure my old changes fast forward to tip before doing a rebase of > my new code. > > I have several branches that I work on and when they are ready, I rebase > them on top of tip, run a bunch of tests, and then push them out. > > I'm a heavy user of tracing_off() too, and since the development of my > work is usually on older branches, I would not have noticed the bug. As > after I get things working I usually rebase on top of tip before doing > my final tests and asking for the pull request. Bla, bla, bla..... commit 499e547057f5bba5cd6f87ebe59b05d0c59da905 Author: Steven Rostedt <srostedt@redhat.com> Date: Wed Feb 22 15:50:28 2012 -0500 Did you actually read what I wrote above? > > That's still not an excuse for utter stupidity, really. Just check the > > commit date vs. your discovery time. Emphasis on discovery time. Si tacuisses, philosophus mansisses ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off() 2012-06-15 0:00 ` Thomas Gleixner @ 2012-06-15 0:32 ` Steven Rostedt 0 siblings, 0 replies; 10+ messages in thread From: Steven Rostedt @ 2012-06-15 0:32 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers On Fri, 2012-06-15 at 02:00 +0200, Thomas Gleixner wrote: > commit 499e547057f5bba5cd6f87ebe59b05d0c59da905 > Author: Steven Rostedt <srostedt@redhat.com> > Date: Wed Feb 22 15:50:28 2012 -0500 > > Did you actually read what I wrote above? > > > > That's still not an excuse for utter stupidity, really. Just check the > > > commit date vs. your discovery time. > > Emphasis on discovery time. > Commit date was the local version. The pull request was March 14. It was a hodge podge of updates that I pulled together. This could have been where I pulled a stale branch. In fact, the fact that commit date was Feb, and my push was almost a month later, could be exactly because I grabbed the wrong branch :-/ Then most of my development goes on separate git branches. Sometimes I rebase on top of tip and continue development, and sometimes I just continue development on the branch I'm at and rebase when its done. It may be months before I'm reusing code that I pushed out if the code I'm working with is unrelated. If this is an issue, I can change my work flow to always update to the latest. Lately, I've actually been doing that. But a lot of time I just work on my separate pieces, as they don't always have to do with each other. Since that change was submitted, I've pulled in and tested other developers code, and did the 'remove stop machine with breakpoints'. That code has been under development for about a year. And when it was finally done, I even waited a few releases before pushing, and then only pushing bits and pieces at a time. Basically, yeah, the change went in in March, but I admit, I haven't been developing with the latest code. A lot of the code I've been working on is a year old, and the branches have not been updated to the latest. June 6th came and Arnaldo asked for a feature in ftrace, and I checked out a new branch based on latest tip. This was the first time I encountered the bug. I also use ftrace and tracing_off() extensively for work. But as we are still on the 3.2-rt kernel, that would also keep me from seeing this as well. -- Steve ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off() 2012-06-14 23:48 ` Steven Rostedt 2012-06-15 0:00 ` Thomas Gleixner @ 2012-06-15 4:18 ` Steven Rostedt 2012-06-15 11:54 ` Steven Rostedt 1 sibling, 1 reply; 10+ messages in thread From: Steven Rostedt @ 2012-06-15 4:18 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers On Thu, 2012-06-14 at 19:48 -0400, Steven Rostedt wrote: > I have several branches that I work on and when they are ready, I rebase > them on top of tip, run a bunch of tests, and then push them out. One more thing. I do run specific tests during development aimed at the changes I make. I may rebase on top of tip afterward, run generic tests, and then push out. These activities are not always done right after each other. I probably screwed up by either cherry-picking or rebasing a separate branch, noticing the bug (I remember fixing it before) but then get side-tracked, and use the original branch that didn't have the fix. Or it could have simply gotten reverted with a git reset --hard, where I forgot to do the --amend. For testing something like this, I would have added tracing_off() in the code, and I do reset --hard to remove my testing code. If I made the fix and forgot to --amend it, the fix was lost. Anyway, I added to my generic tests: testing 'echo schedule:traceon > set_ftrace_filter' as well as 'echo schedule:traceoff > set_ftrace_filter' and make sure that they do enable and disable tracing. The traceon and traceoff triggers use tracing_on() and tracing_off() underneath the covers. If they break, the test will fail (I tested the broken patch and it does fail). This regression wont appear again. Since the last time you've yelled at me, I've added a lot of stress tests to ftrace. I test the tracing_on file to make sure it enables and disables. But my generic tests did not include the internal tracing_off() testing. My mistake, and yes I screwed up. -- Steve ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] tracing: Fix the outmost stupidity of tracing_off() 2012-06-15 4:18 ` Steven Rostedt @ 2012-06-15 11:54 ` Steven Rostedt 0 siblings, 0 replies; 10+ messages in thread From: Steven Rostedt @ 2012-06-15 11:54 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, LKML, Mathieu Desnoyers On Fri, 2012-06-15 at 00:18 -0400, Steven Rostedt wrote: > For testing something like this, I would have > added tracing_off() in the code, and I do reset --hard to remove my > testing code. If I made the fix and forgot to --amend it, the fix was > lost. I'm almost positive this is what happened. tracing_off() is similar to trace_printk(), it's something that is usually added during development and not something that is normally executed in production. With an unmodified kernel, to test it, you need to start tracing and trigger a crash and have ftrace_dump_on_opps set, and have something to read it so that you know the tracing stopped. Not very reliable, and not something to put into generic testing. I forgot that the function tracer 'traceoff' trigger uses it, otherwise I would have added that new test a while ago. For my testing, I put in tracing_off() at various places in the kernel, and made sure that it worked. I did see that it was broken before and fixed it. But for some reason, maybe I was distracted by other work, I never did the git commit --amend (similar to forgetting to do a quilt refresh). All my tests worked, and I reverted the test code I added with either a 'git reset --hard', or a 'git diff | patch -p1 -R', both of which would have removed my fix. I usually scan the 'git diff' before doing the revert, but as this change was a one liner, it could easily have been missed. The code I've been working on for the last few months was debugging the breakpoint code, which, when it failed, triple faulted the box. Tracing was never an option. Which means that I would not have played with tracing_off() then. And the older code that I've been working with (3.2-rt), did not have this bug. Again, I know you are one of the biggest users of ftrace, and I'm very sorry (and ashamed) that his bug got in and caused you a days worth of frustration. -- Steve ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-06-15 11:54 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-06-14 22:47 [PATCH] tracing: Fix the outmost stupidity of tracing_off() Thomas Gleixner 2012-06-14 22:50 ` Thomas Gleixner 2012-06-14 23:38 ` Steven Rostedt 2012-06-14 22:58 ` Steven Rostedt 2012-06-14 23:12 ` Thomas Gleixner 2012-06-14 23:48 ` Steven Rostedt 2012-06-15 0:00 ` Thomas Gleixner 2012-06-15 0:32 ` Steven Rostedt 2012-06-15 4:18 ` Steven Rostedt 2012-06-15 11:54 ` Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).