linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: linas@austin.ibm.com
To: linux-kernel@vger.kernel.org
Subject: KDB in the mainstream 2.4.x kernels?
Date: Fri, 18 Jul 2003 15:06:33 -0500	[thread overview]
Message-ID: <20030718150633.A50102@forte.austin.ibm.com> (raw)


                                                                                
Hi,
                                                                                
Will there be a day that I can expect to find KDB in the 2.4.x kernel?
I know that a traditional answer has been 'never', but I would like the
various influencers and decision makers to reconsider ...
                                                                                
I agree with Linus Torvalds that debuggers are 100% useless when you
are working on code that you know intimately.  I know, I've written a
lot of code, I'm proud of it, and I sneer at people who use words like
'development environment'.   Crap, if you can't figure out why your
code crashed, you shouldn't be a programmer.  But these days, I am not
debugging my code. I'm debugging code that I've never seen before.
And for that, I use KDB.
                                                                                
Right now, I work in a job where the *only* thing that I do is to analyze
and sometimes (when I'm lucky) fix kernel crashes.   Its all I do.
I don't write any new code, don't do any porting at all.  I also don't
debug any 2.5/2.6 'unstable' kernels, nor do I handle any new/unstable
device drivers.  I focus entirely on the 2.4.x kernels, and, with a
small team here, there are more than enough kernel bugs to keep us all
completely busy.  The crashes are generated by a test team of 8 people
with dozens of machines.   Ostensibly their mission is to test new
hardware, but in fact, almost all the crashes that they find are kernel
bugs.  The *only* thing that the test team does is to run stress tests.
Basic stuff. Kernel stress. File create/delete/copy. Reiser, jfs, ext3,
swap, OOM, scsi. Network, nfs, samba.   Some tests take hours to crash
the kernel, some take days.   But the kernel crashes. Its always crashing.
Corruption, races, missing locks, typos, bad hardware, you name it.
When I get it, it has a KDB prompt in front of it.  KDB is great.
I can figure out where it crashed, I can look at the assembly, I can
examine memory locations. I can chase pointers by hand.  And I can
do it all symbolically, with the symbol names in front of me.  Now,
KDB rarely points right at the bug, but it is invaluable for figuring
out where to start looking.  Sometimes I even find the bug, often
I don't.  But anyway, this is all academic, because its at work, in
a controlled environment, where I have the time and resources I need.
                                                                                
But the real reason I write this note is that I want to have the same
capability at home.  It suddenly occurred to me that the servers I run
at home sometimes (rarely) crash with the same symptoms as those at work.
Sure, I can probably blame buggy PC hardware.  But .. I dunno.  I've been
consistently ignoring these crashes cause its just too much of a hassle
to try to debug them.  Its not worth the effort.   But hey ... if I had
KDB at home... maybe it would be worth looking into the hangs. I could
see getting motivated to look into some of these.  At least get some
idea of where the machine got hung.  Maybe no fix, but at least
somewhere to lay the blame.
                                                                                
Yes, of course I could just apply the KDB patches myself, but frankly
its a hassle.  I already play the patch game and I hate it. Every new
kernel, I have to try to remember where to find patch x, how to apply
it, fix up this and that... its just plain painful.
                                                                                
I know that this is not a forceful argument.  But crashes are a fact of
life, whatever the reason may be.  And the crashes almost always happen
in a piece of code I have *never* laid eyes on before, so its unrealistic
to try to puzzle it out with the small dollop of info from magic-sysreq.
Debuggers can be useless, or worse than useless, when you are a developer
on a piece of code you know well.  But when plunging into foreign territory,
all the tools and firepower that you can muster are worth every bit.
This is why KDB belongs in the mainstream kernel distros.
                                                                                
--linas
                                                                                


             reply	other threads:[~2003-07-18 19:51 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-18 20:06 linas [this message]
     [not found] <aJIn.3mj.15@gated-at.bofh.it>
2003-07-18 20:43 ` KDB in the mainstream 2.4.x kernels? Andi Kleen
2003-07-19  0:31   ` linas
2003-07-19  0:57     ` Andi Kleen
2003-07-20 12:55   ` Keith Owens
2003-07-20 13:31     ` David S. Miller
2003-07-20 22:27       ` Keith Owens
2003-07-21 15:06     ` Andi Kleen
2003-07-29 19:44   ` Robin Holt
2003-08-13  4:40   ` Martin Pool
2003-08-13 11:04     ` Andi Kleen
2003-08-25 12:16       ` Greg Stark
2003-08-25 16:23         ` Andi Kleen
2003-08-26 13:39           ` Greg Stark
2003-08-27 13:49           ` Alan Cox
2003-08-30 10:35             ` Pavel Machek
2003-08-28 17:08 Tolentino, Matthew E
2003-08-28 20:24 ` Alan Cox
2003-09-02 20:40 Tolentino, Matthew E

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030718150633.A50102@forte.austin.ibm.com \
    --to=linas@austin.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).