* Re: [lkcd-general] Re: What's left over.
[not found] <551170412@toto.iv>
@ 2002-11-04 3:03 ` Peter Chubb
2002-11-04 13:08 ` Alan Cox
0 siblings, 1 reply; 35+ messages in thread
From: Peter Chubb @ 2002-11-04 3:03 UTC (permalink / raw)
To: linux; +Cc: linux-kernel
>>>>> "linux" == linux <linux@horizon.com> writes:
linux> While a crash dump to just half of one of those mirrors is
linux> fine, finding it might be a little bit tricky. And the fact
linux> that the kernel reassembles the mirrors automatically on boot
linux> might make retrieving the data a little bit tricky, too.
What most other unices do is crash dump to a dedicated swap
partition. LKCD appears to be able to do this. So the setup of MD
etc., isn't going to affect anything.
Peter C
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 3:03 ` [lkcd-general] Re: What's left over Peter Chubb
@ 2002-11-04 13:08 ` Alan Cox
0 siblings, 0 replies; 35+ messages in thread
From: Alan Cox @ 2002-11-04 13:08 UTC (permalink / raw)
To: Peter Chubb; +Cc: linux, Linux Kernel Mailing List
On Mon, 2002-11-04 at 03:03, Peter Chubb wrote:
> >>>>> "linux" == linux <linux@horizon.com> writes:
>
>
> linux> While a crash dump to just half of one of those mirrors is
> linux> fine, finding it might be a little bit tricky. And the fact
> linux> that the kernel reassembles the mirrors automatically on boot
> linux> might make retrieving the data a little bit tricky, too.
>
> What most other unices do is crash dump to a dedicated swap
> partition. LKCD appears to be able to do this. So the setup of MD
> etc., isn't going to affect anything.
I have raid1 swap. That does make a difference to the problem space.
When we get into encrypted raid5 swap over nbd (the security paranoia
dept - store all my swap crypted split into 4 disks in four
jurisdictions...) it gets really fun.
For the normal cases it doesn't seem a problem, even for raid0 swap
since before crash time you can generate a list of device/blocknumber
values and store it in the signed area
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
@ 2002-11-05 20:37 Dr. Greg Wettstein
0 siblings, 0 replies; 35+ messages in thread
From: Dr. Greg Wettstein @ 2002-11-05 20:37 UTC (permalink / raw)
To: Alan Cox, Bill Davidsen
Cc: Matt D. Robinson, Steven King, Linus Torvalds, Joel Becker,
Chris Friesen, Rusty Russell, Linux Kernel Mailing List,
lkcd-general, lkcd-devel
> On Sun, 2002-11-03 at 14:33, Bill Davidsen wrote:
> > If you define "unmaintainably bad" as "having features you don't need"
> > then I agree. But since dump to disk is in almost every other commercial
> > UNIX, maybe someone would question why it's good for others but not for
> > Linux.
Perhaps the other OS's have made bad decisions.
I've only seen one OS in the last 20 years which, by industry
consensus, seems to have some hope of becoming a viable contender to a
monopolistic position. I would hope that we would contemplate the
factors that helped give rise to that situation.
}-- End of excerpt from Alan Cox
As always,
Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC.
4206 N. 19th Ave. Specializing in information infra-structure
Fargo, ND 58102 development.
PH: 701-281-4950 WWW: http://www.enjellic.com
FAX: 701-281-3949 EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"How appropriate, you fight like a cow."
-- Guybrush Threepwood
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-03 16:32 ` Alan Cox
@ 2002-11-05 18:07 ` Bill Davidsen
0 siblings, 0 replies; 35+ messages in thread
From: Bill Davidsen @ 2002-11-05 18:07 UTC (permalink / raw)
To: Alan Cox
Cc: Matt D. Robinson, Steven King, Linus Torvalds, Joel Becker,
Chris Friesen, Rusty Russell, Linux Kernel Mailing List,
lkcd-general, lkcd-devel
On 3 Nov 2002, Alan Cox wrote:
> On Sun, 2002-11-03 at 14:33, Bill Davidsen wrote:
> > If you define "unmaintainably bad" as "having features you don't need"
> > then I agree. But since dump to disk is in almost every other commercial
> > UNIX, maybe someone would question why it's good for others but not for
> > Linux.
>
> It isnt about features, its about clean maintainable code. netdump to me
> doesnt mean no dump to disk option. In fact I'd rather like to be able
> to insmod dump-foo.o. The correctness issues are hard but if the
> dump-foo is standalone, resets the hardware and has an SHA integrity
> check then it can be done (think of it as a post crash variant of the
> trusted computing TCB verification problem)
I certainly don't disagree, but the one critical problem is writing the
dump to the right place, or at least not writing to the wrong place. I'd
love to have disk, net, NVram, whatever choices, but disk is the one which
would help the most. AIX and ISC have dump to swap, and the swapon copies
the data back or clears it, with a fresh O/S load to ensure writing the
right place.
> > uses the crash dump in AIX, the person who wants to send a compressed dump
> > and money to IBM and get back a fix. Netdump assumes external resources
>
> Lots of interesting legal issues but yes you can do it sometimes (DMCA,
> privacy, financial duties sometimes make it horribly complex). Even in
> the case where you only dump the oops its still valuable.
Agreed, I would think about doing that with a mail server. But even an
oops like ksymoops would be helpful. I started on systems with dumps,
ksymoops is wonderful by comparison.
> > and a functional secure network (is the dump encrypted and I missed it?)
> > which home users surely don't have, and remote servers oftem lack as well.
>
> Encrypting the dump with the new crypto lib in the kernel would be easy,
> right now it doesnt.
>
> My disk dump concerns are purely those of correctness. That means
>
> 1. After loading the module getting the block list for the dump target
That could all be built as part of init, clearly you can't depend on
demand loading the module.
> 2. Resetting and scratch initializing the dump device
If the modules are to be really self-sufficient it would have to include
the driver. I'll let someone tell me that's not always the case if the
driver can have its own data area.
> 3. Not relying on any code outside of the dump TCB that may have
> been corrupted
Yes, although with separate code, stack and data that's less likely. In
the bad old days self-modifying code was common.
> 4. At dump time turning off all bus masters, doing the dump TCB
> verification and then dumping
The first part of that looks medium hard, particularly if the code has to
be part of the dump module.
> Most of the pieces already exist.
Clearly it can be done even better than the current implementation, and
given an interface standard a replacement in the whole could be done.
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 16:57 ` Alan Cox
@ 2002-11-05 9:05 ` Suparna Bhattacharya
0 siblings, 0 replies; 35+ messages in thread
From: Suparna Bhattacharya @ 2002-11-05 9:05 UTC (permalink / raw)
To: Alan Cox
Cc: Linus Torvalds, Richard J Moore, Oliver Xymoron, Dave Anderson,
Linux Kernel Mailing List, lkcd-general, lkcd-general-admin,
Rusty Russell, Matt D. Robinson
On Mon, Nov 04, 2002 at 04:40:11PM +0000, Alan Cox wrote:
> Let me ask another question here
>
> Other than "register_reboot_notifier()" and adding a
> "register_exception_notifier()" chain what else does a dump tool need.
> Register_exception_notifier seems to solve about 90% of the insmod gdb
> problem space as well ?
>
>
I had tried to list these in an earlier mail, added a few more
comments now marked by ">>"
1.Enabling IPI to collect CPU state on all processors in the
system right when dump is triggered (may not be a normal
situation, so NMIs where supported are the best option)
>> set/register_nmi_callback could also help in part (though
>> synchronization issues need to be thought through so that
>> the effect on regular system operation is as low as possible),
>> but we also need an interface to generate the NMI ipi when
>> required, and something that generalises on all architectures.
2.Ability to quiesce (silence) the system before dumping
(and if in non-disruptive mode, then restore it back)
>> smp_call_function may not the ideal option for many situations
>> - in general we would like to have a separate "force" path
>> available for some troublesome situations, and it would be
>> nice to be able to tackle non-disruptive (but accurate) dumping
>> as well.
>> maybe 1 & 2 can be combined in some form
>> Dump should preferably not overlap with a regularly used IPI.
3. Calls into dump from kernel paths (panic, oops, sysrq
etc).
>> This is where your register_xxx_notifier(s) fit in
4. Exports of symbols to help with physical memory
traversal and verification
>> Covers what Andi Kleen referred to as
>> iterate_over_memmap_and_give_me_type()
>> (a way to figure out the type of memory - true ram or other)
Regards
Suparna
--
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Labs, India
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 14:45 ` Henning P. Schmiedehausen
2002-11-04 15:29 ` Alan Cox
@ 2002-11-05 4:57 ` Werner Almesberger
1 sibling, 0 replies; 35+ messages in thread
From: Werner Almesberger @ 2002-11-05 4:57 UTC (permalink / raw)
To: Henning P. Schmiedehausen; +Cc: linux-kernel
Henning P. Schmiedehausen wrote:
> Good! This means, people debugging the code have actually to think and
> don't produce "turn on debugger, step here, there, patch a band aid,
> done" solutions you see with various other "commercial products"
Unfortunately, just making it hard doesn't guarantee that they
won't try anyway. If you're lucky, at least their band aid will
be so disgusting that you won't be fooled into thinking they
might be right.
But ultimately, it's an attitude problem. Even people who learn
about their bugs by source code reading may then produce a
shabby fix.
Hmm, I wonder if Linus has ever done any protocol design,
followed by validation. I always find the havoc a protocol
validator (e.g. Spin) wreaks a very instructive demonstration
of how much source code level "correctness" really buys you :-)
(Or what chances you'd stand of realizing what happened just
from an Oops.)
- Werner
--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 16:22 ` Linus Torvalds
@ 2002-11-04 16:57 ` Alan Cox
2002-11-05 9:05 ` Suparna Bhattacharya
0 siblings, 1 reply; 35+ messages in thread
From: Alan Cox @ 2002-11-04 16:57 UTC (permalink / raw)
To: Linus Torvalds
Cc: Richard J Moore, Oliver Xymoron, Dave Anderson,
Linux Kernel Mailing List, lkcd-devel, lkcd-general,
lkcd-general-admin, Rusty Russell, Matt D. Robinson
Let me ask another question here
Other than "register_reboot_notifier()" and adding a
"register_exception_notifier()" chain what else does a dump tool need.
Register_exception_notifier seems to solve about 90% of the insmod gdb
problem space as well ?
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
[not found] ` <1036429035.1718.99.camel@irongate.swansea.linux.org.uk.suse.lists.linux.kernel>
@ 2002-11-04 16:53 ` Andi Kleen
0 siblings, 0 replies; 35+ messages in thread
From: Andi Kleen @ 2002-11-04 16:53 UTC (permalink / raw)
To: Alan Cox; +Cc: linux-kernel
Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> Let me ask another question here
>
> Other than "register_reboot_notifier()" and adding a
> "register_exception_notifier()" chain what else does a dump tool need.
> Register_exception_notifier seems to solve about 90% of the insmod gdb
> problem space as well ?
A memory dumper needs some infrastructure to find out what page is ram
and what is hole etc.
Basically an iterate_over_memmap_and_give_me_type() function.
-Andi
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 15:38 ` Patrick Finnegan
@ 2002-11-04 16:51 ` Henning P. Schmiedehausen
0 siblings, 0 replies; 35+ messages in thread
From: Henning P. Schmiedehausen @ 2002-11-04 16:51 UTC (permalink / raw)
To: linux-kernel
Patrick Finnegan <pat@purdueriots.com> writes:
>On Mon, 4 Nov 2002, Henning P. Schmiedehausen wrote:
>> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
>>
>> >On Mon, 2002-11-04 at 14:45, Henning P. Schmiedehausen wrote:
>> >> Good! This means, people debugging the code have actually to think and
>> >> don't produce "turn on debugger, step here, there, patch a band aid,
>>
>> >Some of us debug hardware. Regardless of the nice theories about
>> >reviewing your code they don't actually work on hardware because no
>> >amount of code review will let you discover things like undocumented
>> >2uS deskew delays, or errors in DMA engines
>>
>> A debugger won't help you here either. A pci bus probe, a 'scope and a
>> logic analyzer do.
>>
>> (And experience, elbow grease, experience and a nice amount of ESP :-)
>> I do hate hardware. Had to debug too much of it (and just on
>> m68k/MCS-51 where the clock rates are low and the parts easy to
>> solder...).
>I find that hard to believe. You're saying it's impossible to use a
>software debugger to debug the interface between the software and the
No. IMHO it is impossible to use a software debugger to catch 2uS
deskew delays or errors in DMA engines. That's what logic analyzers
are for. If you attach or fire up the debugger, the timing changes and
you're no longer testing the failure case but something different.
>(No Linus, I'm not pushing them, just stating my opinion.)
I am, BTW completely your opinion. Personally I find it horrid that
"the XIAFS resurrection" is winked through with "will be probably
accepted for the hack value" and LKCD is rejected with "bloat"
arguments.
But hey, it _is_ Linus' kernel and he may choose as he likes. I
e.g. run vendor kernels (for 2.4).
Regards
Henning
--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH hps@intermeta.de
Am Schwabachgrund 22 Fon.: 09131 / 50654-0 info@intermeta.de
D-91054 Buckenhof Fax.: 09131 / 50654-20
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 11:59 Richard J Moore
2002-11-04 12:27 ` Lars Marowsky-Bree
2002-11-04 16:16 ` John Alvord
@ 2002-11-04 16:22 ` Linus Torvalds
2002-11-04 16:57 ` Alan Cox
2 siblings, 1 reply; 35+ messages in thread
From: Linus Torvalds @ 2002-11-04 16:22 UTC (permalink / raw)
To: Richard J Moore
Cc: Oliver Xymoron, Dave Anderson, linux-kernel, lkcd-devel,
lkcd-general, lkcd-general-admin, Rusty Russell,
Matt D. Robinson
On Mon, 4 Nov 2002, Richard J Moore wrote:
>
> Are you sure? Isn't what Linus is saying is that he understands that some
> problems can be solved using dumps, some from the oops message and some by
> source code inspection and some by others means. But, he's not interested
> in a timely resolution;
Ok, with tons of explanation:
- I'm clearly not interested. I've not seen any discussion of the usage
of the tools or how great it is, and that's apparently because all the
LKCD people are off in their own mailing lists and do not want to have
anything to do with the rest of the world. Except when they come out of
the blue one week before feature freeze and _demand_ that I accept
their patches that I've never seen before or heard anybody talk about.
Hint: think about this part. Deeply. And then go and bother SOMEBODY
ELSE.
- Since I'm not personally convinced, it's not going into my tree.
It's as simple as that. I take stuff that I feel is good. Often that
feeling of goodness comes from trusting the person who sends it to me,
simply by past performance. At other times, it is because I think the
feature is cool, or well done, or whatever.
Hint: if you want stuff in my tree, make me trust you. Or work on
things that I feel are innately interesting. Don't bother dragging me
into your flame-wars and trying to convince me that I "must" apply your
patches.
- If it doesn't go into my tree, is that bad?
NO! Open source is all about _other_ people being able to make their
changes. It by no means means that those changes have to be accepted
back: the license basically only boils down to that I must be _able_ to
accept them back. But the really important thing, the thing that really
makes a difference, is that you, your dog, and your company can make
your OWN changes.
- If it doesn't have to happen in my tree, then whose tree _does_ it have
to happen in?
Doesn't much matter, actually. You can keep it in your tree, for all I
care. OSDL has already picked it up and apparently maintains it in
their tree. The only thing that matters is whether it gets used or not,
and whether it proves itself.
More people use vendor trees than my tree. And if you don't find a
vendor who will apply your patches, there are several "personal
vendors" out there, with the -ac, -aa and -mm trees being the obvious
ones. Many of those trees are not just used, they are also
obviously backed by people I do trust, which brings us back to the
criteria for _me_ to apply patches.
- Considering the above, if you still want it to _eventually_ make it
into my tree, what should you do?
Do you think pestering me makes me like the patches any more and trust
you? And if it doesn't, then how do you expect it to help, considering
my patch acceptance criteria?
No. The way to get it into my tree is not to whine about it. There are
a few different ways to get it into my tree:
(a) prove me wrong. And btw, it doesn't help to do so in your LKCD
mailing list. You need to get those patches out there to
_other_ people, or convince your own people that living in
your little hole just means that nobody else knows or cares
about you.
(b) If you can't convince me, convince somebody else. Maybe that
somebody else is somebody I trust, and that somebody else
feels that I was wrong and since _he_ believes in the project
he will try to convince me about it.
And trust me, the people I trust don't revere me and think I'm
always right. These people call me "pinhead" and tell me when
I'm full of shit. If these people don't believe in your
project, don't blame me and think it's because I "poisoned
their minds".
(c) Push your vendor. I have absolutely _zero_ incentives to care
about whining users (I care deeply about the non-whining
kind), but vendors do. Sometimes they do things just to get
their users off their backs.
And once it's in a vendor tree, that doesn't guarantee I pick
it up, but it _does_ guarantee that the patch is at least
widely used and thus we get more easily to (a) - proving me
wrong outside your own little world.
- Never whine about a patch. I know whining works with a lot of people
("Oh, for chrissake, I'll just do it to get him off my back") but it
works remarkably badly with me. Trust me on this.
Was this clear enough? Any confusion on any particular issue?
In short: convince somebody else. So far, the only thing that the
discussion has convinced me off is that people somehow seem to think that
they are ENTITLED to being merged into my tree. Tough. It ain't so. That
tree is called "Linus' tree" for a reason. The only thing you are
ENTITLED to is to have your own tree.
Linus
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 11:59 Richard J Moore
2002-11-04 12:27 ` Lars Marowsky-Bree
@ 2002-11-04 16:16 ` John Alvord
2002-11-04 16:22 ` Linus Torvalds
2 siblings, 0 replies; 35+ messages in thread
From: John Alvord @ 2002-11-04 16:16 UTC (permalink / raw)
To: Richard J Moore
Cc: Oliver Xymoron, Dave Anderson, linux-kernel, lkcd-devel,
lkcd-general, lkcd-general-admin, Rusty Russell, Linus Torvalds,
Matt D. Robinson
On Mon, 4 Nov 2002 11:59:23 +0000, "Richard J Moore"
<richardj_moore@uk.ibm.com> wrote:
>
>
>> What he really wants is for Andrew or Alan or someone else he trusts
>> to merge it, get actual field results, and declare it useful. If
>> people start visibly passing around crash dump results on l-k and
>> solving problems with them, that'll help too. Until then all he has is
>> his gut feel to go on.
>
>Are you sure? Isn't what Linus is saying is that he understands that some
>problems can be solved using dumps, some from the oops message and some by
>source code inspection and some by others means. But, he's not interested
>in a timely resolution; he has a preference for solving the problems by
>looking at the source and only that way. That's his preference: arguments
>relating to timeliness and commercial considerations are of no interest to
>him - simply because they argue for benefits in which he has no interest.
>Because LKCD doesn't personally interest him he has declared that he will
>not merge it; it' up to some trusted advocate.
What you describe is certainly Linus' general philosophy.
But he also said that the feature was in "vendor push" mode, which
means if enough vendors adopt the feature he would consider. Why do
you think reisferfs got into the mainline - certainly not because he
uses it personally.
He also said he has seen no evidence of its usefulness... not one
report on L-K of kernel problems resolved.
Seems pretty clear to me...
john alvord
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 15:27 ` Henning P. Schmiedehausen
@ 2002-11-04 15:38 ` Patrick Finnegan
2002-11-04 16:51 ` Henning P. Schmiedehausen
0 siblings, 1 reply; 35+ messages in thread
From: Patrick Finnegan @ 2002-11-04 15:38 UTC (permalink / raw)
To: linux-kernel
On Mon, 4 Nov 2002, Henning P. Schmiedehausen wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
>
> >On Mon, 2002-11-04 at 14:45, Henning P. Schmiedehausen wrote:
> >> Good! This means, people debugging the code have actually to think and
> >> don't produce "turn on debugger, step here, there, patch a band aid,
>
> >Some of us debug hardware. Regardless of the nice theories about
> >reviewing your code they don't actually work on hardware because no
> >amount of code review will let you discover things like undocumented
> >2uS deskew delays, or errors in DMA engines
>
> A debugger won't help you here either. A pci bus probe, a 'scope and a
> logic analyzer do.
>
> (And experience, elbow grease, experience and a nice amount of ESP :-)
> I do hate hardware. Had to debug too much of it (and just on
> m68k/MCS-51 where the clock rates are low and the parts easy to
> solder...).
I find that hard to believe. You're saying it's impossible to use a
software debugger to debug the interface between the software and the
hardware? Eg. errors in the hardware that cause periodic anomalies in the
output read by the software would be one thing they could catch, along
with diagnosing that a problem is caused by flaky hardware rather than the
latest not-well-tested VM code. In that last case, since bad hardware can
usually cause a panic, I see crash dumps as an invaluable resource ;-).
(No Linus, I'm not pushing them, just stating my opinion.)
Pat
--
Purdue Universtiy ITAP/RCS
Information Technology at Purdue
Research Computing and Storage
http://www-rcd.cc.purdue.edu
http://dilbert.com/comics/dilbert/archive/images/dilbert2040637020924.gif
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 14:45 ` Henning P. Schmiedehausen
@ 2002-11-04 15:29 ` Alan Cox
2002-11-04 15:27 ` Henning P. Schmiedehausen
2002-11-05 4:57 ` Werner Almesberger
1 sibling, 1 reply; 35+ messages in thread
From: Alan Cox @ 2002-11-04 15:29 UTC (permalink / raw)
To: hps; +Cc: Linux Kernel Mailing List
On Mon, 2002-11-04 at 14:45, Henning P. Schmiedehausen wrote:
> Good! This means, people debugging the code have actually to think and
> don't produce "turn on debugger, step here, there, patch a band aid,
Some of us debug hardware. Regardless of the nice theories about
reviewing your code they don't actually work on hardware because no
amount of code review will let you discover things like undocumented
2uS deskew delays, or errors in DMA engines
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 15:29 ` Alan Cox
@ 2002-11-04 15:27 ` Henning P. Schmiedehausen
2002-11-04 15:38 ` Patrick Finnegan
0 siblings, 1 reply; 35+ messages in thread
From: Henning P. Schmiedehausen @ 2002-11-04 15:27 UTC (permalink / raw)
To: linux-kernel
Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
>On Mon, 2002-11-04 at 14:45, Henning P. Schmiedehausen wrote:
>> Good! This means, people debugging the code have actually to think and
>> don't produce "turn on debugger, step here, there, patch a band aid,
>Some of us debug hardware. Regardless of the nice theories about
>reviewing your code they don't actually work on hardware because no
>amount of code review will let you discover things like undocumented
>2uS deskew delays, or errors in DMA engines
A debugger won't help you here either. A pci bus probe, a 'scope and a
logic analyzer do.
(And experience, elbow grease, experience and a nice amount of ESP :-)
I do hate hardware. Had to debug too much of it (and just on
m68k/MCS-51 where the clock rates are low and the parts easy to
solder...).
Regards
Henning
--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH hps@intermeta.de
Am Schwabachgrund 22 Fon.: 09131 / 50654-0 info@intermeta.de
D-91054 Buckenhof Fax.: 09131 / 50654-20
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 2:44 ` [lkcd-general] " Jennie Haywood
@ 2002-11-04 14:45 ` Henning P. Schmiedehausen
2002-11-04 15:29 ` Alan Cox
2002-11-05 4:57 ` Werner Almesberger
0 siblings, 2 replies; 35+ messages in thread
From: Henning P. Schmiedehausen @ 2002-11-04 14:45 UTC (permalink / raw)
To: linux-kernel
Jennie Haywood <jehaywood@compuserve.com> writes:
>The Linux kernel is _extremely_ painful to debug compared to AIX.
Good! This means, people debugging the code have actually to think and
don't produce "turn on debugger, step here, there, patch a band aid,
done" solutions you see with various other "commercial products" (can
anyone really say "Internet Explorer" on this list and live? ;-) )
Regards
Henning
--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH hps@intermeta.de
Am Schwabachgrund 22 Fon.: 09131 / 50654-0 info@intermeta.de
D-91054 Buckenhof Fax.: 09131 / 50654-20
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
@ 2002-11-04 12:34 Richard J Moore
0 siblings, 0 replies; 35+ messages in thread
From: Richard J Moore @ 2002-11-04 12:34 UTC (permalink / raw)
To: Lars Marowsky-Bree
Cc: lars, linux-kernel, lkcd-devel, lkcd-general, lkcd-general-admin
> But arguing about "I have so many fortune 100 companies just lined up
ready to
> say that they support this campaign!" is marketing speak. Go away with
that
> from Linux kernel, will you.
Thank-you - you have restated my point.
Richard
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-04 11:59 Richard J Moore
@ 2002-11-04 12:27 ` Lars Marowsky-Bree
2002-11-04 16:16 ` John Alvord
2002-11-04 16:22 ` Linus Torvalds
2 siblings, 0 replies; 35+ messages in thread
From: Lars Marowsky-Bree @ 2002-11-04 12:27 UTC (permalink / raw)
To: Richard J Moore
Cc: linux-kernel, lkcd-devel, lkcd-general, lkcd-general-admin
On 2002-11-04T11:59:23,
Richard J Moore <richardj_moore@uk.ibm.com> said:
> So, for those of use who passionately care whether Linux has a system
> dumping mechanism, we need to regroup, we need to decide the correct
> strategy for gaining LKCD's inclusion into the kernel. Many of the
> arguments relate to timeliness and ultimately have a commercial benefit. I
> suggest we actively campaign among the various distros who are interested
> in selling Linus businesses and provide support. We also need to
> concentrate on consolidating the various requirements of a system crash
> dump - it's going to be much easier for everyone if there is a consensus on
> system dumping technology.
I think you are somewhat missing the point.
Both RH and UnitedLinux seem to care enough for system dump facilities that
they ship patched kernels (netdump / LKCD, respectively). Anyone who cares can
simply apply the patch themselves, if they want to compile from vanilla
sources. Just buy RH AS or any enterprise product powered by United Linux, and
off you go. I assume that your "enterprise customers" will want to do that
anyway because they need all those very useful certifications...
And since l-k (rightly!) mostly refuses to deal with crash/oops reports from
vendor patched kernels anyway, the distributors have to deal with the
diagnosis themselves already and do so as part of the support contracts.
Anyone who runs their own patched kernels probably also is able to do so.
While I can see the issue that having the patch included in the mainstream
kernel offers the usual advantages, it is by no means the absolute requirement
you make it out to be.
It appears that the facilities are all there now; so 2.6 should be a the
perfect time to test the various approaches in the field. (And face it, field
experience is rather limitted still, but I am very sure it will grow soon
because it is such a useful feature)
Then it can be included. This is how Linux has always worked. reiserfs has
gone through this, as has ext3, XFS, quite a few of the VM patches etc. So no
worries, nobody is being exceptionally harsh in any fashion.
But arguing about "I have so many fortune 100 companies just lined up ready to
say that they support this campaign!" is marketing speak. Go away with that
from Linux kernel, will you.
Come back when it is "I have so many fortune 100 companies actively using this
feature and have solved many problems with it!".
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
Principal Squirrel
SuSE Labs - Research & Development, SuSE Linux AG
"If anything can go wrong, it will." "Chance favors the prepared (mind)."
-- Capt. Edward A. Murphy -- Louis Pasteur
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
@ 2002-11-04 11:59 Richard J Moore
2002-11-04 12:27 ` Lars Marowsky-Bree
` (2 more replies)
0 siblings, 3 replies; 35+ messages in thread
From: Richard J Moore @ 2002-11-04 11:59 UTC (permalink / raw)
To: Oliver Xymoron
Cc: Dave Anderson, linux-kernel, lkcd-devel, lkcd-general,
lkcd-general-admin, Rusty Russell, Linus Torvalds,
Matt D. Robinson
> What he really wants is for Andrew or Alan or someone else he trusts
> to merge it, get actual field results, and declare it useful. If
> people start visibly passing around crash dump results on l-k and
> solving problems with them, that'll help too. Until then all he has is
> his gut feel to go on.
Are you sure? Isn't what Linus is saying is that he understands that some
problems can be solved using dumps, some from the oops message and some by
source code inspection and some by others means. But, he's not interested
in a timely resolution; he has a preference for solving the problems by
looking at the source and only that way. That's his preference: arguments
relating to timeliness and commercial considerations are of no interest to
him - simply because they argue for benefits in which he has no interest.
Because LKCD doesn't personally interest him he has declared that he will
not merge it; it' up to some trusted advocate.
So, for those of use who passionately care whether Linux has a system
dumping mechanism, we need to regroup, we need to decide the correct
strategy for gaining LKCD's inclusion into the kernel. Many of the
arguments relate to timeliness and ultimately have a commercial benefit. I
suggest we actively campaign among the various distros who are interested
in selling Linus businesses and provide support. We also need to
concentrate on consolidating the various requirements of a system crash
dump - it's going to be much easier for everyone if there is a consensus on
system dumping technology.
First crucial question - are there any avenues still open for 2.5?
Richard J Moore
RAS Project Lead - IBM Linux Technology Centre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-03 13:48 Bill Davidsen
@ 2002-11-04 2:44 ` Jennie Haywood
2002-11-04 14:45 ` Henning P. Schmiedehausen
0 siblings, 1 reply; 35+ messages in thread
From: Jennie Haywood @ 2002-11-04 2:44 UTC (permalink / raw)
To: Bill Davidsen
Cc: Alan Cox, Linus Torvalds, Chris Friesen, Matt D. Robinson,
Rusty Russell, Linux Kernel Mailing List, lkcd-general,
lkcd-devel
Bill Davidsen wrote:
>
> On 1 Nov 2002, Alan Cox wrote:
>
> > On Fri, 2002-11-01 at 06:34, Bill Davidsen wrote:
> > > From the standpoint of just the driver that's true. However, the remote
> > > machine and all the network bits between them are a string of single
> > > points of failure. Isn't it good that both disk and network can be
> > > supported.
> >
> The AIX support has a group just to beat on dumps customers send. What
> more evidence is needed that people can and do use the capability.
>
AIX has 4 people doing dumps in Austin (otherwise known as ZTRANS). There are
others in other countries.
The folks from other countries were brought to Austin for training (usually for 3
months).
There is usually one person in L3 doing dumps in Austin for service, although
every subsystem has someone that specializes in reading dumps for that subsystem.
The first 4 people only do a scan of the dump to see if it's a known problem. If
it's not
a known problem AND it's in AIX code it goes to whoever it is that owns that
subsystem.
Dumps are only the beginning with AIX. Trace hooks along with dumps are VERY
useful.
The trace hooks are also what the performance people use.
The Linux kernel is _extremely_ painful to debug compared to AIX.
--
Jennie Haywood
jehaywood@compuserve.com
Everyone is crazy. It's just a matter of degree.
jehaywood@yahoo.com
-
The oak tree in your backyard is just a nut that held its ground.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-03 17:08 linux
@ 2002-11-03 19:14 ` jw schultz
0 siblings, 0 replies; 35+ messages in thread
From: jw schultz @ 2002-11-03 19:14 UTC (permalink / raw)
To: linux-kernel
On Sun, Nov 03, 2002 at 05:08:23PM -0000, linux@horizon.com wrote:
> Just to complicate things, consider this setup:
>
> # cat /proc/swaps
> Filename Type Size Used Priority
> /dev/md5 partition 999864 16904 0
> /dev/md6 partition 999864 16924 0
> /dev/md7 partition 999864 16920 0
>
> Those are all RAID-1 mirrors, a measure whose ass-saving value I have
> enjoyed.
>
> While a crash dump to just half of one of those mirrors is fine, finding it
> might be a little bit tricky. And the fact that the kernel reassembles
> the mirrors automatically on boot might make retrieving the data a little
> bit tricky, too.
>
> (After a crash, the mirrors will be inconsistent, so one will get copied
> over the other, but I'm not too clear on which direction it'll happen in.)
>
> I can't NOT reassemble at least some mirrors on boot because / is mirrored!
>
> Now, to that, add the case that each of those is significantly smaller than
> main memory. (2/3 size would still allow swap = 2*ram.)
You would want a dump2disk that could span devices.
Probably a module that would put a header on each part with
a dumpID and sequence#. Compression would also help here as
well. The right compression would actually accelerate the
process.
Early userspace would locate and assemble the pieces and put
the dump somewhere. This might happen between mounting /
and assembling the other mirrors. That would be up to you.
--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: jw@pegasys.ws
Remember Cernan and Schmitt
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
@ 2002-11-03 17:08 linux
2002-11-03 19:14 ` jw schultz
0 siblings, 1 reply; 35+ messages in thread
From: linux @ 2002-11-03 17:08 UTC (permalink / raw)
To: linux-kernel
Just to complicate things, consider this setup:
# cat /proc/swaps
Filename Type Size Used Priority
/dev/md5 partition 999864 16904 0
/dev/md6 partition 999864 16924 0
/dev/md7 partition 999864 16920 0
Those are all RAID-1 mirrors, a measure whose ass-saving value I have
enjoyed.
While a crash dump to just half of one of those mirrors is fine, finding it
might be a little bit tricky. And the fact that the kernel reassembles
the mirrors automatically on boot might make retrieving the data a little
bit tricky, too.
(After a crash, the mirrors will be inconsistent, so one will get copied
over the other, but I'm not too clear on which direction it'll happen in.)
I can't NOT reassemble at least some mirrors on boot because / is mirrored!
Now, to that, add the case that each of those is significantly smaller than
main memory. (2/3 size would still allow swap = 2*ram.)
The problem is that hardware is getting more and more sopisticated and
requiring ever more elaborate device drivers. Eventually you have to
have a cutoff and say that something is too complex to talk to after a
crash, even though it's theoretically available. Where is that line?
USB? iSCSI? This situation?
A reasonable fallback is to just drop in a cheap crappy dedicated
IDE drive for catching crash dumps, but I'd like the crash dumper to
know how to wake it up from sleep mode; I'd hate to leave it spinning
all the time...
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-03 14:33 ` Bill Davidsen
2002-11-03 15:34 ` Bernd Eckenfels
@ 2002-11-03 16:32 ` Alan Cox
2002-11-05 18:07 ` Bill Davidsen
1 sibling, 1 reply; 35+ messages in thread
From: Alan Cox @ 2002-11-03 16:32 UTC (permalink / raw)
To: Bill Davidsen
Cc: Matt D. Robinson, Steven King, Linus Torvalds, Joel Becker,
Chris Friesen, Rusty Russell, Linux Kernel Mailing List,
lkcd-general, lkcd-devel
On Sun, 2002-11-03 at 14:33, Bill Davidsen wrote:
> If you define "unmaintainably bad" as "having features you don't need"
> then I agree. But since dump to disk is in almost every other commercial
> UNIX, maybe someone would question why it's good for others but not for
> Linux.
It isnt about features, its about clean maintainable code. netdump to me
doesnt mean no dump to disk option. In fact I'd rather like to be able
to insmod dump-foo.o. The correctness issues are hard but if the
dump-foo is standalone, resets the hardware and has an SHA integrity
check then it can be done (think of it as a post crash variant of the
trusted computing TCB verification problem)
> uses the crash dump in AIX, the person who wants to send a compressed dump
> and money to IBM and get back a fix. Netdump assumes external resources
Lots of interesting legal issues but yes you can do it sometimes (DMCA,
privacy, financial duties sometimes make it horribly complex). Even in
the case where you only dump the oops its still valuable.
> and a functional secure network (is the dump encrypted and I missed it?)
> which home users surely don't have, and remote servers oftem lack as well.
Encrypting the dump with the new crypto lib in the kernel would be easy,
right now it doesnt.
My disk dump concerns are purely those of correctness. That means
1. After loading the module getting the block list for the dump target
2. Resetting and scratch initializing the dump device
3. Not relying on any code outside of the dump TCB that may have
been corrupted
4. At dump time turning off all bus masters, doing the dump TCB
verification and then dumping
Most of the pieces already exist.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-03 14:33 ` Bill Davidsen
@ 2002-11-03 15:34 ` Bernd Eckenfels
2002-11-03 16:32 ` Alan Cox
1 sibling, 0 replies; 35+ messages in thread
From: Bernd Eckenfels @ 2002-11-03 15:34 UTC (permalink / raw)
To: linux-kernel
In article <Pine.LNX.3.96.1021103092330.5197D-100000@gatekeeper.tmr.com> you wrote:
> If you define "unmaintainably bad" as "having features you don't need"
> then I agree. But since dump to disk is in almost every other commercial
> UNIX, maybe someone would question why it's good for others but not for
> Linux.
It is even in FreeBSD or Windows > ME
Greetings
Bernd
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-03 1:49 ` Alan Cox
@ 2002-11-03 14:33 ` Bill Davidsen
2002-11-03 15:34 ` Bernd Eckenfels
2002-11-03 16:32 ` Alan Cox
0 siblings, 2 replies; 35+ messages in thread
From: Bill Davidsen @ 2002-11-03 14:33 UTC (permalink / raw)
To: Alan Cox
Cc: Matt D. Robinson, Steven King, Linus Torvalds, Joel Becker,
Chris Friesen, Rusty Russell, Linux Kernel Mailing List,
lkcd-general, lkcd-devel
On 3 Nov 2002, Alan Cox wrote:
> I would hope IBM have more intelligence than to attempt to destroy the
> product by trying to force all sorts of junk into it. The Linux world
> has a process for filterng crap, it isnt IBM applying force. That path
> leads to Star Office 5.2, Netscape 4 and other similar scales of horror
> code that become unmaintainably bad.
If you define "unmaintainably bad" as "having features you don't need"
then I agree. But since dump to disk is in almost every other commercial
UNIX, maybe someone would question why it's good for others but not for
Linux.
I can agree on stuff the non-hacker wouldn't use, but that is exactly who
uses the crash dump in AIX, the person who wants to send a compressed dump
and money to IBM and get back a fix. Netdump assumes external resources
and a functional secure network (is the dump encrypted and I missed it?)
which home users surely don't have, and remote servers oftem lack as well.
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-03 1:24 ` [lkcd-general] " Matt D. Robinson
2002-11-03 1:49 ` Alan Cox
@ 2002-11-03 3:10 ` Christoph Hellwig
1 sibling, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2002-11-03 3:10 UTC (permalink / raw)
To: Matt D. Robinson
Cc: Alan Cox, Bill Davidsen, Steven King, Linus Torvalds,
Joel Becker, Chris Friesen, Rusty Russell,
Linux Kernel Mailing List, lkcd-general, lkcd-devel
On Sat, Nov 02, 2002 at 05:24:17PM -0800, Matt D. Robinson wrote:
> P.S. IBM shouldn't have signed a contact with Red Hat without
> requiring certain features in Red Hat's OS(es). Pushing for
> LKCD, kprobes, LTT, etc., wouldn't be on this list for a whole
> variety of cases if that had been done in the first place.
Bah, it's enough that IBMs money totally fucked up the tree of one popular
distribution..
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-03 1:24 ` [lkcd-general] " Matt D. Robinson
@ 2002-11-03 1:49 ` Alan Cox
2002-11-03 14:33 ` Bill Davidsen
2002-11-03 3:10 ` Christoph Hellwig
1 sibling, 1 reply; 35+ messages in thread
From: Alan Cox @ 2002-11-03 1:49 UTC (permalink / raw)
To: Matt D. Robinson
Cc: Bill Davidsen, Steven King, Linus Torvalds, Joel Becker,
Chris Friesen, Rusty Russell, Linux Kernel Mailing List,
lkcd-general, lkcd-devel
On Sun, 2002-11-03 at 01:24, Matt D. Robinson wrote:
> P.S. IBM shouldn't have signed a contact with Red Hat without
> requiring certain features in Red Hat's OS(es). Pushing for
> LKCD, kprobes, LTT, etc., wouldn't be on this list for a whole
> variety of cases if that had been done in the first place.
I would hope IBM have more intelligence than to attempt to destroy the
product by trying to force all sorts of junk into it. The Linux world
has a process for filterng crap, it isnt IBM applying force. That path
leads to Star Office 5.2, Netscape 4 and other similar scales of horror
code that become unmaintainably bad.
> P.S. As an aside, too many engineers try and make product marketing
> decisions at Red Hat. I personally think that's really bad for
> their business model as a whole (and I'm not referring to LKCD).
You think things like EVMS are a product marketing decision. I'm very
glad you don't run a Linux distro. It would turn into something like the
old 3com rapops rather rapidly by your models (3com rapops btw ceased to
exist and for good reasons)
Alan
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-11-02 15:29 Alan Cox
@ 2002-11-03 1:24 ` Matt D. Robinson
2002-11-03 1:49 ` Alan Cox
2002-11-03 3:10 ` Christoph Hellwig
0 siblings, 2 replies; 35+ messages in thread
From: Matt D. Robinson @ 2002-11-03 1:24 UTC (permalink / raw)
To: Alan Cox
Cc: Bill Davidsen, Steven King, Linus Torvalds, Joel Becker,
Chris Friesen, Rusty Russell, Linux Kernel Mailing List,
lkcd-general, lkcd-devel
On 2 Nov 2002, Alan Cox wrote:
|>On Sat, 2002-11-02 at 05:17, Bill Davidsen wrote:
|>> I was hoping Alan would push Redhat to put this in their Linux so we
|>> could resolve some of the ongoing problems which don't write an oops to a
|>> log, but I guess none of the developers has to actually support production
|>> servers and find out why they crash.
|>
|>I think several Red Hat people would disagree very strongly. Red Hat
|>shipped with the kernel symbol decoding oops reporter for a good reason,
|>and also acquired netdump for a good reason.
It would be great if crash dumping were an option, at the very least
to unify the netdump, oops reporter and disk dumping (for those that
want it) into a single infrastructure. Long term, that's probably
where this is going anyway. It takes away the religious "who is right"
argument, which is fundamentally silly.
Maybe one day. I think quite a few Red Hat customers would
appreciate it.
--Matt
P.S. IBM shouldn't have signed a contact with Red Hat without
requiring certain features in Red Hat's OS(es). Pushing for
LKCD, kprobes, LTT, etc., wouldn't be on this list for a whole
variety of cases if that had been done in the first place.
P.S. As an aside, too many engineers try and make product marketing
decisions at Red Hat. I personally think that's really bad for
their business model as a whole (and I'm not referring to LKCD).
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] RE: What's left over.
2002-10-31 22:47 Perez-Gonzalez, Inaky
@ 2002-11-01 13:06 ` Jan Iven
0 siblings, 0 replies; 35+ messages in thread
From: Jan Iven @ 2002-11-01 13:06 UTC (permalink / raw)
To: 'Linus Torvalds'
Cc: 'linux-kernel@vger.kernel.org',
'lkcd-general@lists.sourceforge.net',
'lkcd-devel@lists.sourceforge.net'
>>>>> "PI" == Perez-Gonzalez, Inaky <inaky.perez-gonzalez@intel.com> writes:
>> THAT is what I mean by vendor-driven. If vendors decide they
>> really want the patches, and I actually start seeing noises on
>> linux-kernel or getting
>> requests for it being merged from _users_ rather than developers, then
>> that means that the vendor is on to something.
For what it is worth, CERN has been using LKCD kernels for the last
6month or so, enabled mostly on headless farm machines (but the
kernels get deployed to desktops as well). Please consider including
it into the mainstream kernel.
Jan Iven
Linux support / CERN
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-10-31 22:20 Shawn
@ 2002-10-31 23:14 ` Bernhard Kaindl
0 siblings, 0 replies; 35+ messages in thread
From: Bernhard Kaindl @ 2002-10-31 23:14 UTC (permalink / raw)
To: linux-kernel; +Cc: lkcd-general
On Thu, 31 Oct 2002, Shawn wrote:
>
> Linus has to "keep up" with all the changees coming into his inbox as
> well, and the more features, the more breakage that can happen when
> Linus accepts a patch.
Yes, but lkcd differs from the other changes because it can make the
life of people easyer which don't need the patch in the first place,
and help quality and shorten the time to fix bugs.
If someone triggers a problem, one can take a free partition or setup
an network dump server, run and if it happens again, there is a good
chance that all that is needed to fix the problem is in the dump,
the System.map and the Kerntypes file from the kernel which can
be consolidatet into a report with symbolic stack traces of the
CPUs and Tasks quite easy.
Original source, patches and configuration options are good for
analysing but not required if the Kerntypes file is there. The
config options could be even read from the dump if this would
be a liked feature. :-)
> Really, Linus wants to push some of his maintanance overhead to distros,
> who get paid to do it, but also to provide sexy bullet point items for
> users, so they buy "Linux" stuff.
Sure, but the work of the distros could be even better if the base
kernel has lkcd, LTT and dprobes (you don't have to enable them if
you don't need them) because then they would have more resources
to make other even more useful things. But it's up to someone
who merges the stuff.
Bernd
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
@ 2002-10-31 21:58 Richard J Moore
0 siblings, 0 replies; 35+ messages in thread
From: Richard J Moore @ 2002-10-31 21:58 UTC (permalink / raw)
To: Jeff Garzik
Cc: linux-kernel, lkcd-devel, lkcd-general, lkcd-general-admin,
Rusty Russell, Linus Torvalds, Matt D. Robinson
> So, I think the stock kernel does need some form of disk dumping,
> regardless of any presence/absence of netdump. But LKCD isn't there
yet...
But if we get into 2.5 the minimal kernel piece we need, we can continue to
enhance and expand dumping capability independently of the kernel via the
dump module. And in this respect we have been actively working on
integrating the netdump concept with lkcd.
Richard
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-10-31 19:57 ` george anzinger
@ 2002-10-31 20:48 ` Stephen Hemminger
0 siblings, 0 replies; 35+ messages in thread
From: Stephen Hemminger @ 2002-10-31 20:48 UTC (permalink / raw)
To: george anzinger
Cc: Patrick Mochel, Dave Craft, Linus Torvalds, Matt D. Robinson,
Rusty Russell, Kernel List, lkcd-general, lkcd-devel
On Thu, 2002-10-31 at 11:57, george anzinger wrote:
> Stephen Hemminger wrote:
> > FYI the criteria I apply for what goes into DCL is:
> > * Applys to large systems and databases
> > * Vendor support
> > * Conforms to Linux standard style
> > * Active project and maintainer that accepts feedback
> > * Community rejection has been mostly positive.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Could you decode this :)
s/rejection/reaction/
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-10-31 19:16 ` Stephen Hemminger
@ 2002-10-31 19:57 ` george anzinger
2002-10-31 20:48 ` Stephen Hemminger
0 siblings, 1 reply; 35+ messages in thread
From: george anzinger @ 2002-10-31 19:57 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Patrick Mochel, Dave Craft, Linus Torvalds, Matt D. Robinson,
Rusty Russell, Kernel List, lkcd-general, lkcd-devel
Stephen Hemminger wrote:
> FYI the criteria I apply for what goes into DCL is:
> * Applys to large systems and databases
> * Vendor support
> * Conforms to Linux standard style
> * Active project and maintainer that accepts feedback
> * Community rejection has been mostly positive.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Could you decode this :)
--
George Anzinger george@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-10-31 18:45 ` Patrick Mochel
@ 2002-10-31 19:16 ` Stephen Hemminger
2002-10-31 19:57 ` george anzinger
0 siblings, 1 reply; 35+ messages in thread
From: Stephen Hemminger @ 2002-10-31 19:16 UTC (permalink / raw)
To: Patrick Mochel
Cc: Dave Craft, Linus Torvalds, Matt D. Robinson, Rusty Russell,
Kernel List, lkcd-general, lkcd-devel
On Thu, 2002-10-31 at 10:45, Patrick Mochel wrote:
>
> So, this is precisely where something like OSDL's Carrier Grade and Data
> Center working groups can come into play, amazingly enough.
>
> By now, nearly everyone has heard about the working groups and nearly
> every developer that has, despises them. Even I resist association with
> them. But, they can have some real value to the vendors and the OEMs in
> exactly the way you describe.
>
> Take for example DCL. It's a kernel tree with several base patches
> intended to make Linux better in the data center. The base is not fancy,
> and includes things like LKCD and kdb (I think). It's actively maintained
> and updated more often than Linus makes a release (by virtue of
> bitkeeper).
LKCD is in and I try to keep it up to date with the patch stream.
KDB is not in yet, because the current posted patches are not up to date
to apply cleanly against 2.5.44 or 2.5.45.
> The intent is to later have multiple child trees that implement features
> for a specific application space (e.g. databases), while maintainig the
> same base set of features. People wishing to use the most recent kernel
> with those features can use the DCL tree directly. Or an OEM FAE can use
> the tree to build something for the vendor, or add extra features.
CGL hasn't decided what they want to change to.
DCL is going to have one tree focused on databases.
> Note that it's not a distribution. We don't even make real releases, since
> we don't create tarballs or patches (it's only in BK, which actually kinda
> sucks). It's merely a means to have these features actively maintained and
> kept in synch.
For DCL there is both a bitkeeper tree bk://bk.osdl.org/dcl-2.5 and
regular snapshots available on sourceforge
http://osdldcl.sourceforge.net
> And really, that's what everyone wants. Linus doesn't want the features,
> as don't other developers, regardless of the Buzzword or Coolness factors.
> Some vendors and users do want them. The developers of the features and
> distributors of features don't want to deal with the tedium and pain of
> updating patches each and every release.
>
> In the end, it comes down to the fact that Linus's tree is Linus's tree.
> Other people can have their trees. I'm not going to tell you go off and
> make your own if you want those features so bad, because I know what a
> pain in the ass it is, and I know having someone else do it is a lot
> easier.
>
FYI the criteria I apply for what goes into DCL is:
* Applys to large systems and databases
* Vendor support
* Conforms to Linux standard style
* Active project and maintainer that accepts feedback
* Community rejection has been mostly positive.
> DCL and CGL have their trees, for purposes probably very very similar to
> what your customers need. I encourage you to check them out and work with
> them (or talk to people in your company that are). Try and make it work,
> and everyone can be happy (relativey). And, if DCL and CGL aren't
> satisfying the space that you need, please speak up to OSDL and the
> working groups. People are listening, and willing to take your suggestions
> into consideration.
>
> Relevant URLs:
>
> http://osdl.org/projects/cgl/
> http://osdl.org/projects/dcl/
Stephen Hemminger
Data Center Linux (DCL) Maintainer/Coordinater
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-10-31 17:55 ` [lkcd-general] " Dave Craft
@ 2002-10-31 18:45 ` Patrick Mochel
2002-10-31 19:16 ` Stephen Hemminger
0 siblings, 1 reply; 35+ messages in thread
From: Patrick Mochel @ 2002-10-31 18:45 UTC (permalink / raw)
To: Dave Craft
Cc: Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-kernel,
lkcd-general, lkcd-devel
On Thu, 31 Oct 2002, Dave Craft wrote:
> On Thu, 31 Oct 2002, Linus Torvalds wrote:
>
> > What I'm saying by "vendor driven" is that it has no relevance for the
> > standard kernel, and since it has no relevance to that, then I have no
> > incentives to merge it. The crash dump is only useful with people who
> > actively look at the dumps, and I don't know _anybody_ outside of the
> > specialized vendors you mention who actually do that.
>
> Unfortunately the vast majority of the customers I deal with
> buy a distribution and then put a kernel from kernel.org
> on. I believe this comes about because of either needing fixes
> or function that appear in later kernels that have not made
> it to the distributions kernels yet.
>
> Even if the distribution included LKCD in their kernel,
> I lose lots of debug ability once customers switch over to
> kernel.org and no longer have the LKCD patch.
>
> Thus we are currently left with having to maintain LKCD patches for
> many arbitrary kernel.org kernels and convince customers to apply
> it BEFORE they start encountering problems that we'll have to look at.
> Application of patches that aren't automatically included in kernel.org
> rarely happens with our customer set (before problems occur),
> no matter how much we flag the issue to them up front.
So, this is precisely where something like OSDL's Carrier Grade and Data
Center working groups can come into play, amazingly enough.
By now, nearly everyone has heard about the working groups and nearly
every developer that has, despises them. Even I resist association with
them. But, they can have some real value to the vendors and the OEMs in
exactly the way you describe.
Take for example DCL. It's a kernel tree with several base patches
intended to make Linux better in the data center. The base is not fancy,
and includes things like LKCD and kdb (I think). It's actively maintained
and updated more often than Linus makes a release (by virtue of
bitkeeper).
The intent is to later have multiple child trees that implement features
for a specific application space (e.g. databases), while maintainig the
same base set of features. People wishing to use the most recent kernel
with those features can use the DCL tree directly. Or an OEM FAE can use
the tree to build something for the vendor, or add extra features.
Note that it's not a distribution. We don't even make real releases, since
we don't create tarballs or patches (it's only in BK, which actually kinda
sucks). It's merely a means to have these features actively maintained and
kept in synch.
And really, that's what everyone wants. Linus doesn't want the features,
as don't other developers, regardless of the Buzzword or Coolness factors.
Some vendors and users do want them. The developers of the features and
distributors of features don't want to deal with the tedium and pain of
updating patches each and every release.
In the end, it comes down to the fact that Linus's tree is Linus's tree.
Other people can have their trees. I'm not going to tell you go off and
make your own if you want those features so bad, because I know what a
pain in the ass it is, and I know having someone else do it is a lot
easier.
DCL and CGL have their trees, for purposes probably very very similar to
what your customers need. I encourage you to check them out and work with
them (or talk to people in your company that are). Try and make it work,
and everyone can be happy (relativey). And, if DCL and CGL aren't
satisfying the space that you need, please speak up to OSDL and the
working groups. People are listening, and willing to take your suggestions
into consideration.
Relevant URLs:
http://osdl.org/projects/cgl/
http://osdl.org/projects/dcl/
-pat "kissing serious butt" mochel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [lkcd-general] Re: What's left over.
2002-10-31 15:46 Linus Torvalds
@ 2002-10-31 17:55 ` Dave Craft
2002-10-31 18:45 ` Patrick Mochel
0 siblings, 1 reply; 35+ messages in thread
From: Dave Craft @ 2002-10-31 17:55 UTC (permalink / raw)
To: Linus Torvalds
Cc: Matt D. Robinson, Rusty Russell, linux-kernel, lkcd-general, lkcd-devel
On Thu, 31 Oct 2002, Linus Torvalds wrote:
> What I'm saying by "vendor driven" is that it has no relevance for the
> standard kernel, and since it has no relevance to that, then I have no
> incentives to merge it. The crash dump is only useful with people who
> actively look at the dumps, and I don't know _anybody_ outside of the
> specialized vendors you mention who actually do that.
Unfortunately the vast majority of the customers I deal with
buy a distribution and then put a kernel from kernel.org
on. I believe this comes about because of either needing fixes
or function that appear in later kernels that have not made
it to the distributions kernels yet.
Even if the distribution included LKCD in their kernel,
I lose lots of debug ability once customers switch over to
kernel.org and no longer have the LKCD patch.
Thus we are currently left with having to maintain LKCD patches for
many arbitrary kernel.org kernels and convince customers to apply
it BEFORE they start encountering problems that we'll have to look at.
Application of patches that aren't automatically included in kernel.org
rarely happens with our customer set (before problems occur),
no matter how much we flag the issue to them up front.
I realize that while my current capacity makes me fall into
the 'vendor' support you speak of, I believe I am actually
advocating its inclusion on behalf of real live customers.
Vendors can and do actually help linux development, by screening,
researching fixes, and or directly fixing lots of customer
problems that you never have to deal with. To do that, LKCD
is the debug weapon of choice.
I request you reconsider the inclusion of LKCD.
Regards, Dave
Mail : dave@austin.ibm.com Phone : 512-838-8248
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2002-11-05 20:35 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <551170412@toto.iv>
2002-11-04 3:03 ` [lkcd-general] Re: What's left over Peter Chubb
2002-11-04 13:08 ` Alan Cox
2002-11-05 20:37 Dr. Greg Wettstein
[not found] <Pine.LNX.4.44.0211040727330.771-100000@home.transmeta.com.suse.lists.linux.kernel>
[not found] ` <1036429035.1718.99.camel@irongate.swansea.linux.org.uk.suse.lists.linux.kernel>
2002-11-04 16:53 ` Andi Kleen
-- strict thread matches above, loose matches on Subject: below --
2002-11-04 12:34 Richard J Moore
2002-11-04 11:59 Richard J Moore
2002-11-04 12:27 ` Lars Marowsky-Bree
2002-11-04 16:16 ` John Alvord
2002-11-04 16:22 ` Linus Torvalds
2002-11-04 16:57 ` Alan Cox
2002-11-05 9:05 ` Suparna Bhattacharya
2002-11-03 17:08 linux
2002-11-03 19:14 ` jw schultz
2002-11-03 13:48 Bill Davidsen
2002-11-04 2:44 ` [lkcd-general] " Jennie Haywood
2002-11-04 14:45 ` Henning P. Schmiedehausen
2002-11-04 15:29 ` Alan Cox
2002-11-04 15:27 ` Henning P. Schmiedehausen
2002-11-04 15:38 ` Patrick Finnegan
2002-11-04 16:51 ` Henning P. Schmiedehausen
2002-11-05 4:57 ` Werner Almesberger
2002-11-02 15:29 Alan Cox
2002-11-03 1:24 ` [lkcd-general] " Matt D. Robinson
2002-11-03 1:49 ` Alan Cox
2002-11-03 14:33 ` Bill Davidsen
2002-11-03 15:34 ` Bernd Eckenfels
2002-11-03 16:32 ` Alan Cox
2002-11-05 18:07 ` Bill Davidsen
2002-11-03 3:10 ` Christoph Hellwig
2002-10-31 22:47 Perez-Gonzalez, Inaky
2002-11-01 13:06 ` [lkcd-general] " Jan Iven
2002-10-31 22:20 Shawn
2002-10-31 23:14 ` [lkcd-general] " Bernhard Kaindl
2002-10-31 21:58 Richard J Moore
2002-10-31 15:46 Linus Torvalds
2002-10-31 17:55 ` [lkcd-general] " Dave Craft
2002-10-31 18:45 ` Patrick Mochel
2002-10-31 19:16 ` Stephen Hemminger
2002-10-31 19:57 ` george anzinger
2002-10-31 20:48 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).