linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Are we going too fast?
@ 2001-08-15 20:13 Roy Murphy
  0 siblings, 0 replies; 51+ messages in thread
From: Roy Murphy @ 2001-08-15 20:13 UTC (permalink / raw)
  To: linux-kernel

'Twas brillig when Mike Edwards scrobe:
>I think that's a bit unfair. Rather, I suspect people see the 
>word 'stable', and assume, for some unknown reason, that the kernel is >stable.
*AHEM* 

Whatever truth value 2.4 has for the variable stable, it can not be stored in
a boolean type.

'Stable' means that the direction of development is intended to reduce the number
of bugs not add new features unless they can reasonably be shown to not introduce
major bugs.  That the 2.5 tree has not been opened indicates the recognition
that additional concentrated work on 2.4 is needed.

>Seriously, though - even distributions are including 2.4 kernels now. 
>RedHat, Mandrake, Slackware ... Should the latest versions of these 
>distributions be considered unstable as well? 

Even older releases of distributions are maintained.  Should we ever get to
kernel 2.2.38, the distribution maintainers should be releasing bugfix patches
for older distributions with the latest 2.2 kernel.

>Perhaps it needs to be made clear to people that these kernels still 
>aren't all they could be. 

No kernel is perfect.  The judgement was that it was ready to switch from adding
features to increasing stability.  Thus it has ever been since my first kernel
~= 0.95.
-- 
Roy Murphy      \ CSpice -- A mailing list for Clergy Spouses
murphy@panix.com \  http://www.panix.com/~murphy/CSpice.html

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-16 21:42 PinkFreud
  0 siblings, 0 replies; 51+ messages in thread
From: PinkFreud @ 2001-08-16 21:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Dr. Kelsey Hudson, Alan Cox

> On Mon, 13 Aug 2001, PinkFreud wrote:
> 
> > This even happens after the BIOS flash - the first few times I switched
> > consoles, it actually survived.  After that, it locked up again.
> 
> Could you perchance be running a framebuffer console?

Not at the moment.  I have tried in the past, curious to see if it resolved
the problem - it didn't.

Again, this only seems to happen in the 2.4 kernels (I just read that 2.4.9
has been released, but haven't tried it yet).  2.2.19 is stable when doing this.

Alan, do you have any more thoughts on why this might be happening?

Thanks.


	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-14  0:04 ` PinkFreud
  2001-08-14  7:24   ` Francois Romieu
@ 2001-08-15 23:24   ` Dr. Kelsey Hudson
  1 sibling, 0 replies; 51+ messages in thread
From: Dr. Kelsey Hudson @ 2001-08-15 23:24 UTC (permalink / raw)
  To: PinkFreud; +Cc: linux-kernel, Alan Cox

On Mon, 13 Aug 2001, PinkFreud wrote:

> This even happens after the BIOS flash - the first few times I switched
> consoles, it actually survived.  After that, it locked up again.

Could you perchance be running a framebuffer console?

 Kelsey Hudson                                           khudson@ctica.com
 Software Engineer
 Compendium Technologies, Inc                               (619) 725-0771
---------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-14 20:47 ` Alan Cox
@ 2001-08-15  0:07   ` PinkFreud
  0 siblings, 0 replies; 51+ messages in thread
From: PinkFreud @ 2001-08-15  0:07 UTC (permalink / raw)
  To: Alan Cox; +Cc: Anders Larsen, linux-kernel

On Tue, 14 Aug 2001, Alan Cox wrote:

> Date: Tue, 14 Aug 2001 21:47:03 +0100 (BST)
> From: Alan Cox <alan@lxorguk.ukuu.org.uk>
> To: Anders Larsen <anders@alarsen.net>
> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>, PinkFreud <pf-kernel1@mirkwood.net>,
>      linux-kernel@vger.kernel.org
> Subject: Re: Are we going too fast?
> 
> > Mike didn't mention any details of the hardware where he's experiencing this
> > bug, but is it possibly a multiprocessor machine?
> > Since I only have UP's to test on, the qnxfs might have SMP issues.
> > 
> > Could someone please glance through the code in fs/qnx4 to check if there
> > are any obvious problems?
> 
> If I get time tomorrow I'll test the qnxfs code on a dual PPro


Woah, good point.  While the system I'm trying this on is only a single
cpu machine (AMD K6), I noticed (after I compiled, alas) that the kernel
was compiled with the SMP option enabled - whoops.  I meant to change that
back to UP, but haven't done so as of yet.


Linux boromir 2.4.8 #1 SMP Sun Aug 12 14:08:25 EDT 2001 i586 unknown

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 5
model		: 6
model name	: AMD-K6tm w/ multimedia extensions
stepping	: 1
cpu MHz		: 199.684
cache size	: 64 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr mce cx8 mmx
bogomips	: 398.13


	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-14 12:48     ` Alan Cox
@ 2001-08-14 22:30       ` Paul G. Allen
  0 siblings, 0 replies; 51+ messages in thread
From: Paul G. Allen @ 2001-08-14 22:30 UTC (permalink / raw)
  Cc: linux-kernel

Alan Cox wrote:
> 
> >       - use the uhci USB driver when I'm using a USB printer.  If I
> >         use the usb-uhci driver with my USB printer, the whole system
> >         locks.  This has been reported a few times on LKML,
> >         linux-usb-users, and linux-usb-developers and nobody helped,
> >         but a few people wrote back with "me too"s.  It was broken in
> >         the trasnition from 2.4.3 to 2.4.4 and only seems to affect
> >         SMP systems.  I just gave up on USB printing and went back to
> >         my parallel port.
> 
> usb-uhci seems to not be SMP safe. Ultimately we don't need both uhci
> drivers so that hasnt been one that worried me.  Probably we should drop
> the other uhci driver over time (2.5 maybe)
> 

When I first installed RH 7.1 on my Tyan K7, the system would lock upon boot when the USB module was loaded. I disabled the USB controller in the BIOS and all
was fine. After compiling 2.4.7-ac10 and running it for some time reliably, I re-enabled USB and re-compiled making sure the USB modules were included. They now
load just fine.

Note that I do not (yet) have any USB devices.

PGA
-- 
Paul G. Allen
UNIX Admin II/Programmer
Akamai Technologies, Inc.
www.akamai.com
Work: (858)909-3630
Cell: (858)395-5043

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-14 12:49     ` Alan Cox
@ 2001-08-14 22:27       ` Paul G. Allen
  0 siblings, 0 replies; 51+ messages in thread
From: Paul G. Allen @ 2001-08-14 22:27 UTC (permalink / raw)
  Cc: linux-kernel

Alan Cox wrote:
> 
> > If this is truely the case, I'd suggest that kernel.org be modified, as
> > it refers to them as *stable*
> > as of 9:18PM PDT, direct copy & paste from kernel.org page:
> >
> > The latest stable version of the Linux kernel is: 2.4.8 2001-08-11 04:13
> > UTC Changelog
> >
> > The latest prepatch (alpha) version appears to be: 2.4.9-pre3 2001-08-13
> > 23:56 UTC Changelog
> 
> Kernel.org certainly should list the 2.2 status (hey I maintain it I'm
> allowed to be biased). Its unfortunate it many ways that people are still so
> programmed to the "latest version" obsession of the proprietary world some
> times. For most people 2.4 is the right choice but for absolute stability
> why change 8)

Agreed. 2.2.x works just fine for us on our servers (some have been up for over a year, some maybe longer, but the longer they're up without problems, the
easier it is to forget they even exist ;) I am using 2.4 because my personal MoBo is so new, it's the only kernel that will work worth a darn on it. I am also
wanting to upgrade some servers as soon as a more stable kernel is available because there are some improvements in the newer kernels that I feel could be of
great benefit (but then that's my personal view, and not necessarily a company view). It has been long known that even numbered kernels are stable kernels, not
necessarily bug free (nothing is, escept for what I write ;-), and odd numbered are development kernels. By this definition, 2.4.x kernels are stable (in most
cases it seems it's the hardware that's not).

PGA

-- 
Paul G. Allen
UNIX Admin II/Programmer
Akamai Technologies, Inc.
www.akamai.com
Work: (858)909-3630
Cell: (858)395-5043

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
       [not found] <no.id>
  2001-08-13 20:24 ` Alan Cox
@ 2001-08-14 20:47 ` Alan Cox
  2001-08-15  0:07   ` PinkFreud
  1 sibling, 1 reply; 51+ messages in thread
From: Alan Cox @ 2001-08-14 20:47 UTC (permalink / raw)
  To: Anders Larsen; +Cc: Alan Cox, PinkFreud, linux-kernel

> Mike didn't mention any details of the hardware where he's experiencing this
> bug, but is it possibly a multiprocessor machine?
> Since I only have UP's to test on, the qnxfs might have SMP issues.
> 
> Could someone please glance through the code in fs/qnx4 to check if there
> are any obvious problems?

If I get time tomorrow I'll test the qnxfs code on a dual PPro

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-14 18:51   ` Anders Larsen
@ 2001-08-14 20:29     ` Anders Larsen
  0 siblings, 0 replies; 51+ messages in thread
From: Anders Larsen @ 2001-08-14 20:29 UTC (permalink / raw)
  To: Alan Cox; +Cc: PinkFreud, linux-kernel

Anders Larsen wrote:
> 
> On 2001-08-13 15:11 Alan Cox wrote:
> > > emergency sync) when attempting to use 'ls' on a mounted QNX filesystem
> > > (ls comes up fine, then system crashes - nothing sent to syslog, no errors
> > > on screen, nothing!) - and this latest is with 2.4.8!
> >
> > The qnxfs code is experimental - so I can believe it might fail in 2.4. I'd
> > be very interested in info on that one.
> 
> The qnxfs code is really quite stable - that's the first time in more than a
> year that I hear of any problem reading a qnx file-system; actually, I've been
> considering removing the 'experimental' tag, but now I'll reconsider...

Come to think of it...
Mike didn't mention any details of the hardware where he's experiencing this
bug, but is it possibly a multiprocessor machine?
Since I only have UP's to test on, the qnxfs might have SMP issues.

Could someone please glance through the code in fs/qnx4 to check if there
are any obvious problems?

cheers
  Anders (maintainer, qnx4fs)
-- 
"In theory there is no difference between theory and practice.
 In practice there is." - Yogi Berra

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-14 20:20 Per Jessen
  0 siblings, 0 replies; 51+ messages in thread
From: Per Jessen @ 2001-08-14 20:20 UTC (permalink / raw)
  To: David Ford; +Cc: linux-kernel

On Tue, 14 Aug 2001 15:54:33 -0400, David Ford wrote:

>Per Jessen wrote:
>
>>>On Mon, 13 Aug 2001 14:11:32 +0100 (BST), Alan Cox wrote:
>>>
>>>If you want maximum stability you want to be running 2.2 or even 2.0. Newer
>>>less tested code is always less table. 2.4 wont be as stable as 2.2 for a
>>>year yet.
>>>
>>
>>Couldn't have put that any better. On mission-critical systems, this is
>>exactly what people do. Personally, my experience is from the big-iron
>>world of S390 -  if you're a bleeding-edge organisation, you'll be out
>>there applying the latest PTFs, you'll be running the latest OS/390 etc. 
>>If you're conservative, you're at least 2, maybe 3 releases (in todays 
>>OS390 this means about 18-24 months) behind. If you're ultra-conservative,
>>you'll wait for the point where you can no longer avoid an upgrade.
>>
>
>Unfortunately, this methodology also introduces another important 
>factor.  You are the most likely target for exploits and 
>vulnerabilities.  As is ever so strongly evidenced by the great numbers 
>of people being exploited because the version of software they have is 
>outdated.
>
>It's a gross measure of risks; where does the risk come from, how can it 
>affect you, and what can you do about it.

Completely agree. This is also why most big-iron shops employ a couple
of people to do change-management, AKA risk-management. And of course
they'll try to evaluate the risk in moving to the latest OS390 and/or Linux  
versus possible external threats. In the OS390 world, external threats
are limited, and being conservative often pays off very well. In the
Linux world, lots of systems have tight internet connections, and being
alert and uptodate will pay off. It all depends - there is no cure for all.


/Per

PS: is it customary to copy posters on a posting to lkml ? I don't
mind, but just to avoid flames.

regards,
Per Jessen, Zurich
http://www.enidan.com - home of the J1 serial console.

Windows 2001: "I'm sorry Dave ...  I'm afraid I can't do that."



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-14 20:07 Per Jessen
  0 siblings, 0 replies; 51+ messages in thread
From: Per Jessen @ 2001-08-14 20:07 UTC (permalink / raw)
  To: linux-kernel, PinkFreud; +Cc: Alan Cox

On Tue, 14 Aug 2001 12:32:31 -0400 (EDT), PinkFreud wrote:

>> > The latest stable version of the Linux kernel is: 2.4.8 2001-08-11 04:13 
>> > UTC Changelog 
><snip>
>> 
>> Kernel.org certainly should list the 2.2 status (hey I maintain it I'm
>> allowed to be biased). Its unfortunate it many ways that people are still so
>> programmed to the "latest version" obsession of the proprietary world some
>> times. For most people 2.4 is the right choice but for absolute stability
>> why change 8)
>
>I think that's a bit unfair.  Rather, I suspect people see the word 'stable',
>and assume, for some unknown reason, that the kernel is stable.  *AHEM*
>
>Seriously, though - even distributions are including 2.4 kernels now.  RedHat,
>Mandrake, Slackware ... Should the latest versions of these distributions be
>considered unstable as well?

SuSE started shipping 7.1 with a 2.4.0 kernel (optional). I think I installed
it on a development workstation just about the time when 2.4.2 was released.

For what we do (www.enidan.com), I tend to be more conservative, so we were
using 2.0.36 for quite some time, until we decided to move entirely to 2.2.12.
Our 16CPU cluster is up at 2.4.8 - trying to break things :-) -  but for things 
that people depend on, it's 2.2.19. Some workstations are at 2.4.x - depends.

/Per


regards,
Per Jessen, Zurich
http://www.enidan.com - home of the J1 serial console.

Windows 2001: "I'm sorry Dave ...  I'm afraid I can't do that."



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 18:46 Per Jessen
  2001-08-14 13:58 ` Andrew Scott
@ 2001-08-14 19:54 ` David Ford
  1 sibling, 0 replies; 51+ messages in thread
From: David Ford @ 2001-08-14 19:54 UTC (permalink / raw)
  To: Per Jessen; +Cc: linux-kernel

Per Jessen wrote:

>>On Mon, 13 Aug 2001 14:11:32 +0100 (BST), Alan Cox wrote:
>>
>>If you want maximum stability you want to be running 2.2 or even 2.0. Newer
>>less tested code is always less table. 2.4 wont be as stable as 2.2 for a
>>year yet.
>>
>
>Couldn't have put that any better. On mission-critical systems, this is
>exactly what people do. Personally, my experience is from the big-iron
>world of S390 -  if you're a bleeding-edge organisation, you'll be out
>there applying the latest PTFs, you'll be running the latest OS/390 etc. 
>If you're conservative, you're at least 2, maybe 3 releases (in todays 
>OS390 this means about 18-24 months) behind. If you're ultra-conservative,
>you'll wait for the point where you can no longer avoid an upgrade.
>

Unfortunately, this methodology also introduces another important 
factor.  You are the most likely target for exploits and 
vulnerabilities.  As is ever so strongly evidenced by the great numbers 
of people being exploited because the version of software they have is 
outdated.

It's a gross measure of risks; where does the risk come from, how can it 
affect you, and what can you do about it.

Some of the most common questions asked on support areas is (take IIS 
for example) "My server is being exploited, how can I stop it?" and the 
most common answer to that is "Upgrade and install all necessary patches."

Save for the rare occasion of issue, I run a few different server farms 
and they all perform very well and are all rock solid stable.  I should 
also note that they are all 2.4 kernels.  For servers I seem to have 
really good success stories, for my workstation I tend to have issues 
which is fairly natural, my workstation has numerous accessory cards and 
features.

To be honest, save for either power outtage or kernel upgrade, I rarely 
have to deal with reboots.  I tend to keep my servers within a few 
releases of the current code.  Due to this policy I rarely have exploit 
and vulnerability issues.  One particular server (which has a VIA 
chipset...is it jinxed? :) has problems now and then but they get fixed.

David



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-14 19:47 Per Jessen
  0 siblings, 0 replies; 51+ messages in thread
From: Per Jessen @ 2001-08-14 19:47 UTC (permalink / raw)
  To: Helge Hafting, linux-kernel, PinkFreud

On Tue, 14 Aug 2001 09:57:29 +0200, Helge Hafting wrote:

>PinkFreud wrote:
>[...]
>
>> > Matter of opinion. I would say that Linux-2.4 has been way long to come
>> > and wasn't quite ready for stable status. There are numerous other O/Ses
>> 
>> That's what I've been attempting to say, as well.  It seems to have been
>> released too quickly - minimal testing, too many bugs.
>
>The testing isn't minimal - it is merely ongoing.  Users don't
>pay for the kernel, so they are part of the testing team.
>
>If you use anything but a distribution kernel, keep previous
>kernels around when you upgrade.  If the new one fails, report
>it here and go back to the previous one.  The only way to get wide
>testing is when enough people do this.

Very true, although I get the feeling that the 2.2. series was far more
'stable' than the current 2.4 series. Just a feeling, but .... 
What you're saying seems to apply more to a 2.<odd> kernel series, IMHO ?

I haven't done this myself, but perhaps we ought to look at the frequency
of new 2.4 releases compared to new 2.2 releases. Shouldn't their frequency
be roughly equal ? ie. the speed with which we're seeing new 2.4 releases 
should be - roughly - that of which we saw new 2.2 kernels emerging ?

comments ?



regards,
Per Jessen, Zurich
http://www.enidan.com - home of the J1 serial console.

Windows 2001: "I'm sorry Dave ...  I'm afraid I can't do that."



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 13:11 ` Alan Cox
@ 2001-08-14 18:51   ` Anders Larsen
  2001-08-14 20:29     ` Anders Larsen
  0 siblings, 1 reply; 51+ messages in thread
From: Anders Larsen @ 2001-08-14 18:51 UTC (permalink / raw)
  To: Alan Cox; +Cc: PinkFreud, linux-kernel

On 2001-08-13 15:11 Alan Cox wrote:
> > emergency sync) when attempting to use 'ls' on a mounted QNX filesystem
> > (ls comes up fine, then system crashes - nothing sent to syslog, no errors
> > on screen, nothing!) - and this latest is with 2.4.8!
> 
> The qnxfs code is experimental - so I can believe it might fail in 2.4. I'd
> be very interested in info on that one.

The qnxfs code is really quite stable - that's the first time in more than a
year that I hear of any problem reading a qnx file-system; actually, I've been
considering removing the 'experimental' tag, but now I'll reconsider...

Incidentally, I use the qnxfs in a production environment here (tried'em all
up to 2.4.7 - guess I'll better switch to 2.4.8 right now to check, although
the qnxfs code has not changed from 2.4.7)

I'll be very interested in hearing what you find out.

cheers
  Anders (maintainer, qnx4fs)
-- 
"In theory there is no difference between theory and practice.
 In practice there is." - Yogi Berra

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-14 16:32 PinkFreud
  0 siblings, 0 replies; 51+ messages in thread
From: PinkFreud @ 2001-08-14 16:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alan Cox

> > The latest stable version of the Linux kernel is: 2.4.8 2001-08-11 04:13 
> > UTC Changelog 
<snip>
> 
> Kernel.org certainly should list the 2.2 status (hey I maintain it I'm
> allowed to be biased). Its unfortunate it many ways that people are still so
> programmed to the "latest version" obsession of the proprietary world some
> times. For most people 2.4 is the right choice but for absolute stability
> why change 8)

I think that's a bit unfair.  Rather, I suspect people see the word 'stable',
and assume, for some unknown reason, that the kernel is stable.  *AHEM*

Seriously, though - even distributions are including 2.4 kernels now.  RedHat,
Mandrake, Slackware ... Should the latest versions of these distributions be
considered unstable as well?

Perhaps it needs to be made clear to people that these kernels still aren't
all they could be.


	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-14 16:25 PinkFreud
  0 siblings, 0 replies; 51+ messages in thread
From: PinkFreud @ 2001-08-14 16:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Francois Romieu

> PinkFreud <pf-kernel@mirkwood.net> :
> [...]
> > The unthinkable has happened - it locked up again.  Same problem.  No
> > keyboard, no mouse, no display, no network.  It was as far gone as
> > possible.
> 
> Is the nmi_oopser (Documentation/nmi_watchdog.txt) inefficient here ?

>From Documentation/nmi_watchdog.txt:
NOTE: currently the NMI-oopser is enabled unconditionally on x86 SMP
boxes.

I'm not specifically enabling it in LILO, but according to the docs, it's
enabled already.  Unfortunately, the lockup happens when switching between
virtual consoles, so even if something WERE printed to the screen, I'm unlikely
to see it.

Side note: The lockup does *NOT* occur on 2.2.19 with SMP.

> Ueimor


	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 18:46 Per Jessen
@ 2001-08-14 13:58 ` Andrew Scott
  2001-08-14 19:54 ` David Ford
  1 sibling, 0 replies; 51+ messages in thread
From: Andrew Scott @ 2001-08-14 13:58 UTC (permalink / raw)
  To: linux-kernel

On 13 Aug 2001, at 20:46, Per Jessen wrote:

> >On Mon, 13 Aug 2001 14:11:32 +0100 (BST), Alan Cox wrote:
> >
> >If you want maximum stability you want to be running 2.2 or even 2.0. Newer
> >less tested code is always less table. 2.4 wont be as stable as 2.2 for a
> >year yet.
> 
> Couldn't have put that any better. On mission-critical systems, this is
> exactly what people do. Personally, my experience is from the big-iron
> world of S390 -  if you're a bleeding-edge organisation, you'll be out
> there applying the latest PTFs, you'll be running the latest OS/390 etc. 
> If you're conservative, you're at least 2, maybe 3 releases (in todays 
> OS390 this means about 18-24 months) behind. If you're ultra-conservative,
> you'll wait for the point where you can no longer avoid an upgrade.

We've only just now moved from 2.0.36 to 2.2.18, and cautiously at 
that. We've started to run into applications that won't run on the 
older kernel/lib combinatons that we need.


                      _
                     / \   / ascott@casdn.neu.edu
                    / \ \ /
                   /   \_/

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-14  4:19   ` Nicholas Knight
@ 2001-08-14 12:49     ` Alan Cox
  2001-08-14 22:27       ` Paul G. Allen
  0 siblings, 1 reply; 51+ messages in thread
From: Alan Cox @ 2001-08-14 12:49 UTC (permalink / raw)
  To: tegeran; +Cc: David Ford, PinkFreud, linux-kernel

> If this is truely the case, I'd suggest that kernel.org be modified, as 
> it refers to them as *stable*
> as of 9:18PM PDT, direct copy & paste from kernel.org page:
> 
> The latest stable version of the Linux kernel is: 2.4.8 2001-08-11 04:13 
> UTC Changelog 
> 
> The latest prepatch (alpha) version appears to be: 2.4.9-pre3 2001-08-13 
> 23:56 UTC Changelog

Kernel.org certainly should list the 2.2 status (hey I maintain it I'm
allowed to be biased). Its unfortunate it many ways that people are still so
programmed to the "latest version" obsession of the proprietary world some
times. For most people 2.4 is the right choice but for absolute stability
why change 8)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-14  4:21   ` Pete Toscano
@ 2001-08-14 12:48     ` Alan Cox
  2001-08-14 22:30       ` Paul G. Allen
  0 siblings, 1 reply; 51+ messages in thread
From: Alan Cox @ 2001-08-14 12:48 UTC (permalink / raw)
  To: Pete Toscano; +Cc: Francois Romieu, PinkFreud, linux-kernel

> 	- use the uhci USB driver when I'm using a USB printer.  If I
> 	  use the usb-uhci driver with my USB printer, the whole system
> 	  locks.  This has been reported a few times on LKML,
> 	  linux-usb-users, and linux-usb-developers and nobody helped,
> 	  but a few people wrote back with "me too"s.  It was broken in
> 	  the trasnition from 2.4.3 to 2.4.4 and only seems to affect
> 	  SMP systems.  I just gave up on USB printing and went back to
> 	  my parallel port.

usb-uhci seems to not be SMP safe. Ultimately we don't need both uhci
drivers so that hasnt been one that worried me.  Probably we should drop
the other uhci driver over time (2.5 maybe)

Alan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 21:36 PinkFreud
@ 2001-08-14  7:57 ` Helge Hafting
  0 siblings, 0 replies; 51+ messages in thread
From: Helge Hafting @ 2001-08-14  7:57 UTC (permalink / raw)
  To: PinkFreud, linux-kernel

PinkFreud wrote:
[...]

> > Matter of opinion. I would say that Linux-2.4 has been way long to come
> > and wasn't quite ready for stable status. There are numerous other O/Ses
> 
> That's what I've been attempting to say, as well.  It seems to have been
> released too quickly - minimal testing, too many bugs.

The testing isn't minimal - it is merely ongoing.  Users don't
pay for the kernel, so they are part of the testing team.

If you use anything but a distribution kernel, keep previous
kernels around when you upgrade.  If the new one fails, report
it here and go back to the previous one.  The only way to get wide
testing is when enough people do this.

Helge Hafting

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 21:20 ` Alan Cox
  2001-08-13 21:41   ` Rog�rio Brito
  2001-08-14  0:56   ` Ben Ford
@ 2001-08-14  7:34   ` Peter Wächtler
  2 siblings, 0 replies; 51+ messages in thread
From: Peter Wächtler @ 2001-08-14  7:34 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

Alan Cox wrote:
> 
> Dont suppose you know where I can get a qnx file system to play with ?
> 

You can download the newer QNX RTP at http://get.qnx.com.

You can install it in a separate partition ("primary" ones) or
into an existing Windows FAT FS.

If you want to avoid the effort, I can generate a floppy image and
email one to you.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-14  0:04 ` PinkFreud
@ 2001-08-14  7:24   ` Francois Romieu
  2001-08-15 23:24   ` Dr. Kelsey Hudson
  1 sibling, 0 replies; 51+ messages in thread
From: Francois Romieu @ 2001-08-14  7:24 UTC (permalink / raw)
  To: PinkFreud; +Cc: linux-kernel

PinkFreud <pf-kernel@mirkwood.net> :
[...]
> The unthinkable has happened - it locked up again.  Same problem.  No
> keyboard, no mouse, no display, no network.  It was as far gone as
> possible.

Is the nmi_oopser (Documentation/nmi_watchdog.txt) inefficient here ?

-- 
Ueimor

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13  8:55 ` Francois Romieu
@ 2001-08-14  4:21   ` Pete Toscano
  2001-08-14 12:48     ` Alan Cox
  0 siblings, 1 reply; 51+ messages in thread
From: Pete Toscano @ 2001-08-14  4:21 UTC (permalink / raw)
  To: Francois Romieu; +Cc: PinkFreud, linux-kernel

I'm running a SMP (2xPIII 600) on a Tyan Tiger mobo (Via Apollo Pro 133a
chipset) with a G400 and it runs fine, when I do the following:

	- disable APIC ("noapic" as a boot parameter).  Then again, the
	  system won't boot without APIC disabled.
	- use the ALSA drivers for my SoundBlaster Live.  (I haven't
	  tried the kernel-based drivers for a few version now, so this
	  situation might have changes, but up until I switched to ALSA,
	  I had crashes all the time during medium to high I/O.
	- use the uhci USB driver when I'm using a USB printer.  If I
	  use the usb-uhci driver with my USB printer, the whole system
	  locks.  This has been reported a few times on LKML,
	  linux-usb-users, and linux-usb-developers and nobody helped,
	  but a few people wrote back with "me too"s.  It was broken in
	  the trasnition from 2.4.3 to 2.4.4 and only seems to affect
	  SMP systems.  I just gave up on USB printing and went back to
	  my parallel port.

Finally, I'm using RedHat 7.1.  This system has no stability problems
now (after a long series of all kinds of stability problems).  Maybe
it's a load thing, I don't know, but it now runs stable.  

pete

On Mon, 13 Aug 2001, Francois Romieu wrote:

> I'm not convinced that gaining stability on a VIA + G400 + X + smp 
> combo is an easy task anyway.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-14  2:24 ` David Ford
@ 2001-08-14  4:19   ` Nicholas Knight
  2001-08-14 12:49     ` Alan Cox
  0 siblings, 1 reply; 51+ messages in thread
From: Nicholas Knight @ 2001-08-14  4:19 UTC (permalink / raw)
  To: David Ford, PinkFreud; +Cc: linux-kernel

On Monday 13 August 2001 07:24 pm, David Ford wrote:
> PinkFreud wrote:
> >I wasn't aware VIA nor Matrox were broken.  I've seen someone else
> > mention in this thread that perhaps some old HOWTOs on hardware need
> > to be maintained again - I think I agree with that.
>
> VIA comes up as a bloody thorn quite often it seems.  I have a VIA 586B
> system and it seems to work decently but I think I'm just lucky
> considering the large number of broken VIA chipset complaints.
>
> >Perhaps series name should be changed from 'stable' to something else
> > - 'release'?
>
> Erm...they are called release.  2.<even> is a release kernel and
> 2.<odd> is a development kernel.  Some people (myself included) fight
> the impression given by a lot of people of stable/unstable naming.
> Typically called slashdotters...</humor>

If this is truely the case, I'd suggest that kernel.org be modified, as 
it refers to them as *stable*
as of 9:18PM PDT, direct copy & paste from kernel.org page:

The latest stable version of the Linux kernel is: 2.4.8 2001-08-11 04:13 
UTC Changelog 

The latest prepatch (alpha) version appears to be: 2.4.9-pre3 2001-08-13 
23:56 UTC Changelog

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 21:07 PinkFreud
  2001-08-13 21:20 ` Alan Cox
@ 2001-08-14  2:24 ` David Ford
  2001-08-14  4:19   ` Nicholas Knight
  1 sibling, 1 reply; 51+ messages in thread
From: David Ford @ 2001-08-14  2:24 UTC (permalink / raw)
  To: PinkFreud; +Cc: linux-kernel

PinkFreud wrote:

>I wasn't aware VIA nor Matrox were broken.  I've seen someone else mention in
>this thread that perhaps some old HOWTOs on hardware need to be maintained
>again - I think I agree with that.
>

VIA comes up as a bloody thorn quite often it seems.  I have a VIA 586B 
system and it seems to work decently but I think I'm just lucky 
considering the large number of broken VIA chipset complaints.

>Perhaps series name should be changed from 'stable' to something else - 
>'release'?
>

Erm...they are called release.  2.<even> is a release kernel and 2.<odd> 
is a development kernel.  Some people (myself included) fight the 
impression given by a lot of people of stable/unstable naming. 
 Typically called slashdotters...</humor>


David



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 21:20 ` Alan Cox
  2001-08-13 21:41   ` Rog�rio Brito
@ 2001-08-14  0:56   ` Ben Ford
  2001-08-14  7:34   ` Peter Wächtler
  2 siblings, 0 replies; 51+ messages in thread
From: Ben Ford @ 2001-08-14  0:56 UTC (permalink / raw)
  To: linux-kernel

Alan Cox wrote:

>>Unfortunately, that's all the info I have.  Console switching was still
>>working, so I tried enabling logging to a console - no output.  System just
>>hangs.  Any suggestions on what I might try to get more information for you?
>>
>
>Dont suppose you know where I can get a qnx file system to play with ?
>

http://get.qnx.com/

-- 
Number of restrictions placed on "Alice in Wonderland" (public domain) eBook:  5

Maximum penalty for reading "Alice in Wonderland" aloud (possible DMCA    
violation):  5 years jail
                                
Average sentence for commiting Rape: 5 years




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 21:44 PinkFreud
@ 2001-08-14  0:04 ` PinkFreud
  2001-08-14  7:24   ` Francois Romieu
  2001-08-15 23:24   ` Dr. Kelsey Hudson
  0 siblings, 2 replies; 51+ messages in thread
From: PinkFreud @ 2001-08-14  0:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alan Cox

On Mon, 13 Aug 2001, PinkFreud wrote:

> Date: Mon, 13 Aug 2001 17:44:57 -0400 (EDT)
> From: PinkFreud <pf-kernel@mirkwood.net>
> To: linux-kernel@vger.kernel.org
> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
> Subject: Re: Are we going too fast?
> 
> > VIA has some chipset bugs, Matrox G400 cards seem to abuse the PCI spec for 
> > benchmarketing dirties.
> > 
> > (All chipsets have bugs in truth, its just how they appear and if they
> > affect users. As of 2.4.8 the VIA ones should be in the users not affected
> > camp)
> 
> I'll give 2.4.8 a try on the SMP box, and let you know the outcome.

The unthinkable has happened - it locked up again.  Same problem.  No
keyboard, no mouse, no display, no network.  It was as far gone as
possible.

This even happens after the BIOS flash - the first few times I switched
consoles, it actually survived.  After that, it locked up again.

	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-13 21:44 PinkFreud
  2001-08-14  0:04 ` PinkFreud
  0 siblings, 1 reply; 51+ messages in thread
From: PinkFreud @ 2001-08-13 21:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alan Cox

> > Unfortunately, that's all the info I have.  Console switching was still
> > working, so I tried enabling logging to a console - no output.  System just
> > hangs.  Any suggestions on what I might try to get more information for you?
> 
> Dont suppose you know where I can get a qnx file system to play with ?

Same place I got it.  http://get.qnx.com/

> > this thread that perhaps some old HOWTOs on hardware need to be maintained
> > again - I think I agree with that.
> 
> VIA has some chipset bugs, Matrox G400 cards seem to abuse the PCI spec for 
> benchmarketing dirties.
> 
> (All chipsets have bugs in truth, its just how they appear and if they
> affect users. As of 2.4.8 the VIA ones should be in the users not affected
> camp)

I'll give 2.4.8 a try on the SMP box, and let you know the outcome.

> Alan


	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 21:20 ` Alan Cox
@ 2001-08-13 21:41   ` Rog�rio Brito
  2001-08-14  0:56   ` Ben Ford
  2001-08-14  7:34   ` Peter Wächtler
  2 siblings, 0 replies; 51+ messages in thread
From: Rog�rio Brito @ 2001-08-13 21:41 UTC (permalink / raw)
  To: linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 974 bytes --]

On Aug 13 2001, Alan Cox wrote:
> VIA has some chipset bugs, Matrox G400 cards seem to abuse the PCI spec for 
> benchmarketing dirties.

	If I had to purchase a motherboard and graphics card for a
	desktop that were running Linux, which ones should I be
	buying? Are AMD's chipsets better than those made by VIA? And
	what about ATI's cards?

	Of course, making business with an open source-friendly is a
	requirement that I've beeing making for, say, 2 or 3 years.


	[]s, Roger...

P.S.: I have an Asus A7V/VIA KT133 boarch here with a Matrox G400 card
and I wish the performance were better (especially when I'm playing a
DVD). At least, when I'm using the Promise IDE interface and ignore
the IDE interface supplied by the southbridge, Linux feels faster.
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
  Rogério Brito - rbrito@ime.usp.br - http://www.ime.usp.br/~rbrito/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-13 21:36 PinkFreud
  2001-08-14  7:57 ` Helge Hafting
  0 siblings, 1 reply; 51+ messages in thread
From: PinkFreud @ 2001-08-13 21:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Gérard Roudier

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 5979 bytes --]

> On Mon, 13 Aug 2001, PinkFreud wrote:
> 
> > Have a contact address for LSILOGIC?  I'll be happy to CC them in any
> > future bug reports.  It may also be useful to place this address in
> > comments at the top of the ncr53c8xx driver as well.
> 
> It is in fact Pamela Delaney, a LSILOGIC employee, who added support for
> the 53C1010 in the sym53c8xx driver version 1.6. This allowed me time for
> porting version 1.5 (without C1010 support) to FreeBSD (sym driver) and to
> add my own variant of C1010 support to sym. Pamela didn't seem to want to
> add her name and address in the source.  Anyway, you may want to have a
> look at the LSILOGIC web and ftp site for the Linux support. I would be
> surprised if you cannot find Pamela's email address.

I'll look.  Thanks.

> > I think I've proven a number of things to be broken in the 2.4.x series -
> > but they doesn't seem to be getting fixed.  My point was, perhaps more
> > effort should be put into fixing these bugs, rather than adding new
> > features to a supposedly stable series.
> 
> Matter of opinion. I would say that Linux-2.4 has been way long to come
> and wasn't quite ready for stable status. There are numerous other O/Ses

That's what I've been attempting to say, as well.  It seems to have been
released too quickly - minimal testing, too many bugs.

> that have had to suffer such a problem in their long life, especially
> commercial ones. Nothing that only applies to Linux here, in my
> experience.

I think Linux is something of a unique case here, though.  Linus wanted to get
2.4.x out quickly - and now there's more bugs to deal with than ever.  For
this level of (in?)stability, I'd still expect to see this as a development
kernel.  Please, don't get me wrong - I *do* realize the earlier in a kernel
series we are, the more problems will appear.  I just happen to think that
there are far too many for a series labeled 'stable'.  Either a third series
should be created for the interim, or perhaps the kernels need to be in
'devel' for a bit longer?

Just my 2 cents.

> > If Linux is trying to prove itself usable for the business world, how is
> > that going to help?  I'm not implying that I'm a business in any way,
> > shape, or form - but given that I think the majority of us want to see
> > Linux in the server rooms, and even on the desktop, what does this mean
> > for those users?
> 
> Btw, we are using some Linux machines at the company I work to. They
> donnot seem to run 2.4 kernels for the moment. As I am the only guy that
> also uses FreeBSD, I donnot want to risk FreeBSD 5 for real work for the
> same reason. :)

5.0 is current, 4.3 is release.  As I understand it, 'current' is the
equivalent of Linux's 'devel' and 'stable' the equivalent of Linux's 'stable'.

If that's the case, your refusal to use a 'current' release on a production
machine would be like refusing to use 2.3.x or 2.5.x on a production machine -
a very sound decision.  But what of 2.4.x?  It's called stable, but yet has a
ways to go.

> OTOH, we have software that explodes Solaris 8 in a millisecond but that
> works reliably on previous Solaris releases, but Solaris 8 is not that
> young an OS release as we know. Just an example that applies to a
> commercial Unix O/S...

True.  But is that due to a bug in the particular software, or the OS?  :P

> > I know plenty of Windows users who are quite upset at the lack of
> > stability.  They either don't know/understand that there are alternatives,
> > or feel it's too hard to switch to an alternative.
> 
> A windows machine is generally some melting pot of [an O/S + broken
> hardwares + broken drivers + broken applications + viruses] driven by
> unaware users. It is a miracle for such a thing to work enough for real
> work to be possible. Personnaly, I haven't problems with Windows. It runs
> games just fine and since I donnot use it for anything else, it just fit
> my needs. :-)

There's plenty of unaware users using Linux nowadays (RedHat, Mandrake, ...).
What does this mean for them?  Distributions are now including 2.4.x kernels.
What happens when their systems blow up, as the 3 I've used here have?

> > I definintely believe this (the random panic) to be a bug in your
> > ncr53c8xx driver.  ksymoops seemed to believe it to be the case, and
> > NetBSD seems to be working fine, which means it's not faulty hardware.
> 
> I have retrieved your bug report (emailed on 28 July 2001). I was in
> vacation at this date until yesterday. I cannot read thousands of emails
> in a couple of hours, sorry.

My apologies.  I understand you're busy.  I just got a bit frustrated when I
found that all three systems I've tried 2.4.x kernels on blew up.

> The problem is due to a NULL pointer being read from the driver DONE
> queue. This queue uses 0xfffffff as a tag for empty entries and valid
> addresses for entries pointing to completed CCBs. Since this driver is
> actually stable since years (only sym53c8xx was under development) it is
> likely the driver data structures that are screwed up from some other
> place rather than a driver bug, in my opinion. If this also happens on
> 2.2.x (x>=18) kernel release, it will be another story, obviously.

I haven't tried the later 2.2.x kernels on that machine.  Since I do plan on
using that system in some sort of production capacity, and since it's currently
running NetBSD without a problem, I don't think I'm going to get the chance to
run Linux on it any time in the near future.  I do, as mentioned earlier, have
an Alpha with the same controller, which currently operates just fine with
2.2.14.  I will be more than happy to install the latest 2.2.x kernel on it
when the NetBSD system replaces what it does, and see if it blows up.

> Regards,
>   Gérard.


	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 21:07 PinkFreud
@ 2001-08-13 21:20 ` Alan Cox
  2001-08-13 21:41   ` Rog�rio Brito
                     ` (2 more replies)
  2001-08-14  2:24 ` David Ford
  1 sibling, 3 replies; 51+ messages in thread
From: Alan Cox @ 2001-08-13 21:20 UTC (permalink / raw)
  To: PinkFreud; +Cc: linux-kernel, Alan Cox

> Unfortunately, that's all the info I have.  Console switching was still
> working, so I tried enabling logging to a console - no output.  System just
> hangs.  Any suggestions on what I might try to get more information for you?

Dont suppose you know where I can get a qnx file system to play with ?

> this thread that perhaps some old HOWTOs on hardware need to be maintained
> again - I think I agree with that.

VIA has some chipset bugs, Matrox G400 cards seem to abuse the PCI spec for 
benchmarketing dirties.

(All chipsets have bugs in truth, its just how they appear and if they
affect users. As of 2.4.8 the VIA ones should be in the users not affected
camp)

Alan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-13 21:07 PinkFreud
  2001-08-13 21:20 ` Alan Cox
  2001-08-14  2:24 ` David Ford
  0 siblings, 2 replies; 51+ messages in thread
From: PinkFreud @ 2001-08-13 21:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alan Cox

> > of them have suffered from one malady or another - from the dual PIII with
> > the VIA chipset and Matrox G400 card, which locks up nicely when I switch
> 
> Welcome to wacky hardware. To get a G400 stable on x86 you need at least
> 
> XFree86 4.1 if you are running hardware 3D (and DRM 4.1)

I run 4.1.0 on that system.  DRM, I don't believe, is currently enabled,
though I'd like it to be.

> 2.4.8 or higher with the VIA fixes

Oooooh.  So .8 *does* have fixes for VIA... I think I'll give that a try now.

> Preferably a very recent BIOS update for the VIA box

Hmm.  I'll also check VIA to see if they have any updates for this system.
Thanks for the suggestion.

> Of those only the XFree hardware 3d stuff is software bug related.

I'm not currently using 3D - yet the system insists on locking up when I
switch from X to a text console and back.  Again, this only occurs with an
SMP kernel (this is an SMP system).  This does NOT occur with a uniprocessor
kernel.

> > emergency sync) when attempting to use 'ls' on a mounted QNX filesystem
> > (ls comes up fine, then system crashes - nothing sent to syslog, no errors
> > on screen, nothing!) - and this latest is with 2.4.8!
> 
> The qnxfs code is experimental - so I can believe it might fail in 2.4. I'd
> be very interested in info on that one.

Unfortunately, that's all the info I have.  Console switching was still
working, so I tried enabling logging to a console - no output.  System just
hangs.  Any suggestions on what I might try to get more information for you?

> > Should development continue on the latest and supposedly greatest
> > drivers?  Or should the existing bugs be fixed first?  I've got at least
> > three up there that need taking care of, and I'm sure others on this list
> > have found more.  3 seperate crashes on 3 seperate installs on 3 seperate
> > boxes - that's 100% failure rate.  If I get 100% failure on my installs,
> > what are others seeing?
> 
> Near enough 0%. But then I try and avoid buying broken chipsets.

I wasn't aware VIA nor Matrox were broken.  I've seen someone else mention in
this thread that perhaps some old HOWTOs on hardware need to be maintained
again - I think I agree with that.

> > I like Linux.  I'd like to stick with it.  But if it's going to
> > continually crash, I'm going to jump ship - and I'll start recommending to
> 
> If you want maximum stability you want to be running 2.2 or even 2.0. Newer
> less tested code is always less table. 2.4 wont be as stable as 2.2 for a
> year yet.

Perhaps series name should be changed from 'stable' to something else - 
'release'?

> Alan


	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 20:24 ` Alan Cox
@ 2001-08-13 21:06   ` Anthony Barbachan
  0 siblings, 0 replies; 51+ messages in thread
From: Anthony Barbachan @ 2001-08-13 21:06 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

> > > Welcome to wacky hardware. To get a G400 stable on x86 you need at
least
> > >
> > > XFree86 4.1 if you are running hardware 3D (and DRM 4.1)
> > > 2.4.8 or higher with the VIA fixes
> > > Preferably a very recent BIOS update for the VIA box
> >
> > I'm sorry, but what "VIA fixes" are we referring to?
>
> Certain VIA chipsets had some nasty bugs that caused corruption. The older
> kernels have a workaround that mostly does the job but has a few side
> effects. The 2.4.8 kernel has the official VIA provided workaround, which
> makes sbpci128 cards work again, and sorts out some bus hangs, especially
> with matrox cards

    Could these "fixes" resolve any issues with the vt82c686a Southbridge?
For the life of me I have yet to be able to get my FIC VA-503A (that uses a
vt82c686a Southbridge for UDMA66 support) working correctly under Linux
2.4.x (or 2.2.x with the enhanced IDE patch) with DMA enabled by default.
And yes I have already tried switching the 80 pin cables 7 times.  Heck, I
even get CRC errors on UDMA33 drives using 40 pin cables; albeit a lesser
amount.  I have also noticed a hanging issue on a FIC VA-503+ board in which
the PC speaker can hang, in mid beep, along with the system for a short
while occasionally when the speaker issues a beep.  By the way, any ideas on
how I can help debug this particular problem?  There is no Ooops so I am not
sure how I can help out.  Both systems otherwise work very well and
perfectly on Win9x, Win2k, FreeBSD, and OpenBSD.  I'm starting to take a
very dim view of Linux on VIA boards.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 17:53 PinkFreud
@ 2001-08-13 20:27 ` Gérard Roudier
  0 siblings, 0 replies; 51+ messages in thread
From: Gérard Roudier @ 2001-08-13 20:27 UTC (permalink / raw)
  To: PinkFreud; +Cc: linux-kernel



On Mon, 13 Aug 2001, PinkFreud wrote:

> > On Mon, 13 Aug 2001, PinkFreud wrote:
> >

[...]

> > About ncr53c8xx problem reports, I cannot reply to all of them. You may
> > also send them to LSILOGIC support. They also want Linux to work with the
> > ncr/sym/lsi/53c8xx PCI-SCSI controllers, even with old NCR ones. Some
> > other vendors seem to just ignore old hardwares. For example NVIDIA that
> > killed (bought?) 3DFX, does not seem interested in maintaining drivers for
> > the 3DFX graphic chips.
>
> Have a contact address for LSILOGIC?  I'll be happy to CC them in any
> future bug reports.  It may also be useful to place this address in
> comments at the top of the ncr53c8xx driver as well.

It is in fact Pamela Delaney, a LSILOGIC employee, who added support for
the 53C1010 in the sym53c8xx driver version 1.6. This allowed me time for
porting version 1.5 (without C1010 support) to FreeBSD (sym driver) and to
add my own variant of C1010 support to sym. Pamela didn't seem to want to
add her name and address in the source.  Anyway, you may want to have a
look at the LSILOGIC web and ftp site for the Linux support. I would be
surprised if you cannot find Pamela's email address.

> > I use Linux since some 0.99.x (was yygdrasil distribution). My experience
> > has been that 1.2.13, 2.0.27 and 2.2.13 worked reliable enough for me.
>
> I've used all three of those kernels, and I tend to agree - except for the
> nasty security hole in 2.2.13 (but that happens with any OS - look at
> Windows!).
>
> > 'Stable' does not means reliable for any workload. It means that we stop
> > developping (implies changing large portions of code or modifying
> > interfaces) but only focus on fixing the software with it current design
> > (implies only changing what is proven to be broken).  This applies to all
>
> I think I've proven a number of things to be broken in the 2.4.x series -
> but they doesn't seem to be getting fixed.  My point was, perhaps more
> effort should be put into fixing these bugs, rather than adding new
> features to a supposedly stable series.

Matter of opinion. I would say that Linux-2.4 has been way long to come
and wasn't quite ready for stable status. There are numerous other O/Ses
that have had to suffer such a problem in their long life, especially
commercial ones. Nothing that only applies to Linux here, in my
experience.

> > softwares, not only to Linux. As a result, early stable releases still
> > have numerous bugs that may prevent numerous systems from working
> > reliably. It is up to user to check releases and switch to the one that
> > fits his expectations.
>
> If Linux is trying to prove itself usable for the business world, how is
> that going to help?  I'm not implying that I'm a business in any way,
> shape, or form - but given that I think the majority of us want to see
> Linux in the server rooms, and even on the desktop, what does this mean
> for those users?

Btw, we are using some Linux machines at the company I work to. They
donnot seem to run 2.4 kernels for the moment. As I am the only guy that
also uses FreeBSD, I donnot want to risk FreeBSD 5 for real work for the
same reason. :)
OTOH, we have software that explodes Solaris 8 in a millisecond but that
works reliably on previous Solaris releases, but Solaris 8 is not that
young an OS release as we know. Just an example that applies to a
commercial Unix O/S...

> > > This brings me to the subject of this rant: are we going too fast?  New
> > > drivers are still showing up in each successive kernel, and yet no one
> > > seems to be able to fix the old bugs that already exist.  Are we looking
> > > to have the reliability of Windows?  It's starting to seem so - each
> > > successive kernel series just seems to crash more and more often.  When
> > > will we reach the point where Windows, on the average, will have greater
> > > uptime than Linux systems?  Perhaps it's time to slow down, and do some
> > > debugging.
> >
> > The reliabity of Windows seems to be just fine for most users since it is
> > the O/S most of them want to use.:-)
>
> I know plenty of Windows users who are quite upset at the lack of
> stability.  They either don't know/understand that there are alternatives,
> or feel it's too hard to switch to an alternative.

A windows machine is generally some melting pot of [an O/S + broken
hardwares + broken drivers + broken applications + viruses] driven by
unaware users. It is a miracle for such a thing to work enough for real
work to be possible. Personnaly, I haven't problems with Windows. It runs
games just fine and since I donnot use it for anything else, it just fit
my needs. :-)

> > > Should development continue on the latest and supposedly greatest
> > > drivers?  Or should the existing bugs be fixed first?  I've got at least
> > > three up there that need taking care of, and I'm sure others on this list
> > > have found more.  3 seperate crashes on 3 seperate installs on 3 seperate
> > > boxes - that's 100% failure rate.  If I get 100% failure on my installs,
> > > what are others seeing?
> >
> > Hopefully you aren't a typical computer user or you just have bad luck
> > with computers. :-)
>
> Certainly the case with the former, and sadly, you're not the first to
> suggest the latter.  :)
>
> > All software developpers and maintainers want their software to work and
> > thus bugs to be fixed. This is just sometimes hard to know what is
> > actually broken. My experience is that no more than 10% bug reports about
> > a software are due to a bug in the software that is pointed out by the
> > report. And for these less than 10% relevant reports, maintainers must
> > find what is broken... not simple as you can imagine...
>
> I definintely believe this (the random panic) to be a bug in your
> ncr53c8xx driver.  ksymoops seemed to believe it to be the case, and
> NetBSD seems to be working fine, which means it's not faulty hardware.

I have retrieved your bug report (emailed on 28 July 2001). I was in
vacation at this date until yesterday. I cannot read thousands of emails
in a couple of hours, sorry.

The problem is due to a NULL pointer being read from the driver DONE
queue. This queue uses 0xfffffff as a tag for empty entries and valid
addresses for entries pointing to completed CCBs. Since this driver is
actually stable since years (only sym53c8xx was under development) it is
likely the driver data structures that are screwed up from some other
place rather than a driver bug, in my opinion. If this also happens on
2.2.x (x>=18) kernel release, it will be another story, obviously.

[...]

Regards,
  Gérard.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
       [not found] <no.id>
@ 2001-08-13 20:24 ` Alan Cox
  2001-08-13 21:06   ` Anthony Barbachan
  2001-08-14 20:47 ` Alan Cox
  1 sibling, 1 reply; 51+ messages in thread
From: Alan Cox @ 2001-08-13 20:24 UTC (permalink / raw)
  To: John Weber; +Cc: linux-kernel

> > Welcome to wacky hardware. To get a G400 stable on x86 you need at least
> > 
> > XFree86 4.1 if you are running hardware 3D (and DRM 4.1)
> > 2.4.8 or higher with the VIA fixes
> > Preferably a very recent BIOS update for the VIA box
> 
> I'm sorry, but what "VIA fixes" are we referring to?

Certain VIA chipsets had some nasty bugs that caused corruption. The older
kernels have a workaround that mostly does the job but has a few side
effects. The 2.4.8 kernel has the official VIA provided workaround, which
makes sbpci128 cards work again, and sorts out some bus hangs, especially
with matrox cards

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
       [not found] ` <fa.g70as7v.1722ipv@ifi.uio.no>
@ 2001-08-13 19:14   ` John Weber
  0 siblings, 0 replies; 51+ messages in thread
From: John Weber @ 2001-08-13 19:14 UTC (permalink / raw)
  To: linux-kernel



Alan Cox wrote:

>>of them have suffered from one malady or another - from the dual PIII with
>>the VIA chipset and Matrox G400 card, which locks up nicely when I switch
>>
> 
> Welcome to wacky hardware. To get a G400 stable on x86 you need at least
> 
> XFree86 4.1 if you are running hardware 3D (and DRM 4.1)
> 2.4.8 or higher with the VIA fixes
> Preferably a very recent BIOS update for the VIA box
> 


I'm sorry, but what "VIA fixes" are we referring to?

My hardware:
- VIA  Apollo Pro 133A - VIA VT82C686A South, VIA VT82C693A North

The only problem I have ever had with my system had to do with the 

onboard sound (via82cxxx_audio driver specifically), and Mr. Jeff
Garzik promptly issued a patch which corrected my problem.

I'm currently running linux kernel 2.4.8 with no problems whatsoever.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-13 18:53 Petr Vandrovec
  0 siblings, 0 replies; 51+ messages in thread
From: Petr Vandrovec @ 2001-08-13 18:53 UTC (permalink / raw)
  To: Francois Romieu; +Cc: linux-kernel, pf-kernel

On 13 Aug 01 at 10:55, Francois Romieu wrote:
> 
> Try and send specific bug-reports to the maintainers. 
> l-k archives may give you some light on issues with VIA chipsets.
> 
> I'm not convinced that gaining stability on a VIA + G400 + X + smp 
> combo is an easy task anyway.

VIA (694X) (Gigabyte 6VXD7), G450, XF4.0/XF4.1, SMP (2xPIII/833) works 
fine if you
(1) do not use matrox module from Matrox and
(2) there is not PCI activity which targets G400 when X initialize
    hardware (during start or console switch) and
(3) it is highly unrecommended to use DRI (as it touches G400 hardware
    even when X are not on foreground)

If it is too limiting for you, look for another chipset. With i440BX
you'll get at least 2x faster PCI->AGP transfers than with VIA: i440BX
can handle 60MBps (32bpp full PAL) without any problems, while 694x has 
problems with 30MBps (16bpp full PAL) (IDE disk accesses are visible
as dropouts on picture).

There is nothing Linux kernel can do for stability of such box.
                                            Best regards,
                                                Petr Vandrovec
                                                vandrove@vc.cvut.cz
                                                

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-13 18:46 Per Jessen
  2001-08-14 13:58 ` Andrew Scott
  2001-08-14 19:54 ` David Ford
  0 siblings, 2 replies; 51+ messages in thread
From: Per Jessen @ 2001-08-13 18:46 UTC (permalink / raw)
  To: linux-kernel

>On Mon, 13 Aug 2001 14:11:32 +0100 (BST), Alan Cox wrote:
>
>If you want maximum stability you want to be running 2.2 or even 2.0. Newer
>less tested code is always less table. 2.4 wont be as stable as 2.2 for a
>year yet.

Couldn't have put that any better. On mission-critical systems, this is
exactly what people do. Personally, my experience is from the big-iron
world of S390 -  if you're a bleeding-edge organisation, you'll be out
there applying the latest PTFs, you'll be running the latest OS/390 etc. 
If you're conservative, you're at least 2, maybe 3 releases (in todays 
OS390 this means about 18-24 months) behind. If you're ultra-conservative,
you'll wait for the point where you can no longer avoid an upgrade.


regards,
Per Jessen


regards,
Per Jessen, Zurich
http://www.enidan.com - home of the J1 serial console.

Windows 2001: "I'm sorry Dave ...  I'm afraid I can't do that."



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
@ 2001-08-13 17:53 PinkFreud
  2001-08-13 20:27 ` Gérard Roudier
  0 siblings, 1 reply; 51+ messages in thread
From: PinkFreud @ 2001-08-13 17:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: Gérard Roudier

> On Mon, 13 Aug 2001, PinkFreud wrote:
>
> > Please CC me in any replies, I am not subscribed to this list.

This still holds true - I'm not subscribed to this list right now.

> >
> > Please forgive me if I seem incoherent.  It's after 3:30 AM here.
>
> So, you will be forgiven, otherwise ... :-)

Thanks.  :)

> You may want to elaborate on the ncr53c8xx problems (I maintain this
> driver). More generally, you must not ignore the thousands of bugs in the
> hardware you are using, but software developpers haven't access to all
> errata descriptions since hardware vendors donnot like to make this
> information freely available.

I have elaborated.  See below.

>
> About ncr53c8xx problem reports, I cannot reply to all of them. You may
> also send them to LSILOGIC support. They also want Linux to work with the
> ncr/sym/lsi/53c8xx PCI-SCSI controllers, even with old NCR ones. Some
> other vendors seem to just ignore old hardwares. For example NVIDIA that
> killed (bought?) 3DFX, does not seem interested in maintaining drivers for
> the 3DFX graphic chips.

Have a contact address for LSILOGIC?  I'll be happy to CC them in any
future bug reports.  It may also be useful to place this address in
comments at the top of the ncr53c8xx driver as well.

> I use Linux since some 0.99.x (was yygdrasil distribution). My experience
> has been that 1.2.13, 2.0.27 and 2.2.13 worked reliable enough for me.

I've used all three of those kernels, and I tend to agree - except for the
nasty security hole in 2.2.13 (but that happens with any OS - look at
Windows!).

> 'Stable' does not means reliable for any workload. It means that we stop
> developping (implies changing large portions of code or modifying
> interfaces) but only focus on fixing the software with it current design
> (implies only changing what is proven to be broken).  This applies to all

I think I've proven a number of things to be broken in the 2.4.x series -
but they doesn't seem to be getting fixed.  My point was, perhaps more
effort should be put into fixing these bugs, rather than adding new
features to a supposedly stable series.

> softwares, not only to Linux. As a result, early stable releases still
> have numerous bugs that may prevent numerous systems from working
> reliably. It is up to user to check releases and switch to the one that
> fits his expectations.

If Linux is trying to prove itself usable for the business world, how is
that going to help?  I'm not implying that I'm a business in any way,
shape, or form - but given that I think the majority of us want to see
Linux in the server rooms, and even on the desktop, what does this mean
for those users?

> > This brings me to the subject of this rant: are we going too fast?  New
> > drivers are still showing up in each successive kernel, and yet no one
> > seems to be able to fix the old bugs that already exist.  Are we looking
> > to have the reliability of Windows?  It's starting to seem so - each
> > successive kernel series just seems to crash more and more often.  When
> > will we reach the point where Windows, on the average, will have greater
> > uptime than Linux systems?  Perhaps it's time to slow down, and do some
> > debugging.
>
> The reliabity of Windows seems to be just fine for most users since it is
> the O/S most of them want to use.:-)

I know plenty of Windows users who are quite upset at the lack of
stability.  They either don't know/understand that there are alternatives,
or feel it's too hard to switch to an alternative.

> > Should development continue on the latest and supposedly greatest
> > drivers?  Or should the existing bugs be fixed first?  I've got at least
> > three up there that need taking care of, and I'm sure others on this list
> > have found more.  3 seperate crashes on 3 seperate installs on 3 seperate
> > boxes - that's 100% failure rate.  If I get 100% failure on my installs,
> > what are others seeing?
>
> Hopefully you aren't a typical computer user or you just have bad luck
> with computers. :-)

Certainly the case with the former, and sadly, you're not the first to
suggest the latter.  :)

> All software developpers and maintainers want their software to work and
> thus bugs to be fixed. This is just sometimes hard to know what is
> actually broken. My experience is that no more than 10% bug reports about
> a software are due to a bug in the software that is pointed out by the
> report. And for these less than 10% relevant reports, maintainers must
> find what is broken... not simple as you can imagine...

I definintely believe this (the random panic) to be a bug in your
ncr53c8xx driver.  ksymoops seemed to believe it to be the case, and
NetBSD seems to be working fine, which means it's not faulty hardware.

> Btw, I use SYM-2 driver under Linux, FreeBSD and NetBSD 1.5. I have no
> problem with it. If you plan to use Ultra-160 LSI53C1010 chips, the NetBSD
> SIOP driver may be sub-optimal and, btw, it does not seem to know about
> C1010 chips erratas.

I'll keep that in mind.  However, the box in question is an older system,
so I doubt it'll ever see one of those.

By the way, the driver seems to work with 2.2.14 on an Alpha.  On this
system, though, 2.4.x just manages to blow up.

> You donnot seem to have given a try with FreeBSD. Were there some strong
> reasons for that ?

Actually, I was considering both Free- and NetBSD.  I just chose NetBSD.

> > sauron@rivendell:~$ uptime
> >  3:17AM  up 12 days, 15:20, 2 users, load averages: 1.48, 0.66, 0.31
> > sauron@rivendell:~$ uname -a
> > NetBSD rivendell 1.5.1 NetBSD 1.5.1 (RIVENDELL) #0: Tue Jul 31 22:58:54
> > EDT 2001     root@rivendell:/usr/src/sys/arch/i386/compile/RIVENDELL i386
> > sauron@rivendell:~$ dmesg | grep -i sym
> > siop0 at pci0 dev 6 function 0: Symbios Logic 53c810 (fast scsi)
> >
> > (The controller is old - it was made by NCR before it became Symbios Logic
> > - hence, why I was using the NCR driver for it, rather than the Symbios
> > driver, in Linux.)
> >
> > Working on 13 days uptime.  That's well over twice the uptime for Linux on
> > that box.  That's what happens when the kernel has bugs.
>
> You seem so sure it is the ncr53c8xx driver that breaks your Linux ...
> If it was so broken, may be I would have heared about. :-)

You should have heard about it.  The last two messages were sent to the
address you have listed in your driver.


Date: Fri, 20 Jul 2001 13:26:12 -0400 (EDT)
From: PinkFreud <pf-kernel@mirkwood.net>
To: linux-kernel@vger.kernel.org
Subject: two seperate 2.4.x problems...
Message-ID: <Pine.LNX.4.20.0107201305350.5411-100000@eriador.mirkwood.net>

                                                                          
Date: Mon, 23 Jul 2001 14:11:35 -0400 (EDT)
From: PinkFreud <pf-kernel@mirkwood.net>
To: linux-kernel@vger.kernel.org
cc: Gerard Roudier <groudier@club-internet.fr>
Subject: 2.4.6 NCR53C8XX bug?  (was: 2.4.x problems (this is *not* a
    distribution
 related question!))
Message-ID: <Pine.LNX.4.20.0107231347060.5411-100000@eriador.mirkwood.net>


Date: Sat, 28 Jul 2001 22:20:20 -0400 (EDT)
From: PinkFreud <pf-kernel@mirkwood.net>
To: linux-kernel@vger.kernel.org
cc: Gerard Roudier <groudier@club-internet.fr>
Subject: 2.4.7 oops + panic in ncr53c8xx (ncr_wakeup_done)
Message-ID: <Pine.LNX.4.20.0107282207180.316-100000@eriador.mirkwood.net>


Would you like me to re-send the ksymoops output?

> > Take this rant for what you will.  Personally, I switched from Windows to
> > Linux 5 years ago for the stability.  If I need to switch OSs again to
> > continue to have stability, I will.  Somehow, I suspect, if kernel
> > development continues down this path, many others will wind up switching
> > to other OSs as well.
>
> If NetBSD fits your need, then let me encourage you to use it.

It is for the moment.  I hoped Linux would fit my need, though.

> > I like Linux.  I'd like to stick with it.  But if it's going to
> > continually crash, I'm going to jump ship - and I'll start recommending to
> > others that they do the same.
>
> That's unclever recommendation, in my opinion.
> For example, my children are happy using Windows 98 and I donnot want to
> recommend them anything else.

I recommend using what I feel is usable (which includes stability) - which
is why I never recommend using Windows.  But that's just me.


	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13  7:43 PinkFreud
                   ` (6 preceding siblings ...)
  2001-08-13 13:55 ` Anton Altaparmakov
@ 2001-08-13 17:16 ` Stephen Satchell
  7 siblings, 0 replies; 51+ messages in thread
From: Stephen Satchell @ 2001-08-13 17:16 UTC (permalink / raw)
  To: Alan Cox, PinkFreud; +Cc: linux-kernel

At 02:11 PM 8/13/01 +0100, Alan Cox wrote:
> > Should development continue on the latest and supposedly greatest
> > drivers?  Or should the existing bugs be fixed first?  I've got at least
> > three up there that need taking care of, and I'm sure others on this list
> > have found more.  3 seperate crashes on 3 seperate installs on 3 seperate
> > boxes - that's 100% failure rate.  If I get 100% failure on my installs,
> > what are others seeing?
>
>Near enough 0%. But then I try and avoid buying broken chipsets.

Back in the old 2.0 days there was a couple of HOWTOs that discussed what 
hardware worked and didn't work under Linux.  I remember using the 
Ethernet-HOWTO as a "shopping list" when going to ham fests, "Wierd Stuff", 
and similar consignment/surplus/boneyard houses.  Have these HOWTOs been 
maintained?

(I've noticed that a number of very, very useful HOWTOs have fallen into 
the "unmaintained" category.)

I happen to think that "stories around the campfire" are fine, but with an 
OS like Linux we should have the "Campfire FAQ"...  :)

Satch


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 10:29   ` Justin Guyett
  2001-08-13 12:56     ` Andrzej Krzysztofowicz
@ 2001-08-13 16:54     ` Gérard Roudier
  1 sibling, 0 replies; 51+ messages in thread
From: Gérard Roudier @ 2001-08-13 16:54 UTC (permalink / raw)
  To: Justin Guyett; +Cc: linux-kernel



On Mon, 13 Aug 2001, Justin Guyett wrote:

> On Mon, 13 Aug 2001, Gérard Roudier wrote:
>
> > You may want to elaborate on the ncr53c8xx problems (I maintain this
> > driver). More generally, you must not ignore the thousands of bugs in the
> > hardware you are using, but software developpers haven't access to all
> > errata descriptions since hardware vendors donnot like to make this
> > information freely available.
>
> I've got a quick unrelated question.
>
> Why not change the name (or at least the description) of sym53c8xx to
> include the 53c1010 chips, which this driver seems to work on (and on a
> SMP box, no less)?

I have an SMP box and a C1010 chip. I like the 'way' this hardware
combination 'seems' to work for me. :-)

About the driver name, I have changed it to just 'sym' for the port to
FreeBSD. Software modules are usually named using a short name under this
O/S. At the time I did the port, LSI had bought SYMBIOS, but they seemed
to want to keep with SYMBIOS name for the 53C8XX PCI-SCSI family.

Then they (LSI) invented the 53C1010, called it LSI53C1010 and designed it
in a way that made possible to use a single driver for the SYM53C8xx
series (NCR+SYMBIOS) and this one.

The current LSI53C10xx family consists in 2 different architectures that
require 2 different software drivers:

- [PCI host interface + SCSI Ultra-160 interface + SCSI scripts
  processor] that are supported by the sym53c8xx driver, the
  derivated 'sym' under FreeBSD and the latest SYM-2 (derived from sym)
  that wants to be portable.

- [PCI/X host interface + SCSI Ultra-320 interface + ARM based IO
  processor] that requires a new driver. I heared that LSILOGIC want to
  provide a driver for Linux. Note that the LSI FC controllers and those
  ones should possibly share the same software drivers.

Then, what better name for the sym53c8xx driver?

- sym has been obsoleted by lsi.
- sym53c8,10xx is confusing, given the 10xx family weirdness (see above).
- lsi is too vague a name, given the numerous chips supplied by LSI.
- siop (SCSI IO PROCESSOR) is already used and looks to me more vague
  than all other names that have ever been used for an HBA driver.:)

In my taste, sym53c8xx is still quite good name for the following reasons:

1) It is the SYMBIOS company that made the greatest development for the
   53C8XX family, in my opinion.
2) The 53C10xx chips that can be driven by the sym* drivers use a 53C1010
   core, even if then may be named 53C1000, etc..., for marketing reasons.
3) All future 53C10xx HBAs will probably be based on PCI/X + U320 + ARM
   and so will not be supported by sym* (same for SIOP, btw).

Now, we could change everything that wants to describe this driver.
Here is my current one (from sym-2 that also supports NCR53C8xx chips):

 * Device driver for the SYMBIOS/LSILOGIC 53C8XX and 53C1010 family
 * of PCI-SCSI IO processors.

Just my 0.02 euro. :)

  Gérard.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13  7:43 PinkFreud
                   ` (5 preceding siblings ...)
  2001-08-13 13:46 ` hugang
@ 2001-08-13 13:55 ` Anton Altaparmakov
  2001-08-13 17:16 ` Stephen Satchell
  7 siblings, 0 replies; 51+ messages in thread
From: Anton Altaparmakov @ 2001-08-13 13:55 UTC (permalink / raw)
  To: Alan Cox; +Cc: PinkFreud, linux-kernel

At 14:11 13/08/01, Alan Cox wrote:
> > Should development continue on the latest and supposedly greatest
> > drivers?  Or should the existing bugs be fixed first?  I've got at least
> > three up there that need taking care of, and I'm sure others on this list
> > have found more.  3 seperate crashes on 3 seperate installs on 3 seperate
> > boxes - that's 100% failure rate.  If I get 100% failure on my installs,
> > what are others seeing?
>
>Near enough 0%. But then I try and avoid buying broken chipsets.

0% here. Admittedly I have only one VIA box and only as of this weekend but 
it works great with all kernels tried. [Epox 8KTA3 mobo, KT133A chipset 
running 266MHz FSB and 1.33GHz Athlon Thunderbird, PC133 CL2 RAM from 
Micron, active PFC power supply, BIOS setup to the fastest settings it 
allows me to set it to while stayin in spec (i.e. no o/c of CPU or 
busses!), IBM Ericson ATA100 41GiB HD attached to the VIA controller].

Tried kernels are 2.4.4-4GB (from SuSE 7.1 binary install Pentium 
optimizations I believe), 2.4.8 and 2.4.8-ac1 (the latter two both compiled 
with Athlon optimizations) and the system is absolutely fine. bonnie 
scores >30MiB/s (in mostly 34-39 MiB/s) on intelligent read/write tests 
with DMA enabled and a working set 2x size of RAM (i.e. 512MiB test size on 
256MiB RAM). Even when using fsync after every operation i/o speed is 
almost unaffected (it drops a bit but only by about 1-2MiB/s).

Copying 15 GiB of data from one partition to the other on the same disk 
worked fine. Compiling kernels works fine (Admittedly it only takes just 
over 3 minutes to compile the kernel with make -j 2 bzImage, so it's not 
too much of a stress test).

Oh and the VIA AC97 audio codec seems to work beautifully, too. As does X 
(both 3.3.x and 4.0.3) is fine, too. (I use an ancient ET6000 PCI gfx card.)

So, basically no problems here. I was quite worried about buying a VIA 
chipset but now it seems like a great buy. (-:

The only thing that's slightly annoying is that during boot three of the 
PCI resources from the VIA chipset are reported as "unknown, treating 
transparently" (or some simillar msg), don't have box handy to say what 
they were exactly... if anyone is intersted in exact messages I can provide 
dmesg + lspci -vvv output once I get home tonight.

> > I like Linux.  I'd like to stick with it.  But if it's going to
> > continually crash, I'm going to jump ship - and I'll start recommending to
>
>If you want maximum stability you want to be running 2.2 or even 2.0. Newer
>less tested code is always less table. 2.4 wont be as stable as 2.2 for a
>year yet.

Or alternatively buy quality components that other people have tested under 
Linux with kernel 2.4...

Anton


-- 
   "Nothing succeeds like success." - Alexandre Dumas
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13  7:43 PinkFreud
                   ` (4 preceding siblings ...)
  2001-08-13 13:11 ` Alan Cox
@ 2001-08-13 13:46 ` hugang
  2001-08-13 13:55 ` Anton Altaparmakov
  2001-08-13 17:16 ` Stephen Satchell
  7 siblings, 0 replies; 51+ messages in thread
From: hugang @ 2001-08-13 13:46 UTC (permalink / raw)
  To: PinkFreud; +Cc: linux-kernel

On Mon, 13 Aug 2001 03:43:05 -0400 (EDT)
PinkFreud <pf-kernel@mirkwood.net> wrote:

> 
> I have installed various 2.4.x kernels on 3 seperate systems here.  *ALL*
> of them have suffered from one malady or another - from the dual PIII with
> the VIA chipset and Matrox G400 card, which locks up nicely when I switch
> from X to a text console and back to X (but NOT under a uniprocessor
> kernel!), to the system with the NCR 53c810 SCSI board, which suffered
> random kernel panics anywhere from 2 hours to 5 days after booting, due to
> the ncr53c8xx driver, to YET ANOTHER system which has shown a penchant for
> crashing (read: no response on console, can use magic sysrq, but fails to
> emergency sync) when attempting to use 'ls' on a mounted QNX filesystem
> (ls comes up fine, then system crashes - nothing sent to syslog, no errors
> on screen, nothing!) - and this latest is with 2.4.8!
> 

> sauron@rivendell:~$ uptime
>  3:17AM  up 12 days, 15:20, 2 users, load averages: 1.48, 0.66, 0.31
> sauron@rivendell:~$ uname -a
> NetBSD rivendell 1.5.1 NetBSD 1.5.1 (RIVENDELL) #0: Tue Jul 31 22:58:54
> EDT 2001     root@rivendell:/usr/src/sys/arch/i386/compile/RIVENDELL i386
> sauron@rivendell:~$ dmesg | grep -i sym
> siop0 at pci0 dev 6 function 0: Symbios Logic 53c810 (fast scsi)
> 
> (The controller is old - it was made by NCR before it became Symbios Logic
> - hence, why I was using the NCR driver for it, rather than the Symbios
> driver, in Linux.)
> 
> Working on 13 days uptime.  That's well over twice the uptime for Linux on
> that box.  That's what happens when the kernel has bugs.
> 
> Take this rant for what you will.  Personally, I switched from Windows to
> Linux 5 years ago for the stability.  If I need to switch OSs again to
> continue to have stability, I will.  Somehow, I suspect, if kernel
> development continues down this path, many others will wind up switching
> to other OSs as well.
> 

	I think this problem can fix if your can report the crash message . (Oops,...)
Beacuse the netbsd can work fine on it , So I true the linux kernel also can  work fine on it.

-- 
Best Regard!
礼!
----------------------------------------------------
hugang : 胡刚 	GNU/Linux User
email  : gang_hu@soul.com.cn linuxbest@soul.com.cn
Tel    : +861068425741/2/3/4
Web    : http://www.soul.com.cn

	Beijing Soul technology Co.Ltd.
	   北京众志和达科技有限公司
----------------------------------------------------

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13  7:43 PinkFreud
                   ` (3 preceding siblings ...)
  2001-08-13 10:09 ` Chris Wilson
@ 2001-08-13 13:11 ` Alan Cox
  2001-08-14 18:51   ` Anders Larsen
  2001-08-13 13:46 ` hugang
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 51+ messages in thread
From: Alan Cox @ 2001-08-13 13:11 UTC (permalink / raw)
  To: PinkFreud; +Cc: linux-kernel

> of them have suffered from one malady or another - from the dual PIII with
> the VIA chipset and Matrox G400 card, which locks up nicely when I switch

Welcome to wacky hardware. To get a G400 stable on x86 you need at least

XFree86 4.1 if you are running hardware 3D (and DRM 4.1)
2.4.8 or higher with the VIA fixes
Preferably a very recent BIOS update for the VIA box

Of those only the XFree hardware 3d stuff is software bug related.

> emergency sync) when attempting to use 'ls' on a mounted QNX filesystem
> (ls comes up fine, then system crashes - nothing sent to syslog, no errors
> on screen, nothing!) - and this latest is with 2.4.8!

The qnxfs code is experimental - so I can believe it might fail in 2.4. I'd
be very interested in info on that one.

> Should development continue on the latest and supposedly greatest
> drivers?  Or should the existing bugs be fixed first?  I've got at least
> three up there that need taking care of, and I'm sure others on this list
> have found more.  3 seperate crashes on 3 seperate installs on 3 seperate
> boxes - that's 100% failure rate.  If I get 100% failure on my installs,
> what are others seeing?

Near enough 0%. But then I try and avoid buying broken chipsets.

> I like Linux.  I'd like to stick with it.  But if it's going to
> continually crash, I'm going to jump ship - and I'll start recommending to

If you want maximum stability you want to be running 2.2 or even 2.0. Newer
less tested code is always less table. 2.4 wont be as stable as 2.2 for a
year yet.

Alan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 10:29   ` Justin Guyett
@ 2001-08-13 12:56     ` Andrzej Krzysztofowicz
  2001-08-13 16:54     ` Gérard Roudier
  1 sibling, 0 replies; 51+ messages in thread
From: Andrzej Krzysztofowicz @ 2001-08-13 12:56 UTC (permalink / raw)
  To: Justin Guyett; +Cc: linux-kernel

"Justin Guyett wrote:"
> Why not change the name (or at least the description) of sym53c8xx to
> include the 53c1010 chips, which this driver seems to work on (and on a
> SMP box, no less)?

For backward compatibility reasons ?
We are in a stable kernels series.

-- 
=======================================================================
  Andrzej M. Krzysztofowicz               ankry@mif.pg.gda.pl
  phone (48)(58) 347 14 61
Faculty of Applied Phys. & Math.,   Technical University of Gdansk

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 10:09 ` Chris Wilson
@ 2001-08-13 11:09   ` szonyi calin
  0 siblings, 0 replies; 51+ messages in thread
From: szonyi calin @ 2001-08-13 11:09 UTC (permalink / raw)
  To: Chris Wilson; +Cc: linux-kernel


--- Chris Wilson <jakdaw@lists.jakdaw.org> wrote:

> Certainly seems to be moving backwards in that
> respect - from 2.4.6
> onwards the bog standard PS/2 keyboard does not work
> (at all from bootup -
> "keyboard: Timeout - AT keyboard not present?(ed)")
> on my SMP VIA w. G400
> (although I've not got as far as loading X on it
> without the keyboard!) :(
> 
> I can't even see what's changed in 2.4.6 that might
> cause this - soft_irq?
> or is that completely unrelated to the keyboard???
> 
> 
> Chris
> -

I have too a PS/2 keyboard and no problem at all with
2.4.x kernels. 2.4.0-2.4.8
Cx486 UP 


__________________________________________________
Do You Yahoo!?
Send instant messages & get email alerts with Yahoo! Messenger.
http://im.yahoo.com/

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13 10:03 ` Gérard Roudier
@ 2001-08-13 10:29   ` Justin Guyett
  2001-08-13 12:56     ` Andrzej Krzysztofowicz
  2001-08-13 16:54     ` Gérard Roudier
  0 siblings, 2 replies; 51+ messages in thread
From: Justin Guyett @ 2001-08-13 10:29 UTC (permalink / raw)
  To: Gérard Roudier; +Cc: linux-kernel

On Mon, 13 Aug 2001, Gérard Roudier wrote:

> You may want to elaborate on the ncr53c8xx problems (I maintain this
> driver). More generally, you must not ignore the thousands of bugs in the
> hardware you are using, but software developpers haven't access to all
> errata descriptions since hardware vendors donnot like to make this
> information freely available.

I've got a quick unrelated question.

Why not change the name (or at least the description) of sym53c8xx to
include the 53c1010 chips, which this driver seems to work on (and on a
SMP box, no less)?


justin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13  7:43 PinkFreud
                   ` (2 preceding siblings ...)
  2001-08-13 10:03 ` Gérard Roudier
@ 2001-08-13 10:09 ` Chris Wilson
  2001-08-13 11:09   ` szonyi calin
  2001-08-13 13:11 ` Alan Cox
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2001-08-13 10:09 UTC (permalink / raw)
  To: linux-kernel



> > until 2.2.10!).  Furthermore, I have had a HELL of a time trying
> > to get responses to the first two problems (this is the first report
for
> > the third).  It used to be that I could ask a question on this list,
and
> > receive responses.  Not anymore.  I can't seem to get the time of day
from
> > anyone on this list now.
> 
> Try and send specific bug-reports to the maintainers. 
> l-k archives may give you some light on issues with VIA chipsets.
> 
> I'm not convinced that gaining stability on a VIA + G400 + X + smp 
> combo is an easy task anyway.

Certainly seems to be moving backwards in that respect - from 2.4.6
onwards the bog standard PS/2 keyboard does not work (at all from bootup -
"keyboard: Timeout - AT keyboard not present?(ed)") on my SMP VIA w. G400
(although I've not got as far as loading X on it without the keyboard!) :(

I can't even see what's changed in 2.4.6 that might cause this - soft_irq?
or is that completely unrelated to the keyboard???


Chris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13  7:43 PinkFreud
  2001-08-13  8:52 ` Brian
  2001-08-13  8:55 ` Francois Romieu
@ 2001-08-13 10:03 ` Gérard Roudier
  2001-08-13 10:29   ` Justin Guyett
  2001-08-13 10:09 ` Chris Wilson
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 51+ messages in thread
From: Gérard Roudier @ 2001-08-13 10:03 UTC (permalink / raw)
  To: PinkFreud; +Cc: linux-kernel



On Mon, 13 Aug 2001, PinkFreud wrote:

> Please CC me in any replies, I am not subscribed to this list.
>
> Please forgive me if I seem incoherent.  It's after 3:30 AM here.

So, you will be forgiven, otherwise ... :-)

> I have installed various 2.4.x kernels on 3 seperate systems here.  *ALL*
> of them have suffered from one malady or another - from the dual PIII with
> the VIA chipset and Matrox G400 card, which locks up nicely when I switch
> from X to a text console and back to X (but NOT under a uniprocessor
> kernel!), to the system with the NCR 53c810 SCSI board, which suffered
> random kernel panics anywhere from 2 hours to 5 days after booting, due to
> the ncr53c8xx driver, to YET ANOTHER system which has shown a penchant for
> crashing (read: no response on console, can use magic sysrq, but fails to
> emergency sync) when attempting to use 'ls' on a mounted QNX filesystem
> (ls comes up fine, then system crashes - nothing sent to syslog, no errors
> on screen, nothing!) - and this latest is with 2.4.8!

You may want to elaborate on the ncr53c8xx problems (I maintain this
driver). More generally, you must not ignore the thousands of bugs in the
hardware you are using, but software developpers haven't access to all
errata descriptions since hardware vendors donnot like to make this
information freely available.

About ncr53c8xx problem reports, I cannot reply to all of them. You may
also send them to LSILOGIC support. They also want Linux to work with the
ncr/sym/lsi/53c8xx PCI-SCSI controllers, even with old NCR ones. Some
other vendors seem to just ignore old hardwares. For example NVIDIA that
killed (bought?) 3DFX, does not seem interested in maintaining drivers for
the 3DFX graphic chips.

> I've used Linux for over 5 years now.  In all the time I've used it, I
> have never seen this much instability in a single kernel
> series - though I've noticed each successive 'stable' series having
> more bugs than the last (2.2.x crashed once a week with SMP
> until 2.2.10!).  Furthermore, I have had a HELL of a time trying
> to get responses to the first two problems (this is the first report for
> the third).  It used to be that I could ask a question on this list, and
> receive responses.  Not anymore.  I can't seem to get the time of day from
> anyone on this list now.

I use Linux since some 0.99.x (was yygdrasil distribution). My experience
has been that 1.2.13, 2.0.27 and 2.2.13 worked reliable enough for me.
'Stable' does not means reliable for any workload. It means that we stop
developping (implies changing large portions of code or modifying
interfaces) but only focus on fixing the software with it current design
(implies only changing what is proven to be broken).  This applies to all
softwares, not only to Linux. As a result, early stable releases still
have numerous bugs that may prevent numerous systems from working
reliably. It is up to user to check releases and switch to the one that
fits his expectations.

> This brings me to the subject of this rant: are we going too fast?  New
> drivers are still showing up in each successive kernel, and yet no one
> seems to be able to fix the old bugs that already exist.  Are we looking
> to have the reliability of Windows?  It's starting to seem so - each
> successive kernel series just seems to crash more and more often.  When
> will we reach the point where Windows, on the average, will have greater
> uptime than Linux systems?  Perhaps it's time to slow down, and do some
> debugging.

The reliabity of Windows seems to be just fine for most users since it is
the O/S most of them want to use.:-)
I use Win98/SE, Win/NT 4.0, Linux-2.2.19, FreeBSD-4.2, and sometimes
NetBSD-1.5 on my personnal machine. They all work reliably enough for my
personnal usage. But, indeed, this is a station and not a server, even if
I use to sometimes stress the system under Linux and *BSDs.

> This is supposed to be a 'stable' kernel series?  I see nothing stable
> about it.
>
> Should development continue on the latest and supposedly greatest
> drivers?  Or should the existing bugs be fixed first?  I've got at least
> three up there that need taking care of, and I'm sure others on this list
> have found more.  3 seperate crashes on 3 seperate installs on 3 seperate
> boxes - that's 100% failure rate.  If I get 100% failure on my installs,
> what are others seeing?

Hopefully you aren't a typical computer user or you just have bad luck
with computers. :-)

> To those of you who would tell me to fix them myself: I am an
> administrator.  I know Perl.  I am not all that familiar with C, nor with
> kernel programming.  They're not my bugs, but I would fix them if I were
> able to.  I'd hope the authors of said bugs would be willing to fix them -
> but given the track record I've seen for the first two problems, I'm not
> holding my breath for the third to be fixed any time soon.

All software developpers and maintainers want their software to work and
thus bugs to be fixed. This is just sometimes hard to know what is
actually broken. My experience is that no more than 10% bug reports about
a software are due to a bug in the software that is pointed out by the
report. And for these less than 10% relevant reports, maintainers must
find what is broken... not simple as you can imagine...

> I don't know about the rest of you, but I'm going to give up soon and
> switch to NetBSD.  I've already done it on the system with the NCR 53c810
> board - and it's proven to be far more stable than 2.4.x kernels have ever
> managed to be on it.  What does that say?

You can break any O/S given an appropriate workload. Add to this all the
hidden/unknown hardware bugs that can be randomly triggerred ...

Btw, I use SYM-2 driver under Linux, FreeBSD and NetBSD 1.5. I have no
problem with it. If you plan to use Ultra-160 LSI53C1010 chips, the NetBSD
SIOP driver may be sub-optimal and, btw, it does not seem to know about
C1010 chips erratas.

You donnot seem to have given a try with FreeBSD. Were there some strong
reasons for that ?

> sauron@rivendell:~$ uptime
>  3:17AM  up 12 days, 15:20, 2 users, load averages: 1.48, 0.66, 0.31
> sauron@rivendell:~$ uname -a
> NetBSD rivendell 1.5.1 NetBSD 1.5.1 (RIVENDELL) #0: Tue Jul 31 22:58:54
> EDT 2001     root@rivendell:/usr/src/sys/arch/i386/compile/RIVENDELL i386
> sauron@rivendell:~$ dmesg | grep -i sym
> siop0 at pci0 dev 6 function 0: Symbios Logic 53c810 (fast scsi)
>
> (The controller is old - it was made by NCR before it became Symbios Logic
> - hence, why I was using the NCR driver for it, rather than the Symbios
> driver, in Linux.)
>
> Working on 13 days uptime.  That's well over twice the uptime for Linux on
> that box.  That's what happens when the kernel has bugs.

You seem so sure it is the ncr53c8xx driver that breaks your Linux ...
If it was so broken, may be I would have heared about. :-)

> Take this rant for what you will.  Personally, I switched from Windows to
> Linux 5 years ago for the stability.  If I need to switch OSs again to
> continue to have stability, I will.  Somehow, I suspect, if kernel
> development continues down this path, many others will wind up switching
> to other OSs as well.

If NetBSD fits your need, then let me encourage you to use it.

> I like Linux.  I'd like to stick with it.  But if it's going to
> continually crash, I'm going to jump ship - and I'll start recommending to
> others that they do the same.

That's unclever recommendation, in my opinion.
For example, my children are happy using Windows 98 and I donnot want to
recommend them anything else.

> 	Mike Edwards
>
> Brainbench certified Master Linux Administrator
> http://www.brainbench.com/transcript.jsp?pid=158188
> -----------------------------------
> Unsolicited advertisments to this address are not welcome.

Regards,
  Gérard.



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13  7:43 PinkFreud
  2001-08-13  8:52 ` Brian
@ 2001-08-13  8:55 ` Francois Romieu
  2001-08-14  4:21   ` Pete Toscano
  2001-08-13 10:03 ` Gérard Roudier
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 51+ messages in thread
From: Francois Romieu @ 2001-08-13  8:55 UTC (permalink / raw)
  To: PinkFreud; +Cc: linux-kernel

PinkFreud <pf-kernel@mirkwood.net> :
[...]
> kernel!), to the system with the NCR 53c810 SCSI board, which suffered
> random kernel panics anywhere from 2 hours to 5 days after booting, due to
> the ncr53c8xx driver, to YET ANOTHER system which has shown a penchant for

The (ksymoops processed-) oopses may help. You can give a try at the
sym53c8xx driver. It performs well here:
- 53c875 adapter + BX + 2.4.3/2.4.7-ac11/2.4.8 + raid1 (small server)
- 53c810 + VP3 + 2.4.2 (instant reboot at startup with 2.4.8, I guess I
fscked some option).

[...]
> until 2.2.10!).  Furthermore, I have had a HELL of a time trying
> to get responses to the first two problems (this is the first report for
> the third).  It used to be that I could ask a question on this list, and
> receive responses.  Not anymore.  I can't seem to get the time of day from
> anyone on this list now.

Try and send specific bug-reports to the maintainers. 
l-k archives may give you some light on issues with VIA chipsets.

I'm not convinced that gaining stability on a VIA + G400 + X + smp 
combo is an easy task anyway.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Are we going too fast?
  2001-08-13  7:43 PinkFreud
@ 2001-08-13  8:52 ` Brian
  2001-08-13  8:55 ` Francois Romieu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Brian @ 2001-08-13  8:52 UTC (permalink / raw)
  To: PinkFreud, linux-kernel

4:30am here.  I'm sure it won't seem quite as diplomatic in the morning.

Back on 2.2, up until ~.12 or so, I could hang an Intel NIC in a day and a 
half.  A 3Com or a tulip held out for three.  Considering I run web 
servers and kinda needed network access, that was a problem.  I actually 
had a burn of FreeBSD sitting on my desk when the bugs finally got swashed.

Around that time, I was in IRC with a couple of *BSDers.  One commented 
that Linux's eagarness to support anything and everything is exactly what 
made Windows the Swiss cheese we know today.  I laughed it off at the 
time, but it rang brutally true this evening.

I rolled our 2.4.7-based web server back to 2.2.19 tonight.  As if I 
needed further convincing, 2.4.7 apparently panicked on its way out (I was 
logged in from remote, but the symptoms are all too familiar).  That 
leaves our squid servers as the only 2.4-based servers left; they all run 
test12 since the 'stable' 2.4s didn't do very well at the time (of course, 
they've been up for 67-140 days, so that may not be the case anymore).

This is about what my situation was when 2.3 branched off.  I can't say I 
care for it, but at least there's precedent that the wrinkles may iron out.
	-- Brian

On Monday 13 August 2001 03:43 am, PinkFreud wrote:
> Please CC me in any replies, I am not subscribed to this list.
>
> Please forgive me if I seem incoherent.  It's after 3:30 AM here.
>
>
> I have installed various 2.4.x kernels on 3 seperate systems here. 
> *ALL* of them have suffered from one malady or another - from the dual
> PIII with the VIA chipset and Matrox G400 card, which locks up nicely
> when I switch from X to a text console and back to X (but NOT under a
> uniprocessor kernel!), to the system with the NCR 53c810 SCSI board,
> which suffered random kernel panics anywhere from 2 hours to 5 days
> after booting, due to the ncr53c8xx driver, to YET ANOTHER system which
> has shown a penchant for crashing (read: no response on console, can use
> magic sysrq, but fails to emergency sync) when attempting to use 'ls' on
> a mounted QNX filesystem (ls comes up fine, then system crashes -
> nothing sent to syslog, no errors on screen, nothing!) - and this latest
> is with 2.4.8!
>
> I've used Linux for over 5 years now.  In all the time I've used it, I
> have never seen this much instability in a single kernel
> series - though I've noticed each successive 'stable' series having
> more bugs than the last (2.2.x crashed once a week with SMP
> until 2.2.10!).  Furthermore, I have had a HELL of a time trying
> to get responses to the first two problems (this is the first report for
> the third).  It used to be that I could ask a question on this list, and
> receive responses.  Not anymore.  I can't seem to get the time of day
> from anyone on this list now.
>
> This brings me to the subject of this rant: are we going too fast?  New
> drivers are still showing up in each successive kernel, and yet no one
> seems to be able to fix the old bugs that already exist.  Are we looking
> to have the reliability of Windows?  It's starting to seem so - each
> successive kernel series just seems to crash more and more often.  When
> will we reach the point where Windows, on the average, will have greater
> uptime than Linux systems?  Perhaps it's time to slow down, and do some
> debugging.
>
> This is supposed to be a 'stable' kernel series?  I see nothing stable
> about it.
>
> Should development continue on the latest and supposedly greatest
> drivers?  Or should the existing bugs be fixed first?  I've got at least
> three up there that need taking care of, and I'm sure others on this
> list have found more.  3 seperate crashes on 3 seperate installs on 3
> seperate boxes - that's 100% failure rate.  If I get 100% failure on my
> installs, what are others seeing?
>
> To those of you who would tell me to fix them myself: I am an
> administrator.  I know Perl.  I am not all that familiar with C, nor
> with kernel programming.  They're not my bugs, but I would fix them if I
> were able to.  I'd hope the authors of said bugs would be willing to fix
> them - but given the track record I've seen for the first two problems,
> I'm not holding my breath for the third to be fixed any time soon.
>
> I don't know about the rest of you, but I'm going to give up soon and
> switch to NetBSD.  I've already done it on the system with the NCR
> 53c810 board - and it's proven to be far more stable than 2.4.x kernels
> have ever managed to be on it.  What does that say?
>
> sauron@rivendell:~$ uptime
>  3:17AM  up 12 days, 15:20, 2 users, load averages: 1.48, 0.66, 0.31
> sauron@rivendell:~$ uname -a
> NetBSD rivendell 1.5.1 NetBSD 1.5.1 (RIVENDELL) #0: Tue Jul 31 22:58:54
> EDT 2001     root@rivendell:/usr/src/sys/arch/i386/compile/RIVENDELL
> i386 sauron@rivendell:~$ dmesg | grep -i sym
> siop0 at pci0 dev 6 function 0: Symbios Logic 53c810 (fast scsi)
>
> (The controller is old - it was made by NCR before it became Symbios
> Logic - hence, why I was using the NCR driver for it, rather than the
> Symbios driver, in Linux.)
>
> Working on 13 days uptime.  That's well over twice the uptime for Linux
> on that box.  That's what happens when the kernel has bugs.
>
> Take this rant for what you will.  Personally, I switched from Windows
> to Linux 5 years ago for the stability.  If I need to switch OSs again
> to continue to have stability, I will.  Somehow, I suspect, if kernel
> development continues down this path, many others will wind up switching
> to other OSs as well.
>
> I like Linux.  I'd like to stick with it.  But if it's going to
> continually crash, I'm going to jump ship - and I'll start recommending
> to others that they do the same.
>
>
> 	Mike Edwards
>
> Brainbench certified Master Linux Administrator
> http://www.brainbench.com/transcript.jsp?pid=158188
> -----------------------------------
> Unsolicited advertisments to this address are not welcome.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Are we going too fast?
@ 2001-08-13  7:43 PinkFreud
  2001-08-13  8:52 ` Brian
                   ` (7 more replies)
  0 siblings, 8 replies; 51+ messages in thread
From: PinkFreud @ 2001-08-13  7:43 UTC (permalink / raw)
  To: linux-kernel

Please CC me in any replies, I am not subscribed to this list.

Please forgive me if I seem incoherent.  It's after 3:30 AM here.


I have installed various 2.4.x kernels on 3 seperate systems here.  *ALL*
of them have suffered from one malady or another - from the dual PIII with
the VIA chipset and Matrox G400 card, which locks up nicely when I switch
from X to a text console and back to X (but NOT under a uniprocessor
kernel!), to the system with the NCR 53c810 SCSI board, which suffered
random kernel panics anywhere from 2 hours to 5 days after booting, due to
the ncr53c8xx driver, to YET ANOTHER system which has shown a penchant for
crashing (read: no response on console, can use magic sysrq, but fails to
emergency sync) when attempting to use 'ls' on a mounted QNX filesystem
(ls comes up fine, then system crashes - nothing sent to syslog, no errors
on screen, nothing!) - and this latest is with 2.4.8!

I've used Linux for over 5 years now.  In all the time I've used it, I
have never seen this much instability in a single kernel
series - though I've noticed each successive 'stable' series having
more bugs than the last (2.2.x crashed once a week with SMP 
until 2.2.10!).  Furthermore, I have had a HELL of a time trying
to get responses to the first two problems (this is the first report for
the third).  It used to be that I could ask a question on this list, and
receive responses.  Not anymore.  I can't seem to get the time of day from
anyone on this list now.

This brings me to the subject of this rant: are we going too fast?  New
drivers are still showing up in each successive kernel, and yet no one
seems to be able to fix the old bugs that already exist.  Are we looking
to have the reliability of Windows?  It's starting to seem so - each
successive kernel series just seems to crash more and more often.  When
will we reach the point where Windows, on the average, will have greater
uptime than Linux systems?  Perhaps it's time to slow down, and do some
debugging.

This is supposed to be a 'stable' kernel series?  I see nothing stable
about it.

Should development continue on the latest and supposedly greatest
drivers?  Or should the existing bugs be fixed first?  I've got at least
three up there that need taking care of, and I'm sure others on this list
have found more.  3 seperate crashes on 3 seperate installs on 3 seperate
boxes - that's 100% failure rate.  If I get 100% failure on my installs,
what are others seeing?

To those of you who would tell me to fix them myself: I am an
administrator.  I know Perl.  I am not all that familiar with C, nor with
kernel programming.  They're not my bugs, but I would fix them if I were
able to.  I'd hope the authors of said bugs would be willing to fix them -
but given the track record I've seen for the first two problems, I'm not
holding my breath for the third to be fixed any time soon.

I don't know about the rest of you, but I'm going to give up soon and
switch to NetBSD.  I've already done it on the system with the NCR 53c810
board - and it's proven to be far more stable than 2.4.x kernels have ever
managed to be on it.  What does that say?

sauron@rivendell:~$ uptime
 3:17AM  up 12 days, 15:20, 2 users, load averages: 1.48, 0.66, 0.31
sauron@rivendell:~$ uname -a
NetBSD rivendell 1.5.1 NetBSD 1.5.1 (RIVENDELL) #0: Tue Jul 31 22:58:54
EDT 2001     root@rivendell:/usr/src/sys/arch/i386/compile/RIVENDELL i386
sauron@rivendell:~$ dmesg | grep -i sym
siop0 at pci0 dev 6 function 0: Symbios Logic 53c810 (fast scsi)

(The controller is old - it was made by NCR before it became Symbios Logic
- hence, why I was using the NCR driver for it, rather than the Symbios
driver, in Linux.)

Working on 13 days uptime.  That's well over twice the uptime for Linux on
that box.  That's what happens when the kernel has bugs.

Take this rant for what you will.  Personally, I switched from Windows to
Linux 5 years ago for the stability.  If I need to switch OSs again to
continue to have stability, I will.  Somehow, I suspect, if kernel
development continues down this path, many others will wind up switching
to other OSs as well.

I like Linux.  I'd like to stick with it.  But if it's going to
continually crash, I'm going to jump ship - and I'll start recommending to
others that they do the same.


	Mike Edwards

Brainbench certified Master Linux Administrator
http://www.brainbench.com/transcript.jsp?pid=158188
-----------------------------------
Unsolicited advertisments to this address are not welcome.


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2001-08-16 21:42 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-15 20:13 Are we going too fast? Roy Murphy
  -- strict thread matches above, loose matches on Subject: below --
2001-08-16 21:42 PinkFreud
2001-08-14 20:20 Per Jessen
2001-08-14 20:07 Per Jessen
2001-08-14 19:47 Per Jessen
2001-08-14 16:32 PinkFreud
2001-08-14 16:25 PinkFreud
2001-08-13 21:44 PinkFreud
2001-08-14  0:04 ` PinkFreud
2001-08-14  7:24   ` Francois Romieu
2001-08-15 23:24   ` Dr. Kelsey Hudson
2001-08-13 21:36 PinkFreud
2001-08-14  7:57 ` Helge Hafting
2001-08-13 21:07 PinkFreud
2001-08-13 21:20 ` Alan Cox
2001-08-13 21:41   ` Rog�rio Brito
2001-08-14  0:56   ` Ben Ford
2001-08-14  7:34   ` Peter Wächtler
2001-08-14  2:24 ` David Ford
2001-08-14  4:19   ` Nicholas Knight
2001-08-14 12:49     ` Alan Cox
2001-08-14 22:27       ` Paul G. Allen
     [not found] <no.id>
2001-08-13 20:24 ` Alan Cox
2001-08-13 21:06   ` Anthony Barbachan
2001-08-14 20:47 ` Alan Cox
2001-08-15  0:07   ` PinkFreud
     [not found] <fa.l9dq0tv.7gqnhh@ifi.uio.no>
     [not found] ` <fa.g70as7v.1722ipv@ifi.uio.no>
2001-08-13 19:14   ` John Weber
2001-08-13 18:53 Petr Vandrovec
2001-08-13 18:46 Per Jessen
2001-08-14 13:58 ` Andrew Scott
2001-08-14 19:54 ` David Ford
2001-08-13 17:53 PinkFreud
2001-08-13 20:27 ` Gérard Roudier
2001-08-13  7:43 PinkFreud
2001-08-13  8:52 ` Brian
2001-08-13  8:55 ` Francois Romieu
2001-08-14  4:21   ` Pete Toscano
2001-08-14 12:48     ` Alan Cox
2001-08-14 22:30       ` Paul G. Allen
2001-08-13 10:03 ` Gérard Roudier
2001-08-13 10:29   ` Justin Guyett
2001-08-13 12:56     ` Andrzej Krzysztofowicz
2001-08-13 16:54     ` Gérard Roudier
2001-08-13 10:09 ` Chris Wilson
2001-08-13 11:09   ` szonyi calin
2001-08-13 13:11 ` Alan Cox
2001-08-14 18:51   ` Anders Larsen
2001-08-14 20:29     ` Anders Larsen
2001-08-13 13:46 ` hugang
2001-08-13 13:55 ` Anton Altaparmakov
2001-08-13 17:16 ` Stephen Satchell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).