linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux 2.4.21-rc7
@ 2003-06-03 17:04 Marcelo Tosatti
  2003-06-03 18:02 ` Tomas Szepe
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Marcelo Tosatti @ 2003-06-03 17:04 UTC (permalink / raw)
  To: lkml


Hallo,

Now I really hope its the last one, all this rc's are making me mad.

Ok, here it is.


Summary of changes from v2.4.21-rc6 to v2.4.21-rc7
============================================

<ehabkost@conectiva.com.br>:
  o [SPARC]: Export phys_base on sparc32

<jgarzik@pobox.com>:
  o fix olympic driver build

<lethal@linux-sh.org>:
  o Fix Solution Engine 7751 Build
  o Define VM_DATA_DEFAULT_FLAGS for SH

<wesolows@foobazco.org>:
  o [sparc]: Attempt mul/div emulation handling on all cpus

David S. Miller <davem@nuts.ninka.net>:
  o [SPARC]: Fix sys_ipc to return ENOSYS instead of EINVAL as appropriate
  o [SPARC64]: Implement dump_stack in 2.4.x
  o [SPARC64]: Only use power interrupt when button property exists
  o [IPV4/IPV6]: Use Jenkins hash for fragment reassembly handling
  o [IPV6]: Input full addresses into TCP_SYNQ hash function
  o [IPV4]: Add sysctl to control ipfrag_secret_interval
  o [SPARC64]: Fix probe error handling in envctrl.c driver
  o [SPARC64]: Fix probe error handling in bbc_{envctrl,i2c}.c driver
  o [SPARC64]: Fix exploitable holes and bugs in ioctl32 translations

Douglas Gilbert <dougg@torque.net>:
  o sg: Fix side effect introduced by last "off by one" fix

Eric Brower <ebrower@usa.net>:
  o [SPARC]: Refactor AUXIO support

Marcelo Tosatti <marcelo@freak.distro.conectiva>:
  o Changed EXTRAVERSION to -rc7

Pete Zaitcev <zaitcev@redhat.com>:
  o [sparc] Force type in __put_user
  o [SPARC]: Fix gcc-3.x builds

Rob Radez <rob@osinvestor.com>:
  o [sparc]: Fix uninitialized spinlock in SRMMU code
  o [SPARC]: Kill initialize_secondary, unused


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 17:04 Linux 2.4.21-rc7 Marcelo Tosatti
@ 2003-06-03 18:02 ` Tomas Szepe
  2003-06-03 18:07   ` Marcelo Tosatti
  2003-06-03 18:30 ` Alex Romosan
  2003-06-05 12:09 ` Andreas Haumer
  2 siblings, 1 reply; 27+ messages in thread
From: Tomas Szepe @ 2003-06-03 18:02 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: lkml, alan

> [marcelo@conectiva.com.br]
> 
> Now I really hope its the last one, all this rc's are making me mad.

Are you quite sure you don't want Alan to get you the updates necessary
for IDE to build as modules for .21 final?

-- 
Tomas Szepe <szepe@pinerecords.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 18:02 ` Tomas Szepe
@ 2003-06-03 18:07   ` Marcelo Tosatti
  2003-06-03 19:15     ` lk
  0 siblings, 1 reply; 27+ messages in thread
From: Marcelo Tosatti @ 2003-06-03 18:07 UTC (permalink / raw)
  To: Tomas Szepe; +Cc: lkml, alan



On Tue, 3 Jun 2003, Tomas Szepe wrote:

> > [marcelo@conectiva.com.br]
> >
> > Now I really hope its the last one, all this rc's are making me mad.
>
> Are you quite sure you don't want Alan to get you the updates necessary
> for IDE to build as modules for .21 final?

Well, I can for sure release -rc8 with that.

I just want this possible -rc8 to be released no later than tonight.

Alan?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 17:04 Linux 2.4.21-rc7 Marcelo Tosatti
  2003-06-03 18:02 ` Tomas Szepe
@ 2003-06-03 18:30 ` Alex Romosan
  2003-06-03 19:27   ` Jeff Garzik
  2003-06-05 12:09 ` Andreas Haumer
  2 siblings, 1 reply; 27+ messages in thread
From: Alex Romosan @ 2003-06-03 18:30 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: lkml

Marcelo Tosatti <marcelo@conectiva.com.br> writes:

> Now I really hope its the last one, all this rc's are making me mad.

i still can't get it to compile for sparc32:

gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -m32 -pipe -mno-fpu -fcall-used-g5 -fcall-used-g7   -nostdinc -iwithprefix include -DKBUILD_BASENAME=ksyms  -DEXPORT_SYMTAB -c ksyms.c
/usr/src/linux/include/asm/checksum.h: In function `csum_partial_copy_nocheck':
/usr/src/linux/include/asm/checksum.h:59: error: asm-specifier for variable `d' conflicts with asm clobber list
/usr/src/linux/include/asm/checksum.h:59: error: asm-specifier for variable `l' conflicts with asm clobber list
/usr/src/linux/include/asm/checksum.h: In function `csum_partial_copy_from_user':
/usr/src/linux/include/asm/checksum.h:81: error: asm-specifier for variable `d' conflicts with asm clobber list
/usr/src/linux/include/asm/checksum.h:81: error: asm-specifier for variable `l' conflicts with asm clobber list
/usr/src/linux/include/asm/checksum.h:81: error: asm-specifier for variable `s' conflicts with asm clobber list
/usr/src/linux/include/asm/checksum.h: In function `csum_partial_copy_to_user':
/usr/src/linux/include/asm/checksum.h:108: error: asm-specifier for variable `d' conflicts with asm clobber list
/usr/src/linux/include/asm/checksum.h:108: error: asm-specifier for variable `l' conflicts with asm clobber list
/usr/src/linux/include/asm/checksum.h:108: error: asm-specifier for variable `s' conflicts with asm clobber list
make[3]: *** [ksyms.o] Error 1
make[3]: Leaving directory `/usr/src/linux/kernel'
make[2]: *** [first_rule] Error 2
make[2]: Leaving directory `/usr/src/linux/kernel'
make[1]: *** [_dir_kernel] Error 2
make[1]: Leaving directory `/usr/src/linux'
make: *** [stamp-build] Error 2

not sure when this started. the last kernel i managed to compile was
rc2 (skipped rc3 and rc4, rc5 didn't compile). the last one that will
boot was 2.4.21-pre1. this is on a sun4m Fujitsu TurboSparc.

--alex--

-- 
| I believe the moment is at hand when, by a paranoiac and active |
|  advance of the mind, it will be possible (simultaneously with  |
|  automatism and other passive states) to systematize confusion  |
|  and thus to help to discredit completely the world of reality. |

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 18:07   ` Marcelo Tosatti
@ 2003-06-03 19:15     ` lk
  2003-06-03 19:40       ` Alan Cox
  0 siblings, 1 reply; 27+ messages in thread
From: lk @ 2003-06-03 19:15 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: lkml

> > > Now I really hope its the last one, all this rc's are making me mad.
> >
> > Are you quite sure you don't want Alan to get you the updates necessary
> > for IDE to build as modules for .21 final?
> 
> Well, I can for sure release -rc8 with that.
> 
> I just want this possible -rc8 to be released no later than tonight.

Unfortunately I just committed my test box to production and can't test 
Alan's SiImage fixes in rc6-ac2, but if they pan out, please try to 
include them in -rc8 as well.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 18:30 ` Alex Romosan
@ 2003-06-03 19:27   ` Jeff Garzik
  2003-06-03 19:58     ` Alex Romosan
  0 siblings, 1 reply; 27+ messages in thread
From: Jeff Garzik @ 2003-06-03 19:27 UTC (permalink / raw)
  To: Alex Romosan; +Cc: Marcelo Tosatti, lkml

On Tue, Jun 03, 2003 at 11:30:59AM -0700, Alex Romosan wrote:
> Marcelo Tosatti <marcelo@conectiva.com.br> writes:
> 
> > Now I really hope its the last one, all this rc's are making me mad.
> 
> i still can't get it to compile for sparc32:
> 
> gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -m32 -pipe -mno-fpu -fcall-used-g5 -fcall-used-g7   -nostdinc -iwithprefix include -DKBUILD_BASENAME=ksyms  -DEXPORT_SYMTAB -c ksyms.c
> /usr/src/linux/include/asm/checksum.h: In function `csum_partial_copy_nocheck':
> /usr/src/linux/include/asm/checksum.h:59: error: asm-specifier for variable `d' conflicts with asm clobber list
> /usr/src/linux/include/asm/checksum.h:59: error: asm-specifier for variable `l' conflicts with asm clobber list
> /usr/src/linux/include/asm/checksum.h: In function `csum_partial_copy_from_user':

That looks like you either need a different compiler version,
or different binutils version...

	Jeff




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 19:15     ` lk
@ 2003-06-03 19:40       ` Alan Cox
  0 siblings, 0 replies; 27+ messages in thread
From: Alan Cox @ 2003-06-03 19:40 UTC (permalink / raw)
  To: lk; +Cc: Marcelo Tosatti, lkml

On Maw, 2003-06-03 at 20:15, lk@trolloc.com wrote:
> Unfortunately I just committed my test box to production and can't test 
> Alan's SiImage fixes in rc6-ac2, but if they pan out, please try to 
> include them in -rc8 as well.

You could add the dma autoenable but the rest should be avoided


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 19:27   ` Jeff Garzik
@ 2003-06-03 19:58     ` Alex Romosan
  2003-06-03 20:14       ` Tom Rini
  0 siblings, 1 reply; 27+ messages in thread
From: Alex Romosan @ 2003-06-03 19:58 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Marcelo Tosatti, lkml

Jeff Garzik <jgarzik@pobox.com> writes:

> On Tue, Jun 03, 2003 at 11:30:59AM -0700, Alex Romosan wrote:
>> Marcelo Tosatti <marcelo@conectiva.com.br> writes:
>> 
>> > Now I really hope its the last one, all this rc's are making me mad.
>> 
>> i still can't get it to compile for sparc32:
>> 
>> gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -m32 -pipe -mno-fpu -fcall-used-g5 -fcall-used-g7   -nostdinc -iwithprefix include -DKBUILD_BASENAME=ksyms  -DEXPORT_SYMTAB -c ksyms.c
>> /usr/src/linux/include/asm/checksum.h: In function `csum_partial_copy_nocheck':
>> /usr/src/linux/include/asm/checksum.h:59: error: asm-specifier for variable `d' conflicts with asm clobber list
>> /usr/src/linux/include/asm/checksum.h:59: error: asm-specifier for variable `l' conflicts with asm clobber list
>> /usr/src/linux/include/asm/checksum.h: In function `csum_partial_copy_from_user':
>
> That looks like you either need a different compiler version,
> or different binutils version...

gcc (GCC) 3.3 (Debian)
GNU ld version 2.14.90.0.4 20030523 Debian GNU/Linux

the same versions work on i386 though...

--alex--

-- 
| I believe the moment is at hand when, by a paranoiac and active |
|  advance of the mind, it will be possible (simultaneously with  |
|  automatism and other passive states) to systematize confusion  |
|  and thus to help to discredit completely the world of reality. |

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 19:58     ` Alex Romosan
@ 2003-06-03 20:14       ` Tom Rini
  2003-06-04  3:35         ` David S. Miller
  0 siblings, 1 reply; 27+ messages in thread
From: Tom Rini @ 2003-06-03 20:14 UTC (permalink / raw)
  To: Alex Romosan; +Cc: Jeff Garzik, Marcelo Tosatti, lkml

On Tue, Jun 03, 2003 at 12:58:40PM -0700, Alex Romosan wrote:

> Jeff Garzik <jgarzik@pobox.com> writes:
> 
> > On Tue, Jun 03, 2003 at 11:30:59AM -0700, Alex Romosan wrote:
> >> Marcelo Tosatti <marcelo@conectiva.com.br> writes:
> >> 
> >> > Now I really hope its the last one, all this rc's are making me mad.
> >> 
> >> i still can't get it to compile for sparc32:
> >> 
> >> gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-frame-pointer -m32 -pipe -mno-fpu -fcall-used-g5 -fcall-used-g7   -nostdinc -iwithprefix include -DKBUILD_BASENAME=ksyms  -DEXPORT_SYMTAB -c ksyms.c
> >> /usr/src/linux/include/asm/checksum.h: In function `csum_partial_copy_nocheck':
> >> /usr/src/linux/include/asm/checksum.h:59: error: asm-specifier for variable `d' conflicts with asm clobber list
> >> /usr/src/linux/include/asm/checksum.h:59: error: asm-specifier for variable `l' conflicts with asm clobber list
> >> /usr/src/linux/include/asm/checksum.h: In function `csum_partial_copy_from_user':
> >
> > That looks like you either need a different compiler version,
> > or different binutils version...
> 
> gcc (GCC) 3.3 (Debian)
> GNU ld version 2.14.90.0.4 20030523 Debian GNU/Linux

That would do it.

> the same versions work on i386 though...

Yes, but i386 either didn't have now invalid clober lists, or they were
fixed in the -pre portion (like it was on PPC32 as well).

-- 
Tom Rini
http://gate.crashing.org/~trini/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 20:14       ` Tom Rini
@ 2003-06-04  3:35         ` David S. Miller
  2003-06-04 15:09           ` Mr. James W. Laferriere
  2003-06-04 23:37           ` Alex Romosan
  0 siblings, 2 replies; 27+ messages in thread
From: David S. Miller @ 2003-06-04  3:35 UTC (permalink / raw)
  To: Tom Rini; +Cc: Alex Romosan, Jeff Garzik, Marcelo Tosatti, lkml

On Tue, 2003-06-03 at 13:14, Tom Rini wrote:
> > gcc (GCC) 3.3 (Debian)
> > GNU ld version 2.14.90.0.4 20030523 Debian GNU/Linux
> 
> That would do it.

I don't trust anything past gcc-3.2.x on sparc and sparc64.
Use 3.3.x and later at your own peril.

-- 
David S. Miller <davem@redhat.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-04  3:35         ` David S. Miller
@ 2003-06-04 15:09           ` Mr. James W. Laferriere
  2003-06-04 23:37           ` Alex Romosan
  1 sibling, 0 replies; 27+ messages in thread
From: Mr. James W. Laferriere @ 2003-06-04 15:09 UTC (permalink / raw)
  To: David S. Miller
  Cc: Tom Rini, Alex Romosan, Jeff Garzik, Marcelo Tosatti, lkml

	Hello Dave ,  Thank you for the warning .  Now how about why
	laymans style ?  Tia ,  JimL

On Tue, 3 Jun 2003, David S. Miller wrote:
> On Tue, 2003-06-03 at 13:14, Tom Rini wrote:
> > > gcc (GCC) 3.3 (Debian)
> > > GNU ld version 2.14.90.0.4 20030523 Debian GNU/Linux
> > That would do it.
> I don't trust anything past gcc-3.2.x on sparc and sparc64.
> Use 3.3.x and later at your own peril.
-- 
       +------------------------------------------------------------------+
       | James   W.   Laferriere | System    Techniques | Give me VMS     |
       | Network        Engineer |     P.O. Box 854     |  Give me Linux  |
       | babydr@baby-dragons.com | Coudersport PA 16915 |   only  on  AXP |
       +------------------------------------------------------------------+

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-04  3:35         ` David S. Miller
  2003-06-04 15:09           ` Mr. James W. Laferriere
@ 2003-06-04 23:37           ` Alex Romosan
  1 sibling, 0 replies; 27+ messages in thread
From: Alex Romosan @ 2003-06-04 23:37 UTC (permalink / raw)
  To: David S. Miller; +Cc: Tom Rini, Jeff Garzik, Marcelo Tosatti, lkml

"David S. Miller" <davem@redhat.com> writes:

> On Tue, 2003-06-03 at 13:14, Tom Rini wrote:
>> > gcc (GCC) 3.3 (Debian)
>> > GNU ld version 2.14.90.0.4 20030523 Debian GNU/Linux
>> 
>> That would do it.
>
> I don't trust anything past gcc-3.2.x on sparc and sparc64.
> Use 3.3.x and later at your own peril.

recompiled with gcc-3.2.3 and the kernel not only compiled but also
booted. thank you.

--alex--

-- 
| I believe the moment is at hand when, by a paranoiac and active |
|  advance of the mind, it will be possible (simultaneously with  |
|  automatism and other passive states) to systematize confusion  |
|  and thus to help to discredit completely the world of reality. |

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 17:04 Linux 2.4.21-rc7 Marcelo Tosatti
  2003-06-03 18:02 ` Tomas Szepe
  2003-06-03 18:30 ` Alex Romosan
@ 2003-06-05 12:09 ` Andreas Haumer
  2003-06-07 15:46   ` Andreas Haumer
  2 siblings, 1 reply; 27+ messages in thread
From: Andreas Haumer @ 2003-06-05 12:09 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: lkml

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Marcelo Tosatti wrote:
> Hallo,
>
> Now I really hope its the last one, all this rc's are making me mad.
>
;-)

So, here's a report on the more positive side...

As I mentioned in some e-mails in the last few days,
I'm currently testing an Asus AP1700-S5 server with
a single Xeon 2.4GHz CPU (FSB533), 512MB RAM and
4x36GB U320SCSI drives (3 of them are assembled as RAID5),
connected via GBit Ethernet to our internal network

root@setup:~ {533} $ lspci
00:00.0 Host bridge: ServerWorks CNB20-HE Host Bridge (rev 31)
00:00.1 Host bridge: ServerWorks CNB20-HE Host Bridge
00:00.2 Host bridge: ServerWorks CNB20-HE Host Bridge
00:02.0 Ethernet controller: Intel Corp. 82540EM Gigabit Ethernet Controller (rev 02)
00:03.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 ISA bridge: ServerWorks CSB5 South Bridge (rev 93)
00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93)
00:0f.2 USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 05)
00:0f.3 Host bridge: ServerWorks GCLE Host Bridge
00:10.0 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
00:10.2 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
00:11.0 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
00:11.2 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
02:04.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 (rev 07)
02:04.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 (rev 07)
03:02.0 Ethernet controller: Intel Corp. 82544GC Gigabit Ethernet Controller (LOM) (rev 02)

root@setup:~ {538} $ uptime
  2:05pm  up 18:09, 11 users,  load average: 8.03, 8.45, 8.15

This system is running 2.4.21-rc7 for more than 18 hours
now with the following load:

*) an endless loop to create and remove a large file on the
   RAID5 (ext3 filesystem):
   while true; do time dd if /dev/zero of /var/tmp/largefile bs 1M count 2000 ; rm -f /var/tmp/largefile; done

*) some commands to create additional load:
   cd /
   find . boot/ usr/ tmp/ opt/ var/ -xdev -type f -exec md5sum {} \;

*) NFS copy of a whole 40GB filesystem tree from a Linux NFS server
   to the RAID5 (in a loop)

*) the system is also NFS serving a Linux NFS client, which
   copies the whole server filesystem into /dev/null

*) Additionally, I have the following programs running:
   - Squid (currently used as proxy for our internal web browsers)
   - Apache
   - jedit (with j2sdk-1.4.1_01)
   - StarOffice-5.2
   - Mozilla-1.3.1
   - and lots of additional programs (shell, sshd, emacs), but
     no X server (we are using Linux workstations as X-Terminals)

All in all, there are more than 190 processes at any point in
time in the past 18 hours.
This all produces a permanent load between 7 and 9

vmstat 1
   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 0  4  4 111720   3220  11344 423820   0   0     4 18976 4892  4273   2  68  30
 0  4  3 111720   3204  11352 423728  32   0    80 25216 1460  2095   0  15  85
 0  4  3 111716   3332  11352 423364  76   0    92 25796 1432  1895   2  14  84
 0  4  3 111716   3208  11372 423392  48   0   712 26336 1566  2346   4  14  81
 0  6  3 111716   3208  11412 423196 132   0   420 32820 1774  3113  12  19  69
 0  5  3 111716   3376  11440 422340 704   0   924 24444 1570  2811   3  17  79
 6  2  4 111716   2328  11560 423988 536   0   700 32088 2268  4590   6  73  21
11  3  4 111764  63352  11604 321148  16 308   310 36868 2267  5390  12  46  42

root@setup:~ {537} $ uptime
  1:37pm  up 17:41, 10 users,  load average: 7.94, 7.31, 7.18

Under this circumstances, I made the following observations:

a) The system runs stable for more than 18 hours now

b) It seems to behave quite fine, given the load.
   Response time for all services (web-proxy, web-server)
   is reasonable low (you almost don't notice any delay)

c) Interactive programs (Mozilla, StarOffice, JEdit) are
   still quite usable. There is some delay when opening
   a file in SO (say, about 2-3 seconds), but that's fine

d) Sometimes (but not really reproducable) I noticed a
   _big_ delay when connecting to the server using SSH
   (with "big", I mean 1 minute or so). I eventually
   get a connection, and then can work as normal.

e) The server uses a single, but hyperthreaded CPU.
   Hyperthreading is enabled, and Linux shows both
   logical CPU's:

root@setup:~ {529} $ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Xeon(TM) CPU 2.40GHz
stepping        : 7
cpu MHz         : 2392.169
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 4771.02

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Xeon(TM) CPU 2.40GHz
stepping        : 7
cpu MHz         : 2392.169
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 4771.02

   But interrupt distribution seems a little bit strange:

root@setup:~ {530} $ cat /proc/interrupts
           CPU0       CPU1
  0:    6318080          0    IO-APIC-edge  timer
  1:        967          0    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  4:      32477          0    IO-APIC-edge  serial
  5:   55629300          0   IO-APIC-level  eth0
  9:   85639064          0   IO-APIC-level  acpi, ioc0, ioc1
 11:          0          0   IO-APIC-level  usb-ohci
 15:          2          0    IO-APIC-edge  ide1
NMI:          0          0
LOC:    6318529    6318527
ERR:          0
MIS:          0

   With 2.4.21-rc6-ac1, interrupts where counted for both
   logical CPU's. Is this a bug or a feature?

HTH

- - andreas

- --
Andreas Haumer                     | mailto:andreas@xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE+3zMOxJmyeGcXPhERAu6CAKCILyOUfPyGaKG8pvbl4droch6B+ACbBNB/
Dw1L/tRv2JSrOHA12B8BaHM=
=rWPF
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-05 12:09 ` Andreas Haumer
@ 2003-06-07 15:46   ` Andreas Haumer
  2003-06-09 10:16     ` [2.4.21-rc7] AP1700-S5 system freeze :-(( Andreas Haumer
  2003-06-11 20:48     ` Linux 2.4.21-rc7 Marcelo Tosatti
  0 siblings, 2 replies; 27+ messages in thread
From: Andreas Haumer @ 2003-06-07 15:46 UTC (permalink / raw)
  To: Andreas Haumer; +Cc: Marcelo Tosatti, lkml

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Andreas Haumer wrote:
> Hi!
>
> Marcelo Tosatti wrote:
>
>>Hallo,
>>
>>Now I really hope its the last one, all this rc's are making me mad.
>>
>
> ;-)
>
> So, here's a report on the more positive side...
>
I think, I have to take that back... :-((

> As I mentioned in some e-mails in the last few days,
> I'm currently testing an Asus AP1700-S5 server with
> a single Xeon 2.4GHz CPU (FSB533), 512MB RAM and
> 4x36GB U320SCSI drives (3 of them are assembled as RAID5),
> connected via GBit Ethernet to our internal network
>
I had this system running under heavy load for about 24 hours
without problems. I then stopped the stress testing, and had
several system freezes since then.

With system freeze I mean:

*) machine doesn't answer to ping, no reaction to console
   keyboard, no message on the console screen, no message
   in logfile, no oops, no noticeable system activity

I changed several BIOS settings (disabled hyperthreading,
disabled USB, disabled power management) and tried to run
the kernel with "acpi=off" and "noapic".
I also changed root disk, because I found a SCSI error
message in the logs once.

Nothing seems to help. The system just freezes under light
load at some time between 1 and 8 hours uptime.
It's really strange that it survived heavy load for
more than 24 hours in the first place.

I found some problem reports from several people,
which sound quite similar to the freeze I see here.
These people all had motherboards with serverworks
chipset, GBit ethernet and noticed similar lockups
or system freeze symptoms. From the reports I'm not
sure if the problems still persist or if they should
be solved now. Can someone please comment on that?

Here are some infos from the system again:

root@server:~ {505} $ cat /proc/interrupts
           CPU0
  0:     118748    IO-APIC-edge  timer
  1:        274    IO-APIC-edge  keyboard
  2:          0          XT-PIC  cascade
  4:       7011    IO-APIC-edge  serial
  9:    1181037   IO-APIC-level  ioc0, ioc1
 14:       1685   IO-APIC-level  eth0
 15:          2    IO-APIC-edge  ide1
NMI:          0
LOC:     118700
ERR:          0
MIS:          0

root@server:~ {506} $ cat /proc/cmdline
auto BOOT_IMAGE=lx2421rc7 ro root=100 acpi=off

root@server:~ {507} $ uname -a
Linux server 2.4.21-rc7 #1 SMP Wed Jun 4 18:31:15 CEST 2003 i686 unknown

root@server:~ {508} $ lsmod
Module                  Size  Used by    Not tainted
af_packet              13256   1  (autoclean)
e1000                  50028   1  (autoclean)
ext3                   60832   2  (autoclean)
jbd                    40056   2  (autoclean) [ext3]
raid5                  17704   1  (autoclean)
md                     57472   2  (autoclean) [raid5]
xor                     8868   0  (autoclean) [raid5]
unix                   15664  38  (autoclean)
ext2                   33440   4  (autoclean)
sd_mod                 10652  18  (autoclean)
isense                 32404   0  (autoclean) (unused)
mptctl                 19116   0  (autoclean) (unused)
mptscsih               29696   9  (autoclean)
mptbase                32640   5  (autoclean) [isense mptctl mptscsih]
scsi_mod               95748   2  (autoclean) [sd_mod mptscsih]

root@server:~ {511} $ lspci -vvvv
00:00.0 Host bridge: ServerWorks CNB20-HE Host Bridge (rev 31)
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-

00:00.1 Host bridge: ServerWorks CNB20-HE Host Bridge
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-

00:00.2 Host bridge: ServerWorks CNB20-HE Host Bridge
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-

00:02.0 Ethernet controller: Intel Corp. 82540EM Gigabit Ethernet Controller (rev 02)
        Subsystem: Intel Corp. 82540EM Gigabit Ethernet Controller
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (63750ns min), cache line size 08
        Interrupt: pin A routed to IRQ 14
        Region 0: Memory at fd800000 (32-bit, non-prefetchable) [size=128K]
        Region 2: I/O ports at d800 [size=64]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [e4] PCI-X non-bridge device.
                Command: DPERE- ERO+ RBC=0 OST=0
                Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM-      Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
                Address: 0000000000000000  Data: 0000

00:03.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA])
        Subsystem: ATI Technologies Inc: Unknown device 8008
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min), cache line size 08
        Interrupt: pin A routed to IRQ 10
        Region 0: Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: I/O ports at d400 [size=256]
        Region 2: Memory at fb800000 (32-bit, non-prefetchable) [size=4K]
        Expansion ROM at febe0000 [disabled] [size=128K]
        Capabilities: [5c] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0f.0 ISA bridge: ServerWorks CSB5 South Bridge (rev 93)
        Subsystem: ServerWorks CSB5 South Bridge
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
        Latency: 32

00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93) (prog-if 88 [Master SecP])
        Subsystem: ServerWorks CSB5 IDE Controller
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32, cache line size 08
        Region 0: I/O ports at <ignored>
        Region 1: I/O ports at <ignored>
        Region 2: I/O ports at <ignored>
        Region 3: I/O ports at <ignored>
        Region 4: I/O ports at a800 [size=16]

00:0f.3 Host bridge: ServerWorks GCLE Host Bridge
        Subsystem: ServerWorks: Unknown device 0230
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0

00:10.0 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr+ DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
        Capabilities: [60]
00:10.2 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
        Capabilities: [60]
00:11.0 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr+ DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
        Capabilities: [60]
00:11.2 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr+ DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
        Capabilities: [60]
02:04.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 (rev 07)
        Subsystem: LSI Logic / Symbios Logic: Unknown device 1000
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 72 (4250ns min, 4500ns max), cache line size 08
        Interrupt: pin A routed to IRQ 9
        Region 0: I/O ports at a000 [size=256]
        Region 1: Memory at fa000000 (64-bit, non-prefetchable) [size=64K]
        Region 3: Memory at f9800000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at fe900000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
                Address: 0000000000000000  Data: 0000
        Capabilities: [68]
02:04.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 (rev 07)
        Subsystem: LSI Logic / Symbios Logic: Unknown device 1000
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 72 (4250ns min, 4500ns max), cache line size 08
        Interrupt: pin B routed to IRQ 9
        Region 0: I/O ports at 9800 [size=256]
        Region 1: Memory at f9000000 (64-bit, non-prefetchable) [size=64K]
        Region 3: Memory at f8800000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at fe800000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
                Address: 0000000000000000  Data: 0000
        Capabilities: [68]
03:02.0 Ethernet controller: Intel Corp. 82544GC Gigabit Ethernet Controller (LOM) (rev 02)
        Subsystem: Intel Corp.: Unknown device 110d
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (63750ns min), cache line size 08
        Interrupt: pin A routed to IRQ 5
        Region 0: Memory at f8000000 (64-bit, non-prefetchable) [size=128K]
        Region 2: Memory at f7800000 (64-bit, non-prefetchable) [size=128K]
        Region 4: I/O ports at 9400 [size=32]
        Expansion ROM at fe7e0000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [e4] PCI-X non-bridge device.
                Command: DPERE- ERO+ RBC=0 OST=0
                Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM-      Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
                Address: 0000000000000000  Data: 0000

Any idea how I should proceed now?
I really could use some help here, I'm running out
of ideas... :-((

- - andreas

- --
Andreas Haumer                     | mailto:andreas@xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE+4gjsxJmyeGcXPhERAsT4AJ9sylkxso5kXO51+6c5bfskVV2meACgrF33
t8xXYpu6FGPsiQ9VBmnk6ek=
=Yov+
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [2.4.21-rc7] AP1700-S5 system freeze :-((
  2003-06-07 15:46   ` Andreas Haumer
@ 2003-06-09 10:16     ` Andreas Haumer
  2003-06-09 11:46       ` Stephan von Krawczynski
  2003-06-11 20:48     ` Linux 2.4.21-rc7 Marcelo Tosatti
  1 sibling, 1 reply; 27+ messages in thread
From: Andreas Haumer @ 2003-06-09 10:16 UTC (permalink / raw)
  To: Andreas Haumer; +Cc: lkml

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Note: I'm reporting this with a different subject line now,
as I got zero replies to my first bugreport. This is still
the same Asus AP1700-S5 server as in my previous reports,
though:

Asus AP1700-S5 server, single Xeon 2.4GHz CPU (FSB533)
512MB registered DDR with ECC, Asus PR-DLS533 motherboard
with ServerWorks GCLE chipset

root@server:~ {535} $ lspci
00:00.0 Host bridge: ServerWorks CNB20-HE Host Bridge (rev 31)
00:00.1 Host bridge: ServerWorks CNB20-HE Host Bridge
00:00.2 Host bridge: ServerWorks CNB20-HE Host Bridge
00:03.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 ISA bridge: ServerWorks CSB5 South Bridge (rev 93)
00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93)
00:0f.3 Host bridge: ServerWorks GCLE Host Bridge
00:10.0 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
00:10.2 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
00:11.0 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
00:11.2 Host bridge: ServerWorks: Unknown device 0101 (rev 03)
01:02.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 74)
02:04.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 (rev 07)
02:04.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 (rev 07)

Andreas Haumer wrote:
[...]
> I had this system running under heavy load for about 24 hours
> without problems. I then stopped the stress testing, and had
> several system freezes since then.
>
> With system freeze I mean:
>
> *) machine doesn't answer to ping, no reaction to console
>    keyboard, no message on the console screen, no message
>    in logfile, no oops, no noticeable system activity
>
I just had another freeze or lockup of this system,
after 1 day and 14 hours uptime. :-(

This time the machine was running with an 3Com 3c905c
100MBit NIC, with the onboard e1000 GBit controllers disabled.
Obviously, this didn't help, too...

When I noticed the freeze, I tried to ping the server,
and got a few replies back, but with a delay of more than
60 seconds! I didn't wait that long when I tried to ping
the server on the previous lockups, so maybe the "no answer
to ping" symptom I described is more a "big delay in
answering ping packets" symptom. Does that ring any bell?

Any idea anyone?

- - andreas

- --
Andreas Haumer                     | mailto:andreas@xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE+5F6HxJmyeGcXPhERApOfAJ4klAsR0lA8Zzk5s22quImzxud6agCgvAi1
FXZuNQV3C4UaKVi9gOvtJFM=
=qL4B
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [2.4.21-rc7] AP1700-S5 system freeze :-((
  2003-06-09 10:16     ` [2.4.21-rc7] AP1700-S5 system freeze :-(( Andreas Haumer
@ 2003-06-09 11:46       ` Stephan von Krawczynski
  2003-06-09 12:21         ` Andreas Haumer
  0 siblings, 1 reply; 27+ messages in thread
From: Stephan von Krawczynski @ 2003-06-09 11:46 UTC (permalink / raw)
  To: Andreas Haumer; +Cc: linux-kernel

Hello Andreas,

I am not quite sure if you are experiencing something similar to my problem.
Fact is this:

I have a serverworks based dual PIII board and I am experiencing freezes just
about every day. 

Equal setups:

Kernel 2.4.21-rc7
00:00.0 Host bridge: ServerWorks CNB20HE Host Bridge (me: rev 23 you: rev 31)
00:00.1 Host bridge: ServerWorks CNB20HE Host Bridge (rev 01)

Lockups during light load


Differing:

Just about everything else:
                       yours:            mine:
Storage System:        Symbios           AIC
VGA           :        ATI Rage XL       ATI Radeon RV200
Network       :        Intel/3com        Intel/Broadcom
Processor     :        Xeon UP           PIII SMP


I could already produce oops-messages on the problem and mine all come up in
kmem_cache_alloc_batch. It would be interesting where your box freezes. It
cannot be at this same place, because the code is not there in UP.
Try this (in case you are not working in front of the box):

Start box and switch to text console, enter "setterm -blank 0" to disable
screen blanker. Wait for oops. If we are lucky you will see something, get a
pencil then :-)

-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [2.4.21-rc7] AP1700-S5 system freeze :-((
  2003-06-09 11:46       ` Stephan von Krawczynski
@ 2003-06-09 12:21         ` Andreas Haumer
  0 siblings, 0 replies; 27+ messages in thread
From: Andreas Haumer @ 2003-06-09 12:21 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Many thanks for your reply!

Stephan von Krawczynski wrote:
> Hello Andreas,
>
> I am not quite sure if you are experiencing something similar to my problem.
> Fact is this:
>
> I have a serverworks based dual PIII board and I am experiencing freezes just
> about every day.
>
> Equal setups:
>
> Kernel 2.4.21-rc7
> 00:00.0 Host bridge: ServerWorks CNB20HE Host Bridge (me: rev 23 you: rev 31)
> 00:00.1 Host bridge: ServerWorks CNB20HE Host Bridge (rev 01)
>
> Lockups during light load
>
Me too.
I had it running for 24 hours with heavy stress testing
and a load above 7 all the time without problems. I then
stopped this test, and the box locked up 2 hours later,
and locked up about 7 or 8 times in the past few days :-(

>
> Differing:
>
> Just about everything else:
>                        yours:            mine:
> Storage System:        Symbios           AIC

This is not a "normal" symbios logic "sym53c8xx"
storage controller, but a "Symbios Logic 53c1030",
which uses the Fusion MPT driver. This is the first
time I'm running this driver, so I don't know if it's
considered stable (but I guess so)
Unfortunately I can't replace it as I don't have any
spare SCSI controller which fits right now.

> VGA           :        ATI Rage XL       ATI Radeon RV200
> Network       :        Intel/3com        Intel/Broadcom
> Processor     :        Xeon UP           PIII SMP
>
>
> I could already produce oops-messages on the problem and mine all come up in
> kmem_cache_alloc_batch. It would be interesting where your box freezes. It
> cannot be at this same place, because the code is not there in UP.
> Try this (in case you are not working in front of the box):
>
> Start box and switch to text console, enter "setterm -blank 0" to disable
> screen blanker. Wait for oops. If we are lucky you will see something, get a
> pencil then :-)
>
I always have the system running with text console and
screen blanking disabled. Alas, I see no oops :-(

IMHO it doesn't look like the kernel crashes with an oops,
it does look more like it suddenly goes into an endless
loop or ridiculously high load somehow.
Last time I hade this freeze, I noticed that the system
answered my ICMP ping messages with a delay of more than
60 seconds. This looked like the system was very busy
at that time.

I'm now running with 2.4.20rc2, and also have syslog
routed to another system on the network. We'll see if
I can get any more information out of this.

- - andreas

- --
Andreas Haumer                     | mailto:andreas@xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE+5HvjxJmyeGcXPhERAvOvAJ94cQS4tlzylHiVU084v7FK/e/aowCgw4w9
M3YWSHXzx9IuKeU4Z6WicEk=
=8102
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-07 15:46   ` Andreas Haumer
  2003-06-09 10:16     ` [2.4.21-rc7] AP1700-S5 system freeze :-(( Andreas Haumer
@ 2003-06-11 20:48     ` Marcelo Tosatti
       [not found]       ` <1055408183.2552.18.camel@tor.trudheim.com>
  1 sibling, 1 reply; 27+ messages in thread
From: Marcelo Tosatti @ 2003-06-11 20:48 UTC (permalink / raw)
  To: Andreas Haumer; +Cc: lkml



On Sat, 7 Jun 2003, Andreas Haumer wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi!
>
> Andreas Haumer wrote:
> > Hi!
> >
> > Marcelo Tosatti wrote:
> >
> >>Hallo,
> >>
> >>Now I really hope its the last one, all this rc's are making me mad.
> >>
> >
> > ;-)
> >
> > So, here's a report on the more positive side...
> >
> I think, I have to take that back... :-((
>
> > As I mentioned in some e-mails in the last few days,
> > I'm currently testing an Asus AP1700-S5 server with
> > a single Xeon 2.4GHz CPU (FSB533), 512MB RAM and
> > 4x36GB U320SCSI drives (3 of them are assembled as RAID5),
> > connected via GBit Ethernet to our internal network
> >
> I had this system running under heavy load for about 24 hours
> without problems. I then stopped the stress testing, and had
> several system freezes since then.
>
> With system freeze I mean:
>
> *) machine doesn't answer to ping, no reaction to console
>    keyboard, no message on the console screen, no message
>    in logfile, no oops, no noticeable system activity

Maybe the NMI oopser helps?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
       [not found]       ` <1055408183.2552.18.camel@tor.trudheim.com>
@ 2003-06-12  9:35         ` Andreas Haumer
  0 siblings, 0 replies; 27+ messages in thread
From: Andreas Haumer @ 2003-06-12  9:35 UTC (permalink / raw)
  To: Anders Karlsson; +Cc: Marcelo Tosatti, Linux Kernel Mailing List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Anders Karlsson wrote:
> On Wed, 2003-06-11 at 21:48, Marcelo Tosatti wrote:
>
>>On Sat, 7 Jun 2003, Andreas Haumer wrote:
>
> [snip]
>
>>>I had this system running under heavy load for about 24 hours
>>>without problems. I then stopped the stress testing, and had
>>>several system freezes since then.
>>>
>>>With system freeze I mean:
>>>
>>>*) machine doesn't answer to ping, no reaction to console
>>>   keyboard, no message on the console screen, no message
>>>   in logfile, no oops, no noticeable system activity
>
>
> I have this problem without actually stressing the machine too hard. The
> average load on my Thinkpad over a weekend would perhaps be 0.05, yet I
> can have several hard hangs where there seems to be no trace of a hang
> at all in logfiles.
>
I have to admit that "system freeze" is a quite unspecific
symptom. It could have a zillion of different reasons.

In my case I'm currently chasing SCSI errors which I think
could have something to do with it (besides, it's _not_ an Adaptec
controller, but a LSI 53c1030 with Fusion MPT driver... :-)

In my server logs I sometimes see SCSI timeouts like this:

[...]
scsi : aborting command due to timeout : pid 1148093, scsi0, channel 0, id 1, lun 0 Read (10) 00 00 00 0f af 00 00 10 00
mptscsih: OldAbort scheduling ABORT SCSI IO (sc=dfca8e00)
  IOs outstanding = 3
mptscsih: ioc0: Issue of TaskMgmt Successful!
SCSI host 0 abort (pid 1148093) timed out - resetting
SCSI bus is being reset for host 0 channel 0.
mptscsih: OldReset scheduling BUS_RESET (sc=dfca8e00)
  IOs outstanding = 4
SCSI Error Report =-=-= (0:0:0)
  SCSI_Status=02h (CHECK CONDITION)
  Original_CDB[]: 2A 00 00 3C 4D 78 00 00 02 00 - "WRITE(10)"
  SenseData[20h]: 70 00 06 00 00 00 00 18 00 00 00 00 29 02 00 00 00 00 ...
  SenseKey=6h (UNIT ATTENTION); FRU=00h
  ASC/ASCQ=29h/02h "SCSI BUS RESET OCCURRED"
SCSI Error Report =-=-= (0:1:0)
  SCSI_Status=02h (CHECK CONDITION)
  Original_CDB[]: 28 00 00 00 0F AF 00 00 10 00 - "READ(10)"
  SenseData[20h]: 70 00 06 00 00 00 00 18 00 00 00 00 29 02 00 00 00 00 ...
  SenseKey=6h (UNIT ATTENTION); FRU=00h
  ASC/ASCQ=29h/02h "SCSI BUS RESET OCCURRED"
SCSI Error Report =-=-= (0:2:0)
  SCSI_Status=02h (CHECK CONDITION)
  Original_CDB[]: 28 00 00 4E 0A 37 00 00 08 00 - "READ(10)"
  SenseData[20h]: 70 00 06 00 00 00 00 18 00 00 00 00 29 02 00 00 00 00 ...
  SenseKey=6h (UNIT ATTENTION); FRU=00h
  ASC/ASCQ=29h/02h "SCSI BUS RESET OCCURRED"
SCSI Error Report =-=-= (0:3:0)
  SCSI_Status=02h (CHECK CONDITION)
  Original_CDB[]: 28 00 03 B0 08 6F 00 00 08 00 - "READ(10)"
  SenseData[20h]: 70 00 06 00 00 00 00 18 00 00 00 00 29 02 00 00 00 00 ...
  SenseKey=6h (UNIT ATTENTION); FRU=00h
  ASC/ASCQ=29h/02h "SCSI BUS RESET OCCURRED"
[...]

There are 4 hot swap SCSI disks in the server, and all of them
eventually report those timeouts (so it's not specific to a single
disk)
I already replaced cabling, tried a different hot swap (SCA)
cage, and I'm now trying to replace the disks one by one to
eventually find the culprit.

There are two problems with this approach:

1.) After each change I have to wait several hours up to two
    days for a SCSI timeout to occur as I can not reproduce
    the problem at will.

2.) I'm not _sure_ if those SCSI timeouts are related to the server
    freeze symptoms I see. It's just an assumption.
    IMHO it could work as follows: SCSI timeouts occure somtimes.
    The driver then aborts the command and resets the SCSI bus
    to get it into a sane state again. But what if the bus reset
    doesn't work as expected and the bus remains unusable for a
    while? Could this bring the whole system into this "freeze"
    state (the system is still running, but everything waits for
    the SCSI bus to recover)? Could this explain the symptom of
    those big delays of ICMP ping answer messages I saw?

So the most precious resource for chasing this problem is time,
and this is also the resource which I don't have available as
much as I'd like to... :-(

>
>>Maybe the NMI oopser helps?
>
>
> Marcelo, where can I get hold of this and would there be documentation
> included with it for how to install/use it?
>
Look at /usr/src/linux/Documentation/nmi_watchdog.txt

Regards,

- - andreas

- --
Andreas Haumer                     | mailto:andreas@xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE+6El7xJmyeGcXPhERAqykAKCumORTm/lDofkrg52FX33rOfgC/ACeNxR7
l9/znrbi0lZoR/zw+LTdNhI=
=W7Gt
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-08 20:17 Clayton Weaver
  2003-06-08 20:51 ` Bartlomiej Zolnierkiewicz
@ 2003-06-08 21:47 ` Willy Tarreau
  1 sibling, 0 replies; 27+ messages in thread
From: Willy Tarreau @ 2003-06-08 21:47 UTC (permalink / raw)
  To: Clayton Weaver; +Cc: willy, linux-kernel

 
> Note that "nodma" is unnecessary on this
> same box running kernel 2.4.19-rc2. Why would
> 2.4.21-rcX need it? To pin down whether the
> problem is in the ide dma code or some other
> part of the ide code?

exactly, because DMA needs more conditions than PIO to run at all
and even more to run reliably. There are lots of cases where DMA
doesn't work while PIO does.
 
> It does not die more easily with 2.4.19-rc2
> (in my opinion). It dies in a threads context
> but not in a forks context, where the threads
> and the forks are doing the same i/o to/from
> the same controller/disk (different versions
> of same program).
>
> I have also seen it freeze with an unlucky
> mouse click in XFree86 4.0 under 2.4.19-rc2,
> so I did not assume that the threads hang
> was necessarily ide-relevant. Something
> disk i/o intensive was merely what it
> happened to be doing with those threads,
> but that problem seemed to me more thread
> related than ide related. (Guess I'll have
> to spawn a bunch of threads doing some other
> kind of i/o to test that assumption.)

OK, but a freeze isn't acceptable anyway, whatever you were doing,
because it always means a bug somewhere.

Cheers,
Willy

PS: your lines were shorter this way :-)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-08 20:17 Clayton Weaver
@ 2003-06-08 20:51 ` Bartlomiej Zolnierkiewicz
  2003-06-08 21:47 ` Willy Tarreau
  1 sibling, 0 replies; 27+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-06-08 20:51 UTC (permalink / raw)
  To: Clayton Weaver; +Cc: linux-kernel


Please stop comparing 2.4.19-rc2 to 2.4.21-rc7.
Just go through 2.4.20-pre/-rc and 2.4.21-pre/-rc
and find when things broke if you want them fixed.
--
Bartlomiej


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
@ 2003-06-08 20:17 Clayton Weaver
  2003-06-08 20:51 ` Bartlomiej Zolnierkiewicz
  2003-06-08 21:47 ` Willy Tarreau
  0 siblings, 2 replies; 27+ messages in thread
From: Clayton Weaver @ 2003-06-08 20:17 UTC (permalink / raw)
  To: willy; +Cc: linux-kernel

----- Original Message -----
From: Willy Tarreau <willy@w.ods.org>
Date: Sun, 8 Jun 2003 11:47:29 +0200
To: Clayton Weaver <cgweav@email.com>
Subject: Re: Linux 2.4.21-rc7

> Hi !

Greets.

> [ first, please fix your mailer and cut your lines, it's not easy to quote you in replies ]

Long lines?

email.com is a web mailer. If it is failing
to wrap where I put newlines, I'll see what I
can do.
 
> On Sun, Jun 08, 2003 at 03:54:48AM -0500, Clayton Weaver wrote:
> > > Now I really hope its the last one, all this
> > > rc's are making me mad.

> > We still have ide problems, and I don't see
any
> > potential fixes for that in the changelog between -rc6 and -rc7.
> > 
> > I tried -rc6 on a whim and had hda report
> > a timeout (dma, I think, but the message went by kind of quick), then the big freeze with the
> > disk light stuck on,  Never happened in 6 months on the same hardware running
> > 2.4.19-rc2 (with glibc-2.2.5, gcc-2.95.3, binutils-2.12.90.0.9, all ext2 filesystems).
 
> Did you try with "ide0=nodma", or other similar options ?

No.

Note that "nodma" is unnecessary on this
same box running kernel 2.4.19-rc2. Why would
2.4.21-rcX need it? To pin down whether the
problem is in the ide dma code or some other
part of the ide code?

> > SiS530/5513, k6-II/450, udma33 Maxtor
drivethat 2.4.19-rc2 has no problems with.

Here is the data on the drive from hdparm
while running under 2.4.19-rc2. rc.local
executes "hdparm -c 1 /dev/hda" at boot.

hdparm -v:

/dev/hda:
 multcount    = 16 (on)
 IO_support   =  1 (32-bit)
 unmaskirq    =  0 (off)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 1655/255/63, sectors = 26588016, start = 0

hdparm -i:

/dev/hda:

 Model=Maxtor 91360U4, FwRev=MA540RR0, SerialNo=C40LMAFC
 Config={ Fixed }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57
 BuffType=DualPortCache, BuffSize=2048kB, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=26588016
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 *udma2
 AdvancedPM=yes: disabled (255) WriteCache=enabled
 Drive conforms to: ATA/ATAPI-4 T13 1153D revision 17:  1 2 3 4 5

> That's not exactly what you said below. You said that you could reliably kill it with 32 threads...
> Perhaps you have a broken hardware, and 2.4.21 stresses it more than 2.4.19-rc2. Perhaps it's
> really an old driver bug, then having reported it since this you encountered it would have been
> more constructive than telling us at 2.4.21 time that it dies even more easily than a one year old
> 2.4.19-rc2.

It does not die more easily with 2.4.19-rc2
(in my opinion). It dies in a threads context
but not in a forks context, where the threads
and the forks are doing the same i/o to/from
the same controller/disk (different versions
of same program).

I have also seen it freeze with an unlucky
mouse click in XFree86 4.0 under 2.4.19-rc2,
so I did not assume that the threads hang
was necessarily ide-relevant. Something
disk i/o intensive was merely what it
happened to be doing with those threads,
but that problem seemed to me more thread
related than ide related. (Guess I'll have
to spawn a bunch of threads doing some other
kind of i/o to test that assumption.)

[]

> > (Better to find out sooner than release
> > 2.4.21-stable and watch 52 different bug reports on it arrive at the list the next day.)
 
> Well, look through the archives, there have been two patches by Lionel Bouton and Vojtech Pavlik
> posted in May for the 5513 driver, to support newer chipsets. I don't know if they have been
> included, nor if they also fixed old bugs. Perhaps you'll be intersted in checking them.

(SiS530 is not newer, k6-II era, but it
is worth a look anyway.)

The SiS5513 driver seems fine. You can
hammer on it all day with this motherboard
with gcc, multiple smb mounts, gigabyte ftp or
sftp transfers, etc, in parallel, and no blinks from the hard drive (modulo threads or the X-server under 2.4.19-rc2).

(Why 2.4.19-rc2? It mostly works, ie it is
stable for what I typically use that box
for. Someone running a different application
mix or different hardware might consider it useless crap. It has the lcall fix and a
few other minor bug fixes that were posted
to the kernel list between then and now.)
 
> BTW, someone reported yesterday that his 5513 worked flawlessly in 2.4.20, but behaved like yours
> on 2.5.70. Have you tested 2.4.20, or better, have you tried to narrow the problem down to a
> particular version (but I bet it will be tied to the introduction of the newer IDE code).

No. (I do actually need this thing to work at
times.) The newer ide code as the source of the
problem matches my hunch. Maybe the kernel
debugging that I enabled at compile time will
come up with something (*before* the
deadlock, so it can actually log an anomaly).

The newer ide code may have found a bug in
the SiS5513 driver that the old code did
not exercise. Let us hope not, because then
a fix only fixes it for me and other users
of that driver, while lots of people with
other kinds of ide hardware seem to be
reporting similar problems.

My guess is that the problems are upstream
of any specific driver, but that is merely
a hunch. (It is possible that they all do
the same wrong thing *in the drivers*.)

> You may also try -ac kernels which have more recent, but less tested code.

> Regards,
> Willy

Thanks for the insight.

Regards,

Clayton Weaver
<mailto: cgweav@email.com>

-- 
_______________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-08  8:54 Clayton Weaver
@ 2003-06-08  9:47 ` Willy Tarreau
  0 siblings, 0 replies; 27+ messages in thread
From: Willy Tarreau @ 2003-06-08  9:47 UTC (permalink / raw)
  To: Clayton Weaver; +Cc: linux-kernel

Hi !

[ first, please fix your mailer and cut your lines, it's not easy to quote you in replies ]

On Sun, Jun 08, 2003 at 03:54:48AM -0500, Clayton Weaver wrote:
> > Now I really hope its the last one, all this
> > rc's are making me mad.
> 
> We still have ide problems, and I don't see any
> potential fixes for that in the changelog between -rc6 and -rc7.
> 
> I tried -rc6 on a whim and had hda report
> a timeout (dma, I think, but the message went by kind of quick), then the big freeze with the
> disk light stuck on,  Never happened in 6 months on the same hardware running
> 2.4.19-rc2 (with glibc-2.2.5, gcc-2.95.3, binutils-2.12.90.0.9, all ext2 filesystems).

Did you try with "ide0=nodma", or other similar options ?

> SiS530/5513, k6-II/450, udma33 Maxtor drive that 2.4.19-rc2 has no problems with.

That's not exactly what you said below. You said that you could reliably kill it with 32 threads...
Perhaps you have a broken hardware, and 2.4.21 stresses it more than 2.4.19-rc2. Perhaps it's
really an old driver bug, then having reported it since this you encountered it would have been
more constructive than telling us at 2.4.21 time that it dies even more easily than a one year old
2.4.19-rc2.

> You can release a 2.4.21 anyway, of course, but without finding out where the ide livelock (and other big freezes, thinking of the report on the all-scsi system already posted) originates, calling it "stable" would be a bit fanciful.

That's what -pre and -rc are for : bug reports. The ide code has been included in 2.4.21-pre1,
several months ago. There's always a risk of breaking someone's setup, but obviously, if people
don't try pre-releases and don't report problems in time, how could they hope to get a stable
kernel on their hardware ?

> Not what you wanted to hear, right? Oh well.
> 
> (Better to find out sooner than release
> 2.4.21-stable and watch 52 different bug reports on it arrive at the list the next day.)

Well, look through the archives, there have been two patches by Lionel Bouton and Vojtech Pavlik
posted in May for the 5513 driver, to support newer chipsets. I don't know if they have been
included, nor if they also fixed old bugs. Perhaps you'll be intersted in checking them.

BTW, someone reported yesterday that his 5513 worked flawlessly in 2.4.20, but behaved like yours
on 2.5.70. Have you tested 2.4.20, or better, have you tried to narrow the problem down to a
particular version (but I bet it will be tied to the introduction of the newer IDE code).

You may also try -ac kernels which have more recent, but less tested code.

Regards,
Willy


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
@ 2003-06-08  8:54 Clayton Weaver
  2003-06-08  9:47 ` Willy Tarreau
  0 siblings, 1 reply; 27+ messages in thread
From: Clayton Weaver @ 2003-06-08  8:54 UTC (permalink / raw)
  To: linux-kernel

> Now I really hope its the last one, all this
> rc's are making me mad.

We still have ide problems, and I don't see any
potential fixes for that in the changelog between -rc6 and -rc7.

I tried -rc6 on a whim and had hda report
a timeout (dma, I think, but the message went by kind of quick), then the big freeze with the
disk light stuck on,  Never happened
in 6 months on the same hardware running
2.4.19-rc2 (with glibc-2.2.5, gcc-2.95.3,
binutils-2.12.90.0.9, all ext2 filesystems).

I recompiled with all kernel debugging options
enabled and disabled partition statistics, since that was the one thing that was obviously new about the enabled ide options (I didn't select
any other new options, but of course the kernel code underneath is probably different, so one could not conclude anything from suck meager
testing). It ran for about 8 hours without freezing, with that drive doing a lot more
work than it was doing when it livelocked.

e2fsck reported errors on the next reboot, though,
and it's been rebooted into 2.4.19-rc2 to get some
other work done with it since then (caching the source for an upgrade of a 2.2.x box, different libc, yada yada, needs to be reliable until
that is finished).

SiS530/5513, k6-II/450, udma33 Maxtor drive that 2.4.19-rc2 has no problems with.

You can release a 2.4.21 anyway, of course, but without finding out where the ide livelock (and other big freezes, thinking of the report on the all-scsi system already posted) originates, calling it "stable" would be a bit fanciful.

(2.4.19-rc2 has its own quirks, of course, but
not "single-threaded ide livelock with this
chipset and ide drive". I can reliably kill it with 32 threads depth-first scanning different directory trees on that same disk in parallel, unfortunately without an oops to show for it.
It is not running out of memory (no ENOMEM reports), merely some mundane race condition or missing lock or whatever. Change it to 32 forks running in parallel, and they finish normally, though of course not all that quickly while seek-thrashing one and the same disk between them.)

Not what you wanted to hear, right? Oh well.

(Better to find out sooner than release
2.4.21-stable and watch 52 different bug reports on it arrive at the list the next day.)

Regards,

Clayton Weaver
<mailto: cgweav@email.com>

-- 
_______________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 18:50 ` Marc-Christian Petersen
@ 2003-06-03 19:38   ` Christoph Hellwig
  0 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2003-06-03 19:38 UTC (permalink / raw)
  To: Marc-Christian Petersen
  Cc: Margit Schubert-While, linux-kernel, Marcelo Tosatti

On Tue, Jun 03, 2003 at 08:50:00PM +0200, Marc-Christian Petersen wrote:
> On Tuesday 03 June 2003 20:45, Margit Schubert-While wrote:
> 
> > if [ -r System.map ]; then /sbin/depmod -ae -F System.map  2.4.21-rc7; fi
> > depmod: *** Unresolved symbols in
> > /lib/modules/2.4.21-rc7/kernel/drivers/net/wan/comx.o
> > depmod:         proc_get_inode
> 
> attached.
> 
> hch: I know what you'll say, so don't reply ;-))

So add the message yourself if you don't want me to reply.

For those who haven't heard before:  this is _not_ a correct
fix.  proc_get_inode is not exported for a reason and the whole
procfs mess in comx needs a rewrite.  Given that no one looked
into this over the last three years I guess we should rather
remove the driver..


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
  2003-06-03 18:45 Margit Schubert-While
@ 2003-06-03 18:50 ` Marc-Christian Petersen
  2003-06-03 19:38   ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Marc-Christian Petersen @ 2003-06-03 18:50 UTC (permalink / raw)
  To: Margit Schubert-While, linux-kernel; +Cc: Marcelo Tosatti

[-- Attachment #1: Type: text/plain, Size: 337 bytes --]

On Tuesday 03 June 2003 20:45, Margit Schubert-While wrote:

> if [ -r System.map ]; then /sbin/depmod -ae -F System.map  2.4.21-rc7; fi
> depmod: *** Unresolved symbols in
> /lib/modules/2.4.21-rc7/kernel/drivers/net/wan/comx.o
> depmod:         proc_get_inode

attached.

hch: I know what you'll say, so don't reply ;-))

ciao, Marc



[-- Attachment #2: 01_comx-driver-compile-1.patch --]
[-- Type: text/x-diff, Size: 258 bytes --]

--- 2.4.19pre8aa2/fs/proc/root.c.~1~	Fri May  3 02:12:18 2002
+++ 2.4.19pre8aa2/fs/proc/root.c	Sat May  4 13:45:30 2002
@@ -145,3 +145,4 @@
 EXPORT_SYMBOL(proc_net);
 EXPORT_SYMBOL(proc_bus);
 EXPORT_SYMBOL(proc_root_driver);
+EXPORT_SYMBOL(proc_get_inode);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Linux 2.4.21-rc7
@ 2003-06-03 18:45 Margit Schubert-While
  2003-06-03 18:50 ` Marc-Christian Petersen
  0 siblings, 1 reply; 27+ messages in thread
From: Margit Schubert-While @ 2003-06-03 18:45 UTC (permalink / raw)
  To: linux-kernel

if [ -r System.map ]; then /sbin/depmod -ae -F System.map  2.4.21-rc7; fi
depmod: *** Unresolved symbols in 
/lib/modules/2.4.21-rc7/kernel/drivers/net/wan/comx.o
depmod:         proc_get_inode

Margit


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2003-06-12  9:24 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-06-03 17:04 Linux 2.4.21-rc7 Marcelo Tosatti
2003-06-03 18:02 ` Tomas Szepe
2003-06-03 18:07   ` Marcelo Tosatti
2003-06-03 19:15     ` lk
2003-06-03 19:40       ` Alan Cox
2003-06-03 18:30 ` Alex Romosan
2003-06-03 19:27   ` Jeff Garzik
2003-06-03 19:58     ` Alex Romosan
2003-06-03 20:14       ` Tom Rini
2003-06-04  3:35         ` David S. Miller
2003-06-04 15:09           ` Mr. James W. Laferriere
2003-06-04 23:37           ` Alex Romosan
2003-06-05 12:09 ` Andreas Haumer
2003-06-07 15:46   ` Andreas Haumer
2003-06-09 10:16     ` [2.4.21-rc7] AP1700-S5 system freeze :-(( Andreas Haumer
2003-06-09 11:46       ` Stephan von Krawczynski
2003-06-09 12:21         ` Andreas Haumer
2003-06-11 20:48     ` Linux 2.4.21-rc7 Marcelo Tosatti
     [not found]       ` <1055408183.2552.18.camel@tor.trudheim.com>
2003-06-12  9:35         ` Andreas Haumer
2003-06-03 18:45 Margit Schubert-While
2003-06-03 18:50 ` Marc-Christian Petersen
2003-06-03 19:38   ` Christoph Hellwig
2003-06-08  8:54 Clayton Weaver
2003-06-08  9:47 ` Willy Tarreau
2003-06-08 20:17 Clayton Weaver
2003-06-08 20:51 ` Bartlomiej Zolnierkiewicz
2003-06-08 21:47 ` Willy Tarreau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).